This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Remove redundant register copies in late peephole
Needs ReviewPublic

Authored by nemanjai on Nov 8 2017, 1:57 AM.

Details

Summary

In a number of cases, we will end up with redundant register copies quite late in the pipeline (i.e. after AADB, etc.).
A sequence such as this:

mr r30, r3
...        # no clobbering of r30 or r3
mr r3, r30 # this is redundant

Earlier passes sometimes leave sequences such as this behind because the sequence in between clobbers r3, but AADB will rewrite it.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai created this revision.Nov 8 2017, 1:57 AM
inouehrs added inline comments.
lib/Target/PowerPC/PPCPreEmitPeephole.cpp
80

Do you eliminate only the second mr, not both mrs?
We have additional opportunity to erase first mr by rewriting uses of r3 with r30.

nemanjai added inline comments.Nov 9 2017, 6:04 AM
lib/Target/PowerPC/PPCPreEmitPeephole.cpp
80

This isn't safe to do in general. The motivating test case for example has a call. That call needs the value to be in r3, so I can't just tell it to use r30.

This is fine except that it's O(N^2). Can you please either rewrite this to do a scan first to collect candidate pairs and then check them (which I'd prefer) or add a threshold for the forward search (which would be okay too: given a sufficiently large BB, the chances of having these opportunities with a large separation seems low).

stefanp added inline comments.
lib/Target/PowerPC/PPCPreEmitPeephole.cpp
90

Once the second mr is deleted the first mr could become a trivially dead instruction. (It is certainly not guaranteed to be dead but it is possible that it is dead.) Is there a pass after this to clean that up? I think this is just pre-emit so I don't think there is anything else after this.

It may be worth doing a quick test to see if the first mr is dead once we have removed the second one.

130

Maybe a silly question but...
I don't understand why these flags are being cleared.
In this case MI is the first mr instruction. So from what I see:

mr r30, r3

The r30 is being overwritten but the r3 is only a use. Why would these flags even be set to true in the first place for r3?

test/CodeGen/PowerPC/remove-cyclic-mr.ll
21

I'm debating if this test is over-specified.
We are testing

mr 30, 3
... 
mr 3, 30

Do we need anything above the first mr and below the second mr?
I just don't want this test to fail every now and then when we change something in scheduling or register allocation (or any other pass that might make the code a little different.)