Page MenuHomePhabricator

MachineCopyPropagation: Catch copies of the form A<-B;A<-B
ClosedPublic

Authored by MatzeB on Feb 19 2016, 8:07 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 48573.Feb 19 2016, 8:07 PM
MatzeB retitled this revision from to MachineCopyPropagation: Catch copies of the form A<-B;A<-B.
MatzeB updated this object.
MatzeB added a reviewer: qcolombet.
MatzeB set the repository for this revision to rL LLVM.
MatzeB added a subscriber: llvm-commits.
qcolombet added inline comments.Feb 22 2016, 11:06 AM
lib/CodeGen/MachineCopyPropagation.cpp
55 ↗(On Diff #48573)

Please document this function.
Although based on the uses, I found Src and Def being confusing because they do not actually represent the source and destination of MI.

209 ↗(On Diff #48573)

Unless I am missing something, the previous comment is a plain copy of the previous one, whereas what you want is:

//  %ECX<def> = COPY %EAX
//  ... nothing clobbered EAX.
//  %ECX<def> = COPY %EAX

I.e., same def and src for both copies.

That being said, it seems strange to me that we have to support the full reg case that you exposed in the example. Indeed, this does not look like a copy propagation, but, like you said, a redundant copy. In other words, I would have expected we catch this in MachineCSE and if this is not the case, I believe it makes more sense to understand why and do it thereI.
Anyhow, MachineCSE would still be useful in cases where sub register are involved. I.e., I am fine with the code living here, but we would need a test case with sub register.
E.g.,
eax = ecx
al = cl

arsenm added a subscriber: arsenm.Feb 22 2016, 12:00 PM

Does this obsolete r248611?

qcolombet edited edge metadata.Feb 22 2016, 1:17 PM
qcolombet added a subscriber: qcolombet.

Hi Matt,

No, this is complementary to r248611 because one works before regalloc and the other after.

Cheers,
-Quentin

junbuml added inline comments.Feb 22 2016, 3:09 PM
lib/CodeGen/MachineCopyPropagation.cpp
155 ↗(On Diff #48573)

Cleaning kill for Def is not always correct. In the second call for eraseRedundantCopy() below, MI's Src is passed as the second parameter, which is Def here, but we need to clean kill for MI's Def instead of MI's Src.

194 ↗(On Diff #48573)

Looks like the first eraseRedundantCopy() is for :

%A = COPY %B

:

%B = COPY %A,
while the second eraseRedundantCopy() is for :

%A = COPY %B

:

%A = COPY %B.
So, it will good to have comment for each.

junbuml added inline comments.Feb 23 2016, 10:06 AM
lib/CodeGen/MachineCopyPropagation.cpp
146 ↗(On Diff #48573)

In MRI->isReserved(Def), we should use the actual register defined in MI, instead of the Def passed as a parameter. In case of the second call for eraseRedundantCopy() below in line 209, MI's Src is passed as Def here.

MatzeB updated this revision to Diff 48856.Feb 23 2016, 4:12 PM
MatzeB edited edge metadata.
MatzeB marked 4 inline comments as done.

New versions addressing comments, cleaning up the code, improving comments and extending the tests.

MatzeB updated this revision to Diff 48874.Feb 23 2016, 6:30 PM
MatzeB marked an inline comment as done.

Update revision which turns an assert in isNopCopy back to an if() and adjusts two AMDGPU testcases to the improved copy elimination.

MatzeB added inline comments.Feb 23 2016, 6:30 PM
lib/CodeGen/MachineCopyPropagation.cpp
146 ↗(On Diff #48573)

I think we cannot eliminate any copy to/from a reserved reg, that's what I put in place now.

155 ↗(On Diff #48573)

Good catch! Should be fixed now.

209 ↗(On Diff #48573)

I reorganized the whole comment hope it is easier to understand now.

junbuml added inline comments.Feb 23 2016, 8:52 PM
lib/CodeGen/MachineCopyPropagation.cpp
212 ↗(On Diff #48874)

Shouldn't we need to add !MRI->isReserved(Src) as well ?

MatzeB marked an inline comment as done.Feb 24 2016, 3:46 PM
MatzeB added inline comments.
lib/CodeGen/MachineCopyPropagation.cpp
212 ↗(On Diff #48874)

This is fine. Elminating a redundant copy of a reserved register is bad because you either write to a reserved register which may have effects you don't know about so you just cannot erase the write, or the redundant copy reads from a reserved register in which case we cannot be sure that is still has the previous value.

In the case of a dead copy reading form a reserved register we do not care though: We don't know the value that is read but nobody reads the value produced by the dead copy anyway.

junbuml added inline comments.Feb 25 2016, 9:16 AM
lib/CodeGen/MachineCopyPropagation.cpp
125 ↗(On Diff #48874)

Would it make sense to give an exception when Src is zero register? In AArch64, I can see redundant zero copies after RA like :

%X8<def> = COPY %XZR
STRXui %X8<kill>, <fi#36>, 0
%X8<def> = COPY %XZR
STRXui %X8<kill>, <fi#34>, 0

Hi Jun,

I’d rather not start messing with reserved registers in a generic way.

If you want to catch your example, maybe this is something you can teach in the AArch64RedundantCopyElimination pass.

Cheers,
-Quentin

test/CodeGen/X86/machine-copy-prop.mir
17 ↗(On Diff #48874)

Didn’t you fixed those typos in a previous commit?!
Forgot to rebase maybe?

I’d rather not start messing with reserved registers in a generic way.
If you want to catch your example, maybe this is something you can teach in the AArch64RedundantCopyElimination pass.

Make sense to me.

MatzeB marked an inline comment as done.Feb 25 2016, 10:57 AM

Hi Jun,

I’d rather not start messing with reserved registers in a generic way.

If you want to catch your example, maybe this is something you can teach in the AArch64RedundantCopyElimination pass.

Cheers,
-Quentin

Uh, I think this has probably worked before and fails after my changes, I don't want to cause regressions here...

Uh, I think this has probably worked before and fails after my changes, I don't want to cause regressions here...

I don't think the case I mentioned above was handled even before your patch.

Uh, I think this has probably worked before and fails after my changes, I don't want to cause regressions here...

I don't think the case I mentioned above was handled even before your patch.

You are right, this wasn't possible before of course, because it was an A<-B; A<-B pattern.

I just discussed this a bit with Quentin and we think that eliminating the second copy in an "A<-B; A<-B" is invalid if B is a reserved register, because we cannot be sure that we are actually reading the same value. For an example look at ARM32s PC register. I'll probably take a few minutes to create a patch that adds some comments to TargetRegisterInfo that define what reserved registers mean.

gberry added a subscriber: gberry.Feb 25 2016, 11:41 AM

Couldn't you add a check for isConstantPhysReg() to handle the case Jun mentioned?

That’ll work, though it should be a follow-up patch IMO.

Q.

Adding isConstantPhysReg() would be good in this pass. As far as I check, however, isConstantPhysReg() returns false for WZR and XZR in AArch64, which I think we need to fix (Bug 25457).

Adding isConstantPhysReg() would be good in this pass. As far as I check, however, isConstantPhysReg() returns false for WZR and XZR in AArch64, which I think we need to fix (Bug 25457).

Yes let's do this one in follow up commits. Is it okay to push this one?

This looks good to me, assuming other reviewers are OK with adding isConstantPhysReg() in a follow-up patch.

This looks good to me, assuming other reviewers are OK with adding isConstantPhysReg() in a follow-up patch.

I'm not convinced that isConstantPhysReg() is the answer. This function currently simply checks whether there is more than one definition for a register. This still does not allow us to assume the same value for arbitrary reserved registers IMO (think of the PC register on ARM) it would also fail if we have more than 1 write of the zero registers (not sure about ARM but on sparc it is common to write to the zero register). See also the llvm-dev I just posted about unallocatable/reserved registers.

qcolombet accepted this revision.Feb 25 2016, 7:01 PM
qcolombet edited edge metadata.

LGTM plus maybe a test for reserved registers (i.e., don’t do anything for those).

Q.

test/CodeGen/AMDGPU/llvm.amdgcn.rsq.clamp.ll
34 ↗(On Diff #48874)

I guess this is the rebase messing around.

test/CodeGen/X86/machine-copy-prop.mir
184 ↗(On Diff #48874)

Could you add a test with Reserved register for both cases source and destination?

This revision is now accepted and ready to land.Feb 25 2016, 7:01 PM
This revision was automatically updated to reflect the committed changes.