Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
56 | Please document this function. | |
208 | Unless I am missing something, the previous comment is a plain copy of the previous one, whereas what you want is: // %ECX<def> = COPY %EAX // ... nothing clobbered EAX. // %ECX<def> = COPY %EAX I.e., same def and src for both copies. That being said, it seems strange to me that we have to support the full reg case that you exposed in the example. Indeed, this does not look like a copy propagation, but, like you said, a redundant copy. In other words, I would have expected we catch this in MachineCSE and if this is not the case, I believe it makes more sense to understand why and do it thereI. |
Hi Matt,
No, this is complementary to r248611 because one works before regalloc and the other after.
Cheers,
-Quentin
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
168 | Cleaning kill for Def is not always correct. In the second call for eraseRedundantCopy() below, MI's Src is passed as the second parameter, which is Def here, but we need to clean kill for MI's Def instead of MI's Src. | |
193 | Looks like the first eraseRedundantCopy() is for : %A = COPY %B : %B = COPY %A, %A = COPY %B : %A = COPY %B. |
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
159 | In MRI->isReserved(Def), we should use the actual register defined in MI, instead of the Def passed as a parameter. In case of the second call for eraseRedundantCopy() below in line 209, MI's Src is passed as Def here. |
New versions addressing comments, cleaning up the code, improving comments and extending the tests.
Update revision which turns an assert in isNopCopy back to an if() and adjusts two AMDGPU testcases to the improved copy elimination.
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
211 | Shouldn't we need to add !MRI->isReserved(Src) as well ? |
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
211 | This is fine. Elminating a redundant copy of a reserved register is bad because you either write to a reserved register which may have effects you don't know about so you just cannot erase the write, or the redundant copy reads from a reserved register in which case we cannot be sure that is still has the previous value. In the case of a dead copy reading form a reserved register we do not care though: We don't know the value that is read but nobody reads the value produced by the dead copy anyway. |
lib/CodeGen/MachineCopyPropagation.cpp | ||
---|---|---|
125 | Would it make sense to give an exception when Src is zero register? In AArch64, I can see redundant zero copies after RA like : %X8<def> = COPY %XZR STRXui %X8<kill>, <fi#36>, 0 %X8<def> = COPY %XZR STRXui %X8<kill>, <fi#34>, 0 |
Hi Jun,
I’d rather not start messing with reserved registers in a generic way.
If you want to catch your example, maybe this is something you can teach in the AArch64RedundantCopyElimination pass.
Cheers,
-Quentin
test/CodeGen/X86/machine-copy-prop.mir | ||
---|---|---|
17 | Didn’t you fixed those typos in a previous commit?! |
I’d rather not start messing with reserved registers in a generic way. If you want to catch your example, maybe this is something you can teach in the AArch64RedundantCopyElimination pass.
Make sense to me.
Uh, I think this has probably worked before and fails after my changes, I don't want to cause regressions here...
Uh, I think this has probably worked before and fails after my changes, I don't want to cause regressions here...
I don't think the case I mentioned above was handled even before your patch.
You are right, this wasn't possible before of course, because it was an A<-B; A<-B pattern.
I just discussed this a bit with Quentin and we think that eliminating the second copy in an "A<-B; A<-B" is invalid if B is a reserved register, because we cannot be sure that we are actually reading the same value. For an example look at ARM32s PC register. I'll probably take a few minutes to create a patch that adds some comments to TargetRegisterInfo that define what reserved registers mean.
Adding isConstantPhysReg() would be good in this pass. As far as I check, however, isConstantPhysReg() returns false for WZR and XZR in AArch64, which I think we need to fix (Bug 25457).
This looks good to me, assuming other reviewers are OK with adding isConstantPhysReg() in a follow-up patch.
I'm not convinced that isConstantPhysReg() is the answer. This function currently simply checks whether there is more than one definition for a register. This still does not allow us to assume the same value for arbitrary reserved registers IMO (think of the PC register on ARM) it would also fail if we have more than 1 write of the zero registers (not sure about ARM but on sparc it is common to write to the zero register). See also the llvm-dev I just posted about unallocatable/reserved registers.
LGTM plus maybe a test for reserved registers (i.e., don’t do anything for those).
Q.
test/CodeGen/AMDGPU/llvm.amdgcn.rsq.clamp.ll | ||
---|---|---|
34 ↗ | (On Diff #48874) | I guess this is the rebase messing around. |
test/CodeGen/X86/machine-copy-prop.mir | ||
184 | Could you add a test with Reserved register for both cases source and destination? |
Please document this function.
Although based on the uses, I found Src and Def being confusing because they do not actually represent the source and destination of MI.