The COPY inserted in the epilog block before return instruction as part
of ABI lowering, can get optimized during machine copy propagation if
the same register is used earlier in a wwm operation that demands the
prolog/epilog wwm-spill store/restore to preserve its inactive lanes.
With the spill restore in the epilog, the preceding COPY appears to be
dead during machine-cp. To avoid it, mark the same register as a tied-op
in the spill restore instruction to ensure a usage for the COPY.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60,040 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
1669 | Using LiveRegs here won't help. The CSRs (including wwm-regs) are marked LiveOut by initLiveRegs() for an epilog block. In that sense, the liveness query will always hold true and we add the tied-op unconditionally for all spill reloads. The intention of this patch is to tie the return value register to its own spill restore (if have one) to avoid an incorrect optimization currently performed on it due to the presence of the wwm-restore inserted afterward. |
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
1669 | I'm having trouble following this. The live-outness is already represented by the $vgpr0 use on the return instruction. You're special casing looking for the return instruction in the low level return utility. Is this just to filter out spills in other contexts, and that's why the liveness query doesn't work? If you performed the liveness query in buildEpilogRestore and passed in that you needed the tied operand, would that let you use the liveness query? | |
llvm/test/CodeGen/AMDGPU/tied-op-for-wwm-scratch-reg-spill-restore.mir | ||
37 | Can you add some additional tests with returned register tuples, and one where the def'd register is in the CSR range? Also, test where the spill isn't used as the return value |
Used MachineInstr::readsRegister to check if RETURN instruction reads the spill-reg.
Also, added more tests.
LGTM with nit
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
1669 | MBB.isReturnBlock is redundant with MI->isReturn. isReturnBlock just checks if the last instruction is a return |
Exact register equality is almost always wrong