Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I can see why you would put it in SIShrinkInstructions, but this isn't really a shrinking problem. Can you add a comment that this is really just a generic post-RA peephole that should probably be in a separate pass?
lib/Target/AMDGPU/SIInstrInfo.h | ||
---|---|---|
219 ↗ | (On Diff #167773) | !isReg() and continue to reduce indentation |
226–227 ↗ | (On Diff #167773) | I think this is a bit difficult to read with the () and .any() on the temporary. Can you use a variable for the & |
234–238 ↗ | (On Diff #167773) | How is this different from MachineInstr::readsRegister? Does that just to add a subreg operand? |
246–247 ↗ | (On Diff #167773) | RegSubRegPair instead of std::pair? |
286 ↗ | (On Diff #167773) | use_nodbg_operands? |
401 ↗ | (On Diff #167773) | ST.hasSwap() or something implemented by checking gfx9-insts |
test/CodeGen/AMDGPU/v_swap_b32.mir | ||
502 ↗ | (On Diff #167773) | Needs a test where the mov/copy has extra implicit uses / defs. I think this should probably skip them if there are. In particular I'm worried about the super-register copy pattern where there's an implicit def of the super-register on the first copy in the sequence |
lib/Target/AMDGPU/SIInstrInfo.h | ||
---|---|---|
234–238 ↗ | (On Diff #167773) | Yes, reads/modifiesRegister would return true if a super-register is modified even if a particular lane of interest does not. Some of the foldings in the test would fail if I use it. |
lib/Target/AMDGPU/AMDGPUSubtarget.h | ||
---|---|---|
519 ↗ | (On Diff #167882) | There's a specific GFX9Insts feature (although we have some others that should be checking it instead of the generation already) |
lib/Target/AMDGPU/SIInstrInfo.h | ||
234–238 ↗ | (On Diff #167773) | This needs a comment explaining it then |
286 ↗ | (On Diff #167773) | I still see it as use_operands? |
test/CodeGen/AMDGPU/v_swap_b32.mir | ||
521 ↗ | (On Diff #167882) | %2 needs another use which might break |
Relaxed match to allow multiple uses of temp register and keep initial mov to temp:
mov t, x
mov x, y
mov y, t
>
mov t, x (t is potentially dead and move eliminated)
v_swap_b32 x, y