Page MenuHomePhabricator

[AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering.
ClosedPublic

Authored by mjbedy on Dec 11 2019, 7:35 PM.

Details

Summary
  • SI Whole Quad Mode phase is replacing WQM pseudo instructions with v_mov instructions.

While this is necessary for the special handling of moving results out of WWM live ranges,
it is not necessary for WQM live ranges. The result is a v_mov from a register to itself after every
WQM operation. This change uses a COPY psuedo in these cases, which allows the register
allocator to coalesce the moves away.

Diff Detail

Event Timeline

mjbedy created this revision.Dec 11 2019, 7:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 11 2019, 7:35 PM
mjbedy added a reviewer: foad.Dec 11 2019, 7:43 PM
mjbedy updated this revision to Diff 233490.Dec 11 2019, 7:58 PM

Fix typo in commit message.

mjbedy retitled this revision from [AMDGPU] Remove unnessary v_mov from a register to itself in WQM lowering. to [AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering..Dec 11 2019, 7:59 PM
arsenm added inline comments.Dec 11 2019, 8:47 PM
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
876–877

These should only ever have been moves to begin with? Just remove the one exec operand?

This basically LGTM

llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
876–877

They were move-like: WQM, SOFT_WQM, V_SET_INACTIVE_xx. I think it makes sense though to just remove that one operand and maybe assert there are no other implicit uses.

mjbedy added inline comments.Dec 13 2019, 11:30 AM
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
876–877

Now that I look closer at this, I'm not even sure what this loop is doing (or the one in the case above that I modeled it on)? The exec operands are implicit so this doesn't touch them. This is what comes into the phase:

%12:vgpr_32 = WQM %10:vgpr_32, implicit $exec, implicit $exec

And this is what comes out:

%12:vgpr_32 = COPY %10:vgpr_32, implicit $exec, implicit $exec

foad added inline comments.Dec 16 2019, 3:36 AM
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
876–877

The loop removes any extra explicit operands after the first two, which will become the destination and source of the copy. Currently the SET_INACTIVE instructions have an extra explicit operand called "inactive".

Several passes add an implicit use of exec without checking whether there was one already, so I'm not surprised this pass does not bother to remove any implicit operands. You can see instructions with multiple "implicit $exec" operands in debug output and even in some of our autogenerated lit tests.

mjbedy updated this revision to Diff 235220.Dec 24 2019, 8:12 AM
  • Change operand removel code to be less generic, only V_SET_INACTIVE with an undef source needs an operand removed.
mjbedy updated this revision to Diff 235222.Dec 24 2019, 8:22 AM

Remove unintentially included file.

mjbedy marked 4 inline comments as done.Dec 30 2019, 9:12 PM
mjbedy added inline comments.
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
876–877

I fixed the code to only expect this one case, and added a test for it.

mjbedy marked an inline comment as done.Jan 9 2020, 5:57 AM

Ping.

nhaehnle accepted this revision.Jan 9 2020, 6:46 AM

LGTM, with one minor nitpick.

llvm/test/CodeGen/AMDGPU/wqm.mir
53

*its

This revision is now accepted and ready to land.Jan 9 2020, 6:46 AM
mjbedy marked an inline comment as done.Jan 10 2020, 7:39 PM
This revision was automatically updated to reflect the committed changes.