This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] WQM/WWM: Fix marking of partial definitions
ClosedPublic

Authored by critson on Jan 26 2021, 8:53 PM.

Details

Summary

Track lanes when processing definitions for marking WQM/WWM.
If all lanes have been defined then marking can stop.
This prevents marking unnecessary instructions as WQM/WWM.

In particular this fixes a bug where values passing through
V_SET_INACTIVE would me marked as requiring WWM.

Diff Detail

Event Timeline

critson created this revision.Jan 26 2021, 8:53 PM
critson requested review of this revision.Jan 26 2021, 8:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2021, 8:53 PM
foad added a comment.Jan 27 2021, 3:01 AM

Typo "me" in description.

llvm/test/CodeGen/AMDGPU/wqm.mir
248

Remove most of this? Usually you only need name, tracksRegLiveness and body.

critson marked an inline comment as done.Jan 27 2021, 8:58 PM
critson updated this revision to Diff 319752.Jan 27 2021, 8:58 PM
  • Tidy test.
arsenm added inline comments.Jan 28 2021, 10:06 AM
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
295–296

You lost this assert (but we should convert it to the pass required properties set anyway)

382–389

Should this skip all implicit operands?

critson marked 2 inline comments as done.Jan 28 2021, 4:44 PM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
295–296

Yes, while I think I deleted the assertion by accident, I believe it is technically wrong.
I will add clearedProperties to the pass in D94746 which is where !IsSSA is definitely required.

382–389

No we must process implicit uses as well and mark them as WWM/WQM with $vcc being the primary use case, e.g. $vcc input of WWM VOP2 v_addc_co must also be in computed in WWM.
At the moment $exec is the only register I am confident that can be ignored as it explicitly managed by this pass; however, it is possible other register (e.g. $m0 and $mode) might also be valid here, but I'd prefer to leave that as exercise for later.

piotr added a comment.EditedFeb 17 2021, 9:39 AM

Looks good to me (with two nits inline).

llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
337

DefLanes is a bit too similar to DefinedLanes. Can you rename one of them please?

343

HasDef |= Overlap.any();

critson updated this revision to Diff 324928.Feb 19 2021, 2:36 AM
critson marked 2 inline comments as done.
  • Rebase
  • Address reviewer comments
critson marked 2 inline comments as done.Feb 19 2021, 2:38 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
337

Changed to OpLanes

piotr accepted this revision.Feb 19 2021, 2:40 AM
This revision is now accepted and ready to land.Feb 19 2021, 2:40 AM
This revision was landed with ongoing or failed builds.Feb 19 2021, 3:47 AM
This revision was automatically updated to reflect the committed changes.
critson marked an inline comment as done.