This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add llvm.amdgcn.wqm.demote intrinsic
ClosedPublic

Authored by critson on Jan 14 2021, 10:05 PM.

Details

Summary

Add intrinsic which demotes all active lanes to helper lanes.
This is used to implement demote to helper Vulkan extension.

In practice demoting a lane to helper simply means removing it
from the mask of live lanes used for WQM/WWM/Exact mode.
Where the shader does not use WQM, demotes just become kills.

Additionally add llvm.amdgcn.live.mask intrinsic to complement
demote operations. In theory llvm.amdgcn.ps.live can be used
to detect helper lanes; however, ps.live can be moved by LICM.
The movement of ps.live cannot be remedied without changing
its type signature and such a change would require ps.live
users to update as well.

Diff Detail

Event Timeline

critson created this revision.Jan 14 2021, 10:05 PM
critson requested review of this revision.Jan 14 2021, 10:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2021, 10:05 PM
foad added inline comments.Jan 15 2021, 6:40 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
389

Doesn't really matter but you can end the line with a ; instead of {}.

llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
820–827

Please put the isDemote case first. Negated conditions with "else"s make my head hurt.

critson marked 2 inline comments as done.Jan 15 2021, 5:37 PM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIInstructions.td
389

A legacy of when this had some flags.

critson updated this revision to Diff 317127.Jan 15 2021, 5:38 PM
critson marked an inline comment as done.
  • Address reviewer comments.
critson updated this revision to Diff 317781.Jan 19 2021, 11:23 PM
  • Resync with parent patch.
  • Fix a bug where demotes did not block if simplification in control flow lowering.
arsenm added inline comments.Jan 20 2021, 6:08 AM
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
1358–1362

Should ps_live eventually be replaced? I would rather make that clear rather than adding another intrinsic to void updating users

foad added a reviewer: piotr.Jan 22 2021, 6:38 AM

Looks reasonable to me.

piotr added a comment.Jan 25 2021, 2:31 AM

LGTM with a few nits - feel free to ignore them.

llvm/include/llvm/IR/IntrinsicsAMDGPU.td
1353

Mark as deprecated?

1358

If the old ps_live is slated to go away, it would be better to not refer to it here.

llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
814

You can avoid calling .contains() by adding a local variable in the outer "for" loop (or slightly refactoring the first inner loop and testing the iterator).

critson updated this revision to Diff 319506.Jan 27 2021, 2:24 AM
  • Address review comments
  • Mark ps.live as deprecated.
piotr accepted this revision.Jan 27 2021, 11:16 PM
This revision is now accepted and ready to land.Jan 27 2021, 11:16 PM
This revision was landed with ongoing or failed builds.Feb 14 2021, 4:40 PM
This revision was automatically updated to reflect the committed changes.