This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add pseudo wavemode to optimize strict_wqm
ClosedPublic

Authored by critson on Oct 26 2022, 9:03 PM.

Details

Summary

Strict WQM does not require a WQM transistion if it occurs within
an existing WQM section.
This occurs heavily in GFX11 pixel shaders with LDS_PARAM_LOAD.
Which leads to unnecessary EXEC mask manipulation.

To avoid these transitions, detect WQM -> Strict WQM -> WQM
and substitute new ENTER_PSEUDO_WM/EXIT_PSEUDO_WM markers instead.
These are treat similarly by WWM register pre-allocation pass,
but do not manipulate EXEC or use registers to save EXEC state.

Diff Detail

Event Timeline

critson created this revision.Oct 26 2022, 9:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 9:03 PM
critson requested review of this revision.Oct 26 2022, 9:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 9:03 PM
piotr accepted this revision.Oct 27 2022, 12:05 AM

LGTM

This revision is now accepted and ready to land.Oct 27 2022, 12:05 AM
foad added inline comments.Oct 27 2022, 1:57 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
193–194

Does it need the implicit use and def?

critson added inline comments.Oct 27 2022, 2:04 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
193–194

Probably not, but I wasn't sure if the EXEC modification was being used to avoid something else happening in the backend passes.

Technically all the EXIT_ types should be marked as implict def of EXEC too.

This revision was landed with ongoing or failed builds.Oct 27 2022, 5:46 PM
This revision was automatically updated to reflect the committed changes.