This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Teach the WQM pass about Whole Wavefront Mode and wqm_ctrl
AbandonedPublic

Authored by cwabbott on Jun 27 2017, 3:48 PM.

Details

Reviewers
tstellar
arsenm
Summary

Whole Wavefront Wode (WWM) is required for implementing wavefront
reductions in non-uniform control flow, where we need to use the
inactive lanes to propagate intermediate results, so they need to be
enabled. We need to propagate WWM to uses (unless they're explicitly
marked as exact) so that they also propagate intermediate results
correctly. We do the analysis and exec mask munging during the WQM pass,
since we may get other, non-WWM instructions mixed in the the WWM
instructions, and we'd like to avoid the overhead of switching back and
forth if we can, but only the WQM pass has this information. For
simplicity, WWM is entirely block-local -- blocks are never WWM on entry
or exit of a block, and WWM is not propagated to block inputs/outputs.
This means that computations involving WWM cannot involve control flow,
but we only ever plan to use WWM for a few limited purposes (none of
which involve control flow) anyways.

Right now, the only way to specify WWM is through a pseudo operand on
DPP instructions (added in a separate change). This commit also adds
support for WQM on DPP instructions through wqm_ctrl.

Event Timeline

cwabbott created this revision.Jun 27 2017, 3:48 PM
cwabbott updated this revision to Diff 104968.Jun 30 2017, 5:26 PM

Actually disable WWM on exit of a block.

arsenm edited edge metadata.Jun 30 2017, 7:27 PM

Needs tests

cwabbott abandoned this revision.Jul 17 2017, 5:58 PM

Abadon in favor of D35524. While I based that change off of this one, things have changed so much that it's probably better to abandon this and do the review there.