This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Allow hoisting of some VALU compare instructions
ClosedPublic

Authored by critson on Feb 4 2022, 1:18 AM.

Details

Summary

Conversatively allow hoisting/sinking of VALU comparisons.
If the result of a comparison is masked with exec, narrowing the
set of active lanes, then it is safe to hoist it as the masking
instruction will never by hoisted.

Heuristically this is also true for sinking, as we do not expect
the result of a sunk comparison that is masked with exec to be
used outside of the loop.

Diff Detail

Event Timeline

critson created this revision.Feb 4 2022, 1:18 AM
critson requested review of this revision.Feb 4 2022, 1:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2022, 1:18 AM
rampitec added inline comments.Feb 4 2022, 11:30 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
132–133

The function probably needs a different name now.

133–157

This needs a comment.

135

Check it is virtual.

critson updated this revision to Diff 406336.Feb 7 2022, 12:11 AM
critson marked 3 inline comments as done.
  • Address reviewer comments.

I suspect that V_CNDMASK uses are fairly common, and those implicitly mask with EXEC as well.

This also doesn't address more complex cases like

%0 = V_CMP ...
%1 = V_CMP ...
%2 = S_OR %0, %1
V_CNDMASK lhs, rhs, %2

though perhaps the change is good as-is for now.

rampitec accepted this revision.Feb 7 2022, 1:04 PM

LGTM with a nit.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
142

use_nodbg_instructions, otherwise result may be different with and without debug.

This revision is now accepted and ready to land.Feb 7 2022, 1:04 PM
critson marked an inline comment as done.Feb 7 2022, 6:25 PM

I suspect that V_CNDMASK uses are fairly common, and those implicitly mask with EXEC as well.

This also doesn't address more complex cases like

%0 = V_CMP ...
%1 = V_CMP ...
%2 = S_OR %0, %1
V_CNDMASK lhs, rhs, %2

though perhaps the change is good as-is for now.

Thanks for pointing this out.

I think it makes sense to look at additional patterns in a follow up patch.
IMO it's safer to expand the set of allowable hoists iteratively (testing as we go).

critson updated this revision to Diff 406671.Feb 7 2022, 6:27 PM
  • Address reviewer comments.
This revision was landed with ongoing or failed builds.Feb 7 2022, 6:28 PM
This revision was automatically updated to reflect the committed changes.