This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Optimize SI_IF lowering for simple if regions

Authored by rampitec on Jul 25 2017, 3:54 PM.



Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64.
The xor is used to extract only the changed bits. In case of a simple
if region where the only use of that value is in the SI_END_CF to
restore the old exec mask, we can omit the xor and perform an or of
the exec mask with the original exec value saved by the

Diff Detail

Event Timeline

rampitec created this revision.Jul 25 2017, 3:54 PM
arsenm accepted this revision.Jul 26 2017, 1:43 PM


This revision is now accepted and ready to land.Jul 26 2017, 1:43 PM
This revision was automatically updated to reflect the committed changes.