This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Translate s_and/s_andn2 to s_mov in vcc optimisation
ClosedPublic

Authored by critson on Jul 14 2020, 10:49 PM.

Details

Summary

When SCC is dead, but VCC is required then replace s_and / s_andn2
with s_mov into VCC when mask value is 0 or -1.

Diff Detail

Event Timeline

critson created this revision.Jul 14 2020, 10:49 PM
arsenm added inline comments.Jul 15 2020, 7:19 AM
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
140–141

Can we do this earlier? Removing the SCC def earlier would be more useful

It seems wave32 failure is real.

critson updated this revision to Diff 278453.Jul 16 2020, 6:37 AM

Add missing test diffs.

critson marked an inline comment as done.Jul 16 2020, 6:54 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
140–141

To the best of my understanding, the earliest this optimisation becomes available is after "Branch Probability Basic Block Placement". Which is not much earlier.

This revision is now accepted and ready to land.Jul 16 2020, 8:28 AM
This revision was automatically updated to reflect the committed changes.