This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Optimize AtomicOptimizer
ClosedPublic

Authored by Flakebi on Mar 11 2020, 1:47 AM.

Details

Summary

Mark the ctpop as convergent so it does not get moved into the
single-lane basic block. This saves us currently one instruction.
Another way to save this instruction is reusing the saved exec register
from the inserted control flow (output of s_saveexec). This is
currently hard to do though it might work when GlobalISel gets used.

Diff Detail

Event Timeline

Flakebi created this revision.Mar 11 2020, 1:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2020, 1:47 AM
foad accepted this revision.Mar 11 2020, 2:36 AM

LGTM.

This revision is now accepted and ready to land.Mar 11 2020, 2:36 AM
Flakebi closed this revision.Mar 18 2020, 3:06 AM

Unfortunately this does not work anymore with the updated ballot intrinsic. I’ll leave this for later, see also D65088.

arsenm added inline comments.Mar 18 2020, 8:05 AM
llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
530

This won't be preserved in any meaningful way to th backend, this should be removed