This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Make exec mask optimzations more resistant to block splits
ClosedPublic

Authored by arsenm on Mar 25 2019, 9:44 AM.

Details

Reviewers
rampitec
Summary

Also improve the check for SALU instructions to also ignore
implicit_def and other fake instructions.

Diff Detail

Event Timeline

arsenm created this revision.Mar 25 2019, 9:44 AM
rampitec added inline comments.Mar 25 2019, 10:29 AM
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
144

Yes, in case block is only preceded bit itself. We do not generate this way, but I can write such IR manually. Anyway layout successor means there will be no branch.

test/CodeGen/AMDGPU/collapse-endcf.mir
146

I am not sure autogenerated test really tests anything, as there is no GCN-NEXT. The copy may easily remain and be untested.

arsenm marked an inline comment as done.Mar 25 2019, 10:32 AM
arsenm added inline comments.
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
144

It doesn't mean there's no branch. It still allows

S_CBRANCH
<no S_BRANCH>

<fallthrough block>.

If both used unconditional branches it shouldn't be any different

arsenm marked an inline comment as done.Mar 25 2019, 10:33 AM
arsenm added inline comments.
test/CodeGen/AMDGPU/collapse-endcf.mir
146

The important part is the number of s_or_b64

rampitec added inline comments.Mar 25 2019, 10:35 AM
test/CodeGen/AMDGPU/collapse-endcf.mir
146

But how you can be sure about the number if there can be some others in between of the checks?

arsenm marked 2 inline comments as done.Mar 25 2019, 11:00 AM
arsenm added inline comments.
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
144

I think what's really necessary here is something like:

end_cf sources are both si_if
First si_if dominates the second, and the second end_cf post dominates the first.

This requires moving the pass before the control flow pseudos are lowered

test/CodeGen/AMDGPU/collapse-endcf.mir
146

I think it's pretty unlikely this can break with everything checked. update_mir_test_checks should probably be emitting -NEXT, but that's a bigger problem to work on

This revision is now accepted and ready to land.Mar 25 2019, 11:03 AM
arsenm closed this revision.Mar 28 2019, 7:00 AM

r357170