This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] SIRemoveShortExecBranches should not remove branches exiting loops
ClosedPublic

Authored by critson on Jan 19 2020, 4:13 AM.

Details

Summary

Check that a s_cbranch_execz is not a loop exit before removing it.
As the pass is generating infinite loops.

Diff Detail

Event Timeline

critson created this revision.Jan 19 2020, 4:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2020, 4:13 AM

You are right, the s_cbranch_execz shouldn't be removed from this loop exit BB.

But depending on the MachineLoopInfo to fix it is not good in general. There are irreducible control flows and the check may fail.

There is already a special check in function mustRetainExeczBranch, looking for loop code if the opcode is S_CBRANCH_VCCNZ & S_CBRANCH_VCCZ.
We should include S_CBRANCH_SCC0 & S_CBRANCH_SCC1 here. I hope this guarantees to be working for such cases.

If everyone agrees, I can create a patch and fix it.

There is already a special check in function mustRetainExeczBranch, looking for loop code if the opcode is S_CBRANCH_VCCNZ & S_CBRANCH_VCCZ.
We should include S_CBRANCH_SCC0 & S_CBRANCH_SCC1 here. I hope this guarantees to be working for such cases.

That would work for my test case (essentially a slightly more complex version of lit test).
I am happy with your proposed change, if others agree then I will abandon this one.

You are right, the s_cbranch_execz shouldn't be removed from this loop exit BB.

But depending on the MachineLoopInfo to fix it is not good in general. There are irreducible control flows and the check may fail.

There is already a special check in function mustRetainExeczBranch, looking for loop code if the opcode is S_CBRANCH_VCCNZ & S_CBRANCH_VCCZ.
We should include S_CBRANCH_SCC0 & S_CBRANCH_SCC1 here. I hope this guarantees to be working for such cases.

If everyone agrees, I can create a patch and fix it.

This is probably sufficient

You are right, the s_cbranch_execz shouldn't be removed from this loop exit BB.

But depending on the MachineLoopInfo to fix it is not good in general. There are irreducible control flows and the check may fail.

There is already a special check in function mustRetainExeczBranch, looking for loop code if the opcode is S_CBRANCH_VCCNZ & S_CBRANCH_VCCZ.
We should include S_CBRANCH_SCC0 & S_CBRANCH_SCC1 here. I hope this guarantees to be working for such cases.

If everyone agrees, I can create a patch and fix it.

This is probably sufficient

Also needs to catch execz/execnz/ Maybe this should juts check isBranch?

critson updated this revision to Diff 239224.Jan 20 2020, 7:23 PM

Update patch to current state of discussion.

arsenm added inline comments.Jan 20 2020, 7:30 PM
llvm/lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp
94

Special casing execnz just doesn't make sense to me. execz/execnz are morally equivalent

critson updated this revision to Diff 239226.Jan 20 2020, 8:08 PM

Remove C_BRANCH_EXECNZ special case.
This revives a c_branch_execz in one lit test.

Alternatively we special case both exec_nz/exec_z.
However this ceases to feel much better than simply stating all the accepted branch types.

nhaehnle accepted this revision.Jan 21 2020, 12:35 AM

LGTM. You'll have to re-submit D68092 as well then.

This revision is now accepted and ready to land.Jan 21 2020, 12:35 AM
This revision was automatically updated to reflect the committed changes.