Page MenuHomePhabricator

AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops.
ClosedPublic

Authored by cfang on May 1 2018, 4:11 PM.

Details

Summary

The current StructurizeCFG pass only works for CFG with one exit. AMDGPUUnifyDivergentExitNodes combines multiple "return" blocks and/or "unreachable" blocks
to one exit block for the Structurizer to work. However, infinite loop is another kind of special "exit", and if we don't handle it, the case of multiple exits will prevent the structurizer from working.

In this work, for each infinite loop, we add a dummy edge to the "return" block, and thus the AMDGPUUnifyDivergentExitNodes pass will work with infinite loops.
This will make CFG with infinite loops be structurized.

Diff Detail

Event Timeline

cfang created this revision.May 1 2018, 4:11 PM

So this does not cover all possible cases of "irreducible" infinite loops, e.g. consider the case with basic blocks A and B where both end with a conditional branch that can go to either A or B. I.e., there are no unconditional branches in the loop.

A more robust approach would be to also transform conditional branches like this:

br i1 %cc, label %A, label %B

becomes

  br i1 true, label %DummyReturnBB, label %local_dummy
local_dummy:
  br i1 %cc, label %A, label %b
lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
194

Shouldn't this use eraseFromParent()?

cfang updated this revision to Diff 145114.May 3 2018, 4:38 PM

Handle the case that the infinite loop is controlled by a conditional branch. In such case, the two edges of the branch
are both backedges.

We create a "TransitionBB" to hold the conditional branch. And at the place of the original conditional branch,
we create a new conditional branch with the conditional always true, the the true edge connects to the TransitionBB, and the false edge to
the dummy return block.

Thanks to Nicolai for pointing out the case as well as the approach.

cfang marked an inline comment as done.May 3 2018, 4:44 PM

So this does not cover all possible cases of "irreducible" infinite loops, e.g. consider the case with basic blocks A and B where both end with a conditional branch that can go to either A or B. I.e., there are no unconditional branches in the loop.

A more robust approach would be to also transform conditional branches like this:

br i1 %cc, label %A, label %B

becomes

  br i1 true, label %DummyReturnBB, label %local_dummy
local_dummy:
  br i1 %cc, label %A, label %b

Thank you so much for pointing out this case and providing the algorithm. Done as suggested.

lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
194

Right. There are other places in the same file need to do the same thing using eraseFromParent(). Will do in a separate patch.

Thanks. This change introduces more branches, but since it should only do so in an infinite loop it should be fine.

I did notice some minor style problems though.

lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
186–192

The indentation looks wrong here. Please check with clang-format/clang-format-diff (I prefer to use clang-format-diff, because otherwise you're quite likely to get notifications about formatting errors in code that you didn't actually change).

199

Typo: *conditional

cfang marked 3 inline comments as done.May 7 2018, 9:28 AM
lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
186–192

Nice catch! Corrected. Thanks.

199

Thanks.

cfang updated this revision to Diff 145482.May 7 2018, 9:29 AM
cfang marked 2 inline comments as done.

Correct indentation error and typo.

cfang added a comment.May 9 2018, 2:54 PM

Any additional comments/suggestions? Thanks!

Ping.

We need this fix to unblock some important tasks. So if there is no additional comments, we need the permission
to integrate the patch. Thanks.

This revision is now accepted and ready to land.May 17 2018, 8:26 AM
This revision was automatically updated to reflect the committed changes.