This is an archive of the discontinued LLVM Phabricator instance.

Introduce unify-loop-exits pass.
ClosedPublic

Authored by sameerds on Mar 9 2020, 11:49 AM.

Details

Summary

For each natural loop with multiple exit blocks, this pass creates a
new block N such that all exiting blocks now branch to N, and then
control flow is redistributed to all the original exit blocks.

The bulk of the tranformation is a new function introduced in
BasicBlockUtils that an redirect control flow from a set of incoming
blocks to a set of outgoing blocks via a common "hub".

This is a useful workaround for a limitation in the structurizer which
incorrectly orders blocks when processing a nest of loops. This pass
bypasses that issue by ensuring that each natural loop is recognized
as a separate region. Since the structurizer is a region pass, it no
longer sees a nest of loops in a single region, and instead processes
each "level" in the nesting as a separate region.

The AMDGPU backend provides a new option to enable this pass before
the structurizer, which may eventually be enabled by default.

Diff Detail

Event Timeline

sameerds created this revision.Mar 9 2020, 11:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2020, 11:49 AM
arsenm added inline comments.Mar 9 2020, 12:05 PM
llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h
395

Could use some ASCII art graphs?

madhur13490 added inline comments.
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
1144

What about modularizing this function a bit to make it more readable? I see some logical blocks in this function.

madhur13490 marked an inline comment as done.Mar 11 2020, 12:57 AM
madhur13490 added inline comments.
llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h
367

A better word may "Sink"?

sameerds updated this revision to Diff 250351.Mar 14 2020, 3:01 AM

Addressed review comments:

  1. New ascii art.
  2. Split the large function CreateControlFlowHub into smaller functions.
sameerds marked 3 inline comments as done and an inline comment as not done.Mar 14 2020, 3:05 AM
sameerds added inline comments.
llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h
367

Not really. "Sink" would imply an end of a flow as opposed to "Source". This really is a hub where unrelated control flow paths are made to converge and then fanout again.

sameerds updated this revision to Diff 250402.Mar 14 2020, 9:30 PM
sameerds marked an inline comment as not done.

Fixed the ascii art example to be correct in terms of SSA. The extra
edge from In1 to Out2 would be nice to have, but it breaks the
pre-condition that Def dominates Use.

arsenm added inline comments.Mar 17 2020, 9:53 AM
llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
175–192

Won't the verifier catch these after the pass anyway?

If not I think the all the verification calls should be moved to a separate function

sameerds marked an inline comment as done.Mar 17 2020, 10:40 AM
sameerds added inline comments.
llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
175–192

This verification happens immediately after each loop is transformed, which is useful for reporting errors closer to where they occur. If we move the verification out of this function, the outer loop that calls this function could end up working on an invalid loop structure, and the errors reported would not be very useful.

arsenm added inline comments.Mar 19 2020, 2:56 PM
llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
33

This seem to ignore the existence of switches (and call_br, but we don't handle that at all). Should this add a dependency on LowerSwitch (and preserve it)? Also should add a switch test or two

153

Printing empty string?

155

single quotes

159

single quotes

sameerds marked 2 inline comments as done.Mar 19 2020, 7:26 PM
sameerds added inline comments.
llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
33

@nhaehnle had mentioned at some point that requiring LowerSwitch via AnalysisUsage is not reliable. The structurizer seems to be the only pass in LLVM that seems to do that, which does not help my confidence at all.

155

Does this really matter? Seems terribly pedantic to me, and would be surprised to see this in any coding standard. At least nothing specific showed up when I search in the LLVM, Linux and GNU coding standards.

sameerds updated this revision to Diff 251534.Mar 19 2020, 7:38 PM

Added a comment about requiring direct branches; removed an empty string from debug output

arsenm added inline comments.Mar 20 2020, 9:11 AM
llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
33

Unreliable in what way? I know there was something like a circular dependence issue in the context of one of the other passes, that might not exist here. I would still try to add it here, and this should probably error if it encounters a switch or something

155

It's not really important, but they're not identical. The char version does save a function call vs. the string

sameerds updated this revision to Diff 251958.Mar 23 2020, 1:01 AM

require and preserve lower-switch pass

This mostly looks good to me. I have some comments questions that affect the correctness of the code (at least in general); most of the rest is suggestions for possible simplifications / minor improvements.

llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
1151

How about: "These indicate that if the Hub is entered from InBB, then control is transferred to OutBB if the corresponding value is true."

This suggests a slight refactoring of the function to have an outer loop based on Incoming and then looking at the possible cases of the BranchInst terminator, to parallel the corresponding code in redirectIncomingToHub.

1175–1176

Since Inverted does not only contain inverted conditions, could you please consistently rename it? Maybe AllConditions?

1244–1257

Smart. I wonder if you could simplify this further by transposing the loops.

Basically, first create all the phis in a loop over the outgoing blocks.

Then in a separate loop nest, iterate over the Incoming blocks, then iterate over all outgoing phis, and add the appropriate condition. If you combine this with my comment in collectPredicates, it may even be quite suitable to merge the two functions into one. This could then elide the creation of the "OldPredicates" entirely. The code doesn't need that structure for anything else, do we? (I suppose this would require doing the redirectIncomingToHub later.)

1261

std::move?

1367–1368

Shouldn't there be an edge insertion from the last guard blocks to both the last and the second-last Outgoing block? What am I missing here?

llvm/lib/Transforms/Utils/Local.cpp
3036

I believe the not should be created directly after Inst instead of before its block's terminator, to ensure that the new instruction dominates everything that Inst dominates.

3041–3042

Same argument here: the not should be create at the *start* of the entry block rather than at the end.

llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
140–142

I don't understand this argument. getExitingBlocks also goes through all blocks in the loop... since you need both exiting *and* exit blocks, you may as well loop over all blocks in the loop directly. Though I guess this approach is not too bad, either, just the comment is confusing. If you want to argue that getting the exiting blocks and then enumerating successors to find exits is cheaper than enumerating exits and then enumerating predecessors, I suppose there are two arguments to be made:

  • Eliminating duplicates is more natural the way you're doing it
  • Iterating over successors is more efficient than iterating over predecessors in LLVM IR
197

This should be guarded by #ifndef NDEBUG.

A quick addendum to ask whether you've run this with LLVM_ENABLE_EXPENSIVE_CHECKS -- that's required to flush out any problems with the dom tree updates.

sameerds updated this revision to Diff 253474.Mar 29 2020, 7:38 PM

review comments: reorder loops and merge the predicate logic

sameerds marked 19 inline comments as done.Mar 29 2020, 7:46 PM
sameerds added inline comments.
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
1244–1257

That's brilliant! All three functions quickly collapsed into a single function now.

1367–1368

Yes, there should be, and there was one in the first version of this review. It got lost when I refactored into separate functions. This is fixed now. The whole thing manages to work correctly even if I "forget" to add any outgoing edge from the new guard blocks. My best guess is that the updater itself all new edges incident on a new node, and since I am updating the dom tree as late as possible, this always discovers all the new edges.

llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
140–142

I assume you are okay with the call to getExitingBlocks. If I were to traverse the whole loop body myself, that would just duplicate functionality that's already easily available. I rewrote the comments for clarity.

197

Wrapped this in EXPENSIVE_CHECKS instead.

sameerds updated this revision to Diff 253540.Mar 30 2020, 3:13 AM
sameerds marked 3 inline comments as done.

clang-format

nhaehnle accepted this revision.Mar 30 2020, 8:24 AM

Thanks, LGTM!

llvm/lib/Transforms/Utils/UnifyLoopExits.cpp
140–142

Yes.

197

Thanks.

This revision is now accepted and ready to land.Mar 30 2020, 8:24 AM
This revision was automatically updated to reflect the committed changes.