This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Make sinkRegion and hoistRegion non-recursive
ClosedPublic

Authored by majnemer on Jul 18 2017, 10:24 PM.

Details

Summary

Large CFGs can cause us to blow up the stack because we would have a
recursive step for each basic block in a region.

Instead, create a worklist and iterate it. This limits the stack usage
to something more manageable.

Event Timeline

majnemer created this revision.Jul 18 2017, 10:24 PM
davide edited edge metadata.Jul 18 2017, 10:45 PM

The idea is sound, I'll think a little bit more about the visitation order (tomorrow morning) but it seems you got it right.
Just a minor nit, I see this patch has some unrelated formatting, mind to revert that? (and/or commit separately)

The idea is sound, I'll think a little bit more about the visitation order (tomorrow morning) but it seems you got it right.
Just a minor nit, I see this patch has some unrelated formatting, mind to revert that? (and/or commit separately)

AFAICT, iterating over the worklist caused the nesting to increase which caused the formatting to change.

sanjoy added inline comments.Jul 18 2017, 11:13 PM
lib/Transforms/Scalar/LICM.cpp
371

getDescendants seems to be doing a BFS. Given that (and that DT is a tree), can you iterate Worklist in reverse to get a post order? If you do this, we'd have to spec getDescendants as doing a BFS though (and not just have it be an implementation detail).

OTOH if you write the BFS by hand (instead of relying on getDescendents), you can add an early exit for CurLoop->contains(BB).

majnemer added inline comments.Jul 19 2017, 9:40 AM
lib/Transforms/Scalar/LICM.cpp
371

Do you mean BFS or DFS?

sanjoy added inline comments.Jul 19 2017, 10:03 AM
lib/Transforms/Scalar/LICM.cpp
371

I meant BFS; since DT is a tree you can do a (slightly) cheaper in-place BFS as:

Order.push_back(N);
for (int i = 0; i < Order.size(); i++)
  Order.push_back(N->children());
majnemer added inline comments.Jul 19 2017, 10:21 AM
lib/Transforms/Scalar/LICM.cpp
371

I don't believe reverse iterating a BFS gives you a post-order.

If I understand the patch correctly, aren't we just iterating children->parent, therefore the order of siblings should be irrelevant in this case ?
(i.e. the worklist is sorted according to the dominance relation and shouldn't really matter if we process two siblings in the dominator in different order?)
I think Sanjoy's suggestion works if my reasoning is correct, although I'm also fine with the current version of the patch (I don't have a strong preference for one or the other).

efriedma added inline comments.
lib/Transforms/Scalar/LICM.cpp
378

properlyDominates is not a strict weak ordering, so this has undefined behavior.

majnemer updated this revision to Diff 107431.Jul 19 2017, 6:11 PM
  • Address review feedback
majnemer marked 2 inline comments as done.Jul 19 2017, 6:18 PM

LGTM, nice to have you back from the grave :)

lib/Transforms/Scalar/LICM.cpp
346

capital i maybe?

sanjoy added inline comments.Jul 19 2017, 6:24 PM
lib/Transforms/Scalar/LICM.cpp
349

Not sure that the lambda is actually useful here -- why not just:

if (CurLoop->contains(Child->getBlock()))
  Worklist.push_back(CurLoop);

?

If you wanted to go a bit further, I'd say even

for (size_t i = 0; i < Worklist.size(); i++)
  copy_if(Worklist[i]->getChildren(), <lambda to check if node is in loop>, std::back_inserter(Worklist));

may work.

davide added inline comments.Jul 19 2017, 6:27 PM
lib/Transforms/Scalar/LICM.cpp
349

I personally find the lambda version slightly more readable.

sanjoy accepted this revision.Jul 19 2017, 6:34 PM
sanjoy added inline comments.
lib/Transforms/Scalar/LICM.cpp
349

In that case SGTM, but please use camel case. :)

This revision is now accepted and ready to land.Jul 19 2017, 6:34 PM
This revision was automatically updated to reflect the committed changes.