This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] limit recursion depth when speculating instructions (PR26308)
ClosedPublic

Authored by spatel on Jan 27 2016, 10:10 AM.

Details

Summary

This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=26308

With the switch to using the TTI cost model in:
http://reviews.llvm.org/rL228826
...it became possible to hit a zero-cost cycle of instructions (gep -> phi -> gep...), so we need a cap for the recursion in DominatesMergePoint().

A recursion depth parameter was already added for a different reason in:
http://reviews.llvm.org/rL255660
...so I think we just need to set a limit.

I pulled "10" out of the air and made it an independent parameter that we can play with. It might be higher than it needs to be given the currently low default value of PHINodeFoldingThreshold (2). That's the starting cost value that we enter the recursion with, and most instructions have cost set to TCC_Basic (1), so I don't think we're going to speculate more than 2 instructions with the current parameters.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 46149.Jan 27 2016, 10:10 AM
spatel retitled this revision from to [SimplifyCFG] limit recursion depth when speculating instructions (PR26308).
spatel updated this object.
spatel added reviewers: majnemer, jmolloy.
spatel added a subscriber: llvm-commits.
majnemer accepted this revision.Jan 27 2016, 10:25 AM
majnemer edited edge metadata.

LGTM

We can always adjust 10 if it is too low.

This revision is now accepted and ready to land.Jan 27 2016, 10:25 AM

Errr, rather than limit recursion depth, why not just keep track of what's
visited and avoid the cycles?

(You could also cheaply build sccs of the instruction and it's uses, and
decide what to do for each SCC as a whole, which would avoid this issue as
well)

Errr, rather than limit recursion depth, why not just keep track of what's
visited and avoid the cycles?

(You could also cheaply build sccs of the instruction and it's uses, and
decide what to do for each SCC as a whole, which would avoid this issue as
well)

We run DominatesMergePoint on every two entry phi node. Using a set of visited blocks would still be pretty expensive because, in the worst case, DominatesMergePoint might recurse back to the entry node if there are a lot of cheap BBs. I think it would make sense to have a depth limit to handle such pathological cases.

If you do it on literally every two entry phi nodes, why is the answer not:

Number each node of dominator tree with depth in dom tree (IE level).
Process dominator tree in level order, visiting all nodes of a given level
before the next deeper level:

Try to speculate a given phi node.
Stop anytime you hit a phi node with a level above you above you (because

you will have already speculated it if you could).

Track visited phi nodes to avoid cycles.

Or something similar.
Over time, we seem to expand a lot of lazy-walking optimizations to do
these "walk possibly everything backwards" types of algorithms, and at the
point we want to walk *everything of a certain type* there is no point in
doing it lazily.

If you do it on literally every two entry phi nodes, why is the answer not:

Number each node of dominator tree with depth in dom tree (IE level).
Process dominator tree in level order, visiting all nodes of a given level
before the next deeper level:

Try to speculate a given phi node.
Stop anytime you hit a phi node with a level above you above you (because

you will have already speculated it if you could).

Track visited phi nodes to avoid cycles.

Or something similar.
Over time, we seem to expand a lot of lazy-walking optimizations to do
these "walk possibly everything backwards" types of algorithms, and at the
point we want to walk *everything of a certain type* there is no point in
doing it lazily.

You will find no disagreement on this point from me :)
On the other hand, there is no reason not to apply this patch. The code as-written obviously intended to do something with this recursion depth counter, it's disuse was an oversight.

This revision was automatically updated to reflect the committed changes.