Page MenuHomePhabricator

Accelerate isPotentiallyReachable when a DominatorTree is available.
Needs ReviewPublic

Authored by nicholas on Apr 5 2019, 8:38 PM.

Details

Reviewers
bruno
asbirlea
Summary

When we show that our stop block is not dominated by the current block, we don't need to visit any of the other dominated blocks except to visit their successors. Efficiently add those transitive successors to the worklist.

When both a dominator tree is available and we're in a loop, prefer to jump to the exit blocks of a loop. We could fuse the two by visiting only the non-dominated transitive successors of the loop, but it would have the same effect as simply adding the loop blocks to the worklist and going through the check again. I don't think the savings of redundant checks is worth the complexity, at worst we might have one extra call to DomTree::dominates per loop exit block.

In one sense there's no functionality change here, but isPotentiallyReachable now does more work per loop iteration which means that it may complete successfully instead of running into the block limit and giving up.

Diff Detail

Event Timeline

nicholas created this revision.Apr 5 2019, 8:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 8:38 PM
nicholas edited the summary of this revision. (Show Details)Apr 6 2019, 12:50 AM
nicholas edited the summary of this revision. (Show Details)
nicholas added reviewers: bruno, asbirlea.EditedApr 16 2019, 10:16 PM

CFG.cpp doesn't have many non-cleanup authors, Bruno once made a change all the way back in 2015 to improve the performance of isPotentiallyReachable. Alina, I've added you since you reviewed my previous change to isPotentiallyReachable when it affected LICM. This series of patches improves performance of isPotentiallyReachable, primary by taking advantage of dominator tree properties.

The two properties of dominator trees that isPotentiallyReachable isn't presently taking advantage of (and fixed by this patch series) is the ability to proceed with the blocks that are not dominated by the current block (instead of the immediate successors), and the ability to quickly determine whether the block we just popped off the worklist is dominated by a block we have already visited.

I think it would make sense to have an overall idea of what the plan is in order to start the review, even though we don't need to cover all the details now. Some of the earlier patches temporarily make things worse overall in preparation of improvements that can't be realized until later patches land. As ASCII art:

┌ D60356: https://reviews.llvm.org/D60356
│ When we visit a block and have a DominatorTree, don't simply visit the block's successors. Build up a dominator-frontier (all the blocks that have edges leading to blocks not also dominated). This is done using df_iterator, which is intended for arbitrary DAGs so it keeps its own private Visited set, but that's unnecessary complexity (time/memory) when traversing a simple tree. Other parts of LLVM use DominatorTree::getDescendants.
│
├ D60357: https://reviews.llvm.org/D60357
│ isPotentiallyReachable was only adding blocks it popped off of WorkList to Visited. Also include the blocks that we effectively visited, such as those dominated by the block, and those in the same loop as the block. We should never revisit those either. This is logically the right idea, but could in practice be a large number of blocks.
│
├ D60752: https://reviews.llvm.org/D60752
│ Create a set of VisitedLoops separate from the set of visited blocks. Store the Loop*'s we visit there, saving us the memory of storing all those blocks. The cost is an extra call to getOutermostLoop before we can check whether our block is in a visited loop.
│
├ D60752: https://reviews.llvm.org/D60752
│ When the dominator tree is available, it simply doesn't make sense to walk the basic blocks. At this point, the unified implementation of isPotentiallyReachable with optional DominatorTree and optional LoopInfo no longer makes sense. Split this into a BB-based version ("slow") and a DT-based version ("fast"). The fast version never grows to support excluded blocks (is there a route from A to B without stepping on blocks in {X1, X2, ...}). 
│
│ ┌ D60754: https://reviews.llvm.org/D60754
│ │ Change DominatorTree to allow us to fix the inefficient walk of a DomTreeNode's descendants mentioned in D60356. We don't need the SmallVector of descendants, so we migrate DominatorTree::getDescendants()'s implementation into DominatorTree::forEachDescendant() and reimplement the former by calling the latter. There's a number of implementation options here, such as writing a df_tree_iterator or adding a GraphTraits to indicate that the graph is really a tree.
│ │
└─┼ D60755: https://reviews.llvm.org/D60755
  │ Use DominatorTree::forEachDescendant() to speed up the walk of descendants in isPotentiallyReachable.
  │
  ├ D60756: https://reviews.llvm.org/D60756
  │ Accelerate checking whether we've visited this block before by bisecting over the dominator tree node's dfs-in number. This takes advantage of the fact that the dfs-in .. dfs-out ranges are never partially overlapping: they are either entirely disjoint (for two blocks A and B, Ain < Aout < Bin < Bout, or swap A and B) or entirely overlapping (Ain < Bin < Bout < Aout, or swap A and B). Those are the only four cases. We store the visited list in a form where we never have a block that dominates another block, and sorted by dfs-in number.
  ┊
vsk added a subscriber: vsk.Apr 17 2019, 8:54 AM