This is an archive of the discontinued LLVM Phabricator instance.

Expose CFG Update struct. Define GraphTraits to get children given a snapshot CFG.
ClosedPublic

Authored by asbirlea on Aug 8 2018, 2:59 PM.

Details

Summary

Certain passes or analysis need to view a CFG snapshot rather than the actual CFG. This patch provides GraphTraits to offer such a view.

The patch defines GraphTraits for BasicBlock* and Inverse<BasicBlock*> to provide CFG successors and predecessors based on a list of CFG updates.

An Update is defined as a triple {InsertOrDeleteKind, BlockStartOfEdge, BlockEndOfEdge}.
A GraphDiff is defined as a list of Updates that has been preprocessed to treat the CFG as a graph rather than a multi-graph. As such, there can only exist a single Update given two nodes. All duplicates will be filtered and Insert/Delete edges that cancel out will be ignored.
The methods GraphDiff exposes are:

  • Determine if an existing child needs to be ignored, i.e. an Update exists in the correct direction to assume the removal of that edge.
  • Return a list of new children to be considered, i.e. an Update exists in the correct direction for each child in the list to assume the insertion of that edge.

Diff Detail

Repository
rL LLVM

Event Timeline

asbirlea created this revision.Aug 8 2018, 2:59 PM

Looking for feedback on this before moving forward. Some questions:

  1. Namespace naming (cfg) and placement of Update struct.
  2. Move WrappedPairNodeDataIterator somewhere in ADT, to reuse in LoopIterator.h?
  3. The part to pre-process the Updates is not part of this patch. I would like to move GenericDomTreeConstruction::LegalizeUpdates somewhere in Support outside DomTreeBuilder namespace, in order to reuse the functionality. Where is a good place to have this, or are there objections to moving it?

Should these follow-up patches be part of this one?

  1. Reference cfg::Update in DomTree, remove Update from DomTreeBuilder namespace.
  2. Teach IDF to use children based on a GraphDiff, given the new GraphTraits in this patch.

One thing that is not clear to me is how you decide whether to apply the updates on top of the current CFG or to reverse-apply them. DomTreeConstruction needs the latter, but I don't know what makes sense in your use-case.

I think that this should eventually replace the diffing functionality in the dominator tree construction. This should be also perhaps useful if we decide to go with CFG snapshots for easier incremental updates, like in @NutshellySima 's prototypes.

Looking for feedback on this before moving forward. Some questions:

  1. Namespace naming (cfg) and placement of Update struct.

Fine with me.

  1. Move WrappedPairNodeDataIterator somewhere in ADT, to reuse in LoopIterator.h?

Not familiar enough to suggest anything :/.

  1. The part to pre-process the Updates is not part of this patch. I would like to move GenericDomTreeConstruction::LegalizeUpdates somewhere in Support outside DomTreeBuilder namespace, in order to reuse the functionality. Where is a good place to have this, or are there objections to moving it?

Fine with migrating it somewhere. I would keep it close to the update struct.

Should these follow-up patches be part of this one?

  1. Reference cfg::Update in DomTree, remove Update from DomTreeBuilder namespace.

Fine with either now or a separate patch.

  1. Teach IDF to use children based on a GraphDiff, given the new GraphTraits in this patch.

Where does IDF calculation need the diffing functionality?

include/llvm/IR/CFGDiff.h
41 ↗(On Diff #159823)

It wan't obvious for me what this is for until I grepped for the field name and saw how it's used -- I'd add a comment saying why you need it.

71 ↗(On Diff #159823)

Why is returning a pointer better than reference? Can this ever return nullptr?

79 ↗(On Diff #159823)

I think it would be *extremely* useful to add a dump method here.

Right, I'm defining Inverse but not IsPostDom (from ChildrenGetter).
Adding a second bool is an easy option, and defining 2 more traits with:
std::pair<const GraphDiff<Inverse<BasicBlock *>> *, BasicBlock *>>
and
std::pair<const GraphDiff<Inverse<BasicBlock *>> *, Inverse<BasicBlock *>>>
These would set the second bool for the GraphDiff calls to reflect IsPostDom. NodeRef would remain the same.

The children call site in DomTree would look something like:
children<GraphDiffBBPair>({GD, BB})), with
using GraphDiffBBPair = std::pair<const GraphDiff<BasicBlock*> *, Inverse<BasicBlock*>; for predecessors in DomTree (and so on for all combos)

I agree that long term this can and should be reused in DomTree, so I'm trying to make it have the right functionality.
Another possibly related issue, is that I'm not reversing the children, as in DomTree, not sure the reason behind that one.

Another issue with the above traits: is there a way to avoid some of the code duplication, since the only differences are the bool arguments to GraphDiff and succ/pred iterators?

include/llvm/IR/CFGDiff.h
71 ↗(On Diff #159823)

I really wanted to define the GraphTraits in the other header as a pair of a *const* GraphDiff* and BasicBlock*. This meant that I needed to have the methods const as well and they cannot return a reference. No, this can never return null.

The alternative to all this is to drop the const in the GraphTraits std::pair below and all the related const attributes. Personal preference is to keep them, but if folks have strong preferences otherwise, I'm open to changing it.

kuhar added a comment.Aug 9 2018, 1:05 PM

Another issue with the above traits: is there a way to avoid some of the code duplication, since the only differences are the bool arguments to GraphDiff and succ/pred iterators?

I think you can create a base class with extra template parameters and then inherit from it, specializing the parameters you want. There's something similar in BlockFrequencyInfoImpl.h.

kuhar added a comment.Aug 9 2018, 1:07 PM

I agree that long term this can and should be reused in DomTree, so I'm trying to make it have the right functionality.
Another possibly related issue, is that I'm not reversing the children, as in DomTree, not sure the reason behind that one.

Sure. The children are being reversed to preserve the successor order in DFS, which uses stack and reverses them (again) internally. But this reverse can be moved somewhere else and you don't have to worry about it.

timshen added inline comments.Aug 9 2018, 5:08 PM
include/llvm/IR/CFGDiff.h
59 ↗(On Diff #159823)

Can we have a void verify() const that verifies the class invariants? e.g. whether duplicated edges are allowed, can one edge exist in both delete and insert, etc. We can call it with the guards of #ifdef EXPENSIVE_CHECKS.

71 ↗(On Diff #159823)

Would it be better to return iterator_range(SmallVectorImpl<NodePtr>::const_iterator)?

110 ↗(On Diff #159823)

Optional: The two GraphTraits specializations are highly duplicated. Maybe find a way to de-duplicate them?

The only differences between the two specializations are (1) succ_iterator vs succ_iterator, (2) succ_begin/succ_end vs pred_begin/pred_end, and (3) false vs true gets passed to getAddedChildren.

Both (1) and (2) can be abstracted in a common place. Maybe add some "inverse utility"?

include/llvm/Support/CFGUpdate.h
22 ↗(On Diff #159823)

Due to the use of PointerIntPair, it's better be a class and hides the data members.

38 ↗(On Diff #159823)

Can this be a print() and/or dump() function, consistent with other types?

asbirlea updated this revision to Diff 160164.Aug 10 2018, 11:54 AM
asbirlea marked 6 inline comments as done.

Address most comments.
Partially resolved GraphTraits duplication, working out if I can fully parametrize on succ/pred.

asbirlea added inline comments.Aug 10 2018, 11:54 AM
include/llvm/IR/CFGDiff.h
59 ↗(On Diff #159823)

I added the LegalizeUpdates call that should guarantee the specified behavior. Let me know if this is a good solution.

71 ↗(On Diff #159823)

Works for me, updated.

asbirlea marked 2 inline comments as done.Aug 10 2018, 11:55 AM
kuhar added inline comments.Aug 10 2018, 12:05 PM
include/llvm/IR/CFGDiff.h
60 ↗(On Diff #160164)

It's not clear to me what InverseGraph means here.

82 ↗(On Diff #160164)

return llvm::find(EdgesForBB, EdgeEnd) != EdgesForBB.end();

101 ↗(On Diff #160164)

missing )? :P

113 ↗(On Diff #160164)

please guard this with the macro for dump methods

include/llvm/Support/CFGUpdate.h
47 ↗(On Diff #160164)

Same as above - please guard this with the macro.

timshen accepted this revision.Aug 10 2018, 1:37 PM
timshen added inline comments.
include/llvm/IR/CFGDiff.h
60 ↗(On Diff #160164)

I imagine that if InverseGraph, then To is a predecessor of From. If not InverseGraph, To is a successor of From. Maybe you want to document that?

87 ↗(On Diff #160164)

"const" is not necessary here. It doesn't prevent the caller from removing the const anyway.

include/llvm/Support/CFGUpdate.h
72 ↗(On Diff #160164)

There is no need to swap?

95 ↗(On Diff #160164)

If the previous swap is removed, I guess this will always be Operations[{U.getFrom(), U.getTo()}] = int(i);.

This revision is now accepted and ready to land.Aug 10 2018, 1:37 PM
kuhar added inline comments.Aug 10 2018, 1:47 PM
include/llvm/IR/CFGDiff.h
60 ↗(On Diff #160164)

What confuses me is it's already possible to Pass Inverse<BB*> as the template arguments, so it's not clear if InverseGraph is for the graph, the updates, or if it means to swap insertions and deletions of edges.

It would be really helpful to add a comment about it above.

include/llvm/Support/CFGUpdate.h
72 ↗(On Diff #160164)

DomTree and PostDomTree allow you to pass the same update vector to both, but since PostDomTree operates on inverse CFG, you have to swap somewhere to make it work.

I believe this was just copied from the domtree code.

timshen requested changes to this revision.Aug 10 2018, 3:41 PM

Unintentional LGTM. But it's close. :)

This revision now requires changes to proceed.Aug 10 2018, 3:41 PM
asbirlea marked 5 inline comments as done.Aug 10 2018, 5:05 PM
asbirlea added inline comments.
include/llvm/IR/CFGDiff.h
60 ↗(On Diff #160164)

Let me try to clarify what I had in mind and please let me know if this makes sense in terms of generality. There is also a comment about this below which I'd be happy to move around and expand to give clarity.

There are 2 boolean values used consistently throughout this file: InverseGraph and InverseEdge. InverseGraph refers to having a normal graph, or viewing the graph in reverse, which is analogous with reverse-applying updates.
InverseEdge refers to the direction of the edges you're looking for. For a non-inversed graph, that's naturally successors when InverseEdge is false and predecessors when InverseEdge is true.

Below there are two classes that call into GraphDiff:
CFGViewSuccessors and CFGViewPredecessors, both can be parametrized to consider the graph inverted or not (i.e. InverseGraph). Successors implicitly has InverseEdge = false and Predecessors implicitly has InverseEdge = true (see calls to GraphDiff methods in there).

In the GraphTraits instantiations:

  • GraphDiff<BasicBlock *> is equivalent to InverseGraph = false
  • GraphDiff<Inverse<BasicBlock *>> is equivalent to InverseGraph = true
  • second pair item is BasicBlock *, then InverseEdge = false
  • second pair item is Inverse<BasicBlock *>, then InverseEdge = true

The 4 GraphTraits are as follows:

  1. std::pair<const GraphDiff<BasicBlock *> *, BasicBlock *>> : CFGViewSuccessors<false>

Regular CFG, children means successors, InverseGraph = false, InverseEdge = false.

  1. std::pair<const GraphDiff<Inverse<BasicBlock *>> *, BasicBlock *>> : CFGViewSuccessors<true>

Reverse the graph, get successors but reverse-apply updates, InverseGraph = true, InverseEdge = false.

  1. std::pair<const GraphDiff<BasicBlock *> *, Inverse<BasicBlock *>>> : CFGViewPredecessors<false>

Regular CFG, reverse edges, so children mean predecessors, InverseGraph = false, InverseEdge = true.

  1. std::pair<const GraphDiff<Inverse<BasicBlock *>> *, Inverse<BasicBlock *>> : CFGViewPredecessors<true>

Reverse the graph and the edges, InverseGraph = true, InverseEdge = true.

If this makes sense, I'll make this explanation into a comment before GraphDiff, rather than before CFGViewSuccessors.

include/llvm/Support/CFGUpdate.h
72 ↗(On Diff #160164)

The swap is intentional, for reverse-applying updates. It refers to the first bool: InverseGraph and should be used for PostDomTree.
Yes, the method is copied from the domtree code. I have a dependent patch to clean that up (remove Update struct and legalizeUpdates).

asbirlea updated this revision to Diff 160386.Aug 13 2018, 10:20 AM

Updated documentation in CFGDiff to describe InverseEdge adn InverseGraph.
Moved wrapper iterator to ADT/iterator.h

Thanks a lot for the comments so far!

asbirlea marked 10 inline comments as done.Aug 13 2018, 10:22 AM
kuhar accepted this revision.Aug 13 2018, 10:23 AM

Thanks for the changes. LGTM.

Thank you for the review, Kuba! I'll wait for Tim's input as well in case he has more comments.

timshen accepted this revision.Aug 13 2018, 11:02 AM
timshen added inline comments.
include/llvm/Support/CFGUpdate.h
105 ↗(On Diff #160386)

Just want to make sure that this sorts to descending order.

This revision is now accepted and ready to land.Aug 13 2018, 11:02 AM
asbirlea marked an inline comment as done.Aug 13 2018, 4:01 PM
asbirlea added inline comments.
include/llvm/Support/CFGUpdate.h
105 ↗(On Diff #160386)

Yes, that's right. The reason is that DomTree then applies updates from the end of the vector, so they will be applied in the original order.

My use case of GraphDiff does not care about update order and I don't want to change the order for DomTree, so I'll leave it like this in this patch.
I'm planning to refactor DomTree to use GraphDiff instead of its inner data structure (BatchUpdaterInfo), so then I can tell DT to use the Updates in order, and also reverse the sort here.

This revision was automatically updated to reflect the committed changes.
asbirlea marked an inline comment as done.