Page MenuHomePhabricator

[DDG] DirectedGraph as a base class for various dependence graphs such as DDG and PDG.
ClosedPublic

Authored by bmahjour on Jul 2 2019, 10:58 AM.

Details

Summary

This is an implementation of a directed graph base class with explicit representation of both nodes and edges. This implementation makes the edges explicit because we expect to assign various attributes (such as dependence type, distribution interference weight, etc) to the edges in the derived classes such as DDG and DIG. The DirectedGraph consists of a list of DGNode's. Each node consists of a (possibly empty) list of outgoing edges to other nodes in the graph. A DGEdge contains a reference to a single target node. Note that nodes do not know about their incoming edges so the DirectedGraph class provides a function to find all incoming edges to a given node.

This is the first patch in a series of patches that we are planning to contribute upstream in order to implement Data Dependence Graph and Program Dependence Graph.

More information about the proposed design can be found here: https://ibm.ent.box.com/v/directed-graph-and-ddg

Diff Detail

Repository
rL LLVM

Event Timeline

bmahjour created this revision.Jul 2 2019, 10:58 AM

Tests missing.

bmahjour edited the summary of this revision. (Show Details)Jul 4 2019, 6:55 AM
myhsu added a comment.Jul 4 2019, 7:24 AM

Is there any plan on supporting GraphTraits in this patch? I understand that sometimes it probably will be more suitable for derived class of DirectedGraph to implement GraphTraits. But I see no problem on providing a basic implementation of GraphTraits for DirectedGraph here.

Is there any plan on supporting GraphTraits in this patch?

Not in this patch, but subsequent patches where subclasses of DGNode, DGEdge and DirectedGraph are implemented will have corresponding graph-trait specializations.

I understand that sometimes it probably will be more suitable for derived class of DirectedGraph to implement GraphTraits. But I see no problem on providing a basic implementation of GraphTraits for DirectedGraph here.

The classes defined in this file are really meant to be used as base classes in an inheritance relationship that complete the CRTP idiom started here. In fact it's not possible to construct an object of type DirectedGraph with DGNode and DGEdge types, because each type requires a node type and an edge type as template parameters, and without corresponding subclasses their definitions would be recursive. When client nodes and edges are derived from DGNode and DGEdge, the CRTP idiom can be completed and the concrete node and edge types can be instantiated. For example the DDG implementation will define the graph nodes, edges and the graph-trait specialization as follows:

class DDGNode;
class DDGEdge;
using DDGNodeBase = DGNode<DDGNode, DDGEdge>;
using DDGEdgeBase = DGEdge<DDGNode, DDGEdge>;
using DDGBase = DirectedGraph<DDGNode, DDGEdge>;

class DDGNode : public DDGNodeBase { ... };
class DDGEdge : public DDGEdgeBase {...};
class DataDependenceGraph : public DDGBase {...};

template <> struct GraphTraits<DDGNode *> {...};
template <> struct GraphTraits<DataDependenceGraph *> : public GraphTraits<DDGNode *> {...};

The choice of the CRTP idiom is made to avoid introducing too many virtual function dispatches when interacting with the nodes and edges of the graph.

Tests missing.

The tests will be provided with the patches that extend the DirectedGraph and actually build an instance of it. I realize this patch introduces code without tests, but this is all in an effort to keep the reviews small. If there are any other suggestions on how to break up the reviews I'd be more than happy to look into it.

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

llvm/include/llvm/ADT/DirectedGraph.h
9 ↗(On Diff #207590)

Not just the interface; also a base implementation.

54 ↗(On Diff #207590)

Could this get a more descriptive name, such as Target/TargetNode?

61 ↗(On Diff #207590)

[style] Type declarations typically have the suffix Ty

65 ↗(On Diff #207590)

Make a doxygen comment?

72 ↗(On Diff #207590)

There is a move constructor, but no move-assignment overload.

80 ↗(On Diff #207590)

[style] I often create a protected member NodeType &getDerived() (an const NodeType &getDerived() const) to avoid casting on every use. See e.g. clang::ASTNodeTraverser::getDerived() for the technique.

93 ↗(On Diff #207590)

I understand you might want to re-use one implementation, but return *Edges.front() is a lot shorter (The assertion is redundant: SmallVector::front() already contains it).

118–120 ↗(On Diff #207590)

This makes me worry about scaling. Is this method used often? How large is the worst practical edge list? Did you think about using SetVector?

For the sake of avoiding premature optimization, we might worry about it when it becomes a problem.

129–134 ↗(On Diff #207590)

Use std::remove?

209 ↗(On Diff #207590)

Return size_t?

275–279 ↗(On Diff #207590)

This iterates over the list of outgoing edges 3 times. The find_if is redundant since addEdge already does this.

287 ↗(On Diff #207590)

Is clearing each individual node necessary? I wouldn't expect them to be re-used.

292 ↗(On Diff #207590)

Comment before the member.

bmahjour updated this revision to Diff 208800.Jul 9 2019, 1:47 PM

Address Michael's review comments.

fhahn added a comment.Jul 9 2019, 2:02 PM

Tests missing.

The tests will be provided with the patches that extend the DirectedGraph and actually build an instance of it. I realize this patch introduces code without tests, but this is all in an effort to keep the reviews small. If there are any other suggestions on how to break up the reviews I'd be more than happy to look into it.

I also think it would be good to at least have a simple unit test, that instantiates the template, otherwise we don't even test it builds. IIUC it should not be too hard to instantiate it with very simple node/edge types and test the various functions? That should not increase the size too much.

Another option would be to post follow-up patches with tests/uses and commit them together.

bmahjour marked 15 inline comments as done.Jul 9 2019, 2:07 PM
bmahjour added inline comments.
llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗(On Diff #207590)

This is a good question. The method is used fairly frequently when building the graph. I do not have comprehensive stats on the number of nodes and edges and their ratios when building large applications. The space complexity of the DDG depends on a number of factors such as:

  1. The number of instructions being analyzed. For example if DDG is run as a function pass it is more likely to result in larger number of nodes (and consequently edges) than if it is run as a loop pass. Of course this also depends on the size of the functions and loop bodies.
  2. The quality of the dependence analysis. If dependence analysis gives us pessimistic results we end up creating more edges between nodes.
  3. How well we are able to simplify the graph. Sometimes it's possible to collapse an edge and merge two nodes together. The more we can simplify the less nodes and edges we will have.

Using SetVector will likely help with the compile-time performance, but it comes with a memory trade off. My preliminary tests suggest that time-complexity may be more of an issue than memory consumption, so it maybe a good trade off.

I agree it's better not to do premature optimizations at this point, and instead consider such improvements when more comprehensive stats become available.

129–134 ↗(On Diff #207590)

I'll use the erase-remove idiom.

275–279 ↗(On Diff #207590)

Good point. I'll remove the find_if part.

287 ↗(On Diff #207590)

Not sure if I understand your question...we need to clear each node so that their outgoing edges are removed.

bmahjour marked 2 inline comments as done.Jul 9 2019, 2:14 PM

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

  1. Using pointers gives the clients freedom to use polymorphic behavior.
  2. Using pointers avoids the copy and moves you mentioned. Given that this class is intended to be extended by client code, it's probably better not to assume that the copy/moves will always be cheap.
  3. Using pointers allow us to do the edge optimization you mentioned. This is currently not being done, it's something we can look into if memory usage becomes an issue.
fhahn added inline comments.Jul 9 2019, 2:42 PM
llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗(On Diff #207590)

This is a good question. The method is used fairly frequently when building the graph. I do not have comprehensive stats on the number of nodes and edges and their ratios when building large applications. The space complexity of the DDG depends on a number of factors such as:

The number of instructions being analyzed. For example if DDG is run as a function pass it is more likely to result in larger number of nodes (and consequently edges) than if it is run as a loop pass. Of course this also depends on the size of the functions and loop bodies.

From experience, people pass all kind of crazy code to LLVM and functions as well as loop nests can be huge.

AFAIU you are planning to use this to also represent Def-Use dependencies? Have you considered integrating the existing information present in the IR and have the DDG just integrate it as an overlay?

The quality of the dependence analysis. If dependence analysis gives us pessimistic results we end up creating more edges between nodes.
How well we are able to simplify the graph. Sometimes it's possible to collapse an edge and merge two nodes together. The more we can simplify the less nodes and edges we will have.
Using SetVector will likely help with the compile-time performance, but it comes with a memory trade off. My preliminary tests suggest that time-complexity may be more of an issue than memory consumption, so it maybe a good trade off.

Intuitively I agree that compile-time will be a bigger issue than memory usage, especially as we only need to keep the DDG of a loop nest/function around at a time. IIRC SetVector 'just' uses roughly twice as much memory.

I agree it's better not to do premature optimizations at this point, and instead consider such improvements when more comprehensive stats become available.

I think it is definitely worth discussing/thinking about what the right data structure is here to start with. Compile-time problems, especially the edge cases tend to appear a while after the patches land in master, once they made it an a range of production compilers and are used to compile very large code bases. At that point it is hard to quickly fix the issue.

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

  1. Using pointers gives the clients freedom to use polymorphic behavior.

I don't see why you would go the way implementing compile-time template polymorphism, but introduce vtables in derived classes.

  1. Using pointers avoids the copy and moves you mentioned. Given that this class is intended to be extended by client code, it's probably better not to assume that the copy/moves will always be cheap.

Depends on the size of the edge objects. I'd expect the to be small, maybe just with an enum with the dependence kind. Those are cheap to copy compared to a heap allocation for each edges (cf. pass-by-value vs. pass-by-const-reference). Even if an edge object is large, it can still implement move ctors/assignment operators and a pimpl idiom.

  1. Using pointers allow us to do the edge optimization you mentioned. This is currently not being done, it's something we can look into if memory usage becomes an issue.

I do not recommend this as it raises the complexity significantly: Changing one edges' property might change edges coming from other nodes as well. Only edges with the same target node even could benefit from it.

There might be reasons to prefer allocated edges (such as having an identity), but it depends on how they are used.

This graph is only walkable in one direction. Are you sure you don't need the other direction as well to answer queries such as "which statements must be executed before the current statement" in addition to "which statements can only execute after the current statement"?

llvm/include/llvm/ADT/DirectedGraph.h
173 ↗(On Diff #208800)

The small capacity of the node list (and even better: the container implementation) is an implementation detail and should not be public.

118–120 ↗(On Diff #207590)

Depends on what If transitive dependencies are added explicitly, there an be a lot of edges in the graph.

Could delegate the responsibility of adding a an edge only once to the caller? Very often by construction (newly created edge) this is not even possible. Instead, check it in an assert.

287 ↗(On Diff #207590)

But there are no nodes in the graph after clear(); it is irrelevant what the previous nodes store, they are not part of the graph anymore.

bmahjour marked 6 inline comments as done.Jul 10 2019, 12:21 PM

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

  1. Using pointers gives the clients freedom to use polymorphic behavior.

I don't see why you would go the way implementing compile-time template polymorphism, but introduce vtables in derived classes.

I actually just realized that we cannot have a container of objects due to the CRTP idiom, because the edge type is not a complete type yet. If we don't use static polymorphism, then we need to use dynamic polymorphism and that requires a container of pointers. Either way we need to store pointers, it seems.

This graph is only walkable in one direction. Are you sure you don't need the other direction as well to answer queries such as "which statements must be executed before the current statement" in addition to "which statements can only execute after the current statement"?

Once pi-blocks are formed, we are left with a DAG that can be topologically sorted based on direction of dependencies. At that point the nodes are in a dependency-preserving order.

llvm/include/llvm/ADT/DirectedGraph.h
173 ↗(On Diff #208800)

Ok, I can make it protected, but note that the EdgeListTy would still be public because it's part of a public interface (see findIncomingEdgesToNode bellow).

118–120 ↗(On Diff #207590)

Have you considered integrating the existing information present in the IR and have the DDG just integrate it as an overlay?

Thanks for raising this point, although it is getting a little outside the scope of this patch. We have thought about this approach, but we have not prototyped it yet.

The def-use dependencies are important to track as they can form cycles in the dependence chains, and detecting those cycles is one of the primary jobs of the DDG. Our current prototype creates nodes for all instructions (including those that don't access memory) and materialize def-use edges between them. This has two main advantages: 1. Detecting cycles is easy as we can use an SCCIterator to identify nodes that are part of a cycle, and 2. Code generation is simpler since a topologically sorted DDG can fully represents all the dependent instructions in the right order.

The alternative I can think of is to create a DDG where nodes are created only for instructions that access memory. In order to be able to detect cycles, once the memory edges are established, the def-use chains need to be examined in the IR to see if any of the nodes need to be connected to each other due to def-use dependencies. A DDG in this form will not fully represent all the dependent instructions, so code gen would have to deal with the definitions of dependent uses separately.

We should have a more in-depth discussion about this at one of the loop group meetings and/or as part of the DDG code revision.

Intuitively I agree that compile-time will be a bigger issue than memory usage, especially as we only need to keep the DDG of a loop nest/function around at a time. IIRC SetVector 'just' uses roughly twice as much memory.

Ok, I'll use SetVector instead then.

118–120 ↗(On Diff #207590)

This would imply that the client would have to know about all the outgoing edges before being able to add them to the node. This sounds too restrictive, specially if the graph is to be updated post-construction.

287 ↗(On Diff #207590)

But there are no nodes in the graph after clear(); it is irrelevant what the previous nodes store, they are not part of the graph anymore.

Once the nodes are removed all the handles to the edges become inaccessible, so we need to clear the edges in each node (while we have a chance) before we remove the nodes from the graph.

fhahn added inline comments.Jul 10 2019, 1:31 PM
llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗(On Diff #207590)

Thanks for raising this point, although it is getting a little outside the scope of this patch. We have thought about this approach, but we have not prototyped it yet.

The def-use dependencies are important to track as they can form cycles in the dependence chains, and detecting those cycles is one of the primary jobs of the DDG. Our current prototype creates nodes for all instructions (including those that don't access memory) and materialize def-use edges between them. This has two main advantages: 1. Detecting cycles is easy as we can use an SCCIterator to identify nodes that are part of a cycle, and 2. Code generation is simpler since a topologically sorted DDG can fully represents all the dependent instructions in the right order.

The alternative I can think of is to create a DDG where nodes are created only for instructions that access memory. In order to be able to detect cycles, once the memory edges are established, the def-use chains need to be examined in the IR to see if any of the nodes need to be connected to each other due to def-use dependencies. A DDG in this form will not fully represent all the dependent instructions, so code gen would have to deal with the definitions of dependent uses separately.

We should have a more in-depth discussion about this at one of the loop group meetings and/or as part of the DDG code revision.

Yep we should discuss this in more detail when it comes to the actual DDG implementation. I just wanted to make sure we have this approach on the radar.

bmahjour updated this revision to Diff 209065.Jul 10 2019, 2:12 PM
bmahjour marked an inline comment as done.

Address second round of comments from Michael and Florian.

Meinersbur added inline comments.Jul 10 2019, 2:28 PM
llvm/include/llvm/ADT/DirectedGraph.h
173 ↗(On Diff #208800)

For findIncomingEdgesToNode, use SmallVectorImpl<EdgeType*>, such that the caller can use its own small capacity.

228 ↗(On Diff #208800)

[suggestion] Move the declaration of the list before the loop and clear() it within the loop. This allows reusing the list's memory allocation.

118–120 ↗(On Diff #207590)

The could be a hasEdge method that checks the presence in the outgoing list. Then, if the check is really necessary, it could do

if (!hasEdge(E))
  addEdge(E)

I think in most cases, the call to hasEdge is not even necessary, such as

auto E = new MyEdgeType();
Node->addEdge(E);

Since I just created the edge, it cannot be already be in the outgoing list.

However, this is discussing performance details depending on how it is used which is not part of this patch. Let's delay the discussion to when we do performance optimization.

287 ↗(On Diff #207590)

Once the nodes are removed all the handles to the edges become inaccessible, so we need to clear the edges in each node (while we have a chance) before we remove the nodes from the graph.

Why does the edge list need to be empty for nodes that are not in any graph? It's not for free'ing the SmallVector's memory, that will happen anyway when free'ing the node.

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

fhahn added a comment.Jul 10 2019, 3:16 PM

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

Sounds good, but I think to land it, it would be good if we have some unit tests for the current implementation, to make sure it builds, we don’t regress and have sanitizer coverage.

bmahjour updated this revision to Diff 209525.Jul 12 2019, 10:23 AM
bmahjour marked 8 inline comments as done.

Address more review comments.

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

Sounds good, but I think to land it, it would be good if we have some unit tests for the current implementation, to make sure it builds, we don’t regress and have sanitizer coverage.

Please see the updated patch with tests for various functions of the DirectedGraph plus a GraphTrait instantiation as well as a test of SCC iterator over the directed graph.

bmahjour marked an inline comment as done.Jul 12 2019, 10:26 AM
bmahjour added inline comments.
llvm/include/llvm/ADT/DirectedGraph.h
287 ↗(On Diff #207590)

Why does the edge list need to be empty for nodes that are not in any graph? It's not for free'ing the SmallVector's memory, that will happen anyway when free'ing the node.

Conceptually the graph is a collection of nodes and edges. One would expect the clear operation on the graph to behave as if removeNode is called on all the nodes which would result in the nodes and connected edges to be removed. I'd find it surprising if clear removes the nodes and leaves the edges in place. In any case, I've removed the clear function all together, since it's not all that useful anyway.

bmahjour marked an inline comment as done.Jul 12 2019, 10:30 AM
Meinersbur accepted this revision.Jul 12 2019, 1:12 PM

LGTM

llvm/unittests/ADT/DirectedGraphTest.cpp
288 ↗(On Diff #209525)
This revision is now accepted and ready to land.Jul 12 2019, 1:12 PM
fhahn added inline comments.Jul 15 2019, 2:35 AM
llvm/unittests/ADT/DirectedGraphTest.cpp
20 ↗(On Diff #209525)

using should not be needed, as the code below is wrapped in namespace llvm.

bmahjour updated this revision to Diff 209854.Jul 15 2019, 7:33 AM
bmahjour marked 2 inline comments as done.

Address comments on the unit test.

Meinersbur accepted this revision.Jul 17 2019, 8:54 AM

LGTM

llvm/unittests/ADT/DirectedGraphTest.cpp
20 ↗(On Diff #209525)

@bmahjour This was a small issue with obvious resolution; I don't think you would need to wait for another LGTM for this change.

This revision was automatically updated to reflect the committed changes.