This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/ADT/
-
llvm/
-
ADT/
-
DirectedGraph.h
-
unittests/ADT/
-
ADT/
-
CMakeLists.txt
-
DirectedGraphTest.cpp

Differential D64088

[DDG] DirectedGraph as a base class for various dependence graphs such as DDG and PDG.
ClosedPublic

Authored by bmahjour on Jul 2 2019, 10:58 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
myhsu
hfinkel
fhahn
jdoerfert
kbarton

Commits

rG8b288c7d11cc: [DDG] DirectedGraph as a base class for various dependence graphs such as DDG…
rL367043: [DDG] DirectedGraph as a base class for various dependence graphs such

Summary

This is an implementation of a directed graph base class with explicit representation of both nodes and edges. This implementation makes the edges explicit because we expect to assign various attributes (such as dependence type, distribution interference weight, etc) to the edges in the derived classes such as DDG and DIG. The DirectedGraph consists of a list of DGNode's. Each node consists of a (possibly empty) list of outgoing edges to other nodes in the graph. A DGEdge contains a reference to a single target node. Note that nodes do not know about their incoming edges so the DirectedGraph class provides a function to find all incoming edges to a given node.

This is the first patch in a series of patches that we are planning to contribute upstream in order to implement Data Dependence Graph and Program Dependence Graph.

More information about the proposed design can be found here: https://ibm.ent.box.com/v/directed-graph-and-ddg

Diff Detail

Repository: rL LLVM

Event Timeline

bmahjour created this revision.Jul 2 2019, 10:58 AM

Herald added subscribers: llvm-commits, kristina, dexonsmith. · View Herald TranscriptJul 2 2019, 10:58 AM

Tests missing.

Herald added a subscriber: jsji. · View Herald TranscriptJul 3 2019, 9:50 AM

bmahjour edited the summary of this revision. (Show Details)Jul 4 2019, 6:55 AM

Herald added a subscriber: • wuzish. · View Herald TranscriptJul 4 2019, 6:55 AM

Is there any plan on supporting GraphTraits in this patch? I understand that sometimes it probably will be more suitable for derived class of DirectedGraph to implement GraphTraits. But I see no problem on providing a basic implementation of GraphTraits for DirectedGraph here.

In D64088#1570392, @myhsu wrote:

Is there any plan on supporting GraphTraits in this patch?

Not in this patch, but subsequent patches where subclasses of DGNode, DGEdge and DirectedGraph are implemented will have corresponding graph-trait specializations.

I understand that sometimes it probably will be more suitable for derived class of DirectedGraph to implement GraphTraits. But I see no problem on providing a basic implementation of GraphTraits for DirectedGraph here.

The classes defined in this file are really meant to be used as base classes in an inheritance relationship that complete the CRTP idiom started here. In fact it's not possible to construct an object of type DirectedGraph with DGNode and DGEdge types, because each type requires a node type and an edge type as template parameters, and without corresponding subclasses their definitions would be recursive. When client nodes and edges are derived from DGNode and DGEdge, the CRTP idiom can be completed and the concrete node and edge types can be instantiated. For example the DDG implementation will define the graph nodes, edges and the graph-trait specialization as follows:

class DDGNode;
class DDGEdge;
using DDGNodeBase = DGNode<DDGNode, DDGEdge>;
using DDGEdgeBase = DGEdge<DDGNode, DDGEdge>;
using DDGBase = DirectedGraph<DDGNode, DDGEdge>;

class DDGNode : public DDGNodeBase { ... };
class DDGEdge : public DDGEdgeBase {...};
class DataDependenceGraph : public DDGBase {...};

template <> struct GraphTraits<DDGNode *> {...};
template <> struct GraphTraits<DataDependenceGraph *> : public GraphTraits<DDGNode *> {...};

The choice of the CRTP idiom is made to avoid introducing too many virtual function dispatches when interacting with the nodes and edges of the graph.

In D64088#1568644, @lebedev.ri wrote:

Tests missing.

The tests will be provided with the patches that extend the DirectedGraph and actually build an instance of it. I realize this patch introduces code without tests, but this is all in an effort to keep the reviews small. If there are any other suggestions on how to break up the reviews I'd be more than happy to look into it.

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

llvm/include/llvm/ADT/DirectedGraph.h
9 ↗	(On Diff #207590)	Not just the interface; also a base implementation.
54 ↗	(On Diff #207590)	Could this get a more descriptive name, such as `Target`/`TargetNode`?
61 ↗	(On Diff #207590)	[style] Type declarations typically have the suffix `Ty`
65 ↗	(On Diff #207590)	Make a doxygen comment?
72 ↗	(On Diff #207590)	There is a move constructor, but no move-assignment overload.
80 ↗	(On Diff #207590)	[style] I often create a protected member `NodeType &getDerived()` (an `const NodeType &getDerived() const`) to avoid casting on every use. See e.g. `clang::ASTNodeTraverser::getDerived()` for the technique.
93 ↗	(On Diff #207590)	I understand you might want to re-use one implementation, but `return *Edges.front()` is a lot shorter (The assertion is redundant: SmallVector::front() already contains it).
118–120 ↗	(On Diff #207590)	This makes me worry about scaling. Is this method used often? How large is the worst practical edge list? Did you think about using `SetVector`? For the sake of avoiding premature optimization, we might worry about it when it becomes a problem.
129–134 ↗	(On Diff #207590)	Use `std::remove`?
209 ↗	(On Diff #207590)	Return `size_t`?
275–279 ↗	(On Diff #207590)	This iterates over the list of outgoing edges 3 times. The `find_if` is redundant since `addEdge` already does this.
287 ↗	(On Diff #207590)	Is clearing each individual node necessary? I wouldn't expect them to be re-used.
292 ↗	(On Diff #207590)	Comment before the member.

Address Michael's review comments.

In D64088#1570715, @bmahjour wrote:

In D64088#1568644, @lebedev.ri wrote:

Tests missing.

The tests will be provided with the patches that extend the DirectedGraph and actually build an instance of it. I realize this patch introduces code without tests, but this is all in an effort to keep the reviews small. If there are any other suggestions on how to break up the reviews I'd be more than happy to look into it.

I also think it would be good to at least have a simple unit test, that instantiates the template, otherwise we don't even test it builds. IIUC it should not be too hard to instantiate it with very simple node/edge types and test the various functions? That should not increase the size too much.

Another option would be to post follow-up patches with tests/uses and commit them together.

bmahjour marked 15 inline comments as done.Jul 9 2019, 2:07 PM

bmahjour added inline comments.

llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗	(On Diff #207590)	This is a good question. The method is used fairly frequently when building the graph. I do not have comprehensive stats on the number of nodes and edges and their ratios when building large applications. The space complexity of the DDG depends on a number of factors such as: The number of instructions being analyzed. For example if DDG is run as a function pass it is more likely to result in larger number of nodes (and consequently edges) than if it is run as a loop pass. Of course this also depends on the size of the functions and loop bodies. The quality of the dependence analysis. If dependence analysis gives us pessimistic results we end up creating more edges between nodes. How well we are able to simplify the graph. Sometimes it's possible to collapse an edge and merge two nodes together. The more we can simplify the less nodes and edges we will have. Using SetVector will likely help with the compile-time performance, but it comes with a memory trade off. My preliminary tests suggest that time-complexity may be more of an issue than memory consumption, so it maybe a good trade off. I agree it's better not to do premature optimizations at this point, and instead consider such improvements when more comprehensive stats become available.
129–134 ↗	(On Diff #207590)	I'll use the erase-remove idiom.
275–279 ↗	(On Diff #207590)	Good point. I'll remove the find_if part.
287 ↗	(On Diff #207590)	Not sure if I understand your question...we need to clear each node so that their outgoing edges are removed.

In D64088#1574555, @Meinersbur wrote:

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

Using pointers gives the clients freedom to use polymorphic behavior.
Using pointers avoids the copy and moves you mentioned. Given that this class is intended to be extended by client code, it's probably better not to assume that the copy/moves will always be cheap.
Using pointers allow us to do the edge optimization you mentioned. This is currently not being done, it's something we can look into if memory usage becomes an issue.

fhahn added inline comments.Jul 9 2019, 2:42 PM

llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗	(On Diff #207590)	This is a good question. The method is used fairly frequently when building the graph. I do not have comprehensive stats on the number of nodes and edges and their ratios when building large applications. The space complexity of the DDG depends on a number of factors such as: The number of instructions being analyzed. For example if DDG is run as a function pass it is more likely to result in larger number of nodes (and consequently edges) than if it is run as a loop pass. Of course this also depends on the size of the functions and loop bodies. From experience, people pass all kind of crazy code to LLVM and functions as well as loop nests can be huge. AFAIU you are planning to use this to also represent Def-Use dependencies? Have you considered integrating the existing information present in the IR and have the DDG just integrate it as an overlay? The quality of the dependence analysis. If dependence analysis gives us pessimistic results we end up creating more edges between nodes. How well we are able to simplify the graph. Sometimes it's possible to collapse an edge and merge two nodes together. The more we can simplify the less nodes and edges we will have. Using SetVector will likely help with the compile-time performance, but it comes with a memory trade off. My preliminary tests suggest that time-complexity may be more of an issue than memory consumption, so it maybe a good trade off. Intuitively I agree that compile-time will be a bigger issue than memory usage, especially as we only need to keep the DDG of a loop nest/function around at a time. IIRC SetVector 'just' uses roughly twice as much memory. I agree it's better not to do premature optimizations at this point, and instead consider such improvements when more comprehensive stats become available. I think it is definitely worth discussing/thinking about what the right data structure is here to start with. Compile-time problems, especially the edge cases tend to appear a while after the patches land in master, once they made it an a range of production compilers and are used to compile very large code bases. At that point it is hard to quickly fix the issue.

In D64088#1577050, @bmahjour wrote:

In D64088#1574555, @Meinersbur wrote:

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

Using pointers gives the clients freedom to use polymorphic behavior.

I don't see why you would go the way implementing compile-time template polymorphism, but introduce vtables in derived classes.

Using pointers avoids the copy and moves you mentioned. Given that this class is intended to be extended by client code, it's probably better not to assume that the copy/moves will always be cheap.

Depends on the size of the edge objects. I'd expect the to be small, maybe just with an enum with the dependence kind. Those are cheap to copy compared to a heap allocation for each edges (cf. pass-by-value vs. pass-by-const-reference). Even if an edge object is large, it can still implement move ctors/assignment operators and a pimpl idiom.

Using pointers allow us to do the edge optimization you mentioned. This is currently not being done, it's something we can look into if memory usage becomes an issue.

I do not recommend this as it raises the complexity significantly: Changing one edges' property might change edges coming from other nodes as well. Only edges with the same target node even could benefit from it.

There might be reasons to prefer allocated edges (such as having an identity), but it depends on how they are used.

This graph is only walkable in one direction. Are you sure you don't need the other direction as well to answer queries such as "which statements must be executed before the current statement" in addition to "which statements can only execute after the current statement"?

llvm/include/llvm/ADT/DirectedGraph.h
173 ↗	(On Diff #208800)	The small capacity of the node list (and even better: the container implementation) is an implementation detail and should not be public.
118–120 ↗	(On Diff #207590)	Depends on what If transitive dependencies are added explicitly, there an be a lot of edges in the graph. Could delegate the responsibility of adding a an edge only once to the caller? Very often by construction (newly created edge) this is not even possible. Instead, check it in an assert.
287 ↗	(On Diff #207590)	But there are no nodes in the graph after `clear()`; it is irrelevant what the previous nodes store, they are not part of the graph anymore.

In D64088#1577250, @Meinersbur wrote:

In D64088#1577050, @bmahjour wrote:

In D64088#1574555, @Meinersbur wrote:

To reduce the number of allocations, have you thought about making EdgeList contain the edges objects instead of pointers? The edges would have to be copied/moved into the list and edges could not be compared by identity. Is this semantic needed/are edge objects large?

Since the edge class does not contain the source node, the same edge object could be put into multiple outgoing edges lists. Is this supported?

I think we would be better off using pointers in this case, because of the following reasons:

Using pointers gives the clients freedom to use polymorphic behavior.

I don't see why you would go the way implementing compile-time template polymorphism, but introduce vtables in derived classes.

I actually just realized that we cannot have a container of objects due to the CRTP idiom, because the edge type is not a complete type yet. If we don't use static polymorphism, then we need to use dynamic polymorphism and that requires a container of pointers. Either way we need to store pointers, it seems.

This graph is only walkable in one direction. Are you sure you don't need the other direction as well to answer queries such as "which statements must be executed before the current statement" in addition to "which statements can only execute after the current statement"?

Once pi-blocks are formed, we are left with a DAG that can be topologically sorted based on direction of dependencies. At that point the nodes are in a dependency-preserving order.

llvm/include/llvm/ADT/DirectedGraph.h
173 ↗	(On Diff #208800)	Ok, I can make it protected, but note that the `EdgeListTy` would still be public because it's part of a public interface (see `findIncomingEdgesToNode` bellow).
118–120 ↗	(On Diff #207590)	Have you considered integrating the existing information present in the IR and have the DDG just integrate it as an overlay? Thanks for raising this point, although it is getting a little outside the scope of this patch. We have thought about this approach, but we have not prototyped it yet. The def-use dependencies are important to track as they can form cycles in the dependence chains, and detecting those cycles is one of the primary jobs of the DDG. Our current prototype creates nodes for all instructions (including those that don't access memory) and materialize def-use edges between them. This has two main advantages: 1. Detecting cycles is easy as we can use an SCCIterator to identify nodes that are part of a cycle, and 2. Code generation is simpler since a topologically sorted DDG can fully represents all the dependent instructions in the right order. The alternative I can think of is to create a DDG where nodes are created only for instructions that access memory. In order to be able to detect cycles, once the memory edges are established, the def-use chains need to be examined in the IR to see if any of the nodes need to be connected to each other due to def-use dependencies. A DDG in this form will not fully represent all the dependent instructions, so code gen would have to deal with the definitions of dependent uses separately. We should have a more in-depth discussion about this at one of the loop group meetings and/or as part of the DDG code revision. Intuitively I agree that compile-time will be a bigger issue than memory usage, especially as we only need to keep the DDG of a loop nest/function around at a time. IIRC SetVector 'just' uses roughly twice as much memory. Ok, I'll use SetVector instead then.
118–120 ↗	(On Diff #207590)	This would imply that the client would have to know about all the outgoing edges before being able to add them to the node. This sounds too restrictive, specially if the graph is to be updated post-construction.
287 ↗	(On Diff #207590)	But there are no nodes in the graph after clear(); it is irrelevant what the previous nodes store, they are not part of the graph anymore. Once the nodes are removed all the handles to the edges become inaccessible, so we need to clear the edges in each node (while we have a chance) before we remove the nodes from the graph.

fhahn added inline comments.Jul 10 2019, 1:31 PM

llvm/include/llvm/ADT/DirectedGraph.h
118–120 ↗	(On Diff #207590)	Thanks for raising this point, although it is getting a little outside the scope of this patch. We have thought about this approach, but we have not prototyped it yet. The def-use dependencies are important to track as they can form cycles in the dependence chains, and detecting those cycles is one of the primary jobs of the DDG. Our current prototype creates nodes for all instructions (including those that don't access memory) and materialize def-use edges between them. This has two main advantages: 1. Detecting cycles is easy as we can use an SCCIterator to identify nodes that are part of a cycle, and 2. Code generation is simpler since a topologically sorted DDG can fully represents all the dependent instructions in the right order. The alternative I can think of is to create a DDG where nodes are created only for instructions that access memory. In order to be able to detect cycles, once the memory edges are established, the def-use chains need to be examined in the IR to see if any of the nodes need to be connected to each other due to def-use dependencies. A DDG in this form will not fully represent all the dependent instructions, so code gen would have to deal with the definitions of dependent uses separately. We should have a more in-depth discussion about this at one of the loop group meetings and/or as part of the DDG code revision. Yep we should discuss this in more detail when it comes to the actual DDG implementation. I just wanted to make sure we have this approach on the radar.

Address second round of comments from Michael and Florian.

Meinersbur added inline comments.Jul 10 2019, 2:28 PM

llvm/include/llvm/ADT/DirectedGraph.h
173 ↗	(On Diff #208800)	For `findIncomingEdgesToNode`, use `SmallVectorImpl<EdgeType*>`, such that the caller can use its own small capacity.
228 ↗	(On Diff #208800)	[suggestion] Move the declaration of the list before the loop and `clear()` it within the loop. This allows reusing the list's memory allocation.
118–120 ↗	(On Diff #207590)	The could be a `hasEdge` method that checks the presence in the outgoing list. Then, if the check is really necessary, it could do if (!hasEdge(E)) addEdge(E) I think in most cases, the call to hasEdge is not even necessary, such as auto E = new MyEdgeType(); Node->addEdge(E); Since I just created the edge, it cannot be already be in the outgoing list. However, this is discussing performance details depending on how it is used which is not part of this patch. Let's delay the discussion to when we do performance optimization.
287 ↗	(On Diff #207590)	Once the nodes are removed all the handles to the edges become inaccessible, so we need to clear the edges in each node (while we have a chance) before we remove the nodes from the graph. Why does the edge list need to be empty for nodes that are not in any graph? It's not for free'ing the SmallVector's memory, that will happen anyway when free'ing the node.

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

In D64088#1579463, @Meinersbur wrote:

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

Sounds good, but I think to land it, it would be good if we have some unit tests for the current implementation, to make sure it builds, we don’t regress and have sanitizer coverage.

Address more review comments.

Herald added a subscriber: mgorny. · View Herald TranscriptJul 12 2019, 10:23 AM

In D64088#1579497, @fhahn wrote:

In D64088#1579463, @Meinersbur wrote:

To not to be stuck in details and not block the dependence graph patches, we maybe should land the patch and work on it in-tree when its users materialize?

Sounds good, but I think to land it, it would be good if we have some unit tests for the current implementation, to make sure it builds, we don’t regress and have sanitizer coverage.

Please see the updated patch with tests for various functions of the DirectedGraph plus a GraphTrait instantiation as well as a test of SCC iterator over the directed graph.

bmahjour marked an inline comment as done.Jul 12 2019, 10:26 AM

bmahjour added inline comments.

llvm/include/llvm/ADT/DirectedGraph.h
287 ↗	(On Diff #207590)	Why does the edge list need to be empty for nodes that are not in any graph? It's not for free'ing the SmallVector's memory, that will happen anyway when free'ing the node. Conceptually the graph is a collection of nodes and edges. One would expect the `clear` operation on the graph to behave as if `removeNode` is called on all the nodes which would result in the nodes and connected edges to be removed. I'd find it surprising if `clear` removes the nodes and leaves the edges in place. In any case, I've removed the `clear` function all together, since it's not all that useful anyway.

bmahjour marked an inline comment as done.Jul 12 2019, 10:30 AM

LGTM

llvm/unittests/ADT/DirectedGraphTest.cpp
288 ↗	(On Diff #209525)	[style] https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code

This revision is now accepted and ready to land.Jul 12 2019, 1:12 PM

fhahn added inline comments.Jul 15 2019, 2:35 AM

llvm/unittests/ADT/DirectedGraphTest.cpp
20 ↗	(On Diff #209525)	`using` should not be needed, as the code below is wrapped in `namespace llvm`.

Address comments on the unit test.

LGTM

llvm/unittests/ADT/DirectedGraphTest.cpp
20 ↗	(On Diff #209525)	@bmahjour This was a small issue with obvious resolution; I don't think you would need to wait for another LGTM for this change.

Closed by commit rL367043: [DDG] DirectedGraph as a base class for various dependence graphs such (authored by whitneyt). · Explain WhyJul 25 2019, 11:22 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

ADT/

DirectedGraph.h

270 lines

unittests/

ADT/

CMakeLists.txt

1 line

DirectedGraphTest.cpp

295 lines

Diff 211795

llvm/trunk/include/llvm/ADT/DirectedGraph.h

				//===- llvm/ADT/DirectedGraph.h - Directed Graph ----------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the interface and a base class implementation for a
				// directed graph.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ADT_DIRECTEDGRAPH_H
				#define LLVM_ADT_DIRECTEDGRAPH_H

				#include "llvm/ADT/GraphTraits.h"
				#include "llvm/ADT/SetVector.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"

				namespace llvm {

				/// Represent an edge in the directed graph.
				/// The edge contains the target node it connects to.
				template <class NodeType, class EdgeType> class DGEdge {
				public:
				DGEdge() = delete;
				/// Create an edge pointing to the given node \p N.
				explicit DGEdge(NodeType &N) : TargetNode(N) {}
				explicit DGEdge(const DGEdge<NodeType, EdgeType> &E)
				: TargetNode(E.TargetNode) {}
				DGEdge<NodeType, EdgeType> &operator=(const DGEdge<NodeType, EdgeType> &E) {
				TargetNode = E.TargetNode;
				return *this;
				}

				/// Static polymorphism: delegate implementation (via isEqualTo) to the
				/// derived class.
				bool operator==(const EdgeType &E) const { return getDerived().isEqualTo(E); }
				bool operator!=(const EdgeType &E) const { return !operator==(E); }

				/// Retrieve the target node this edge connects to.
				const NodeType &getTargetNode() const { return TargetNode; }
				NodeType &getTargetNode() {
				return const_cast<NodeType &>(
				static_cast<const DGEdge<NodeType, EdgeType> &>(*this).getTargetNode());
				}

				protected:
				// As the default implementation use address comparison for equality.
				bool isEqualTo(const EdgeType &E) const { return this == &E; }

				// Cast the 'this' pointer to the derived type and return a reference.
				EdgeType &getDerived() { return static_cast<EdgeType >(this); }
				const EdgeType &getDerived() const {
				return static_cast<const EdgeType >(this);
				}

				// The target node this edge connects to.
				NodeType &TargetNode;
				};

				/// Represent a node in the directed graph.
				/// The node has a (possibly empty) list of outgoing edges.
				template <class NodeType, class EdgeType> class DGNode {
				public:
				using EdgeListTy = SetVector<EdgeType *>;
				using iterator = typename EdgeListTy::iterator;
				using const_iterator = typename EdgeListTy::const_iterator;

				/// Create a node with a single outgoing edge \p E.
				explicit DGNode(EdgeType &E) : Edges() { Edges.insert(&E); }
				DGNode() = default;

				explicit DGNode(const DGNode<NodeType, EdgeType> &N) : Edges(N.Edges) {}
				DGNode(DGNode<NodeType, EdgeType> &&N) : Edges(std::move(N.Edges)) {}

				DGNode<NodeType, EdgeType> &operator=(const DGNode<NodeType, EdgeType> &N) {
				Edges = N.Edges;
				return *this;
				}
				DGNode<NodeType, EdgeType> &operator=(const DGNode<NodeType, EdgeType> &&N) {
				Edges = std::move(N.Edges);
				return *this;
				}

				/// Static polymorphism: delegate implementation (via isEqualTo) to the
				/// derived class.
				bool operator==(const NodeType &N) const { return getDerived().isEqualTo(N); }
				bool operator!=(const NodeType &N) const { return !operator==(N); }

				const_iterator begin() const { return Edges.begin(); }
				const_iterator end() const { return Edges.end(); }
				iterator begin() { return Edges.begin(); }
				iterator end() { return Edges.end(); }
				const EdgeType &front() const { return *Edges.front(); }
				EdgeType &front() { return *Edges.front(); }
				const EdgeType &back() const { return *Edges.back(); }
				EdgeType &back() { return *Edges.back(); }

				/// Collect in \p EL, all the edges from this node to \p N.
				/// Return true if at least one edge was found, and false otherwise.
				/// Note that this implementation allows more than one edge to connect
				/// a given pair of nodes.
				bool findEdgesTo(const NodeType &N, SmallVectorImpl<EdgeType *> &EL) const {
				assert(EL.empty() && "Expected the list of edges to be empty.");
				for (auto *E : Edges)
				if (E->getTargetNode() == N)
				EL.push_back(E);
				return !EL.empty();
				}

				/// Add the given edge \p E to this node, if it doesn't exist already. Returns
				/// true if the edge is added and false otherwise.
				bool addEdge(EdgeType &E) { return Edges.insert(&E); }

				/// Remove the given edge \p E from this node, if it exists.
				void removeEdge(EdgeType &E) { Edges.remove(&E); }

				/// Test whether there is an edge that goes from this node to \p N.
				bool hasEdgeTo(const NodeType &N) const {
				return (findEdgeTo(N) != Edges.end());
				}

				/// Retrieve the outgoing edges for the node.
				const EdgeListTy &getEdges() const { return Edges; }
				EdgeListTy &getEdges() {
				return const_cast<EdgeListTy &>(
				static_cast<const DGNode<NodeType, EdgeType> &>(*this).Edges);
				}

				/// Clear the outgoing edges.
				void clear() { Edges.clear(); }

				protected:
				// As the default implementation use address comparison for equality.
				bool isEqualTo(const NodeType &N) const { return this == &N; }

				// Cast the 'this' pointer to the derived type and return a reference.
				NodeType &getDerived() { return static_cast<NodeType >(this); }
				const NodeType &getDerived() const {
				return static_cast<const NodeType >(this);
				}

				/// Find an edge to \p N. If more than one edge exists, this will return
				/// the first one in the list of edges.
				const_iterator findEdgeTo(const NodeType &N) const {
				return llvm::find_if(
				Edges, [&N](const EdgeType *E) { return E->getTargetNode() == N; });
				}

				// The list of outgoing edges.
				EdgeListTy Edges;
				};

				/// Directed graph
				///
				/// The graph is represented by a table of nodes.
				/// Each node contains a (possibly empty) list of outgoing edges.
				/// Each edge contains the target node it connects to.
				template <class NodeType, class EdgeType> class DirectedGraph {
				protected:
				using NodeListTy = SmallVector<NodeType *, 10>;
				using EdgeListTy = SmallVector<EdgeType *, 10>;
				public:
				using iterator = typename NodeListTy::iterator;
				using const_iterator = typename NodeListTy::const_iterator;
				using DGraphType = DirectedGraph<NodeType, EdgeType>;

				DirectedGraph() = default;
				explicit DirectedGraph(NodeType &N) : Nodes() { addNode(N); }
				DirectedGraph(const DGraphType &G) : Nodes(G.Nodes) {}
				DirectedGraph(DGraphType &&RHS) : Nodes(std::move(RHS.Nodes)) {}
				DGraphType &operator=(const DGraphType &G) {
				Nodes = G.Nodes;
				return *this;
				}
				DGraphType &operator=(const DGraphType &&G) {
				Nodes = std::move(G.Nodes);
				return *this;
				}

				const_iterator begin() const { return Nodes.begin(); }
				const_iterator end() const { return Nodes.end(); }
				iterator begin() { return Nodes.begin(); }
				iterator end() { return Nodes.end(); }
				const NodeType &front() const { return *Nodes.front(); }
				NodeType &front() { return *Nodes.front(); }
				const NodeType &back() const { return *Nodes.back(); }
				NodeType &back() { return *Nodes.back(); }

				size_t size() const { return Nodes.size(); }

				/// Find the given node \p N in the table.
				const_iterator findNode(const NodeType &N) const {
				return llvm::find_if(Nodes,
				[&N](const NodeType Node) { return Node == N; });
				}
				iterator findNode(const NodeType &N) {
				return const_cast<iterator>(
				static_cast<const DGraphType &>(*this).findNode(N));
				}

				/// Add the given node \p N to the graph if it is not already present.
				bool addNode(NodeType &N) {
				if (findNode(N) != Nodes.end())
				return false;
				Nodes.push_back(&N);
				return true;
				}

				/// Collect in \p EL all edges that are coming into node \p N. Return true
				/// if at least one edge was found, and false otherwise.
				bool findIncomingEdgesToNode(const NodeType &N, SmallVectorImpl<EdgeType*> &EL) const {
				assert(EL.empty() && "Expected the list of edges to be empty.");
				EdgeListTy TempList;
				for (auto *Node : Nodes) {
				if (*Node == N)
				continue;
				Node->findEdgesTo(N, TempList);
				EL.insert(EL.end(), TempList.begin(), TempList.end());
				TempList.clear();
				}
				return !EL.empty();
				}

				/// Remove the given node \p N from the graph. If the node has incoming or
				/// outgoing edges, they are also removed. Return true if the node was found
				/// and then removed, and false if the node was not found in the graph to
				/// begin with.
				bool removeNode(NodeType &N) {
				iterator IT = findNode(N);
				if (IT == Nodes.end())
				return false;
				// Remove incoming edges.
				EdgeListTy EL;
				for (auto *Node : Nodes) {
				if (*Node == N)
				continue;
				Node->findEdgesTo(N, EL);
				for (auto *E : EL)
				Node->removeEdge(*E);
				EL.clear();
				}
				N.clear();
				Nodes.erase(IT);
				return true;
				}

				/// Assuming nodes \p Src and \p Dst are already in the graph, connect node \p
				/// Src to node \p Dst using the provided edge \p E. Return true if \p Src is
				/// not already connected to \p Dst via \p E, and false otherwise.
				bool connect(NodeType &Src, NodeType &Dst, EdgeType &E) {
				assert(findNode(Src) != Nodes.end() && "Src node should be present.");
				assert(findNode(Dst) != Nodes.end() && "Dst node should be present.");
				assert((E.getTargetNode() == Dst) &&
				"Target of the given edge does not match Dst.");
				return Src.addEdge(E);
				}

				protected:
				// The list of nodes in the graph.
				NodeListTy Nodes;
				};

				} // namespace llvm

				#endif // LLVM_ADT_DIRECTEDGRAPH_H

llvm/trunk/unittests/ADT/CMakeLists.txt

Show All 11 Lines	add_llvm_unittest(ADTTests
BitVectorTest.cpp		BitVectorTest.cpp
BreadthFirstIteratorTest.cpp		BreadthFirstIteratorTest.cpp
BumpPtrListTest.cpp		BumpPtrListTest.cpp
DAGDeltaAlgorithmTest.cpp		DAGDeltaAlgorithmTest.cpp
DeltaAlgorithmTest.cpp		DeltaAlgorithmTest.cpp
DenseMapTest.cpp		DenseMapTest.cpp
DenseSetTest.cpp		DenseSetTest.cpp
DepthFirstIteratorTest.cpp		DepthFirstIteratorTest.cpp
		DirectedGraphTest.cpp
EquivalenceClassesTest.cpp		EquivalenceClassesTest.cpp
FallibleIteratorTest.cpp		FallibleIteratorTest.cpp
FoldingSet.cpp		FoldingSet.cpp
FunctionExtrasTest.cpp		FunctionExtrasTest.cpp
FunctionRefTest.cpp		FunctionRefTest.cpp
HashingTest.cpp		HashingTest.cpp
IListBaseTest.cpp		IListBaseTest.cpp
IListIteratorTest.cpp		IListIteratorTest.cpp
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

llvm/trunk/unittests/ADT/DirectedGraphTest.cpp

				//===- llvm/unittest/ADT/DirectedGraphTest.cpp ------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines concrete derivations of the directed-graph base classes
				// for testing purposes.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/DirectedGraph.h"
				#include "llvm/ADT/GraphTraits.h"
				#include "llvm/ADT/SCCIterator.h"
				#include "llvm/ADT/SmallPtrSet.h"
				#include "gtest/gtest.h"

				namespace llvm {

				//===--------------------------------------------------------------------===//
				// Derived nodes, edges and graph types based on DirectedGraph.
				//===--------------------------------------------------------------------===//

				class DGTestNode;
				class DGTestEdge;
				using DGTestNodeBase = DGNode<DGTestNode, DGTestEdge>;
				using DGTestEdgeBase = DGEdge<DGTestNode, DGTestEdge>;
				using DGTestBase = DirectedGraph<DGTestNode, DGTestEdge>;

				class DGTestNode : public DGTestNodeBase {
				public:
				DGTestNode() = default;
				};

				class DGTestEdge : public DGTestEdgeBase {
				public:
				DGTestEdge() = delete;
				DGTestEdge(DGTestNode &N) : DGTestEdgeBase(N) {}
				};

				class DGTestGraph : public DGTestBase {
				public:
				DGTestGraph() = default;
				~DGTestGraph(){};
				};

				using EdgeListTy = SmallVector<DGTestEdge *, 2>;

				//===--------------------------------------------------------------------===//
				// GraphTraits specializations for the DGTest
				//===--------------------------------------------------------------------===//

				template <> struct GraphTraits<DGTestNode *> {
				using NodeRef = DGTestNode *;

				static DGTestNode DGTestGetTargetNode(DGEdge<DGTestNode, DGTestEdge> P) {
				return &P->getTargetNode();
				}

				// Provide a mapped iterator so that the GraphTrait-based implementations can
				// find the target nodes without having to explicitly go through the edges.
				using ChildIteratorType =
				mapped_iterator<DGTestNode::iterator, decltype(&DGTestGetTargetNode)>;
				using ChildEdgeIteratorType = DGTestNode::iterator;

				static NodeRef getEntryNode(NodeRef N) { return N; }
				static ChildIteratorType child_begin(NodeRef N) {
				return ChildIteratorType(N->begin(), &DGTestGetTargetNode);
				}
				static ChildIteratorType child_end(NodeRef N) {
				return ChildIteratorType(N->end(), &DGTestGetTargetNode);
				}

				static ChildEdgeIteratorType child_edge_begin(NodeRef N) {
				return N->begin();
				}
				static ChildEdgeIteratorType child_edge_end(NodeRef N) { return N->end(); }
				};

				template <>
				struct GraphTraits<DGTestGraph > : public GraphTraits<DGTestNode > {
				using nodes_iterator = DGTestGraph::iterator;
				static NodeRef getEntryNode(DGTestGraph DG) { return DG->begin(); }
				static nodes_iterator nodes_begin(DGTestGraph *DG) { return DG->begin(); }
				static nodes_iterator nodes_end(DGTestGraph *DG) { return DG->end(); }
				};

				//===--------------------------------------------------------------------===//
				// Test various modification and query functions.
				//===--------------------------------------------------------------------===//

				TEST(DirectedGraphTest, AddAndConnectNodes) {
				DGTestGraph DG;
				DGTestNode N1, N2, N3;
				DGTestEdge E1(N1), E2(N2), E3(N3);

				// Check that new nodes can be added successfully.
				EXPECT_TRUE(DG.addNode(N1));
				EXPECT_TRUE(DG.addNode(N2));
				EXPECT_TRUE(DG.addNode(N3));

				// Check that duplicate nodes are not added to the graph.
				EXPECT_FALSE(DG.addNode(N1));

				// Check that nodes can be connected using valid edges with no errors.
				EXPECT_TRUE(DG.connect(N1, N2, E2));
				EXPECT_TRUE(DG.connect(N2, N3, E3));
				EXPECT_TRUE(DG.connect(N3, N1, E1));

				// The graph looks like this now:
				//
				// +---------------+
				// v \|
				// N1 -> N2 -> N3 -+

				// Check that already connected nodes with the given edge are not connected
				// again (ie. edges are between nodes are not duplicated).
				EXPECT_FALSE(DG.connect(N3, N1, E1));

				// Check that there are 3 nodes in the graph.
				EXPECT_TRUE(DG.size() == 3);

				// Check that the added nodes can be found in the graph.
				EXPECT_NE(DG.findNode(N3), DG.end());

				// Check that nodes that are not part of the graph are not found.
				DGTestNode N4;
				EXPECT_EQ(DG.findNode(N4), DG.end());

				// Check that findIncommingEdgesToNode works correctly.
				EdgeListTy EL;
				EXPECT_TRUE(DG.findIncomingEdgesToNode(N1, EL));
				EXPECT_TRUE(EL.size() == 1);
				EXPECT_EQ(*EL[0], E1);
				}

				TEST(DirectedGraphTest, AddRemoveEdge) {
				DGTestGraph DG;
				DGTestNode N1, N2, N3;
				DGTestEdge E1(N1), E2(N2), E3(N3);
				DG.addNode(N1);
				DG.addNode(N2);
				DG.addNode(N3);
				DG.connect(N1, N2, E2);
				DG.connect(N2, N3, E3);
				DG.connect(N3, N1, E1);

				// The graph looks like this now:
				//
				// +---------------+
				// v \|
				// N1 -> N2 -> N3 -+

				// Check that there are 3 nodes in the graph.
				EXPECT_TRUE(DG.size() == 3);

				// Check that the target nodes of the edges are correct.
				EXPECT_EQ(E1.getTargetNode(), N1);
				EXPECT_EQ(E2.getTargetNode(), N2);
				EXPECT_EQ(E3.getTargetNode(), N3);

				// Remove the edge from N1 to N2.
				N1.removeEdge(E2);

				// The graph looks like this now:
				//
				// N2 -> N3 -> N1

				// Check that there are no incoming edges to N2.
				EdgeListTy EL;
				EXPECT_FALSE(DG.findIncomingEdgesToNode(N2, EL));
				EXPECT_TRUE(EL.empty());

				// Put the edge from N1 to N2 back in place.
				N1.addEdge(E2);

				// Check that E2 is the only incoming edge to N2.
				EL.clear();
				EXPECT_TRUE(DG.findIncomingEdgesToNode(N2, EL));
				EXPECT_EQ(*EL[0], E2);
				}

				TEST(DirectedGraphTest, hasEdgeTo) {
				DGTestGraph DG;
				DGTestNode N1, N2, N3;
				DGTestEdge E1(N1), E2(N2), E3(N3), E4(N1);
				DG.addNode(N1);
				DG.addNode(N2);
				DG.addNode(N3);
				DG.connect(N1, N2, E2);
				DG.connect(N2, N3, E3);
				DG.connect(N3, N1, E1);
				DG.connect(N2, N1, E4);

				// The graph looks like this now:
				//
				// +-----+
				// v \|
				// N1 -> N2 -> N3
				// ^ \|
				// +-----------+

				EXPECT_TRUE(N2.hasEdgeTo(N1));
				EXPECT_TRUE(N3.hasEdgeTo(N1));
				}

				TEST(DirectedGraphTest, AddRemoveNode) {
				DGTestGraph DG;
				DGTestNode N1, N2, N3;
				DGTestEdge E1(N1), E2(N2), E3(N3);
				DG.addNode(N1);
				DG.addNode(N2);
				DG.addNode(N3);
				DG.connect(N1, N2, E2);
				DG.connect(N2, N3, E3);
				DG.connect(N3, N1, E1);

				// The graph looks like this now:
				//
				// +---------------+
				// v \|
				// N1 -> N2 -> N3 -+

				// Check that there are 3 nodes in the graph.
				EXPECT_TRUE(DG.size() == 3);

				// Check that a node in the graph can be removed, but not more than once.
				EXPECT_TRUE(DG.removeNode(N1));
				EXPECT_EQ(DG.findNode(N1), DG.end());
				EXPECT_FALSE(DG.removeNode(N1));

				// The graph looks like this now:
				//
				// N2 -> N3

				// Check that there are 2 nodes in the graph and only N2 is connected to N3.
				EXPECT_TRUE(DG.size() == 2);
				EXPECT_TRUE(N3.getEdges().empty());
				EdgeListTy EL;
				EXPECT_FALSE(DG.findIncomingEdgesToNode(N2, EL));
				EXPECT_TRUE(EL.empty());
				}

				TEST(DirectedGraphTest, SCC) {

				DGTestGraph DG;
				DGTestNode N1, N2, N3, N4;
				DGTestEdge E1(N1), E2(N2), E3(N3), E4(N4);
				DG.addNode(N1);
				DG.addNode(N2);
				DG.addNode(N3);
				DG.addNode(N4);
				DG.connect(N1, N2, E2);
				DG.connect(N2, N3, E3);
				DG.connect(N3, N1, E1);
				DG.connect(N3, N4, E4);

				// The graph looks like this now:
				//
				// +---------------+
				// v \|
				// N1 -> N2 -> N3 -+ N4
				// \| ^
				// +--------+

				// Test that there are two SCCs:
				// 1. {N1, N2, N3}
				// 2. {N4}
				using NodeListTy = SmallPtrSet<DGTestNode *, 3>;
				SmallVector<NodeListTy, 4> ListOfSCCs;
				for (auto &SCC : make_range(scc_begin(&DG), scc_end(&DG)))
				ListOfSCCs.push_back(NodeListTy(SCC.begin(), SCC.end()));

				EXPECT_TRUE(ListOfSCCs.size() == 2);

				for (auto &SCC : ListOfSCCs) {
				if (SCC.size() > 1)
				continue;
				EXPECT_TRUE(SCC.size() == 1);
				EXPECT_TRUE(SCC.count(&N4) == 1);
				}
				for (auto &SCC : ListOfSCCs) {
				if (SCC.size() <= 1)
				continue;
				EXPECT_TRUE(SCC.size() == 3);
				EXPECT_TRUE(SCC.count(&N1) == 1);
				EXPECT_TRUE(SCC.count(&N2) == 1);
				EXPECT_TRUE(SCC.count(&N3) == 1);
				EXPECT_TRUE(SCC.count(&N4) == 0);
				}
				}

				} // namespace llvm