This is an archive of the discontinued LLVM Phabricator instance.

[DDG] Data Dependence Graph Basics
ClosedPublic

Authored by bmahjour on Jul 26 2019, 2:39 PM.

Details

Summary

This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:

D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.

This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.

The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Diff Detail

Repository
rL LLVM

Event Timeline

bmahjour created this revision.Jul 26 2019, 2:39 PM
Meinersbur added inline comments.Jul 29 2019, 4:24 PM
llvm/include/llvm/Analysis/DDG.h
69

[style] By LLVM's coding standard, method names start with lower case.

194

[serious] Please write something about the algorithmic complexity of this.

213–214

[style] Most often the LLVM uses the Ty suffix for typedefs/using.

230

[suggestion] Instead of marking the method as final, maybe mark the entire class as final?

274

[style] ResultTy?

299

This type is not a reference, but a pointer.

llvm/lib/Analysis/DDG.cpp
33

[nit] Parens around cast unnecessary

134

Could you either pass a non-const Function or use a list of constBasicBlock?

135

[subjective] This is a good illustration of one of the reasons why I dislike doing more than just initializing an object inside its constructor. When calling a virtual function, it will not call the overriden function of the class that will be returned, but the function that it statically resolves to.

In this case, the populateGraph method is marked as final, so there cannot be overriden in another subclass. But then, there is no point in making populateGraph virtual as it is protected and called nowhere else, not even in the base class.

My suggestion is to let a method of DDGBuilder return a populated graph instead of doing it in the constructor.

139–141

[suggestion] Instead of the long initializer expression, assign it inside the body?

158–160

[style] return DDGBase::addNode(N)

llvm/lib/Analysis/DependenceGraphBuilder.cpp
40

[style] LLVM does not make use of almost-always auto.

120

Why this const_cast? Can we avoid it?

myhsu added inline comments.Jul 31 2019, 8:47 AM
llvm/include/llvm/Analysis/DDG.h
69

Maybe we can use llvm::function_ref here?

bmahjour marked 23 inline comments as done.Jul 31 2019, 12:26 PM

Addressed comments from Michael and Min-Yih.

llvm/include/llvm/Analysis/DDG.h
194

Sure. I have put a comment on the populate function of AbstractDependenceGraphBuilder class.

213–214

If I change the suffix only for the usings, they would look inconsistent with the template type names. For example we'll have:

template <class GraphType> class AbstractDependenceGraphBuilder {
  using NodeTy = typename GraphType::NodeTy;

Notice GraphType vs NodeTy. Unless you'd like me to rename the template type names too, I think it looks and reads better the way it is right now. By the way, I couldn't find anything on this topic in the LLVM Coding Standards.

230

I will remove this function.

274

The pass manager interface requires each pass to have a type named Result. eg in include/llvm/IR/PassManagerInternal.h:

AnalysisResultModel<IRUnitT, PassT, typename PassT::Result

299

Again, the templated implementation of GraphTraits and related iterators require this type to be named NodeRef. Please see llvm/ADT/GraphTraits.h and llvm/ADT/SCCIterator.h for example.

llvm/lib/Analysis/DDG.cpp
134

Unfortunately many of LLVM interfaces take non-const Instructions (including DependenceInfo::depends), so even if I change the list types to be lists of const BasicBlocks and const Instructions, we would need to do a const_cast at some point to interface with the rest of LLVM. I'll change the type of the parameter F to be a non-const Function :(

135

Fair enough, I'll remove the populateGraph function all together.

139–141

Please note that I'm trying to call one of the base class constructors that takes arguments. I think the only way to call it without also calling the default constructor is to use the initializer list.

158–160

actually this function does nothing more than calling the base class addNode until we add the pi-blocks, so I'll remove it from this patch.

llvm/lib/Analysis/DependenceGraphBuilder.cpp
120

AbstractDependenceGraphBuilder holds a constant reference to the result of the dependence analysis (DI). Unfortunately DependenceInfo::depends is not marked as constant although it doesn't modify anything. A few ways to solve this:

  1. Make DI a non-const reference.
  2. Do a const_cast when calling depends.
  3. Make depends constant (can go viral and is beyond the scope of this work)

I personally prefer (2) as it keeps the const_cast where the const chain broke down. If depends is improved and made const we just need to drop this const_cast, and in the meantime we prevent code changes that inadvertently modify DI in this implementation.

bmahjour updated this revision to Diff 212641.Jul 31 2019, 12:28 PM
bmahjour marked 9 inline comments as done.

Addressed comments from Michael and Min-Yih.

A friendly reminder to kindly review and provide comments/approval. Thank you!

LGTM, with some nitpicks.

llvm/include/llvm/Analysis/DDG.h
108

[style] Why not

return InstrList;

instead of those casts?

Additionally, I don't see the harm of returning and empty instruction list. Especially if the function is used to add instructions to the node.

118

[suggestion] There often is also static bool classof(const SimpleDDGNode *) { return true; } for the case when the compiler already knows the type.

213–214

It's not in the coding standard document, but as mentioned it is something you see a lot in the existing code base. I find it useful as the compiler would be confused if one writes e.g.

using SomeList = SmallVector<...>;
...
SomeList SomeList;

However, I see you are trying to match the template argument names (And indeed I don't see the Ty suffix in template argument names in the code base). Therefore, I think your choice makes sense here.

274

OK, didn't notice this.

299

Sorry, didn't see this as a requirement of GraphTraits

llvm/include/llvm/Analysis/DependenceGraphBuilder.h
26–27

[style] The names "BasicBlockList" and "InstructionList" a very general to be put into the entire llvm:: namespace and could conflict with other headers that use lists of instructions and BBs. Maybe move them into one of the DDG classes?

Also the Ty suffix for typedefs would apply here.

llvm/lib/Analysis/DDG.cpp
139–141

OK.

llvm/lib/Analysis/DependenceGraphBuilder.cpp
103

[style] By coding standard , the end-iterator should be cached (the list of nodes is not modified within the body, right)?

120

DI is created by this patch in DDGAnalysis::run and don't see a reason to pass it as const & if the class was not designed for it. That is, I'd prefer (1).

fhahn added a comment.Aug 19 2019, 3:34 PM

This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.

I think it would be good to summarize the information in-tree as well, to ensure the information is accessible later on as well. Some of the docs fit in the headers, for some of it a new documentation page might be worth adding. Ideally it would include some info about design decisions, the intended/example uses cases and how the DDG helps and the benefits over the existing infrastructure.

A few additional questions:

I am not sure I see the direct benefit of duplicating the def-use edges in the DDG? Given a User, we already have access to its uses throughout the User itself and LLVM tries hard to maintain that relation very efficiently.

IIUC the plan is to build the DDG up front and then check the legality of a transformation on top of it. Currently, most passes bail out early once they detect a transformation cannot be applied and this helps to limit compile time. Could we do something similar with checks dependent on the DDG?

llvm/include/llvm/Analysis/DDG.h
102

Unless there is a good reason, I think it would be better to return an ArrayRef/SmallVectorImpl instead of a concrete SmallVector here (and other places, same for arguments).

200

Why do we need a copy here? Wouldn't a reference be enough?

234

Do we need dynamic memory allocation here? Can the DDG just own the nodes?

llvm/include/llvm/Analysis/DependenceGraphBuilder.h
27

The default seems a bit excessive. Do you expect most nodes to have multiple instructions?

90

Is there a reason to not use DenseMap here?

bmahjour updated this revision to Diff 216391.EditedAug 21 2019, 8:20 AM
bmahjour marked 24 inline comments as done.

Addressed the latest round of review comments. Also merged with the latest changes from master branch.

bmahjour added inline comments.Aug 21 2019, 8:24 AM
llvm/include/llvm/Analysis/DDG.h
102

Ok, I'll change functions to work with SmallVectorImpl.

108

We could just return InstList, but since the const version is already doing this, we use this trick to avoid code duplication. It's common practice in C++ codes, and I see instances of it in LLVM as well (eg. getInlineBuckets in DenseMap.h) .

Scott Meyers encourages this practice in his book "Effective C++" in section Avoiding Duplication in const and Non-const Member Functions.

The assert is useful because we intentionally avoid constructing a SimpleDDGNode node with no instruction and none of the member functions allow removing instructions from the node.

200

The reason is that the target of the reference may go out of scope, while the DDG (as an analysis result) lives on. For instance if you look at DDGAnalysis::run, an object of DependenceInfo is created inside the function which is used to construct the DDG. The function returns a unique_ptr to that DDG. The DDG lives on and needs to answer queries about the dependencies, while the DependenceInfo object is local to the function and gets destroyed upon return of the run function.

234

I don't see how DDG can be implemented without using dynamic memory allocation. Having the DDG own the nodes also creates problems for the builder design pattern, since the creation of the objects is meant to be delegated to the builder class.

llvm/include/llvm/Analysis/DependenceGraphBuilder.h
26–27

Good point. I'll fix it.

27

The default seems a bit excessive. Do you expect most nodes to have multiple instructions?

It depends on whether the builder is used to build a DDG or a PDG. Program Dependence Graphs tend to contain coarse-grained nodes containing many instructions. Merged DDG nodes after graph simplification (not shown in this patch) usually contain 2 or 3 instructions and based on my observation they constitute about less than a third of all the nodes. PDG nodes on the other hand could easily have in the order of tens or hundreds of instructions.

I'll change it to 2 instructions for now, since PDG is farther down the road.

90

Originally I thought DenseMap was implemented as a binary search tree, but looks like it's a true hash table. I'll switch to DenseMap.

llvm/lib/Analysis/DependenceGraphBuilder.cpp
103

Right, I'll use that style for the this and the DstIt loop.

120

Ok, I'll make DI non-const.

This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.

I think it would be good to summarize the information in-tree as well, to ensure the information is accessible later on as well. Some of the docs fit in the headers, for some of it a new documentation page might be worth adding. Ideally it would include some info about design decisions, the intended/example uses cases and how the DDG helps and the benefits over the existing infrastructure.

Sure I can create a page or two of documentation, however I'm not very familiar with the doc infrastructure in LLVM. Could you point me to some examples to follow? Would an rts file under llvm/docs/DDG be sufficient? Are they rendered by any tool and if so how can I test it?

A few additional questions:

I am not sure I see the direct benefit of duplicating the def-use edges in the DDG? Given a User, we already have access to its uses throughout the User itself and LLVM tries hard to maintain that relation very efficiently.

The def-use dependencies are important as they carry scalar dependencies but it is possible to only consider memory access instructions in the DDG and only follow def-use edges during graph construction to establish reaching defs from load-like instructions to store-like instructions. However doing that reduces the generality of DDG as transformations would need to do extra work during codegen to pull in all required instructions. If the def-use edges are explicitly represented in the graph, then codegen is simplified because a topological sort of the graph fully represents the whole program. Please note that, the DDG can potentially be used in many different transformations. Many of those transformations, such as instruction scheduling, care about all instructions (not just memory access instructions), and would not benefit from using this implementation of a DDG if a minimalistic approach is to be taken.

I actually implemented a prototype where I did the "minimal" implementation only considering memory instructions. I measured the difference in compile-time for a number of benchmarks with and without this approach, and I only noticed a small improvement. From what I observed and the feedback from several people, the gain is too small to justify the loss of generality and convenience of a full DDG.

IIUC the plan is to build the DDG up front and then check the legality of a transformation on top of it. Currently, most passes bail out early once they detect a transformation cannot be applied and this helps to limit compile time. Could we do something similar with checks dependent on the DDG?

The DDG would certainly help analyze legality of various transformations. It can go beyond answering the question of "whether a transformation is legal or not". It can actually help determine how to transform the code so that it preserves original program dependencies, a good example of this is loop distribution and loop vectorization. Other transformations, which only care about existence of a certain data dependency pattern, can use the DDG as well, but they would have to consider the benefits versus the compile-time cost of building it.

simoll added a subscriber: simoll.Aug 26 2019, 10:36 AM
bmahjour updated this revision to Diff 217966.Aug 29 2019, 1:37 PM

Created a documentation in tree with diagrams and words describing the high level design.

This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.

I think it would be good to summarize the information in-tree as well, to ensure the information is accessible later on as well. Some of the docs fit in the headers, for some of it a new documentation page might be worth adding. Ideally it would include some info about design decisions, the intended/example uses cases and how the DDG helps and the benefits over the existing infrastructure.

Sure I can create a page or two of documentation, however I'm not very familiar with the doc infrastructure in LLVM. Could you point me to some examples to follow? Would an rts file under llvm/docs/DDG be sufficient? Are they rendered by any tool and if so how can I test it?

Great thanks! Yep adding a .rts should be sufficient. I think you need sphinx installed to build the docs and set LLVM_BUILD_DOCS.

A few additional questions:

I am not sure I see the direct benefit of duplicating the def-use edges in the DDG? Given a User, we already have access to its uses throughout the User itself and LLVM tries hard to maintain that relation very efficiently.

The def-use dependencies are important as they carry scalar dependencies but it is possible to only consider memory access instructions in the DDG and only follow def-use edges during graph construction to establish reaching defs from load-like instructions to store-like instructions. However doing that reduces the generality of DDG as transformations would need to do extra work during codegen to pull in all required instructions. If the def-use edges are explicitly represented in the graph, then codegen is simplified because a topological sort of the graph fully represents the whole program. Please note that, the DDG can potentially be used in many different transformations. Many of those transformations, such as instruction scheduling, care about all instructions (not just memory access instructions), and would not benefit from using this implementation of a DDG if a minimalistic approach is to be taken.

Thanks for clarifying, I was not aware of the additional intended use cases.

I actually implemented a prototype where I did the "minimal" implementation only considering memory instructions. I measured the difference in compile-time for a number of benchmarks with and without this approach, and I only noticed a small improvement. From what I observed and the feedback from several people, the gain is too small to justify the loss of generality and convenience of a full DDG.

IIUC the plan is to build the DDG up front and then check the legality of a transformation on top of it. Currently, most passes bail out early once they detect a transformation cannot be applied and this helps to limit compile time. Could we do something similar with checks dependent on the DDG?

The DDG would certainly help analyze legality of various transformations. It can go beyond answering the question of "whether a transformation is legal or not". It can actually help determine how to transform the code so that it preserves original program dependencies, a good example of this is loop distribution and loop vectorization. Other transformations, which only care about existence of a certain data dependency pattern, can use the DDG as well, but they would have to consider the benefits versus the compile-time cost of building it.

llvm/include/llvm/Analysis/DDG.h
200

Ah right, it's unfortunate that the new pass manager does not really allow to get DI easily from a loop pass. I might be worth moving the DI into DependenceInfo, to make the ownership a bit more explicit.

bmahjour marked an inline comment as done.Sep 5 2019, 7:14 AM

This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.

I think it would be good to summarize the information in-tree as well, to ensure the information is accessible later on as well. Some of the docs fit in the headers, for some of it a new documentation page might be worth adding. Ideally it would include some info about design decisions, the intended/example uses cases and how the DDG helps and the benefits over the existing infrastructure.

Sure I can create a page or two of documentation, however I'm not very familiar with the doc infrastructure in LLVM. Could you point me to some examples to follow? Would an rts file under llvm/docs/DDG be sufficient? Are they rendered by any tool and if so how can I test it?

Great thanks! Yep adding a .rts should be sufficient. I think you need sphinx installed to build the docs and set LLVM_BUILD_DOCS.

No Problem. Thank you for bringing it up. The latest uploaded patch contains the .rst and the relevant images.

@fhahn @Meinersbur If there are no further comments I'd appreciate an approval so we can move on to the next patches. Thanks.

llvm/include/llvm/Analysis/DDG.h
200

might be worth moving the DI into DependenceInfo, to make the ownership a bit more explicit.

I'm not sure what you mean by moving the DI into DependenceInfo. Could you please clarify?

Thanks for the docs! Could you clarify the difference between paper and implementation? The implementations looks fine, apart a few nits.

llvm/docs/DependenceGraphs/DDG.rst
27–31 ↗(On Diff #217966)

[clarification needed] AFAICS, this implementation does not allow an arbitrary hierarchical depth; just one layer that allows group instructions into a single node (Could it be called a pi-node?). Could you clarify the difference between paper and implementation?

44 ↗(On Diff #217966)

The patch currently does not apply with arcanist. I only get 0 byte files for these images.

110 ↗(On Diff #217966)

[nit] "over-engineering" (with dash or as single word)

llvm/include/llvm/Analysis/DDG.h
114

[suggestion] getInstructions().front()

116

[suggestion] getInstructions().back()

257

[nit] Passing a NodeKind by const has no effect on the declaration.

llvm/lib/Analysis/DDG.cpp
50–52

[suggestion] Explicitly handle the NodeKind::Unknown case so the compiler can warn that about a missing switch case if the NodeKind is extended.

60

[suggestion] isa<SimpleDDGNode>(N) since there is an overload isa(const Y &Val).

62

[suggestion] cast<const SimpleDDGNode>(N).getInstructions()

bmahjour marked 7 inline comments as done.Sep 6 2019, 12:41 PM

Thanks for the docs! Could you clarify the difference between paper and implementation? The implementations looks fine, apart a few nits.

No problem. I've added a section on the differences between the paper and this implementation.

llvm/docs/DependenceGraphs/DDG.rst
27–31 ↗(On Diff #217966)

You are correct, the hierarchy is at most one level deep. There is no difference between the paper and this implementation in this regard. I'll clarify that pi-blocks cannot be nested and why.

44 ↗(On Diff #217966)

Sorry I forgot to use --binary when creating my patch. I'll fix it.

llvm/include/llvm/Analysis/DDG.h
257

True, but I usually try to keep the signature of the declaration and definition as similar as possible for consistency. Not sure why anyone would care to have const removed from by-value parameters on declarations!

bmahjour marked 3 inline comments as done.Sep 6 2019, 12:41 PM
bmahjour marked 5 inline comments as done.
bmahjour updated this revision to Diff 219156.Sep 6 2019, 12:46 PM

Addressed latest round of review comments.

bmahjour marked an inline comment as done.Sep 6 2019, 2:10 PM
bmahjour added inline comments.
llvm/docs/DependenceGraphs/DDG.rst
44 ↗(On Diff #217966)

There seems to be an issue with phabricator showing binary parts of a patch. I can recreate the .png files by reapplying the patch that I uploaded to this revision, but if I "Download Raw Diff" from this revision and apply it the images are null. I see the same issue in other reviews that added images eg. https://reviews.llvm.org/D49433.

For now you can see the images here: https://ibm.box.com/v/data-dependence-graph

This revision is now accepted and ready to land.Sep 12 2019, 5:32 PM
This revision was automatically updated to reflect the committed changes.