This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Add GraphTraits for FunctionSummaries
ClosedPublic

Authored by ncharlie on Aug 4 2017, 7:58 AM.

Details

Summary

Depends on D36233

Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Diff Detail

Event Timeline

ncharlie created this revision.Aug 4 2017, 7:58 AM
ncharlie updated this revision to Diff 109736.Aug 4 2017, 7:59 AM

Diff against D36233 rather than master.

Since this patch depends on D36233, I'm diffing against it to show minimal changes.

include/llvm/IR/ModuleSummaryIndex.h
509

There are some external functions (i.e. libc functions) that aren't included in the summary, so I create an empty functionsummary for them.

1086

Should the first callee be the entry node or the caller function itself?

ncharlie updated this revision to Diff 110114.Aug 7 2017, 5:14 PM

Make the entry node the function itself.

ncharlie marked an inline comment as done.Aug 7 2017, 5:15 PM

How should I test this and ensure it's working properly?

ncharlie updated this revision to Diff 110195.Aug 8 2017, 7:32 AM
ncharlie updated this revision to Diff 110226.Aug 8 2017, 10:26 AM

simplify code by using mapped_iterator rather than custom implementation.

ncharlie updated this revision to Diff 110294.Aug 8 2017, 3:37 PM
ncharlie edited the summary of this revision. (Show Details)

Added GraphTraits for ModuleSummaryIndex

include/llvm/IR/ModuleSummaryIndex.h
1122

This entry node isn't necessarily at the bottom of the callgraph. I'm not sure how to address this problem at the moment.

ncharlie marked an inline comment as done.Aug 15 2017, 7:29 AM
ncharlie added inline comments.
include/llvm/IR/ModuleSummaryIndex.h
1089

This is causing problems. The scc_iterator gets stuck on these functionsummaries since they're not in the original graph, which causes it to terminate before going through all the SCCs in the summary callgraph.

ncharlie added inline comments.Aug 15 2017, 10:14 AM
include/llvm/IR/ModuleSummaryIndex.h
1126

Looking more closely at the scc_iterator code, I don't think I need to describe nodes_begin/nodes_end. If that's the case I'll remove them (I'd like to get rid of my fsummary_iterator implementation if possible)

ncharlie updated this revision to Diff 111206.Aug 15 2017, 10:51 AM

removed fsummary_iterator since it's not needed for the scc_iterator

ncharlie marked 2 inline comments as done.EditedAug 15 2017, 10:58 AM

For some reason, the scc_iterator isn't visiting the whole graph, and only visits the children of the entryNode. How should I move across all SCCs in the Index?

ncharlie marked an inline comment as done.Aug 15 2017, 10:58 AM

Ah, I think I realized what the entryNode should be: I should topologically sort the nodes in the ModuleSummaryIndex and take the first FunctionSummary. Right now the entry node is arbitrary.

I don't think I can use a topological sort in this case since the FunctionSummary callgraph isn't necessarily a DAG, so if there's a cycle it would make the sort useless.

Instead, I tried using the main function as the entry node and that worked. Any suggestions on a clean way of implementing this rather than hard coding a lookup for "main" in the Index?

davide edited edge metadata.Aug 16 2017, 12:51 AM

I don't think I can use a topological sort in this case since the FunctionSummary callgraph isn't necessarily a DAG, so if there's a cycle it would make the sort useless.

No callgraph is guaranteed to be a DAG, what's a DAG is the condensed graph, where you contract every node of a component to a single vertex.
The data-flow algorithms generally work first propagating within a single SCC (using a worklist algorithm) until you hit a maximal fixpoint, then propagating topologically the facts you learned across the graphs of SCCs.

Instead, I tried using the main function as the entry node and that worked. Any suggestions on a clean way of implementing this rather than hard coding a lookup for "main" in the Index?

I'm not sure you're always guaranteed to have the main function as entry point of your program.

ncharlie added a comment.EditedAug 16 2017, 6:50 AM

I'm not sure you're always guaranteed to have the main function as entry point of your program.

Right - how would you recommend finding the root node of the callgraph so I can return that in the getEntryNode for the GraphTraits for the ModuleSummaryIndex? Since right now, it's just arbitrarily finding a FunctionSummary in the Index.

I tried implementing a search to find the root nodes in the Index callgraph. But there can be multiple root nodes in the callgraph and we can only return one node from GraphTraits<ModuleSummaryIndex *>::getEntry. Since I didn't know how to pick the correct root, I didn't go with that solution. Is there any information I can find in the Index that indicates the entry point?

tejohnson edited edge metadata.Aug 16 2017, 7:04 AM

I tried implementing a search to find the root nodes in the Index callgraph. But there can be multiple root nodes in the callgraph and we can only return one node from GraphTraits<ModuleSummaryIndex *>::getEntry. Since I didn't know how to pick the correct root, I didn't go with that solution. Is there any information I can find in the Index that indicates the entry point?

Looking at how the GraphTraits is implemented for the CFG, it returns a special "ExternalCallingNode", that is created to contain edges to all functions that may be called externally (e.g. external linkage or address taken). See CallGraph::addToCallGraph() in CallGraph.cpp. Perhaps you can create such a dummy node in the index, which has edges to all the root nodes, and return that as the entry node?

Perhaps you can create such a dummy node in the index, which has edges to all the root nodes, and return that as the entry node?

Yeah, I think that'll work! I've been staring at the CallGraph::addToCallGraph() for the last few days but didn't think to apply the same idea of a dummy root node to the Index :P

Thanks!

How should I test this and ensure it's working properly?

Maybe add an scc dumper facility for the module summary graph, which creates an scc_iterator from the module summary and just dumps the nodes in each scc, that can be checked using a Lit based test

include/llvm/IR/ModuleSummaryIndex.h
28

Not needed in this patch?

1089

In what way does it get stuck? The CallGraph.h has a dedicated CallsExternalNode for calls to external functions, but it seems like you are doing something similar here with a static ExternalFunction.

Maybe add an scc dumper facility for the module summary graph, which creates an scc_iterator from the module summary and just dumps the nodes in each scc, that can be checked using a Lit based test

I'll look into adding this.

include/llvm/IR/ModuleSummaryIndex.h
28

Oops - leftover from a different branch. I'll remove that.

1089

Sorry, nvm that comment - I was misattributing it to external nodes before, but it was really just because my entry point for the Index was wrong.

ncharlie updated this revision to Diff 111522.Aug 17 2017, 8:33 AM

Create dummy root for index callgraph entry point.

tejohnson added inline comments.Aug 17 2017, 1:31 PM
include/llvm/IR/ModuleSummaryIndex.h
583

This isn't the first callee, it is the function summary whose callees we will walk as children.

724

doxygen comment needed

780

Since you set FunctionHasParent on the line above, I think this would always already return true for isDiscovered.

796

Better to reverse sense of condition and do continue.

797

Why are you using original name here? I guess it is because you don't have easy access to the normal GUID? Can you fix this by changing FunctionHasParent to key off the ValueInfo instead of the FunctionSummary? I.e. then in the above loop over the index you would call discoverNodes with ValueInfo(&S). Then here you have direct access to the ValueInfo.

I guess the type of NodeRef and return type of fsumFromEdge need to change to ValueInfo correspondingly.

805

Add a comment describing the parameters you are passing constant values. E.g. "/*NotEligibleToImport=*/true".

ncharlie updated this revision to Diff 112344.Aug 23 2017, 6:07 AM

Updated according to comments
Added test.

ncharlie marked 6 inline comments as done.Aug 23 2017, 6:07 AM
davide added inline comments.Aug 23 2017, 6:39 AM
include/llvm/IR/ModuleSummaryIndex.h
166

Add a message.

569–577

This is not really readable. I wonder if you can make it shorter.

773

dyn_cast<> + assert != nullptr ?

lib/IR/ModuleSummaryIndex.cpp
87

Shouldn't this be under NDEBUG || LLVM_ENABLE_DUMP (or whatever it's called?)

Getting there, thanks!

include/llvm/IR/ModuleSummaryIndex.h
569–577

Not sure how to make the constructor invocation shorter, but perhaps the code that invokes make_unique could be extracted out to a helper and invoked both here and below where the dummy call graph root node is being constructed. It only needs to take a vector of edges since that is the only difference AFAICT between the 2 invocations.

776

Simplify the above by simply doing:

if (FunctionHasParent.find(C.first) != FunctionHasParent.end())

continue;

Actually - I don't think this is quite correct. We would have emplaced an entry for a function (with bool of false) if we encountered it first through the index walk in the loop below this. So I think you are going to need to assign the result of find() to a variable and then continue if it is != end() and is set to true.

Not sure if this fix is testable - I suppose you were ending up with more nodes than necessary connected to the CallGraphRoot, but I'm not sure offhand what the implication of that is.

784

Needs comment (I can't remember offhand why we would have an index entry with an empty SummaryList - oh, declarations right? I don't think those will show up in the combined index but still ok here but needs to be documented).

tools/llvm-lto2/llvm-lto2.cpp
396 ↗(On Diff #112344)

I'm not sure llvm-lto2 is the right place for this. Specifically because we normally want to dump these on the combined index that is usually in-memory. Maybe add an option to LTO.cpp and optionally dump it out at the start of runThinLTO?

(Note also this appears to be built on top of your llvm-lto2 with the dump-summary change, which unfortunately we have not gotten agreement to commit yet)

ncharlie added inline comments.Aug 23 2017, 12:42 PM
include/llvm/IR/ModuleSummaryIndex.h
569–577

Yeah - having a createDummyNode function (or something like that) would definitely help with the readability.

lib/IR/ModuleSummaryIndex.cpp
87

Won't that remove it in non-debug builds though? That would break the new llvm-lto2 dump-sccs subcommand. That's fine if llvm-lto2 isn't expected to be used except in debug builds.

tools/llvm-lto2/llvm-lto2.cpp
396 ↗(On Diff #112344)

Maybe add an option to LTO.cpp and optionally dump it out at the start of runThinLTO?

Yeah, I though it was a bit weird having it in here. I'll replace it with a hidden flag in LTO.cpp.

ncharlie updated this revision to Diff 112695.Aug 25 2017, 7:58 AM
ncharlie marked 11 inline comments as done.Aug 25 2017, 8:01 AM
ncharlie added inline comments.
test/ThinLTO/X86/module_summary_graph_traits.ll
33

Despite the indirect recursion in l1/l2, the SCCs only have 1 node in each. The loop is still detected (with the scc_iterator::hasLoop function), but there's an SCC for each function.

This looks really good, just a few minor comments. But I do have a concern about the test and it not showing the multiple node SCCs that I would expect. Can you investigate?

include/llvm/IR/ModuleSummaryIndex.h
434

document

761

Notice this is no longer lazily calculated. Was there a specific reason for the change? I suppose one reason not to do that is that you'd want any changes to the Index reflected in this in between subsequent SCC graph constructions (which should not happen frequently anyway).

776

comment

784

nit - move comment to preceding line.

test/ThinLTO/X86/module_summary_graph_traits.ll
7

Put a comment above each of these as to which function name it corresponds to.

33

Hmm, that seems wrong? Wouldn't we incorrectly think these were NoRecurse then? Wondering also why there are 3 functions marked as "has loop". I assume that is root() - but we haven't done any inlining that would cause this right?

ncharlie added inline comments.Sep 2 2017, 8:43 AM
include/llvm/IR/ModuleSummaryIndex.h
761

I was having trouble with the ownership of the unique_ptr node generated by this function, so I decided it was simpler to just pass the FunctionSummary directly rather than allocating and managing the memory for it.

test/ThinLTO/X86/module_summary_graph_traits.ll
33

I'll look into this.

ncharlie updated this revision to Diff 117291.Oct 1 2017, 1:15 PM

Added operator!= and operator<

Any updates on this @tejohnson @davide ?

Any updates on this @tejohnson @davide ?

Sorry - didn't realize this was ready for re-review! I had a question about the tests earlier (see below). The comment wasn't marked as done. Although I see now that the test has changed a bit - did you resolve this? I will take a look this morning.

test/ThinLTO/X86/module_summary_graph_traits.ll
33

Did you figure this out?

ncharlie marked 4 inline comments as done.Oct 13 2017, 8:44 AM

Any updates on this @tejohnson @davide ?

Sorry - didn't realize this was ready for re-review! I had a question about the tests earlier (see below). The comment wasn't marked as done. Although I see now that the test has changed a bit - did you resolve this? I will take a look this morning.

Ah sorry - forgot to mark it. Yeah - it's corrected now. Updated the tests accordingly.

Just minor comments left at this point, I noted a few places where you hadn't yet addressed my previous commenting nits, plus a question about paring down the new operators, and some suggestions for minorly beefing up the matching you are doing on the test lines.

include/llvm/IR/ModuleSummaryIndex.h
161

I see you have both const and non-const parameter versions of operator== and operator<, but only the non-const parameter version of operator!=. Why is that? Also, since ValueInfo::getGUID is const, I would think you should be able to have only a single version of each operator overload, all taking const ValueInfo.

434

Not done yet.

776

Not done yet

784

Not done yet

test/ThinLTO/X86/module_summary_graph_traits.ll
7

Please add this - I think what you are getting now makes sense, but it would be really helpful to have these documented. BTW, you can check this correlation by passing -print-summary-global-ids to llvm-lto2.

13

Add "{{$}}" after each of these that shouldn't be followed by " (has loop)".
Ditto for matching start of line for those that shouldn't be preceeded by "External". I.e. I think preceeding the GUID you are matching with "{{^}} " should do it (match start of line followed by one space which appears to be how these are getting emitted).

25

What is this node? Ah, the dummy call graph root? Document.

ncharlie updated this revision to Diff 121599.Nov 4 2017, 1:22 PM
ncharlie marked 14 inline comments as done.
ncharlie marked an inline comment as done.
ncharlie added inline comments.
include/llvm/IR/ModuleSummaryIndex.h
161

OK - it works without the non-const operators, so I removed them.

ncharlie marked an inline comment as done.Jan 6 2018, 6:52 AM

Any updates on this?

Sorry, I missed these being updated. Will take another pass through but it looks like you have addressed my comments.

Easwaran - can this be used to do the call graph walks you need for synthetic counts propagation? If not we need to figure out why, because I see you are going in a different direction in D42311.

tejohnson accepted this revision.Jan 21 2018, 3:19 PM

LGTM with a few suggestions below that don't require a re-review. Minor update needed to the ValueInfo comparisons due to a recent patch (see the comment I left there), and a few efficiency improvements. Go ahead and submit once those are fixed and tests pass and you've uploaded the new patch here for posterity.

include/llvm/IR/ModuleSummaryIndex.h
150–151

The asserts in these comparison operators will need a slight update after the changes to ValueInfo in r323062. I think you can just change Ref to getRef() on the given ValueInfo objects.

759–823

Add a comment above this documenting what the bool represents.

759–823

getValueInfo will lookup the given GUID in the GlobalValueMap - the same map we are iterating over in this loop. Can't we just do "discoverNodes(S.second)"?

759–823

Presumably more efficient to check if there is already a map entry and return early if so (we'll eventually skip all the callees below since they would already have been visited, but might as well not even do that processing). Also, no need for make_pair when using emplace. I.e. change this to:

auto S = FunctionHasParent.emplace(V, false);
// Stop if we have already discovered this node.
if (!S.second)

return;
759–823

We can actually skip the discoverNodes if it was in the map (even if S->second was false), since we would already have walked it's callees if it was added to the map when encountered in the for loop below over the index. Additionally, no need to do another map lookup to set the value to true, since you already have the entry via the iterator, although even better to just do via emplace. So you should be able to change this block to:

Insert the node if necessary
auto S = FunctionHasParent.emplace(C.first, true);
Skip nodes that already had parents
if (!S.second && S.first->second)

continue;

// If we just inserted this node, discover it, otherwise just note that it now has a parent.
if (S.second)

discoverNodes(C.first);

else

S.first->second = true;

I think this should work...

This revision is now accepted and ready to land.Jan 21 2018, 3:19 PM
eraman added inline comments.Jan 26 2018, 4:06 PM
include/llvm/IR/ModuleSummaryIndex.h
150–151

The current version of this file has a isSpecialKey(ValueInfo) in DenseMapInfo that compares a ValueInfo with tombstone and empty keys. This breaks with the == overload since the tombstone and empty keys don't have valid pointers and this tries dereferencing them. A fix is to not use == in isSpecialKey but call getRef and compare the pointers.

ncharlie updated this revision to Diff 131736.Jan 28 2018, 6:26 PM

Merged master, fixed tests, minor tweaks.

ncharlie marked 7 inline comments as done.Jan 28 2018, 6:33 PM
ncharlie added inline comments.
include/llvm/IR/ModuleSummaryIndex.h
1034

@tejohnson

I just wanted to double check that this functionality for getting a FunctionSummary from a ValueInfo is sane. It expects the first value in the SummaryList to be either a FunctionSummary (in which case it returns it) or an AliasSummary (in which case it gets its aliasee, expecting a FunctionSummary).

It looks like D36850 got accidentally merged into this version of the patch -make sure to remove that.

include/llvm/IR/ModuleSummaryIndex.h
760

s/Functinos/Functions
s/mark/marked/

1034

You can just invoke GlobalValueSummary::getBaseObject which will return the aliasee if it is an alias, otherwise the this pointer. It looks like you should just assert then that the resulting object is a FunctionSummary, since the caller expects it to be one (and looks like this is just called when walking call edges, so should always be the case).

eraman added inline comments.Jan 30 2018, 3:41 PM
include/llvm/IR/ModuleSummaryIndex.h
1056

From the comments in GraphTraits, NodeRef is "Type of Node token in the graph, which should be cheap to copy." ValueInfo is cheap to copy now, but if something later gets added to it it might not be cheap to copy anymore. It is preferable to keep NodeRef as a reference (or pointer) to ValueInfo. Also, you could use using instead of typedef.

It looks like D36850 got accidentally merged into this version of the patch -make sure to remove that.

Hi Charles,

I was just chatting with Easwaran - he's using your patch and building on top of it for frequency propagation changes. Would you be able to address my last round of comments which are all minor and don't need a re-review, and commit this soon (say in the next week)? If not, let me know if you would like one of us to commit it (attributing it to you). (Easwaran says don't worry about his comment regarding the NodeRef for this patch.)

Thanks!
Teresa

ncharlie updated this revision to Diff 133800.Feb 11 2018, 1:00 PM
ncharlie updated this revision to Diff 133803.Feb 11 2018, 1:16 PM

fixed borked diff

ncharlie updated this revision to Diff 133804.Feb 11 2018, 1:19 PM
ncharlie closed this revision.Feb 11 2018, 2:10 PM

Committed in rL324854

Thanks! Looks like the test accidentally got left out of the commit, so please commit that as a follow-on.

Thanks! Looks like the test accidentally got left out of the commit, so please commit that as a follow-on.

Ah, good catch - thanks. Committed the test as well.

ncharlie updated this revision to Diff 134819.Feb 17 2018, 1:35 PM

Use function_ref rather std::function, becuase of bug in clang.

Committed patch to use function_ref rather than std::function.

ncharlie updated this revision to Diff 134851.Feb 18 2018, 5:00 PM

Move lambda into static function.