This is an archive of the discontinued LLVM Phabricator instance.

[XRay] A graph Class for the llvm-xray graph

Authored by varno on Jan 22 2017, 10:23 PM.

Diff Detail


Event Timeline

varno created this revision.Jan 22 2017, 10:23 PM
varno updated this revision to Diff 85524.Jan 23 2017, 8:39 PM

Minimise the lib components, to add remaining changes on an as need

varno updated this revision to Diff 85528.Jan 23 2017, 9:03 PM
  • move inner class to llvm::xray::internal
varno updated this revision to Diff 85532.Jan 23 2017, 10:08 PM
  • fix Spelling mistakes
dberris edited edge metadata.Jan 23 2017, 10:51 PM

As @dblaikie already mentioned, it's a really good idea to give this a good set of unit tests given the complexity of the implementation of this type.

26–29 ↗(On Diff #85528)

Please put one space between the namespace name and the opening {.

32 ↗(On Diff #85528)


37 ↗(On Diff #85528)

The name seems to be misleading. I think this is a transforming iterator (applies a function on dereference). A lot of the machinery being implemented here seems unnecessary if you define this in terms of the already existing iterator adapter type in llvm/ADT/iterator.h -- have you considered doing that either for this iterator?

66 ↗(On Diff #85528)

Unnecessary empty line?

77–79 ↗(On Diff #85528)

I don't think you need this defined in the body of the constructor -- you can initialise H directly with this lambda.

80 ↗(On Diff #85528)

Spurious semicolon.

82 ↗(On Diff #85528)

You can probably just use non-static data member initialisation for these (instead of having a mix of an initialiser list and an in-constructor definition). That way you can =default this constructor instead.

Also you have a spurious semicolon at the end of a function definition.

84–85 ↗(On Diff #85528)

Please avoid using names with leading underscores.

85 ↗(On Diff #85528)

Spurious semicolon.

97 ↗(On Diff #85528)

Empty line after this?

105 ↗(On Diff #85528)

Empty line after this?

115–120 ↗(On Diff #85528)

For types that are user-facing, it's usually preferred that it's documented to at least contain the following:

  • Important semantic concepts. In this domain (of graphs), say what kind of graph this represents (labelled directed graph).
  • Usage examples (how to use this, say as how library users might use it).
  • Requirements on the parametric types, as this is a template.
120 ↗(On Diff #85528)

Space between identifier and {?

130 ↗(On Diff #85528)

Why the inheritance? Will this not work with composition instead? Or is this for the empty base optimisation?

131–142 ↗(On Diff #85528)

Since you're giving names anyway, consider using more descriptive ones (like getVertexId and getEdgeAttr).

145–163 ↗(On Diff #85528)

I may be wrong, but there seems to be a slight preference for using instead of typedef now.

166–170 ↗(On Diff #85528)

Probably document these a little better?

184–222 ↗(On Diff #85528)

This doesn't seem like an iterator to me, so the name is a little misleading. This seems to be more of a View into the data, or a range.

224–245 ↗(On Diff #85528)

I think you want to call this a view as well, instead of a proxy.

241 ↗(On Diff #85528)

This function may be always const?

287–289 ↗(On Diff #85528)


312 ↗(On Diff #85528)

Should be no space between operator and ++?

336–357 ↗(On Diff #85528)

I think this really ought to be named a View.

432–435 ↗(On Diff #85528)

Is this necessary to implement this way? Can we get away with making copies of the underlying data instead?

I mean, wouldn't it be valid if we just kept this = default also?

480 ↗(On Diff #85528)

I'm wondering whether you need to keep track of this value yourself (or whether it's something one of the data structures already keeps track of, that you can just return)?

79 ↗(On Diff #85528)

Consider using better names than generic terms like VertexAttribute -- in the context of XRay this might be FunctionStats, and EdgeAttribute might better be CallStats.

varno updated this revision to Diff 85867.Jan 26 2017, 1:05 AM
varno marked 4 inline comments as done.
  • move inner class to llvm::xray::internal
  • fix Spelling mistakes
  • Make Graph.h cleaner
  • fix bugs and respond to comments
  • add comments
  • fix
varno updated this revision to Diff 85868.Jan 26 2017, 1:09 AM
varno marked 9 inline comments as done.
  • clang-format
varno marked 9 inline comments as done.Jan 26 2017, 1:16 AM
varno added inline comments.
130 ↗(On Diff #85528)

It was not required, fixed.

131–142 ↗(On Diff #85528)

Explained in comment

432–435 ↗(On Diff #85528)

Sadly no, it does not work. :( (that was the first thing I tried, and it caused failed tests.)

480 ↗(On Diff #85528)

This is not tracked in a single place anywhere, It could be recalculated every time you want to query If you think that would be better.

varno marked an inline comment as done.Jan 26 2017, 1:17 AM
dberris added inline comments.Jan 26 2017, 5:55 PM
10 ↗(On Diff #85868)

This isn't entirely accurate -- now that this is in include/llvm/XRay/... then it's no longer just for the llvm-xray graph subcommand.

30 ↗(On Diff #85868)


34 ↗(On Diff #85868)

Each variable? Or each Vertex?

47 ↗(On Diff #85868)

Why this, as opposed to:

Graph<int, int, int> G;

59 ↗(On Diff #85868)

You don't really need this copy example, just state that Graph is CopyConstructible, CopyAssignable, MoveConstructible, and MoveAssignable (but has no strict weak partial ordering nor equality, which may or may not be something you'd want to implement anyway).

75–86 ↗(On Diff #85868)


Break up the run-on sentence into shorter sentences too.

85 ↗(On Diff #85868)

s/Please don't use them.//

278 ↗(On Diff #85868)

For classes, it's already implicit that members are private unless otherwise specified. You may omit this, and the one following it in line 293.

300 ↗(On Diff #85868)

Space before the {.

304 ↗(On Diff #85868)

This initialisation can happen in the initializer list too.

483 ↗(On Diff #85868)

Maybe use 'auto' here instead?

auto TI = Edges.find(Val.first.first);

Also, alternatively you can alias the members of the pairs with more descriptive names.

130 ↗(On Diff #85528)

I still see it doing inheritance -- was this an oversight?

145–163 ↗(On Diff #85528)

i.e. instead of:

typedef Foo X;


using X = Foo;

432–435 ↗(On Diff #85528)

Please say more -- it's not clear why an explicit copy of the members doesn't just work.

148 ↗(On Diff #85868)

Empty line before?

varno updated this revision to Diff 86391.Jan 31 2017, 12:58 AM
varno marked an inline comment as done.
  • move inner class to llvm::xray::internal
  • fix Spelling mistakes
  • Make Graph.h cleaner
  • fix bugs and respond to comments
  • add comments
  • fix
  • clang-format
  • address missed comments to the color commit
  • add at method to Graph
varno updated this revision to Diff 86541.Jan 31 2017, 5:49 PM

Rebase to master

dblaikie edited edge metadata.Feb 1 2017, 9:51 AM

Some drive by comments - up to you & Dean how you prioritize them, etc.

79–81 ↗(On Diff #86541)

I don't understand this explanation - I'm not sure how these member functions would "enable references to look like std::pair". Could you explain further?

87 ↗(On Diff #86541)

This seems like a strange class - it doesn't look like it provides any encapsulation (public inheritance) & the names don't provide any extra documentation. I'd suggest using std::pair directly without this extra type.

105 ↗(On Diff #86541)

Drop the 'class' keyword in all these typedefs

165–171 ↗(On Diff #86541)

Seems more common, perhaps a bit simpler/more obvious, to implement non-const iterator to const iterator conversion as a converting ctor rather than a conversion operator. Though probably no real concern either way, I guess? Haven't thought about it in great detail before.

179 ↗(On Diff #86541)

Should probably be a const member

187 ↗(On Diff #86541)

Drop the 'class' keyword here as it's not needed, I think? (& similarly below)

230 ↗(On Diff #86541)

Spurious semicolon

(actually every member function in this class has a semicolon at the end - probably need to do a pass to remove those across this change/others)

259–260 ↗(On Diff #86541)

const overloads?

263 ↗(On Diff #86541)

Spurious semicolon

316–318 ↗(On Diff #86541)

where possible, make operator overloads non-members (this ensures equal treatment of the LHS and RHS - as-is, this wouldn't be usable for comparisons with const_iterator on the LHS, for example)

330 ↗(On Diff #86541)

this should probably be const?

dblaikie added inline comments.Feb 1 2017, 9:51 AM
402–405 ↗(On Diff #86541)

Does two lookups (count and insert). Avoid that if you can (usually by using "insert" and checking the bool element of the pair returned)

414–417 ↗(On Diff #86541)

Similar double lookup

496–514 ↗(On Diff #86541)

Could use the unified assignment operator if you like, maybe (pass by value, then swap)

528–529 ↗(On Diff #86541)

calling insert might be more obvious. Explicitly calling operator overloads is generally a bit 'weird'.

dberris requested changes to this revision.Feb 2 2017, 7:04 PM

Aside from addressing @dblaikie's comments, please also make sure that the documentation questions raised earlier are addressed too.

Marking this with "Requested Changes" to remove from my "needs review" list.

366 ↗(On Diff #86541)

Please do not use names that start with _.

369 ↗(On Diff #86541)

This section s already public, why the repetition here?

476 ↗(On Diff #86541)

Why the empty line?

This revision now requires changes to proceed.Feb 2 2017, 7:04 PM
varno updated this revision to Diff 87318.Feb 6 2017, 3:27 PM
varno edited edge metadata.
varno marked 14 inline comments as done.
  • Address Reviewer comments
  • reply to comments
  • rebase on top of D29610 and simplify implementation.
varno edited the summary of this revision. (Show Details)Feb 6 2017, 3:28 PM
varno updated this revision to Diff 87321.Feb 6 2017, 3:29 PM

fix up dependancy

varno updated this revision to Diff 87323.Feb 6 2017, 3:30 PM

try again

varno updated this revision to Diff 87327.Feb 6 2017, 3:32 PM
  • Address Reviewer comments
  • reply to comments
varno marked 13 inline comments as done.Feb 6 2017, 4:21 PM
varno added inline comments.
87 ↗(On Diff #86541)

Please see the new Comments to explain this class class.

165–171 ↗(On Diff #86541)

In this case I cannot define this as a converting ctor, as I need to access a protected member of my parent, which I cannot do in a converting ctor :(=.

316–318 ↗(On Diff #86541)

Well actually in this case yes, it is defined symetrically due to the Boolean template behavior of this class. however this does expose a problem in iterator_facade_base which I have to work around.

varno updated this revision to Diff 87347.Feb 6 2017, 5:59 PM
  • Address Reviewer comments
  • reply to comments
  • clang-format
dberris requested changes to this revision.Feb 6 2017, 11:39 PM

While thinking about this more, it's going to be really hard to start understanding what the semantics we're optimising for are without unit tests. This class is sufficiently complex in implementation that I think we really ought to spend as much time as we can writing up unit tests that will exercise the parts of the code that are really important (in this case that the graph we're building is representing the graph we intend to build) and that we're able to implement the basic graph algorithms like breadth-first-search or depth-first-search on top of the interface of this graph type. The risk right now of not implementing unit tests is that any changes we make to this type's implementation will introduce subtle bugs that will break the basic use-cases while serving only the ones that the llvm-xray graph/graph-diff tools need.

I understand that there's some urgency here, I'd be really happy with covering basic tests covering the most basic of features of this type.

41–44 ↗(On Diff #87347)

This seems like it's leaking implementation details. Why should the user of the type need to know about DenseMapInfo?

49–61 ↗(On Diff #87347)

Some well-placed empty lines here would help readability.

/// Usage:
///    Graph<int, int, int> G;
///    G[1] = 0;
///    G[2] = 1;
///    G[{1, 2}] = 2;
///    for (const auto& V : G.vertices()) {
///       // V will be an int, do something with it here.
///    }
///    for (const auto &E : G.edges()) {
///        // E will be a std::pair<int, int>, do something with it here.
///    }
///    // show an example on what to do with the third int?
62 ↗(On Diff #87347)

While you do provide documentation for the views and for some of the member operations, please provide documentation about things like:

  • Space complexity/requirements.
  • Complexity of operations like insertion, removal if that's supported, point-wise lookup, etc.
  • What the semantics are for certain operations -- how do failures get conveyed, what are the invariants that are preserved that's useful to know from a user's perspective, etc.
71–73 ↗(On Diff #87347)

I don't know whether saying that this is derived from std::pair<...> is important to document. This seems like an implementation detail that doesn't help the user of this class at all. Probably say what this type is useful for -- probably important to say that this represents the vertex and the attributes associated with it.

77–79 ↗(On Diff #87347)

Likewise here, I don't see why it's important to tell the reader that you should treat this type as a pair.

80–88 ↗(On Diff #87347)

Why is this documentation on this internal class, as opposed to documentation on the operations on the graph type? i.e. why is it important to mention that this is a child of std::pair<...> -- as opposed to saying why this is derived from std::pair<...> instead?

90–95 ↗(On Diff #87347)

All this documentation seems like implementation detail documentation. This will not be helpful when the doxygen docs are generated and read by users of this type. Remember this is getting into the set of LLVM libraries, and can be used on its own as opposed to just serving the purposes of llvm-xray graph or similar requirements.

98–112 ↗(On Diff #87347)

We still don't have documentation as to "why" these member functions exist, and why we're not just using a DenseMap<pair<VertexIdentifier, VertexIdentifier>, EdgeAttribute>. This still isn't clear from the documentation that's been so far mentioned here. Also it's not clear why the adjacency list being stored isn't just an unordered_map<VertexIdentifier, pair<VertexIdentifier, EdgeAttribute>>, and lookup for all the adjacent vertices (or even out-edges) will be given by equal_range(VertexIdentifier).

Since we're already adapting iterators anyway, I'm sure you can turn an iterator that yields a pair<VertexIdentifier, pair<VertexIdentifier, EdgeAddtribute>> into something that yields a tuple<VertexIdentifier&, VertexIdentifier&, EdgeAttribute&> with appropriate const-ness of members.

For in-edge and out-edge and attribute storage, you could have done it this way:

using EdgeAttributeContainer = std::list<EdgeAttribute>;
std::list<EdgeAttribute> EdgeAttributes;
unordered_multimap<VertexIdentifier, pair<VertexIdentifier, EdgeAttributeContainer::iterator>> OutEdges;
unordered_multimap<VertexIdentifier, pair<VertexIdentifier, EdgeAttributeContainer::iterator>> InEdges;
unordered_multimap<pair<VertexIdentifier, VertexIdentifier>, EdgeAttributeContainer::iterator>> EdgeLookup;

Since you're already defining your own iterators, they could have been simpler lookup, and still have the consistent interface.

I'm not saying this is how you should do it, I'm asking why this isn't what's being done, and the documentation (either internal for the implementer/reader or external generated from this type) isn't answering this question.

139–140 ↗(On Diff #87347)

Consider moving this and all public declarations together for easier reading.

351–353 ↗(On Diff #87347)

Why can't this be defined as a non-member like so:

friend template <bool IsConstL, bool IsConstR>
bool operator==(const EdgeListIterator<IsConstL>&L, const EdgeListIterator<IsConstR>&R) {
  return ...;

This way it's defined with the class and it's also accessible only through ADL?

355 ↗(On Diff #87347)

If you defined this as a non-member, it would look like:

friend bool operator!=(...) {
  return !(L == R);
557 ↗(On Diff #87347)

LLVM Style wants local identifiers starting with capital letters.

300 ↗(On Diff #87347)

Pro-tip: when asserting, provide some human-readable explanation for what the expectation is. As in:

assert((/* complicated conditional */) && "We want a pony, but got a unicorn!");
This revision now requires changes to proceed.Feb 6 2017, 11:39 PM

Few more drive-by comments.

538–539 ↗(On Diff #87327)

two lookups (count + find) - avoid that by using find and then checking end, etc.

563 ↗(On Diff #87327)

Extra semicolon (there are a whole bunch in this class/generally - remove them all)

571 ↗(On Diff #87327)

Missing spaces (run this all through clang-format?)

578 ↗(On Diff #87327)

Extra semicolon

581 ↗(On Diff #87327)

Extra semicolon

87 ↗(On Diff #86541)

Sorry, it's still unclear to me what this class provides over using std::pair directly. (but it sounds like Dean's in on the discussion and has some higher-level feedback/ideas/etc that might motivate a different direction or help sure up this one)

497–499 ↗(On Diff #87327)

LLVM code usually skips {} on single line statements.

varno added inline comments.Feb 7 2017, 2:43 PM
41–44 ↗(On Diff #87347)

DenseSet and other Derived classes seem to do this too.
It does leak some implementation details, but really only that we use DenseMap,
From that you get the ability to implement for more VertexIdentifier types, the limitation
for key would be the same anyway.

98–112 ↗(On Diff #87347)

I cannot quite do this, however something simmilar probably could be done. (The problem with unordered_multimap is the cost of edge-deletion :(. The code complexity of this would increase slightly. If we were to go this way I would probably implement it more like:

DenseMap<VertexIdentifier, Set<VertexIdentifier>> OutEdges;
DenseMap<VertexIdentifier, Set<VertexIdentifier>> InEdges;
DenseMap<pair<VertexIdentifier, VertexIdentifier>, EdgeAttribute>> EdgeLookup;

Which would increase the size somewhat, but would probably work fine.

If you want I can go ahead with that?

varno updated this revision to Diff 87595.Feb 7 2017, 11:20 PM
varno edited edge metadata.
  • Address major reviewer comment and initial tests
varno marked 17 inline comments as done.Feb 7 2017, 11:26 PM
varno edited the summary of this revision. (Show Details)
dberris requested changes to this revision.Feb 8 2017, 11:55 PM

Please address all comments.

41–44 ↗(On Diff #87347)

I'm a little uneasy with the requirements being imposed on VI here. Let me ask a different question: can we just remove this? Why is this essential if we're not relying on DenseMap requirements anymore in the implementation? And if we are using DenseMap, can we get away with just not even mentioning this type in the interface of Graph?

I'm trying to further simplify this type because it's currently really hard to explain why VI is there just by looking at what this type is supposed to do.

49–61 ↗(On Diff #87347)

I don't see it done.

212–213 ↗(On Diff #87595)

Empty line in between?

220–221 ↗(On Diff #87595)

Empty line in between?

230 ↗(On Diff #87595)

You're doing two lookups here. Consider just doing one lookup using iterators.

315 ↗(On Diff #87595)

I'd remove this comment.

364 ↗(On Diff #87595)

LLVM Style uses capital letters for local variable names.

125 ↗(On Diff #87595)

Probably not for now, but it would be great if we can construct a graph with a terse interface. Something like:

Graph<int, int, int> G(
  {1, 2, 3, 4, 5},
  {{1, 2}, {2, 3}, {3, 4}});

Maybe write as a TODO on the Graph type?

1 ↗(On Diff #87595)

Please have more basic tests (construction, copy-construction, assignment, move construction, etc.), smaller and more focused test cases for more interface and semantics-related tests.

19 ↗(On Diff #87595)

This comment seems nothing to do with the graph. Sets? Hashing? This seems to be leaking implementation details, so let's avoid that by only keeping this about the graph.

24 ↗(On Diff #87595)

This comment seems out of place.

46–47 ↗(On Diff #87595)

Probably needs to be ASSERT_EQ because if this isn't true it doesn't make sense to proceed with the test case.

50–51 ↗(On Diff #87595)

Spurious whitespace.

52–53 ↗(On Diff #87595)

Use consumeError(...) instead?

60–83 ↗(On Diff #87595)

Break out into a different test case, please, give it a name that describes what the case is actually testing.

66 ↗(On Diff #87595)

Consider EXPECT_TRUE(bool{T}) instead.

This revision now requires changes to proceed.Feb 8 2017, 11:55 PM
varno updated this revision to Diff 87786.Feb 9 2017, 3:44 AM
varno edited edge metadata.
  • more tests and a fix
varno updated this revision to Diff 87917.Feb 9 2017, 4:49 PM
varno marked 13 inline comments as done.
  • Address Review Commits
41–44 ↗(On Diff #87347)

We are still using DenseMap, and so the question.

VI needs to be there, because the type is int for Graph and StrRef for Graph Diff.

315 ↗(On Diff #87595)

It was not a problem anymore anyway.

432–435 ↗(On Diff #85528)

Oh, in the new version it does work. That is why there is no constructors non-default anymore :)

dberris requested changes to this revision.Feb 9 2017, 7:22 PM

Almost ready to land. One thing that needs to be addressed are the spurious semicolons (will cause the buildbots that build with -pedantic to complain loudly).

The tests could be better, but those could be improved post-commit.

This revision now requires changes to proceed.Feb 9 2017, 7:22 PM
varno updated this revision to Diff 87954.Feb 9 2017, 7:54 PM
varno edited edge metadata.
  • spurious semicolons
dberris accepted this revision.Feb 9 2017, 9:48 PM
This revision is now accepted and ready to land.Feb 9 2017, 9:48 PM
This revision was automatically updated to reflect the committed changes.