This is an archive of the discontinued LLVM Phabricator instance.

[XRay] A tool for Comparing xray function call graphs

Authored by varno on Jan 31 2017, 1:31 AM.



This is a tool for comparing the function graphs produced by the
llvm-xray graph too. It takes the form of a new subcommand of the
llvm-xray tool 'graph-diff'.

This initial version of the patch is very rough, but it is close to
feature complete.

Depends on D29363

Diff Detail


Event Timeline

varno created this revision.Jan 31 2017, 1:31 AM
varno updated this revision to Diff 86542.Jan 31 2017, 5:50 PM

Rebase to master.

varno updated this revision to Diff 86548.EditedJan 31 2017, 6:28 PM
varno edited the summary of this revision. (Show Details)Jan 31 2017, 6:30 PM
varno edited the summary of this revision. (Show Details)
varno updated this revision to Diff 86550.Jan 31 2017, 6:34 PM
  • try that again
varno updated this revision to Diff 86561.Jan 31 2017, 7:29 PM
  • respond to comments
varno updated this revision to Diff 86563.Jan 31 2017, 7:32 PM

roll back mistaken commit

varno updated this revision to Diff 86564.Jan 31 2017, 7:41 PM


varno updated this revision to Diff 86565.Jan 31 2017, 7:43 PM

try again

dblaikie edited edge metadata.Feb 1 2017, 10:06 AM

Few drive by comments

351 ↗(On Diff #86565)

missing space between ')' and '{'

351–353 ↗(On Diff #86565)

Usually skip {} on single line blocks

417–419 ↗(On Diff #86565)

Unnecessary {} and/or missing space (between ')' and '{')

73 ↗(On Diff #86565)

missing space between '()' and '{'

75 ↗(On Diff #86565)

spurious semicolon

52 ↗(On Diff #86565)

Spurious semicolon

dberris requested changes to this revision.Feb 2 2017, 10:21 PM

Overall needs formatting changes, please use clang-format using the style defined in the LLVM codebase.

236–266 ↗(On Diff #86565)

I would really much prefer that we limit the repetition here for readability's sake. Factor the logic out into a single function, and provide the graphs as inputs?

285–287 ↗(On Diff #86565)

This deserves a comment I think.

391 ↗(On Diff #86565)

Maybe give a different name for the diff graph?

10–11 ↗(On Diff #86565)

Probably needs to be updated.

53 ↗(On Diff #86565)

Please don't use names that start with _.

62 ↗(On Diff #86565)

Now that you've defined a default constructor and a non-default constructor, you might as well define (or =default) a copy constructor, a move constructor, a copy+move assignment (probably the unified assignment operator), a swap overload, etc.

Can you say instead why you needed to do this now?

This revision now requires changes to proceed.Feb 2 2017, 10:21 PM
varno updated this revision to Diff 87363.Feb 6 2017, 9:16 PM
varno edited edge metadata.
  • Address reviewer comments
dberris added inline comments.Feb 7 2017, 12:45 AM
268 ↗(On Diff #87363)

s/relitively/relatively/ ?

271 ↗(On Diff #87363)

Now that you've documented this, it seems like this is a "bounded relative diff" or a means of normalizing distance within the range [-1, 1]. I'd urge you to find a more expressive/intuitive name than "divF".

371–373 ↗(On Diff #87363)

Am not a fan of this function's naming... consider spelling it out instead, truncateString seems just fine to me.

25 ↗(On Diff #87363)

Please document this class, how to use it, etc.

26 ↗(On Diff #87363)

nit: really ought to be constexpr size_t.

33 ↗(On Diff #87363)

A using statement to bring in this EdgeValueType into scope would really make this more readable.

varno updated this revision to Diff 87730.Feb 8 2017, 4:25 PM
  • fix mistake and address comments
  • address comments
varno marked an inline comment as done.Feb 8 2017, 5:30 PM
varno added inline comments.
62 ↗(On Diff #86565)

I was able to remove them by empty list-initializing them where used.

varno updated this revision to Diff 87750.Feb 8 2017, 6:38 PM
varno marked an inline comment as done.
  • Address reviewer comments
  • fix mistake and address comments
  • address comments
  • rebase
dberris requested changes to this revision.Feb 9 2017, 12:14 AM

Please address all comments by me and @dblaikie

222 ↗(On Diff #87750)

Magic number 2 for a reason? Maybe use the N in GraphDiffRenderer?

261 ↗(On Diff #87750)

Punctuation mark?

316 ↗(On Diff #87750)

What would be really helpful here is if you gave E.second a name. Is this an EdgeAttr? If so, if would be much easier to read something like:

auto &EdgeAttr = E.second;
if (EdgeAttr.Parents[0] == nullptr)
  return "green";
if (EdgeAttr.Parents[1] == nullptr)
  return "red";
327–329 ↗(On Diff #87750)

I'm not an expert here, but comparing to zero seems a little... odd because these are floating point numbers, but maybe not something important enough to worry about.

51 ↗(On Diff #87750)

Do you need to fully qualify N here?

dberris added inline comments.Feb 9 2017, 12:14 AM
416–418 ↗(On Diff #87750)

Lose the { and } ?

This revision now requires changes to proceed.Feb 9 2017, 12:14 AM
varno updated this revision to Diff 87956.Feb 9 2017, 7:59 PM
varno edited edge metadata.
varno marked 20 inline comments as done.
  • Address reviewer comments
    • fix mistake and address comments
    • address comments
    • rebase
    • address comments
    • -pedantic
    • rebase
dberris requested changes to this revision.Feb 24 2017, 8:41 PM

Lots of readability comments, and a couple of functionality questions.

222–238 ↗(On Diff #87956)

Looks like this is better merged:

for (auto i = 0; i < N; ++i) {
  const auto &G = G[i].get();
  for (const auto &V : G.vertices())
  for (const auto &E : G.edges())
223 ↗(On Diff #87956)

For readability purposes, I'd suggest you alias G[i].get() to reduce the visual noise.

228–229 ↗(On Diff #87956)

What is "S" and "E" here? It would really help if you use better names here. It will make more sense if you do something like:

auto &FromVertex = E.first.first;
auto &ToVertex = E.first.second;
auto FromVertexOrErr =;
auto ToVertexOrErr =;
236 ↗(On Diff #87956)

This is really unreadable. What is P?

242 ↗(On Diff #87956)

This is really opaque -- what does IM stand for? Please pick a more meaningful name. It's also not clear why we're using a monotonic number here.

244 ↗(On Diff #87956)

I suggest using enum names instead of 0 and 1 here, to aid in readability. Something like:

enum { LEFT = 0, RIGHT = 1 };
auto &VAttr = V.second;
if (VAttr.Parent[LEFT] == nullptr || VAttr.Parent[RIGHT] == nullptr)
244–245 ↗(On Diff #87956)

So let me get this right -- if one of the vertex attribute's parents doesn't have it, then we move on. So we don't indicate that the vertex is only seen in one of the graphs?

282–302 ↗(On Diff #87956)

All this repetition seems unnecessary. It seems you might be able to do this with a loop:

auto MemPtrs[] = { &TimeStat::Count, &TimeStat::Min, &TimeStat::Median, ... };
for (auto Mem : MemPtrs) {
  auto Diff = maxDiff(MA.*Mem, MB.*Mem);
  MA.*Mem = Diff.first;
  MB.*Mem = Diff.second;
324–327 ↗(On Diff #87956)

What is R and what is MR -- this code looks really hard to read. I can't even begin to explain what this is actually doing (what's EdgeAttr.P[0]->second.S ?).

334 ↗(On Diff #87956)

What is Manumim? Did you mean Maximum?

342–345 ↗(On Diff #87956)

Why are these colours hard-coded, when you already have a ColorHelper available?

369–372 ↗(On Diff #87956)

What is this meant to do? Did you mean to provide stringified labels? Why isn't this just always returning the empty string? Is this supposed to also provide the value of the difference, in relative terms as a string appropriately coloured?

382 ↗(On Diff #87956)

Looks like you might as well just use formatv here, as it's more confusing to read this as is.

405 ↗(On Diff #87956)

Why are we truncating the tooltip? It seems it might be better to provide the whole string in the tooltip so in cases where we can keep them (in SVG) then the truncated names show up in the code, and on hover will show more information.

38–44 ↗(On Diff #87956)

Please use a better identifier than P -- Is this meant to indicate Parent? If so, just use that to make it easier to read. If not, then please use a more descriptive name.

This revision now requires changes to proceed.Feb 24 2017, 8:41 PM

@varno -- are you able to make the changes, or are you alright with me taking over this patch?

varno planned changes to this revision.Mar 2 2017, 2:37 PM

Yep. This weekend

varno updated this revision to Diff 90568.Mar 4 2017, 12:30 AM
varno edited edge metadata.
  • address reviewer comments
  • clang-format
varno marked 11 inline comments as done.Mar 4 2017, 12:32 AM
varno added inline comments.
244–245 ↗(On Diff #87956)

That the Vertex is in only one of the graphs is encoded in the fact that the corresponding pointer is null. So Yeah We do indicate this, i.e. if there is no link the 0th or 1st vertex does not exist in the corresponding graph.

282–302 ↗(On Diff #87956)

Can't quite do this due to type issues but I have fixed this most of the way there.

342–345 ↗(On Diff #87956)

These are different colors not on the scale for if the vertex or edge does not exist (as opposed to encoding some difference. They should not come from the ColorHelper but are specified separately.

369–372 ↗(On Diff #87956)

I am not sure what to do here in labeling edges with differences, so I have a TODO here, If you can tell me what you think I should put there I will change this.

dberris requested changes to this revision.Mar 8 2017, 12:23 AM

More readability comments. In general all the escaped strings and broken up stream operations don't read well. A better option is using formatv(...) and consolidating those into format strings that are specified as raw string literals would make all the output generation code much more readable.

226 ↗(On Diff #90568)

What is V.second? Is this the vertex attribute? Aliasing this will make it much easier to read:

const auto& VAttr = V.second;
242–244 ↗(On Diff #90568)

Single line body; prefer no {}'s.

253–255 ↗(On Diff #90568)

As documented, it doesn't say what a "negative" value means -- so for values 1 and 2, it will result in:

(1/2) (0.5) - (2) = -1.5

I presume that the sign implies says the value on the left is larger? Or is that in reverse? It would be better if the comment actually stated this.

Alternatively, if you used a relative computation instead and yielded the percentage difference in one direction, and can use that as a label too. That would look something like:

-(Left - Right / Left)

If Left == 0, then it's +100%. If Right is 0 then it's -100%.

256 ↗(On Diff #90568)

Reading this now, I think 'difference' is a little misleading a name for what it does. It seems to be a scaled relative difference, because it's not just A - B (as the name "difference" would suggest). So one of: scaledRelativeDiff(...) or ratioDiff(...) would make this less confusing as it says what it means.

281–282 ↗(On Diff #90568)

Alternative way of spelling this:

std::tie(MA.*Mem, MB.*Mem) = DP;

Or just:

std::tie(MA.*Mem, MB.*Mem) =
    maxDiff(MA.*Mem, MB.*Mem, A.*Mem, B.*Mem);

This might also benefit from better names. Mem isn't the best name, but "Stat" might be more indicative of what it actually is. DoubleMemPtrs looks like it doesn't need to be given a name, the for loop could just have it there. Also, MA and MB don't mean much in terms of names. So rewriting that to be more succinct:

for (auto Stat : {&TimeStat::Min, ...})
  std::tie(StatLeft.*Stat, StatRight.*Stat) =
      maxDiff(StatLeft.*Stat, StatRight.*Stat, NewStatLeft.*Stat, NewStatRight.*Stat);
385 ↗(On Diff #90568)

Use raw string literals to avoid the problem of having to escape quotes.

385 ↗(On Diff #90568)

Can 40 not be hard-coded here? Consider a flag to control string length truncation.

412–415 ↗(On Diff #90568)

This is completely unreadable. Consider using the combination of formatv(...) and raw string literals instead. Also, please use aliases to make things like E.first.first say something a bit more readable. Something like:

const auto &SourceVertex = E.first.first;
const auto &DestinationVertex = E.first.second;
437–481 ↗(On Diff #90568)

Some of this duplication seems a little unnecessary. How about using control flow to reduce the repetition?

struct GraphData {
  bool KeepGoing;
  bool DeduceSiblings;
  string InstrMapFile;
  string Input;
  GraphRenderer::Factory Factory;
  Trace Trace;
  GraphRenderer Renderer;
  Graph Graph;
std::array<GraphData, 2> Graphs {
  { isSpecified(KeepGoing1), ... },
  { isSpecified(KeepGoing2), ...},
for (auto &Graph : Graphs) {
  Graph.Factory.KeepGoing = Graph.KeepGoing;
  auto TraceOrError = loadTraceFile(Graph.Input, true);
  if (!TraceOrError) ...
  Graph.Trace = std::move(*TraceOrError);
342–345 ↗(On Diff #87956)

I can reverse that question too -- why isn't the choice for what colour to use for when something is missing from one side not in the ColorHelper? It seems weird that these are the only colours hard-coded when I'd expect as a reader/maintainer of this that this information fits really well in the ColorHelper utility.

Something like a categorisation amongst "in both sets", "in set A", "in set B", "in neither". We could just use hard-coded colours to indicate those four states in the ColorHelper, and centralise where all the Colors are defined.

369–372 ↗(On Diff #87956)

How about providing the relative differences? Something like "+10%" or "-2.3%" in the same colour as the edge/vertex?

374–402 ↗(On Diff #90568)

Now that I read this, I'm more convinced of a mapping between the enum and data member pointers.

This revision now requires changes to proceed.Mar 8 2017, 12:23 AM

Hi @varno -- are you still working on this, or should I take this over instead?

varno marked 3 inline comments as done.Mar 17 2017, 3:34 AM

Am still working on this.

varno updated this revision to Diff 92261.Mar 18 2017, 11:42 PM
varno edited edge metadata.
varno marked an inline comment as done.
  • Address Reviewer Comments
varno marked 12 inline comments as done.Mar 18 2017, 11:45 PM

I have fixed most of the comments, and replied to all of them, Tests incoming soon.

253–255 ↗(On Diff #90568)

I have improved the comment. I chose this function because it is symmetrical (that is diff(A,B) = -diff(B,A)) and it behaves similarly to a

Right/Left - 1 which you suggest. behaves well if Right == 0 but not if left == 0. it does not treat the left and right hand sides symmetrically, nor close to it away from 1.

369–372 ↗(On Diff #87956)

I am not convinced that this is the right statistic to show, or whether this is the right way to go about coloring edges.

dberris requested changes to this revision.Mar 19 2017, 5:58 PM
dberris added inline comments.
265–266 ↗(On Diff #92261)

So this is always a positive number?

369–372 ↗(On Diff #87956)

Why not?

User gave you two traces. We're showing them differences. Using a made-up relative difference as opposed to ((A - B / A) * 100%) seems wrong and not helpful. We want to make sure the magnitude of the difference is what we use to highlight the edge or the vertex. Any other scheme would be trying to be too clever, and be actively harmful because it wouldn't make sense to the user trying to make sense of the data. If you don't provide the relative difference then it's effectively useless -- I as a user wouldn't be able to tell how severe the difference is in relative terms which is the whole point of the diff tool.

What makes the current way better than using the relative difference for the stat chosen?

This revision now requires changes to proceed.Mar 19 2017, 5:58 PM
varno added inline comments.Mar 20 2017, 3:46 PM
265–266 ↗(On Diff #92261)

Sorry, bad comment will fix.

369–372 ↗(On Diff #87956)

Hi Dean.

I Agree that for the textrual representation that RatioDiff is not the best answer, however for coloring the differences and determining what the largest difference is, this is just fine as both as the graph plotting ratioDiff vs (A-B)/A is monotonic. The properties of the chosen funciton Namely the symmetry allow us to compare the amount of change going positive and negative with each other, which is a useful property.

The reason that this is better is that the order of diff is mostly arbitrary, and so A/B - B;/A behaves like max(A,B)/min(A,B) away from where A and B are similar and properly when A, and B are close to each other then it behaves like (A-B)/A. However even here the behavior is better behaved. Also for choosing colors this is easier. the function (A-B)/ A just treats A, and B too differently to work well for coloring edges (I tried this first at google btw as it was the "obvious" answer).

However as a User in a textual representation I agree that we want to use a different metric, that is why I store both sides of the max and minimum statistics and not just the diff. There I am not sure whether we want to show percentage difference, absolute difference or just both numbers in the change. The decision here also will likely be different for edges and vertices, with edges probably wanting something simple like the percentage change or absolute difference, while the vertices you may want to have both the old and new numbers plus an amount of change.

varno requested review of this revision.Mar 20 2017, 3:59 PM
varno edited edge metadata.
varno added a comment.Mar 20 2017, 4:39 PM

Just doing the math now, (A/B - B/A)*100 is exactly 2x the average of x and y where x and y are defined such that :
A is x% faster than B, and B is x% slower than A. So this is really a reasonable relative difference function even intuitively, being the average of the two difference metrics.

(x = (A/B - 1)*100 and y= -(B/A - 1)*100, so (x+y)/2 = (A/B - B/A)/2)

varno planned changes to this revision.Mar 21 2017, 4:02 PM
varno updated this revision to Diff 92601.Mar 22 2017, 12:12 AM
  • Change coloring metric, add labels
dberris requested changes to this revision.Mar 26 2017, 11:13 PM

I've had a quick look, but haven't dug deep into the recent changes yet. Please add tests.

This revision now requires changes to proceed.Mar 26 2017, 11:13 PM

@varno -- are you still working on this?

varno added a comment.Apr 11 2017, 5:45 PM

I have been very busy recently but I have next week off. I will be working on it then.

varno updated this revision to Diff 96079.Apr 20 2017, 7:09 PM
varno edited edge metadata.
  • New branch
  • address reviewer comments
  • clang-format
  • Address Reviewer Comments
  • Change coloring metric, add labels
  • tests
dberris accepted this revision.Apr 23 2017, 10:26 PM

LGTM -- just a couple more nits, but I'll land it first and get the follow-up changes done myself. :)

Thanks for the work @varno!

310–315 ↗(On Diff #96079)

Why isn't the caller of this just doing:

llvm::find(Collection, nullptr) != Collection.end()
350 ↗(On Diff #96079)

Is it just me, or are the squiggly brackets not matched?

339–341 ↗(On Diff #96079)

This could be a static constexpr array, so it doesn't have to be initialised every time this function is called.

This revision is now accepted and ready to land.Apr 23 2017, 10:26 PM
This revision was automatically updated to reflect the committed changes.
varno added a comment.Apr 24 2017, 3:33 PM

Thanks for letting me finish this.

350 ↗(On Diff #96079)

two {{ escape a { in formatv, so yeah, but they have to be

jroelofs added inline comments.

'*' --> '/' copy-pasta?

sanjoy added a subscriber: sanjoy.Apr 28 2017, 1:43 PM
sanjoy added inline comments.
310–315 ↗(On Diff #96079)

Armchair comment: we also have llvm::is_contained in STLExtras.h