This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Visualize SLP trees with -view-slp-tree
ClosedPublic

Authored by anemet on Mar 7 2017, 11:13 PM.

Details

Summary

Analyzing larger trees is extremely difficult with the current debug output so
this adds GraphTraits and DOTGraphTraits on top of the VectorizableTree data
structure. We can now display the SLP trees with Graphviz as in
https://reviews.llvm.org/F3132765.

I decorated the graph where a value needs to be gathered for one reason or
another. These are the red nodes.

There other improvement I am planning to make as I work through my case here.
For example, I would also like to mark nodes that need to be extracted.

Diff Detail

Repository
rL LLVM

Event Timeline

anemet created this revision.Mar 7 2017, 11:13 PM
mkuper edited edge metadata.Mar 8 2017, 12:04 AM

Thank you, this will be incredibly helpful. I don't think it quite works, though - see inline.

(In any case, I can only review the SLP part, never implemented GraphTraits.)

lib/Transforms/Vectorize/SLPVectorizer.cpp
517 ↗(On Diff #90982)

tl;dr: I think this is broken, but it's not something you can easily fix here.

Fair warning: I haven't had time to really think this through, so maybe there's something trivial I'm missing here. Having said that - I just realized yesterday the tree is a lie. It isn't actually a tree, it's a DAG - so "the user" is not well-defined (see line 1095 on the left, and the discussion about the trouble it causes here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435431.html )

I think the whole thing should be rewritten, to be explicit about the graph it contains. But right now, I believe the only correct way to look up which tree nodes feed a node T is the way vectorizeTree() does it - look at the (scalar) operands of T's scalars, and map them back to the nodes through ScalarToTreeEntry. I'm not sure trying to wrap GraphTraits around that is a good idea, though.

We can either:

  1. Leave this as is, and be explicit about the fact that the rendered graph may be imprecise.
  2. Try to walk the graph "properly".
  3. Put this on hold until the underlying "graph" has a saner representation. I'm not sure when I'll have time to do it, though. If anybody else volunteers...

None of those sound appealing, TBH. Not sure what the least evil is.

2136 ↗(On Diff #90982)

Any reason not to render this as part of the graph regardless of NDEBUG?

mssimpso edited edge metadata.Mar 8 2017, 7:26 AM

Thanks for working on this Adam! I have drawn these kinds of graphs by hand more times than I care to admit. This will be very helpful.

lib/Transforms/Vectorize/SLPVectorizer.cpp
517 ↗(On Diff #90982)

I'd like to better understand the imprecision here. You're essentially saying that the graph output could be missing some edges because a tree entry could have multiple users? And that these uses need not even be of the same vector value (in the unordered load case)? How difficult would it be to walk the DAG properly?

anemet added inline comments.Mar 8 2017, 8:09 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
517 ↗(On Diff #90982)

tl;dr: I think this is broken, but it's not something you can easily fix here.

Well not more broken that the entire code then ;). We are badly missing a FIXME somewhere.

Fair warning: I haven't had time to really think this through, so maybe there's something trivial I'm missing here. Having said that - I just realized yesterday the tree is a lie. It isn't actually a tree, it's a DAG - so "the user" is not well-defined (see line 1095 on the left, and the discussion about the trouble it causes here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435431.html )

I think the whole thing should be rewritten, to be explicit about the graph it contains. But right now, I believe the only correct way to look up which tree nodes feed a node T is the way vectorizeTree() does it - look at the (scalar) operands of T's scalars, and map them back to the nodes through ScalarToTreeEntry. I'm not sure trying to wrap GraphTraits around that is a good idea, though.

I have actually seen that we're trying to recognize existing entries in the tree but my conclusion was that at that point we just just terminate the tree. I am curious to look at your testcase.

Anyhow, it's trivial to make the UsingTreeIdx a SmallVector<int, 1> to be able to represent a DAG without bloating the TreeEntry much. Hopefully we can find a place to easily update it too. It would also remove my C++ magic to make a single int an iterate-able "container".

We can either:

Leave this as is, and be explicit about the fact that the rendered graph may be imprecise.
Try to walk the graph "properly".
Put this on hold until the underlying "graph" has a saner representation. I'm not sure when I'll have time to do it, though. If anybody else volunteers...
None of those sound appealing, TBH. Not sure what the least evil is.

Let me look at this today. Hopefully the testcase you quote is a non-tree case (please confirm). I can use that to make this more precise.

Either way, I'd like to check this in one form or another as it's super-useful even if we make it experimental for now. We can improve it in tree. If we make the user node a container the data structure is equipped to deal wit this.

anemet updated this revision to Diff 91039.Mar 8 2017, 9:43 AM
anemet marked an inline comment as done.

This version allows for multiple users for a TreeEntry and hooks up
buildTree_rec to update it when there is a reuse. See
https://reviews.llvm.org/F3133339 how it looks for Michael's testcase.

The SLP part LGTM - it's less broken than the rest of the code, and we need to fix SLP itself to fix it (and avoid things like the Container hack...)

The GraphTraits implementation generally LGTM as well, but that may be because I don't know enough about it to tell a good one from a bad one...

lib/Transforms/Vectorize/SLPVectorizer.cpp
517 ↗(On Diff #90982)

@mssimpso : yes. I think we're going to unroll the unordered load case for now, so they'll always be the same value. And, apparently, not as difficult as I thought, because Adam just did it.

@anemet : Yes, not more broken than the entire code. I was going to write "but it's not your fault". :-)
And yes, you're completely right, it is trivial... I didn't consider the fact the initial walk is correct, even if the results it produces is nonsense, so you can just keep using the walk order as is. Thanks!

1188 ↗(On Diff #91039)

I think this is where we want the FIXME. :-)
Can you please add it -not for the entire thing, just that this here is a huge hack that makes the graph used for printing *more* precise than the graph actually used for vectorization...

This revision was automatically updated to reflect the committed changes.
anemet added inline comments.Mar 8 2017, 11:01 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1188 ↗(On Diff #91039)

OK, I added that. It would be good explain the shuffle-order thing as well but I will leave that to you or Shahid.

Thanks for the review!