This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Use SCEV to sort memory accesses
ClosedPublic

Authored by mkuper on Feb 1 2017, 2:17 PM.

Details

Summary

This generalizes memory access sorting to use SCEVs instead of relying on constant offsets.

I'm not entirely sure my use of SCEV here is sane, both in terms of using the cache (if it's ok to rely on the cache, I can get rid of the map and simplify the whole thing), and in terms of expecting "SCEV.getMinusSCEV(X,Y) returns a constant" to be transitive. @sanjoy , I'll really appreciate feedback on both.

Diff Detail

Repository
rL LLVM

Event Timeline

mkuper created this revision.Feb 1 2017, 2:17 PM
mkuper updated this revision to Diff 86717.Feb 1 2017, 2:19 PM

Now, with correct test.

ABataev added inline comments.Feb 2 2017, 2:07 AM
lib/Analysis/LoopAccessAnalysis.cpp
1064 ↗(On Diff #86717)

You can make OffValPairs.reserve(VL.size()) and Sorted.reserve(VL.size()) to avoid possible resize

1083 ↗(On Diff #86717)

Better to use OffValPairs.emplace_back(SE.getSCEV(Ptr), Val);

mssimpso added inline comments.Feb 2 2017, 7:37 AM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

Do we require constant differences? If not, you could also ask SCEV about the range:

return SE.isKnownPositive(SE.getMinusSCEV(Right.first, Left.first))
efriedma added inline comments.
lib/Analysis/LoopAccessAnalysis.cpp
1071 ↗(On Diff #86717)

The SCEV cache is reasonably fast, but it's obviously faster not to do a hashtable lookup at all.

1095 ↗(On Diff #86717)

std::sort requires your comparator to produce a strict weak ordering; otherwise, it has undefined behavior.

Thanks for the comments!

lib/Analysis/LoopAccessAnalysis.cpp
1064 ↗(On Diff #86717)

Right, thanks.

1071 ↗(On Diff #86717)

I know, but using the cache would simplify the code even more, so I wonder if the penalty for a cache lookup is worth it. I mean, a SCEV cache lookup is basically a hash lookup, right?

1083 ↗(On Diff #86717)

Sure.

1093–1095 ↗(On Diff #86717)

@mssimpso : right, I think this should work, thanks! I was stuck thinking about constants because if the differences are non-constant we won't be able to do anything with the sorted array anyway. But this is nicer assuming I can actually use it as a comparator (see answer to Eli).

@efriedma : I know. That's why I've wanted sanjoy to comment on whether we can rely on SCEV to give transitive results. I mean, this comparator is obviously irreflexive, and I'm rather sure it's antisymmetric. So it comes down to whether we can rely on isKnownPositive(getMinus(A,B)) and isKnownPositive(getMinus(B,C)) implying that isKnownPositive(getMinus(A,C)) (or the equivalent for the version I have now).
If not, then I see two options:
(a) roll my own sort that bails if two values are incomparable.
(b) do something like: take one of the pointers, compute the difference vs. all the others, if one of the differences is non-constant bail, otherwise compare based on the diffs.

I guess (b) would be the simplest and safest option.

efriedma added inline comments.Feb 2 2017, 12:01 PM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

I don't think we can rely on SCEV being transitive? At least, I wouldn't want to rely on it, then crash in some obscure case.

Anyway, your comparator has worse problems: currently, you can prove 0 == A == 1 with it.

mkuper added inline comments.Feb 2 2017, 12:12 PM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

I don't think we can rely on SCEV being transitive? At least, I wouldn't want to rely on it, then crash in some obscure case.

That worries me too, that's why I wanted a stamp of approval/disapproval from sanjoy :-)
Maybe I'll just go with (b) and avoid this can of worms.

Anyway, your comparator has worse problems: currently, you can prove 0 == A == 1 with it.

I don't see why. I mean, for {0, 1, A} the strict weak order is "0 < 1, A is incomparable with both 0 and 1". The only comparison that evaluates to true is cmp(0,1), and this should be fine.

mkuper added inline comments.Feb 2 2017, 12:14 PM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

Er, wait, you're right, that's not strict. But I think it still satisfies the explicit requirements for a comparator.

efriedma added inline comments.Feb 2 2017, 2:19 PM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

C++ [alg.sorting]p3; "For algorithms [...] to work correctly, comp has to induce a strict weak ordering on the values."

mkuper added inline comments.Feb 2 2017, 3:12 PM
lib/Analysis/LoopAccessAnalysis.cpp
1093–1095 ↗(On Diff #86717)

Ok, so "strict" has two possible meanings. One is irreflexivity (which holds for this order), the other is transitivity of incomparability (which doesn't).
The standard uses it to mean the first one: [alg.sorting]p4: "The term strict refers to the requirement of an irreflexive relation (!comp(x, x) for all x),"
So I think, if transitivity holds, this is fine.

Anyway, it's a moot point, I'm going to change this not rely on SCEV transitivity.

mkuper updated this revision to Diff 86902.Feb 2 2017, 3:19 PM

Updated not to rely on SCEV transitivity.
Also addressed other comments, except for the suggestion to use SE.isKnownPositive(), which won't work with this comparator.

mssimpso added inline comments.Feb 3 2017, 6:48 AM
lib/Analysis/LoopAccessAnalysis.cpp
1081 ↗(On Diff #86902)

Hi Michael,

I must have missed this the first time around, but this looks a little confusing to me. If the accesses aren't sortable, why are we adding them to Sorted? I would think the caller would need some signal that sorting was unsuccessful - either an empty Sorted or a returned bool. Am I missing something obvious?

mkuper added inline comments.Feb 3 2017, 9:02 AM
lib/Analysis/LoopAccessAnalysis.cpp
1081 ↗(On Diff #86902)

No, you're not missing anything.

It's just that the API currently doesn't allow for the possibility of failure, so the callers blindly use the "sorted" array. This made sense for the previous version because it did not explicitly fail (it would just produce a meaningless order in cases where things weren't really sortable), and I didn't notice this when reviewing the original patch.
I was planning to fix the API up in a follow-up patch, but if you want to see it in the review, I can update it with that.

mssimpso added inline comments.Feb 3 2017, 9:05 AM
lib/Analysis/LoopAccessAnalysis.cpp
1081 ↗(On Diff #86902)

A follow-on is perfectly fine - I just wanted to make sure I wasn't confusing myself. Thanks for the clarification!

mssimpso edited edge metadata.Feb 3 2017, 9:20 AM

I had a comment for the test case, otherwise this looks good to me.

test/Transforms/SLPVectorizer/X86/jumbled-load.ll
8 ↗(On Diff #86902)

It looks like the changes in this test are unrelated to the patch. Probably left-overs from the update script?

mkuper added inline comments.Feb 3 2017, 10:38 AM
test/Transforms/SLPVectorizer/X86/jumbled-load.ll
8 ↗(On Diff #86902)

Yes, it looks like the update script decided to pattern-match the first uses of function arguments instead of matching their names. :-S

sanjoy accepted this revision.Feb 3 2017, 10:44 AM

Going back a couple of diffs, I don't think SCEV can guarantee beyond "best effort" that f(A, B) = isKnownPositive(getMinus(A, B)) (or the variant of this that you had) is transitive. This is because there are places where SCEV gives up on canonicalization after a certain cutoff point (e.g. around large (x + y) * (a + b + c) like cases) and so it is possible that while both B - A and C - B simplify to a constant, C - A does not (PR25170 is a more extreme version of this issue, for instance).

However, what you have now looks fine to me.

include/llvm/Analysis/LoopAccessAnalysis.h
696 ↗(On Diff #86902)

I don't know general LAA conventions, but this contract seem weird -- why not return an error.

[edit: I noticed you've already answered this]

lib/Analysis/LoopAccessAnalysis.cpp
1096 ↗(On Diff #86902)

Do you need the std::make_pair? Can't you do emplace_back(Diff->getAPInt().getSExtValue(), Val)?

This revision is now accepted and ready to land.Feb 3 2017, 10:44 AM

Thanks!

lib/Analysis/LoopAccessAnalysis.cpp
1096 ↗(On Diff #86902)

Ugh, right, thanks!

This revision was automatically updated to reflect the committed changes.