Page MenuHomePhabricator

[Analysis]Add getPointersDiff function to improve compile time.
ClosedPublic

Authored by ABataev on Mar 19 2021, 10:33 AM.

Details

Summary

Added getPointersDiff function to LoopAccessAnalysis and used it instead
direct calculatoin of the distance between pointers and/or
isConsecutiveAccess function in SLP vectorizer to improve compile time
and detection of stores consecutive chains.

Part of D57059

Diff Detail

Event Timeline

ABataev created this revision.Mar 19 2021, 10:33 AM
ABataev requested review of this revision.Mar 19 2021, 10:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2021, 10:33 AM
ABataev added inline comments.Mar 19 2021, 10:34 AM
llvm/test/Transforms/SLPVectorizer/X86/pr35497.ll
2–5

No need for the second run of SLPVectorizer pass after this change

Seems about right, some thoughts.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1129–1134
1158–1159
1161–1165

Why not just do this in APInt directly?

1167–1172

But this checks that OffsetDelta is a multiple of original pointed-to type,
not what the comment says?

I suppose this overly restrictive.

1180–1185

Would it be better to add the non-constant case as an else to if (PtrA1 == PtrB1) {,
and deduplicate this code?

ABataev added inline comments.Mar 19 2021, 12:40 PM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1129–1134

Ok, will do

1158–1159

It is used for Size

1161–1165

Will try

1167–1172
  1. Yes, the comment is wrong, will fix it.
  2. Unfortunately not. Caused by a real example.
1180–1185

I'll try

lebedev.ri added inline comments.Mar 19 2021, 12:51 PM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1132–1133

Feel free to ignore this one:
for opaque ptr types forward-compatibility, it might be good to just take element type as an extra argument
and not look at the elt types of the pointers.

1158–1159

Err, i should have left this as a comment.
I think this should mention that while we might have traversed into another address space,
we still want to keep the original index width.
Either that, or we need to refresh index width.

1167–1172

Err, i understand what this checks and that this is for sure currently needed in some cases,
i'm only saying that perhaps some code could deal with sub-element-misalignment.

ABataev added inline comments.Mar 19 2021, 12:53 PM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1158–1159

Ah, I see, thanks.

1167–1172

I can add a boolean param to the function to control this check

spatel added inline comments.Mar 21 2021, 6:22 AM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1137–1138

Drive-by comment: should this be the first thing we check? Is it valid for callers to pass in null values for one or both of these arguments, or can we assert that it should not happen?

ABataev updated this revision to Diff 332388.Mar 22 2021, 12:02 PM

Address comments, rebase, improvements

lebedev.ri accepted this revision.Mar 23 2021, 12:37 AM

Can isConsecutiveAccess() be reimplemented in terms of getPointersDiff() now?
LGTM, but someone else should take a look, too.
Nice, https://llvm-compile-time-tracker.com/compare.php?from=bde995c9c2a0f73c69f1ca60b94bec0bebf20537&to=de67ba4218b6a64361e0269624a6998acb493b51&stat=instructions

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1202

Why is this no longer llvm::SmallSet<>?

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
945–946

This looks like a separate change.

This revision is now accepted and ready to land.Mar 23 2021, 12:37 AM

Can isConsecutiveAccess() be reimplemented in terms of getPointersDiff() now?

Sure, I'll rework isConsecutiveAccess() to use getPointersDiff()

LGTM, but someone else should take a look, too.
Nice, https://llvm-compile-time-tracker.com/compare.php?from=bde995c9c2a0f73c69f1ca60b94bec0bebf20537&to=de67ba4218b6a64361e0269624a6998acb493b51&stat=instructions

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1202

Well, I thought that it is going to be more optimal to use std::set here directly. For SmallSet we cannot pass comparer directly (but this can be fixed, I think), for the small number of elements it uses SmallVector with linear search and if growth too big it switches to std::set, copying all the elements from the SmallVector. I think that here we can use std::set directly to avoid extra copying in case of the increasing number of elements and linear search for the small number of elements.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
945–946

Well, in general, yes but this also improves compile time as we do not need to get pointers diff in this case and we can return ScoreFail immediately.

ABataev updated this revision to Diff 332667.Mar 23 2021, 7:16 AM

Rebase + rework of isConsecutiveAccess()

LGTM - still OK with this @lebedev.ri ?

LGTM - still OK with this @lebedev.ri ?

Yep

This revision was landed with ongoing or failed builds.Mar 23 2021, 12:59 PM
This revision was automatically updated to reflect the committed changes.
RKSimon added inline comments.Mar 24 2021, 2:25 AM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1226

@ABataev IsConsecutive is false so isn't this line redundant?

Noticed by static analysis.

ABataev added inline comments.Mar 24 2021, 4:12 AM
llvm/lib/Analysis/LoopAccessAnalysis.cpp
1226

Yes, just forgot to remove this line after rework. Will do thisnlater today.

uabelho added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3155

Hi,
I'm seeing a miscompile with this patch for my out-of-tree target and I wonder if we perhaps should pass StrictCheck=true for this getPointerDiff call.

In the case I see I have two loads where Dist is calculated as 1, but actually it's 3/2, and the resulting vectorized load then doesn't load the correct words.

I'm having some troubles reproducing this on an in-tree target but I don't see why it shouldn't be possible.

I see that if I do pass StrictCheck=true here the miscopile goes away.

@ABataev , what do you think about this? Is it possible that the code broke here?

ABataev added inline comments.Mar 26 2021, 5:59 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3155

It should not be broken here. sortPtrAccesses should call getPointersDiff with StrictCheck already after 040c60d9b69e2ad570556f255a746929a4b10e82 from Richard Smith and we can get here only if getPointerDiff with StrictCheck==true returned defined result already. Try to update your repo. Plus, this code is for stores, not for loads. Anyway, your problem should be fixed already, just update

uabelho added inline comments.Mar 26 2021, 7:06 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3155

Aha, yep 040c60d9b69e2 seems to help. Great!
(And yes, the problem I saw was with the load case at line 2880 and not here in store.)

Looks like this can trigger an infinite loop: https://bugs.llvm.org/show_bug.cgi?id=49898

Looks like this can trigger an infinite loop: https://bugs.llvm.org/show_bug.cgi?id=49898

I'll fix it ASAP, thanks for the report and reproducer!