This is an archive of the discontinued LLVM Phabricator instance.

[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences
ClosedPublic

Authored by kmclaughlin on Apr 22 2021, 8:38 AM.

Details

Summary

Adds support for scalable vectorization of loops containing first-order recurrences, e.g:

for(int i = 0; i < n; i++)
  b[i] =  a[i] + a[i - 1]

This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into
account when inserting into and extracting from the last lane of a vector.
CreateVectorSplice has been added to construct a vector for the recurrence, which
returns a splice intrinsic for scalable types. For fixed-width the behaviour
remains unchanged as CreateVectorSplice will return a shufflevector instead.

The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll

Diff Detail

Event Timeline

kmclaughlin created this revision.Apr 22 2021, 8:38 AM
kmclaughlin requested review of this revision.Apr 22 2021, 8:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2021, 8:38 AM
paulwalker-arm added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4187–4191

Hi @kmclaughlin, sorry for the flyby comment but I just wanted to highlight this block looks equally applicable to fixed length vectors and thus ideally I'd prefer to not introduce split control flow unless absolutely necessary.

There's a couple of other changes below that also look like the "scalable" path can be used for all vector types.

Matt added a subscriber: Matt.Apr 24 2021, 10:26 AM
  • Removed VF.isScalable() calls added to fixFirstOrderRecurrence and applied the 'scalable' path to all vector types
kmclaughlin marked an inline comment as done.Apr 26 2021, 7:52 AM

Hi @kmclaughlin, the latest patch looks a lot neater! I've not finished reviewing all the tests yet, but just a few minor comments so far ...

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4184

nit: I think we can use call Builder.GetSub here?

4253

nit: I think we can just call Builder.GetSub here instead?

llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll
121

nit: Perhaps better to use a named variables in this test instead of %vector.recur.init and %vector.recur as they chould change?

133

nit: Can just use %{{.*}} here I think.

179

I wonder if we need these negative tests (PR26734, PR27246 and no_sink_after), since they're already covered for fixed width vectors in llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll and in this patch you've only changed fixRecurrence, which only gets invoked when deciding to vectorise?

sdesmalen added inline comments.Apr 26 2021, 9:37 AM
llvm/include/llvm/IR/IRBuilder.h
2514–2515

a vector is extracted from concat(V1, V2), starting at Imm.

llvm/lib/IR/IRBuilder.cpp
1032

This variable isn't particularly useful. Maybe just write:

assert(isa<VectorType>(V1->getType()) && "Unexpected type");
1045

cast<FixedVectorType>(Ty)->getNumElements();

david-arm added inline comments.Apr 27 2021, 12:23 AM
llvm/lib/IR/IRBuilder.cpp
1035

I guess if you remove Ty as @sdesmalen suggests it then makes sense to write here:

if (auto VTy = dyn_cast<ScalableVectorType>(V1->getType()))

and replace the Ty below with VTy?

david-arm added inline comments.Apr 27 2021, 5:55 AM
llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll
259

nit: Again, I'm not sure if this test is really needed here - it doesn't seem like it adds much value over the original fixed width equivalent in llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll to be honest.

308

nit: Perhaps avoid referring to the variable names %vector.recur.init and %vector.recur?

paulwalker-arm added inline comments.Apr 27 2021, 6:08 AM
llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll
121

@david-arm I think this is typical behaviour, as used by the original tests. These names are hardcoded within the pass so can be relied upon. If nothing else, relying on these names means we can ensure the IR remains readable.

kmclaughlin marked 8 inline comments as done.

Addressed review comments from @sdesmalen & @david-arm:

  • Replaced uses of Builder.CreateBinOp with Builder.GetSub in fixFirstOrderRecurrence
  • Removed tests which added little value over the exisiting tests in test/Transform/LoopVectorize/first-order-recurrence.ll (PR26734, PR27246, PR30183 & no_sink_after)
  • Removed Ty from CreateVectorReverse
david-arm accepted this revision.Apr 28 2021, 6:13 AM

LGTM! Thanks for the changes - it looks like you've addressed everyone's comments.

llvm/lib/IR/IRBuilder.cpp
1036

nit: Maybe fix clang-tidy warning before merging?

1053

nit: Maybe fix clang-tidy warning before merging?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4268

nit: It looks like these comments have been included in the vector case, but I think they're intended for the scalar else if block below. Perhaps you can just move them into that block instead?

This revision is now accepted and ready to land.Apr 28 2021, 6:13 AM
fhahn added inline comments.Apr 28 2021, 6:46 AM
llvm/lib/IR/IRBuilder.cpp
1047

Is this case actually used? Do we need to support this case, or can we assert that Imm always is valid? (might also be good to mention in the comment what values of Imm are supported).

llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll
2

Is there anything AArch64 specific about the test? The VFs are all forced, so no cost-modeling should be needed, could this just use -force-target-supports-scalable-vectors?

kmclaughlin marked 4 inline comments as done.
  • Added an assert for out of range immediate values in CreateVectorSplice as we should never encounter this case & expanded the comment in IRBuilder.h to include the supported values of Imm
  • Removed -mtriple aarch64-unknown-linux-gnu -mattr=+sve from the tests where possible and used -force-target-supports-scalable-vectors instead
  • Fixed clang-tidy warnings
kmclaughlin added inline comments.Apr 30 2021, 6:07 AM
llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll
2

Where possible I've moved the tests into LoopVectorize/scalable-first-order-recurrence.ll and used the -force-target-supports-scalable-vectors flag. There were a couple of tests (@PR33613 & @PR34711) which require masked gathers, and without specifying a target architecture setCostBasedWideningDecision will assert as it decides to scalarise these, so I've kept these tests in AArch64/first-order-recurrence.ll.

david-arm accepted this revision.May 5 2021, 12:30 AM

LGTM! It looks like you've addressed everyone's review comments. Thanks for the changes!

fhahn requested changes to this revision.May 5 2021, 1:52 PM

Thanks for the updates! LGTM

This revision now requires changes to proceed.May 5 2021, 1:52 PM
fhahn accepted this revision.May 5 2021, 2:09 PM

(wrong button)

This revision is now accepted and ready to land.May 5 2021, 2:09 PM
This revision was landed with ongoing or failed builds.May 6 2021, 3:49 AM
This revision was automatically updated to reflect the committed changes.