Page MenuHomePhabricator

[LoopVectorize] Don't attempt to widen certain calls for scalable vectors
Needs ReviewPublic

Authored by david-arm on May 13 2021, 4:53 AM.

Details

Summary

When attempting to vectorise a loop containing a call to the following
intrinsic

tail call fast float @llvm.sin.f32(float %0)

we then crashed in tryToWidenCall because we don't have scalable vector
versions of these intrinsics and we also cannot scalarise them. Rather
than rely on the cost model returning Invalid for such cases we should
be explicitly rejecting scalable vectorisation upfront.

This patch adds a new TTI interface called isLegalVectorIntrinsic that
checks if it is legal to vectorise a particular intrinsic. We then
walk through all calls in the loop ensuring they are legal - if not
we return a zero VF from LoopVectorizationCostModel::getMaxLegalScalableVF.

Tests have been added here:

Transforms/LoopVectorize/AArch64/scalable-call.ll

Diff Detail

Event Timeline

david-arm created this revision.May 13 2021, 4:53 AM
david-arm requested review of this revision.May 13 2021, 4:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2021, 4:53 AM
Matt added a subscriber: Matt.May 18 2021, 7:29 AM
kmclaughlin added inline comments.May 19 2021, 4:11 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5671

nit: could this be moved below the comments beneath it so that it's above if !(canVectorizeReductions(VF))?

5689

Hi @david-arm, is it possible for there to be a VecID but getMappings(*CI) is not empty? Is it worth adding a comment about what should happen in this case?

llvm/test/Transforms/LoopVectorize/AArch64/scalable-call.ll
124

nit: should this test be also be called no_mapping if it is legal to vectorize?

david-arm updated this revision to Diff 346461.May 19 2021, 7:55 AM
  • Improved and added more comments to code + tests to explain what's going on.
  • Tried to restructure the code a little to be a bit clearer.
david-arm marked 3 inline comments as done.May 19 2021, 8:00 AM
david-arm added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5689

Yeah, so in the majority of cases I expect we don't have mappings for things like sqrt and sin intrinsics, at least not until there is some scalable vector library support. I've tried to restructure the code here to make the logic a bit clearer.

llvm/test/Transforms/LoopVectorize/AArch64/scalable-call.ll
124

Hi @kmclaughlin, so in this case there are no mappings attached to the call instruction, however it is legal to vectorise because there is hardware support for it. I've added comments that hopefully explains it better!

sdesmalen added inline comments.May 19 2021, 8:06 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1910

I'd rather write this the other way around, have a list here of intrinsics we do support and can safely return 'true' for, and fall back on BaseT's implementation for the 'default' case (which seems to be missing here), which is more pessimistic and only returns true if it knows the intrinsics can always be vectorized, i.e. the trivially vectorizable cases.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5669

MaxVF ?

5669

nit: s/isSclableLoopLegal/loopCanBeWidenedWithScalableVectors/ ?

5671

for eligible vectorization factors up to MaxVF

5672

Can you move this comment down to the CallInst case?

david-arm marked 2 inline comments as done.May 19 2021, 8:10 AM
david-arm added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5672

Yeah sure, although it's worth pointing out this is an existing comment that was referring to canVectorizeReductions, i.e. that we can also filter out the VFs not valid for the reductions too. I'm happy to move it down to the CallInst case though if you're not worried about the reduction case atm?

sdesmalen added inline comments.May 19 2021, 8:36 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5672

Ah I see. Looking at the implementation for canVectorizeReductions, and correspondingly to isLegalToVectorizeReduction, the VF itself isn't really used, so perhaps we may just as well remove it at this point. It's more the reduction-operation that is holding it back from vectorizing rather than the VF. Initially I imagined us having a function like you added here, which would filter the VFs based on capabilities, but it turned out that for the reductions that just wasn't really needed, but we kept the individual VF operand.

david-arm added inline comments.Thu, May 20, 1:28 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1910

So we do fall back on the BaseT version, it's just I didn't do it explicitly in the default: case. For some reason I thought this coding structure was the preferred way - I'm never really too sure about the coding style of switch statements in LLVM!

It's also worth pointing out that that the BaseT version calls isTriviallyVectorizable where even sin and cos will return true. Therefore, returning the BaseT version for those cases is not what we want. I could either:

  1. Fill in the complete set of entries here with both the false and true cases, or
  2. Change the BaseT version to always return false.

Any preferences?

sdesmalen added inline comments.Thu, May 20, 3:58 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1910

It's also worth pointing out that that the BaseT version calls isTriviallyVectorizable where even sin and cos will return true. Therefore, returning the BaseT version for those cases is not what we want.

That suggests to me that the BaseT implementation for isLegalScalableVectorIntrinsic is too lenient and shouldn't rely on isTriviallyVectorizable. Returning 'false' seems like a better starting point.

david-arm updated this revision to Diff 346758.Thu, May 20, 9:11 AM
  • Changed AArch64TargetTransformInfo function to return true for supported cases and default to the BaseT version for the rest.
  • Changed the BaseT version of isLegalScalableVectorIntrinsic to return false.
  • Removed a FIXME comment.
david-arm marked 7 inline comments as done.Thu, May 20, 9:12 AM
sdesmalen added inline comments.Thu, May 27, 1:54 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
665

nit: I read isLegalScalableVectorIntrinsic as asking if the "scalable vector intrinsic" is legal as if the behaviour of the intrinsic is specific to scalable vectors. Rather, what you're asking is if the intrinsic (not specific to scalable vectors) is legal to use with scalable vector arguments, so from that reasoning I prefer isIntrinsicLegalForScalableVectors.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5684

When debug-info is enabled, that will insert llvm.dbg intrinsic calls which aren't handled in your switch, but maybe it's better to avoid getting into that to begin with by discarding instructions where I.isDebugOrPseudoInst() == true.

5697

Instead of passing in Msg, can you just call reportVectorizationInfo here? Also, can you make the message more specific by telling which function is not vectorizable with scalable vectors?

5699

Shouldn't the code traverse the mappings in the VFdatabase to see if there is a scalable form available?

david-arm updated this revision to Diff 348295.Thu, May 27, 9:08 AM
  • Rebase + addressed review comments
david-arm marked 4 inline comments as done.Thu, May 27, 9:08 AM
sdesmalen added inline comments.Thu, May 27, 9:44 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
1523

nit: Ah I only now see it falls a bit out of style, so perhaps isLegalIntrinsicForScalableVectors is a better name. Sorry I didn't spot that before.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1912

nit: maybe drop the comments, because it's not complete (for example, smax is not bit-manipulation)

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1516

nit: is this an unrelated change?

5701–5708

nit: Can you use llvm::any_of() instead of the above loop?

5715

Can you add a FIXME that says this code is still too optimistic? It depends on the chosen VF whether the loop can vectorize (vis-a-vis whether a mapping is available for that specific VF). I expect we'll want to build up a set of suitable VFs (up to and including MaxVF) that are legal, so that we can later filter out VFs that should not be considered as candidates for vectorization.

david-arm added inline comments.Wed, Jun 16, 1:30 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1516

No, it's needed because the caller is also marked const

5701–5708

Sure. To be honest, in general I find the above loop more readable than using lambdas, which is why I tend to find it more natural to write code this way. However, I know lots of the codebase uses them. :)

david-arm updated this revision to Diff 352635.Thu, Jun 17, 1:12 AM
  • Rebase.
  • Addressed review comments.
david-arm marked 5 inline comments as done.Thu, Jun 17, 1:13 AM