Page MenuHomePhabricator

[LV] Prevent vectorization with unsupported element types.

Authored by kmclaughlin on May 11 2021, 9:53 AM.



This patch adds a TTI function, isElementTypeLegalForScalableVector, to query
whether it is possible to vectorize a given element type. This is called by
isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if
any of the instruction types in the loop are unsupported, e.g:

int foo(__int128_t* ptr, int N)
  #pragma clang loop vectorize_width(4, scalable)
  for (int i=0; i<N; ++i)
    ptr[i] = ptr[i] + 42;

This example currently crashes if we attempt to vectorize since i128 is not a
supported type for scalable vectorization.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kmclaughlin marked 4 inline comments as done.
kmclaughlin edited the summary of this revision. (Show Details)
  • Moved the changes which add isLegalToVectorizeElementType into a seperate patch (D102515)
  • Added canVectorizeInstructionTypes, which iterates through all instructions in the loop and queries whether it is legal to vectorize the element type.
  • Call canVectorizeInstructionTypes from getMaxLegalScalableVF to ensure we reject scalable vectorization if the loop contains any unsupported types.

Hello. I thought that the idea was going to be that the costmodel returned an invalid cost for types/operations that could not be handled. And because it had an invalid cost, the vectorizer would then not vectorize with scalable factors? Is that not still the idea?

Hi @dmgreen, thanks for taking a look at this! We should be returning invalid costs for operations that are not supported, though I think the idea is that we should also be trying to reject scalable vectorisation early in situations such as the one here so that we don't have to rely on what the cost model is returning. As we can also only cost operations that are legal to vectorise, if we get an invalid cost for an instruction that we considered legal, this suggests either the cost model is incomplete or legalisation was not strict enough.
I've added D102515 to ensure we are returning an invalid cost for memory ops with unsupported types.

Matt added a subscriber: Matt.May 14 2021, 12:13 PM
joechrisellis added inline comments.May 17 2021, 1:43 AM

I think this is equivalent to:

bool canVectorizeInstructionTypes(Loop *TheLoop, ElementCount VF) {
  for (BasicBlock *BB : TheLoop->blocks())
    for (Instruction &I : BB->instructionsWithoutDebug()) {
      auto *Ty = I.getType();
      if (!Ty->isVoidTy() &&
          !TTI.isLegalToVectorizeElementType(Ty, VF.isScalable()))
        return false;
  return true;

Has the nice side effect of getting rid of the double negative. ๐Ÿ™‚

1 โ†—(On Diff #345506)

nit: I usually see the triple being set with:

target triple = "aarch64-linux-gnu"

to simplify the RUN line. IMO in general it makes it a little nicer to run the test as a standalone thing rather than through llvm-lit. ๐Ÿ™‚

david-arm added inline comments.May 17 2021, 1:48 AM
1 โ†—(On Diff #345506)

I think many/most of our tests in llvm/test/Transforms/LoopVectorize/AArch64/ do it this way, i.e. specify the triple on the command line so @kmclaughlin's RUN line is more the norm here. But either way works!

@kmclaughlin - since this is AArch64 specific can you move this to the AArch64 directory?

david-arm added inline comments.May 17 2021, 1:51 AM

I don't think we need to pass TheLoop in here, since I think it's a member of the class already?

kmclaughlin marked 4 inline comments as done.

Addressing review comments from @joechrisellis & @david-arm:

  • Removed IllegalTy from canVectorizeInstructionTypes() & stopped passing TheLoop to the function
  • Moved the test file to the AArch64 directory & removed the triple from its RUN line
david-arm accepted this revision.May 18 2021, 6:12 AM

LGTM! Thanks for dealing with review comments!


nit: Could you add a simple comment above the function before committing? Something like

/// Returns true if all types found in the loop are legal to vectorize.



nit: Since this test is now in the AArch64 directory I think you can replace "-force-target-supports.." with "-mattr=+sve"?

This revision is now accepted and ready to land.May 18 2021, 6:12 AM
fhahn added a subscriber: fhahn.May 18 2021, 6:15 AM
fhahn added inline comments.

I think terminology used in other places is 'widening' instead of 'vectorizing' , both in the cost model and codegen. Does vectorizing here means something different? Also would be good to add a comment for the function..

david-arm added inline comments.May 18 2021, 6:41 AM

I guess they're the same? I hadn't deliberately chosen the word vectorize over widen to be honest. In this case I think we specifically want to know if the target has hardware support for vector instructions involving a given type because we cannot fall back on scalarisation.

In general, is there a preference to using Widen instead of Vectorize in naming schemes and comments? It's just I do see other functions in this file use the word Vectorize in functions and class names so I guess it seemed natural to create functions with the word Vectorize in them.

fhahn added inline comments.May 19 2021, 3:15 AM

I don't think there's a definitive answer. The current usage is not as clear cut as I thought. I was thinking about the terminology used for the various widenInstruction, widenGEP, widenIntOrFpInduction, widenPHIInstruction & co vs. vectorizeInterleaveGroup which does not simplify widen instructions, together with the widening decision terminology used in the cost model.

In any case, a doc-comment would definitely be helpful. Another thing to consider to align the name with TTI.isLegalToVectorizeElementType to something like isLegalTo(Vectorize/Widen)InstructionType

kmclaughlin marked an inline comment as done.
  • Renamed canVectorizeInstructionTypes to isLegalToVectorizeInstTypesForScalable & added a comment above it
  • Replaced -force-target-supports-scalable-vectors with -mattr=+sve in sve-illegal-type.ll
david-arm accepted this revision.May 24 2021, 8:11 AM

LGTM! Seems like outstanding review comments have been addressed. Thanks!


nit: I think perhaps you can just remove the VF argument now?

kmclaughlin marked an inline comment as done.
kmclaughlin edited the summary of this revision. (Show Details)
  • Moved the addition of isElementTypeLegalForScalableVector into this patch from D102515
  • Removed unused VF from isLegalToVectorizeInstTypesForScalable
  • Merged the isElementTypeLegalForScalableVector and isLegalElementTypeForSVE functions. Since isElementTypeLegalForScalableVector does not check the VF anymore, there was no additional benefit to them being separate.
sdesmalen added inline comments.Jun 8 2021, 9:58 AM

Missing doxygen comment.


To reduce the diff, can you just rename isLegalElementTypeForSVE?


I'd like to suggest to have a name that's more generic: canWidenLoopWithScalableVectors, so that it can be reused for other purposes. D102394 and subsequently D101916 do that already. If you make a similar change as in D102394 where you move canVectorizeReductions into this function, then @david-arm can rebase his patch when he's back.

kmclaughlin marked 2 inline comments as done.
  • Renamed isLegalElementTypeForSVE to isElementTypeLegalForScalableVector, but kept the implementation in AArch64TargetTransformInfo.h
  • Renamed isLegalToVectorizeInstTypesForScalable to canWidenLoopWithScalableVectors and moved canVectorizeReductions into this function, similar to D102394
sdesmalen added inline comments.Tue, Jun 29, 2:50 AM

Hi @kmclaughlin I tried out this patch and found there is a case that's missing, namely where the instruction is a StoreInst, which itself has void as its type, but then stores a loop invariant i128/f128 value to a variant address.

Also, looking at this condition again, why is IntegerTy(1) handled differently here? For SVE we have predicate vectors, so it should be covered by isElementTypeLegalForScalableVector?

kmclaughlin marked an inline comment as done.
kmclaughlin edited the summary of this revision. (Show Details)
  • Split getSmallestAndWidestTypes into two functions. collectAllElementTypesInLoop now iterates over every instruction in the loop and collects a list of the element types found, which is used later by canWidenLoopWithScalableVectors to disable scalable vectorization if any of the types are illegal. getSmallestAndWidestTypes now only returns the min & max widths found in the list of element types.
  • Moved the Ty->isIntegerTy(1) check into isElementTypeLegalForScalableVector
  • Added a test which stores an loop invariant i128 value to a variant address.
kmclaughlin added inline comments.Tue, Jun 29, 10:31 AM

Hi @sdesmalen, I've added a test for the missing store case which you found.
The IntegerTy(1) here should be covered by isElementTypeLegalForScalableVector - I've moved this accordingly.

david-arm added inline comments.Wed, Jun 30, 12:40 AM

@sdesmalen @kmclaughlin By doing this I think we now have a large test escape because we're essentially adding a whole new piece of functionality, i.e. telling the vectoriser we can also do masked loads/stores and gather/scatters with i1 types too. I think this means we now need to add a cost model for masked loads/store/gathers/scatter using i1 types, write vectoriser tests and ensure we get sensible codegen too?

  • Updated isLegalMaskedLoadStore & isLegalMaskedGatherScatter to ensure they do not return true for i1 types, now that isElementTypeLegalForScalableVector returns true for i1

I think the new changes look good @kmclaughlin! Just a couple more minor comments.


If we're relying upon the element types being collected for correctness here is it worth adding an assert:

assert(!ElementTypesInLoop.empty() && "Unable to calculate smallest and widest types");

Do we ever call this function more than once for a given loop? I wonder if it's worth clearing the list at the start just in case, i.e.

kmclaughlin marked an inline comment as done.
  • Rebased the patch & ensured the ElementTypesInLoop list is cleared at the start of collectAllElementTypesInLoop

Since we only add the element types of loads, stores and (reduction) phi nodes in collectAllElementTypesInLoop(), I think it is possible for ElementTypesInLoop to be empty here for a given loop. I tried adding the assert and did run into quite a few test failures with the change.


I don't think we ever call this function more than once for each loop, though I think it's still worth clearing the list as you suggest.

david-arm accepted this revision.Thu, Jul 1, 7:31 AM

LGTM! Thanks for making all the changes @kmclaughlin. Can you address the nits before merging?


nit: I think the comment probably needs updating now since we no longer return a remark. Maybe something like:

/// scalable vectorization factor MaxVF. If the loop is illegal the function
/// emits an appropriate error remark.

nit: Maybe best not to rely upon %4 and %5 here perhaps and use a named variable?


nit: Same thing as above about using %4 and %5

I left a few more comments, but other than that, I'm mostly happy with the approach taken in this patch.


nit: widening is


Given the recent change in direction to use invalid costs to avoid vectorization with scalable vectors, this function should be quite limited in size/scope, so maybe it's better not moving this out into a separate function like you've done here, but just add the extra condition to getMaxLegalScalableVF as in:


if (any_of(ElementTypesInLoop, [] (const Type *Ty) { return !Ty->isVoidTy() && !TTI.isElementTypeLegalForScalableVector(Ty); })) {
  return ElementCount::getScalable(0);

Could you split out the non-functional change which adds collectAllElementTypesInLoop and changes getSmallestAndWidestTypes accordingly, and commit this as a separate (NFC) patch?

kmclaughlin marked 5 inline comments as done.
kmclaughlin edited the summary of this revision. (Show Details)

Addressing new review comments from @david-arm & @sdesmalen:

  • Removed the canWidenLoopWithScalableVectors function, moving the condition to disable scalable vectorization of loops with illegal types into getMaxLegalScalableVF instead.
  • Removed the changes to add getSmallestAndWidestTypes and the ElementTypesInLoop list from this patch.
  • Improved the CHECK lines in sve-illegal-type.ll.
sdesmalen accepted this revision.Mon, Jul 5, 1:56 PM

LGTM, thanks @kmclaughlin!


nit: unnecessary whitespace change?

This revision was landed with ongoing or failed builds.Tue, Jul 6, 5:33 AM
This revision was automatically updated to reflect the committed changes.
kmclaughlin marked 2 inline comments as done.
kmclaughlin added inline comments.Tue, Jul 6, 5:34 AM

The extra whitespace was introduced in D105437, which I removed before committing.