This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
2/2
TargetTransformInfo.h
-
TargetTransformInfoImpl.h
-
lib/
-
Analysis/
-
TargetTransformInfo.cpp
-
Target/AArch64/
-
AArch64/
1/2
AArch64TargetTransformInfo.h
3/4
AArch64TargetTransformInfo.cpp
-
Transforms/Vectorize/
-
Vectorize/
13/20
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/AArch64/
-
Transforms/
-
LoopVectorize/
-
AArch64/
-
scalable-reductions.ll
3/3
sve-illegal-type.ll

Differential D102253

[LV] Prevent vectorization with unsupported element types.
ClosedPublic

Authored by kmclaughlin on May 11 2021, 9:53 AM.

Download Raw Diff

Details

Reviewers

sdesmalen
david-arm
CarolineConcatto
joechrisellis
craig.topper

Commits

rGa7512401e5a2: [LV] Prevent vectorization with unsupported element types.

Summary

This patch adds a TTI function, isElementTypeLegalForScalableVector, to query
whether it is possible to vectorize a given element type. This is called by
isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if
any of the instruction types in the loop are unsupported, e.g:

int foo(__int128_t* ptr, int N)
  #pragma clang loop vectorize_width(4, scalable)
  for (int i=0; i<N; ++i)
    ptr[i] = ptr[i] + 42;

This example currently crashes if we attempt to vectorize since i128 is not a
supported type for scalable vectorization.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kmclaughlin created this revision.May 11 2021, 9:53 AM

Herald added subscribers: danielkiss, hiraditya, kristof.beyls, inglorion. · View Herald TranscriptMay 11 2021, 9:53 AM

kmclaughlin requested review of this revision.May 11 2021, 9:53 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 9:53 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B103772: Diff 344447.May 11 2021, 10:39 AM

sdesmalen added inline comments.May 11 2021, 2:06 PM

llvm/include/llvm/Analysis/TargetTransformInfo.h
1327	Should this instead be changed to `isLegalToVectorizeElementType`? (that would match the comment at least)
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1734–1735	is this needed? specifically, this function is never called if `Ty->isVoidTy`
1737	nit: unnecessary braces

@kmclaughlin I hope you don't mind me retitling your patch. I wanted to clarify that this patch and the new interface is not specific to AArch64/SVE.

sdesmalen added a reviewer: craig.topper.May 11 2021, 2:10 PM

craig.topper added inline comments.May 11 2021, 5:07 PM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1737	Does this change the behavior of fixed length vectorization of unsupported types when SVE is enabled even when SVE isn't being used for fixed vectors?

david-arm added inline comments.May 12 2021, 1:57 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1737	It looks like it - I wonder if it's possible to delay the check for legal types until we're dealing with a VF, and then we can pass the VF as an extra parameter to the function?

Hello. I thought that the idea was going to be that the costmodel returned an invalid cost for types/operations that could not be handled. And because it had an invalid cost, the vectorizer would then not vectorize with scalable factors? Is that not still the idea?

Moved the changes which add isLegalToVectorizeElementType into a seperate patch (D102515)
Added canVectorizeInstructionTypes, which iterates through all instructions in the loop and queries whether it is legal to vectorize the element type.
Call canVectorizeInstructionTypes from getMaxLegalScalableVF to ensure we reject scalable vectorization if the loop contains any unsupported types.

In D102253#2753897, @dmgreen wrote:

Hello. I thought that the idea was going to be that the costmodel returned an invalid cost for types/operations that could not be handled. And because it had an invalid cost, the vectorizer would then not vectorize with scalable factors? Is that not still the idea?

Hi @dmgreen, thanks for taking a look at this! We should be returning invalid costs for operations that are not supported, though I think the idea is that we should also be trying to reject scalable vectorisation early in situations such as the one here so that we don't have to rely on what the cost model is returning. As we can also only cost operations that are legal to vectorise, if we get an invalid cost for an instruction that we considered legal, this suggests either the cost model is incomplete or legalisation was not strict enough.
I've added D102515 to ensure we are returning an invalid cost for memory ops with unsupported types.

kmclaughlin added a parent revision: D102515: [CostModel] Return an invalid cost for memory ops with unsupported types.May 14 2021, 11:32 AM

Harbormaster completed remote builds in B104549: Diff 345506.May 14 2021, 11:45 AM

Matt added a subscriber: Matt.May 14 2021, 12:13 PM

joechrisellis added inline comments.May 17 2021, 1:43 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529–1539	I think this is equivalent to: bool canVectorizeInstructionTypes(Loop TheLoop, ElementCount VF) { for (BasicBlock BB : TheLoop->blocks()) for (Instruction &I : BB->instructionsWithoutDebug()) { auto *Ty = I.getType(); if (!Ty->isVoidTy() && !TTI.isLegalToVectorizeElementType(Ty, VF.isScalable())) return false; } return true; } Has the nice side effect of getting rid of the double negative. 🙂
llvm/test/Transforms/LoopVectorize/sve-illegal-type.ll
1 ↗	(On Diff #345506)	nit: I usually see the triple being set with: target triple = "aarch64-linux-gnu" to simplify the `RUN` line. IMO in general it makes it a little nicer to run the test as a standalone thing rather than through `llvm-lit`. 🙂

david-arm added inline comments.May 17 2021, 1:48 AM

llvm/test/Transforms/LoopVectorize/sve-illegal-type.ll
1 ↗	(On Diff #345506)	I think many/most of our tests in llvm/test/Transforms/LoopVectorize/AArch64/ do it this way, i.e. specify the triple on the command line so @kmclaughlin's RUN line is more the norm here. But either way works! @kmclaughlin - since this is AArch64 specific can you move this to the AArch64 directory?

david-arm added inline comments.May 17 2021, 1:51 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529	I don't think we need to pass `TheLoop` in here, since I think it's a member of the class already?

Addressing review comments from @joechrisellis & @david-arm:

Removed IllegalTy from canVectorizeInstructionTypes() & stopped passing TheLoop to the function
Moved the test file to the AArch64 directory & removed the triple from its RUN line

Harbormaster completed remote builds in B104839: Diff 345895.May 17 2021, 9:44 AM

LGTM! Thanks for dealing with review comments!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529	nit: Could you add a simple comment above the function before committing? Something like /// Returns true if all types found in the loop are legal to vectorize. maybe?
llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll
2	nit: Since this test is now in the AArch64 directory I think you can replace "-force-target-supports.." with "-mattr=+sve"?

This revision is now accepted and ready to land.May 18 2021, 6:12 AM

fhahn added a subscriber: fhahn.May 18 2021, 6:15 AM

fhahn added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529	I think terminology used in other places is 'widening' instead of 'vectorizing' , both in the cost model and codegen. Does vectorizing here means something different? Also would be good to add a comment for the function..

david-arm added inline comments.May 18 2021, 6:41 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529	I guess they're the same? I hadn't deliberately chosen the word `vectorize` over `widen` to be honest. In this case I think we specifically want to know if the target has hardware support for vector instructions involving a given type because we cannot fall back on scalarisation. In general, is there a preference to using `Widen` instead of `Vectorize` in naming schemes and comments? It's just I do see other functions in this file use the word `Vectorize` in functions and class names so I guess it seemed natural to create functions with the word `Vectorize` in them.

fhahn added inline comments.May 19 2021, 3:15 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1529	I don't think there's a definitive answer. The current usage is not as clear cut as I thought. I was thinking about the terminology used for the various `widenInstruction`, `widenGEP`, `widenIntOrFpInduction`, `widenPHIInstruction` & co vs. `vectorizeInterleaveGroup` which does not simplify widen instructions, together with the widening decision terminology used in the cost model. In any case, a doc-comment would definitely be helpful. Another thing to consider to align the name with `TTI.isLegalToVectorizeElementType` to something like `isLegalTo(Vectorize/Widen)InstructionType`

sdesmalen mentioned this in D102541: [TTI] NFC: Change getRegUsageForType to return InstructionCost..May 20 2021, 3:44 AM

Renamed canVectorizeInstructionTypes to isLegalToVectorizeInstTypesForScalable & added a comment above it
Replaced -force-target-supports-scalable-vectors with -mattr=+sve in sve-illegal-type.ll

LGTM! Seems like outstanding review comments have been addressed. Thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1531	nit: I think perhaps you can just remove the `VF` argument now?

Harbormaster completed remote builds in B105913: Diff 347396.May 24 2021, 8:32 AM

sdesmalen mentioned this in D102515: [CostModel] Return an invalid cost for memory ops with unsupported types.May 25 2021, 3:36 AM

Moved the addition of isElementTypeLegalForScalableVector into this patch from D102515
Removed unused VF from isLegalToVectorizeInstTypesForScalable

kmclaughlin removed a parent revision: D102515: [CostModel] Return an invalid cost for memory ops with unsupported types.Jun 2 2021, 8:59 AM

Harbormaster completed remote builds in B107259: Diff 349292.Jun 2 2021, 10:01 AM

Merged the isElementTypeLegalForScalableVector and isLegalElementTypeForSVE functions. Since isElementTypeLegalForScalableVector does not check the VF anymore, there was no additional benefit to them being separate.

Harbormaster completed remote builds in B108222: Diff 350618.Jun 8 2021, 9:05 AM

sdesmalen added inline comments.Jun 8 2021, 9:58 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
1330	Missing doxygen comment.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
305	To reduce the diff, can you just rename isLegalElementTypeForSVE?
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1531	I'd like to suggest to have a name that's more generic: `canWidenLoopWithScalableVectors`, so that it can be reused for other purposes. D102394 and subsequently D101916 do that already. If you make a similar change as in D102394 where you move `canVectorizeReductions` into this function, then @david-arm can rebase his patch when he's back.

Renamed isLegalElementTypeForSVE to isElementTypeLegalForScalableVector, but kept the implementation in AArch64TargetTransformInfo.h
Renamed isLegalToVectorizeInstTypesForScalable to canWidenLoopWithScalableVectors and moved canVectorizeReductions into this function, similar to D102394

Harbormaster completed remote builds in B108410: Diff 350888.Jun 9 2021, 8:14 AM

sdesmalen added inline comments.Jun 29 2021, 2:50 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5677	Hi @kmclaughlin I tried out this patch and found there is a case that's missing, namely where the instruction is a `StoreInst`, which itself has `void` as its type, but then stores a loop invariant i128/f128 value to a variant address. Also, looking at this condition again, why is IntegerTy(1) handled differently here? For SVE we have predicate vectors, so it should be covered by isElementTypeLegalForScalableVector?

Split getSmallestAndWidestTypes into two functions. collectAllElementTypesInLoop now iterates over every instruction in the loop and collects a list of the element types found, which is used later by canWidenLoopWithScalableVectors to disable scalable vectorization if any of the types are illegal. getSmallestAndWidestTypes now only returns the min & max widths found in the list of element types.
Moved the Ty->isIntegerTy(1) check into isElementTypeLegalForScalableVector
Added a test which stores an loop invariant i128 value to a variant address.

kmclaughlin added inline comments.Jun 29 2021, 10:31 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5677	Hi @sdesmalen, I've added a test for the missing store case which you found. The IntegerTy(1) here should be covered by `isElementTypeLegalForScalableVector` - I've moved this accordingly.

Harbormaster completed remote builds in B111566: Diff 355284.Jun 29 2021, 11:09 AM

david-arm added inline comments.Jun 30 2021, 12:40 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
221	@sdesmalen @kmclaughlin By doing this I think we now have a large test escape because we're essentially adding a whole new piece of functionality, i.e. telling the vectoriser we can also do masked loads/stores and gather/scatters with i1 types too. I think this means we now need to add a cost model for masked loads/store/gathers/scatter using i1 types, write vectoriser tests and ensure we get sensible codegen too?

Updated isLegalMaskedLoadStore & isLegalMaskedGatherScatter to ensure they do not return true for i1 types, now that isElementTypeLegalForScalableVector returns true for i1

Harbormaster completed remote builds in B111736: Diff 355524.Jun 30 2021, 7:07 AM

I think the new changes look good @kmclaughlin! Just a couple more minor comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6268	If we're relying upon the element types being collected for correctness here is it worth adding an assert: assert(!ElementTypesInLoop.empty() && "Unable to calculate smallest and widest types");
6277	Do we ever call this function more than once for a given loop? I wonder if it's worth clearing the list at the start just in case, i.e. ElementTypesInLoop.clear();

Rebased the patch & ensured the ElementTypesInLoop list is cleared at the start of collectAllElementTypesInLoop

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6268	Since we only add the element types of loads, stores and (reduction) phi nodes in collectAllElementTypesInLoop(), I think it is possible for ElementTypesInLoop to be empty here for a given loop. I tried adding the assert and did run into quite a few test failures with the change.
6277	I don't think we ever call this function more than once for each loop, though I think it's still worth clearing the list as you suggest.

Harbormaster completed remote builds in B111967: Diff 355856.Jul 1 2021, 7:23 AM

LGTM! Thanks for making all the changes @kmclaughlin. Can you address the nits before merging?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1531	nit: I think the comment probably needs updating now since we no longer return a remark. Maybe something like: /// scalable vectorization factor MaxVF. If the loop is illegal the function /// emits an appropriate error remark.
llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll
14	nit: Maybe best not to rely upon `%4` and `%5` here perhaps and use a named variable?
41	nit: Same thing as above about using `%4` and `%5`

I left a few more comments, but other than that, I'm mostly happy with the approach taken in this patch.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1308	nit: widening is
5661	Given the recent change in direction to use invalid costs to avoid vectorization with scalable vectors, this function should be quite limited in size/scope, so maybe it's better not moving this out into a separate function like you've done here, but just add the extra condition to `getMaxLegalScalableVF` as in: i.e. if (any_of(ElementTypesInLoop, [] (const Type *Ty) { return !Ty->isVoidTy() && !TTI.isElementTypeLegalForScalableVector(Ty); })) { reportVectorizationInfo(...); return ElementCount::getScalable(0); }
9867	Could you split out the non-functional change which adds `collectAllElementTypesInLoop` and changes `getSmallestAndWidestTypes` accordingly, and commit this as a separate (NFC) patch?

Addressing new review comments from @david-arm & @sdesmalen:

Removed the canWidenLoopWithScalableVectors function, moving the condition to disable scalable vectorization of loops with illegal types into getMaxLegalScalableVF instead.
Removed the changes to add getSmallestAndWidestTypes and the ElementTypesInLoop list from this patch.
Improved the CHECK lines in sve-illegal-type.ll.

Harbormaster completed remote builds in B112457: Diff 356517.Jul 5 2021, 9:45 AM

kmclaughlin mentioned this in D105437: [LV] Collect a list of all element types found in the loop (NFC).Jul 5 2021, 9:49 AM

kmclaughlin added a parent revision: D105437: [LV] Collect a list of all element types found in the loop (NFC).

LGTM, thanks @kmclaughlin!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6263–6264	nit: unnecessary whitespace change?

kmclaughlin mentioned this in rG17b701c43ca6: [LV] Collect a list of all element types found in the loop (NFC).Jul 6 2021, 2:46 AM

This revision was landed with ongoing or failed builds.Jul 6 2021, 5:33 AM

Closed by commit rGa7512401e5a2: [LV] Prevent vectorization with unsupported element types. (authored by kmclaughlin). · Explain Why

This revision was automatically updated to reflect the committed changes.

kmclaughlin marked 2 inline comments as done.

kmclaughlin added a commit: rGa7512401e5a2: [LV] Prevent vectorization with unsupported element types..

kmclaughlin added inline comments.Jul 6 2021, 5:34 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6263–6264	The extra whitespace was introduced in D105437, which I removed before committing.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

7 lines

TargetTransformInfoImpl.h

2 lines

lib/

Analysis/

TargetTransformInfo.cpp

4 lines

Target/

AArch64/

AArch64TargetTransformInfo.h

10 lines

AArch64TargetTransformInfo.cpp

2 lines

Transforms/

Vectorize/

LoopVectorize.cpp

17 lines

test/

Transforms/

LoopVectorize/

AArch64/

scalable-reductions.ll

2 lines

sve-illegal-type.ll

106 lines

Diff 356690

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 1,318 Lines • ▼ Show 20 Lines
bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes, Align Alignment,		bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes, Align Alignment,
unsigned AddrSpace) const;		unsigned AddrSpace) const;

/// \returns True if it is legal to vectorize the given store chain.		/// \returns True if it is legal to vectorize the given store chain.
bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes, Align Alignment,		bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes, Align Alignment,
unsigned AddrSpace) const;		unsigned AddrSpace) const;

/// \returns True if it is legal to vectorize the given reduction kind.		/// \returns True if it is legal to vectorize the given reduction kind.
bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,		bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
		sdesmalenUnsubmitted Done Reply Inline Actions Should this instead be changed to `isLegalToVectorizeElementType`? (that would match the comment at least) sdesmalen: Should this instead be changed to `isLegalToVectorizeElementType`? (that would match the…
ElementCount VF) const;		ElementCount VF) const;

		/// \returns True if the given type is supported for scalable vectors
		sdesmalenUnsubmitted Done Reply Inline Actions Missing doxygen comment. sdesmalen: Missing doxygen comment.
		bool isElementTypeLegalForScalableVector(Type *Ty) const;

/// \returns The new vector factor value if the target doesn't support \p		/// \returns The new vector factor value if the target doesn't support \p
/// SizeInBytes loads or has a better vector factor.		/// SizeInBytes loads or has a better vector factor.
unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,		unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const;		VectorType *VecTy) const;

/// \returns The new vector factor value if the target doesn't support \p		/// \returns The new vector factor value if the target doesn't support \p
/// SizeInBytes stores or has a better vector factor.		/// SizeInBytes stores or has a better vector factor.
▲ Show 20 Lines • Show All 367 Lines • ▼ Show 20 Lines	public:
virtual bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,		virtual bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,
Align Alignment,		Align Alignment,
unsigned AddrSpace) const = 0;		unsigned AddrSpace) const = 0;
virtual bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,		virtual bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes,
Align Alignment,		Align Alignment,
unsigned AddrSpace) const = 0;		unsigned AddrSpace) const = 0;
virtual bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,		virtual bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
ElementCount VF) const = 0;		ElementCount VF) const = 0;
		virtual bool isElementTypeLegalForScalableVector(Type *Ty) const = 0;
virtual unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,		virtual unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const = 0;		VectorType *VecTy) const = 0;
virtual unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		virtual unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const = 0;		VectorType *VecTy) const = 0;
virtual bool preferInLoopReduction(unsigned Opcode, Type *Ty,		virtual bool preferInLoopReduction(unsigned Opcode, Type *Ty,
ReductionFlags) const = 0;		ReductionFlags) const = 0;
▲ Show 20 Lines • Show All 535 Lines • ▼ Show 20 Lines	bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes, Align Alignment,
unsigned AddrSpace) const override {		unsigned AddrSpace) const override {
return Impl.isLegalToVectorizeStoreChain(ChainSizeInBytes, Alignment,		return Impl.isLegalToVectorizeStoreChain(ChainSizeInBytes, Alignment,
AddrSpace);		AddrSpace);
}		}
bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,		bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
ElementCount VF) const override {		ElementCount VF) const override {
return Impl.isLegalToVectorizeReduction(RdxDesc, VF);		return Impl.isLegalToVectorizeReduction(RdxDesc, VF);
}		}
		bool isElementTypeLegalForScalableVector(Type *Ty) const override {
		return Impl.isElementTypeLegalForScalableVector(Ty);
		}
unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,		unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const override {		VectorType *VecTy) const override {
return Impl.getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);		return Impl.getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);
}		}
unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const override {		VectorType *VecTy) const override {
▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 715 Lines • ▼ Show 20 Lines	bool isLegalToVectorizeStoreChain(unsigned ChainSizeInBytes, Align Alignment,
return true;		return true;
}		}

bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,		bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
ElementCount VF) const {		ElementCount VF) const {
return true;		return true;
}		}

		bool isElementTypeLegalForScalableVector(Type *Ty) const { return true; }

unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,		unsigned getLoadVectorFactor(unsigned VF, unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const {		VectorType *VecTy) const {
return VF;		return VF;
}		}

unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,		unsigned getStoreVectorFactor(unsigned VF, unsigned StoreSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
▲ Show 20 Lines • Show All 443 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 997 Lines • ▼ Show 20 Lines	return TTIImpl->isLegalToVectorizeStoreChain(ChainSizeInBytes, Alignment,
AddrSpace);		AddrSpace);
}		}

bool TargetTransformInfo::isLegalToVectorizeReduction(		bool TargetTransformInfo::isLegalToVectorizeReduction(
const RecurrenceDescriptor &RdxDesc, ElementCount VF) const {		const RecurrenceDescriptor &RdxDesc, ElementCount VF) const {
return TTIImpl->isLegalToVectorizeReduction(RdxDesc, VF);		return TTIImpl->isLegalToVectorizeReduction(RdxDesc, VF);
}		}

		bool TargetTransformInfo::isElementTypeLegalForScalableVector(Type *Ty) const {
		return TTIImpl->isElementTypeLegalForScalableVector(Ty);
		}

unsigned TargetTransformInfo::getLoadVectorFactor(unsigned VF,		unsigned TargetTransformInfo::getLoadVectorFactor(unsigned VF,
unsigned LoadSize,		unsigned LoadSize,
unsigned ChainSizeInBytes,		unsigned ChainSizeInBytes,
VectorType *VecTy) const {		VectorType *VecTy) const {
return TTIImpl->getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);		return TTIImpl->getLoadVectorFactor(VF, LoadSize, ChainSizeInBytes, VecTy);
}		}

unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,		unsigned TargetTransformInfo::getStoreVectorFactor(unsigned VF,
▲ Show 20 Lines • Show All 435 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h

Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines	public:
void getPeelingPreferences(Loop *L, ScalarEvolution &SE,		void getPeelingPreferences(Loop *L, ScalarEvolution &SE,
TTI::PeelingPreferences &PP);		TTI::PeelingPreferences &PP);

Value getOrCreateResultFromMemIntrinsic(IntrinsicInst Inst,		Value getOrCreateResultFromMemIntrinsic(IntrinsicInst Inst,
Type *ExpectedType);		Type *ExpectedType);

bool getTgtMemIntrinsic(IntrinsicInst *Inst, MemIntrinsicInfo &Info);		bool getTgtMemIntrinsic(IntrinsicInst *Inst, MemIntrinsicInfo &Info);

bool isLegalElementTypeForSVE(Type *Ty) const {		bool isElementTypeLegalForScalableVector(Type *Ty) const {
if (Ty->isPointerTy())		if (Ty->isPointerTy())
return true;		return true;

if (Ty->isBFloatTy() && ST->hasBF16())		if (Ty->isBFloatTy() && ST->hasBF16())
return true;		return true;

if (Ty->isHalfTy() \|\| Ty->isFloatTy() \|\| Ty->isDoubleTy())		if (Ty->isHalfTy() \|\| Ty->isFloatTy() \|\| Ty->isDoubleTy())
return true;		return true;

if (Ty->isIntegerTy(8) \|\| Ty->isIntegerTy(16) \|\|		if (Ty->isIntegerTy(1) \|\| Ty->isIntegerTy(8) \|\| Ty->isIntegerTy(16) \|\|
		david-armUnsubmitted Not Done Reply Inline Actions @sdesmalen @kmclaughlin By doing this I think we now have a large test escape because we're essentially adding a whole new piece of functionality, i.e. telling the vectoriser we can also do masked loads/stores and gather/scatters with i1 types too. I think this means we now need to add a cost model for masked loads/store/gathers/scatter using i1 types, write vectoriser tests and ensure we get sensible codegen too? david-arm: @sdesmalen @kmclaughlin By doing this I think we now have a large test escape because we're…
Ty->isIntegerTy(32) \|\| Ty->isIntegerTy(64))		Ty->isIntegerTy(32) \|\| Ty->isIntegerTy(64))
return true;		return true;

return false;		return false;
}		}

bool isLegalMaskedLoadStore(Type *DataType, Align Alignment) {		bool isLegalMaskedLoadStore(Type *DataType, Align Alignment) {
if (!ST->hasSVE())		if (!ST->hasSVE())
return false;		return false;

// For fixed vectors, avoid scalarization if using SVE for them.		// For fixed vectors, avoid scalarization if using SVE for them.
if (isa<FixedVectorType>(DataType) && !ST->useSVEForFixedLengthVectors())		if (isa<FixedVectorType>(DataType) && !ST->useSVEForFixedLengthVectors())
return false; // Fall back to scalarization of masked operations.		return false; // Fall back to scalarization of masked operations.

return isLegalElementTypeForSVE(DataType->getScalarType());		return !DataType->getScalarType()->isIntegerTy(1) &&
		isElementTypeLegalForScalableVector(DataType->getScalarType());
}		}

bool isLegalMaskedLoad(Type *DataType, Align Alignment) {		bool isLegalMaskedLoad(Type *DataType, Align Alignment) {
return isLegalMaskedLoadStore(DataType, Alignment);		return isLegalMaskedLoadStore(DataType, Alignment);
}		}

bool isLegalMaskedStore(Type *DataType, Align Alignment) {		bool isLegalMaskedStore(Type *DataType, Align Alignment) {
return isLegalMaskedLoadStore(DataType, Alignment);		return isLegalMaskedLoadStore(DataType, Alignment);
}		}

bool isLegalMaskedGatherScatter(Type *DataType) const {		bool isLegalMaskedGatherScatter(Type *DataType) const {
if (!ST->hasSVE())		if (!ST->hasSVE())
return false;		return false;

// For fixed vectors, scalarize if not using SVE for them.		// For fixed vectors, scalarize if not using SVE for them.
auto *DataTypeFVTy = dyn_cast<FixedVectorType>(DataType);		auto *DataTypeFVTy = dyn_cast<FixedVectorType>(DataType);
if (DataTypeFVTy && (!ST->useSVEForFixedLengthVectors() \|\|		if (DataTypeFVTy && (!ST->useSVEForFixedLengthVectors() \|\|
DataTypeFVTy->getNumElements() < 2))		DataTypeFVTy->getNumElements() < 2))
return false;		return false;

return isLegalElementTypeForSVE(DataType->getScalarType());		return !DataType->getScalarType()->isIntegerTy(1) &&
		isElementTypeLegalForScalableVector(DataType->getScalarType());
}		}

bool isLegalMaskedGather(Type *DataType, Align Alignment) const {		bool isLegalMaskedGather(Type *DataType, Align Alignment) const {
return isLegalMaskedGatherScatter(DataType);		return isLegalMaskedGatherScatter(DataType);
}		}
bool isLegalMaskedScatter(Type *DataType, Align Alignment) const {		bool isLegalMaskedScatter(Type *DataType, Align Alignment) const {
return isLegalMaskedGatherScatter(DataType);		return isLegalMaskedGatherScatter(DataType);
}		}
Show All 29 Lines	public:
bool shouldExpandReduction(const IntrinsicInst *II) const { return false; }		bool shouldExpandReduction(const IntrinsicInst *II) const { return false; }

unsigned getGISelRematGlobalCost() const {		unsigned getGISelRematGlobalCost() const {
return 2;		return 2;
}		}

bool supportsScalableVectors() const { return ST->hasSVE(); }		bool supportsScalableVectors() const { return ST->hasSVE(); }

bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,		bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
		sdesmalenUnsubmitted Done Reply Inline Actions To reduce the diff, can you just rename isLegalElementTypeForSVE? sdesmalen: To reduce the diff, can you just rename isLegalElementTypeForSVE?
ElementCount VF) const;		ElementCount VF) const;

InstructionCost getArithmeticReductionCost(		InstructionCost getArithmeticReductionCost(
unsigned Opcode, VectorType *Ty, bool IsPairwiseForm,		unsigned Opcode, VectorType *Ty, bool IsPairwiseForm,
TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput);		TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput);

InstructionCost getShuffleCost(TTI::ShuffleKind Kind, VectorType *Tp,		InstructionCost getShuffleCost(TTI::ShuffleKind Kind, VectorType *Tp,
ArrayRef<int> Mask, int Index,		ArrayRef<int> Mask, int Index,
VectorType *SubTp);		VectorType *SubTp);
/// @}		/// @}
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_LIB_TARGET_AARCH64_AARCH64TARGETTRANSFORMINFO_H		#endif // LLVM_LIB_TARGET_AARCH64_AARCH64TARGETTRANSFORMINFO_H

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 1,725 Lines • ▼ Show 20 Lines	if (const GetElementPtrInst *GEPInst = dyn_cast<GetElementPtrInst>(U)) {
break;		break;
}		}
}		}
}		}
return Considerable;		return Considerable;
}		}

bool AArch64TTIImpl::isLegalToVectorizeReduction(		bool AArch64TTIImpl::isLegalToVectorizeReduction(
const RecurrenceDescriptor &RdxDesc, ElementCount VF) const {		const RecurrenceDescriptor &RdxDesc, ElementCount VF) const {
if (!VF.isScalable())		if (!VF.isScalable())
		sdesmalenUnsubmitted Done Reply Inline Actions is this needed? specifically, this function is never called if `Ty->isVoidTy` sdesmalen: is this needed? specifically, this function is never called if `Ty->isVoidTy`
return true;		return true;

		sdesmalenUnsubmitted Done Reply Inline Actions nit: unnecessary braces sdesmalen: nit: unnecessary braces
		craig.topperUnsubmitted Not Done Reply Inline Actions Does this change the behavior of fixed length vectorization of unsupported types when SVE is enabled even when SVE isn't being used for fixed vectors? craig.topper: Does this change the behavior of fixed length vectorization of unsupported types when SVE is…
		david-armUnsubmitted Done Reply Inline Actions It looks like it - I wonder if it's possible to delay the check for legal types until we're dealing with a VF, and then we can pass the VF as an extra parameter to the function? david-arm: It looks like it - I wonder if it's possible to delay the check for legal types until we're…
Type *Ty = RdxDesc.getRecurrenceType();		Type *Ty = RdxDesc.getRecurrenceType();
if (Ty->isBFloatTy() \|\| !isLegalElementTypeForSVE(Ty))		if (Ty->isBFloatTy() \|\| !isElementTypeLegalForScalableVector(Ty))
return false;		return false;

switch (RdxDesc.getRecurrenceKind()) {		switch (RdxDesc.getRecurrenceKind()) {
case RecurKind::Add:		case RecurKind::Add:
case RecurKind::FAdd:		case RecurKind::FAdd:
case RecurKind::And:		case RecurKind::And:
case RecurKind::Or:		case RecurKind::Or:
case RecurKind::Xor:		case RecurKind::Xor:
▲ Show 20 Lines • Show All 302 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,299 Lines • ▼ Show 20 Lines	public:
/// \return Returns information about the register usages of the loop for the		/// \return Returns information about the register usages of the loop for the
/// given vectorization factors.		/// given vectorization factors.
SmallVector<RegisterUsage, 8>		SmallVector<RegisterUsage, 8>
calculateRegisterUsage(ArrayRef<ElementCount> VFs);		calculateRegisterUsage(ArrayRef<ElementCount> VFs);

/// Collect values we want to ignore in the cost model.		/// Collect values we want to ignore in the cost model.
void collectValuesToIgnore();		void collectValuesToIgnore();

/// Collect all element types in the loop for which widening is needed.		/// Collect all element types in the loop for which widening is needed.
		sdesmalenUnsubmitted Done Reply Inline Actions nit: widening is sdesmalen: nit: widening is
void collectElementTypesForWidening();		void collectElementTypesForWidening();

/// Split reductions into those that happen in the loop, and those that happen		/// Split reductions into those that happen in the loop, and those that happen
/// outside. In loop reductions are collected into InLoopReductionChains.		/// outside. In loop reductions are collected into InLoopReductionChains.
void collectInLoopReductions();		void collectInLoopReductions();

/// Returns true if we should use strict in-order reductions for the given		/// Returns true if we should use strict in-order reductions for the given
/// RdxDesc. This is true if the -enable-strict-reductions flag is passed,		/// RdxDesc. This is true if the -enable-strict-reductions flag is passed,
▲ Show 20 Lines • Show All 197 Lines • ▼ Show 20 Lines	bool isLegalGatherOrScatter(Value *V) {
auto *Ty = getLoadStoreType(V);		auto *Ty = getLoadStoreType(V);
Align Align = getLoadStoreAlignment(V);		Align Align = getLoadStoreAlignment(V);
return (LI && TTI.isLegalMaskedGather(Ty, Align)) \|\|		return (LI && TTI.isLegalMaskedGather(Ty, Align)) \|\|
(SI && TTI.isLegalMaskedScatter(Ty, Align));		(SI && TTI.isLegalMaskedScatter(Ty, Align));
}		}

/// Returns true if the target machine supports all of the reduction		/// Returns true if the target machine supports all of the reduction
/// variables found for the given VF.		/// variables found for the given VF.
bool canVectorizeReductions(ElementCount VF) {		bool canVectorizeReductions(ElementCount VF) const {
return (all_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {		return (all_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {
const RecurrenceDescriptor &RdxDesc = Reduction.second;		const RecurrenceDescriptor &RdxDesc = Reduction.second;
return TTI.isLegalToVectorizeReduction(RdxDesc, VF);		return TTI.isLegalToVectorizeReduction(RdxDesc, VF);
}));		}));
}		}

/// Returns true if \p I is an instruction that will be scalarized with		/// Returns true if \p I is an instruction that will be scalarized with
		david-armUnsubmitted Done Reply Inline Actions I don't think we need to pass `TheLoop` in here, since I think it's a member of the class already? david-arm: I don't think we need to pass `TheLoop` in here, since I think it's a member of the class…
		david-armUnsubmitted Not Done Reply Inline Actions nit: Could you add a simple comment above the function before committing? Something like /// Returns true if all types found in the loop are legal to vectorize. maybe? david-arm: nit: Could you add a simple comment above the function before committing? Something like ///…
		fhahnUnsubmitted Not Done Reply Inline Actions I think terminology used in other places is 'widening' instead of 'vectorizing' , both in the cost model and codegen. Does vectorizing here means something different? Also would be good to add a comment for the function.. fhahn: I think terminology used in other places is 'widening' instead of 'vectorizing' , both in the…
		david-armUnsubmitted Not Done Reply Inline Actions I guess they're the same? I hadn't deliberately chosen the word `vectorize` over `widen` to be honest. In this case I think we specifically want to know if the target has hardware support for vector instructions involving a given type because we cannot fall back on scalarisation. In general, is there a preference to using `Widen` instead of `Vectorize` in naming schemes and comments? It's just I do see other functions in this file use the word `Vectorize` in functions and class names so I guess it seemed natural to create functions with the word `Vectorize` in them. david-arm: I guess they're the same? I hadn't deliberately chosen the word `vectorize` over `widen` to be…
		fhahnUnsubmitted Not Done Reply Inline Actions I don't think there's a definitive answer. The current usage is not as clear cut as I thought. I was thinking about the terminology used for the various `widenInstruction`, `widenGEP`, `widenIntOrFpInduction`, `widenPHIInstruction` & co vs. `vectorizeInterleaveGroup` which does not simplify widen instructions, together with the widening decision terminology used in the cost model. In any case, a doc-comment would definitely be helpful. Another thing to consider to align the name with `TTI.isLegalToVectorizeElementType` to something like `isLegalTo(Vectorize/Widen)InstructionType` fhahn: I don't think there's a definitive answer. The current usage is not as clear cut as I thought.
/// predication. Such instructions include conditional stores and		/// predication. Such instructions include conditional stores and
/// instructions that may divide by zero.		/// instructions that may divide by zero.
		david-armUnsubmitted Done Reply Inline Actions nit: I think perhaps you can just remove the `VF` argument now? david-arm: nit: I think perhaps you can just remove the `VF` argument now?
		sdesmalenUnsubmitted Not Done Reply Inline Actions I'd like to suggest to have a name that's more generic: `canWidenLoopWithScalableVectors`, so that it can be reused for other purposes. D102394 and subsequently D101916 do that already. If you make a similar change as in D102394 where you move `canVectorizeReductions` into this function, then @david-arm can rebase his patch when he's back. sdesmalen: I'd like to suggest to have a name that's more generic: `canWidenLoopWithScalableVectors`, so…
		david-armUnsubmitted Done Reply Inline Actions nit: I think the comment probably needs updating now since we no longer return a remark. Maybe something like: /// scalable vectorization factor MaxVF. If the loop is illegal the function /// emits an appropriate error remark. david-arm: nit: I think the comment probably needs updating now since we no longer return a remark. Maybe…
/// If a non-zero VF has been calculated, we check if I will be scalarized		/// If a non-zero VF has been calculated, we check if I will be scalarized
/// predication for that VF.		/// predication for that VF.
bool isScalarWithPredication(Instruction *I) const;		bool isScalarWithPredication(Instruction *I) const;

// Returns true if \p I is an instruction that will be predicated either		// Returns true if \p I is an instruction that will be predicated either
// through scalar predication or masked load/store or masked gather/scatter.		// through scalar predication or masked load/store or masked gather/scatter.
// Superset of instructions that return true for isScalarWithPredication.		// Superset of instructions that return true for isScalarWithPredication.
bool isPredicatedInst(Instruction *I) {		bool isPredicatedInst(Instruction *I) {
		joechrisellisUnsubmitted Done Reply Inline Actions I think this is equivalent to: bool canVectorizeInstructionTypes(Loop TheLoop, ElementCount VF) { for (BasicBlock BB : TheLoop->blocks()) for (Instruction &I : BB->instructionsWithoutDebug()) { auto Ty = I.getType(); if (!Ty->isVoidTy() && !TTI.isLegalToVectorizeElementType(Ty, VF.isScalable())) return false; } return true; } Has the nice side effect of getting rid of the double negative. 🙂 joechrisellis:* I think this is equivalent to: ``` bool canVectorizeInstructionTypes(Loop *TheLoop…
if (!blockNeedsPredication(I->getParent()))		if (!blockNeedsPredication(I->getParent()))
return false;		return false;
// Loads and stores that need some form of masked operation are predicated		// Loads and stores that need some form of masked operation are predicated
// instructions.		// instructions.
if (isa<LoadInst>(I) \|\| isa<StoreInst>(I))		if (isa<LoadInst>(I) \|\| isa<StoreInst>(I))
return Legal->isMaskRequired(I);		return Legal->isMaskRequired(I);
return isScalarWithPredication(I);		return isScalarWithPredication(I);
}		}
▲ Show 20 Lines • Show All 4,105 Lines • ▼ Show 20 Lines	reportVectorizationFailure("Runtime stride check for small trip count",
"this loop without such check by compiling with -Os/-Oz",		"this loop without such check by compiling with -Os/-Oz",
"CantVersionLoopWithOptForSize", ORE, TheLoop);		"CantVersionLoopWithOptForSize", ORE, TheLoop);
return true;		return true;
}		}

return false;		return false;
}		}

ElementCount		ElementCount
		sdesmalenUnsubmitted Done Reply Inline Actions Given the recent change in direction to use invalid costs to avoid vectorization with scalable vectors, this function should be quite limited in size/scope, so maybe it's better not moving this out into a separate function like you've done here, but just add the extra condition to `getMaxLegalScalableVF` as in: i.e. if (any_of(ElementTypesInLoop, [] (const Type Ty) { return !Ty->isVoidTy() && !TTI.isElementTypeLegalForScalableVector(Ty); })) { reportVectorizationInfo(...); return ElementCount::getScalable(0); } sdesmalen:* Given the recent change in direction to use invalid costs to avoid vectorization with scalable…
LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {		LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors) {		if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors) {
reportVectorizationInfo(		reportVectorizationInfo(
"Disabling scalable vectorization, because target does not "		"Disabling scalable vectorization, because target does not "
"support scalable vectors.",		"support scalable vectors.",
"ScalableVectorsUnsupported", ORE, TheLoop);		"ScalableVectorsUnsupported", ORE, TheLoop);
return ElementCount::getScalable(0);		return ElementCount::getScalable(0);
}		}

if (Hints->isScalableVectorizationDisabled()) {		if (Hints->isScalableVectorizationDisabled()) {
reportVectorizationInfo("Scalable vectorization is explicitly disabled",		reportVectorizationInfo("Scalable vectorization is explicitly disabled",
"ScalableVectorizationDisabled", ORE, TheLoop);		"ScalableVectorizationDisabled", ORE, TheLoop);
return ElementCount::getScalable(0);		return ElementCount::getScalable(0);
}		}

auto MaxScalableVF = ElementCount::getScalable(		auto MaxScalableVF = ElementCount::getScalable(
		sdesmalenUnsubmitted Done Reply Inline Actions Hi @kmclaughlin I tried out this patch and found there is a case that's missing, namely where the instruction is a `StoreInst`, which itself has `void` as its type, but then stores a loop invariant i128/f128 value to a variant address. Also, looking at this condition again, why is IntegerTy(1) handled differently here? For SVE we have predicate vectors, so it should be covered by isElementTypeLegalForScalableVector? sdesmalen: Hi @kmclaughlin I tried out this patch and found there is a case that's missing, namely where…
		kmclaughlinAuthorUnsubmitted Done Reply Inline Actions Hi @sdesmalen, I've added a test for the missing store case which you found. The IntegerTy(1) here should be covered by `isElementTypeLegalForScalableVector` - I've moved this accordingly. kmclaughlin: Hi @sdesmalen, I've added a test for the missing store case which you found. The IntegerTy(1)…
std::numeric_limits<ElementCount::ScalarTy>::max());		std::numeric_limits<ElementCount::ScalarTy>::max());

// Disable scalable vectorization if the loop contains unsupported reductions.
// Test that the loop-vectorizer can legalize all operations for this MaxVF.		// Test that the loop-vectorizer can legalize all operations for this MaxVF.
// FIXME: While for scalable vectors this is currently sufficient, this should		// FIXME: While for scalable vectors this is currently sufficient, this should
// be replaced by a more detailed mechanism that filters out specific VFs,		// be replaced by a more detailed mechanism that filters out specific VFs,
// instead of invalidating vectorization for a whole set of VFs based on the		// instead of invalidating vectorization for a whole set of VFs based on the
// MaxVF.		// MaxVF.

		// Disable scalable vectorization if the loop contains unsupported reductions.
if (!canVectorizeReductions(MaxScalableVF)) {		if (!canVectorizeReductions(MaxScalableVF)) {
reportVectorizationInfo(		reportVectorizationInfo(
"Scalable vectorization not supported for the reduction "		"Scalable vectorization not supported for the reduction "
"operations found in this loop.",		"operations found in this loop.",
"ScalableVFUnfeasible", ORE, TheLoop);		"ScalableVFUnfeasible", ORE, TheLoop);
return ElementCount::getScalable(0);		return ElementCount::getScalable(0);
}		}

		// Disable scalable vectorization if the loop contains any instructions
		// with element types not supported for scalable vectors.
		if (any_of(ElementTypesInLoop, [&](Type *Ty) {
		return !Ty->isVoidTy() &&
		!this->TTI.isElementTypeLegalForScalableVector(Ty);
		})) {
		reportVectorizationInfo("Scalable vectorization is not supported "
		"for all element types found in this loop.",
		"ScalableVFUnfeasible", ORE, TheLoop);
		return ElementCount::getScalable(0);
		}

if (Legal->isSafeForAnyVectorWidth())		if (Legal->isSafeForAnyVectorWidth())
return MaxScalableVF;		return MaxScalableVF;

// Limit MaxScalableVF by the maximum safe dependence distance.		// Limit MaxScalableVF by the maximum safe dependence distance.
Optional<unsigned> MaxVScale = TTI.getMaxVScale();		Optional<unsigned> MaxVScale = TTI.getMaxVScale();
MaxScalableVF = ElementCount::getScalable(		MaxScalableVF = ElementCount::getScalable(
MaxVScale ? (MaxSafeElements / MaxVScale.getValue()) : 0);		MaxVScale ? (MaxSafeElements / MaxVScale.getValue()) : 0);
if (!MaxScalableVF)		if (!MaxScalableVF)
▲ Show 20 Lines • Show All 545 Lines • ▼ Show 20 Lines	LoopVectorizationCostModel::selectEpilogueVectorizationFactor(
return Result;		return Result;
}		}

std::pair<unsigned, unsigned>		std::pair<unsigned, unsigned>
LoopVectorizationCostModel::getSmallestAndWidestTypes() {		LoopVectorizationCostModel::getSmallestAndWidestTypes() {
unsigned MinWidth = -1U;		unsigned MinWidth = -1U;
unsigned MaxWidth = 8;		unsigned MaxWidth = 8;
const DataLayout &DL = TheFunction->getParent()->getDataLayout();		const DataLayout &DL = TheFunction->getParent()->getDataLayout();
for (Type *T : ElementTypesInLoop) {		for (Type *T : ElementTypesInLoop) {
		david-armUnsubmitted Not Done Reply Inline Actions If we're relying upon the element types being collected for correctness here is it worth adding an assert: assert(!ElementTypesInLoop.empty() && "Unable to calculate smallest and widest types"); david-arm: If we're relying upon the element types being collected for correctness here is it worth adding…
		kmclaughlinAuthorUnsubmitted Not Done Reply Inline Actions Since we only add the element types of loads, stores and (reduction) phi nodes in collectAllElementTypesInLoop(), I think it is possible for ElementTypesInLoop to be empty here for a given loop. I tried adding the assert and did run into quite a few test failures with the change. kmclaughlin: Since we only add the element types of loads, stores and (reduction) phi nodes in…
MinWidth = std::min<unsigned>(		MinWidth = std::min<unsigned>(
MinWidth, DL.getTypeSizeInBits(T->getScalarType()).getFixedSize());		MinWidth, DL.getTypeSizeInBits(T->getScalarType()).getFixedSize());
MaxWidth = std::max<unsigned>(		MaxWidth = std::max<unsigned>(
MaxWidth, DL.getTypeSizeInBits(T->getScalarType()).getFixedSize());		MaxWidth, DL.getTypeSizeInBits(T->getScalarType()).getFixedSize());
}		}
return {MinWidth, MaxWidth};		return {MinWidth, MaxWidth};
}		}

void LoopVectorizationCostModel::collectElementTypesForWidening() {		void LoopVectorizationCostModel::collectElementTypesForWidening() {
sdesmalenUnsubmitted Done Reply Inline Actions nit: unnecessary whitespace change? sdesmalen: nit: unnecessary whitespace change?
kmclaughlinAuthorUnsubmitted Done Reply Inline Actions The extra whitespace was introduced in D105437, which I removed before committing. kmclaughlin: The extra whitespace was introduced in D105437, which I removed before committing.
		david-armUnsubmitted Done Reply Inline Actions Do we ever call this function more than once for a given loop? I wonder if it's worth clearing the list at the start just in case, i.e. ElementTypesInLoop.clear(); david-arm: Do we ever call this function more than once for a given loop? I wonder if it's worth clearing…
		kmclaughlinAuthorUnsubmitted Done Reply Inline Actions I don't think we ever call this function more than once for each loop, though I think it's still worth clearing the list as you suggest. kmclaughlin: I don't think we ever call this function more than once for each loop, though I think it's…
ElementTypesInLoop.clear();		ElementTypesInLoop.clear();
// For each block.		// For each block.
for (BasicBlock *BB : TheLoop->blocks()) {		for (BasicBlock *BB : TheLoop->blocks()) {
// For each instruction in the loop.		// For each instruction in the loop.
for (Instruction &I : BB->instructionsWithoutDebug()) {		for (Instruction &I : BB->instructionsWithoutDebug()) {
Type *T = I.getType();		Type *T = I.getType();

// Skip ignored values.		// Skip ignored values.
▲ Show 20 Lines • Show All 3,573 Lines • ▼ Show 20 Lines	static bool processLoopInVPlanNativePath(
// TODO: CM is not used at this point inside the planner. Turn CM into an		// TODO: CM is not used at this point inside the planner. Turn CM into an
// optional argument if we don't need it in the future.		// optional argument if we don't need it in the future.
LoopVectorizationPlanner LVP(L, LI, TLI, TTI, LVL, CM, IAI, PSE, Hints,		LoopVectorizationPlanner LVP(L, LI, TLI, TTI, LVL, CM, IAI, PSE, Hints,
Requirements, ORE);		Requirements, ORE);

// Get user vectorization factor.		// Get user vectorization factor.
ElementCount UserVF = Hints.getWidth();		ElementCount UserVF = Hints.getWidth();

CM.collectElementTypesForWidening();		CM.collectElementTypesForWidening();
		sdesmalenUnsubmitted Done Reply Inline Actions Could you split out the non-functional change which adds `collectAllElementTypesInLoop` and changes `getSmallestAndWidestTypes` accordingly, and commit this as a separate (NFC) patch? sdesmalen: Could you split out the non-functional change which adds `collectAllElementTypesInLoop` and…

// Plan how to best vectorize, return the best VF and its cost.		// Plan how to best vectorize, return the best VF and its cost.
const VectorizationFactor VF = LVP.planInVPlanNativePath(UserVF);		const VectorizationFactor VF = LVP.planInVPlanNativePath(UserVF);

// If we are stress testing VPlan builds, do not attempt to generate vector		// If we are stress testing VPlan builds, do not attempt to generate vector
// code. Masked vector code generation support will follow soon.		// code. Masked vector code generation support will follow soon.
// Also, do not attempt to vectorize if no vector code will be produced.		// Also, do not attempt to vectorize if no vector code will be produced.
if (VPlanBuildStressTest \|\| EnableVPlanPredication \|\|		if (VPlanBuildStressTest \|\| EnableVPlanPredication \|\|
▲ Show 20 Lines • Show All 535 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/AArch64/scalable-reductions.ll

	; RUN: opt < %s -loop-vectorize -pass-remarks=loop-vectorize -pass-remarks-analysis=loop-vectorize -pass-remarks-missed=loop-vectorize -mtriple aarch64-unknown-linux-gnu -mattr=+sve -S -scalable-vectorization=on 2>%t \| FileCheck %s -check-prefix=CHECK			; RUN: opt < %s -loop-vectorize -pass-remarks=loop-vectorize -pass-remarks-analysis=loop-vectorize -pass-remarks-missed=loop-vectorize -mtriple aarch64-unknown-linux-gnu -mattr=+sve,+bf16 -S -scalable-vectorization=on 2>%t \| FileCheck %s -check-prefix=CHECK
	; RUN: cat %t \| FileCheck %s -check-prefix=CHECK-REMARK			; RUN: cat %t \| FileCheck %s -check-prefix=CHECK-REMARK

	; Reduction can be vectorized			; Reduction can be vectorized

	; ADD			; ADD

	; CHECK-REMARK: vectorized loop (vectorization width: vscale x 8, interleaved count: 2)			; CHECK-REMARK: vectorized loop (vectorization width: vscale x 8, interleaved count: 2)
	define i32 @add(i32* nocapture %a, i32* nocapture readonly %b, i64 %n) {			define i32 @add(i32* nocapture %a, i32* nocapture readonly %b, i64 %n) {
	▲ Show 20 Lines • Show All 391 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll

This file was added.

				; RUN: opt < %s -loop-vectorize -scalable-vectorization=on -mattr=+sve -force-vector-width=4 -pass-remarks-analysis=loop-vectorize -S 2>%t \| FileCheck %s
				; RUN: cat %t \| FileCheck %s -check-prefix=CHECK-REMARKS
				david-armUnsubmitted Done Reply Inline Actions nit: Since this test is now in the AArch64 directory I think you can replace "-force-target-supports.." with "-mattr=+sve"? david-arm: nit: Since this test is now in the AArch64 directory I think you can replace "-force-target…
				target triple = "aarch64-linux-gnu"

				; CHECK-REMARKS: Scalable vectorization is not supported for all element types found in this loop
				define dso_local void @loop_sve_i128(i128* nocapture %ptr, i64 %N) {
				; CHECK-LABEL: @loop_sve_i128
				; CHECK: vector.body
				; CHECK: %[[LOAD1:.]] = load i128, i128 {{.*}}
				; CHECK-NEXT: %[[LOAD2:.]] = load i128, i128 {{.*}}
				; CHECK-NEXT: %[[ADD1:.*]] = add nsw i128 %[[LOAD1]], 42
				; CHECK-NEXT: %[[ADD2:.*]] = add nsw i128 %[[LOAD2]], 42
				; CHECK-NEXT: store i128 %[[ADD1]], i128* {{.*}}
				; CHECK-NEXT: store i128 %[[ADD2]], i128* {{.*}}
				david-armUnsubmitted Done Reply Inline Actions nit: Maybe best not to rely upon `%4` and `%5` here perhaps and use a named variable? david-arm: nit: Maybe best not to rely upon `%4` and `%5` here perhaps and use a named variable?
				entry:
				br label %for.body

				for.body:
				%iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i128, i128* %ptr, i64 %iv
				%0 = load i128, i128* %arrayidx, align 16
				%add = add nsw i128 %0, 42
				store i128 %add, i128* %arrayidx, align 16
				%iv.next = add i64 %iv, 1
				%exitcond.not = icmp eq i64 %iv.next, %N
				br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0

				for.end:
				ret void
				}

				; CHECK-REMARKS: Scalable vectorization is not supported for all element types found in this loop
				define dso_local void @loop_sve_f128(fp128* nocapture %ptr, i64 %N) {
				; CHECK-LABEL: @loop_sve_f128
				; CHECK: vector.body
				; CHECK: %[[LOAD1:.]] = load fp128, fp128
				; CHECK-NEXT: %[[LOAD2:.]] = load fp128, fp128
				; CHECK-NEXT: %[[FSUB1:.*]] = fsub fp128 %[[LOAD1]], 0xL00000000000000008000000000000000
				; CHECK-NEXT: %[[FSUB2:.*]] = fsub fp128 %[[LOAD2]], 0xL00000000000000008000000000000000
				; CHECK-NEXT: store fp128 %[[FSUB1]], fp128* {{.*}}
				; CHECK-NEXT: store fp128 %[[FSUB2]], fp128* {{.*}}
				david-armUnsubmitted Done Reply Inline Actions nit: Same thing as above about using `%4` and `%5` david-arm: nit: Same thing as above about using `%4` and `%5`
				entry:
				br label %for.body

				for.body:
				%iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
				%arrayidx = getelementptr inbounds fp128, fp128* %ptr, i64 %iv
				%0 = load fp128, fp128* %arrayidx, align 16
				%add = fsub fp128 %0, 0xL00000000000000008000000000000000
				store fp128 %add, fp128* %arrayidx, align 16
				%iv.next = add nuw nsw i64 %iv, 1
				%exitcond.not = icmp eq i64 %iv.next, %N
				br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0

				for.end:
				ret void
				}

				; CHECK-REMARKS: Scalable vectorization is not supported for all element types found in this loop
				define dso_local void @loop_invariant_sve_i128(i128* nocapture %ptr, i128 %val, i64 %N) {
				; CHECK-LABEL: @loop_invariant_sve_i128
				; CHECK: vector.body
				; CHECK: %[[GEP1:.]] = getelementptr inbounds i128, i128 %ptr
				; CHECK-NEXT: %[[GEP2:.]] = getelementptr inbounds i128, i128 %ptr
				; CHECK-NEXT: store i128 %val, i128* %[[GEP1]]
				; CHECK-NEXT: store i128 %val, i128* %[[GEP2]]
				entry:
				br label %for.body

				for.body:
				%iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i128, i128* %ptr, i64 %iv
				store i128 %val, i128* %arrayidx, align 16
				%iv.next = add nuw nsw i64 %iv, 1
				%exitcond.not = icmp eq i64 %iv.next, %N
				br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0

				for.end:
				ret void
				}

				define dso_local void @loop_fixed_width_i128(i128* nocapture %ptr, i64 %N) {
				; CHECK-LABEL: @loop_fixed_width_i128
				; CHECK: load <4 x i128>, <4 x i128>*
				; CHECK: add nsw <4 x i128> {{.*}}, <i128 42, i128 42, i128 42, i128 42>
				; CHECK: store <4 x i128> {{.}} <4 x i128>
				; CHECK-NOT: vscale
				entry:
				br label %for.body

				for.body:
				%iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i128, i128* %ptr, i64 %iv
				%0 = load i128, i128* %arrayidx, align 16
				%add = add nsw i128 %0, 42
				store i128 %add, i128* %arrayidx, align 16
				%iv.next = add i64 %iv, 1
				%exitcond.not = icmp eq i64 %iv.next, %N
				br i1 %exitcond.not, label %for.end, label %for.body

				for.end:
				ret void
				}

				!0 = distinct !{!0, !1}
				!1 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

This is an archive of the discontinued LLVM Phabricator instance.

[LV] Prevent vectorization with unsupported element types.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 356690

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/test/Transforms/LoopVectorize/AArch64/scalable-reductions.ll

llvm/test/Transforms/LoopVectorize/AArch64/sve-illegal-type.ll

[LV] Prevent vectorization with unsupported element types.
ClosedPublic