This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Induction variables: support arbitrary constant step
ClosedPublic

Authored by • HaoLiu on Jan 26 2015, 10:36 PM.

Download Raw Diff

Details

Reviewers

nadav
aschwaighofer
hfinkel

Summary

Hi,

Now for induction variables it's possible to have only -1 and +1 step values. But for induction variables with other steps, LV will do nothing.
Alexey wrote a path in D6051 to support arbitrary induction variable steps. I think his patch is very useful so I toke it over and fixed some bugs (Minor bugs about induction calculation which caused some runtime failures in SPEC2000 and LNT). Now this patch can pass our internal tests.

There are two kinds of induction variables:

integer induction: for (int i = 0; i < 1024; i+=2) { int tmp = *A++; sum += i * tmp; }

"i" is an integer induction variable of step 2. Actually such case can be well vectorized if we support arbitrary induction variable steps.

pointer induction: for (int i = 0; i < 1024; i++) { int tmp0 = *A++; int tmp1 = *A++; sum += tmp0 * tmp1; }

pointer "A" is an pointer induction variable of step 2. Even we support arbitrary stepsCurrently, we still can not vectorize such case well. LoopVectorizer will say "vectorization is possible but not benefical". But we still can force the LoopVectorizer to do vectorization.

Actually if the targets support masked/interleaved load/store, we can vectorize the second case very well. For example, AArch64 backend supports interleaved load/store. To vectorize "tmp0" and "tmp1", we only need one interleaved load such as "LD2 {V0, V1}, [X0]". Vector register V0 and V1 will contain interleaved data. V0 contains "A[0], A[2], A[4], A[6]", and V1 contains "A[1], A[3], A[5], A[7]".

This patch has no big impact on performance according to the tests on AArch64 targets. There are few cases like the first case. There are many cases like the second case, the loop vectorizer thinks it is not beneficial to do vectorization, and most likely it will just do interleave. But I think if we support masked/interleaved load/store in the future, we can get many performance improvements. But I think the first step is that the LoopVectorizer should support arbitrary induction variable steps.

Review please.

Thanks,
-Hao

Diff Detail

Event Timeline

• HaoLiu updated this revision to Diff 18797.Jan 26 2015, 10:36 PM

• HaoLiu retitled this revision from to [LoopVectorize] Induction variables: support arbitrary constant step.

• HaoLiu updated this object.

• HaoLiu edited the test plan for this revision. (Show Details)

• HaoLiu added reviewers: nadav, hfinkel.

• HaoLiu added subscribers: volkalexey, Unknown Object (MLST).

Herald added a subscriber: aemerson. · View Herald TranscriptJan 26 2015, 10:36 PM

hfinkel added a reviewer: aschwaighofer.Jan 27 2015, 6:21 AM

Thanks for continuing to work on this. I think this is a nice over-all code simplification (that also gives us some forward-looking added capabilities).

lib/Transforms/Vectorize/LoopVectorize.cpp
712	I prefer that is* function return a boolean, but this returns a (-1, 0, 1) value. Maybe getConsecutiveDirection() would be better?
722	Given that we don't vectorize loops with wrap-around index spaces, as far as I know, we should be able to add nsw (or nuw or both) flags on these operations. Please do so when possible.
1670	I know it was like this before, but nsw/nuw?
3205	nsw/nuw? (same with the others in the code below)
test/Transforms/LoopVectorize/arbitrary-induction-step.ll
8	If you're checking debug output, you'll need to add: ; REQUIRES: asserts to the entire test. Please break this out into a separate test file, if you'd like to do this, to limit the reduction in test coverage for non-asserts builds.

aschwaighofer added inline comments.Jan 27 2015, 8:03 AM

lib/Transforms/Vectorize/LoopVectorize.cpp
722	Hal, I don't think this is safe. Consider the following example. The canonical induction variable does not wrap but the derived one does. int test(int *a) { int r = 0; uint64_t idx2=0; for (uint64_t i = 0 ; i < UINT64_MAX; ++i) { r+=a[idx2]; idx2 += 2; } return r; }

hfinkel added inline comments.Jan 27 2015, 8:10 AM

lib/Transforms/Vectorize/LoopVectorize.cpp
722	Agreed. But shouldn't we be able to get the nsw/nuw flags from the original scalar operation? (If so, but this departs too far from the scope of this patch, I'm fine with adding a FIXME and leaving this for follow-up)

karthikthecool added a subscriber: karthikthecool.Jan 27 2015, 7:18 PM

• HaoLiu updated this revision to Diff 18876.Jan 27 2015, 11:16 PM

• HaoLiu edited edge metadata.

Hi,

According to the comments, a new patch has been attached.
Review, please.

Thanks,
-Hao

lib/Transforms/Vectorize/LoopVectorize.cpp
722	Agree with you two. Several FIXMEs have been added.
test/Transforms/LoopVectorize/arbitrary-induction-step.ll
8	Hal, thanks for let me know about this. As I think it's not worth to add another file just for testing the debug info, I remove such 'CHECK' and '-debug-only'.

LGTM.

This revision is now accepted and ready to land.Jan 28 2015, 7:22 AM

Hao, this patch LGTM. This is a major change to the vectorizer so please run the llvm test suite and try to identify miscompiles or regressions.

• HaoLiu closed this revision.Mar 15 2015, 7:19 PM

Revision Contents

Path

Size

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

262 lines

test/

Transforms/

LoopVectorize/

arbitrary-induction-step.ll

150 lines

gcc-examples.ll

3 lines

reverse_induction.ll

4 lines

Diff 18876

lib/Transforms/Vectorize/LoopVectorize.cpp

Show First 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	protected:

/// Create a broadcast instruction. This method generates a broadcast		/// Create a broadcast instruction. This method generates a broadcast
/// instruction (shuffle) for loop invariant values and for the induction		/// instruction (shuffle) for loop invariant values and for the induction
/// value. If this is the induction variable then we extend it to N, N+1, ...		/// value. If this is the induction variable then we extend it to N, N+1, ...
/// this is needed because each iteration in the loop corresponds to a SIMD		/// this is needed because each iteration in the loop corresponds to a SIMD
/// element.		/// element.
virtual Value getBroadcastInstrs(Value V);		virtual Value getBroadcastInstrs(Value V);

/// This function adds 0, 1, 2 ... to each vector element, starting at zero.		/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)
/// If Negate is set then negative numbers are added e.g. (0, -1, -2, ...).		/// to each vector element of Val. The sequence starts at StartIndex.
/// The sequence starts at StartIndex.		virtual Value getStepVector(Value Val, int StartIdx, Value *Step);
virtual Value getConsecutiveVector(Value Val, int StartIdx, bool Negate);

/// When we go over instructions in the basic block we rely on previous		/// When we go over instructions in the basic block we rely on previous
/// values within the current basic block or on loop invariant values.		/// values within the current basic block or on loop invariant values.
/// When we widen (vectorize) values we place them in the map. If the values		/// When we widen (vectorize) values we place them in the map. If the values
/// are not within the map, they have to be loop invariant, so we simply		/// are not within the map, they have to be loop invariant, so we simply
/// broadcast them into a vector.		/// broadcast them into a vector.
VectorParts &getVectorValue(Value *V);		VectorParts &getVectorValue(Value *V);

▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	InnerLoopUnroller(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,
const TargetLibraryInfo *TLI, unsigned UnrollFactor) :		const TargetLibraryInfo *TLI, unsigned UnrollFactor) :
InnerLoopVectorizer(OrigLoop, SE, LI, DT, DL, TLI, 1, UnrollFactor) { }		InnerLoopVectorizer(OrigLoop, SE, LI, DT, DL, TLI, 1, UnrollFactor) { }

private:		private:
void scalarizeInstruction(Instruction *Instr,		void scalarizeInstruction(Instruction *Instr,
bool IfPredicateStore = false) override;		bool IfPredicateStore = false) override;
void vectorizeMemoryInstruction(Instruction *Instr) override;		void vectorizeMemoryInstruction(Instruction *Instr) override;
Value getBroadcastInstrs(Value V) override;		Value getBroadcastInstrs(Value V) override;
Value getConsecutiveVector(Value Val, int StartIdx, bool Negate) override;		Value getStepVector(Value Val, int StartIdx, Value *Step) override;
Value reverseVector(Value Vec) override;		Value reverseVector(Value Vec) override;
};		};

/// \brief Look for a meaningful debug location on the instruction or it's		/// \brief Look for a meaningful debug location on the instruction or it's
/// operands.		/// operands.
static Instruction getDebugLocFromInstOrOperands(Instruction I) {		static Instruction getDebugLocFromInstOrOperands(Instruction I) {
if (!I)		if (!I)
return I;		return I;
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	enum ReductionKind {
RK_IntegerMinMax, ///< Min/max implemented in terms of select(cmp()).		RK_IntegerMinMax, ///< Min/max implemented in terms of select(cmp()).
RK_FloatAdd, ///< Sum of floats.		RK_FloatAdd, ///< Sum of floats.
RK_FloatMult, ///< Product of floats.		RK_FloatMult, ///< Product of floats.
RK_FloatMinMax ///< Min/max implemented in terms of select(cmp()).		RK_FloatMinMax ///< Min/max implemented in terms of select(cmp()).
};		};

/// This enum represents the kinds of inductions that we support.		/// This enum represents the kinds of inductions that we support.
enum InductionKind {		enum InductionKind {
IK_NoInduction, ///< Not an induction variable.		IK_NoInduction, ///< Not an induction variable.
IK_IntInduction, ///< Integer induction variable. Step = 1.		IK_IntInduction, ///< Integer induction variable. Step = C.
IK_ReverseIntInduction, ///< Reverse int induction variable. Step = -1.		IK_PtrInduction ///< Pointer induction var. Step = C / sizeof(elem).
IK_PtrInduction, ///< Pointer induction var. Step = sizeof(elem).
IK_ReversePtrInduction ///< Reverse ptr indvar. Step = - sizeof(elem).
};		};

// This enum represents the kind of minmax reduction.		// This enum represents the kind of minmax reduction.
enum MinMaxReductionKind {		enum MinMaxReductionKind {
MRK_Invalid,		MRK_Invalid,
MRK_UIntMin,		MRK_UIntMin,
MRK_UIntMax,		MRK_UIntMax,
MRK_SIntMin,		MRK_SIntMin,
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	struct RuntimePointerCheck {
/// shared underlying object.		/// shared underlying object.
SmallVector<unsigned, 2> DependencySetId;		SmallVector<unsigned, 2> DependencySetId;
/// Holds the id of the disjoint alias set to which this pointer belongs.		/// Holds the id of the disjoint alias set to which this pointer belongs.
SmallVector<unsigned, 2> AliasSetId;		SmallVector<unsigned, 2> AliasSetId;
};		};

/// A struct for saving information about induction variables.		/// A struct for saving information about induction variables.
struct InductionInfo {		struct InductionInfo {
InductionInfo(Value *Start, InductionKind K) : StartValue(Start), IK(K) {}		InductionInfo(Value Start, InductionKind K, ConstantInt Step)
InductionInfo() : StartValue(nullptr), IK(IK_NoInduction) {}		: StartValue(Start), IK(K), StepValue(Step) {
		assert(IK != IK_NoInduction && "Not an induction");
		assert(StartValue && "StartValue is null");
		assert(StepValue && !StepValue->isZero() && "StepValue is zero");
		assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&
		"StartValue is not a pointer for pointer induction");
		assert((IK != IK_IntInduction \|\| StartValue->getType()->isIntegerTy()) &&
		"StartValue is not an integer for integer induction");
		assert(StepValue->getType()->isIntegerTy() &&
		"StepValue is not an integer");
		}
		InductionInfo()
		: StartValue(nullptr), IK(IK_NoInduction), StepValue(nullptr) {}

		/// Get the consecutive direction. Returns:
		hfinkelUnsubmitted Not Done Reply Inline Actions I prefer that is* function return a boolean, but this returns a (-1, 0, 1) value. Maybe getConsecutiveDirection() would be better? hfinkel: I prefer that is* function return a boolean, but this returns a (-1, 0, 1) value. Maybe…
		/// 0 - unknown or non-consecutive.
		/// 1 - consecutive and increasing.
		/// -1 - consecutive and decreasing.
		int getConsecutiveDirection() const {
		if (StepValue && (StepValue->isOne() \|\| StepValue->isMinusOne()))
		return StepValue->getSExtValue();
		return 0;
		}

		/// Compute the transformed value of Index at offset StartValue using step
		hfinkelUnsubmitted Not Done Reply Inline Actions Given that we don't vectorize loops with wrap-around index spaces, as far as I know, we should be able to add nsw (or nuw or both) flags on these operations. Please do so when possible. hfinkel: Given that we don't vectorize loops with wrap-around index spaces, as far as I know, we should…
		aschwaighoferUnsubmitted Not Done Reply Inline Actions Hal, I don't think this is safe. Consider the following example. The canonical induction variable does not wrap but the derived one does. int test(int a) { int r = 0; uint64_t idx2=0; for (uint64_t i = 0 ; i < UINT64_MAX; ++i) { r+=a[idx2]; idx2 += 2; } return r; } aschwaighofer:* Hal, I don't think this is safe. Consider the following example. The canonical induction…
		hfinkelUnsubmitted Not Done Reply Inline Actions Agreed. But shouldn't we be able to get the nsw/nuw flags from the original scalar operation? (If so, but this departs too far from the scope of this patch, I'm fine with adding a FIXME and leaving this for follow-up) hfinkel: Agreed. But shouldn't we be able to get the nsw/nuw flags from the original scalar operation?
		HaoLiuAuthorUnsubmitted Not Done Reply Inline Actions Agree with you two. Several FIXMEs have been added. HaoLiu: Agree with you two. Several FIXMEs have been added.
		/// StepValue.
		/// For integer induction, returns StartValue + Index * StepValue.
		/// For pointer induction, returns StartValue[Index * StepValue].
		/// FIXME: The newly created binary instructions should contain nsw/nuw
		/// flags, which can be found from the original scalar operations.
		Value transform(IRBuilder<> &B, Value Index) const {
		switch (IK) {
		case IK_IntInduction:
		assert(Index->getType() == StartValue->getType() &&
		"Index type does not match StartValue type");
		if (StepValue->isMinusOne())
		return B.CreateSub(StartValue, Index);
		if (!StepValue->isOne())
		Index = B.CreateMul(Index, StepValue);
		return B.CreateAdd(StartValue, Index);

		case IK_PtrInduction:
		if (StepValue->isMinusOne())
		Index = B.CreateNeg(Index);
		else if (!StepValue->isOne())
		Index = B.CreateMul(Index, StepValue);
		return B.CreateGEP(StartValue, Index);

		case IK_NoInduction:
		default:
		return nullptr;
		}
		}

/// Start value.		/// Start value.
TrackingVH<Value> StartValue;		TrackingVH<Value> StartValue;
/// Induction kind.		/// Induction kind.
InductionKind IK;		InductionKind IK;
		/// Step value.
		ConstantInt *StepValue;
};		};

/// ReductionList contains the reduction descriptors for all		/// ReductionList contains the reduction descriptors for all
/// of the reductions that were found in the loop.		/// of the reductions that were found in the loop.
typedef DenseMap<PHINode*, ReductionDescriptor> ReductionList;		typedef DenseMap<PHINode*, ReductionDescriptor> ReductionList;

/// InductionList saves induction variables and maps them to the		/// InductionList saves induction variables and maps them to the
/// induction descriptor.		/// induction descriptor.
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	private:
/// compare instruction to the select instruction and stores this pointer in		/// compare instruction to the select instruction and stores this pointer in
/// 'PatternLastInst' member of the returned struct.		/// 'PatternLastInst' member of the returned struct.
ReductionInstDesc isReductionInstr(Instruction *I, ReductionKind Kind,		ReductionInstDesc isReductionInstr(Instruction *I, ReductionKind Kind,
ReductionInstDesc &Desc);		ReductionInstDesc &Desc);
/// Returns true if the instruction is a Select(ICmp(X, Y), X, Y) instruction		/// Returns true if the instruction is a Select(ICmp(X, Y), X, Y) instruction
/// pattern corresponding to a min(X, Y) or max(X, Y).		/// pattern corresponding to a min(X, Y) or max(X, Y).
static ReductionInstDesc isMinMaxSelectCmpPattern(Instruction *I,		static ReductionInstDesc isMinMaxSelectCmpPattern(Instruction *I,
ReductionInstDesc &Prev);		ReductionInstDesc &Prev);
/// Returns the induction kind of Phi. This function may return NoInduction		/// Returns the induction kind of Phi and record the step. This function may
/// if the PHI is not an induction variable.		/// return NoInduction if the PHI is not an induction variable.
InductionKind isInductionVariable(PHINode *Phi);		InductionKind isInductionVariable(PHINode Phi, ConstantInt &StepValue);

/// \brief Collect memory access with loop invariant strides.		/// \brief Collect memory access with loop invariant strides.
///		///
/// Looks for accesses like "a[i * StrideA]" where "StrideA" is loop		/// Looks for accesses like "a[i * StrideA]" where "StrideA" is loop
/// invariant.		/// invariant.
void collectStridedAccess(Value *LoadOrStoreInst);		void collectStridedAccess(Value *LoadOrStoreInst);

/// Report an analysis message to assist the user in diagnosing loops that are		/// Report an analysis message to assist the user in diagnosing loops that are
▲ Show 20 Lines • Show All 751 Lines • ▼ Show 20 Lines	if (Invariant)
Builder.SetInsertPoint(LoopVectorPreHeader->getTerminator());		Builder.SetInsertPoint(LoopVectorPreHeader->getTerminator());

// Broadcast the scalar into all locations in the vector.		// Broadcast the scalar into all locations in the vector.
Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");		Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");

return Shuf;		return Shuf;
}		}

Value InnerLoopVectorizer::getConsecutiveVector(Value Val, int StartIdx,		Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,
bool Negate) {		Value *Step) {
assert(Val->getType()->isVectorTy() && "Must be a vector");		assert(Val->getType()->isVectorTy() && "Must be a vector");
assert(Val->getType()->getScalarType()->isIntegerTy() &&		assert(Val->getType()->getScalarType()->isIntegerTy() &&
"Elem must be an integer");		"Elem must be an integer");
		assert(Step->getType() == Val->getType()->getScalarType() &&
		"Step has wrong type");
// Create the types.		// Create the types.
Type *ITy = Val->getType()->getScalarType();		Type *ITy = Val->getType()->getScalarType();
VectorType *Ty = cast<VectorType>(Val->getType());		VectorType *Ty = cast<VectorType>(Val->getType());
int VLen = Ty->getNumElements();		int VLen = Ty->getNumElements();
SmallVector<Constant*, 8> Indices;		SmallVector<Constant*, 8> Indices;

// Create a vector of consecutive numbers from zero to VF.		// Create a vector of consecutive numbers from zero to VF.
for (int i = 0; i < VLen; ++i) {		for (int i = 0; i < VLen; ++i)
int64_t Idx = Negate ? (-i) : i;		Indices.push_back(ConstantInt::get(ITy, StartIdx + i));
Indices.push_back(ConstantInt::get(ITy, StartIdx + Idx, Negate));
}

// Add the consecutive indices to the vector value.		// Add the consecutive indices to the vector value.
Constant *Cv = ConstantVector::get(Indices);		Constant *Cv = ConstantVector::get(Indices);
assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");		assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");
return Builder.CreateAdd(Val, Cv, "induction");		Step = Builder.CreateVectorSplat(VLen, Step);
		assert(Step->getType() == Val->getType() && "Invalid step vec");
		// FIXME: The newly created binary instructions should contain nsw/nuw flags,
		// which can be found from the original scalar operations.
		hfinkelUnsubmitted Not Done Reply Inline Actions I know it was like this before, but nsw/nuw? hfinkel: I know it was like this before, but nsw/nuw?
		Step = Builder.CreateMul(Cv, Step);
		return Builder.CreateAdd(Val, Step, "induction");
}		}

/// \brief Find the operand of the GEP that should be checked for consecutive		/// \brief Find the operand of the GEP that should be checked for consecutive
/// stores. This ignores trailing indices that have no effect on the final		/// stores. This ignores trailing indices that have no effect on the final
/// pointer.		/// pointer.
static unsigned getGEPInductionOperand(const DataLayout *DL,		static unsigned getGEPInductionOperand(const DataLayout *DL,
const GetElementPtrInst *Gep) {		const GetElementPtrInst *Gep) {
unsigned LastOperand = Gep->getNumOperands() - 1;		unsigned LastOperand = Gep->getNumOperands() - 1;
Show All 21 Lines	int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {
// Make sure that the pointer does not point to structs.		// Make sure that the pointer does not point to structs.
if (Ptr->getType()->getPointerElementType()->isAggregateType())		if (Ptr->getType()->getPointerElementType()->isAggregateType())
return 0;		return 0;

// If this value is a pointer induction variable we know it is consecutive.		// If this value is a pointer induction variable we know it is consecutive.
PHINode *Phi = dyn_cast_or_null<PHINode>(Ptr);		PHINode *Phi = dyn_cast_or_null<PHINode>(Ptr);
if (Phi && Inductions.count(Phi)) {		if (Phi && Inductions.count(Phi)) {
InductionInfo II = Inductions[Phi];		InductionInfo II = Inductions[Phi];
if (IK_PtrInduction == II.IK)		return II.getConsecutiveDirection();
return 1;
else if (IK_ReversePtrInduction == II.IK)
return -1;
}		}

GetElementPtrInst *Gep = dyn_cast_or_null<GetElementPtrInst>(Ptr);		GetElementPtrInst *Gep = dyn_cast_or_null<GetElementPtrInst>(Ptr);
if (!Gep)		if (!Gep)
return 0;		return 0;

unsigned NumOperands = Gep->getNumOperands();		unsigned NumOperands = Gep->getNumOperands();
Value *GpPtr = Gep->getPointerOperand();		Value *GpPtr = Gep->getPointerOperand();
// If this GEP value is a consecutive pointer induction variable and all of		// If this GEP value is a consecutive pointer induction variable and all of
// the indices are constant then we know it is consecutive. We can		// the indices are constant then we know it is consecutive. We can
Phi = dyn_cast<PHINode>(GpPtr);		Phi = dyn_cast<PHINode>(GpPtr);
if (Phi && Inductions.count(Phi)) {		if (Phi && Inductions.count(Phi)) {

// Make sure that the pointer does not point to structs.		// Make sure that the pointer does not point to structs.
PointerType *GepPtrType = cast<PointerType>(GpPtr->getType());		PointerType *GepPtrType = cast<PointerType>(GpPtr->getType());
if (GepPtrType->getElementType()->isAggregateType())		if (GepPtrType->getElementType()->isAggregateType())
return 0;		return 0;

// Make sure that all of the index operands are loop invariant.		// Make sure that all of the index operands are loop invariant.
for (unsigned i = 1; i < NumOperands; ++i)		for (unsigned i = 1; i < NumOperands; ++i)
if (!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))		if (!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))
return 0;		return 0;

InductionInfo II = Inductions[Phi];		InductionInfo II = Inductions[Phi];
if (IK_PtrInduction == II.IK)		return II.getConsecutiveDirection();
return 1;
else if (IK_ReversePtrInduction == II.IK)
return -1;
}		}

unsigned InductionOperand = getGEPInductionOperand(DL, Gep);		unsigned InductionOperand = getGEPInductionOperand(DL, Gep);

// Check that all of the gep indices are uniform except for our induction		// Check that all of the gep indices are uniform except for our induction
// operand.		// operand.
for (unsigned i = 0; i != NumOperands; ++i)		for (unsigned i = 0; i != NumOperands; ++i)
if (i != InductionOperand &&		if (i != InductionOperand &&
▲ Show 20 Lines • Show All 798 Lines • ▼ Show 20 Lines	case LoopVectorizationLegality::IK_IntInduction: {
break;		break;
}		}

// Not the canonical induction variable - add the vector loop count to the		// Not the canonical induction variable - add the vector loop count to the
// start value.		// start value.
Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,		Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,
II.StartValue->getType(),		II.StartValue->getType(),
"cast.crd");		"cast.crd");
EndValue = BypassBuilder.CreateAdd(CRD, II.StartValue , "ind.end");		EndValue = II.transform(BypassBuilder, CRD);
break;		EndValue->setName("ind.end");
}
case LoopVectorizationLegality::IK_ReverseIntInduction: {
// Convert the CountRoundDown variable to the PHI size.
Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,
II.StartValue->getType(),
"cast.crd");
// Handle reverse integer induction counter.
EndValue = BypassBuilder.CreateSub(II.StartValue, CRD, "rev.ind.end");
break;		break;
}		}
case LoopVectorizationLegality::IK_PtrInduction: {		case LoopVectorizationLegality::IK_PtrInduction: {
// For pointer induction variables, calculate the offset using		EndValue = II.transform(BypassBuilder, CountRoundDown);
// the end index.		EndValue->setName("ptr.ind.end");
EndValue = BypassBuilder.CreateGEP(II.StartValue, CountRoundDown,
"ptr.ind.end");
break;
}
case LoopVectorizationLegality::IK_ReversePtrInduction: {
// The value at the end of the loop for the reverse pointer is calculated
// by creating a GEP with a negative index starting from the start value.
Value *Zero = ConstantInt::get(CountRoundDown->getType(), 0);
Value *NegIdx = BypassBuilder.CreateSub(Zero, CountRoundDown,
"rev.ind.end");
EndValue = BypassBuilder.CreateGEP(II.StartValue, NegIdx,
"rev.ptr.ind.end");
break;		break;
}		}
}// end of case		}// end of case

// The new PHI merges the original incoming value, in case of a bypass,		// The new PHI merges the original incoming value, in case of a bypass,
// or the value at the end of the vectorized loop.		// or the value at the end of the vectorized loop.
for (unsigned I = 1, E = LoopBypassBlocks.size(); I != E; ++I) {		for (unsigned I = 1, E = LoopBypassBlocks.size(); I != E; ++I) {
if (OrigPhi == OldInduction)		if (OrigPhi == OldInduction)
▲ Show 20 Lines • Show All 598 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::widenPHIInstruction(Instruction *PN,
// This PHINode must be an induction variable.		// This PHINode must be an induction variable.
// Make sure that we know about it.		// Make sure that we know about it.
assert(Legal->getInductionVars()->count(P) &&		assert(Legal->getInductionVars()->count(P) &&
"Not an induction variable");		"Not an induction variable");

LoopVectorizationLegality::InductionInfo II =		LoopVectorizationLegality::InductionInfo II =
Legal->getInductionVars()->lookup(P);		Legal->getInductionVars()->lookup(P);

		// FIXME: The newly created binary instructions should contain nsw/nuw flags,
		// which can be found from the original scalar operations.
switch (II.IK) {		switch (II.IK) {
case LoopVectorizationLegality::IK_NoInduction:		case LoopVectorizationLegality::IK_NoInduction:
llvm_unreachable("Unknown induction");		llvm_unreachable("Unknown induction");
case LoopVectorizationLegality::IK_IntInduction: {		case LoopVectorizationLegality::IK_IntInduction: {
assert(P->getType() == II.StartValue->getType() && "Types must match");		assert(P->getType() == II.StartValue->getType() && "Types must match");
Type *PhiTy = P->getType();		Type *PhiTy = P->getType();
Value *Broadcasted;		Value *Broadcasted;
if (P == OldInduction) {		if (P == OldInduction) {
// Handle the canonical induction variable. We might have had to		// Handle the canonical induction variable. We might have had to
// extend the type.		// extend the type.
Broadcasted = Builder.CreateTrunc(Induction, PhiTy);		Broadcasted = Builder.CreateTrunc(Induction, PhiTy);
} else {		} else {
// Handle other induction variables that are now based on the		// Handle other induction variables that are now based on the
// canonical one.		// canonical one.
Value *NormalizedIdx = Builder.CreateSub(Induction, ExtendedIdx,		Value *NormalizedIdx = Builder.CreateSub(Induction, ExtendedIdx,
"normalized.idx");		"normalized.idx");
NormalizedIdx = Builder.CreateSExtOrTrunc(NormalizedIdx, PhiTy);		NormalizedIdx = Builder.CreateSExtOrTrunc(NormalizedIdx, PhiTy);
Broadcasted = Builder.CreateAdd(II.StartValue, NormalizedIdx,		Broadcasted = II.transform(Builder, NormalizedIdx);
"offset.idx");		Broadcasted->setName("offset.idx");
}		}
Broadcasted = getBroadcastInstrs(Broadcasted);		Broadcasted = getBroadcastInstrs(Broadcasted);
// After broadcasting the induction variable we need to make the vector		// After broadcasting the induction variable we need to make the vector
// consecutive by adding 0, 1, 2, etc.		// consecutive by adding 0, 1, 2, etc.
for (unsigned part = 0; part < UF; ++part)		for (unsigned part = 0; part < UF; ++part)
Entry[part] = getConsecutiveVector(Broadcasted, VF * part, false);		Entry[part] = getStepVector(Broadcasted, VF * part, II.StepValue);
return;		return;
}		}
case LoopVectorizationLegality::IK_ReverseIntInduction:
case LoopVectorizationLegality::IK_PtrInduction:		case LoopVectorizationLegality::IK_PtrInduction:
case LoopVectorizationLegality::IK_ReversePtrInduction:
// Handle reverse integer and pointer inductions.
Value *StartIdx = ExtendedIdx;
// This is the normalized GEP that starts counting at zero.
Value *NormalizedIdx = Builder.CreateSub(Induction, StartIdx,
"normalized.idx");

// Handle the reverse integer induction variable case.
if (LoopVectorizationLegality::IK_ReverseIntInduction == II.IK) {
IntegerType *DstTy = cast<IntegerType>(II.StartValue->getType());
Value *CNI = Builder.CreateSExtOrTrunc(NormalizedIdx, DstTy,
"resize.norm.idx");
Value *ReverseInd = Builder.CreateSub(II.StartValue, CNI,
"reverse.idx");

// This is a new value so do not hoist it out.
Value *Broadcasted = getBroadcastInstrs(ReverseInd);
// After broadcasting the induction variable we need to make the
// vector consecutive by adding ... -3, -2, -1, 0.
for (unsigned part = 0; part < UF; ++part)
Entry[part] = getConsecutiveVector(Broadcasted, -(int)VF * part,
true);
return;
}

// Handle the pointer induction variable case.		// Handle the pointer induction variable case.
assert(P->getType()->isPointerTy() && "Unexpected type.");		assert(P->getType()->isPointerTy() && "Unexpected type.");
		// This is the normalized GEP that starts counting at zero.
// Is this a reverse induction ptr or a consecutive induction ptr.		Value *NormalizedIdx =
bool Reverse = (LoopVectorizationLegality::IK_ReversePtrInduction ==		Builder.CreateSub(Induction, ExtendedIdx, "normalized.idx");
		hfinkelUnsubmitted Not Done Reply Inline Actions nsw/nuw? (same with the others in the code below) hfinkel: nsw/nuw? (same with the others in the code below)
II.IK);

// This is the vector of results. Notice that we don't generate		// This is the vector of results. Notice that we don't generate
// vector geps because scalar geps result in better code.		// vector geps because scalar geps result in better code.
for (unsigned part = 0; part < UF; ++part) {		for (unsigned part = 0; part < UF; ++part) {
if (VF == 1) {		if (VF == 1) {
int EltIndex = (part) * (Reverse ? -1 : 1);		int EltIndex = part;
Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);		Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);
Value *GlobalIdx;		Value *GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx);
if (Reverse)		Value *SclrGep = II.transform(Builder, GlobalIdx);
GlobalIdx = Builder.CreateSub(Idx, NormalizedIdx, "gep.ridx");		SclrGep->setName("next.gep");
else
GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx, "gep.idx");

Value *SclrGep = Builder.CreateGEP(II.StartValue, GlobalIdx,
"next.gep");
Entry[part] = SclrGep;		Entry[part] = SclrGep;
continue;		continue;
}		}

Value *VecVal = UndefValue::get(VectorType::get(P->getType(), VF));		Value *VecVal = UndefValue::get(VectorType::get(P->getType(), VF));
for (unsigned int i = 0; i < VF; ++i) {		for (unsigned int i = 0; i < VF; ++i) {
int EltIndex = (i + part * VF) * (Reverse ? -1 : 1);		int EltIndex = i + part * VF;
Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);		Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);
Value *GlobalIdx;		Value *GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx);
if (!Reverse)		Value *SclrGep = II.transform(Builder, GlobalIdx);
GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx, "gep.idx");		SclrGep->setName("next.gep");
else
GlobalIdx = Builder.CreateSub(Idx, NormalizedIdx, "gep.ridx");

Value *SclrGep = Builder.CreateGEP(II.StartValue, GlobalIdx,
"next.gep");
VecVal = Builder.CreateInsertElement(VecVal, SclrGep,		VecVal = Builder.CreateInsertElement(VecVal, SclrGep,
Builder.getInt32(i),		Builder.getInt32(i),
"insert.gep");		"insert.gep");
}		}
Entry[part] = VecVal;		Entry[part] = VecVal;
}		}
return;		return;
}		}
}		}

void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {		void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {
// For each instruction in the old loop.		// For each instruction in the old loop.
for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {		for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {
VectorParts &Entry = WidenMap.get(it);		VectorParts &Entry = WidenMap.get(it);
switch (it->getOpcode()) {		switch (it->getOpcode()) {
case Instruction::Br:		case Instruction::Br:
// Nothing to do for PHIs and BR, since we already took care of the		// Nothing to do for PHIs and BR, since we already took care of the
// loop control flow instructions.		// loop control flow instructions.
continue;		continue;
case Instruction::PHI:{		case Instruction::PHI: {
// Vectorize PHINodes.		// Vectorize PHINodes.
widenPHIInstruction(it, Entry, UF, VF, PV);		widenPHIInstruction(it, Entry, UF, VF, PV);
continue;		continue;
}// End of PHI.		}// End of PHI.

case Instruction::Add:		case Instruction::Add:
case Instruction::FAdd:		case Instruction::FAdd:
case Instruction::Sub:		case Instruction::Sub:
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	case Instruction::BitCast: {
/// variable. Notice that we can only optimize the 'trunc' case		/// variable. Notice that we can only optimize the 'trunc' case
/// because: a. FP conversions lose precision, b. sext/zext may wrap,		/// because: a. FP conversions lose precision, b. sext/zext may wrap,
/// c. other casts depend on pointer size.		/// c. other casts depend on pointer size.
if (CI->getOperand(0) == OldInduction &&		if (CI->getOperand(0) == OldInduction &&
it->getOpcode() == Instruction::Trunc) {		it->getOpcode() == Instruction::Trunc) {
Value *ScalarCast = Builder.CreateCast(CI->getOpcode(), Induction,		Value *ScalarCast = Builder.CreateCast(CI->getOpcode(), Induction,
CI->getType());		CI->getType());
Value *Broadcasted = getBroadcastInstrs(ScalarCast);		Value *Broadcasted = getBroadcastInstrs(ScalarCast);
		LoopVectorizationLegality::InductionInfo II =
		Legal->getInductionVars()->lookup(OldInduction);
		Constant *Step =
		ConstantInt::getSigned(CI->getType(), II.StepValue->getSExtValue());
for (unsigned Part = 0; Part < UF; ++Part)		for (unsigned Part = 0; Part < UF; ++Part)
Entry[Part] = getConsecutiveVector(Broadcasted, VF * Part, false);		Entry[Part] = getStepVector(Broadcasted, VF * Part, Step);
propagateMetadata(Entry, it);		propagateMetadata(Entry, it);
break;		break;
}		}
/// Vectorize casts.		/// Vectorize casts.
Type *DestTy = (VF == 1) ? CI->getType() :		Type *DestTy = (VF == 1) ? CI->getType() :
VectorType::get(CI->getType(), VF);		VectorType::get(CI->getType(), VF);

VectorParts &A = getVectorValue(it->getOperand(0));		VectorParts &A = getVectorValue(it->getOperand(0));
▲ Show 20 Lines • Show All 330 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator it = (bb)->begin(), e = (bb)->end(); it != e;
emitAnalysis(Report(it)		emitAnalysis(Report(it)
<< "control flow not understood by vectorizer");		<< "control flow not understood by vectorizer");
DEBUG(dbgs() << "LV: Found an invalid PHI.\n");		DEBUG(dbgs() << "LV: Found an invalid PHI.\n");
return false;		return false;
}		}

// This is the value coming from the preheader.		// This is the value coming from the preheader.
Value *StartValue = Phi->getIncomingValueForBlock(PreHeader);		Value *StartValue = Phi->getIncomingValueForBlock(PreHeader);
		ConstantInt *StepValue = nullptr;
// Check if this is an induction variable.		// Check if this is an induction variable.
InductionKind IK = isInductionVariable(Phi);		InductionKind IK = isInductionVariable(Phi, StepValue);

if (IK_NoInduction != IK) {		if (IK_NoInduction != IK) {
// Get the widest type.		// Get the widest type.
if (!WidestIndTy)		if (!WidestIndTy)
WidestIndTy = convertPointerToIntegerType(*DL, PhiTy);		WidestIndTy = convertPointerToIntegerType(*DL, PhiTy);
else		else
WidestIndTy = getWiderType(*DL, PhiTy, WidestIndTy);		WidestIndTy = getWiderType(*DL, PhiTy, WidestIndTy);

// Int inductions are special because we only allow one IV.		// Int inductions are special because we only allow one IV.
if (IK == IK_IntInduction) {		if (IK == IK_IntInduction && StepValue->isOne()) {
// Use the phi node with the widest type as induction. Use the last		// Use the phi node with the widest type as induction. Use the last
// one if there are multiple (no good reason for doing this other		// one if there are multiple (no good reason for doing this other
// than it is expedient).		// than it is expedient).
if (!Induction \|\| PhiTy == WidestIndTy)		if (!Induction \|\| PhiTy == WidestIndTy)
Induction = Phi;		Induction = Phi;
}		}

DEBUG(dbgs() << "LV: Found an induction variable.\n");		DEBUG(dbgs() << "LV: Found an induction variable.\n");
Inductions[Phi] = InductionInfo(StartValue, IK);		Inductions[Phi] = InductionInfo(StartValue, IK, StepValue);

// Until we explicitly handle the case of an induction variable with		// Until we explicitly handle the case of an induction variable with
// an outside loop user we have to give up vectorizing this loop.		// an outside loop user we have to give up vectorizing this loop.
if (hasOutsideLoopUser(TheLoop, it, AllowedExit)) {		if (hasOutsideLoopUser(TheLoop, it, AllowedExit)) {
emitAnalysis(Report(it) << "use of induction value outside of the "		emitAnalysis(Report(it) << "use of induction value outside of the "
"loop is not handled by vectorizer");		"loop is not handled by vectorizer");
return false;		return false;
}		}
▲ Show 20 Lines • Show All 1,534 Lines • ▼ Show 20 Lines	case Instruction::Select:
if (Kind != RK_IntegerMinMax &&		if (Kind != RK_IntegerMinMax &&
(!HasFunNoNaNAttr \|\| Kind != RK_FloatMinMax))		(!HasFunNoNaNAttr \|\| Kind != RK_FloatMinMax))
return ReductionInstDesc(false, I);		return ReductionInstDesc(false, I);
return isMinMaxSelectCmpPattern(I, Prev);		return isMinMaxSelectCmpPattern(I, Prev);
}		}
}		}

LoopVectorizationLegality::InductionKind		LoopVectorizationLegality::InductionKind
LoopVectorizationLegality::isInductionVariable(PHINode *Phi) {		LoopVectorizationLegality::isInductionVariable(PHINode *Phi,
		ConstantInt *&StepValue) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.		// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())
return IK_NoInduction;		return IK_NoInduction;

// Check that the PHI is consecutive.		// Check that the PHI is consecutive.
const SCEV *PhiScev = SE->getSCEV(Phi);		const SCEV *PhiScev = SE->getSCEV(Phi);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);
if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return IK_NoInduction;		return IK_NoInduction;
}		}
const SCEV Step = AR->getStepRecurrence(SE);

// Integer inductions need to have a stride of one.
if (PhiTy->isIntegerTy()) {
if (Step->isOne())
return IK_IntInduction;
if (Step->isAllOnesValue())
return IK_ReverseIntInduction;
return IK_NoInduction;
}

		const SCEV Step = AR->getStepRecurrence(SE);
// Calculate the pointer stride and check if it is consecutive.		// Calculate the pointer stride and check if it is consecutive.
const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);		const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);
if (!C)		if (!C)
return IK_NoInduction;		return IK_NoInduction;

		ConstantInt *CV = C->getValue();
		if (PhiTy->isIntegerTy()) {
		StepValue = CV;
		return IK_IntInduction;
		}

assert(PhiTy->isPointerTy() && "The PHI must be a pointer");		assert(PhiTy->isPointerTy() && "The PHI must be a pointer");
Type *PointerElementType = PhiTy->getPointerElementType();		Type *PointerElementType = PhiTy->getPointerElementType();
// The pointer stride cannot be determined if the pointer element type is not		// The pointer stride cannot be determined if the pointer element type is not
// sized.		// sized.
if (!PointerElementType->isSized())		if (!PointerElementType->isSized())
return IK_NoInduction;		return IK_NoInduction;

uint64_t Size = DL->getTypeAllocSize(PointerElementType);		int64_t Size = static_cast<int64_t>(DL->getTypeAllocSize(PointerElementType));
if (C->getValue()->equalsInt(Size))		int64_t CVSize = CV->getSExtValue();
return IK_PtrInduction;		if (CVSize % Size)
else if (C->getValue()->equalsInt(0 - Size))
return IK_ReversePtrInduction;

return IK_NoInduction;		return IK_NoInduction;
		StepValue = ConstantInt::getSigned(CV->getType(), CVSize / Size);
		return IK_PtrInduction;
}		}

bool LoopVectorizationLegality::isInductionVariable(const Value *V) {		bool LoopVectorizationLegality::isInductionVariable(const Value *V) {
Value In0 = const_cast<Value>(V);		Value In0 = const_cast<Value>(V);
PHINode *PN = dyn_cast_or_null<PHINode>(In0);		PHINode *PN = dyn_cast_or_null<PHINode>(In0);
if (!PN)		if (!PN)
return false;		return false;

▲ Show 20 Lines • Show All 965 Lines • ▼ Show 20 Lines
Value InnerLoopUnroller::reverseVector(Value Vec) {		Value InnerLoopUnroller::reverseVector(Value Vec) {
return Vec;		return Vec;
}		}

Value InnerLoopUnroller::getBroadcastInstrs(Value V) {		Value InnerLoopUnroller::getBroadcastInstrs(Value V) {
return V;		return V;
}		}

Value InnerLoopUnroller::getConsecutiveVector(Value Val, int StartIdx,		Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step) {
bool Negate) {
// When unrolling and the VF is 1, we only need to add a simple scalar.		// When unrolling and the VF is 1, we only need to add a simple scalar.
Type *ITy = Val->getType();		Type *ITy = Val->getType();
assert(!ITy->isVectorTy() && "Val must be a scalar");		assert(!ITy->isVectorTy() && "Val must be a scalar");
Constant *C = ConstantInt::get(ITy, StartIdx, Negate);		Constant *C = ConstantInt::get(ITy, StartIdx);
return Builder.CreateAdd(Val, C, "induction");		return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");
}		}

test/Transforms/LoopVectorize/arbitrary-induction-step.ll

This file was added.

				; RUN: opt -S < %s -loop-vectorize 2>&1 \| FileCheck %s
				; RUN: opt -S < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 \| FileCheck %s --check-prefix=FORCE-VEC

				target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64--linux-gnueabi"

				; Test integer induction variable of step 2:
				; for (int i = 0; i < 1024; i+=2) {
				hfinkelUnsubmitted Not Done Reply Inline Actions If you're checking debug output, you'll need to add: ; REQUIRES: asserts to the entire test. Please break this out into a separate test file, if you'd like to do this, to limit the reduction in test coverage for non-asserts builds. hfinkel: If you're checking debug output, you'll need to add: ; REQUIRES: asserts to the entire test.
				HaoLiuAuthorUnsubmitted Not Done Reply Inline Actions Hal, thanks for let me know about this. As I think it's not worth to add another file just for testing the debug info, I remove such 'CHECK' and '-debug-only'. HaoLiu: Hal, thanks for let me know about this. As I think it's not worth to add another file just for…
				; int tmp = *A++;
				; sum += i * tmp;
				; }

				; CHECK-LABEL: @ind_plus2(
				; CHECK: load <4 x i32>*
				; CHECK: load <4 x i32>*
				; CHECK: mul nsw <4 x i32>
				; CHECK: mul nsw <4 x i32>
				; CHECK: add nsw <4 x i32>
				; CHECK: add nsw <4 x i32>
				; CHECK: %index.next = add i64 %index, 8
				; CHECK: icmp eq i64 %index.next, 512

				; FORCE-VEC-LABEL: @ind_plus2(
				; FORCE-VEC: %wide.load = load <2 x i32>*
				; FORCE-VEC: mul nsw <2 x i32>
				; FORCE-VEC: add nsw <2 x i32>
				; FORCE-VEC: %index.next = add i64 %index, 2
				; FORCE-VEC: icmp eq i64 %index.next, 512
				define i32 @ind_plus2(i32* %A) {
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%A.addr = phi i32* [ %A, %entry ], [ %inc.ptr, %for.body ]
				%i = phi i32 [ 0, %entry ], [ %add1, %for.body ]
				%sum = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%inc.ptr = getelementptr inbounds i32* %A.addr, i64 1
				%0 = load i32* %A.addr, align 4
				%mul = mul nsw i32 %0, %i
				%add = add nsw i32 %mul, %sum
				%add1 = add nsw i32 %i, 2
				%cmp = icmp slt i32 %add1, 1024
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				%add.lcssa = phi i32 [ %add, %for.body ]
				ret i32 %add.lcssa
				}


				; Test integer induction variable of step -2:
				; for (int i = 1024; i > 0; i-=2) {
				; int tmp = *A++;
				; sum += i * tmp;
				; }

				; CHECK-LABEL: @ind_minus2(
				; CHECK: load <4 x i32>*
				; CHECK: load <4 x i32>*
				; CHECK: mul nsw <4 x i32>
				; CHECK: mul nsw <4 x i32>
				; CHECK: add nsw <4 x i32>
				; CHECK: add nsw <4 x i32>
				; CHECK: %index.next = add i64 %index, 8
				; CHECK: icmp eq i64 %index.next, 512

				; FORCE-VEC-LABEL: @ind_minus2(
				; FORCE-VEC: %wide.load = load <2 x i32>*
				; FORCE-VEC: mul nsw <2 x i32>
				; FORCE-VEC: add nsw <2 x i32>
				; FORCE-VEC: %index.next = add i64 %index, 2
				; FORCE-VEC: icmp eq i64 %index.next, 512
				define i32 @ind_minus2(i32* %A) {
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%A.addr = phi i32* [ %A, %entry ], [ %inc.ptr, %for.body ]
				%i = phi i32 [ 1024, %entry ], [ %sub, %for.body ]
				%sum = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%inc.ptr = getelementptr inbounds i32* %A.addr, i64 1
				%0 = load i32* %A.addr, align 4
				%mul = mul nsw i32 %0, %i
				%add = add nsw i32 %mul, %sum
				%sub = add nsw i32 %i, -2
				%cmp = icmp sgt i32 %i, 2
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				%add.lcssa = phi i32 [ %add, %for.body ]
				ret i32 %add.lcssa
				}


				; Test pointer induction variable of step 2. As currently we don't support
				; masked load/store, vectorization is possible but not beneficial. If loop
				; vectorization is not enforced, LV will only do interleave.
				; for (int i = 0; i < 1024; i++) {
				; int tmp0 = *A++;
				; int tmp1 = *A++;
				; sum += tmp0 * tmp1;
				; }

				; CHECK-LABEL: @ptr_ind_plus2(
				; CHECK: load i32*
				; CHECK: load i32*
				; CHECK: load i32*
				; CHECK: load i32*
				; CHECK: mul nsw i32
				; CHECK: mul nsw i32
				; CHECK: add nsw i32
				; CHECK: add nsw i32
				; CHECK: %index.next = add i64 %index, 2
				; CHECK: %21 = icmp eq i64 %index.next, 1024

				; FORCE-VEC-LABEL: @ptr_ind_plus2(
				; FORCE-VEC: load i32*
				; FORCE-VEC: insertelement <2 x i32>
				; FORCE-VEC: load i32*
				; FORCE-VEC: insertelement <2 x i32>
				; FORCE-VEC: load i32*
				; FORCE-VEC: insertelement <2 x i32>
				; FORCE-VEC: load i32*
				; FORCE-VEC: insertelement <2 x i32>
				; FORCE-VEC: mul nsw <2 x i32>
				; FORCE-VEC: add nsw <2 x i32>
				; FORCE-VEC: %index.next = add i64 %index, 2
				; FORCE-VEC: icmp eq i64 %index.next, 1024
				define i32 @ptr_ind_plus2(i32* %A) {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%A.addr = phi i32* [ %A, %entry ], [ %inc.ptr1, %for.body ]
				%sum = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%i = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%inc.ptr = getelementptr inbounds i32* %A.addr, i64 1
				%0 = load i32* %A.addr, align 4
				%inc.ptr1 = getelementptr inbounds i32* %A.addr, i64 2
				%1 = load i32* %inc.ptr, align 4
				%mul = mul nsw i32 %1, %0
				%add = add nsw i32 %mul, %sum
				%inc = add nsw i32 %i, 1
				%exitcond = icmp eq i32 %inc, 1024
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				%add.lcssa = phi i32 [ %add, %for.body ]
				ret i32 %add.lcssa
				}

test/Transforms/LoopVectorize/gcc-examples.ll

Show First 20 Lines • Show All 382 Lines • ▼ Show 20 Lines	; <label>:1 ; preds = %1, %0
%lftr.wideiv = trunc i64 %indvars.iv.next to i32		%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, 1024		%exitcond = icmp eq i32 %lftr.wideiv, 1024
br i1 %exitcond, label %4, label %1		br i1 %exitcond, label %4, label %1

; <label>:4 ; preds = %1		; <label>:4 ; preds = %1
ret void		ret void
}		}

; Can't vectorize because of reductions.
;CHECK-LABEL: @example13(		;CHECK-LABEL: @example13(
;CHECK-NOT: <4 x i32>		;CHECK: <4 x i32>
;CHECK: ret void		;CHECK: ret void
define void @example13(i32 nocapture %A, i32 nocapture %B, i32* nocapture %out) nounwind uwtable ssp {		define void @example13(i32 nocapture %A, i32 nocapture %B, i32* nocapture %out) nounwind uwtable ssp {
br label %.preheader		br label %.preheader

.preheader: ; preds = %14, %0		.preheader: ; preds = %14, %0
%indvars.iv4 = phi i64 [ 0, %0 ], [ %indvars.iv.next5, %14 ]		%indvars.iv4 = phi i64 [ 0, %0 ], [ %indvars.iv.next5, %14 ]
%1 = getelementptr inbounds i32** %A, i64 %indvars.iv4		%1 = getelementptr inbounds i32** %A, i64 %indvars.iv4
%2 = load i32** %1, align 8		%2 = load i32** %1, align 8
▲ Show 20 Lines • Show All 286 Lines • Show Last 20 Lines

test/Transforms/LoopVectorize/reverse_induction.ll

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	; --reverse_induction;			; --reverse_induction;
	; }			; }
	; }			; }

	; CHECK-LABEL: @reverse_forward_induction_i64_i8(			; CHECK-LABEL: @reverse_forward_induction_i64_i8(
	; CHECK: vector.body			; CHECK: vector.body
	; CHECK: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]			; CHECK: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
	; CHECK: %normalized.idx = sub i64 %index, 0			; CHECK: %normalized.idx = sub i64 %index, 0
	; CHECK: %reverse.idx = sub i64 1023, %normalized.idx			; CHECK: %offset.idx = sub i64 1023, %normalized.idx
	; CHECK: trunc i64 %index to i8			; CHECK: trunc i64 %index to i8

	define void @reverse_forward_induction_i64_i8() {			define void @reverse_forward_induction_i64_i8() {
	entry:			entry:
	br label %while.body			br label %while.body

	while.body:			while.body:
	%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %while.body ]			%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %while.body ]
	Show All 10 Lines
	while.end:			while.end:
	ret void			ret void
	}			}

	; CHECK-LABEL: @reverse_forward_induction_i64_i8_signed(			; CHECK-LABEL: @reverse_forward_induction_i64_i8_signed(
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK: %index = phi i64 [ 129, %vector.ph ], [ %index.next, %vector.body ]			; CHECK: %index = phi i64 [ 129, %vector.ph ], [ %index.next, %vector.body ]
	; CHECK: %normalized.idx = sub i64 %index, 129			; CHECK: %normalized.idx = sub i64 %index, 129
	; CHECK: %reverse.idx = sub i64 1023, %normalized.idx			; CHECK: %offset.idx = sub i64 1023, %normalized.idx
	; CHECK: trunc i64 %index to i8			; CHECK: trunc i64 %index to i8

	define void @reverse_forward_induction_i64_i8_signed() {			define void @reverse_forward_induction_i64_i8_signed() {
	entry:			entry:
	br label %while.body			br label %while.body

	while.body:			while.body:
	%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %while.body ]			%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %while.body ]
	Show All 13 Lines