This is an archive of the discontinued LLVM Phabricator instance.

Induction variables: support arbitrary constant step
Needs ReviewPublic

Authored by volkalexey on Oct 31 2014, 5:25 AM.

Download Raw Diff

Details

Reviewers

nadav
hfinkel

Summary

Now for induction variables it's possible to have only -1 and +1 step values.
This patch adds support for other than +1 and -1 constant step values.

Diff Detail

Event Timeline

volkalexey updated this revision to Diff 15609.Oct 31 2014, 5:25 AM

volkalexey retitled this revision from to Induction variables: support arbitrary constant step.

volkalexey updated this object.

volkalexey edited the test plan for this revision. (Show Details)

volkalexey added reviewers: nadav, hfinkel.

volkalexey set the repository for this revision to rL LLVM.

volkalexey added subscribers: Unknown Object (MLST), zinovy.nis.

Can you please include a more-verbose description of what's going on here? Since we don't have masked stores, I assume this affects only loops without stores (which means reductions). For reductions with non-unit stride, we can indeed load data in larger vectors and then only use some of the vector lanes for the reduced value (and I agree this is a useful capability to have), but for that I'd expect to see changes in the code after this comment:

if (VF > 1) {
  // VF is a power of 2 so we can emit the reduction using log2(VF) shuffles
  // and vector ops, reducing the set of values being computed by half each
  // round.

but I don't see any changes there, so I don't understand how you're selecting out only the correct vector lanes.

Hi Alexey,

Did you verify the correctness and the performance impacts of this patch? You should run the LLVM test suite with your changes.

This patch looks very suspicious. The test that you are chancing (example13) has code that looks like this:

for (j = 0; j < N; j+=8) {
  diff += (a[i][j] - b[i][j]);
}

How are you generating consecutive vector loads? This looks like a bug. Also, you did not explain what programs you are expecting to vectorize and how. This is a huge patch and you should explain what changes you made and why.

-Nadav

Hi Alexey,

I'm investigating a similar issue. I think your patch is great. I tried your patch with several benchmarks for correctness. But there are some failures in LNT (ClamAV, TSVC) and SPEC CPU2000 (164.gzip, 256.bzip2).
As this patch was sent two months ago. I'm not sure whether you are still working on this patch? Or I'd like to help to fix such failures and send it out again.

Thanks,
-Hao

Hi Hao,

I am not working on this patch now.
You can take it and fix the problems.

Thanks,
Alexey

rengolin edited edge metadata.Jan 7 2015, 2:50 AM

rengolin added a subscriber: rengolin.

• HaoLiu mentioned this in D7193: [LoopVectorize] Induction variables: support arbitrary constant step.Jan 26 2015, 10:36 PM

Revision Contents

Path

Size

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

247 lines

test/

Transforms/

LoopVectorize/

const-induction.ll

45 lines

gcc-examples.ll

3 lines

Diff 15609

lib/Transforms/Vectorize/LoopVectorize.cpp

Show First 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	protected:

/// Create a broadcast instruction. This method generates a broadcast		/// Create a broadcast instruction. This method generates a broadcast
/// instruction (shuffle) for loop invariant values and for the induction		/// instruction (shuffle) for loop invariant values and for the induction
/// value. If this is the induction variable then we extend it to N, N+1, ...		/// value. If this is the induction variable then we extend it to N, N+1, ...
/// this is needed because each iteration in the loop corresponds to a SIMD		/// this is needed because each iteration in the loop corresponds to a SIMD
/// element.		/// element.
virtual Value getBroadcastInstrs(Value V);		virtual Value getBroadcastInstrs(Value V);

/// This function adds 0, 1, 2 ... to each vector element, starting at zero.		/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)
/// If Negate is set then negative numbers are added e.g. (0, -1, -2, ...).		/// to each vector element of Val.
/// The sequence starts at StartIndex.		virtual Value getConsecutiveVector(Value Val, int StartIdx, Value *Step);
virtual Value getConsecutiveVector(Value Val, int StartIdx, bool Negate);

/// When we go over instructions in the basic block we rely on previous		/// When we go over instructions in the basic block we rely on previous
/// values within the current basic block or on loop invariant values.		/// values within the current basic block or on loop invariant values.
/// When we widen (vectorize) values we place them in the map. If the values		/// When we widen (vectorize) values we place them in the map. If the values
/// are not within the map, they have to be loop invariant, so we simply		/// are not within the map, they have to be loop invariant, so we simply
/// broadcast them into a vector.		/// broadcast them into a vector.
VectorParts &getVectorValue(Value *V);		VectorParts &getVectorValue(Value *V);

▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	InnerLoopUnroller(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,
const TargetLibraryInfo *TLI, unsigned UnrollFactor) :		const TargetLibraryInfo *TLI, unsigned UnrollFactor) :
InnerLoopVectorizer(OrigLoop, SE, LI, DT, DL, TLI, 1, UnrollFactor) { }		InnerLoopVectorizer(OrigLoop, SE, LI, DT, DL, TLI, 1, UnrollFactor) { }

private:		private:
void scalarizeInstruction(Instruction *Instr,		void scalarizeInstruction(Instruction *Instr,
bool IfPredicateStore = false) override;		bool IfPredicateStore = false) override;
void vectorizeMemoryInstruction(Instruction *Instr) override;		void vectorizeMemoryInstruction(Instruction *Instr) override;
Value getBroadcastInstrs(Value V) override;		Value getBroadcastInstrs(Value V) override;
Value getConsecutiveVector(Value Val, int StartIdx, bool Negate) override;		Value getConsecutiveVector(Value Val, int StartIdx, Value *Step) override;
Value reverseVector(Value Vec) override;		Value reverseVector(Value Vec) override;
};		};

/// \brief Look for a meaningful debug location on the instruction or it's		/// \brief Look for a meaningful debug location on the instruction or it's
/// operands.		/// operands.
static Instruction getDebugLocFromInstOrOperands(Instruction I) {		static Instruction getDebugLocFromInstOrOperands(Instruction I) {
if (!I)		if (!I)
return I;		return I;
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	enum ReductionKind {
RK_FloatAdd, ///< Sum of floats.		RK_FloatAdd, ///< Sum of floats.
RK_FloatMult, ///< Product of floats.		RK_FloatMult, ///< Product of floats.
RK_FloatMinMax ///< Min/max implemented in terms of select(cmp()).		RK_FloatMinMax ///< Min/max implemented in terms of select(cmp()).
};		};

/// This enum represents the kinds of inductions that we support.		/// This enum represents the kinds of inductions that we support.
enum InductionKind {		enum InductionKind {
IK_NoInduction, ///< Not an induction variable.		IK_NoInduction, ///< Not an induction variable.
IK_IntInduction, ///< Integer induction variable. Step = 1.		IK_IntInduction, ///< Integer induction variable. Step = C.
IK_ReverseIntInduction, ///< Reverse int induction variable. Step = -1.		IK_PtrInduction ///< Pointer induction var. Step = C * sizeof(elem).
IK_PtrInduction, ///< Pointer induction var. Step = sizeof(elem).
IK_ReversePtrInduction ///< Reverse ptr indvar. Step = - sizeof(elem).
};		};

// This enum represents the kind of minmax reduction.		// This enum represents the kind of minmax reduction.
enum MinMaxReductionKind {		enum MinMaxReductionKind {
MRK_Invalid,		MRK_Invalid,
MRK_UIntMin,		MRK_UIntMin,
MRK_UIntMax,		MRK_UIntMax,
MRK_SIntMin,		MRK_SIntMin,
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	struct RuntimePointerCheck {
/// shared underlying object.		/// shared underlying object.
SmallVector<unsigned, 2> DependencySetId;		SmallVector<unsigned, 2> DependencySetId;
/// Holds the id of the disjoint alias set to which this pointer belongs.		/// Holds the id of the disjoint alias set to which this pointer belongs.
SmallVector<unsigned, 2> AliasSetId;		SmallVector<unsigned, 2> AliasSetId;
};		};

/// A struct for saving information about induction variables.		/// A struct for saving information about induction variables.
struct InductionInfo {		struct InductionInfo {
InductionInfo(Value *Start, InductionKind K) : StartValue(Start), IK(K) {}		InductionInfo(Value Start, InductionKind K, ConstantInt Step)
InductionInfo() : StartValue(nullptr), IK(IK_NoInduction) {}		: StartValue(Start), IK(K), StepValue(Step) {
		assert(IK != IK_NoInduction && "Not an induction");
		assert(StartValue && "StartValue is null");
		assert(StepValue && "StepValue is null");
		assert((IK != IK_IntInduction \|\|
		StartValue->getType() == StepValue->getType()) &&
		"StartValue type does not match StepValue type");
		assert((IK != IK_PtrInduction \|\|
		StartValue->getType()->isPointerTy()) &&
		"StartValue is not a pointer");
		assert((IK != IK_PtrInduction \|\|
		StepValue->getType()->isIntegerTy()) &&
		"StepValue is not an integer");
		}
		InductionInfo()
		: StartValue(nullptr), IK(IK_NoInduction), StepValue(nullptr) {}

		/// Return true if StepValue is a negative constant.
		bool isReverse() const { return StepValue && StepValue->isNegative(); }

		/// Compute the transformed value of Index at offset StartValue using step
		/// StepValue.
		Value transform(IRBuilder<> &B, Value Index) const {
		switch (IK) {
		case IK_IntInduction:
		assert(Index->getType() == StartValue->getType() &&
		"Index type does not match StartValue type");
		if (StepValue->isMinusOne())
		return B.CreateSub(StartValue, Index);
		if (!StepValue->isOne())
		Index = B.CreateMul(Index, StepValue);
		return B.CreateAdd(StartValue, Index);

		case IK_PtrInduction:
		if (StepValue->isMinusOne())
		Index = B.CreateNeg(Index);
		else if (!StepValue->isOne())
		Index = B.CreateMul(Index, StepValue);
		return B.CreateGEP(StartValue, Index);

		case IK_NoInduction:
		break;
		default:
		llvm_unreachable("Unknown induction");
		}
		}

/// Start value.		/// Start value.
TrackingVH<Value> StartValue;		TrackingVH<Value> StartValue;
/// Induction kind.		/// Induction kind.
InductionKind IK;		InductionKind IK;
		/// Step value.
		ConstantInt *StepValue;
};		};

/// ReductionList contains the reduction descriptors for all		/// ReductionList contains the reduction descriptors for all
/// of the reductions that were found in the loop.		/// of the reductions that were found in the loop.
typedef DenseMap<PHINode*, ReductionDescriptor> ReductionList;		typedef DenseMap<PHINode*, ReductionDescriptor> ReductionList;

/// InductionList saves induction variables and maps them to the		/// InductionList saves induction variables and maps them to the
/// induction descriptor.		/// induction descriptor.
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	private:
ReductionInstDesc isReductionInstr(Instruction *I, ReductionKind Kind,		ReductionInstDesc isReductionInstr(Instruction *I, ReductionKind Kind,
ReductionInstDesc &Desc);		ReductionInstDesc &Desc);
/// Returns true if the instruction is a Select(ICmp(X, Y), X, Y) instruction		/// Returns true if the instruction is a Select(ICmp(X, Y), X, Y) instruction
/// pattern corresponding to a min(X, Y) or max(X, Y).		/// pattern corresponding to a min(X, Y) or max(X, Y).
static ReductionInstDesc isMinMaxSelectCmpPattern(Instruction *I,		static ReductionInstDesc isMinMaxSelectCmpPattern(Instruction *I,
ReductionInstDesc &Prev);		ReductionInstDesc &Prev);
/// Returns the induction kind of Phi. This function may return NoInduction		/// Returns the induction kind of Phi. This function may return NoInduction
/// if the PHI is not an induction variable.		/// if the PHI is not an induction variable.
InductionKind isInductionVariable(PHINode *Phi);		InductionKind isInductionVariable(PHINode Phi, ConstantInt &StepValue);

/// \brief Collect memory access with loop invariant strides.		/// \brief Collect memory access with loop invariant strides.
///		///
/// Looks for accesses like "a[i * StrideA]" where "StrideA" is loop		/// Looks for accesses like "a[i * StrideA]" where "StrideA" is loop
/// invariant.		/// invariant.
void collectStridedAcccess(Value *LoadOrStoreInst);		void collectStridedAcccess(Value *LoadOrStoreInst);

/// Report an analysis message to assist the user in diagnosing loops that are		/// Report an analysis message to assist the user in diagnosing loops that are
▲ Show 20 Lines • Show All 748 Lines • ▼ Show 20 Lines	Value InnerLoopVectorizer::getBroadcastInstrs(Value V) {

// Broadcast the scalar into all locations in the vector.		// Broadcast the scalar into all locations in the vector.
Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");		Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");

return Shuf;		return Shuf;
}		}

Value InnerLoopVectorizer::getConsecutiveVector(Value Val, int StartIdx,		Value InnerLoopVectorizer::getConsecutiveVector(Value Val, int StartIdx,
bool Negate) {		Value* Step) {
assert(Val->getType()->isVectorTy() && "Must be a vector");		assert(Val->getType()->isVectorTy() && "Must be a vector");
assert(Val->getType()->getScalarType()->isIntegerTy() &&		assert(Val->getType()->getScalarType()->isIntegerTy() &&
"Elem must be an integer");		"Elem must be an integer");
		assert(Step->getType() == Val->getType()->getScalarType() &&
		"Step has wrong type");
// Create the types.		// Create the types.
Type *ITy = Val->getType()->getScalarType();		Type *ITy = Val->getType()->getScalarType();
VectorType *Ty = cast<VectorType>(Val->getType());		VectorType *Ty = cast<VectorType>(Val->getType());
int VLen = Ty->getNumElements();		int VLen = Ty->getNumElements();
SmallVector<Constant*, 8> Indices;		SmallVector<Constant*, 8> Indices;

// Create a vector of consecutive numbers from zero to VF.		// Create a vector of consecutive numbers from zero to VF.
for (int i = 0; i < VLen; ++i) {		for (int i = 0; i < VLen; ++i)
int64_t Idx = Negate ? (-i) : i;		Indices.push_back(ConstantInt::get(ITy, StartIdx + i));
Indices.push_back(ConstantInt::get(ITy, StartIdx + Idx, Negate));
}

// Add the consecutive indices to the vector value.		// Add the consecutive indices to the vector value.
Constant *Cv = ConstantVector::get(Indices);		Constant *Cv = ConstantVector::get(Indices);
assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");		assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");
return Builder.CreateAdd(Val, Cv, "induction");		Step = Builder.CreateVectorSplat(VLen, Step);
		assert(Step->getType() == Val->getType() && "Invalid step vec");
		Step = Builder.CreateMul(Cv, Step);
		return Builder.CreateAdd(Val, Step, "induction");
}		}

/// \brief Find the operand of the GEP that should be checked for consecutive		/// \brief Find the operand of the GEP that should be checked for consecutive
/// stores. This ignores trailing indices that have no effect on the final		/// stores. This ignores trailing indices that have no effect on the final
/// pointer.		/// pointer.
static unsigned getGEPInductionOperand(const DataLayout *DL,		static unsigned getGEPInductionOperand(const DataLayout *DL,
const GetElementPtrInst *Gep) {		const GetElementPtrInst *Gep) {
unsigned LastOperand = Gep->getNumOperands() - 1;		unsigned LastOperand = Gep->getNumOperands() - 1;
Show All 21 Lines	int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {
// Make sure that the pointer does not point to structs.		// Make sure that the pointer does not point to structs.
if (Ptr->getType()->getPointerElementType()->isAggregateType())		if (Ptr->getType()->getPointerElementType()->isAggregateType())
return 0;		return 0;

// If this value is a pointer induction variable we know it is consecutive.		// If this value is a pointer induction variable we know it is consecutive.
PHINode *Phi = dyn_cast_or_null<PHINode>(Ptr);		PHINode *Phi = dyn_cast_or_null<PHINode>(Ptr);
if (Phi && Inductions.count(Phi)) {		if (Phi && Inductions.count(Phi)) {
InductionInfo II = Inductions[Phi];		InductionInfo II = Inductions[Phi];
if (IK_PtrInduction == II.IK)		return II.isReverse() ? -1 : 1;
return 1;
else if (IK_ReversePtrInduction == II.IK)
return -1;
}		}

GetElementPtrInst *Gep = dyn_cast_or_null<GetElementPtrInst>(Ptr);		GetElementPtrInst *Gep = dyn_cast_or_null<GetElementPtrInst>(Ptr);
if (!Gep)		if (!Gep)
return 0;		return 0;

unsigned NumOperands = Gep->getNumOperands();		unsigned NumOperands = Gep->getNumOperands();
Value *GpPtr = Gep->getPointerOperand();		Value *GpPtr = Gep->getPointerOperand();
// If this GEP value is a consecutive pointer induction variable and all of		// If this GEP value is a consecutive pointer induction variable and all of
// the indices are constant then we know it is consecutive. We can		// the indices are constant then we know it is consecutive. We can
Phi = dyn_cast<PHINode>(GpPtr);		Phi = dyn_cast<PHINode>(GpPtr);
if (Phi && Inductions.count(Phi)) {		if (Phi && Inductions.count(Phi)) {

// Make sure that the pointer does not point to structs.		// Make sure that the pointer does not point to structs.
PointerType *GepPtrType = cast<PointerType>(GpPtr->getType());		PointerType *GepPtrType = cast<PointerType>(GpPtr->getType());
if (GepPtrType->getElementType()->isAggregateType())		if (GepPtrType->getElementType()->isAggregateType())
return 0;		return 0;

// Make sure that all of the index operands are loop invariant.		// Make sure that all of the index operands are loop invariant.
for (unsigned i = 1; i < NumOperands; ++i)		for (unsigned i = 1; i < NumOperands; ++i)
if (!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))		if (!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))
return 0;		return 0;

InductionInfo II = Inductions[Phi];		InductionInfo II = Inductions[Phi];
if (IK_PtrInduction == II.IK)		return II.isReverse() ? -1 : 1;
return 1;
else if (IK_ReversePtrInduction == II.IK)
return -1;
}		}

unsigned InductionOperand = getGEPInductionOperand(DL, Gep);		unsigned InductionOperand = getGEPInductionOperand(DL, Gep);

// Check that all of the gep indices are uniform except for our induction		// Check that all of the gep indices are uniform except for our induction
// operand.		// operand.
for (unsigned i = 0; i != NumOperands; ++i)		for (unsigned i = 0; i != NumOperands; ++i)
if (i != InductionOperand &&		if (i != InductionOperand &&
▲ Show 20 Lines • Show All 783 Lines • ▼ Show 20 Lines	case LoopVectorizationLegality::IK_IntInduction: {
break;		break;
}		}

// Not the canonical induction variable - add the vector loop count to the		// Not the canonical induction variable - add the vector loop count to the
// start value.		// start value.
Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,		Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,
II.StartValue->getType(),		II.StartValue->getType(),
"cast.crd");		"cast.crd");
EndValue = BypassBuilder.CreateAdd(CRD, II.StartValue , "ind.end");		EndValue = II.transform(BypassBuilder, CRD);
break;		EndValue->setName("ind.end");
}
case LoopVectorizationLegality::IK_ReverseIntInduction: {
// Convert the CountRoundDown variable to the PHI size.
Value *CRD = BypassBuilder.CreateSExtOrTrunc(CountRoundDown,
II.StartValue->getType(),
"cast.crd");
// Handle reverse integer induction counter.
EndValue = BypassBuilder.CreateSub(II.StartValue, CRD, "rev.ind.end");
break;		break;
}		}
case LoopVectorizationLegality::IK_PtrInduction: {		case LoopVectorizationLegality::IK_PtrInduction: {
// For pointer induction variables, calculate the offset using		EndValue = II.transform(BypassBuilder, CountRoundDown);
// the end index.		EndValue->setName("ptr.ind.end");
EndValue = BypassBuilder.CreateGEP(II.StartValue, CountRoundDown,
"ptr.ind.end");
break;
}
case LoopVectorizationLegality::IK_ReversePtrInduction: {
// The value at the end of the loop for the reverse pointer is calculated
// by creating a GEP with a negative index starting from the start value.
Value *Zero = ConstantInt::get(CountRoundDown->getType(), 0);
Value *NegIdx = BypassBuilder.CreateSub(Zero, CountRoundDown,
"rev.ind.end");
EndValue = BypassBuilder.CreateGEP(II.StartValue, NegIdx,
"rev.ptr.ind.end");
break;		break;
}		}
}// end of case		}// end of case

// The new PHI merges the original incoming value, in case of a bypass,		// The new PHI merges the original incoming value, in case of a bypass,
// or the value at the end of the vectorized loop.		// or the value at the end of the vectorized loop.
for (unsigned I = 1, E = LoopBypassBlocks.size(); I != E; ++I) {		for (unsigned I = 1, E = LoopBypassBlocks.size(); I != E; ++I) {
if (OrigPhi == OldInduction)		if (OrigPhi == OldInduction)
▲ Show 20 Lines • Show All 617 Lines • ▼ Show 20 Lines	case LoopVectorizationLegality::IK_IntInduction: {
// extend the type.		// extend the type.
Broadcasted = Builder.CreateTrunc(Induction, PhiTy);		Broadcasted = Builder.CreateTrunc(Induction, PhiTy);
} else {		} else {
// Handle other induction variables that are now based on the		// Handle other induction variables that are now based on the
// canonical one.		// canonical one.
Value *NormalizedIdx = Builder.CreateSub(Induction, ExtendedIdx,		Value *NormalizedIdx = Builder.CreateSub(Induction, ExtendedIdx,
"normalized.idx");		"normalized.idx");
NormalizedIdx = Builder.CreateSExtOrTrunc(NormalizedIdx, PhiTy);		NormalizedIdx = Builder.CreateSExtOrTrunc(NormalizedIdx, PhiTy);
Broadcasted = Builder.CreateAdd(II.StartValue, NormalizedIdx,		Broadcasted = II.transform(Builder, NormalizedIdx);
"offset.idx");		Broadcasted->setName(II.isReverse() ? "reverse.idx" : "offset.idx");
}		}
Broadcasted = getBroadcastInstrs(Broadcasted);		Broadcasted = getBroadcastInstrs(Broadcasted);
// After broadcasting the induction variable we need to make the vector		// After broadcasting the induction variable we need to make the vector
// consecutive by adding 0, 1, 2, etc.		// consecutive by adding 0, 1, 2, etc.
for (unsigned part = 0; part < UF; ++part)		for (unsigned part = 0; part < UF; ++part)
Entry[part] = getConsecutiveVector(Broadcasted, VF * part, false);		Entry[part] = getConsecutiveVector(Broadcasted, VF * part,
		II.StepValue);
return;		return;
}		}
case LoopVectorizationLegality::IK_ReverseIntInduction:
case LoopVectorizationLegality::IK_PtrInduction:		case LoopVectorizationLegality::IK_PtrInduction:
case LoopVectorizationLegality::IK_ReversePtrInduction:
// Handle reverse integer and pointer inductions.
Value *StartIdx = ExtendedIdx;
// This is the normalized GEP that starts counting at zero.
Value *NormalizedIdx = Builder.CreateSub(Induction, StartIdx,
"normalized.idx");

// Handle the reverse integer induction variable case.
if (LoopVectorizationLegality::IK_ReverseIntInduction == II.IK) {
IntegerType *DstTy = cast<IntegerType>(II.StartValue->getType());
Value *CNI = Builder.CreateSExtOrTrunc(NormalizedIdx, DstTy,
"resize.norm.idx");
Value *ReverseInd = Builder.CreateSub(II.StartValue, CNI,
"reverse.idx");

// This is a new value so do not hoist it out.
Value *Broadcasted = getBroadcastInstrs(ReverseInd);
// After broadcasting the induction variable we need to make the
// vector consecutive by adding ... -3, -2, -1, 0.
for (unsigned part = 0; part < UF; ++part)
Entry[part] = getConsecutiveVector(Broadcasted, -(int)VF * part,
true);
return;
}

// Handle the pointer induction variable case.		// Handle the pointer induction variable case.
assert(P->getType()->isPointerTy() && "Unexpected type.");		assert(P->getType()->isPointerTy() && "Unexpected type.");
		// This is the normalized GEP that starts counting at zero.
// Is this a reverse induction ptr or a consecutive induction ptr.		Value *NormalizedIdx = Builder.CreateSub(Induction, ExtendedIdx,
bool Reverse = (LoopVectorizationLegality::IK_ReversePtrInduction ==		"normalized.idx");
II.IK);

// This is the vector of results. Notice that we don't generate		// This is the vector of results. Notice that we don't generate
// vector geps because scalar geps result in better code.		// vector geps because scalar geps result in better code.
for (unsigned part = 0; part < UF; ++part) {		for (unsigned part = 0; part < UF; ++part) {
if (VF == 1) {		if (VF == 1) {
int EltIndex = (part) * (Reverse ? -1 : 1);		int EltIndex = part;
Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);		Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);
Value *GlobalIdx;		Value *GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx);
if (Reverse)		Value *SclrGep = II.transform(Builder, GlobalIdx);
GlobalIdx = Builder.CreateSub(Idx, NormalizedIdx, "gep.ridx");		SclrGep->setName("next.gep");
else
GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx, "gep.idx");

Value *SclrGep = Builder.CreateGEP(II.StartValue, GlobalIdx,
"next.gep");
Entry[part] = SclrGep;		Entry[part] = SclrGep;
continue;		continue;
}		}

Value *VecVal = UndefValue::get(VectorType::get(P->getType(), VF));		Value *VecVal = UndefValue::get(VectorType::get(P->getType(), VF));
for (unsigned int i = 0; i < VF; ++i) {		for (unsigned int i = 0; i < VF; ++i) {
int EltIndex = (i + part * VF) * (Reverse ? -1 : 1);		int EltIndex = i + part * VF;
Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);		Constant *Idx = ConstantInt::get(Induction->getType(), EltIndex);
Value *GlobalIdx;		Value *GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx);
if (!Reverse)		Value *SclrGep = II.transform(Builder, GlobalIdx);
GlobalIdx = Builder.CreateAdd(NormalizedIdx, Idx, "gep.idx");		SclrGep->setName("next.gep");
else
GlobalIdx = Builder.CreateSub(Idx, NormalizedIdx, "gep.ridx");

Value *SclrGep = Builder.CreateGEP(II.StartValue, GlobalIdx,
"next.gep");
VecVal = Builder.CreateInsertElement(VecVal, SclrGep,		VecVal = Builder.CreateInsertElement(VecVal, SclrGep,
Builder.getInt32(i),		Builder.getInt32(i),
"insert.gep");		"insert.gep");
}		}
Entry[part] = VecVal;		Entry[part] = VecVal;
}		}
return;		return;
}		}
}		}

void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {		void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {
// For each instruction in the old loop.		// For each instruction in the old loop.
for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {		for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {
VectorParts &Entry = WidenMap.get(it);		VectorParts &Entry = WidenMap.get(it);
switch (it->getOpcode()) {		switch (it->getOpcode()) {
case Instruction::Br:		case Instruction::Br:
// Nothing to do for PHIs and BR, since we already took care of the		// Nothing to do for PHIs and BR, since we already took care of the
// loop control flow instructions.		// loop control flow instructions.
continue;		continue;
case Instruction::PHI:{		case Instruction::PHI: {
// Vectorize PHINodes.		// Vectorize PHINodes.
widenPHIInstruction(it, Entry, UF, VF, PV);		widenPHIInstruction(it, Entry, UF, VF, PV);
continue;		continue;
}// End of PHI.		}// End of PHI.

case Instruction::Add:		case Instruction::Add:
case Instruction::FAdd:		case Instruction::FAdd:
case Instruction::Sub:		case Instruction::Sub:
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	case Instruction::BitCast: {
/// variable. Notice that we can only optimize the 'trunc' case		/// variable. Notice that we can only optimize the 'trunc' case
/// because: a. FP conversions lose precision, b. sext/zext may wrap,		/// because: a. FP conversions lose precision, b. sext/zext may wrap,
/// c. other casts depend on pointer size.		/// c. other casts depend on pointer size.
if (CI->getOperand(0) == OldInduction &&		if (CI->getOperand(0) == OldInduction &&
it->getOpcode() == Instruction::Trunc) {		it->getOpcode() == Instruction::Trunc) {
Value *ScalarCast = Builder.CreateCast(CI->getOpcode(), Induction,		Value *ScalarCast = Builder.CreateCast(CI->getOpcode(), Induction,
CI->getType());		CI->getType());
Value *Broadcasted = getBroadcastInstrs(ScalarCast);		Value *Broadcasted = getBroadcastInstrs(ScalarCast);
		LoopVectorizationLegality::InductionInfo II =
		Legal->getInductionVars()->lookup(OldInduction);
		Constant *Step = ConstantInt::getSigned(CI->getType(),
		II.StepValue->getSExtValue());
for (unsigned Part = 0; Part < UF; ++Part)		for (unsigned Part = 0; Part < UF; ++Part)
Entry[Part] = getConsecutiveVector(Broadcasted, VF * Part, false);		Entry[Part] = getConsecutiveVector(Broadcasted, VF * Part, Step);
propagateMetadata(Entry, it);		propagateMetadata(Entry, it);
break;		break;
}		}
/// Vectorize casts.		/// Vectorize casts.
Type *DestTy = (VF == 1) ? CI->getType() :		Type *DestTy = (VF == 1) ? CI->getType() :
VectorType::get(CI->getType(), VF);		VectorType::get(CI->getType(), VF);

VectorParts &A = getVectorValue(it->getOperand(0));		VectorParts &A = getVectorValue(it->getOperand(0));
▲ Show 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator it = (bb)->begin(), e = (bb)->end(); it != e;
emitAnalysis(Report(it)		emitAnalysis(Report(it)
<< "control flow not understood by vectorizer");		<< "control flow not understood by vectorizer");
DEBUG(dbgs() << "LV: Found an invalid PHI.\n");		DEBUG(dbgs() << "LV: Found an invalid PHI.\n");
return false;		return false;
}		}

// This is the value coming from the preheader.		// This is the value coming from the preheader.
Value *StartValue = Phi->getIncomingValueForBlock(PreHeader);		Value *StartValue = Phi->getIncomingValueForBlock(PreHeader);
		ConstantInt *StepValue = 0;
// Check if this is an induction variable.		// Check if this is an induction variable.
InductionKind IK = isInductionVariable(Phi);		InductionKind IK = isInductionVariable(Phi, StepValue);

if (IK_NoInduction != IK) {		if (IK_NoInduction != IK) {
// Get the widest type.		// Get the widest type.
if (!WidestIndTy)		if (!WidestIndTy)
WidestIndTy = convertPointerToIntegerType(*DL, PhiTy);		WidestIndTy = convertPointerToIntegerType(*DL, PhiTy);
else		else
WidestIndTy = getWiderType(*DL, PhiTy, WidestIndTy);		WidestIndTy = getWiderType(*DL, PhiTy, WidestIndTy);

// Int inductions are special because we only allow one IV.		// Int inductions are special because we only allow one IV.
if (IK == IK_IntInduction) {		if (IK == IK_IntInduction && !StepValue->isNegative()) {
// Use the phi node with the widest type as induction. Use the last		// Use the phi node with the widest type as induction. Use the last
// one if there are multiple (no good reason for doing this other		// one if there are multiple (no good reason for doing this other
// than it is expedient).		// than it is expedient).
if (!Induction \|\| PhiTy == WidestIndTy)		if (!Induction \|\| PhiTy == WidestIndTy)
Induction = Phi;		Induction = Phi;
}		}

DEBUG(dbgs() << "LV: Found an induction variable.\n");		DEBUG(dbgs() << "LV: Found an induction variable.\n");
Inductions[Phi] = InductionInfo(StartValue, IK);		Inductions[Phi] = InductionInfo(StartValue, IK, StepValue);

// Until we explicitly handle the case of an induction variable with		// Until we explicitly handle the case of an induction variable with
// an outside loop user we have to give up vectorizing this loop.		// an outside loop user we have to give up vectorizing this loop.
if (hasOutsideLoopUser(TheLoop, it, AllowedExit)) {		if (hasOutsideLoopUser(TheLoop, it, AllowedExit)) {
emitAnalysis(Report(it) << "use of induction value outside of the "		emitAnalysis(Report(it) << "use of induction value outside of the "
"loop is not handled by vectorizer");		"loop is not handled by vectorizer");
return false;		return false;
}		}
▲ Show 20 Lines • Show All 1,524 Lines • ▼ Show 20 Lines	case Instruction::Select:
if (Kind != RK_IntegerMinMax &&		if (Kind != RK_IntegerMinMax &&
(!HasFunNoNaNAttr \|\| Kind != RK_FloatMinMax))		(!HasFunNoNaNAttr \|\| Kind != RK_FloatMinMax))
return ReductionInstDesc(false, I);		return ReductionInstDesc(false, I);
return isMinMaxSelectCmpPattern(I, Prev);		return isMinMaxSelectCmpPattern(I, Prev);
}		}
}		}

LoopVectorizationLegality::InductionKind		LoopVectorizationLegality::InductionKind
LoopVectorizationLegality::isInductionVariable(PHINode *Phi) {		LoopVectorizationLegality::isInductionVariable(PHINode *Phi,
		ConstantInt *&StepValue) {
		StepValue = 0;

Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.		// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())
return IK_NoInduction;		return IK_NoInduction;

// Check that the PHI is consecutive.		// Check that the PHI is consecutive.
const SCEV *PhiScev = SE->getSCEV(Phi);		const SCEV *PhiScev = SE->getSCEV(Phi);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);
if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return IK_NoInduction;		return IK_NoInduction;
}		}

const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);
		const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);
		if (!C)
		return IK_NoInduction;

		ConstantInt *CV = C->getValue();

// Integer inductions need to have a stride of one.
if (PhiTy->isIntegerTy()) {		if (PhiTy->isIntegerTy()) {
if (Step->isOne())		StepValue = CV;
return IK_IntInduction;		return IK_IntInduction;
if (Step->isAllOnesValue())
return IK_ReverseIntInduction;
return IK_NoInduction;
}		}

// Calculate the pointer stride and check if it is consecutive.
const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);
if (!C)
return IK_NoInduction;

assert(PhiTy->isPointerTy() && "The PHI must be a pointer");		assert(PhiTy->isPointerTy() && "The PHI must be a pointer");
uint64_t Size = DL->getTypeAllocSize(PhiTy->getPointerElementType());		int64_t Size = DL->getTypeAllocSize(PhiTy->getPointerElementType());
if (C->getValue()->equalsInt(Size))		int64_t CVSize = CV->getSExtValue();
return IK_PtrInduction;		if (CVSize % Size)
else if (C->getValue()->equalsInt(0 - Size))
return IK_ReversePtrInduction;

return IK_NoInduction;		return IK_NoInduction;
		StepValue = ConstantInt::getSigned(CV->getType(), CVSize / Size);
		return IK_PtrInduction;
}		}

bool LoopVectorizationLegality::isInductionVariable(const Value *V) {		bool LoopVectorizationLegality::isInductionVariable(const Value *V) {
Value In0 = const_cast<Value>(V);		Value In0 = const_cast<Value>(V);
PHINode *PN = dyn_cast_or_null<PHINode>(In0);		PHINode *PN = dyn_cast_or_null<PHINode>(In0);
if (!PN)		if (!PN)
return false;		return false;

▲ Show 20 Lines • Show All 943 Lines • ▼ Show 20 Lines	Value InnerLoopUnroller::reverseVector(Value Vec) {
return Vec;		return Vec;
}		}

Value InnerLoopUnroller::getBroadcastInstrs(Value V) {		Value InnerLoopUnroller::getBroadcastInstrs(Value V) {
return V;		return V;
}		}

Value InnerLoopUnroller::getConsecutiveVector(Value Val, int StartIdx,		Value InnerLoopUnroller::getConsecutiveVector(Value Val, int StartIdx,
bool Negate) {		Value *Step) {
// When unrolling and the VF is 1, we only need to add a simple scalar.		// When unrolling and the VF is 1, we only need to add a simple scalar.
Type *ITy = Val->getType();		Type *ITy = Val->getType();
assert(!ITy->isVectorTy() && "Val must be a scalar");		assert(!ITy->isVectorTy() && "Val must be a scalar");
Constant *C = ConstantInt::get(ITy, StartIdx, Negate);		Constant *C = ConstantInt::get(ITy, StartIdx);
return Builder.CreateAdd(Val, C, "induction");		return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");
}		}

test/Transforms/LoopVectorize/const-induction.ll

This file was added.

				; RUN: opt < %s -loop-vectorize -force-vector-width=4 -dce -instcombine -S \| FileCheck %s

				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.8.0"

				;CHECK-LABEL: @ind_plus(
				;CHECK: <4 x i32>
				define i32 @ind_plus(i32* nocapture readonly %a) #0 {
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%res.05 = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%i.04 = phi i32 [ 0, %entry ], [ %add.1, %for.body ]
				%arrayidx = getelementptr inbounds i32* %a, i32 %i.04
				%0 = load i32* %arrayidx, align 4
				%add = add nsw i32 %0, %res.05
				%add.1 = add nsw i32 %i.04, 2
				%cmp = icmp slt i32 %add.1, 101
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret i32 %add
				}

				;CHECK-LABEL: @ind_minus(
				;CHECK: <4 x i32>
				define i32 @ind_minus(i32* nocapture readonly %a) #0 {
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%res.05 = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%i.04 = phi i32 [ 100, %entry ], [ %sub, %for.body ]
				%arrayidx = getelementptr inbounds i32* %a, i32 %i.04
				%0 = load i32* %arrayidx, align 4
				%add = add nsw i32 %0, %res.05
				%sub = add nsw i32 %i.04, -2
				%cmp = icmp sgt i32 %sub, 0
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret i32 %add
				}

test/Transforms/LoopVectorize/gcc-examples.ll

Show First 20 Lines • Show All 382 Lines • ▼ Show 20 Lines	; <label>:1 ; preds = %1, %0
%lftr.wideiv = trunc i64 %indvars.iv.next to i32		%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, 1024		%exitcond = icmp eq i32 %lftr.wideiv, 1024
br i1 %exitcond, label %4, label %1		br i1 %exitcond, label %4, label %1

; <label>:4 ; preds = %1		; <label>:4 ; preds = %1
ret void		ret void
}		}

; Can't vectorize because of reductions.
;CHECK-LABEL: @example13(		;CHECK-LABEL: @example13(
;CHECK-NOT: <4 x i32>		;CHECK: <4 x i32>
;CHECK: ret void		;CHECK: ret void
define void @example13(i32 nocapture %A, i32 nocapture %B, i32* nocapture %out) nounwind uwtable ssp {		define void @example13(i32 nocapture %A, i32 nocapture %B, i32* nocapture %out) nounwind uwtable ssp {
br label %.preheader		br label %.preheader

.preheader: ; preds = %14, %0		.preheader: ; preds = %14, %0
%indvars.iv4 = phi i64 [ 0, %0 ], [ %indvars.iv.next5, %14 ]		%indvars.iv4 = phi i64 [ 0, %0 ], [ %indvars.iv.next5, %14 ]
%1 = getelementptr inbounds i32** %A, i64 %indvars.iv4		%1 = getelementptr inbounds i32** %A, i64 %indvars.iv4
%2 = load i32** %1, align 8		%2 = load i32** %1, align 8
▲ Show 20 Lines • Show All 286 Lines • Show Last 20 Lines