This is an archive of the discontinued LLVM Phabricator instance.

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	Why do we need this?
lib/Transforms/Utils/LoopUtils.cpp
758 ↗	(On Diff #60711)	Looking at the users of `isInductionPHI`, looks like it should be easy to pass in the `Loop *` directly? If so, can we just have this check be: `if (L->getHeader() != Phi->getParent()) return false;` ? That way we won't have to add the `getLoopFor` interface to SCEV (which does not look like it belongs there).

delena added inline comments.Jun 20 2016, 10:12 AM

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	I defined Step as unknown SCEV, but underlying value is of floating point type. In this case expand() just returns the underlying value, but it fails on one of these two lines.
lib/Transforms/Utils/LoopUtils.cpp
758 ↗	(On Diff #60711)	Yes, of course. I'll change.

mkuper added inline comments.Jun 20 2016, 12:42 PM

lib/Transforms/Utils/LoopUtils.cpp
675 ↗	(On Diff #60711)	Maybe ((IK == IK_FpInduction) \|\| Step->getType()->isIntegerTy()) would be clearer.
678 ↗	(On Diff #60711)	Maybe put this on line 668, next to the two asserts for the other types?
733 ↗	(On Diff #60711)	As long as you're touching this - maybe hoist this assert for all 3 cases above the switch?
739 ↗	(On Diff #60711)	Why do we need this?
767 ↗	(On Diff #60711)	Some variable naming nitpicking: Bb -> either BB or bb? BEValueV -> BEValue StartValueV -> StartValue
772 ↗	(On Diff #60711)	Do we really need the if here? We expect L->contains(BB) to be true for one of the incoming values, and false for the other, right? So if we're in the "else", we're already in a bad shape, regardless of whether BEValueV == V or not. Or did I misunderstand?
776 ↗	(On Diff #60711)	Same as above.
780 ↗	(On Diff #60711)	Can we replace this with an early exit?
801 ↗	(On Diff #60711)	Are you sure this is always a safe place to insert? E.g. what if Addend is a PHI in an outer loop?
827 ↗	(On Diff #60711)	Why not isFloatingPointTy() here?
lib/Transforms/Vectorize/LoopVectorize.cpp
2154 ↗	(On Diff #60711)	ITy here stands for "IntegerTy", I guess. Perhaps rename, now that ITy can be float? (Unless ITy stands for IndexTy, which also make sense, and in which case we should also probably rename. :-) )
2178 ↗	(On Diff #60711)	This looks a bit odd. If I understand correctly, you're relying on the step being FNeg to distinguish whether the original direction of the loop was positive or negative (by transforming a phi that feeds an fsub by a phi that feeds an fadd with an fneg). What happens if the scalar loop has an fadd, where the original step is a loop-invariant fneg?

mkuper added inline comments.Jun 20 2016, 12:42 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	That sounds a bit weird to me. It looks like the contract for expandCodeFor is that if you pass a type, you get the expansion cast to this type. What happens if you pass a type, but get a result that's from a different type than you passed (because it's not scevable)?

Michael, thanks a lot for your comments. I'll upload a new patch.

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	I don't know why do we need to pass the type. May be for casting between i64 and PointerType? But an Unknown SCEV may have any type, right?
lib/Transforms/Utils/LoopUtils.cpp
678 ↗	(On Diff #60711)	All these checks belong to the Step type. See comment #669.
733 ↗	(On Diff #60711)	Ok
739 ↗	(On Diff #60711)	I should cover fsub operation somehow. ; for (int i=0; i < N; ++i) { ; A[i] = x; ; x -= fp_inc; ; } I keep Step as Fneg(fp_inc) in this case.
772 ↗	(On Diff #60711)	I just copied this code from integer Phi together with weird variable names. Can we have multiple bbs inside loop?
801 ↗	(On Diff #60711)	This FNeg disappears after transformation. It is just a way to distinguish between FAdd and FSub.
827 ↗	(On Diff #60711)	isFloatingPointTy() includes more than half, float and double. I'm not sure I want to cover other types.
lib/Transforms/Vectorize/LoopVectorize.cpp
2178 ↗	(On Diff #60711)	I don't see any problem. x + (-a) is equal to x - a.

Thanks, Elena.

I guess there's one thing I don't understand about the fnegs - why are you using an IR instruction as the marker of whether the original induction had an fadd or an fsub, instead of a property of the IV? It would make sense to me if you actually used the fneg to feed the vector induction, thus simplifying the code (not having to special-case the sub), but instead you special-case it anyway by looking through the fneg.

Another thing I forgot earlier - this should only fire when we're in unsafe/fast math mode, right? Is there a check for that?

lib/Transforms/Utils/LoopUtils.cpp
739 ↗	(On Diff #60711)	I understand, what I mean is - let's say the step is an FNeg. Why can't you feed the FNeg directly into the CreateFMul? Do we get worse code?
772 ↗	(On Diff #60711)	Right now, I think not. (and that needs to be fixed.) :-\ Anyway, if this is a verbatim copy from the int case, leave it be for now. Should probably be fixed separately. (Of course, that raises the question - can the int and the fp case share the code? Or is it not worth it?)
801 ↗	(On Diff #60711)	When does it disappear? If it disappears in later clean-up, we still don't want to insert it at an illegal location (like between two phis). If it's guaranteed to be deleted during the run of the vectorizer, I guess this will technically work, but I'd still prefer to avoid it.
lib/Transforms/Vectorize/LoopVectorize.cpp
2178 ↗	(On Diff #60711)	What I'm trying to say is that it's weird that the behavior would be different based on whether the step is an fneg. It could be an fneg because you added an fneg, it could be an fneg because there was already an fneg. Why does this code look through the fneg? It doesn't seem like this should be the vectorizer's job. Although if the fneg won't get cleaned up later, this is probably the right thing to do.

In D21330#462655, @mkuper wrote:

I guess there's one thing I don't understand about the fnegs - why are you using an IR instruction as the marker of whether the original induction had an fadd or an fsub, instead of a property of the IV? It would make sense to me if you actually used the fneg to feed the vector induction, thus simplifying the code (not having to special-case the sub), but instead you special-case it anyway by looking through the fneg.

What do you mean by "property of IV" ? Do you suggest to add a special field to InductionDescriptor?
Fneg should be a part of Step. But Step is a SCEV and I was not allowed to implement FP SCEV.

In D21330#462655, @mkuper wrote:

Another thing I forgot earlier - this should only fire when we're in unsafe/fast math mode, right? Is there a check for that?

FP reduction is allowed for +/- operators. Only max/min checks for unsafe.

What do you mean by "property of IV" ? Do you suggest to add a special field to InductionDescriptor?
Fneg should be a part of Step. But Step is a SCEV and I was not allowed to implement FP SCEV.

You know, I probably just don't understand the SCEV situation well enough.
Sanjoy, any chance you could take a look, even though part of it is in LoopVectorize.cpp? :-)

Changed handling of FSUB operation. I keep the original binary operation inside Induction Descriptor.
Allow FP induction in fast-math mode only.

Ping *

delena added reviewers: Ayal, dorit, gilr.Jul 5 2016, 1:06 AM

Hi Elena,

Sorry for the delay, I was out on vacation. I'm catching up on email now, and I will try to get to review this this week.

Thanks!

Some comments inline. Please let me know if you want me to look at something specific, but I'm not familiar enough with the code this patch touches to lgtm it.

../lib/Analysis/ScalarEvolutionExpander.cpp
1614 ↗	(On Diff #61935)	This is odd -- is it just to help keep the `Step` as a `SCEV `? If so, I'd suggest solving that within `InductionDescriptor` itself (i.e. maybe support having the step as either a `SCEV ` or a `Value *`, depending on the type of the `InductionDescriptor`?).
../lib/Transforms/Utils/LoopUtils.cpp
762 ↗	(On Diff #61935)	(Not for fixing in this change) looks like a better interface would be to return an `Optional< InductionDescriptor>`?
787 ↗	(On Diff #61935)	Maybe use a `dyn_cast` here?
811 ↗	(On Diff #61935)	The condition looks inverted?

This revision now requires changes to proceed.Jul 15 2016, 12:43 AM

delena marked an inline comment as done.Jul 17 2016, 5:35 AM

delena added inline comments.

../lib/Analysis/ScalarEvolutionExpander.cpp
1614 ↗	(On Diff #61935)	I't dropping this change, I don't need it anymore.
../lib/Transforms/Utils/LoopUtils.cpp
787 ↗	(On Diff #61935)	Ok.
811 ↗	(On Diff #61935)	The hasUnsafeAlgebra() means that instruction itself has "fast" attribute. In this case we don't need additional check. But if the BOp does not have the "fast" attribute, the legality of FP transformation should be allowed on function level. I'll add a comment.

Some changes according to Sanjoy's comment. Thanks Sanjoy and Michael for review.
Still looking for somebody who can accept this patch. Added @sbaranga, who made similar changes in induction variables.

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

If someone - e.g. sanjoy - OK's the design, I can LGTM the LV code change.

mkuper added inline comments.Jul 18 2016, 11:20 AM

../include/llvm/Transforms/Utils/LoopUtils.h
274 ↗	(On Diff #64249)	I think Instruction::BinaryOpsEnd may be better for an explicitly invalid BinaryOp. Not sure that's a good choice, but pretty sure 0 isn't.
../lib/Transforms/Utils/LoopUtils.cpp
787 ↗	(On Diff #64249)	I think what sanjoy meant was: BinaryOperator *BOp = dyn_cast<BinaryOperator>(BEValue); if (!BOp) return false;
../lib/Transforms/Vectorize/LoopVectorize.cpp
4100 ↗	(On Diff #64249)	Main -> Primary (I think we use that consistently)
6402 ↗	(On Diff #64249)	Can you add a test for this? All of the tests you added force UF == 1.
6405 ↗	(On Diff #64249)	Are you sure about this? I mean, it's true for vectorizing, but is it true here as well? (I'm not saying it isn't, just making sure this is intentional)

In D21330#487162, @mkuper wrote:

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

The bottom line of the FP SCEV discussion was the point that FP SCEV is overkill for "secondary" IV (like in the example above). We'll need FP SCEV for primary FP IV like
for (float f =0.0; f < g; f+=0.5) {}.
But such loops are rare and most of them can be re-mapped to integers.
The suggestion was to include FP IV to the current InductionDescriptor.

In D21330#487357, @delena wrote:

In D21330#487162, @mkuper wrote:

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

The bottom line of the FP SCEV discussion was the point that FP SCEV is overkill for "secondary" IV (like in the example above). We'll need FP SCEV for primary FP IV like
for (float f =0.0; f < g; f+=0.5) {}.
But such loops are rare and most of them can be re-mapped to integers.
The suggestion was to include FP IV to the current InductionDescriptor.

delena marked 3 inline comments as done.Jul 19 2016, 5:04 AM

delena added inline comments.

../include/llvm/Transforms/Utils/LoopUtils.h
274 ↗	(On Diff #64249)	Thanks, I'll fix.
../lib/Transforms/Vectorize/LoopVectorize.cpp
6405 ↗	(On Diff #64249)	Even if you have only unrolling, and VF is 1, the value of FP induction is calculated as: sitofp(PrimaryIV) * Increment. for (int i=0; i<N; i++) { fp_ind += fp_inc; } is transferred to something like this: init = fp_inc; for (int i=0; i<N; i++) { fp_ind = init + i*fp_inc; } In this case we need unsafe math. I added tests for unrolling.

Updated the code according to Michael's comments.

LGTM

In D21330#488606, @mkuper wrote:

LGTM

I've just started looking at this too. Please give me a few mins. So far I only encountered minor things.

../include/llvm/Transforms/Utils/LoopUtils.h
308 ↗	(On Diff #64479)	Fp->FP
328–329 ↗	(On Diff #64479)	binary op -> induction op is better everywhere. Also I am assuming this is the op that advances the induction variable. You may want to spell this out somewhere.
../lib/Transforms/Utils/LoopUtils.cpp
770 ↗	(On Diff #64479)	this function is returning a bool

Sorry, didn't realize anyone else is still interested in looking at this, given how long this patch has been up.
Ignore my LGTM. :-)

In D21330#488622, @mkuper wrote:

Sorry, didn't realize anyone else is still interested in looking at this, given how long this patch has been up.
Ignore my LGTM. :-)

No, *I am* sorry for chiming in this late. I felt obliged because you mentioned that no one looked at this from the original llvm-dev thread and I felt bad ;). And thanks for reviewing it, this is looking pretty good.

So with these it should LGTM too. I haven't checked everything (most notably the unsafe math part).

I just wanted to see whether this was in line with the direction set in the original llvm-dev thread and it is! Thanks to all of you and sorry about the delay again.

../lib/Transforms/Utils/LoopUtils.cpp
816–818 ↗	(On Diff #64479)	I think that a better interface would be to take BOp (step instruction) optionally and then derive DI::hasUnsafeAlgebra and the opcode from that. This is OK as a follow-up if you prefer.
../lib/Transforms/Vectorize/LoopVectorize.cpp
4104–4105 ↗	(On Diff #64479)	by adding StepVal, you mean

Minor changes according to Adam's comments.

delena added inline comments.Jul 20 2016, 1:38 AM

../include/llvm/Transforms/Utils/LoopUtils.h
328–329 ↗	(On Diff #64479)	Ok. Changing it to getInductionBinOp()
../lib/Transforms/Utils/LoopUtils.cpp
770 ↗	(On Diff #64479)	Fixed. Thanks.
816–818 ↗	(On Diff #64479)	I did it at the beginning and then prefer to follow Reduction implementation, just to be consistent with existing interface.
../lib/Transforms/Vectorize/LoopVectorize.cpp
4104–4105 ↗	(On Diff #64479)	I fixed the comment. thanks.

Added one more test where init value and step are constants.

You should probably also have a test for the case where there is not "fast" attribute on the step instruction but we can still vectorize with the hints.

../include/llvm/Transforms/Utils/LoopUtils.h
321–323 ↗	(On Diff #64670)	This comment seems incorrect/misleading. I think this flag "allows" a relaxed FP model not "requires" it. Do you agree?
328–329 ↗	(On Diff #64670)	I think you missed the part about "everywhere", i.e. making the member variable consistent with this name too: BinaryOp -> InductionOp
343–344 ↗	(On Diff #64670)	This comment is also ambiguous. This is again the induction/step instruction iff the instruction allows for a relaxed FP model.
../lib/Transforms/Utils/LoopUtils.cpp
819–821 ↗	(On Diff #64670)	Fair but it does not make sense to pass a different instructions for UAI and BO and the current interface allows for that. At least then change the ctor to take a single instruction and derive UAI and BO from that.
../test/Transforms/LoopVectorize/float-induction.ll
12–13 ↗	(On Diff #64670)	Where is the result of these guys used? Would be good to check too.
16–17 ↗	(On Diff #64670)	Why not also match the result of the fsub and check it in the store?

Updated the patch according to Adam's comments.

anemet added inline comments.Jul 21 2016, 10:18 AM

../include/llvm/Transforms/Utils/LoopUtils.h
355 ↗	(On Diff #64853)	Comments are full sentences, please end with a period.
../test/Transforms/LoopVectorize/float-induction.ll
4 ↗	(On Diff #64853)	I think you can only have this under LoopVectorize/X86 (what if the X86 backend is not enabled in a build?). But more importantly, I don't understand why you need to formulate the non-fast-math case as an x86 test.

delena added inline comments.Jul 22 2016, 5:47 AM

../include/llvm/Transforms/Utils/LoopUtils.h
355 ↗	(On Diff #64853)	ok.
../test/Transforms/LoopVectorize/float-induction.ll
4 ↗	(On Diff #64853)	The "unsafe" function attribute works only for auto-vec. If you specify -force-vector-width=4 the loop will be vectorized anyway. So I added "X86" tests just to check combination of function attribute and "safe" FP induction in the auto-vectorization mode.

LGTM with the x86 test split into its own test under Transform/LoopVectorize/X86. Thanks!

../test/Transforms/LoopVectorize/float-induction.ll
4 ↗	(On Diff #64853)	Hmm, that is somewhat unexpected. I thought that the -force* stuff only overruled the cost model... @hfinkel, is it intentional/desired that -force-vector-width=>1 overrules (some of) the legality checks? We have different ways to bypass the legality checks so perhaps keeping -force-vector-width as a way to overrule the profitability checks is more desired? I guess one concerns is that this internal flag was advertised on Nadav's Auto-vectorizer blog post so perhaps changing the behavior is not trivial at this point. Anyhow, coming back to this patch, you need then to split this part of the test into a Transform/LoopVectorize/X86 test.

Closed by commit rL276554: [Loop Vectorizer] Handling loops FP induction variables. (authored by delena). · Explain WhyJul 24 2016, 12:32 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Transforms/

Utils/

LoopUtils.h

58 lines

lib/

Transforms/

Scalar/

LoopInterchange.cpp

2 lines

Utils/

LoopUtils.cpp

118 lines

Vectorize/

LoopVectorize.cpp

135 lines

test/

Transforms/

LoopVectorize/

X86/

float-induction-x86.ll

86 lines

float-induction.ll

218 lines

Diff 65266

llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h

	Show First 20 Lines • Show All 257 Lines • ▼ Show 20 Lines

	/// A struct for saving information about induction variables.			/// A struct for saving information about induction variables.
	class InductionDescriptor {			class InductionDescriptor {
	public:			public:
	/// This enum represents the kinds of inductions that we support.			/// This enum represents the kinds of inductions that we support.
	enum InductionKind {			enum InductionKind {
	IK_NoInduction, ///< Not an induction variable.			IK_NoInduction, ///< Not an induction variable.
	IK_IntInduction, ///< Integer induction variable. Step = C.			IK_IntInduction, ///< Integer induction variable. Step = C.
	IK_PtrInduction ///< Pointer induction var. Step = C / sizeof(elem).			IK_PtrInduction, ///< Pointer induction var. Step = C / sizeof(elem).
				IK_FpInduction ///< Floating point induction variable.
	};			};

	public:			public:
	/// Default constructor - creates an invalid induction.			/// Default constructor - creates an invalid induction.
	InductionDescriptor()			InductionDescriptor()
	: StartValue(nullptr), IK(IK_NoInduction), Step(nullptr) {}			: StartValue(nullptr), IK(IK_NoInduction), Step(nullptr),
				InductionBinOp(nullptr) {}

	/// Get the consecutive direction. Returns:			/// Get the consecutive direction. Returns:
	/// 0 - unknown or non-consecutive.			/// 0 - unknown or non-consecutive.
	/// 1 - consecutive and increasing.			/// 1 - consecutive and increasing.
	/// -1 - consecutive and decreasing.			/// -1 - consecutive and decreasing.
	int getConsecutiveDirection() const;			int getConsecutiveDirection() const;

	/// Compute the transformed value of Index at offset StartValue using step			/// Compute the transformed value of Index at offset StartValue using step
	/// StepValue.			/// StepValue.
	/// For integer induction, returns StartValue + Index * StepValue.			/// For integer induction, returns StartValue + Index * StepValue.
	/// For pointer induction, returns StartValue[Index * StepValue].			/// For pointer induction, returns StartValue[Index * StepValue].
	/// FIXME: The newly created binary instructions should contain nsw/nuw			/// FIXME: The newly created binary instructions should contain nsw/nuw
	/// flags, which can be found from the original scalar operations.			/// flags, which can be found from the original scalar operations.
	Value transform(IRBuilder<> &B, Value Index, ScalarEvolution *SE,			Value transform(IRBuilder<> &B, Value Index, ScalarEvolution *SE,
	const DataLayout& DL) const;			const DataLayout& DL) const;

	Value *getStartValue() const { return StartValue; }			Value *getStartValue() const { return StartValue; }
	InductionKind getKind() const { return IK; }			InductionKind getKind() const { return IK; }
	const SCEV *getStep() const { return Step; }			const SCEV *getStep() const { return Step; }
	ConstantInt *getConstIntStepValue() const;			ConstantInt *getConstIntStepValue() const;

	/// Returns true if \p Phi is an induction. If \p Phi is an induction,			/// Returns true if \p Phi is an induction in the loop \p L. If \p Phi is an
	/// the induction descriptor \p D will contain the data describing this			/// induction, the induction descriptor \p D will contain the data describing
	/// induction. If by some other means the caller has a better SCEV			/// this induction. If by some other means the caller has a better SCEV
	/// expression for \p Phi than the one returned by the ScalarEvolution			/// expression for \p Phi than the one returned by the ScalarEvolution
	/// analysis, it can be passed through \p Expr.			/// analysis, it can be passed through \p Expr.
	static bool isInductionPHI(PHINode Phi, ScalarEvolution SE,			static bool isInductionPHI(PHINode Phi, const Loop L, ScalarEvolution *SE,
	InductionDescriptor &D,			InductionDescriptor &D,
	const SCEV *Expr = nullptr);			const SCEV *Expr = nullptr);

	/// Returns true if \p Phi is an induction, in the context associated with			/// Returns true if \p Phi is a floating point induction in the loop \p L.
	/// the run-time predicate of PSE. If \p Assume is true, this can add further			/// If \p Phi is an induction, the induction descriptor \p D will contain
	/// SCEV predicates to \p PSE in order to prove that \p Phi is an induction.			/// the data describing this induction.
				static bool isFPInductionPHI(PHINode Phi, const Loop L,
				ScalarEvolution *SE, InductionDescriptor &D);

				/// Returns true if \p Phi is a loop \p L induction, in the context associated
				/// with the run-time predicate of PSE. If \p Assume is true, this can add
				/// further SCEV predicates to \p PSE in order to prove that \p Phi is an
				/// induction.
	/// If \p Phi is an induction, \p D will contain the data describing this			/// If \p Phi is an induction, \p D will contain the data describing this
	/// induction.			/// induction.
	static bool isInductionPHI(PHINode *Phi, PredicatedScalarEvolution &PSE,			static bool isInductionPHI(PHINode Phi, const Loop L,
				PredicatedScalarEvolution &PSE,
	InductionDescriptor &D, bool Assume = false);			InductionDescriptor &D, bool Assume = false);

				/// Returns true if the induction type is FP and the binary operator does
				/// not have the "fast-math" property. Such operation requires a relaxed FP
				/// mode.
				bool hasUnsafeAlgebra() {
				return InductionBinOp &&
				!cast<FPMathOperator>(InductionBinOp)->hasUnsafeAlgebra();
				}

				/// Returns induction operator that does not have "fast-math" property
				/// and requires FP unsafe mode.
				Instruction *getUnsafeAlgebraInst() {
				if (!InductionBinOp \|\|
				cast<FPMathOperator>(InductionBinOp)->hasUnsafeAlgebra())
				return nullptr;
				return InductionBinOp;
				}

				/// Returns binary opcode of the induction operator.
				Instruction::BinaryOps getInductionOpcode() const {
				return InductionBinOp ? InductionBinOp->getOpcode() :
				Instruction::BinaryOpsEnd;
				}

	private:			private:
	/// Private constructor - used by \c isInductionPHI.			/// Private constructor - used by \c isInductionPHI.
	InductionDescriptor(Value Start, InductionKind K, const SCEV Step);			InductionDescriptor(Value Start, InductionKind K, const SCEV Step,
				BinaryOperator *InductionBinOp = nullptr);

	/// Start value.			/// Start value.
	TrackingVH<Value> StartValue;			TrackingVH<Value> StartValue;
	/// Induction kind.			/// Induction kind.
	InductionKind IK;			InductionKind IK;
	/// Step value.			/// Step value.
	const SCEV *Step;			const SCEV *Step;
				// Instruction that advances induction variable.
				BinaryOperator *InductionBinOp;
	};			};

	BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,			BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,
	bool PreserveLCSSA);			bool PreserveLCSSA);

	/// Ensures LCSSA form for every instruction from the Worklist in the scope of			/// Ensures LCSSA form for every instruction from the Worklist in the scope of
	/// innermost containing loop.			/// innermost containing loop.
	///			///
	▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Scalar/LoopInterchange.cpp

Show First 20 Lines • Show All 697 Lines • ▼ Show 20 Lines	bool LoopInterchangeLegality::findInductionAndReductions(
Loop L, SmallVector<PHINode , 8> &Inductions,		Loop L, SmallVector<PHINode , 8> &Inductions,
SmallVector<PHINode *, 8> &Reductions) {		SmallVector<PHINode *, 8> &Reductions) {
if (!L->getLoopLatch() \|\| !L->getLoopPredecessor())		if (!L->getLoopLatch() \|\| !L->getLoopPredecessor())
return false;		return false;
for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I) {		for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I) {
RecurrenceDescriptor RD;		RecurrenceDescriptor RD;
InductionDescriptor ID;		InductionDescriptor ID;
PHINode *PHI = cast<PHINode>(I);		PHINode *PHI = cast<PHINode>(I);
if (InductionDescriptor::isInductionPHI(PHI, SE, ID))		if (InductionDescriptor::isInductionPHI(PHI, L, SE, ID))
Inductions.push_back(PHI);		Inductions.push_back(PHI);
else if (RecurrenceDescriptor::isReductionPHI(PHI, L, RD))		else if (RecurrenceDescriptor::isReductionPHI(PHI, L, RD))
Reductions.push_back(PHI);		Reductions.push_back(PHI);
else {		else {
DEBUG(		DEBUG(
dbgs() << "Failed to recognize PHI as an induction or reduction.\n");		dbgs() << "Failed to recognize PHI as an induction or reduction.\n");
return false;		return false;
}		}
▲ Show 20 Lines • Show All 566 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 648 Lines • ▼ Show 20 Lines	Value *RecurrenceDescriptor::createMinMaxOp(IRBuilder<> &Builder,
else		else
Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");		Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");

Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");		Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");
return Select;		return Select;
}		}

InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,		InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,
const SCEV *Step)		const SCEV Step, BinaryOperator BOp)
: StartValue(Start), IK(K), Step(Step) {		: StartValue(Start), IK(K), Step(Step), InductionBinOp(BOp) {
assert(IK != IK_NoInduction && "Not an induction");		assert(IK != IK_NoInduction && "Not an induction");

// Start value type should match the induction kind and the value		// Start value type should match the induction kind and the value
// itself should not be null.		// itself should not be null.
assert(StartValue && "StartValue is null");		assert(StartValue && "StartValue is null");
assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&		assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&
"StartValue is not a pointer for pointer induction");		"StartValue is not a pointer for pointer induction");
assert((IK != IK_IntInduction \|\| StartValue->getType()->isIntegerTy()) &&		assert((IK != IK_IntInduction \|\| StartValue->getType()->isIntegerTy()) &&
"StartValue is not an integer for integer induction");		"StartValue is not an integer for integer induction");

// Check the Step Value. It should be non-zero integer value.		// Check the Step Value. It should be non-zero integer value.
assert((!getConstIntStepValue() \|\| !getConstIntStepValue()->isZero()) &&		assert((!getConstIntStepValue() \|\| !getConstIntStepValue()->isZero()) &&
"Step value is zero");		"Step value is zero");

assert((IK != IK_PtrInduction \|\| getConstIntStepValue()) &&		assert((IK != IK_PtrInduction \|\| getConstIntStepValue()) &&
"Step value should be constant for pointer induction");		"Step value should be constant for pointer induction");
assert(Step->getType()->isIntegerTy() && "StepValue is not an integer");		assert((IK == IK_FpInduction \|\| Step->getType()->isIntegerTy()) &&
		"StepValue is not an integer");

		assert((IK != IK_FpInduction \|\| Step->getType()->isFloatingPointTy()) &&
		"StepValue is not FP for FpInduction");
		assert((IK != IK_FpInduction \|\| (InductionBinOp &&
		(InductionBinOp->getOpcode() == Instruction::FAdd \|\|
		InductionBinOp->getOpcode() == Instruction::FSub))) &&
		"Binary opcode should be specified for FP induction");
}		}

int InductionDescriptor::getConsecutiveDirection() const {		int InductionDescriptor::getConsecutiveDirection() const {
ConstantInt *ConstStep = getConstIntStepValue();		ConstantInt *ConstStep = getConstIntStepValue();
if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))		if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))
return ConstStep->getSExtValue();		return ConstStep->getSExtValue();
return 0;		return 0;
}		}

ConstantInt *InductionDescriptor::getConstIntStepValue() const {		ConstantInt *InductionDescriptor::getConstIntStepValue() const {
if (isa<SCEVConstant>(Step))		if (isa<SCEVConstant>(Step))
return dyn_cast<ConstantInt>(cast<SCEVConstant>(Step)->getValue());		return dyn_cast<ConstantInt>(cast<SCEVConstant>(Step)->getValue());
return nullptr;		return nullptr;
}		}

Value InductionDescriptor::transform(IRBuilder<> &B, Value Index,		Value InductionDescriptor::transform(IRBuilder<> &B, Value Index,
ScalarEvolution *SE,		ScalarEvolution *SE,
const DataLayout& DL) const {		const DataLayout& DL) const {

SCEVExpander Exp(*SE, DL, "induction");		SCEVExpander Exp(*SE, DL, "induction");
		assert(Index->getType() == Step->getType() &&
		"Index type does not match StepValue type");
switch (IK) {		switch (IK) {
case IK_IntInduction: {		case IK_IntInduction: {
assert(Index->getType() == StartValue->getType() &&		assert(Index->getType() == StartValue->getType() &&
"Index type does not match StartValue type");		"Index type does not match StartValue type");

// FIXME: Theoretically, we can call getAddExpr() of ScalarEvolution		// FIXME: Theoretically, we can call getAddExpr() of ScalarEvolution
// and calculate (Start + Index * Step) for all cases, without		// and calculate (Start + Index * Step) for all cases, without
// special handling for "isOne" and "isMinusOne".		// special handling for "isOne" and "isMinusOne".
// But in the real life the result code getting worse. We mix SCEV		// But in the real life the result code getting worse. We mix SCEV
// expressions and ADD/SUB operations and receive redundant		// expressions and ADD/SUB operations and receive redundant
// intermediate values being calculated in different ways and		// intermediate values being calculated in different ways and
// Instcombine is unable to reduce them all.		// Instcombine is unable to reduce them all.

if (getConstIntStepValue() &&		if (getConstIntStepValue() &&
getConstIntStepValue()->isMinusOne())		getConstIntStepValue()->isMinusOne())
return B.CreateSub(StartValue, Index);		return B.CreateSub(StartValue, Index);
if (getConstIntStepValue() &&		if (getConstIntStepValue() &&
getConstIntStepValue()->isOne())		getConstIntStepValue()->isOne())
return B.CreateAdd(StartValue, Index);		return B.CreateAdd(StartValue, Index);
const SCEV *S = SE->getAddExpr(SE->getSCEV(StartValue),		const SCEV *S = SE->getAddExpr(SE->getSCEV(StartValue),
SE->getMulExpr(Step, SE->getSCEV(Index)));		SE->getMulExpr(Step, SE->getSCEV(Index)));
return Exp.expandCodeFor(S, StartValue->getType(), &*B.GetInsertPoint());		return Exp.expandCodeFor(S, StartValue->getType(), &*B.GetInsertPoint());
}		}
case IK_PtrInduction: {		case IK_PtrInduction: {
assert(Index->getType() == Step->getType() &&
"Index type does not match StepValue type");
assert(isa<SCEVConstant>(Step) &&		assert(isa<SCEVConstant>(Step) &&
"Expected constant step for pointer induction");		"Expected constant step for pointer induction");
const SCEV *S = SE->getMulExpr(SE->getSCEV(Index), Step);		const SCEV *S = SE->getMulExpr(SE->getSCEV(Index), Step);
Index = Exp.expandCodeFor(S, Index->getType(), &*B.GetInsertPoint());		Index = Exp.expandCodeFor(S, Index->getType(), &*B.GetInsertPoint());
return B.CreateGEP(nullptr, StartValue, Index);		return B.CreateGEP(nullptr, StartValue, Index);
}		}
		case IK_FpInduction: {
		assert(Step->getType()->isFloatingPointTy() && "Expected FP Step value");
		assert(InductionBinOp &&
		(InductionBinOp->getOpcode() == Instruction::FAdd \|\|
		InductionBinOp->getOpcode() == Instruction::FSub) &&
		"Original bin op should be defined for FP induction");

		Value *StepValue = cast<SCEVUnknown>(Step)->getValue();

		// Floating point operations had to be 'fast' to enable the induction.
		FastMathFlags Flags;
		Flags.setUnsafeAlgebra();

		Value *MulExp = B.CreateFMul(StepValue, Index);
		if (isa<Instruction>(MulExp))
		// We have to check, the MulExp may be a constant.
		cast<Instruction>(MulExp)->setFastMathFlags(Flags);

		Value *BOp = B.CreateBinOp(InductionBinOp->getOpcode() , StartValue,
		MulExp, "induction");
		if (isa<Instruction>(BOp))
		cast<Instruction>(BOp)->setFastMathFlags(Flags);

		return BOp;
		}
case IK_NoInduction:		case IK_NoInduction:
return nullptr;		return nullptr;
}		}
llvm_unreachable("invalid enum");		llvm_unreachable("invalid enum");
}		}

bool InductionDescriptor::isInductionPHI(PHINode *Phi,		bool InductionDescriptor::isFPInductionPHI(PHINode Phi, const Loop TheLoop,
		ScalarEvolution *SE,
		InductionDescriptor &D) {

		// Here we only handle FP induction variables.
		assert(Phi->getType()->isFloatingPointTy() && "Unexpected Phi type");

		if (TheLoop->getHeader() != Phi->getParent())
		return false;

		// The loop may have multiple entrances or multiple exits; we can analyze
		// this phi if it has a unique entry value and a unique backedge value.
		if (Phi->getNumIncomingValues() != 2)
		return false;
		Value BEValue = nullptr, StartValue = nullptr;
		if (TheLoop->contains(Phi->getIncomingBlock(0))) {
		BEValue = Phi->getIncomingValue(0);
		StartValue = Phi->getIncomingValue(1);
		} else {
		assert(TheLoop->contains(Phi->getIncomingBlock(1)) &&
		"Unexpected Phi node in the loop");
		BEValue = Phi->getIncomingValue(1);
		StartValue = Phi->getIncomingValue(0);
		}

		BinaryOperator *BOp = dyn_cast<BinaryOperator>(BEValue);
		if (!BOp)
		return false;

		Value *Addend = nullptr;
		if (BOp->getOpcode() == Instruction::FAdd) {
		if (BOp->getOperand(0) == Phi)
		Addend = BOp->getOperand(1);
		else if (BOp->getOperand(1) == Phi)
		Addend = BOp->getOperand(0);
		} else if (BOp->getOpcode() == Instruction::FSub)
		if (BOp->getOperand(0) == Phi)
		Addend = BOp->getOperand(1);

		if (!Addend)
		return false;

		// The addend should be loop invariant
		if (auto *I = dyn_cast<Instruction>(Addend))
		if (TheLoop->contains(I))
		return false;

		// FP Step has unknown SCEV
		const SCEV *Step = SE->getUnknown(Addend);
		D = InductionDescriptor(StartValue, IK_FpInduction, Step, BOp);
		return true;
		}

		bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,
PredicatedScalarEvolution &PSE,		PredicatedScalarEvolution &PSE,
InductionDescriptor &D,		InductionDescriptor &D,
bool Assume) {		bool Assume) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		// Handle integer and pointer inductions variables.
		// Now we handle also FP induction but not trying to make a
		// recurrent expression from the PHI node in-place.

		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy() &&
		!PhiTy->isFloatTy() && !PhiTy->isDoubleTy() && !PhiTy->isHalfTy())
return false;		return false;

		if (PhiTy->isFloatingPointTy())
		return isFPInductionPHI(Phi, TheLoop, PSE.getSE(), D);

const SCEV *PhiScev = PSE.getSCEV(Phi);		const SCEV *PhiScev = PSE.getSCEV(Phi);
const auto *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const auto *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);

// We need this expression to be an AddRecExpr.		// We need this expression to be an AddRecExpr.
if (Assume && !AR)		if (Assume && !AR)
AR = PSE.getAsAddRec(Phi);		AR = PSE.getAsAddRec(Phi);

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return false;		return false;
}		}

return isInductionPHI(Phi, PSE.getSE(), D, AR);		return isInductionPHI(Phi, TheLoop, PSE.getSE(), D, AR);
}		}

bool InductionDescriptor::isInductionPHI(PHINode *Phi,		bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,
ScalarEvolution *SE,		ScalarEvolution *SE,
InductionDescriptor &D,		InductionDescriptor &D,
const SCEV *Expr) {		const SCEV *Expr) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.		// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())
return false;		return false;

// Check that the PHI is consecutive.		// Check that the PHI is consecutive.
const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);		const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return false;		return false;
}		}

assert(AR->getLoop()->getHeader() == Phi->getParent() &&		assert(TheLoop->getHeader() == Phi->getParent() &&
"PHI is an AddRec for a different loop?!");		"PHI is an AddRec for a different loop?!");
Value *StartValue =		Value *StartValue =
Phi->getIncomingValueForBlock(AR->getLoop()->getLoopPreheader());		Phi->getIncomingValueForBlock(AR->getLoop()->getLoopPreheader());
const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);
// Calculate the pointer stride and check if it is consecutive.		// Calculate the pointer stride and check if it is consecutive.
// The stride may be a constant or a loop invariant integer value.		// The stride may be a constant or a loop invariant integer value.
const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);		const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);
if (!ConstStep && !SE->isLoopInvariant(Step, AR->getLoop()))		if (!ConstStep && !SE->isLoopInvariant(Step, TheLoop))
return false;		return false;

if (PhiTy->isIntegerTy()) {		if (PhiTy->isIntegerTy()) {
D = InductionDescriptor(StartValue, IK_IntInduction, Step);		D = InductionDescriptor(StartValue, IK_IntInduction, Step);
return true;		return true;
}		}

assert(PhiTy->isPointerTy() && "The PHI must be a pointer");		assert(PhiTy->isPointerTy() && "The PHI must be a pointer");
▲ Show 20 Lines • Show All 177 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 396 Lines • ▼ Show 20 Lines	protected:
/// instruction (shuffle) for loop invariant values and for the induction		/// instruction (shuffle) for loop invariant values and for the induction
/// value. If this is the induction variable then we extend it to N, N+1, ...		/// value. If this is the induction variable then we extend it to N, N+1, ...
/// this is needed because each iteration in the loop corresponds to a SIMD		/// this is needed because each iteration in the loop corresponds to a SIMD
/// element.		/// element.
virtual Value getBroadcastInstrs(Value V);		virtual Value getBroadcastInstrs(Value V);

/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)		/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)
/// to each vector element of Val. The sequence starts at StartIndex.		/// to each vector element of Val. The sequence starts at StartIndex.
virtual Value getStepVector(Value Val, int StartIdx, Value *Step);		/// \p Opcode is relevant for FP induction variable.
		virtual Value getStepVector(Value Val, int StartIdx, Value *Step,
		Instruction::BinaryOps Opcode =
		Instruction::BinaryOpsEnd);

/// Compute scalar induction steps. \p ScalarIV is the scalar induction		/// Compute scalar induction steps. \p ScalarIV is the scalar induction
/// variable on which to base the steps, \p Step is the size of the step, and		/// variable on which to base the steps, \p Step is the size of the step, and
/// \p EntryVal is the value from the original loop that maps to the steps.		/// \p EntryVal is the value from the original loop that maps to the steps.
/// Note that \p EntryVal doesn't have to be an induction variable (e.g., it		/// Note that \p EntryVal doesn't have to be an induction variable (e.g., it
/// can be a truncate instruction).		/// can be a truncate instruction).
void buildScalarSteps(Value ScalarIV, Value Step, Value *EntryVal);		void buildScalarSteps(Value ScalarIV, Value Step, Value *EntryVal);

▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	InnerLoopUnroller(Loop *OrigLoop, PredicatedScalarEvolution &PSE,
: InnerLoopVectorizer(OrigLoop, PSE, LI, DT, TLI, TTI, AC, ORE, 1,		: InnerLoopVectorizer(OrigLoop, PSE, LI, DT, TLI, TTI, AC, ORE, 1,
UnrollFactor) {}		UnrollFactor) {}

private:		private:
void scalarizeInstruction(Instruction *Instr,		void scalarizeInstruction(Instruction *Instr,
bool IfPredicateStore = false) override;		bool IfPredicateStore = false) override;
void vectorizeMemoryInstruction(Instruction *Instr) override;		void vectorizeMemoryInstruction(Instruction *Instr) override;
Value getBroadcastInstrs(Value V) override;		Value getBroadcastInstrs(Value V) override;
Value getStepVector(Value Val, int StartIdx, Value *Step) override;		Value getStepVector(Value Val, int StartIdx, Value *Step,
		Instruction::BinaryOps Opcode =
		Instruction::BinaryOpsEnd) override;
Value reverseVector(Value Vec) override;		Value reverseVector(Value Vec) override;
};		};

/// \brief Look for a meaningful debug location on the instruction or it's		/// \brief Look for a meaningful debug location on the instruction or it's
/// operands.		/// operands.
static Instruction getDebugLocFromInstOrOperands(Instruction I) {		static Instruction getDebugLocFromInstOrOperands(Instruction I) {
if (!I)		if (!I)
return I;		return I;
▲ Show 20 Lines • Show All 1,358 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::widenIntInduction(PHINode *IV, VectorParts &Entry,
// in the loop in the common case prior to InstCombine. We will be trading		// in the loop in the common case prior to InstCombine. We will be trading
// one vector extract for each scalar step.		// one vector extract for each scalar step.
if (VF > 1 && ValuesNotWidened->count(IV)) {		if (VF > 1 && ValuesNotWidened->count(IV)) {
auto *EntryVal = Trunc ? cast<Value>(Trunc) : IV;		auto *EntryVal = Trunc ? cast<Value>(Trunc) : IV;
buildScalarSteps(ScalarIV, Step, EntryVal);		buildScalarSteps(ScalarIV, Step, EntryVal);
}		}
}		}

Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,		Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx, Value *Step,
Value *Step) {		Instruction::BinaryOps BinOp) {
		// Create and check the types.
assert(Val->getType()->isVectorTy() && "Must be a vector");		assert(Val->getType()->isVectorTy() && "Must be a vector");
assert(Val->getType()->getScalarType()->isIntegerTy() &&		int VLen = Val->getType()->getVectorNumElements();
"Elem must be an integer");
assert(Step->getType() == Val->getType()->getScalarType() &&		Type *STy = Val->getType()->getScalarType();
"Step has wrong type");		assert((STy->isIntegerTy() \|\| STy->isFloatingPointTy()) &&
// Create the types.		"Induction Step must be an integer or FP");
Type *ITy = Val->getType()->getScalarType();		assert(Step->getType() == STy && "Step has wrong type");
VectorType *Ty = cast<VectorType>(Val->getType());
int VLen = Ty->getNumElements();
SmallVector<Constant *, 8> Indices;		SmallVector<Constant *, 8> Indices;

		if (STy->isIntegerTy()) {
// Create a vector of consecutive numbers from zero to VF.		// Create a vector of consecutive numbers from zero to VF.
for (int i = 0; i < VLen; ++i)		for (int i = 0; i < VLen; ++i)
Indices.push_back(ConstantInt::get(ITy, StartIdx + i));		Indices.push_back(ConstantInt::get(STy, StartIdx + i));

// Add the consecutive indices to the vector value.		// Add the consecutive indices to the vector value.
Constant *Cv = ConstantVector::get(Indices);		Constant *Cv = ConstantVector::get(Indices);
assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");		assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");
Step = Builder.CreateVectorSplat(VLen, Step);		Step = Builder.CreateVectorSplat(VLen, Step);
assert(Step->getType() == Val->getType() && "Invalid step vec");		assert(Step->getType() == Val->getType() && "Invalid step vec");
// FIXME: The newly created binary instructions should contain nsw/nuw flags,		// FIXME: The newly created binary instructions should contain nsw/nuw flags,
// which can be found from the original scalar operations.		// which can be found from the original scalar operations.
Step = Builder.CreateMul(Cv, Step);		Step = Builder.CreateMul(Cv, Step);
return Builder.CreateAdd(Val, Step, "induction");		return Builder.CreateAdd(Val, Step, "induction");
}		}

		// Floating point induction.
		assert((BinOp == Instruction::FAdd \|\| BinOp == Instruction::FSub) &&
		"Binary Opcode should be specified for FP induction");
		// Create a vector of consecutive numbers from zero to VF.
		for (int i = 0; i < VLen; ++i)
		Indices.push_back(ConstantFP::get(STy, (double)(StartIdx + i)));

		// Add the consecutive indices to the vector value.
		Constant *Cv = ConstantVector::get(Indices);

		Step = Builder.CreateVectorSplat(VLen, Step);

		// Floating point operations had to be 'fast' to enable the induction.
		FastMathFlags Flags;
		Flags.setUnsafeAlgebra();

		Value *MulOp = Builder.CreateFMul(Cv, Step);
		if (isa<Instruction>(MulOp))
		// Have to check, MulOp may be a constant
		cast<Instruction>(MulOp)->setFastMathFlags(Flags);

		Value *BOp = Builder.CreateBinOp(BinOp, Val, MulOp, "induction");
		if (isa<Instruction>(BOp))
		cast<Instruction>(BOp)->setFastMathFlags(Flags);
		return BOp;
		}

void InnerLoopVectorizer::buildScalarSteps(Value ScalarIV, Value Step,		void InnerLoopVectorizer::buildScalarSteps(Value ScalarIV, Value Step,
Value *EntryVal) {		Value *EntryVal) {

// We shouldn't have to build scalar steps if we aren't vectorizing.		// We shouldn't have to build scalar steps if we aren't vectorizing.
assert(VF > 1 && "VF should be greater than one");		assert(VF > 1 && "VF should be greater than one");

// Get the value type and ensure it and the step have the same integer type.		// Get the value type and ensure it and the step have the same integer type.
Type *ScalarIVTy = ScalarIV->getType()->getScalarType();		Type *ScalarIVTy = ScalarIV->getType()->getScalarType();
▲ Show 20 Lines • Show All 1,055 Lines • ▼ Show 20 Lines	for (auto &InductionEntry : *List) {
PHINode *BCResumeVal = PHINode::Create(		PHINode *BCResumeVal = PHINode::Create(
OrigPhi->getType(), 3, "bc.resume.val", ScalarPH->getTerminator());		OrigPhi->getType(), 3, "bc.resume.val", ScalarPH->getTerminator());
Value *EndValue;		Value *EndValue;
if (OrigPhi == OldInduction) {		if (OrigPhi == OldInduction) {
// We know what the end value is.		// We know what the end value is.
EndValue = CountRoundDown;		EndValue = CountRoundDown;
} else {		} else {
IRBuilder<> B(LoopBypassBlocks.back()->getTerminator());		IRBuilder<> B(LoopBypassBlocks.back()->getTerminator());
Value *CRD = B.CreateSExtOrTrunc(CountRoundDown,		Type *StepType = II.getStep()->getType();
II.getStep()->getType(), "cast.crd");		Instruction::CastOps CastOp =
		CastInst::getCastOpcode(CountRoundDown, true, StepType, true);
		Value *CRD = B.CreateCast(CastOp, CountRoundDown, StepType, "cast.crd");
const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();
EndValue = II.transform(B, CRD, PSE.getSE(), DL);		EndValue = II.transform(B, CRD, PSE.getSE(), DL);
EndValue->setName("ind.end");		EndValue->setName("ind.end");
}		}

// The new PHI merges the original incoming value, in case of a bypass,		// The new PHI merges the original incoming value, in case of a bypass,
// or the value at the end of the vectorized loop.		// or the value at the end of the vectorized loop.
BCResumeVal->addIncoming(EndValue, MiddleBlock);		BCResumeVal->addIncoming(EndValue, MiddleBlock);
▲ Show 20 Lines • Show All 930 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::widenPHIInstruction(

// FIXME: The newly created binary instructions should contain nsw/nuw flags,		// FIXME: The newly created binary instructions should contain nsw/nuw flags,
// which can be found from the original scalar operations.		// which can be found from the original scalar operations.
switch (II.getKind()) {		switch (II.getKind()) {
case InductionDescriptor::IK_NoInduction:		case InductionDescriptor::IK_NoInduction:
llvm_unreachable("Unknown induction");		llvm_unreachable("Unknown induction");
case InductionDescriptor::IK_IntInduction:		case InductionDescriptor::IK_IntInduction:
return widenIntInduction(P, Entry);		return widenIntInduction(P, Entry);
case InductionDescriptor::IK_PtrInduction:		case InductionDescriptor::IK_PtrInduction: {
// Handle the pointer induction variable case.		// Handle the pointer induction variable case.
assert(P->getType()->isPointerTy() && "Unexpected type.");		assert(P->getType()->isPointerTy() && "Unexpected type.");
// This is the normalized GEP that starts counting at zero.		// This is the normalized GEP that starts counting at zero.
Value *PtrInd = Induction;		Value *PtrInd = Induction;
PtrInd = Builder.CreateSExtOrTrunc(PtrInd, II.getStep()->getType());		PtrInd = Builder.CreateSExtOrTrunc(PtrInd, II.getStep()->getType());
// This is the vector of results. Notice that we don't generate		// This is the vector of results. Notice that we don't generate
// vector geps because scalar geps result in better code.		// vector geps because scalar geps result in better code.
for (unsigned part = 0; part < UF; ++part) {		for (unsigned part = 0; part < UF; ++part) {
Show All 16 Lines	for (unsigned part = 0; part < UF; ++part) {
SclrGep->setName("next.gep");		SclrGep->setName("next.gep");
VecVal = Builder.CreateInsertElement(VecVal, SclrGep,		VecVal = Builder.CreateInsertElement(VecVal, SclrGep,
Builder.getInt32(i), "insert.gep");		Builder.getInt32(i), "insert.gep");
}		}
Entry[part] = VecVal;		Entry[part] = VecVal;
}		}
return;		return;
}		}
		case InductionDescriptor::IK_FpInduction: {
		assert(P->getType() == II.getStartValue()->getType() &&
		"Types must match");
		// Handle other induction variables that are now based on the
		// canonical one.
		assert(P != OldInduction && "Primary induction can be integer only");

		Value *V = Builder.CreateCast(Instruction::SIToFP, Induction, P->getType());
		V = II.transform(Builder, V, PSE.getSE(), DL);
		V->setName("fp.offset.idx");

		// Now we have scalar op: %fp.offset.idx = StartVal +/- Induction*StepVal

		Value *Broadcasted = getBroadcastInstrs(V);
		// After broadcasting the induction variable we need to make the vector
		// consecutive by adding StepVal0, StepVal1, StepVal*2, etc.
		Value *StepVal = cast<SCEVUnknown>(II.getStep())->getValue();
		for (unsigned part = 0; part < UF; ++part)
		Entry[part] = getStepVector(Broadcasted, VF * part, StepVal,
		II.getInductionOpcode());
		return;
		}
		}
}		}

void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {		void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {
// For each instruction in the old loop.		// For each instruction in the old loop.
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
VectorParts &Entry = WidenMap.get(&I);		VectorParts &Entry = WidenMap.get(&I);

switch (I.getOpcode()) {		switch (I.getOpcode()) {
▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines
void LoopVectorizationLegality::addInductionPhi(		void LoopVectorizationLegality::addInductionPhi(
PHINode *Phi, const InductionDescriptor &ID,		PHINode *Phi, const InductionDescriptor &ID,
SmallPtrSetImpl<Value *> &AllowedExit) {		SmallPtrSetImpl<Value *> &AllowedExit) {
Inductions[Phi] = ID;		Inductions[Phi] = ID;
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
const DataLayout &DL = Phi->getModule()->getDataLayout();		const DataLayout &DL = Phi->getModule()->getDataLayout();

// Get the widest type.		// Get the widest type.
		if (!PhiTy->isFloatingPointTy()) {
if (!WidestIndTy)		if (!WidestIndTy)
WidestIndTy = convertPointerToIntegerType(DL, PhiTy);		WidestIndTy = convertPointerToIntegerType(DL, PhiTy);
else		else
WidestIndTy = getWiderType(DL, PhiTy, WidestIndTy);		WidestIndTy = getWiderType(DL, PhiTy, WidestIndTy);
		}

// Int inductions are special because we only allow one IV.		// Int inductions are special because we only allow one IV.
if (ID.getKind() == InductionDescriptor::IK_IntInduction &&		if (ID.getKind() == InductionDescriptor::IK_IntInduction &&
ID.getConstIntStepValue() &&		ID.getConstIntStepValue() &&
ID.getConstIntStepValue()->isOne() &&		ID.getConstIntStepValue()->isOne() &&
isa<Constant>(ID.getStartValue()) &&		isa<Constant>(ID.getStartValue()) &&
cast<Constant>(ID.getStartValue())->isNullValue()) {		cast<Constant>(ID.getStartValue())->isNullValue()) {

▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
if (RedDes.hasUnsafeAlgebra())		if (RedDes.hasUnsafeAlgebra())
Requirements->addUnsafeAlgebraInst(RedDes.getUnsafeAlgebraInst());		Requirements->addUnsafeAlgebraInst(RedDes.getUnsafeAlgebraInst());
AllowedExit.insert(RedDes.getLoopExitInstr());		AllowedExit.insert(RedDes.getLoopExitInstr());
Reductions[Phi] = RedDes;		Reductions[Phi] = RedDes;
continue;		continue;
}		}

InductionDescriptor ID;		InductionDescriptor ID;
if (InductionDescriptor::isInductionPHI(Phi, PSE, ID)) {		if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID)) {
addInductionPhi(Phi, ID, AllowedExit);		addInductionPhi(Phi, ID, AllowedExit);
		if (ID.hasUnsafeAlgebra() && !HasFunNoNaNAttr)
		Requirements->addUnsafeAlgebraInst(ID.getUnsafeAlgebraInst());
continue;		continue;
}		}

if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop, DT)) {		if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop, DT)) {
FirstOrderRecurrences.insert(Phi);		FirstOrderRecurrences.insert(Phi);
continue;		continue;
}		}

// As a last resort, coerce the PHI to a AddRec expression		// As a last resort, coerce the PHI to a AddRec expression
// and re-try classifying it a an induction PHI.		// and re-try classifying it a an induction PHI.
if (InductionDescriptor::isInductionPHI(Phi, PSE, ID, true)) {		if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID, true)) {
addInductionPhi(Phi, ID, AllowedExit);		addInductionPhi(Phi, ID, AllowedExit);
continue;		continue;
}		}

emitAnalysis(VectorizationReport(Phi)		emitAnalysis(VectorizationReport(Phi)
<< "value that could not be identified as "		<< "value that could not be identified as "
"reduction is used outside the loop");		"reduction is used outside the loop");
DEBUG(dbgs() << "LV: Found an unidentified PHI." << *Phi << "\n");		DEBUG(dbgs() << "LV: Found an unidentified PHI." << *Phi << "\n");
▲ Show 20 Lines • Show All 1,670 Lines • ▼ Show 20 Lines	void InnerLoopUnroller::vectorizeMemoryInstruction(Instruction *Instr) {

return scalarizeInstruction(Instr, IfPredicateStore);		return scalarizeInstruction(Instr, IfPredicateStore);
}		}

Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }		Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }

Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }		Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }

Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step) {		Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step,
		Instruction::BinaryOps BinOp) {
// When unrolling and the VF is 1, we only need to add a simple scalar.		// When unrolling and the VF is 1, we only need to add a simple scalar.
Type *ITy = Val->getType();		Type *Ty = Val->getType();
assert(!ITy->isVectorTy() && "Val must be a scalar");		assert(!Ty->isVectorTy() && "Val must be a scalar");
Constant *C = ConstantInt::get(ITy, StartIdx);
		if (Ty->isFloatingPointTy()) {
		Constant *C = ConstantFP::get(Ty, (double)StartIdx);

		// Floating point operations had to be 'fast' to enable the unrolling.
		Value *MulOp = addFastMathFlag(Builder.CreateFMul(C, Step));
		return addFastMathFlag(Builder.CreateBinOp(BinOp, Val, MulOp));
		}
		Constant *C = ConstantInt::get(Ty, StartIdx);
return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");		return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");
}		}

static void AddRuntimeUnrollDisableMetaData(Loop *L) {		static void AddRuntimeUnrollDisableMetaData(Loop *L) {
SmallVector<Metadata *, 4> MDs;		SmallVector<Metadata *, 4> MDs;
// Reserve first location for self reference to the LoopID metadata node.		// Reserve first location for self reference to the LoopID metadata node.
MDs.push_back(nullptr);		MDs.push_back(nullptr);
bool IsUnrollMetadata = false;		bool IsUnrollMetadata = false;
▲ Show 20 Lines • Show All 336 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

				; RUN: opt < %s -O3 -mcpu=core-avx2 -mtriple=x86_64-unknown-linux-gnu -S \| FileCheck --check-prefix AUTO_VEC %s

				; This test checks auto-vectorization with FP induction variable.
				; The FP operation is not "fast" and requires "fast-math" function attribute.

				;void fp_iv_loop1(float * __restrict__ A, int N) {
				; float x = 1.0;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x += 0.5;
				; }
				;}


				; AUTO_VEC-LABEL: @fp_iv_loop1(
				; AUTO_VEC: vector.body
				; AUTO_VEC: store <8 x float>

				define void @fp_iv_loop1(float* noalias nocapture %A, i32 %N) #0 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%x.06 = phi float [ %conv1, %for.body ], [ 1.000000e+00, %for.body.preheader ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.06, float* %arrayidx, align 4
				%conv1 = fadd float %x.06, 5.000000e-01
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				; The same as the previous, FP operation is not fast, different function attribute
				; Vectorization should be rejected.
				;void fp_iv_loop2(float * __restrict__ A, int N) {
				; float x = 1.0;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x += 0.5;
				; }
				;}

				; AUTO_VEC-LABEL: @fp_iv_loop2(
				; AUTO_VEC-NOT: vector.body
				; AUTO_VEC-NOT: store <{{.*}} x float>

				define void @fp_iv_loop2(float* noalias nocapture %A, i32 %N) #1 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%x.06 = phi float [ %conv1, %for.body ], [ 1.000000e+00, %for.body.preheader ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.06, float* %arrayidx, align 4
				%conv1 = fadd float %x.06, 5.000000e-01
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				attributes #0 = { "no-nans-fp-math"="true" }
				attributes #1 = { "no-nans-fp-math"="false" }

llvm/trunk/test/Transforms/LoopVectorize/float-induction.ll

				; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -dce -instcombine -S \| FileCheck --check-prefix VEC4_INTERL1 %s
				; RUN: opt < %s -loop-vectorize -force-vector-interleave=2 -force-vector-width=4 -dce -instcombine -S \| FileCheck --check-prefix VEC4_INTERL2 %s
				; RUN: opt < %s -loop-vectorize -force-vector-interleave=2 -force-vector-width=1 -dce -instcombine -S \| FileCheck --check-prefix VEC1_INTERL2 %s

				; VEC4_INTERL1-LABEL: @fp_iv_loop1(
				; VEC4_INTERL1: %[[FP_INC:.]] = load float, float @fp_inc
				; VEC4_INTERL1: vector.body:
				; VEC4_INTERL1: %[[FP_INDEX:.]] = sitofp i64 {{.}} to float
				; VEC4_INTERL1: %[[VEC_INCR:.]] = fmul fast float {{.}}, %[[FP_INDEX]]
				; VEC4_INTERL1: %[[FP_OFFSET_IDX:.*]] = fsub fast float %init, %[[VEC_INCR]]
				; VEC4_INTERL1: %[[BRCT_INSERT:.*]] = insertelement <4 x float> undef, float %[[FP_OFFSET_IDX]], i32 0
				; VEC4_INTERL1-NEXT: %[[BRCT_SPLAT:.*]] = shufflevector <4 x float> %[[BRCT_INSERT]], <4 x float> undef, <4 x i32> zeroinitializer
				; VEC4_INTERL1: %[[BRCT_INSERT:.]] = insertelement {{.}} %[[FP_INC]]
				; VEC4_INTERL1-NEXT: %[[FP_INC_BCST:.]] = shufflevector <4 x float> %[[BRCT_INSERT]], {{.}} zeroinitializer
				; VEC4_INTERL1: %[[VSTEP:.*]] = fmul fast <4 x float> %[[FP_INC_BCST]], <float 0.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00>
				; VEC4_INTERL1-NEXT: %[[VEC_INDUCTION:.*]] = fsub fast <4 x float> %[[BRCT_SPLAT]], %[[VSTEP]]
				; VEC4_INTERL1: store <4 x float> %[[VEC_INDUCTION]]

				; VEC4_INTERL2-LABEL: @fp_iv_loop1(
				; VEC4_INTERL2: %[[FP_INC:.]] = load float, float @fp_inc
				; VEC4_INTERL2: vector.body:
				; VEC4_INTERL2: %[[INDEX:.]] = sitofp i64 {{.}} to float
				; VEC4_INTERL2: %[[VEC_INCR:.]] = fmul fast float %{{.}}, %[[INDEX]]
				; VEC4_INTERL2: fsub fast float %init, %[[VEC_INCR]]
				; VEC4_INTERL2: %[[VSTEP1:.]] = fmul fast <4 x float> %{{.}}, <float 0.000000e+00, float 1.000000e+00, float 2.000000e+00, float 3.000000e+00>
				; VEC4_INTERL2-NEXT: %[[VEC_INDUCTION1:.]] = fsub fast <4 x float> {{.}}, %[[VSTEP1]]
				; VEC4_INTERL2: %[[VSTEP2:.]] = fmul fast <4 x float> %{{.}}, <float 4.000000e+00, float 5.000000e+00, float 6.000000e+00, float 7.000000e+00>
				; VEC4_INTERL2-NEXT: %[[VEC_INDUCTION2:.]] = fsub fast <4 x float> {{.}}, %[[VSTEP2]]
				; VEC4_INTERL2: store <4 x float> %[[VEC_INDUCTION1]]
				; VEC4_INTERL2: store <4 x float> %[[VEC_INDUCTION2]]

				; VEC1_INTERL2-LABEL: @fp_iv_loop1(
				; VEC1_INTERL2: %[[FP_INC:.]] = load float, float @fp_inc
				; VEC1_INTERL2: vector.body:
				; VEC1_INTERL2: %[[INDEX:.]] = sitofp i64 {{.}} to float
				; VEC1_INTERL2: %[[STEP:.]] = fmul fast float %{{.}}, %[[INDEX]]
				; VEC1_INTERL2: %[[FP_OFFSET_IDX:.*]] = fsub fast float %init, %[[STEP]]
				; VEC1_INTERL2: %[[SCALAR_INDUCTION2:.*]] = fsub fast float %[[FP_OFFSET_IDX]], %[[FP_INC]]
				; VEC1_INTERL2: store float %[[FP_OFFSET_IDX]]
				; VEC1_INTERL2: store float %[[SCALAR_INDUCTION2]]

				@fp_inc = common global float 0.000000e+00, align 4

				;void fp_iv_loop1(float init, float * __restrict__ A, int N) {
				; float x = init;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x -= fp_inc;
				; }
				;}

				define void @fp_iv_loop1(float %init, float* noalias nocapture %A, i32 %N) #1 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.lr.ph, label %for.end

				for.body.lr.ph: ; preds = %entry
				%fpinc = load float, float* @fp_inc, align 4
				br label %for.body

				for.body: ; preds = %for.body, %for.body.lr.ph
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%x.05 = phi float [ %init, %for.body.lr.ph ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.05, float* %arrayidx, align 4
				%add = fsub fast float %x.05, %fpinc
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				;void fp_iv_loop2(float init, float * __restrict__ A, int N) {
				; float x = init;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x += 0.5;
				; }
				;}

				; VEC4_INTERL1-LABEL: @fp_iv_loop2(
				; VEC4_INTERL1: vector.body
				; VEC4_INTERL1: %[[index:.*]] = phi i64 [ 0, %vector.ph ]
				; VEC4_INTERL1: sitofp i64 %[[index]] to float
				; VEC4_INTERL1: %[[VAR1:.]] = fmul fast float {{.}}, 5.000000e-01
				; VEC4_INTERL1: %[[VAR2:.*]] = fadd fast float %[[VAR1]]
				; VEC4_INTERL1: insertelement <4 x float> undef, float %[[VAR2]], i32 0
				; VEC4_INTERL1: shufflevector <4 x float> {{.*}}, <4 x float> undef, <4 x i32> zeroinitializer
				; VEC4_INTERL1: fadd fast <4 x float> {{.*}}, <float 0.000000e+00, float 5.000000e-01, float 1.000000e+00, float 1.500000e+00>
				; VEC4_INTERL1: store <4 x float>

				define void @fp_iv_loop2(float %init, float* noalias nocapture %A, i32 %N) #0 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%x.06 = phi float [ %conv1, %for.body ], [ %init, %for.body.preheader ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.06, float* %arrayidx, align 4
				%conv1 = fadd fast float %x.06, 5.000000e-01
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				;void fp_iv_loop3(float init, float * __restrict__ A, float * __restrict__ B, float * __restrict__ C, int N) {
				; int i = 0;
				; float x = init;
				; float y = 0.1;
				; for (; i < N; ++i) {
				; A[i] = x;
				; x += fp_inc;
				; y -= 0.5;
				; B[i] = x + y;
				; C[i] = y;
				; }
				;}
				; VEC4_INTERL1-LABEL: @fp_iv_loop3(
				; VEC4_INTERL1: vector.body
				; VEC4_INTERL1: %[[index:.*]] = phi i64 [ 0, %vector.ph ]
				; VEC4_INTERL1: sitofp i64 %[[index]] to float
				; VEC4_INTERL1: %[[VAR1:.]] = fmul fast float {{.}}, -5.000000e-01
				; VEC4_INTERL1: fadd fast float %[[VAR1]]
				; VEC4_INTERL1: fadd fast <4 x float> {{.*}}, <float -5.000000e-01, float -1.000000e+00, float -1.500000e+00, float -2.000000e+00>
				; VEC4_INTERL1: store <4 x float>

				define void @fp_iv_loop3(float %init, float* noalias nocapture %A, float* noalias nocapture %B, float* noalias nocapture %C, i32 %N) #1 {
				entry:
				%cmp9 = icmp sgt i32 %N, 0
				br i1 %cmp9, label %for.body.lr.ph, label %for.end

				for.body.lr.ph: ; preds = %entry
				%0 = load float, float* @fp_inc, align 4
				br label %for.body

				for.body: ; preds = %for.body, %for.body.lr.ph
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%y.012 = phi float [ 0x3FB99999A0000000, %for.body.lr.ph ], [ %conv1, %for.body ]
				%x.011 = phi float [ %init, %for.body.lr.ph ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.011, float* %arrayidx, align 4
				%add = fadd fast float %x.011, %0
				%conv1 = fadd fast float %y.012, -5.000000e-01
				%add2 = fadd fast float %conv1, %add
				%arrayidx4 = getelementptr inbounds float, float* %B, i64 %indvars.iv
				store float %add2, float* %arrayidx4, align 4
				%arrayidx6 = getelementptr inbounds float, float* %C, i64 %indvars.iv
				store float %conv1, float* %arrayidx6, align 4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; Start and step values are constants. There is no 'fmul' operation in this case
				;void fp_iv_loop4(float * __restrict__ A, int N) {
				; float x = 1.0;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x += 0.5;
				; }
				;}

				; VEC4_INTERL1-LABEL: @fp_iv_loop4(
				; VEC4_INTERL1: vector.body
				; VEC4_INTERL1-NOT: fmul fast <4 x float>
				; VEC4_INTERL1: %[[induction:.]] = fadd fast <4 x float> %{{.}}, <float 0.000000e+00, float 5.000000e-01, float 1.000000e+00, float 1.500000e+00>
				; VEC4_INTERL1: store <4 x float> %[[induction]]

				define void @fp_iv_loop4(float* noalias nocapture %A, i32 %N) {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%x.06 = phi float [ %conv1, %for.body ], [ 1.000000e+00, %for.body.preheader ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.06, float* %arrayidx, align 4
				%conv1 = fadd fast float %x.06, 5.000000e-01
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Loop vectorization with FP induction variablesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65266

llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h

llvm/trunk/lib/Transforms/Scalar/LoopInterchange.cpp

llvm/trunk/lib/Transforms/Utils/LoopUtils.cpp

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/trunk/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

llvm/trunk/test/Transforms/LoopVectorize/float-induction.ll

Loop vectorization with FP induction variables
ClosedPublic