This is an archive of the discontinued LLVM Phabricator instance.

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	Why do we need this?
lib/Transforms/Utils/LoopUtils.cpp
758 ↗	(On Diff #60711)	Looking at the users of `isInductionPHI`, looks like it should be easy to pass in the `Loop *` directly? If so, can we just have this check be: `if (L->getHeader() != Phi->getParent()) return false;` ? That way we won't have to add the `getLoopFor` interface to SCEV (which does not look like it belongs there).

delena added inline comments.Jun 20 2016, 10:12 AM

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	I defined Step as unknown SCEV, but underlying value is of floating point type. In this case expand() just returns the underlying value, but it fails on one of these two lines.
lib/Transforms/Utils/LoopUtils.cpp
758 ↗	(On Diff #60711)	Yes, of course. I'll change.

mkuper added inline comments.Jun 20 2016, 12:42 PM

lib/Transforms/Utils/LoopUtils.cpp
675 ↗	(On Diff #60711)	Maybe ((IK == IK_FpInduction) \|\| Step->getType()->isIntegerTy()) would be clearer.
678 ↗	(On Diff #60711)	Maybe put this on line 668, next to the two asserts for the other types?
733 ↗	(On Diff #60711)	As long as you're touching this - maybe hoist this assert for all 3 cases above the switch?
739 ↗	(On Diff #60711)	Why do we need this?
767 ↗	(On Diff #60711)	Some variable naming nitpicking: Bb -> either BB or bb? BEValueV -> BEValue StartValueV -> StartValue
772 ↗	(On Diff #60711)	Do we really need the if here? We expect L->contains(BB) to be true for one of the incoming values, and false for the other, right? So if we're in the "else", we're already in a bad shape, regardless of whether BEValueV == V or not. Or did I misunderstand?
776 ↗	(On Diff #60711)	Same as above.
780 ↗	(On Diff #60711)	Can we replace this with an early exit?
801 ↗	(On Diff #60711)	Are you sure this is always a safe place to insert? E.g. what if Addend is a PHI in an outer loop?
827 ↗	(On Diff #60711)	Why not isFloatingPointTy() here?
lib/Transforms/Vectorize/LoopVectorize.cpp
2154 ↗	(On Diff #60711)	ITy here stands for "IntegerTy", I guess. Perhaps rename, now that ITy can be float? (Unless ITy stands for IndexTy, which also make sense, and in which case we should also probably rename. :-) )
2178 ↗	(On Diff #60711)	This looks a bit odd. If I understand correctly, you're relying on the step being FNeg to distinguish whether the original direction of the loop was positive or negative (by transforming a phi that feeds an fsub by a phi that feeds an fadd with an fneg). What happens if the scalar loop has an fadd, where the original step is a loop-invariant fneg?

mkuper added inline comments.Jun 20 2016, 12:42 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	That sounds a bit weird to me. It looks like the contract for expandCodeFor is that if you pass a type, you get the expansion cast to this type. What happens if you pass a type, but get a result that's from a different type than you passed (because it's not scevable)?

Michael, thanks a lot for your comments. I'll upload a new patch.

lib/Analysis/ScalarEvolutionExpander.cpp
1610 ↗	(On Diff #60711)	I don't know why do we need to pass the type. May be for casting between i64 and PointerType? But an Unknown SCEV may have any type, right?
lib/Transforms/Utils/LoopUtils.cpp
678 ↗	(On Diff #60711)	All these checks belong to the Step type. See comment #669.
733 ↗	(On Diff #60711)	Ok
739 ↗	(On Diff #60711)	I should cover fsub operation somehow. ; for (int i=0; i < N; ++i) { ; A[i] = x; ; x -= fp_inc; ; } I keep Step as Fneg(fp_inc) in this case.
772 ↗	(On Diff #60711)	I just copied this code from integer Phi together with weird variable names. Can we have multiple bbs inside loop?
801 ↗	(On Diff #60711)	This FNeg disappears after transformation. It is just a way to distinguish between FAdd and FSub.
827 ↗	(On Diff #60711)	isFloatingPointTy() includes more than half, float and double. I'm not sure I want to cover other types.
lib/Transforms/Vectorize/LoopVectorize.cpp
2178 ↗	(On Diff #60711)	I don't see any problem. x + (-a) is equal to x - a.

Thanks, Elena.

I guess there's one thing I don't understand about the fnegs - why are you using an IR instruction as the marker of whether the original induction had an fadd or an fsub, instead of a property of the IV? It would make sense to me if you actually used the fneg to feed the vector induction, thus simplifying the code (not having to special-case the sub), but instead you special-case it anyway by looking through the fneg.

Another thing I forgot earlier - this should only fire when we're in unsafe/fast math mode, right? Is there a check for that?

lib/Transforms/Utils/LoopUtils.cpp
739 ↗	(On Diff #60711)	I understand, what I mean is - let's say the step is an FNeg. Why can't you feed the FNeg directly into the CreateFMul? Do we get worse code?
772 ↗	(On Diff #60711)	Right now, I think not. (and that needs to be fixed.) :-\ Anyway, if this is a verbatim copy from the int case, leave it be for now. Should probably be fixed separately. (Of course, that raises the question - can the int and the fp case share the code? Or is it not worth it?)
801 ↗	(On Diff #60711)	When does it disappear? If it disappears in later clean-up, we still don't want to insert it at an illegal location (like between two phis). If it's guaranteed to be deleted during the run of the vectorizer, I guess this will technically work, but I'd still prefer to avoid it.
lib/Transforms/Vectorize/LoopVectorize.cpp
2178 ↗	(On Diff #60711)	What I'm trying to say is that it's weird that the behavior would be different based on whether the step is an fneg. It could be an fneg because you added an fneg, it could be an fneg because there was already an fneg. Why does this code look through the fneg? It doesn't seem like this should be the vectorizer's job. Although if the fneg won't get cleaned up later, this is probably the right thing to do.

In D21330#462655, @mkuper wrote:

I guess there's one thing I don't understand about the fnegs - why are you using an IR instruction as the marker of whether the original induction had an fadd or an fsub, instead of a property of the IV? It would make sense to me if you actually used the fneg to feed the vector induction, thus simplifying the code (not having to special-case the sub), but instead you special-case it anyway by looking through the fneg.

What do you mean by "property of IV" ? Do you suggest to add a special field to InductionDescriptor?
Fneg should be a part of Step. But Step is a SCEV and I was not allowed to implement FP SCEV.

In D21330#462655, @mkuper wrote:

Another thing I forgot earlier - this should only fire when we're in unsafe/fast math mode, right? Is there a check for that?

FP reduction is allowed for +/- operators. Only max/min checks for unsafe.

What do you mean by "property of IV" ? Do you suggest to add a special field to InductionDescriptor?
Fneg should be a part of Step. But Step is a SCEV and I was not allowed to implement FP SCEV.

You know, I probably just don't understand the SCEV situation well enough.
Sanjoy, any chance you could take a look, even though part of it is in LoopVectorize.cpp? :-)

Changed handling of FSUB operation. I keep the original binary operation inside Induction Descriptor.
Allow FP induction in fast-math mode only.

Ping *

delena added reviewers: Ayal, dorit, gilr.Jul 5 2016, 1:06 AM

Hi Elena,

Sorry for the delay, I was out on vacation. I'm catching up on email now, and I will try to get to review this this week.

Thanks!

Some comments inline. Please let me know if you want me to look at something specific, but I'm not familiar enough with the code this patch touches to lgtm it.

../lib/Analysis/ScalarEvolutionExpander.cpp
1614	This is odd -- is it just to help keep the `Step` as a `SCEV `? If so, I'd suggest solving that within `InductionDescriptor` itself (i.e. maybe support having the step as either a `SCEV ` or a `Value *`, depending on the type of the `InductionDescriptor`?).
../lib/Transforms/Utils/LoopUtils.cpp
762	(Not for fixing in this change) looks like a better interface would be to return an `Optional< InductionDescriptor>`?
787	Maybe use a `dyn_cast` here?
811	The condition looks inverted?

This revision now requires changes to proceed.Jul 15 2016, 12:43 AM

delena marked an inline comment as done.Jul 17 2016, 5:35 AM

delena added inline comments.

../lib/Analysis/ScalarEvolutionExpander.cpp
1614	I't dropping this change, I don't need it anymore.
../lib/Transforms/Utils/LoopUtils.cpp
787	Ok.
811	The hasUnsafeAlgebra() means that instruction itself has "fast" attribute. In this case we don't need additional check. But if the BOp does not have the "fast" attribute, the legality of FP transformation should be allowed on function level. I'll add a comment.

Some changes according to Sanjoy's comment. Thanks Sanjoy and Michael for review.
Still looking for somebody who can accept this patch. Added @sbaranga, who made similar changes in induction variables.

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

If someone - e.g. sanjoy - OK's the design, I can LGTM the LV code change.

mkuper added inline comments.Jul 18 2016, 11:20 AM

../include/llvm/Transforms/Utils/LoopUtils.h
274	I think Instruction::BinaryOpsEnd may be better for an explicitly invalid BinaryOp. Not sure that's a good choice, but pretty sure 0 isn't.
../lib/Transforms/Utils/LoopUtils.cpp
787	I think what sanjoy meant was: BinaryOperator *BOp = dyn_cast<BinaryOperator>(BEValue); if (!BOp) return false;
../lib/Transforms/Vectorize/LoopVectorize.cpp
4347	Main -> Primary (I think we use that consistently)
6659	Can you add a test for this? All of the tests you added force UF == 1.
6662	Are you sure about this? I mean, it's true for vectorizing, but is it true here as well? (I'm not saying it isn't, just making sure this is intentional)

In D21330#487162, @mkuper wrote:

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

The bottom line of the FP SCEV discussion was the point that FP SCEV is overkill for "secondary" IV (like in the example above). We'll need FP SCEV for primary FP IV like
for (float f =0.0; f < g; f+=0.5) {}.
But such loops are rare and most of them can be re-mapped to integers.
The suggestion was to include FP IV to the current InductionDescriptor.

In D21330#487357, @delena wrote:

In D21330#487162, @mkuper wrote:

I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations.

The bottom line of the FP SCEV discussion was the point that FP SCEV is overkill for "secondary" IV (like in the example above). We'll need FP SCEV for primary FP IV like
for (float f =0.0; f < g; f+=0.5) {}.
But such loops are rare and most of them can be re-mapped to integers.
The suggestion was to include FP IV to the current InductionDescriptor.

delena marked 3 inline comments as done.Jul 19 2016, 5:04 AM

delena added inline comments.

../include/llvm/Transforms/Utils/LoopUtils.h
274	Thanks, I'll fix.
../lib/Transforms/Vectorize/LoopVectorize.cpp
6662	Even if you have only unrolling, and VF is 1, the value of FP induction is calculated as: sitofp(PrimaryIV) * Increment. for (int i=0; i<N; i++) { fp_ind += fp_inc; } is transferred to something like this: init = fp_inc; for (int i=0; i<N; i++) { fp_ind = init + i*fp_inc; } In this case we need unsafe math. I added tests for unrolling.

Updated the code according to Michael's comments.

LGTM

In D21330#488606, @mkuper wrote:

LGTM

I've just started looking at this too. Please give me a few mins. So far I only encountered minor things.

../include/llvm/Transforms/Utils/LoopUtils.h
308	Fp->FP
328–329	binary op -> induction op is better everywhere. Also I am assuming this is the op that advances the induction variable. You may want to spell this out somewhere.
../lib/Transforms/Utils/LoopUtils.cpp
770	this function is returning a bool

Sorry, didn't realize anyone else is still interested in looking at this, given how long this patch has been up.
Ignore my LGTM. :-)

In D21330#488622, @mkuper wrote:

Sorry, didn't realize anyone else is still interested in looking at this, given how long this patch has been up.
Ignore my LGTM. :-)

No, *I am* sorry for chiming in this late. I felt obliged because you mentioned that no one looked at this from the original llvm-dev thread and I felt bad ;). And thanks for reviewing it, this is looking pretty good.

So with these it should LGTM too. I haven't checked everything (most notably the unsafe math part).

I just wanted to see whether this was in line with the direction set in the original llvm-dev thread and it is! Thanks to all of you and sorry about the delay again.

../lib/Transforms/Utils/LoopUtils.cpp
816–818	I think that a better interface would be to take BOp (step instruction) optionally and then derive DI::hasUnsafeAlgebra and the opcode from that. This is OK as a follow-up if you prefer.
../lib/Transforms/Vectorize/LoopVectorize.cpp
4354–4355	by adding StepVal, you mean

Minor changes according to Adam's comments.

delena added inline comments.Jul 20 2016, 1:38 AM

../include/llvm/Transforms/Utils/LoopUtils.h
328–329	Ok. Changing it to getInductionBinOp()
../lib/Transforms/Utils/LoopUtils.cpp
770	Fixed. Thanks.
816–818	I did it at the beginning and then prefer to follow Reduction implementation, just to be consistent with existing interface.
../lib/Transforms/Vectorize/LoopVectorize.cpp
4354–4355	I fixed the comment. thanks.

Added one more test where init value and step are constants.

You should probably also have a test for the case where there is not "fast" attribute on the step instruction but we can still vectorize with the hints.

../include/llvm/Transforms/Utils/LoopUtils.h
321–323	This comment seems incorrect/misleading. I think this flag "allows" a relaxed FP model not "requires" it. Do you agree?
328–329	I think you missed the part about "everywhere", i.e. making the member variable consistent with this name too: BinaryOp -> InductionOp
343–344	This comment is also ambiguous. This is again the induction/step instruction iff the instruction allows for a relaxed FP model.
../lib/Transforms/Utils/LoopUtils.cpp
816–818	Fair but it does not make sense to pass a different instructions for UAI and BO and the current interface allows for that. At least then change the ctor to take a single instruction and derive UAI and BO from that.
../test/Transforms/LoopVectorize/float-induction.ll
13–14	Where is the result of these guys used? Would be good to check too.
17–18	Why not also match the result of the fsub and check it in the store?

Updated the patch according to Adam's comments.

anemet added inline comments.Jul 21 2016, 10:18 AM

../include/llvm/Transforms/Utils/LoopUtils.h
343	Comments are full sentences, please end with a period.
../test/Transforms/LoopVectorize/float-induction.ll
5	I think you can only have this under LoopVectorize/X86 (what if the X86 backend is not enabled in a build?). But more importantly, I don't understand why you need to formulate the non-fast-math case as an x86 test.

delena added inline comments.Jul 22 2016, 5:47 AM

../include/llvm/Transforms/Utils/LoopUtils.h
343	ok.
../test/Transforms/LoopVectorize/float-induction.ll
5	The "unsafe" function attribute works only for auto-vec. If you specify -force-vector-width=4 the loop will be vectorized anyway. So I added "X86" tests just to check combination of function attribute and "safe" FP induction in the auto-vectorization mode.

LGTM with the x86 test split into its own test under Transform/LoopVectorize/X86. Thanks!

../test/Transforms/LoopVectorize/float-induction.ll
5	Hmm, that is somewhat unexpected. I thought that the -force* stuff only overruled the cost model... @hfinkel, is it intentional/desired that -force-vector-width=>1 overrules (some of) the legality checks? We have different ways to bypass the legality checks so perhaps keeping -force-vector-width as a way to overrule the profitability checks is more desired? I guess one concerns is that this internal flag was advertised on Nadav's Auto-vectorizer blog post so perhaps changing the behavior is not trivial at this point. Anyhow, coming back to this patch, you need then to split this part of the test into a Transform/LoopVectorize/X86 test.

Closed by commit rL276554: [Loop Vectorizer] Handling loops FP induction variables. (authored by delena). · Explain WhyJul 24 2016, 12:32 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

../

include/

llvm/

Transforms/

Utils/

LoopUtils.h

48 lines

lib/

Analysis/

ScalarEvolutionExpander.cpp

2 lines

Transforms/

Scalar/

LoopInterchange.cpp

2 lines

Utils/

LoopUtils.cpp

115 lines

Vectorize/

LoopVectorize.cpp

149 lines

test/

Transforms/

LoopVectorize/

float-induction.ll

146 lines

induction-step.ll

4 lines

Diff 61935

../include/llvm/Transforms/Utils/LoopUtils.h

	Show First 20 Lines • Show All 257 Lines • ▼ Show 20 Lines

	/// A struct for saving information about induction variables.			/// A struct for saving information about induction variables.
	class InductionDescriptor {			class InductionDescriptor {
	public:			public:
	/// This enum represents the kinds of inductions that we support.			/// This enum represents the kinds of inductions that we support.
	enum InductionKind {			enum InductionKind {
	IK_NoInduction, ///< Not an induction variable.			IK_NoInduction, ///< Not an induction variable.
	IK_IntInduction, ///< Integer induction variable. Step = C.			IK_IntInduction, ///< Integer induction variable. Step = C.
	IK_PtrInduction ///< Pointer induction var. Step = C / sizeof(elem).			IK_PtrInduction, ///< Pointer induction var. Step = C / sizeof(elem).
				IK_FpInduction ///< Floating point induction variable.
	};			};

	public:			public:
	/// Default constructor - creates an invalid induction.			/// Default constructor - creates an invalid induction.
	InductionDescriptor()			InductionDescriptor()
	: StartValue(nullptr), IK(IK_NoInduction), Step(nullptr) {}			: StartValue(nullptr), IK(IK_NoInduction), Step(nullptr),
				UnsafeAlgebraInst(nullptr), BinaryOp((Instruction::BinaryOps)0) {}
				mkuperUnsubmitted Done Reply Inline Actions I think Instruction::BinaryOpsEnd may be better for an explicitly invalid BinaryOp. Not sure that's a good choice, but pretty sure 0 isn't. mkuper: I think Instruction::BinaryOpsEnd may be better for an explicitly invalid BinaryOp. Not sure…
				delenaAuthorUnsubmitted Not Done Reply Inline Actions Thanks, I'll fix. delena: Thanks, I'll fix.

	/// Get the consecutive direction. Returns:			/// Get the consecutive direction. Returns:
	/// 0 - unknown or non-consecutive.			/// 0 - unknown or non-consecutive.
	/// 1 - consecutive and increasing.			/// 1 - consecutive and increasing.
	/// -1 - consecutive and decreasing.			/// -1 - consecutive and decreasing.
	int getConsecutiveDirection() const;			int getConsecutiveDirection() const;

	/// Compute the transformed value of Index at offset StartValue using step			/// Compute the transformed value of Index at offset StartValue using step
	/// StepValue.			/// StepValue.
	/// For integer induction, returns StartValue + Index * StepValue.			/// For integer induction, returns StartValue + Index * StepValue.
	/// For pointer induction, returns StartValue[Index * StepValue].			/// For pointer induction, returns StartValue[Index * StepValue].
	/// FIXME: The newly created binary instructions should contain nsw/nuw			/// FIXME: The newly created binary instructions should contain nsw/nuw
	/// flags, which can be found from the original scalar operations.			/// flags, which can be found from the original scalar operations.
	Value transform(IRBuilder<> &B, Value Index, ScalarEvolution *SE,			Value transform(IRBuilder<> &B, Value Index, ScalarEvolution *SE,
	const DataLayout& DL) const;			const DataLayout& DL) const;

	Value *getStartValue() const { return StartValue; }			Value *getStartValue() const { return StartValue; }
	InductionKind getKind() const { return IK; }			InductionKind getKind() const { return IK; }
	const SCEV *getStep() const { return Step; }			const SCEV *getStep() const { return Step; }
	ConstantInt *getConstIntStepValue() const;			ConstantInt *getConstIntStepValue() const;

	/// Returns true if \p Phi is an induction. If \p Phi is an induction,			/// Returns true if \p Phi is an induction in the loop \p L. If \p Phi is an
	/// the induction descriptor \p D will contain the data describing this			/// induction, the induction descriptor \p D will contain the data describing
	/// induction. If by some other means the caller has a better SCEV			/// this induction. If by some other means the caller has a better SCEV
	/// expression for \p Phi than the one returned by the ScalarEvolution			/// expression for \p Phi than the one returned by the ScalarEvolution
	/// analysis, it can be passed through \p Expr.			/// analysis, it can be passed through \p Expr.
	static bool isInductionPHI(PHINode Phi, ScalarEvolution SE,			static bool isInductionPHI(PHINode Phi, const Loop L, ScalarEvolution *SE,
	InductionDescriptor &D,			InductionDescriptor &D,
	const SCEV *Expr = nullptr);			const SCEV *Expr = nullptr);

	/// Returns true if \p Phi is an induction, in the context associated with			/// Returns true if \p Phi is a floating point induction in the loop \p L.
	/// the run-time predicate of PSE. If \p Assume is true, this can add further			/// If \p Phi is an induction, the induction descriptor \p D will contain
	/// SCEV predicates to \p PSE in order to prove that \p Phi is an induction.			/// the data describing this induction.
				static bool isFpInductionPHI(PHINode Phi, const Loop L,
				anemetUnsubmitted Done Reply Inline Actions Fp->FP anemet: Fp->FP
				ScalarEvolution *SE, InductionDescriptor &D);

				/// Returns true if \p Phi is a loop \p L induction, in the context associated
				/// with the run-time predicate of PSE. If \p Assume is true, this can add
				/// further SCEV predicates to \p PSE in order to prove that \p Phi is an
				/// induction.
	/// If \p Phi is an induction, \p D will contain the data describing this			/// If \p Phi is an induction, \p D will contain the data describing this
	/// induction.			/// induction.
	static bool isInductionPHI(PHINode *Phi, PredicatedScalarEvolution &PSE,			static bool isInductionPHI(PHINode Phi, const Loop L,
				PredicatedScalarEvolution &PSE,
	InductionDescriptor &D, bool Assume = false);			InductionDescriptor &D, bool Assume = false);

				/// Returns true if the induction has unsafe algebra which requires a relaxed
				/// floating-point model.
				bool hasUnsafeAlgebra() { return UnsafeAlgebraInst != nullptr; }
				anemetUnsubmitted Not Done Reply Inline Actions This comment seems incorrect/misleading. I think this flag "allows" a relaxed FP model not "requires" it. Do you agree? anemet: This comment seems incorrect/misleading. I think this flag "allows" a relaxed FP model not…

				/// Returns unsafe algebra instruction.
				Instruction *getUnsafeAlgebraInst() { return UnsafeAlgebraInst; }

				/// Returns binary opcode.
				Instruction::BinaryOps getBinaryOpcode() const { return BinaryOp; }
				anemetUnsubmitted Done Reply Inline Actions binary op -> induction op is better everywhere. Also I am assuming this is the op that advances the induction variable. You may want to spell this out somewhere. anemet: binary op -> induction op is better everywhere. Also I am assuming this is the op that…
				delenaAuthorUnsubmitted Not Done Reply Inline Actions Ok. Changing it to getInductionBinOp() delena: Ok. Changing it to getInductionBinOp()
				anemetUnsubmitted Not Done Reply Inline Actions I think you missed the part about "everywhere", i.e. making the member variable consistent with this name too: BinaryOp -> InductionOp anemet: I think you missed the part about "everywhere", i.e. making the member variable consistent with…

	private:			private:
	/// Private constructor - used by \c isInductionPHI.			/// Private constructor - used by \c isInductionPHI.
	InductionDescriptor(Value Start, InductionKind K, const SCEV Step);			InductionDescriptor(Value Start, InductionKind K, const SCEV Step,
				Instruction *UAI = nullptr,
				Instruction::BinaryOps BOp = (Instruction::BinaryOps)0);

	/// Start value.			/// Start value.
	TrackingVH<Value> StartValue;			TrackingVH<Value> StartValue;
	/// Induction kind.			/// Induction kind.
	InductionKind IK;			InductionKind IK;
	/// Step value.			/// Step value.
	const SCEV *Step;			const SCEV *Step;
				// Induction has unsafe algebra.
				anemetUnsubmitted Not Done Reply Inline Actions Comments are full sentences, please end with a period. anemet: Comments are full sentences, please end with a period.
				delenaAuthorUnsubmitted Not Done Reply Inline Actions ok. delena: ok.
				Instruction *UnsafeAlgebraInst;
				anemetUnsubmitted Not Done Reply Inline Actions This comment is also ambiguous. This is again the induction/step instruction iff the instruction allows for a relaxed FP model. anemet: This comment is also ambiguous. This is again the induction/step instruction iff the…
				// Original Induction opcode.
				Instruction::BinaryOps BinaryOp;
	};			};

	BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,			BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,
	bool PreserveLCSSA);			bool PreserveLCSSA);

	/// \brief Simplify each loop in a loop nest recursively.			/// \brief Simplify each loop in a loop nest recursively.
	///			///
	/// This takes a potentially un-simplified loop L (and its children) and turns			/// This takes a potentially un-simplified loop L (and its children) and turns
	▲ Show 20 Lines • Show All 102 Lines • Show Last 20 Lines

../lib/Analysis/ScalarEvolutionExpander.cpp

Show First 20 Lines • Show All 1,605 Lines • ▼ Show 20 Lines	Value SCEVExpander::expandCodeFor(const SCEV SH, Type *Ty,
assert(IP);		assert(IP);
Builder.SetInsertPoint(IP);		Builder.SetInsertPoint(IP);
return expandCodeFor(SH, Ty);		return expandCodeFor(SH, Ty);
}		}

Value SCEVExpander::expandCodeFor(const SCEV SH, Type *Ty) {		Value SCEVExpander::expandCodeFor(const SCEV SH, Type *Ty) {
// Expand the code for this SCEV.		// Expand the code for this SCEV.
Value *V = expand(SH);		Value *V = expand(SH);
if (Ty) {		if (Ty && SE.isSCEVable(Ty)) {
		sanjoyUnsubmitted Not Done Reply Inline Actions This is odd -- is it just to help keep the `Step` as a `SCEV `? If so, I'd suggest solving that within `InductionDescriptor` itself (i.e. maybe support having the step as either a `SCEV ` or a `Value `, depending on the type of the `InductionDescriptor`?). sanjoy:* This is odd -- is it just to help keep the `Step` as a `SCEV *`? If so, I'd suggest solving…
		delenaAuthorUnsubmitted Not Done Reply Inline Actions I't dropping this change, I don't need it anymore. delena: I't dropping this change, I don't need it anymore.
assert(SE.getTypeSizeInBits(Ty) == SE.getTypeSizeInBits(SH->getType()) &&		assert(SE.getTypeSizeInBits(Ty) == SE.getTypeSizeInBits(SH->getType()) &&
"non-trivial casts should be done with the SCEVs directly!");		"non-trivial casts should be done with the SCEVs directly!");
V = InsertNoopCastOfTo(V, Ty);		V = InsertNoopCastOfTo(V, Ty);
}		}
return V;		return V;
}		}

Value SCEVExpander::FindValueInExprValueMap(const SCEV S,		Value SCEVExpander::FindValueInExprValueMap(const SCEV S,
▲ Show 20 Lines • Show All 582 Lines • Show Last 20 Lines

../lib/Transforms/Scalar/LoopInterchange.cpp

Show First 20 Lines • Show All 697 Lines • ▼ Show 20 Lines	bool LoopInterchangeLegality::findInductionAndReductions(
Loop L, SmallVector<PHINode , 8> &Inductions,		Loop L, SmallVector<PHINode , 8> &Inductions,
SmallVector<PHINode *, 8> &Reductions) {		SmallVector<PHINode *, 8> &Reductions) {
if (!L->getLoopLatch() \|\| !L->getLoopPredecessor())		if (!L->getLoopLatch() \|\| !L->getLoopPredecessor())
return false;		return false;
for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I) {		for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I) {
RecurrenceDescriptor RD;		RecurrenceDescriptor RD;
InductionDescriptor ID;		InductionDescriptor ID;
PHINode *PHI = cast<PHINode>(I);		PHINode *PHI = cast<PHINode>(I);
if (InductionDescriptor::isInductionPHI(PHI, SE, ID))		if (InductionDescriptor::isInductionPHI(PHI, L, SE, ID))
Inductions.push_back(PHI);		Inductions.push_back(PHI);
else if (RecurrenceDescriptor::isReductionPHI(PHI, L, RD))		else if (RecurrenceDescriptor::isReductionPHI(PHI, L, RD))
Reductions.push_back(PHI);		Reductions.push_back(PHI);
else {		else {
DEBUG(		DEBUG(
dbgs() << "Failed to recognize PHI as an induction or reduction.\n");		dbgs() << "Failed to recognize PHI as an induction or reduction.\n");
return false;		return false;
}		}
▲ Show 20 Lines • Show All 566 Lines • Show Last 20 Lines

../lib/Transforms/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 648 Lines • ▼ Show 20 Lines	Value *RecurrenceDescriptor::createMinMaxOp(IRBuilder<> &Builder,
else		else
Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");		Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");

Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");		Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");
return Select;		return Select;
}		}

InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,		InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,
const SCEV *Step)		const SCEV Step, Instruction UAI,
: StartValue(Start), IK(K), Step(Step) {		Instruction::BinaryOps BOp)
		: StartValue(Start), IK(K), Step(Step), UnsafeAlgebraInst(UAI),
		BinaryOp(BOp) {
assert(IK != IK_NoInduction && "Not an induction");		assert(IK != IK_NoInduction && "Not an induction");

// Start value type should match the induction kind and the value		// Start value type should match the induction kind and the value
// itself should not be null.		// itself should not be null.
assert(StartValue && "StartValue is null");		assert(StartValue && "StartValue is null");
assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&		assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&
"StartValue is not a pointer for pointer induction");		"StartValue is not a pointer for pointer induction");
assert((IK != IK_IntInduction \|\| StartValue->getType()->isIntegerTy()) &&		assert((IK != IK_IntInduction \|\| StartValue->getType()->isIntegerTy()) &&
"StartValue is not an integer for integer induction");		"StartValue is not an integer for integer induction");

// Check the Step Value. It should be non-zero integer value.		// Check the Step Value. It should be non-zero integer value.
assert((!getConstIntStepValue() \|\| !getConstIntStepValue()->isZero()) &&		assert((!getConstIntStepValue() \|\| !getConstIntStepValue()->isZero()) &&
"Step value is zero");		"Step value is zero");

assert((IK != IK_PtrInduction \|\| getConstIntStepValue()) &&		assert((IK != IK_PtrInduction \|\| getConstIntStepValue()) &&
"Step value should be constant for pointer induction");		"Step value should be constant for pointer induction");
assert(Step->getType()->isIntegerTy() && "StepValue is not an integer");		assert((IK == IK_FpInduction \|\| Step->getType()->isIntegerTy()) &&
		"StepValue is not an integer");

		assert((IK != IK_FpInduction \|\| Step->getType()->isFloatingPointTy()) &&
		"StepValue is not FP for FpInduction");
		assert((IK != IK_FpInduction \|\| BinaryOp != 0) &&
		"Binary opcode should be specified for FP induction");
}		}

int InductionDescriptor::getConsecutiveDirection() const {		int InductionDescriptor::getConsecutiveDirection() const {
ConstantInt *ConstStep = getConstIntStepValue();		ConstantInt *ConstStep = getConstIntStepValue();
if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))		if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))
return ConstStep->getSExtValue();		return ConstStep->getSExtValue();
return 0;		return 0;
}		}

ConstantInt *InductionDescriptor::getConstIntStepValue() const {		ConstantInt *InductionDescriptor::getConstIntStepValue() const {
if (isa<SCEVConstant>(Step))		if (isa<SCEVConstant>(Step))
return dyn_cast<ConstantInt>(cast<SCEVConstant>(Step)->getValue());		return dyn_cast<ConstantInt>(cast<SCEVConstant>(Step)->getValue());
return nullptr;		return nullptr;
}		}

Value InductionDescriptor::transform(IRBuilder<> &B, Value Index,		Value InductionDescriptor::transform(IRBuilder<> &B, Value Index,
ScalarEvolution *SE,		ScalarEvolution *SE,
const DataLayout& DL) const {		const DataLayout& DL) const {

SCEVExpander Exp(*SE, DL, "induction");		SCEVExpander Exp(*SE, DL, "induction");
		assert(Index->getType() == Step->getType() &&
		"Index type does not match StepValue type");
switch (IK) {		switch (IK) {
case IK_IntInduction: {		case IK_IntInduction: {
assert(Index->getType() == StartValue->getType() &&		assert(Index->getType() == StartValue->getType() &&
"Index type does not match StartValue type");		"Index type does not match StartValue type");

// FIXME: Theoretically, we can call getAddExpr() of ScalarEvolution		// FIXME: Theoretically, we can call getAddExpr() of ScalarEvolution
// and calculate (Start + Index * Step) for all cases, without		// and calculate (Start + Index * Step) for all cases, without
// special handling for "isOne" and "isMinusOne".		// special handling for "isOne" and "isMinusOne".
// But in the real life the result code getting worse. We mix SCEV		// But in the real life the result code getting worse. We mix SCEV
// expressions and ADD/SUB operations and receive redundant		// expressions and ADD/SUB operations and receive redundant
// intermediate values being calculated in different ways and		// intermediate values being calculated in different ways and
// Instcombine is unable to reduce them all.		// Instcombine is unable to reduce them all.

if (getConstIntStepValue() &&		if (getConstIntStepValue() &&
getConstIntStepValue()->isMinusOne())		getConstIntStepValue()->isMinusOne())
return B.CreateSub(StartValue, Index);		return B.CreateSub(StartValue, Index);
if (getConstIntStepValue() &&		if (getConstIntStepValue() &&
getConstIntStepValue()->isOne())		getConstIntStepValue()->isOne())
return B.CreateAdd(StartValue, Index);		return B.CreateAdd(StartValue, Index);
const SCEV *S = SE->getAddExpr(SE->getSCEV(StartValue),		const SCEV *S = SE->getAddExpr(SE->getSCEV(StartValue),
SE->getMulExpr(Step, SE->getSCEV(Index)));		SE->getMulExpr(Step, SE->getSCEV(Index)));
return Exp.expandCodeFor(S, StartValue->getType(), &*B.GetInsertPoint());		return Exp.expandCodeFor(S, StartValue->getType(), &*B.GetInsertPoint());
}		}
case IK_PtrInduction: {		case IK_PtrInduction: {
assert(Index->getType() == Step->getType() &&
"Index type does not match StepValue type");
assert(isa<SCEVConstant>(Step) &&		assert(isa<SCEVConstant>(Step) &&
"Expected constant step for pointer induction");		"Expected constant step for pointer induction");
const SCEV *S = SE->getMulExpr(SE->getSCEV(Index), Step);		const SCEV *S = SE->getMulExpr(SE->getSCEV(Index), Step);
Index = Exp.expandCodeFor(S, Index->getType(), &*B.GetInsertPoint());		Index = Exp.expandCodeFor(S, Index->getType(), &*B.GetInsertPoint());
return B.CreateGEP(nullptr, StartValue, Index);		return B.CreateGEP(nullptr, StartValue, Index);
}		}
		case IK_FpInduction: {
		assert(Step->getType()->isFloatingPointTy() && "Expected FP Step value");
		assert(BinaryOp &&
		"Original bin op should be defined for FP induction");

		Value *StepValue = cast<SCEVUnknown>(Step)->getValue();

		// Floating point operations had to be 'fast' to enable the induction.
		FastMathFlags Flags;
		Flags.setUnsafeAlgebra();

		Value *MulExp = B.CreateFMul(StepValue, Index);
		cast<Instruction>(MulExp)->setFastMathFlags(Flags);

		Value *BOp = B.CreateBinOp(BinaryOp, StartValue,
		MulExp, "induction");
		cast<Instruction>(BOp)->setFastMathFlags(Flags);

		return BOp;
		}
case IK_NoInduction:		case IK_NoInduction:
return nullptr;		return nullptr;
}		}
llvm_unreachable("invalid enum");		llvm_unreachable("invalid enum");
}		}

bool InductionDescriptor::isInductionPHI(PHINode *Phi,		bool InductionDescriptor::isFpInductionPHI(PHINode Phi, const Loop TheLoop,
		sanjoyUnsubmitted Not Done Reply Inline Actions (Not for fixing in this change) looks like a better interface would be to return an `Optional< InductionDescriptor>`? sanjoy: (Not for fixing in this change) looks like a better interface would be to return an `Optional<…
		ScalarEvolution *SE,
		InductionDescriptor &D) {

		// Here we only handle FP induction variables.
		assert(Phi->getType()->isFloatingPointTy() && "Unexpected Phi type");

		if (TheLoop->getHeader() != Phi->getParent())
		return nullptr;
		anemetUnsubmitted Done Reply Inline Actions this function is returning a bool anemet: this function is returning a bool
		delenaAuthorUnsubmitted Not Done Reply Inline Actions Fixed. Thanks. delena: Fixed. Thanks.

		// The loop may have multiple entrances or multiple exits; we can analyze
		// this phi if it has a unique entry value and a unique backedge value.
		if (Phi->getNumIncomingValues() != 2)
		return false;
		Value BEValue = nullptr, StartValue = nullptr;
		if (TheLoop->contains(Phi->getIncomingBlock(0))) {
		BEValue = Phi->getIncomingValue(0);
		StartValue = Phi->getIncomingValue(1);
		} else {
		assert(TheLoop->contains(Phi->getIncomingBlock(1)) &&
		"Unexpected Phi node in the loop");
		BEValue = Phi->getIncomingValue(1);
		StartValue = Phi->getIncomingValue(0);
		}

		if (!isa<BinaryOperator>(BEValue))
		sanjoyUnsubmitted Done Reply Inline Actions Maybe use a `dyn_cast` here? sanjoy: Maybe use a `dyn_cast` here?
		delenaAuthorUnsubmitted Not Done Reply Inline Actions Ok. delena: Ok.
		mkuperUnsubmitted Done Reply Inline Actions I think what sanjoy meant was: BinaryOperator BOp = dyn_cast<BinaryOperator>(BEValue); if (!BOp) return false; mkuper:* I think what sanjoy meant was: ``` BinaryOperator *BOp = dyn_cast<BinaryOperator>(BEValue); if…
		return false;

		BinaryOperator *BOp = cast<BinaryOperator>(BEValue);
		Value *Addend = nullptr;
		if (BOp->getOpcode() == Instruction::FAdd) {
		if (BOp->getOperand(0) == Phi)
		Addend = BOp->getOperand(1);
		else if (BOp->getOperand(1) == Phi)
		Addend = BOp->getOperand(0);
		} else if (BOp->getOpcode() == Instruction::FSub)
		if (BOp->getOperand(0) == Phi)
		Addend = BOp->getOperand(1);

		if (!Addend)
		return false;

		// The addend should be loop invariant
		if (auto *I = dyn_cast<Instruction>(Addend))
		if (TheLoop->contains(I))
		return false;

		const SCEV *Step = SE->getUnknown(Addend);

		Instruction *UAI = !BOp->hasUnsafeAlgebra() ? BOp : nullptr;
		sanjoyUnsubmitted Not Done Reply Inline Actions The condition looks inverted? sanjoy: The condition looks inverted?
		delenaAuthorUnsubmitted Not Done Reply Inline Actions The hasUnsafeAlgebra() means that instruction itself has "fast" attribute. In this case we don't need additional check. But if the BOp does not have the "fast" attribute, the legality of FP transformation should be allowed on function level. I'll add a comment. delena: The hasUnsafeAlgebra() means that instruction itself has "fast" attribute. In this case we…
		D = InductionDescriptor(StartValue, IK_FpInduction, Step, UAI,
		BOp->getOpcode());
		return true;
		}

		bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,
PredicatedScalarEvolution &PSE,		PredicatedScalarEvolution &PSE,
		anemetUnsubmitted Not Done Reply Inline Actions I think that a better interface would be to take BOp (step instruction) optionally and then derive DI::hasUnsafeAlgebra and the opcode from that. This is OK as a follow-up if you prefer. anemet: I think that a better interface would be to take BOp (step instruction) optionally and then…
		delenaAuthorUnsubmitted Not Done Reply Inline Actions I did it at the beginning and then prefer to follow Reduction implementation, just to be consistent with existing interface. delena: I did it at the beginning and then prefer to follow Reduction implementation, just to be…
		anemetUnsubmitted Not Done Reply Inline Actions Fair but it does not make sense to pass a different instructions for UAI and BO and the current interface allows for that. At least then change the ctor to take a single instruction and derive UAI and BO from that. anemet: Fair but it does not make sense to pass a different instructions for UAI and BO and the current…
InductionDescriptor &D,		InductionDescriptor &D,
bool Assume) {		bool Assume) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		// Handle integer and pointer inductions variables.
		// Now we handle also FP induction but not trying to make a
		// recurrent expression from the PHI node in-place.

		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy() &&
		!PhiTy->isFloatTy() && !PhiTy->isDoubleTy() && !PhiTy->isHalfTy())
return false;		return false;

		if (PhiTy->isFloatingPointTy())
		return isFpInductionPHI(Phi, TheLoop, PSE.getSE(), D);

const SCEV *PhiScev = PSE.getSCEV(Phi);		const SCEV *PhiScev = PSE.getSCEV(Phi);
const auto *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const auto *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);

// We need this expression to be an AddRecExpr.		// We need this expression to be an AddRecExpr.
if (Assume && !AR)		if (Assume && !AR)
AR = PSE.getAsAddRec(Phi);		AR = PSE.getAsAddRec(Phi);

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return false;		return false;
}		}

return isInductionPHI(Phi, PSE.getSE(), D, AR);		return isInductionPHI(Phi, TheLoop, PSE.getSE(), D, AR);
}		}

bool InductionDescriptor::isInductionPHI(PHINode *Phi,		bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,
ScalarEvolution *SE,		ScalarEvolution *SE,
InductionDescriptor &D,		InductionDescriptor &D,
const SCEV *Expr) {		const SCEV *Expr) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.		// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())
return false;		return false;

// Check that the PHI is consecutive.		// Check that the PHI is consecutive.
const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);		const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return false;		return false;
}		}

assert(AR->getLoop()->getHeader() == Phi->getParent() &&		assert(TheLoop->getHeader() == Phi->getParent() &&
"PHI is an AddRec for a different loop?!");		"PHI is an AddRec for a different loop?!");
Value *StartValue =		Value *StartValue =
Phi->getIncomingValueForBlock(AR->getLoop()->getLoopPreheader());		Phi->getIncomingValueForBlock(AR->getLoop()->getLoopPreheader());
const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);
// Calculate the pointer stride and check if it is consecutive.		// Calculate the pointer stride and check if it is consecutive.
// The stride may be a constant or a loop invariant integer value.		// The stride may be a constant or a loop invariant integer value.
const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);		const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);
if (!ConstStep && !SE->isLoopInvariant(Step, AR->getLoop()))		if (!ConstStep && !SE->isLoopInvariant(Step, TheLoop))
return false;		return false;

if (PhiTy->isIntegerTy()) {		if (PhiTy->isIntegerTy()) {
D = InductionDescriptor(StartValue, IK_IntInduction, Step);		D = InductionDescriptor(StartValue, IK_IntInduction, Step);
return true;		return true;
}		}

assert(PhiTy->isPointerTy() && "The PHI must be a pointer");		assert(PhiTy->isPointerTy() && "The PHI must be a pointer");
▲ Show 20 Lines • Show All 177 Lines • Show Last 20 Lines

../lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 399 Lines • ▼ Show 20 Lines	protected:
/// instruction (shuffle) for loop invariant values and for the induction		/// instruction (shuffle) for loop invariant values and for the induction
/// value. If this is the induction variable then we extend it to N, N+1, ...		/// value. If this is the induction variable then we extend it to N, N+1, ...
/// this is needed because each iteration in the loop corresponds to a SIMD		/// this is needed because each iteration in the loop corresponds to a SIMD
/// element.		/// element.
virtual Value getBroadcastInstrs(Value V);		virtual Value getBroadcastInstrs(Value V);

/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)		/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)
/// to each vector element of Val. The sequence starts at StartIndex.		/// to each vector element of Val. The sequence starts at StartIndex.
virtual Value getStepVector(Value Val, int StartIdx, Value *Step);		/// \p Opcode is relevant for FP induction variable.
		virtual Value getStepVector(Value Val, int StartIdx, Value *Step,
		Instruction::BinaryOps Opcode =
		(Instruction::BinaryOps)0);

/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)		/// This function adds (StartIdx, StartIdx + Step, StartIdx + 2*Step, ...)
/// to each vector element of Val. The sequence starts at StartIndex.		/// to each vector element of Val. The sequence starts at StartIndex.
/// Step is a SCEV. In order to get StepValue it takes the existing value		/// Step is a SCEV. In order to get StepValue it takes the existing value
/// from SCEV or creates a new using SCEVExpander.		/// from SCEV or creates a new using SCEVExpander.
virtual Value getStepVector(Value Val, int StartIdx, const SCEV *Step);		/// \p Opcode is relevant for FP induction variable.
		virtual Value getStepVector(Value Val, int StartIdx, const SCEV *Step,
		Instruction::BinaryOps Opcode =
		(Instruction::BinaryOps)0);

/// Create a vector induction variable based on an existing scalar one.		/// Create a vector induction variable based on an existing scalar one.
/// Currently only works for integer induction variables with a constant		/// Currently only works for integer induction variables with a constant
/// step.		/// step.
/// If TruncType is provided, instead of widening the original IV, we		/// If TruncType is provided, instead of widening the original IV, we
/// widen a version of the IV truncated to TruncType.		/// widen a version of the IV truncated to TruncType.
void widenInductionVariable(const InductionDescriptor &II, VectorParts &Entry,		void widenInductionVariable(const InductionDescriptor &II, VectorParts &Entry,
IntegerType *TruncType = nullptr);		IntegerType *TruncType = nullptr);
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	InnerLoopUnroller(Loop *OrigLoop, PredicatedScalarEvolution &PSE,
: InnerLoopVectorizer(OrigLoop, PSE, LI, DT, TLI, TTI, AC, 1,		: InnerLoopVectorizer(OrigLoop, PSE, LI, DT, TLI, TTI, AC, 1,
UnrollFactor) {}		UnrollFactor) {}

private:		private:
void scalarizeInstruction(Instruction *Instr,		void scalarizeInstruction(Instruction *Instr,
bool IfPredicateStore = false) override;		bool IfPredicateStore = false) override;
void vectorizeMemoryInstruction(Instruction *Instr) override;		void vectorizeMemoryInstruction(Instruction *Instr) override;
Value getBroadcastInstrs(Value V) override;		Value getBroadcastInstrs(Value V) override;
Value getStepVector(Value Val, int StartIdx, Value *Step) override;		Value getStepVector(Value Val, int StartIdx, Value *Step,
Value getStepVector(Value Val, int StartIdx, const SCEV *StepSCEV) override;		Instruction::BinaryOps Opcode =
		(Instruction::BinaryOps)0) override;
		Value getStepVector(Value Val, int StartIdx, const SCEV *StepSCEV,
		Instruction::BinaryOps Opcode =
		(Instruction::BinaryOps)0) override;
Value reverseVector(Value Vec) override;		Value reverseVector(Value Vec) override;
};		};

/// \brief Look for a meaningful debug location on the instruction or it's		/// \brief Look for a meaningful debug location on the instruction or it's
/// operands.		/// operands.
static Instruction getDebugLocFromInstOrOperands(Instruction I) {		static Instruction getDebugLocFromInstOrOperands(Instruction I) {
if (!I)		if (!I)
return I;		return I;
▲ Show 20 Lines • Show All 1,548 Lines • ▼ Show 20 Lines	Value InnerLoopVectorizer::getBroadcastInstrs(Value V) {

// Broadcast the scalar into all locations in the vector.		// Broadcast the scalar into all locations in the vector.
Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");		Value *Shuf = Builder.CreateVectorSplat(VF, V, "broadcast");

return Shuf;		return Shuf;
}		}

Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,		Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,
const SCEV *StepSCEV) {		const SCEV *StepSCEV,
		Instruction::BinaryOps BinOp) {
const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();
SCEVExpander Exp(*PSE.getSE(), DL, "induction");		SCEVExpander Exp(*PSE.getSE(), DL, "induction");
Value *StepValue = Exp.expandCodeFor(StepSCEV, StepSCEV->getType(),		Value *StepValue = Exp.expandCodeFor(StepSCEV, StepSCEV->getType(),
&*Builder.GetInsertPoint());		&*Builder.GetInsertPoint());
return getStepVector(Val, StartIdx, StepValue);		return getStepVector(Val, StartIdx, StepValue, BinOp);
}		}

void InnerLoopVectorizer::widenInductionVariable(const InductionDescriptor &II,		void InnerLoopVectorizer::widenInductionVariable(const InductionDescriptor &II,
VectorParts &Entry,		VectorParts &Entry,
IntegerType *TruncType) {		IntegerType *TruncType) {
Value *Start = II.getStartValue();		Value *Start = II.getStartValue();
ConstantInt *Step = II.getConstIntStepValue();		ConstantInt *Step = II.getConstIntStepValue();
assert(Step && "Can not widen an IV with a non-constant step");		assert(Step && "Can not widen an IV with a non-constant step");
Show All 22 Lines	for (unsigned Part = 0; Part < UF; ++Part) {
LastInduction = Builder.CreateAdd(LastInduction, SplatVF, "step.add");		LastInduction = Builder.CreateAdd(LastInduction, SplatVF, "step.add");
}		}

VecInd->addIncoming(SteppedStart, LoopVectorPreHeader);		VecInd->addIncoming(SteppedStart, LoopVectorPreHeader);
VecInd->addIncoming(LastInduction, LoopVectorBody);		VecInd->addIncoming(LastInduction, LoopVectorBody);
}		}

Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,		Value InnerLoopVectorizer::getStepVector(Value Val, int StartIdx,
Value *Step) {		Value *Step,
		Instruction::BinaryOps BinOp) {
assert(Val->getType()->isVectorTy() && "Must be a vector");		assert(Val->getType()->isVectorTy() && "Must be a vector");
assert(Val->getType()->getScalarType()->isIntegerTy() &&		assert((Val->getType()->getScalarType()->isIntegerTy() \|\|
"Elem must be an integer");		Val->getType()->getScalarType()->isFloatingPointTy()) &&
		"Induction Step must be an integer or FP");
assert(Step->getType() == Val->getType()->getScalarType() &&		assert(Step->getType() == Val->getType()->getScalarType() &&
"Step has wrong type");		"Step has wrong type");
// Create the types.		// Create the types.
Type *ITy = Val->getType()->getScalarType();		Type *STy = Val->getType()->getScalarType();
VectorType *Ty = cast<VectorType>(Val->getType());		VectorType *Ty = cast<VectorType>(Val->getType());
int VLen = Ty->getNumElements();		int VLen = Ty->getNumElements();
SmallVector<Constant *, 8> Indices;		SmallVector<Constant *, 8> Indices;
		auto CurrIP = Builder.saveIP();
		Builder.SetInsertPoint(LoopVectorPreHeader->getTerminator());

		if (Step->getType()->isIntegerTy()) {
// Create a vector of consecutive numbers from zero to VF.		// Create a vector of consecutive numbers from zero to VF.
for (int i = 0; i < VLen; ++i)		for (int i = 0; i < VLen; ++i)
Indices.push_back(ConstantInt::get(ITy, StartIdx + i));		Indices.push_back(ConstantInt::get(STy, StartIdx + i));

// Add the consecutive indices to the vector value.
Constant *Cv = ConstantVector::get(Indices);		Constant *Cv = ConstantVector::get(Indices);
assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");		assert(Cv->getType() == Val->getType() && "Invalid consecutive vec");
Step = Builder.CreateVectorSplat(VLen, Step);		Step = Builder.CreateVectorSplat(VLen, Step);
assert(Step->getType() == Val->getType() && "Invalid step vec");		assert(Step->getType() == Val->getType() && "Invalid step vec");
// FIXME: The newly created binary instructions should contain nsw/nuw flags,		// FIXME: The newly created binary instructions should contain nsw/nuw flags,
// which can be found from the original scalar operations.		// which can be found from the original scalar operations.
Step = Builder.CreateMul(Cv, Step);		Step = Builder.CreateMul(Cv, Step);
		Builder.restoreIP(CurrIP);
return Builder.CreateAdd(Val, Step, "induction");		return Builder.CreateAdd(Val, Step, "induction");
}		}

		// Floating point induction.
		assert(BinOp && "Binary Opcode should be specified for FP induction");
		// Create a vector of consecutive numbers from zero to VF.
		for (int i = 0; i < VLen; ++i)
		Indices.push_back(ConstantFP::get(STy, (double)(StartIdx + i)));

		// Add the consecutive indices to the vector value.
		Constant *Cv = ConstantVector::get(Indices);

		Step = Builder.CreateVectorSplat(VLen, Step);

		// Floating point operations had to be 'fast' to enable the induction.
		FastMathFlags Flags;
		Flags.setUnsafeAlgebra();

		Value *MulOp = Builder.CreateFMul(Cv, Step);
		if (isa<Instruction>(MulOp))
		cast<Instruction>(MulOp)->setFastMathFlags(Flags);
		Builder.restoreIP(CurrIP);

		Value *BOp = Builder.CreateBinOp(BinOp, Val, MulOp, "induction");
		cast<Instruction>(BOp)->setFastMathFlags(Flags);
		return BOp;
		}

int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {		int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {
assert(Ptr->getType()->isPointerTy() && "Unexpected non-ptr");		assert(Ptr->getType()->isPointerTy() && "Unexpected non-ptr");
auto *SE = PSE.getSE();		auto *SE = PSE.getSE();
// Make sure that the pointer does not point to structs.		// Make sure that the pointer does not point to structs.
if (Ptr->getType()->getPointerElementType()->isAggregateType())		if (Ptr->getType()->getPointerElementType()->isAggregateType())
return 0;		return 0;

// If this value is a pointer induction variable, we know it is consecutive.		// If this value is a pointer induction variable, we know it is consecutive.
▲ Show 20 Lines • Show All 1,022 Lines • ▼ Show 20 Lines	for (I = List->begin(), E = List->end(); I != E; ++I) {
PHINode *BCResumeVal = PHINode::Create(		PHINode *BCResumeVal = PHINode::Create(
OrigPhi->getType(), 3, "bc.resume.val", ScalarPH->getTerminator());		OrigPhi->getType(), 3, "bc.resume.val", ScalarPH->getTerminator());
Value *EndValue;		Value *EndValue;
if (OrigPhi == OldInduction) {		if (OrigPhi == OldInduction) {
// We know what the end value is.		// We know what the end value is.
EndValue = CountRoundDown;		EndValue = CountRoundDown;
} else {		} else {
IRBuilder<> B(LoopBypassBlocks.back()->getTerminator());		IRBuilder<> B(LoopBypassBlocks.back()->getTerminator());
Value *CRD = B.CreateSExtOrTrunc(CountRoundDown,		Value *CRD;
		if (II.getStep()->getType()->isIntegerTy())
		CRD = B.CreateSExtOrTrunc(CountRoundDown, II.getStep()->getType(),
		"cast.crd");
		else
		CRD = B.CreateCast(Instruction::SIToFP, CountRoundDown,
II.getStep()->getType(), "cast.crd");		II.getStep()->getType(), "cast.crd");
const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();
EndValue = II.transform(B, CRD, PSE.getSE(), DL);		EndValue = II.transform(B, CRD, PSE.getSE(), DL);
EndValue->setName("ind.end");		EndValue->setName("ind.end");
}		}

// The new PHI merges the original incoming value, in case of a bypass,		// The new PHI merges the original incoming value, in case of a bypass,
// or the value at the end of the vectorized loop.		// or the value at the end of the vectorized loop.
BCResumeVal->addIncoming(EndValue, MiddleBlock);		BCResumeVal->addIncoming(EndValue, MiddleBlock);
▲ Show 20 Lines • Show All 963 Lines • ▼ Show 20 Lines	if (VF == 1 \|\| P->getType() != Induction->getType() \|\|
Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep());		Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep());
} else {		} else {
// Instead of re-creating the vector IV by splatting the scalar IV		// Instead of re-creating the vector IV by splatting the scalar IV
// in each iteration, we can make a new independent vector IV.		// in each iteration, we can make a new independent vector IV.
widenInductionVariable(II, Entry);		widenInductionVariable(II, Entry);
}		}
return;		return;
}		}
case InductionDescriptor::IK_PtrInduction:		case InductionDescriptor::IK_PtrInduction: {
// Handle the pointer induction variable case.		// Handle the pointer induction variable case.
assert(P->getType()->isPointerTy() && "Unexpected type.");		assert(P->getType()->isPointerTy() && "Unexpected type.");
// This is the normalized GEP that starts counting at zero.		// This is the normalized GEP that starts counting at zero.
Value *PtrInd = Induction;		Value *PtrInd = Induction;
PtrInd = Builder.CreateSExtOrTrunc(PtrInd, II.getStep()->getType());		PtrInd = Builder.CreateSExtOrTrunc(PtrInd, II.getStep()->getType());
// This is the vector of results. Notice that we don't generate		// This is the vector of results. Notice that we don't generate
// vector geps because scalar geps result in better code.		// vector geps because scalar geps result in better code.
for (unsigned part = 0; part < UF; ++part) {		for (unsigned part = 0; part < UF; ++part) {
Show All 16 Lines	for (unsigned part = 0; part < UF; ++part) {
SclrGep->setName("next.gep");		SclrGep->setName("next.gep");
VecVal = Builder.CreateInsertElement(VecVal, SclrGep,		VecVal = Builder.CreateInsertElement(VecVal, SclrGep,
Builder.getInt32(i), "insert.gep");		Builder.getInt32(i), "insert.gep");
}		}
Entry[part] = VecVal;		Entry[part] = VecVal;
}		}
return;		return;
}		}
		case InductionDescriptor::IK_FpInduction: {
		assert(P->getType() == II.getStartValue()->getType() &&
		"Types must match");
		// Handle other induction variables that are now based on the
		// canonical one.
		assert(P != OldInduction && "Main induction can be integer only");
		mkuperUnsubmitted Done Reply Inline Actions Main -> Primary (I think we use that consistently) mkuper: Main -> Primary (I think we use that consistently)

		Value *V = Builder.CreateCast(Instruction::SIToFP, Induction, P->getType());
		V = II.transform(Builder, V, PSE.getSE(), DL);
		V->setName("fp.offset.idx");

		Value *Broadcasted = getBroadcastInstrs(V);
		// After broadcasting the induction variable we need to make the vector
		// consecutive by adding 0, 1, 2, etc.
		anemetUnsubmitted Done Reply Inline Actions by adding StepVal, you mean anemet: by adding StepVal, you mean
		delenaAuthorUnsubmitted Not Done Reply Inline Actions I fixed the comment. thanks. delena: I fixed the comment. thanks.
		for (unsigned part = 0; part < UF; ++part)
		Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep(),
		II.getBinaryOpcode());
		return;
		}
		}
}		}

void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {		void InnerLoopVectorizer::vectorizeBlockInLoop(BasicBlock BB, PhiVector PV) {
// For each instruction in the old loop.		// For each instruction in the old loop.
for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {		for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {
VectorParts &Entry = WidenMap.get(&*it);		VectorParts &Entry = WidenMap.get(&*it);

switch (it->getOpcode()) {		switch (it->getOpcode()) {
▲ Show 20 Lines • Show All 493 Lines • ▼ Show 20 Lines
void LoopVectorizationLegality::addInductionPhi(		void LoopVectorizationLegality::addInductionPhi(
PHINode *Phi, const InductionDescriptor &ID,		PHINode *Phi, const InductionDescriptor &ID,
SmallPtrSetImpl<Value *> &AllowedExit) {		SmallPtrSetImpl<Value *> &AllowedExit) {
Inductions[Phi] = ID;		Inductions[Phi] = ID;
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
const DataLayout &DL = Phi->getModule()->getDataLayout();		const DataLayout &DL = Phi->getModule()->getDataLayout();

// Get the widest type.		// Get the widest type.
		if (!PhiTy->isFloatingPointTy()) {
if (!WidestIndTy)		if (!WidestIndTy)
WidestIndTy = convertPointerToIntegerType(DL, PhiTy);		WidestIndTy = convertPointerToIntegerType(DL, PhiTy);
else		else
WidestIndTy = getWiderType(DL, PhiTy, WidestIndTy);		WidestIndTy = getWiderType(DL, PhiTy, WidestIndTy);
		}

// Int inductions are special because we only allow one IV.		// Int inductions are special because we only allow one IV.
if (ID.getKind() == InductionDescriptor::IK_IntInduction &&		if (ID.getKind() == InductionDescriptor::IK_IntInduction &&
ID.getConstIntStepValue() &&		ID.getConstIntStepValue() &&
ID.getConstIntStepValue()->isOne() &&		ID.getConstIntStepValue()->isOne() &&
isa<Constant>(ID.getStartValue()) &&		isa<Constant>(ID.getStartValue()) &&
cast<Constant>(ID.getStartValue())->isNullValue()) {		cast<Constant>(ID.getStartValue())->isNullValue()) {

▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator it = (bb)->begin(), e = (bb)->end(); it != e;
if (RedDes.hasUnsafeAlgebra())		if (RedDes.hasUnsafeAlgebra())
Requirements->addUnsafeAlgebraInst(RedDes.getUnsafeAlgebraInst());		Requirements->addUnsafeAlgebraInst(RedDes.getUnsafeAlgebraInst());
AllowedExit.insert(RedDes.getLoopExitInstr());		AllowedExit.insert(RedDes.getLoopExitInstr());
Reductions[Phi] = RedDes;		Reductions[Phi] = RedDes;
continue;		continue;
}		}

InductionDescriptor ID;		InductionDescriptor ID;
if (InductionDescriptor::isInductionPHI(Phi, PSE, ID)) {		if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID)) {
addInductionPhi(Phi, ID, AllowedExit);		addInductionPhi(Phi, ID, AllowedExit);
		if (ID.hasUnsafeAlgebra())
		Requirements->addUnsafeAlgebraInst(ID.getUnsafeAlgebraInst());
continue;		continue;
}		}

if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop, DT)) {		if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop, DT)) {
FirstOrderRecurrences.insert(Phi);		FirstOrderRecurrences.insert(Phi);
continue;		continue;
}		}

// As a last resort, coerce the PHI to a AddRec expression		// As a last resort, coerce the PHI to a AddRec expression
// and re-try classifying it a an induction PHI.		// and re-try classifying it a an induction PHI.
if (InductionDescriptor::isInductionPHI(Phi, PSE, ID, true)) {		if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID, true)) {
addInductionPhi(Phi, ID, AllowedExit);		addInductionPhi(Phi, ID, AllowedExit);
continue;		continue;
}		}

emitAnalysis(VectorizationReport(&*it)		emitAnalysis(VectorizationReport(&*it)
<< "value that could not be identified as "		<< "value that could not be identified as "
"reduction is used outside the loop");		"reduction is used outside the loop");
DEBUG(dbgs() << "LV: Found an unidentified PHI." << *Phi << "\n");		DEBUG(dbgs() << "LV: Found an unidentified PHI." << *Phi << "\n");
▲ Show 20 Lines • Show All 1,651 Lines • ▼ Show 20 Lines	void InnerLoopUnroller::vectorizeMemoryInstruction(Instruction *Instr) {
return scalarizeInstruction(Instr, IfPredicateStore);		return scalarizeInstruction(Instr, IfPredicateStore);
}		}

Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }		Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }

Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }		Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }

Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx,		Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx,
const SCEV *StepSCEV) {		const SCEV *StepSCEV,
		Instruction::BinaryOps BinOp) {
const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = OrigLoop->getHeader()->getModule()->getDataLayout();
SCEVExpander Exp(*PSE.getSE(), DL, "induction");		SCEVExpander Exp(*PSE.getSE(), DL, "induction");
Value *StepValue = Exp.expandCodeFor(StepSCEV, StepSCEV->getType(),		Value *StepValue = Exp.expandCodeFor(StepSCEV, StepSCEV->getType(),
&*Builder.GetInsertPoint());		&*Builder.GetInsertPoint());
return getStepVector(Val, StartIdx, StepValue);		return getStepVector(Val, StartIdx, StepValue);
}		}

Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step) {		Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step,
		Instruction::BinaryOps BinOp) {
// When unrolling and the VF is 1, we only need to add a simple scalar.		// When unrolling and the VF is 1, we only need to add a simple scalar.
Type *ITy = Val->getType();		Type *Ty = Val->getType();
assert(!ITy->isVectorTy() && "Val must be a scalar");		assert(!Ty->isVectorTy() && "Val must be a scalar");
Constant *C = ConstantInt::get(ITy, StartIdx);
		if (Ty->isFloatingPointTy()) {
		mkuperUnsubmitted Not Done Reply Inline Actions Can you add a test for this? All of the tests you added force UF == 1. mkuper: Can you add a test for this? All of the tests you added force UF == 1.
		Constant *C = ConstantFP::get(Ty, (double)StartIdx);

		// Floating point operations had to be 'fast' to enable the unrolling.
		mkuperUnsubmitted Not Done Reply Inline Actions Are you sure about this? I mean, it's true for vectorizing, but is it true here as well? (I'm not saying it isn't, just making sure this is intentional) mkuper: Are you sure about this? I mean, it's true for vectorizing, but is it true here as well? (I'm…
		delenaAuthorUnsubmitted Not Done Reply Inline Actions Even if you have only unrolling, and VF is 1, the value of FP induction is calculated as: sitofp(PrimaryIV) * Increment. for (int i=0; i<N; i++) { fp_ind += fp_inc; } is transferred to something like this: init = fp_inc; for (int i=0; i<N; i++) { fp_ind = init + ifp_inc; } In this case we need unsafe math. I added tests for unrolling. delena:* Even if you have only unrolling, and VF is 1, the value of FP induction is calculated as…
		FastMathFlags Flags;
		Flags.setUnsafeAlgebra();

		Value *MulOp = Builder.CreateFMul(C, Step);
		if (isa<Instruction>(MulOp))
		cast<Instruction>(MulOp)->setFastMathFlags(Flags);
		Value *BOp = Builder.CreateBinOp(BinOp, Val, MulOp);
		cast<Instruction>(BOp)->setFastMathFlags(Flags);
		return BOp;
		}
		Constant *C = ConstantInt::get(Ty, StartIdx);
return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");		return Builder.CreateAdd(Val, Builder.CreateMul(C, Step), "induction");
}		}

../test/Transforms/LoopVectorize/float-induction.ll

				; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -dce -instcombine -S \| FileCheck %s


				; CHECK-LABEL: @fp_iv_loop1(
				; CHECK: vector.body:
				anemetUnsubmitted Not Done Reply Inline Actions I think you can only have this under LoopVectorize/X86 (what if the X86 backend is not enabled in a build?). But more importantly, I don't understand why you need to formulate the non-fast-math case as an x86 test. anemet: I think you can only have this under LoopVectorize/X86 (what if the X86 backend is not…
				delenaAuthorUnsubmitted Not Done Reply Inline Actions The "unsafe" function attribute works only for auto-vec. If you specify -force-vector-width=4 the loop will be vectorized anyway. So I added "X86" tests just to check combination of function attribute and "safe" FP induction in the auto-vectorization mode. delena: The "unsafe" function attribute works only for auto-vec. If you specify -force-vector-width=4…
				anemetUnsubmitted Not Done Reply Inline Actions Hmm, that is somewhat unexpected. I thought that the -force* stuff only overruled the cost model... @hfinkel, is it intentional/desired that -force-vector-width=>1 overrules (some of) the legality checks? We have different ways to bypass the legality checks so perhaps keeping -force-vector-width as a way to overrule the profitability checks is more desired? I guess one concerns is that this internal flag was advertised on Nadav's Auto-vectorizer blog post so perhaps changing the behavior is not trivial at this point. Anyhow, coming back to this patch, you need then to split this part of the test into a Transform/LoopVectorize/X86 test. anemet: Hmm, that is somewhat unexpected. I thought that the -force* stuff only overruled the cost…
				; CHECK: %[[INDEX:.]] = sitofp i64 {{.}} to float
				; CHECK: %[[VEC_INCR:.]] = fmul fast float %{{.}}, %[[INDEX]]
				; CHECK: fsub fast float %init, %[[VEC_INCR]]
				; CHECK: store <4 x float>

				@fp_inc = common global float 0.000000e+00, align 4

				;void fp_iv_loop1(float init, float * __restrict__ A, int N) {
				; float x = init;
				anemetUnsubmitted Not Done Reply Inline Actions Where is the result of these guys used? Would be good to check too. anemet: Where is the result of these guys used? Would be good to check too.
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x -= fp_inc;
				; }
				anemetUnsubmitted Not Done Reply Inline Actions Why not also match the result of the fsub and check it in the store? anemet: Why not also match the result of the fsub and check it in the store?
				;}

				define void @fp_iv_loop1(float %init, float* noalias nocapture %A, i32 %N) #0 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.lr.ph, label %for.end

				for.body.lr.ph: ; preds = %entry
				%0 = load float, float* @fp_inc, align 4
				br label %for.body

				for.body: ; preds = %for.body, %for.body.lr.ph
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%x.05 = phi float [ %init, %for.body.lr.ph ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.05, float* %arrayidx, align 4
				%add = fsub fast float %x.05, %0
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				;void fp_iv_loop2(float init, float * __restrict__ A, int N) {
				; float x = init;
				; for (int i=0; i < N; ++i) {
				; A[i] = x;
				; x += 0.5;
				; }
				;}

				; CHECK-LABEL: @fp_iv_loop2(
				; CHECK: vector.body
				; CHECK: %[[index:.*]] = phi i64 [ 0, %vector.ph ]
				; CHECK: sitofp i64 %[[index]] to float
				; CHECK: %[[VAR1:.]] = fmul fast float {{.}}, 5.000000e-01
				; CHECK: %[[VAR2:.*]] = fadd fast float %[[VAR1]]
				; CHECK: insertelement <4 x float> undef, float %[[VAR2]], i32 0
				; CHECK: shufflevector <4 x float> {{.*}}, <4 x float> undef, <4 x i32> zeroinitializer
				; CHECK: fadd fast <4 x float> {{.*}}, <float 0.000000e+00, float 5.000000e-01, float 1.000000e+00, float 1.500000e+00>
				; CHECK: store <4 x float>

				define void @fp_iv_loop2(float %init, float* noalias nocapture %A, i32 %N) #0 {
				entry:
				%cmp4 = icmp sgt i32 %N, 0
				br i1 %cmp4, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%x.06 = phi float [ %conv1, %for.body ], [ %init, %for.body.preheader ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.06, float* %arrayidx, align 4
				%conv1 = fadd fast float %x.06, 5.000000e-01
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit: ; preds = %for.body
				br label %for.end

				for.end: ; preds = %for.end.loopexit, %entry
				ret void
				}

				;void fp_iv_loop3(float init, float * __restrict__ A, float * __restrict__ B, float * __restrict__ C, int N) {
				; int i = 0;
				; float x = init;
				; float y = 0.1;
				; for (; i < N; ++i) {
				; A[i] = x;
				; x += fp_inc;
				; y -= 0.5;
				; B[i] = x + y;
				; C[i] = y;
				; }
				;}
				; CHECK-LABEL: @fp_iv_loop3(
				; CHECK: vector.body
				; CHECK: %[[index:.*]] = phi i64 [ 0, %vector.ph ]
				; CHECK: sitofp i64 %[[index]] to float
				; CHECK: %[[VAR1:.]] = fmul fast float {{.}}, -5.000000e-01
				; CHECK: fadd fast float %[[VAR1]]
				; CHECK: fadd fast <4 x float> {{.*}}, <float -5.000000e-01, float -1.000000e+00, float -1.500000e+00, float -2.000000e+00>
				; CHECK: store <4 x float>

				define void @fp_iv_loop3(float %init, float* noalias nocapture %A, float* noalias nocapture %B, float* noalias nocapture %C, i32 %N) #0 {
				entry:
				%cmp9 = icmp sgt i32 %N, 0
				br i1 %cmp9, label %for.body.lr.ph, label %for.end

				for.body.lr.ph: ; preds = %entry
				%0 = load float, float* @fp_inc, align 4
				br label %for.body

				for.body: ; preds = %for.body, %for.body.lr.ph
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%y.012 = phi float [ 0x3FB99999A0000000, %for.body.lr.ph ], [ %conv1, %for.body ]
				%x.011 = phi float [ %init, %for.body.lr.ph ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
				store float %x.011, float* %arrayidx, align 4
				%add = fadd fast float %x.011, %0
				%conv1 = fadd fast float %y.012, -5.000000e-01
				%add2 = fadd fast float %conv1, %add
				%arrayidx4 = getelementptr inbounds float, float* %B, i64 %indvars.iv
				store float %add2, float* %arrayidx4, align 4
				%arrayidx6 = getelementptr inbounds float, float* %C, i64 %indvars.iv
				store float %conv1, float* %arrayidx6, align 4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

../test/Transforms/LoopVectorize/induction-step.ll

	; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=8 -S \| FileCheck %s			; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-width=8 -S \| FileCheck %s

	; int int_inc;			; int int_inc;
	;			;
	;int induction_with_global(int init, int *restrict A, int N) {			;int induction_with_global(int init, int *restrict A, int N) {
	; int x = init;			; int x = init;
	; for (int i=0;i<N;i++){			; for (int i=0;i<N;i++){
	; A[i] = x;			; A[i] = x;
	; x += int_inc;			; x += int_inc;
	; }			; }
	; return x;			; return x;
	;}			;}

	; CHECK-LABEL: @induction_with_global(			; CHECK-LABEL: @induction_with_global(
	; CHECK: %[[INT_INC:.]] = load i32, i32 @int_inc, align 4			; CHECK: %[[INT_INC:.]] = load i32, i32 @int_inc, align 4
	; CHECK: vector.body:			; CHECK: vector.ph:
	; CHECK: %[[VAR1:.*]] = insertelement <8 x i32> undef, i32 %[[INT_INC]], i32 0			; CHECK: %[[VAR1:.*]] = insertelement <8 x i32> undef, i32 %[[INT_INC]], i32 0
	; CHECK: %[[VAR2:.*]] = shufflevector <8 x i32> %[[VAR1]], <8 x i32> undef, <8 x i32> zeroinitializer			; CHECK: %[[VAR2:.*]] = shufflevector <8 x i32> %[[VAR1]], <8 x i32> undef, <8 x i32> zeroinitializer
	; CHECK: mul <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>, %[[VAR2]]			; CHECK: mul <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>, %[[VAR2]]

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"


	@int_inc = common global i32 0, align 4			@int_inc = common global i32 0, align 4
	Show All 39 Lines
	; }			; }
	; return x;			; return x;
	;}			;}

	; CHECK-LABEL: @induction_with_loop_inv(			; CHECK-LABEL: @induction_with_loop_inv(
	; CHECK: for.cond1.preheader:			; CHECK: for.cond1.preheader:
	; CHECK: %[[INDVAR0:.*]] = phi i32 [ 0,			; CHECK: %[[INDVAR0:.*]] = phi i32 [ 0,
	; CHECK: %[[INDVAR1:.*]] = phi i32 [ 0,			; CHECK: %[[INDVAR1:.*]] = phi i32 [ 0,
	; CHECK: vector.body:			; CHECK: vector.ph:
	; CHECK: %[[VAR1:.*]] = insertelement <8 x i32> undef, i32 %[[INDVAR1]], i32 0			; CHECK: %[[VAR1:.*]] = insertelement <8 x i32> undef, i32 %[[INDVAR1]], i32 0
	; CHECK: %[[VAR2:.*]] = shufflevector <8 x i32> %[[VAR1]], <8 x i32> undef, <8 x i32> zeroinitializer			; CHECK: %[[VAR2:.*]] = shufflevector <8 x i32> %[[VAR1]], <8 x i32> undef, <8 x i32> zeroinitializer
	; CHECK: mul <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>, %[[VAR2]]			; CHECK: mul <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>, %[[VAR2]]

	define i32 @induction_with_loop_inv(i32 %init, i32* noalias nocapture %A, i32 %N, i32 %M) {			define i32 @induction_with_loop_inv(i32 %init, i32* noalias nocapture %A, i32 %N, i32 %M) {
	entry:			entry:
	%cmp10 = icmp sgt i32 %M, 0			%cmp10 = icmp sgt i32 %M, 0
	br i1 %cmp10, label %for.cond1.preheader.lr.ph, label %for.end6			br i1 %cmp10, label %for.cond1.preheader.lr.ph, label %for.end6
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Loop vectorization with FP induction variablesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 61935

../include/llvm/Transforms/Utils/LoopUtils.h

../lib/Analysis/ScalarEvolutionExpander.cpp

../lib/Transforms/Scalar/LoopInterchange.cpp

../lib/Transforms/Utils/LoopUtils.cpp

../lib/Transforms/Vectorize/LoopVectorize.cpp

../test/Transforms/LoopVectorize/float-induction.ll

../test/Transforms/LoopVectorize/induction-step.ll

Loop vectorization with FP induction variables
ClosedPublic