This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
Analysis/
1
ScalarEvolution.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
Analysis/
5/15
ScalarEvolution.cpp
-
Transforms/
-
Utils/
5/7
LoopUtils.cpp
-
Vectorize/
5/9
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
5/5
vect-phiscev-sext-trunc.ll

Differential D38948

[LV] Support efficient vectorization of an induction with redundant casts
ClosedPublic

Authored by dorit on Oct 16 2017, 5:29 AM.

Download Raw Diff

Details

Reviewers

silviu.baranga
Ayal
mssimpso

Summary

D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose update chain involves casts; PSCEV can now build an AddRecurrence for some forms of such phi nodes, under the proper runtime overflow test. This means that we can identify such phi nodes as an induction, and the loop-vectorizer can now vectorize such inductions, however inefficiently. The vectorizer doesn't know that it can ignore the casts, and so it vectorizes them.

This patch records the casts in the InductionDescriptor, so that they could be marked to be ignored for cost calculation (we use vecValuesToIgnore for that) and ignored for vectorization/widening/scalarization (i.e. treated as TriviallyDead).

In addition to marking all these casts to be ignored, we also need to make sure that each cast is mapped to the right vector value in the vector loop body (be it a widened, vectorized, or scalarized induction). So whenever an induction phi is mapped to a vector value (during vectorization/widening/scalarization), we also map the respective cast instruction (if exists) to that vector value.

Note: if the phi-update sequence of an induction involves more than one cast, then the above mapping to vector value is relevant only for the last cast of the sequence. (We also allow only the "last cast" to be used outside the induction update chain itself).

Lastly, we also record all these "last casts" in InductionCastsToIgnore; this is because some utilities, which need to query whether a Value is an Induction variable, may wish to be able to (quickly) identify that also a casted induction value behave like an Induction (e.g. for cost calculation, such as in getAddressAccessCost).

This is the last step in addressing PR30654.

Diff Detail

Event Timeline

dorit created this revision.Oct 16 2017, 5:29 AM

Herald added subscribers: hiraditya, sanjoy. · View Herald TranscriptOct 16 2017, 5:29 AM

Ayal added inline comments.Nov 2 2017, 9:33 AM

../llvm/include/llvm/Transforms/Utils/LoopUtils.h
311 ↗	(On Diff #119140)	"\Casts" >> "\p Casts". Or better yet "\p CastsToIgnore"?
359 ↗	(On Diff #119140)	Better have "const SmallVectorImpl<Instruction *>" as the return type?
367 ↗	(On Diff #119140)	Use "Casts" instead of "CI" for consistency.
379 ↗	(On Diff #119140)	"CastInsts" >> "RedundantCasts"?
../llvm/lib/Analysis/ScalarEvolution.cpp
4389 ↗	(On Diff #119140)	"casts" >> "cast", or "and".
4405 ↗	(On Diff #119140)	Can you check `if (!Mask \|\| !Mask.isMask())` instead?
4424 ↗	(On Diff #119140)	Forgot `%x` to And with 255?
4428 ↗	(On Diff #119140)	"%sext, m" >> "%tmp, m"?

fhahn added a subscriber: fhahn.Nov 3 2017, 2:47 AM

Incorporated Ayal's comments. Thanks!

ping :)

ping^2

thanks,
Dorit

A few additional minor comments..

lib/Analysis/ScalarEvolution.cpp
4378	Perhaps a clearer way of stating this: "This last instruction (is the only one which) may be used outside of the ExtTrunc cast sequence, so it is placed in position 0 of CastInsts, for special handing later on." Worth asserting that CastInsts is empty before pushing back, to make sure that the above statement holds?
4404	assert empty before push_back?
test/Transforms/LoopVectorize/vect-phiscev-sext-trunc.ll
26	doit3 >> doit1
44	If we first check NOT to have the shl which defines TEST, seems useless to then check that ashr doesn't use TEST. Better check instead that the induction phi in vector.body (the one being stored into a[i]) feeds and consumes its bump directly?
94	doit3 >> doit2

Thanks Ayal. Incorporated your suggestions.

Sorry for not having a look earlier.

Maybe I'm missing something, but why not rewrite the induction variable using the the SCEV that we get from PSE (something like what IndVarsSimpify does)?
If we would take this approach then the casts would end up being dead code and removed.

lib/Analysis/ScalarEvolution.cpp
4470	(1) might not hold in the future, so this might end up being a problem?

Hi Silviu,

Maybe I'm missing something, but why not rewrite the induction variable using the the SCEV that we get from PSE (something like what IndVarsSimpify does)? If we would take this approach then the casts would end up being dead code and removed.

We can't do that on the original loop, which remains scalar and untouched, and will not be guarded by the scev predicates.
This means that the cast instructions will be there when we start vectorizing, and we need a way to tell that they will cost us nothing, and that we don't need to vectorize them but rather "look through them" as if they weren't there.

IIUC, what IndVarSimply does is call the rewriter and then fix the users of the induction; This is similar in effect to what we do: We don't need to call the PSCEV rewriter again, we already have the nice AddRec for the induction phi (isInductionPhi() had already obtained it); The vectorization of the induction phi proceeds unchanged; The only thing we are adding is the def-use wiring so that in the vectorized loop, any users of the cast instructions will be users of the vectorized phi. We never vectorize the casts, and we never actively remove them -- they will end up dead code in the vectorized loop because they will not be used.

Comment at: lib/Analysis/ScalarEvolution.cpp:4470
+/ (1) The value of the phi at iteration i is:
+/ (Ext ix (Trunc iy ( Start + i*Step ) to ix) to iy)

+/// (2) Under the runtime predicate, the above expression is equal to:

(1) might not hold in the future, so this might end up being a problem?

Do you mean this will not hold once additional forms of "casted-inductions" are supported by createAddRecFromPhiWithCasts()?
In such a case this function will not find the IR sequence that it expects, and we will have a missed optimization (namely, the vectorizer will vectorize the cast instructions instead of ignoring them). But we won't have a correctness problem.

I think I should probably add a comment to the documentation of createAddRecFromPhiWithCasts() to remind us that if/when it is extended to support more forms of "casted-induction" SECV-Exprs, then it is recommended to also extend getCastsForInductionPHI() to look for the additional IR patterns that can result in these SCEV-Exprs. Would that address your concern?

Thanks,
Dorit

Ayal added inline comments.Nov 19 2017, 10:31 AM

lib/Transforms/Utils/LoopUtils.cpp
880	Set CastInsts to nullptr or the address of a SmallVector created earlier depending on the conditions below, or invoke isInductionPHI() once using the default nullptr for its last parameter, or else with such an address.
933	Suggest to keep the 'nullptr' parameter as the last argument and continue to use its default, or document /* what-this-nullptr-is */
lib/Transforms/Vectorize/LoopVectorize.cpp
619	Placing 'Part' next to 'Lane' seems more logical.
1592	The name "isCastedInductionVariable()" may be confused with "isInductionVariable()" - the latter deals with a header Phi, whereas the former deals with a redundant Cast. (The Phi feeding an isCastedInductionVariable() answers true to isInductionVariable(), right?) Perhaps have "isInductionPhi()", and change "isInductionVariable()" to answer true for both Phi and Cast Instructions of an IV?
5273	In other words, all casts could be recorded here for ignoring, but suffices to record only the first.
6999	VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for VF's where they are actually widened, IIUC their use in calculateRegisterUsage(). Should the redundant IV casts be added to ValuesToIgnore instead, being redundant in either scalar and vector types? In any case, having expectedCost() neglect the cost of VecValuesToIgnore may affect cases unrelated to this patch's casted Inductions (related to Reductions), so probably better done in a separate patch.
test/Transforms/LoopVectorize/vect-phiscev-sext-trunc.ll
52	TEST is set but not used.
120	TEST is set but not used.

In D38948#929665, @dorit wrote:

IIUC, what IndVarSimply does is call the rewriter and then fix the users of the induction; This is similar in effect to what we do: We don't need to call the PSCEV rewriter again, we already have the nice AddRec for the induction phi (isInductionPhi() had already obtained it); The vectorization of the induction phi proceeds unchanged; The only thing we are adding is the def-use wiring so that in the vectorized loop, any users of the cast instructions will be users of the vectorized phi. We never vectorize the casts, and we never actively remove them -- they will end up dead code in the vectorized loop because they will not be used.

Yes, IndVarSimpify wouldn't fix this issue, but I was thinking more of using the techniques there that use the SCEV expressions to find these cases instead of doing the pattern matching (see the inline comment).

Comment at: lib/Analysis/ScalarEvolution.cpp:4470
+/ (1) The value of the phi at iteration i is:
+/ (Ext ix (Trunc iy ( Start + i*Step ) to ix) to iy)

+/// (2) Under the runtime predicate, the above expression is equal to:

(1) might not hold in the future, so this might end up being a problem?

Do you mean this will not hold once additional forms of "casted-inductions" are supported by createAddRecFromPhiWithCasts()?

Yes, but even if it's not a correctness issue I think we should avoid pattern matching here if possible to avoid doing more work in the future.

Thanks,
Silviu

lib/Analysis/ScalarEvolution.cpp
4482	If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? In that case, wouldn't it be possible to search for values in the loop with the same SCEV as an induction PHI, go through the def-use chain until you've reached the PHI and mark all instructions in between as a (part of) a cast of that PHI? I think this wouldn't need any pattern matching and should be straight-forward to do. Also I think ideally this should all be in an utility function (LoopUtils?). It seems a strange to have this in Scalar Evolution. It's very specific to the vectorizer.
4491	Technically this would be part of a cast operation, and not a cast itself? (shl changes the value).

Yes, IndVarSimpify wouldn't fix this issue, but I was thinking more of using the techniques there that use the SCEV expressions to find these cases instead of doing the pattern matching (see the inline comment).

Oh, so your comment was referring to how we retrieve the IR sequence, not to how we vectorize the phi... Ok, I think I understand your point now.

Do you mean this will not hold once additional forms of "casted-inductions" are supported by createAddRecFromPhiWithCasts()?

Yes, but even if it's not a correctness issue I think we should avoid pattern matching here if possible to avoid doing more work in the future.

Yes, I agree it's preferable.
The current pattern-matching is actually an overkill, just being overly cautious checking again things we don't need to re-check. It's enough to walk the def-use chain starting from the value that defines the phi via the back-edge, and, as you suggest, once we encounter a Value on the way whose SCEV is the same as the PHI SCEV to mark instructions as part of a casting sequence.

I agree that it's nicer. I would hesitate about implementing here something much more generic than what our SCEV-rewriter is currently able to support. But it can certainly be made more generic than it currently is, so it will be easier to extend it even further, in the future, if/when the need arises. I will give it try.

lib/Analysis/ScalarEvolution.cpp
4482	If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? Yes, that should be the case In that case, wouldn't it be possible to search for values in the loop with the same SCEV as an induction PHI Is there a ScalarEvolution map that holds this information (all the Values that have the same SCEV)? There's the ExprValueMap which looks close, but I'm not sure... or do you suggest to really obtain "from scratch" all the values defined in the loop and call getSCEV for each? Would it make more sense to start from the Value that defines the phi from the backedge (the way we start now), walk the def-use chain from there? (calling getSCEV on each value we find on the way, and from the point where we encounter the same SCEV as the induction phi continue the def-use walk while marking instructions on the way as part of a cast)? Also I think ideally this should all be in an utility function (LoopUtils?). It seems a strange to have this in Scalar Evolution. It's very specific to the vectorizer. Ok, sure, will move it.
4491	Yes, sure. It's a "Cast-Sequence Member". I'll drop the "Cast:" from the comment.

Addressed Ayal's comments.
Have yet to address Silviu's comments.

Thanks very much Silviu and Ayal!

dorit marked 12 inline comments as done.Nov 20 2017, 1:28 AM

dorit added inline comments.

lib/Transforms/Vectorize/LoopVectorize.cpp
6999	VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for VF's where they are actually widened, IIUC their use in calculateRegisterUsage(). Should the redundant IV casts be added to ValuesToIgnore instead, being redundant in either scalar and vector types? No, because then we would also ignore the cost of the casts when we calculate the baseline cost (of the scalar loop), and we don't want to do that (that loop will not be guarded by the predicate, and the casts will remain there). In any case, having expectedCost() neglect the cost of VecValuesToIgnore may affect cases unrelated to this patch's casted Inductions (related to Reductions), so probably better done in a separate patch. Yes. I thought it's a small enough fix to be included in this patch. I can submit a separate patch for this.

Hi Silviu,
I started to try out the approach you suggested, and I realized that our assumption doesn't hold... (see response to inlined comment).
Thanks,
Dorit

lib/Analysis/ScalarEvolution.cpp
4482	If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? Yes, that should be the case Turns out that's not the case…. For example, for this def-use chain: V0: %w_ix.011 = phi (0, %add) V1: %sext = shl i64 %w_ix.011, 32 V2: %idxprom = ashr exact i64 %sext, 32 V3: %add = add i64 %idxprom, %step The scev expr of the Phi is {0,+,%step}<%for.body> The scev expr of V2 is (sext i32 (trunc i64 %w_ix.011 to i32) to i64) (this is what PSCEV.getSCEV() returns)....

sbaranga added inline comments.Nov 20 2017, 1:15 PM

lib/Analysis/ScalarEvolution.cpp
4482	Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into account the equals predicate). Ideally we would be making the following transformations: (sext i32 (trunc i64 %w_ix.011 to i32) to i64) -> (equals predicate, replace %w_ix) (sext i32 (trunc i64 {0, +, step} to i32) to i64) -> (fold trunc) (sext i32 {0, + trunc i64 step to i32} to i64) -> (no overflow predicate) {0, +, sext i32 (trunc i64 step to i32)} -> (equals predicate, replace sext i32 (trunc i64 step to i32) with step) {0, +, step}

Hi Silviu,

Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into account the equals predicate).

a couple updates:

I think mixed up things a bit in what I wrote above; the situation is as follows:

Ideally we would be making the following transformations:
(sext i32 (trunc i64 %w_ix.011 to i32) to i64) -> (equals predicate, replace %w_ix)
(sext i32 (trunc i64 {0, +, step} to i32) to i64) -> (fold trunc)
(sext i32 {0, + trunc i64 step to i32} to i64) -> (no overflow predicate)
{0, +, sext i32 (trunc i64 step to i32)}

Up to here everything works as expected, and the expression returned for PSE.getSCEV(V2) is {0, +, sext i32 (trunc i64 step to i32)}.
But now, when we check if this SCEV is equal to {0, +, step}, we failed because my equality check was not looking at the predicates... When considering the equality predicates all is well.

BTW, I didn't find a PSCEV utility that checks for equality of SCEV expressions taking equality predicates into account… did I miss anything? (here we have two AddRecs to compare, so a Start1,Start2 to compare and a Step1,Step2 to compare, so in what I wrote I'm looking for any EqualPredicates whose LHS=Step1 and RHS=Step2, or LHS=Step2 and RHS=Step1, and the same for the Start exprs… makes sense?? (just feels like a lot of work…))

There was another bit of a hurdle in the unsigned version of this test (doit3 in the testcase):

The IR pattern is the following:

V0: %p.09 = phi (0, %add)
V1: %conv = and i32 %p.09, 255
V2: %add = add nsw i32 %conv, %step

And we have:
PSCEV.getScev(V0) = {0,+,%step}
PSCEV.getScev(V2) = {0,+,(sext i8 (trunc i32 %step to i8) to i32)}

And we are not able to deduce equality of the above because the Equal Predicate that we add in createAddRecFromPHIWithCasts() is:
%step == (zext i8 (trunc i32 %step to i8) to i32)

I guess that's a bug in the Predicate creation (?) in createAddRecFromPHIWithCastsImpl():
"AccumExtended = GetExtendedExpr(Accum)" creates the extended expression with zext, whereas probably the step part should be extended using sext because the overflow check that we add is IncrementNUSW…? Does that make sense?

(when I fix that, then the entire test passes).

Thanks,
Dorit

Hi Silviu,

The new version I uploaded has two main changes:

It fixes (what I think may be) a bug in createAddRecFromPHIWithCastsImpl(), where we add the equality predicate for the unsigned case: The IncrementNUSW overflow predicate that we add in this function allows the rewriter to rewrite this:

(zext i8 {0, + , (trunc i32 step to i8)} to i32)
into
{0, +, (sext i8 (trunc i32 step to i8) to i32)}
But the Equal predicate that we add is:
%step == (zext i8 (trunc i32 %step to i8) to i32).
So the fix changes the Equal predicate to:%step == (sext i8 (trunc i32 %step to i8) to i32)
(even for the unsigned case).

It changes the search for the IR cast-sequence in the spirit of what you proposed: It is moved to LoopUtils.cpp, and it relies on the SCEV of an instruction to be equal (***) to the SCEV of the phi.

I think it may not be entirely as general as you may have envisioned, but generalizing the implementation even further comes with some cost (complexity)which I am not sure that the current limited support justifies. Even the other SCEV patterns that I've seen (which I listed under the TODO of createAddRecFromPHIWithCastsImpl()) would be covered by the current implementatin, so I wouldn't like to over generalize at this point. In any case, this implementation is much more easily extendable than the previous pattern-based approach,
I hope this is close enough to what you had in mind…

(***) For the scev equality, I added the "areAddRecsEqualWithPreds()" utility, which considers the Equal predicates, because the rewriter did not rewrite this:
{0, +, (sext i8 (trunc i32 to i8) to i32)}
into{0, +, %step}.
We could instead extend the rewriter: in the "Sext/Zext" case it would have to check if the SCEV expr at hand confirms to the pattern (ext ix (trunc iy %step to ix) to iy),
and if so, to look for any equality predicates of the form:
(ext ix (trunc iy %step to ix) to iy) == %step.
(basically to call "areAddRecsEqualWithPreds()" there, instead of in the LoopUtils utility.
Do you think we should do that?

Many thanks,
Dorit

(uploaded a fix to LoopUtils:getCastsForInductionPHI())

Hi Dorit,

Thanks for doing the changes! I just have a few comments/questions (see inline).

Silviu

lib/Analysis/ScalarEvolution.cpp
4751	Nice catch, I think you can do a separate commit for this fix.
4815	Could you add a FIXME here? This shouldn't be required, and PSE should return the same expression for both.
4823	Any idea what the complexity of isKnownPredicate() is?
lib/Transforms/Utils/LoopUtils.cpp
889	I guess this is just an optimization for speed? Maybe it's enough to check that SE returns a SCEVUnknown and PSE returns an AddRec?

dorit mentioned this in D40641: [SCEV] Fix wrong Equal predicate created in getAddRecForPhiWithCasts.Nov 30 2017, 1:43 AM

dorit marked 3 inline comments as done.Nov 30 2017, 2:00 AM

dorit added inline comments.

lib/Analysis/ScalarEvolution.cpp
4751	Done - this is https://reviews.llvm.org/D40641.
4815	Yes, I'll do that; "FIXME: This is currently required because the rewriter currently does not rewrite this: {0, +, (sext ix (trunc iy to ix) to iy)} into {0, +, %step}, although there exists an Equal predicate "%step == (sext ix (trunc iy to ix) to iy)".
4823	I guess it can be potentially more complex than looking up Preds.implies... I can move it to be last check (or drop it...?).
lib/Transforms/Utils/LoopUtils.cpp
889	Yes, to make sure we only look at relevant phis... The check you suggest is exactly what the caller to this utility checks before calling this routine...

Hi Ayal,

lib/Transforms/Vectorize/LoopVectorize.cpp
6999	Yes. I thought it's a small enough fix to be included in this patch. I can submit a separate patch for this. This is https://reviews.llvm.org/D40883. thanks, Dorit

Dropped the parts that are uploaded for review separately (D40641, D40883), and hopefully addressed Silviu's last comments.

Ayal mentioned this in D40883: [LV] Ignore the cost of values that will not appear in the vectorized loop.Dec 9 2017, 1:57 PM

This looks good to me, but please wait for Silviu to approve as well before committing.

include/llvm/Analysis/ScalarEvolution.h
1027	clang-format?
lib/Transforms/Vectorize/LoopVectorize.cpp
2615	Early exit instead if (Casts.empty())?
5921	Can be folded into a single `return (Inst && InductionCastsToIgnore.count(Inst));`
6999	Great, thanks.

sbaranga added inline comments.Dec 10 2017, 11:03 AM

lib/Transforms/Utils/LoopUtils.cpp
889	In that case would we ever bail out here? If not, and this doesn't affect correctness, maybe we shouldn't do this check? The problem I have with havePredicatedSCEVRewriteForPHI is that it is looking into the SCEV internal structures while SCEV is doing the analysis lazily. So the answer that it's getting is that either we don't have a better value under some predicates or we haven't performed the analysis at all. So maybe there is an assumption here that createAddRecFromPHIWithCasts was called at some point. From an interface perspective, I think it's better to call createAddRecForPHIWithCasts if we really require doing this check.
943	nit: formatting, there should be a space before after AR

Addressed Ayal's and Silviu's comments.

dorit marked 5 inline comments as done.Dec 12 2017, 12:46 PM

dorit added inline comments.

lib/Transforms/Utils/LoopUtils.cpp
889	Ok, I see your point. I dropped the (redundant/overly-cautious) call to havePredicatedSCEVRewirtesForPHI.

LGTM!

Thanks so much for all your help with this work!

Committed as:
r320672 | dorit | 2017-12-14 09:56:31 +0200 (Thu, 14 Dec 2017) | 27 lines

[LV] Support efficient vectorization of an induction with redundant casts

(I forgot to include the Differential revision number in the commit...)

Just to formally close this review, as it wasn't closed by the commit.

This revision is now accepted and ready to land.Dec 14 2017, 11:33 AM

dorit closed this revision.Dec 14 2017, 10:54 PM

Ayal mentioned this in D115112: [LV] Remove dead IV casts using VPlan (NFC)..Dec 12 2021, 11:53 PM

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

9 lines

Transforms/

Utils/

LoopUtils.h

24 lines

lib/

Analysis/

ScalarEvolution.cpp

175 lines

Transforms/

Utils/

LoopUtils.cpp

40 lines

Vectorize/

LoopVectorize.cpp

95 lines

test/

Transforms/

LoopVectorize/

vect-phiscev-sext-trunc.ll

216 lines

Diff 121630

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 569 Lines • ▼ Show 20 Lines	public:

/// Checks if \p SymbolicPHI can be rewritten as an AddRecExpr under some		/// Checks if \p SymbolicPHI can be rewritten as an AddRecExpr under some
/// Predicates. If successful return these <AddRecExpr, Predicates>;		/// Predicates. If successful return these <AddRecExpr, Predicates>;
/// The function is intended to be called from PSCEV (the caller will decide		/// The function is intended to be called from PSCEV (the caller will decide
/// whether to actually add the predicates and carry out the rewrites).		/// whether to actually add the predicates and carry out the rewrites).
Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI);		createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI);

		/// Check if \p PhiScev is an integer loop header phi for which PSCEV was
		/// able to build the AddRecurrence \p AR under some predicates. If so, the
		/// function returns in \p CastInsts the cast instructions that participate in
		/// the induction update chain associated with this phi, and that can be
		/// ignored under these predicates.
		bool getCastsForInductionPHI(const SCEVUnknown *PhiScev,
		const SCEVAddRecExpr *AR,
		SmallVectorImpl<Instruction *> &CastInsts);

/// Returns an expression for a GEP		/// Returns an expression for a GEP
///		///
/// \p GEP The GEP. The indices contained in the GEP itself are ignored,		/// \p GEP The GEP. The indices contained in the GEP itself are ignored,
/// instead we use IndexExprs.		/// instead we use IndexExprs.
/// \p IndexExprs The expressions for the indices.		/// \p IndexExprs The expressions for the indices.
const SCEV getGEPExpr(GEPOperator GEP,		const SCEV getGEPExpr(GEPOperator GEP,
const SmallVectorImpl<const SCEV *> &IndexExprs);		const SmallVectorImpl<const SCEV *> &IndexExprs);
const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);		const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);
▲ Show 20 Lines • Show All 424 Lines • ▼ Show 20 Lines	public:
/// adding additional predicates to \p Preds as required.		/// adding additional predicates to \p Preds as required.
const SCEVAddRecExpr *convertSCEVToAddRecWithPredicates(		const SCEVAddRecExpr *convertSCEVToAddRecWithPredicates(
const SCEV S, const Loop L,		const SCEV S, const Loop L,
SmallPtrSetImpl<const SCEVPredicate *> &Preds);		SmallPtrSetImpl<const SCEVPredicate *> &Preds);

private:		private:
/// A CallbackVH to arrange for ScalarEvolution to be notified whenever a		/// A CallbackVH to arrange for ScalarEvolution to be notified whenever a
/// Value is deleted.		/// Value is deleted.
class SCEVCallbackVH final : public CallbackVH {		class SCEVCallbackVH final : public CallbackVH {
		AyalUnsubmitted Not Done Reply Inline Actions clang-format? Ayal: clang-format?
ScalarEvolution *SE;		ScalarEvolution *SE;

void deleted() override;		void deleted() override;
void allUsesReplacedWith(Value *New) override;		void allUsesReplacedWith(Value *New) override;

public:		public:
SCEVCallbackVH(Value V, ScalarEvolution SE = nullptr);		SCEVCallbackVH(Value V, ScalarEvolution SE = nullptr);
};		};
▲ Show 20 Lines • Show All 909 Lines • Show Last 20 Lines

include/llvm/Transforms/Utils/LoopUtils.h

Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	public:
InductionKind getKind() const { return IK; }		InductionKind getKind() const { return IK; }
const SCEV *getStep() const { return Step; }		const SCEV *getStep() const { return Step; }
ConstantInt *getConstIntStepValue() const;		ConstantInt *getConstIntStepValue() const;

/// Returns true if \p Phi is an induction in the loop \p L. If \p Phi is an		/// Returns true if \p Phi is an induction in the loop \p L. If \p Phi is an
/// induction, the induction descriptor \p D will contain the data describing		/// induction, the induction descriptor \p D will contain the data describing
/// this induction. If by some other means the caller has a better SCEV		/// this induction. If by some other means the caller has a better SCEV
/// expression for \p Phi than the one returned by the ScalarEvolution		/// expression for \p Phi than the one returned by the ScalarEvolution
/// analysis, it can be passed through \p Expr.		/// analysis, it can be passed through \p Expr. If the def-use chain
static bool isInductionPHI(PHINode Phi, const Loop L, ScalarEvolution *SE,		/// associated with the phi includes casts (that we know we can ignore
InductionDescriptor &D,		/// under proper runtime checks), they are passed through \p CastsToIgnore.
const SCEV *Expr = nullptr);		static bool
		isInductionPHI(PHINode Phi, const Loop L, ScalarEvolution *SE,
		InductionDescriptor &D, const SCEV *Expr = nullptr,
		SmallVectorImpl<Instruction > CastsToIgnore = nullptr);

/// Returns true if \p Phi is a floating point induction in the loop \p L.		/// Returns true if \p Phi is a floating point induction in the loop \p L.
/// If \p Phi is an induction, the induction descriptor \p D will contain		/// If \p Phi is an induction, the induction descriptor \p D will contain
/// the data describing this induction.		/// the data describing this induction.
static bool isFPInductionPHI(PHINode Phi, const Loop L,		static bool isFPInductionPHI(PHINode Phi, const Loop L,
ScalarEvolution *SE, InductionDescriptor &D);		ScalarEvolution *SE, InductionDescriptor &D);

/// Returns true if \p Phi is a loop \p L induction, in the context associated		/// Returns true if \p Phi is a loop \p L induction, in the context associated
Show All 24 Lines	public:
}		}

/// Returns binary opcode of the induction operator.		/// Returns binary opcode of the induction operator.
Instruction::BinaryOps getInductionOpcode() const {		Instruction::BinaryOps getInductionOpcode() const {
return InductionBinOp ? InductionBinOp->getOpcode() :		return InductionBinOp ? InductionBinOp->getOpcode() :
Instruction::BinaryOpsEnd;		Instruction::BinaryOpsEnd;
}		}

		/// Returns a reference to the type cast instructions in the induction
		/// update chain, that are redundant when guarded with a runtime
		/// SCEV overflow check.
		const SmallVectorImpl<Instruction *> &getCastInsts() const {
		return RedundantCasts;
		}

private:		private:
/// Private constructor - used by \c isInductionPHI.		/// Private constructor - used by \c isInductionPHI.
InductionDescriptor(Value Start, InductionKind K, const SCEV Step,		InductionDescriptor(Value Start, InductionKind K, const SCEV Step,
BinaryOperator *InductionBinOp = nullptr);		BinaryOperator *InductionBinOp = nullptr,
		SmallVectorImpl<Instruction > Casts = nullptr);

/// Start value.		/// Start value.
TrackingVH<Value> StartValue;		TrackingVH<Value> StartValue;
/// Induction kind.		/// Induction kind.
InductionKind IK = IK_NoInduction;		InductionKind IK = IK_NoInduction;
/// Step value.		/// Step value.
const SCEV *Step = nullptr;		const SCEV *Step = nullptr;
// Instruction that advances induction variable.		// Instruction that advances induction variable.
BinaryOperator *InductionBinOp = nullptr;		BinaryOperator *InductionBinOp = nullptr;
		// Instructions used for type-casts of the induction variable,
		// that are redundant when guarded with a runtime SCEV overflow check.
		SmallVector<Instruction *, 2> RedundantCasts;
};		};

BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,		BasicBlock InsertPreheaderForLoop(Loop L, DominatorTree DT, LoopInfo LI,
bool PreserveLCSSA);		bool PreserveLCSSA);

/// Ensure that all exit blocks of the loop are dedicated exits.		/// Ensure that all exit blocks of the loop are dedicated exits.
///		///
/// For any loop exit block with non-loop predecessors, we split the loop		/// For any loop exit block with non-loop predecessors, we split the loop
▲ Show 20 Lines • Show All 189 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,336 Lines • ▼ Show 20 Lines	static const Loop isIntegerLoopHeaderPHI(const PHINode PN, LoopInfo &LI) {
if (!PN->getType()->isIntegerTy())		if (!PN->getType()->isIntegerTy())
return nullptr;		return nullptr;
const Loop *L = LI.getLoopFor(PN->getParent());		const Loop *L = LI.getLoopFor(PN->getParent());
if (!L \|\| L->getHeader() != PN->getParent())		if (!L \|\| L->getHeader() != PN->getParent())
return nullptr;		return nullptr;
return L;		return L;
}		}

		/// Helper function for getCastsForInductionPHIImpl.
		/// Given a Value \p CastedPhi, and a PhiNode Value \p PN,
		/// look for the following sequence:
		/// %Ext = shl %PN, m
		/// %CastedPhi = ashr %Ext, m
		/// If found, return true, and insert the intermediate casts in \p CastInsts.
		/// Otherwise, return false.
		bool isAshrShlInductionCastSequence(Value CastedPhi, Value PN,
		const DataLayout &DL,
		SmallVectorImpl<Instruction *> &CastInsts) {

		// 1) look for ashr instruction
		auto *AshrInst = dyn_cast<Instruction>(CastedPhi);
		if (!AshrInst \|\| AshrInst->getOpcode() != Instruction::AShr)
		return false;

		Value *Ext = AshrInst->getOperand(0);
		uint64_t ExtSrcBitWidth = DL.getTypeSizeInBits(Ext->getType());
		ConstantInt *AshrAmt = dyn_cast<ConstantInt>(AshrInst->getOperand(1));
		if (!AshrAmt \|\| !AshrAmt->getValue().ult(ExtSrcBitWidth))
		return false;

		// 2) look for the shl instruction
		auto *ShlInst = dyn_cast<Instruction>(Ext);
		if (!ShlInst \|\| ShlInst->getOpcode() != Instruction::Shl \|\|
		!ShlInst->hasOneUse() \|\| ShlInst->getOperand(0) != PN)
		return false;

		auto *ShlAmt = dyn_cast<ConstantInt>(ShlInst->getOperand(1));
		if (!ShlAmt \|\| ShlAmt->getZExtValue() != AshrAmt->getZExtValue())
		return false;

		// First insert the last instruction from the ExtTrunc cast sequence.
		// The first instruction in the CastInsts set may be used outside of the
		AyalUnsubmitted Done Reply Inline Actions Perhaps a clearer way of stating this: "This last instruction (is the only one which) may be used outside of the ExtTrunc cast sequence, so it is placed in position 0 of CastInsts, for special handing later on." Worth asserting that CastInsts is empty before pushing back, to make sure that the above statement holds? Ayal: Perhaps a clearer way of stating this: "This last instruction (is the only one which) may be…
		// ExtTrunc cast sequence, and therefore needs special handling later on.
		CastInsts.push_back(AshrInst);
		CastInsts.push_back(ShlInst);
		return true;
		}

		/// Helper function for getCastsForInductionPHIImpl.
		/// Given a Value \p CastedPhi, and a PhiNode Value \p PN,
		/// look for the following sequence:
		/// %CastedPhi = and i64 %PN, 2^n-1
		/// If found, return true, and insert the intermediate cast in \p CastInsts.
		/// Otherwise, return false.
		bool isAndInductionCastSequence(Value CastedPhi, Value PN,
		const DataLayout &DL,
		SmallVectorImpl<Instruction *> &CastInsts) {

		auto *AndInst = dyn_cast<Instruction>(CastedPhi);
		if (!AndInst \|\| AndInst->getOpcode() != Instruction::And \|\|
		AndInst->getOperand(0) != PN)
		return false;

		ConstantInt *Mask = dyn_cast<ConstantInt>(AndInst->getOperand(1));
		if (!Mask \|\| !Mask->getValue().isMask())
		return false;

		CastInsts.push_back(AndInst);
		AyalUnsubmitted Done Reply Inline Actions assert empty before push_back? Ayal: assert empty before push_back?
		return true;
		}

		/// Look for the following IR sequence:
		/// %for.body:
		/// %x = phi i64 [ 0, %ph ], [ %add, %for.body ]
		/// %casted_phi = "ExtTrunc i64 %x"
		/// %add = add i64 %casted_phi, %step
		///
		/// where %x is given in \p PN, and where "ExtTrunc i64 %x" can take one of the
		/// following forms:
		/// a) %casted_phi = And i64 %x, 2^n-1
		/// e.g.:
		/// %for.body:
		/// %x = phi i64 [ 0, %ph ], [ %add, %for.body ]
		/// %casted_phi = And i64 %x, 255
		/// %add = add i64 %casted_phi, %step
		///
		/// b) %tmp = shl i64 %x, m
		/// %casted_phi = ashr exact i64 %tmp, m
		/// e.g.:
		/// %for.body:
		/// %x = phi i64 [ 0, %ph ], [ %add, %for.body ]
		/// %sext = shl i64 %x, 32
		/// %casted_phi = ashr exact i64 %sext, 32
		/// %add = add i64 %casted_phi, %step
		///
		/// If found, return true, and insert the intermediate casts in \p CastInsts.
		/// Otherwise, return false.
		bool getCastsForInductionPHIImpl(PHINode PN, const Loop L,
		SmallVectorImpl<Instruction *> &CastInsts) {

		// 1) Look for the add instruction that increments the induction via the
		// loop backedge.
		BasicBlock *Latch = L->getLoopLatch();
		if (!Latch)
		return false;
		Value *BEValueV = PN->getIncomingValueForBlock(Latch);
		auto *IndUpdate = dyn_cast<Instruction>(BEValueV);
		if (!IndUpdate \|\| IndUpdate->getOpcode() != Instruction::Add)
		return false;
		Value *Op0 = IndUpdate->getOperand(0);
		Value *Op1 = IndUpdate->getOperand(1);
		Value *CastedPhi = nullptr;
		if (Op0 == PN \|\| Op1 == PN)
		return false;
		if (L->isLoopInvariant(Op0))
		CastedPhi = Op1;
		else if (L->isLoopInvariant(Op1))
		CastedPhi = Op0;
		else
		return false;

		// 2) Look for the ExtTrunc Sequence
		const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();
		return (isAshrShlInductionCastSequence(CastedPhi, PN, DL, CastInsts) \|\|
		isAndInductionCastSequence(CastedPhi, PN, DL, CastInsts));
		}

		/// If PredicatedSCEVRewrites contains an entry that maps \p PhiScev to \p AR
		/// under a runtime predicate, then we know the following:
		/// (1) The value of the phi at iteration i is:
		/// (Ext ix (Trunc iy ( Start + i*Step ) to ix) to iy)
		/// (2) Under the runtime predicate, the above expression is equal to:
		/// Start + i*Step
		/// where Step is a loop invariant.
		sbarangaUnsubmitted Not Done Reply Inline Actions (1) might not hold in the future, so this might end up being a problem? sbaranga: (1) might not hold in the future, so this might end up being a problem?
		///
		/// We want to find the cast instructions that are involved in the
		/// update-chain of this induction. A caller that adds the required runtime
		/// predicate can be free to drop these cast instructions, and compute
		/// the phi using (2) above instead of (1).
		///
		/// We look for the following sequence:
		/// x1 = phi (x0, x_next)
		/// x2 = ExtTrunc (x1)
		/// x_next = add (x2, Step)
		///
		/// Where ExtTrunc is the IR sequence that resulted in the SCEV "sext(trunc("
		sbarangaUnsubmitted Not Done Reply Inline Actions If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? In that case, wouldn't it be possible to search for values in the loop with the same SCEV as an induction PHI, go through the def-use chain until you've reached the PHI and mark all instructions in between as a (part of) a cast of that PHI? I think this wouldn't need any pattern matching and should be straight-forward to do. Also I think ideally this should all be in an utility function (LoopUtils?). It seems a strange to have this in Scalar Evolution. It's very specific to the vectorizer. sbaranga: If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for…
		doritAuthorUnsubmitted Not Done Reply Inline Actions If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? Yes, that should be the case In that case, wouldn't it be possible to search for values in the loop with the same SCEV as an induction PHI Is there a ScalarEvolution map that holds this information (all the Values that have the same SCEV)? There's the ExprValueMap which looks close, but I'm not sure... or do you suggest to really obtain "from scratch" all the values defined in the loop and call getSCEV for each? Would it make more sense to start from the Value that defines the phi from the backedge (the way we start now), walk the def-use chain from there? (calling getSCEV on each value we find on the way, and from the point where we encounter the same SCEV as the induction phi continue the def-use walk while marking instructions on the way as part of a cast)? Also I think ideally this should all be in an utility function (LoopUtils?). It seems a strange to have this in Scalar Evolution. It's very specific to the vectorizer. Ok, sure, will move it. dorit: > If I understand correctly the SCEV expression returned by PSE for x2 should be the same as…
		doritAuthorUnsubmitted Not Done Reply Inline Actions If I understand correctly the SCEV expression returned by PSE for x2 should be the same as for x1? Yes, that should be the case Turns out that's not the case…. For example, for this def-use chain: V0: %w_ix.011 = phi (0, %add) V1: %sext = shl i64 %w_ix.011, 32 V2: %idxprom = ashr exact i64 %sext, 32 V3: %add = add i64 %idxprom, %step The scev expr of the Phi is {0,+,%step}<%for.body> The scev expr of V2 is (sext i32 (trunc i64 %w_ix.011 to i32) to i64) (this is what PSCEV.getSCEV() returns).... dorit: > > If I understand correctly the SCEV expression returned by PSE for x2 should be the same as…
		sbarangaUnsubmitted Not Done Reply Inline Actions Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into account the equals predicate). Ideally we would be making the following transformations: (sext i32 (trunc i64 %w_ix.011 to i32) to i64) -> (equals predicate, replace %w_ix) (sext i32 (trunc i64 {0, +, step} to i32) to i64) -> (fold trunc) (sext i32 {0, + trunc i64 step to i32} to i64) -> (no overflow predicate) {0, +, sext i32 (trunc i64 step to i32)} -> (equals predicate, replace sext i32 (trunc i64 step to i32) with step) {0, +, step} sbaranga: Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into…
		/// or "zext(trunc" expression. This can be one of several patterns; We look
		/// for one of the following two patterns:
		/// ExtTrunc1:
		/// Cast: %x2 = and %x1, 2^n-1
		/// ExtTrunc2:
		/// Cast: %t = shl %x1, m
		/// Cast: %x2 = ashr %t, m
		///
		/// If we are able to find this sequence, we return the "Cast:" instructions
		sbarangaUnsubmitted Not Done Reply Inline Actions Technically this would be part of a cast operation, and not a cast itself? (shl changes the value). sbaranga: Technically this would be part of a cast operation, and not a cast itself? (shl changes the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Yes, sure. It's a "Cast-Sequence Member". I'll drop the "Cast:" from the comment. dorit: Yes, sure. It's a "Cast-Sequence Member". I'll drop the "Cast:" from the comment.
		/// from the pattern we found.
		/// (TODO: Check for more IR induction patterns that can result in the SCEV
		/// expression in (1) above.)
		bool ScalarEvolution::getCastsForInductionPHI(
		const SCEVUnknown PhiScev, const SCEVAddRecExpr AR,
		SmallVectorImpl<Instruction *> &CastInsts) {

		auto *PN = cast<PHINode>(PhiScev->getValue());
		const Loop *L = isIntegerLoopHeaderPHI(PN, LI);
		if (!L)
		return false;

		auto I = PredicatedSCEVRewrites.find({PhiScev, L});
		if (I == PredicatedSCEVRewrites.end())
		return false;
		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>> Rewrite =
		I->second;
		assert(isa<SCEVAddRecExpr>(Rewrite.first) && "Expected an AddRec");
		if (Rewrite.first != AR)
		return false;

		// PhiScev was found in PredicatedSCEVRewrites, and it is mapped to \p AR
		// under some runtime tests. Find the cast instructions that participate in
		// the def-use chain of PhiScev in the loop.
		//
		return getCastsForInductionPHIImpl(PN, L, CastInsts);
		}

// Analyze \p SymbolicPHI, a SCEV expression of a phi node, and check if the		// Analyze \p SymbolicPHI, a SCEV expression of a phi node, and check if the
// computation that updates the phi follows the following pattern:		// computation that updates the phi follows the following pattern:
// (SExt/ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum		// (SExt/ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum
// which correspond to a phi->trunc->sext/zext->add->phi update chain.		// which correspond to a phi->trunc->sext/zext->add->phi update chain.
// If so, try to see if it can be rewritten as an AddRecExpr under some		// If so, try to see if it can be rewritten as an AddRecExpr under some
// Predicates. If successful, return them as a pair. Also cache the results		// Predicates. If successful, return them as a pair. Also cache the results
// of the analysis.		// of the analysis.
//		//
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	ScalarEvolution::createAddRecFromPHIWithCastsImpl(const SCEVUnknown *SymbolicPHI) {
const SCEV *StartExtended = GetExtendedExpr(StartVal);		const SCEV *StartExtended = GetExtendedExpr(StartVal);
if (PredIsKnownFalse(StartVal, StartExtended)) {		if (PredIsKnownFalse(StartVal, StartExtended)) {
DEBUG(dbgs() << "P2 is compile-time false\n";);		DEBUG(dbgs() << "P2 is compile-time false\n";);
return None;		return None;
}		}

const SCEV *AccumExtended = GetExtendedExpr(Accum);		const SCEV *AccumExtended = GetExtendedExpr(Accum);
if (PredIsKnownFalse(Accum, AccumExtended)) {		if (PredIsKnownFalse(Accum, AccumExtended)) {
DEBUG(dbgs() << "P3 is compile-time false\n";);		DEBUG(dbgs() << "P3 is compile-time false\n";);
		sbarangaUnsubmitted Done Reply Inline Actions Nice catch, I think you can do a separate commit for this fix. sbaranga: Nice catch, I think you can do a separate commit for this fix.
		doritAuthorUnsubmitted Not Done Reply Inline Actions Done - this is https://reviews.llvm.org/D40641. dorit: Done - this is https://reviews.llvm.org/D40641.
return None;		return None;
}		}

auto AppendPredicate = [&](const SCEV *Expr,		auto AppendPredicate = [&](const SCEV *Expr,
const SCEV *ExtendedExpr) -> void {		const SCEV *ExtendedExpr) -> void {
if (Expr != ExtendedExpr &&		if (Expr != ExtendedExpr &&
!isKnownPredicate(ICmpInst::ICMP_EQ, Expr, ExtendedExpr)) {		!isKnownPredicate(ICmpInst::ICMP_EQ, Expr, ExtendedExpr)) {
const SCEVPredicate *Pred = getEqualPredicate(Expr, ExtendedExpr);		const SCEVPredicate *Pred = getEqualPredicate(Expr, ExtendedExpr);
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	ScalarEvolution::createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI) {
if (!Rewrite) {		if (!Rewrite) {
SmallVector<const SCEVPredicate *, 3> Predicates;		SmallVector<const SCEVPredicate *, 3> Predicates;
PredicatedSCEVRewrites[{SymbolicPHI, L}] = {SymbolicPHI, Predicates};		PredicatedSCEVRewrites[{SymbolicPHI, L}] = {SymbolicPHI, Predicates};
return None;		return None;
}		}

return Rewrite;		return Rewrite;
}		}

		sbarangaUnsubmitted Done Reply Inline Actions Could you add a FIXME here? This shouldn't be required, and PSE should return the same expression for both. sbaranga: Could you add a FIXME here? This shouldn't be required, and PSE should return the same…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Yes, I'll do that; "FIXME: This is currently required because the rewriter currently does not rewrite this: {0, +, (sext ix (trunc iy to ix) to iy)} into {0, +, %step}, although there exists an Equal predicate "%step == (sext ix (trunc iy to ix) to iy)". dorit: Yes, I'll do that; "FIXME: This is currently required because the rewriter currently does not…
/// A helper function for createAddRecFromPHI to handle simple cases.		/// A helper function for createAddRecFromPHI to handle simple cases.
///		///
/// This function tries to find an AddRec expression for the simplest (yet most		/// This function tries to find an AddRec expression for the simplest (yet most
/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).		/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).
/// If it fails, createAddRecFromPHI will use a more general, but slow,		/// If it fails, createAddRecFromPHI will use a more general, but slow,
/// technique for finding the AddRec expression.		/// technique for finding the AddRec expression.
const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,		const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,
Value *BEValueV,		Value *BEValueV,
		sbarangaUnsubmitted Done Reply Inline Actions Any idea what the complexity of isKnownPredicate() is? sbaranga: Any idea what the complexity of isKnownPredicate() is?
		doritAuthorUnsubmitted Not Done Reply Inline Actions I guess it can be potentially more complex than looking up Preds.implies... I can move it to be last check (or drop it...?). dorit: I guess it can be potentially more complex than looking up Preds.implies... I can move it to be…
Value *StartValueV) {		Value *StartValueV) {
const Loop *L = LI.getLoopFor(PN->getParent());		const Loop *L = LI.getLoopFor(PN->getParent());
assert(L && L->getHeader() == PN->getParent());		assert(L && L->getHeader() == PN->getParent());
assert(BEValueV && StartValueV);		assert(BEValueV && StartValueV);

auto BO = MatchBinaryOp(BEValueV, DT);		auto BO = MatchBinaryOp(BEValueV, DT);
if (!BO)		if (!BO)
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 7,005 Lines • Show Last 20 Lines

lib/Transforms/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 672 Lines • ▼ Show 20 Lines	Value *RecurrenceDescriptor::createMinMaxOp(IRBuilder<> &Builder,
else		else
Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");		Cmp = Builder.CreateICmp(P, Left, Right, "rdx.minmax.cmp");

Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");		Value *Select = Builder.CreateSelect(Cmp, Left, Right, "rdx.minmax.select");
return Select;		return Select;
}		}

InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,		InductionDescriptor::InductionDescriptor(Value *Start, InductionKind K,
const SCEV Step, BinaryOperator BOp)		const SCEV Step, BinaryOperator BOp,
		SmallVectorImpl<Instruction > Casts)
: StartValue(Start), IK(K), Step(Step), InductionBinOp(BOp) {		: StartValue(Start), IK(K), Step(Step), InductionBinOp(BOp) {
assert(IK != IK_NoInduction && "Not an induction");		assert(IK != IK_NoInduction && "Not an induction");

// Start value type should match the induction kind and the value		// Start value type should match the induction kind and the value
// itself should not be null.		// itself should not be null.
assert(StartValue && "StartValue is null");		assert(StartValue && "StartValue is null");
assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&		assert((IK != IK_PtrInduction \|\| StartValue->getType()->isPointerTy()) &&
"StartValue is not a pointer for pointer induction");		"StartValue is not a pointer for pointer induction");
Show All 10 Lines	assert((IK == IK_FpInduction \|\| Step->getType()->isIntegerTy()) &&
"StepValue is not an integer");		"StepValue is not an integer");

assert((IK != IK_FpInduction \|\| Step->getType()->isFloatingPointTy()) &&		assert((IK != IK_FpInduction \|\| Step->getType()->isFloatingPointTy()) &&
"StepValue is not FP for FpInduction");		"StepValue is not FP for FpInduction");
assert((IK != IK_FpInduction \|\| (InductionBinOp &&		assert((IK != IK_FpInduction \|\| (InductionBinOp &&
(InductionBinOp->getOpcode() == Instruction::FAdd \|\|		(InductionBinOp->getOpcode() == Instruction::FAdd \|\|
InductionBinOp->getOpcode() == Instruction::FSub))) &&		InductionBinOp->getOpcode() == Instruction::FSub))) &&
"Binary opcode should be specified for FP induction");		"Binary opcode should be specified for FP induction");

		if (Casts) {
		for (auto &Inst : *Casts) {
		RedundantCasts.push_back(Inst);
		}
		}
}		}

int InductionDescriptor::getConsecutiveDirection() const {		int InductionDescriptor::getConsecutiveDirection() const {
ConstantInt *ConstStep = getConstIntStepValue();		ConstantInt *ConstStep = getConstIntStepValue();
if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))		if (ConstStep && (ConstStep->isOne() \|\| ConstStep->isMinusOne()))
return ConstStep->getSExtValue();		return ConstStep->getSExtValue();
return 0;		return 0;
}		}
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,
if (Assume && !AR)		if (Assume && !AR)
AR = PSE.getAsAddRec(Phi);		AR = PSE.getAsAddRec(Phi);

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");		DEBUG(dbgs() << "LV: PHI is not a poly recurrence.\n");
return false;		return false;
}		}

return isInductionPHI(Phi, TheLoop, PSE.getSE(), D, AR);		// Record any Cast instructions that participate in the induction update
		AyalUnsubmitted Done Reply Inline Actions Set CastInsts to nullptr or the address of a SmallVector created earlier depending on the conditions below, or invoke isInductionPHI() once using the default nullptr for its last parameter, or else with such an address. Ayal: Set CastInsts to nullptr or the address of a SmallVector created earlier depending on the…
}		SmallVector<Instruction , 2> CastInsts = nullptr;
		const auto *SymbolicPhi = dyn_cast<SCEVUnknown>(PhiScev);
bool InductionDescriptor::isInductionPHI(PHINode Phi, const Loop TheLoop,		// If we started from an UnknownSCEV, and managed to build an addRecurrence
ScalarEvolution *SE,		// only after enabling Assume with PSCEV, this means we may have encountered
InductionDescriptor &D,		// cast instructions that required adding a runtime check in order to
const SCEV *Expr) {		// guarantee the correctness of the AddRecurence respresentation of the
		// induction.
		if (PhiScev != AR && SymbolicPhi) {
		SmallVector<Instruction *, 2> Casts;
		sbarangaUnsubmitted Done Reply Inline Actions I guess this is just an optimization for speed? Maybe it's enough to check that SE returns a SCEVUnknown and PSE returns an AddRec? sbaranga: I guess this is just an optimization for speed? Maybe it's enough to check that SE returns a…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Yes, to make sure we only look at relevant phis... The check you suggest is exactly what the caller to this utility checks before calling this routine... dorit: Yes, to make sure we only look at relevant phis... The check you suggest is exactly what the…
		sbarangaUnsubmitted Done Reply Inline Actions In that case would we ever bail out here? If not, and this doesn't affect correctness, maybe we shouldn't do this check? The problem I have with havePredicatedSCEVRewriteForPHI is that it is looking into the SCEV internal structures while SCEV is doing the analysis lazily. So the answer that it's getting is that either we don't have a better value under some predicates or we haven't performed the analysis at all. So maybe there is an assumption here that createAddRecFromPHIWithCasts was called at some point. From an interface perspective, I think it's better to call createAddRecForPHIWithCasts if we really require doing this check. sbaranga: In that case would we ever bail out here? If not, and this doesn't affect correctness, maybe we…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Ok, I see your point. I dropped the (redundant/overly-cautious) call to havePredicatedSCEVRewirtesForPHI. dorit: Ok, I see your point. I dropped the (redundant/overly-cautious) call to…
		if (PSE.getSE()->getCastsForInductionPHI(SymbolicPhi, AR, Casts))
		CastInsts = &Casts;
		}

		return isInductionPHI(Phi, TheLoop, PSE.getSE(), D, AR, CastInsts);
		}

		bool InductionDescriptor::isInductionPHI(
		PHINode Phi, const Loop TheLoop, ScalarEvolution *SE,
		InductionDescriptor &D, const SCEV *Expr,
		SmallVectorImpl<Instruction > CastsToIgnore) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// We only handle integer and pointer inductions variables.		// We only handle integer and pointer inductions variables.
if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())		if (!PhiTy->isIntegerTy() && !PhiTy->isPointerTy())
return false;		return false;

// Check that the PHI is consecutive.		// Check that the PHI is consecutive.
const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);		const SCEV *PhiScev = Expr ? Expr : SE->getSCEV(Phi);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PhiScev);
Show All 15 Lines	bool InductionDescriptor::isInductionPHI(
const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);
// Calculate the pointer stride and check if it is consecutive.		// Calculate the pointer stride and check if it is consecutive.
// The stride may be a constant or a loop invariant integer value.		// The stride may be a constant or a loop invariant integer value.
const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);		const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);
if (!ConstStep && !SE->isLoopInvariant(Step, TheLoop))		if (!ConstStep && !SE->isLoopInvariant(Step, TheLoop))
return false;		return false;

if (PhiTy->isIntegerTy()) {		if (PhiTy->isIntegerTy()) {
D = InductionDescriptor(StartValue, IK_IntInduction, Step);		D = InductionDescriptor(StartValue, IK_IntInduction, Step, nullptr,
		CastsToIgnore);
		AyalUnsubmitted Done Reply Inline Actions Suggest to keep the 'nullptr' parameter as the last argument and continue to use its default, or document /* what-this-nullptr-is / Ayal:* Suggest to keep the 'nullptr' parameter as the last argument and continue to use its default…
return true;		return true;
}		}

assert(PhiTy->isPointerTy() && "The PHI must be a pointer");		assert(PhiTy->isPointerTy() && "The PHI must be a pointer");
// Pointer induction should be a constant.		// Pointer induction should be a constant.
if (!ConstStep)		if (!ConstStep)
return false;		return false;

ConstantInt *CV = ConstStep->getValue();		ConstantInt *CV = ConstStep->getValue();
Type *PointerElementType = PhiTy->getPointerElementType();		Type *PointerElementType = PhiTy->getPointerElementType();
		sbarangaUnsubmitted Done Reply Inline Actions nit: formatting, there should be a space before after AR sbaranga: nit: formatting, there should be a space before after AR
// The pointer stride cannot be determined if the pointer element type is not		// The pointer stride cannot be determined if the pointer element type is not
// sized.		// sized.
if (!PointerElementType->isSized())		if (!PointerElementType->isSized())
return false;		return false;

const DataLayout &DL = Phi->getModule()->getDataLayout();		const DataLayout &DL = Phi->getModule()->getDataLayout();
int64_t Size = static_cast<int64_t>(DL.getTypeAllocSize(PointerElementType));		int64_t Size = static_cast<int64_t>(DL.getTypeAllocSize(PointerElementType));
if (!Size)		if (!Size)
▲ Show 20 Lines • Show All 611 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 598 Lines • ▼ Show 20 Lines	protected:

/// Returns true if an instruction \p I should be scalarized instead of		/// Returns true if an instruction \p I should be scalarized instead of
/// vectorized for the chosen vectorization factor.		/// vectorized for the chosen vectorization factor.
bool shouldScalarizeInstruction(Instruction *I) const;		bool shouldScalarizeInstruction(Instruction *I) const;

/// Returns true if we should generate a scalar version of \p IV.		/// Returns true if we should generate a scalar version of \p IV.
bool needsScalarInduction(Instruction *IV) const;		bool needsScalarInduction(Instruction *IV) const;

		/// If there is a cast involved in the induction variable \p ID, which should
		/// be ignored in the vectorized loop body, this function records the
		/// VectorLoopValue of the respective Phi also as the VectorLoopValue of the
		/// cast. We had already proved that the casted Phi is equal to the uncasted
		/// Phi in the vectorized loop (under a runtime guard), and therefore
		/// there is no need to vectorize the cast - the same value can be used in the
		/// vector loop for both the Phi and the cast.
		/// If \p VectorLoopValue is a scalarized value, \p Lane is also specified,
		/// Otherwise, \p VectorLoopValue is a widened/vectorized value.
		void recordVectorLoopValueForInductionCast (const InductionDescriptor &ID,
		unsigned Part,
		Value *VectorLoopValue,
		unsigned Lane = UINT_MAX);
		AyalUnsubmitted Done Reply Inline Actions Placing 'Part' next to 'Lane' seems more logical. Ayal: Placing 'Part' next to 'Lane' seems more logical.

/// Generate a shuffle sequence that will reverse the vector Vec.		/// Generate a shuffle sequence that will reverse the vector Vec.
virtual Value reverseVector(Value Vec);		virtual Value reverseVector(Value Vec);

/// Returns (and creates if needed) the original loop trip count.		/// Returns (and creates if needed) the original loop trip count.
Value getOrCreateTripCount(Loop NewLoop);		Value getOrCreateTripCount(Loop NewLoop);

/// Returns (and creates if needed) the trip count of the widened loop.		/// Returns (and creates if needed) the trip count of the widened loop.
Value getOrCreateVectorTripCount(Loop NewLoop);		Value getOrCreateVectorTripCount(Loop NewLoop);
▲ Show 20 Lines • Show All 952 Lines • ▼ Show 20 Lines	public:
DenseMap<Instruction , Instruction > &getSinkAfter() { return SinkAfter; }		DenseMap<Instruction , Instruction > &getSinkAfter() { return SinkAfter; }

/// Returns the widest induction type.		/// Returns the widest induction type.
Type *getWidestInductionType() { return WidestIndTy; }		Type *getWidestInductionType() { return WidestIndTy; }

/// Returns True if V is an induction variable in this loop.		/// Returns True if V is an induction variable in this loop.
bool isInductionVariable(const Value *V);		bool isInductionVariable(const Value *V);

		/// Returns True if V is a casted induction variable in this loop, that had
		/// been proven to be redundant (possible under runtime guards) and can be
		/// ignored when creating the vectorized loop body.
		bool isCastedInductionVariable(Value *V);
		AyalUnsubmitted Done Reply Inline Actions The name "isCastedInductionVariable()" may be confused with "isInductionVariable()" - the latter deals with a header Phi, whereas the former deals with a redundant Cast. (The Phi feeding an isCastedInductionVariable() answers true to isInductionVariable(), right?) Perhaps have "isInductionPhi()", and change "isInductionVariable()" to answer true for both Phi and Cast Instructions of an IV? Ayal: The name "isCastedInductionVariable()" may be confused with "isInductionVariable()" - the…

/// Returns True if PN is a reduction variable in this loop.		/// Returns True if PN is a reduction variable in this loop.
bool isReductionVariable(PHINode *PN) { return Reductions.count(PN); }		bool isReductionVariable(PHINode *PN) { return Reductions.count(PN); }

/// Returns True if Phi is a first-order recurrence in this loop.		/// Returns True if Phi is a first-order recurrence in this loop.
bool isFirstOrderRecurrence(const PHINode *Phi);		bool isFirstOrderRecurrence(const PHINode *Phi);

/// Return true if the block BB needs to be predicated in order for the loop		/// Return true if the block BB needs to be predicated in order for the loop
/// to be vectorized.		/// to be vectorized.
▲ Show 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	private:
/// Holds the reduction variables.		/// Holds the reduction variables.
ReductionList Reductions;		ReductionList Reductions;

/// Holds all of the induction variables that we found in the loop.		/// Holds all of the induction variables that we found in the loop.
/// Notice that inductions don't need to start at zero and that induction		/// Notice that inductions don't need to start at zero and that induction
/// variables can be pointers.		/// variables can be pointers.
InductionList Inductions;		InductionList Inductions;

		/// Holds all the casts that participate in the update chain of the induction
		/// variables, and that have been proven to be redundant (possibly under a
		/// runtime guard). These casts can be ignored when creating the vectorized
		/// loop body.
		SmallPtrSet<Instruction *, 4> InductionCastsToIgnore;

/// Holds the phi nodes that are first-order recurrences.		/// Holds the phi nodes that are first-order recurrences.
RecurrenceSet FirstOrderRecurrences;		RecurrenceSet FirstOrderRecurrences;

/// Holds instructions that need to sink past other instructions to handle		/// Holds instructions that need to sink past other instructions to handle
/// first-order recurrences.		/// first-order recurrences.
DenseMap<Instruction , Instruction > SinkAfter;		DenseMap<Instruction , Instruction > SinkAfter;

/// Holds the widest induction type encountered.		/// Holds the widest induction type encountered.
▲ Show 20 Lines • Show All 753 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::createVectorIntOrFpInductionPHI(

// We may need to add the step a number of times, depending on the unroll		// We may need to add the step a number of times, depending on the unroll
// factor. The last of those goes into the PHI.		// factor. The last of those goes into the PHI.
PHINode *VecInd = PHINode::Create(SteppedStart->getType(), 2, "vec.ind",		PHINode *VecInd = PHINode::Create(SteppedStart->getType(), 2, "vec.ind",
&*LoopVectorBody->getFirstInsertionPt());		&*LoopVectorBody->getFirstInsertionPt());
Instruction *LastInduction = VecInd;		Instruction *LastInduction = VecInd;
for (unsigned Part = 0; Part < UF; ++Part) {		for (unsigned Part = 0; Part < UF; ++Part) {
VectorLoopValueMap.setVectorValue(EntryVal, Part, LastInduction);		VectorLoopValueMap.setVectorValue(EntryVal, Part, LastInduction);
		recordVectorLoopValueForInductionCast(II, Part, LastInduction);
if (isa<TruncInst>(EntryVal))		if (isa<TruncInst>(EntryVal))
addMetadata(LastInduction, EntryVal);		addMetadata(LastInduction, EntryVal);
LastInduction = cast<Instruction>(addFastMathFlag(		LastInduction = cast<Instruction>(addFastMathFlag(
Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add")));		Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add")));
}		}

// Move the last step to the end of the latch block. This ensures consistent		// Move the last step to the end of the latch block. This ensures consistent
// placement of all induction updates.		// placement of all induction updates.
Show All 17 Lines	if (shouldScalarizeInstruction(IV))
return true;		return true;
auto isScalarInst = [&](User *U) -> bool {		auto isScalarInst = [&](User *U) -> bool {
auto *I = cast<Instruction>(U);		auto *I = cast<Instruction>(U);
return (OrigLoop->contains(I) && shouldScalarizeInstruction(I));		return (OrigLoop->contains(I) && shouldScalarizeInstruction(I));
};		};
return llvm::any_of(IV->users(), isScalarInst);		return llvm::any_of(IV->users(), isScalarInst);
}		}

		void InnerLoopVectorizer::recordVectorLoopValueForInductionCast(
		const InductionDescriptor &ID, unsigned Part, Value *VectorLoopVal,
		unsigned Lane) {
		const SmallVectorImpl<Instruction *> &Casts = ID.getCastInsts();
		if (!Casts.empty()) {
		AyalUnsubmitted Done Reply Inline Actions Early exit instead if (Casts.empty())? Ayal: Early exit instead if (Casts.empty())?
		// Only the first Cast instruction in the Casts vector is of interest.
		// The rest of the Casts (if exist) have no uses outside the
		// induction update chain itself.
		Instruction CastInst = Casts.begin();
		if (Lane < UINT_MAX)
		VectorLoopValueMap.setScalarValue(CastInst, {Part, Lane}, VectorLoopVal);
		else
		VectorLoopValueMap.setVectorValue(CastInst, Part, VectorLoopVal);
		}
		}

void InnerLoopVectorizer::widenIntOrFpInduction(PHINode IV, TruncInst Trunc) {		void InnerLoopVectorizer::widenIntOrFpInduction(PHINode IV, TruncInst Trunc) {
assert((IV->getType()->isIntegerTy() \|\| IV != OldInduction) &&		assert((IV->getType()->isIntegerTy() \|\| IV != OldInduction) &&
"Primary induction variable must have an integer type");		"Primary induction variable must have an integer type");

auto II = Legal->getInductionVars()->find(IV);		auto II = Legal->getInductionVars()->find(IV);
assert(II != Legal->getInductionVars()->end() && "IV is not an induction");		assert(II != Legal->getInductionVars()->end() && "IV is not an induction");

auto ID = II->second;		auto ID = II->second;
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::widenIntOrFpInduction(PHINode IV, TruncInst Trunc) {
// If we haven't yet vectorized the induction variable, splat the scalar		// If we haven't yet vectorized the induction variable, splat the scalar
// induction variable, and build the necessary step vectors.		// induction variable, and build the necessary step vectors.
if (!VectorizedIV) {		if (!VectorizedIV) {
Value *Broadcasted = getBroadcastInstrs(ScalarIV);		Value *Broadcasted = getBroadcastInstrs(ScalarIV);
for (unsigned Part = 0; Part < UF; ++Part) {		for (unsigned Part = 0; Part < UF; ++Part) {
Value *EntryPart =		Value *EntryPart =
getStepVector(Broadcasted, VF * Part, Step, ID.getInductionOpcode());		getStepVector(Broadcasted, VF * Part, Step, ID.getInductionOpcode());
VectorLoopValueMap.setVectorValue(EntryVal, Part, EntryPart);		VectorLoopValueMap.setVectorValue(EntryVal, Part, EntryPart);
		recordVectorLoopValueForInductionCast(ID, Part, EntryPart);
if (Trunc)		if (Trunc)
addMetadata(EntryPart, Trunc);		addMetadata(EntryPart, Trunc);
}		}
}		}

// If an induction variable is only used for counting loop iterations or		// If an induction variable is only used for counting loop iterations or
// calculating addresses, it doesn't need to be widened. Create scalar steps		// calculating addresses, it doesn't need to be widened. Create scalar steps
// that can be used by instructions we will later scalarize. Note that the		// that can be used by instructions we will later scalarize. Note that the
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	unsigned Lanes =
: VF;		: VF;
// Compute the scalar steps and save the results in VectorLoopValueMap.		// Compute the scalar steps and save the results in VectorLoopValueMap.
for (unsigned Part = 0; Part < UF; ++Part) {		for (unsigned Part = 0; Part < UF; ++Part) {
for (unsigned Lane = 0; Lane < Lanes; ++Lane) {		for (unsigned Lane = 0; Lane < Lanes; ++Lane) {
auto StartIdx = getSignedIntOrFpConstant(ScalarIVTy, VF Part + Lane);		auto StartIdx = getSignedIntOrFpConstant(ScalarIVTy, VF Part + Lane);
auto *Mul = addFastMathFlag(Builder.CreateBinOp(MulOp, StartIdx, Step));		auto *Mul = addFastMathFlag(Builder.CreateBinOp(MulOp, StartIdx, Step));
auto *Add = addFastMathFlag(Builder.CreateBinOp(AddOp, ScalarIV, Mul));		auto *Add = addFastMathFlag(Builder.CreateBinOp(AddOp, ScalarIV, Mul));
VectorLoopValueMap.setScalarValue(EntryVal, {Part, Lane}, Add);		VectorLoopValueMap.setScalarValue(EntryVal, {Part, Lane}, Add);
		recordVectorLoopValueForInductionCast(ID, Part, Add, Lane);
}		}
}		}
}		}

int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {		int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {
const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() :		const ValueToValueMap &Strides = getSymbolicStrides() ? *getSymbolicStrides() :
ValueToValueMap();		ValueToValueMap();

▲ Show 20 Lines • Show All 2,436 Lines • ▼ Show 20 Lines	if (!AllowedExit.count(Inst))
}		}
return false;		return false;
}		}

void LoopVectorizationLegality::addInductionPhi(		void LoopVectorizationLegality::addInductionPhi(
PHINode *Phi, const InductionDescriptor &ID,		PHINode *Phi, const InductionDescriptor &ID,
SmallPtrSetImpl<Value *> &AllowedExit) {		SmallPtrSetImpl<Value *> &AllowedExit) {
Inductions[Phi] = ID;		Inductions[Phi] = ID;

		// In case this induction also comes with casts that we know we can ignore
		// in the vectorized loop body, record them here. Only one of these casts
		// (the first in Casts, the last in the actual IR sequence) is needed here,
		// because any other casts, if exist, are not used outside the cast sequence,
		// and will therefore will not be queried.
		AyalUnsubmitted Done Reply Inline Actions In other words, all casts could be recorded here for ignoring, but suffices to record only the first. Ayal: In other words, all casts could be recorded here for ignoring, but suffices to record only the…
		const SmallVectorImpl<Instruction *> &Casts = ID.getCastInsts();
		if (!Casts.empty())
		InductionCastsToIgnore.insert(*Casts.begin());

Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
const DataLayout &DL = Phi->getModule()->getDataLayout();		const DataLayout &DL = Phi->getModule()->getDataLayout();

// Get the widest type.		// Get the widest type.
if (!PhiTy->isFloatingPointTy()) {		if (!PhiTy->isFloatingPointTy()) {
if (!WidestIndTy)		if (!WidestIndTy)
WidestIndTy = convertPointerToIntegerType(DL, PhiTy);		WidestIndTy = convertPointerToIntegerType(DL, PhiTy);
else		else
▲ Show 20 Lines • Show All 622 Lines • ▼ Show 20 Lines	bool LoopVectorizationLegality::isInductionVariable(const Value *V) {
Value In0 = const_cast<Value >(V);		Value In0 = const_cast<Value >(V);
PHINode *PN = dyn_cast_or_null<PHINode>(In0);		PHINode *PN = dyn_cast_or_null<PHINode>(In0);
if (!PN)		if (!PN)
return false;		return false;

return Inductions.count(PN);		return Inductions.count(PN);
}		}

		bool LoopVectorizationLegality::isCastedInductionVariable(Value *V) {
		auto *Inst = dyn_cast<Instruction>(V);
		if (!Inst)
		return false;

		return InductionCastsToIgnore.count(Inst);
		AyalUnsubmitted Done Reply Inline Actions Can be folded into a single `return (Inst && InductionCastsToIgnore.count(Inst));` Ayal: Can be folded into a single `return (Inst && InductionCastsToIgnore.count(Inst));`
		}

bool LoopVectorizationLegality::isFirstOrderRecurrence(const PHINode *Phi) {		bool LoopVectorizationLegality::isFirstOrderRecurrence(const PHINode *Phi) {
return FirstOrderRecurrences.count(Phi);		return FirstOrderRecurrences.count(Phi);
}		}

bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) {		bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) {
return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);		return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);
}		}

▲ Show 20 Lines • Show All 1,057 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
// Skip dbg intrinsics.		// Skip dbg intrinsics.
if (isa<DbgInfoIntrinsic>(I))		if (isa<DbgInfoIntrinsic>(I))
continue;		continue;

// Skip ignored values.		// Skip ignored values.
if (ValuesToIgnore.count(&I))		if (ValuesToIgnore.count(&I))
continue;		continue;

		if (VF > 1 && VecValuesToIgnore.count(&I))
		continue;

		AyalUnsubmitted Not Done Reply Inline Actions VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for VF's where they are actually widened, IIUC their use in calculateRegisterUsage(). Should the redundant IV casts be added to ValuesToIgnore instead, being redundant in either scalar and vector types? In any case, having expectedCost() neglect the cost of VecValuesToIgnore may affect cases unrelated to this patch's casted Inductions (related to Reductions), so probably better done in a separate patch. Ayal: VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for VF's…
		doritAuthorUnsubmitted Not Done Reply Inline Actions VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for VF's where they are actually widened, IIUC their use in calculateRegisterUsage(). Should the redundant IV casts be added to ValuesToIgnore instead, being redundant in either scalar and vector types? No, because then we would also ignore the cost of the casts when we calculate the baseline cost (of the scalar loop), and we don't want to do that (that loop will not be guarded by the predicate, and the casts will remain there). In any case, having expectedCost() neglect the cost of VecValuesToIgnore may affect cases unrelated to this patch's casted Inductions (related to Reductions), so probably better done in a separate patch. Yes. I thought it's a small enough fix to be included in this patch. I can submit a separate patch for this. dorit: > VecValuesToIgnore holds Instructions whose cost should be ignored only if widened, and for…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Yes. I thought it's a small enough fix to be included in this patch. I can submit a separate patch for this. This is https://reviews.llvm.org/D40883. thanks, Dorit dorit: > Yes. I thought it's a small enough fix to be included in this patch. I can submit a separate…
		AyalUnsubmitted Not Done Reply Inline Actions Great, thanks. Ayal: Great, thanks.
VectorizationCostTy C = getInstructionCost(&I, VF);		VectorizationCostTy C = getInstructionCost(&I, VF);

// Check if we should override the cost.		// Check if we should override the cost.
if (ForceTargetInstructionCost.getNumOccurrences() > 0)		if (ForceTargetInstructionCost.getNumOccurrences() > 0)
C.first = ForceTargetInstructionCost;		C.first = ForceTargetInstructionCost;

BlockCost.first += C.first;		BlockCost.first += C.first;
BlockCost.second \|= C.second;		BlockCost.second \|= C.second;
Show All 20 Lines
/// \brief Gets Address Access SCEV after verifying that the access pattern		/// \brief Gets Address Access SCEV after verifying that the access pattern
/// is loop invariant except the induction variable dependence.		/// is loop invariant except the induction variable dependence.
///		///
/// This SCEV can be sent to the Target in order to estimate the address		/// This SCEV can be sent to the Target in order to estimate the address
/// calculation cost.		/// calculation cost.
static const SCEV *getAddressAccessSCEV(		static const SCEV *getAddressAccessSCEV(
Value *Ptr,		Value *Ptr,
LoopVectorizationLegality *Legal,		LoopVectorizationLegality *Legal,
ScalarEvolution *SE,		PredicatedScalarEvolution &PSE,
const Loop *TheLoop) {		const Loop *TheLoop) {

auto *Gep = dyn_cast<GetElementPtrInst>(Ptr);		auto *Gep = dyn_cast<GetElementPtrInst>(Ptr);
if (!Gep)		if (!Gep)
return nullptr;		return nullptr;

// We are looking for a gep with all loop invariant indices except for one		// We are looking for a gep with all loop invariant indices except for one
// which should be an induction variable.		// which should be an induction variable.
		auto SE = PSE.getSE();
unsigned NumOperands = Gep->getNumOperands();		unsigned NumOperands = Gep->getNumOperands();
for (unsigned i = 1; i < NumOperands; ++i) {		for (unsigned i = 1; i < NumOperands; ++i) {
Value *Opd = Gep->getOperand(i);		Value *Opd = Gep->getOperand(i);
if (!SE->isLoopInvariant(SE->getSCEV(Opd), TheLoop) &&		if (!SE->isLoopInvariant(SE->getSCEV(Opd), TheLoop) &&
!Legal->isInductionVariable(Opd))		!Legal->isInductionVariable(Opd) &&
		!Legal->isCastedInductionVariable(Opd))
return nullptr;		return nullptr;
}		}

// Now we know we have a GEP ptr, %inv, %ind, %inv. return the Ptr SCEV.		// Now we know we have a GEP ptr, %inv, %ind, %inv. return the Ptr SCEV.
return SE->getSCEV(Ptr);		return PSE.getSCEV(Ptr);
}		}

static bool isStrideMul(Instruction I, LoopVectorizationLegality Legal) {		static bool isStrideMul(Instruction I, LoopVectorizationLegality Legal) {
return Legal->hasStride(I->getOperand(0)) \|\|		return Legal->hasStride(I->getOperand(0)) \|\|
Legal->hasStride(I->getOperand(1));		Legal->hasStride(I->getOperand(1));
}		}

unsigned LoopVectorizationCostModel::getMemInstScalarizationCost(Instruction *I,		unsigned LoopVectorizationCostModel::getMemInstScalarizationCost(Instruction *I,
unsigned VF) {		unsigned VF) {
Type *ValTy = getMemInstValueType(I);		Type *ValTy = getMemInstValueType(I);
auto SE = PSE.getSE();		auto SE = PSE.getSE();

unsigned Alignment = getMemInstAlignment(I);		unsigned Alignment = getMemInstAlignment(I);
unsigned AS = getMemInstAddressSpace(I);		unsigned AS = getMemInstAddressSpace(I);
Value *Ptr = getPointerOperand(I);		Value *Ptr = getPointerOperand(I);
Type *PtrTy = ToVectorTy(Ptr->getType(), VF);		Type *PtrTy = ToVectorTy(Ptr->getType(), VF);

// Figure out whether the access is strided and get the stride value		// Figure out whether the access is strided and get the stride value
// if it's known in compile time		// if it's known in compile time
const SCEV *PtrSCEV = getAddressAccessSCEV(Ptr, Legal, SE, TheLoop);		const SCEV *PtrSCEV = getAddressAccessSCEV(Ptr, Legal, PSE, TheLoop);

// Get the cost of the scalar memory instruction and address computation.		// Get the cost of the scalar memory instruction and address computation.
unsigned Cost = VF * TTI.getAddressComputationCost(PtrTy, SE, PtrSCEV);		unsigned Cost = VF * TTI.getAddressComputationCost(PtrTy, SE, PtrSCEV);

Cost += VF *		Cost += VF *
TTI.getMemoryOpCost(I->getOpcode(), ValTy->getScalarType(), Alignment,		TTI.getMemoryOpCost(I->getOpcode(), ValTy->getScalarType(), Alignment,
AS, I);		AS, I);

▲ Show 20 Lines • Show All 537 Lines • ▼ Show 20 Lines	void LoopVectorizationCostModel::collectValuesToIgnore() {

// Ignore type-promoting instructions we identified during reduction		// Ignore type-promoting instructions we identified during reduction
// detection.		// detection.
for (auto &Reduction : *Legal->getReductionVars()) {		for (auto &Reduction : *Legal->getReductionVars()) {
RecurrenceDescriptor &RedDes = Reduction.second;		RecurrenceDescriptor &RedDes = Reduction.second;
SmallPtrSetImpl<Instruction *> &Casts = RedDes.getCastInsts();		SmallPtrSetImpl<Instruction *> &Casts = RedDes.getCastInsts();
VecValuesToIgnore.insert(Casts.begin(), Casts.end());		VecValuesToIgnore.insert(Casts.begin(), Casts.end());
}		}
		// Ignore type-casting instructions we identified during induction
		// detection.
		for (auto &Induction : *Legal->getInductionVars()) {
		InductionDescriptor &IndDes = Induction.second;
		const SmallVectorImpl<Instruction *> &Casts = IndDes.getCastInsts();
		VecValuesToIgnore.insert(Casts.begin(), Casts.end());
		}
}		}

LoopVectorizationCostModel::VectorizationFactor		LoopVectorizationCostModel::VectorizationFactor
LoopVectorizationPlanner::plan(bool OptForSize, unsigned UserVF) {		LoopVectorizationPlanner::plan(bool OptForSize, unsigned UserVF) {
// Width 1 means no vectorize, cost 0 means uncomputed cost.		// Width 1 means no vectorize, cost 0 means uncomputed cost.
const LoopVectorizationCostModel::VectorizationFactor NoVectorization = {1U,		const LoopVectorizationCostModel::VectorizationFactor NoVectorization = {1U,
0U};		0U};
Optional<unsigned> MaybeMaxVF = CM.computeMaxVF(OptForSize);		Optional<unsigned> MaybeMaxVF = CM.computeMaxVF(OptForSize);
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	void LoopVectorizationPlanner::collectTriviallyDeadInstructions(
// all its users except the induction variable are dead.		// all its users except the induction variable are dead.
for (auto &Induction : *Legal->getInductionVars()) {		for (auto &Induction : *Legal->getInductionVars()) {
PHINode *Ind = Induction.first;		PHINode *Ind = Induction.first;
auto *IndUpdate = cast<Instruction>(Ind->getIncomingValueForBlock(Latch));		auto *IndUpdate = cast<Instruction>(Ind->getIncomingValueForBlock(Latch));
if (llvm::all_of(IndUpdate->users(), [&](User *U) -> bool {		if (llvm::all_of(IndUpdate->users(), [&](User *U) -> bool {
return U == Ind \|\| DeadInstructions.count(cast<Instruction>(U));		return U == Ind \|\| DeadInstructions.count(cast<Instruction>(U));
}))		}))
DeadInstructions.insert(IndUpdate);		DeadInstructions.insert(IndUpdate);

		// We record as "Dead" also the type-casting instructions we had identified
		// during induction analysis. We don't need any handling for them in the
		// vectorized loop because we have proven that, under a proper runtime
		// test guarding the vectorized loop, the value of the phi, and the casted
		// value of the phi, are the same. The last instruction in this casting chain
		// will get its scalar/vector/widened def from the scalar/vector/widened def
		// of the respective phi node. Any other casts in the induction def-use chain
		// have no other uses outside the phi update chain, and will be ignored.
		InductionDescriptor &IndDes = Induction.second;
		const SmallVectorImpl<Instruction *> &Casts = IndDes.getCastInsts();
		DeadInstructions.insert(Casts.begin(), Casts.end());
}		}
}		}

Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }		Value InnerLoopUnroller::reverseVector(Value Vec) { return Vec; }

Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }		Value InnerLoopUnroller::getBroadcastInstrs(Value V) { return V; }

Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step,		Value InnerLoopUnroller::getStepVector(Value Val, int StartIdx, Value *Step,
▲ Show 20 Lines • Show All 1,146 Lines • Show Last 20 Lines

test/Transforms/LoopVectorize/vect-phiscev-sext-trunc.ll

				; RUN: opt -S -loop-vectorize -force-vector-width=8 -force-vector-interleave=1 < %s \| FileCheck %s -check-prefix=VF8
				; RUN: opt -S -loop-vectorize -force-vector-width=1 -force-vector-interleave=4 < %s \| FileCheck %s -check-prefix=VF1

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; Given a loop with an induction variable which is being
				; truncated/extended using casts that had been proven to
				; be redundant under a runtime test, we want to make sure
				; that these casts, do not get vectorized/scalarized/widened.
				; This is the case for inductions whose SCEV expression is
				; of the form "ExtTrunc(%phi) + %step", where "ExtTrunc"
				; can be a result of the IR sequences we check below.
				;
				; See also pr30654.
				;

				; Case1: Check the following induction pattern:
				;
				; %p.09 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.body ]
				; %sext = shl i32 %p.09, 24
				; %conv = ashr exact i32 %sext, 24
				; %add = add nsw i32 %conv, %step
				;
				; This is the case in the following code:
				;
				; void doit3(int n, int step) {
				AyalUnsubmitted Done Reply Inline Actions doit3 >> doit1 Ayal: doit3 >> doit1
				; int i;
				; char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;
				; The "ExtTrunc" IR sequence here is:
				; "%sext = shl i32 %p.09, 24"
				; "%conv = ashr exact i32 %sext, 24"
				; We check that it does not appear in the vector loop body, whether
				; we vectorize or scalarize the induction.

				; VF8-LABEL: @doit1
				; VF8: vector.body:
				; VF8-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl <8 x i32>
				; VF8-NOT: %{{.*}} = ashr exact <8 x i32> %[[TEST]]
				AyalUnsubmitted Done Reply Inline Actions If we first check NOT to have the shl which defines TEST, seems useless to then check that ashr doesn't use TEST. Better check instead that the induction phi in vector.body (the one being stored into a[i]) feeds and consumes its bump directly? Ayal: If we first check NOT to have the shl which defines TEST, seems useless to then check that ashr…
				; VF8: <8 x i32>
				; VF8-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl <8 x i32>
				; VF8-NOT: %{{.*}} = ashr exact <8 x i32> %[[TEST]]
				; VF8: middle.block:

				; VF1-LABEL: @doit1
				; VF1: vector.body:
				; VF1-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl i32
				AyalUnsubmitted Done Reply Inline Actions TEST is set but not used. Ayal: TEST is set but not used.
				; VF1-NOT: %{{.*}} = ashr exact i32 %[[TEST]]
				; VF1: middle.block:

				@a = common local_unnamed_addr global [250 x i32] zeroinitializer, align 16

				define void @doit1(i32 %n, i32 %step) {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.lr.ph, label %for.end

				for.body.lr.ph:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%p.09 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.body ]
				%sext = shl i32 %p.09, 24
				%conv = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}


				; Case2: Another variant of the above pattern is where the induction variable
				; is used only for address compuation (i.e. it is a GEP index) and therefore
				; the induction is not vectorized but rather only the step is widened.
				;
				; This is the case in the following code, where the induction variable 'w_ix'
				; is only used to access the array 'in':
				;
				; void doit3(int in, int out, size_t size, size_t step)
				AyalUnsubmitted Done Reply Inline Actions doit3 >> doit2 Ayal: doit3 >> doit2
				; {
				; int w_ix = 0;
				; for (size_t offset = 0; offset < size; ++offset)
				; {
				; int w = in[w_ix];
				; out[offset] = w;
				; w_ix += step;
				; }
				; }
				;
				; The "ExtTrunc" IR sequence here is similar to the previous case:
				; "%sext = shl i64 %w_ix.012, 32
				; %idxprom = ashr exact i64 %sext, 32"
				; We check that it does not appear in the vector loop body, whether
				; we widen or scalarize the induction.

				; VF8-LABEL: @doit2
				; VF8: vector.body:
				; VF8-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl <8 x i64>
				; VF8-NOT: %{{.*}} = ashr exact <8 x i64> %[[TEST]]
				; VF8: <8 x i32>
				; VF8-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl <8 x i64>
				; VF8-NOT: %{{.*}} = ashr exact <8 x i64> %[[TEST]]
				; VF8: middle.block:

				; VF1-LABEL: @doit2
				AyalUnsubmitted Done Reply Inline Actions TEST is set but not used. Ayal: TEST is set but not used.
				; VF1: vector.body:
				; VF1-NOT: %[[TEST:[a-zA-Z0-9.]+]] = shl i64
				; VF1-NOT: %{{.*}} = ashr i64 %[[TEST]]
				; VF1: middle.block:
				;

				define void @doit2(i32* nocapture readonly %in, i32* nocapture %out, i64 %size, i64 %step) {
				entry:
				%cmp9 = icmp eq i64 %size, 0
				br i1 %cmp9, label %for.cond.cleanup, label %for.body.lr.ph

				for.body.lr.ph:
				br label %for.body

				for.cond.cleanup.loopexit:
				br label %for.cond.cleanup

				for.cond.cleanup:
				ret void

				for.body:
				%w_ix.011 = phi i64 [ 0, %for.body.lr.ph ], [ %add, %for.body ]
				%offset.010 = phi i64 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
				%sext = shl i64 %w_ix.011, 32
				%idxprom = ashr exact i64 %sext, 32
				%arrayidx = getelementptr inbounds i32, i32* %in, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 4
				%arrayidx1 = getelementptr inbounds i32, i32* %out, i64 %offset.010
				store i32 %0, i32* %arrayidx1, align 4
				%add = add i64 %idxprom, %step
				%inc = add nuw i64 %offset.010, 1
				%exitcond = icmp eq i64 %inc, %size
				br i1 %exitcond, label %for.cond.cleanup.loopexit, label %for.body
				}

				; Case3: Lastly, check also the following induction pattern:
				;
				; %p.09 = phi i32 [ %val0, %scalar.ph ], [ %add, %for.body ]
				; %conv = and i32 %p.09, 255
				; %add = add nsw i32 %conv, %step
				;
				; This is the case in the following code:
				;
				; int a[N];
				; void doit3(int n, int step) {
				; int i;
				; unsigned char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;
				; The "ExtTrunc" IR sequence here is:
				; "%conv = and i32 %p.09, 255".
				; We check that it does not appear in the vector loop body, whether
				; we vectorize or scalarize the induction.

				; VF8-LABEL: @doit3
				; VF8: vector.body:
				; VF8-NOT: %{{.*}} = and <8 x i32>
				; VF8: <8 x i32>
				; VF8-NOT: %{{.*}} = and <8 x i32>
				; VF8: middle.block:

				; VF1-LABEL: @doit3
				; VF1: vector.body:
				; VF1-NOT: %{{.*}} = and i32
				; VF1: middle.block:

				define void @doit3(i32 %n, i32 %step) {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.lr.ph, label %for.end

				for.body.lr.ph:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next, %for.body ]
				%p.09 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.body ]
				%conv = and i32 %p.09, 255
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LV] Support efficient vectorization of an induction with redundant castsClosedPublic

Details

Diff Detail

Event Timeline

+/// (2) Under the runtime predicate, the above expression is equal to:

+/// (2) Under the runtime predicate, the above expression is equal to:

Revision Contents

Diff 121630

include/llvm/Analysis/ScalarEvolution.h

include/llvm/Transforms/Utils/LoopUtils.h

lib/Analysis/ScalarEvolution.cpp

lib/Transforms/Utils/LoopUtils.cpp

lib/Transforms/Vectorize/LoopVectorize.cpp

test/Transforms/LoopVectorize/vect-phiscev-sext-trunc.ll

[LV] Support efficient vectorization of an induction with redundant casts
ClosedPublic