This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
ScalarEvolution.h
-
lib/Analysis/
-
Analysis/
-
ScalarEvolution.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
-
pr30654-phiscev-sext-trunc.ll

Differential D30041

[PSCEV] Create AddRec for Phis in cases of possible integer overflow, using runtime checks
ClosedPublic

Authored by dorit on Feb 16 2017, 5:25 AM.

Download Raw Diff

Details

Reviewers

anemet
sanjoy
mssimpso
sbaranga

Commits

rGca4fd18ddcde: PSCEV] Create AddRec for Phis in cases of possible integer overflow, using…
rL308299: PSCEV] Create AddRec for Phis in cases of possible integer overflow,

Summary

Extend the SCEVPredicateRewriter to work a bit harder when it encounters an UnknownSCEV whose Value is a Phi node.
The goal it to build an AddRecurrence for Phi nodes whose update chain involves casts, that can be ignored under the proper runtime overflow test.
This is the first step in addressing PR30654.
Next steps will improve upon it, as detailed in the comment in the body of the patch (under "TODO").

(BTW, If some of these steps seem critical I am happy to include them with this first patch, if this patch looks in principle ok (I don't want to build much upon a wrong direction...)).

Diff Detail

Repository: rL LLVM

Event Timeline

dorit created this revision.Feb 16 2017, 5:25 AM

Herald added a subscriber: mzolotukhin. · View Herald TranscriptFeb 16 2017, 5:25 AM

dorit added reviewers: sbaranga, sanjoy, anemet.Feb 16 2017, 5:27 AM

dorit added a reviewer: mssimpso.

Ayal added a subscriber: Ayal.Feb 20 2017, 1:48 AM

Ayal added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
10167 ↗	(On Diff #88719)	dyn_cast >> isa
10167 ↗	(On Diff #88719)	The attempt to createAddRecFromPHIWithCasts() involves introducing a new cast predicate, right? Worth guarding or at-least commenting.
10329 ↗	(On Diff #88719)	exits >> latches
10349 ↗	(On Diff #88719)	Alternatively, you can bailed out immediately and return nullptr when multiple distinct BackEdge/Start values are found. Then these checks should be asserts?
10370 ↗	(On Diff #88719)	This check seems redundant, as we're stopping on the first index found to be a phi or a simple casted phi, right? Simply break when found, and check if i == e afterwards, setting FoundIndex = i (if not).

delena added a subscriber: delena.Feb 20 2017, 1:53 AM

sbaranga added inline comments.Feb 20 2017, 10:35 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10185 ↗	(On Diff #88719)	I know I said that this should be a NUSW on the bugzilla ticket, but I'm not that sure anymore. Whatever the case this need a comment explaining the choice.
10324 ↗	(On Diff #88719)	I think it would be better to move this to Scalar Evolution itself, instead of having it in the rewriter. It would essentially be a lazy analysis, and would also take as an argument the loop and would return the analyzed expression and a set of predicates. That way we don't have to do the analysis again for every instantiation of a SCEVPredicateRewriter.
10396 ↗	(On Diff #88719)	Why is it correct to add the NSW flag here? I'm worried that it's somehow implied by the predicates that we're adding.
10415 ↗	(On Diff #88719)	Same as above (why can we add the NSW flag here?).

Hi Ayal,
I agree with all your comments, and will incorporate your suggestions to the next upload. I just want to clear out the caching and NoWrapFlags issues that Silviu had raised so I could include that in the next upload.
BTW, some of your comments refer to code that was copied over from getAddRecFromPHI, so I guess it would make sense to make the same changes there, where relevant. (I hope @sanjoy will have a chance to take a look to provide his input :-)).
Thanks!
Dorit

Hi Silviu,

About the NoWrap flags: I don't have very high confidence about this, I think it indeed needs to be corrected -- proposed fixes (and followup questions...) below.

I should mention that in getAddRecFromPHI (which this function is inspired by) the wrapping flags are determined by this piece of logic:

if (auto BO = MatchBinaryOp(BEValueV, DT)) {
  if (BO->Opcode == Instruction::Add && BO->LHS == PN) {
    if (BO->IsNUW)
      Flags = setFlags(Flags, SCEV::FlagNUW);
    if (BO->IsNSW)
      Flags = setFlags(Flags, SCEV::FlagNSW);
  }
} else if (GEPOperator *GEP = dyn_cast<GEPOperator>(BEValueV)) {
  // If the increment is an inbounds GEP, then we know the address
  // space cannot be wrapped around. We cannot make any guarantee
  // about signed or unsigned overflow because pointers are
  // unsigned but we may have a negative index from the base
  // pointer. We can guarantee that no unsigned wrap occurs if the
  // indices form a positive value.
  if (GEP->isInBounds() && GEP->getOperand(0) == PN) {
    Flags = setFlags(Flags, SCEV::FlagNW);

    const SCEV *Ptr = getSCEV(GEP->getPointerOperand());
    if (isKnownPositive(getMinusSCEV(getSCEV(GEP), Ptr)))
      Flags = setFlags(Flags, SCEV::FlagNUW);
  }

  // We cannot transfer nuw and nsw flags from subtraction
  // operations -- sub nuw X, Y is not the same as add nuw X, -Y
  // for instance.
}

I thought this is not needed/relevant in our scenario here because we determine the signess of the flags according to the whether we encountered a zext or sext; but I also realize now didn't think about the GEP scenario at all, I was thinking about integer inductions only. So in addition to the fixes below I will also add a check that we are dealing with integers, not pointers.

Many thanks,
Dorit

llvm/lib/Analysis/ScalarEvolution.cpp
10185 ↗	(On Diff #88719)	I actually don't see how we can tell whether to create an NUSW or NSSW assumption without further analysis, such as the analysis we do in getAddRecForPhiWithCasts…; So maybe in the general case we need to be conservative here and add both NSSW and NUSW to make sure that there will be no kind of overflow due to the truncation?
10324 ↗	(On Diff #88719)	So where/when would this analysis be triggered? And where would we cache the result of the ScalarEvolution analysis? could you please elaborate on what would be the flow of things you are suggesting / what exactly you mean by lazy here…? I agree of course we should avoid repeating the analysis over and over again; the caching of the analysis that I was going to add, along with guarding the analysis with "if(NewPreds)", would guarantee that if the analysis succeeded once (when NewPreds was passed) then any time we are passed "Preds" we will not repeat the analysis (because I was going to add a new kind of Predicate, and cache things in Preds). But indeed if the analysis fails, this caching will not prevent us from repeating it (and failing) every time we are passed NewPreds… So maybe what this means is that the caching should be done in a new data-structure to be added to PSCEV, separately of Preds, where we would cache both a failure -- simply associate the UnknownSCEV of the phi node with itself, and a success -- associate the UnknownSCEV with the respective AddRec (and of course this AddRec would itself have a Predicate that the analysis will have already added). This way we could check the results of the analysis regardless of whether we are passed Preds. If we do that, would your suggestion above still be relevant?
10396 ↗	(On Diff #88719)	Not sure about this, and looking at this again I can't justify SCEV:NSW here. Probably SCEV::FlagAnyWrap is all we can do here (as without the predicate we know nothing because of the truncate). Right?
10415 ↗	(On Diff #88719)	Again, not sure about this. I thought we can put here what the predicate guarantees. So if we added an NSSW assumption we could set the NoWrapFlags to SCEV:FlagNSW (right?). Originally I only looked at the Sext pattern and that's why I put the NSW Flag. Then I extended the analysis to also consider the Zext pattern, but didn't go back to fix the flag. So if we added an NUSW predicate, then would it be correct to set the flags to SCEV:FlagNUW ?? (NUSW and NUW don't have the same semantics…). Maybe SCEV:FlagNW makes most sense then in that case?

Silviu, would you please clarify your comment about moving the analysis to Scalar Evolution (please see my question above)? and why is it better than just caching things in the rewriter (possibly via a new data-structure in PSCEV or via Preds)? I would like to address your comments and upload a revised patch... thanks a lot,
Dorit

Hi Dorit,

Sorry for the delay. I'm on holiday until next week so communication will be slow until I get back.

Thanks,
Silviu

llvm/lib/Analysis/ScalarEvolution.cpp
10185 ↗	(On Diff #88719)	Adding just NUSW would work, but the problem would be that the predicate would fail at runtime often if the number is used as signed. Ideally we should find a solution that works in most cases.
10324 ↗	(On Diff #88719)	I was thinking using the same mechanism that SCEV already has for caching SCEV expressions (and which also use to store SCEV predicates). Essentially, PSCEV would call SCEV from here and SCEV would check to see if it has already analyzed the node or not. If not, it would do this analysis and store the result (using the loop + the SCEV Unknown as keys for further lookups). As you've said, in case of failure we can just return the SCEVUnknown expression without any additional predicates. This would essentially be the same thing SCEV does for getSCEV().
10396 ↗	(On Diff #88719)	Sounds correct, we should drop the NSW flag.
10415 ↗	(On Diff #88719)	We can't add NSW/NUW on SCEV expressions if we infer them from SCEV predicates. The problem is doing so would essentially mean that the NUW/NSW are not predicated (which isn't true) and can technically lead us to false conclusions (we can even use the nsw/nuw flags to prove the original predicate, which is incorrect).

Thanks for responding while on holiday!

llvm/lib/Analysis/ScalarEvolution.cpp
10185 ↗	(On Diff #88719)	So Truncate may need to call SE::getAddRecForPhiWithCasts analysis directly, so that it will have the signess knowledge/context; in fact the analysis results already include the predicate that needs to be added (NUSW/NSSW).
10324 ↗	(On Diff #88719)	So just making sure: Currently SCEV caches things in the ValueExprMap. Are you suggesting to add a new member to SCEV, to store the result of the analysis? (where the result of the analysis is: "the SCEVUnknown %x in loop L can be rewritten to the AddRec '(0,+,%step)' if the Predicate 'Flags=NSSW, AR=(0,+,trunc(%step)' is added" ? )

sbaranga added inline comments.Mar 9 2017, 2:35 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10324 ↗	(On Diff #88719)	I think the result of the analysis should be "for loop L the loop-variant SCEVUnknown can be re-written to another SCEV, given a set of predicates", so the analysis gives you both the SCEV and a set of predicates. This should be general enough to use in more cases. I would be happy with either reusing the same ValueExprMap for storage, or adding another ValueExprMap (they would probably both work). In general I'm not too picky about this as long as it is sensible (although @sanjoy will probably have something to say).

dorit added inline comments.Mar 16 2017, 10:05 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10324 ↗	(On Diff #88719)	SCEV's ValueExprMap maps a Value to a SCEVExpr; and we want to map a <UnknownSCEV, Loop> pair to a <SCEVExpr, setOfPredicates> pair. Are we talking about the same ValueExprMap? Also, you commented earlier that SCEV uses the ValueExprMap to also cache SCEV predicates; Could you please point me to where? (I thought that predicates and predicate-based rewrites were stored only in PSCEV's Preds and RewriteMap…) ? Thanks

sbaranga added inline comments.Mar 16 2017, 10:29 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10324 ↗	(On Diff #88719)	Sorry, I should have looked at the code earlier. What I meant was adding a FoldingSet, like the ones used by ScalarEvolution to store SCEVs to and Preds (see UniqueSCEVs and UniquePreds). In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value.

dorit added inline comments.Mar 17 2017, 5:57 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10324 ↗	(On Diff #88719)	In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. Isn't it more natural to hold the mapping from the unknownSCEV+Loop to the Predicate+AddRecExpr in a map like this?: DenseMap< std::pair<const Loop , const SCEV > , std::pair<const SCEV , SmallVector<constSCEVPredicate , 2>> > It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value. Wait, why would we be replacing uses here? We will be recording here just a tentative mapping, which will be valid only if the PSCEV caller will decide to actually add these predicates and SCEV rewrites in its Preds and RewriteMap… ? I'll upload a new patch along these lines next week (but if something in the above sounds wrong please shout!)

sbaranga added inline comments.Mar 21 2017, 7:20 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10324 ↗	(On Diff #88719)	Using DenseMap should be ok as well. Regarding replacing uses: we need to handle this case in order to properly cache the result of the analysis (because passes that use SCEV can replace uses). This should be fine however if we use SCEVs instead of Values. So this shouldn't be a problem with the DenseMap that you want to use.

Hi Silviu,

The new revision addresses your comments and implements two of the TODO items:

it caches the results of the analysis in a new map in ScalarEvolution (as per your suggestion; thanks!).
it provides a bit of context to visitTruncate so we'd know which overflow check to create (signed or unsigned).

About the cache: For now, I didn't define it to hold a set of predicates but just a single SCEVWrapPredicate per item; I was wondering if maybe we want a SCEVUnionPredicate instead of a Set of predicates? In any case, maybe better leave that generalization for when the need arises?

I have not yet implemented the code style comments by Ayal that relate to the code I copied from createAddRecFromPHI; These should be fixed in both functions, and I want to try to see if I can outline some common pieces between these functions to avoid duplication as much as possible.
I will go ahead and look at that next, and upload a new revision later on. Or in a separate patch if you prefer?
(only expected changes are NFC stuff around createAddRecFromPHI/ createAddRecFromPHIWithCasts; In terms of functionality - the patch is ready for review :-))

In D30041#707455, @dorit wrote:

Hi Silviu,

The new revision addresses your comments and implements two of the TODO items:

it caches the results of the analysis in a new map in ScalarEvolution (as per your suggestion; thanks!).

it provides a bit of context to visitTruncate so we'd know which overflow check to create (signed or unsigned).

About the cache: For now, I didn't define it to hold a set of predicates but just a single SCEVWrapPredicate per item; I was wondering if maybe we want a SCEVUnionPredicate instead of a Set of predicates? In any case, maybe better leave that generalization for when the need arises?

I have not yet implemented the code style comments by Ayal that relate to the code I copied from createAddRecFromPHI; These should be fixed in both functions, and I want to try to see if I can outline some common pieces between these functions to avoid duplication as much as possible.
I will go ahead and look at that next, and upload a new revision later on. Or in a separate patch if you prefer?
(only expected changes are NFC stuff around createAddRecFromPHI/ createAddRecFromPHIWithCasts; In terms of functionality - the patch is ready for review :-))

Sure, this is fine with me. It's indeed best to address style comments separately.

Sure, this is fine with me. It's indeed best to address style comments separately.

Great.

In that case, I have no further updates to this revision at this point.

Any further comments anyone? Silviu, does this address your all your comments?

Hi Dorit,

Sorry for the delayed review. I have some more comments.

Regarding testing: it would be nicer to add some LAA tests, since the LAA analysis results will print the added predicates and the SCEV expressions for the bounds.

Thanks,
Silviu

llvm/include/llvm/Analysis/ScalarEvolution.h
1665 ↗	(On Diff #92622)	The data in this map should be also transferred in ScalarEvolution::ScalarEvolution(ScalarEvolution &&Arg). We should also remove mappings in forgetLoop(). The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs?
llvm/lib/Analysis/ScalarEvolution.cpp
4020 ↗	(On Diff #92622)	Can this be a static function?
4159 ↗	(On Diff #92622)	It would be nice to have a comment here saying that this works because we're going to add a SCEV predicate to prove that SymbolicPHI == Add->getOperand(i).
4196 ↗	(On Diff #92622)	I think the contract should be to return a SCEV expression with the same type as what getSCEV would return for the phi node. It's also probably better to not mention the vectorizer here (since more passes will end up running this code anyway).
10424 ↗	(On Diff #92622)	Now that I'm having a fresh look, I think we need to revisit the conditions for the transformation here. If we want to do trunc({x,+,y}) -> {trunc(x), +, trunc(y)}, is this not always true?

dorit updated this revision to Diff 95137.Apr 13 2017, 8:44 AM

Regarding testing: it would be nicer to add some LAA tests, since the LAA analysis results will print the added predicates and the SCEV expressions for the bounds.

Until your patch in D17080 is committed my patch does not have an effect on loop-access-analysis PSE (only to loop-vectorizer's PSE)… So for now nothing is printed. I'm happy to add a loop-accesses analysis test as soon as your patch is committed.

Thanks very much for your comments,
Dorit

llvm/include/llvm/Analysis/ScalarEvolution.h
1665 ↗	(On Diff #92622)	The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs? Changed to PredicatedSCEVRewrites (since we can rewrite the SCEVUnknowns into the SCEV on the RHS if we add the predicate). But if your suggestion is more intuitive to you I'll change to that.
llvm/lib/Analysis/ScalarEvolution.cpp
10424 ↗	(On Diff #92622)	Yes, I think you're right… This simplifies a few things… :-) so no need for a visitTruncateExpr case at all in this rewriter... the base class rewriter can handle the Truncate (without any predicates) (right?) Thanks!!

dorit added inline comments.Apr 14 2017, 10:07 PM

llvm/include/llvm/Analysis/ScalarEvolution.h
1665 ↗	(On Diff #92622)	We should also remove mappings in forgetLoop(). Just noticed there's also a forgetMemoizedResults(S). Looks like we should be removing the mapping for S there too, right?

Remove mappings from PredicatedSCEVRewrites also in forgetMemoizedResults().

ping :-)
thanks,
Dorit

nitpick: you should run a spell checker over the patch.

This generally looks ok to me but since this is a complex change @sanjoy should approve it before being committed.

-Silviu

llvm/lib/Analysis/ScalarEvolution.cpp
4220 ↗	(On Diff #95409)	I think we need a more formal explanation here on why this predicate guarantees that this is an AddRecExpr. We're trying to prove that if we have: SymbolicPHI = phi({Start, LoopHead}, {NextValue, LoopLatch}) NextValue = (Sext ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum then SymolicPHI = {Start, +, InvariantAccum}. At iteration 0 both values are equal to Start, so it's enough to prove that SymbolicPhi + Invariant == (Sext (trunc (SymbolicPHI)) + Invariant, which should be true from the SCEV predicate.
10424 ↗	(On Diff #92622)	Correct, we shouldn't need to do anything here.

Some comments inline.

llvm/include/llvm/Analysis/ScalarEvolution.h
1208 ↗	(On Diff #95409)	s/SumbolicPHI/SymbolicPHI/
1211 ↗	(On Diff #95409)	s/cahced/cached/
llvm/lib/Analysis/ScalarEvolution.cpp
4042 ↗	(On Diff #95409)	This is very minor, but we usually spell these as `SExt` and `ZExt`, in keeping with camel case.
4081 ↗	(On Diff #95409)	s/rewritew/rewrite/
4112 ↗	(On Diff #95409)	Usually for out parameters like this, the type is `const SCEVPredicate *&`. But I'd prefer just returning an `std::pair`.
4124 ↗	(On Diff #95409)	Can you do `find({SymbolicPHI, L})`?
4137 ↗	(On Diff #95409)	`Pair` isn't clear. Please either name it something more specific, or (IMO better) use `{SymbolicPHI, L}` instead of `Pair`.
4138 ↗	(On Diff #95409)	Can you do `{SymbolicPHI, nullptr}` instead of `make_pair`?
4202 ↗	(On Diff #95409)	Do you also need to check that the recurrence is affine?
4216 ↗	(On Diff #95409)	s/specificed/specified/
4221 ↗	(On Diff #95409)	I'm not sure that this is correct. Say we had a loop with 4 iterations, `StartVal` was `i16 257`, `Accum` was `i16 1`, `TruncTy` was `i8` and the PHI was being zero extended. In that case, the value of the PHI node on the second iteration (i.e. after taking the backedge once) would be `(trunc 257) + 1` = `2`. However, despite `{(trunc 257),+,1}` = `{1,+,1}` not unsigned-overflowing in 4 iterations, `{257,+,1}` would not produce the correct values for the PHI node.
10384 ↗	(On Diff #95409)	Doesn't `PredicatedSCEVRewrites.erase(I++)` invalidate `E`?
10682 ↗	(On Diff #95409)	Use `auto *` for pointers.

This revision now requires changes to proceed.Apr 24 2017, 11:02 PM

Hi Sanjoy, I will upload a new fixed version soon; just have two quick followup questions below. Thanks a lot!
Dorit

llvm/lib/Analysis/ScalarEvolution.cpp
4202 ↗	(On Diff #95409)	Oh, probably so… Do you think the isAffine check is also required in createAddRecFromPHI()? createAddRecFromPHIWithCasts() is basically a subset of createAddRecFromPHI, modified to consider also the sext-trunc cast pattern (the intention is later to try to factor out common parts as much as possible). I copied this check from there without thinking much…
4221 ↗	(On Diff #95409)	I see. So I guess we should be returning a {(sext (trunc Start)),+,{(sext (trunc Accum))} as the newAR, right...?

sanjoy added inline comments.Apr 25 2017, 10:11 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4202 ↗	(On Diff #95409)	I was suggesting the `isAffine` since IIRC (but please check) you cannot create a no wrap predicate on a non-affine add recurrence.
4221 ↗	(On Diff #95409)	Not sure how that will work -- won't `{(sext (trunc Start)),+,{(sext (trunc Accum))}` be `{1,+,1}` and thus be `1` instead of `257` in the first iteration? I haven't thought this through, but I suspect you'll have to check that the starte value fits in the narrower type or something like that. In any case, I agree with Silviu here -- whatever you go with, please justify why that is correct with a proof here.
10676 ↗	(On Diff #95409)	I'm also not a big fan of the name `analyzeUnknownSCEVPHI` here -- `analyze` does not mean anything specific, and the `UnknownSCEV` part is redundant. How about `convertToAddRecWithPreds`?

sbaranga added inline comments.Apr 25 2017, 10:40 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4221 ↗	(On Diff #95409)	Maybe have two extra SCEVEqualPredicates to test that (sext (trunc Start)) == Start and {(sext (trunc Accum))} == Accum and return {Start, + Accum}? Before adding the extra predicates it would be worth testing if SCEV can prove these properties statically first (for example for cases where either Start or Acum are constants).

dorit added inline comments.Apr 26 2017, 4:16 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4221 ↗	(On Diff #95409)	Good idea! Sanjoy, Silviu: Are you ok with extending the SCEVEqualPredicate to a non constant on the RHS? (I think currently only SCEV == constant is supported).

sbaranga added inline comments.Apr 26 2017, 9:10 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4221 ↗	(On Diff #95409)	That's ok with me.

Main changes:

SCEVEqualPredicate can take any SCEV on LHS/RHS (instead of only SCEVUnknown/SCEVConstant )
We add three predicates instead of only one; so we now have pairs of <SCEV, SmallVector of predicates> (instead of <SCEV, single Predicate>)
Added a proof for why these predicates guarantee the correctness of the proposed rewrite
Addressed Style and Spelling comments

Thanks,
Dorit

(discovered typo in the documentation)

Ping :)

sbaranga added inline comments.May 22 2017, 3:20 PM

llvm/lib/Analysis/ScalarEvolution.cpp
4355 ↗	(On Diff #98427)	I originally had in mind testing the SCEV expressions for equality (Expr == ExtendedExpr) and checking that both are loop invariant. I had a look at the implementation of isKnownPredicate, and I'm not convinced this would work as expected. Can we add a regression test for this? I think we need to check anyway that both expressions are loop invariant before adding the predicate.

We now require that Accum is loop invariant.
Added a simple Expr == ExtendedExpr check, and only if it fails we call isKnownPredicate().
Extended the testcase:

• Check that we have both the overflow runtime check and the equality runtime check in doit1 and doit2
• Added doit3: where step is not invariant (not very interesting - we do nothing)
• Added doit4: where we can figure out at compile time that step == sext(trunc(step)). Here we check that we only have the overflow runtime check (without the equality runtime check).

Thanks,
Dorit

dorit added inline comments.May 24 2017, 12:43 PM

llvm/lib/Analysis/ScalarEvolution.cpp
4355 ↗	(On Diff #98427)	Right. I was assuming Accum is invariant in the loop, forgot I was allowing it to not be invariant. Thanks.

ping?
thanks,
Dorit

Sorry for the delay, I missed the last update. I have a few minor suggestions, but otherwise I think it generally looks good.
I think Sanjoy still needs to approve this before it can go in.

Thanks
-Silviu

llvm/include/llvm/Analysis/ScalarEvolution.h
244 ↗	(On Diff #100147)	Ideally we would have something (not necessarily in SCEVEqualPredicate) to verify that these predicates can be checked. We should also be making the LHS->RHS substitution in the rewriter at some point (not necessarily in this change).
llvm/lib/Analysis/ScalarEvolution.cpp
4109 ↗	(On Diff #100147)	s/confirms/conforms
4295 ↗	(On Diff #100147)	This is a bit hard to follow with NewExpr, Expr, etc and the initial statement problem should be simplified. I guess we just want to prove that Expr(i) = Start + i *Accum, given that (1) Expr(0) = Start (2) Expr(i+1) = (Ext ix (Trunc iy (Expr(i)) to ix) to iy) + Accum
4319 ↗	(On Diff #100147)	No text should be required for iteration 2, proving the induction step should be enough.
4341 ↗	(On Diff #100147)	It would be better to just use Ext and Trunc as separate operators instead SExtTrunc.
llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
4 ↗	(On Diff #100147)	Could you add some text for each of these saying what predicates get added?

Thanks Silviu. I'll iron the comments following your suggestions.

llvm/include/llvm/Analysis/ScalarEvolution.h
244 ↗	(On Diff #100147)	Can you please clarify what check is missing? And where would be an appropriate place add a TODO comment about the substitution?
llvm/lib/Analysis/ScalarEvolution.cpp
4319 ↗	(On Diff #100147)	I thought it's helpful to have an example step before the formal step. I'll drop it.

sbaranga added inline comments.Jun 14 2017, 9:03 AM

llvm/include/llvm/Analysis/ScalarEvolution.h
244 ↗	(On Diff #100147)	If we would have a Loop * as a member of SCEVEqualPredicate, and we could check that both LHS and RHS are invariant inside the constructor. However, we don't need the Loop for anything else so I'm not sure that would be the right solution. I guess since we already have an assert in AppendPredicate that should be enough for now.

Addressed Silviu's last comments on the documentation:

Added the predicates that are added for each of the loops in the testcase
In the proof:
- dropped all the introductory text and jump directly to the formal proof.
- expanded the short notation I was using (SExTrunc) into the explicit notation (Ext ix (Trunc iy () to ix ) to iy)

Herald added a subscriber: hiraditya. · View Herald TranscriptJun 18 2017, 12:34 AM

@sbaranga, @sanjoy, ping :-)
thanks.
dorit

ping^2

thanks,
dorit

It looks ok to me, thanks!

-Silviu

@sanjoy, is the patch ok with you?

thanks,
Dorit

Mostly minor stuff.

llvm/include/llvm/Analysis/ScalarEvolution.h
245 ↗	(On Diff #102963)	Now that the type system does not guarantee this, how about adding an assert that `LHS != RHS`?
1719 ↗	(On Diff #102963)	Why not have the key type be `std::pair<const SCEVUnknown , const Loop >`?
llvm/lib/Analysis/ScalarEvolution.cpp
4169 ↗	(On Diff #102963)	s/const SCEV SymbolicPHI/const SCEVUnknown SymbolicPHI/
4186 ↗	(On Diff #102963)	How about doing this check as the very first thing (to avoid doing this extra work when it would have failed anyway)?
4262 ↗	(On Diff #102963)	IMO a slightly more idiomatic pattern (which would obviate the need for the `*** Part1` comment) is to have a `createAddRecFromPHIWithCasts` and a `createAddRecFromPHIWithCastsImpl`, where the first function checks `PredicatedSCEVRewrites` for an existing solution, and delegates to `createAddRecFromPHIWithCastsImpl` if we don't have a cached solution.
4276 ↗	(On Diff #102963)	Is this semantically important? That is, if you remove this and instead ensure that we always populate `PredicatedSCEVRewrites[{SymbolicPHI, L}]` before leaving this function, will we enter an infinite loop somewhere?
4318 ↗	(On Diff #102963)	Should we be getting here in the `Add->getOperand(i) == SymbolicPHI` case? Shouldn't the regular add recurrence creating logic have triggered?
4399 ↗	(On Diff #102963)	You should be able to `cast<>` here.

This revision now requires changes to proceed.Jul 11 2017, 4:04 PM

Addressing Sanjoy's comments.

Hi Sanjoy,
Thanks! Comments addressed,
Dorit

llvm/include/llvm/Analysis/ScalarEvolution.h
245 ↗	(On Diff #102963)	Added in the constructor.
1719 ↗	(On Diff #102963)	Absolutely right (that was the intention, but somehow it made it only to the comment... )
llvm/lib/Analysis/ScalarEvolution.cpp
4276 ↗	(On Diff #102963)	I added a clarification on the motivation for this early initialization. There is nothing semantic behind it. It is just to avoid having to initialize it upon every exit from the function (keep things a bit more compact, avoid the slight code duplication, make sure we don't forget it…). Is that ok with you with the clarification?
4318 ↗	(On Diff #102963)	yes, I would indeed expect it would have been caught already; added a comment here, and an assert in isSimpleCastedPHI.

lgtm with one nit

llvm/lib/Analysis/ScalarEvolution.cpp
4276 ↗	(On Diff #102963)	Given that you now have the `createAddRecFromPHIWithCastsImpl` split out, a cleaner way to achieve the same property would be to insert the result into the cache in `createAddRecFromPHIWithCasts`.

This revision is now accepted and ready to land.Jul 15 2017, 12:33 PM

Hi Sanjoy,

Thanks!

I addressed your last comment + made another small change:

Should we be getting here in the Add->getOperand(i) == SymbolicPHI case? Shouldn't the regular add recurrence creating logic have triggered?

I realized that we may be getting here with Op == SymbolicPHI because we haven't yet processed the rest of the operands of the Add... (so createAddRecFromPHI may have failed because one of them is not invariant). So I changed the assert back to an if, with a detailed comment (in isSimpleCastedPHI).

Will wait a bit before I commit.

Many thanks Sanjoy and Silviu for all your help with this patch,

Dorit

Closed by commit rL308299: PSCEV] Create AddRec for Phis in cases of possible integer overflow, (authored by dorit). · Explain WhyJul 18 2017, 4:57 AM

This revision was automatically updated to reflect the committed changes.

dneilson mentioned this in D37265: [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values.Sep 5 2017, 12:58 PM

dorit mentioned this in D38948: [LV] Support efficient vectorization of an induction with redundant casts.Oct 16 2017, 5:29 AM

dorit mentioned this in rL320672: [LV] Support efficient vectorization of an induction with redundant casts.Dec 13 2017, 11:57 PM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

ScalarEvolution.h

48 lines

lib/

Analysis/

ScalarEvolution.cpp

385 lines

test/

Transforms/

LoopVectorize/

pr30654-phiscev-sext-trunc.ll

240 lines

Diff 107068

llvm/trunk/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	struct FoldingSetTrait<SCEVPredicate> : DefaultFoldingSetTrait<SCEVPredicate> {
}		}
static unsigned ComputeHash(const SCEVPredicate &X,		static unsigned ComputeHash(const SCEVPredicate &X,
FoldingSetNodeID &TempID) {		FoldingSetNodeID &TempID) {
return X.FastID.ComputeHash();		return X.FastID.ComputeHash();
}		}
};		};

/// This class represents an assumption that two SCEV expressions are equal,		/// This class represents an assumption that two SCEV expressions are equal,
/// and this can be checked at run-time. We assume that the left hand side is		/// and this can be checked at run-time.
/// a SCEVUnknown and the right hand side a constant.
class SCEVEqualPredicate final : public SCEVPredicate {		class SCEVEqualPredicate final : public SCEVPredicate {
/// We assume that LHS == RHS, where LHS is a SCEVUnknown and RHS a		/// We assume that LHS == RHS.
/// constant.		const SCEV *LHS;
const SCEVUnknown *LHS;		const SCEV *RHS;
const SCEVConstant *RHS;

public:		public:
SCEVEqualPredicate(const FoldingSetNodeIDRef ID, const SCEVUnknown *LHS,		SCEVEqualPredicate(const FoldingSetNodeIDRef ID, const SCEV *LHS,
const SCEVConstant *RHS);		const SCEV *RHS);

/// Implementation of the SCEVPredicate interface		/// Implementation of the SCEVPredicate interface
bool implies(const SCEVPredicate *N) const override;		bool implies(const SCEVPredicate *N) const override;
void print(raw_ostream &OS, unsigned Depth = 0) const override;		void print(raw_ostream &OS, unsigned Depth = 0) const override;
bool isAlwaysTrue() const override;		bool isAlwaysTrue() const override;
const SCEV *getExpr() const override;		const SCEV *getExpr() const override;

/// Returns the left hand side of the equality.		/// Returns the left hand side of the equality.
const SCEVUnknown *getLHS() const { return LHS; }		const SCEV *getLHS() const { return LHS; }

/// Returns the right hand side of the equality.		/// Returns the right hand side of the equality.
const SCEVConstant *getRHS() const { return RHS; }		const SCEV *getRHS() const { return RHS; }

/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const SCEVPredicate *P) {		static bool classof(const SCEVPredicate *P) {
return P->getKind() == P_Equal;		return P->getKind() == P_Equal;
}		}
};		};

/// This class represents an assumption made on an AddRec expression. Given an		/// This class represents an assumption made on an AddRec expression. Given an
▲ Show 20 Lines • Show All 965 Lines • ▼ Show 20 Lines	const SCEV getAddRecExpr(const SCEV Start, const SCEV Step, const Loop L,
SCEV::NoWrapFlags Flags);		SCEV::NoWrapFlags Flags);
const SCEV getAddRecExpr(SmallVectorImpl<const SCEV > &Operands,		const SCEV getAddRecExpr(SmallVectorImpl<const SCEV > &Operands,
const Loop *L, SCEV::NoWrapFlags Flags);		const Loop *L, SCEV::NoWrapFlags Flags);
const SCEV getAddRecExpr(const SmallVectorImpl<const SCEV > &Operands,		const SCEV getAddRecExpr(const SmallVectorImpl<const SCEV > &Operands,
const Loop *L, SCEV::NoWrapFlags Flags) {		const Loop *L, SCEV::NoWrapFlags Flags) {
SmallVector<const SCEV *, 4> NewOp(Operands.begin(), Operands.end());		SmallVector<const SCEV *, 4> NewOp(Operands.begin(), Operands.end());
return getAddRecExpr(NewOp, L, Flags);		return getAddRecExpr(NewOp, L, Flags);
}		}

		/// Checks if \p SymbolicPHI can be rewritten as an AddRecExpr under some
		/// Predicates. If successful return these <AddRecExpr, Predicates>;
		/// The function is intended to be called from PSCEV (the caller will decide
		/// whether to actually add the predicates and carry out the rewrites).
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI);

/// Returns an expression for a GEP		/// Returns an expression for a GEP
///		///
/// \p GEP The GEP. The indices contained in the GEP itself are ignored,		/// \p GEP The GEP. The indices contained in the GEP itself are ignored,
/// instead we use IndexExprs.		/// instead we use IndexExprs.
/// \p IndexExprs The expressions for the indices.		/// \p IndexExprs The expressions for the indices.
const SCEV getGEPExpr(GEPOperator GEP,		const SCEV getGEPExpr(GEPOperator GEP,
const SmallVectorImpl<const SCEV *> &IndexExprs);		const SmallVectorImpl<const SCEV *> &IndexExprs);
const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);		const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);
▲ Show 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	void delinearize(const SCEV Expr, SmallVectorImpl<const SCEV > &Subscripts,
const SCEV *ElementSize);		const SCEV *ElementSize);

/// Return the DataLayout associated with the module this SCEV instance is		/// Return the DataLayout associated with the module this SCEV instance is
/// operating on.		/// operating on.
const DataLayout &getDataLayout() const {		const DataLayout &getDataLayout() const {
return F.getParent()->getDataLayout();		return F.getParent()->getDataLayout();
}		}

const SCEVPredicate getEqualPredicate(const SCEVUnknown LHS,		const SCEVPredicate getEqualPredicate(const SCEV LHS, const SCEV *RHS);
const SCEVConstant *RHS);

const SCEVPredicate *		const SCEVPredicate *
getWrapPredicate(const SCEVAddRecExpr *AR,		getWrapPredicate(const SCEVAddRecExpr *AR,
SCEVWrapPredicate::IncrementWrapFlags AddedFlags);		SCEVWrapPredicate::IncrementWrapFlags AddedFlags);

/// Re-writes the SCEV according to the Predicates in \p A.		/// Re-writes the SCEV according to the Predicates in \p A.
const SCEV rewriteUsingPredicate(const SCEV S, const Loop *L,		const SCEV rewriteUsingPredicate(const SCEV S, const Loop *L,
SCEVUnionPredicate &A);		SCEVUnionPredicate &A);
/// Tries to convert the \p S expression to an AddRec expression,		/// Tries to convert the \p S expression to an AddRec expression,
/// adding additional predicates to \p Preds as required.		/// adding additional predicates to \p Preds as required.
const SCEVAddRecExpr *convertSCEVToAddRecWithPredicates(		const SCEVAddRecExpr *convertSCEVToAddRecWithPredicates(
const SCEV S, const Loop L,		const SCEV S, const Loop L,
SmallPtrSetImpl<const SCEVPredicate *> &Preds);		SmallPtrSetImpl<const SCEVPredicate *> &Preds);

private:		private:
		/// Similar to createAddRecFromPHI, but with the additional flexibility of
		/// suggesting runtime overflow checks in case casts are encountered.
		/// If successful, the analysis records that for this loop, \p SymbolicPHI,
		/// which is the UnknownSCEV currently representing the PHI, can be rewritten
		/// into an AddRec, assuming some predicates; The function then returns the
		/// AddRec and the predicates as a pair, and caches this pair in
		/// PredicatedSCEVRewrites.
		/// If the analysis is not successful, a mapping from the \p SymbolicPHI to
		/// itself (with no predicates) is recorded, and a nullptr with an empty
		/// predicates vector is returned as a pair.
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		createAddRecFromPHIWithCastsImpl(const SCEVUnknown *SymbolicPHI);

/// Compute the backedge taken count knowing the interval difference, the		/// Compute the backedge taken count knowing the interval difference, the
/// stride and presence of the equality in the comparison.		/// stride and presence of the equality in the comparison.
const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,		const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,
bool Equality);		bool Equality);

/// Verify if an linear IV with positive stride can overflow when in a		/// Verify if an linear IV with positive stride can overflow when in a
/// less-than comparison, knowing the invariant term of the comparison,		/// less-than comparison, knowing the invariant term of the comparison,
/// the stride and the knowledge of NSW/NUW flags on the recurrence.		/// the stride and the knowledge of NSW/NUW flags on the recurrence.
Show All 14 Lines	private:
const SCEV getOrCreateMulExpr(SmallVectorImpl<const SCEV > &Ops,		const SCEV getOrCreateMulExpr(SmallVectorImpl<const SCEV > &Ops,
SCEV::NoWrapFlags Flags);		SCEV::NoWrapFlags Flags);

private:		private:
FoldingSet<SCEV> UniqueSCEVs;		FoldingSet<SCEV> UniqueSCEVs;
FoldingSet<SCEVPredicate> UniquePreds;		FoldingSet<SCEVPredicate> UniquePreds;
BumpPtrAllocator SCEVAllocator;		BumpPtrAllocator SCEVAllocator;

		/// Cache tentative mappings from UnknownSCEVs in a Loop, to a SCEV expression
		/// they can be rewritten into under certain predicates.
		DenseMap<std::pair<const SCEVUnknown , const Loop >,
		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		PredicatedSCEVRewrites;

/// The head of a linked list of all SCEVUnknown values that have been		/// The head of a linked list of all SCEVUnknown values that have been
/// allocated. This is used by releaseMemory to locate them all and call		/// allocated. This is used by releaseMemory to locate them all and call
/// their destructors.		/// their destructors.
SCEVUnknown *FirstUnknown;		SCEVUnknown *FirstUnknown;
};		};

/// Analysis pass that exposes the \c ScalarEvolution for a function.		/// Analysis pass that exposes the \c ScalarEvolution for a function.
class ScalarEvolutionAnalysis		class ScalarEvolutionAnalysis
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,167 Lines • ▼ Show 20 Lines	static Optional<BinaryOp> MatchBinaryOp(Value *V, DominatorTree &DT) {

default:		default:
break;		break;
}		}

return None;		return None;
}		}

		/// Helper function to createAddRecFromPHIWithCasts. We have a phi
		/// node whose symbolic (unknown) SCEV is \p SymbolicPHI, which is updated via
		/// the loop backedge by a SCEVAddExpr, possibly also with a few casts on the
		/// way. This function checks if \p Op, an operand of this SCEVAddExpr,
		/// follows one of the following patterns:
		/// Op == (SExt ix (Trunc iy (%SymbolicPHI) to ix) to iy)
		/// Op == (ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy)
		/// If the SCEV expression of \p Op conforms with one of the expected patterns
		/// we return the type of the truncation operation, and indicate whether the
		/// truncated type should be treated as signed/unsigned by setting
		/// \p Signed to true/false, respectively.
		static Type isSimpleCastedPHI(const SCEV Op, const SCEVUnknown *SymbolicPHI,
		bool &Signed, ScalarEvolution &SE) {

		// The case where Op == SymbolicPHI (that is, with no type conversions on
		// the way) is handled by the regular add recurrence creating logic and
		// would have already been triggered in createAddRecForPHI. Reaching it here
		// means that createAddRecFromPHI had failed for this PHI before (e.g.,
		// because one of the other operands of the SCEVAddExpr updating this PHI is
		// not invariant).
		//
		// Here we look for the case where Op = (ext(trunc(SymbolicPHI))), and in
		// this case predicates that allow us to prove that Op == SymbolicPHI will
		// be added.
		if (Op == SymbolicPHI)
		return nullptr;

		unsigned SourceBits = SE.getTypeSizeInBits(SymbolicPHI->getType());
		unsigned NewBits = SE.getTypeSizeInBits(Op->getType());
		if (SourceBits != NewBits)
		return nullptr;

		const SCEVSignExtendExpr *SExt = dyn_cast<SCEVSignExtendExpr>(Op);
		const SCEVZeroExtendExpr *ZExt = dyn_cast<SCEVZeroExtendExpr>(Op);
		if (!SExt && !ZExt)
		return nullptr;
		const SCEVTruncateExpr *Trunc =
		SExt ? dyn_cast<SCEVTruncateExpr>(SExt->getOperand())
		: dyn_cast<SCEVTruncateExpr>(ZExt->getOperand());
		if (!Trunc)
		return nullptr;
		const SCEV *X = Trunc->getOperand();
		if (X != SymbolicPHI)
		return nullptr;
		Signed = SExt ? true : false;
		return Trunc->getType();
		}

		static const Loop isIntegerLoopHeaderPHI(const PHINode PN, LoopInfo &LI) {
		if (!PN->getType()->isIntegerTy())
		return nullptr;
		const Loop *L = LI.getLoopFor(PN->getParent());
		if (!L \|\| L->getHeader() != PN->getParent())
		return nullptr;
		return L;
		}

		// Analyze \p SymbolicPHI, a SCEV expression of a phi node, and check if the
		// computation that updates the phi follows the following pattern:
		// (SExt/ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum
		// which correspond to a phi->trunc->sext/zext->add->phi update chain.
		// If so, try to see if it can be rewritten as an AddRecExpr under some
		// Predicates. If successful, return them as a pair. Also cache the results
		// of the analysis.
		//
		// Example usage scenario:
		// Say the Rewriter is called for the following SCEV:
		// 8 * ((sext i32 (trunc i64 %X to i32) to i64) + %Step)
		// where:
		// %X = phi i64 (%Start, %BEValue)
		// It will visitMul->visitAdd->visitSExt->visitTrunc->visitUnknown(%X),
		// and call this function with %SymbolicPHI = %X.
		//
		// The analysis will find that the value coming around the backedge has
		// the following SCEV:
		// BEValue = ((sext i32 (trunc i64 %X to i32) to i64) + %Step)
		// Upon concluding that this matches the desired pattern, the function
		// will return the pair {NewAddRec, SmallPredsVec} where:
		// NewAddRec = {%Start,+,%Step}
		// SmallPredsVec = {P1, P2, P3} as follows:
		// P1(WrapPred): AR: {trunc(%Start),+,(trunc %Step)}<nsw> Flags: <nssw>
		// P2(EqualPred): %Start == (sext i32 (trunc i64 %Start to i32) to i64)
		// P3(EqualPred): %Step == (sext i32 (trunc i64 %Step to i32) to i64)
		// The returned pair means that SymbolicPHI can be rewritten into NewAddRec
		// under the predicates {P1,P2,P3}.
		// This predicated rewrite will be cached in PredicatedSCEVRewrites:
		// PredicatedSCEVRewrites[{%X,L}] = {NewAddRec, {P1,P2,P3)}
		//
		// TODO's:
		//
		// 1) Extend the Induction descriptor to also support inductions that involve
		// casts: When needed (namely, when we are called in the context of the
		// vectorizer induction analysis), a Set of cast instructions will be
		// populated by this method, and provided back to isInductionPHI. This is
		// needed to allow the vectorizer to properly record them to be ignored by
		// the cost model and to avoid vectorizing them (otherwise these casts,
		// which are redundant under the runtime overflow checks, will be
		// vectorized, which can be costly).
		//
		// 2) Support additional induction/PHISCEV patterns: We also want to support
		// inductions where the sext-trunc / zext-trunc operations (partly) occur
		// after the induction update operation (the induction increment):
		//
		// (Trunc iy (SExt/ZExt ix (%SymbolicPHI + InvariantAccum) to iy) to ix)
		// which correspond to a phi->add->trunc->sext/zext->phi update chain.
		//
		// (Trunc iy ((SExt/ZExt ix (%SymbolicPhi) to iy) + InvariantAccum) to ix)
		// which correspond to a phi->trunc->add->sext/zext->phi update chain.
		//
		// 3) Outline common code with createAddRecFromPHI to avoid duplication.
		//
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		ScalarEvolution::createAddRecFromPHIWithCastsImpl(const SCEVUnknown *SymbolicPHI) {
		SmallVector<const SCEVPredicate *, 3> Predicates;

		// *** Part1: Analyze if we have a phi-with-cast pattern for which we can
		// return an AddRec expression under some predicate.

		auto *PN = cast<PHINode>(SymbolicPHI->getValue());
		const Loop *L = isIntegerLoopHeaderPHI(PN, LI);
		assert (L && "Expecting an integer loop header phi");

		// The loop may have multiple entrances or multiple exits; we can analyze
		// this phi as an addrec if it has a unique entry value and a unique
		// backedge value.
		Value BEValueV = nullptr, StartValueV = nullptr;
		for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
		Value *V = PN->getIncomingValue(i);
		if (L->contains(PN->getIncomingBlock(i))) {
		if (!BEValueV) {
		BEValueV = V;
		} else if (BEValueV != V) {
		BEValueV = nullptr;
		break;
		}
		} else if (!StartValueV) {
		StartValueV = V;
		} else if (StartValueV != V) {
		StartValueV = nullptr;
		break;
		}
		}
		if (!BEValueV \|\| !StartValueV)
		return None;

		const SCEV *BEValue = getSCEV(BEValueV);

		// If the value coming around the backedge is an add with the symbolic
		// value we just inserted, possibly with casts that we can ignore under
		// an appropriate runtime guard, then we found a simple induction variable!
		const auto *Add = dyn_cast<SCEVAddExpr>(BEValue);
		if (!Add)
		return None;

		// If there is a single occurrence of the symbolic value, possibly
		// casted, replace it with a recurrence.
		unsigned FoundIndex = Add->getNumOperands();
		Type *TruncTy = nullptr;
		bool Signed;
		for (unsigned i = 0, e = Add->getNumOperands(); i != e; ++i)
		if ((TruncTy =
		isSimpleCastedPHI(Add->getOperand(i), SymbolicPHI, Signed, *this)))
		if (FoundIndex == e) {
		FoundIndex = i;
		break;
		}

		if (FoundIndex == Add->getNumOperands())
		return None;

		// Create an add with everything but the specified operand.
		SmallVector<const SCEV *, 8> Ops;
		for (unsigned i = 0, e = Add->getNumOperands(); i != e; ++i)
		if (i != FoundIndex)
		Ops.push_back(Add->getOperand(i));
		const SCEV *Accum = getAddExpr(Ops);

		// The runtime checks will not be valid if the step amount is
		// varying inside the loop.
		if (!isLoopInvariant(Accum, L))
		return None;


		// *** Part2: Create the predicates

		// Analysis was successful: we have a phi-with-cast pattern for which we
		// can return an AddRec expression under the following predicates:
		//
		// P1: A Wrap predicate that guarantees that Trunc(Start) + i*Trunc(Accum)
		// fits within the truncated type (does not overflow) for i = 0 to n-1.
		// P2: An Equal predicate that guarantees that
		// Start = (Ext ix (Trunc iy (Start) to ix) to iy)
		// P3: An Equal predicate that guarantees that
		// Accum = (Ext ix (Trunc iy (Accum) to ix) to iy)
		//
		// As we next prove, the above predicates guarantee that:
		// Start + iAccum = (Ext ix (Trunc iy ( Start + iAccum ) to ix) to iy)
		//
		//
		// More formally, we want to prove that:
		// Expr(i+1) = Start + (i+1) * Accum
		// = (Ext ix (Trunc iy (Expr(i)) to ix) to iy) + Accum
		//
		// Given that:
		// 1) Expr(0) = Start
		// 2) Expr(1) = Start + Accum
		// = (Ext ix (Trunc iy (Start) to ix) to iy) + Accum :: from P2
		// 3) Induction hypothesis (step i):
		// Expr(i) = (Ext ix (Trunc iy (Expr(i-1)) to ix) to iy) + Accum
		//
		// Proof:
		// Expr(i+1) =
		// = Start + (i+1)*Accum
		// = (Start + i*Accum) + Accum
		// = Expr(i) + Accum
		// = (Ext ix (Trunc iy (Expr(i-1)) to ix) to iy) + Accum + Accum
		// :: from step i
		//
		// = (Ext ix (Trunc iy (Start + (i-1)*Accum) to ix) to iy) + Accum + Accum
		//
		// = (Ext ix (Trunc iy (Start + (i-1)*Accum) to ix) to iy)
		// + (Ext ix (Trunc iy (Accum) to ix) to iy)
		// + Accum :: from P3
		//
		// = (Ext ix (Trunc iy ((Start + (i-1)*Accum) + Accum) to ix) to iy)
		// + Accum :: from P1: Ext(x)+Ext(y)=>Ext(x+y)
		//
		// = (Ext ix (Trunc iy (Start + i*Accum) to ix) to iy) + Accum
		// = (Ext ix (Trunc iy (Expr(i)) to ix) to iy) + Accum
		//
		// By induction, the same applies to all iterations 1<=i<n:
		//

		// Create a truncated addrec for which we will add a no overflow check (P1).
		const SCEV *StartVal = getSCEV(StartValueV);
		const SCEV *PHISCEV =
		getAddRecExpr(getTruncateExpr(StartVal, TruncTy),
		getTruncateExpr(Accum, TruncTy), L, SCEV::FlagAnyWrap);
		const auto *AR = cast<SCEVAddRecExpr>(PHISCEV);

		SCEVWrapPredicate::IncrementWrapFlags AddedFlags =
		Signed ? SCEVWrapPredicate::IncrementNSSW
		: SCEVWrapPredicate::IncrementNUSW;
		const SCEVPredicate *AddRecPred = getWrapPredicate(AR, AddedFlags);
		Predicates.push_back(AddRecPred);

		// Create the Equal Predicates P2,P3:
		auto AppendPredicate = [&](const SCEV *Expr) -> void {
		assert (isLoopInvariant(Expr, L) && "Expr is expected to be invariant");
		const SCEV *TruncatedExpr = getTruncateExpr(Expr, TruncTy);
		const SCEV *ExtendedExpr =
		Signed ? getSignExtendExpr(TruncatedExpr, Expr->getType())
		: getZeroExtendExpr(TruncatedExpr, Expr->getType());
		if (Expr != ExtendedExpr &&
		!isKnownPredicate(ICmpInst::ICMP_EQ, Expr, ExtendedExpr)) {
		const SCEVPredicate *Pred = getEqualPredicate(Expr, ExtendedExpr);
		DEBUG (dbgs() << "Added Predicate: " << *Pred);
		Predicates.push_back(Pred);
		}
		};

		AppendPredicate(StartVal);
		AppendPredicate(Accum);

		// *** Part3: Predicates are ready. Now go ahead and create the new addrec in
		// which the casts had been folded away. The caller can rewrite SymbolicPHI
		// into NewAR if it will also add the runtime overflow checks specified in
		// Predicates.
		auto *NewAR = getAddRecExpr(StartVal, Accum, L, SCEV::FlagAnyWrap);

		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>> PredRewrite =
		std::make_pair(NewAR, Predicates);
		// Remember the result of the analysis for this SCEV at this locayyytion.
		PredicatedSCEVRewrites[{SymbolicPHI, L}] = PredRewrite;
		return PredRewrite;
		}

		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		ScalarEvolution::createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI) {

		auto *PN = cast<PHINode>(SymbolicPHI->getValue());
		const Loop *L = isIntegerLoopHeaderPHI(PN, LI);
		if (!L)
		return None;

		// Check to see if we already analyzed this PHI.
		auto I = PredicatedSCEVRewrites.find({SymbolicPHI, L});
		if (I != PredicatedSCEVRewrites.end()) {
		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>> Rewrite =
		I->second;
		// Analysis was done before and failed to create an AddRec:
		if (Rewrite.first == SymbolicPHI)
		return None;
		// Analysis was done before and succeeded to create an AddRec under
		// a predicate:
		assert(isa<SCEVAddRecExpr>(Rewrite.first) && "Expected an AddRec");
		assert(!(Rewrite.second).empty() && "Expected to find Predicates");
		return Rewrite;
		}

		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		Rewrite = createAddRecFromPHIWithCastsImpl(SymbolicPHI);

		// Record in the cache that the analysis failed
		if (!Rewrite) {
		SmallVector<const SCEVPredicate *, 3> Predicates;
		PredicatedSCEVRewrites[{SymbolicPHI, L}] = {SymbolicPHI, Predicates};
		return None;
		}

		return Rewrite;
		}

/// A helper function for createAddRecFromPHI to handle simple cases.		/// A helper function for createAddRecFromPHI to handle simple cases.
///		///
/// This function tries to find an AddRec expression for the simplest (yet most		/// This function tries to find an AddRec expression for the simplest (yet most
/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).		/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).
/// If it fails, createAddRecFromPHI will use a more general, but slow,		/// If it fails, createAddRecFromPHI will use a more general, but slow,
/// technique for finding the AddRec expression.		/// technique for finding the AddRec expression.
const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,		const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,
Value *BEValueV,		Value *BEValueV,
▲ Show 20 Lines • Show All 1,715 Lines • ▼ Show 20 Lines	auto RemoveLoopFromBackedgeMap =
BTCPos->second.clear();		BTCPos->second.clear();
Map.erase(BTCPos);		Map.erase(BTCPos);
}		}
};		};

RemoveLoopFromBackedgeMap(BackedgeTakenCounts);		RemoveLoopFromBackedgeMap(BackedgeTakenCounts);
RemoveLoopFromBackedgeMap(PredicatedBackedgeTakenCounts);		RemoveLoopFromBackedgeMap(PredicatedBackedgeTakenCounts);

		// Drop information about predicated SCEV rewrites for this loop.
		for (auto I = PredicatedSCEVRewrites.begin();
		I != PredicatedSCEVRewrites.end();) {
		std::pair<const SCEV , const Loop > Entry = I->first;
		if (Entry.second == L)
		PredicatedSCEVRewrites.erase(I++);
		else
		++I;
		}

// Drop information about expressions based on loop-header PHIs.		// Drop information about expressions based on loop-header PHIs.
SmallVector<Instruction *, 16> Worklist;		SmallVector<Instruction *, 16> Worklist;
PushLoopPHIs(L, Worklist);		PushLoopPHIs(L, Worklist);

SmallPtrSet<Instruction *, 8> Visited;		SmallPtrSet<Instruction *, 8> Visited;
while (!Worklist.empty()) {		while (!Worklist.empty()) {
Instruction *I = Worklist.pop_back_val();		Instruction *I = Worklist.pop_back_val();
if (!Visited.insert(I).second)		if (!Visited.insert(I).second)
▲ Show 20 Lines • Show All 4,142 Lines • ▼ Show 20 Lines	: F(Arg.F), HasGuards(Arg.HasGuards), TLI(Arg.TLI), AC(Arg.AC), DT(Arg.DT),
LoopDispositions(std::move(Arg.LoopDispositions)),		LoopDispositions(std::move(Arg.LoopDispositions)),
LoopPropertiesCache(std::move(Arg.LoopPropertiesCache)),		LoopPropertiesCache(std::move(Arg.LoopPropertiesCache)),
BlockDispositions(std::move(Arg.BlockDispositions)),		BlockDispositions(std::move(Arg.BlockDispositions)),
UnsignedRanges(std::move(Arg.UnsignedRanges)),		UnsignedRanges(std::move(Arg.UnsignedRanges)),
SignedRanges(std::move(Arg.SignedRanges)),		SignedRanges(std::move(Arg.SignedRanges)),
UniqueSCEVs(std::move(Arg.UniqueSCEVs)),		UniqueSCEVs(std::move(Arg.UniqueSCEVs)),
UniquePreds(std::move(Arg.UniquePreds)),		UniquePreds(std::move(Arg.UniquePreds)),
SCEVAllocator(std::move(Arg.SCEVAllocator)),		SCEVAllocator(std::move(Arg.SCEVAllocator)),
		PredicatedSCEVRewrites(std::move(Arg.PredicatedSCEVRewrites)),
FirstUnknown(Arg.FirstUnknown) {		FirstUnknown(Arg.FirstUnknown) {
Arg.FirstUnknown = nullptr;		Arg.FirstUnknown = nullptr;
}		}

ScalarEvolution::~ScalarEvolution() {		ScalarEvolution::~ScalarEvolution() {
// Iterate through all the SCEVUnknown instances and call their		// Iterate through all the SCEVUnknown instances and call their
// destructors, so that they release their references to their values.		// destructors, so that they release their references to their values.
for (SCEVUnknown *U = FirstUnknown; U;) {		for (SCEVUnknown *U = FirstUnknown; U;) {
▲ Show 20 Lines • Show All 384 Lines • ▼ Show 20 Lines	void ScalarEvolution::forgetMemoizedResults(const SCEV *S) {
LoopDispositions.erase(S);		LoopDispositions.erase(S);
BlockDispositions.erase(S);		BlockDispositions.erase(S);
UnsignedRanges.erase(S);		UnsignedRanges.erase(S);
SignedRanges.erase(S);		SignedRanges.erase(S);
ExprValueMap.erase(S);		ExprValueMap.erase(S);
HasRecMap.erase(S);		HasRecMap.erase(S);
MinTrailingZerosCache.erase(S);		MinTrailingZerosCache.erase(S);

		for (auto I = PredicatedSCEVRewrites.begin();
		I != PredicatedSCEVRewrites.end();) {
		std::pair<const SCEV , const Loop > Entry = I->first;
		if (Entry.first == S)
		PredicatedSCEVRewrites.erase(I++);
		else
		++I;
		}

auto RemoveSCEVFromBackedgeMap =		auto RemoveSCEVFromBackedgeMap =
[S, this](DenseMap<const Loop *, BackedgeTakenInfo> &Map) {		[S, this](DenseMap<const Loop *, BackedgeTakenInfo> &Map) {
for (auto I = Map.begin(), E = Map.end(); I != E;) {		for (auto I = Map.begin(), E = Map.end(); I != E;) {
BackedgeTakenInfo &BEInfo = I->second;		BackedgeTakenInfo &BEInfo = I->second;
if (BEInfo.hasOperand(S, this)) {		if (BEInfo.hasOperand(S, this)) {
BEInfo.clear();		BEInfo.clear();
Map.erase(I++);		Map.erase(I++);
} else		} else
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines
void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {		void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequiredTransitive<AssumptionCacheTracker>();		AU.addRequiredTransitive<AssumptionCacheTracker>();
AU.addRequiredTransitive<LoopInfoWrapperPass>();		AU.addRequiredTransitive<LoopInfoWrapperPass>();
AU.addRequiredTransitive<DominatorTreeWrapperPass>();		AU.addRequiredTransitive<DominatorTreeWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();		AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
}		}

const SCEVPredicate *		const SCEVPredicate ScalarEvolution::getEqualPredicate(const SCEV LHS,
ScalarEvolution::getEqualPredicate(const SCEVUnknown *LHS,		const SCEV *RHS) {
const SCEVConstant *RHS) {
FoldingSetNodeID ID;		FoldingSetNodeID ID;
		assert(LHS->getType() == RHS->getType() &&
		"Type mismatch between LHS and RHS");
// Unique this node based on the arguments		// Unique this node based on the arguments
ID.AddInteger(SCEVPredicate::P_Equal);		ID.AddInteger(SCEVPredicate::P_Equal);
ID.AddPointer(LHS);		ID.AddPointer(LHS);
ID.AddPointer(RHS);		ID.AddPointer(RHS);
void *IP = nullptr;		void *IP = nullptr;
if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))		if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))
return S;		return S;
SCEVEqualPredicate *Eq = new (SCEVAllocator)		SCEVEqualPredicate *Eq = new (SCEVAllocator)
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	public:
const SCEV visitUnknown(const SCEVUnknown Expr) {		const SCEV visitUnknown(const SCEVUnknown Expr) {
if (Pred) {		if (Pred) {
auto ExprPreds = Pred->getPredicatesForExpr(Expr);		auto ExprPreds = Pred->getPredicatesForExpr(Expr);
for (auto *Pred : ExprPreds)		for (auto *Pred : ExprPreds)
if (const auto *IPred = dyn_cast<SCEVEqualPredicate>(Pred))		if (const auto *IPred = dyn_cast<SCEVEqualPredicate>(Pred))
if (IPred->getLHS() == Expr)		if (IPred->getLHS() == Expr)
return IPred->getRHS();		return IPred->getRHS();
}		}
		return convertToAddRecWithPreds(Expr);
return Expr;
}		}

const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {		const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {
const SCEV *Operand = visit(Expr->getOperand());		const SCEV *Operand = visit(Expr->getOperand());
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);
if (AR && AR->getLoop() == L && AR->isAffine()) {		if (AR && AR->getLoop() == L && AR->isAffine()) {
// This couldn't be folded because the operand didn't have the nuw		// This couldn't be folded because the operand didn't have the nuw
// flag. Add the nusw flag as an assumption that we could make.		// flag. Add the nusw flag as an assumption that we could make.
Show All 19 Lines	if (AR && AR->getLoop() == L && AR->isAffine()) {
return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),		return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),
SE.getSignExtendExpr(Step, Ty), L,		SE.getSignExtendExpr(Step, Ty), L,
AR->getNoWrapFlags());		AR->getNoWrapFlags());
}		}
return SE.getSignExtendExpr(Operand, Expr->getType());		return SE.getSignExtendExpr(Operand, Expr->getType());
}		}

private:		private:
bool addOverflowAssumption(const SCEVAddRecExpr *AR,		bool addOverflowAssumption(const SCEVPredicate *P) {
SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
auto *A = SE.getWrapPredicate(AR, AddedFlags);
if (!NewPreds) {		if (!NewPreds) {
// Check if we've already made this assumption.		// Check if we've already made this assumption.
return Pred && Pred->implies(A);		return Pred && Pred->implies(P);
}		}
NewPreds->insert(A);		NewPreds->insert(P);
return true;		return true;
}		}

		bool addOverflowAssumption(const SCEVAddRecExpr *AR,
		SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
		auto *A = SE.getWrapPredicate(AR, AddedFlags);
		return addOverflowAssumption(A);
		}

		// If \p Expr represents a PHINode, we try to see if it can be represented
		// as an AddRec, possibly under a predicate (PHISCEVPred). If it is possible
		// to add this predicate as a runtime overflow check, we return the AddRec.
		// If \p Expr does not meet these conditions (is not a PHI node, or we
		// couldn't create an AddRec for it, or couldn't add the predicate), we just
		// return \p Expr.
		const SCEV convertToAddRecWithPreds(const SCEVUnknown Expr) {
		if (!isa<PHINode>(Expr->getValue()))
		return Expr;
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		PredicatedRewrite = SE.createAddRecFromPHIWithCasts(Expr);
		if (!PredicatedRewrite)
		return Expr;
		for (auto *P : PredicatedRewrite->second){
		if (!addOverflowAssumption(P))
		return Expr;
		}
		return PredicatedRewrite->first;
		}

SmallPtrSetImpl<const SCEVPredicate > NewPreds;		SmallPtrSetImpl<const SCEVPredicate > NewPreds;
SCEVUnionPredicate *Pred;		SCEVUnionPredicate *Pred;
const Loop *L;		const Loop *L;
};		};
} // end anonymous namespace		} // end anonymous namespace

const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV S, const Loop *L,		const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV S, const Loop *L,
SCEVUnionPredicate &Preds) {		SCEVUnionPredicate &Preds) {
Show All 20 Lines
}		}

/// SCEV predicates		/// SCEV predicates
SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,		SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,
SCEVPredicateKind Kind)		SCEVPredicateKind Kind)
: FastID(ID), Kind(Kind) {}		: FastID(ID), Kind(Kind) {}

SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,		SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,
const SCEVUnknown *LHS,		const SCEV LHS, const SCEV RHS)
const SCEVConstant *RHS)		: SCEVPredicate(ID, P_Equal), LHS(LHS), RHS(RHS) {
: SCEVPredicate(ID, P_Equal), LHS(LHS), RHS(RHS) {}		assert(LHS->getType() == RHS->getType() && "LHS and RHS types don't match");
		assert(LHS != RHS && "LHS and RHS are the same SCEV");
		}

bool SCEVEqualPredicate::implies(const SCEVPredicate *N) const {		bool SCEVEqualPredicate::implies(const SCEVPredicate *N) const {
const auto *Op = dyn_cast<SCEVEqualPredicate>(N);		const auto *Op = dyn_cast<SCEVEqualPredicate>(N);

if (!Op)		if (!Op)
return false;		return false;

return Op->LHS == LHS && Op->RHS == RHS;		return Op->LHS == LHS && Op->RHS == RHS;
▲ Show 20 Lines • Show All 250 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll

				; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 < %s 2>&1 \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; Check that the vectorizer identifies the %p.09 phi,
				; as an induction variable, despite the potential overflow
				; due to the truncation from 32bit to 8bit.
				; SCEV will detect the pattern "sext(trunc(%p.09)) + %step"
				; and generate the required runtime checks under which
				; we can assume no overflow. We check here that we generate
				; exactly two runtime checks:
				; 1) an overflow check:
				; {0,+,(trunc i32 %step to i8)}<%for.body> Added Flags: <nssw>
				; 2) an equality check verifying that the step of the induction
				; is equal to sext(trunc(step)):
				; Equal predicate: %step == (sext i8 (trunc i32 %step to i8) to i32)
				;
				; See also pr30654.
				;
				; int a[N];
				; void doit1(int n, int step) {
				; int i;
				; char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit1
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: %[[TEST:[0-9]+]] = or i1 {{.*}}, %mul.overflow
				; CHECK: %[[NTEST:[0-9]+]] = or i1 false, %[[TEST]]
				; CHECK: %ident.check = icmp ne i32 {{.}}, %{{.}}
				; CHECK: %{{.*}} = or i1 %[[NTEST]], %ident.check
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				@a = common local_unnamed_addr global [250 x i32] zeroinitializer, align 16

				; Function Attrs: norecurse nounwind uwtable
				define void @doit1(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%sext = shl i32 %p.09, 24
				%conv = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; Same as above, but for checking the SCEV "zext(trunc(%p.09)) + %step".
				; Here we expect the following two predicates to be added for runtime checking:
				; 1) {0,+,(trunc i32 %step to i8)}<%for.body> Added Flags: <nusw>
				; 2) Equal predicate: %step == (zext i8 (trunc i32 %step to i8) to i32)
				;
				; int a[N];
				; void doit2(int n, int step) {
				; int i;
				; unsigned char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit2
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: %[[TEST:[0-9]+]] = or i1 {{.*}}, %mul.overflow
				; CHECK: %[[NTEST:[0-9]+]] = or i1 false, %[[TEST]]
				; CHECK: %ident.check = icmp ne i32 {{.}}, %{{.}}
				; CHECK: %{{.*}} = or i1 %[[NTEST]], %ident.check
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				; Function Attrs: norecurse nounwind uwtable
				define void @doit2(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%conv = and i32 %p.09, 255
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; Here we check that the same phi scev analysis would fail
				; to create the runtime checks because the step is not invariant.
				; As a result vectorization will fail.
				;
				; int a[N];
				; void doit3(int n, int step) {
				; int i;
				; char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; step += 2;
				; }
				; }
				;

				; CHECK-LABEL: @doit3
				; CHECK-NOT: vector.scevcheck
				; CHECK-NOT: vector.body:
				; CHECK-LABEL: for.body:

				; Function Attrs: norecurse nounwind uwtable
				define void @doit3(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp9 = icmp sgt i32 %n, 0
				br i1 %cmp9, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.012 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%step.addr.010 = phi i32 [ %add3, %for.body ], [ %step, %for.body.preheader ]
				%sext = shl i32 %p.012, 24
				%conv = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step.addr.010
				%add3 = add nsw i32 %step.addr.010, 2
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}


				; Lastly, we also check the case where we can tell at compile time that
				; the step of the induction is equal to sext(trunc(step)), in which case
				; we don't have to check this equality at runtime (we only need the
				; runtime overflow check). Therefore only the following overflow predicate
				; will be added for runtime checking:
				; {0,+,%cstep}<%for.body> Added Flags: <nssw>
				;
				; a[N];
				; void doit4(int n, char cstep) {
				; int i;
				; char p = 0;
				; int istep = cstep;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + istep;
				; }
				; }

				; CHECK-LABEL: @doit4
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: %{{.}} = or i1 {{.}}, %mul.overflow
				; CHECK-NOT: %ident.check = icmp ne i32 {{.}}, %{{.}}
				; CHECK-NOT: %{{.}} = or i1 %{{.}}, %ident.check
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				; Function Attrs: norecurse nounwind uwtable
				define void @doit4(i32 %n, i8 signext %cstep) local_unnamed_addr {
				entry:
				%conv = sext i8 %cstep to i32
				%cmp10 = icmp sgt i32 %n, 0
				br i1 %cmp10, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.011 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%sext = shl i32 %p.011, 24
				%conv2 = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv2, i32* %arrayidx, align 4
				%add = add nsw i32 %conv2, %conv
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}