This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
2/12
ScalarEvolution.h
-
lib/Analysis/
-
Analysis/
5/66
ScalarEvolution.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
1
pr30654-phiscev-sext-trunc.ll

Differential D30041

[PSCEV] Create AddRec for Phis in cases of possible integer overflow, using runtime checks
ClosedPublic

Authored by dorit on Feb 16 2017, 5:25 AM.

Download Raw Diff

Details

Reviewers

anemet
sanjoy
mssimpso
sbaranga

Commits

rGca4fd18ddcde: PSCEV] Create AddRec for Phis in cases of possible integer overflow, using…
rL308299: PSCEV] Create AddRec for Phis in cases of possible integer overflow,

Summary

Extend the SCEVPredicateRewriter to work a bit harder when it encounters an UnknownSCEV whose Value is a Phi node.
The goal it to build an AddRecurrence for Phi nodes whose update chain involves casts, that can be ignored under the proper runtime overflow test.
This is the first step in addressing PR30654.
Next steps will improve upon it, as detailed in the comment in the body of the patch (under "TODO").

(BTW, If some of these steps seem critical I am happy to include them with this first patch, if this patch looks in principle ok (I don't want to build much upon a wrong direction...)).

Diff Detail

Event Timeline

dorit created this revision.Feb 16 2017, 5:25 AM

Herald added a subscriber: mzolotukhin. · View Herald TranscriptFeb 16 2017, 5:25 AM

dorit added reviewers: sbaranga, sanjoy, anemet.Feb 16 2017, 5:27 AM

dorit added a reviewer: mssimpso.

Ayal added a subscriber: Ayal.Feb 20 2017, 1:48 AM

Ayal added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
10863–10864	dyn_cast >> isa
10863–10864	The attempt to createAddRecFromPHIWithCasts() involves introducing a new cast predicate, right? Worth guarding or at-least commenting.
11005	exits >> latches
11025	Alternatively, you can bailed out immediately and return nullptr when multiple distinct BackEdge/Start values are found. Then these checks should be asserts?
11046	This check seems redundant, as we're stopping on the first index found to be a phi or a simple casted phi, right? Simply break when found, and check if i == e afterwards, setting FoundIndex = i (if not).

delena added a subscriber: delena.Feb 20 2017, 1:53 AM

sbaranga added inline comments.Feb 20 2017, 10:35 AM

llvm/lib/Analysis/ScalarEvolution.cpp
10878	I know I said that this should be a NUSW on the bugzilla ticket, but I'm not that sure anymore. Whatever the case this need a comment explaining the choice.
11000	I think it would be better to move this to Scalar Evolution itself, instead of having it in the rewriter. It would essentially be a lazy analysis, and would also take as an argument the loop and would return the analyzed expression and a set of predicates. That way we don't have to do the analysis again for every instantiation of a SCEVPredicateRewriter.
11072	Why is it correct to add the NSW flag here? I'm worried that it's somehow implied by the predicates that we're adding.
11091	Same as above (why can we add the NSW flag here?).

Hi Ayal,
I agree with all your comments, and will incorporate your suggestions to the next upload. I just want to clear out the caching and NoWrapFlags issues that Silviu had raised so I could include that in the next upload.
BTW, some of your comments refer to code that was copied over from getAddRecFromPHI, so I guess it would make sense to make the same changes there, where relevant. (I hope @sanjoy will have a chance to take a look to provide his input :-)).
Thanks!
Dorit

Hi Silviu,

About the NoWrap flags: I don't have very high confidence about this, I think it indeed needs to be corrected -- proposed fixes (and followup questions...) below.

I should mention that in getAddRecFromPHI (which this function is inspired by) the wrapping flags are determined by this piece of logic:

if (auto BO = MatchBinaryOp(BEValueV, DT)) {
  if (BO->Opcode == Instruction::Add && BO->LHS == PN) {
    if (BO->IsNUW)
      Flags = setFlags(Flags, SCEV::FlagNUW);
    if (BO->IsNSW)
      Flags = setFlags(Flags, SCEV::FlagNSW);
  }
} else if (GEPOperator *GEP = dyn_cast<GEPOperator>(BEValueV)) {
  // If the increment is an inbounds GEP, then we know the address
  // space cannot be wrapped around. We cannot make any guarantee
  // about signed or unsigned overflow because pointers are
  // unsigned but we may have a negative index from the base
  // pointer. We can guarantee that no unsigned wrap occurs if the
  // indices form a positive value.
  if (GEP->isInBounds() && GEP->getOperand(0) == PN) {
    Flags = setFlags(Flags, SCEV::FlagNW);

    const SCEV *Ptr = getSCEV(GEP->getPointerOperand());
    if (isKnownPositive(getMinusSCEV(getSCEV(GEP), Ptr)))
      Flags = setFlags(Flags, SCEV::FlagNUW);
  }

  // We cannot transfer nuw and nsw flags from subtraction
  // operations -- sub nuw X, Y is not the same as add nuw X, -Y
  // for instance.
}

I thought this is not needed/relevant in our scenario here because we determine the signess of the flags according to the whether we encountered a zext or sext; but I also realize now didn't think about the GEP scenario at all, I was thinking about integer inductions only. So in addition to the fixes below I will also add a check that we are dealing with integers, not pointers.

Many thanks,
Dorit

llvm/lib/Analysis/ScalarEvolution.cpp
10878	I actually don't see how we can tell whether to create an NUSW or NSSW assumption without further analysis, such as the analysis we do in getAddRecForPhiWithCasts…; So maybe in the general case we need to be conservative here and add both NSSW and NUSW to make sure that there will be no kind of overflow due to the truncation?
11000	So where/when would this analysis be triggered? And where would we cache the result of the ScalarEvolution analysis? could you please elaborate on what would be the flow of things you are suggesting / what exactly you mean by lazy here…? I agree of course we should avoid repeating the analysis over and over again; the caching of the analysis that I was going to add, along with guarding the analysis with "if(NewPreds)", would guarantee that if the analysis succeeded once (when NewPreds was passed) then any time we are passed "Preds" we will not repeat the analysis (because I was going to add a new kind of Predicate, and cache things in Preds). But indeed if the analysis fails, this caching will not prevent us from repeating it (and failing) every time we are passed NewPreds… So maybe what this means is that the caching should be done in a new data-structure to be added to PSCEV, separately of Preds, where we would cache both a failure -- simply associate the UnknownSCEV of the phi node with itself, and a success -- associate the UnknownSCEV with the respective AddRec (and of course this AddRec would itself have a Predicate that the analysis will have already added). This way we could check the results of the analysis regardless of whether we are passed Preds. If we do that, would your suggestion above still be relevant?
11072	Not sure about this, and looking at this again I can't justify SCEV:NSW here. Probably SCEV::FlagAnyWrap is all we can do here (as without the predicate we know nothing because of the truncate). Right?
11091	Again, not sure about this. I thought we can put here what the predicate guarantees. So if we added an NSSW assumption we could set the NoWrapFlags to SCEV:FlagNSW (right?). Originally I only looked at the Sext pattern and that's why I put the NSW Flag. Then I extended the analysis to also consider the Zext pattern, but didn't go back to fix the flag. So if we added an NUSW predicate, then would it be correct to set the flags to SCEV:FlagNUW ?? (NUSW and NUW don't have the same semantics…). Maybe SCEV:FlagNW makes most sense then in that case?

Silviu, would you please clarify your comment about moving the analysis to Scalar Evolution (please see my question above)? and why is it better than just caching things in the rewriter (possibly via a new data-structure in PSCEV or via Preds)? I would like to address your comments and upload a revised patch... thanks a lot,
Dorit

Hi Dorit,

Sorry for the delay. I'm on holiday until next week so communication will be slow until I get back.

Thanks,
Silviu

llvm/lib/Analysis/ScalarEvolution.cpp
10878	Adding just NUSW would work, but the problem would be that the predicate would fail at runtime often if the number is used as signed. Ideally we should find a solution that works in most cases.
11000	I was thinking using the same mechanism that SCEV already has for caching SCEV expressions (and which also use to store SCEV predicates). Essentially, PSCEV would call SCEV from here and SCEV would check to see if it has already analyzed the node or not. If not, it would do this analysis and store the result (using the loop + the SCEV Unknown as keys for further lookups). As you've said, in case of failure we can just return the SCEVUnknown expression without any additional predicates. This would essentially be the same thing SCEV does for getSCEV().
11072	Sounds correct, we should drop the NSW flag.
11091	We can't add NSW/NUW on SCEV expressions if we infer them from SCEV predicates. The problem is doing so would essentially mean that the NUW/NSW are not predicated (which isn't true) and can technically lead us to false conclusions (we can even use the nsw/nuw flags to prove the original predicate, which is incorrect).

Thanks for responding while on holiday!

llvm/lib/Analysis/ScalarEvolution.cpp
10878	So Truncate may need to call SE::getAddRecForPhiWithCasts analysis directly, so that it will have the signess knowledge/context; in fact the analysis results already include the predicate that needs to be added (NUSW/NSSW).
11000	So just making sure: Currently SCEV caches things in the ValueExprMap. Are you suggesting to add a new member to SCEV, to store the result of the analysis? (where the result of the analysis is: "the SCEVUnknown %x in loop L can be rewritten to the AddRec '(0,+,%step)' if the Predicate 'Flags=NSSW, AR=(0,+,trunc(%step)' is added" ? )

sbaranga added inline comments.Mar 9 2017, 2:35 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11000	I think the result of the analysis should be "for loop L the loop-variant SCEVUnknown can be re-written to another SCEV, given a set of predicates", so the analysis gives you both the SCEV and a set of predicates. This should be general enough to use in more cases. I would be happy with either reusing the same ValueExprMap for storage, or adding another ValueExprMap (they would probably both work). In general I'm not too picky about this as long as it is sensible (although @sanjoy will probably have something to say).

dorit added inline comments.Mar 16 2017, 10:05 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11000	SCEV's ValueExprMap maps a Value to a SCEVExpr; and we want to map a <UnknownSCEV, Loop> pair to a <SCEVExpr, setOfPredicates> pair. Are we talking about the same ValueExprMap? Also, you commented earlier that SCEV uses the ValueExprMap to also cache SCEV predicates; Could you please point me to where? (I thought that predicates and predicate-based rewrites were stored only in PSCEV's Preds and RewriteMap…) ? Thanks

sbaranga added inline comments.Mar 16 2017, 10:29 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11000	Sorry, I should have looked at the code earlier. What I meant was adding a FoldingSet, like the ones used by ScalarEvolution to store SCEVs to and Preds (see UniqueSCEVs and UniquePreds). In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value.

dorit added inline comments.Mar 17 2017, 5:57 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11000	In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. Isn't it more natural to hold the mapping from the unknownSCEV+Loop to the Predicate+AddRecExpr in a map like this?: DenseMap< std::pair<const Loop , const SCEV > , std::pair<const SCEV , SmallVector<constSCEVPredicate , 2>> > It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value. Wait, why would we be replacing uses here? We will be recording here just a tentative mapping, which will be valid only if the PSCEV caller will decide to actually add these predicates and SCEV rewrites in its Preds and RewriteMap… ? I'll upload a new patch along these lines next week (but if something in the above sounds wrong please shout!)

sbaranga added inline comments.Mar 21 2017, 7:20 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11000	Using DenseMap should be ok as well. Regarding replacing uses: we need to handle this case in order to properly cache the result of the analysis (because passes that use SCEV can replace uses). This should be fine however if we use SCEVs instead of Values. So this shouldn't be a problem with the DenseMap that you want to use.

Hi Silviu,

The new revision addresses your comments and implements two of the TODO items:

it caches the results of the analysis in a new map in ScalarEvolution (as per your suggestion; thanks!).
it provides a bit of context to visitTruncate so we'd know which overflow check to create (signed or unsigned).

About the cache: For now, I didn't define it to hold a set of predicates but just a single SCEVWrapPredicate per item; I was wondering if maybe we want a SCEVUnionPredicate instead of a Set of predicates? In any case, maybe better leave that generalization for when the need arises?

I have not yet implemented the code style comments by Ayal that relate to the code I copied from createAddRecFromPHI; These should be fixed in both functions, and I want to try to see if I can outline some common pieces between these functions to avoid duplication as much as possible.
I will go ahead and look at that next, and upload a new revision later on. Or in a separate patch if you prefer?
(only expected changes are NFC stuff around createAddRecFromPHI/ createAddRecFromPHIWithCasts; In terms of functionality - the patch is ready for review :-))

In D30041#707455, @dorit wrote:

Hi Silviu,

The new revision addresses your comments and implements two of the TODO items:

it caches the results of the analysis in a new map in ScalarEvolution (as per your suggestion; thanks!).

it provides a bit of context to visitTruncate so we'd know which overflow check to create (signed or unsigned).

About the cache: For now, I didn't define it to hold a set of predicates but just a single SCEVWrapPredicate per item; I was wondering if maybe we want a SCEVUnionPredicate instead of a Set of predicates? In any case, maybe better leave that generalization for when the need arises?

I have not yet implemented the code style comments by Ayal that relate to the code I copied from createAddRecFromPHI; These should be fixed in both functions, and I want to try to see if I can outline some common pieces between these functions to avoid duplication as much as possible.
I will go ahead and look at that next, and upload a new revision later on. Or in a separate patch if you prefer?
(only expected changes are NFC stuff around createAddRecFromPHI/ createAddRecFromPHIWithCasts; In terms of functionality - the patch is ready for review :-))

Sure, this is fine with me. It's indeed best to address style comments separately.

Sure, this is fine with me. It's indeed best to address style comments separately.

Great.

In that case, I have no further updates to this revision at this point.

Any further comments anyone? Silviu, does this address your all your comments?

Hi Dorit,

Sorry for the delayed review. I have some more comments.

Regarding testing: it would be nicer to add some LAA tests, since the LAA analysis results will print the added predicates and the SCEV expressions for the bounds.

Thanks,
Silviu

llvm/include/llvm/Analysis/ScalarEvolution.h
1722	The data in this map should be also transferred in ScalarEvolution::ScalarEvolution(ScalarEvolution &&Arg). We should also remove mappings in forgetLoop(). The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs?
llvm/lib/Analysis/ScalarEvolution.cpp
4086	Can this be a static function?
4225	It would be nice to have a comment here saying that this works because we're going to add a SCEV predicate to prove that SymbolicPHI == Add->getOperand(i).
4262	I think the contract should be to return a SCEV expression with the same type as what getSCEV would return for the phi node. It's also probably better to not mention the vectorizer here (since more passes will end up running this code anyway).
10868	Now that I'm having a fresh look, I think we need to revisit the conditions for the transformation here. If we want to do trunc({x,+,y}) -> {trunc(x), +, trunc(y)}, is this not always true?

dorit updated this revision to Diff 95137.Apr 13 2017, 8:44 AM

Regarding testing: it would be nicer to add some LAA tests, since the LAA analysis results will print the added predicates and the SCEV expressions for the bounds.

Until your patch in D17080 is committed my patch does not have an effect on loop-access-analysis PSE (only to loop-vectorizer's PSE)… So for now nothing is printed. I'm happy to add a loop-accesses analysis test as soon as your patch is committed.

Thanks very much for your comments,
Dorit

llvm/include/llvm/Analysis/ScalarEvolution.h
1722	The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs? Changed to PredicatedSCEVRewrites (since we can rewrite the SCEVUnknowns into the SCEV on the RHS if we add the predicate). But if your suggestion is more intuitive to you I'll change to that.
llvm/lib/Analysis/ScalarEvolution.cpp
10868	Yes, I think you're right… This simplifies a few things… :-) so no need for a visitTruncateExpr case at all in this rewriter... the base class rewriter can handle the Truncate (without any predicates) (right?) Thanks!!

dorit added inline comments.Apr 14 2017, 10:07 PM

llvm/include/llvm/Analysis/ScalarEvolution.h
1722	We should also remove mappings in forgetLoop(). Just noticed there's also a forgetMemoizedResults(S). Looks like we should be removing the mapping for S there too, right?

Remove mappings from PredicatedSCEVRewrites also in forgetMemoizedResults().

ping :-)
thanks,
Dorit

nitpick: you should run a spell checker over the patch.

This generally looks ok to me but since this is a complex change @sanjoy should approve it before being committed.

-Silviu

llvm/lib/Analysis/ScalarEvolution.cpp
4277	I think we need a more formal explanation here on why this predicate guarantees that this is an AddRecExpr. We're trying to prove that if we have: SymbolicPHI = phi({Start, LoopHead}, {NextValue, LoopLatch}) NextValue = (Sext ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum then SymolicPHI = {Start, +, InvariantAccum}. At iteration 0 both values are equal to Start, so it's enough to prove that SymbolicPhi + Invariant == (Sext (trunc (SymbolicPHI)) + Invariant, which should be true from the SCEV predicate.
10868	Correct, we shouldn't need to do anything here.

Some comments inline.

llvm/include/llvm/Analysis/ScalarEvolution.h
1263	s/SumbolicPHI/SymbolicPHI/
1266	s/cahced/cached/
llvm/lib/Analysis/ScalarEvolution.cpp
4099	This is very minor, but we usually spell these as `SExt` and `ZExt`, in keeping with camel case.
4138	s/rewritew/rewrite/
4169	Usually for out parameters like this, the type is `const SCEVPredicate *&`. But I'd prefer just returning an `std::pair`.
4181	Can you do `find({SymbolicPHI, L})`?
4194	`Pair` isn't clear. Please either name it something more specific, or (IMO better) use `{SymbolicPHI, L}` instead of `Pair`.
4195	Can you do `{SymbolicPHI, nullptr}` instead of `make_pair`?
4259	Do you also need to check that the recurrence is affine?
4273	s/specificed/specified/
4278	I'm not sure that this is correct. Say we had a loop with 4 iterations, `StartVal` was `i16 257`, `Accum` was `i16 1`, `TruncTy` was `i8` and the PHI was being zero extended. In that case, the value of the PHI node on the second iteration (i.e. after taking the backedge once) would be `(trunc 257) + 1` = `2`. However, despite `{(trunc 257),+,1}` = `{1,+,1}` not unsigned-overflowing in 4 iterations, `{257,+,1}` would not produce the correct values for the PHI node.
10634	Doesn't `PredicatedSCEVRewrites.erase(I++)` invalidate `E`?
10921	Use `auto *` for pointers.

This revision now requires changes to proceed.Apr 24 2017, 11:02 PM

Hi Sanjoy, I will upload a new fixed version soon; just have two quick followup questions below. Thanks a lot!
Dorit

llvm/lib/Analysis/ScalarEvolution.cpp
4259	Oh, probably so… Do you think the isAffine check is also required in createAddRecFromPHI()? createAddRecFromPHIWithCasts() is basically a subset of createAddRecFromPHI, modified to consider also the sext-trunc cast pattern (the intention is later to try to factor out common parts as much as possible). I copied this check from there without thinking much…
4278	I see. So I guess we should be returning a {(sext (trunc Start)),+,{(sext (trunc Accum))} as the newAR, right...?

sanjoy added inline comments.Apr 25 2017, 10:11 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4259	I was suggesting the `isAffine` since IIRC (but please check) you cannot create a no wrap predicate on a non-affine add recurrence.
4278	Not sure how that will work -- won't `{(sext (trunc Start)),+,{(sext (trunc Accum))}` be `{1,+,1}` and thus be `1` instead of `257` in the first iteration? I haven't thought this through, but I suspect you'll have to check that the starte value fits in the narrower type or something like that. In any case, I agree with Silviu here -- whatever you go with, please justify why that is correct with a proof here.
10915	I'm also not a big fan of the name `analyzeUnknownSCEVPHI` here -- `analyze` does not mean anything specific, and the `UnknownSCEV` part is redundant. How about `convertToAddRecWithPreds`?

sbaranga added inline comments.Apr 25 2017, 10:40 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4278	Maybe have two extra SCEVEqualPredicates to test that (sext (trunc Start)) == Start and {(sext (trunc Accum))} == Accum and return {Start, + Accum}? Before adding the extra predicates it would be worth testing if SCEV can prove these properties statically first (for example for cases where either Start or Acum are constants).

dorit added inline comments.Apr 26 2017, 4:16 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4278	Good idea! Sanjoy, Silviu: Are you ok with extending the SCEVEqualPredicate to a non constant on the RHS? (I think currently only SCEV == constant is supported).

sbaranga added inline comments.Apr 26 2017, 9:10 AM

llvm/lib/Analysis/ScalarEvolution.cpp
4278	That's ok with me.

Main changes:

SCEVEqualPredicate can take any SCEV on LHS/RHS (instead of only SCEVUnknown/SCEVConstant )
We add three predicates instead of only one; so we now have pairs of <SCEV, SmallVector of predicates> (instead of <SCEV, single Predicate>)
Added a proof for why these predicates guarantee the correctness of the proposed rewrite
Addressed Style and Spelling comments

Thanks,
Dorit

(discovered typo in the documentation)

Ping :)

sbaranga added inline comments.May 22 2017, 3:20 PM

llvm/lib/Analysis/ScalarEvolution.cpp
4355	I originally had in mind testing the SCEV expressions for equality (Expr == ExtendedExpr) and checking that both are loop invariant. I had a look at the implementation of isKnownPredicate, and I'm not convinced this would work as expected. Can we add a regression test for this? I think we need to check anyway that both expressions are loop invariant before adding the predicate.

We now require that Accum is loop invariant.
Added a simple Expr == ExtendedExpr check, and only if it fails we call isKnownPredicate().
Extended the testcase:

• Check that we have both the overflow runtime check and the equality runtime check in doit1 and doit2
• Added doit3: where step is not invariant (not very interesting - we do nothing)
• Added doit4: where we can figure out at compile time that step == sext(trunc(step)). Here we check that we only have the overflow runtime check (without the equality runtime check).

Thanks,
Dorit

dorit added inline comments.May 24 2017, 12:43 PM

llvm/lib/Analysis/ScalarEvolution.cpp
4355	Right. I was assuming Accum is invariant in the loop, forgot I was allowing it to not be invariant. Thanks.

ping?
thanks,
Dorit

Sorry for the delay, I missed the last update. I have a few minor suggestions, but otherwise I think it generally looks good.
I think Sanjoy still needs to approve this before it can go in.

Thanks
-Silviu

llvm/include/llvm/Analysis/ScalarEvolution.h
244	Ideally we would have something (not necessarily in SCEVEqualPredicate) to verify that these predicates can be checked. We should also be making the LHS->RHS substitution in the rewriter at some point (not necessarily in this change).
llvm/lib/Analysis/ScalarEvolution.cpp
4093	s/confirms/conforms
4279	This is a bit hard to follow with NewExpr, Expr, etc and the initial statement problem should be simplified. I guess we just want to prove that Expr(i) = Start + i *Accum, given that (1) Expr(0) = Start (2) Expr(i+1) = (Ext ix (Trunc iy (Expr(i)) to ix) to iy) + Accum
4303	No text should be required for iteration 2, proving the induction step should be enough.
4325	It would be better to just use Ext and Trunc as separate operators instead SExtTrunc.
llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
5	Could you add some text for each of these saying what predicates get added?

Thanks Silviu. I'll iron the comments following your suggestions.

llvm/include/llvm/Analysis/ScalarEvolution.h
244	Can you please clarify what check is missing? And where would be an appropriate place add a TODO comment about the substitution?
llvm/lib/Analysis/ScalarEvolution.cpp
4303	I thought it's helpful to have an example step before the formal step. I'll drop it.

sbaranga added inline comments.Jun 14 2017, 9:03 AM

llvm/include/llvm/Analysis/ScalarEvolution.h
244	If we would have a Loop * as a member of SCEVEqualPredicate, and we could check that both LHS and RHS are invariant inside the constructor. However, we don't need the Loop for anything else so I'm not sure that would be the right solution. I guess since we already have an assert in AppendPredicate that should be enough for now.

Addressed Silviu's last comments on the documentation:

Added the predicates that are added for each of the loops in the testcase
In the proof:
- dropped all the introductory text and jump directly to the formal proof.
- expanded the short notation I was using (SExTrunc) into the explicit notation (Ext ix (Trunc iy () to ix ) to iy)

Herald added a subscriber: hiraditya. · View Herald TranscriptJun 18 2017, 12:34 AM

@sbaranga, @sanjoy, ping :-)
thanks.
dorit

ping^2

thanks,
dorit

It looks ok to me, thanks!

-Silviu

@sanjoy, is the patch ok with you?

thanks,
Dorit

Mostly minor stuff.

llvm/include/llvm/Analysis/ScalarEvolution.h
245	Now that the type system does not guarantee this, how about adding an assert that `LHS != RHS`?
1721	Why not have the key type be `std::pair<const SCEVUnknown , const Loop >`?
llvm/lib/Analysis/ScalarEvolution.cpp
4097	s/const SCEV SymbolicPHI/const SCEVUnknown SymbolicPHI/
4114	How about doing this check as the very first thing (to avoid doing this extra work when it would have failed anyway)?
4190	IMO a slightly more idiomatic pattern (which would obviate the need for the `*** Part1` comment) is to have a `createAddRecFromPHIWithCasts` and a `createAddRecFromPHIWithCastsImpl`, where the first function checks `PredicatedSCEVRewrites` for an existing solution, and delegates to `createAddRecFromPHIWithCastsImpl` if we don't have a cached solution.
4204	Is this semantically important? That is, if you remove this and instead ensure that we always populate `PredicatedSCEVRewrites[{SymbolicPHI, L}]` before leaving this function, will we enter an infinite loop somewhere?
4246	Should we be getting here in the `Add->getOperand(i) == SymbolicPHI` case? Shouldn't the regular add recurrence creating logic have triggered?
4327	You should be able to `cast<>` here.

This revision now requires changes to proceed.Jul 11 2017, 4:04 PM

Addressing Sanjoy's comments.

Hi Sanjoy,
Thanks! Comments addressed,
Dorit

llvm/include/llvm/Analysis/ScalarEvolution.h
245	Added in the constructor.
1721	Absolutely right (that was the intention, but somehow it made it only to the comment... )
llvm/lib/Analysis/ScalarEvolution.cpp
4204	I added a clarification on the motivation for this early initialization. There is nothing semantic behind it. It is just to avoid having to initialize it upon every exit from the function (keep things a bit more compact, avoid the slight code duplication, make sure we don't forget it…). Is that ok with you with the clarification?
4246	yes, I would indeed expect it would have been caught already; added a comment here, and an assert in isSimpleCastedPHI.

lgtm with one nit

llvm/lib/Analysis/ScalarEvolution.cpp
4204	Given that you now have the `createAddRecFromPHIWithCastsImpl` split out, a cleaner way to achieve the same property would be to insert the result into the cache in `createAddRecFromPHIWithCasts`.

This revision is now accepted and ready to land.Jul 15 2017, 12:33 PM

Hi Sanjoy,

Thanks!

I addressed your last comment + made another small change:

Should we be getting here in the Add->getOperand(i) == SymbolicPHI case? Shouldn't the regular add recurrence creating logic have triggered?

I realized that we may be getting here with Op == SymbolicPHI because we haven't yet processed the rest of the operands of the Add... (so createAddRecFromPHI may have failed because one of them is not invariant). So I changed the assert back to an if, with a detailed comment (in isSimpleCastedPHI).

Will wait a bit before I commit.

Many thanks Sanjoy and Silviu for all your help with this patch,

Dorit

Closed by commit rL308299: PSCEV] Create AddRec for Phis in cases of possible integer overflow, (authored by dorit). · Explain WhyJul 18 2017, 4:57 AM

This revision was automatically updated to reflect the committed changes.

dneilson mentioned this in D37265: [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values.Sep 5 2017, 12:58 PM

dorit mentioned this in D38948: [LV] Support efficient vectorization of an induction with redundant casts.Oct 16 2017, 5:29 AM

dorit mentioned this in rL320672: [LV] Support efficient vectorization of an induction with redundant casts.Dec 13 2017, 11:57 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ScalarEvolution.h

43 lines

lib/

Analysis/

ScalarEvolution.cpp

362 lines

test/

Transforms/

LoopVectorize/

pr30654-phiscev-sext-trunc.ll

214 lines

Diff 98427

llvm/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	struct FoldingSetTrait<SCEVPredicate> : DefaultFoldingSetTrait<SCEVPredicate> {
}		}
static unsigned ComputeHash(const SCEVPredicate &X,		static unsigned ComputeHash(const SCEVPredicate &X,
FoldingSetNodeID &TempID) {		FoldingSetNodeID &TempID) {
return X.FastID.ComputeHash();		return X.FastID.ComputeHash();
}		}
};		};

/// This class represents an assumption that two SCEV expressions are equal,		/// This class represents an assumption that two SCEV expressions are equal,
/// and this can be checked at run-time. We assume that the left hand side is		/// and this can be checked at run-time.
/// a SCEVUnknown and the right hand side a constant.
class SCEVEqualPredicate final : public SCEVPredicate {		class SCEVEqualPredicate final : public SCEVPredicate {
/// We assume that LHS == RHS, where LHS is a SCEVUnknown and RHS a		/// We assume that LHS == RHS.
/// constant.		const SCEV *LHS;
const SCEVUnknown *LHS;		const SCEV *RHS;
		sbarangaUnsubmitted Not Done Reply Inline Actions Ideally we would have something (not necessarily in SCEVEqualPredicate) to verify that these predicates can be checked. We should also be making the LHS->RHS substitution in the rewriter at some point (not necessarily in this change). sbaranga: Ideally we would have something (not necessarily in SCEVEqualPredicate) to verify that these…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Can you please clarify what check is missing? And where would be an appropriate place add a TODO comment about the substitution? dorit: Can you please clarify what check is missing? And where would be an appropriate place add a…
		sbarangaUnsubmitted Not Done Reply Inline Actions If we would have a Loop * as a member of SCEVEqualPredicate, and we could check that both LHS and RHS are invariant inside the constructor. However, we don't need the Loop for anything else so I'm not sure that would be the right solution. I guess since we already have an assert in AppendPredicate that should be enough for now. sbaranga: If we would have a Loop * as a member of SCEVEqualPredicate, and we could check that both LHS…
const SCEVConstant *RHS;

		sanjoyUnsubmitted Done Reply Inline Actions Now that the type system does not guarantee this, how about adding an assert that `LHS != RHS`? sanjoy: Now that the type system does not guarantee this, how about adding an assert that `LHS != RHS`?
		doritAuthorUnsubmitted Not Done Reply Inline Actions Added in the constructor. dorit: Added in the constructor.
public:		public:
SCEVEqualPredicate(const FoldingSetNodeIDRef ID, const SCEVUnknown *LHS,		SCEVEqualPredicate(const FoldingSetNodeIDRef ID, const SCEV *LHS,
const SCEVConstant *RHS);		const SCEV *RHS);

/// Implementation of the SCEVPredicate interface		/// Implementation of the SCEVPredicate interface
bool implies(const SCEVPredicate *N) const override;		bool implies(const SCEVPredicate *N) const override;
void print(raw_ostream &OS, unsigned Depth = 0) const override;		void print(raw_ostream &OS, unsigned Depth = 0) const override;
bool isAlwaysTrue() const override;		bool isAlwaysTrue() const override;
const SCEV *getExpr() const override;		const SCEV *getExpr() const override;

/// Returns the left hand side of the equality.		/// Returns the left hand side of the equality.
const SCEVUnknown *getLHS() const { return LHS; }		const SCEV *getLHS() const { return LHS; }

/// Returns the right hand side of the equality.		/// Returns the right hand side of the equality.
const SCEVConstant *getRHS() const { return RHS; }		const SCEV *getRHS() const { return RHS; }

/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static inline bool classof(const SCEVPredicate *P) {		static inline bool classof(const SCEVPredicate *P) {
return P->getKind() == P_Equal;		return P->getKind() == P_Equal;
}		}
};		};

/// This class represents an assumption made on an AddRec expression. Given an		/// This class represents an assumption made on an AddRec expression. Given an
▲ Show 20 Lines • Show All 979 Lines • ▼ Show 20 Lines	const SCEV getAddRecExpr(const SCEV Start, const SCEV Step, const Loop L,
SCEV::NoWrapFlags Flags);		SCEV::NoWrapFlags Flags);
const SCEV getAddRecExpr(SmallVectorImpl<const SCEV > &Operands,		const SCEV getAddRecExpr(SmallVectorImpl<const SCEV > &Operands,
const Loop *L, SCEV::NoWrapFlags Flags);		const Loop *L, SCEV::NoWrapFlags Flags);
const SCEV getAddRecExpr(const SmallVectorImpl<const SCEV > &Operands,		const SCEV getAddRecExpr(const SmallVectorImpl<const SCEV > &Operands,
const Loop *L, SCEV::NoWrapFlags Flags) {		const Loop *L, SCEV::NoWrapFlags Flags) {
SmallVector<const SCEV *, 4> NewOp(Operands.begin(), Operands.end());		SmallVector<const SCEV *, 4> NewOp(Operands.begin(), Operands.end());
return getAddRecExpr(NewOp, L, Flags);		return getAddRecExpr(NewOp, L, Flags);
}		}

		/// Similar to createAddRecFromPHI, but with the additional flexibility of
		/// suggesting runtime overflow checks in case casts are encountered.
		/// If successful, the analysis records that for this loop, \p SymbolicPHI,
		/// which is the UnknownSCEV currently representing the PHI, can be rewritten
		/// into an AddRec, assuming some predicates; The function then returns the
		/// AddRec and the predicates as a pair, and caches this pair in
		/// PredicatedSCEVRewrites.
		sanjoyUnsubmitted Not Done Reply Inline Actions s/SumbolicPHI/SymbolicPHI/ sanjoy: s/SumbolicPHI/SymbolicPHI/
		/// If the analysis is not successful, a mapping from the \p SymbolicPHI to
		/// itself (with no predicates) is recorded, and a nullptr with an empty
		/// predicates vector is returned as a pair.
		sanjoyUnsubmitted Not Done Reply Inline Actions s/cahced/cached/ sanjoy: s/cahced/cached/
		/// The function is intended to be called from PSCEV (the caller will decide
		/// whether to actually add the predicates and carry out the rewrites).
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI);

/// Returns an expression for a GEP		/// Returns an expression for a GEP
///		///
/// \p GEP The GEP. The indices contained in the GEP itself are ignored,		/// \p GEP The GEP. The indices contained in the GEP itself are ignored,
/// instead we use IndexExprs.		/// instead we use IndexExprs.
/// \p IndexExprs The expressions for the indices.		/// \p IndexExprs The expressions for the indices.
const SCEV getGEPExpr(GEPOperator GEP,		const SCEV getGEPExpr(GEPOperator GEP,
const SmallVectorImpl<const SCEV *> &IndexExprs);		const SmallVectorImpl<const SCEV *> &IndexExprs);
const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);		const SCEV getSMaxExpr(const SCEV LHS, const SCEV *RHS);
▲ Show 20 Lines • Show All 390 Lines • ▼ Show 20 Lines	void delinearize(const SCEV Expr, SmallVectorImpl<const SCEV > &Subscripts,
const SCEV *ElementSize);		const SCEV *ElementSize);

/// Return the DataLayout associated with the module this SCEV instance is		/// Return the DataLayout associated with the module this SCEV instance is
/// operating on.		/// operating on.
const DataLayout &getDataLayout() const {		const DataLayout &getDataLayout() const {
return F.getParent()->getDataLayout();		return F.getParent()->getDataLayout();
}		}

const SCEVPredicate getEqualPredicate(const SCEVUnknown LHS,		const SCEVPredicate getEqualPredicate(const SCEV LHS, const SCEV *RHS);
const SCEVConstant *RHS);

const SCEVPredicate *		const SCEVPredicate *
getWrapPredicate(const SCEVAddRecExpr *AR,		getWrapPredicate(const SCEVAddRecExpr *AR,
SCEVWrapPredicate::IncrementWrapFlags AddedFlags);		SCEVWrapPredicate::IncrementWrapFlags AddedFlags);

/// Re-writes the SCEV according to the Predicates in \p A.		/// Re-writes the SCEV according to the Predicates in \p A.
const SCEV rewriteUsingPredicate(const SCEV S, const Loop *L,		const SCEV rewriteUsingPredicate(const SCEV S, const Loop *L,
SCEVUnionPredicate &A);		SCEVUnionPredicate &A);
Show All 25 Lines	private:
const SCEV getOrCreateAddExpr(SmallVectorImpl<const SCEV > &Ops,		const SCEV getOrCreateAddExpr(SmallVectorImpl<const SCEV > &Ops,
SCEV::NoWrapFlags Flags);		SCEV::NoWrapFlags Flags);

private:		private:
FoldingSet<SCEV> UniqueSCEVs;		FoldingSet<SCEV> UniqueSCEVs;
FoldingSet<SCEVPredicate> UniquePreds;		FoldingSet<SCEVPredicate> UniquePreds;
BumpPtrAllocator SCEVAllocator;		BumpPtrAllocator SCEVAllocator;

		/// Cache tentative mappings from UnknownSCEVs in a Loop, to a SCEV expression
		/// they can be rewritten into under certain predicates.
		sanjoyUnsubmitted Done Reply Inline Actions Why not have the key type be `std::pair<const SCEVUnknown , const Loop >`? sanjoy: Why not have the key type be `std::pair<const SCEVUnknown , const Loop >`?
		doritAuthorUnsubmitted Not Done Reply Inline Actions Absolutely right (that was the intention, but somehow it made it only to the comment... ) dorit: Absolutely right (that was the intention, but somehow it made it only to the comment... )
		DenseMap<std::pair<const SCEV , const Loop >,
		sbarangaUnsubmitted Not Done Reply Inline Actions The data in this map should be also transferred in ScalarEvolution::ScalarEvolution(ScalarEvolution &&Arg). We should also remove mappings in forgetLoop(). The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs? sbaranga: The data in this map should be also transferred in ScalarEvolution::ScalarEvolution…
		doritAuthorUnsubmitted Not Done Reply Inline Actions The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the analysis result, so we should have a better name for this? Maybe PredicatedAnalyzableSCEVs? Changed to PredicatedSCEVRewrites (since we can rewrite the SCEVUnknowns into the SCEV on the RHS if we add the predicate). But if your suggestion is more intuitive to you I'll change to that. dorit: > The point of the mapping would be to remove existing loop-variant SCEVUnknowns from the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions We should also remove mappings in forgetLoop(). Just noticed there's also a forgetMemoizedResults(S). Looks like we should be removing the mapping for S there too, right? dorit: > We should also remove mappings in forgetLoop(). Just noticed there's also a…
		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		PredicatedSCEVRewrites;

/// The head of a linked list of all SCEVUnknown values that have been		/// The head of a linked list of all SCEVUnknown values that have been
/// allocated. This is used by releaseMemory to locate them all and call		/// allocated. This is used by releaseMemory to locate them all and call
/// their destructors.		/// their destructors.
SCEVUnknown *FirstUnknown;		SCEVUnknown *FirstUnknown;
};		};

/// Analysis pass that exposes the \c ScalarEvolution for a function.		/// Analysis pass that exposes the \c ScalarEvolution for a function.
class ScalarEvolutionAnalysis		class ScalarEvolutionAnalysis
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,077 Lines • ▼ Show 20 Lines	static Optional<BinaryOp> MatchBinaryOp(Value *V, DominatorTree &DT) {

default:		default:
break;		break;
}		}

return None;		return None;
}		}

		/// Helper function to createAddRecFromPHIWithCasts. We have a phi
		sbarangaUnsubmitted Not Done Reply Inline Actions Can this be a static function? sbaranga: Can this be a static function?
		/// node whose symbolic (unknown) SCEV is \p SymbolicPHI, which is updated via
		/// the loop backedge by a SCEVAddExpr, possibly also with a few casts on the
		/// way. This function checks if \p Op, an operand of this SCEVAddExpr,
		/// follows one of the following patterns:
		/// Op == (SExt ix (Trunc iy (%SymbolicPHI) to ix) to iy)
		/// Op == (ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy)
		/// If the SCEV expression of \p Op confirms with one of the expected patterns
		sbarangaUnsubmitted Not Done Reply Inline Actions s/confirms/conforms sbaranga: s/confirms/conforms
		/// we return the type of the truncation operation, and indicate whether the
		/// truncated type should be treated as signed/unsigned by setting
		/// \p Signed to true/false, respectively.
		static Type isSimpleCastedPHI(const SCEV Op, const SCEV *SymbolicPHI,
		sanjoyUnsubmitted Not Done Reply Inline Actions s/const SCEV SymbolicPHI/const SCEVUnknown SymbolicPHI/ sanjoy: s/const SCEV SymbolicPHI/const SCEVUnknown SymbolicPHI/
		bool &Signed, ScalarEvolution &SE) {
		const SCEVSignExtendExpr *SExt = dyn_cast<SCEVSignExtendExpr>(Op);
		sanjoyUnsubmitted Not Done Reply Inline Actions This is very minor, but we usually spell these as `SExt` and `ZExt`, in keeping with camel case. sanjoy: This is very minor, but we usually spell these as `SExt` and `ZExt`, in keeping with camel case.
		const SCEVZeroExtendExpr *ZExt = dyn_cast<SCEVZeroExtendExpr>(Op);
		if (!SExt && !ZExt)
		return nullptr;
		const SCEVTruncateExpr *Trunc =
		SExt ? dyn_cast<SCEVTruncateExpr>(SExt->getOperand())
		: dyn_cast<SCEVTruncateExpr>(ZExt->getOperand());
		if (!Trunc)
		return nullptr;
		const SCEV *X = Trunc->getOperand();
		if (X != SymbolicPHI)
		return nullptr;
		unsigned SourceBits = SE.getTypeSizeInBits(X->getType());
		unsigned NewBits = SExt ? SE.getTypeSizeInBits(SExt->getType())
		: SE.getTypeSizeInBits(ZExt->getType());
		if (SourceBits != NewBits)
		sanjoyUnsubmitted Done Reply Inline Actions How about doing this check as the very first thing (to avoid doing this extra work when it would have failed anyway)? sanjoy: How about doing this check as the very first thing (to avoid doing this extra work when it…
		return nullptr;
		Signed = SExt ? true : false;
		return Trunc->getType();
		}

		// Analyze \p SymbolicPHI, a SCEV expression of a phi node, and check if the
		// computation that updates the phi follows the following pattern:
		// (SExt/ZExt ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum
		// which correspond to a phi->trunc->sext/zext->add->phi update chain.
		// If so, try to see if it can be rewritten as an AddRecExpr under some
		// Predicates. If successful, return them as a pair. Also cache the results
		// of the analysis.
		//
		// Example usage scenario:
		// Say the Rewriter is called for the following SCEV:
		// 8 * ((sext i32 (trunc i64 %X to i32) to i64) + %Step)
		// where:
		// %X = phi i64 (%Start, %BEValue)
		// It will visitMul->visitAdd->visitSExt->visitTrunc->visitUnknown(%X),
		// and call this function with %SymbolicPHI = %X.
		//
		// The analysis will find that the value coming around the backedge has
		// the following SCEV:
		// BEValue = ((sext i32 (trunc i64 %X to i32) to i64) + %Step)
		sanjoyUnsubmitted Not Done Reply Inline Actions s/rewritew/rewrite/ sanjoy: s/rewritew/rewrite/
		// Upon concluding that this matches the desired pattern, the function
		// will return the pair {NewAddRec, SmallPredsVec} where:
		// NewAddRec = {%Start,+,%Step}
		// SmallPredsVec = {P1, P2, P3} as follows:
		// P1(WrapPred): AR: {trunc(%Start),+,(trunc %Step)}<nsw> Flags: <nssw>
		// P2(EqualPred): %Start == (sext i32 (trunc i64 %Start to i32) to i64)
		// P3(EqualPred): %Step == (sext i32 (trunc i64 %Step to i32) to i64)
		// The returned pair means that SymbolicPHI can be rewritten into NewAddRec
		// under the predicates {P1,P2,P3}.
		// This predicated rewrite will be cached in PredicatedSCEVRewrites:
		// PredicatedSCEVRewrites[{%X,L}] = {NewAddRec, {P1,P2,P3)}
		//
		// TODO's:
		//
		// 1) Extend the Induction descriptor to also support inductions that involve
		// casts: When needed (namely, when we are called in the context of the
		// vectorizer induction analysis), a Set of cast instructions will be
		// populated by this method, and provided back to isInductionPHI. This is
		// needed to allow the vectorizer to properly record them to be ignored by
		// the cost model and to avoid vectorizing them (otherwise these casts,
		// which are redundant under the runtime overflow checks, will be
		// vectorized, which can be costly).
		//
		// 2) Support additional induction/PHISCEV patterns: We also want to support
		// inductions where the sext-trunc / zext-trunc operations (partly) occur
		// after the induction update operation (the induction increment):
		//
		// (Trunc iy (SExt/ZExt ix (%SymbolicPHI + InvariantAccum) to iy) to ix)
		// which correspond to a phi->add->trunc->sext/zext->phi update chain.
		//
		// (Trunc iy ((SExt/ZExt ix (%SymbolicPhi) to iy) + InvariantAccum) to ix)
		sanjoyUnsubmitted Not Done Reply Inline Actions Usually for out parameters like this, the type is `const SCEVPredicate &`. But I'd prefer just returning an `std::pair`. sanjoy:* Usually for out parameters like this, the type is `const SCEVPredicate *&`. But I'd prefer…
		// which correspond to a phi->trunc->add->sext/zext->phi update chain.
		//
		// 3) Outline common code with createAddRecFromPHI to avoid duplication.
		//
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		ScalarEvolution::createAddRecFromPHIWithCasts(const SCEVUnknown *SymbolicPHI) {

		SmallVector<const SCEVPredicate *, 3> Predicates;

		// *** Part1: Analysis: Check if we have the expected pattern

		auto *PN = cast<PHINode>(SymbolicPHI->getValue());
		sanjoyUnsubmitted Not Done Reply Inline Actions Can you do `find({SymbolicPHI, L})`? sanjoy: Can you do `find({SymbolicPHI, L})`?
		if (!PN->getType()->isIntegerTy())
		return None;

		const Loop *L = LI.getLoopFor(PN->getParent());
		if (!L \|\| L->getHeader() != PN->getParent())
		return None;

		// Check to see if we already analyzed this PHI.
		auto I = PredicatedSCEVRewrites.find({SymbolicPHI, L});
		sanjoyUnsubmitted Done Reply Inline Actions IMO a slightly more idiomatic pattern (which would obviate the need for the `* Part1` comment) is to have a `createAddRecFromPHIWithCasts` and a `createAddRecFromPHIWithCastsImpl`, where the first function checks `PredicatedSCEVRewrites` for an existing solution, and delegates to `createAddRecFromPHIWithCastsImpl` if we don't have a cached solution. sanjoy: IMO a slightly more idiomatic pattern (which would obviate the need for the `* Part1`…
		if (I != PredicatedSCEVRewrites.end()) {
		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>> Rewrite =
		I->second;
		// Analysis was done before and failed to create an AddRec:
		sanjoyUnsubmitted Not Done Reply Inline Actions `Pair` isn't clear. Please either name it something more specific, or (IMO better) use `{SymbolicPHI, L}` instead of `Pair`. sanjoy: `Pair` isn't clear. Please either name it something more specific, or (IMO better) use…
		if (Rewrite.first == SymbolicPHI)
		sanjoyUnsubmitted Not Done Reply Inline Actions Can you do `{SymbolicPHI, nullptr}` instead of `make_pair`? sanjoy: Can you do `{SymbolicPHI, nullptr}` instead of `make_pair`?
		return None;
		// Analysis was done before and succeeded to create an AddRec under
		// a predicate:
		assert(isa<SCEVAddRecExpr>(Rewrite.first) && "Expected an AddRec");
		assert(!(Rewrite.second).empty() && "Expected to find Predicates");
		return Rewrite;
		}

		// In case the analysis that follows fails, we (speculatively) record in the
		sanjoyUnsubmitted Done Reply Inline Actions Is this semantically important? That is, if you remove this and instead ensure that we always populate `PredicatedSCEVRewrites[{SymbolicPHI, L}]` before leaving this function, will we enter an infinite loop somewhere? sanjoy: Is this semantically important? That is, if you remove this and instead ensure that we always…
		doritAuthorUnsubmitted Not Done Reply Inline Actions I added a clarification on the motivation for this early initialization. There is nothing semantic behind it. It is just to avoid having to initialize it upon every exit from the function (keep things a bit more compact, avoid the slight code duplication, make sure we don't forget it…). Is that ok with you with the clarification? dorit: I added a clarification on the motivation for this early initialization. There is nothing…
		sanjoyUnsubmitted Not Done Reply Inline Actions Given that you now have the `createAddRecFromPHIWithCastsImpl` split out, a cleaner way to achieve the same property would be to insert the result into the cache in `createAddRecFromPHIWithCasts`. sanjoy: Given that you now have the `createAddRecFromPHIWithCastsImpl` split out, a cleaner way to…
		// cache that it had failed.
		// If the analysis succeeds, we override this entry with the proposed rewrite.
		PredicatedSCEVRewrites[{SymbolicPHI, L}] = {SymbolicPHI, Predicates};

		// The loop may have multiple entrances or multiple exits; we can analyze
		// this phi as an addrec if it has a unique entry value and a unique
		// backedge value.
		Value BEValueV = nullptr, StartValueV = nullptr;
		for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
		Value *V = PN->getIncomingValue(i);
		if (L->contains(PN->getIncomingBlock(i))) {
		if (!BEValueV) {
		BEValueV = V;
		} else if (BEValueV != V) {
		BEValueV = nullptr;
		break;
		}
		} else if (!StartValueV) {
		StartValueV = V;
		} else if (StartValueV != V) {
		StartValueV = nullptr;
		sbarangaUnsubmitted Not Done Reply Inline Actions It would be nice to have a comment here saying that this works because we're going to add a SCEV predicate to prove that SymbolicPHI == Add->getOperand(i). sbaranga: It would be nice to have a comment here saying that this works because we're going to add a…
		break;
		}
		}
		if (!BEValueV \|\| !StartValueV)
		return None;

		const SCEV *BEValue = getSCEV(BEValueV);

		// If the value coming around the backedge is an add with the symbolic
		// value we just inserted, possibly with casts that we can ignore under
		// an appropriate runtime guard, then we found a simple induction variable!
		const auto *Add = dyn_cast<SCEVAddExpr>(BEValue);
		if (!Add)
		return None;

		// If there is a single occurrence of the symbolic value, possibly
		// casted, replace it with a recurrence.
		unsigned FoundIndex = Add->getNumOperands();
		Type *TruncTy = nullptr;
		bool Signed;
		for (unsigned i = 0, e = Add->getNumOperands(); i != e; ++i)
		sanjoyUnsubmitted Done Reply Inline Actions Should we be getting here in the `Add->getOperand(i) == SymbolicPHI` case? Shouldn't the regular add recurrence creating logic have triggered? sanjoy: Should we be getting here in the `Add->getOperand(i) == SymbolicPHI` case? Shouldn't the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions yes, I would indeed expect it would have been caught already; added a comment here, and an assert in isSimpleCastedPHI. dorit: yes, I would indeed expect it would have been caught already; added a comment here, and an…
		if (Add->getOperand(i) == SymbolicPHI \|\|
		// In this case we will add a SCEV predicate that allows us to prove
		// that Add->getOperand(i) == SymbolicPHI.
		(TruncTy =
		isSimpleCastedPHI(Add->getOperand(i), SymbolicPHI, Signed, *this)))
		if (FoundIndex == e) {
		FoundIndex = i;
		break;
		}

		if (FoundIndex == Add->getNumOperands())
		return None;

		sanjoyUnsubmitted Not Done Reply Inline Actions Do you also need to check that the recurrence is affine? sanjoy: Do you also need to check that the recurrence is affine?
		doritAuthorUnsubmitted Not Done Reply Inline Actions Oh, probably so… Do you think the isAffine check is also required in createAddRecFromPHI()? createAddRecFromPHIWithCasts() is basically a subset of createAddRecFromPHI, modified to consider also the sext-trunc cast pattern (the intention is later to try to factor out common parts as much as possible). I copied this check from there without thinking much… dorit: Oh, probably so… Do you think the isAffine check is also required in createAddRecFromPHI()?
		sanjoyUnsubmitted Not Done Reply Inline Actions I was suggesting the `isAffine` since IIRC (but please check) you cannot create a no wrap predicate on a non-affine add recurrence. sanjoy: I was suggesting the `isAffine` since IIRC (but please check) you cannot create a no wrap…
		// Create an add with everything but the specified operand.
		SmallVector<const SCEV *, 8> Ops;
		for (unsigned i = 0, e = Add->getNumOperands(); i != e; ++i)
		sbarangaUnsubmitted Not Done Reply Inline Actions I think the contract should be to return a SCEV expression with the same type as what getSCEV would return for the phi node. It's also probably better to not mention the vectorizer here (since more passes will end up running this code anyway). sbaranga: I think the contract should be to return a SCEV expression with the same type as what getSCEV…
		if (i != FoundIndex)
		Ops.push_back(Add->getOperand(i));
		const SCEV *Accum = getAddExpr(Ops);

		// This is not a valid addrec if the step amount is varying each
		// loop iteration, but is not itself an addrec in this loop.
		if (!isLoopInvariant(Accum, L) &&
		!(isa<SCEVAddRecExpr>(Accum) &&
		cast<SCEVAddRecExpr>(Accum)->getLoop() == L &&
		cast<SCEVAddRecExpr>(Accum)->isAffine()))
		return None;
		sanjoyUnsubmitted Not Done Reply Inline Actions s/specificed/specified/ sanjoy: s/specificed/specified/


		// *** Part2: Create the predicates

		sbarangaUnsubmitted Not Done Reply Inline Actions I think we need a more formal explanation here on why this predicate guarantees that this is an AddRecExpr. We're trying to prove that if we have: SymbolicPHI = phi({Start, LoopHead}, {NextValue, LoopLatch}) NextValue = (Sext ix (Trunc iy (%SymbolicPHI) to ix) to iy) + InvariantAccum then SymolicPHI = {Start, +, InvariantAccum}. At iteration 0 both values are equal to Start, so it's enough to prove that SymbolicPhi + Invariant == (Sext (trunc (SymbolicPHI)) + Invariant, which should be true from the SCEV predicate. sbaranga: I think we need a more formal explanation here on why this predicate guarantees that this is an…
		// Analysis was successful: we have a phi-with-cast pattern for which we
		sanjoyUnsubmitted Not Done Reply Inline Actions I'm not sure that this is correct. Say we had a loop with 4 iterations, `StartVal` was `i16 257`, `Accum` was `i16 1`, `TruncTy` was `i8` and the PHI was being zero extended. In that case, the value of the PHI node on the second iteration (i.e. after taking the backedge once) would be `(trunc 257) + 1` = `2`. However, despite `{(trunc 257),+,1}` = `{1,+,1}` not unsigned-overflowing in 4 iterations, `{257,+,1}` would not produce the correct values for the PHI node. sanjoy: I'm not sure that this is correct. Say we had a loop with 4 iterations, `StartVal` was `i16…
		doritAuthorUnsubmitted Not Done Reply Inline Actions I see. So I guess we should be returning a {(sext (trunc Start)),+,{(sext (trunc Accum))} as the newAR, right...? dorit: I see. So I guess we should be returning a {(sext (trunc Start)),+,{(sext (trunc Accum))} as…
		sanjoyUnsubmitted Not Done Reply Inline Actions Not sure how that will work -- won't `{(sext (trunc Start)),+,{(sext (trunc Accum))}` be `{1,+,1}` and thus be `1` instead of `257` in the first iteration? I haven't thought this through, but I suspect you'll have to check that the starte value fits in the narrower type or something like that. In any case, I agree with Silviu here -- whatever you go with, please justify why that is correct with a proof here. sanjoy: Not sure how that will work -- won't `{(sext (trunc Start)),+,{(sext (trunc Accum))}` be `{1,+…
		sbarangaUnsubmitted Not Done Reply Inline Actions Maybe have two extra SCEVEqualPredicates to test that (sext (trunc Start)) == Start and {(sext (trunc Accum))} == Accum and return {Start, + Accum}? Before adding the extra predicates it would be worth testing if SCEV can prove these properties statically first (for example for cases where either Start or Acum are constants). sbaranga: Maybe have two extra SCEVEqualPredicates to test that (sext (trunc Start)) == Start and {(sext…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Good idea! Sanjoy, Silviu: Are you ok with extending the SCEVEqualPredicate to a non constant on the RHS? (I think currently only SCEV == constant is supported). dorit: Good idea! Sanjoy, Silviu: Are you ok with extending the SCEVEqualPredicate to a non constant…
		sbarangaUnsubmitted Not Done Reply Inline Actions That's ok with me. sbaranga: That's ok with me.
		// can return an AddRec expression under some predicates. Let's see which
		sbarangaUnsubmitted Not Done Reply Inline Actions This is a bit hard to follow with NewExpr, Expr, etc and the initial statement problem should be simplified. I guess we just want to prove that Expr(i) = Start + i Accum, given that (1) Expr(0) = Start (2) Expr(i+1) = (Ext ix (Trunc iy (Expr(i)) to ix) to iy) + Accum sbaranga:* This is a bit hard to follow with NewExpr, Expr, etc and the initial statement problem should…
		// predicates are required, and prove that they guarantee that the NewExpr
		// that we return is equal to the original Expr, where:
		// for i==0:
		// NewExpr == OrigExpr == Start
		// for i>0:
		// NewExpr = Start + i * Accum
		// OrigExpr = SExtTrunc(Start + (i-1)*Accum) + Accum
		// or in other words, prove that:
		// Start + (i-1)Accum == SExtTrunc( Start + (i-1)Accum )
		//
		// We use the abbreviation SExtTrunc(%w) to denote:
		// (sext ix (trunc iy %w to ix) to iy)
		//
		// We will be adding the following Predicates:
		// P1: A Wrap predicate that guarantees that Trunc(Start) + i*Trunc(Accum)
		// fits within the truncated type (does not overflow) for i = 0 to n-1.
		// P2: An Equal predicate that guarantees that Start == SExtTrunc(Start)
		// P3: An Equal predicate that guarantees that Accum == SExtTrunc(Accum)
		//
		// Now, let's start with the initial steps:
		// At iteration 0 we want to prove:
		// Start == Start :: Holds Trivially.
		//
		// At iteration 1 we want to prove:
		sbarangaUnsubmitted Not Done Reply Inline Actions No text should be required for iteration 2, proving the induction step should be enough. sbaranga: No text should be required for iteration 2, proving the induction step should be enough.
		doritAuthorUnsubmitted Not Done Reply Inline Actions I thought it's helpful to have an example step before the formal step. I'll drop it. dorit: I thought it's helpful to have an example step before the formal step. I'll drop it.
		// Start + Accum == SExtTrunc(Start) + Accum :: from P2
		//
		// Now to the induction step:
		// At iteration 2 we want to prove:
		// Start + 2*Accum == SExtTrunc(SExtTrunc(Start) + Accum) + Accum
		// or in other words:
		// Start + Accum == SExtTrunc(SExtTrunc(Start) + Accum)
		// == SExtTrunc(Start + Accum) :: from P2
		// == SExt(Trunc(Start) + Trunc(Accum)) :: Trunc(x+y)=>Trunc(x)+Trunc(y)
		// == SExtTrunc(Start) + SExtTrunc(Accum):: from P1:
		// :: SExt(x+y)=>SExt(x)+SExt(y)
		// == Start + Accum :: from P2,P3
		//
		// More formally:
		// Given that, by induction hypothesis:
		// Start + iAccum == SExtTrunc(Start + (i-1)Accum) + Accum :: (step I)
		// Prove that:
		// Start + (i+1)Accum == SExtTrunc(Start + iAccum) + Accum :: (step I+1)
		//
		// Start + (i+1)*Accum
		// == (Start + i*Accum) + Accum
		// == (SExtTrunc(Start + (i-1)*Accum) + Accum) + Accum :: from step I
		sbarangaUnsubmitted Not Done Reply Inline Actions It would be better to just use Ext and Trunc as separate operators instead SExtTrunc. sbaranga: It would be better to just use Ext and Trunc as separate operators instead SExtTrunc.
		// == SExtTrunc(Start + (i-1)*Accum) + SExtTrunc(Accum) + Accum :: from P3
		// == SExtTrunc((Start + (i-1)*Accum) + Accum) + Accum
		sanjoyUnsubmitted Done Reply Inline Actions You should be able to `cast<>` here. sanjoy: You should be able to `cast<>` here.
		// :: from P1: SExt(x)+SExt(y)=>SExt(x+y)
		// == SExtTrunc(Start + i*Accum) + Accum
		//
		// By induction, the same applies to all iterations 1<=i<n:
		//

		// Create a truncated addrec for which we will add a no overflow check (P1).
		const SCEV *StartVal = getSCEV(StartValueV);
		const SCEV *PHISCEV =
		getAddRecExpr(getTruncateExpr(StartVal, TruncTy),
		getTruncateExpr(Accum, TruncTy), L, SCEV::FlagAnyWrap);
		const auto *AR = dyn_cast<SCEVAddRecExpr>(PHISCEV);
		if (!AR)
		return None;

		SCEVWrapPredicate::IncrementWrapFlags AddedFlags =
		Signed ? SCEVWrapPredicate::IncrementNSSW
		: SCEVWrapPredicate::IncrementNUSW;
		const SCEVPredicate *AddRecPred = getWrapPredicate(AR, AddedFlags);
		Predicates.push_back(AddRecPred);

		// Create the Equal Predicates P2,P3:
		auto AppendPredicate = [&](const SCEV *Expr) -> void {
		const SCEV *TruncatedExpr = getTruncateExpr(Expr, TruncTy);
		const SCEV *ExtendedExpr =
		Signed ? getSignExtendExpr(TruncatedExpr, Expr->getType())
		: getZeroExtendExpr(TruncatedExpr, Expr->getType());
		if (!isKnownPredicate(ICmpInst::ICMP_EQ, Expr, ExtendedExpr)) {
		sbarangaUnsubmitted Not Done Reply Inline Actions I originally had in mind testing the SCEV expressions for equality (Expr == ExtendedExpr) and checking that both are loop invariant. I had a look at the implementation of isKnownPredicate, and I'm not convinced this would work as expected. Can we add a regression test for this? I think we need to check anyway that both expressions are loop invariant before adding the predicate. sbaranga: I originally had in mind testing the SCEV expressions for equality (Expr == ExtendedExpr) and…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Right. I was assuming Accum is invariant in the loop, forgot I was allowing it to not be invariant. Thanks. dorit: Right. I was assuming Accum is invariant in the loop, forgot I was allowing it to not be…
		const SCEVPredicate *Pred = getEqualPredicate(Expr, ExtendedExpr);
		Predicates.push_back(Pred);
		}
		};

		AppendPredicate(StartVal);
		AppendPredicate(Accum);


		// *** Part3: Predicates are ready. Now go ahead and create the new addrec in
		// which the casts had been folded away. The caller can rewrite SymbolicPHI
		// into NewAR if it will also add the runtime overflow checks specified in
		// Predicates.
		auto *NewAR = getAddRecExpr(StartVal, Accum, L, SCEV::FlagAnyWrap);

		std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>> PredRewrite =
		std::make_pair(NewAR, Predicates);
		// Remember the result of the analysis for this SCEV at this locayyytion.
		PredicatedSCEVRewrites[{SymbolicPHI, L}] = PredRewrite;
		return PredRewrite;
		}

/// A helper function for createAddRecFromPHI to handle simple cases.		/// A helper function for createAddRecFromPHI to handle simple cases.
///		///
/// This function tries to find an AddRec expression for the simplest (yet most		/// This function tries to find an AddRec expression for the simplest (yet most
/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).		/// common) cases: PN = PHI(Start, OP(Self, LoopInvariant)).
/// If it fails, createAddRecFromPHI will use a more general, but slow,		/// If it fails, createAddRecFromPHI will use a more general, but slow,
/// technique for finding the AddRec expression.		/// technique for finding the AddRec expression.
const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,		const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,
Value *BEValueV,		Value *BEValueV,
▲ Show 20 Lines • Show All 1,720 Lines • ▼ Show 20 Lines	auto RemoveLoopFromBackedgeMap =
BTCPos->second.clear();		BTCPos->second.clear();
Map.erase(BTCPos);		Map.erase(BTCPos);
}		}
};		};

RemoveLoopFromBackedgeMap(BackedgeTakenCounts);		RemoveLoopFromBackedgeMap(BackedgeTakenCounts);
RemoveLoopFromBackedgeMap(PredicatedBackedgeTakenCounts);		RemoveLoopFromBackedgeMap(PredicatedBackedgeTakenCounts);

		// Drop information about predicated SCEV rewrites for this loop.
		for (auto I = PredicatedSCEVRewrites.begin();
		I != PredicatedSCEVRewrites.end();) {
		std::pair<const SCEV , const Loop > Entry = I->first;
		if (Entry.second == L)
		PredicatedSCEVRewrites.erase(I++);
		else
		++I;
		}

// Drop information about expressions based on loop-header PHIs.		// Drop information about expressions based on loop-header PHIs.
SmallVector<Instruction *, 16> Worklist;		SmallVector<Instruction *, 16> Worklist;
PushLoopPHIs(L, Worklist);		PushLoopPHIs(L, Worklist);

SmallPtrSet<Instruction *, 8> Visited;		SmallPtrSet<Instruction *, 8> Visited;
while (!Worklist.empty()) {		while (!Worklist.empty()) {
Instruction *I = Worklist.pop_back_val();		Instruction *I = Worklist.pop_back_val();
if (!Visited.insert(I).second)		if (!Visited.insert(I).second)
▲ Show 20 Lines • Show All 4,088 Lines • ▼ Show 20 Lines	: F(Arg.F), HasGuards(Arg.HasGuards), TLI(Arg.TLI), AC(Arg.AC), DT(Arg.DT),
LoopDispositions(std::move(Arg.LoopDispositions)),		LoopDispositions(std::move(Arg.LoopDispositions)),
LoopPropertiesCache(std::move(Arg.LoopPropertiesCache)),		LoopPropertiesCache(std::move(Arg.LoopPropertiesCache)),
BlockDispositions(std::move(Arg.BlockDispositions)),		BlockDispositions(std::move(Arg.BlockDispositions)),
UnsignedRanges(std::move(Arg.UnsignedRanges)),		UnsignedRanges(std::move(Arg.UnsignedRanges)),
SignedRanges(std::move(Arg.SignedRanges)),		SignedRanges(std::move(Arg.SignedRanges)),
UniqueSCEVs(std::move(Arg.UniqueSCEVs)),		UniqueSCEVs(std::move(Arg.UniqueSCEVs)),
UniquePreds(std::move(Arg.UniquePreds)),		UniquePreds(std::move(Arg.UniquePreds)),
SCEVAllocator(std::move(Arg.SCEVAllocator)),		SCEVAllocator(std::move(Arg.SCEVAllocator)),
		PredicatedSCEVRewrites(std::move(Arg.PredicatedSCEVRewrites)),
FirstUnknown(Arg.FirstUnknown) {		FirstUnknown(Arg.FirstUnknown) {
Arg.FirstUnknown = nullptr;		Arg.FirstUnknown = nullptr;
}		}

ScalarEvolution::~ScalarEvolution() {		ScalarEvolution::~ScalarEvolution() {
// Iterate through all the SCEVUnknown instances and call their		// Iterate through all the SCEVUnknown instances and call their
// destructors, so that they release their references to their values.		// destructors, so that they release their references to their values.
for (SCEVUnknown *U = FirstUnknown; U;) {		for (SCEVUnknown *U = FirstUnknown; U;) {
▲ Show 20 Lines • Show All 384 Lines • ▼ Show 20 Lines	void ScalarEvolution::forgetMemoizedResults(const SCEV *S) {
LoopDispositions.erase(S);		LoopDispositions.erase(S);
BlockDispositions.erase(S);		BlockDispositions.erase(S);
UnsignedRanges.erase(S);		UnsignedRanges.erase(S);
SignedRanges.erase(S);		SignedRanges.erase(S);
ExprValueMap.erase(S);		ExprValueMap.erase(S);
HasRecMap.erase(S);		HasRecMap.erase(S);
MinTrailingZerosCache.erase(S);		MinTrailingZerosCache.erase(S);

		for (auto I = PredicatedSCEVRewrites.begin();
		I != PredicatedSCEVRewrites.end();) {
		std::pair<const SCEV , const Loop > Entry = I->first;
		if (Entry.first == S)
		PredicatedSCEVRewrites.erase(I++);
		else
		sanjoyUnsubmitted Not Done Reply Inline Actions Doesn't `PredicatedSCEVRewrites.erase(I++)` invalidate `E`? sanjoy: Doesn't `PredicatedSCEVRewrites.erase(I++)` invalidate `E`?
		++I;
		}

auto RemoveSCEVFromBackedgeMap =		auto RemoveSCEVFromBackedgeMap =
[S, this](DenseMap<const Loop *, BackedgeTakenInfo> &Map) {		[S, this](DenseMap<const Loop *, BackedgeTakenInfo> &Map) {
for (auto I = Map.begin(), E = Map.end(); I != E;) {		for (auto I = Map.begin(), E = Map.end(); I != E;) {
BackedgeTakenInfo &BEInfo = I->second;		BackedgeTakenInfo &BEInfo = I->second;
if (BEInfo.hasOperand(S, this)) {		if (BEInfo.hasOperand(S, this)) {
BEInfo.clear();		BEInfo.clear();
Map.erase(I++);		Map.erase(I++);
} else		} else
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines
void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {		void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequiredTransitive<AssumptionCacheTracker>();		AU.addRequiredTransitive<AssumptionCacheTracker>();
AU.addRequiredTransitive<LoopInfoWrapperPass>();		AU.addRequiredTransitive<LoopInfoWrapperPass>();
AU.addRequiredTransitive<DominatorTreeWrapperPass>();		AU.addRequiredTransitive<DominatorTreeWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();		AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
}		}

const SCEVPredicate *		const SCEVPredicate ScalarEvolution::getEqualPredicate(const SCEV LHS,
ScalarEvolution::getEqualPredicate(const SCEVUnknown *LHS,		const SCEV *RHS) {
const SCEVConstant *RHS) {
FoldingSetNodeID ID;		FoldingSetNodeID ID;
		assert(LHS->getType() == RHS->getType() &&
		"Type mismatch between LHS and RHS");
// Unique this node based on the arguments		// Unique this node based on the arguments
ID.AddInteger(SCEVPredicate::P_Equal);		ID.AddInteger(SCEVPredicate::P_Equal);
ID.AddPointer(LHS);		ID.AddPointer(LHS);
ID.AddPointer(RHS);		ID.AddPointer(RHS);
void *IP = nullptr;		void *IP = nullptr;
if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))		if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))
return S;		return S;
SCEVEqualPredicate *Eq = new (SCEVAllocator)		SCEVEqualPredicate *Eq = new (SCEVAllocator)
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	public:

const SCEV visitUnknown(const SCEVUnknown Expr) {		const SCEV visitUnknown(const SCEVUnknown Expr) {
if (Pred) {		if (Pred) {
auto ExprPreds = Pred->getPredicatesForExpr(Expr);		auto ExprPreds = Pred->getPredicatesForExpr(Expr);
for (auto *Pred : ExprPreds)		for (auto *Pred : ExprPreds)
if (const auto *IPred = dyn_cast<SCEVEqualPredicate>(Pred))		if (const auto *IPred = dyn_cast<SCEVEqualPredicate>(Pred))
if (IPred->getLHS() == Expr)		if (IPred->getLHS() == Expr)
return IPred->getRHS();		return IPred->getRHS();
}		}
		return convertToAddRecWithPreds(Expr);
		AyalUnsubmitted Not Done Reply Inline Actions dyn_cast >> isa Ayal: dyn_cast >> isa
		AyalUnsubmitted Not Done Reply Inline Actions The attempt to createAddRecFromPHIWithCasts() involves introducing a new cast predicate, right? Worth guarding or at-least commenting. Ayal: The attempt to createAddRecFromPHIWithCasts() involves introducing a new cast predicate, right?
return Expr;
}		}

const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {		const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {
const SCEV *Operand = visit(Expr->getOperand());		const SCEV *Operand = visit(Expr->getOperand());
		sbarangaUnsubmitted Not Done Reply Inline Actions Now that I'm having a fresh look, I think we need to revisit the conditions for the transformation here. If we want to do trunc({x,+,y}) -> {trunc(x), +, trunc(y)}, is this not always true? sbaranga: Now that I'm having a fresh look, I think we need to revisit the conditions for the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Yes, I think you're right… This simplifies a few things… :-) so no need for a visitTruncateExpr case at all in this rewriter... the base class rewriter can handle the Truncate (without any predicates) (right?) Thanks!! dorit: Yes, I think you're right… This simplifies a few things… :-) so no need for a visitTruncateExpr…
		sbarangaUnsubmitted Not Done Reply Inline Actions Correct, we shouldn't need to do anything here. sbaranga: Correct, we shouldn't need to do anything here.
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);
if (AR && AR->getLoop() == L && AR->isAffine()) {		if (AR && AR->getLoop() == L && AR->isAffine()) {
// This couldn't be folded because the operand didn't have the nuw		// This couldn't be folded because the operand didn't have the nuw
// flag. Add the nusw flag as an assumption that we could make.		// flag. Add the nusw flag as an assumption that we could make.
const SCEV *Step = AR->getStepRecurrence(SE);		const SCEV *Step = AR->getStepRecurrence(SE);
Type *Ty = Expr->getType();		Type *Ty = Expr->getType();
if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNUSW))		if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNUSW))
return SE.getAddRecExpr(SE.getZeroExtendExpr(AR->getStart(), Ty),		return SE.getAddRecExpr(SE.getZeroExtendExpr(AR->getStart(), Ty),
SE.getSignExtendExpr(Step, Ty), L,		SE.getSignExtendExpr(Step, Ty), L,
AR->getNoWrapFlags());		AR->getNoWrapFlags());
		sbarangaUnsubmitted Not Done Reply Inline Actions I know I said that this should be a NUSW on the bugzilla ticket, but I'm not that sure anymore. Whatever the case this need a comment explaining the choice. sbaranga: I know I said that this should be a NUSW on the bugzilla ticket, but I'm not that sure anymore.
		doritAuthorUnsubmitted Not Done Reply Inline Actions I actually don't see how we can tell whether to create an NUSW or NSSW assumption without further analysis, such as the analysis we do in getAddRecForPhiWithCasts…; So maybe in the general case we need to be conservative here and add both NSSW and NUSW to make sure that there will be no kind of overflow due to the truncation? dorit: I actually don't see how we can tell whether to create an NUSW or NSSW assumption without…
		sbarangaUnsubmitted Not Done Reply Inline Actions Adding just NUSW would work, but the problem would be that the predicate would fail at runtime often if the number is used as signed. Ideally we should find a solution that works in most cases. sbaranga: Adding just NUSW would work, but the problem would be that the predicate would fail at runtime…
		doritAuthorUnsubmitted Not Done Reply Inline Actions So Truncate may need to call SE::getAddRecForPhiWithCasts analysis directly, so that it will have the signess knowledge/context; in fact the analysis results already include the predicate that needs to be added (NUSW/NSSW). dorit: So Truncate may need to call SE::getAddRecForPhiWithCasts analysis directly, so that it will…
}		}
return SE.getZeroExtendExpr(Operand, Expr->getType());		return SE.getZeroExtendExpr(Operand, Expr->getType());
}		}

const SCEV visitSignExtendExpr(const SCEVSignExtendExpr Expr) {		const SCEV visitSignExtendExpr(const SCEVSignExtendExpr Expr) {
const SCEV *Operand = visit(Expr->getOperand());		const SCEV *Operand = visit(Expr->getOperand());
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Operand);
if (AR && AR->getLoop() == L && AR->isAffine()) {		if (AR && AR->getLoop() == L && AR->isAffine()) {
// This couldn't be folded because the operand didn't have the nsw		// This couldn't be folded because the operand didn't have the nsw
// flag. Add the nssw flag as an assumption that we could make.		// flag. Add the nssw flag as an assumption that we could make.
const SCEV *Step = AR->getStepRecurrence(SE);		const SCEV *Step = AR->getStepRecurrence(SE);
Type *Ty = Expr->getType();		Type *Ty = Expr->getType();
if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNSSW))		if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNSSW))
return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),		return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),
SE.getSignExtendExpr(Step, Ty), L,		SE.getSignExtendExpr(Step, Ty), L,
AR->getNoWrapFlags());		AR->getNoWrapFlags());
}		}
return SE.getSignExtendExpr(Operand, Expr->getType());		return SE.getSignExtendExpr(Operand, Expr->getType());
}		}

private:		private:
bool addOverflowAssumption(const SCEVAddRecExpr *AR,		bool addOverflowAssumption(const SCEVPredicate *P) {
SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
auto *A = SE.getWrapPredicate(AR, AddedFlags);
if (!NewPreds) {		if (!NewPreds) {
// Check if we've already made this assumption.		// Check if we've already made this assumption.
return Pred && Pred->implies(A);		return Pred && Pred->implies(P);
}		}
NewPreds->insert(A);		NewPreds->insert(P);
return true;		return true;
}		}

		bool addOverflowAssumption(const SCEVAddRecExpr *AR,
		SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
		auto *A = SE.getWrapPredicate(AR, AddedFlags);
		return addOverflowAssumption(A);
		}

		// If \p Expr represents a PHINode, we try to see if it can be represented
		sanjoyUnsubmitted Not Done Reply Inline Actions I'm also not a big fan of the name `analyzeUnknownSCEVPHI` here -- `analyze` does not mean anything specific, and the `UnknownSCEV` part is redundant. How about `convertToAddRecWithPreds`? sanjoy: I'm also not a big fan of the name `analyzeUnknownSCEVPHI` here -- `analyze` does not mean…
		// as an AddRec, possibly under a predicate (PHISCEVPred). If it is possible
		// to add this predicate as a runtime overflow check, we return the AddRec.
		// If \p Expr does not meet these conditions (is not a PHI node, or we
		// couldn't create an AddRec for it, or couldn't add the predicate), we just
		// return \p Expr.
		const SCEV convertToAddRecWithPreds(const SCEVUnknown Expr) {
		sanjoyUnsubmitted Not Done Reply Inline Actions Use `auto ` for pointers. sanjoy:* Use `auto *` for pointers.
		if (!isa<PHINode>(Expr->getValue()))
		return Expr;
		Optional<std::pair<const SCEV , SmallVector<const SCEVPredicate , 3>>>
		PredicatedRewrite = SE.createAddRecFromPHIWithCasts(Expr);
		if (!PredicatedRewrite)
		return Expr;
		for (auto *P : PredicatedRewrite->second)
		if (!addOverflowAssumption(P))
		return Expr;
		return PredicatedRewrite->first;
		}

SmallPtrSetImpl<const SCEVPredicate > NewPreds;		SmallPtrSetImpl<const SCEVPredicate > NewPreds;
SCEVUnionPredicate *Pred;		SCEVUnionPredicate *Pred;
const Loop *L;		const Loop *L;
};		};
} // end anonymous namespace		} // end anonymous namespace

const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV S, const Loop *L,		const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV S, const Loop *L,
SCEVUnionPredicate &Preds) {		SCEVUnionPredicate &Preds) {
Show All 20 Lines
}		}

/// SCEV predicates		/// SCEV predicates
SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,		SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,
SCEVPredicateKind Kind)		SCEVPredicateKind Kind)
: FastID(ID), Kind(Kind) {}		: FastID(ID), Kind(Kind) {}

SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,		SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,
const SCEVUnknown *LHS,		const SCEV LHS, const SCEV RHS)
const SCEVConstant *RHS)		: SCEVPredicate(ID, P_Equal), LHS(LHS), RHS(RHS) {
: SCEVPredicate(ID, P_Equal), LHS(LHS), RHS(RHS) {}		assert(LHS->getType() == RHS->getType() && "LHS and RHS types don't match");
		}

bool SCEVEqualPredicate::implies(const SCEVPredicate *N) const {		bool SCEVEqualPredicate::implies(const SCEVPredicate *N) const {
const auto *Op = dyn_cast<SCEVEqualPredicate>(N);		const auto *Op = dyn_cast<SCEVEqualPredicate>(N);

if (!Op)		if (!Op)
return false;		return false;

return Op->LHS == LHS && Op->RHS == RHS;		return Op->LHS == LHS && Op->RHS == RHS;
Show All 10 Lines
SCEVWrapPredicate::SCEVWrapPredicate(const FoldingSetNodeIDRef ID,		SCEVWrapPredicate::SCEVWrapPredicate(const FoldingSetNodeIDRef ID,
const SCEVAddRecExpr *AR,		const SCEVAddRecExpr *AR,
IncrementWrapFlags Flags)		IncrementWrapFlags Flags)
: SCEVPredicate(ID, P_Wrap), AR(AR), Flags(Flags) {}		: SCEVPredicate(ID, P_Wrap), AR(AR), Flags(Flags) {}

const SCEV *SCEVWrapPredicate::getExpr() const { return AR; }		const SCEV *SCEVWrapPredicate::getExpr() const { return AR; }

bool SCEVWrapPredicate::implies(const SCEVPredicate *N) const {		bool SCEVWrapPredicate::implies(const SCEVPredicate *N) const {
const auto *Op = dyn_cast<SCEVWrapPredicate>(N);		const auto *Op = dyn_cast<SCEVWrapPredicate>(N);
		sbarangaUnsubmitted Not Done Reply Inline Actions I think it would be better to move this to Scalar Evolution itself, instead of having it in the rewriter. It would essentially be a lazy analysis, and would also take as an argument the loop and would return the analyzed expression and a set of predicates. That way we don't have to do the analysis again for every instantiation of a SCEVPredicateRewriter. sbaranga: I think it would be better to move this to Scalar Evolution itself, instead of having it in the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions So where/when would this analysis be triggered? And where would we cache the result of the ScalarEvolution analysis? could you please elaborate on what would be the flow of things you are suggesting / what exactly you mean by lazy here…? I agree of course we should avoid repeating the analysis over and over again; the caching of the analysis that I was going to add, along with guarding the analysis with "if(NewPreds)", would guarantee that if the analysis succeeded once (when NewPreds was passed) then any time we are passed "Preds" we will not repeat the analysis (because I was going to add a new kind of Predicate, and cache things in Preds). But indeed if the analysis fails, this caching will not prevent us from repeating it (and failing) every time we are passed NewPreds… So maybe what this means is that the caching should be done in a new data-structure to be added to PSCEV, separately of Preds, where we would cache both a failure -- simply associate the UnknownSCEV of the phi node with itself, and a success -- associate the UnknownSCEV with the respective AddRec (and of course this AddRec would itself have a Predicate that the analysis will have already added). This way we could check the results of the analysis regardless of whether we are passed Preds. If we do that, would your suggestion above still be relevant? dorit: So where/when would this analysis be triggered? And where would we cache the result of the…
		sbarangaUnsubmitted Not Done Reply Inline Actions I was thinking using the same mechanism that SCEV already has for caching SCEV expressions (and which also use to store SCEV predicates). Essentially, PSCEV would call SCEV from here and SCEV would check to see if it has already analyzed the node or not. If not, it would do this analysis and store the result (using the loop + the SCEV Unknown as keys for further lookups). As you've said, in case of failure we can just return the SCEVUnknown expression without any additional predicates. This would essentially be the same thing SCEV does for getSCEV(). sbaranga: I was thinking using the same mechanism that SCEV already has for caching SCEV expressions (and…
		doritAuthorUnsubmitted Not Done Reply Inline Actions So just making sure: Currently SCEV caches things in the ValueExprMap. Are you suggesting to add a new member to SCEV, to store the result of the analysis? (where the result of the analysis is: "the SCEVUnknown %x in loop L can be rewritten to the AddRec '(0,+,%step)' if the Predicate 'Flags=NSSW, AR=(0,+,trunc(%step)' is added" ? ) dorit: So just making sure: Currently SCEV caches things in the ValueExprMap. Are you suggesting to…
		sbarangaUnsubmitted Not Done Reply Inline Actions I think the result of the analysis should be "for loop L the loop-variant SCEVUnknown can be re-written to another SCEV, given a set of predicates", so the analysis gives you both the SCEV and a set of predicates. This should be general enough to use in more cases. I would be happy with either reusing the same ValueExprMap for storage, or adding another ValueExprMap (they would probably both work). In general I'm not too picky about this as long as it is sensible (although @sanjoy will probably have something to say). sbaranga: I think the result of the analysis should be "for loop L the loop-variant SCEVUnknown can be re…
		doritAuthorUnsubmitted Not Done Reply Inline Actions SCEV's ValueExprMap maps a Value to a SCEVExpr; and we want to map a <UnknownSCEV, Loop> pair to a <SCEVExpr, setOfPredicates> pair. Are we talking about the same ValueExprMap? Also, you commented earlier that SCEV uses the ValueExprMap to also cache SCEV predicates; Could you please point me to where? (I thought that predicates and predicate-based rewrites were stored only in PSCEV's Preds and RewriteMap…) ? Thanks dorit: SCEV's ValueExprMap maps a Value to a SCEVExpr; and we want to map a <UnknownSCEV, Loop> pair…
		sbarangaUnsubmitted Not Done Reply Inline Actions Sorry, I should have looked at the code earlier. What I meant was adding a FoldingSet, like the ones used by ScalarEvolution to store SCEVs to and Preds (see UniqueSCEVs and UniquePreds). In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value. sbaranga: Sorry, I should have looked at the code earlier. What I meant was adding a FoldingSet, like the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions In fact what I have in mind is just adding another FoldingSet next to the two existing folding sets in ScalarEvolution. Isn't it more natural to hold the mapping from the unknownSCEV+Loop to the Predicate+AddRecExpr in a map like this?: DenseMap< std::pair<const Loop , const SCEV > , std::pair<const SCEV , SmallVector<constSCEVPredicate , 2>> > It would probably be easier to use a SCEVUnknown instead of a Value as a key, since it already has the callback to handle RAUW for the underlying Value. Wait, why would we be replacing uses here? We will be recording here just a tentative mapping, which will be valid only if the PSCEV caller will decide to actually add these predicates and SCEV rewrites in its Preds and RewriteMap… ? I'll upload a new patch along these lines next week (but if something in the above sounds wrong please shout!) dorit: > In fact what I have in mind is just adding another FoldingSet next to the two existing…
		sbarangaUnsubmitted Not Done Reply Inline Actions Using DenseMap should be ok as well. Regarding replacing uses: we need to handle this case in order to properly cache the result of the analysis (because passes that use SCEV can replace uses). This should be fine however if we use SCEVs instead of Values. So this shouldn't be a problem with the DenseMap that you want to use. sbaranga: Using DenseMap should be ok as well. Regarding replacing uses: we need to handle this case in…

return Op && Op->AR == AR && setFlags(Flags, Op->Flags) == Flags;		return Op && Op->AR == AR && setFlags(Flags, Op->Flags) == Flags;
}		}

bool SCEVWrapPredicate::isAlwaysTrue() const {		bool SCEVWrapPredicate::isAlwaysTrue() const {
		AyalUnsubmitted Not Done Reply Inline Actions exits >> latches Ayal: exits >> latches
SCEV::NoWrapFlags ScevFlags = AR->getNoWrapFlags();		SCEV::NoWrapFlags ScevFlags = AR->getNoWrapFlags();
IncrementWrapFlags IFlags = Flags;		IncrementWrapFlags IFlags = Flags;

if (ScalarEvolution::setFlags(ScevFlags, SCEV::FlagNSW) == ScevFlags)		if (ScalarEvolution::setFlags(ScevFlags, SCEV::FlagNSW) == ScevFlags)
IFlags = clearFlags(IFlags, IncrementNSSW);		IFlags = clearFlags(IFlags, IncrementNSSW);

return IFlags == IncrementAnyWrap;		return IFlags == IncrementAnyWrap;
}		}

void SCEVWrapPredicate::print(raw_ostream &OS, unsigned Depth) const {		void SCEVWrapPredicate::print(raw_ostream &OS, unsigned Depth) const {
OS.indent(Depth) << *getExpr() << " Added Flags: ";		OS.indent(Depth) << *getExpr() << " Added Flags: ";
if (SCEVWrapPredicate::IncrementNUSW & getFlags())		if (SCEVWrapPredicate::IncrementNUSW & getFlags())
OS << "<nusw>";		OS << "<nusw>";
if (SCEVWrapPredicate::IncrementNSSW & getFlags())		if (SCEVWrapPredicate::IncrementNSSW & getFlags())
OS << "<nssw>";		OS << "<nssw>";
OS << "\n";		OS << "\n";
}		}

SCEVWrapPredicate::IncrementWrapFlags		SCEVWrapPredicate::IncrementWrapFlags
SCEVWrapPredicate::getImpliedFlags(const SCEVAddRecExpr *AR,		SCEVWrapPredicate::getImpliedFlags(const SCEVAddRecExpr *AR,
		AyalUnsubmitted Not Done Reply Inline Actions Alternatively, you can bailed out immediately and return nullptr when multiple distinct BackEdge/Start values are found. Then these checks should be asserts? Ayal: Alternatively, you can bailed out immediately and return nullptr when multiple distinct…
ScalarEvolution &SE) {		ScalarEvolution &SE) {
IncrementWrapFlags ImpliedFlags = IncrementAnyWrap;		IncrementWrapFlags ImpliedFlags = IncrementAnyWrap;
SCEV::NoWrapFlags StaticFlags = AR->getNoWrapFlags();		SCEV::NoWrapFlags StaticFlags = AR->getNoWrapFlags();

// We can safely transfer the NSW flag as NSSW.		// We can safely transfer the NSW flag as NSSW.
if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNSW) == StaticFlags)		if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNSW) == StaticFlags)
ImpliedFlags = IncrementNSSW;		ImpliedFlags = IncrementNSSW;

if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNUW) == StaticFlags) {		if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNUW) == StaticFlags) {
// If the increment is positive, the SCEV NUW flag will also imply the		// If the increment is positive, the SCEV NUW flag will also imply the
// WrapPredicate NUSW flag.		// WrapPredicate NUSW flag.
if (const auto *Step = dyn_cast<SCEVConstant>(AR->getStepRecurrence(SE)))		if (const auto *Step = dyn_cast<SCEVConstant>(AR->getStepRecurrence(SE)))
if (Step->getValue()->getValue().isNonNegative())		if (Step->getValue()->getValue().isNonNegative())
ImpliedFlags = setFlags(ImpliedFlags, IncrementNUSW);		ImpliedFlags = setFlags(ImpliedFlags, IncrementNUSW);
}		}

return ImpliedFlags;		return ImpliedFlags;
}		}

/// Union predicates don't get cached so create a dummy set ID for it.		/// Union predicates don't get cached so create a dummy set ID for it.
SCEVUnionPredicate::SCEVUnionPredicate()		SCEVUnionPredicate::SCEVUnionPredicate()
		AyalUnsubmitted Not Done Reply Inline Actions This check seems redundant, as we're stopping on the first index found to be a phi or a simple casted phi, right? Simply break when found, and check if i == e afterwards, setting FoundIndex = i (if not). Ayal: This check seems redundant, as we're stopping on the first index found to be a phi or a simple…
: SCEVPredicate(FoldingSetNodeIDRef(nullptr, 0), P_Union) {}		: SCEVPredicate(FoldingSetNodeIDRef(nullptr, 0), P_Union) {}

bool SCEVUnionPredicate::isAlwaysTrue() const {		bool SCEVUnionPredicate::isAlwaysTrue() const {
return all_of(Preds,		return all_of(Preds,
[](const SCEVPredicate *I) { return I->isAlwaysTrue(); });		[](const SCEVPredicate *I) { return I->isAlwaysTrue(); });
}		}

ArrayRef<const SCEVPredicate *>		ArrayRef<const SCEVPredicate *>
Show All 9 Lines	if (const auto *Set = dyn_cast<SCEVUnionPredicate>(N))
return all_of(Set->Preds,		return all_of(Set->Preds,
[this](const SCEVPredicate *I) { return this->implies(I); });		[this](const SCEVPredicate *I) { return this->implies(I); });

auto ScevPredsIt = SCEVToPreds.find(N->getExpr());		auto ScevPredsIt = SCEVToPreds.find(N->getExpr());
if (ScevPredsIt == SCEVToPreds.end())		if (ScevPredsIt == SCEVToPreds.end())
return false;		return false;
auto &SCEVPreds = ScevPredsIt->second;		auto &SCEVPreds = ScevPredsIt->second;

return any_of(SCEVPreds,		return any_of(SCEVPreds,
		sbarangaUnsubmitted Not Done Reply Inline Actions Why is it correct to add the NSW flag here? I'm worried that it's somehow implied by the predicates that we're adding. sbaranga: Why is it correct to add the NSW flag here? I'm worried that it's somehow implied by the…
		doritAuthorUnsubmitted Not Done Reply Inline Actions Not sure about this, and looking at this again I can't justify SCEV:NSW here. Probably SCEV::FlagAnyWrap is all we can do here (as without the predicate we know nothing because of the truncate). Right? dorit: Not sure about this, and looking at this again I can't justify SCEV:NSW here. Probably SCEV…
		sbarangaUnsubmitted Not Done Reply Inline Actions Sounds correct, we should drop the NSW flag. sbaranga: Sounds correct, we should drop the NSW flag.
[N](const SCEVPredicate *I) { return I->implies(N); });		[N](const SCEVPredicate *I) { return I->implies(N); });
}		}

const SCEV *SCEVUnionPredicate::getExpr() const { return nullptr; }		const SCEV *SCEVUnionPredicate::getExpr() const { return nullptr; }

void SCEVUnionPredicate::print(raw_ostream &OS, unsigned Depth) const {		void SCEVUnionPredicate::print(raw_ostream &OS, unsigned Depth) const {
for (auto Pred : Preds)		for (auto Pred : Preds)
Pred->print(OS, Depth);		Pred->print(OS, Depth);
}		}

void SCEVUnionPredicate::add(const SCEVPredicate *N) {		void SCEVUnionPredicate::add(const SCEVPredicate *N) {
if (const auto *Set = dyn_cast<SCEVUnionPredicate>(N)) {		if (const auto *Set = dyn_cast<SCEVUnionPredicate>(N)) {
for (auto Pred : Set->Preds)		for (auto Pred : Set->Preds)
add(Pred);		add(Pred);
return;		return;
}		}

if (implies(N))		if (implies(N))
return;		return;
		sbarangaUnsubmitted Not Done Reply Inline Actions Same as above (why can we add the NSW flag here?). sbaranga: Same as above (why can we add the NSW flag here?).
		doritAuthorUnsubmitted Not Done Reply Inline Actions Again, not sure about this. I thought we can put here what the predicate guarantees. So if we added an NSSW assumption we could set the NoWrapFlags to SCEV:FlagNSW (right?). Originally I only looked at the Sext pattern and that's why I put the NSW Flag. Then I extended the analysis to also consider the Zext pattern, but didn't go back to fix the flag. So if we added an NUSW predicate, then would it be correct to set the flags to SCEV:FlagNUW ?? (NUSW and NUW don't have the same semantics…). Maybe SCEV:FlagNW makes most sense then in that case? dorit: Again, not sure about this. I thought we can put here what the predicate guarantees. So if we…
		sbarangaUnsubmitted Not Done Reply Inline Actions We can't add NSW/NUW on SCEV expressions if we infer them from SCEV predicates. The problem is doing so would essentially mean that the NUW/NSW are not predicated (which isn't true) and can technically lead us to false conclusions (we can even use the nsw/nuw flags to prove the original predicate, which is incorrect). sbaranga: We can't add NSW/NUW on SCEV expressions if we infer them from SCEV predicates. The problem is…

const SCEV *Key = N->getExpr();		const SCEV *Key = N->getExpr();
assert(Key && "Only SCEVUnionPredicate doesn't have an "		assert(Key && "Only SCEVUnionPredicate doesn't have an "
" associated expression!");		" associated expression!");

SCEVToPreds[Key].push_back(N);		SCEVToPreds[Key].push_back(N);
Preds.push_back(N);		Preds.push_back(N);
}		}
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll

				; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 < %s 2>&1 \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; Check that the vectorizer identifies the %p.09 phi,
				sbarangaUnsubmitted Not Done Reply Inline Actions Could you add some text for each of these saying what predicates get added? sbaranga: Could you add some text for each of these saying what predicates get added?
				; as an induction variable, despite the potential overflow
				; due to the truncation from 32bit to 8bit.
				; SCEV will detect the pattern "sext(trunc(%p.09)) + %step"
				; and generate the required runtime overflow check under which
				; we can assume no overflow. See pr30654.
				;
				; int a[N];
				; void doit1(int n, int step) {
				; int i;
				; char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit1
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				@a = common local_unnamed_addr global [250 x i32] zeroinitializer, align 16

				; Function Attrs: norecurse nounwind uwtable
				define void @doit1(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%sext = shl i32 %p.09, 24
				%conv = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; Same as above, but for checkinhg the SCEV "zext(trunc(%p.09)) + %step":
				;
				; int a[N];
				; void doit2(int n, int step) {
				; int i;
				; unsigned char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit2
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				; Function Attrs: norecurse nounwind uwtable
				define void @doit2(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%conv = and i32 %p.09, 255
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}
				; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 < %s 2>&1 \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				; Check that the vectorizer identifies the %p.09 phi,
				; as an induction variable, despite the potential overflow
				; due to the truncation from 32bit to 8bit.
				; SCEV will detect the pattern "sext(trunc(%p.09)) + %step"
				; and generate the required runtime overflow check under which
				; we can assume no overflow. See pr30654.
				;
				; int a[N];
				; void doit1(int n, int step) {
				; int i;
				; char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit1
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				@a = common local_unnamed_addr global [250 x i32] zeroinitializer, align 16

				; Function Attrs: norecurse nounwind uwtable
				define void @doit1(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%sext = shl i32 %p.09, 24
				%conv = ashr exact i32 %sext, 24
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; Same as above, but for checkinhg the SCEV "zext(trunc(%p.09)) + %step":
				;
				; int a[N];
				; void doit2(int n, int step) {
				; int i;
				; unsigned char p = 0;
				; for (i = 0; i < n; i++) {
				; a[i] = p;
				; p = p + step;
				; }
				; }
				;

				; CHECK-LABEL: @doit2
				; CHECK: vector.scevcheck
				; CHECK: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK-NOT: %mul = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 {{.}}, i8 {{.}})
				; CHECK: vector.body:
				; CHECK: <4 x i32>

				; Function Attrs: norecurse nounwind uwtable
				define void @doit2(i32 %n, i32 %step) local_unnamed_addr {
				entry:
				%cmp7 = icmp sgt i32 %n, 0
				br i1 %cmp7, label %for.body.preheader, label %for.end

				for.body.preheader:
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
				%p.09 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
				%conv = and i32 %p.09, 255
				%arrayidx = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 %indvars.iv
				store i32 %conv, i32* %arrayidx, align 4
				%add = add nsw i32 %conv, %step
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}