This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
ClosedPublic

Authored by sbaranga on Feb 12 2016, 7:22 AM.

Details

Summary

When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Diff Detail

Event Timeline

sbaranga updated this revision to Diff 47800.Feb 12 2016, 7:22 AM
sbaranga retitled this revision from to [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV.
sbaranga updated this object.
sbaranga added a subscriber: llvm-commits.
sanjoy requested changes to this revision.Feb 19 2016, 6:04 PM
sanjoy edited edge metadata.

Given that we're now computing predicated trip counts (the bread and butter of SCEV), I think this is good time to start doing some whitebox testing. What I mean by that is to add a way to get SCEV to produce a predicated trip count and dump both the predicate and the trip count, and do llvm-lit/FileCheck type testing on the output. Unfortunately, this will be more work for you, but I think this the right thing to do for the system. We don't want to end up in a situation where adding a test for a predicated trip count related bug involves tricking the loop vectorizer to vectorize a certain loop.

In fact, I think there is already some scope for whitebox testing things like PredicatedScalarEvolution::getAsAddRec, maybe that's a good place to start?

Again, I know this is more work for you; but having the infrastructure to easily add tests is important.

include/llvm/Analysis/ScalarEvolution.h
723

"count of the" is repeated

898–900

Describe UseAssumptions here.

945

I don't think Force is a good name -- how about CreateAssumptions or CreatePredicates? Same comment below.

1343

Add a comment for this (fine to extend the one on getBackedgeTakenCount)

lib/Analysis/ScalarEvolution.cpp
5493

I don't think you need the parens around ENT->Pred.

5949

I don't think this is the right layering -- for instance, this forces SCEV clients that don't want speculative / predicated trip counts to pay to cost of computing them.

I'd say SCEV users that care about predicated trip counts should do this retry themselves, i.e. something like

const SCEV *TC = getBackedgeTakenCount(L);
if (TC is SCEVCouldNotCompute) {
  SE->forgetLoop(L);
  TC = getPredicatedBackedgeTakenCount(L);
}
6984

I don't think you need to (!A) && B, you can !A && B instead.

This revision now requires changes to proceed.Feb 19 2016, 6:04 PM

Given that we're now computing predicated trip counts (the bread and butter of SCEV), I think this is good time to start doing some whitebox testing. What I mean by that is to add a way to get SCEV to produce a predicated trip count and dump both the predicate and the trip count, and do llvm-lit/FileCheck type testing on the output. Unfortunately, this will be more work for you, but I think this the right thing to do for the system. We don't want to end up in a situation where adding a test for a predicated trip count related bug involves tricking the loop vectorizer to vectorize a certain loop.

In fact, I think there is already some scope for whitebox testing things like PredicatedScalarEvolution::getAsAddRec, maybe that's a good place to start?

Again, I know this is more work for you; but having the infrastructure to easily add tests is important.

Thanks, Sanjoy! I agree, we should do as much testing as possible for this. I'll add the tests.

lib/Analysis/ScalarEvolution.cpp
5949

Good point. I'll look into how we could compute this more lazily.

Why do you think we would need a SE->forgetLoop() here? FWIW I'm generally trying to avoid having to invalidate the analysis for a given loop.

sanjoy added inline comments.Feb 22 2016, 8:01 AM
lib/Analysis/ScalarEvolution.cpp
5949

Why do you think we would need a SE->forgetLoop() here?

To forget the cached CouldNotCompute trip count. But you're right -- we're probably better off not invalidating the whole analysis -- we can just have the second call to getPredicatedBackedgeTakenCount overwrite CouldNotCompute.

sbaranga added inline comments.Feb 22 2016, 9:20 AM
lib/Analysis/ScalarEvolution.cpp
5949

Thanks! We'll take this approach then.

sbaranga updated this revision to Diff 49955.Mar 7 2016, 6:37 AM
sbaranga edited edge metadata.

This update addresses the comments from Sanjoy from the previous
review round.

Renamed GuardedBackedgeTakenCount -> PredicatedBackedgeTakenCount, for
consistency with existing code.

Modified ScalarEvolution's print method to also print the predicated
backedge taken count. Used this feature to add regression tests that
directly check SCEV's output.

Cloned the backedge taken info map to store information for the predicated
backedge taken count, and used this to lazily compute the predicated
backedge taken count. This will allow users of SCEV that don't need
the predicated backedge taken count to not pay the cost of computing it.
This approach is also much less bug-prone then the previous one, as
the new information (the predicated backedge taken count) is is another
data structure that must be explicitly accessed.

Hi Sanjoy,

I've added the test/Analysis/ScalarEvolution/predicated-trip-count.ll test which goes through some use cases of this feature for whitebox testing.
Was this what you had in mind, or do you think we should also be doing something else?

Thanks,
Silviu

test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll
54

I had to change this test because it started to figure out what the backedge taken count was.

This was testing the vectorization remark when we cannot find the backedge taken count. changed the test so that it will continue not be able to get the backedge taken count.

What is interesting about this is the way we've managed to get the (exact) backedge taken count. The initial analysis was only able to get the maximum backedge taken count but not the exact one. However, we can use this to get a better SCEV for cmp3.

On a following invocation of computeBackedgeTakenCount this information is used to get an exact backedge taken count.

Hi Silviu,

Was this what you had in mind, or do you think we should also be doing something else?

Yes, this is what I had in mind, so thanks! I'm still not done
reviewing the change, but I've added some minor comments inline. I'll
try to finish the review by Monday (no need to address the inline
comments before then).

Btw, it would be great to have (in a later change) similar llvm-lit
tests for PredicatedScalarEvolution::getAsAddRec as well; but I'll
leave it to you to make a call on that.

include/llvm/Analysis/ScalarEvolution.h
873–876

This is minor, but the IsGuarded name is somewhat misleading, since the loop may not already be guarded. Why not call this AddPredicates or AllowPredicates?

lib/Analysis/ScalarEvolution.cpp
5272

Use auto here.

5278

Add a /* IsGuarded = */ before the true.

5826

Please add a comment as /* IsGuarded = */ true.

6986

No need to fix this in this change, but a nicer API could be to have convertSCEVToAddRecWithPredicates return an add recurrence or null if it failed (and get rid of the dyn_cast).

8600

I think you can just fall through the logic below (under // Avoid weird loops). Also, the comment on the api change to convertSCEVToAddRecWithPredicates also applies here.

sanjoy added inline comments.Mar 10 2016, 6:27 PM
test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll
54

Interesting. Do you mean "However, we can use this to get a better
SCEV for %0"? I can see how SimplifyICmpOperands would be able to
use a tighter range on %0 to simplify the sle into an slt.

It would be great if SCEV could directly compute the backedge count
here though. The problem is that we didn't have a way for
computeExitLimitFromCond to say "BECount is Infinite if L is
INT_MAX else is L + 1 (say)" so computeExitLimitFromCond would
have to give up in the face of the dreaded COULDNOTCOMPUTE. But
with your work, that is changing (ability to represent predicated BE
counts); and perhaps one day SCEV will be able to directly compute the
backedge taken counts of loop like these. :)

sanjoy requested changes to this revision.Mar 11 2016, 4:56 PM
sanjoy edited edge metadata.
sanjoy added inline comments.
include/llvm/Analysis/ScalarEvolution.h
565

Now that you're embedding SCEVUnionPredicate in every
ExitNotTakenInfo, I'm somewhat worried about the `SmallVector<const
SCEVPredicate *, 16> Preds;` :)

Firstly, what is the typical number of predicates we see? 16 seems
higher that what I would expect.

Secondly, a better (in terms of memory consumption) solution would be
to have a std::unique_ptr<SCEVUnionPredicate> here, where a null
value means "always true" so that in the non-predicated case we only
use up one word.

The most compact solution (I think we should go with this) is to:

  • Extract out a struct ExitNotTakenExtras that contains a SmallVector of ExitNotTaken s and a SCEVUnionPredicate
  • Change PointerIntPair<ExitNotTakenInfo*, 1> NextExit to PointerIntPair<ExitNotTakenExtras *, 1> NextExit where it is usually null, but if it is non-null you either have multiple exits or have some SCEV predicates.

Normally I wouldn't worry so much about space, but ExitNotTakenInfo
has clearly been optimized for space usage, so we should try not to
break that.

702

Add a comment here that Max is valid only if the predicates on each of the ExitNotTakenInfo instances is true.

708–709

I think the right API here is to promote the std::pair into a full struct that carries the block, backedge count and a predicate; and not have a separate vector for ExitPreds. You could also use std::tuple here (since the types are all different, there isn't much scope for confusion).

725

Can you please make this comment a little more clear? Is it that the caller needs to know in advance, by some other means, that the backedge taken count is predicated and pass in a non-null Predicates? If so, what is that other means?

The always correct (independent of any assumptions that should be checked at run-time) is misleading too -- you do sometimes return exact BE counts are dependent on a run-time assumption.

759

Here and elsewhere, can you please add a newline after field declarations (i.e. after the declaration for BackedgeTakenCounts)?

871

I'd remove the (the exact backedge taken count will be known) bit -- it can otherwise mislead people into thinking that this will return a precise be count, no matter what.

887–888

Nit: the parameter name is IsGuarded and not Guarded. But, as I said earlier, it is probably better to rename this to AddPredicates, AllowPredicates or CreatePredicates as you have below.

1348

Nit: indentation.

lib/Analysis/ScalarEvolution.cpp
5364

Usually I see lambdas formatted with no space between the ] and the (. I'd say just run the diff through clang-format before checkin, and whatever it does is fine. :)

5372

What you have here is fine, but I'd have tried:

for (auto &Map : {BackedgeTakenCounts, PredicatedBackedgeTakenCounts}) {
  // Remove L from Map
}

(not 100% sure if the above will work).

5585–5588

Add an assert here that if Guarded is false, then the predicate in EL is trivially true (i.e. we didn't add a predicate when we were not supposed to).

5598

This does not look right to me -- won't the EL be freed after each iteration, leaving a dangling pointer in ExitCountPreds?

This revision now requires changes to proceed.Mar 11 2016, 4:56 PM

Hi Sanjoy,

Thanks for the comments! Some replies inline.

Regarding testing getAsAddRec, we'll most likely have to do it in a pass like LoopAccessAnalysis, since the call takes a Loop parameter. We already do something similar in test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll, although we just check the resulting predicates there (it would be nice to also get the initial and final expressions).

Thanks,
Silviu

include/llvm/Analysis/ScalarEvolution.h
565

Ok, the most common size of the SCEVUnionPredicate there should be 0, so going with your solution makes sense to me.

On the other hand, it does seem a bit strange to optimize this for memory consumption. I wouldn't expect it to make that much of a difference.

lib/Analysis/ScalarEvolution.cpp
5598

Of course, thanks for catching this!

test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll
54

Yes, that is basically what is happening (we can now use the range information on %0).

You're correct, with these changes we will have a way of doing the sle/ule to slt/ult conversions even without the appropriate range information. In fact that would be something really nice to have (I've even seen people hitting this issue on llvm-dev, so maybe this is not just a corner case).

sanjoy added a subscriber: atrick.Mar 14 2016, 11:21 AM
sanjoy added inline comments.
include/llvm/Analysis/ScalarEvolution.h
565

My guess is that there are enough ExitNotTakenInfo instances floating around in a typical compilation that its size becomes important. But maybe @atrick has a more cogent answer?

atrick added inline comments.Mar 14 2016, 11:40 AM
include/llvm/Analysis/ScalarEvolution.h
565

I don't have a better answer but agree that SCEV instances need to be size conscious, just as other IR data types do. I would avoid allocating and initializing 16 words that are unlikely to be used.

sbaranga updated this revision to Diff 51037.Mar 18 2016, 10:06 AM
sbaranga edited edge metadata.

Reworked the ExitNotTakenInfo allocation scheme to avoid allocating
un-needed SCEVUnionPredicates.

The new allocation scheme follows Sanjoy's idea:
we have a new optional structure that can hold optional (almost always not
needed) data. The optional data contains a SCEVPredicate and a vector
of ExitNotTakenInfo structs. The first elemet has the optional info if it
has a SCEVPredicate or there are more than one loop exits. The other
ExitNotTakenInfo structs will contain the extra info only if they have
a SCEVPredicate.

Note that we could pottentially modify this scheme such that the first
loop exit has a union of all the SCEV predicates (which doesn't have
the per-exit information, but uses less memory - in the cases where
we do need SCEV predicates).

Since the new structure is not trivial to traverse, we also have a new
iterator for this (which keeps the traversals simple to write).

Also followed up on the reset of the review comments from the last round.

sbaranga updated this revision to Diff 51184.Mar 21 2016, 10:10 AM
sbaranga edited edge metadata.

Minor cleanup: use range-based for loops when traversing the ExitNotTakenInfo structs, since we now have iterators.

sbaranga added inline comments.Mar 21 2016, 10:32 AM
lib/Analysis/ScalarEvolution.cpp
5372

Sanjoy, I've left this as is for now. Let me know if you prefer the form above, and I'll try to change the code to use it.

sanjoy requested changes to this revision.Mar 22 2016, 2:07 PM
sanjoy edited edge metadata.

Hi Silviu,

This is looking close to ready to land; I mostly have minor comments except on the ExitNotTakenExtras structure, which is more complex than I had had in mind.

include/llvm/Analysis/ScalarEvolution.h
557

Leave a line between declarations.

575

This is pretty obvious by name, but I'd add a doxygen comment for consistency.

580

I'd remove the braces. Also a more LLVM-idiomatic way of writing this would be

if (auto *Info = ExtraInfo.getPointer())
  return &Info->Pred;
return nullptr;
591

I'd mildly prefer return !getPred() || getPred()->isAlwaysTrue();, but if you like this better, that's fine too.

595

This is great!

654

The following does not need to be done for this change, but this is
just to note some cleanup work one of us should consider taking up in
the future:

Now that we're being so nice and well factored about this, I think it
will be even nicer if we have a good containment relationship.
Specifically, it will be nice to have a top level ExitTakenInfoSet
struct that contains one inlined ExitTakenInfo object, and a
PointerIntPair of ExitTakenInfoSetExtras and the isComplete bit.

665

I think the code would be cleaner if we changed Pred to a std::vector or SmallVector of predicates, and have the invariant be:

  • ExtraInfo in the "root" ExitNotTaken instance is nullptr means that there is one unpredicated exit count
  • ExtraInfo in the "root" ExitNotTaken instance is not nullptr means that i'th exit count is in i == 0 ? Root : Root->Extra.Exits[i - 1] (exactly as it is today), and the predicate for the i'th exit count is Root->Extra ? Root->Extra.Exits[i] : AlwaysTrue.

This is less efficient than the layout you have here, but is simpler; and we still have the nice property that unpredicated exit counts for single exit loops can be represented compactly.

Does the above sound feasible? If it does, then lets please go ahead with that; else let me know and I'll review the BackedgeTakenInfo constructor as is.

666

Again, please leave blank lines between decls.

894

Please consistently use one of AllowPredicates or CreatePredicates. Also there is some inconsistency around when CreatePredicates has a default value and when it does not -- is there a reason for the difference?

lib/Analysis/ScalarEvolution.cpp
5372

sgtm

5519–5545

Can you please do one of the following:

  • Comment which numeric field is which
  • Cache the three fields in three local variables, with more mnemonic names

Honestly, reading the code, a struct with named fields would have been better; but we can fix that later.

5551

Are you unconditionally assuming !std::get<2>(ExitCounts[0]).isAlwaysTrue() here (given that you're unconditionally adding 1)? If so, please add a brief comment on why that is okay.

This revision now requires changes to proceed.Mar 22 2016, 2:07 PM
sbaranga added inline comments.Mar 23 2016, 9:07 AM
include/llvm/Analysis/ScalarEvolution.h
665

The problem with this is we can't get the predicate of an ExitNotTaken without knowing the root, so the information there is not self contained anymore.

So things like the isPredicateAlwaysTrue() method would need to change (because we don't know the root), probably by taking the ExitNotTaken root as the argument?

Would there be a way around this? If not, I would prefer to keep the current solution.

lib/Analysis/ScalarEvolution.cpp
5551

No, we know the first ExitNotTakenInfo has an ExtraInfo struct because we have more than 1 exit (so there's no need to check the predicate). The other entries only have an ExtraInfo if they have a not always true SCEVPredicate.

Perhaps ExtraInfoSize instead of PredsSize would have been better here.

sbaranga updated this revision to Diff 51564.Mar 24 2016, 9:27 AM
sbaranga edited edge metadata.

Replaced the <SCEV*, BasicBlock*, SCEVUnionPredicate> tuple with a new "EdgeInfo" struct.
We now consistently use AllowPredicates eveywhere (defaults to false).

I've left ExitNotTakenInfo as is for now, until we decide how (and if) we should change it.

sanjoy requested changes to this revision.Mar 27 2016, 3:43 PM
sanjoy edited edge metadata.

Mostly nits, except that there may be a correctness issue in SCEVExpander::generateOverflowCheck.

include/llvm/Analysis/ScalarEvolution.h
665

SGTM, lets go with what we have now. I still think the data layout here can be simplified, but we (I'm happy to take care of that) can do that later once this change is in.

887–888

Nit (here and below) "this call will try to"

lib/Analysis/ScalarEvolution.cpp
5557

Can you move this auto &Exits out of the loop?

5562

Can this just be Exits.emplace_back(ExitCounts[i].ExitBlock, ExitCounts[i].taken, Ptr)?

5565

Why not move this to the place statement where you assign to Ptr:

if (!ExitCounts[i].Pred.isAlwaysTrue()) {
  Ptr = &ENT[PredPos++];
  Ptr->Pred = ...
}
lib/Analysis/ScalarEvolutionExpander.cpp
2007–2009

This part is a significant semantic change that I unfortunately hadn't
(but really should have) noticed in earlier revisions. While I don't
yet have a specific example given the current state of the code, I
think it is possible to end up with a circular logic fallacy here.

Earlier, the invariant was: given a no-overflow predicate, we would
write a loop entry predicate EP (a function of the backedge count)
that, when true, would imply no overflow. Formally, EP => NoOverflow.
However, now we're doing something different. Now we
have: EP => (Backedge taken count is BE => NoOverflow) (this part is
the same as earlier), but Backedge taken count is BE is not
axiomatically true -- it is true under the NoOverflow
condition. In other words, the set of logical equations we have are

EP => (Backedge taken count is BE => NoOverflow)
NoOverflow => Backedge taken count is BE

Now we check EP at runtime, so it is known to be true. Given
that, there are two solutions to the above: {`Backedge taken count is
BE` is true, NoOverflow is true}; or {`Backedge taken count is
BE` is false, NoOverflow is false}. The latter solution is
problematic.

One problematic case that won't happen today, but illustrates what I'm
talking about:

for (u32 i = 0; ; i++)
  store volatile (GEP a, i), 0

The above loop can run forever (assuming u32 's overflow is well
defined), but let's say we decide to predicate the increment to an
NUSW increment. Given that, we know the loop won't run more than -1
times (since otherwise we will have a side effect that uses poison).
Howver, the predicate that you'll compute in SCEVExpander in such a
case is -1 == -1, which is trivially true, and now the loop is
miscompiled (instead of running forever, it'll just run -1 times).

Now, I'll note that I've so far not have been able to come with a
problematic case that will break in the current version of the patch,
so it is possible that there is some deeper invariant here that is
obvious.

This revision now requires changes to proceed.Mar 27 2016, 3:43 PM
sbaranga added inline comments.Mar 28 2016, 4:15 AM
lib/Analysis/ScalarEvolutionExpander.cpp
2007–2009

Thanks! That is a very good point and we should be really careful here. I think this change is still correct:

Let's say that we return a symbolic answer X, and the correct answer is Y. The problem is when X != Y and we pass the predicate check.

Because the answer is wrong, some overflow must happen. Since we are passing the NoOverflow check, we need to have X < Y (otherwise we wouldn't be passing the check).

Note that for the first X iterations our predicate holds. I think what makes this correct is the fact that we're using the predicates in HowManyGreaterThans/HowManyLessThans, which means that we need to stop at iteration X.

sanjoy added inline comments.Mar 28 2016, 5:42 PM
lib/Analysis/ScalarEvolutionExpander.cpp
2007–2009

So, to rephrase what I understood, by "with the predicate (IsNoWrap
AR) the trip count of the loop is N", PSE really means "if the add
recurrence AR does not overflow in the first N iterations, then the
loop's count is N". In particular, for the loop

for (unsigned i; ; i++)
  side_effect_use(i);

predicating i++ as NUSW does *not* let you conclude that the trip
count of the loop is UINT_MAX, since just because i++ does not
overflow the in first UINT_MAX iterations does not guarantee that
the loop will exit or have UB in the `UINT_MAX+1 th iteration.

This is a subtle invariant, so at the very least this needs to be
documented explicitly; and the places where we add predicates
HowManyLessThans etc. need to be audited carefully (it sounds like
you already have done most of that?).

Finally, if possible, it will be great if we can structure the code in
a way that will make (accidentally) breaking this invariant difficult;
though I don't have anything concrete in mind right now.

sbaranga added inline comments.Mar 29 2016, 5:31 AM
lib/Analysis/ScalarEvolutionExpander.cpp
2007–2009

Yes, this is a very good description of the logic!

Indeed, this is subtle so it needs documenting. I did audit the places where we've added the predicates, but I'd like to have another look.

sbaranga updated this revision to Diff 52191.Mar 31 2016, 5:42 AM
sbaranga edited edge metadata.

Updated the definition of the SCEVWrapPredicate so that its semantics are now decoupled from the backedge taken count.
This should address the other review comments as well.

sbaranga marked 4 inline comments as done.Mar 31 2016, 5:59 AM
sbaranga added inline comments.
include/llvm/Analysis/ScalarEvolution.h
654

Makes sense to me.

665

Thanks! Greatly appreciated :).

lib/Analysis/ScalarEvolutionExpander.cpp
2007–2010

I've re-audited the

Please see the updated definition of the SCEVWrapPredicate which should make this point clear.

Regarding the code restructuring: I don't see a good solution for this either.

sbaranga added inline comments.Apr 1 2016, 9:05 AM
lib/Analysis/ScalarEvolutionExpander.cpp
2007–2010

I meant to write that I've re-audited the places where we were using the predicates, and I think everything is ok.

sanjoy accepted this revision.Apr 4 2016, 11:47 PM
sanjoy edited edge metadata.

Hi Silviu,

This lgtm now! Thanks for bearing with the long review process. I have a minor "messaging" comment inline, but otherwise things look great!

include/llvm/Analysis/ScalarEvolution.h
278–279

What I have in mind is slightly different (same content, said
differently):

  1. If a SCEVPredicate says AR is nssw then AR *is* nssw for the entire iteration space of the loop. No exceptions.
  2. The predicated BE taken computation logic now needs to be aware of a possible circular logic issue -- if it adds a predicate forcing an AR to be nssw then AR is nssw only for the trip count it itself computes. This means it cannot use the no-overflow property in certain ways the non-predicated BE taken computation can.

If (1) is ever incorrect (i.e. the predicate fails to hold) then we've
already "lost" (i.e. miscompiled) because
getPredicatedBackedgeTakenCount gave us the wrong answer.

In other words, I think you're downplaying the fact that the "burden"
is really on getPredicatedBackedgeTakenCount, and not on the general
notion of predicate itself. For instance, there is no complication at
all if we don't also use predicated backedge taken counts (i.e. we use
normal backedge taken counts) with SCEVPredicate.

This revision is now accepted and ready to land.Apr 4 2016, 11:47 PM

Hi Silviu,

This lgtm now! Thanks for bearing with the long review process. I have a minor "messaging" comment inline, but otherwise things look great!

Thanks a lot for reviewing! Some comments inline, but it looks like this shouldn't be too difficult to resolve.

-Silviu

include/llvm/Analysis/ScalarEvolution.h
278–279

Yes, this was what I also had in mind initially.

However, I found it more easy to reason about things this way. Here is the reason:

The new definition also happens to imply the old one for all code that also checks the predicated backedge taken count, and it only makes a difference when computing this predicated backedge taken count.

It removes the circular logic issue for our current use case. That is, we don't need to know that the predicated backedge taken count is correct in order to check the predicate, so the reasoning would look like:

wrap predicate (+ possibly some other predicates) => predicated backedge taken count is correct
predicated backedge taken count is correct + wrap predicate => Wrap predicate is true throughout the loop

which looked like a slightly more elegant way of reasoning about this.

The burden is still on the getPredicatedBackedgeTakenCount to correctly use the predicates and not get itself into circular logic.

Also, we can still use the backedge taken count here since if we can compute it, the predicated backedge taken count would be equal to it and have no predicates.

Given this, what do you think? If you still like to go with your version, we can do that.

sanjoy added inline comments.Apr 5 2016, 12:56 PM
include/llvm/Analysis/ScalarEvolution.h
276

Optional suggestion: I'd change this line to "Note that this predicate does not imply ... taken count". The "that X is interpreted as a SCEV expression" sounds confusing on the first reading (since we've already established that X is a SCEV expression returned by getPredicatedBackedgeTakenCount.

278

Optional suggestion: I'd remove the "Wrap" from "nusw Wrap", since we already have "w" for "wrap" in "nusw".

278–279

SGTM, what you have is fine. I added some minor wording suggestions -- all of which are optional to apply.

sbaranga updated this revision to Diff 52784.Apr 6 2016, 6:22 AM
sbaranga edited edge metadata.

Improve description for SCEVWrapPredicate.

sbaranga closed this revision.Apr 6 2016, 6:23 AM

Thanks! Committed in r265535.

-Silviu

Reverted because this caused the following failure: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/7327

As far as I know this is isolated to windows. It's not entirely clear why this happening..

Re-committed in r265786 with a fix. Essentially, we can't use a pointer allocated by new with PointerIntPair, since new doesn't guarantee alignment, which is required by PointerIntPair. So I've broken this up into a pointer and a bool.

We could use a special allocator (SpecificBumpPtrAllocator?) to guarantee alignment in order to work around this, but it doesn't make much sense since the allocation scheme for ExitNotTakenInfo/Extras will probably change soon anyway.