This is an archive of the discontinued LLVM Phabricator instance.

[SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows
AbandonedPublic

Authored by sbaranga on Jun 1 2015, 9:15 AM.

Download Raw Diff

Details

Reviewers

anemet
mzolotukhin
aschwaighofer
atrick
sanjoy

Summary

Add a new pass - AssumingScalarEvolution - that extends
SCEV, but can make assumptions and generate code that
can check these assumptions. This pass uses the normal
ScalarEvolution pass for anything outside the current
analyzed loop, but uses its own data structures to handle
the SCEVs within the loop. For now, the pass assumes
that chrec expressions will not overflow.

The AssumingScalarEvolution pass can add checks for the
made assumptions, so that a loop can be versioned.

We use this pass in order to add runtime overflow checks
in the Loop Vectorize pass of expressions which can
in theory overflow and would prevent the vectorization
a loop. Also note that the runtime checks will almost always
pass, since having an overflow is usually a sign of a
coding error on the part of user.

The main reasons behind inheriting from ScalarEvolution:
Since ScalarEvolution maintains its own cache for various
results, making any assumption would likely require invalidating
the entire cache.

This way we can use the new pass as an almost drop-in
replacement for ScalarEvolution. The users that are doing
a transformation would have to know about it in order to add
the checks.

However, there are probably a lot more ways in which this could
be implemented.

Any comments are much appreciated!

Thanks,
Silviu

Diff Detail

Event Timeline

sbaranga updated this revision to Diff 26900.Jun 1 2015, 9:15 AM

sbaranga retitled this revision from to [SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows.

sbaranga updated this object.

sbaranga edited the test plan for this revision. (Show Details)

sbaranga added a subscriber: Unknown Object (MLST).

jmolloy added a subscriber: jmolloy.Jun 1 2015, 9:48 AM

Thanks for working on this! cc'ing Andy in case he has an opinion regarding the general approach.

include/llvm/Analysis/ScalarEvolution.h
786	Are these two methods special in theory, in this context, or in the future would overrides for more methods be useful?
1262	Don't use 'chrec' here without defining it.
1313	You should clear the cache here?
lib/Analysis/ScalarEvolution.cpp
9178	Given that the SCEVs are uniqued, you could at least eliminate those that are structurally identical easily.
lib/Transforms/Vectorize/LoopVectorize.cpp
2904	Line too long?

Overall, it's great to see something like this working in practice. In the past, several alternatives have been discussed. IIRC the main objection to this approach is that SCEV could make assumptions that are not necessary for the optimization leading to overhead for unnecessary checks.

Given this is a major design decision, it would be good to have feedback from anyone who has done a lot of work in the area (thanks to Hal for the review). What about Sanjoy, Arnold (especially for the vectorizer change), MichaelZ, AdamN ??

I don't see a major problem with this approach. Just a few comments on the implementation...

I'm afraid that iterating over a DenseMap will produce SCEV expressions and IR checks in a nondeterministic order. We usually fix this with a MapVector.

+ AssumptionMapTy::iterator getLoopAssumptionBegin(const Loop *L) {

I don't see AnalyzedLoop being initialized in the ctor:

+ AssumingScalarEvolution::AssumingScalarEvolution() :
+ ScalarEvolution(false, ID), SE(nullptr) {

See Instruction::getModule()

+ Module *M = Loc->getParent()->getParent()->getParent();

On most targets, I think it's more efficient to check the high bits for overflow. e.g.

if ((a + 0x800000) & 0xffffff000000) overflow

Instead of:

+ if (Signed) {
+ APInt CmpMinValue = APInt::getSignedMinValue(DstBits).sext(SrcBits);
+ ConstantInt *CTMin = ConstantInt::get(M->getContext(), CmpMinValue);
+ Value *MinCheck = OFBuilder.CreateICmp(ICmpInst::ICMP_SLT,
+ TripCount,
+ CTMin);
+ TripCountCheck = OFBuilder.CreateOr(TripCountCheck, MinCheck);
+ }

Adding reviewers...

I'll make it a point to do a more detailed review this week, but I
have a high level question:

Instead of a separate AssumingScalarEvolution, why not have an
interface like:

NoOverflowResult ScalarEvolution::proveNoOverflow(SCEVAddRecExpr

*AddRec, FlagType, bool CanAddAssumptions);

NoOverflowResult can be Yes, No or (if CanAddAssumptions is true) a
predicate that, if satisfied, will guarantee AddRec does not overflow
in the FlagType sense. Then the client can choose if it is profitable
to actually emit the predicate if it is cheap enough relative to expected
payoff.

A much more general interface could be

Result ScalarEvolution::provePredicate(CmpInst::Pred, SCEV *LHS, SCEV *RHS)

and have it return one of "Yes | No | Guarded (Predicate)"

Then not only can you ask the "sext{addrec} == addrec{sext}" question
to check for no-wrap, but also push more general logic into
ScalarEvolution as needed in the future (i.e. is this add-rec equal to this
other add-rec at any iteration? if not, can I make it so by adding runtime
checks?).

Sanjoy

Hi Hal,

Thanks for having a look! I've only replied to comments related to the design for now. I think we can get the others after we've converged on some conclusion.

Thanks,
Silviu

include/llvm/Analysis/ScalarEvolution.h
786	I think they are special because these expressions cannot fold sext/zext expressions. As far as I can see all the others can, so these seem like they are the only expressions that are stopping us for getting AddRecExprs as results? For example if we have (add (sext({x, + , 1}), y). Getting rid of the sext gets us {x + y, +, 1}. However, there are other methods which might be useful to override, but not related to expression folding. For example SimplifyICmpOperands when taking an ULE predicate can only canonicalize the comparison if it knows that it can add one to the right operand without overflowing (the operand is loop invariant and could possibly be checked). This issue appears mostly when trying to get the number of backedges taken.

Hi Andy,

Thanks for looking at this!

In D10161#192391, @atrick wrote:

Overall, it's great to see something like this working in practice. In the past, several alternatives have been discussed. IIRC the main objection to this approach is that SCEV could make assumptions that are not necessary for the optimization leading to overhead for unnecessary checks.

Would that be on the list? If so, it would be an interesting read.
Yes, in theory this can produce unnecessary checks. I think it works ok for the vectorizer (since any case where we wouldn't make this assumption would result in not vectorizing). It looks like a trade-off between not having all the users deal explicitly with this against the number of memchecks.

On most targets, I think it's more efficient to check the high bits for overflow. e.g.
if ((a + 0x800000) & 0xffffff000000) overflow
Instead of:

+ if (Signed) {
+ APInt CmpMinValue = APInt::getSignedMinValue(DstBits).sext(SrcBits);
+ ConstantInt *CTMin = ConstantInt::get(M->getContext(), CmpMinValue);
+ Value *MinCheck = OFBuilder.CreateICmp(ICmpInst::ICMP_SLT,
+ TripCount,
+ CTMin);
+ TripCountCheck = OFBuilder.CreateOr(TripCountCheck, MinCheck);
+ }

Nice. I wonder if the optimizers handle well the cases where we would test some consecutive values with this method, or if we don't end up generating something similar anyway.

Thanks,
Silviu

Hi Sanjoy,

In D10161#192468, @sanjoy wrote:
I'll make it a point to do a more detailed review this week, but I
have a high level question:

Instead of a separate AssumingScalarEvolution, why not have an
interface like:
NoOverflowResult ScalarEvolution::proveNoOverflow(SCEVAddRecExpr
*AddRec, FlagType, bool CanAddAssumptions);

I actually started from something similar, but ended up moving to the current form as it seemed cleaner and the loop vectorizer doesn't need the flexibility (if something doesn't come out as a SCEVAddRecExpr it will fail).

I think the users actually want to get a SCEVAddRecExpr (or a SCEVConstant if possible) form a normal SCEV. It is possible to parse the SCEV and, figure out if there are any assumptions that can be made and add them somewhere.

It would also be incorrect in this case to set the nuw/nsw bits (which didn't seem ideal).

Another problem is that ScalarEvolution's expression cache really gets in the way.

NoOverflowResult can be Yes, No or (if CanAddAssumptions is true) a
predicate that, if satisfied, will guarantee AddRec does not overflow
in the FlagType sense. Then the client can choose if it is profitable
to actually emit the predicate if it is cheap enough relative to expected
payoff.

A much more general interface could be

Result ScalarEvolution::provePredicate(CmpInst::Pred, SCEV *LHS, SCEV *RHS)

and have it return one of "Yes | No | Guarded (Predicate)"

Then not only can you ask the "sext{addrec} == addrec{sext}" question
to check for no-wrap, but also push more general logic into
ScalarEvolution as needed in the future (i.e. is this add-rec equal to this
other add-rec at any iteration? if not, can I make it so by adding runtime
checks?).

Sanjoy

That would be also one way to do it. We would probably need some utility functions to drive that interface. The initial solution would be at the exact opposite side of the spectrum (easy to use, but not flexible).

Cheers,
Silviu

Hi Silviu,

Can you please discuss the use-cases? We all ran into SCEV not always being the right vehicle to prove no-overflow but this proposes a pretty big change, so I want to make sure we can't take more targeted/distributed solutions to the problem. (My general feeling is similar to what Andy and Sanjoy have already expressed.)

I feel like that in some cases, we can prove no-overflow at *compile* time by further analyzing the IR (like what I am proposing in D10472). Essentially this is relying on C/C++ signed overflow being undefined.

In other cases we may need prove no-overflow of smaller types so that we can up-level the sign/zero-extensions. Is this perhaps something that's better done in indvars? The idea is (maybe flawed) that you can eliminate an extension in the loop by using an overflow check outside the loop.

Anyhow, you collected some testcases so categorizing the issues would probably help the discussion.

I am also in favor of allowing finer level of control along the lines of Sanjoy's comments. Your approach may work for the vectorizer but in case of the general dependence analysis, we may not need to prove of no-overflow of all pointers. For example, if a pointer can't alias with any other accesses in the loop, we don't care that we can't get a true affine form for it.

Thanks,
Adam

Hi Adam,

In D10161#193173, @anemet wrote:

Can you please discuss the use-cases? We all ran into SCEV not always being the right vehicle to prove no-overflow but this proposes a pretty big change, so I want to make sure we can't take more targeted/distributed solutions to the problem. (My general feeling is similar to what Andy and Sanjoy have already expressed.)

I think the biggest problem would be where overflow can happen, depending on the input data. I would prefer to add some no overflow proving code for cases when it is possbile to prove rather than implementing something like this, but it looks like for certain cases there is no work-around. I've listed some examples below.

I feel like that in some cases, we can prove no-overflow at *compile* time by further analyzing the IR (like what I am proposing in D10472). Essentially this is relying on C/C++ signed overflow being undefined.

Ideally we would figure out at compile time, but I think it would be impossible (or impractical) to cover all the cases where we could prove this, and there are cases where we cannot prove no-overflow (and the overflow condition would depend on input data).

In other cases we may need prove no-overflow of smaller types so that we can up-level the sign/zero-extensions. Is this perhaps something that's better done in indvars? The idea is (maybe flawed) that you can eliminate an extension in the loop by using an overflow check outside the loop.

Anyhow, you collected some testcases so categorizing the issues would probably help the discussion.

I am also in favor of allowing finer level of control along the lines of Sanjoy's comments. Your approach may work for the vectorizer but in case of the general dependence analysis, we may not need to prove of no-overflow of all pointers. For example, if a pointer can't alias with any other accesses in the loop, we don't care that we can't get a true affine form for it.

Yes, good point! I need to think a bit about the interface (and perhaps do some experimenting), but it should definitely be possible to have something similar to what Sanjoy suggested (and I like the idea).

The test cases I have come from C/C++ where unsigned integers are being used as induction variables and for some reason they would get extended at some point. There are cases where we wouldn't need the extend, but those are outside current scope. The problem with these cases is that the behaviour of unsigned overflow is defined (at least in C/C++), and there is no way of statically reasoning about these cases (we can overflow, and it can for example cause infinite loops). For example:

void test(uint32_t n) {

for (uint16_t i = 0; i < n; ++i) {
  <do something>
}

}

Here we would need to compare i can overflow, and for values larger than 2^16-1 we'll get an infinite loop.
In fact there is no way to progress here besides versioning the loop as far as I can see.

A related example:

void test(uint32_t n) {

for (uint32_t i = 0; i <= n; ++i) {
  <do something>
}

}

Here we have no extend operations, but for n == 2^32-1 this will be an infinite loop and i will overflow. SCEV gives up on computing the backedge count because there is no correct result that it can give.

This can affect memory accesses as well:

void test(uint32_t n, uint16_t ind, uint32_t offset, char *a, char *b) {

for (uint16_t i = 0; i < n; ++i) {
  ind++;
  a[ind + offset] = b[ind + offset] * 3;
}

}

ind + offset does not evaluate here as a chrec. ind is a chrec, but then we apply zext to it and add it to the offset. The absence of nsw/nuw means that we actually get add(offset, zext({ind, +, 1}) for ind + offset. This causes a number of problems (accesses to a are not consecutive, etc) And that's correct, they can be non-consecutive for some values of n - but unlikely to ever happen at execution.

In fact since we would get or not sign extends here depending on the pointer size, this would mean that it's possible to run into issues when porting code from a 32-bit target to a 64-bit one.

I think there are valid reasons to use usigned integers (eg. for the extra range), so it may be easy to hit these cases. But it looks like any combination of unsigned integers and zext will disable most loop optimizations? Having unsigned integers in the exit condition will probably cause issues on its own.

Thanks,
Silviu

Re-designed to allow the users to make only the assumptions that they need.

The new implementation does not need the AssumingScalarEvolution pass.

Added "SCEVPredicate" as a class in SCEV as an interface for an assumption that
can be made and checked at run-time. Currently we only have implementations for
the SCEVAddRec overflow assumptions, and a SCEVPredicate that can hold
multiple other predicates (SCEVPredicateSet).

We use a SCEV rewriter to take a SCEV and transform it according to the
predicates in a predicate set. Combined with the SCEV AddRec overflow
predicate, this allows us to fold sext/zext expressions into AddRec
expressions and will produce folded SCEVs equivalent with what
the AssumingScalarEvolution pass was producing.

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also
theoretically do this).

Added a getGuardedBackedgeCount method to ScalarEvolution that can return
the backedge count as a SCEV guarded by a predicate. Now LoopAccessAnalysis
and LoopVectorize are using this method.

Modified LoopAccessAnalysis and LoopVectorize to use the SCEV rewriter.
These passes were already using something similar for the symbolic stride.

We now also allow LoopDistribute to use generate the SCEV run-time checks.
This allows us to remove special handling in the LoopAccessAnalysis for the
case where a client wasn't aware of the new run-time checks.

Hi Silviu,

I have some comments, mostly nitpicks. I'll continue the review as I go, but here are the first ones.

cheers,
--renato

include/llvm/Analysis/LoopAccessAnalysis.h
656	this comment seems a bit terse. UseAssumptions is an argument to some methods in SCEV, nothing to do with LoopInfo.
include/llvm/Analysis/ScalarEvolution.h
191	nitpick: maybe isAlwaysTrue / isNeverTrue or isAlwaysTrue / isAlwaysFalse would be more descriptive?
247	No description?
248	Why the duplication? Will you have one for each further type?
275	Why here, and not in SCEVPredicate?

In D10161#206332, @rengolin wrote:

Hi Silviu,

I have some comments, mostly nitpicks. I'll continue the review as I go, but here are the first ones.

cheers,
--renato

Thanks! I'll make sure to fix these in the next upload. Some comments from the last review should also apply, so I'll do that as well.

Cheers,
Silviu

I think this change is orthogonal to improving SCEV for cases where we can prove no wrapping by looking at the IR. At the high level this change looks good to me (I have always wanted to have the ability for SCEV to gather assumptions needed to prove no wrapping) - and the examples demonstrate that there is a need. Silviu has addressed the issue of only generating the checks when needed.

I'll leave it to Sanjoy and Adam to do a detailed review.

rengolin added inline comments.Jul 20 2015, 1:26 AM

include/llvm/Analysis/ScalarEvolutionExpressions.h
752	Do you need to keep this public? In addOverflowAssumption, you check for MakeAssumptions to add or not the SCEV to it, and if it remains public, anyone will be able to bypass this check.

rengolin added inline comments.Jul 20 2015, 6:53 AM

lib/Analysis/LoopAccessAnalysis.cpp
122	Why are you using the llvm namespace here? Shouldn't these functions be static?
127	replaceSymbolicStrideSCEV is defined on the header, as namespace llvm, static, and leads to confusions like these. If you need it in more than one place, it shouldn't be static and it should be implemented in its own cpp file. If these are local only, they should be static or in anonymous namespaces.

hfinkel added inline comments.Jul 25 2015, 5:52 PM

include/llvm/Analysis/ScalarEvolutionExpressions.h
869	Why not make this a member function of SCEV (same for rewriteSCEVWithAssumptions below)?
lib/Analysis/ScalarEvolution.cpp
8303	Don't remove blank line.

Thanks for the reviews! I'll have a new version out soon to fix all of the highlighted issues.

-Silviu

include/llvm/Analysis/ScalarEvolution.h
248	I was thinking of having a SmallVector for storage of each predicate, plus all the references in Preds. So adding further types would require adding additional SmallVectors (we will probably have more types, I can think of at least one more for checking the range of an expression).
include/llvm/Analysis/ScalarEvolutionExpressions.h
869	Makes sense.

Rebased the patch on trunk (which is the largest part of this update).

Moved the SCEV rewriter implementation from the ScalarEvolutionExpressions header
to the ScalarEvolution .cpp file. ScalarEvolution now has methods to perform
the SCEV rewrite.

Renamed the isAlways/isNever methods from SCEVPredicate to isAlwaysTrue/isAlwaysFalse.

Herald added a subscriber: sanjoy. · View Herald TranscriptAug 18 2015, 9:01 AM

Hi Silviu,

Sorry about the delay! Also my comments below are somewhat disorganized because I am still trying to wrap my head around this large patch. I *think* that the overall direction is good but I have a hard time seeing how the different parts connect at this point. I also have some ideas below of how you could probably restructure this patch to help the review.

As a high-level comment, why doesn't the strided assumption become another type of assumption after these changes?

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also theoretically do this).

Why is this needed? It's not explained.

As another high-level comment, I am also wondering if this 1000+ line patch can be split up to ease review. The steps I am thinking:

Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions
Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.
Add overflow assumptions
Add the guarded back-edge count extension along with analysis tests
Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

This is just an quick idea, feel free to structure the patches differently but I would be more comfortable with a more incremental approach.

Thanks,
Adam

include/llvm/Analysis/LoopAccessAnalysis.h
295–296	Can you please make this a bit more precise? E.g. With these assumption satisfied the memory accesses become simple AddRecs which allows them to get analyzed. Also it should either be called Preds or PredSet.
576	What's the deal with unique_ptr above and this here? Can you explaining the ownership situation? Also it's probably less confusing if we use the same variable name for PredicateSets.
lib/Analysis/LoopAccessAnalysis.cpp
141–148	Hmm, why don't you only write Preds after to ensure that we succeeded? Isn't this the same as: if (R.Ret && !R.P.isNever()) { R.P.add(&Preds); <-- write Preds here. return R.Ret: } if not it needs a comment.
158–161	We should not recompute this for every insert. Also we need to bound the number of predicates so getting the guarded backedge count at a more central place is probably a better idea.
591–606	This does not look tight enough. Can't we fail in hasComputable... for reasons other than the SCEV not being an AddRec? Also the return value of convertSCEV... should be checked rather than just repeating hasComputable... blindly.
lib/Transforms/Vectorize/LoopVectorize.cpp
312–314	There is no def for this function.

Hi Adam,

In D10161#232197, @anemet wrote:

Hi Silviu,

Sorry about the delay! Also my comments below are somewhat disorganized because I am still trying to wrap my head around this large patch. I *think* that the overall direction is good but I have a hard time seeing how the different parts connect at this point. I also have some ideas below of how you could probably restructure this patch to help the review.

Thanks! Restructuring the patch sounds like a good idea.

As a high-level comment, why doesn't the strided assumption become another type of assumption after these changes?

That's a good idea, we should be able to do that. I don't see any problem with making a predicate for it in one of the patches.

Added a SCEVPredicate in ExitLimit, to allow getting a backedge count
guarded by a SCEV predicate and patched the ExitLimit logic to handle the
predicate. Added code to generate SCEV overflow predicates when
computing the loop count from an icmp condition (more code paths could also theoretically do this).

Why is this needed? It's not explained.

The exit limit is what's being cached. That result is used to generate the (exact) backedge count, and other things. In order to keep caching mostly the same, we need to add the predicate to the ExitLimit. So this is all part of getting getBackedgeCount to work with predicates.

As another high-level comment, I am also wondering if this 1000+ line patch can be split up to ease review. The steps I am thinking:

Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions

Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.

Add overflow assumptions

Add the guarded back-edge count extension along with analysis tests

Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

Yes, I've also felt the need to split this up. At least all the places where we're making assumptions should probably be reviewed separately. My plans were slightly different though (more focused towards being able to get the backedge count). I'll have to think about the exact details, but it will probably end up being similar to what you've listed above.

This is just an quick idea, feel free to structure the patches differently but I would be more comfortable with a more incremental approach.

Thanks,
Adam

include/llvm/Analysis/LoopAccessAnalysis.h
576	This should be removed as it doesn't do anything. The unique_ptr above should contain the predicates, LoopAccessInfo ownes it and all other classes/structs have reference to it.
lib/Analysis/LoopAccessAnalysis.cpp
141–148	R.P.add(&Preds) adds Preds to R.P, so that wouldn't write Preds. We use the temporary because we can only be sure if we've succeeded or not after we see the final set of predicates.
158–161	Caching expressions would make sense to me. We would have to recompute the expressions every time we commit to a new predicate though.
591–606	Yes, you're correct. A better solution seems to only make the assumption after we've seen that hasComputableBounds.. now returns true. This change probably needs a closer look.
lib/Transforms/Vectorize/LoopVectorize.cpp
312–314	Must be an artefact from the rebase. It's not being used anywhere either, so it just needs to be removed.

In D10161#233173, @sbaranga wrote:

Somehow generalize/refactor the stride-checks to make them more similar to the overflow assumptions

Add the minimal predicate set logic and drive it from LAA or somewhere so that you can add tests for simple cases. Perhaps only make strided assumptions work with predicate sets first.

Add overflow assumptions

Add the guarded back-edge count extension along with analysis tests

Add the ExitLimit mods with some more tests. This may have to come before 4 but I don't really understand this change at this point.

Yes, I've also felt the need to split this up. At least all the places where we're making assumptions should probably be reviewed separately. My plans were slightly different though (more focused towards being able to get the backedge count). I'll have to think about the exact details, but it will probably end up being similar to what you've listed above.

Great! I understand that your actual goal to get the backedge count into a suitable form for analysis by using predicates. The tricky part that that needs multiple things: the whole predicated SCEV infrastructure *plus* a good way to expose this otherwise implicit step to clients so that they have a way to reason about the cost of adding these predicates.

That's why if possible I'd like to get the first set in where the clients are in full control (based on potentially wrapping pointers for example) and then start bringing in the whole backedge count complexity.

That's said it's entirely possible that I am oversimplifying things.

Thanks,
Adam

lib/Analysis/LoopAccessAnalysis.cpp
141–148	I see. It still feels like a common multi-step pattern that clients can get wrong. I wonder if we can make the API better here. We can discuss this after you split this into multiple patches.
158–161	I don't follow.

Is this still relevant, or has this been superseded by D13595 ?

In D10161#274408, @sanjoy wrote:

Is this still relevant, or has this been superseded by D13595 ?

It is only relevant as a reference for future work I'll be splitting out patches from here in the future, although we've already diverged with D13595 in some points. I'll mark it as abandoned.

Revision Contents

Path

Size

include/

llvm/

Analysis/

LoopAccessAnalysis.h

43 lines

ScalarEvolution.h

194 lines

ScalarEvolutionExpressions.h

14 lines

Transforms/

Utils/

LoopVersioning.h

6 lines

lib/

Analysis/

LoopAccessAnalysis.cpp

132 lines

ScalarEvolution.cpp

687 lines

Transforms/

Scalar/

LoopDistribute.cpp

8 lines

Utils/

LoopVersioning.cpp

28 lines

Vectorize/

LoopVectorize.cpp

130 lines

test/

Transforms/

LoopDistribute/

distribute-with-overflows.ll

102 lines

LoopVectorize/

safegep.ll

2 lines

scev-overflow-check.ll

125 lines

version-mem-access.ll

11 lines

Diff 32418

include/llvm/Analysis/LoopAccessAnalysis.h

Show All 27 Lines
namespace llvm {		namespace llvm {

class Value;		class Value;
class DataLayout;		class DataLayout;
class AliasAnalysis;		class AliasAnalysis;
class ScalarEvolution;		class ScalarEvolution;
class Loop;		class Loop;
class SCEV;		class SCEV;
		class SCEVPredicateSet;

/// Optimization analysis message produced during vectorization. Messages inform		/// Optimization analysis message produced during vectorization. Messages inform
/// the user why vectorization did not occur.		/// the user why vectorization did not occur.
class LoopAccessReport {		class LoopAccessReport {
std::string Message;		std::string Message;
const Instruction *Instr;		const Instruction *Instr;

protected:		protected:
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	struct Dependence {
bool isPossiblyBackward() const;		bool isPossiblyBackward() const;

/// \brief Print the dependence. \p Instr is used to map the instruction		/// \brief Print the dependence. \p Instr is used to map the instruction
/// indices to instructions.		/// indices to instructions.
void print(raw_ostream &OS, unsigned Depth,		void print(raw_ostream &OS, unsigned Depth,
const SmallVectorImpl<Instruction *> &Instrs) const;		const SmallVectorImpl<Instruction *> &Instrs) const;
};		};

MemoryDepChecker(ScalarEvolution Se, const Loop L)		MemoryDepChecker(ScalarEvolution Se, const Loop L, SCEVPredicateSet &Pred)
: SE(Se), InnermostLoop(L), AccessIdx(0),		: SE(Se), InnermostLoop(L), AccessIdx(0),
ShouldRetryWithRuntimeCheck(false), SafeForVectorization(true),		ShouldRetryWithRuntimeCheck(false), SafeForVectorization(true),
RecordInterestingDependences(true) {}		RecordInterestingDependences(true), Pred(Pred) {}

/// \brief Register the location (instructions are given increasing numbers)		/// \brief Register the location (instructions are given increasing numbers)
/// of a write access.		/// of a write access.
void addAccess(StoreInst *SI) {		void addAccess(StoreInst *SI) {
Value *Ptr = SI->getPointerOperand();		Value *Ptr = SI->getPointerOperand();
Accesses[MemAccessInfo(Ptr, true)].push_back(AccessIdx);		Accesses[MemAccessInfo(Ptr, true)].push_back(AccessIdx);
InstMap.push_back(SI);		InstMap.push_back(SI);
++AccessIdx;		++AccessIdx;
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	private:
/// Otherwise, this function returns true signaling a possible dependence.		/// Otherwise, this function returns true signaling a possible dependence.
Dependence::DepType isDependent(const MemAccessInfo &A, unsigned AIdx,		Dependence::DepType isDependent(const MemAccessInfo &A, unsigned AIdx,
const MemAccessInfo &B, unsigned BIdx,		const MemAccessInfo &B, unsigned BIdx,
const ValueToValueMap &Strides);		const ValueToValueMap &Strides);

/// \brief Check whether the data dependence could prevent store-load		/// \brief Check whether the data dependence could prevent store-load
/// forwarding.		/// forwarding.
bool couldPreventStoreLoadForward(unsigned Distance, unsigned TypeByteSize);		bool couldPreventStoreLoadForward(unsigned Distance, unsigned TypeByteSize);

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;
		anemetUnsubmitted Not Done Reply Inline Actions Can you please make this a bit more precise? E.g. With these assumption satisfied the memory accesses become simple AddRecs which allows them to get analyzed. Also it should either be called Preds or PredSet. anemet: Can you please make this a bit more precise? E.g. With these assumption satisfied the memory…
};		};

/// \brief Holds information about the memory runtime legality checks to verify		/// \brief Holds information about the memory runtime legality checks to verify
/// that a group of pointers do not overlap.		/// that a group of pointers do not overlap.
class RuntimePointerChecking {		class RuntimePointerChecking {
public:		public:
struct PointerInfo {		struct PointerInfo {
/// Holds the pointer value that we need to check.		/// Holds the pointer value that we need to check.
Show All 26 Lines	public:
void reset() {		void reset() {
Need = false;		Need = false;
Pointers.clear();		Pointers.clear();
Checks.clear();		Checks.clear();
}		}

/// Insert a pointer and calculate the start and end SCEVs.		/// Insert a pointer and calculate the start and end SCEVs.
void insert(Loop Lp, Value Ptr, bool WritePtr, unsigned DepSetId,		void insert(Loop Lp, Value Ptr, bool WritePtr, unsigned DepSetId,
unsigned ASId, const ValueToValueMap &Strides);		unsigned ASId, const ValueToValueMap &Strides,
		SCEVPredicateSet &Pred);

/// \brief No run-time memory checking is necessary.		/// \brief No run-time memory checking is necessary.
bool empty() const { return Pointers.empty(); }		bool empty() const { return Pointers.empty(); }

/// A grouping of pointers. A single memcheck is required between		/// A grouping of pointers. A single memcheck is required between
/// two groups.		/// two groups.
struct CheckingPtrGroup {		struct CheckingPtrGroup {
/// \brief Create a new pointer checking group containing a single		/// \brief Create a new pointer checking group containing a single
▲ Show 20 Lines • Show All 189 Lines • ▼ Show 20 Lines	public:

/// \brief Checks existence of store to invariant address inside loop.		/// \brief Checks existence of store to invariant address inside loop.
/// If the loop has any store to invariant address, then it returns true,		/// If the loop has any store to invariant address, then it returns true,
/// else returns false.		/// else returns false.
bool hasStoreToLoopInvariantAddress() const {		bool hasStoreToLoopInvariantAddress() const {
return StoreToLoopInvariantAddress;		return StoreToLoopInvariantAddress;
}		}

		/// The SCEV predicate containing all the SCEV-related assumptions.
		std::unique_ptr<SCEVPredicateSet> Pred;

private:		private:
/// \brief Analyze the loop. Substitute symbolic strides using Strides.		/// \brief Analyze the loop. Substitute symbolic strides using Strides.
void analyzeLoop(const ValueToValueMap &Strides);		void analyzeLoop(const ValueToValueMap &Strides);

/// \brief Check if the structure of the loop allows it to be analyzed by this		/// \brief Check if the structure of the loop allows it to be analyzed by this
/// pass.		/// pass.
bool canAnalyzeLoop();		bool canAnalyzeLoop();

Show All 10 Lines	private:
Loop *TheLoop;		Loop *TheLoop;
ScalarEvolution *SE;		ScalarEvolution *SE;
const DataLayout &DL;		const DataLayout &DL;
const TargetLibraryInfo *TLI;		const TargetLibraryInfo *TLI;
AliasAnalysis *AA;		AliasAnalysis *AA;
DominatorTree *DT;		DominatorTree *DT;
LoopInfo *LI;		LoopInfo *LI;

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet ScPredicates;
		anemetUnsubmitted Not Done Reply Inline Actions What's the deal with unique_ptr above and this here? Can you explaining the ownership situation? Also it's probably less confusing if we use the same variable name for PredicateSets. anemet: What's the deal with unique_ptr above and this here? Can you explaining the ownership…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions This should be removed as it doesn't do anything. The unique_ptr above should contain the predicates, LoopAccessInfo ownes it and all other classes/structs have reference to it. sbaranga: This should be removed as it doesn't do anything. The unique_ptr above should contain the…

unsigned NumLoads;		unsigned NumLoads;
unsigned NumStores;		unsigned NumStores;

unsigned MaxSafeDepDistBytes;		unsigned MaxSafeDepDistBytes;

/// \brief Cache the result of analyzeLoop.		/// \brief Cache the result of analyzeLoop.
bool CanVecMem;		bool CanVecMem;

Show All 13 Lines
///		///
/// If \p OrigPtr is not null, use it to look up the stride value instead of \p		/// If \p OrigPtr is not null, use it to look up the stride value instead of \p
/// Ptr. \p PtrToStride provides the mapping between the pointer value and its		/// Ptr. \p PtrToStride provides the mapping between the pointer value and its
/// stride as collected by LoopVectorizationLegality::collectStridedAccess.		/// stride as collected by LoopVectorizationLegality::collectStridedAccess.
const SCEV replaceSymbolicStrideSCEV(ScalarEvolution SE,		const SCEV replaceSymbolicStrideSCEV(ScalarEvolution SE,
const ValueToValueMap &PtrToStride,		const ValueToValueMap &PtrToStride,
Value Ptr, Value OrigPtr = nullptr);		Value Ptr, Value OrigPtr = nullptr);

		///\brief Return the SCEV of a value having with all assumptions applied. This
		/// will replace symbolic strides according to \p PtrToStride and apply any
		/// existing SCEV assumptions contained in \p Preds.
		const SCEV rewriteSCEV(ScalarEvolution SE, const ValueToValueMap &PtrToStride,
		Value Ptr, Value OrigPtr, const Loop *L,
		SCEVPredicateSet &Preds);

		///\brief Try and add a minimal set of assumptions that will cause the
		/// re-written SCEV of \p Ptr to be an AddRecExpr. If successful we will
		/// return a modified AddRecExpr and add any assumptions made to \p Preds.
		/// Otherwise, we will make no new assumption and return the same result as
		/// rewriteSCEV.
		const SCEV convertSCEVToAddRec(ScalarEvolution SE,
		const ValueToValueMap &PtrToStride, Value *Ptr,
		Value OrigPtr, const Loop L,
		SCEVPredicateSet &Preds);

/// \brief Check the stride of the pointer and ensure that it does not wrap in		/// \brief Check the stride of the pointer and ensure that it does not wrap in
/// the address space.		/// the address space. If \p MakeAssumptions is true, we will try to add
		/// SCEV assumptions as necessary to \Pred in order to return true. If we
		/// cannot return true, \p Pred will remain unchanged.
int isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,		int isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,
const ValueToValueMap &StridesMap);		const ValueToValueMap &StridesMap, SCEVPredicateSet &Pred,
		bool MakeAssumptions);

/// \brief This analysis provides dependence information for the memory accesses		/// \brief This analysis provides dependence information for the memory accesses
/// of a loop.		/// of a loop.
///		///
/// It runs the analysis for a loop on demand. This can be initiated by		/// It runs the analysis for a loop on demand. This can be initiated by
/// querying the loop access info via LAA::getInfo. getInfo return a		/// querying the loop access info via LAA::getInfo. getInfo return a
/// LoopAccessInfo object. See this class for the specifics of what information		/// LoopAccessInfo object. See this class for the specifics of what information
/// is provided.		/// is provided.
class LoopAccessAnalysis : public FunctionPass {		class LoopAccessAnalysis : public FunctionPass {
public:		public:
static char ID;		static char ID;

LoopAccessAnalysis() : FunctionPass(ID) {		LoopAccessAnalysis() : FunctionPass(ID) {
initializeLoopAccessAnalysisPass(*PassRegistry::getPassRegistry());		initializeLoopAccessAnalysisPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;

/// \brief Query the result of the loop access information for the loop \p L.		/// \brief Query the result of the loop access information for the loop \p L.
///		///
/// If the client speculates (and then issues run-time checks) for the values		/// If the client speculates (and then issues run-time checks) for the values
/// of symbolic strides, \p Strides provides the mapping (see		/// of symbolic strides, \p Strides provides the mapping (see
/// replaceSymbolicStrideSCEV). If there is no cached result available run		/// replaceSymbolicStrideSCEV). If there is no cached result available run
/// the analysis.		/// the analysis.
		rengolinUnsubmitted Not Done Reply Inline Actions this comment seems a bit terse. UseAssumptions is an argument to some methods in SCEV, nothing to do with LoopInfo. rengolin: this comment seems a bit terse. UseAssumptions is an argument to some methods in SCEV, nothing…
const LoopAccessInfo &getInfo(Loop *L, const ValueToValueMap &Strides);		const LoopAccessInfo &getInfo(Loop *L, const ValueToValueMap &Strides);

void releaseMemory() override {		void releaseMemory() override {
// Invalidate the cache when the pass is freed.		// Invalidate the cache when the pass is freed.
LoopAccessInfoMap.clear();		LoopAccessInfoMap.clear();
}		}

/// \brief Print the result of the analysis when invoked with -analyze.		/// \brief Print the result of the analysis when invoked with -analyze.
Show All 16 Lines

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	namespace llvm {
class TargetLibraryInfo;		class TargetLibraryInfo;
class LLVMContext;		class LLVMContext;
class Loop;		class Loop;
class LoopInfo;		class LoopInfo;
class Operator;		class Operator;
class SCEVUnknown;		class SCEVUnknown;
class SCEVAddRecExpr;		class SCEVAddRecExpr;
class SCEV;		class SCEV;
		class SCEVExpander;

template<> struct FoldingSetTrait<SCEV>;		template<> struct FoldingSetTrait<SCEV>;

/// SCEV - This class represents an analyzed expression in the program. These		/// SCEV - This class represents an analyzed expression in the program. These
/// are opaque objects that the client is not allowed to do much with		/// are opaque objects that the client is not allowed to do much with
/// directly.		/// directly.
///		///
class SCEV : public FoldingSetNode {		class SCEV : public FoldingSetNode {
friend struct FoldingSetTrait<SCEV>;		friend struct FoldingSetTrait<SCEV>;
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	#endif
/// marker.		/// marker.
struct SCEVCouldNotCompute : public SCEV {		struct SCEVCouldNotCompute : public SCEV {
SCEVCouldNotCompute();		SCEVCouldNotCompute();

/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const SCEV *S);		static bool classof(const SCEV *S);
};		};

		enum SCEVPredicateTypes { pAddRecOverflow, pSet };

		//===--------------------------------------------------------------------===//
		/// SCEVPredicate - This class represents an assumption made using SCEV
		/// expressions which can be checked at run-time.
		///
		class SCEVPredicate {
		protected:
		unsigned short SCEVPredicateType;

		public:
		SCEVPredicate(unsigned short Type);
		unsigned short getType() const { return SCEVPredicateType; }
		/// Returns true if the predicate is always true. This means that no
		/// assumptions were made and nothing needs to be checked at run-time.
		virtual bool isAlwaysTrue() const = 0;
		rengolinUnsubmitted Not Done Reply Inline Actions nitpick: maybe isAlwaysTrue / isNeverTrue or isAlwaysTrue / isAlwaysFalse would be more descriptive? rengolin: nitpick: maybe isAlwaysTrue / isNeverTrue or isAlwaysTrue / isAlwaysFalse would be more…
		/// Return true if we consider this to be always false or if we've
		/// given up on this set of assumptions (for example because of the
		/// high cost of checking at run-time).
		virtual bool isAlwaysFalse() const = 0;
		/// Returns true if this predicate implies \p N.
		virtual bool contains(const SCEVPredicate *N) const = 0;
		/// Prints a textual representation of this predicate.
		virtual void print(raw_ostream &OS, unsigned Depth) const = 0;

		/// Generates a run-time check for this predicate.
		virtual Value generateCheck(Instruction Loc, ScalarEvolution *SE,
		const DataLayout *DL, SCEVExpander &Exp) = 0;
		};

		//===--------------------------------------------------------------------===//
		/// SCEVAddRecOverflowPredicate - This class represents an assumption
		/// made using on an AddRec expression. Given an affine AddRec expression
		/// (a,+,b), we assume that it has nsw or nuw flags.
		class SCEVAddRecOverflowPredicate : public SCEVPredicate {
		const SCEV *AR;
		SCEV::NoWrapFlags Flags;

		public:
		SCEVAddRecOverflowPredicate()
		: SCEVPredicate(pAddRecOverflow), AR(nullptr),
		Flags(SCEV::FlagAnyWrap) {}
		SCEVAddRecOverflowPredicate(const SCEV *AR, SCEV::NoWrapFlags Flags);

		/// Returns the set assumed no overflow flags.
		SCEV::NoWrapFlags getFlags() const { return Flags; }
		/// Add an assumption of no overflow for \p AddedFlags.
		void addFlags(SCEV::NoWrapFlags AddedFlags);
		/// Returns the AddRec expression that we've made assumptions for.
		const SCEV *getExpr() const { return AR; }

		/// Implementation of the SCEVPredicate interface
		bool contains(const SCEVPredicate *N) const override;
		void print(raw_ostream &OS, unsigned Depth) const override;
		bool isAlwaysTrue() const override;
		bool isAlwaysFalse() const override;
		Value generateCheck(Instruction Loc, ScalarEvolution *SE,
		const DataLayout *DL, SCEVExpander &Exp) override;

		/// Methods for support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const SCEVPredicate *P) {
		return P->getType() == pAddRecOverflow;
		}
		};

		//===--------------------------------------------------------------------===//
		/// SCEVPredicateSet - This class represents a composition of other
		/// SCEV predicates, and is the class that most clients will interact with.
		///
		class SCEVPredicateSet : public SCEVPredicate {
		private:
		/// Flag used to track if this predicate set is invalid.
		rengolinUnsubmitted Not Done Reply Inline Actions No description? rengolin: No description?
		bool Never;
		rengolinUnsubmitted Not Done Reply Inline Actions Why the duplication? Will you have one for each further type? rengolin: Why the duplication? Will you have one for each further type?
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions I was thinking of having a SmallVector for storage of each predicate, plus all the references in Preds. So adding further types would require adding additional SmallVectors (we will probably have more types, I can think of at least one more for checking the range of an expression). sbaranga: I was thinking of having a SmallVector for storage of each predicate, plus all the references…
		/// Storage for different predicates that make up this Predicate Set.
		SmallVector<SCEVAddRecOverflowPredicate, 16> AddRecOverflows;
		/// Vector with references to all predicates in this set.
		SmallVector<SCEVPredicate *, 16> Preds;

		public:
		SCEVPredicateSet();
		/// The copy constructor.
		SCEVPredicateSet(const SCEVPredicateSet &Old);
		/// Adds a predicate to this predicate set.
		void add(const SCEVPredicate *N);

		/// Generates a run-time check for all the contained predicates.
		/// This is a wrapper around generateCheck, and provides an interface
		/// similar to other run-time checks used for versioning.
		std::pair<Instruction , Instruction >
		generateGuardCond(Instruction Loc, ScalarEvolution SE);

		/// Implementation of the SCEVPredicate interface
		bool isAlwaysTrue() const override;
		bool isAlwaysFalse() const override;
		bool contains(const SCEVPredicate *N) const override;
		void print(raw_ostream &OS, unsigned Depth) const;
		Value generateCheck(Instruction Loc, ScalarEvolution *SE,
		const DataLayout *DL, SCEVExpander &Exp) override;

		/// Methods for support type inquiry through isa, cast, and dyn_cast:
		rengolinUnsubmitted Not Done Reply Inline Actions Why here, and not in SCEVPredicate? rengolin: Why here, and not in SCEVPredicate?
		static inline bool classof(const SCEVPredicate *P) {
		return P->getType() == pSet;
		}

		/// The copy operator.
		const SCEVPredicateSet &operator=(const SCEVPredicateSet &RHS) {
		Never = RHS.Never;
		AddRecOverflows = RHS.AddRecOverflows;
		Preds.clear();
		for (unsigned II = 0; II < AddRecOverflows.size(); ++II) {
		Preds.push_back(&AddRecOverflows[II]);
		}
		assert(Preds.size() == RHS.Preds.size() && "Wrong Preds size after copy");
		return *this;
		}
		};

		/// Associates a SCEV predicate to a SCEV.
		struct AssumptionResult {
		const SCEV *Start;
		const SCEV *Res;
		SCEVPredicateSet Pred;
		AssumptionResult(const SCEV *Start) : Start(Start), Res(nullptr) {}
		};

/// The main scalar evolution driver. Because client code (intentionally)		/// The main scalar evolution driver. Because client code (intentionally)
/// can't do much with the SCEV objects directly, they must ask this class		/// can't do much with the SCEV objects directly, they must ask this class
/// for services.		/// for services.
class ScalarEvolution {		class ScalarEvolution {
public:		public:
/// LoopDisposition - An enum describing the relationship between a		/// LoopDisposition - An enum describing the relationship between a
/// SCEV and a loop.		/// SCEV and a loop.
enum LoopDisposition {		enum LoopDisposition {
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	private:
/// ExitLimit - Information about the number of loop iterations for which a		/// ExitLimit - Information about the number of loop iterations for which a
/// loop exit's branch condition evaluates to the not-taken path. This is a		/// loop exit's branch condition evaluates to the not-taken path. This is a
/// temporary pair of exact and max expressions that are eventually		/// temporary pair of exact and max expressions that are eventually
/// summarized in ExitNotTakenInfo and BackedgeTakenInfo.		/// summarized in ExitNotTakenInfo and BackedgeTakenInfo.
struct ExitLimit {		struct ExitLimit {
const SCEV *Exact;		const SCEV *Exact;
const SCEV *Max;		const SCEV *Max;

		/// A predicate set guard for this ExitLimit. The result is only
		/// valid if this predicate evaluates to 'true' at run-time.
		SCEVPredicateSet Pred;

/implicit/ ExitLimit(const SCEV *E) : Exact(E), Max(E) {}		/implicit/ ExitLimit(const SCEV *E) : Exact(E), Max(E) {}

ExitLimit(const SCEV E, const SCEV M) : Exact(E), Max(M) {}		ExitLimit(const SCEV E, const SCEV M, SCEVPredicateSet &P)
		: Exact(E), Max(M), Pred(P) {}

/// hasAnyInfo - Test whether this ExitLimit contains any computed		/// hasAnyInfo - Test whether this ExitLimit contains any computed
/// information, or whether it's all SCEVCouldNotCompute values.		/// information, or whether it's all SCEVCouldNotCompute values.
bool hasAnyInfo() const {		bool hasAnyInfo() const {
return !isa<SCEVCouldNotCompute>(Exact) \|\|		return !isa<SCEVCouldNotCompute>(Exact) \|\|
!isa<SCEVCouldNotCompute>(Max);		!isa<SCEVCouldNotCompute>(Max);
}		}
};		};

/// ExitNotTakenInfo - Information about the number of times a particular		/// ExitNotTakenInfo - Information about the number of times a particular
/// loop exit may be reached before exiting the loop.		/// loop exit may be reached before exiting the loop.
struct ExitNotTakenInfo {		struct ExitNotTakenInfo {
AssertingVH<BasicBlock> ExitingBlock;		AssertingVH<BasicBlock> ExitingBlock;
const SCEV *ExactNotTaken;		const SCEV *ExactNotTaken;
PointerIntPair<ExitNotTakenInfo*, 1> NextExit;		PointerIntPair<ExitNotTakenInfo *, 1> NextExit;
		SCEVPredicateSet Pred;

ExitNotTakenInfo() : ExitingBlock(nullptr), ExactNotTaken(nullptr) {}		ExitNotTakenInfo() : ExitingBlock(nullptr), ExactNotTaken(nullptr) {}

/// isCompleteList - Return true if all loop exits are computable.		/// isCompleteList - Return true if all loop exits are computable.
bool isCompleteList() const {		bool isCompleteList() const {
return NextExit.getInt() == 0;		return NextExit.getInt() == 0;
}		}

Show All 19 Lines	class BackedgeTakenInfo {
/// count of the loop that is known, or a SCEVCouldNotCompute.		/// count of the loop that is known, or a SCEVCouldNotCompute.
const SCEV *Max;		const SCEV *Max;

public:		public:
BackedgeTakenInfo() : Max(nullptr) {}		BackedgeTakenInfo() : Max(nullptr) {}

/// Initialize BackedgeTakenInfo from a list of exact exit counts.		/// Initialize BackedgeTakenInfo from a list of exact exit counts.
BackedgeTakenInfo(		BackedgeTakenInfo(
SmallVectorImpl< std::pair<BasicBlock , const SCEV > > &ExitCounts,		SmallVectorImpl<std::pair<BasicBlock , const SCEV >> &ExitCounts,
bool Complete, const SCEV *MaxCount);		SmallVectorImpl<SCEVPredicateSet *> &ExitPreds, bool Complete,
		const SCEV *MaxCount);

/// hasAnyInfo - Test whether this BackedgeTakenInfo contains any		/// hasAnyInfo - Test whether this BackedgeTakenInfo contains any
/// computed information, or whether it's all SCEVCouldNotCompute		/// computed information, or whether it's all SCEVCouldNotCompute
/// values.		/// values.
bool hasAnyInfo() const {		bool hasAnyInfo() const {
return ExitNotTaken.ExitingBlock \|\| !isa<SCEVCouldNotCompute>(Max);		return ExitNotTaken.ExitingBlock \|\| !isa<SCEVCouldNotCompute>(Max);
}		}

/// getExact - Return an expression indicating the exact backedge-taken		/// getExact - Return an expression indicating the exact backedge-taken
/// count of the loop if it is known, or SCEVCouldNotCompute		/// count of the loop if it is known and always correct (independent
/// otherwise. This is the number of times the loop header can be		/// of any assumptions that should be checked at run-time), or
/// guaranteed to execute, minus one.		/// SCEVCouldNotCompute otherwise. This is the number of times the
		/// loop header can be guaranteed to execute, minus one.
const SCEV getExact(ScalarEvolution SE) const;		const SCEV getExact(ScalarEvolution SE) const;

		/// getGuardedExact - Return an expression indicating the exact
		/// backedge-taken count of the loop if it is known, or
		/// SCEVCouldNotCompute otherwise. Returns the SCEV predicates that
		/// need to be checked at run-time in order for this answer to be valid
		/// in \p Predicates. This is the number of times the loop header can be
		/// guaranteed to execute, minus one.
		const SCEV getGuardedExact(ScalarEvolution SE,
		SCEVPredicateSet &Predicates) const;
/// getExact - Return the number of times this loop exit may fall through		/// getExact - Return the number of times this loop exit may fall through
/// to the back edge, or SCEVCouldNotCompute. The loop is guaranteed not		/// to the back edge, or SCEVCouldNotCompute. The loop is guaranteed not
/// to exit via this block before this number of iterations, but may exit		/// to exit via this block before this number of iterations, but may exit
/// via another block.		/// via another block.
const SCEV getExact(BasicBlock ExitingBlock, ScalarEvolution *SE) const;		const SCEV getExact(BasicBlock ExitingBlock, ScalarEvolution *SE) const;

/// getMax - Get the max backedge taken count for the loop.		/// getMax - Get the max backedge taken count for the loop.
const SCEV getMax(ScalarEvolution SE) const;		const SCEV getMax(ScalarEvolution SE) const;
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	ExitLimit ComputeExitLimitFromCond(const Loop *L,
Value *ExitCond,		Value *ExitCond,
BasicBlock *TBB,		BasicBlock *TBB,
BasicBlock *FBB,		BasicBlock *FBB,
bool IsSubExpr);		bool IsSubExpr);

/// ComputeExitLimitFromICmp - Compute the number of times the backedge of		/// ComputeExitLimitFromICmp - Compute the number of times the backedge of
/// the specified loop will execute if its exit condition were a conditional		/// the specified loop will execute if its exit condition were a conditional
/// branch of the ICmpInst ExitCond, TBB, and FBB.		/// branch of the ICmpInst ExitCond, TBB, and FBB.
ExitLimit ComputeExitLimitFromICmp(const Loop *L,		ExitLimit ComputeExitLimitFromICmp(const Loop L, ICmpInst ExitCond,
ICmpInst *ExitCond,		BasicBlock TBB, BasicBlock FBB,
BasicBlock *TBB,		bool IsSubExpr,
BasicBlock *FBB,		bool UseAssumptions = false);
bool IsSubExpr);

/// ComputeExitLimitFromSingleExitSwitch - Compute the number of times the		/// ComputeExitLimitFromSingleExitSwitch - Compute the number of times the
/// backedge of the specified loop will execute if its exit condition were a		/// backedge of the specified loop will execute if its exit condition were a
/// switch with a single exiting case to ExitingBB.		/// switch with a single exiting case to ExitingBB.
ExitLimit		ExitLimit
ComputeExitLimitFromSingleExitSwitch(const Loop L, SwitchInst Switch,		ComputeExitLimitFromSingleExitSwitch(const Loop L, SwitchInst Switch,
BasicBlock *ExitingBB, bool IsSubExpr);		BasicBlock *ExitingBB, bool IsSubExpr);

Show All 12 Lines	private:
/// evaluate the exit count of the loop, return CouldNotCompute.		/// evaluate the exit count of the loop, return CouldNotCompute.
const SCEV ComputeExitCountExhaustively(const Loop L,		const SCEV ComputeExitCountExhaustively(const Loop L,
Value *Cond,		Value *Cond,
bool ExitWhen);		bool ExitWhen);

/// HowFarToZero - Return the number of times an exit condition comparing		/// HowFarToZero - Return the number of times an exit condition comparing
/// the specified value to zero will execute. If not computable, return		/// the specified value to zero will execute. If not computable, return
/// CouldNotCompute.		/// CouldNotCompute.
ExitLimit HowFarToZero(const SCEV V, const Loop L, bool IsSubExpr);		ExitLimit HowFarToZero(const SCEV V, const Loop L, bool IsSubExpr,
		bool UseAssumptions = false);

/// HowFarToNonZero - Return the number of times an exit condition checking		/// HowFarToNonZero - Return the number of times an exit condition checking
/// the specified value for nonzero will execute. If not computable, return		/// the specified value for nonzero will execute. If not computable, return
/// CouldNotCompute.		/// CouldNotCompute.
ExitLimit HowFarToNonZero(const SCEV V, const Loop L);		ExitLimit HowFarToNonZero(const SCEV V, const Loop L,
		bool UseAssumptions = false);

/// HowManyLessThans - Return the number of times an exit condition		/// HowManyLessThans - Return the number of times an exit condition
/// containing the specified less-than comparison will execute. If not		/// containing the specified less-than comparison will execute. If not
/// computable, return CouldNotCompute. isSigned specifies whether the		/// computable, return CouldNotCompute. isSigned specifies whether the
/// less-than is signed.		/// less-than is signed.
ExitLimit HowManyLessThans(const SCEV LHS, const SCEV RHS,		ExitLimit HowManyLessThans(const SCEV LHS, const SCEV RHS, const Loop *L,
const Loop *L, bool isSigned, bool IsSubExpr);		bool isSigned, bool IsSubExpr,
		bool UseAssumptions = false);
ExitLimit HowManyGreaterThans(const SCEV LHS, const SCEV RHS,		ExitLimit HowManyGreaterThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool isSigned, bool IsSubExpr);		const Loop *L, bool isSigned, bool IsSubExpr,
		bool UseAssumptions = false);

/// getPredecessorWithUniqueSuccessorForBB - Return a predecessor of BB		/// getPredecessorWithUniqueSuccessorForBB - Return a predecessor of BB
/// (which may not be an immediate predecessor) which has exactly one		/// (which may not be an immediate predecessor) which has exactly one
/// successor from which BB is reachable, or null if no such block is		/// successor from which BB is reachable, or null if no such block is
/// found.		/// found.
std::pair<BasicBlock , BasicBlock >		std::pair<BasicBlock , BasicBlock >
getPredecessorWithUniqueSuccessorForBB(BasicBlock *BB);		getPredecessorWithUniqueSuccessorForBB(BasicBlock *BB);

▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	public:
const SCEV getSCEV(Value V);		const SCEV getSCEV(Value V);

const SCEV getConstant(ConstantInt V);		const SCEV getConstant(ConstantInt V);
const SCEV *getConstant(const APInt& Val);		const SCEV *getConstant(const APInt& Val);
const SCEV getConstant(Type Ty, uint64_t V, bool isSigned = false);		const SCEV getConstant(Type Ty, uint64_t V, bool isSigned = false);
const SCEV getTruncateExpr(const SCEV Op, Type *Ty);		const SCEV getTruncateExpr(const SCEV Op, Type *Ty);
const SCEV getZeroExtendExpr(const SCEV Op, Type *Ty);		const SCEV getZeroExtendExpr(const SCEV Op, Type *Ty);
const SCEV getSignExtendExpr(const SCEV Op, Type *Ty);		const SCEV getSignExtendExpr(const SCEV Op, Type *Ty);
const SCEV getAnyExtendExpr(const SCEV Op, Type *Ty);		const SCEV getAnyExtendExpr(const SCEV Op, Type *Ty);
		hfinkelUnsubmitted Not Done Reply Inline Actions Are these two methods special in theory, in this context, or in the future would overrides for more methods be useful? hfinkel: Are these two methods special in theory, in this context, or in the future would overrides for…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions I think they are special because these expressions cannot fold sext/zext expressions. As far as I can see all the others can, so these seem like they are the only expressions that are stopping us for getting AddRecExprs as results? For example if we have (add (sext({x, + , 1}), y). Getting rid of the sext gets us {x + y, +, 1}. However, there are other methods which might be useful to override, but not related to expression folding. For example SimplifyICmpOperands when taking an ULE predicate can only canonicalize the comparison if it knows that it can add one to the right operand without overflowing (the operand is loop invariant and could possibly be checked). This issue appears mostly when trying to get the number of backedges taken. sbaranga: I think they are special because these expressions cannot fold sext/zext expressions. As far as…
const SCEV getAddExpr(SmallVectorImpl<const SCEV > &Ops,		const SCEV getAddExpr(SmallVectorImpl<const SCEV > &Ops,
SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);
const SCEV getAddExpr(const SCEV LHS, const SCEV *RHS,		const SCEV getAddExpr(const SCEV LHS, const SCEV *RHS,
SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap) {		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap) {
SmallVector<const SCEV *, 2> Ops;		SmallVector<const SCEV *, 2> Ops;
Ops.push_back(LHS);		Ops.push_back(LHS);
Ops.push_back(RHS);		Ops.push_back(RHS);
return getAddExpr(Ops, Flags);		return getAddExpr(Ops, Flags);
▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	public:
/// when the header is branched to from outside the loop.		/// when the header is branched to from outside the loop.
///		///
/// Note that it is not valid to call this method on a loop without a		/// Note that it is not valid to call this method on a loop without a
/// loop-invariant backedge-taken count (see		/// loop-invariant backedge-taken count (see
/// hasLoopInvariantBackedgeTakenCount).		/// hasLoopInvariantBackedgeTakenCount).
///		///
const SCEV getBackedgeTakenCount(const Loop L);		const SCEV getBackedgeTakenCount(const Loop L);

		const SCEV getGuardedBackedgeTakenCount(const Loop L,
		SCEVPredicateSet &Predicates);

/// getMaxBackedgeTakenCount - Similar to getBackedgeTakenCount, except		/// getMaxBackedgeTakenCount - Similar to getBackedgeTakenCount, except
/// return the least SCEV value that is known never to be less than the		/// return the least SCEV value that is known never to be less than the
/// actual backedge taken count.		/// actual backedge taken count.
const SCEV getMaxBackedgeTakenCount(const Loop L);		const SCEV getMaxBackedgeTakenCount(const Loop L);

/// hasLoopInvariantBackedgeTakenCount - Return true if the specified loop		/// hasLoopInvariantBackedgeTakenCount - Return true if the specified loop
/// has an analyzable loop-invariant backedge-taken count.		/// has an analyzable loop-invariant backedge-taken count.
bool hasLoopInvariantBackedgeTakenCount(const Loop *L);		bool hasLoopInvariantBackedgeTakenCount(const Loop *L);
▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	public:
/// The subscript of the outermost dimension is the Quotient: [j+k].		/// The subscript of the outermost dimension is the Quotient: [j+k].
///		///
/// Overall, we have: A[][n][m], and the access function: A[j+k][2i][5i].		/// Overall, we have: A[][n][m], and the access function: A[j+k][2i][5i].
void delinearize(const SCEV *Expr,		void delinearize(const SCEV *Expr,
SmallVectorImpl<const SCEV *> &Subscripts,		SmallVectorImpl<const SCEV *> &Subscripts,
SmallVectorImpl<const SCEV *> &Sizes,		SmallVectorImpl<const SCEV *> &Sizes,
const SCEV *ElementSize);		const SCEV *ElementSize);

		/// Re-writes the SCEV according to the Predicates in \p Preds, by
		/// applying overflow assumptions and sinking sext/zext expressions.
		const SCEV rewriteUsingPredicate(const SCEV Scev, const Loop *L,
		SCEVPredicateSet &A);

		/// Tries to convert a SCEV into an AddRecExpr by making overflow
		/// assumptions and sinking SCEV nodes. If unsuccessful, we will return
		/// a nullptr in the Ret field. If succesful, the predicate set of the
		/// answer must be checked at run-time in order for the answer to be
		/// valid.
		AssumptionResult getAddRecWithRTChecks(const SCEV S, const Loop L);
private:		private:
/// Compute the backedge taken count knowing the interval difference, the		/// Compute the backedge taken count knowing the interval difference, the
/// stride and presence of the equality in the comparison.		/// stride and presence of the equality in the comparison.
const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,		const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,
bool Equality);		bool Equality);

/// Verify if an linear IV with positive stride can overflow when in a		/// Verify if an linear IV with positive stride can overflow when in a
/// less-than comparison, knowing the invariant term of the comparison,		/// less-than comparison, knowing the invariant term of the comparison,
Show All 14 Lines	private:
/// FirstUnknown - The head of a linked list of all SCEVUnknown		/// FirstUnknown - The head of a linked list of all SCEVUnknown
/// values that have been allocated. This is used by releaseMemory		/// values that have been allocated. This is used by releaseMemory
/// to locate them all and call their destructors.		/// to locate them all and call their destructors.
SCEVUnknown *FirstUnknown;		SCEVUnknown *FirstUnknown;
};		};

/// \brief Analysis pass that exposes the \c ScalarEvolution for a function.		/// \brief Analysis pass that exposes the \c ScalarEvolution for a function.
class ScalarEvolutionAnalysis {		class ScalarEvolutionAnalysis {
static char PassID;		static char PassID;
		hfinkelUnsubmitted Not Done Reply Inline Actions Don't use 'chrec' here without defining it. hfinkel: Don't use 'chrec' here without defining it.

public:		public:
typedef ScalarEvolution Result;		typedef ScalarEvolution Result;

/// \brief Opaque, unique identifier for this analysis pass.		/// \brief Opaque, unique identifier for this analysis pass.
static void ID() { return (void )&PassID; }		static void ID() { return (void )&PassID; }

/// \brief Provide a name for the analysis for debugging and logging.		/// \brief Provide a name for the analysis for debugging and logging.
Show All 27 Lines	public:
bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
void releaseMemory() override;		void releaseMemory() override;
void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;
void print(raw_ostream &OS, const Module * = nullptr) const override;		void print(raw_ostream &OS, const Module * = nullptr) const override;
void verifyAnalysis() const override;		void verifyAnalysis() const override;
};		};
}		}

#endif		#endif
		hfinkelUnsubmitted Not Done Reply Inline Actions You should clear the cache here? hfinkel: You should clear the cache here?

include/llvm/Analysis/ScalarEvolutionExpressions.h

Show All 17 Lines
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"

namespace llvm {		namespace llvm {
class ConstantInt;		class ConstantInt;
class ConstantRange;		class ConstantRange;
class DominatorTree;		class DominatorTree;
		class SCEVExpander;

enum SCEVTypes {		enum SCEVTypes {
// These should be ordered in terms of increasing complexity to make the		// These should be ordered in terms of increasing complexity to make the
// folders simpler.		// folders simpler.
scConstant, scTruncate, scZeroExtend, scSignExtend, scAddExpr, scMulExpr,		scConstant, scTruncate, scZeroExtend, scSignExtend, scAddExpr, scMulExpr,
scUDivExpr, scAddRecExpr, scUMaxExpr, scSMaxExpr,		scUDivExpr, scAddRecExpr, scUMaxExpr, scSMaxExpr,
scUnknown, scCouldNotCompute		scUnknown, scCouldNotCompute
};		};
▲ Show 20 Lines • Show All 706 Lines • ▼ Show 20 Lines	const SCEV visitCouldNotCompute(const SCEVCouldNotCompute Expr) {
return Expr;		return Expr;
}		}

private:		private:
ScalarEvolution &SE;		ScalarEvolution &SE;
LoopToScevMapT &Map;		LoopToScevMapT &Map;
};		};

/// Applies the Map (Loop -> SCEV) to the given Scev.		/// Applies the Map (Loop -> SCEV) to the given Scev.
static inline const SCEV apply(const SCEV Scev, LoopToScevMapT &Map,		static inline const SCEV apply(const SCEV Scev, LoopToScevMapT &Map,
ScalarEvolution &SE) {		ScalarEvolution &SE) {
return SCEVApplyRewriter::rewrite(Scev, Map, SE);		return SCEVApplyRewriter::rewrite(Scev, Map, SE);
		rengolinUnsubmitted Not Done Reply Inline Actions Do you need to keep this public? In addOverflowAssumption, you check for MakeAssumptions to add or not the SCEV to it, and if it remains public, anyone will be able to bypass this check. rengolin: Do you need to keep this public? In addOverflowAssumption, you check for MakeAssumptions to…
}		}

}		}

#endif		#endif
		hfinkelUnsubmitted Not Done Reply Inline Actions Why not make this a member function of SCEV (same for rewriteSCEVWithAssumptions below)? hfinkel: Why not make this a member function of SCEV (same for rewriteSCEVWithAssumptions below)?
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Makes sense. sbaranga: Makes sense.

include/llvm/Transforms/Utils/LoopVersioning.h

Show All 17 Lines

#include "llvm/Transforms/Utils/ValueMapper.h"		#include "llvm/Transforms/Utils/ValueMapper.h"

namespace llvm {		namespace llvm {

class Loop;		class Loop;
class LoopAccessInfo;		class LoopAccessInfo;
class LoopInfo;		class LoopInfo;
		class ScalarEvolution;

/// \brief This class emits a version of the loop where run-time checks ensure		/// \brief This class emits a version of the loop where run-time checks ensure
/// that may-alias pointers can't overlap.		/// that may-alias pointers can't overlap.
///		///
/// It currently only supports single-exit loops and assumes that the loop		/// It currently only supports single-exit loops and assumes that the loop
/// already has a preheader.		/// already has a preheader.
class LoopVersioning {		class LoopVersioning {
public:		public:
/// \brief Expects MemCheck, LoopAccessInfo, Loop, LoopInfo, DominatorTree		/// \brief Expects MemCheck, LoopAccessInfo, Loop, LoopInfo, DominatorTree
/// as input. It uses runtime check provided by user.		/// as input. It uses runtime check provided by user.
LoopVersioning(SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks,		LoopVersioning(SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks,
const LoopAccessInfo &LAI, Loop L, LoopInfo LI,		const LoopAccessInfo &LAI, Loop L, LoopInfo LI,
DominatorTree *DT);		DominatorTree DT, ScalarEvolution SE);

/// \brief Expects LoopAccessInfo, Loop, LoopInfo, DominatorTree as input.		/// \brief Expects LoopAccessInfo, Loop, LoopInfo, DominatorTree as input.
/// It uses default runtime check provided by LoopAccessInfo.		/// It uses default runtime check provided by LoopAccessInfo.
LoopVersioning(const LoopAccessInfo &LAInfo, Loop L, LoopInfo LI,		LoopVersioning(const LoopAccessInfo &LAInfo, Loop L, LoopInfo LI,
DominatorTree *DT);		DominatorTree DT, ScalarEvolution SE);

/// \brief Performs the CFG manipulation part of versioning the loop including		/// \brief Performs the CFG manipulation part of versioning the loop including
/// the DominatorTree and LoopInfo updates.		/// the DominatorTree and LoopInfo updates.
///		///
/// The loop that was used to construct the class will be the "versioned" loop		/// The loop that was used to construct the class will be the "versioned" loop
/// i.e. the loop that will receive control if all the memchecks pass.		/// i.e. the loop that will receive control if all the memchecks pass.
///		///
/// This allows the loop transform pass to operate on the same loop regardless		/// This allows the loop transform pass to operate on the same loop regardless
Show All 36 Lines	private:

/// \brief The set of checks that we are versioning for.		/// \brief The set of checks that we are versioning for.
SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks;		SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks;

/// \brief Analyses used.		/// \brief Analyses used.
const LoopAccessInfo &LAI;		const LoopAccessInfo &LAI;
LoopInfo *LI;		LoopInfo *LI;
DominatorTree *DT;		DominatorTree *DT;
		ScalarEvolution *SE;
};		};
}		}

#endif		#endif

lib/Analysis/LoopAccessAnalysis.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	DEBUG(dbgs() << "LAA: Replacing SCEV: " << OrigSCEV << " by: " << ByOne
<< "\n");		<< "\n");
return ByOne;		return ByOne;
}		}

// Otherwise, just return the SCEV of the original pointer.		// Otherwise, just return the SCEV of the original pointer.
return SE->getSCEV(Ptr);		return SE->getSCEV(Ptr);
}		}

		const SCEV llvm::rewriteSCEV(ScalarEvolution SE,
		rengolinUnsubmitted Not Done Reply Inline Actions Why are you using the llvm namespace here? Shouldn't these functions be static? rengolin: Why are you using the llvm namespace here? Shouldn't these functions be static?
		const ValueToValueMap &PtrToStride, Value *Ptr,
		Value OrigPtr, const Loop L,
		SCEVPredicateSet &Preds) {

		const SCEV *Ret = replaceSymbolicStrideSCEV(SE, PtrToStride, Ptr, OrigPtr);
		rengolinUnsubmitted Not Done Reply Inline Actions replaceSymbolicStrideSCEV is defined on the header, as namespace llvm, static, and leads to confusions like these. If you need it in more than one place, it shouldn't be static and it should be implemented in its own cpp file. If these are local only, they should be static or in anonymous namespaces. rengolin: replaceSymbolicStrideSCEV is defined on the header, as namespace llvm, static, and leads to…
		Ret = SE->rewriteUsingPredicate(Ret, L, Preds);
		return Ret;
		}

		const SCEV llvm::convertSCEVToAddRec(ScalarEvolution SE,
		const ValueToValueMap &PtrToStride,
		Value Ptr, Value OrigPtr, const Loop *L,
		SCEVPredicateSet &Preds) {

		const SCEV *Ret = rewriteSCEV(SE, PtrToStride, Ptr, OrigPtr, L, Preds);
		if (dyn_cast<const SCEVAddRecExpr>(Ret))
		return Ret;

		AssumptionResult R = SE->getAddRecWithRTChecks(Ret, L);
		R.Pred.add(&Preds);
		// Only commit to the new predicates if we've succeeded.
		if (R.Res && !R.Pred.isAlwaysFalse()) {
		Preds = R.Pred;
		return R.Res;
		}
		return Ret;
		anemetUnsubmitted Not Done Reply Inline Actions Hmm, why don't you only write Preds after to ensure that we succeeded? Isn't this the same as: if (R.Ret && !R.P.isNever()) { R.P.add(&Preds); <-- write Preds here. return R.Ret: } if not it needs a comment. anemet: Hmm, why don't you only write Preds after to ensure that we succeeded? Isn't this the same as…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions R.P.add(&Preds) adds Preds to R.P, so that wouldn't write Preds. We use the temporary because we can only be sure if we've succeeded or not after we see the final set of predicates. sbaranga: R.P.add(&Preds) adds Preds to R.P, so that wouldn't write Preds. We use the temporary because…
		anemetUnsubmitted Not Done Reply Inline Actions I see. It still feels like a common multi-step pattern that clients can get wrong. I wonder if we can make the API better here. We can discuss this after you split this into multiple patches. anemet: I see. It still feels like a common multi-step pattern that clients can get wrong. I wonder…
		}

void RuntimePointerChecking::insert(Loop Lp, Value Ptr, bool WritePtr,		void RuntimePointerChecking::insert(Loop Lp, Value Ptr, bool WritePtr,
unsigned DepSetId, unsigned ASId,		unsigned DepSetId, unsigned ASId,
const ValueToValueMap &Strides) {		const ValueToValueMap &Strides,
		SCEVPredicateSet &Pred) {
// Get the stride replaced scev.		// Get the stride replaced scev.
const SCEV *Sc = replaceSymbolicStrideSCEV(SE, Strides, Ptr);		const SCEV *Sc = rewriteSCEV(SE, Strides, Ptr, nullptr, Lp, Pred);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Sc);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Sc);
assert(AR && "Invalid addrec expression");		assert(AR && "Invalid addrec expression");
const SCEV *Ex = SE->getBackedgeTakenCount(Lp);

		const SCEV *Ex = SE->getGuardedBackedgeTakenCount(Lp, Pred);
const SCEV *ScStart = AR->getStart();		const SCEV *ScStart = AR->getStart();
		anemetUnsubmitted Not Done Reply Inline Actions We should not recompute this for every insert. Also we need to bound the number of predicates so getting the guarded backedge count at a more central place is probably a better idea. anemet: We should not recompute this for every insert. Also we need to bound the number of predicates…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Caching expressions would make sense to me. We would have to recompute the expressions every time we commit to a new predicate though. sbaranga: Caching expressions would make sense to me. We would have to recompute the expressions every…
		anemetUnsubmitted Not Done Reply Inline Actions I don't follow. anemet: I don't follow.
const SCEV ScEnd = AR->evaluateAtIteration(Ex, SE);		const SCEV ScEnd = AR->evaluateAtIteration(Ex, SE);
const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);

// For expressions with negative step, the upper bound is ScStart and the		// For expressions with negative step, the upper bound is ScStart and the
// lower bound is ScEnd.		// lower bound is ScEnd.
if (const SCEVConstant *CStep = dyn_cast<const SCEVConstant>(Step)) {		if (const SCEVConstant *CStep = dyn_cast<const SCEVConstant>(Step)) {
if (CStep->getValue()->isNegative())		if (CStep->getValue()->isNegative())
std::swap(ScStart, ScEnd);		std::swap(ScStart, ScEnd);
▲ Show 20 Lines • Show All 272 Lines • ▼ Show 20 Lines
/// dependence checking.		/// dependence checking.
class AccessAnalysis {		class AccessAnalysis {
public:		public:
/// \brief Read or write access location.		/// \brief Read or write access location.
typedef PointerIntPair<Value *, 1, bool> MemAccessInfo;		typedef PointerIntPair<Value *, 1, bool> MemAccessInfo;
typedef SmallPtrSet<MemAccessInfo, 8> MemAccessInfoSet;		typedef SmallPtrSet<MemAccessInfo, 8> MemAccessInfoSet;

AccessAnalysis(const DataLayout &Dl, AliasAnalysis AA, LoopInfo LI,		AccessAnalysis(const DataLayout &Dl, AliasAnalysis AA, LoopInfo LI,
MemoryDepChecker::DepCandidates &DA)		MemoryDepChecker::DepCandidates &DA, SCEVPredicateSet &Pred)
: DL(Dl), AST(*AA), LI(LI), DepCands(DA),		: DL(Dl), AST(*AA), LI(LI), DepCands(DA), IsRTCheckAnalysisNeeded(false),
IsRTCheckAnalysisNeeded(false) {}		Pred(Pred) {}

/// \brief Register a load and whether it is only read from.		/// \brief Register a load and whether it is only read from.
void addLoad(MemoryLocation &Loc, bool IsReadOnly) {		void addLoad(MemoryLocation &Loc, bool IsReadOnly) {
Value Ptr = const_cast<Value>(Loc.Ptr);		Value Ptr = const_cast<Value>(Loc.Ptr);
AST.add(Ptr, MemoryLocation::UnknownSize, Loc.AATags);		AST.add(Ptr, MemoryLocation::UnknownSize, Loc.AATags);
Accesses.insert(MemAccessInfo(Ptr, false));		Accesses.insert(MemAccessInfo(Ptr, false));
if (IsReadOnly)		if (IsReadOnly)
ReadOnlyPtr.insert(Ptr);		ReadOnlyPtr.insert(Ptr);
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	private:
/// \brief Initial processing of memory accesses determined that we may need		/// \brief Initial processing of memory accesses determined that we may need
/// to add memchecks. Perform the analysis to determine the necessary checks.		/// to add memchecks. Perform the analysis to determine the necessary checks.
///		///
/// Note that, this is different from isDependencyCheckNeeded. When we retry		/// Note that, this is different from isDependencyCheckNeeded. When we retry
/// memcheck analysis without dependency checking		/// memcheck analysis without dependency checking
/// (i.e. ShouldRetryWithRuntimeCheck), isDependencyCheckNeeded is cleared		/// (i.e. ShouldRetryWithRuntimeCheck), isDependencyCheckNeeded is cleared
/// while this remains set if we have potentially dependent accesses.		/// while this remains set if we have potentially dependent accesses.
bool IsRTCheckAnalysisNeeded;		bool IsRTCheckAnalysisNeeded;

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;
};		};

} // end anonymous namespace		} // end anonymous namespace

/// \brief Check whether a pointer can participate in a runtime bounds check.		/// \brief Check whether a pointer can participate in a runtime bounds check.
static bool hasComputableBounds(ScalarEvolution *SE,		static bool hasComputableBounds(ScalarEvolution *SE,
const ValueToValueMap &Strides, Value *Ptr) {		const ValueToValueMap &Strides, Value *Ptr,
const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, Strides, Ptr);		Loop *L, SCEVPredicateSet &Pred) {
		const SCEV *PtrScev = rewriteSCEV(SE, Strides, Ptr, nullptr, L, Pred);
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
if (!AR)		if (!AR)
return false;		return false;

return AR->isAffine();		return AR->isAffine();
}		}

bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,		bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
Show All 26 Lines	for (auto A : AS) {
bool IsWrite = Accesses.count(MemAccessInfo(Ptr, true));		bool IsWrite = Accesses.count(MemAccessInfo(Ptr, true));
MemAccessInfo Access(Ptr, IsWrite);		MemAccessInfo Access(Ptr, IsWrite);

if (IsWrite)		if (IsWrite)
++NumWritePtrChecks;		++NumWritePtrChecks;
else		else
++NumReadPtrChecks;		++NumReadPtrChecks;

if (hasComputableBounds(SE, StridesMap, Ptr) &&		bool Bounded = hasComputableBounds(SE, StridesMap, Ptr, TheLoop, Pred);
		if (!Bounded) {
		convertSCEVToAddRec(SE, StridesMap, Ptr, nullptr, TheLoop, Pred);
		Bounded = hasComputableBounds(SE, StridesMap, Ptr, TheLoop, Pred);
		}

		bool ValidStride = true;
		if (ShouldCheckStride) {
		ValidStride =
		(isStridedPtr(SE, Ptr, TheLoop, StridesMap, Pred, true) == 1);
		}

// When we run after a failing dependency check we have to make sure		// When we run after a failing dependency check we have to make sure
// we don't have wrapping pointers.		// we don't have wrapping pointers.
(!ShouldCheckStride \|\|		if (Bounded && ValidStride) {
isStridedPtr(SE, Ptr, TheLoop, StridesMap) == 1)) {
// The id of the dependence set.		// The id of the dependence set.
		anemetUnsubmitted Not Done Reply Inline Actions This does not look tight enough. Can't we fail in hasComputable... for reasons other than the SCEV not being an AddRec? Also the return value of convertSCEV... should be checked rather than just repeating hasComputable... blindly. anemet: This does not look tight enough. Can't we fail in hasComputable... for reasons other than the…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Yes, you're correct. A better solution seems to only make the assumption after we've seen that hasComputableBounds.. now returns true. This change probably needs a closer look. sbaranga: Yes, you're correct. A better solution seems to only make the assumption after we've seen that…
unsigned DepId;		unsigned DepId;

if (IsDepCheckNeeded) {		if (IsDepCheckNeeded) {
Value *Leader = DepCands.getLeaderValue(Access).getPointer();		Value *Leader = DepCands.getLeaderValue(Access).getPointer();
unsigned &LeaderId = DepSetId[Leader];		unsigned &LeaderId = DepSetId[Leader];
if (!LeaderId)		if (!LeaderId)
LeaderId = RunningDepId++;		LeaderId = RunningDepId++;
DepId = LeaderId;		DepId = LeaderId;
} else		} else
// Each access has its own dependence set.		// Each access has its own dependence set.
DepId = RunningDepId++;		DepId = RunningDepId++;

RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap);		RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap, Pred);

DEBUG(dbgs() << "LAA: Found a runtime check ptr:" << *Ptr << '\n');		DEBUG(dbgs() << "LAA: Found a runtime check ptr:" << *Ptr << '\n');
} else {		} else {
DEBUG(dbgs() << "LAA: Can't find bounds for ptr:" << *Ptr << '\n');		DEBUG(dbgs() << "LAA: Can't find bounds for ptr:" << *Ptr << '\n');
CanDoRT = false;		CanDoRT = false;
}		}
}		}

▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	if (OBO->hasNoSignedWrap() &&
return OpAR->getLoop() == L && OpAR->getNoWrapFlags(SCEV::FlagNSW);		return OpAR->getLoop() == L && OpAR->getNoWrapFlags(SCEV::FlagNSW);
}		}

return false;		return false;
}		}

/// \brief Check whether the access through \p Ptr has a constant stride.		/// \brief Check whether the access through \p Ptr has a constant stride.
int llvm::isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,		int llvm::isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,
const ValueToValueMap &StridesMap) {		const ValueToValueMap &StridesMap,
		SCEVPredicateSet &Pred, bool MakeAssumptions) {
Type *Ty = Ptr->getType();		Type *Ty = Ptr->getType();
assert(Ty->isPointerTy() && "Unexpected non-ptr");		assert(Ty->isPointerTy() && "Unexpected non-ptr");

// Make sure that the pointer does not point to aggregate types.		// Make sure that the pointer does not point to aggregate types.
auto *PtrTy = cast<PointerType>(Ty);		auto *PtrTy = cast<PointerType>(Ty);
if (PtrTy->getElementType()->isAggregateType()) {		if (PtrTy->getElementType()->isAggregateType()) {
DEBUG(dbgs() << "LAA: Bad stride - Not a pointer to a scalar type"		DEBUG(dbgs() << "LAA: Bad stride - Not a pointer to a scalar type"
<< *Ptr << "\n");		<< *Ptr << "\n");
return 0;		return 0;
}		}

const SCEV *PtrScev = replaceSymbolicStrideSCEV(SE, StridesMap, Ptr);		const SCEV *PtrScev = rewriteSCEV(SE, StridesMap, Ptr, nullptr, Lp, Pred);

const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LAA: Bad stride - Not an AddRecExpr pointer "		// It's not an AddRecExpr. Try to force an AddRecExpr by making
<< Ptr << " SCEV: " << PtrScev << "\n");		// assumptions which can be checked at run-time.
		const SCEV *Retry =
		convertSCEVToAddRec(SE, StridesMap, Ptr, nullptr, Lp, Pred);
		AR = dyn_cast<const SCEVAddRecExpr>(Retry);
		if (!AR) {
		DEBUG(dbgs() << "LAA: Bad stride - Not an AddRecExpr pointer " << *Ptr
		<< " SCEV: " << *PtrScev << "\n");
return 0;		return 0;
}		}
		}

// The accesss function must stride over the innermost loop.		// The access function must stride over the innermost loop.
if (Lp != AR->getLoop()) {		if (Lp != AR->getLoop()) {
DEBUG(dbgs() << "LAA: Bad stride - Not striding over innermost loop " <<		DEBUG(dbgs() << "LAA: Bad stride - Not striding over innermost loop " <<
Ptr << " SCEV: " << PtrScev << "\n");		Ptr << " SCEV: " << PtrScev << "\n");
}		}

// The address calculation must not wrap. Otherwise, a dependence could be		// The address calculation must not wrap. Otherwise, a dependence could be
// inverted.		// inverted.
// An inbounds getelementptr that is a AddRec with a unit stride		// An inbounds getelementptr that is a AddRec with a unit stride
▲ Show 20 Lines • Show All 185 Lines • ▼ Show 20 Lines	MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
if (!AIsWrite && !BIsWrite)		if (!AIsWrite && !BIsWrite)
return Dependence::NoDep;		return Dependence::NoDep;

// We cannot check pointers in different address spaces.		// We cannot check pointers in different address spaces.
if (APtr->getType()->getPointerAddressSpace() !=		if (APtr->getType()->getPointerAddressSpace() !=
BPtr->getType()->getPointerAddressSpace())		BPtr->getType()->getPointerAddressSpace())
return Dependence::Unknown;		return Dependence::Unknown;

const SCEV *AScev = replaceSymbolicStrideSCEV(SE, Strides, APtr);		const SCEV *AScev =
const SCEV *BScev = replaceSymbolicStrideSCEV(SE, Strides, BPtr);		rewriteSCEV(SE, Strides, APtr, nullptr, InnermostLoop, Pred);
		const SCEV *BScev =
int StrideAPtr = isStridedPtr(SE, APtr, InnermostLoop, Strides);		rewriteSCEV(SE, Strides, BPtr, nullptr, InnermostLoop, Pred);
int StrideBPtr = isStridedPtr(SE, BPtr, InnermostLoop, Strides);
		// Make assumptions here, otherwise we're guaranteed to end up with
		// an unknown dependence.
		int StrideAPtr = isStridedPtr(SE, APtr, InnermostLoop, Strides, Pred, true);
		int StrideBPtr = isStridedPtr(SE, BPtr, InnermostLoop, Strides, Pred, true);

const SCEV *Src = AScev;		const SCEV *Src = AScev;
const SCEV *Sink = BScev;		const SCEV *Sink = BScev;

// If the induction step is negative we have to invert source and sink of the		// If the induction step is negative we have to invert source and sink of the
// dependence.		// dependence.
if (StrideAPtr < 0) {		if (StrideAPtr < 0) {
//Src = BScev;		//Src = BScev;
Show All 10 Lines	MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
DEBUG(dbgs() << "LAA: Src Scev: " << Src << "Sink Scev: " << Sink		DEBUG(dbgs() << "LAA: Src Scev: " << Src << "Sink Scev: " << Sink
<< "(Induction step: " << StrideAPtr << ")\n");		<< "(Induction step: " << StrideAPtr << ")\n");
DEBUG(dbgs() << "LAA: Distance for " << *InstMap[AIdx] << " to "		DEBUG(dbgs() << "LAA: Distance for " << *InstMap[AIdx] << " to "
<< InstMap[BIdx] << ": " << Dist << "\n");		<< InstMap[BIdx] << ": " << Dist << "\n");

// Need accesses with constant stride. We don't want to vectorize		// Need accesses with constant stride. We don't want to vectorize
// "A[B[i]] += ..." and similar code or pointer arithmetic that could wrap in		// "A[B[i]] += ..." and similar code or pointer arithmetic that could wrap in
// the address space.		// the address space.
if (!StrideAPtr \|\| !StrideBPtr \|\| StrideAPtr != StrideBPtr){		if (!StrideAPtr \|\| !StrideBPtr \|\| StrideAPtr != StrideBPtr) {
DEBUG(dbgs() << "Pointer access with non-constant stride\n");		DEBUG(dbgs() << "Pointer access with non-constant stride\n");
return Dependence::Unknown;		return Dependence::Unknown;
}		}

const SCEVConstant *C = dyn_cast<SCEVConstant>(Dist);		const SCEVConstant *C = dyn_cast<SCEVConstant>(Dist);
if (!C) {		if (!C) {
DEBUG(dbgs() << "LAA: Dependence because of non-constant distance\n");		DEBUG(dbgs() << "LAA: Dependence because of non-constant distance\n");
ShouldRetryWithRuntimeCheck = true;		ShouldRetryWithRuntimeCheck = true;
▲ Show 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
DEBUG(dbgs() << "LAA: loop control flow is not understood by analyzer\n");		DEBUG(dbgs() << "LAA: loop control flow is not understood by analyzer\n");
emitAnalysis(		emitAnalysis(
LoopAccessReport() <<		LoopAccessReport() <<
"loop control flow is not understood by analyzer");		"loop control flow is not understood by analyzer");
return false;		return false;
}		}

// ScalarEvolution needs to be able to find the exit count.		// ScalarEvolution needs to be able to find the exit count.
const SCEV *ExitCount = SE->getBackedgeTakenCount(TheLoop);		const SCEV ExitCount = SE->getGuardedBackedgeTakenCount(TheLoop, Pred);

if (ExitCount == SE->getCouldNotCompute()) {		if (ExitCount == SE->getCouldNotCompute()) {
emitAnalysis(LoopAccessReport() <<		emitAnalysis(LoopAccessReport()
"could not determine number of loop iterations");		<< "could not determine number of loop iterations");
DEBUG(dbgs() << "LAA: SCEV could not compute the loop exit count.\n");		DEBUG(dbgs() << "LAA: SCEV could not compute the loop exit count.\n");
return false;		return false;
}		}

return true;		return true;
}		}

void LoopAccessInfo::analyzeLoop(const ValueToValueMap &Strides) {		void LoopAccessInfo::analyzeLoop(const ValueToValueMap &Strides) {
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	void LoopAccessInfo::analyzeLoop(const ValueToValueMap &Strides) {
if (!Stores.size()) {		if (!Stores.size()) {
DEBUG(dbgs() << "LAA: Found a read-only loop!\n");		DEBUG(dbgs() << "LAA: Found a read-only loop!\n");
CanVecMem = true;		CanVecMem = true;
return;		return;
}		}

MemoryDepChecker::DepCandidates DependentAccesses;		MemoryDepChecker::DepCandidates DependentAccesses;
AccessAnalysis Accesses(TheLoop->getHeader()->getModule()->getDataLayout(),		AccessAnalysis Accesses(TheLoop->getHeader()->getModule()->getDataLayout(),
AA, LI, DependentAccesses);		AA, LI, DependentAccesses, *Pred);

// Holds the analyzed pointers. We don't want to call GetUnderlyingObjects		// Holds the analyzed pointers. We don't want to call GetUnderlyingObjects
// multiple times on the same object. If the ptr is accessed twice, once		// multiple times on the same object. If the ptr is accessed twice, once
// for read and once for write, it will only appear once (on the write		// for read and once for write, it will only appear once (on the write
// list). This is okay, since we are going to check for conflicts between		// list). This is okay, since we are going to check for conflicts between
// writes and between reads and writes, but not between reads and reads.		// writes and between reads and writes, but not between reads and reads.
ValueSet Seen;		ValueSet Seen;

Show All 34 Lines	for (I = Loads.begin(), IE = Loads.end(); I != IE; ++I) {
// read list. If we did see it before, then it is already in		// read list. If we did see it before, then it is already in
// the read-write list. This allows us to vectorize expressions		// the read-write list. This allows us to vectorize expressions
// such as A[i] += x; Because the address of A[i] is a read-write		// such as A[i] += x; Because the address of A[i] is a read-write
// pointer. This only works if the index of A[i] is consecutive.		// pointer. This only works if the index of A[i] is consecutive.
// If the address of i is unknown (for example A[B[i]]) then we may		// If the address of i is unknown (for example A[B[i]]) then we may
// read a few words, modify, and write a few words, and some of the		// read a few words, modify, and write a few words, and some of the
// words may be written to the same address.		// words may be written to the same address.
bool IsReadOnlyPtr = false;		bool IsReadOnlyPtr = false;
if (Seen.insert(Ptr).second \|\| !isStridedPtr(SE, Ptr, TheLoop, Strides)) {		if (Seen.insert(Ptr).second \|\|
		!isStridedPtr(SE, Ptr, TheLoop, Strides, *Pred, false)) {
++NumReads;		++NumReads;
IsReadOnlyPtr = true;		IsReadOnlyPtr = true;
}		}

MemoryLocation Loc = MemoryLocation::get(LD);		MemoryLocation Loc = MemoryLocation::get(LD);
// The TBAA metadata could have a control dependency on the predication		// The TBAA metadata could have a control dependency on the predication
// condition, so we cannot rely on it when determining whether or not we		// condition, so we cannot rely on it when determining whether or not we
// need runtime pointer checks.		// need runtime pointer checks.
▲ Show 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	LoopAccessInfo::addRuntimeChecks(Instruction *Loc) const {
return addRuntimeChecks(Loc, PtrRtChecking.getChecks());		return addRuntimeChecks(Loc, PtrRtChecking.getChecks());
}		}

LoopAccessInfo::LoopAccessInfo(Loop L, ScalarEvolution SE,		LoopAccessInfo::LoopAccessInfo(Loop L, ScalarEvolution SE,
const DataLayout &DL,		const DataLayout &DL,
const TargetLibraryInfo TLI, AliasAnalysis AA,		const TargetLibraryInfo TLI, AliasAnalysis AA,
DominatorTree DT, LoopInfo LI,		DominatorTree DT, LoopInfo LI,
const ValueToValueMap &Strides)		const ValueToValueMap &Strides)
: PtrRtChecking(SE), DepChecker(SE, L), TheLoop(L), SE(SE), DL(DL),		: Pred(new SCEVPredicateSet), PtrRtChecking(SE), DepChecker(SE, L, *Pred),
TLI(TLI), AA(AA), DT(DT), LI(LI), NumLoads(0), NumStores(0),		TheLoop(L), SE(SE), DL(DL), TLI(TLI),
		AA(AA), DT(DT), LI(LI), NumLoads(0), NumStores(0),
MaxSafeDepDistBytes(-1U), CanVecMem(false),		MaxSafeDepDistBytes(-1U), CanVecMem(false),
StoreToLoopInvariantAddress(false) {		StoreToLoopInvariantAddress(false) {
if (canAnalyzeLoop())		if (canAnalyzeLoop())
analyzeLoop(Strides);		analyzeLoop(Strides);
}		}

void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {		void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {
if (CanVecMem) {		if (CanVecMem) {
Show All 17 Lines	void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {

// List the pair of accesses need run-time checks to prove independence.		// List the pair of accesses need run-time checks to prove independence.
PtrRtChecking.print(OS, Depth);		PtrRtChecking.print(OS, Depth);
OS << "\n";		OS << "\n";

OS.indent(Depth) << "Store to invariant address was "		OS.indent(Depth) << "Store to invariant address was "
<< (StoreToLoopInvariantAddress ? "" : "not ")		<< (StoreToLoopInvariantAddress ? "" : "not ")
<< "found in loop.\n";		<< "found in loop.\n";

		OS.indent(Depth) << "SCEV assumptions:\n";
		Pred->print(OS, Depth);
}		}

const LoopAccessInfo &		const LoopAccessInfo &
LoopAccessAnalysis::getInfo(Loop *L, const ValueToValueMap &Strides) {		LoopAccessAnalysis::getInfo(Loop *L, const ValueToValueMap &Strides) {
auto &LAI = LoopAccessInfoMap[L];		DenseMap<Loop *, std::unique_ptr<LoopAccessInfo>> &Map = LoopAccessInfoMap;

		auto &LAI = Map[L];

#ifndef NDEBUG		#ifndef NDEBUG
assert((!LAI \|\| LAI->NumSymbolicStrides == Strides.size()) &&		assert((!LAI \|\| LAI->NumSymbolicStrides == Strides.size()) &&
"Symbolic strides changed for loop");		"Symbolic strides changed for loop");
#endif		#endif

if (!LAI) {		if (!LAI) {
const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();
▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
		#include "llvm/Analysis/ScalarEvolutionExpander.h"
#include "llvm/Analysis/ScalarEvolutionExpressions.h"		#include "llvm/Analysis/ScalarEvolutionExpressions.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/ConstantRange.h"		#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/GetElementPtrTypeIterator.h"		#include "llvm/IR/GetElementPtrTypeIterator.h"
#include "llvm/IR/GlobalAlias.h"		#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
Show All 13 Lines

static cl::opt<unsigned>		static cl::opt<unsigned>
MaxBruteForceIterations("scalar-evolution-max-iterations", cl::ReallyHidden,		MaxBruteForceIterations("scalar-evolution-max-iterations", cl::ReallyHidden,
cl::desc("Maximum number of iterations SCEV will "		cl::desc("Maximum number of iterations SCEV will "
"symbolically execute a constant "		"symbolically execute a constant "
"derived loop"),		"derived loop"),
cl::init(100));		cl::init(100));

		static cl::opt<unsigned>
		OverflowCheckThreshold("force-max-overflow-checks", cl::init(16),
		cl::Hidden,
		cl::desc("Don't create SCEV predicates with more than "
		"this number of assumptions."));


// FIXME: Enable this with XDEBUG when the test suite is clean.		// FIXME: Enable this with XDEBUG when the test suite is clean.
static cl::opt<bool>		static cl::opt<bool>
VerifySCEV("verify-scev",		VerifySCEV("verify-scev",
cl::desc("Verify ScalarEvolution's backedge taken counts (slow)"));		cl::desc("Verify ScalarEvolution's backedge taken counts (slow)"));

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SCEV class definitions		// SCEV class definitions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 4,571 Lines • ▼ Show 20 Lines
/// Note that it is not valid to call this method on a loop without a		/// Note that it is not valid to call this method on a loop without a
/// loop-invariant backedge-taken count (see		/// loop-invariant backedge-taken count (see
/// hasLoopInvariantBackedgeTakenCount).		/// hasLoopInvariantBackedgeTakenCount).
///		///
const SCEV ScalarEvolution::getBackedgeTakenCount(const Loop L) {		const SCEV ScalarEvolution::getBackedgeTakenCount(const Loop L) {
return getBackedgeTakenInfo(L).getExact(this);		return getBackedgeTakenInfo(L).getExact(this);
}		}

		const SCEV *
		ScalarEvolution::getGuardedBackedgeTakenCount(const Loop *L,
		SCEVPredicateSet &Predicates) {
		return getBackedgeTakenInfo(L).getGuardedExact(this, Predicates);
		}

/// getMaxBackedgeTakenCount - Similar to getBackedgeTakenCount, except		/// getMaxBackedgeTakenCount - Similar to getBackedgeTakenCount, except
/// return the least SCEV value that is known never to be less than the		/// return the least SCEV value that is known never to be less than the
/// actual backedge taken count.		/// actual backedge taken count.
const SCEV ScalarEvolution::getMaxBackedgeTakenCount(const Loop L) {		const SCEV ScalarEvolution::getMaxBackedgeTakenCount(const Loop L) {
return getBackedgeTakenInfo(L).getMax(this);		return getBackedgeTakenInfo(L).getMax(this);
}		}

/// PushLoopPHIs - Push PHI nodes in the header of the given loop		/// PushLoopPHIs - Push PHI nodes in the header of the given loop
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	ScalarEvolution::BackedgeTakenInfo::getExact(ScalarEvolution *SE) const {
assert(ExitNotTaken.ExactNotTaken && "uninitialized not-taken info");		assert(ExitNotTaken.ExactNotTaken && "uninitialized not-taken info");

const SCEV *BECount = nullptr;		const SCEV *BECount = nullptr;
for (const ExitNotTakenInfo *ENT = &ExitNotTaken;		for (const ExitNotTakenInfo *ENT = &ExitNotTaken;
ENT != nullptr; ENT = ENT->getNextExit()) {		ENT != nullptr; ENT = ENT->getNextExit()) {

assert(ENT->ExactNotTaken != SE->getCouldNotCompute() && "bad exit SCEV");		assert(ENT->ExactNotTaken != SE->getCouldNotCompute() && "bad exit SCEV");

		if (ENT->Pred.isAlwaysFalse() \|\| !ENT->Pred.isAlwaysTrue())
		return SE->getCouldNotCompute();

if (!BECount)		if (!BECount)
BECount = ENT->ExactNotTaken;		BECount = ENT->ExactNotTaken;
else if (BECount != ENT->ExactNotTaken)		else if (BECount != ENT->ExactNotTaken)
return SE->getCouldNotCompute();		return SE->getCouldNotCompute();
}		}
assert(BECount && "Invalid not taken count for loop exit");		assert(BECount && "Invalid not taken count for loop exit");
return BECount;		return BECount;
}		}

		const SCEV *ScalarEvolution::BackedgeTakenInfo::getGuardedExact(
		ScalarEvolution *SE, SCEVPredicateSet &Predicates) const {
		// If any exits were not computable, the loop is not computable.
		if (!ExitNotTaken.isCompleteList())
		return SE->getCouldNotCompute();

		// We need exactly one computable exit.
		if (!ExitNotTaken.ExitingBlock)
		return SE->getCouldNotCompute();
		assert(ExitNotTaken.ExactNotTaken && "uninitialized not-taken info");

		const SCEV *BECount = nullptr;
		SCEVPredicateSet Pred;
		for (const ExitNotTakenInfo *ENT = &ExitNotTaken; ENT != nullptr;
		ENT = ENT->getNextExit()) {

		assert(ENT->ExactNotTaken != SE->getCouldNotCompute() && "bad exit SCEV");

		if (!BECount) {
		BECount = ENT->ExactNotTaken;
		} else if (BECount != ENT->ExactNotTaken) {
		return SE->getCouldNotCompute();
		}
		Predicates.add(&(ENT->Pred));
		}
		assert(BECount && "Invalid not taken count for loop exit");

		if (Predicates.isAlwaysFalse())
		return SE->getCouldNotCompute();

		return BECount;
		}

/// getExact - Get the exact not taken count for this loop exit.		/// getExact - Get the exact not taken count for this loop exit.
const SCEV *		const SCEV *
ScalarEvolution::BackedgeTakenInfo::getExact(BasicBlock *ExitingBlock,		ScalarEvolution::BackedgeTakenInfo::getExact(BasicBlock *ExitingBlock,
ScalarEvolution *SE) const {		ScalarEvolution *SE) const {
for (const ExitNotTakenInfo *ENT = &ExitNotTaken;		for (const ExitNotTakenInfo *ENT = &ExitNotTaken; ENT != nullptr;
ENT != nullptr; ENT = ENT->getNextExit()) {		ENT = ENT->getNextExit()) {

if (ENT->ExitingBlock == ExitingBlock)		if (ENT->ExitingBlock == ExitingBlock && ENT->Pred.isAlwaysTrue())
return ENT->ExactNotTaken;		return ENT->ExactNotTaken;
}		}
return SE->getCouldNotCompute();		return SE->getCouldNotCompute();
}		}

/// getMax - Get the max backedge taken count for the loop.		/// getMax - Get the max backedge taken count for the loop.
const SCEV *		const SCEV *
ScalarEvolution::BackedgeTakenInfo::getMax(ScalarEvolution *SE) const {		ScalarEvolution::BackedgeTakenInfo::getMax(ScalarEvolution *SE) const {
		for (const ExitNotTakenInfo *ENT = &ExitNotTaken; ENT != nullptr;
		ENT = ENT->getNextExit()) {
		if (!ENT->Pred.isAlwaysTrue())
		return SE->getCouldNotCompute();
		}
return Max ? Max : SE->getCouldNotCompute();		return Max ? Max : SE->getCouldNotCompute();
}		}

bool ScalarEvolution::BackedgeTakenInfo::hasOperand(const SCEV *S,		bool ScalarEvolution::BackedgeTakenInfo::hasOperand(const SCEV *S,
ScalarEvolution *SE) const {		ScalarEvolution *SE) const {
if (Max && Max != SE->getCouldNotCompute() && SE->hasOperand(Max, S))		if (Max && Max != SE->getCouldNotCompute() && SE->hasOperand(Max, S))
return true;		return true;

Show All 9 Lines	for (const ExitNotTakenInfo *ENT = &ExitNotTaken;
}		}
}		}
return false;		return false;
}		}

/// Allocate memory for BackedgeTakenInfo and copy the not-taken count of each		/// Allocate memory for BackedgeTakenInfo and copy the not-taken count of each
/// computable exit into a persistent ExitNotTakenInfo array.		/// computable exit into a persistent ExitNotTakenInfo array.
ScalarEvolution::BackedgeTakenInfo::BackedgeTakenInfo(		ScalarEvolution::BackedgeTakenInfo::BackedgeTakenInfo(
SmallVectorImpl< std::pair<BasicBlock , const SCEV > > &ExitCounts,		SmallVectorImpl<std::pair<BasicBlock , const SCEV >> &ExitCounts,
bool Complete, const SCEV *MaxCount) : Max(MaxCount) {		SmallVectorImpl<SCEVPredicateSet *> &ExitPreds, bool Complete,
		const SCEV *MaxCount)
		: Max(MaxCount) {

if (!Complete)		if (!Complete)
ExitNotTaken.setIncomplete();		ExitNotTaken.setIncomplete();

unsigned NumExits = ExitCounts.size();		unsigned NumExits = ExitCounts.size();
if (NumExits == 0) return;		if (NumExits == 0) return;

ExitNotTaken.ExitingBlock = ExitCounts[0].first;		ExitNotTaken.ExitingBlock = ExitCounts[0].first;
ExitNotTaken.ExactNotTaken = ExitCounts[0].second;		ExitNotTaken.ExactNotTaken = ExitCounts[0].second;
if (NumExits == 1) return;		ExitNotTaken.Pred = *ExitPreds[0];

		if (NumExits == 1)
		return;

// Handle the rare case of multiple computable exits.		// Handle the rare case of multiple computable exits.
ExitNotTakenInfo *ENT = new ExitNotTakenInfo[NumExits-1];		ExitNotTakenInfo *ENT = new ExitNotTakenInfo[NumExits-1];

ExitNotTakenInfo *PrevENT = &ExitNotTaken;		ExitNotTakenInfo *PrevENT = &ExitNotTaken;
for (unsigned i = 1; i < NumExits; ++i, PrevENT = ENT, ++ENT) {		for (unsigned i = 1; i < NumExits; ++i, PrevENT = ENT, ++ENT) {
PrevENT->setNextExit(ENT);		PrevENT->setNextExit(ENT);
ENT->ExitingBlock = ExitCounts[i].first;		ENT->ExitingBlock = ExitCounts[i].first;
ENT->ExactNotTaken = ExitCounts[i].second;		ENT->ExactNotTaken = ExitCounts[i].second;
		ENT->Pred = *ExitPreds[i];
}		}
}		}

/// clear - Invalidate this result and free the ExitNotTakenInfo array.		/// clear - Invalidate this result and free the ExitNotTakenInfo array.
void ScalarEvolution::BackedgeTakenInfo::clear() {		void ScalarEvolution::BackedgeTakenInfo::clear() {
ExitNotTaken.ExitingBlock = nullptr;		ExitNotTaken.ExitingBlock = nullptr;
ExitNotTaken.ExactNotTaken = nullptr;		ExitNotTaken.ExactNotTaken = nullptr;
delete[] ExitNotTaken.getNextExit();		delete[] ExitNotTaken.getNextExit();
}		}

/// ComputeBackedgeTakenCount - Compute the number of times the backedge		/// ComputeBackedgeTakenCount - Compute the number of times the backedge
/// of the specified loop will execute.		/// of the specified loop will execute.
ScalarEvolution::BackedgeTakenInfo		ScalarEvolution::BackedgeTakenInfo
ScalarEvolution::ComputeBackedgeTakenCount(const Loop *L) {		ScalarEvolution::ComputeBackedgeTakenCount(const Loop *L) {
SmallVector<BasicBlock *, 8> ExitingBlocks;		SmallVector<BasicBlock *, 8> ExitingBlocks;
L->getExitingBlocks(ExitingBlocks);		L->getExitingBlocks(ExitingBlocks);

SmallVector<std::pair<BasicBlock , const SCEV >, 4> ExitCounts;		SmallVector<std::pair<BasicBlock , const SCEV >, 4> ExitCounts;
		SmallVector<SCEVPredicateSet *, 4> ExitCountPreds;
bool CouldComputeBECount = true;		bool CouldComputeBECount = true;
BasicBlock *Latch = L->getLoopLatch(); // may be NULL.		BasicBlock *Latch = L->getLoopLatch(); // may be NULL.
const SCEV *MustExitMaxBECount = nullptr;		const SCEV *MustExitMaxBECount = nullptr;
const SCEV *MayExitMaxBECount = nullptr;		const SCEV *MayExitMaxBECount = nullptr;

// Compute the ExitLimit for each loop exit. Use this to populate ExitCounts		// Compute the ExitLimit for each loop exit. Use this to populate ExitCounts
// and compute maxBECount.		// and compute maxBECount.
		// Do a union of all the predicates here.
for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {		for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {
BasicBlock *ExitBB = ExitingBlocks[i];		BasicBlock *ExitBB = ExitingBlocks[i];
ExitLimit EL = ComputeExitLimit(L, ExitBB);		ExitLimit EL = ComputeExitLimit(L, ExitBB);

// 1. For each exit that can be computed, add an entry to ExitCounts.		// 1. For each exit that can be computed, add an entry to ExitCounts.
// CouldComputeBECount is true only if all exits can be computed.		// CouldComputeBECount is true only if all exits can be computed.
if (EL.Exact == getCouldNotCompute())		if (EL.Exact == getCouldNotCompute())
// We couldn't compute an exact value for this exit, so		// We couldn't compute an exact value for this exit, so
// we won't be able to compute an exact value for the loop.		// we won't be able to compute an exact value for the loop.
CouldComputeBECount = false;		CouldComputeBECount = false;
else		else {
ExitCounts.push_back(std::make_pair(ExitBB, EL.Exact));		ExitCounts.push_back(std::make_pair(ExitBB, EL.Exact));
		ExitCountPreds.push_back(&EL.Pred);
		}

// 2. Derive the loop's MaxBECount from each exit's max number of		// 2. Derive the loop's MaxBECount from each exit's max number of
// non-exiting iterations. Partition the loop exits into two kinds:		// non-exiting iterations. Partition the loop exits into two kinds:
// LoopMustExits and LoopMayExits.		// LoopMustExits and LoopMayExits.
//		//
// If the exit dominates the loop latch, it is a LoopMustExit otherwise it		// If the exit dominates the loop latch, it is a LoopMustExit otherwise it
// is a LoopMayExit. If any computable LoopMustExit is found, then		// is a LoopMayExit. If any computable LoopMustExit is found, then
// MaxBECount is the minimum EL.Max of computable LoopMustExits. Otherwise,		// MaxBECount is the minimum EL.Max of computable LoopMustExits. Otherwise,
Show All 11 Lines	if (EL.Max != getCouldNotCompute() && Latch &&
if (!MayExitMaxBECount \|\| EL.Max == getCouldNotCompute())		if (!MayExitMaxBECount \|\| EL.Max == getCouldNotCompute())
MayExitMaxBECount = EL.Max;		MayExitMaxBECount = EL.Max;
else {		else {
MayExitMaxBECount =		MayExitMaxBECount =
getUMaxFromMismatchedTypes(MayExitMaxBECount, EL.Max);		getUMaxFromMismatchedTypes(MayExitMaxBECount, EL.Max);
}		}
}		}
}		}
const SCEV *MaxBECount = MustExitMaxBECount ? MustExitMaxBECount :		const SCEV *MaxBECount =
(MayExitMaxBECount ? MayExitMaxBECount : getCouldNotCompute());		MustExitMaxBECount
return BackedgeTakenInfo(ExitCounts, CouldComputeBECount, MaxBECount);		? MustExitMaxBECount
		: (MayExitMaxBECount ? MayExitMaxBECount : getCouldNotCompute());
		return BackedgeTakenInfo(ExitCounts, ExitCountPreds, CouldComputeBECount,
		MaxBECount);
}		}

/// ComputeExitLimit - Compute the number of times the backedge of the specified		/// ComputeExitLimit - Compute the number of times the backedge of the specified
/// loop will execute if it exits via the specified block.		/// loop will execute if it exits via the specified block.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::ComputeExitLimit(const Loop L, BasicBlock ExitingBlock) {		ScalarEvolution::ComputeExitLimit(const Loop L, BasicBlock ExitingBlock) {

// Okay, we've chosen an exiting block. See what condition causes us to		// Okay, we've chosen an exiting block. See what condition causes us to
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	if (BO->getOpcode() == Instruction::And) {
// For now, be conservative.		// For now, be conservative.
assert(L->contains(FBB) && "Loop block has no successor in loop!");		assert(L->contains(FBB) && "Loop block has no successor in loop!");
if (EL0.Max == EL1.Max)		if (EL0.Max == EL1.Max)
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
if (EL0.Exact == EL1.Exact)		if (EL0.Exact == EL1.Exact)
BECount = EL0.Exact;		BECount = EL0.Exact;
}		}

return ExitLimit(BECount, MaxBECount);		SCEVPredicateSet NP;
		NP.add(&EL0.Pred);
		NP.add(&EL1.Pred);
		return ExitLimit(BECount, MaxBECount, NP);
}		}
if (BO->getOpcode() == Instruction::Or) {		if (BO->getOpcode() == Instruction::Or) {
// Recurse on the operands of the or.		// Recurse on the operands of the or.
bool EitherMayExit = L->contains(FBB);		bool EitherMayExit = L->contains(FBB);
ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,		ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,
ControlsExit && !EitherMayExit);		ControlsExit && !EitherMayExit);
ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,		ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,
ControlsExit && !EitherMayExit);		ControlsExit && !EitherMayExit);
Show All 18 Lines	if (BO->getOpcode() == Instruction::Or) {
// For now, be conservative.		// For now, be conservative.
assert(L->contains(TBB) && "Loop block has no successor in loop!");		assert(L->contains(TBB) && "Loop block has no successor in loop!");
if (EL0.Max == EL1.Max)		if (EL0.Max == EL1.Max)
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
if (EL0.Exact == EL1.Exact)		if (EL0.Exact == EL1.Exact)
BECount = EL0.Exact;		BECount = EL0.Exact;
}		}

return ExitLimit(BECount, MaxBECount);		SCEVPredicateSet NP;
		NP.add(&EL0.Pred);
		NP.add(&EL1.Pred);
		return ExitLimit(BECount, MaxBECount, NP);
}		}
}		}

// With an icmp, it may be feasible to compute an exact backedge-taken count.		// With an icmp, it may be feasible to compute an exact backedge-taken count.
// Proceed to the next level to examine the icmp.		// Proceed to the next level to examine the icmp.
if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond))		if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond))
return ComputeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, ControlsExit);		return ComputeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, ControlsExit);

Show All 12 Lines	ScalarEvolution::ComputeExitLimitFromCond(const Loop *L,

// If it's not an integer or pointer comparison then compute it the hard way.		// If it's not an integer or pointer comparison then compute it the hard way.
return ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));		return ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));
}		}

/// ComputeExitLimitFromICmp - Compute the number of times the		/// ComputeExitLimitFromICmp - Compute the number of times the
/// backedge of the specified loop will execute if its exit condition		/// backedge of the specified loop will execute if its exit condition
/// were a conditional branch of the ICmpInst ExitCond, TBB, and FBB.		/// were a conditional branch of the ICmpInst ExitCond, TBB, and FBB.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit ScalarEvolution::ComputeExitLimitFromICmp(
ScalarEvolution::ComputeExitLimitFromICmp(const Loop *L,		const Loop L, ICmpInst ExitCond, BasicBlock TBB, BasicBlock FBB,
ICmpInst *ExitCond,		bool ControlsExit, bool UseAssumptions) {
BasicBlock *TBB,
BasicBlock *FBB,
bool ControlsExit) {

// If the condition was exit on true, convert the condition to exit on false		// If the condition was exit on true, convert the condition to exit on false
ICmpInst::Predicate Cond;		ICmpInst::Predicate Cond;
if (!L->contains(FBB))		if (!L->contains(FBB))
Cond = ExitCond->getPredicate();		Cond = ExitCond->getPredicate();
else		else
Cond = ExitCond->getInversePredicate();		Cond = ExitCond->getInversePredicate();

Show All 33 Lines	if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(LHS))
ConstantRange CompRange(		ConstantRange CompRange(
ICmpInst::makeConstantRange(Cond, RHSC->getValue()->getValue()));		ICmpInst::makeConstantRange(Cond, RHSC->getValue()->getValue()));

const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);		const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);
if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;		if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;
}		}

switch (Cond) {		switch (Cond) {
case ICmpInst::ICMP_NE: { // while (X != Y)		case ICmpInst::ICMP_NE: { // while (X != Y)
// Convert to: while (X-Y != 0)		// Convert to: while (X-Y != 0)
ExitLimit EL = HowFarToZero(getMinusSCEV(LHS, RHS), L, ControlsExit);		ExitLimit EL =
if (EL.hasAnyInfo()) return EL;		HowFarToZero(getMinusSCEV(LHS, RHS), L, ControlsExit, UseAssumptions);
		if (EL.hasAnyInfo())
		return EL;
break;		break;
}		}
case ICmpInst::ICMP_EQ: { // while (X == Y)		case ICmpInst::ICMP_EQ: { // while (X == Y)
// Convert to: while (X-Y == 0)		// Convert to: while (X-Y == 0)
ExitLimit EL = HowFarToNonZero(getMinusSCEV(LHS, RHS), L);		ExitLimit EL = HowFarToNonZero(getMinusSCEV(LHS, RHS), L, UseAssumptions);
if (EL.hasAnyInfo()) return EL;		if (EL.hasAnyInfo())
		return EL;
break;		break;
}		}
case ICmpInst::ICMP_SLT:		case ICmpInst::ICMP_SLT:
case ICmpInst::ICMP_ULT: { // while (X < Y)		case ICmpInst::ICMP_ULT: { // while (X < Y)
bool IsSigned = Cond == ICmpInst::ICMP_SLT;		bool IsSigned = Cond == ICmpInst::ICMP_SLT;
ExitLimit EL = HowManyLessThans(LHS, RHS, L, IsSigned, ControlsExit);		ExitLimit EL =
if (EL.hasAnyInfo()) return EL;		HowManyLessThans(LHS, RHS, L, IsSigned, ControlsExit, UseAssumptions);
		if (EL.hasAnyInfo())
		return EL;
break;		break;
}		}
case ICmpInst::ICMP_SGT:		case ICmpInst::ICMP_SGT:
case ICmpInst::ICMP_UGT: { // while (X > Y)		case ICmpInst::ICMP_UGT: { // while (X > Y)
bool IsSigned = Cond == ICmpInst::ICMP_SGT;		bool IsSigned = Cond == ICmpInst::ICMP_SGT;
ExitLimit EL = HowManyGreaterThans(LHS, RHS, L, IsSigned, ControlsExit);		ExitLimit EL = HowManyGreaterThans(LHS, RHS, L, IsSigned, ControlsExit,
if (EL.hasAnyInfo()) return EL;		UseAssumptions);
		if (EL.hasAnyInfo())
		return EL;
break;		break;
}		}
default:		default:
#if 0		#if 0
dbgs() << "ComputeBackedgeTakenCount ";		dbgs() << "ComputeBackedgeTakenCount ";
if (ExitCond->getOperand(0)->getType()->isUnsigned())		if (ExitCond->getOperand(0)->getType()->isUnsigned())
dbgs() << "[unsigned] ";		dbgs() << "[unsigned] ";
dbgs() << *LHS << " "		dbgs() << *LHS << " "
<< Instruction::getOpcodeName(Instruction::ICmp)		<< Instruction::getOpcodeName(Instruction::ICmp)
<< " " << *RHS << "\n";		<< " " << *RHS << "\n";
#endif		#endif
break;		break;
}		}
return ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));		ExitLimit EL = ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));

		if (EL.hasAnyInfo() \|\| UseAssumptions)
		return EL;

		// We could not prove what the exit limit is without making
		// assumptions. Try to compute it using assumptions.
		return ComputeExitLimitFromICmp(L, ExitCond, TBB, FBB, ControlsExit, true);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::ComputeExitLimitFromSingleExitSwitch(const Loop *L,		ScalarEvolution::ComputeExitLimitFromSingleExitSwitch(const Loop *L,
SwitchInst *Switch,		SwitchInst *Switch,
BasicBlock *ExitingBlock,		BasicBlock *ExitingBlock,
bool ControlsExit) {		bool ControlsExit) {
assert(!L->contains(ExitingBlock) && "Not an exiting block!");		assert(!L->contains(ExitingBlock) && "Not an exiting block!");
▲ Show 20 Lines • Show All 872 Lines • ▼ Show 20 Lines

/// HowFarToZero - Return the number of times a backedge comparing the specified		/// HowFarToZero - Return the number of times a backedge comparing the specified
/// value to zero will execute. If not computable, return CouldNotCompute.		/// value to zero will execute. If not computable, return CouldNotCompute.
///		///
/// This is only used for loops with a "x != y" exit test. The exit condition is		/// This is only used for loops with a "x != y" exit test. The exit condition is
/// now expressed as a single expression, V = x-y. So the exit test is		/// now expressed as a single expression, V = x-y. So the exit test is
/// effectively V != 0. We know and take advantage of the fact that this		/// effectively V != 0. We know and take advantage of the fact that this
/// expression only being used in a comparison by zero context.		/// expression only being used in a comparison by zero context.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit ScalarEvolution::HowFarToZero(const SCEV *V,
ScalarEvolution::HowFarToZero(const SCEV V, const Loop L, bool ControlsExit) {		const Loop *L,
		bool ControlsExit,
		bool UseAssumptions) {
		SCEVPredicateSet P;

// If the value is a constant		// If the value is a constant
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {		if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {
// If the value is already zero, the branch will execute zero times.		// If the value is already zero, the branch will execute zero times.
if (C->getValue()->isZero()) return C;		if (C->getValue()->isZero()) return C;
return getCouldNotCompute(); // Otherwise it will loop infinitely.		return getCouldNotCompute(); // Otherwise it will loop infinitely.
}		}

const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(V);		const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(V);
		if ((!AddRec) && UseAssumptions) {
		// Try to make this a chrec using runtime assumptions.
		//AssumptionResult R = removeOverflowsWithAssumptions(V, L, this);
		AssumptionResult R = getAddRecWithRTChecks(V, L);
		if (!R.Res)
		return getCouldNotCompute();
		if (R.Pred.isAlwaysFalse())
		return getCouldNotCompute();
		AddRec = dyn_cast<SCEVAddRecExpr>(R.Res);
		if (!AddRec)
		return getCouldNotCompute();
		P.add(&R.Pred);
		}

if (!AddRec \|\| AddRec->getLoop() != L)		if (!AddRec \|\| AddRec->getLoop() != L)
return getCouldNotCompute();		return getCouldNotCompute();

// If this is a quadratic (3-term) AddRec {L,+,M,+,N}, find the roots of		// If this is a quadratic (3-term) AddRec {L,+,M,+,N}, find the roots of
// the quadratic equation to solve it.		// the quadratic equation to solve it.
if (AddRec->isQuadratic() && AddRec->getType()->isIntegerTy()) {		if (AddRec->isQuadratic() && AddRec->getType()->isIntegerTy()) {
std::pair<const SCEV ,const SCEV > Roots =		std::pair<const SCEV ,const SCEV > Roots =
SolveQuadraticEquation(AddRec, *this);		SolveQuadraticEquation(AddRec, *this);
Show All 12 Lines	#endif
if (!CB->getZExtValue())		if (!CB->getZExtValue())
std::swap(R1, R2); // R1 is the minimum root now.		std::swap(R1, R2); // R1 is the minimum root now.

// We can only use this value if the chrec ends up with an exact zero		// We can only use this value if the chrec ends up with an exact zero
// value at this index. When solving for "X*X != 5", for example, we		// value at this index. When solving for "X*X != 5", for example, we
// should not accept a root of 2.		// should not accept a root of 2.
const SCEV Val = AddRec->evaluateAtIteration(R1, this);		const SCEV Val = AddRec->evaluateAtIteration(R1, this);
if (Val->isZero())		if (Val->isZero())
return R1; // We found a quadratic root!		return ExitLimit(R1, R1, P); // We found a quadratic root!
}		}
}		}
return getCouldNotCompute();		return getCouldNotCompute();
}		}

// Otherwise we can only handle this if it is affine.		// Otherwise we can only handle this if it is affine.
if (!AddRec->isAffine())		if (!AddRec->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();
Show All 38 Lines	if (StepC->getValue()->equalsInt(1) \|\| StepC->getValue()->isAllOnesValue()) {
ConstantRange CR = getUnsignedRange(Start);		ConstantRange CR = getUnsignedRange(Start);
const SCEV *MaxBECount;		const SCEV *MaxBECount;
if (!CountDown && CR.getUnsignedMin().isMinValue())		if (!CountDown && CR.getUnsignedMin().isMinValue())
// When counting up, the worst starting value is 1, not 0.		// When counting up, the worst starting value is 1, not 0.
MaxBECount = CR.getUnsignedMax().isMinValue()		MaxBECount = CR.getUnsignedMax().isMinValue()
? getConstant(APInt::getMinValue(CR.getBitWidth()))		? getConstant(APInt::getMinValue(CR.getBitWidth()))
: getConstant(APInt::getMaxValue(CR.getBitWidth()));		: getConstant(APInt::getMaxValue(CR.getBitWidth()));
else		else
MaxBECount = getConstant(CountDown ? CR.getUnsignedMax()		MaxBECount =
: -CR.getUnsignedMin());		getConstant(CountDown ? CR.getUnsignedMax() : -CR.getUnsignedMin());
return ExitLimit(Distance, MaxBECount);		return ExitLimit(Distance, MaxBECount, P);
}		}

// As a special case, handle the instance where Step is a positive power of		// As a special case, handle the instance where Step is a positive power of
// two. In this case, determining whether Step divides Distance evenly can be		// two. In this case, determining whether Step divides Distance evenly can be
// done by counting and comparing the number of trailing zeros of Step and		// done by counting and comparing the number of trailing zeros of Step and
// Distance.		// Distance.
if (!CountDown) {		if (!CountDown) {
const APInt &StepV = StepC->getValue()->getValue();		const APInt &StepV = StepC->getValue()->getValue();
// StepV.isPowerOf2() returns true if StepV is an positive power of two. It		// StepV.isPowerOf2() returns true if StepV is an positive power of two. It
// also returns true if StepV is maximally negative (eg, INT_MIN), but that		// also returns true if StepV is maximally negative (eg, INT_MIN), but that
// case is not handled as this code is guarded by !CountDown.		// case is not handled as this code is guarded by !CountDown.
if (StepV.isPowerOf2() &&		if (StepV.isPowerOf2() &&
GetMinTrailingZeros(Distance) >= StepV.countTrailingZeros())		GetMinTrailingZeros(Distance) >= StepV.countTrailingZeros()) {
return getUDivExactExpr(Distance, Step);		const SCEV *E = getUDivExactExpr(Distance, Step);
		return ExitLimit(E, E, P);
		}
}		}

// If the condition controls loop exit (the loop exits only if the expression		// If the condition controls loop exit (the loop exits only if the expression
// is true) and the addition is no-wrap we can use unsigned divide to		// is true) and the addition is no-wrap we can use unsigned divide to
// compute the backedge count. In this case, the step may not divide the		// compute the backedge count. In this case, the step may not divide the
// distance, but we don't care because if the condition is "missed" the loop		// distance, but we don't care because if the condition is "missed" the loop
// will have undefined behavior due to wrapping.		// will have undefined behavior due to wrapping.
if (ControlsExit && AddRec->getNoWrapFlags(SCEV::FlagNW)) {		if (ControlsExit && AddRec->getNoWrapFlags(SCEV::FlagNW)) {
const SCEV *Exact =		const SCEV *Exact =
getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);		getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);
return ExitLimit(Exact, Exact);		return ExitLimit(Exact, Exact, P);
}		}

// Then, try to solve the above equation provided that Start is constant.		// Then, try to solve the above equation provided that Start is constant.
if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start))		if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start)) {
return SolveLinEquationWithOverflow(StepC->getValue()->getValue(),		const SCEV *E = SolveLinEquationWithOverflow(
-StartC->getValue()->getValue(),		StepC->getValue()->getValue(), -StartC->getValue()->getValue(), *this);
*this);		return ExitLimit(E, E, P);
		}
return getCouldNotCompute();		return getCouldNotCompute();
}		}

/// HowFarToNonZero - Return the number of times a backedge checking the		/// HowFarToNonZero - Return the number of times a backedge checking the
/// specified value for nonzero will execute. If not computable, return		/// specified value for nonzero will execute. If not computable, return
/// CouldNotCompute		/// CouldNotCompute
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowFarToNonZero(const SCEV V, const Loop L) {		ScalarEvolution::HowFarToNonZero(const SCEV V, const Loop L,
		bool UseAssumptions) {
// Loops that look like: while (X == 0) are very strange indeed. We don't		// Loops that look like: while (X == 0) are very strange indeed. We don't
// handle them yet except for the trivial case. This could be expanded in the		// handle them yet except for the trivial case. This could be expanded in the
// future as needed.		// future as needed.

// If the value is a constant, check to see if it is known to be non-zero		// If the value is a constant, check to see if it is known to be non-zero
// already. If so, the backedge will execute zero times.		// already. If so, the backedge will execute zero times.
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {		if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {
if (!C->getValue()->isNullValue())		if (!C->getValue()->isNullValue())
▲ Show 20 Lines • Show All 1,179 Lines • ▼ Show 20 Lines
/// CouldNotCompute.		/// CouldNotCompute.
///		///
/// @param ControlsExit is true when the LHS < RHS condition directly controls		/// @param ControlsExit is true when the LHS < RHS condition directly controls
/// the branch (loops exits only if condition is true). In this case, we can use		/// the branch (loops exits only if condition is true). In this case, we can use
/// NoWrapFlags to skip overflow checks.		/// NoWrapFlags to skip overflow checks.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowManyLessThans(const SCEV LHS, const SCEV RHS,		ScalarEvolution::HowManyLessThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool IsSigned,		const Loop *L, bool IsSigned,
bool ControlsExit) {		bool ControlsExit, bool UseAssumptions) {
		SCEVPredicateSet P;

// We handle only IV < Invariant		// We handle only IV < Invariant
if (!isLoopInvariant(RHS, L))		if (!isLoopInvariant(RHS, L))
return getCouldNotCompute();		return getCouldNotCompute();

const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);		const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);

		if (!IV && UseAssumptions) {
		// Try to make this a chrec using runtime assumptions.
		AssumptionResult R = getAddRecWithRTChecks(LHS, L);
		if (!R.Res)
		return getCouldNotCompute();
		if (R.Pred.isAlwaysFalse())
		return getCouldNotCompute();
		IV = dyn_cast<SCEVAddRecExpr>(R.Res);
		if (!IV)
		return getCouldNotCompute();
		P.add(&R.Pred);
		}

// Avoid weird loops		// Avoid weird loops
if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())		if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();

		// FIXME: we can assume NoWrap here if necessary and check at runtime.
bool NoWrap = ControlsExit &&		bool NoWrap = ControlsExit &&
IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);		IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);

const SCEV Stride = IV->getStepRecurrence(this);		const SCEV Stride = IV->getStepRecurrence(this);

// Avoid negative or zero stride values		// Avoid negative or zero stride values
if (!isKnownPositive(Stride))		if (!isKnownPositive(Stride))
return getCouldNotCompute();		return getCouldNotCompute();
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else		else
MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),		MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),
getConstant(MinStride), false);		getConstant(MinStride), false);

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount);		return ExitLimit(BECount, MaxBECount, P);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowManyGreaterThans(const SCEV LHS, const SCEV RHS,		ScalarEvolution::HowManyGreaterThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool IsSigned,		const Loop *L, bool IsSigned,
bool ControlsExit) {		bool ControlsExit, bool UseAssumptions) {
		SCEVPredicateSet P;

// We handle only IV > Invariant		// We handle only IV > Invariant
if (!isLoopInvariant(RHS, L))		if (!isLoopInvariant(RHS, L))
return getCouldNotCompute();		return getCouldNotCompute();

const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);		const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);
		if (!IV && UseAssumptions) {
		// Try to make this a chrec using runtime assumptions.
		AssumptionResult R = getAddRecWithRTChecks(LHS, L);
		if (!R.Res)
		return getCouldNotCompute();
		if (R.Pred.isAlwaysFalse())
		return getCouldNotCompute();
		IV = dyn_cast<SCEVAddRecExpr>(R.Res);
		if (!IV)
		return getCouldNotCompute();
		P.add(&R.Pred);
		}

// Avoid weird loops		// Avoid weird loops
if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())		if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();

bool NoWrap = ControlsExit &&		bool NoWrap = ControlsExit &&
IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);		IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else		else
MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),		MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),
getConstant(MinStride), false);		getConstant(MinStride), false);

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount);		return ExitLimit(BECount, MaxBECount, P);
}		}

/// getNumIterationsInRange - Return the number of iterations of this loop that		/// getNumIterationsInRange - Return the number of iterations of this loop that
/// produce values in the specified constant range. Another way of looking at		/// produce values in the specified constant range. Another way of looking at
/// this is that it returns the first iteration number where the value is not in		/// this is that it returns the first iteration number where the value is not in
/// the condition, thus computing the exit count. If the iteration count can't		/// the condition, thus computing the exit count. If the iteration count can't
/// be computed, an instance of SCEVCouldNotCompute is returned.		/// be computed, an instance of SCEVCouldNotCompute is returned.
const SCEV *SCEVAddRecExpr::getNumIterationsInRange(ConstantRange Range,		const SCEV *SCEVAddRecExpr::getNumIterationsInRange(ConstantRange Range,
▲ Show 20 Lines • Show All 598 Lines • ▼ Show 20 Lines
}		}

ScalarEvolution::SCEVCallbackVH::SCEVCallbackVH(Value V, ScalarEvolution se)		ScalarEvolution::SCEVCallbackVH::SCEVCallbackVH(Value V, ScalarEvolution se)
: CallbackVH(V), SE(se) {}		: CallbackVH(V), SE(se) {}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ScalarEvolution Class Implementation		// ScalarEvolution Class Implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

hfinkelUnsubmitted Not Done Reply Inline Actions Don't remove blank line. hfinkel: Don't remove blank line.
ScalarEvolution::ScalarEvolution(Function &F, TargetLibraryInfo &TLI,		ScalarEvolution::ScalarEvolution(Function &F, TargetLibraryInfo &TLI,
AssumptionCache &AC, DominatorTree &DT,		AssumptionCache &AC, DominatorTree &DT,
LoopInfo &LI)		LoopInfo &LI)
: F(F), TLI(TLI), AC(AC), DT(DT), LI(LI),		: F(F), TLI(TLI), AC(AC), DT(DT), LI(LI),
CouldNotCompute(new SCEVCouldNotCompute()),		CouldNotCompute(new SCEVCouldNotCompute()),
WalkingBEDominatingConds(false), ValuesAtScopes(64), LoopDispositions(64),		WalkingBEDominatingConds(false), ValuesAtScopes(64), LoopDispositions(64),
BlockDispositions(64), FirstUnknown(nullptr) {}		BlockDispositions(64), FirstUnknown(nullptr) {}

▲ Show 20 Lines • Show All 508 Lines • ▼ Show 20 Lines

void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {		void ScalarEvolutionWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequiredTransitive<AssumptionCacheTracker>();		AU.addRequiredTransitive<AssumptionCacheTracker>();
AU.addRequiredTransitive<LoopInfoWrapperPass>();		AU.addRequiredTransitive<LoopInfoWrapperPass>();
AU.addRequiredTransitive<DominatorTreeWrapperPass>();		AU.addRequiredTransitive<DominatorTreeWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();		AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
}		}

		static Value generateOverflowCheck(const SCEVAddRecExpr AR, Instruction *Loc,
		bool Signed, ScalarEvolution *SE,
		const DataLayout *DL, SCEVExpander &Exp) {
		Module *M = Loc->getParent()->getParent()->getParent();
		IRBuilder<> OFBuilder(Loc);
		Value AddF, MulF;
		if (Signed) {
		AddF = Intrinsic::getDeclaration(M, Intrinsic::sadd_with_overflow,
		AR->getType());
		MulF = Intrinsic::getDeclaration(M, Intrinsic::smul_with_overflow,
		AR->getType());
		} else {
		AddF = Intrinsic::getDeclaration(M, Intrinsic::uadd_with_overflow,
		AR->getType());
		MulF = Intrinsic::getDeclaration(M, Intrinsic::umul_with_overflow,
		AR->getType());
		}
		Value *Start;
		Value *Stride;

		SCEVPredicateSet MP;
		const SCEV *ExitCount = SE->getGuardedBackedgeTakenCount(AR->getLoop(), MP);

		unsigned DstBits = AR->getType()->getPrimitiveSizeInBits();
		unsigned SrcBits = ExitCount->getType()->getPrimitiveSizeInBits();

		if (SrcBits < DstBits) {
		// We need to extend
		if (Signed)
		ExitCount = SE->getNoopOrSignExtend(ExitCount, AR->getType());
		else
		ExitCount = SE->getNoopOrZeroExtend(ExitCount, AR->getType());
		}

		assert(ExitCount != SE->getCouldNotCompute() && "Invalid loop count");
		Value *TripCount = Exp.expandCodeFor(ExitCount, ExitCount->getType(), Loc);
		Value *TripCountCheck = nullptr;

		// We might need to truncate TripCount
		// If this is the case, we need to make sure that this is legal.
		if (SrcBits > DstBits) {
		APInt CmpMaxValue = Signed ? APInt::getSignedMaxValue(DstBits).sext(SrcBits)
		: APInt::getMaxValue(DstBits).zext(SrcBits);
		// The min value only makes sense for signed checks.

		ConstantInt *CTMax = ConstantInt::get(M->getContext(), CmpMaxValue);
		CmpInst::Predicate P = Signed ? ICmpInst::ICMP_SGT : ICmpInst::ICMP_UGT;
		TripCountCheck = OFBuilder.CreateICmp(P, TripCount, CTMax);

		if (Signed) {
		APInt CmpMinValue = APInt::getSignedMinValue(DstBits).sext(SrcBits);
		ConstantInt *CTMin = ConstantInt::get(M->getContext(), CmpMinValue);
		Value *MinCheck =
		OFBuilder.CreateICmp(ICmpInst::ICMP_SLT, TripCount, CTMin);
		TripCountCheck = OFBuilder.CreateOr(TripCountCheck, MinCheck);
		}

		TripCount = OFBuilder.CreateTrunc(TripCount, AR->getType());
		}

		// We need to truncate or extend TripCount to the type used by the SCEV
		// Extension is not a problem.
		Start = Exp.expandCodeFor(AR->getStart(), AR->getStart()->getType(), Loc);

		// This is an affine expression
		Stride =
		Exp.expandCodeFor(AR->getOperand(1), AR->getOperand(1)->getType(), Loc);

		CallInst *Mul = OFBuilder.CreateCall(MulF, {Stride, TripCount}, "mul");
		Value *MulV = OFBuilder.CreateExtractValue(Mul, 0, "mul.result");
		Value *OfMul = OFBuilder.CreateExtractValue(Mul, 1, "mul.overflow");
		CallInst *Add = OFBuilder.CreateCall(AddF, {MulV, Start}, "uadd");
		Value *OfAdd = OFBuilder.CreateExtractValue(Add, 1, "add.overflow");
		Value *Overflow = OFBuilder.CreateOr(OfMul, OfAdd, "overflow");

		if (TripCountCheck) {
		Overflow = OFBuilder.CreateOr(Overflow, TripCountCheck);
		}

		return Overflow;
		}

		static Instruction getFirstInst(Instruction FirstInst, Value *V,
		Instruction *Loc) {
		if (FirstInst)
		return FirstInst;
		if (Instruction *I = dyn_cast<Instruction>(V))
		return I->getParent() == Loc->getParent() ? I : nullptr;
		return nullptr;
		}

		/// Removes overflows and records the assumptions that were made.
		struct SCEVOverflowRewriter
		: public SCEVVisitor<SCEVOverflowRewriter, const SCEV *> {
		public:
		SCEVPredicateSet &P;

		static const SCEV rewrite(const SCEV Scev, const Loop *L,
		ScalarEvolution &SE, SCEVPredicateSet &A,
		bool Assume) {
		SCEVOverflowRewriter Rewriter(L, SE, A, Assume);
		return Rewriter.visit(Scev);
		}

		bool addOverflowAssumption(const SCEV *S, SCEV::NoWrapFlags AddedFlags) {
		SCEVAddRecOverflowPredicate A(S, AddedFlags);
		if (!MakeAssumptions) {
		// Check if we've already made this assumption.
		if (P.contains(&A))
		return true;
		return false;
		}
		P.add(&A);
		return true;
		}

		SCEVOverflowRewriter(const Loop *L, ScalarEvolution &S, SCEVPredicateSet &P,
		bool MakeAssumptions)
		: P(P), SE(S), L(L), MakeAssumptions(MakeAssumptions) {}

		const SCEV visitConstant(const SCEVConstant Constant) { return Constant; }

		const SCEV visitTruncateExpr(const SCEVTruncateExpr Expr) {
		const SCEV *Operand = visit(Expr->getOperand());
		return SE.getTruncateExpr(Operand, Expr->getType());
		}

		// We should only need to add assumptions when encountering the
		// sext/zext expressions, as other expressions will fold into
		// the AddRecExprs.
		const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {
		const SCEV *Operand = visit(Expr->getOperand());
		const SCEVAddRecExpr *AR = dyn_cast<const SCEVAddRecExpr>(Operand);
		if (AR && AR->getLoop() == L && AR->isAffine()) {
		// This couldn't be folded because the operand didn't have the nuw
		// flag. Add the nuw flag as an assumption that we could make.
		const SCEV *Step = AR->getStepRecurrence(SE);
		Type *Ty = Expr->getType();
		// We would also like to add the NUW flag here. We add the NUW
		// flag to the assumption set but cannot set it on the expression
		// because it would pollute ScalarEvolution's cache.
		if (addOverflowAssumption(AR, SCEV::FlagNUW))
		return SE.getAddRecExpr(SE.getZeroExtendExpr(AR->getStart(), Ty),
		SE.getZeroExtendExpr(Step, Ty), L,
		AR->getNoWrapFlags());
		}
		return SE.getZeroExtendExpr(Operand, Expr->getType());
		}

		const SCEV visitSignExtendExpr(const SCEVSignExtendExpr Expr) {
		const SCEV *Operand = visit(Expr->getOperand());
		const SCEVAddRecExpr *AR = dyn_cast<const SCEVAddRecExpr>(Operand);
		if (AR && AR->getLoop() == L && AR->isAffine()) {
		// This couldn't be folded because the operand didn't have the nsw
		// flag. Add the nsw flag as an assumption that we could make.
		const SCEV *Step = AR->getStepRecurrence(SE);
		Type *Ty = Expr->getType();
		// We would also like to add the NUW flag here. We add the NUW
		// flag to the assumption set but cannot set it on the expression
		// because it would pollute ScalarEvolution's cache.
		if (addOverflowAssumption(AR, SCEV::FlagNSW))
		return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),
		SE.getSignExtendExpr(Step, Ty), L,
		AR->getNoWrapFlags());
		}
		return SE.getSignExtendExpr(Operand, Expr->getType());
		}

		const SCEV visitAddExpr(const SCEVAddExpr Expr) {
		SmallVector<const SCEV *, 2> Operands;
		for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
		Operands.push_back(visit(Expr->getOperand(i)));
		return SE.getAddExpr(Operands);
		}

		const SCEV visitMulExpr(const SCEVMulExpr Expr) {
		SmallVector<const SCEV *, 2> Operands;
		for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
		Operands.push_back(visit(Expr->getOperand(i)));
		return SE.getMulExpr(Operands);
		}

		const SCEV visitUDivExpr(const SCEVUDivExpr Expr) {
		return SE.getUDivExpr(visit(Expr->getLHS()), visit(Expr->getRHS()));
		}

		const SCEV visitAddRecExpr(const SCEVAddRecExpr Expr) {
		SmallVector<const SCEV *, 2> Operands;
		for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
		Operands.push_back(visit(Expr->getOperand(i)));

		const Loop *L = Expr->getLoop();
		return SE.getAddRecExpr(Operands, L, Expr->getNoWrapFlags());
		}

		const SCEV visitSMaxExpr(const SCEVSMaxExpr Expr) {
		SmallVector<const SCEV *, 2> Operands;
		for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
		Operands.push_back(visit(Expr->getOperand(i)));
		return SE.getSMaxExpr(Operands);
		}

		const SCEV visitUMaxExpr(const SCEVUMaxExpr Expr) {
		SmallVector<const SCEV *, 2> Operands;
		for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
		Operands.push_back(visit(Expr->getOperand(i)));
		return SE.getUMaxExpr(Operands);
		}

		const SCEV visitUnknown(const SCEVUnknown Expr) { return Expr; }

		hfinkelUnsubmitted Not Done Reply Inline Actions Given that the SCEVs are uniqued, you could at least eliminate those that are structurally identical easily. hfinkel: Given that the SCEVs are uniqued, you could at least eliminate those that are structurally…
		const SCEV visitCouldNotCompute(const SCEVCouldNotCompute Expr) {
		return Expr;
		}

		private:
		ScalarEvolution &SE;
		const Loop *L;
		bool MakeAssumptions;
		};

		const SCEV *
		ScalarEvolution::rewriteUsingPredicate(const SCEV Scev, const Loop L,
		SCEVPredicateSet &Pred) {
		return SCEVOverflowRewriter::rewrite(Scev, L, *this, Pred, false);
		}

		AssumptionResult
		ScalarEvolution::getAddRecWithRTChecks(const SCEV S, const Loop L) {
		AssumptionResult Result(S);
		const SCEV Ret = SCEVOverflowRewriter::rewrite(S, L, this,
		Result.Pred, true);
		if (dyn_cast<const SCEVConstant>(Ret) \|\|
		dyn_cast<const SCEVAddRecExpr>(Ret)) {
		Result.Res = Ret;
		}
		return Result;
		}

		//// SCEV predicates
		SCEVPredicate::SCEVPredicate(unsigned short Type) : SCEVPredicateType(Type) {}

		SCEVAddRecOverflowPredicate::SCEVAddRecOverflowPredicate(
		const SCEV *AR, SCEV::NoWrapFlags Flags)
		: SCEVPredicate(pAddRecOverflow), AR(AR), Flags(Flags) {
		assert(dyn_cast<const SCEVAddRecExpr>(AR) &&
		"Can only create a"
		"SCEVAddRecOverflowPredicate using a SCEVAddRecExpr");
		}

		bool SCEVAddRecOverflowPredicate::contains(const SCEVPredicate *N) const {
		const SCEVAddRecOverflowPredicate *OP =
		dyn_cast<const SCEVAddRecOverflowPredicate>(N);

		if (!OP)
		return false;

		if (OP->getExpr() != AR)
		return false;

		if ((OP->getFlags() & getFlags()) != OP->getFlags())
		return false;

		return true;
		}

		bool SCEVAddRecOverflowPredicate::isAlwaysTrue() const {
		const SCEVAddRecExpr A = static_cast<const SCEVAddRecExpr >(AR);
		return ScalarEvolution::clearFlags(Flags, A->getNoWrapFlags()) ==
		SCEV::FlagAnyWrap;
		}

		bool SCEVAddRecOverflowPredicate::isAlwaysFalse() const { return false; }

		Value SCEVAddRecOverflowPredicate::generateCheck(Instruction Loc,
		ScalarEvolution *SE,
		const DataLayout *DL,
		SCEVExpander &Exp) {
		IRBuilder<> OFBuilder(Loc);
		const SCEVAddRecExpr A = static_cast<const SCEVAddRecExpr >(AR);
		Value *OverflowRuntimeCheck = nullptr;

		if (Flags & SCEV::FlagNUW) {
		// Add a check for NUW
		Value *Overflow = generateOverflowCheck(A, Loc, false, SE, DL, Exp);

		if (!OverflowRuntimeCheck)
		OverflowRuntimeCheck = Overflow;
		else
		OverflowRuntimeCheck = OFBuilder.CreateOr(OverflowRuntimeCheck, Overflow);
		}
		if (Flags & SCEV::FlagNSW) {
		// Add a check for NSW
		Value *Overflow = generateOverflowCheck(A, Loc, true, SE, DL, Exp);

		if (!OverflowRuntimeCheck)
		OverflowRuntimeCheck = Overflow;
		else
		OverflowRuntimeCheck = OFBuilder.CreateOr(OverflowRuntimeCheck, Overflow);
		}
		return OverflowRuntimeCheck;
		}

		void SCEVAddRecOverflowPredicate::print(raw_ostream &OS, unsigned Depth) const {
		OS.indent(Depth) << *getExpr() << " Added Flags: ";
		if (SCEV::FlagNUW & getFlags())
		OS << "<nuw>";
		if (SCEV::FlagNSW & getFlags())
		OS << "<nsw>";
		OS << "\n";
		}

		SCEVPredicateSet::SCEVPredicateSet() : SCEVPredicate(pSet), Never(false) {}

		SCEVPredicateSet::SCEVPredicateSet(const SCEVPredicateSet &Old)
		: SCEVPredicateSet() {
		this->Never = Old.Never;
		if (Never)
		return;
		AddRecOverflows = Old.AddRecOverflows;
		for (unsigned II = 0; II < AddRecOverflows.size(); ++II) {
		Preds.push_back(&AddRecOverflows[II]);
		}
		}

		bool SCEVPredicateSet::isAlwaysTrue() const {
		if (Never)
		return false;

		for (auto II = Preds.begin(), EE = Preds.end(); II != EE; ++II) {
		const SCEVPredicate OA = II;
		if (!OA->isAlwaysTrue())
		return false;
		}

		return true;
		}

		bool SCEVPredicateSet::isAlwaysFalse() const { return Never; }

		std::pair<Instruction , Instruction >
		SCEVPredicateSet::generateGuardCond(Instruction Loc, ScalarEvolution SE) {
		Instruction *tnullptr = nullptr;

		assert(!Never && "Cannot generate a runtime check on "
		"a predicate with the Never flag set");

		if (isAlwaysTrue())
		return std::pair<Instruction , Instruction >(tnullptr, tnullptr);

		IRBuilder<> OFBuilder(Loc);
		Instruction *FirstInst = nullptr;
		Module *M = Loc->getParent()->getParent()->getParent();
		const DataLayout &DL = M->getDataLayout();
		SCEVExpander Exp(*SE, DL, "start");

		Value *Check = generateCheck(Loc, SE, &DL, Exp);

		if (!Check)
		return std::make_pair(nullptr, nullptr);

		Instruction *CheckInst =
		BinaryOperator::CreateOr(Check, ConstantInt::getFalse(M->getContext()));
		OFBuilder.Insert(CheckInst, "scev.check");

		FirstInst = getFirstInst(FirstInst, CheckInst, Loc);
		return std::make_pair(FirstInst, CheckInst);
		}

		Value SCEVPredicateSet::generateCheck(Instruction Loc, ScalarEvolution *SE,
		const DataLayout *DL,
		SCEVExpander &Exp) {

		IRBuilder<> OFBuilder(Loc);
		Value *AllCheck = nullptr;

		// Loop over all checks in this set.
		for (auto II = Preds.begin(), EE = Preds.end(); II != EE; ++II) {
		SCEVPredicate OA = II;

		if (OA->isAlwaysTrue())
		continue;

		Value *CheckResult = OA->generateCheck(Loc, SE, DL, Exp);

		if (!AllCheck)
		AllCheck = CheckResult;
		else
		AllCheck = OFBuilder.CreateOr(AllCheck, CheckResult);
		}

		return AllCheck;
		}

		bool SCEVPredicateSet::contains(const SCEVPredicate *N) const {
		if (Never)
		return false;

		if (const SCEVPredicateSet *Set = dyn_cast<const SCEVPredicateSet>(N)) {
		for (auto II = Set->Preds.begin(), EE = Set->Preds.end(); II != EE; ++II) {
		if (!contains(*II))
		return false;
		}
		return true;
		}
		for (auto II = Preds.begin(), EE = Preds.end(); II != EE; ++II) {
		if ((*II)->contains(N))
		return true;
		}
		return false;
		}

		void SCEVPredicateSet::print(raw_ostream &OS, unsigned Depth) const {
		for (auto II = Preds.begin(), EE = Preds.end(); II != EE; ++II)
		(*II)->print(OS, Depth);
		}

		void SCEVPredicateSet::add(const SCEVPredicate *N) {
		if (Preds.size() > OverflowCheckThreshold \|\| N->isAlwaysFalse()) {
		Never = true;
		return;
		}

		if (const SCEVAddRecOverflowPredicate *OP =
		dyn_cast<const SCEVAddRecOverflowPredicate>(N)) {

		const SCEVAddRecExpr *AR =
		static_cast<const SCEVAddRecExpr *>(OP->getExpr());
		for (unsigned II = 0, EE = AddRecOverflows.size(); II < EE; ++II) {
		if (AddRecOverflows[II].getExpr() == AR) {
		AddRecOverflows[II].addFlags(OP->getFlags());
		return;
		}
		}
		AddRecOverflows.push_back(*OP);
		Preds.push_back(&AddRecOverflows.back());

		} else if (const SCEVPredicateSet *Set =
		dyn_cast<const SCEVPredicateSet>(N)) {
		for (auto II = Set->Preds.begin(), EE = Set->Preds.end(); II != EE; ++II) {
		add(*II);
		}
		} else
		llvm_unreachable("Unknown SCEV predicate type!");
		}

		void SCEVAddRecOverflowPredicate::addFlags(SCEV::NoWrapFlags AddedFlags) {
		Flags = ScalarEvolution::setFlags(Flags, AddedFlags);
		}

lib/Transforms/Scalar/LoopDistribute.cpp

Show First 20 Lines • Show All 589 Lines • ▼ Show 20 Lines	public:
LoopDistribute() : FunctionPass(ID) {		LoopDistribute() : FunctionPass(ID) {
initializeLoopDistributePass(*PassRegistry::getPassRegistry());		initializeLoopDistributePass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
LAA = &getAnalysis<LoopAccessAnalysis>();		LAA = &getAnalysis<LoopAccessAnalysis>();
DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();		DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
		SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();

// Build up a worklist of inner-loops to vectorize. This is necessary as the		// Build up a worklist of inner-loops to vectorize. This is necessary as the
// act of distributing a loop creates new loops and can invalidate iterators		// act of distributing a loop creates new loops and can invalidate iterators
// across the loops.		// across the loops.
SmallVector<Loop *, 8> Worklist;		SmallVector<Loop *, 8> Worklist;

for (Loop TopLevelLoop : LI)		for (Loop TopLevelLoop : LI)
for (Loop *L : depth_first(TopLevelLoop))		for (Loop *L : depth_first(TopLevelLoop))
// We only handle inner-most loops.		// We only handle inner-most loops.
if (L->empty())		if (L->empty())
Worklist.push_back(L);		Worklist.push_back(L);

// Now walk the identified inner loops.		// Now walk the identified inner loops.
bool Changed = false;		bool Changed = false;
for (Loop *L : Worklist)		for (Loop *L : Worklist)
Changed \|= processLoop(L);		Changed \|= processLoop(L);

// Process each loop nest in the function.		// Process each loop nest in the function.
return Changed;		return Changed;
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
		AU.addRequired<ScalarEvolutionWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.addPreserved<LoopInfoWrapperPass>();		AU.addPreserved<LoopInfoWrapperPass>();
AU.addRequired<LoopAccessAnalysis>();		AU.addRequired<LoopAccessAnalysis>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
}		}

static char ID;		static char ID;
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	bool processLoop(Loop *L) {

// If we need run-time checks to disambiguate pointers are run-time, version		// If we need run-time checks to disambiguate pointers are run-time, version
// the loop now.		// the loop now.
auto PtrToPartition = Partitions.computePartitionSetForPointers(LAI);		auto PtrToPartition = Partitions.computePartitionSetForPointers(LAI);
const auto *RtPtrChecking = LAI.getRuntimePointerChecking();		const auto *RtPtrChecking = LAI.getRuntimePointerChecking();
const auto &AllChecks = RtPtrChecking->getChecks();		const auto &AllChecks = RtPtrChecking->getChecks();
auto Checks = includeOnlyCrossPartitionChecks(AllChecks, PtrToPartition,		auto Checks = includeOnlyCrossPartitionChecks(AllChecks, PtrToPartition,
RtPtrChecking);		RtPtrChecking);
if (!Checks.empty()) {		if ((!LAI.Pred->isAlwaysTrue()) \|\| !Checks.empty()) {
DEBUG(dbgs() << "\nPointers:\n");		DEBUG(dbgs() << "\nPointers:\n");
DEBUG(LAI.getRuntimePointerChecking()->printChecks(dbgs(), Checks));		DEBUG(LAI.getRuntimePointerChecking()->printChecks(dbgs(), Checks));
LoopVersioning LVer(std::move(Checks), LAI, L, LI, DT);		LoopVersioning LVer(std::move(Checks), LAI, L, LI, DT, SE);
LVer.versionLoop();		LVer.versionLoop();
LVer.addPHINodes(DefsUsedOutside);		LVer.addPHINodes(DefsUsedOutside);
}		}

// Create identical copies of the original loop for each partition and hook		// Create identical copies of the original loop for each partition and hook
// them up sequentially.		// them up sequentially.
Partitions.cloneLoops(this);		Partitions.cloneLoops(this);

Show All 11 Lines	bool processLoop(Loop *L) {
++NumLoopsDistributed;		++NumLoopsDistributed;
return true;		return true;
}		}

// Analyses used.		// Analyses used.
LoopInfo *LI;		LoopInfo *LI;
LoopAccessAnalysis *LAA;		LoopAccessAnalysis *LAA;
DominatorTree *DT;		DominatorTree *DT;
		ScalarEvolution *SE;
};		};
} // anonymous namespace		} // anonymous namespace

char LoopDistribute::ID;		char LoopDistribute::ID;
static const char ldist_name[] = "Loop Distribition";		static const char ldist_name[] = "Loop Distribition";

INITIALIZE_PASS_BEGIN(LoopDistribute, LDIST_NAME, ldist_name, false, false)		INITIALIZE_PASS_BEGIN(LoopDistribute, LDIST_NAME, ldist_name, false, false)
INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(LoopAccessAnalysis)		INITIALIZE_PASS_DEPENDENCY(LoopAccessAnalysis)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
INITIALIZE_PASS_END(LoopDistribute, LDIST_NAME, ldist_name, false, false)		INITIALIZE_PASS_END(LoopDistribute, LDIST_NAME, ldist_name, false, false)

namespace llvm {		namespace llvm {
FunctionPass *createLoopDistributePass() { return new LoopDistribute(); }		FunctionPass *createLoopDistributePass() { return new LoopDistribute(); }
}		}

lib/Transforms/Utils/LoopVersioning.cpp

	Show All 18 Lines
	#include "llvm/Transforms/Utils/BasicBlockUtils.h"			#include "llvm/Transforms/Utils/BasicBlockUtils.h"
	#include "llvm/Transforms/Utils/Cloning.h"			#include "llvm/Transforms/Utils/Cloning.h"
	#include "llvm/Transforms/Utils/LoopVersioning.h"			#include "llvm/Transforms/Utils/LoopVersioning.h"

	using namespace llvm;			using namespace llvm;

	LoopVersioning::LoopVersioning(			LoopVersioning::LoopVersioning(
	SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks,			SmallVector<RuntimePointerChecking::PointerCheck, 4> Checks,
	const LoopAccessInfo &LAI, Loop L, LoopInfo LI, DominatorTree *DT)			const LoopAccessInfo &LAI, Loop L, LoopInfo LI, DominatorTree *DT,
				ScalarEvolution *SE)
	: VersionedLoop(L), NonVersionedLoop(nullptr), Checks(std::move(Checks)),			: VersionedLoop(L), NonVersionedLoop(nullptr), Checks(std::move(Checks)),
	LAI(LAI), LI(LI), DT(DT) {			LAI(LAI), LI(LI), DT(DT), SE(SE) {
	assert(L->getExitBlock() && "No single exit block");			assert(L->getExitBlock() && "No single exit block");
	assert(L->getLoopPreheader() && "No preheader");			assert(L->getLoopPreheader() && "No preheader");
	}			}

	LoopVersioning::LoopVersioning(const LoopAccessInfo &LAInfo, Loop *L,			LoopVersioning::LoopVersioning(const LoopAccessInfo &LAInfo, Loop *L,
	LoopInfo LI, DominatorTree DT)			LoopInfo LI, DominatorTree DT,
				ScalarEvolution *SE)
	: VersionedLoop(L), NonVersionedLoop(nullptr),			: VersionedLoop(L), NonVersionedLoop(nullptr),
	Checks(LAInfo.getRuntimePointerChecking()->getChecks()), LAI(LAInfo),			Checks(LAInfo.getRuntimePointerChecking()->getChecks()), LAI(LAInfo),
	LI(LI), DT(DT) {			LI(LI), DT(DT), SE(SE) {
	assert(L->getExitBlock() && "No single exit block");			assert(L->getExitBlock() && "No single exit block");
	assert(L->getLoopPreheader() && "No preheader");			assert(L->getLoopPreheader() && "No preheader");
	}			}

	void LoopVersioning::versionLoop() {			void LoopVersioning::versionLoop() {
	Instruction *FirstCheckInst;			Instruction *FirstCheckInst;
	Instruction *MemRuntimeCheck;			Instruction *MemRuntimeCheck;
				Instruction *OverflowRuntimeCheck;
				Instruction *RuntimeCheck = nullptr;

	// Add the memcheck in the original preheader (this is empty initially).			// Add the memcheck in the original preheader (this is empty initially).
	BasicBlock *MemCheckBB = VersionedLoop->getLoopPreheader();			BasicBlock *MemCheckBB = VersionedLoop->getLoopPreheader();
	std::tie(FirstCheckInst, MemRuntimeCheck) =			std::tie(FirstCheckInst, MemRuntimeCheck) =
	LAI.addRuntimeChecks(MemCheckBB->getTerminator(), Checks);			LAI.addRuntimeChecks(MemCheckBB->getTerminator(), Checks);
	assert(MemRuntimeCheck && "called even though needsAnyChecking = false");			assert(MemRuntimeCheck && "called even though needsAnyChecking = false");
				std::tie(FirstCheckInst, OverflowRuntimeCheck) =
				LAI.Pred->generateGuardCond(MemCheckBB->getTerminator(), SE);

				if (MemRuntimeCheck && OverflowRuntimeCheck) {
				RuntimeCheck = BinaryOperator::Create(Instruction::Or, MemRuntimeCheck,
				OverflowRuntimeCheck, "ldist.safe");
				RuntimeCheck->insertBefore(MemCheckBB->getTerminator());
				} else
				RuntimeCheck = MemRuntimeCheck ? MemRuntimeCheck : OverflowRuntimeCheck;

				assert(RuntimeCheck && "called even though we don't need "
				"any runtime checks");

	// Rename the block to make the IR more readable.			// Rename the block to make the IR more readable.
	MemCheckBB->setName(VersionedLoop->getHeader()->getName() + ".lver.memcheck");			MemCheckBB->setName(VersionedLoop->getHeader()->getName() + ".lver.memcheck");

	// Create empty preheader for the loop (and after cloning for the			// Create empty preheader for the loop (and after cloning for the
	// non-versioned loop).			// non-versioned loop).
	BasicBlock *PH = SplitBlock(MemCheckBB, MemCheckBB->getTerminator(), DT, LI);			BasicBlock *PH = SplitBlock(MemCheckBB, MemCheckBB->getTerminator(), DT, LI);
	PH->setName(VersionedLoop->getHeader()->getName() + ".ph");			PH->setName(VersionedLoop->getHeader()->getName() + ".ph");

	// Clone the loop including the preheader.			// Clone the loop including the preheader.
	//			//
	// FIXME: This does not currently preserve SimplifyLoop because the exit			// FIXME: This does not currently preserve SimplifyLoop because the exit
	// block is a join between the two loops.			// block is a join between the two loops.
	SmallVector<BasicBlock *, 8> NonVersionedLoopBlocks;			SmallVector<BasicBlock *, 8> NonVersionedLoopBlocks;
	NonVersionedLoop =			NonVersionedLoop =
	cloneLoopWithPreheader(PH, MemCheckBB, VersionedLoop, VMap, ".lver.orig",			cloneLoopWithPreheader(PH, MemCheckBB, VersionedLoop, VMap, ".lver.orig",
	LI, DT, NonVersionedLoopBlocks);			LI, DT, NonVersionedLoopBlocks);
	remapInstructionsInBlocks(NonVersionedLoopBlocks, VMap);			remapInstructionsInBlocks(NonVersionedLoopBlocks, VMap);

	// Insert the conditional branch based on the result of the memchecks.			// Insert the conditional branch based on the result of the memchecks.
	Instruction *OrigTerm = MemCheckBB->getTerminator();			Instruction *OrigTerm = MemCheckBB->getTerminator();
	BranchInst::Create(NonVersionedLoop->getLoopPreheader(),			BranchInst::Create(NonVersionedLoop->getLoopPreheader(),
	VersionedLoop->getLoopPreheader(), MemRuntimeCheck,			VersionedLoop->getLoopPreheader(), RuntimeCheck, OrigTerm);
	OrigTerm);
	OrigTerm->eraseFromParent();			OrigTerm->eraseFromParent();

	// The loops merge in the original exit block. This is now dominated by the			// The loops merge in the original exit block. This is now dominated by the
	// memchecking block.			// memchecking block.
	DT->changeImmediateDominator(VersionedLoop->getExitBlock(), MemCheckBB);			DT->changeImmediateDominator(VersionedLoop->getExitBlock(), MemCheckBB);
	}			}

	void LoopVersioning::addPHINodes(			void LoopVersioning::addPHINodes(
	Show All 29 Lines

lib/Transforms/Vectorize/LoopVectorize.cpp

Show First 20 Lines • Show All 259 Lines • ▼ Show 20 Lines
/// aspects. The InnerLoopVectorizer relies on the		/// aspects. The InnerLoopVectorizer relies on the
/// LoopVectorizationLegality class to provide information about the induction		/// LoopVectorizationLegality class to provide information about the induction
/// and reduction variables that were found to a given vectorization factor.		/// and reduction variables that were found to a given vectorization factor.
class InnerLoopVectorizer {		class InnerLoopVectorizer {
public:		public:
InnerLoopVectorizer(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,		InnerLoopVectorizer(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,
DominatorTree DT, const TargetLibraryInfo TLI,		DominatorTree DT, const TargetLibraryInfo TLI,
const TargetTransformInfo *TTI, unsigned VecWidth,		const TargetTransformInfo *TTI, unsigned VecWidth,
unsigned UnrollFactor)		unsigned UnrollFactor, SCEVPredicateSet &Pred)
: OrigLoop(OrigLoop), SE(SE), LI(LI), DT(DT), TLI(TLI), TTI(TTI),		: OrigLoop(OrigLoop), SE(SE), LI(LI), DT(DT), TLI(TLI), TTI(TTI),
VF(VecWidth), UF(UnrollFactor), Builder(SE->getContext()),		VF(VecWidth), UF(UnrollFactor), Builder(SE->getContext()),
Induction(nullptr), OldInduction(nullptr), WidenMap(UnrollFactor),		Induction(nullptr), OldInduction(nullptr), WidenMap(UnrollFactor),
Legal(nullptr), AddedSafetyChecks(false) {}		Legal(nullptr), AddedSafetyChecks(false), Pred(Pred) {}

// Perform the actual loop widening (vectorization).		// Perform the actual loop widening (vectorization).
void vectorize(LoopVectorizationLegality *L) {		void vectorize(LoopVectorizationLegality *L) {
Legal = L;		Legal = L;
// Create a new empty loop. Unlink the old loop and connect the new one.		// Create a new empty loop. Unlink the old loop and connect the new one.
createEmptyLoop();		createEmptyLoop();
// Widen each instruction in the old loop to a new one in the new loop.		// Widen each instruction in the old loop to a new one in the new loop.
// Use the Legality module to find the induction and reduction variables.		// Use the Legality module to find the induction and reduction variables.
Show All 23 Lines	typedef DenseMap<std::pair<BasicBlock, BasicBlock>,
VectorParts> EdgeMaskCache;		VectorParts> EdgeMaskCache;

/// \brief Add checks for strides that were assumed to be 1.		/// \brief Add checks for strides that were assumed to be 1.
///		///
/// Returns the last check instruction and the first check instruction in the		/// Returns the last check instruction and the first check instruction in the
/// pair as (first, last).		/// pair as (first, last).
std::pair<Instruction , Instruction > addStrideCheck(Instruction *Loc);		std::pair<Instruction , Instruction > addStrideCheck(Instruction *Loc);

		// Adds code to check the overflow assumptions made by SCEV
		std::pair<Instruction , Instruction >
		addRuntimeOverflowChecks(Instruction *Loc);
		anemetUnsubmitted Not Done Reply Inline Actions There is no def for this function. anemet: There is no def for this function.
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Must be an artefact from the rebase. It's not being used anywhere either, so it just needs to be removed. sbaranga: Must be an artefact from the rebase. It's not being used anywhere either, so it just needs to…

/// Create an empty loop, based on the loop ranges of the old loop.		/// Create an empty loop, based on the loop ranges of the old loop.
void createEmptyLoop();		void createEmptyLoop();
/// Copy and widen the instructions from the old loop.		/// Copy and widen the instructions from the old loop.
virtual void vectorizeLoop();		virtual void vectorizeLoop();

/// \brief The Loop exit block may have single value PHI nodes where the		/// \brief The Loop exit block may have single value PHI nodes where the
/// incoming value is 'Undef'. While vectorizing we only handled real values		/// incoming value is 'Undef'. While vectorizing we only handled real values
/// that were defined inside the loop. Here we fix the 'undef case'.		/// that were defined inside the loop. Here we fix the 'undef case'.
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	protected:
/// Maps scalars to widened vectors.		/// Maps scalars to widened vectors.
ValueMap WidenMap;		ValueMap WidenMap;
EdgeMaskCache MaskCache;		EdgeMaskCache MaskCache;

LoopVectorizationLegality *Legal;		LoopVectorizationLegality *Legal;

// Record whether runtime check is added.		// Record whether runtime check is added.
bool AddedSafetyChecks;		bool AddedSafetyChecks;

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;
};		};

class InnerLoopUnroller : public InnerLoopVectorizer {		class InnerLoopUnroller : public InnerLoopVectorizer {
public:		public:
InnerLoopUnroller(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,		InnerLoopUnroller(Loop OrigLoop, ScalarEvolution SE, LoopInfo *LI,
DominatorTree DT, const TargetLibraryInfo TLI,		DominatorTree DT, const TargetLibraryInfo TLI,
const TargetTransformInfo *TTI, unsigned UnrollFactor)		const TargetTransformInfo *TTI, unsigned UnrollFactor,
: InnerLoopVectorizer(OrigLoop, SE, LI, DT, TLI, TTI, 1, UnrollFactor) {}		SCEVPredicateSet &Pred)
		: InnerLoopVectorizer(OrigLoop, SE, LI, DT, TLI, TTI, 1, UnrollFactor,
		Pred) {}

private:		private:
void scalarizeInstruction(Instruction *Instr,		void scalarizeInstruction(Instruction *Instr,
bool IfPredicateStore = false) override;		bool IfPredicateStore = false) override;
void vectorizeMemoryInstruction(Instruction *Instr) override;		void vectorizeMemoryInstruction(Instruction *Instr) override;
Value getBroadcastInstrs(Value V) override;		Value getBroadcastInstrs(Value V) override;
Value getStepVector(Value Val, int StartIdx, Value *Step) override;		Value getStepVector(Value Val, int StartIdx, Value *Step) override;
Value reverseVector(Value Vec) override;		Value reverseVector(Value Vec) override;
▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines
/// Use this class to analyze interleaved accesses only when we can vectorize		/// Use this class to analyze interleaved accesses only when we can vectorize
/// a loop. Otherwise it's meaningless to do analysis as the vectorization		/// a loop. Otherwise it's meaningless to do analysis as the vectorization
/// on interleaved accesses is unsafe.		/// on interleaved accesses is unsafe.
///		///
/// The analysis collects interleave groups and records the relationships		/// The analysis collects interleave groups and records the relationships
/// between the member and the group in a map.		/// between the member and the group in a map.
class InterleavedAccessInfo {		class InterleavedAccessInfo {
public:		public:
InterleavedAccessInfo(ScalarEvolution SE, Loop L, DominatorTree *DT)		InterleavedAccessInfo(ScalarEvolution SE, Loop L, DominatorTree *DT,
: SE(SE), TheLoop(L), DT(DT) {}		SCEVPredicateSet &Pred)
		: SE(SE), TheLoop(L), DT(DT), Pred(Pred) {}

~InterleavedAccessInfo() {		~InterleavedAccessInfo() {
SmallSet<InterleaveGroup *, 4> DelSet;		SmallSet<InterleaveGroup *, 4> DelSet;
// Avoid releasing a pointer twice.		// Avoid releasing a pointer twice.
for (auto &I : InterleaveGroupMap)		for (auto &I : InterleaveGroupMap)
DelSet.insert(I.second);		DelSet.insert(I.second);
for (auto *Ptr : DelSet)		for (auto *Ptr : DelSet)
delete Ptr;		delete Ptr;
Show All 16 Lines	if (InterleaveGroupMap.count(Instr))
return InterleaveGroupMap.find(Instr)->second;		return InterleaveGroupMap.find(Instr)->second;
return nullptr;		return nullptr;
}		}

private:		private:
ScalarEvolution *SE;		ScalarEvolution *SE;
Loop *TheLoop;		Loop *TheLoop;
DominatorTree *DT;		DominatorTree *DT;
		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;

/// Holds the relationships between the members and the interleave group.		/// Holds the relationships between the members and the interleave group.
DenseMap<Instruction , InterleaveGroup > InterleaveGroupMap;		DenseMap<Instruction , InterleaveGroup > InterleaveGroupMap;

/// \brief The descriptor for a strided memory access.		/// \brief The descriptor for a strided memory access.
struct StrideDescriptor {		struct StrideDescriptor {
StrideDescriptor(int Stride, const SCEV *Scev, unsigned Size,		StrideDescriptor(int Stride, const SCEV *Scev, unsigned Size,
unsigned Align)		unsigned Align)
▲ Show 20 Lines • Show All 330 Lines • ▼ Show 20 Lines
/// induction variable and the different reduction variables.		/// induction variable and the different reduction variables.
class LoopVectorizationLegality {		class LoopVectorizationLegality {
public:		public:
LoopVectorizationLegality(Loop L, ScalarEvolution SE, DominatorTree *DT,		LoopVectorizationLegality(Loop L, ScalarEvolution SE, DominatorTree *DT,
TargetLibraryInfo TLI, AliasAnalysis AA,		TargetLibraryInfo TLI, AliasAnalysis AA,
Function F, const TargetTransformInfo TTI,		Function F, const TargetTransformInfo TTI,
LoopAccessAnalysis *LAA,		LoopAccessAnalysis *LAA,
LoopVectorizationRequirements *R,		LoopVectorizationRequirements *R,
const LoopVectorizeHints *H)		const LoopVectorizeHints *H,
		SCEVPredicateSet &Pred)
: NumPredStores(0), TheLoop(L), SE(SE), TLI(TLI), TheFunction(F),		: NumPredStores(0), TheLoop(L), SE(SE), TLI(TLI), TheFunction(F),
TTI(TTI), DT(DT), LAA(LAA), LAI(nullptr), InterleaveInfo(SE, L, DT),		TTI(TTI), DT(DT), LAA(LAA), LAI(nullptr),
Induction(nullptr), WidestIndTy(nullptr), HasFunNoNaNAttr(false),		InterleaveInfo(SE, L, DT, Pred), Induction(nullptr),
Requirements(R), Hints(H) {}		WidestIndTy(nullptr), HasFunNoNaNAttr(false), Requirements(R),
		Hints(H), Pred(Pred) {}

/// This enum represents the kinds of inductions that we support.		/// This enum represents the kinds of inductions that we support.
enum InductionKind {		enum InductionKind {
IK_NoInduction, ///< Not an induction variable.		IK_NoInduction, ///< Not an induction variable.
IK_IntInduction, ///< Integer induction variable. Step = C.		IK_IntInduction, ///< Integer induction variable. Step = C.
IK_PtrInduction ///< Pointer induction var. Step = C / sizeof(elem).		IK_PtrInduction ///< Pointer induction var. Step = C / sizeof(elem).
};		};

▲ Show 20 Lines • Show All 260 Lines • ▼ Show 20 Lines	private:
/// Used to emit an analysis of any legality issues.		/// Used to emit an analysis of any legality issues.
const LoopVectorizeHints *Hints;		const LoopVectorizeHints *Hints;

ValueToValueMap Strides;		ValueToValueMap Strides;
SmallPtrSet<Value *, 8> StrideSet;		SmallPtrSet<Value *, 8> StrideSet;

/// While vectorizing these instructions we have to generate a		/// While vectorizing these instructions we have to generate a
/// call to the appropriate masked intrinsic		/// call to the appropriate masked intrinsic
SmallPtrSet<const Instruction*, 8> MaskedOp;		SmallPtrSet<const Instruction *, 8> MaskedOp;

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;
};		};

/// LoopVectorizationCostModel - estimates the expected speedups due to		/// LoopVectorizationCostModel - estimates the expected speedups due to
/// vectorization.		/// vectorization.
/// In many cases vectorization is not profitable. This can happen because of		/// In many cases vectorization is not profitable. This can happen because of
/// a number of reasons. In this class we mainly attempt to predict the		/// a number of reasons. In this class we mainly attempt to predict the
/// expected speedup/slowdowns due to the supported instruction set. We use the		/// expected speedup/slowdowns due to the supported instruction set. We use the
/// TargetTransformInfo to query the different backends for the cost of		/// TargetTransformInfo to query the different backends for the cost of
/// different operations.		/// different operations.
class LoopVectorizationCostModel {		class LoopVectorizationCostModel {
public:		public:
LoopVectorizationCostModel(Loop L, ScalarEvolution SE, LoopInfo *LI,		LoopVectorizationCostModel(Loop L, ScalarEvolution SE, LoopInfo *LI,
LoopVectorizationLegality *Legal,		LoopVectorizationLegality *Legal,
const TargetTransformInfo &TTI,		const TargetTransformInfo &TTI,
const TargetLibraryInfo TLI, AssumptionCache AC,		const TargetLibraryInfo TLI, AssumptionCache AC,
const Function F, const LoopVectorizeHints Hints)		const Function F, const LoopVectorizeHints Hints,
		SCEVPredicateSet &Pred)
: TheLoop(L), SE(SE), LI(LI), Legal(Legal), TTI(TTI), TLI(TLI),		: TheLoop(L), SE(SE), LI(LI), Legal(Legal), TTI(TTI), TLI(TLI),
TheFunction(F), Hints(Hints) {		TheFunction(F), Hints(Hints), Pred(Pred) {
CodeMetrics::collectEphemeralValues(L, AC, EphValues);		CodeMetrics::collectEphemeralValues(L, AC, EphValues);
}		}

/// Information about vectorization costs		/// Information about vectorization costs
struct VectorizationFactor {		struct VectorizationFactor {
unsigned Width; // Vector width with best cost		unsigned Width; // Vector width with best cost
unsigned Cost; // Cost of the loop with that width		unsigned Cost; // Cost of the loop with that width
};		};
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	private:
LoopVectorizationLegality *Legal;		LoopVectorizationLegality *Legal;
/// Vector target information.		/// Vector target information.
const TargetTransformInfo &TTI;		const TargetTransformInfo &TTI;
/// Target Library Info.		/// Target Library Info.
const TargetLibraryInfo *TLI;		const TargetLibraryInfo *TLI;
const Function *TheFunction;		const Function *TheFunction;
// Loop Vectorize Hint.		// Loop Vectorize Hint.
const LoopVectorizeHints *Hints;		const LoopVectorizeHints *Hints;

		/// The SCEV predicate containing all the SCEV-related assumptions.
		SCEVPredicateSet &Pred;
};		};

/// \brief This holds vectorization requirements that must be verified late in		/// \brief This holds vectorization requirements that must be verified late in
/// the process. The requirements are set by legalize and costmodel. Once		/// the process. The requirements are set by legalize and costmodel. Once
/// vectorization has been determined to be possible and profitable the		/// vectorization has been determined to be possible and profitable the
/// requirements can be verified by looking for metadata or compiler options.		/// requirements can be verified by looking for metadata or compiler options.
/// For example, some loops require FP commutativity which is only allowed if		/// For example, some loops require FP commutativity which is only allowed if
/// vectorization is explicitly specified or if the fast-math compiler option		/// vectorization is explicitly specified or if the fast-math compiler option
▲ Show 20 Lines • Show All 218 Lines • ▼ Show 20 Lines	if (TC > 0u && TC < TinyTripCountVectorThreshold) {
else {		else {
DEBUG(dbgs() << "\n");		DEBUG(dbgs() << "\n");
emitAnalysisDiag(F, L, Hints, VectorizationReport()		emitAnalysisDiag(F, L, Hints, VectorizationReport()
<< "vectorization is not beneficial "		<< "vectorization is not beneficial "
"and is not explicitly forced");		"and is not explicitly forced");
return false;		return false;
}		}
}		}
		SCEVPredicateSet Pred;

// Check if it is legal to vectorize the loop.		// Check if it is legal to vectorize the loop.
LoopVectorizationRequirements Requirements;		LoopVectorizationRequirements Requirements;
LoopVectorizationLegality LVL(L, SE, DT, TLI, AA, F, TTI, LAA,		LoopVectorizationLegality LVL(L, SE, DT, TLI, AA, F, TTI, LAA,
&Requirements, &Hints);		&Requirements, &Hints, Pred);
if (!LVL.canVectorize()) {		if (!LVL.canVectorize()) {
DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");		DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");
emitMissedWarning(F, L, Hints);		emitMissedWarning(F, L, Hints);
return false;		return false;
}		}

// Use the cost model.		// Use the cost model.
LoopVectorizationCostModel CM(L, SE, LI, &LVL, *TTI, TLI, AC, F, &Hints);		LoopVectorizationCostModel CM(L, SE, LI, &LVL, *TTI, TLI, AC, F, &Hints,
		Pred);

// Check the function attributes to find out if this function should be		// Check the function attributes to find out if this function should be
// optimized for size.		// optimized for size.
bool OptForSize = Hints.getForce() != LoopVectorizeHints::FK_Enabled &&		bool OptForSize = Hints.getForce() != LoopVectorizeHints::FK_Enabled &&
F->optForSize();		F->optForSize();

// Compute the weighted frequency of this loop being executed and see if it		// Compute the weighted frequency of this loop being executed and see if it
// is less than 20% of the function entry baseline frequency. Note that we		// is less than 20% of the function entry baseline frequency. Note that we
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	if (!VectorizeLoop && !InterleaveLoop) {
<< DebugLocStr << '\n');		<< DebugLocStr << '\n');
DEBUG(dbgs() << "LV: Interleave Count is " << IC << '\n');		DEBUG(dbgs() << "LV: Interleave Count is " << IC << '\n');
}		}

if (!VectorizeLoop) {		if (!VectorizeLoop) {
assert(IC > 1 && "interleave count should not be 1 or 0");		assert(IC > 1 && "interleave count should not be 1 or 0");
// If we decided that it is not legal to vectorize the loop then		// If we decided that it is not legal to vectorize the loop then
// interleave it.		// interleave it.
InnerLoopUnroller Unroller(L, SE, LI, DT, TLI, TTI, IC);		InnerLoopUnroller Unroller(L, SE, LI, DT, TLI, TTI, IC, Pred);
Unroller.vectorize(&LVL);		Unroller.vectorize(&LVL);

emitOptimizationRemark(F->getContext(), DEBUG_TYPE, *F, L->getStartLoc(),		emitOptimizationRemark(F->getContext(), DEBUG_TYPE, *F, L->getStartLoc(),
Twine("interleaved loop (interleaved count: ") +		Twine("interleaved loop (interleaved count: ") +
Twine(IC) + ")");		Twine(IC) + ")");
} else {		} else {
// If we decided that it is legal to vectorize the loop then do it.		// If we decided that it is legal to vectorize the loop then do it.
InnerLoopVectorizer LB(L, SE, LI, DT, TLI, TTI, VF.Width, IC);		InnerLoopVectorizer LB(L, SE, LI, DT, TLI, TTI, VF.Width, IC, Pred);
LB.vectorize(&LVL);		LB.vectorize(&LVL);
++LoopsVectorized;		++LoopsVectorized;

// Add metadata to disable runtime unrolling scalar loop when there's no		// Add metadata to disable runtime unrolling scalar loop when there's no
// runtime check about strides and memory. Because at this situation,		// runtime check about strides and memory. Because at this situation,
// scalar loop is rarely used not worthy to be unrolled.		// scalar loop is rarely used not worthy to be unrolled.
if (!LB.IsSafetyChecksAdded())		if (!LB.IsSafetyChecksAdded())
AddRuntimeUnrollDisableMetaData(L);		AddRuntimeUnrollDisableMetaData(L);
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	int LoopVectorizationLegality::isConsecutivePtr(Value *Ptr) {
for (unsigned i = 0; i != NumOperands; ++i)		for (unsigned i = 0; i != NumOperands; ++i)
if (i != InductionOperand &&		if (i != InductionOperand &&
!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))		!SE->isLoopInvariant(SE->getSCEV(Gep->getOperand(i)), TheLoop))
return 0;		return 0;

// We can emit wide load/stores only if the last non-zero index is the		// We can emit wide load/stores only if the last non-zero index is the
// induction variable.		// induction variable.
const SCEV *Last = nullptr;		const SCEV *Last = nullptr;
if (!Strides.count(Gep))		if (!Strides.count(Gep)) {
Last = SE->getSCEV(Gep->getOperand(InductionOperand));		Last = SE->getSCEV(Gep->getOperand(InductionOperand));
else {		Last = SE->rewriteUsingPredicate(Last, TheLoop, Pred);
		} else {
// Because of the multiplication by a stride we can have a s/zext cast.		// Because of the multiplication by a stride we can have a s/zext cast.
// We are going to replace this stride by 1 so the cast is safe to ignore.		// We are going to replace this stride by 1 so the cast is safe to ignore.
//		//
// %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]		// %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
// %0 = trunc i64 %indvars.iv to i32		// %0 = trunc i64 %indvars.iv to i32
// %mul = mul i32 %0, %Stride1		// %mul = mul i32 %0, %Stride1
// %idxprom = zext i32 %mul to i64 << Safe cast.		// %idxprom = zext i32 %mul to i64 << Safe cast.
// %arrayidx = getelementptr inbounds i32* %B, i64 %idxprom		// %arrayidx = getelementptr inbounds i32* %B, i64 %idxprom
//		//
Last = replaceSymbolicStrideSCEV(SE, Strides,		Last = rewriteSCEV(SE, Strides, Gep->getOperand(InductionOperand), Gep,
Gep->getOperand(InductionOperand), Gep);		TheLoop, Pred);
if (const SCEVCastExpr *C = dyn_cast<SCEVCastExpr>(Last))		if (const SCEVCastExpr *C = dyn_cast<SCEVCastExpr>(Last))
Last =		Last =
(C->getSCEVType() == scSignExtend \|\| C->getSCEVType() == scZeroExtend)		(C->getSCEVType() == scSignExtend \|\| C->getSCEVType() == scZeroExtend)
? C->getOperand()		? C->getOperand()
: Last;		: Last;
}		}

		SCEVPredicateSet P = Pred;
		if (!dyn_cast<SCEVAddRecExpr>(Last)) {
		// Attermpt to add new SCEV assumptions to Last in order to
		// get an AddRecExpr.
		AssumptionResult R = SE->getAddRecWithRTChecks(Last, TheLoop);
		R.Pred.add(&Pred);

		if (R.Res && !R.Pred.isAlwaysFalse()) {
		Last = R.Res;
		P = R.Pred;
		}
		}

if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Last)) {		if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Last)) {
const SCEV Step = AR->getStepRecurrence(SE);		const SCEV Step = AR->getStepRecurrence(SE);

// The memory is consecutive because the last index is consecutive		// The memory is consecutive because the last index is consecutive
// and all other indices are loop invariant.		// and all other indices are loop invariant.
if (Step->isOne())		if (Step->isOne()) {
		Pred = P;
return 1;		return 1;
if (Step->isAllOnesValue())		}
		if (Step->isAllOnesValue()) {
		Pred = P;
return -1;		return -1;
}		}
		}

return 0;		return 0;
}		}

bool LoopVectorizationLegality::isUniform(Value *V) {		bool LoopVectorizationLegality::isUniform(Value *V) {
return LAI->isUniform(V);		return LAI->isUniform(V);
}		}

▲ Show 20 Lines • Show All 635 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::createEmptyLoop() {
// Some loops have a single integer induction variable, while other loops		// Some loops have a single integer induction variable, while other loops
// don't. One example is c++ iterators that often have multiple pointer		// don't. One example is c++ iterators that often have multiple pointer
// induction variables. In the code below we also support a case where we		// induction variables. In the code below we also support a case where we
// don't have a single induction variable.		// don't have a single induction variable.
OldInduction = Legal->getInduction();		OldInduction = Legal->getInduction();
Type *IdxTy = Legal->getWidestInductionType();		Type *IdxTy = Legal->getWidestInductionType();

// Find the loop boundaries.		// Find the loop boundaries.
const SCEV *ExitCount = SE->getBackedgeTakenCount(OrigLoop);		const SCEV *ExitCount = SE->getGuardedBackedgeTakenCount(OrigLoop, Pred);
assert(ExitCount != SE->getCouldNotCompute() && "Invalid loop count");		assert(ExitCount != SE->getCouldNotCompute() && "Invalid loop count");

// The exit count might have the type of i64 while the phi is i32. This can		// The exit count might have the type of i64 while the phi is i32. This can
// happen if we have an induction variable that is sign extended before the		// happen if we have an induction variable that is sign extended before the
// compare. The only way that we get a backedge taken count is that the		// compare. The only way that we get a backedge taken count is that the
// induction variable was signed and as such will not overflow. In such a case		// induction variable was signed and as such will not overflow. In such a case
// truncation is legal.		// truncation is legal.
if (ExitCount->getType()->getPrimitiveSizeInBits() >		if (ExitCount->getType()->getPrimitiveSizeInBits() >
▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	if (MemRuntimeCheck) {
// for the "few elements case".		// for the "few elements case".
ReplaceInstWithInst(		ReplaceInstWithInst(
VectorPH->getTerminator(),		VectorPH->getTerminator(),
BranchInst::Create(MiddleBlock, NewVectorPH, MemRuntimeCheck));		BranchInst::Create(MiddleBlock, NewVectorPH, MemRuntimeCheck));

VectorPH = NewVectorPH;		VectorPH = NewVectorPH;
}		}

		// Generate runtime checks for any SCEV assumptions that we've made.
		Instruction *OFCheck;
		std::tie(FirstCheckInst, OFCheck) =
		Pred.generateGuardCond(VectorPH->getTerminator(), SE);
		if (OFCheck) {
		AddedSafetyChecks = true;
		// Create a new block containing the scev check.
		VectorPH->setName("vector.scevcheck");
		NewVectorPH =
		VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");
		hfinkelUnsubmitted Not Done Reply Inline Actions Line too long? hfinkel: Line too long?

		if (ParentLoop)
		ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);
		LoopBypassBlocks.push_back(VectorPH);

		// Replace the branch into the scev check block with a conditional branch
		// for the "few elements case".
		ReplaceInstWithInst(VectorPH->getTerminator(),
		BranchInst::Create(MiddleBlock, NewVectorPH, OFCheck));

		VectorPH = NewVectorPH;
		}

// We are going to resume the execution of the scalar loop.		// We are going to resume the execution of the scalar loop.
// Go over all of the induction variables that we found and fix the		// Go over all of the induction variables that we found and fix the
// PHIs that are left in the scalar version of the loop.		// PHIs that are left in the scalar version of the loop.
// The starting values of PHI nodes depend on the counter of the last		// The starting values of PHI nodes depend on the counter of the last
// iteration in the vectorized loop.		// iteration in the vectorized loop.
// If we come from a bypass edge then we need to start from the original		// If we come from a bypass edge then we need to start from the original
// start value.		// start value.

▲ Show 20 Lines • Show All 1,141 Lines • ▼ Show 20 Lines	bool LoopVectorizationLegality::canVectorize() {
// Check if we can if-convert non-single-bb loops.		// Check if we can if-convert non-single-bb loops.
unsigned NumBlocks = TheLoop->getNumBlocks();		unsigned NumBlocks = TheLoop->getNumBlocks();
if (NumBlocks != 1 && !canVectorizeWithIfConvert()) {		if (NumBlocks != 1 && !canVectorizeWithIfConvert()) {
DEBUG(dbgs() << "LV: Can't if-convert the loop.\n");		DEBUG(dbgs() << "LV: Can't if-convert the loop.\n");
return false;		return false;
}		}

// ScalarEvolution needs to be able to find the exit count.		// ScalarEvolution needs to be able to find the exit count.
const SCEV *ExitCount = SE->getBackedgeTakenCount(TheLoop);		const SCEV *ExitCount = SE->getGuardedBackedgeTakenCount(TheLoop, Pred);
if (ExitCount == SE->getCouldNotCompute()) {		if (ExitCount == SE->getCouldNotCompute()) {
emitAnalysis(VectorizationReport() <<		emitAnalysis(VectorizationReport()
"could not determine number of loop iterations");		<< "could not determine number of loop iterations");
DEBUG(dbgs() << "LV: SCEV could not compute the loop exit count.\n");		DEBUG(dbgs() << "LV: SCEV could not compute the loop exit count.\n");
return false;		return false;
}		}

// Check if we can vectorize the instructions and CFG in this loop.		// Check if we can vectorize the instructions and CFG in this loop.
if (!canVectorizeInstrs()) {		if (!canVectorizeInstrs()) {
DEBUG(dbgs() << "LV: Can't vectorize the instructions or CFG\n");		DEBUG(dbgs() << "LV: Can't vectorize the instructions or CFG\n");
return false;		return false;
▲ Show 20 Lines • Show All 319 Lines • ▼ Show 20 Lines	if (LAI->hasStoreToLoopInvariantAddress()) {
emitAnalysis(		emitAnalysis(
VectorizationReport()		VectorizationReport()
<< "write to a loop invariant address could not be vectorized");		<< "write to a loop invariant address could not be vectorized");
DEBUG(dbgs() << "LV: We don't allow storing to uniform addresses\n");		DEBUG(dbgs() << "LV: We don't allow storing to uniform addresses\n");
return false;		return false;
}		}

Requirements->addRuntimePointerChecks(LAI->getNumRuntimePointerChecks());		Requirements->addRuntimePointerChecks(LAI->getNumRuntimePointerChecks());
		Pred.add(&*LAI->Pred);

		if (Pred.isAlwaysFalse()) {
		emitAnalysis(VectorizationReport()
		<< "Too many SCEV assuptions need to be made and checked "
		<< "at runtime");
		DEBUG(dbgs() << "LV: Too many SCEV checks needed.\n");
		return false;
		}

return true;		return true;
}		}

LoopVectorizationLegality::InductionKind		LoopVectorizationLegality::InductionKind
LoopVectorizationLegality::isInductionVariable(PHINode *Phi,		LoopVectorizationLegality::isInductionVariable(PHINode *Phi,
ConstantInt *&StepValue) {		ConstantInt *&StepValue) {
if (!isInductionPHI(Phi, SE, StepValue))		if (!isInductionPHI(Phi, SE, StepValue))
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	if (AccessList.empty())
return;		return;

auto &DL = TheLoop->getHeader()->getModule()->getDataLayout();		auto &DL = TheLoop->getHeader()->getModule()->getDataLayout();
for (auto I : AccessList) {		for (auto I : AccessList) {
LoadInst *LI = dyn_cast<LoadInst>(I);		LoadInst *LI = dyn_cast<LoadInst>(I);
StoreInst *SI = dyn_cast<StoreInst>(I);		StoreInst *SI = dyn_cast<StoreInst>(I);

Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();		Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();
int Stride = isStridedPtr(SE, Ptr, TheLoop, Strides);		int Stride = isStridedPtr(SE, Ptr, TheLoop, Strides, Pred, false);

// The factor of the corresponding interleave group.		// The factor of the corresponding interleave group.
unsigned Factor = std::abs(Stride);		unsigned Factor = std::abs(Stride);

// Ignore the access if the factor is too small or too large.		// Ignore the access if the factor is too small or too large.
if (Factor < 2 \|\| Factor > MaxInterleaveGroupFactor)		if (Factor < 2 \|\| Factor > MaxInterleaveGroupFactor)
continue;		continue;

▲ Show 20 Lines • Show All 1,074 Lines • Show Last 20 Lines

test/Transforms/LoopDistribute/distribute-with-overflows.ll

This file was added.

				; RUN: opt -mtriple=aarch64--linux-gnueabi -basicaa -loop-distribute -verify-loop-info -verify-dom-info -S \
				; RUN: < %s \| FileCheck %s

				; RUN: opt -basicaa -loop-distribute -loop-vectorize -force-vector-width=4 \
				; RUN: -verify-loop-info -verify-dom-info -S < %s \| \
				; RUN: FileCheck --check-prefix=VECTORIZE %s

				; The memcheck version of basic.ll with overflows - the induction variables can
				; overflow. We should distribute and vectorize the second part of this loop with
				; 5 memchecks (A+1 x {C, D, E} + C x {A, B})
				;
				; for (i = 0; i < n; i++) {
				; A[i + 1] = A[i] * B[i];
				; -------------------------------
				; C[i] = D[i] * E[i];
				; }

				@B = common global i32* null, align 8
				@A = common global i32* null, align 8
				@C = common global i32* null, align 8
				@D = common global i32* null, align 8
				@E = common global i32* null, align 8

				define void @f(i64 %n) {
				entry:
				%a = load i32, i32* @A, align 8
				%b = load i32, i32* @B, align 8
				%c = load i32, i32* @C, align 8
				%d = load i32, i32* @D, align 8
				%e = load i32, i32* @E, align 8
				br label %for.body

				; We have two compares for each array overlap check which is a total of 10
				; compares.
				;
				; CHECK: for.body.lver.memcheck:

				; CHECK: icmp ugt i64 %{{[a-zA-Z0-9]+}}, 4294967295

				; CHECK: %ldist.safe = or i1 %memcheck.conflict, %scev.check
				; CHECK: br i1 %ldist.safe, label %for.body.ph.lver.orig, label %for.body.ph.ldist1


				; The non-distributed loop that the memchecks fall back on.

				; CHECK: for.body.ph.lver.orig:
				; CHECK: br label %for.body.lver.orig
				; CHECK: for.body.lver.orig:
				; CHECK: br i1 %exitcond.lver.orig, label %for.end, label %for.body.lver.orig

				; Verify the two distributed loops.

				; CHECK: for.body.ph.ldist1:
				; CHECK: br label %for.body.ldist1
				; CHECK: for.body.ldist1:
				; CHECK: %mulA.ldist1 = mul i32 %loadB.ldist1, %loadA.ldist1
				; CHECK: br i1 %exitcond.ldist1, label %for.body.ph, label %for.body.ldist1

				; CHECK: for.body.ph:
				; CHECK: br label %for.body
				; CHECK: for.body:
				; CHECK: %mulC = mul i32 %loadD, %loadE
				; CHECK: for.end:


				; VECTORIZE: mul <4 x i32>

				for.body: ; preds = %for.body, %entry
				%ind = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%ind_ext = zext i32 %ind to i64

				%arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind_ext
				%loadA = load i32, i32* %arrayidxA, align 4

				%arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind_ext
				%loadB = load i32, i32* %arrayidxB, align 4

				%mulA = mul i32 %loadB, %loadA

				%add = add i32 %ind, 1
				%add_ext = zext i32 %add to i64

				%arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add_ext
				store i32 %mulA, i32* %arrayidxA_plus_4, align 4

				%arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind_ext
				%loadD = load i32, i32* %arrayidxD, align 4

				%arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind_ext
				%loadE = load i32, i32* %arrayidxE, align 4

				%mulC = mul i32 %loadD, %loadE

				%arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind_ext
				store i32 %mulC, i32* %arrayidxC, align 4

				%exitcond = icmp eq i64 %add_ext, %n
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

test/Transforms/LoopVectorize/safegep.ll

	; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 < %s \| FileCheck %s			; RUN: opt -S -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 < %s \| FileCheck %s
	target datalayout = "e-p:32:32:32-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f16:16:16-f32:32:32-f64:32:64-f128:128:128-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"			target datalayout = "e-p:32:32:32-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f16:16:16-f32:32:32-f64:32:64-f128:128:128-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"


	; We can vectorize this code because if the address computation would wrap then			; We can vectorize this code because if the address computation would wrap then
	; a load from 0 would take place which is undefined behaviour in address space 0			; a load from 0 would take place which is undefined behaviour in address space 0
	; according to LLVM IR semantics.			; according to LLVM IR semantics.

	; PR16592			; PR16592

	; CHECK-LABEL: @safe(			; CHECK-LABEL: @safe(
				; CHECK-LABEL-NOT: vector.overflowcheck
	; CHECK: <4 x float>			; CHECK: <4 x float>


	define void @safe(float* %A, float* %B, float %K) {			define void @safe(float* %A, float* %B, float %K) {
	entry:			entry:
	br label %"<bb 3>"			br label %"<bb 3>"

	"<bb 3>":			"<bb 3>":
	%i_15 = phi i32 [ 0, %entry ], [ %i_19, %"<bb 3>" ]			%i_15 = phi i32 [ 0, %entry ], [ %i_19, %"<bb 3>" ]
	%pp3 = getelementptr float, float* %A, i32 %i_15			%pp3 = getelementptr float, float* %A, i32 %i_15
	%D.1396_10 = load float, float* %pp3, align 4			%D.1396_10 = load float, float* %pp3, align 4
	Show All 40 Lines

test/Transforms/LoopVectorize/scev-overflow-check.ll

This file was added.

				; RUN: opt -mtriple=aarch64--linux-gnueabi -loop-vectorize < %s -S \| FileCheck %s

				; CHECK-LABEL: test0
				define void @test0(i32* %A,
				i32* %B,
				i32* %C, i32 %N) {
				entry:
				%cmp13 = icmp eq i32 %N, 0
				br i1 %cmp13, label %for.end, label %for.body.preheader

				; If N is greater then 65535, this would loop forever.
				; CHECK: icmp ugt i32 %N, 65535

				for.body.preheader:
				br label %for.body

				for.body:
				%indvars.iv = phi i16 [ %indvars.next, %for.body ], [ 0, %for.body.preheader ]
				%indvars.next = add i16 %indvars.iv, 1
				%indvars.ext = zext i16 %indvars.iv to i32

				%arrayidx = getelementptr inbounds i32, i32* %B, i32 %indvars.ext
				%0 = load i32, i32* %arrayidx, align 4
				%arrayidx3 = getelementptr inbounds i32, i32* %C, i32 %indvars.ext
				%1 = load i32, i32* %arrayidx3, align 4

				%mul4 = mul i32 %1, %0

				%arrayidx7 = getelementptr inbounds i32, i32* %A, i32 %indvars.ext
				store i32 %mul4, i32* %arrayidx7, align 4

				%exitcond = icmp eq i32 %indvars.ext, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; CHECK-LABEL: test1
				define void @test1(i32* %A,
				i32* %B,
				i32* %C, i32 %N, i32 %Offset) {
				entry:
				%cmp13 = icmp eq i32 %N, 0
				br i1 %cmp13, label %for.end, label %for.body.preheader

				; Because of the GEPs, we need check that Offset + N does not overflow.
				; CHECK: [[MUL0:%[a-zA-Z_0-9.]+]] = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 1, i32 %N)
				; CHECK: [[MUL1:%[a-zA-Z_0-9.]+]] = extractvalue { i32, i1 } [[MUL0]], 0
				; CHECK: call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[MUL1]], i32 %Offset)

				for.body.preheader:
				br label %for.body

				for.body:
				%indvars.iv = phi i16 [ %indvars.next, %for.body ], [ 0, %for.body.preheader ]
				%indvars.next = add i16 %indvars.iv, 1

				%indvars.ext = zext i16 %indvars.iv to i32
				%indvars.access = add i32 %Offset, %indvars.ext

				%arrayidx = getelementptr inbounds i32, i32* %B, i32 %indvars.access
				%0 = load i32, i32* %arrayidx, align 4
				%arrayidx3 = getelementptr inbounds i32, i32* %C, i32 %indvars.access
				%1 = load i32, i32* %arrayidx3, align 4

				%mul4 = mul i32 %1, %0

				%arrayidx7 = getelementptr inbounds i32, i32* %A, i32 %indvars.access
				store i32 %mul4, i32* %arrayidx7, align 4

				%exitcond = icmp eq i32 %indvars.ext, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

				; CHECK-LABEL: test2
				define void @test2(i32* %A,
				i32* %B,
				i32* %C, i32 %N, i32 %Offset) {
				entry:
				%cmp13 = icmp eq i32 %N, 0
				br i1 %cmp13, label %for.end, label %for.body.preheader

				; CHECK: icmp sgt i32 %N, 32767
				; CHECK: icmp slt i32 %N, -32768

				for.body.preheader:
				br label %for.body

				for.body:
				%indvars.iv = phi i16 [ %indvars.next, %for.body ], [ 0, %for.body.preheader ]
				%indvars.next = add i16 %indvars.iv, 1

				%indvars.ext = sext i16 %indvars.iv to i32
				%indvars.access = add i32 %Offset, %indvars.ext

				%arrayidx = getelementptr inbounds i32, i32* %B, i32 %indvars.access
				%0 = load i32, i32* %arrayidx, align 4
				%arrayidx3 = getelementptr inbounds i32, i32* %C, i32 %indvars.access
				%1 = load i32, i32* %arrayidx3, align 4

				%mul4 = add i32 %1, %0

				%arrayidx7 = getelementptr inbounds i32, i32* %A, i32 %indvars.access
				store i32 %mul4, i32* %arrayidx7, align 4

				%exitcond = icmp eq i32 %indvars.ext, %N
				br i1 %exitcond, label %for.end.loopexit, label %for.body

				for.end.loopexit:
				br label %for.end

				for.end:
				ret void
				}

test/Transforms/LoopVectorize/version-mem-access.ll

	Show All 10 Lines
	; CHECK-LABEL: test			; CHECK-LABEL: test
	define void @test(i32* %A, i64 %AStride,			define void @test(i32* %A, i64 %AStride,
	i32* %B, i32 %BStride,			i32* %B, i32 %BStride,
	i32* %C, i64 %CStride, i32 %N) {			i32* %C, i64 %CStride, i32 %N) {
	entry:			entry:
	%cmp13 = icmp eq i32 %N, 0			%cmp13 = icmp eq i32 %N, 0
	br i1 %cmp13, label %for.end, label %for.body.preheader			br i1 %cmp13, label %for.end, label %for.body.preheader

				; We don't need to check the symbolic stride for B, we can assume instead
				; that {0,+,BStride} will not overflow.

	; CHECK-DAG: icmp ne i64 %AStride, 1			; CHECK-DAG: icmp ne i64 %AStride, 1
	; CHECK-DAG: icmp ne i32 %BStride, 1
	; CHECK-DAG: icmp ne i64 %CStride, 1			; CHECK-DAG: icmp ne i64 %CStride, 1
	; CHECK: or			; CHECK: or
	; CHECK: or
	; CHECK: br			; CHECK: br

	; CHECK: vector.body			; CHECK: vector.body
	; CHECK: load <2 x i32>			; CHECK: load <2 x i32>

	for.body.preheader:			for.body.preheader:
	br label %for.body			br label %for.body

	Show All 19 Lines
	for.end.loopexit:			for.end.loopexit:
	br label %for.end			br label %for.end

	for.end:			for.end:
	ret void			ret void
	}			}

	; We used to crash on this function because we removed the fptosi cast when			; We used to crash on this function because we removed the fptosi cast when
	; replacing the symbolic stride '%conv'.			; replacing the symbolic stride '%conv' (PR18480). However, replacing the
	; PR18480			; symbolic stride is no longer required since we can do an overflow check.

	; CHECK-LABEL: fn1			; CHECK-LABEL: fn1
	; CHECK: load <2 x double>			; CHECK: store <2 x double>

	define void @fn1(double* noalias %x, double* noalias %c, double %a) {			define void @fn1(double* noalias %x, double* noalias %c, double %a) {
	entry:			entry:
	%conv = fptosi double %a to i32			%conv = fptosi double %a to i32
	%cmp8 = icmp sgt i32 %conv, 0			%cmp8 = icmp sgt i32 %conv, 0
	br i1 %cmp8, label %for.body.preheader, label %for.end			br i1 %cmp8, label %for.body.preheader, label %for.end

	for.body.preheader:			for.body.preheader:
	Show All 22 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflowsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 32418

include/llvm/Analysis/LoopAccessAnalysis.h

include/llvm/Analysis/ScalarEvolution.h

include/llvm/Analysis/ScalarEvolutionExpressions.h

include/llvm/Transforms/Utils/LoopVersioning.h

lib/Analysis/LoopAccessAnalysis.cpp

lib/Analysis/ScalarEvolution.cpp

lib/Transforms/Scalar/LoopDistribute.cpp

lib/Transforms/Utils/LoopVersioning.cpp

lib/Transforms/Vectorize/LoopVectorize.cpp

test/Transforms/LoopDistribute/distribute-with-overflows.ll

test/Transforms/LoopVectorize/safegep.ll

test/Transforms/LoopVectorize/scev-overflow-check.ll

test/Transforms/LoopVectorize/version-mem-access.ll

[SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows
AbandonedPublic