This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
LoopAccessAnalysis.h
-
ScalarEvolution.h
-
ScalarEvolutionExpander.h
-
lib/
-
Analysis/
-
LoopAccessAnalysis.cpp
-
ScalarEvolution.cpp
-
ScalarEvolutionExpander.cpp
-
Transforms/Vectorize/
-
Vectorize/
-
LoopVectorize.cpp
-
test/
-
Analysis/LoopAccessAnalysis/
-
LoopAccessAnalysis/
-
wrapping-pointer-versioning.ll
-
Transforms/LoopVectorize/
-
LoopVectorize/
-
same-base-access.ll

Differential D15412

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
ClosedPublic

Authored by sbaranga on Dec 10 2015, 3:56 AM.

Download Raw Diff

Details

Reviewers

anemet
mzolotukhin
sanjoy

Commits

rGea63a7f512dc: [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory…
rGa35fadc7c49f: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided…
rL260112: [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memory
rL260085: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided…

Summary

This change adds no wrap SCEV predicates with:

support for runtime checking
support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)}

Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.

We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.

Diff Detail

Repository: rL LLVM

Event Timeline

sbaranga updated this revision to Diff 42410.Dec 10 2015, 3:56 AM

sbaranga retitled this revision from to [SCEV][LAA] Add no overflow SCEV predicates and use use them to improve strided pointer detection.

sbaranga updated this object.

sbaranga added reviewers: anemet, mzolotukhin, sanjoy.

sbaranga added subscribers: hfinkel, jmolloy, rengolin, llvm-commits.

Herald added a subscriber: sanjoy. · View Herald TranscriptDec 10 2015, 3:56 AM

(Reviewed the SCEV bits)

include/llvm/Analysis/ScalarEvolution.h
274 ↗	(On Diff #42410)	Nit: notation is `{a,+,b}`.
287 ↗	(On Diff #42410)	Nit: I'd return an `SCEVAddRecExpr *`.
lib/Analysis/ScalarEvolution.cpp
9605 ↗	(On Diff #42410)	Minor nit: I'd use `auto *OF` here.
9675 ↗	(On Diff #42410)	Please either directly pass in a `SCEVAddRecExpr` or use a `cast<>` instead of a `static_cast<>`.
9743 ↗	(On Diff #42410)	Minor: I'd just put this definition in the header.
9746 ↗	(On Diff #42410)	I don't think you need to `dyn_cast<>` to `const T`. If you're casting from a `const` pointer, you'll automatically get a `const` pointer.
9871 ↗	(On Diff #42410)	Please use `cast<SCEVAddRecExpr>`
lib/Analysis/ScalarEvolutionExpander.cpp
1977 ↗	(On Diff #42410)	There is an `Instruction::getModule`
1999 ↗	(On Diff #42410)	I think you need to always zero extend the exit count. For instance, if you have a loop with an exit count of `i8 255` then `i16 {0x7fff,+,1}` will sign overflow, even though `0x7fff + (1 * (-1))` will not sign overflow in either the `*` or the `+`.
2012 ↗	(On Diff #42410)	Please assert that `AR` is affine. Also, it would be easier to read if you extracted out `const SCEV *Step = AR->getOperand(1)` and used `Step` here instead.
2034 ↗	(On Diff #42410)	I don't think this is needed -- the trip count is interpreted as an unsigned value, so you should just need to check that it is `ule iDstBits -1`.
2057 ↗	(On Diff #42410)	Minor nit: please call this `"sadd"` or `"uadd"` depending on `AddF` (or just call it `"add"`).

This revision now requires changes to proceed.Dec 10 2015, 4:50 PM

sanjoy added inline comments.Dec 10 2015, 11:37 PM

lib/Analysis/ScalarEvolutionExpander.cpp
2059 ↗	(On Diff #42410)	Actually, now that I think of it, I don't think this scheme will work for `nsw`. E.g. if `AR` is `i8 {0,+,1}` and `TripCount` is `i8 255` == `i8 -1` then neither `(i8 1) * (i8 255)` == `i8 -1` nor `0 + (-1)` will sign overflow, while the add recurrence clearly does sign overflow. I think the most obvious way to check for no `nsw` is to check `sext(Start + TripCount * Step)` == `sext(Start) + zext(TripCount) * sext(Step)`, where you extend to 2 * Bitwidth(Start). There could also potentially be a more efficient scheme, but I don't have one with me right now.

sbaranga added inline comments.Dec 11 2015, 6:16 AM

lib/Analysis/ScalarEvolutionExpander.cpp
2059 ↗	(On Diff #42410)	Thanks for the catch! This is indeed a problem. I'll think about some more alternatives to this.

sbaranga added inline comments.Dec 14 2015, 4:05 AM

include/llvm/Analysis/ScalarEvolution.h
287 ↗	(On Diff #42410)	This is part of the SCEVPredicate interface (and is missing an override), so it has to return a a const SCEV *.
lib/Analysis/ScalarEvolutionExpander.cpp
2059 ↗	(On Diff #42410)	We might be able to check this without extending by writing TripCount as MAX_INT + x if it larger than MAX_INT, and doing the check twice? I don't know if this will be more efficient. This also got me thinking about negative increments as well, and I think that there is an issue with the nuw flag and negative increments: they will always wrap. We actually want a property that will allow us to sign extend the increment for both the nuw and nsw cases: zext({a,+,b}) -> {zext(a),+,sext(b)} sext({a,+,b}) -> {sext(a),+,sext(b)} We should define separate flags for this property, as the unsigned case doesn't have the same semantics as the nuw flag. I think we should be able to emit checks for this.

Renamed SCEVAddRecOverflowPredicate to SCEVWrapPredicate.

SCEVWrapPredicate now has its own flags for NUW/NSW, with
a slightly different meaning of the NUW flag from SCEV.
These flags allow us to do the transformation:

zext({a,+,b}) -> {zext(a),+,sext(b)}
sext({a,+,b}) -> {sext(a),+,sext(b)}

Updated the predicate checking code to use the Sanjoy's
method of checking:

we extend to a large enough type to remove all possible overflows
we check that the the final value (without any overflow) is what we would get if we would compute the expression and then do the extend.

This should also address the remaining review comments.

lib/Analysis/ScalarEvolution.cpp
9743 ↗	(On Diff #42410)	This is a bit difficult. AR is a SCEVAddRecExpr and not a SCEV, and therefore moving this to the header would require casting.

Further cleanups: this removes all the remaining occurrences of "AddRecOverflow" and replace them with "Wrap".

Add NoWrapFlags manipulation functions to SCEVWrapPredicate (similar to ScalarEvolution).

Add a getImpliedFlags static method to SCEVWrapPredicate, which deduces the statically
implied flags: the NSW flag can be taken directly from AddRecExprs. The NUW flag can be
transfered only if the the increment of the AddRecExpr is constant and positive).

sanjoy requested changes to this revision.Dec 16 2015, 11:11 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

include/llvm/Analysis/ScalarEvolution.h
278 ↗	(On Diff #42848)	I'm still reading through the patch; but if you must introduce a new concept to replace nuw, it should be called something else to avoid confusion.
280 ↗	(On Diff #42848)	Do you mean `zext(a + b) != ...`?
282 ↗	(On Diff #42848)	I'd be a lot more explicit here. What (I think) you're saying is that 0 <= F(a) + F(b) < 2^n where the inequalities and arithmetic above are normal inequalities and arithmetic over integers, and `F` maps an `n` bit tuple to the integer it represents in twos complement. Is that correct?
289 ↗	(On Diff #42848)	Very minor, but why not `NoWrapMask = FlagNUW \| FlagNSW`?
295 ↗	(On Diff #42848)	Minor, but what about doing some sanity checking on `Mask` here?
301 ↗	(On Diff #42848)	Same here and below on `clearFlags`: some minor sanity checking here would be nice.
320 ↗	(On Diff #42848)	`explicit`?
1453 ↗	(On Diff #42848)	Will clients ever try to RAUW / delete `Value` s from under this map. If not, this should be changed to use an `AssertingVH`; otherwise it needs to use a `CallbackVH` like SCEV.
1457 ↗	(On Diff #42848)	Minor: can you please check if it possible to make this a const reference?
lib/Analysis/ScalarEvolution.cpp
9617 ↗	(On Diff #42848)	Please add a comment on what `Assume` means, and use a more descriptive name if possible.
9741 ↗	(On Diff #42848)	Why not check if `Op->Flags` is a superset of `Flags`?
9744 ↗	(On Diff #42848)	Might also want to return true if `Flags == NSW && AR->hasNSW()`.
9769 ↗	(On Diff #42848)	This doesn't look correct? You probably want something like `Step->getValue()->getValue().isNonNegative()`.
9895 ↗	(On Diff #42848)	Please use `cast<>`
9679 ↗	(On Diff #42730)	Why not just `return P.implies(A);`?
lib/Analysis/ScalarEvolutionExpander.cpp
2007 ↗	(On Diff #42848)	Please use `cast<>`
2009 ↗	(On Diff #42848)	I think you can use `ConstantInt::getFalse(IP->getContext())` here.

This revision now requires changes to proceed.Dec 16 2015, 11:11 PM

Hi Sanjoy,

Sorry for the delayed reply (holidays..). I should have a new version of this up for review soon.

Thanks,
Silviu

include/llvm/Analysis/ScalarEvolution.h
282 ↗	(On Diff #42848)	I think almost correct. You would technically need a different function for a, since we regard it as unsigned. So that means 0 <= G(a) + F(b) < 2^n, where F maps to twos complement and G is simple base 2 conversion. Some examples: So for a 16-bit example of a = 0xffff and b = 0xfffe we would get F(0xffff) = 2^16 - 1 and G(0xfffe) = -2 and 0 <= 2^16 - 3 < 2^16 which is satisfied. For a = 0x0, b = 0xffff we get 0 <= -1 < 2^16 which is false. For a = 0xffff, b = 0x0010 we get 0 <= 2^16 -1 + 2 < 2^16 which is false. Does this make sense to you?
1453 ↗	(On Diff #42848)	I think AssertingVH would make the most sense here.

Rebase the change and address the current issues raised by Sanjoy in the last review round.

sbaranga marked 16 inline comments as done.Jan 13 2016, 5:03 AM

sbaranga added inline comments.

include/llvm/Analysis/ScalarEvolution.h
288 ↗	(On Diff #44738)	There's no reason why we couldn't do that (apart from consistency with SCEV::NoWrapFlags). I have no preference for this.

sanjoy requested changes to this revision.Jan 17 2016, 12:47 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

include/llvm/Analysis/ScalarEvolution.h
286 ↗	(On Diff #44738)	Do you mean `G(a) + F(b)` here? Might be helpful to name these as `Signed` and `Unsigned` instead of `F` and `G`. Might be useful to explicitly note that `SCEVWrapPredicate:: IncrementNUW` is not commutative, and you're interpreting `{a,+,b}` as `(((a + b) + b) + b) ...` for the purposes of `SCEVWrapPredicate`.
291 ↗	(On Diff #44738)	Please don't call this nuw (here, and elsewhere in comments) -- I don't think having two definitions of nuw float around it the codebase is a good idea. Pretty much any name other than nuw is fine.
300 ↗	(On Diff #44738)	What you have here is fine, but I'd have done `assert(Flags & IncrementNoWrapMask == Flags)` instead.
lib/Analysis/ScalarEvolution.cpp
9635 ↗	(On Diff #44738)	As mentioned earlier, please don't call this "nuw".
9733 ↗	(On Diff #44738)	Very minor, but why not `return Op && Op->AR == AR && setFlags(Flags, Op->Flags) == Flags;` instead of the early exit above?
9741 ↗	(On Diff #44738)	Minor and optional, but why not have an early return here as `return IFlags == IncrementNSW;`?
9749 ↗	(On Diff #44738)	As mentioned earlier, please don't call this `<nuw>`.

This revision now requires changes to proceed.Jan 17 2016, 12:47 PM

sbaranga added inline comments.Jan 18 2016, 7:33 AM

include/llvm/Analysis/ScalarEvolution.h
286 ↗	(On Diff #44738)	Good catch, it should have been G(a) + F(b). It is indeed not commutative. I'll update the text to reflect this.
300 ↗	(On Diff #44738)	That seems a better way to do it, thanks.
lib/Analysis/ScalarEvolution.cpp
9741 ↗	(On Diff #44738)	This is a corner case, but that wouldn't work if Flags == IncrementAnyWrap (in which case the predicate would always be true).

Rebased and addressed the comments from the last review.

Renamed the new NUW/NSW flags to NUSW and NSSW (No Unsigned/Signed With
Signed increment Wrap) and update the documentation for these flags
according to the last review round.

Update some missed comments to reflect the previous renaming (NUW -> NUSW and NSW -> NSSW).

After further thinking, we should support RAUW/delete for values in FlagsMap.

This change replaces the DenseMap with a ValueMap (with the default configuration)
which will move the flags to the new value in case of RAUW or remove the entry
in case of delete.

We now also need to explicitly specify the copy contructor for
PredicatedScalarEvolution since ValueMap deletes the default copy constructor.

SCEV bits lgtm with minor nits inline.

include/llvm/Analysis/ScalarEvolution.h
297 ↗	(On Diff #45696)	Do you mean "is already non-commutative"?
303 ↗	(On Diff #45696)	This is much better, thanks! Might be helpful to clarify htat NSSW is basically nsw, but NUSW is a hybrid specific to `SCEVWrapPredicate`.
lib/Analysis/ScalarEvolutionExpander.cpp
2011 ↗	(On Diff #45696)	I'd suggest not generating the redundant or instruction, since it is easy to avoid here.

Hi Silviu,

Just started to look at this (sorry about the delay) and on first impression, I am surprised by the lack of tests. Can't we cover this with LAA tests?

Adam

In D15412#336209, @anemet wrote:

Hi Silviu,

Just started to look at this (sorry about the delay) and on first impression, I am surprised by the lack of tests. Can't we cover this with LAA tests?

Adam

Hi Adam,

I've been having some trouble testing this with just LAA (we were bailing out when checking if we can do runtime checks).
However, I can now trigger this by adding noalias on the input pointers, so that problem has been solved and I can trigger all code paths with LAA.

I'll also need your change for loop versioning testing (http://reviews.llvm.org/D16612) in order to test the runtime checks, so I'll wait for that to go in before updating this.

Thanks,
Silviu

Add test cases where LAA can use the SCEV predicates to both transform non-AddRecExprs to AddRecExprs and directly add no overflow flags for pointers.
The test cases check both the result of LAA and that LoopVectorize adds run-time checks for the SCEV predicates.

anemet added inline comments.Feb 2 2016, 9:51 PM

lib/Analysis/LoopAccessAnalysis.cpp
812 ↗	(On Diff #46635)	Do you want to try to coerce the expression into affine/non-wrapping here too? This may be a more common case for C/C++ where we are always inbounds for GEP. Just asking, it does not have to and probably shouldn't be part of this patch.
913–916 ↗	(On Diff #46635)	No {} Also, any particular reason you have a dbg message above but not here?
test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
7 ↗	(On Diff #46635)	Isnt't this initialized to 0 in the IR below?
26–27 ↗	(On Diff #46635)	We should include the SCEV for %mul_ext here in a comment between the two. Probably both with and without the added flag on %mul. This is more critical later when more things are going on. This way we'd have the whole reasoning in front of us.
75–78 ↗	(On Diff #46635)	Don't you mean 2 * index as the expression for this part? The GEP is covered later.

Address comments from Adam.

sbaranga added inline comments.Feb 3 2016, 5:18 AM

lib/Analysis/LoopAccessAnalysis.cpp
812 ↗	(On Diff #46770)	I think that's already covered by the current change, because isStridedPtr will add the predicates if needed (and I think this is only called from isStridedPtr). This should be better because isStridedPtr might return true even if isNoWrapAddRec returns false (for example in the case of unit strided pointers), and we try to only add predicates when needed. If we ever end up calling isNoWrapAddRec from somewhere else than isStridedPtr, it would be a good idea to allow coercing here as well.
913–916 ↗	(On Diff #46770)	Thanks, I intended to add the debug message here as well.
test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
27–28 ↗	(On Diff #46770)	Added text with the SCEV expressions for %mul_ext before and after the predicates for all tests.

anemet added inline comments.Feb 4 2016, 6:17 PM

lib/Analysis/LoopAccessAnalysis.cpp
812 ↗	(On Diff #46770)	OK, that's possible, quite a few things are happening here implicitly. But then please add a test for it. I.e. gep with inbounds but the index not marked with nowrap.
test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
99 ↗	(On Diff #46770)	Long line, a few more further down.

Added a test case with an inbounds GEPs, with the GEP index
having a non AddRec SCEV expression. This covers more
accurately the case where the input comes from C/C++.

Fixed the long lines in the test.

I've also noticed that the debug messages in isStridedPtr
would get confusing because they would use the SCEV expression
of the pointer before being coerced into an AddRecExpr. This
also changes the debug messages to use the updated expressions.

Herald added a subscriber: mzolotukhin. · View Herald TranscriptFeb 5 2016, 4:12 AM

sbaranga marked an inline comment as done.Feb 5 2016, 4:28 AM

sbaranga added inline comments.

lib/Analysis/LoopAccessAnalysis.cpp
812 ↗	(On Diff #47009)	I've added a test for this. FWIW I've looked in detail at what's going on here and I believe there is a chance to improve things: in the added test, OBO->hasNoSignedWrap() returns false, and we therefore bail out. I think we can deduce the NSW flag from the added SCEV predicates. This would reduce the number of required SCEV predicates by 1.

LGTM too.

test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
2 ↗	(On Diff #47009)	Are you planning to change these to use the new loop-versioning pass in a follow-on patch?
249 ↗	(On Diff #47009)	Not AddRecExpr or rather not nsw/nuw?

sbaranga marked an inline comment as done.Feb 8 2016, 2:06 AM

sbaranga added inline comments.

test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll
2 ↗	(On Diff #47009)	Sure, I'll do that in a follow-up change.

sbaranga retitled this revision from [SCEV][LAA] Add no overflow SCEV predicates and use use them to improve strided pointer detection to [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection.Feb 8 2016, 2:29 AM

sbaranga updated this object.

sbaranga edited edge metadata.

Fix a comment in on of the tests.

Closed by commit rL260085: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided… (authored by sbaranga). · Explain WhyFeb 8 2016, 2:50 AM

This revision was automatically updated to reflect the committed changes.

Committed in r260085. Thanks!

-Silviu

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

LoopAccessAnalysis.h

4 lines

ScalarEvolution.h

124 lines

ScalarEvolutionExpander.h

9 lines

lib/

Analysis/

LoopAccessAnalysis.cpp

61 lines

ScalarEvolution.cpp

197 lines

ScalarEvolutionExpander.cpp

68 lines

Transforms/

Vectorize/

LoopVectorize.cpp

2 lines

test/

Analysis/

LoopAccessAnalysis/

wrapping-pointer-versioning.ll

292 lines

Transforms/

LoopVectorize/

same-base-access.ll

10 lines

Diff 47171

llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h

Show First 20 Lines • Show All 650 Lines • ▼ Show 20 Lines	const SCEV *replaceSymbolicStrideSCEV(PredicatedScalarEvolution &PSE,
const ValueToValueMap &PtrToStride,		const ValueToValueMap &PtrToStride,
Value Ptr, Value OrigPtr = nullptr);		Value Ptr, Value OrigPtr = nullptr);

/// \brief Check the stride of the pointer and ensure that it does not wrap in		/// \brief Check the stride of the pointer and ensure that it does not wrap in
/// the address space, assuming \p Preds is true.		/// the address space, assuming \p Preds is true.
///		///
/// If necessary this method will version the stride of the pointer according		/// If necessary this method will version the stride of the pointer according
/// to \p PtrToStride and therefore add a new predicate to \p Preds.		/// to \p PtrToStride and therefore add a new predicate to \p Preds.
		/// The \p Assume parameter indicates if we are allowed to make additional
		/// run-time assumptions.
int isStridedPtr(PredicatedScalarEvolution &PSE, Value Ptr, const Loop Lp,		int isStridedPtr(PredicatedScalarEvolution &PSE, Value Ptr, const Loop Lp,
const ValueToValueMap &StridesMap);		const ValueToValueMap &StridesMap, bool Assume = false);

/// \brief Returns true if the memory operations \p A and \p B are consecutive.		/// \brief Returns true if the memory operations \p A and \p B are consecutive.
/// This is a simple API that does not depend on the analysis pass.		/// This is a simple API that does not depend on the analysis pass.
bool isConsecutiveAccess(Value A, Value B, const DataLayout &DL,		bool isConsecutiveAccess(Value A, Value B, const DataLayout &DL,
ScalarEvolution &SE, bool CheckType = true);		ScalarEvolution &SE, bool CheckType = true);

/// \brief This analysis provides dependence information for the memory accesses		/// \brief This analysis provides dependence information for the memory accesses
/// of a loop.		/// of a loop.
▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Analysis/ScalarEvolution.h

Show All 25 Lines
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/IR/ConstantRange.h"		#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
		#include "llvm/IR/ValueMap.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
#include "llvm/Support/DataTypes.h"		#include "llvm/Support/DataTypes.h"
#include <map>		#include <map>

namespace llvm {		namespace llvm {
class APInt;		class APInt;
class AssumptionCache;		class AssumptionCache;
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	namespace llvm {
class SCEVPredicate : public FoldingSetNode {		class SCEVPredicate : public FoldingSetNode {
friend struct FoldingSetTrait<SCEVPredicate>;		friend struct FoldingSetTrait<SCEVPredicate>;

/// A reference to an Interned FoldingSetNodeID for this node. The		/// A reference to an Interned FoldingSetNodeID for this node. The
/// ScalarEvolution's BumpPtrAllocator holds the data.		/// ScalarEvolution's BumpPtrAllocator holds the data.
FoldingSetNodeIDRef FastID;		FoldingSetNodeIDRef FastID;

public:		public:
enum SCEVPredicateKind { P_Union, P_Equal };		enum SCEVPredicateKind { P_Union, P_Equal, P_Wrap };

protected:		protected:
SCEVPredicateKind Kind;		SCEVPredicateKind Kind;
~SCEVPredicate() = default;		~SCEVPredicate() = default;
SCEVPredicate(const SCEVPredicate&) = default;		SCEVPredicate(const SCEVPredicate&) = default;
SCEVPredicate &operator=(const SCEVPredicate&) = default;		SCEVPredicate &operator=(const SCEVPredicate&) = default;

public:		public:
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	public:
const SCEVConstant *getRHS() const { return RHS; }		const SCEVConstant *getRHS() const { return RHS; }

/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static inline bool classof(const SCEVPredicate *P) {		static inline bool classof(const SCEVPredicate *P) {
return P->getKind() == P_Equal;		return P->getKind() == P_Equal;
}		}
};		};

		/// SCEVWrapPredicate - This class represents an assumption
		/// made on an AddRec expression. Given an affine AddRec expression
		/// {a,+,b}, we assume that it has the nssw or nusw flags (defined
		/// below).
		class SCEVWrapPredicate final : public SCEVPredicate {
		public:
		/// Similar to SCEV::NoWrapFlags, but with slightly different semantics
		/// for FlagNUSW. The increment is considered to be signed, and a + b
		/// (where b is the increment) is considered to wrap if:
		/// zext(a + b) != zext(a) + sext(b)
		///
		/// If Signed is a function that takes an n-bit tuple and maps to the
		/// integer domain as the tuples value interpreted as twos complement,
		/// and Unsigned a function that takes an n-bit tuple and maps to the
		/// integer domain as as the base two value of input tuple, then a + b
		/// has IncrementNUSW iff:
		///
		/// 0 <= Unsigned(a) + Signed(b) < 2^n
		///
		/// The IncrementNSSW flag has identical semantics with SCEV::FlagNSW.
		///
		/// Note that the IncrementNUSW flag is not commutative: if base + inc
		/// has IncrementNUSW, then inc + base doesn't neccessarily have this
		/// property. The reason for this is that this is used for sign/zero
		/// extending affine AddRec SCEV expressions when a SCEVWrapPredicate is
		/// assumed. A {base,+,inc} expression is already non-commutative with
		/// regards to base and inc, since it is interpreted as:
		/// (((base + inc) + inc) + inc) ...
		enum IncrementWrapFlags {
		IncrementAnyWrap = 0, // No guarantee.
		IncrementNUSW = (1 << 0), // No unsigned with signed increment wrap.
		IncrementNSSW = (1 << 1), // No signed with signed increment wrap
		// (equivalent with SCEV::NSW)
		IncrementNoWrapMask = (1 << 2) - 1
		};

		/// Convenient IncrementWrapFlags manipulation methods.
		static SCEVWrapPredicate::IncrementWrapFlags LLVM_ATTRIBUTE_UNUSED_RESULT
		clearFlags(SCEVWrapPredicate::IncrementWrapFlags Flags,
		SCEVWrapPredicate::IncrementWrapFlags OffFlags) {
		assert((Flags & IncrementNoWrapMask) == Flags && "Invalid flags value!");
		assert((OffFlags & IncrementNoWrapMask) == OffFlags &&
		"Invalid flags value!");
		return (SCEVWrapPredicate::IncrementWrapFlags)(Flags & ~OffFlags);
		}

		static SCEVWrapPredicate::IncrementWrapFlags LLVM_ATTRIBUTE_UNUSED_RESULT
		maskFlags(SCEVWrapPredicate::IncrementWrapFlags Flags, int Mask) {
		assert((Flags & IncrementNoWrapMask) == Flags && "Invalid flags value!");
		assert((Mask & IncrementNoWrapMask) == Mask && "Invalid mask value!");

		return (SCEVWrapPredicate::IncrementWrapFlags)(Flags & Mask);
		}

		static SCEVWrapPredicate::IncrementWrapFlags LLVM_ATTRIBUTE_UNUSED_RESULT
		setFlags(SCEVWrapPredicate::IncrementWrapFlags Flags,
		SCEVWrapPredicate::IncrementWrapFlags OnFlags) {
		assert((Flags & IncrementNoWrapMask) == Flags && "Invalid flags value!");
		assert((OnFlags & IncrementNoWrapMask) == OnFlags &&
		"Invalid flags value!");

		return (SCEVWrapPredicate::IncrementWrapFlags)(Flags \| OnFlags);
		}

		/// \brief Returns the set of SCEVWrapPredicate no wrap flags implied
		/// by a SCEVAddRecExpr.
		static SCEVWrapPredicate::IncrementWrapFlags
		getImpliedFlags(const SCEVAddRecExpr *AR, ScalarEvolution &SE);

		private:
		const SCEVAddRecExpr *AR;
		IncrementWrapFlags Flags;

		public:
		explicit SCEVWrapPredicate(const FoldingSetNodeIDRef ID,
		const SCEVAddRecExpr *AR,
		IncrementWrapFlags Flags);

		/// \brief Returns the set assumed no overflow flags.
		IncrementWrapFlags getFlags() const { return Flags; }
		/// Implementation of the SCEVPredicate interface
		const SCEV *getExpr() const override;
		bool implies(const SCEVPredicate *N) const override;
		void print(raw_ostream &OS, unsigned Depth = 0) const override;
		bool isAlwaysTrue() const override;

		/// Methods for support type inquiry through isa, cast, and dyn_cast:
		static inline bool classof(const SCEVPredicate *P) {
		return P->getKind() == P_Wrap;
		}
		};

/// SCEVUnionPredicate - This class represents a composition of other		/// SCEVUnionPredicate - This class represents a composition of other
/// SCEV predicates, and is the class that most clients will interact with.		/// SCEV predicates, and is the class that most clients will interact with.
/// This is equivalent to a logical "AND" of all the predicates in the union.		/// This is equivalent to a logical "AND" of all the predicates in the union.
class SCEVUnionPredicate final : public SCEVPredicate {		class SCEVUnionPredicate final : public SCEVPredicate {
private:		private:
typedef DenseMap<const SCEV , SmallVector<const SCEVPredicate , 4>>		typedef DenseMap<const SCEV , SmallVector<const SCEVPredicate , 4>>
PredicateMap;		PredicateMap;

▲ Show 20 Lines • Show All 995 Lines • ▼ Show 20 Lines	public:
/// operating on.		/// operating on.
const DataLayout &getDataLayout() const {		const DataLayout &getDataLayout() const {
return F.getParent()->getDataLayout();		return F.getParent()->getDataLayout();
}		}

const SCEVPredicate getEqualPredicate(const SCEVUnknown LHS,		const SCEVPredicate getEqualPredicate(const SCEVUnknown LHS,
const SCEVConstant *RHS);		const SCEVConstant *RHS);

		const SCEVPredicate *
		getWrapPredicate(const SCEVAddRecExpr *AR,
		SCEVWrapPredicate::IncrementWrapFlags AddedFlags);

/// Re-writes the SCEV according to the Predicates in \p Preds.		/// Re-writes the SCEV according to the Predicates in \p Preds.
const SCEV rewriteUsingPredicate(const SCEV Scev, SCEVUnionPredicate &A);		const SCEV rewriteUsingPredicate(const SCEV Scev, const Loop *L,
		SCEVUnionPredicate &A);
		/// Tries to convert the \p Scev expression to an AddRec expression,
		/// adding additional predicates to \p Preds as required.
		const SCEV convertSCEVToAddRecWithPredicates(const SCEV Scev,
		const Loop *L,
		SCEVUnionPredicate &Preds);

private:		private:
/// Compute the backedge taken count knowing the interval difference, the		/// Compute the backedge taken count knowing the interval difference, the
/// stride and presence of the equality in the comparison.		/// stride and presence of the equality in the comparison.
const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,		const SCEV computeBECount(const SCEV Delta, const SCEV *Stride,
bool Equality);		bool Equality);

/// Verify if an linear IV with positive stride can overflow when in a		/// Verify if an linear IV with positive stride can overflow when in a
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
/// expression for a single Value is consistent across two different		/// expression for a single Value is consistent across two different
/// getSCEV calls. This means that, for example, once we've obtained		/// getSCEV calls. This means that, for example, once we've obtained
/// an AddRec expression for a certain value through expression		/// an AddRec expression for a certain value through expression
/// rewriting, we will continue to get an AddRec expression for that		/// rewriting, we will continue to get an AddRec expression for that
/// Value.		/// Value.
/// - lowers the number of expression rewrites.		/// - lowers the number of expression rewrites.
class PredicatedScalarEvolution {		class PredicatedScalarEvolution {
public:		public:
PredicatedScalarEvolution(ScalarEvolution &SE);		PredicatedScalarEvolution(ScalarEvolution &SE, Loop &L);
const SCEVUnionPredicate &getUnionPredicate() const;		const SCEVUnionPredicate &getUnionPredicate() const;
/// \brief Returns the SCEV expression of V, in the context of the current		/// \brief Returns the SCEV expression of V, in the context of the current
/// SCEV predicate.		/// SCEV predicate.
/// The order of transformations applied on the expression of V returned		/// The order of transformations applied on the expression of V returned
/// by ScalarEvolution is guaranteed to be preserved, even when adding new		/// by ScalarEvolution is guaranteed to be preserved, even when adding new
/// predicates.		/// predicates.
const SCEV getSCEV(Value V);		const SCEV getSCEV(Value V);
/// \brief Adds a new predicate.		/// \brief Adds a new predicate.
void addPredicate(const SCEVPredicate &Pred);		void addPredicate(const SCEVPredicate &Pred);
		/// \brief Attempts to produce an AddRecExpr for V by adding additional
		/// SCEV predicates.
		const SCEV getAsAddRec(Value V);
		/// \brief Proves that V doesn't overflow by adding SCEV predicate.
		void setNoOverflow(Value *V, SCEVWrapPredicate::IncrementWrapFlags Flags);
		/// \brief Returns true if we've proved that V doesn't wrap by means of a
		/// SCEV predicate.
		bool hasNoOverflow(Value *V, SCEVWrapPredicate::IncrementWrapFlags Flags);
/// \brief Returns the ScalarEvolution analysis used.		/// \brief Returns the ScalarEvolution analysis used.
ScalarEvolution *getSE() const { return &SE; }		ScalarEvolution *getSE() const { return &SE; }
		/// We need to explicitly define the copy constructor because of FlagsMap.
		PredicatedScalarEvolution(const PredicatedScalarEvolution&);
private:		private:
/// \brief Increments the version number of the predicate.		/// \brief Increments the version number of the predicate.
/// This needs to be called every time the SCEV predicate changes.		/// This needs to be called every time the SCEV predicate changes.
void updateGeneration();		void updateGeneration();
/// Holds a SCEV and the version number of the SCEV predicate used to		/// Holds a SCEV and the version number of the SCEV predicate used to
/// perform the rewrite of the expression.		/// perform the rewrite of the expression.
typedef std::pair<unsigned, const SCEV *> RewriteEntry;		typedef std::pair<unsigned, const SCEV *> RewriteEntry;
/// Maps a SCEV to the rewrite result of that SCEV at a certain version		/// Maps a SCEV to the rewrite result of that SCEV at a certain version
/// number. If this number doesn't match the current Generation, we will		/// number. If this number doesn't match the current Generation, we will
/// need to do a rewrite. To preserve the transformation order of previous		/// need to do a rewrite. To preserve the transformation order of previous
/// rewrites, we will rewrite the previous result instead of the original		/// rewrites, we will rewrite the previous result instead of the original
/// SCEV.		/// SCEV.
DenseMap<const SCEV *, RewriteEntry> RewriteMap;		DenseMap<const SCEV *, RewriteEntry> RewriteMap;
		/// Records what NoWrap flags we've added to a Value *.
		ValueMap<Value *, SCEVWrapPredicate::IncrementWrapFlags> FlagsMap;
/// The ScalarEvolution analysis.		/// The ScalarEvolution analysis.
ScalarEvolution &SE;		ScalarEvolution &SE;
		/// The analyzed Loop.
		const Loop &L;
/// The SCEVPredicate that forms our context. We will rewrite all		/// The SCEVPredicate that forms our context. We will rewrite all
/// expressions assuming that this predicate true.		/// expressions assuming that this predicate true.
SCEVUnionPredicate Preds;		SCEVUnionPredicate Preds;
/// Marks the version of the SCEV predicate used. When rewriting a SCEV		/// Marks the version of the SCEV predicate used. When rewriting a SCEV
/// expression we mark it with the version of the predicate. We use this to		/// expression we mark it with the version of the predicate. We use this to
/// figure out if the predicate has changed from the last rewrite of the		/// figure out if the predicate has changed from the last rewrite of the
/// SCEV. If so, we need to perform a new rewrite.		/// SCEV. If so, we need to perform a new rewrite.
unsigned Generation;		unsigned Generation;
};		};
}		}

#endif		#endif

llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	#endif
/// predicate is false and 1 otherwise.		/// predicate is false and 1 otherwise.
Value expandCodeForPredicate(const SCEVPredicate Pred, Instruction *Loc);		Value expandCodeForPredicate(const SCEVPredicate Pred, Instruction *Loc);

/// \brief A specialized variant of expandCodeForPredicate, handling the		/// \brief A specialized variant of expandCodeForPredicate, handling the
/// case when we are expanding code for a SCEVEqualPredicate.		/// case when we are expanding code for a SCEVEqualPredicate.
Value expandEqualPredicate(const SCEVEqualPredicate Pred,		Value expandEqualPredicate(const SCEVEqualPredicate Pred,
Instruction *Loc);		Instruction *Loc);

		/// \brief Generates code that evaluates if the \p AR expression will
		/// overflow.
		Value generateOverflowCheck(const SCEVAddRecExpr AR, Instruction *Loc,
		bool Signed);

		/// \brief A specialized variant of expandCodeForPredicate, handling the
		/// case when we are expanding code for a SCEVWrapPredicate.
		Value expandWrapPredicate(const SCEVWrapPredicate P, Instruction *Loc);

/// \brief A specialized variant of expandCodeForPredicate, handling the		/// \brief A specialized variant of expandCodeForPredicate, handling the
/// case when we are expanding code for a SCEVUnionPredicate.		/// case when we are expanding code for a SCEVUnionPredicate.
Value expandUnionPredicate(const SCEVUnionPredicate Pred,		Value expandUnionPredicate(const SCEVUnionPredicate Pred,
Instruction *Loc);		Instruction *Loc);

/// \brief Set the current IV increment loop and position.		/// \brief Set the current IV increment loop and position.
void setIVIncInsertPos(const Loop L, Instruction Pos) {		void setIVIncInsertPos(const Loop L, Instruction Pos) {
assert(!CanonicalMode &&		assert(!CanonicalMode &&
▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp

Show First 20 Lines • Show All 767 Lines • ▼ Show 20 Lines	static bool isInBoundsGep(Value *Ptr) {
if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Ptr))		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Ptr))
return GEP->isInBounds();		return GEP->isInBounds();
return false;		return false;
}		}

/// \brief Return true if an AddRec pointer \p Ptr is unsigned non-wrapping,		/// \brief Return true if an AddRec pointer \p Ptr is unsigned non-wrapping,
/// i.e. monotonically increasing/decreasing.		/// i.e. monotonically increasing/decreasing.
static bool isNoWrapAddRec(Value Ptr, const SCEVAddRecExpr AR,		static bool isNoWrapAddRec(Value Ptr, const SCEVAddRecExpr AR,
ScalarEvolution SE, const Loop L) {		PredicatedScalarEvolution &PSE, const Loop *L) {
// FIXME: This should probably only return true for NUW.		// FIXME: This should probably only return true for NUW.
if (AR->getNoWrapFlags(SCEV::NoWrapMask))		if (AR->getNoWrapFlags(SCEV::NoWrapMask))
return true;		return true;

// Scalar evolution does not propagate the non-wrapping flags to values that		// Scalar evolution does not propagate the non-wrapping flags to values that
// are derived from a non-wrapping induction variable because non-wrapping		// are derived from a non-wrapping induction variable because non-wrapping
// could be flow-sensitive.		// could be flow-sensitive.
//		//
Show All 19 Lines	static bool isNoWrapAddRec(Value Ptr, const SCEVAddRecExpr AR,

// The index in GEP is signed. It is non-wrapping if it's derived from a NSW		// The index in GEP is signed. It is non-wrapping if it's derived from a NSW
// AddRec using a NSW operation.		// AddRec using a NSW operation.
if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(NonConstIndex))		if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(NonConstIndex))
if (OBO->hasNoSignedWrap() &&		if (OBO->hasNoSignedWrap() &&
// Assume constant for other the operand so that the AddRec can be		// Assume constant for other the operand so that the AddRec can be
// easily found.		// easily found.
isa<ConstantInt>(OBO->getOperand(1))) {		isa<ConstantInt>(OBO->getOperand(1))) {
auto *OpScev = SE->getSCEV(OBO->getOperand(0));		auto *OpScev = PSE.getSCEV(OBO->getOperand(0));

if (auto *OpAR = dyn_cast<SCEVAddRecExpr>(OpScev))		if (auto *OpAR = dyn_cast<SCEVAddRecExpr>(OpScev))
return OpAR->getLoop() == L && OpAR->getNoWrapFlags(SCEV::FlagNSW);		return OpAR->getLoop() == L && OpAR->getNoWrapFlags(SCEV::FlagNSW);
}		}

return false;		return false;
}		}

/// \brief Check whether the access through \p Ptr has a constant stride.		/// \brief Check whether the access through \p Ptr has a constant stride.
int llvm::isStridedPtr(PredicatedScalarEvolution &PSE, Value *Ptr,		int llvm::isStridedPtr(PredicatedScalarEvolution &PSE, Value *Ptr,
const Loop *Lp, const ValueToValueMap &StridesMap) {		const Loop *Lp, const ValueToValueMap &StridesMap,
		bool Assume) {
Type *Ty = Ptr->getType();		Type *Ty = Ptr->getType();
assert(Ty->isPointerTy() && "Unexpected non-ptr");		assert(Ty->isPointerTy() && "Unexpected non-ptr");

// Make sure that the pointer does not point to aggregate types.		// Make sure that the pointer does not point to aggregate types.
auto *PtrTy = cast<PointerType>(Ty);		auto *PtrTy = cast<PointerType>(Ty);
if (PtrTy->getElementType()->isAggregateType()) {		if (PtrTy->getElementType()->isAggregateType()) {
DEBUG(dbgs() << "LAA: Bad stride - Not a pointer to a scalar type"		DEBUG(dbgs() << "LAA: Bad stride - Not a pointer to a scalar type" << *Ptr
<< *Ptr << "\n");		<< "\n");
return 0;		return 0;
}		}

const SCEV *PtrScev = replaceSymbolicStrideSCEV(PSE, StridesMap, Ptr);		const SCEV *PtrScev = replaceSymbolicStrideSCEV(PSE, StridesMap, Ptr);

const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
		if (Assume && !AR)
		AR = dyn_cast<SCEVAddRecExpr>(PSE.getAsAddRec(Ptr));

if (!AR) {		if (!AR) {
DEBUG(dbgs() << "LAA: Bad stride - Not an AddRecExpr pointer "		DEBUG(dbgs() << "LAA: Bad stride - Not an AddRecExpr pointer " << *Ptr
<< Ptr << " SCEV: " << PtrScev << "\n");		<< " SCEV: " << *PtrScev << "\n");
return 0;		return 0;
}		}

// The accesss function must stride over the innermost loop.		// The accesss function must stride over the innermost loop.
if (Lp != AR->getLoop()) {		if (Lp != AR->getLoop()) {
DEBUG(dbgs() << "LAA: Bad stride - Not striding over innermost loop " <<		DEBUG(dbgs() << "LAA: Bad stride - Not striding over innermost loop " <<
Ptr << " SCEV: " << PtrScev << "\n");		Ptr << " SCEV: " << AR << "\n");
return 0;		return 0;
}		}

// The address calculation must not wrap. Otherwise, a dependence could be		// The address calculation must not wrap. Otherwise, a dependence could be
// inverted.		// inverted.
// An inbounds getelementptr that is a AddRec with a unit stride		// An inbounds getelementptr that is a AddRec with a unit stride
// cannot wrap per definition. The unit stride requirement is checked later.		// cannot wrap per definition. The unit stride requirement is checked later.
// An getelementptr without an inbounds attribute and unit stride would have		// An getelementptr without an inbounds attribute and unit stride would have
// to access the pointer value "0" which is undefined behavior in address		// to access the pointer value "0" which is undefined behavior in address
// space 0, therefore we can also vectorize this case.		// space 0, therefore we can also vectorize this case.
bool IsInBoundsGEP = isInBoundsGep(Ptr);		bool IsInBoundsGEP = isInBoundsGep(Ptr);
bool IsNoWrapAddRec = isNoWrapAddRec(Ptr, AR, PSE.getSE(), Lp);		bool IsNoWrapAddRec =
		PSE.hasNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW) \|\|
		isNoWrapAddRec(Ptr, AR, PSE, Lp);
bool IsInAddressSpaceZero = PtrTy->getAddressSpace() == 0;		bool IsInAddressSpaceZero = PtrTy->getAddressSpace() == 0;
if (!IsNoWrapAddRec && !IsInBoundsGEP && !IsInAddressSpaceZero) {		if (!IsNoWrapAddRec && !IsInBoundsGEP && !IsInAddressSpaceZero) {
		if (Assume) {
		PSE.setNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW);
		IsNoWrapAddRec = true;
		DEBUG(dbgs() << "LAA: Pointer may wrap in the address space:\n"
		<< "LAA: Pointer: " << *Ptr << "\n"
		<< "LAA: SCEV: " << *AR << "\n"
		<< "LAA: Added an overflow assumption\n");
		} else {
DEBUG(dbgs() << "LAA: Bad stride - Pointer may wrap in the address space "		DEBUG(dbgs() << "LAA: Bad stride - Pointer may wrap in the address space "
<< Ptr << " SCEV: " << PtrScev << "\n");		<< Ptr << " SCEV: " << AR << "\n");
return 0;		return 0;
}		}
		}

// Check the step is constant.		// Check the step is constant.
const SCEV Step = AR->getStepRecurrence(PSE.getSE());		const SCEV Step = AR->getStepRecurrence(PSE.getSE());

// Calculate the pointer stride and check if it is constant.		// Calculate the pointer stride and check if it is constant.
const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);		const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);
if (!C) {		if (!C) {
DEBUG(dbgs() << "LAA: Bad stride - Not a constant strided " << *Ptr <<		DEBUG(dbgs() << "LAA: Bad stride - Not a constant strided " << *Ptr <<
" SCEV: " << *PtrScev << "\n");		" SCEV: " << *AR << "\n");
return 0;		return 0;
}		}

auto &DL = Lp->getHeader()->getModule()->getDataLayout();		auto &DL = Lp->getHeader()->getModule()->getDataLayout();
int64_t Size = DL.getTypeAllocSize(PtrTy->getElementType());		int64_t Size = DL.getTypeAllocSize(PtrTy->getElementType());
const APInt &APStepVal = C->getAPInt();		const APInt &APStepVal = C->getAPInt();

// Huge step value - give up.		// Huge step value - give up.
if (APStepVal.getBitWidth() > 64)		if (APStepVal.getBitWidth() > 64)
return 0;		return 0;

int64_t StepVal = APStepVal.getSExtValue();		int64_t StepVal = APStepVal.getSExtValue();

// Strided access.		// Strided access.
int64_t Stride = StepVal / Size;		int64_t Stride = StepVal / Size;
int64_t Rem = StepVal % Size;		int64_t Rem = StepVal % Size;
if (Rem)		if (Rem)
return 0;		return 0;

// If the SCEV could wrap but we have an inbounds gep with a unit stride we		// If the SCEV could wrap but we have an inbounds gep with a unit stride we
// know we can't "wrap around the address space". In case of address space		// know we can't "wrap around the address space". In case of address space
// zero we know that this won't happen without triggering undefined behavior.		// zero we know that this won't happen without triggering undefined behavior.
if (!IsNoWrapAddRec && (IsInBoundsGEP \|\| IsInAddressSpaceZero) &&		if (!IsNoWrapAddRec && (IsInBoundsGEP \|\| IsInAddressSpaceZero) &&
Stride != 1 && Stride != -1)		Stride != 1 && Stride != -1) {
		if (Assume) {
		// We can avoid this case by adding a run-time check.
		DEBUG(dbgs() << "LAA: Non unit strided pointer which is not either "
		<< "inbouds or in address space 0 may wrap:\n"
		<< "LAA: Pointer: " << *Ptr << "\n"
		<< "LAA: SCEV: " << *AR << "\n"
		<< "LAA: Added an overflow assumption\n");
		PSE.setNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW);
		} else
return 0;		return 0;
		}

return Stride;		return Stride;
}		}

/// Take the pointer operand from the Load/Store instruction.		/// Take the pointer operand from the Load/Store instruction.
/// Returns NULL if this is not a valid Load/Store instruction.		/// Returns NULL if this is not a valid Load/Store instruction.
static Value getPointerOperand(Value I) {		static Value getPointerOperand(Value I) {
if (LoadInst *LI = dyn_cast<LoadInst>(I))		if (LoadInst *LI = dyn_cast<LoadInst>(I))
▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines	MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
// We cannot check pointers in different address spaces.		// We cannot check pointers in different address spaces.
if (APtr->getType()->getPointerAddressSpace() !=		if (APtr->getType()->getPointerAddressSpace() !=
BPtr->getType()->getPointerAddressSpace())		BPtr->getType()->getPointerAddressSpace())
return Dependence::Unknown;		return Dependence::Unknown;

const SCEV *AScev = replaceSymbolicStrideSCEV(PSE, Strides, APtr);		const SCEV *AScev = replaceSymbolicStrideSCEV(PSE, Strides, APtr);
const SCEV *BScev = replaceSymbolicStrideSCEV(PSE, Strides, BPtr);		const SCEV *BScev = replaceSymbolicStrideSCEV(PSE, Strides, BPtr);

int StrideAPtr = isStridedPtr(PSE, APtr, InnermostLoop, Strides);		int StrideAPtr = isStridedPtr(PSE, APtr, InnermostLoop, Strides, true);
int StrideBPtr = isStridedPtr(PSE, BPtr, InnermostLoop, Strides);		int StrideBPtr = isStridedPtr(PSE, BPtr, InnermostLoop, Strides, true);

const SCEV *Src = AScev;		const SCEV *Src = AScev;
const SCEV *Sink = BScev;		const SCEV *Sink = BScev;

// If the induction step is negative we have to invert source and sink of the		// If the induction step is negative we have to invert source and sink of the
// dependence.		// dependence.
if (StrideAPtr < 0) {		if (StrideAPtr < 0) {
//Src = BScev;		//Src = BScev;
▲ Show 20 Lines • Show All 683 Lines • ▼ Show 20 Lines	LoopAccessInfo::addRuntimeChecks(Instruction *Loc) const {
return addRuntimeChecks(Loc, PtrRtChecking.getChecks());		return addRuntimeChecks(Loc, PtrRtChecking.getChecks());
}		}

LoopAccessInfo::LoopAccessInfo(Loop L, ScalarEvolution SE,		LoopAccessInfo::LoopAccessInfo(Loop L, ScalarEvolution SE,
const DataLayout &DL,		const DataLayout &DL,
const TargetLibraryInfo TLI, AliasAnalysis AA,		const TargetLibraryInfo TLI, AliasAnalysis AA,
DominatorTree DT, LoopInfo LI,		DominatorTree DT, LoopInfo LI,
const ValueToValueMap &Strides)		const ValueToValueMap &Strides)
: PSE(*SE), PtrRtChecking(SE), DepChecker(PSE, L), TheLoop(L), DL(DL),		: PSE(SE, L), PtrRtChecking(SE), DepChecker(PSE, L), TheLoop(L), DL(DL),
TLI(TLI), AA(AA), DT(DT), LI(LI), NumLoads(0), NumStores(0),		TLI(TLI), AA(AA), DT(DT), LI(LI), NumLoads(0), NumStores(0),
MaxSafeDepDistBytes(-1U), CanVecMem(false),		MaxSafeDepDistBytes(-1U), CanVecMem(false),
StoreToLoopInvariantAddress(false) {		StoreToLoopInvariantAddress(false) {
if (canAnalyzeLoop())		if (canAnalyzeLoop())
analyzeLoop(Strides);		analyzeLoop(Strides);
}		}

void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {		void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,621 Lines • ▼ Show 20 Lines	ScalarEvolution::getEqualPredicate(const SCEVUnknown *LHS,
if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))		if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))
return S;		return S;
SCEVEqualPredicate *Eq = new (SCEVAllocator)		SCEVEqualPredicate *Eq = new (SCEVAllocator)
SCEVEqualPredicate(ID.Intern(SCEVAllocator), LHS, RHS);		SCEVEqualPredicate(ID.Intern(SCEVAllocator), LHS, RHS);
UniquePreds.InsertNode(Eq, IP);		UniquePreds.InsertNode(Eq, IP);
return Eq;		return Eq;
}		}

		const SCEVPredicate *ScalarEvolution::getWrapPredicate(
		const SCEVAddRecExpr *AR,
		SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
		FoldingSetNodeID ID;
		// Unique this node based on the arguments
		ID.AddInteger(SCEVPredicate::P_Wrap);
		ID.AddPointer(AR);
		ID.AddInteger(AddedFlags);
		void *IP = nullptr;
		if (const auto *S = UniquePreds.FindNodeOrInsertPos(ID, IP))
		return S;
		auto *OF = new (SCEVAllocator)
		SCEVWrapPredicate(ID.Intern(SCEVAllocator), AR, AddedFlags);
		UniquePreds.InsertNode(OF, IP);
		return OF;
		}

namespace {		namespace {

class SCEVPredicateRewriter : public SCEVRewriteVisitor<SCEVPredicateRewriter> {		class SCEVPredicateRewriter : public SCEVRewriteVisitor<SCEVPredicateRewriter> {
public:		public:
static const SCEV rewrite(const SCEV Scev, ScalarEvolution &SE,		// Rewrites Scev in the context of a loop L and the predicate A.
SCEVUnionPredicate &A) {		// If Assume is true, rewrite is free to add further predicates to A
SCEVPredicateRewriter Rewriter(SE, A);		// such that the result will be an AddRecExpr.
		static const SCEV rewrite(const SCEV Scev, const Loop *L,
		ScalarEvolution &SE, SCEVUnionPredicate &A,
		bool Assume) {
		SCEVPredicateRewriter Rewriter(L, SE, A, Assume);
return Rewriter.visit(Scev);		return Rewriter.visit(Scev);
}		}

SCEVPredicateRewriter(ScalarEvolution &SE, SCEVUnionPredicate &P)		SCEVPredicateRewriter(const Loop *L, ScalarEvolution &SE,
: SCEVRewriteVisitor(SE), P(P) {}		SCEVUnionPredicate &P, bool Assume)
		: SCEVRewriteVisitor(SE), P(P), L(L), Assume(Assume) {}

const SCEV visitUnknown(const SCEVUnknown Expr) {		const SCEV visitUnknown(const SCEVUnknown Expr) {
auto ExprPreds = P.getPredicatesForExpr(Expr);		auto ExprPreds = P.getPredicatesForExpr(Expr);
for (auto *Pred : ExprPreds)		for (auto *Pred : ExprPreds)
if (const auto *IPred = dyn_cast<const SCEVEqualPredicate>(Pred))		if (const auto *IPred = dyn_cast<const SCEVEqualPredicate>(Pred))
if (IPred->getLHS() == Expr)		if (IPred->getLHS() == Expr)
return IPred->getRHS();		return IPred->getRHS();

return Expr;		return Expr;
}		}

		const SCEV visitZeroExtendExpr(const SCEVZeroExtendExpr Expr) {
		const SCEV *Operand = visit(Expr->getOperand());
		const SCEVAddRecExpr *AR = dyn_cast<const SCEVAddRecExpr>(Operand);
		if (AR && AR->getLoop() == L && AR->isAffine()) {
		// This couldn't be folded because the operand didn't have the nuw
		// flag. Add the nusw flag as an assumption that we could make.
		const SCEV *Step = AR->getStepRecurrence(SE);
		Type *Ty = Expr->getType();
		if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNUSW))
		return SE.getAddRecExpr(SE.getZeroExtendExpr(AR->getStart(), Ty),
		SE.getSignExtendExpr(Step, Ty), L,
		AR->getNoWrapFlags());
		}
		return SE.getZeroExtendExpr(Operand, Expr->getType());
		}

		const SCEV visitSignExtendExpr(const SCEVSignExtendExpr Expr) {
		const SCEV *Operand = visit(Expr->getOperand());
		const SCEVAddRecExpr *AR = dyn_cast<const SCEVAddRecExpr>(Operand);
		if (AR && AR->getLoop() == L && AR->isAffine()) {
		// This couldn't be folded because the operand didn't have the nsw
		// flag. Add the nssw flag as an assumption that we could make.
		const SCEV *Step = AR->getStepRecurrence(SE);
		Type *Ty = Expr->getType();
		if (addOverflowAssumption(AR, SCEVWrapPredicate::IncrementNSSW))
		return SE.getAddRecExpr(SE.getSignExtendExpr(AR->getStart(), Ty),
		SE.getSignExtendExpr(Step, Ty), L,
		AR->getNoWrapFlags());
		}
		return SE.getSignExtendExpr(Operand, Expr->getType());
		}

private:		private:
		bool addOverflowAssumption(const SCEVAddRecExpr *AR,
		SCEVWrapPredicate::IncrementWrapFlags AddedFlags) {
		auto *A = SE.getWrapPredicate(AR, AddedFlags);
		if (!Assume) {
		// Check if we've already made this assumption.
		if (P.implies(A))
		return true;
		return false;
		}
		P.add(A);
		return true;
		}

SCEVUnionPredicate &P;		SCEVUnionPredicate &P;
		const Loop *L;
		bool Assume;
};		};
} // end anonymous namespace		} // end anonymous namespace

const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV Scev,		const SCEV ScalarEvolution::rewriteUsingPredicate(const SCEV Scev,
		const Loop *L,
SCEVUnionPredicate &Preds) {		SCEVUnionPredicate &Preds) {
return SCEVPredicateRewriter::rewrite(Scev, *this, Preds);		return SCEVPredicateRewriter::rewrite(Scev, L, *this, Preds, false);
		}

		const SCEV *ScalarEvolution::convertSCEVToAddRecWithPredicates(
		const SCEV Scev, const Loop L, SCEVUnionPredicate &Preds) {
		return SCEVPredicateRewriter::rewrite(Scev, L, *this, Preds, true);
}		}

/// SCEV predicates		/// SCEV predicates
SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,		SCEVPredicate::SCEVPredicate(const FoldingSetNodeIDRef ID,
SCEVPredicateKind Kind)		SCEVPredicateKind Kind)
: FastID(ID), Kind(Kind) {}		: FastID(ID), Kind(Kind) {}

SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,		SCEVEqualPredicate::SCEVEqualPredicate(const FoldingSetNodeIDRef ID,
Show All 13 Lines
bool SCEVEqualPredicate::isAlwaysTrue() const { return false; }		bool SCEVEqualPredicate::isAlwaysTrue() const { return false; }

const SCEV *SCEVEqualPredicate::getExpr() const { return LHS; }		const SCEV *SCEVEqualPredicate::getExpr() const { return LHS; }

void SCEVEqualPredicate::print(raw_ostream &OS, unsigned Depth) const {		void SCEVEqualPredicate::print(raw_ostream &OS, unsigned Depth) const {
OS.indent(Depth) << "Equal predicate: " << LHS << " == " << RHS << "\n";		OS.indent(Depth) << "Equal predicate: " << LHS << " == " << RHS << "\n";
}		}

		SCEVWrapPredicate::SCEVWrapPredicate(const FoldingSetNodeIDRef ID,
		const SCEVAddRecExpr *AR,
		IncrementWrapFlags Flags)
		: SCEVPredicate(ID, P_Wrap), AR(AR), Flags(Flags) {}

		const SCEV *SCEVWrapPredicate::getExpr() const { return AR; }

		bool SCEVWrapPredicate::implies(const SCEVPredicate *N) const {
		const auto *Op = dyn_cast<SCEVWrapPredicate>(N);

		return Op && Op->AR == AR && setFlags(Flags, Op->Flags) == Flags;
		}

		bool SCEVWrapPredicate::isAlwaysTrue() const {
		SCEV::NoWrapFlags ScevFlags = AR->getNoWrapFlags();
		IncrementWrapFlags IFlags = Flags;

		if (ScalarEvolution::setFlags(ScevFlags, SCEV::FlagNSW) == ScevFlags)
		IFlags = clearFlags(IFlags, IncrementNSSW);

		return IFlags == IncrementAnyWrap;
		}

		void SCEVWrapPredicate::print(raw_ostream &OS, unsigned Depth) const {
		OS.indent(Depth) << *getExpr() << " Added Flags: ";
		if (SCEVWrapPredicate::IncrementNUSW & getFlags())
		OS << "<nusw>";
		if (SCEVWrapPredicate::IncrementNSSW & getFlags())
		OS << "<nssw>";
		OS << "\n";
		}

		SCEVWrapPredicate::IncrementWrapFlags
		SCEVWrapPredicate::getImpliedFlags(const SCEVAddRecExpr *AR,
		ScalarEvolution &SE) {
		IncrementWrapFlags ImpliedFlags = IncrementAnyWrap;
		SCEV::NoWrapFlags StaticFlags = AR->getNoWrapFlags();

		// We can safely transfer the NSW flag as NSSW.
		if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNSW) == StaticFlags)
		ImpliedFlags = IncrementNSSW;

		if (ScalarEvolution::setFlags(StaticFlags, SCEV::FlagNUW) == StaticFlags) {
		// If the increment is positive, the SCEV NUW flag will also imply the
		// WrapPredicate NUSW flag.
		if (const auto *Step = dyn_cast<SCEVConstant>(AR->getStepRecurrence(SE)))
		if (Step->getValue()->getValue().isNonNegative())
		ImpliedFlags = setFlags(ImpliedFlags, IncrementNUSW);
		}

		return ImpliedFlags;
		}

/// Union predicates don't get cached so create a dummy set ID for it.		/// Union predicates don't get cached so create a dummy set ID for it.
SCEVUnionPredicate::SCEVUnionPredicate()		SCEVUnionPredicate::SCEVUnionPredicate()
: SCEVPredicate(FoldingSetNodeIDRef(nullptr, 0), P_Union) {}		: SCEVPredicate(FoldingSetNodeIDRef(nullptr, 0), P_Union) {}

bool SCEVUnionPredicate::isAlwaysTrue() const {		bool SCEVUnionPredicate::isAlwaysTrue() const {
return all_of(Preds,		return all_of(Preds,
[](const SCEVPredicate *I) { return I->isAlwaysTrue(); });		[](const SCEVPredicate *I) { return I->isAlwaysTrue(); });
}		}
Show All 40 Lines	void SCEVUnionPredicate::add(const SCEVPredicate *N) {
const SCEV *Key = N->getExpr();		const SCEV *Key = N->getExpr();
assert(Key && "Only SCEVUnionPredicate doesn't have an "		assert(Key && "Only SCEVUnionPredicate doesn't have an "
" associated expression!");		" associated expression!");

SCEVToPreds[Key].push_back(N);		SCEVToPreds[Key].push_back(N);
Preds.push_back(N);		Preds.push_back(N);
}		}

PredicatedScalarEvolution::PredicatedScalarEvolution(ScalarEvolution &SE)		PredicatedScalarEvolution::PredicatedScalarEvolution(ScalarEvolution &SE,
: SE(SE), Generation(0) {}		Loop &L)
		: SE(SE), L(L), Generation(0) {}

const SCEV PredicatedScalarEvolution::getSCEV(Value V) {		const SCEV PredicatedScalarEvolution::getSCEV(Value V) {
const SCEV *Expr = SE.getSCEV(V);		const SCEV *Expr = SE.getSCEV(V);
RewriteEntry &Entry = RewriteMap[Expr];		RewriteEntry &Entry = RewriteMap[Expr];

// If we already have an entry and the version matches, return it.		// If we already have an entry and the version matches, return it.
if (Entry.second && Generation == Entry.first)		if (Entry.second && Generation == Entry.first)
return Entry.second;		return Entry.second;

// We found an entry but it's stale. Rewrite the stale entry		// We found an entry but it's stale. Rewrite the stale entry
// acording to the current predicate.		// acording to the current predicate.
if (Entry.second)		if (Entry.second)
Expr = Entry.second;		Expr = Entry.second;

const SCEV *NewSCEV = SE.rewriteUsingPredicate(Expr, Preds);		const SCEV *NewSCEV = SE.rewriteUsingPredicate(Expr, &L, Preds);
Entry = {Generation, NewSCEV};		Entry = {Generation, NewSCEV};

return NewSCEV;		return NewSCEV;
}		}

void PredicatedScalarEvolution::addPredicate(const SCEVPredicate &Pred) {		void PredicatedScalarEvolution::addPredicate(const SCEVPredicate &Pred) {
if (Preds.implies(&Pred))		if (Preds.implies(&Pred))
return;		return;
Preds.add(&Pred);		Preds.add(&Pred);
updateGeneration();		updateGeneration();
}		}

const SCEVUnionPredicate &PredicatedScalarEvolution::getUnionPredicate() const {		const SCEVUnionPredicate &PredicatedScalarEvolution::getUnionPredicate() const {
return Preds;		return Preds;
}		}

void PredicatedScalarEvolution::updateGeneration() {		void PredicatedScalarEvolution::updateGeneration() {
// If the generation number wrapped recompute everything.		// If the generation number wrapped recompute everything.
if (++Generation == 0) {		if (++Generation == 0) {
for (auto &II : RewriteMap) {		for (auto &II : RewriteMap) {
const SCEV *Rewritten = II.second.second;		const SCEV *Rewritten = II.second.second;
II.second = {Generation, SE.rewriteUsingPredicate(Rewritten, Preds)};		II.second = {Generation, SE.rewriteUsingPredicate(Rewritten, &L, Preds)};
		}
}		}
}		}

		void PredicatedScalarEvolution::setNoOverflow(
		Value *V, SCEVWrapPredicate::IncrementWrapFlags Flags) {
		const SCEV *Expr = getSCEV(V);
		const auto *AR = cast<SCEVAddRecExpr>(Expr);

		auto ImpliedFlags = SCEVWrapPredicate::getImpliedFlags(AR, SE);

		// Clear the statically implied flags.
		Flags = SCEVWrapPredicate::clearFlags(Flags, ImpliedFlags);
		addPredicate(*SE.getWrapPredicate(AR, Flags));

		auto II = FlagsMap.insert({V, Flags});
		if (!II.second)
		II.first->second = SCEVWrapPredicate::setFlags(Flags, II.first->second);
		}

		bool PredicatedScalarEvolution::hasNoOverflow(
		Value *V, SCEVWrapPredicate::IncrementWrapFlags Flags) {
		const SCEV *Expr = getSCEV(V);
		const auto *AR = cast<SCEVAddRecExpr>(Expr);

		Flags = SCEVWrapPredicate::clearFlags(
		Flags, SCEVWrapPredicate::getImpliedFlags(AR, SE));

		auto II = FlagsMap.find(V);

		if (II != FlagsMap.end())
		Flags = SCEVWrapPredicate::clearFlags(Flags, II->second);

		return Flags == SCEVWrapPredicate::IncrementAnyWrap;
		}

		const SCEV PredicatedScalarEvolution::getAsAddRec(Value V) {
		const SCEV *Expr = this->getSCEV(V);
		const SCEV *New = SE.convertSCEVToAddRecWithPredicates(Expr, &L, Preds);
		updateGeneration();
		RewriteMap[SE.getSCEV(V)] = {Generation, New};
		return New;
		}

		PredicatedScalarEvolution::
		PredicatedScalarEvolution(const PredicatedScalarEvolution &Init) :
		RewriteMap(Init.RewriteMap), SE(Init.SE), L(Init.L), Preds(Init.Preds) {
		for (auto I = Init.FlagsMap.begin(), E = Init.FlagsMap.end(); I != E; ++I)
		FlagsMap.insert(*I);
}		}

llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp

	Show First 20 Lines • Show All 1,965 Lines • ▼ Show 20 Lines
	Value SCEVExpander::expandCodeForPredicate(const SCEVPredicate Pred,			Value SCEVExpander::expandCodeForPredicate(const SCEVPredicate Pred,
	Instruction *IP) {			Instruction *IP) {
	assert(IP);			assert(IP);
	switch (Pred->getKind()) {			switch (Pred->getKind()) {
	case SCEVPredicate::P_Union:			case SCEVPredicate::P_Union:
	return expandUnionPredicate(cast<SCEVUnionPredicate>(Pred), IP);			return expandUnionPredicate(cast<SCEVUnionPredicate>(Pred), IP);
	case SCEVPredicate::P_Equal:			case SCEVPredicate::P_Equal:
	return expandEqualPredicate(cast<SCEVEqualPredicate>(Pred), IP);			return expandEqualPredicate(cast<SCEVEqualPredicate>(Pred), IP);
				case SCEVPredicate::P_Wrap: {
				auto *AddRecPred = cast<SCEVWrapPredicate>(Pred);
				return expandWrapPredicate(AddRecPred, IP);
				}
	}			}
	llvm_unreachable("Unknown SCEV predicate type");			llvm_unreachable("Unknown SCEV predicate type");
	}			}

	Value SCEVExpander::expandEqualPredicate(const SCEVEqualPredicate Pred,			Value SCEVExpander::expandEqualPredicate(const SCEVEqualPredicate Pred,
	Instruction *IP) {			Instruction *IP) {
	Value *Expr0 = expandCodeFor(Pred->getLHS(), Pred->getLHS()->getType(), IP);			Value *Expr0 = expandCodeFor(Pred->getLHS(), Pred->getLHS()->getType(), IP);
	Value *Expr1 = expandCodeFor(Pred->getRHS(), Pred->getRHS()->getType(), IP);			Value *Expr1 = expandCodeFor(Pred->getRHS(), Pred->getRHS()->getType(), IP);

	Builder.SetInsertPoint(IP);			Builder.SetInsertPoint(IP);
	auto *I = Builder.CreateICmpNE(Expr0, Expr1, "ident.check");			auto *I = Builder.CreateICmpNE(Expr0, Expr1, "ident.check");
	return I;			return I;
	}			}

				Value SCEVExpander::generateOverflowCheck(const SCEVAddRecExpr AR,
				Instruction *Loc, bool Signed) {
				assert(AR->isAffine() && "Cannot generate RT check for "
				"non-affine expression");

				const SCEV *ExitCount = SE.getBackedgeTakenCount(AR->getLoop());
				const SCEV *Step = AR->getStepRecurrence(SE);
				const SCEV *Start = AR->getStart();

				unsigned DstBits = SE.getTypeSizeInBits(AR->getType());
				unsigned SrcBits = SE.getTypeSizeInBits(ExitCount->getType());
				unsigned MaxBits = 2 * std::max(DstBits, SrcBits);

				auto *TripCount = SE.getTruncateOrZeroExtend(ExitCount, AR->getType());
				IntegerType *MaxTy = IntegerType::get(Loc->getContext(), MaxBits);

				assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");

				const auto *ExtendedTripCount = SE.getZeroExtendExpr(ExitCount, MaxTy);
				const auto *ExtendedStep = SE.getSignExtendExpr(Step, MaxTy);
				const auto *ExtendedStart = Signed ? SE.getSignExtendExpr(Start, MaxTy)
				: SE.getZeroExtendExpr(Start, MaxTy);

				const SCEV *End = SE.getAddExpr(Start, SE.getMulExpr(TripCount, Step));
				const SCEV *RHS = Signed ? SE.getSignExtendExpr(End, MaxTy)
				: SE.getZeroExtendExpr(End, MaxTy);

				const SCEV *LHS = SE.getAddExpr(
				ExtendedStart, SE.getMulExpr(ExtendedTripCount, ExtendedStep));

				// Do all SCEV expansions now.
				Value *LHSVal = expandCodeFor(LHS, MaxTy, Loc);
				Value *RHSVal = expandCodeFor(RHS, MaxTy, Loc);

				Builder.SetInsertPoint(Loc);

				return Builder.CreateICmp(ICmpInst::ICMP_NE, RHSVal, LHSVal);
				}

				Value SCEVExpander::expandWrapPredicate(const SCEVWrapPredicate Pred,
				Instruction *IP) {
				const auto *A = cast<SCEVAddRecExpr>(Pred->getExpr());
				Value NSSWCheck = nullptr, NUSWCheck = nullptr;

				// Add a check for NUSW
				if (Pred->getFlags() & SCEVWrapPredicate::IncrementNUSW)
				NUSWCheck = generateOverflowCheck(A, IP, false);

				// Add a check for NSSW
				if (Pred->getFlags() & SCEVWrapPredicate::IncrementNSSW)
				NSSWCheck = generateOverflowCheck(A, IP, true);

				if (NUSWCheck && NSSWCheck)
				return Builder.CreateOr(NUSWCheck, NSSWCheck);

				if (NUSWCheck)
				return NUSWCheck;

				if (NSSWCheck)
				return NSSWCheck;

				return ConstantInt::getFalse(IP->getContext());
				}

	Value SCEVExpander::expandUnionPredicate(const SCEVUnionPredicate Union,			Value SCEVExpander::expandUnionPredicate(const SCEVUnionPredicate Union,
	Instruction *IP) {			Instruction *IP) {
	auto *BoolType = IntegerType::get(IP->getContext(), 1);			auto *BoolType = IntegerType::get(IP->getContext(), 1);
	Value *Check = ConstantInt::getNullValue(BoolType);			Value *Check = ConstantInt::getNullValue(BoolType);

	// Loop over all checks in this set.			// Loop over all checks in this set.
	for (auto Pred : Union->getPredicates()) {			for (auto Pred : Union->getPredicates()) {
	auto *NextCheck = expandCodeForPredicate(Pred, IP);			auto *NextCheck = expandCodeForPredicate(Pred, IP);
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

Show First 20 Lines • Show All 1,745 Lines • ▼ Show 20 Lines	if (TC > 0u && TC < TinyTripCountVectorThreshold) {
DEBUG(dbgs() << "\n");		DEBUG(dbgs() << "\n");
emitAnalysisDiag(F, L, Hints, VectorizationReport()		emitAnalysisDiag(F, L, Hints, VectorizationReport()
<< "vectorization is not beneficial "		<< "vectorization is not beneficial "
"and is not explicitly forced");		"and is not explicitly forced");
return false;		return false;
}		}
}		}

PredicatedScalarEvolution PSE(*SE);		PredicatedScalarEvolution PSE(SE, L);

// Check if it is legal to vectorize the loop.		// Check if it is legal to vectorize the loop.
LoopVectorizationRequirements Requirements;		LoopVectorizationRequirements Requirements;
LoopVectorizationLegality LVL(L, PSE, DT, TLI, AA, F, TTI, LAA,		LoopVectorizationLegality LVL(L, PSE, DT, TLI, AA, F, TTI, LAA,
&Requirements, &Hints);		&Requirements, &Hints);
if (!LVL.canVectorize()) {		if (!LVL.canVectorize()) {
DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");		DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");
emitMissedWarning(F, L, Hints);		emitMissedWarning(F, L, Hints);
▲ Show 20 Lines • Show All 4,066 Lines • Show Last 20 Lines

llvm/trunk/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll

				; RUN: opt -basicaa -loop-accesses -analyze < %s \| FileCheck %s -check-prefix=LAA
				; RUN: opt -loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s \| FileCheck %s -check-prefix=LV

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"

				; For this loop:
				; unsigned index = 0;
				; for (int i = 0; i < n; i++) {
				; A[2 * index] = A[2 * index] + B[i];
				; index++;
				; }
				;
				; SCEV is unable to prove that A[2 * i] does not overflow.
				;
				; Analyzing the IR does not help us because the GEPs are not
				; affine AddRecExprs. However, we can turn them into AddRecExprs
				; using SCEV Predicates.
				;
				; Once we have an affine expression we need to add an additional NUSW
				; to check that the pointers don't wrap since the GEPs are not
				; inbound.

				; LAA-LABEL: f1
				; LAA: Memory dependences are safe{{$}}
				; LAA: SCEV assumptions:
				; LAA-NEXT: {0,+,2}<%for.body> Added Flags: <nusw>
				; LAA-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>

				; The expression for %mul_ext as analyzed by SCEV is
				; (zext i32 {0,+,2}<%for.body> to i64)
				; We have added the nusw flag to turn this expression into the SCEV expression:
				; i64 {0,+,2}<%for.body>

				; LV-LABEL: f1
				; LV-LABEL: vector.scevcheck
				; LV: [[PredCheck0:%[^ ]*]] = icmp ne i128
				; LV: [[Or0:%[^ ]*]] = or i1 false, [[PredCheck0]]
				; LV: [[PredCheck1:%[^ ]*]] = icmp ne i128
				; LV: [[FinalCheck:%[^ ]*]] = or i1 [[Or0]], [[PredCheck1]]
				; LV: br i1 [[FinalCheck]], label %scalar.ph, label %vector.ph
				define void @f1(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]
				%ind1 = phi i32 [ 0, %entry ], [ %inc1, %for.body ]

				%mul = mul i32 %ind1, 2
				%mul_ext = zext i32 %mul to i64

				%arrayidxA = getelementptr i16, i16* %a, i64 %mul_ext
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%inc1 = add i32 %ind1, 1

				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

				; For this loop:
				; unsigned index = n;
				; for (int i = 0; i < n; i++) {
				; A[2 * index] = A[2 * index] + B[i];
				; index--;
				; }
				;
				; the SCEV expression for 2 * index is not an AddRecExpr
				; (and implictly not affine). However, we are able to make assumptions
				; that will turn the expression into an affine one and continue the
				; analysis.
				;
				; Once we have an affine expression we need to add an additional NUSW
				; to check that the pointers don't wrap since the GEPs are not
				; inbounds.
				;
				; This loop has a negative stride for A, and the nusw flag is required in
				; order to properly extend the increment from i32 -4 to i64 -4.

				; LAA-LABEL: f2
				; LAA: Memory dependences are safe{{$}}
				; LAA: SCEV assumptions:
				; LAA-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nusw>
				; LAA-NEXT: {((2 * (zext i32 (2 * (trunc i64 %N to i32)) to i64)) + %a),+,-4}<%for.body> Added Flags: <nusw>

				; The expression for %mul_ext as analyzed by SCEV is
				; (zext i32 {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> to i64)
				; We have added the nusw flag to turn this expression into the following SCEV:
				; i64 {zext i32 (2 * (trunc i64 %N to i32)) to i64,+,-2}<%for.body>

				; LV-LABEL: f2
				; LV-LABEL: vector.scevcheck
				; LV: [[PredCheck0:%[^ ]*]] = icmp ne i128
				; LV: [[Or0:%[^ ]*]] = or i1 false, [[PredCheck0]]
				; LV: [[PredCheck1:%[^ ]*]] = icmp ne i128
				; LV: [[FinalCheck:%[^ ]*]] = or i1 [[Or0]], [[PredCheck1]]
				; LV: br i1 [[FinalCheck]], label %scalar.ph, label %vector.ph
				define void @f2(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				%TruncN = trunc i64 %N to i32
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]
				%ind1 = phi i32 [ %TruncN, %entry ], [ %dec, %for.body ]

				%mul = mul i32 %ind1, 2
				%mul_ext = zext i32 %mul to i64

				%arrayidxA = getelementptr i16, i16* %a, i64 %mul_ext
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%dec = sub i32 %ind1, 1

				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

				; We replicate the tests above, but this time sign extend 2 * index instead
				; of zero extending it.

				; LAA-LABEL: f3
				; LAA: Memory dependences are safe{{$}}
				; LAA: SCEV assumptions:
				; LAA-NEXT: {0,+,2}<%for.body> Added Flags: <nssw>
				; LAA-NEXT: {%a,+,4}<%for.body> Added Flags: <nusw>

				; The expression for %mul_ext as analyzed by SCEV is
				; i64 (sext i32 {0,+,2}<%for.body> to i64)
				; We have added the nssw flag to turn this expression into the following SCEV:
				; i64 {0,+,2}<%for.body>

				; LV-LABEL: f3
				; LV-LABEL: vector.scevcheck
				; LV: [[PredCheck0:%[^ ]*]] = icmp ne i128
				; LV: [[Or0:%[^ ]*]] = or i1 false, [[PredCheck0]]
				; LV: [[PredCheck1:%[^ ]*]] = icmp ne i128
				; LV: [[FinalCheck:%[^ ]*]] = or i1 [[Or0]], [[PredCheck1]]
				; LV: br i1 [[FinalCheck]], label %scalar.ph, label %vector.ph
				define void @f3(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]
				%ind1 = phi i32 [ 0, %entry ], [ %inc1, %for.body ]

				%mul = mul i32 %ind1, 2
				%mul_ext = sext i32 %mul to i64

				%arrayidxA = getelementptr i16, i16* %a, i64 %mul_ext
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%inc1 = add i32 %ind1, 1

				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

				; LAA-LABEL: f4
				; LAA: Memory dependences are safe{{$}}
				; LAA: SCEV assumptions:
				; LAA-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nssw>
				; LAA-NEXT: {((2 * (sext i32 (2 * (trunc i64 %N to i32)) to i64)) + %a),+,-4}<%for.body> Added Flags: <nusw>

				; The expression for %mul_ext as analyzed by SCEV is
				; i64 (sext i32 {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> to i64)
				; We have added the nssw flag to turn this expression into the following SCEV:
				; i64 {sext i32 (2 * (trunc i64 %N to i32)) to i64,+,-2}<%for.body>

				; LV-LABEL: f4
				; LV-LABEL: vector.scevcheck
				; LV: [[PredCheck0:%[^ ]*]] = icmp ne i128
				; LV: [[Or0:%[^ ]*]] = or i1 false, [[PredCheck0]]
				; LV: [[PredCheck1:%[^ ]*]] = icmp ne i128
				; LV: [[FinalCheck:%[^ ]*]] = or i1 [[Or0]], [[PredCheck1]]
				; LV: br i1 [[FinalCheck]], label %scalar.ph, label %vector.ph
				define void @f4(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				%TruncN = trunc i64 %N to i32
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]
				%ind1 = phi i32 [ %TruncN, %entry ], [ %dec, %for.body ]

				%mul = mul i32 %ind1, 2
				%mul_ext = sext i32 %mul to i64

				%arrayidxA = getelementptr i16, i16* %a, i64 %mul_ext
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%dec = sub i32 %ind1, 1

				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

				; The following function is similar to the one above, but has the GEP
				; to pointer %A inbounds. The index %mul doesn't have the nsw flag.
				; This means that the SCEV expression for %mul can wrap and we need
				; a SCEV predicate to continue analysis.
				;
				; We can still analyze this by adding the required no wrap SCEV predicates.

				; LAA-LABEL: f5
				; LAA: Memory dependences are safe{{$}}
				; LAA: SCEV assumptions:
				; LAA-NEXT: {(2 * (trunc i64 %N to i32)),+,-2}<%for.body> Added Flags: <nssw>
				; LAA-NEXT: {((2 * (sext i32 (2 * (trunc i64 %N to i32)) to i64)) + %a),+,-4}<%for.body> Added Flags: <nusw>

				; LV-LABEL: f5
				; LV-LABEL: vector.scevcheck
				define void @f5(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				%TruncN = trunc i64 %N to i32
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]
				%ind1 = phi i32 [ %TruncN, %entry ], [ %dec, %for.body ]

				%mul = mul i32 %ind1, 2

				%arrayidxA = getelementptr inbounds i16, i16* %a, i32 %mul
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr inbounds i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%dec = sub i32 %ind1, 1

				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

llvm/trunk/test/Transforms/LoopVectorize/same-base-access.ll

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	; <label>:25 ; preds = %8
store i32 %27, i32* %k, align 4		store i32 %27, i32* %k, align 4
br label %4		br label %4

; <label>:28 ; preds = %4		; <label>:28 ; preds = %4
ret i32 0		ret i32 0
}		}


		; A[i*7] is scalarized, and the different scalars can in theory wrap
; We don't vectorize this function because A[i*7] is scalarized, and the		; around and overwrite other scalar elements. However we can still
; different scalars can in theory wrap around and overwrite other scalar		; vectorize because we can version the loop to avoid this case.
; elements. At the moment we only allow read/write access to arrays
; that are consecutive.
;		;
; void foo(int *a) {		; void foo(int *a) {
; for (int i=0; i<256; ++i) {		; for (int i=0; i<256; ++i) {
; int x = a[i*7];		; int x = a[i*7];
; if (x>3)		; if (x>3)
; x = xx+x4;		; x = xx+x4;
; a[i*7] = x+3;		; a[i*7] = x+3;
; }		; }
; }		; }

; CHECK-LABEL: @func2(		; CHECK-LABEL: @func2(
; CHECK-NOT: <4 x i32>		; CHECK: <4 x i32>
; CHECK: ret		; CHECK: ret
define i32 @func2(i32* nocapture %a) nounwind uwtable ssp {		define i32 @func2(i32* nocapture %a) nounwind uwtable ssp {
br label %1		br label %1

; <label>:1 ; preds = %7, %0		; <label>:1 ; preds = %7, %0
%indvars.iv = phi i64 [ 0, %0 ], [ %indvars.iv.next, %7 ]		%indvars.iv = phi i64 [ 0, %0 ], [ %indvars.iv.next, %7 ]
%2 = mul nsw i64 %indvars.iv, 7		%2 = mul nsw i64 %indvars.iv, 7
%3 = getelementptr inbounds i32, i32* %a, i64 %2		%3 = getelementptr inbounds i32, i32* %a, i64 %2
Show All 21 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detectionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 47171

llvm/trunk/include/llvm/Analysis/LoopAccessAnalysis.h

llvm/trunk/include/llvm/Analysis/ScalarEvolution.h

llvm/trunk/include/llvm/Analysis/ScalarEvolutionExpander.h

llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

llvm/trunk/lib/Analysis/ScalarEvolutionExpander.cpp

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/trunk/test/Analysis/LoopAccessAnalysis/wrapping-pointer-versioning.ll

llvm/trunk/test/Transforms/LoopVectorize/same-base-access.ll

[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
ClosedPublic