This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
ScalarEvolution.h
-
lib/Analysis/
-
Analysis/
11/18
ScalarEvolution.cpp
-
test/
-
Analysis/
-
Delinearization/
-
a.ll
-
ScalarEvolution/
-
flags-from-poison.ll
1/2
min-max-exprs.ll
-
Transforms/LoopStrengthReduce/
-
LoopStrengthReduce/
-
sext-ind-var.ll

Differential D11860

[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
ClosedPublic

Authored by broune on Aug 7 2015, 9:34 PM.

Download Raw Diff

Details

Reviewers

atrick
sanjoy

Commits

rG9791ed4705f6: [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
rL245118: [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl

Summary

http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

for (int i = 0; i < numIterations; ++i)
  sum += ptr[i - offset];

for (int i = 0; i < numIterations; ++i)
  sum += ptr[i * stride];

for (int i = 0; i < numIterations; ++i)
  sum += ptr[3 * (i << 7)];

Diff Detail

Event Timeline

broune updated this revision to Diff 31572.Aug 7 2015, 9:34 PM

broune retitled this revision from to [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl.

broune updated this object.

broune added reviewers: sanjoy, atrick.

broune added subscribers: eliben, jingyue, meheff and 3 others.

Comments inline.

lib/Analysis/ScalarEvolution.cpp
3378	Any non-zero value except 1. :)
3380	I'm being pedantic here, but the reasoning here may be easier to follow if you write `-Foo` as `-1 * Foo` (assuming that's what you mean).
3396	I did not quite follow the logic here -- what's is the "relevant loop" here? An example will be very useful here. Also, why don't you need `cast<SCEVAddRecExpr>(RHS)->getLoop() == RelevantLoop`?
4124	Why is this needed? Aren't we ever only going to pass in `mul`, `sub` and `shl`?
4237–4245	Why not hoist `LHS = getSCEV(U->getOperand(0)` as well?
4266	I'll be in favor of using `U` here instead of `Op` to be consistent, unless that does not work for some reason.
4418	Unfortunately, this is not quite right. There are some inconsistencies in the langref that to my knowledge have not yet been fixed: see http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html and http://reviews.llvm.org/D8890

This revision now requires changes to proceed.Aug 11 2015, 2:55 PM

Address Sanjoy's comments.

Herald added a subscriber: sanjoy. · View Herald TranscriptAug 11 2015, 7:25 PM

broune added inline comments.Aug 11 2015, 7:26 PM

lib/Analysis/ScalarEvolution.cpp
3378	It's true that getNegativeSCEV happens to represent -RHS as (-1) * RHS which as an unsigned operation would look like MAX * RHS which indeed doesn't wrap for RHS==1. My thinking is that the mathematical operation of negation does wrap for RHS==1, as you end up with MAX instead of -1. So as I think of it, it would be incorrect to pass the NUW flag to getNegativeSCEV even if you knew that RHS == 1, as the mathematical operation is not NUW, but it would be correct for getNegativeSCEV itself to recognize if RHS==1 and apply the NUW flag in that case (ignoring that it could fold to a constant in that case).
3380	-Foo is the mathematical operation being done, while (-1) * Foo is the representation that getMinusSCEV happens to choose for that operation (at least before simplification). In the way that I'm thinking of this, the flags that are passed in to getAddExpr and getNegativeSCEV concern the mathematical operation, not the representation, so the comment follows the math. I could change this if you think that the passed-in flags should concern the representation, instead?
3396	I added a comment with an example. The difficulty that this comes out of is that flags and many SCEV operations do not have a loop/scope attached to them, only recurrences do. So we have to be careful with applying flags to subexpressions that do not involve a recurrence. In this case, the relevant loop would be the one from the recurrence for the purposes of this check. I also realized that this check is unnecessary if NSW was proven by looking only at RHS on its own, with no use of LHS or the Flags parameter, so I added a check for that case.
4124	This function can be passed a `ConstantExpr` version of those, which it cannot handle. I preferred to check for that once here instead of doing it in each caller.
4418	It seems that there is agreement for shl nsw to be equivalent to mul nsw, in which case this is correct, except that no one has bothered to update the LangRef yet, so it's not official. Also, some updates in InstCombine (and possibly elsewhere) would be required to make the change and those also haven't been done yet. Is that right? I added a comment to explain this, along with a check to avoid applying flags for left shift by BitWidth - 1 until the situation is resolved.

sanjoy added inline comments.Aug 13 2015, 12:54 AM

lib/Analysis/ScalarEvolution.cpp
3381	My subjective opinion is that since there is no direct representation of `-X` in LLVM IR, we ultimately have to choose a specific representation to reason about. The specific representation is not obvious either since both `sub 0 X` and `mul -1 X` are equally valid, so it is nice to be explicit about these things. Moreover, here things are doubly confusing since `sub 0 X` and `mul -1 X` have identical behavior w.r.t. `nsw` so there is no easy way for the reader to "check" if (s)he understood the intent correctly. However, objectively, I don't think there is any problem with directly proving things about a mathematical negation operation, so I'd suggest just putting in a sentence either here or above on what exactly you mean by `-RHS`.
3408	But what if you have a loop nested within other, and LHS is an addrec on the inner loop while RHS is an addrec on the outer loop? Then `LHS - RHS` would be an addrec on the inner loop (so the `nsw` is guaranteed to apply only on the inner loop), but `RHS` has a larger scope than the inner loop while still being an add rec. IOW, something like this: define void @f(i32 %outer_l, i32 %inner_l) { entry: br label %outer outer: %o_idx = phi i32 [ 0, %entry ], [ %o_idx.inc, %outer.be ] %o_idx.inc = add i32 %o_idx, 1 %cond = icmp eq i32 %o_idx, 42 br i1 %cond, label %inner, label %outer.be inner: %i_idx = phi i32 [ 0, %outer ], [ %i_idx.inc, %inner ] %i_idx.inc = add i32 %i_idx, 1 %v = sub nsw i32 %i_idx, %o_idx %cond2 = icmp eq i32 %i_idx, %inner_l br i1 %cond2, label %outer.be, label %inner outer.be: %cond3 = icmp eq i32 %o_idx, %outer_l br i1 %cond3, label %exit, label %outer exit: ret void } where `%v` is an add rec for the `%inner` loop while `%o_idx` is an add rec on the outer loop.
4125	Then I'd suggest `if (isa<ConstantExpr>(V)) return FlagAnyWrap;`, that makes the code more obvious.
4418	SGTM.

broune marked 4 inline comments as done.Aug 13 2015, 9:01 PM

broune added inline comments.

lib/Analysis/ScalarEvolution.cpp
3381	I changed it to (-1)*RHS.
3408	Good point, I added a test case that is based on this. It could also be an addrec on the outer loop, which is what I was thinking of it as, but your example shows that that interpretation is not sufficient. There should be a workable scheme where whomever proves NSW or NUW and passes in those flags also can explicitly pass in the loop that the flags are proven within, though I'll leave that for another time. I now only put NSW on the negation if it can be proven without using the passed-in NSW.

broune updated this revision to Diff 32126.Aug 13 2015, 9:02 PM

broune edited edge metadata.

broune marked 2 inline comments as done.

lgtm

test/Analysis/ScalarEvolution/min-max-exprs.ll
36	I'm surprised this changed in the last version -- I think you only made SCEV more conservative in this last revision?

This revision is now accepted and ready to land.Aug 14 2015, 12:05 AM

Thank you to Sanjoy for the review!

test/Analysis/ScalarEvolution/min-max-exprs.ll
36	The NSW on the negation is now applied (if it can be proven) even if the subtraction does not have a NSW flag.

broune closed this revision.Aug 14 2015, 3:46 PM

broune marked an inline comment as done.

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

3 lines

lib/

Analysis/

ScalarEvolution.cpp

113 lines

test/

Analysis/

Delinearization/

a.ll

2 lines

ScalarEvolution/

flags-from-poison.ll

234 lines

min-max-exprs.ll

2 lines

Transforms/

LoopStrengthReduce/

sext-ind-var.ll

104 lines

Diff 32126

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 706 Lines • ▼ Show 20 Lines	public:

/// getOffsetOfExpr - Return an expression for offsetof on the given field		/// getOffsetOfExpr - Return an expression for offsetof on the given field
/// with type IntTy		/// with type IntTy
///		///
const SCEV getOffsetOfExpr(Type IntTy, StructType *STy, unsigned FieldNo);		const SCEV getOffsetOfExpr(Type IntTy, StructType *STy, unsigned FieldNo);

/// getNegativeSCEV - Return the SCEV object corresponding to -V.		/// getNegativeSCEV - Return the SCEV object corresponding to -V.
///		///
const SCEV getNegativeSCEV(const SCEV V);		const SCEV getNegativeSCEV(const SCEV V,
		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);

/// getNotSCEV - Return the SCEV object corresponding to ~V.		/// getNotSCEV - Return the SCEV object corresponding to ~V.
///		///
const SCEV getNotSCEV(const SCEV V);		const SCEV getNotSCEV(const SCEV V);

/// getMinusSCEV - Return LHS-RHS. Minus is represented in SCEV as A+B*-1.		/// getMinusSCEV - Return LHS-RHS. Minus is represented in SCEV as A+B*-1.
const SCEV getMinusSCEV(const SCEV LHS, const SCEV *RHS,		const SCEV getMinusSCEV(const SCEV LHS, const SCEV *RHS,
SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap);
▲ Show 20 Lines • Show All 378 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,333 Lines • ▼ Show 20 Lines	if (checkValidity(S))
return S;		return S;
ValueExprMap.erase(I);		ValueExprMap.erase(I);
}		}
return nullptr;		return nullptr;
}		}

/// getNegativeSCEV - Return a SCEV corresponding to -V = -1*V		/// getNegativeSCEV - Return a SCEV corresponding to -V = -1*V
///		///
const SCEV ScalarEvolution::getNegativeSCEV(const SCEV V) {		const SCEV ScalarEvolution::getNegativeSCEV(const SCEV V,
		SCEV::NoWrapFlags Flags) {
if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))		if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))
return getConstant(		return getConstant(
cast<ConstantInt>(ConstantExpr::getNeg(VC->getValue())));		cast<ConstantInt>(ConstantExpr::getNeg(VC->getValue())));

Type *Ty = V->getType();		Type *Ty = V->getType();
Ty = getEffectiveSCEVType(Ty);		Ty = getEffectiveSCEVType(Ty);
return getMulExpr(V,		return getMulExpr(
getConstant(cast<ConstantInt>(Constant::getAllOnesValue(Ty))));		V, getConstant(cast<ConstantInt>(Constant::getAllOnesValue(Ty))), Flags);
}		}

/// getNotSCEV - Return a SCEV corresponding to ~V = -1-V		/// getNotSCEV - Return a SCEV corresponding to ~V = -1-V
const SCEV ScalarEvolution::getNotSCEV(const SCEV V) {		const SCEV ScalarEvolution::getNotSCEV(const SCEV V) {
if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))		if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))
return getConstant(		return getConstant(
cast<ConstantInt>(ConstantExpr::getNot(VC->getValue())));		cast<ConstantInt>(ConstantExpr::getNot(VC->getValue())));

Type *Ty = V->getType();		Type *Ty = V->getType();
Ty = getEffectiveSCEVType(Ty);		Ty = getEffectiveSCEVType(Ty);
const SCEV *AllOnes =		const SCEV *AllOnes =
getConstant(cast<ConstantInt>(Constant::getAllOnesValue(Ty)));		getConstant(cast<ConstantInt>(Constant::getAllOnesValue(Ty)));
return getMinusSCEV(AllOnes, V);		return getMinusSCEV(AllOnes, V);
}		}

/// getMinusSCEV - Return LHS-RHS. Minus is represented in SCEV as A+B*-1.		/// getMinusSCEV - Return LHS-RHS. Minus is represented in SCEV as A+B*-1.
const SCEV ScalarEvolution::getMinusSCEV(const SCEV LHS, const SCEV *RHS,		const SCEV ScalarEvolution::getMinusSCEV(const SCEV LHS, const SCEV *RHS,
SCEV::NoWrapFlags Flags) {		SCEV::NoWrapFlags Flags) {
assert(!maskFlags(Flags, SCEV::FlagNUW) && "subtraction does not have NUW");

// Fast path: X - X --> 0.		// Fast path: X - X --> 0.
if (LHS == RHS)		if (LHS == RHS)
return getConstant(LHS->getType(), 0);		return getConstant(LHS->getType(), 0);

// X - Y --> X + -Y.		// We represent LHS - RHS as LHS + (-1)*RHS. This transformation
// X -(nsw \|\| nuw) Y --> X + -Y.		// makes it so that we cannot make much use of NUW.
return getAddExpr(LHS, getNegativeSCEV(RHS));		auto AddFlags = SCEV::FlagAnyWrap;
		const bool RHSIsNotMinSigned =
		!getSignedRange(RHS).getSignedMin().isMinSignedValue();
		sanjoyUnsubmitted Done Reply Inline Actions Any non-zero value except 1. :) sanjoy: Any non-zero value except 1. :)
		brouneAuthorUnsubmitted Not Done Reply Inline Actions It's true that getNegativeSCEV happens to represent -RHS as (-1) * RHS which as an unsigned operation would look like MAX * RHS which indeed doesn't wrap for RHS==1. My thinking is that the mathematical operation of negation does wrap for RHS==1, as you end up with MAX instead of -1. So as I think of it, it would be incorrect to pass the NUW flag to getNegativeSCEV even if you knew that RHS == 1, as the mathematical operation is not NUW, but it would be correct for getNegativeSCEV itself to recognize if RHS==1 and apply the NUW flag in that case (ignoring that it could fold to a constant in that case). broune: It's true that getNegativeSCEV happens to represent -RHS as (-1) * RHS which as an unsigned…
		if (maskFlags(Flags, SCEV::FlagNSW) == SCEV::FlagNSW) {
		// Let M be the minimum representable signed value. Then (-1)*RHS
		sanjoyUnsubmitted Done Reply Inline Actions I'm being pedantic here, but the reasoning here may be easier to follow if you write `-Foo` as `-1 * Foo` (assuming that's what you mean). sanjoy: I'm being pedantic here, but the reasoning here may be easier to follow if you write `-Foo` as…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions -Foo is the mathematical operation being done, while (-1) * Foo is the representation that getMinusSCEV happens to choose for that operation (at least before simplification). In the way that I'm thinking of this, the flags that are passed in to getAddExpr and getNegativeSCEV concern the mathematical operation, not the representation, so the comment follows the math. I could change this if you think that the passed-in flags should concern the representation, instead? broune: -Foo is the mathematical operation being done, while (-1) * Foo is the representation that…
		// signed-wraps if and only if RHS is M. That can happen even for
		sanjoyUnsubmitted Done Reply Inline Actions My subjective opinion is that since there is no direct representation of `-X` in LLVM IR, we ultimately have to choose a specific representation to reason about. The specific representation is not obvious either since both `sub 0 X` and `mul -1 X` are equally valid, so it is nice to be explicit about these things. Moreover, here things are doubly confusing since `sub 0 X` and `mul -1 X` have identical behavior w.r.t. `nsw` so there is no easy way for the reader to "check" if (s)he understood the intent correctly. However, objectively, I don't think there is any problem with directly proving things about a mathematical negation operation, so I'd suggest just putting in a sentence either here or above on what exactly you mean by `-RHS`. sanjoy: My subjective opinion is that since there is no direct representation of `-X` in LLVM IR, we…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions I changed it to (-1)RHS. broune:* I changed it to (-1)*RHS.
		// a NSW subtraction because e.g. (-1)*M signed-wraps even though
		// -1 - M does not. So to transfer NSW from LHS - RHS to LHS +
		// (-1)*RHS, we need to prove that RHS != M.
		//
		// If LHS is non-negative and we know that LHS - RHS does not
		// signed-wrap, then RHS cannot be M. So we can rule out signed-wrap
		// either by proving that RHS > M or that LHS >= 0.
		if (RHSIsNotMinSigned \|\| isKnownNonNegative(LHS)) {
		AddFlags = SCEV::FlagNSW;
		}
		}

		// FIXME: Find a correct way to transfer NSW to (-1)*M when LHS -
		// RHS is NSW and LHS >= 0.
		//
		sanjoyUnsubmitted Done Reply Inline Actions I did not quite follow the logic here -- what's is the "relevant loop" here? An example will be very useful here. Also, why don't you need `cast<SCEVAddRecExpr>(RHS)->getLoop() == RelevantLoop`? sanjoy: I did not quite follow the logic here -- what's is the "relevant loop" here? An example will…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions I added a comment with an example. The difficulty that this comes out of is that flags and many SCEV operations do not have a loop/scope attached to them, only recurrences do. So we have to be careful with applying flags to subexpressions that do not involve a recurrence. In this case, the relevant loop would be the one from the recurrence for the purposes of this check. I also realized that this check is unnecessary if NSW was proven by looking only at RHS on its own, with no use of LHS or the Flags parameter, so I added a check for that case. broune: I added a comment with an example. The difficulty that this comes out of is that flags and many…
		// The difficulty here is that the NSW flag may have been proven
		// relative to a loop that is to be found in a recurrence in LHS and
		// not in RHS. Applying NSW to (-1)*M may then let the NSW have a
		// larger scope than intended.
		auto NegFlags = RHSIsNotMinSigned ? SCEV::FlagNSW : SCEV::FlagAnyWrap;

		return getAddExpr(LHS, getNegativeSCEV(RHS, NegFlags), AddFlags);
}		}

/// getTruncateOrZeroExtend - Return a SCEV corresponding to a conversion of the		/// getTruncateOrZeroExtend - Return a SCEV corresponding to a conversion of the
/// input value to the specified type. If the type must be extended, it is zero		/// input value to the specified type. If the type must be extended, it is zero
/// extended.		/// extended.
		sanjoyUnsubmitted Done Reply Inline Actions But what if you have a loop nested within other, and LHS is an addrec on the inner loop while RHS is an addrec on the outer loop? Then `LHS - RHS` would be an addrec on the inner loop (so the `nsw` is guaranteed to apply only on the inner loop), but `RHS` has a larger scope than the inner loop while still being an add rec. IOW, something like this: define void @f(i32 %outer_l, i32 %inner_l) { entry: br label %outer outer: %o_idx = phi i32 [ 0, %entry ], [ %o_idx.inc, %outer.be ] %o_idx.inc = add i32 %o_idx, 1 %cond = icmp eq i32 %o_idx, 42 br i1 %cond, label %inner, label %outer.be inner: %i_idx = phi i32 [ 0, %outer ], [ %i_idx.inc, %inner ] %i_idx.inc = add i32 %i_idx, 1 %v = sub nsw i32 %i_idx, %o_idx %cond2 = icmp eq i32 %i_idx, %inner_l br i1 %cond2, label %outer.be, label %inner outer.be: %cond3 = icmp eq i32 %o_idx, %outer_l br i1 %cond3, label %exit, label %outer exit: ret void } where `%v` is an add rec for the `%inner` loop while `%o_idx` is an add rec on the outer loop. sanjoy: But what if you have a loop nested within other, and LHS is an addrec on the inner loop while…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions Good point, I added a test case that is based on this. It could also be an addrec on the outer loop, which is what I was thinking of it as, but your example shows that that interpretation is not sufficient. There should be a workable scheme where whomever proves NSW or NUW and passes in those flags also can explicitly pass in the loop that the flags are proven within, though I'll leave that for another time. I now only put NSW on the negation if it can be proven without using the passed-in NSW. broune: Good point, I added a test case that is based on this. It could also be an addrec on the outer…
const SCEV *		const SCEV *
ScalarEvolution::getTruncateOrZeroExtend(const SCEV V, Type Ty) {		ScalarEvolution::getTruncateOrZeroExtend(const SCEV V, Type Ty) {
Type *SrcTy = V->getType();		Type *SrcTy = V->getType();
assert((SrcTy->isIntegerTy() \|\| SrcTy->isPointerTy()) &&		assert((SrcTy->isIntegerTy() \|\| SrcTy->isPointerTy()) &&
(Ty->isIntegerTy() \|\| Ty->isPointerTy()) &&		(Ty->isIntegerTy() \|\| Ty->isPointerTy()) &&
"Cannot truncate or zero extend with non-integer arguments!");		"Cannot truncate or zero extend with non-integer arguments!");
if (getTypeSizeInBits(SrcTy) == getTypeSizeInBits(Ty))		if (getTypeSizeInBits(SrcTy) == getTypeSizeInBits(Ty))
return V; // No conversion		return V; // No conversion
▲ Show 20 Lines • Show All 698 Lines • ▼ Show 20 Lines	if (const SCEVUnknown *U = dyn_cast<SCEVUnknown>(S)) {

return setRange(U, SignHint, ConservativeResult);		return setRange(U, SignHint, ConservativeResult);
}		}

return setRange(S, SignHint, ConservativeResult);		return setRange(S, SignHint, ConservativeResult);
}		}

SCEV::NoWrapFlags ScalarEvolution::getNoWrapFlagsFromUB(const Value *V) {		SCEV::NoWrapFlags ScalarEvolution::getNoWrapFlagsFromUB(const Value *V) {
		if (isa<ConstantExpr>(V)) return SCEV::FlagAnyWrap;
const BinaryOperator *BinOp = cast<BinaryOperator>(V);		const BinaryOperator *BinOp = cast<BinaryOperator>(V);
		sanjoyUnsubmitted Done Reply Inline Actions Why is this needed? Aren't we ever only going to pass in `mul`, `sub` and `shl`? sanjoy: Why is this needed? Aren't we ever only going to pass in `mul`, `sub` and `shl`?
		brouneAuthorUnsubmitted Not Done Reply Inline Actions This function can be passed a `ConstantExpr` version of those, which it cannot handle. I preferred to check for that once here instead of doing it in each caller. broune: This function can be passed a `ConstantExpr` version of those, which it cannot handle. I…

		sanjoyUnsubmitted Done Reply Inline Actions Then I'd suggest `if (isa<ConstantExpr>(V)) return FlagAnyWrap;`, that makes the code more obvious. sanjoy: Then I'd suggest `if (isa<ConstantExpr>(V)) return FlagAnyWrap;`, that makes the code more…
// Return early if there are no flags to propagate to the SCEV.		// Return early if there are no flags to propagate to the SCEV.
SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;
if (BinOp->hasNoUnsignedWrap())		if (BinOp->hasNoUnsignedWrap())
Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);		Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);
if (BinOp->hasNoSignedWrap())		if (BinOp->hasNoSignedWrap())
Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW);		Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW);
if (Flags == SCEV::FlagAnyWrap) {		if (Flags == SCEV::FlagAnyWrap) {
return SCEV::FlagAnyWrap;		return SCEV::FlagAnyWrap;
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	const SCEV ScalarEvolution::createSCEV(Value V) {
switch (Opcode) {		switch (Opcode) {
case Instruction::Add: {		case Instruction::Add: {
// The simple thing to do would be to just call getSCEV on both operands		// The simple thing to do would be to just call getSCEV on both operands
// and call getAddExpr with the result. However if we're looking at a		// and call getAddExpr with the result. However if we're looking at a
// bunch of things all added together, this can be quite inefficient,		// bunch of things all added together, this can be quite inefficient,
// because it leads to N-1 getAddExpr calls for N ultimate operands.		// because it leads to N-1 getAddExpr calls for N ultimate operands.
// Instead, gather up all the operands and make a single getAddExpr call.		// Instead, gather up all the operands and make a single getAddExpr call.
// LLVM IR canonical form means we need only traverse the left operands.		// LLVM IR canonical form means we need only traverse the left operands.
//
// FIXME: Expand this handling of NSW and NUW to other instructions, like
// sub and mul.
SmallVector<const SCEV *, 4> AddOps;		SmallVector<const SCEV *, 4> AddOps;
for (Value *Op = U;; Op = U->getOperand(0)) {		for (Value *Op = U;; Op = U->getOperand(0)) {
U = dyn_cast<Operator>(Op);		U = dyn_cast<Operator>(Op);
unsigned Opcode = U ? U->getOpcode() : 0;		unsigned Opcode = U ? U->getOpcode() : 0;
if (!U \|\| (Opcode != Instruction::Add && Opcode != Instruction::Sub)) {		if (!U \|\| (Opcode != Instruction::Add && Opcode != Instruction::Sub)) {
assert(Op != V && "V should be an add");		assert(Op != V && "V should be an add");
AddOps.push_back(getSCEV(Op));		AddOps.push_back(getSCEV(Op));
break;		break;
}		}

if (auto *OpSCEV = getExistingSCEV(Op)) {		if (auto *OpSCEV = getExistingSCEV(U)) {
AddOps.push_back(OpSCEV);		AddOps.push_back(OpSCEV);
break;		break;
}		}

// If a NUW or NSW flag can be applied to the SCEV for this		// If a NUW or NSW flag can be applied to the SCEV for this
// addition, then compute the SCEV for this addition by itself		// addition, then compute the SCEV for this addition by itself
// with a separate call to getAddExpr. We need to do that		// with a separate call to getAddExpr. We need to do that
// instead of pushing the operands of the addition onto AddOps,		// instead of pushing the operands of the addition onto AddOps,
// since the flags are only known to apply to this particular		// since the flags are only known to apply to this particular
// addition - they may not apply to other additions that can be		// addition - they may not apply to other additions that can be
// formed with operands from AddOps.		// formed with operands from AddOps.
//		const SCEV *RHS = getSCEV(U->getOperand(1));
// FIXME: Expand this to sub instructions.
if (Opcode == Instruction::Add && isa<BinaryOperator>(U)) {
SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(U);		SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(U);
if (Flags != SCEV::FlagAnyWrap) {		if (Flags != SCEV::FlagAnyWrap) {
AddOps.push_back(getAddExpr(getSCEV(U->getOperand(0)),		const SCEV *LHS = getSCEV(U->getOperand(0));
getSCEV(U->getOperand(1)), Flags));		if (Opcode == Instruction::Sub)
		AddOps.push_back(getMinusSCEV(LHS, RHS, Flags));
		else
		AddOps.push_back(getAddExpr(LHS, RHS, Flags));
break;		break;
		sanjoyUnsubmitted Done Reply Inline Actions Why not hoist `LHS = getSCEV(U->getOperand(0)` as well? sanjoy: Why not hoist `LHS = getSCEV(U->getOperand(0)` as well?
}		}
}

const SCEV *Op1 = getSCEV(U->getOperand(1));
if (Opcode == Instruction::Sub)		if (Opcode == Instruction::Sub)
AddOps.push_back(getNegativeSCEV(Op1));		AddOps.push_back(getNegativeSCEV(RHS));
else		else
AddOps.push_back(Op1);		AddOps.push_back(RHS);
}		}
return getAddExpr(AddOps);		return getAddExpr(AddOps);
}		}

case Instruction::Mul: {		case Instruction::Mul: {
// FIXME: Transfer NSW/NUW as in AddExpr.
SmallVector<const SCEV *, 4> MulOps;		SmallVector<const SCEV *, 4> MulOps;
MulOps.push_back(getSCEV(U->getOperand(1)));		for (Value *Op = U;; Op = U->getOperand(0)) {
for (Value *Op = U->getOperand(0);		U = dyn_cast<Operator>(Op);
Op->getValueID() == Instruction::Mul + Value::InstructionVal;		if (!U \|\| U->getOpcode() != Instruction::Mul) {
Op = U->getOperand(0)) {		assert(Op != V && "V should be a mul");
U = cast<Operator>(Op);		MulOps.push_back(getSCEV(Op));
		break;
		}

		if (auto *OpSCEV = getExistingSCEV(U)) {
		sanjoyUnsubmitted Done Reply Inline Actions I'll be in favor of using `U` here instead of `Op` to be consistent, unless that does not work for some reason. sanjoy: I'll be in favor of using `U` here instead of `Op` to be consistent, unless that does not work…
		MulOps.push_back(OpSCEV);
		break;
		}

		SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(U);
		if (Flags != SCEV::FlagAnyWrap) {
		MulOps.push_back(getMulExpr(getSCEV(U->getOperand(0)),
		getSCEV(U->getOperand(1)), Flags));
		break;
		}

MulOps.push_back(getSCEV(U->getOperand(1)));		MulOps.push_back(getSCEV(U->getOperand(1)));
}		}
MulOps.push_back(getSCEV(U->getOperand(0)));
return getMulExpr(MulOps);		return getMulExpr(MulOps);
}		}
case Instruction::UDiv:		case Instruction::UDiv:
return getUDivExpr(getSCEV(U->getOperand(0)),		return getUDivExpr(getSCEV(U->getOperand(0)),
getSCEV(U->getOperand(1)));		getSCEV(U->getOperand(1)));
case Instruction::Sub:		case Instruction::Sub:
return getMinusSCEV(getSCEV(U->getOperand(0)),		return getMinusSCEV(getSCEV(U->getOperand(0)), getSCEV(U->getOperand(1)),
getSCEV(U->getOperand(1)));		getNoWrapFlagsFromUB(U));
case Instruction::And:		case Instruction::And:
// For an expression like x&255 that merely masks off the high bits,		// For an expression like x&255 that merely masks off the high bits,
// use zext(trunc(x)) as the SCEV expression.		// use zext(trunc(x)) as the SCEV expression.
if (ConstantInt *CI = dyn_cast<ConstantInt>(U->getOperand(1))) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(U->getOperand(1))) {
if (CI->isNullValue())		if (CI->isNullValue())
return getSCEV(U->getOperand(1));		return getSCEV(U->getOperand(1));
if (CI->isAllOnesValue())		if (CI->isAllOnesValue())
return getSCEV(U->getOperand(0));		return getSCEV(U->getOperand(0));
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	if (ConstantInt *SA = dyn_cast<ConstantInt>(U->getOperand(1))) {

// If the shift count is not less than the bitwidth, the result of		// If the shift count is not less than the bitwidth, the result of
// the shift is undefined. Don't try to analyze it, because the		// the shift is undefined. Don't try to analyze it, because the
// resolution chosen here may differ from the resolution chosen in		// resolution chosen here may differ from the resolution chosen in
// other parts of the compiler.		// other parts of the compiler.
if (SA->getValue().uge(BitWidth))		if (SA->getValue().uge(BitWidth))
break;		break;

		// It is currently not resolved how to interpret NSW for left
		// shift by BitWidth - 1, so we avoid applying flags in that
		// case. Remove this check (or this comment) once the situation
		// is resolved. See
		// http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html
		// and http://reviews.llvm.org/D8890 .
		auto Flags = SCEV::FlagAnyWrap;
		if (SA->getValue().ult(BitWidth - 1)) Flags = getNoWrapFlagsFromUB(U);

Constant *X = ConstantInt::get(getContext(),		Constant *X = ConstantInt::get(getContext(),
APInt::getOneBitSet(BitWidth, SA->getZExtValue()));		APInt::getOneBitSet(BitWidth, SA->getZExtValue()));
return getMulExpr(getSCEV(U->getOperand(0)), getSCEV(X));		return getMulExpr(getSCEV(U->getOperand(0)), getSCEV(X), Flags);
		sanjoyUnsubmitted Done Reply Inline Actions Unfortunately, this is not quite right. There are some inconsistencies in the langref that to my knowledge have not yet been fixed: see http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html and http://reviews.llvm.org/D8890 sanjoy: Unfortunately, this is not quite right. There are some inconsistencies in the langref that to…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions It seems that there is agreement for shl nsw to be equivalent to mul nsw, in which case this is correct, except that no one has bothered to update the LangRef yet, so it's not official. Also, some updates in InstCombine (and possibly elsewhere) would be required to make the change and those also haven't been done yet. Is that right? I added a comment to explain this, along with a check to avoid applying flags for left shift by BitWidth - 1 until the situation is resolved. broune: It seems that there is agreement for shl nsw to be equivalent to mul nsw, in which case this is…
		sanjoyUnsubmitted Done Reply Inline Actions SGTM. sanjoy: SGTM.
}		}
break;		break;

case Instruction::LShr:		case Instruction::LShr:
// Turn logical shift right of a constant into a unsigned divide.		// Turn logical shift right of a constant into a unsigned divide.
if (ConstantInt *SA = dyn_cast<ConstantInt>(U->getOperand(1))) {		if (ConstantInt *SA = dyn_cast<ConstantInt>(U->getOperand(1))) {
uint32_t BitWidth = cast<IntegerType>(U->getType())->getBitWidth();		uint32_t BitWidth = cast<IntegerType>(U->getType())->getBitWidth();

▲ Show 20 Lines • Show All 4,361 Lines • Show Last 20 Lines

test/Analysis/Delinearization/a.ll

	; RUN: opt < %s -analyze -delinearize \| FileCheck %s			; RUN: opt < %s -analyze -delinearize \| FileCheck %s
	;			;
	; void foo(long n, long m, long o, int A[n][m][o]) {			; void foo(long n, long m, long o, int A[n][m][o]) {
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; A[2i+3][3j-4][5*k+7] = 1;			; A[2i+3][3j-4][5*k+7] = 1;
	; }			; }

	; AddRec: {{{(28 + (4 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,(12 * %o)}<%for.j>,+,20}<%for.k>			; AddRec: {{{(28 + (4 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,(12 * %o)}<%for.j>,+,20}<%for.k>
	; CHECK: Base offset: %A			; CHECK: Base offset: %A
	; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 4 bytes.			; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 4 bytes.
	; CHECK: ArrayRef[{3,+,2}<%for.i>][{-4,+,3}<%for.j>][{7,+,5}<%for.k>]			; CHECK: ArrayRef[{3,+,2}<%for.i>][{-4,+,3}<%for.j>][{7,+,5}<nw><%for.k>]

	define void @foo(i64 %n, i64 %m, i64 %o, i32* nocapture %A) #0 {			define void @foo(i64 %n, i64 %m, i64 %o, i32* nocapture %A) #0 {
	entry:			entry:
	%cmp32 = icmp sgt i64 %n, 0			%cmp32 = icmp sgt i64 %n, 0
	br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end17			br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end17

	for.cond1.preheader.lr.ph: ; preds = %entry			for.cond1.preheader.lr.ph: ; preds = %entry
	%cmp230 = icmp sgt i64 %m, 0			%cmp230 = icmp sgt i64 %m, 0
	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

test/Analysis/ScalarEvolution/flags-from-poison.ll

Show First 20 Lines • Show All 350 Lines • ▼ Show 20 Lines	; CHECK: --> {(1 + %offset),+,1}<nsw>

%ptr = getelementptr inbounds float, float* %input, i32 %index32		%ptr = getelementptr inbounds float, float* %input, i32 %index32
%nexti = add nsw i32 %i, 1		%nexti = add nsw i32 %i, 1
store float 1.0, float* %ptr, align 4		store float 1.0, float* %ptr, align 4
br label %loop2		br label %loop2
exit:		exit:
ret void		ret void
}		}

		; Example where a mul should get the nsw flag, so that a sext can be
		; distributed over the mul.
		define void @test-mul-nsw(float* %input, i32 %stride, i32 %numIterations) {
		; CHECK-LABEL: @test-mul-nsw
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {0,+,%stride}<nsw>
		%index32 = mul nsw i32 %i, %stride

		; CHECK: %index64 =
		; CHECK: --> {0,+,(sext i32 %stride to i64)}<nsw>
		%index64 = sext i32 %index32 to i64

		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%nexti = add nsw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop
		exit:
		ret void
		}

		; Example where a mul should get the nuw flag.
		define void @test-mul-nuw(float* %input, i32 %stride, i32 %numIterations) {
		; CHECK-LABEL: @test-mul-nuw
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {0,+,%stride}<nuw>
		%index32 = mul nuw i32 %i, %stride

		%ptr = getelementptr inbounds float, float* %input, i32 %index32
		%nexti = add nuw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		exit:
		ret void
		}

		; Example where a shl should get the nsw flag, so that a sext can be
		; distributed over the shl.
		define void @test-shl-nsw(float* %input, i32 %start, i32 %numIterations) {
		; CHECK-LABEL: @test-shl-nsw
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ %start, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {(256 * %start),+,256}<nsw>
		%index32 = shl nsw i32 %i, 8

		; CHECK: %index64 =
		; CHECK: --> {(sext i32 (256 * %start) to i64),+,256}<nsw>
		%index64 = sext i32 %index32 to i64

		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%nexti = add nsw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop
		exit:
		ret void
		}

		; Example where a shl should get the nuw flag.
		define void @test-shl-nuw(float* %input, i32 %numIterations) {
		; CHECK-LABEL: @test-shl-nuw
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {0,+,512}<nuw>
		%index32 = shl nuw i32 %i, 9

		%ptr = getelementptr inbounds float, float* %input, i32 %index32
		%nexti = add nuw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		exit:
		ret void
		}

		; Example where a sub should not get the nsw flag, because of how
		; scalar evolution represents A - B as A + (-B) and -B can wrap even
		; in cases where A - B does not.
		define void @test-sub-no-nsw(float* %input, i32 %start, i32 %sub, i32 %numIterations) {
		; CHECK-LABEL: @test-sub-no-nsw
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ %start, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {((-1 * %sub) + %start),+,1}<nw>
		%index32 = sub nsw i32 %i, %sub
		%index64 = sext i32 %index32 to i64

		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%nexti = add nsw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop
		exit:
		ret void
		}

		; Example where a sub should get the nsw flag as the RHS cannot be the
		; minimal signed value.
		define void @test-sub-nsw(float* %input, i32 %start, i32 %sub, i32 %numIterations) {
		; CHECK-LABEL: @test-sub-nsw
		entry:
		%halfsub = ashr i32 %sub, 1
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ %start, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {((-1 * %halfsub)<nsw> + %start),+,1}<nsw>
		%index32 = sub nsw i32 %i, %halfsub
		%index64 = sext i32 %index32 to i64

		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%nexti = add nsw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop
		exit:
		ret void
		}

		; Example where a sub should get the nsw flag, since the LHS is non-negative,
		; which implies that the RHS cannot be the minimal signed value.
		define void @test-sub-nsw-lhs-non-negative(float* %input, i32 %sub, i32 %numIterations) {
		; CHECK-LABEL: @test-sub-nsw-lhs-non-negative
		entry:
		br label %loop
		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

		; CHECK: %index32 =
		; CHECK: --> {(-1 * %sub),+,1}<nsw>
		%index32 = sub nsw i32 %i, %sub

		; CHECK: %index64 =
		; CHECK: --> {(sext i32 (-1 * %sub) to i64),+,1}<nsw>
		%index64 = sext i32 %index32 to i64

		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%nexti = add nsw i32 %i, 1
		%f = load float, float* %ptr, align 4
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop
		exit:
		ret void
		}

		; Two adds with a sub in the middle and the sub should have nsw. There is
		; a special case for sequential adds/subs and this test covers that. We have to
		; put the final add first in the program since otherwise the special case
		; is not triggered, hence the strange basic block ordering.
		define void @test-sub-with-add(float* %input, i32 %offset, i32 %numIterations) {
		; CHECK-LABEL: @test-sub-with-add
		entry:
		br label %loop
		loop2:
		; CHECK: %seq =
		; CHECK: --> {(2 + (-1 * %offset)),+,1}<nw>
		%seq = add nsw nuw i32 %index32, 1
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		loop:
		%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]

		%j = add nsw i32 %i, 1
		; CHECK: %index32 =
		; CHECK: --> {(1 + (-1 * %offset)),+,1}<nsw>
		%index32 = sub nsw i32 %j, %offset

		%ptr = getelementptr inbounds float, float* %input, i32 %index32
		%nexti = add nsw i32 %i, 1
		store float 1.0, float* %ptr, align 4
		br label %loop2
		exit:
		ret void
		}


		; Subtraction of two recurrences. The addition in the SCEV that this
		; maps to is NSW, but the negation of the RHS does not since that
		; recurrence could be the most negative representable value.
		define void @subrecurrences(i32 %outer_l, i32 %inner_l, i32 %val) {
		; CHECK-LABEL: @subrecurrences
		entry:
		br label %outer

		outer:
		%o_idx = phi i32 [ 0, %entry ], [ %o_idx.inc, %outer.be ]
		%o_idx.inc = add nsw i32 %o_idx, 1
		%cond = icmp eq i32 %o_idx, %val
		br i1 %cond, label %inner, label %outer.be

		inner:
		%i_idx = phi i32 [ 0, %outer ], [ %i_idx.inc, %inner ]
		%i_idx.inc = add nsw i32 %i_idx, 1
		; CHECK: %v =
		; CHECK-NEXT: --> {{[{][{]}}-1,+,-1}<nw><%outer>,+,1}<nsw><%inner>
		%v = sub nsw i32 %i_idx, %o_idx.inc
		%forub = udiv i32 1, %v
		%cond2 = icmp eq i32 %i_idx, %inner_l
		br i1 %cond2, label %outer.be, label %inner

		outer.be:
		%cond3 = icmp eq i32 %o_idx, %outer_l
		br i1 %cond3, label %exit, label %outer

		exit:
		ret void
		}

test/Analysis/ScalarEvolution/min-max-exprs.ll

	Show All 27 Lines
	bb2: ; preds = %bb1			bb2: ; preds = %bb1
	%tmp3 = add nuw nsw i32 %i.0, 3			%tmp3 = add nuw nsw i32 %i.0, 3
	%tmp4 = icmp slt i32 %tmp3, %N			%tmp4 = icmp slt i32 %tmp3, %N
	%tmp5 = sext i32 %tmp3 to i64			%tmp5 = sext i32 %tmp3 to i64
	%tmp6 = sext i32 %N to i64			%tmp6 = sext i32 %N to i64
	%tmp9 = select i1 %tmp4, i64 %tmp5, i64 %tmp6			%tmp9 = select i1 %tmp4, i64 %tmp5, i64 %tmp6
	; min(N, i+3)			; min(N, i+3)
	; CHECK: select i1 %tmp4, i64 %tmp5, i64 %tmp6			; CHECK: select i1 %tmp4, i64 %tmp5, i64 %tmp6
	; CHECK-NEXT: --> (-1 + (-1 * ((-1 + (-1 * (sext i32 {3,+,1}<nw><%bb1> to i64))) smax (-1 + (-1 * (sext i32 %N to i64))))))			; CHECK-NEXT: --> (-1 + (-1 * ((-1 + (-1 * (sext i32 {3,+,1}<nw><%bb1> to i64))<nsw>) smax (-1 + (-1 * (sext i32 %N to i64))<nsw>)))<nsw>)
				sanjoyUnsubmitted Done Reply Inline Actions I'm surprised this changed in the last version -- I think you only made SCEV more conservative in this last revision? sanjoy: I'm surprised this changed in the last version -- I think you only made SCEV more conservative…
				brouneAuthorUnsubmitted Not Done Reply Inline Actions The NSW on the negation is now applied (if it can be proven) even if the subtraction does not have a NSW flag. broune: The NSW on the negation is now applied (if it can be proven) even if the subtraction does not…
	%tmp11 = getelementptr inbounds i32, i32* %A, i64 %tmp9			%tmp11 = getelementptr inbounds i32, i32* %A, i64 %tmp9
	%tmp12 = load i32, i32* %tmp11, align 4			%tmp12 = load i32, i32* %tmp11, align 4
	%tmp13 = shl nsw i32 %tmp12, 1			%tmp13 = shl nsw i32 %tmp12, 1
	%tmp14 = icmp sge i32 3, %i.0			%tmp14 = icmp sge i32 3, %i.0
	%tmp17 = add nsw i64 %i.0.1, -3			%tmp17 = add nsw i64 %i.0.1, -3
	%tmp19 = select i1 %tmp14, i64 0, i64 %tmp17			%tmp19 = select i1 %tmp14, i64 0, i64 %tmp17
	; max(0, i - 3)			; max(0, i - 3)
	; CHECK: select i1 %tmp14, i64 0, i64 %tmp17			; CHECK: select i1 %tmp14, i64 0, i64 %tmp17
	Show All 9 Lines

test/Transforms/LoopStrengthReduce/sext-ind-var.ll

; RUN: opt -loop-reduce -S < %s \| FileCheck %s		; RUN: opt -loop-reduce -S < %s \| FileCheck %s

target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"		target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-unknown-unknown"		target triple = "nvptx64-unknown-unknown"

; LSR used not to be able to generate a float* induction variable in		; LSR used not to be able to generate a float* induction variable in
; these cases due to scalar evolution not propagating nsw from an		; these cases due to scalar evolution not propagating nsw from an
; instruction to the SCEV, preventing distributing sext into the		; instruction to the SCEV, preventing distributing sext into the
; corresponding addrec.		; corresponding addrec.

		; Test this pattern:
		;
		; for (int i = 0; i < numIterations; ++i)
		; sum += ptr[i + offset];
		;
define float @testadd(float* %input, i32 %offset, i32 %numIterations) {		define float @testadd(float* %input, i32 %offset, i32 %numIterations) {
; CHECK-LABEL: @testadd		; CHECK-LABEL: @testadd
; CHECK: sext i32 %offset to i64		; CHECK: sext i32 %offset to i64
; CHECK: loop:		; CHECK: loop:
; CHECK-DAG: phi float*		; CHECK-DAG: phi float*
; CHECK-DAG: phi i32		; CHECK-DAG: phi i32
; CHECK-NOT: sext		; CHECK-NOT: sext

Show All 10 Lines	loop:
%nextsum = fadd float %sum, %addend		%nextsum = fadd float %sum, %addend
%nexti = add nuw nsw i32 %i, 1		%nexti = add nuw nsw i32 %i, 1
%exitcond = icmp eq i32 %nexti, %numIterations		%exitcond = icmp eq i32 %nexti, %numIterations
br i1 %exitcond, label %exit, label %loop		br i1 %exitcond, label %exit, label %loop

exit:		exit:
ret float %nextsum		ret float %nextsum
}		}

		; Test this pattern:
		;
		; for (int i = 0; i < numIterations; ++i)
		; sum += ptr[i - offset];
		;
		define float @testsub(float* %input, i32 %offset, i32 %numIterations) {
		; CHECK-LABEL: @testsub
		; CHECK: sub i32 0, %offset
		; CHECK: sext i32
		; CHECK: loop:
		; CHECK-DAG: phi float*
		; CHECK-DAG: phi i32
		; CHECK-NOT: sext

		entry:
		br label %loop

		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]
		%sum = phi float [ %nextsum, %loop ], [ 0.000000e+00, %entry ]
		%index32 = sub nuw nsw i32 %i, %offset
		%index64 = sext i32 %index32 to i64
		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%addend = load float, float* %ptr, align 4
		%nextsum = fadd float %sum, %addend
		%nexti = add nuw nsw i32 %i, 1
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		exit:
		ret float %nextsum
		}

		; Test this pattern:
		;
		; for (int i = 0; i < numIterations; ++i)
		; sum += ptr[i * stride];
		;
		define float @testmul(float* %input, i32 %stride, i32 %numIterations) {
		; CHECK-LABEL: @testmul
		; CHECK: sext i32 %stride to i64
		; CHECK: loop:
		; CHECK-DAG: phi float*
		; CHECK-DAG: phi i32
		; CHECK-NOT: sext

		entry:
		br label %loop

		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]
		%sum = phi float [ %nextsum, %loop ], [ 0.000000e+00, %entry ]
		%index32 = mul nuw nsw i32 %i, %stride
		%index64 = sext i32 %index32 to i64
		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%addend = load float, float* %ptr, align 4
		%nextsum = fadd float %sum, %addend
		%nexti = add nuw nsw i32 %i, 1
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		exit:
		ret float %nextsum
		}

		; Test this pattern:
		;
		; for (int i = 0; i < numIterations; ++i)
		; sum += ptr[3 * (i << 7)];
		;
		; The multiplication by 3 is to make the address calculation expensive
		; enough to force the introduction of a pointer induction variable.
		define float @testshl(float* %input, i32 %numIterations) {
		; CHECK-LABEL: @testshl
		; CHECK: loop:
		; CHECK-DAG: phi float*
		; CHECK-DAG: phi i32
		; CHECK-NOT: sext

		entry:
		br label %loop

		loop:
		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]
		%sum = phi float [ %nextsum, %loop ], [ 0.000000e+00, %entry ]
		%index32 = shl nuw nsw i32 %i, 7
		%index32mul = mul nuw nsw i32 %index32, 3
		%index64 = sext i32 %index32mul to i64
		%ptr = getelementptr inbounds float, float* %input, i64 %index64
		%addend = load float, float* %ptr, align 4
		%nextsum = fadd float %sum, %addend
		%nexti = add nuw nsw i32 %i, 1
		%exitcond = icmp eq i32 %nexti, %numIterations
		br i1 %exitcond, label %exit, label %loop

		exit:
		ret float %nextsum
		}

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shlClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 32126

include/llvm/Analysis/ScalarEvolution.h

lib/Analysis/ScalarEvolution.cpp

test/Analysis/Delinearization/a.ll

test/Analysis/ScalarEvolution/flags-from-poison.ll

test/Analysis/ScalarEvolution/min-max-exprs.ll

test/Transforms/LoopStrengthReduce/sext-ind-var.ll

[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
ClosedPublic