This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
ScalarEvolution.h
-
lib/Analysis/
-
Analysis/
6/11
ScalarEvolution.cpp
-
test/
-
Analysis/
-
IVUsers/
-
quadradic-exit-value.ll
-
ScalarEvolution/
-
different-loops-recs.ll
-
Transforms/LoopStrengthReduce/
-
LoopStrengthReduce/
-
X86/
-
incorrect-offset-scaling.ll
-
lsr-expand-quadratic.ll
-
post-inc-icmpzero.ll

Differential D33316

[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
ClosedPublic

Authored by mkazantsev on May 18 2017, 5:32 AM.

Download Raw Diff

Details

Reviewers

sanjoy
reames
anna
apilipenko
skatkov

Summary

When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.

Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:

for (...) { // loop
  phi = {A,+,B}
}
X = load ...

Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.

This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.

Diff Detail

Event Timeline

mkazantsev created this revision.May 18 2017, 5:32 AM

Herald added a subscriber: mzolotukhin. · View Herald TranscriptMay 18 2017, 5:32 AM

Comments inline.

lib/Analysis/ScalarEvolution.cpp
2183	I'd not call this complex rec, since I don't think this is specific to PHI nodes (see below). Instead, I'd just call it `SCEVDominatesLoop` or something like that.
2184	Can you use `SCEVTraversal` here? That'll also prevent exponential complexity.
2210	I'm not sure why this can't happen with non-PHIs. IMO the condition should be "the expression the SCEVUnknown corresponds to must dominate L->getHeader()".

This revision now requires changes to proceed.May 18 2017, 10:59 AM

mkazantsev added inline comments.May 18 2017, 11:44 PM

lib/Analysis/ScalarEvolution.cpp
2210	Good point, the same situation could happen if we applied non-SCEVable operation with our Phi.

Need to address the comments. Also need to fix the typo in commit message (lost ---> most).

Addressed the comments, and also made the analysis more accurate. We only react on SCEVUnknown if there is another loop between it and our loop, and this loop header has Phis.

sanjoy added inline comments.May 19 2017, 12:19 PM

lib/Analysis/ScalarEvolution.cpp
2197	I believe this should be more general, and we should disallow forming `{%x,+,%y}<%loop>` unless `%x` and `%y` both dominate `%loop`[0]. Whether there is a PHI involved or not should not matter. In particular, `{%x,+,%y}` is an expression that evaluates to `%x` on the 0th iteration of the loop. This is meaningless unless `%x` dominates the loop header. However, if you fix this for good, you may see some fall out of LSR, but we should just go ahead and fix those. [0]: or, if they're complex SCEV expressions, all of the contained SCEVUnknown and SCEVAddRec expressions must dominate `%loop`.
4428	Why do you need this? Even if you do, please just check it in separately without further review.

sanjoy requested changes to this revision.May 19 2017, 5:04 PM

This revision now requires changes to proceed.May 19 2017, 5:04 PM

mkazantsev added inline comments.May 21 2017, 9:27 PM

lib/Analysis/ScalarEvolution.cpp
2197	Consider two situations: Case 1: for (...) { %phi = {a,+,b} } %x = ... %y = %phi + %x Case 2: %x = ... for (...) { %phi = {a,+,b} } %y = %phi + %x If %x doesn't depend on the loop in any way, I don't see a reason why SCEV should not evaluate %y equally in both cases. In case 2, it is absolutely OK to imagine a recurrence {a+x,+,b} (it could be explicitly calculated in the loop by creating a Phi), but in case 1 such transormation is also valid because SCEV should be place-independent, and if nothing prohibits us from calculating %x before the loop (even if the actual instruction stands after it), we should be able to deal with it just like if it was calculated before the loop if it makes sense. Is it not correct? The only problem SCEVUnknown creates is that it can be actually a varying value in some loop which does not dominate L. In this case our logic of picking the bottom-most loop in getAddExpr and getMulExpr becomes incorrect. We are just picking the wrong loop, because we can only pick a loop of AddRecExpr. If there are no loops with Phis between our SCEVUnknown and L, all varying values that might be used by the SCEVUnknown actually dominate the loop L (and this is actually what we want in [0]). It means that our picking of the loop in AddExpr is correct. We could just prohibit all SCEVUnknown below L, but I think this would be too strong reduction of the scope.
4428	Yeah, this wasn't intended to be here. Will be removed.

sanjoy added inline comments.May 21 2017, 10:50 PM

lib/Analysis/ScalarEvolution.cpp
2197	I don't think the two situations are the same. Firstly, I agree that in case 2, `%y` should be `{%x+a,+,b}`. In case 1, however, I don't think the same should hold. The definition of a `{a,+,b}` is that it is equal to `a` on the 0th iteration and that makes little sense if `a` does not dominate the loop. For instance, if we allow such SCEV expressions, we won't always be able to compute a "value at exit" (like we do in `rewriteLoopExitValues`) that is available on a loop exit. However, repeating what I said before, I suspect we may have to compromise on this due to LSR (i.e. LSR may be relying on us also folding case 1), but we should try and see.

mkazantsev added inline comments.May 23 2017, 1:26 AM

lib/Analysis/ScalarEvolution.cpp
2197	Doing this breaks 4 tests (CHECK's fail), but at least in some of them we can get rid of SCEVUnknown making some minor improvements. I will go through them accurately and make this changes, and then we'll see if we lost something important.

mkazantsev updated this revision to Diff 99884.May 23 2017, 5:02 AM

mkazantsev edited edge metadata.

mkazantsev retitled this revision from [SCEV] Do not fold expressions with SCEVUnknown Phis into AddRecExpr's to [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start.

mkazantsev edited the summary of this revision. (Show Details)

mkazantsev marked 2 inline comments as done.

lgtm with a minor nit

lib/Analysis/ScalarEvolution.cpp
2182	Instead of `hasDominatedSCEVUnknown`, I'd call this `isAvailableAtLoopEntry`, negate the return value and push the `isLoopInvariant` into this this function. This seems more more semantic -- we're interested in knowing if a SCEV expression can possibly be computed at `L` 's preheader. You can then shift the comment about invariants (paraphrased as needed) into this function.

This revision is now accepted and ready to land.May 23 2017, 5:04 PM

No functional changes, just renames of methods for better semantical understanding.

Landed as https://reviews.llvm.org/rL303730

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

6 lines

lib/

Analysis/

ScalarEvolution.cpp

61 lines

test/

Analysis/

IVUsers/

quadradic-exit-value.ll

36 lines

ScalarEvolution/

different-loops-recs.ll

64 lines

Transforms/

LoopStrengthReduce/

X86/

incorrect-offset-scaling.ll

12 lines

lsr-expand-quadratic.ll

17 lines

post-inc-icmpzero.ll

4 lines

Diff 100049

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 1,527 Lines • ▼ Show 20 Lines	public:
/// Return the "disposition" of the given SCEV with respect to the given		/// Return the "disposition" of the given SCEV with respect to the given
/// loop.		/// loop.
LoopDisposition getLoopDisposition(const SCEV S, const Loop L);		LoopDisposition getLoopDisposition(const SCEV S, const Loop L);

/// Return true if the value of the given SCEV is unchanging in the		/// Return true if the value of the given SCEV is unchanging in the
/// specified loop.		/// specified loop.
bool isLoopInvariant(const SCEV S, const Loop L);		bool isLoopInvariant(const SCEV S, const Loop L);

		/// Determine if the SCEV can be evaluated at loop's entry. It is true if it
		/// doesn't depend on a SCEVUnknown of an instruction which is dominated by
		/// the header of loop L.
		bool isAvailableAtLoopEntry(const SCEV S, const Loop L, DominatorTree &DT,
		LoopInfo &LI);

/// Return true if the given SCEV changes value in a known way in the		/// Return true if the given SCEV changes value in a known way in the
/// specified loop. This property being true implies that the value is		/// specified loop. This property being true implies that the value is
/// variant in the loop AND that we can emit an expression to compute the		/// variant in the loop AND that we can emit an expression to compute the
/// value of the expression at any particular loop iteration.		/// value of the expression at any particular loop iteration.
bool hasComputableLoopEvolution(const SCEV S, const Loop L);		bool hasComputableLoopEvolution(const SCEV S, const Loop L);

/// Return the "disposition" of the given SCEV with respect to the given		/// Return the "disposition" of the given SCEV with respect to the given
/// block.		/// block.
▲ Show 20 Lines • Show All 296 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,172 Lines • ▼ Show 20 Lines	if (!(SignOrUnsignWrap & SCEV::FlagNUW)) {
if (NUWRegion.contains(SE->getUnsignedRange(Ops[1])))		if (NUWRegion.contains(SE->getUnsignedRange(Ops[1])))
Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);		Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);
}		}
}		}

return Flags;		return Flags;
}		}

		bool ScalarEvolution::isAvailableAtLoopEntry(const SCEV S, const Loop L,
		DominatorTree &DT, LoopInfo &LI) {
		sanjoyUnsubmitted Not Done Reply Inline Actions Instead of `hasDominatedSCEVUnknown`, I'd call this `isAvailableAtLoopEntry`, negate the return value and push the `isLoopInvariant` into this this function. This seems more more semantic -- we're interested in knowing if a SCEV expression can possibly be computed at `L` 's preheader. You can then shift the comment about invariants (paraphrased as needed) into this function. sanjoy: Instead of `hasDominatedSCEVUnknown`, I'd call this `isAvailableAtLoopEntry`, negate the return…
		if (!isLoopInvariant(S, L))
		sanjoyUnsubmitted Done Reply Inline Actions I'd not call this complex rec, since I don't think this is specific to PHI nodes (see below). Instead, I'd just call it `SCEVDominatesLoop` or something like that. sanjoy: I'd not call this complex rec, since I don't think this is specific to PHI nodes (see below).
		return false;
		sanjoyUnsubmitted Done Reply Inline Actions Can you use `SCEVTraversal` here? That'll also prevent exponential complexity. sanjoy: Can you use `SCEVTraversal` here? That'll also prevent exponential complexity.
		// If a value depends on a SCEVUnknown which is defined after the loop, we
		// conservatively assume that we cannot calculate it at the loop's entry.
		struct FindDominatedSCEVUnknown {
		bool Found = false;
		const Loop *L;
		DominatorTree &DT;
		LoopInfo &LI;

		FindDominatedSCEVUnknown(const Loop *L, DominatorTree &DT, LoopInfo &LI)
		: L(L), DT(DT), LI(LI) {}

		bool checkSCEVUnknown(const SCEVUnknown *SU) {
		if (auto *I = dyn_cast<Instruction>(SU->getValue())) {
		sanjoyUnsubmitted Not Done Reply Inline Actions I believe this should be more general, and we should disallow forming `{%x,+,%y}<%loop>` unless `%x` and `%y` both dominate `%loop`[0]. Whether there is a PHI involved or not should not matter. In particular, `{%x,+,%y}` is an expression that evaluates to `%x` on the 0th iteration of the loop. This is meaningless unless `%x` dominates the loop header. However, if you fix this for good, you may see some fall out of LSR, but we should just go ahead and fix those. [0]: or, if they're complex SCEV expressions, all of the contained SCEVUnknown and SCEVAddRec expressions must dominate `%loop`. sanjoy: I believe this should be more general, and we should disallow forming `{%x,+,%y}<%loop>` unless…
		mkazantsevAuthorUnsubmitted Not Done Reply Inline Actions Consider two situations: Case 1: for (...) { %phi = {a,+,b} } %x = ... %y = %phi + %x Case 2: %x = ... for (...) { %phi = {a,+,b} } %y = %phi + %x If %x doesn't depend on the loop in any way, I don't see a reason why SCEV should not evaluate %y equally in both cases. In case 2, it is absolutely OK to imagine a recurrence {a+x,+,b} (it could be explicitly calculated in the loop by creating a Phi), but in case 1 such transormation is also valid because SCEV should be place-independent, and if nothing prohibits us from calculating %x before the loop (even if the actual instruction stands after it), we should be able to deal with it just like if it was calculated before the loop if it makes sense. Is it not correct? The only problem SCEVUnknown creates is that it can be actually a varying value in some loop which does not dominate L. In this case our logic of picking the bottom-most loop in getAddExpr and getMulExpr becomes incorrect. We are just picking the wrong loop, because we can only pick a loop of AddRecExpr. If there are no loops with Phis between our SCEVUnknown and L, all varying values that might be used by the SCEVUnknown actually dominate the loop L (and this is actually what we want in [0]). It means that our picking of the loop in AddExpr is correct. We could just prohibit all SCEVUnknown below L, but I think this would be too strong reduction of the scope. mkazantsev: Consider two situations: Case 1: for (...) { %phi = {a,+,b} } %x = ... %y = %phi +…
		sanjoyUnsubmitted Done Reply Inline Actions I don't think the two situations are the same. Firstly, I agree that in case 2, `%y` should be `{%x+a,+,b}`. In case 1, however, I don't think the same should hold. The definition of a `{a,+,b}` is that it is equal to `a` on the 0th iteration and that makes little sense if `a` does not dominate the loop. For instance, if we allow such SCEV expressions, we won't always be able to compute a "value at exit" (like we do in `rewriteLoopExitValues`) that is available on a loop exit. However, repeating what I said before, I suspect we may have to compromise on this due to LSR (i.e. LSR may be relying on us also folding case 1), but we should try and see. sanjoy: I don't think the two situations are the same. Firstly, I agree that in case 2, `%y` should be…
		mkazantsevAuthorUnsubmitted Done Reply Inline Actions Doing this breaks 4 tests (CHECK's fail), but at least in some of them we can get rid of SCEVUnknown making some minor improvements. I will go through them accurately and make this changes, and then we'll see if we lost something important. mkazantsev: Doing this breaks 4 tests (CHECK's fail), but at least in some of them we can get rid of…
		if (DT.dominates(L->getHeader(), I->getParent()))
		Found = true;
		else
		assert(DT.dominates(I->getParent(), L->getHeader()) &&
		"No dominance relationship between SCEV and loop?");
		}
		return false;
		}

		bool follow(const SCEV *S) {
		switch (static_cast<SCEVTypes>(S->getSCEVType())) {
		case scConstant:
		return false;
		sanjoyUnsubmitted Done Reply Inline Actions I'm not sure why this can't happen with non-PHIs. IMO the condition should be "the expression the SCEVUnknown corresponds to must dominate L->getHeader()". sanjoy: I'm not sure why this can't happen with non-PHIs. IMO the condition should be "the expression…
		mkazantsevAuthorUnsubmitted Done Reply Inline Actions Good point, the same situation could happen if we applied non-SCEVable operation with our Phi. mkazantsev: Good point, the same situation could happen if we applied non-SCEVable operation with our Phi.
		case scAddRecExpr:
		case scTruncate:
		case scZeroExtend:
		case scSignExtend:
		case scAddExpr:
		case scMulExpr:
		case scUMaxExpr:
		case scSMaxExpr:
		case scUDivExpr:
		return true;
		case scUnknown:
		return checkSCEVUnknown(cast<SCEVUnknown>(S));
		case scCouldNotCompute:
		llvm_unreachable("Attempt to use a SCEVCouldNotCompute object!");
		}
		return false;
		}

		bool isDone() { return Found; }
		};

		FindDominatedSCEVUnknown FSU(L, DT, LI);
		SCEVTraversal<FindDominatedSCEVUnknown> ST(FSU);
		ST.visitAll(S);
		return !FSU.Found;
		}

/// Get a canonical add expression, or something simpler if possible.		/// Get a canonical add expression, or something simpler if possible.
const SCEV ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV > &Ops,		const SCEV ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV > &Ops,
SCEV::NoWrapFlags Flags,		SCEV::NoWrapFlags Flags,
unsigned Depth) {		unsigned Depth) {
assert(!(Flags & ~(SCEV::FlagNUW \| SCEV::FlagNSW)) &&		assert(!(Flags & ~(SCEV::FlagNUW \| SCEV::FlagNSW)) &&
"only nuw or nsw allowed");		"only nuw or nsw allowed");
assert(!Ops.empty() && "Cannot get empty add!");		assert(!Ops.empty() && "Cannot get empty add!");
if (Ops.size() == 1) return Ops[0];		if (Ops.size() == 1) return Ops[0];
▲ Show 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	#endif
// Scan over all recurrences, trying to fold loop invariants into them.		// Scan over all recurrences, trying to fold loop invariants into them.
for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {		for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {
// Scan all of the other operands to this add and add them to the vector if		// Scan all of the other operands to this add and add them to the vector if
// they are loop invariant w.r.t. the recurrence.		// they are loop invariant w.r.t. the recurrence.
SmallVector<const SCEV *, 8> LIOps;		SmallVector<const SCEV *, 8> LIOps;
const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);		const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
const Loop *AddRecLoop = AddRec->getLoop();		const Loop *AddRecLoop = AddRec->getLoop();
for (unsigned i = 0, e = Ops.size(); i != e; ++i)		for (unsigned i = 0, e = Ops.size(); i != e; ++i)
if (isLoopInvariant(Ops[i], AddRecLoop)) {		if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
LIOps.push_back(Ops[i]);		LIOps.push_back(Ops[i]);
Ops.erase(Ops.begin()+i);		Ops.erase(Ops.begin()+i);
--i; --e;		--i; --e;
}		}

// If we found some loop invariants, fold them into the recurrence.		// If we found some loop invariants, fold them into the recurrence.
if (!LIOps.empty()) {		if (!LIOps.empty()) {
// NLI + LI + {Start,+,Step} --> NLI + {LI+Start,+,Step}		// NLI + LI + {Start,+,Step} --> NLI + {LI+Start,+,Step}
▲ Show 20 Lines • Show All 258 Lines • ▼ Show 20 Lines	#endif
// Scan over all recurrences, trying to fold loop invariants into them.		// Scan over all recurrences, trying to fold loop invariants into them.
for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {		for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {
// Scan all of the other operands to this mul and add them to the vector if		// Scan all of the other operands to this mul and add them to the vector if
// they are loop invariant w.r.t. the recurrence.		// they are loop invariant w.r.t. the recurrence.
SmallVector<const SCEV *, 8> LIOps;		SmallVector<const SCEV *, 8> LIOps;
const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);		const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
const Loop *AddRecLoop = AddRec->getLoop();		const Loop *AddRecLoop = AddRec->getLoop();
for (unsigned i = 0, e = Ops.size(); i != e; ++i)		for (unsigned i = 0, e = Ops.size(); i != e; ++i)
if (isLoopInvariant(Ops[i], AddRecLoop)) {		if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
LIOps.push_back(Ops[i]);		LIOps.push_back(Ops[i]);
Ops.erase(Ops.begin()+i);		Ops.erase(Ops.begin()+i);
--i; --e;		--i; --e;
}		}

// If we found some loop invariants, fold them into the recurrence.		// If we found some loop invariants, fold them into the recurrence.
if (!LIOps.empty()) {		if (!LIOps.empty()) {
// NLI * LI * {Start,+,Step} --> NLI * {LIStart,+,LIStep}		// NLI * LI * {Start,+,Step} --> NLI * {LIStart,+,LIStep}
▲ Show 20 Lines • Show All 1,617 Lines • ▼ Show 20 Lines	bool follow(const SCEV *S) {
case scUDivExpr:		case scUDivExpr:
case scCouldNotCompute:		case scCouldNotCompute:
// We do not try to smart about these at all.		// We do not try to smart about these at all.
return setUnavailable();		return setUnavailable();
}		}
llvm_unreachable("switch should be fully covered!");		llvm_unreachable("switch should be fully covered!");
}		}

bool isDone() { return TraversalDone; }		bool isDone() { return TraversalDone; }
		sanjoyUnsubmitted Not Done Reply Inline Actions Why do you need this? Even if you do, please just check it in separately without further review. sanjoy: Why do you need this? Even if you do, please just check it in separately without further…
		mkazantsevAuthorUnsubmitted Not Done Reply Inline Actions Yeah, this wasn't intended to be here. Will be removed. mkazantsev: Yeah, this wasn't intended to be here. Will be removed.
};		};

CheckAvailable CA(L, BB, DT);		CheckAvailable CA(L, BB, DT);
SCEVTraversal<CheckAvailable> ST(CA);		SCEVTraversal<CheckAvailable> ST(CA);

ST.visitAll(S);		ST.visitAll(S);
return CA.Available;		return CA.Available;
}		}
▲ Show 20 Lines • Show All 6,583 Lines • Show Last 20 Lines

test/Analysis/IVUsers/quadradic-exit-value.ll

Show All 24 Lines	foo.loop:
%c = icmp eq i64 %indvar.next, %n		%c = icmp eq i64 %indvar.next, %n
br i1 %c, label %exit, label %foo.loop		br i1 %c, label %exit, label %foo.loop

exit:		exit:
%r = mul i64 %indvar.next, %indvar.next		%r = mul i64 %indvar.next, %indvar.next
ret i64 %r		ret i64 %r
}		}

		; PR15470: LSR miscompile. The test1 function should return '1'.
		; It is valid to fold SCEVUnknown into the recurrence because it
		; was defined before the loop.
		;
		; SCEV does not know how to denormalize chained recurrences, so make
		; sure they aren't marked as post-inc users.
		;
		; CHECK-LABEL: IV Users for loop %test1.loop
		; CHECK-NO-LCSSA: %sext.us = {0,+,(16777216 + (-16777216 * %sub.us))<nuw><nsw>,+,33554432}<%test1.loop> (post-inc with loop %test1.loop) in %f = ashr i32 %sext.us, 24
		define i32 @test1(i1 %cond) {
		entry:
		%sub.us = select i1 %cond, i32 0, i32 0
		br label %test1.loop

		test1.loop:
		%inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test1.loop ]
		%inc11.us = add nsw i32 %inc1115.us, 1
		%cmp.us = icmp slt i32 %inc11.us, 2
		br i1 %cmp.us, label %test1.loop, label %for.end

		for.end:
		%tobool.us = icmp eq i32 %inc1115.us, 0
		%mul.us = shl i32 %inc1115.us, 24
		%sub.cond.us = sub nsw i32 %inc1115.us, %sub.us
		%sext.us = mul i32 %mul.us, %sub.cond.us
		%f = ashr i32 %sext.us, 24
		br label %exit

		exit:
		ret i32 %f
		}

; PR15470: LSR miscompile. The test2 function should return '1'.		; PR15470: LSR miscompile. The test2 function should return '1'.
		; It is illegal to fold SCEVUnknown (sext.us) into the recurrence
		; because it is defined after the loop where this recurrence belongs.
;		;
; SCEV does not know how to denormalize chained recurrences, so make		; SCEV does not know how to denormalize chained recurrences, so make
; sure they aren't marked as post-inc users.		; sure they aren't marked as post-inc users.
;		;
; CHECK-LABEL: IV Users for loop %test2.loop		; CHECK-LABEL: IV Users for loop %test2.loop
; CHECK-NO-LCSSA: %sext.us = {0,+,(16777216 + (-16777216 * %sub.us))<nuw><nsw>,+,33554432}<%test2.loop> (post-inc with loop %test2.loop) in %f = ashr i32 %sext.us, 24		; CHECK-NO-LCSSA: %sub.cond.us = ((-1 * %sub.us)<nsw> + {0,+,1}<nuw><nsw><%test2.loop>) (post-inc with loop %test2.loop) in %sext.us = mul i32 %mul.us, %sub.cond.us
define i32 @test2() {		define i32 @test2() {
entry:		entry:
br label %test2.loop		br label %test2.loop

test2.loop:		test2.loop:
%inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test2.loop ]		%inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test2.loop ]
%inc11.us = add nsw i32 %inc1115.us, 1		%inc11.us = add nsw i32 %inc1115.us, 1
%cmp.us = icmp slt i32 %inc11.us, 2		%cmp.us = icmp slt i32 %inc11.us, 2
Show All 14 Lines

test/Analysis/ScalarEvolution/different-loops-recs.ll

Show First 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	exit:
%s4 = add i32 %phi2, %is2		%s4 = add i32 %phi2, %is2
%s5 = add i32 %is1, %is2		%s5 = add i32 %is1, %is2
%s6 = add i32 %is2, %is1		%s6 = add i32 %is2, %is1
ret void		ret void
}		}

; Mix of previous use cases that demonstrates %s3 can be incorrectly treated as		; Mix of previous use cases that demonstrates %s3 can be incorrectly treated as
; a recurrence of loop1 because of operands order if we pick recurrencies in an		; a recurrence of loop1 because of operands order if we pick recurrencies in an
; incorrect order.		; incorrect order. It also shows that we cannot safely fold v1 (SCEVUnknown)
		; because we cannot prove for sure that it doesn't use Phis of loop 2.

define void @test_03(i32 %a, i32 %b, i32 %c, i32* %p) {		define void @test_03(i32 %a, i32 %b, i32 %c, i32* %p) {

; CHECK-LABEL: Classifying expressions for: @test_03		; CHECK-LABEL: Classifying expressions for: @test_03
; CHECK: %v1 = load i32, i32* %p		; CHECK: %v1 = load i32, i32* %p
; CHECK-NEXT: --> %v1		; CHECK-NEXT: --> %v1
; CHECK: %s1 = add i32 %phi1, %v1		; CHECK: %s1 = add i32 %phi1, %v1
; CHECK-NEXT: --> {(%a + %v1),+,1}<%loop1>		; CHECK-NEXT: --> ({%a,+,1}<%loop1> + %v1)
; CHECK: %s2 = add i32 %s1, %b		; CHECK: %s2 = add i32 %s1, %b
; CHECK-NEXT: --> {(%a + %b + %v1),+,1}<%loop1>		; CHECK-NEXT: --> ({(%a + %b),+,1}<%loop1> + %v1)
; CHECK: %s3 = add i32 %s2, %phi2		; CHECK: %s3 = add i32 %s2, %phi2
; CHECK-NEXT: --> ({{{{}}((2 * %a) + %b),+,1}<%loop1>,+,2}<%loop2> + %v1)		; CHECK-NEXT: --> ({{{{}}((2 * %a) + %b),+,1}<%loop1>,+,2}<%loop2> + %v1)

entry:		entry:
br label %loop1		br label %loop1

loop1:		loop1:
%phi1 = phi i32 [ %a, %entry ], [ %phi1.inc, %loop1 ]		%phi1 = phi i32 [ %a, %entry ], [ %phi1.inc, %loop1 ]
▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	exit:
%s1 = add i32 %phi1, %phi2		%s1 = add i32 %phi1, %phi2
%s2 = add i32 %phi2, %phi1		%s2 = add i32 %phi2, %phi1
%s3 = add i32 %phi1, %phi3		%s3 = add i32 %phi1, %phi3
%s4 = add i32 %phi3, %phi1		%s4 = add i32 %phi3, %phi1
%s5 = add i32 %phi2, %phi3		%s5 = add i32 %phi2, %phi3
%s6 = add i32 %phi3, %phi2		%s6 = add i32 %phi3, %phi2
ret void		ret void
}		}

		; Make sure that a complicated Phi does not get folded with rec's start value
		; of a loop which is above.
		define void @test_08() {

		; CHECK-LABEL: Classifying expressions for: @test_08
		; CHECK: %tmp11 = add i64 %iv.2.2, %iv.2.1
		; CHECK-NEXT: --> ({0,+,-1}<nsw><%loop_2> + %iv.2.1)
		; CHECK: %tmp12 = trunc i64 %tmp11 to i32
		; CHECK-NEXT: --> (trunc i64 ({0,+,-1}<nsw><%loop_2> + %iv.2.1) to i32)
		; CHECK: %tmp14 = mul i32 %tmp12, %tmp7
		; CHECK-NEXT: --> ((trunc i64 ({0,+,-1}<nsw><%loop_2> + %iv.2.1) to i32) * {-1,+,-1}<%loop_1>)
		; CHECK: %tmp16 = mul i64 %iv.2.1, %iv.1.1
		; CHECK-NEXT: --> ({2,+,1}<nuw><nsw><%loop_1> * %iv.2.1)

		entry:
		br label %loop_1

		loop_1:
		%iv.1.1 = phi i64 [ 2, %entry ], [ %iv.1.1.next, %loop_1_back_branch ]
		%iv.1.2 = phi i32 [ -1, %entry ], [ %iv.1.2.next, %loop_1_back_branch ]
		br label %loop_1_exit

		dead:
		br label %loop_1_exit

		loop_1_exit:
		%tmp5 = icmp sgt i64 %iv.1.1, 2
		br i1 %tmp5, label %loop_2_preheader, label %loop_1_back_branch

		loop_1_back_branch:
		%iv.1.1.next = add nuw nsw i64 %iv.1.1, 1
		%iv.1.2.next = add nsw i32 %iv.1.2, 1
		br label %loop_1

		loop_2_preheader:
		%tmp6 = sub i64 1, %iv.1.1
		%tmp7 = trunc i64 %tmp6 to i32
		br label %loop_2

		loop_2:
		%iv.2.1 = phi i64 [ 0, %loop_2_preheader ], [ %tmp16, %loop_2 ]
		%iv.2.2 = phi i64 [ 0, %loop_2_preheader ], [ %iv.2.2.next, %loop_2 ]
		%iv.2.3 = phi i64 [ 2, %loop_2_preheader ], [ %iv.2.3.next, %loop_2 ]
		%tmp11 = add i64 %iv.2.2, %iv.2.1
		%tmp12 = trunc i64 %tmp11 to i32
		%tmp14 = mul i32 %tmp12, %tmp7
		%tmp16 = mul i64 %iv.2.1, %iv.1.1
		%iv.2.3.next = add nuw nsw i64 %iv.2.3, 1
		%iv.2.2.next = add nsw i64 %iv.2.2, -1
		%tmp17 = icmp slt i64 %iv.2.3.next, %iv.1.1
		br i1 %tmp17, label %loop_2, label %exit

		exit:
		%tmp10 = add i32 %iv.1.2, 3
		ret void
		}

test/Transforms/LoopStrengthReduce/X86/incorrect-offset-scaling.ll

	Show All 19 Lines

	L2: ; preds = %idxend.8			L2: ; preds = %idxend.8
	%r1 = add i64 %r13, 1			%r1 = add i64 %r13, 1
	br i1 undef, label %L, label %L1			br i1 undef, label %L, label %L1

	if6: ; preds = %idxend.8			if6: ; preds = %idxend.8
	%r2 = add i64 %0, -1			%r2 = add i64 %0, -1
	%r3 = load i64, i64* %1, align 8			%r3 = load i64, i64* %1, align 8
	; CHECK-NOT: %r2			; CHECK: %r2 = add i64 %0, -1
	; CHECK: %r3 = load i64			; CHECK: %r3 = load i64
	br label %ib			br label %ib

	idxend.8: ; preds = %L1			idxend.8: ; preds = %L1
	br i1 undef, label %if6, label %L2			br i1 undef, label %if6, label %L2

	ib: ; preds = %if6			ib: ; preds = %if6
	%r4 = mul i64 %r3, %r0			%r4 = mul i64 %r3, %r0
	%r5 = add i64 %r2, %r4			%r5 = add i64 %r2, %r4
	%r6 = icmp ult i64 %r5, undef			%r6 = icmp ult i64 %r5, undef
	; CHECK: [[MUL1:%[0-9]+]] = mul i64 %lsr.iv, %r3			; CHECK: %r4 = mul i64 %r3, %lsr.iv
	; CHECK: [[ADD1:%[0-9]+]] = add i64 [[MUL1]], -1			; CHECK: %r5 = add i64 %r2, %r4
	; CHECK: add i64 %{{.}}, [[ADD1]]			; CHECK: %r6 = icmp ult i64 %r5, undef
	; CHECK: %r6			; CHECK: %r7 = getelementptr i64, i64* undef, i64 %r5
	%r7 = getelementptr i64, i64* undef, i64 %r5			%r7 = getelementptr i64, i64* undef, i64 %r5
	store i64 1, i64* %r7, align 8			store i64 1, i64* %r7, align 8
	; CHECK: [[MUL2:%[0-9]+]] = mul i64 %lsr.iv, %r3
	; CHECK: [[ADD2:%[0-9]+]] = add i64 [[MUL2]], -1
	br label %L			br label %L
	}			}

test/Transforms/LoopStrengthReduce/lsr-expand-quadratic.ll

	; RUN: opt -loop-reduce -S < %s \| FileCheck %s			; RUN: opt -loop-reduce -S < %s \| FileCheck %s

	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"			target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
	target triple = "x86_64-apple-macosx"			target triple = "x86_64-apple-macosx"

	; PR15470: LSR miscompile. The test2 function should return '1'.			; PR15470: LSR miscompile. The test2 function should return '1'.
	;			;
	; SCEV expander cannot expand quadratic recurrences outside of the			; SCEV expander cannot expand quadratic recurrences outside of the
	; loop. This recurrence depends on %sub.us, so can't be expanded.			; loop. This recurrence depends on %sub.us, so can't be expanded.
				; We cannot fold SCEVUnknown (sub.us) with recurrences since it is
				; declared after the loop.
	;			;
	; CHECK-LABEL: @test2			; CHECK-LABEL: @test2
	; CHECK-LABEL: test2.loop:			; CHECK-LABEL: test2.loop:
	; CHECK: %lsr.iv = phi i32 [ %lsr.iv.next, %test2.loop ], [ -16777216, %entry ]			; CHECK: %lsr.iv1 = phi i32 [ %lsr.iv.next2, %test2.loop ], [ -16777216, %entry ]
	; CHECK: %lsr.iv.next = add nsw i32 %lsr.iv, 16777216			; CHECK: %lsr.iv = phi i32 [ %lsr.iv.next, %test2.loop ], [ -1, %entry ]
				; CHECK: %lsr.iv.next = add nsw i32 %lsr.iv, 1
				; CHECK: %lsr.iv.next2 = add nsw i32 %lsr.iv1, 16777216
	;			;
	; CHECK-LABEL: for.end:			; CHECK-LABEL: for.end:
	; CHECK: %sub.cond.us = sub nsw i32 %inc1115.us, %sub.us			; CHECK: %tobool.us = icmp eq i32 %lsr.iv.next2, 0
	; CHECK: %sext.us = mul i32 %lsr.iv.next, %sub.cond.us			; CHECK: %sub.us = select i1 %tobool.us, i32 0, i32 0
				; CHECK: %1 = sub i32 0, %sub.us
				; CHECK: %2 = add i32 %1, %lsr.iv.next
				; CHECK: %sext.us = mul i32 %lsr.iv.next2, %2
	; CHECK: %f = ashr i32 %sext.us, 24			; CHECK: %f = ashr i32 %sext.us, 24
	; CHECK: ret i32 %f			; CHECK: ret i32 %f
	define i32 @test2() {			define i32 @test2() {
	entry:			entry:
	br label %test2.loop			br label %test2.loop

	test2.loop:			test2.loop:
	%inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test2.loop ]			%inc1115.us = phi i32 [ 0, %entry ], [ %inc11.us, %test2.loop ]
	%inc11.us = add nsw i32 %inc1115.us, 1			%inc11.us = add nsw i32 %inc1115.us, 1
	Show All 15 Lines

test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll

	Show All 19 Lines
	%struct.Vector2 = type { i16*, [64 x i16], i32 }			%struct.Vector2 = type { i16*, [64 x i16], i32 }

	@.str = private unnamed_addr constant [37 x i8] c"0123456789abcdefghijklmnopqrstuvwxyz\00"			@.str = private unnamed_addr constant [37 x i8] c"0123456789abcdefghijklmnopqrstuvwxyz\00"

	define void @_Z15IntegerToStringjjR7Vector2(i32 %i, i32 %radix, %struct.Vector2* nocapture %result) nounwind noinline {			define void @_Z15IntegerToStringjjR7Vector2(i32 %i, i32 %radix, %struct.Vector2* nocapture %result) nounwind noinline {
	entry:			entry:
	%buffer = alloca [33 x i16], align 16			%buffer = alloca [33 x i16], align 16
	%add.ptr = getelementptr inbounds [33 x i16], [33 x i16]* %buffer, i64 0, i64 33			%add.ptr = getelementptr inbounds [33 x i16], [33 x i16]* %buffer, i64 0, i64 33
				%sub.ptr.lhs.cast = ptrtoint i16* %add.ptr to i64
				%sub.ptr.rhs.cast = ptrtoint i16* %add.ptr to i64
	br label %do.body			br label %do.body

	do.body: ; preds = %do.body, %entry			do.body: ; preds = %do.body, %entry
	%0 = phi i64 [ %indvar.next44, %do.body ], [ 0, %entry ]			%0 = phi i64 [ %indvar.next44, %do.body ], [ 0, %entry ]
	%i.addr.0 = phi i32 [ %div, %do.body ], [ %i, %entry ]			%i.addr.0 = phi i32 [ %div, %do.body ], [ %i, %entry ]
	%tmp51 = sub i64 32, %0			%tmp51 = sub i64 32, %0
	%incdec.ptr = getelementptr [33 x i16], [33 x i16]* %buffer, i64 0, i64 %tmp51			%incdec.ptr = getelementptr [33 x i16], [33 x i16]* %buffer, i64 0, i64 %tmp51
	%rem = urem i32 %i.addr.0, 10			%rem = urem i32 %i.addr.0, 10
	%div = udiv i32 %i.addr.0, 10			%div = udiv i32 %i.addr.0, 10
	%idxprom = zext i32 %rem to i64			%idxprom = zext i32 %rem to i64
	%arrayidx = getelementptr inbounds [37 x i8], [37 x i8]* @.str, i64 0, i64 %idxprom			%arrayidx = getelementptr inbounds [37 x i8], [37 x i8]* @.str, i64 0, i64 %idxprom
	%tmp5 = load i8, i8* %arrayidx, align 1			%tmp5 = load i8, i8* %arrayidx, align 1
	%conv = sext i8 %tmp5 to i16			%conv = sext i8 %tmp5 to i16
	store i16 %conv, i16* %incdec.ptr, align 2			store i16 %conv, i16* %incdec.ptr, align 2
	%1 = icmp ugt i32 %i.addr.0, 9			%1 = icmp ugt i32 %i.addr.0, 9
	%indvar.next44 = add i64 %0, 1			%indvar.next44 = add i64 %0, 1
	br i1 %1, label %do.body, label %do.end			br i1 %1, label %do.body, label %do.end

	do.end: ; preds = %do.body			do.end: ; preds = %do.body
	%xap.0 = inttoptr i64 %0 to i1*			%xap.0 = inttoptr i64 %0 to i1*
	%cap.0 = ptrtoint i1* %xap.0 to i64			%cap.0 = ptrtoint i1* %xap.0 to i64
	%sub.ptr.lhs.cast = ptrtoint i16* %add.ptr to i64
	%sub.ptr.rhs.cast = ptrtoint i16* %incdec.ptr to i64
	%sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast			%sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
	%sub.ptr.div39 = lshr exact i64 %sub.ptr.sub, 1			%sub.ptr.div39 = lshr exact i64 %sub.ptr.sub, 1
	%conv11 = trunc i64 %sub.ptr.div39 to i32			%conv11 = trunc i64 %sub.ptr.div39 to i32
	%mLength = getelementptr inbounds %struct.Vector2, %struct.Vector2* %result, i64 0, i32 2			%mLength = getelementptr inbounds %struct.Vector2, %struct.Vector2* %result, i64 0, i32 2
	%idx.ext21 = bitcast i64 %sub.ptr.div39 to i64			%idx.ext21 = bitcast i64 %sub.ptr.div39 to i64
	%incdec.ptr.sum = add i64 %idx.ext21, -1			%incdec.ptr.sum = add i64 %idx.ext21, -1
	%cp.0.sum = sub i64 %incdec.ptr.sum, %0			%cp.0.sum = sub i64 %incdec.ptr.sum, %0
	%add.ptr22 = getelementptr [33 x i16], [33 x i16]* %buffer, i64 1, i64 %cp.0.sum			%add.ptr22 = getelementptr [33 x i16], [33 x i16]* %buffer, i64 1, i64 %cp.0.sum
	Show All 33 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr startClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 100049

include/llvm/Analysis/ScalarEvolution.h

lib/Analysis/ScalarEvolution.cpp

test/Analysis/IVUsers/quadradic-exit-value.ll

test/Analysis/ScalarEvolution/different-loops-recs.ll

test/Transforms/LoopStrengthReduce/X86/incorrect-offset-scaling.ll

test/Transforms/LoopStrengthReduce/lsr-expand-quadratic.ll

test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll

[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
ClosedPublic