This is an archive of the discontinued LLVM Phabricator instance.

lib/Analysis/ScalarEvolution.cpp
6980	What is it about this check which is a problem? Or put another way, why is this not okay but the call to isImpliedCond on line 6956 is fine? The problem is recursion through isImpliedCond->getSCEV->..., right?

Prazek added inline comments.Sep 8 2015, 8:11 PM

lib/Analysis/ScalarEvolution.cpp
6980	The problem is that the check wasn't covering assume loop which caused big hang. Stacktrace looked more like this (isImpliedCond -> getZeroExtendExpr -> isLoopBackedgeGuardedByCond) x much

nlewycky added inline comments.Sep 8 2015, 8:12 PM

lib/Analysis/ScalarEvolution.cpp
6980	Does the code from 6953 to 6959 need to move below this check too? Is there another bug here?

I don't think this is an infinite loop (Piotr, can you please verify
this?), it is probably an O(n!) recursion where n == number of the
assumptions.

ScalarEvolution::isImpliedCond already guards for infinite loops via
MarkPendingLoopPredicate. However, if you have

assume_0()
assume_1()
assume_2()
assume_3()

then the recursive call to isImpliedCond(assume_2, X) may end up
calling isImpliedCond(assume_1, Y) and that may end up calling
isImpliedCond(assume_0, Y) and that may end up calling
isImpliedCond(assume_3, Y). This way, even though we're protected
against full on infinite recursion, we'll still explore all 4! = 24
possibilities.

I think the check with LoopContinuePredicate is fine since it only
calls isImpliedCond if there is exactly one latch in the loop. This
means that the above n! possibility is really only a 1! = 1
possibility for LoopContinuePredicate.

But this from memory, so please double check.

I checked it, it is of course not infinite. I am not sure about n! in case of assumes. I having 10 or 20 assumes will not make it slow. @rsmith was helping me with it, and he thinks that for assumes it is O(n^2), because results are memorized.

So it this case, 1000 assumes is enough to make difference

lib/Analysis/ScalarEvolution.cpp
6980	I don't think so. I think the results are memorized, so calling sImpliedCond in 6956 will not cause lag.

I used "hanging out" in the meaning of being very slow, not sure if it is right word for it.

-cfe-commits
+llvm-commits

In D12719#242207, @Prazek wrote:

I checked it, it is of course not infinite. I am not sure about n! in case of assumes. I having 10 or 20 assumes will not make it slow. @rsmith was helping me with it, and he thinks that for assumes it is O(n^2), because results are memorized.

I think @rsmith is right -- in this case the complexity is O(n^2). I thought I had an example where it was O(n!), but I cannot come with anything concrete right now.

Do you mind also changing O(n!) time complexity to O(n^2) time complexity in the comment?

I am not sure if it will be correct - the comment was related to loop after assumes loop, and I am not sure if this one is different

In D12719#242218, @sanjoy wrote:

I think @rsmith is right -- in this case the complexity is O(n^2). I thought I had an example where it was O(n!), but I cannot come with anything concrete right now.

Our initial analysis of the 'assume' problem appeared to be O(n!), but we don't have a reduced testcase for that. The cycle there was isLoopBackedgeGuardedByCond -> isImpliedCond -> getZeroExtendExpr -> isLoopBackedgeGuardedByCond. In the testcase in this patch, getZeroExtendExpr memoizes its result, but there are paths through it that do not appear to do so (in particular, the isLoopBackedgeGuardedByCond test in the isKnownPositive / isKnownNegative cases can lead to a return with no memoization of the getZeroExtendExpr result).

I could easily believe there are testcases for both loops that lead to O(n!) performance. (If not, we are emitting /vastly/ too many assumes...)

LGTM someone?

(accepted by Nick Lewycky in mail)
LGTM.

This revision is now accepted and ready to land.Sep 9 2015, 1:29 PM

Prazek closed this revision.Sep 9 2015, 1:49 PM

Prazek mentioned this in rL247199: Generating assumption loads of vptr after ctor call (fixed).Sep 9 2015, 3:21 PM

Prazek mentioned this in rL247646: Generating assumption loads of vptr after ctor call (fixed).Sep 14 2015, 5:38 PM

Revision Contents

Path

Size

lib/

Analysis/

ScalarEvolution.cpp

26 lines

test/

Analysis/

ScalarEvolution/

avoid-assume-hang.ll

139 lines

Diff 34291

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,952 Lines • ▼ Show 20 Lines	ScalarEvolution::isLoopBackedgeGuardedByCond(const Loop *L,
BranchInst *LoopContinuePredicate =		BranchInst *LoopContinuePredicate =
dyn_cast<BranchInst>(Latch->getTerminator());		dyn_cast<BranchInst>(Latch->getTerminator());
if (LoopContinuePredicate && LoopContinuePredicate->isConditional() &&		if (LoopContinuePredicate && LoopContinuePredicate->isConditional() &&
isImpliedCond(Pred, LHS, RHS,		isImpliedCond(Pred, LHS, RHS,
LoopContinuePredicate->getCondition(),		LoopContinuePredicate->getCondition(),
LoopContinuePredicate->getSuccessor(0) != L->getHeader()))		LoopContinuePredicate->getSuccessor(0) != L->getHeader()))
return true;		return true;

// Check conditions due to any @llvm.assume intrinsics.
for (auto &AssumeVH : AC.assumptions()) {
if (!AssumeVH)
continue;
auto *CI = cast<CallInst>(AssumeVH);
if (!DT.dominates(CI, Latch->getTerminator()))
continue;

if (isImpliedCond(Pred, LHS, RHS, CI->getArgOperand(0), false))
return true;
}

struct ClearWalkingBEDominatingCondsOnExit {		struct ClearWalkingBEDominatingCondsOnExit {
ScalarEvolution &SE;		ScalarEvolution &SE;

explicit ClearWalkingBEDominatingCondsOnExit(ScalarEvolution &SE)		explicit ClearWalkingBEDominatingCondsOnExit(ScalarEvolution &SE)
: SE(SE){}		: SE(SE){}

~ClearWalkingBEDominatingCondsOnExit() {		~ClearWalkingBEDominatingCondsOnExit() {
SE.WalkingBEDominatingConds = false;		SE.WalkingBEDominatingConds = false;
}		}
};		};

// We don't want more than one activation of the following loop on the stack		// We don't want more than one activation of the following loops on the stack
// -- that can lead to O(n!) time complexity.		// -- that can lead to O(n!) time complexity.
if (WalkingBEDominatingConds)		if (WalkingBEDominatingConds)
return false;		return false;

WalkingBEDominatingConds = true;		WalkingBEDominatingConds = true;
ClearWalkingBEDominatingCondsOnExit ClearOnExit(*this);		ClearWalkingBEDominatingCondsOnExit ClearOnExit(*this);

		// Check conditions due to any @llvm.assume intrinsics.
		nlewyckyUnsubmitted Not Done Reply Inline Actions What is it about this check which is a problem? Or put another way, why is this not okay but the call to isImpliedCond on line 6956 is fine? The problem is recursion through isImpliedCond->getSCEV->..., right? nlewycky: What is it about this check which is a problem? Or put another way, why is this not okay but…
		PrazekAuthorUnsubmitted Not Done Reply Inline Actions The problem is that the check wasn't covering assume loop which caused big hang. Stacktrace looked more like this (isImpliedCond -> getZeroExtendExpr -> isLoopBackedgeGuardedByCond) x much Prazek: The problem is that the check wasn't covering assume loop which caused big hang. Stacktrace…
		nlewyckyUnsubmitted Not Done Reply Inline Actions Does the code from 6953 to 6959 need to move below this check too? Is there another bug here? nlewycky: Does the code from 6953 to 6959 need to move below this check too? Is there another bug here?
		PrazekAuthorUnsubmitted Not Done Reply Inline Actions I don't think so. I think the results are memorized, so calling sImpliedCond in 6956 will not cause lag. Prazek: I don't think so. I think the results are memorized, so calling sImpliedCond in 6956 will not…
		for (auto &AssumeVH : AC.assumptions()) {
		if (!AssumeVH)
		continue;
		auto *CI = cast<CallInst>(AssumeVH);
		if (!DT.dominates(CI, Latch->getTerminator()))
		continue;

		if (isImpliedCond(Pred, LHS, RHS, CI->getArgOperand(0), false))
		return true;
		}

// If the loop is not reachable from the entry block, we risk running into an		// If the loop is not reachable from the entry block, we risk running into an
// infinite loop as we walk up into the dom tree. These loops do not matter		// infinite loop as we walk up into the dom tree. These loops do not matter
// anyway, so we just return a conservative answer when we see them.		// anyway, so we just return a conservative answer when we see them.
if (!DT.isReachableFromEntry(L->getHeader()))		if (!DT.isReachableFromEntry(L->getHeader()))
return false;		return false;

for (DomTreeNode DTN = DT[Latch], HeaderDTN = DT[L->getHeader()];		for (DomTreeNode DTN = DT[Latch], HeaderDTN = DT[L->getHeader()];
DTN != HeaderDTN; DTN = DTN->getIDom()) {		DTN != HeaderDTN; DTN = DTN->getIDom()) {
▲ Show 20 Lines • Show All 1,861 Lines • Show Last 20 Lines

test/Analysis/ScalarEvolution/avoid-assume-hang.ll

This file was added.

				; RUN: opt %s -always-inline \| opt -analyze -scalar-evolution
				; There was optimization bug in ScalarEvolution, that causes too long
				; compute time and stack overflow crash.

				declare void @body(i32)
				declare void @llvm.assume(i1)

				define available_externally void @assume1(i64 %i.ext, i64 %a) alwaysinline {
				%cmp0 = icmp ne i64 %i.ext, %a
				call void @llvm.assume(i1 %cmp0)

				%a1 = add i64 %a, 1
				%cmp1 = icmp ne i64 %i.ext, %a1
				call void @llvm.assume(i1 %cmp1)

				%a2 = add i64 %a1, 1
				%cmp2 = icmp ne i64 %i.ext, %a2
				call void @llvm.assume(i1 %cmp2)

				%a3 = add i64 %a2, 1
				%cmp3 = icmp ne i64 %i.ext, %a3
				call void @llvm.assume(i1 %cmp3)

				%a4 = add i64 %a3, 1
				%cmp4 = icmp ne i64 %i.ext, %a4
				call void @llvm.assume(i1 %cmp4)

				ret void
				}

				define available_externally void @assume2(i64 %i.ext, i64 %a) alwaysinline {
				call void @assume1(i64 %i.ext, i64 %a)

				%a1 = add i64 %a, 5
				%cmp1 = icmp ne i64 %i.ext, %a1
				call void @assume1(i64 %i.ext, i64 %a1)

				%a2 = add i64 %a1, 5
				%cmp2 = icmp ne i64 %i.ext, %a2
				call void @assume1(i64 %i.ext, i64 %a2)

				%a3 = add i64 %a2, 5
				%cmp3 = icmp ne i64 %i.ext, %a3
				call void @assume1(i64 %i.ext, i64 %a3)

				%a4 = add i64 %a3, 5
				%cmp4 = icmp ne i64 %i.ext, %a4
				call void @assume1(i64 %i.ext, i64 %a4)

				ret void
				}

				define available_externally void @assume3(i64 %i.ext, i64 %a) alwaysinline {
				call void @assume2(i64 %i.ext, i64 %a)

				%a1 = add i64 %a, 25
				%cmp1 = icmp ne i64 %i.ext, %a1
				call void @assume2(i64 %i.ext, i64 %a1)

				%a2 = add i64 %a1, 25
				%cmp2 = icmp ne i64 %i.ext, %a2
				call void @assume2(i64 %i.ext, i64 %a2)

				%a3 = add i64 %a2, 25
				%cmp3 = icmp ne i64 %i.ext, %a3
				call void @assume2(i64 %i.ext, i64 %a3)

				%a4 = add i64 %a3, 25
				%cmp4 = icmp ne i64 %i.ext, %a4
				call void @assume2(i64 %i.ext, i64 %a4)

				ret void
				}

				define available_externally void @assume4(i64 %i.ext, i64 %a) alwaysinline {
				call void @assume3(i64 %i.ext, i64 %a)

				%a1 = add i64 %a, 125
				%cmp1 = icmp ne i64 %i.ext, %a1
				call void @assume3(i64 %i.ext, i64 %a1)

				%a2 = add i64 %a1, 125
				%cmp2 = icmp ne i64 %i.ext, %a2
				call void @assume3(i64 %i.ext, i64 %a2)

				%a3 = add i64 %a2, 125
				%cmp3 = icmp ne i64 %i.ext, %a3
				call void @assume3(i64 %i.ext, i64 %a3)

				%a4 = add i64 %a3, 125
				%cmp4 = icmp ne i64 %i.ext, %a4
				call void @assume3(i64 %i.ext, i64 %a4)

				ret void
				}

				define available_externally void @assume5(i64 %i.ext, i64 %a) alwaysinline {
				call void @assume4(i64 %i.ext, i64 %a)

				%a1 = add i64 %a, 625
				%cmp1 = icmp ne i64 %i.ext, %a1
				call void @assume4(i64 %i.ext, i64 %a1)

				%a2 = add i64 %a1, 625
				%cmp2 = icmp ne i64 %i.ext, %a2
				call void @assume4(i64 %i.ext, i64 %a2)

				%a3 = add i64 %a2, 625
				%cmp3 = icmp ne i64 %i.ext, %a3
				call void @assume4(i64 %i.ext, i64 %a3)

				%a4 = add i64 %a3, 625
				%cmp4 = icmp ne i64 %i.ext, %a4
				call void @assume4(i64 %i.ext, i64 %a4)

				ret void
				}

				define void @fn(i32 %init) {
				entry:
				br label %loop

				loop:
				%i = phi i32 [%init, %entry], [%next, %loop]
				call void @body(i32 %i)

				%i.ext = zext i32 %i to i64

				call void @assume5(i64 %i.ext, i64 500000000)

				%i.next = add i64 %i.ext, 1
				%next = trunc i64 %i.next to i32
				%done = icmp eq i32 %i, 500000000

				br i1 %done, label %exit, label %loop

				exit:
				ret void
				}
				No newline at end of file

This is an archive of the discontinued LLVM Phabricator instance.

ScalarEvolution assume hanging bugfixClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 34291

lib/Analysis/ScalarEvolution.cpp

test/Analysis/ScalarEvolution/avoid-assume-hang.ll

ScalarEvolution assume hanging bugfix
ClosedPublic