This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
3
ScalarEvolution.cpp
-
test/Analysis/ScalarEvolution/
-
Analysis/
-
ScalarEvolution/
-
exact_iter_count.ll

Differential D44677

[SCEV] Make computeExitLimit more simple and more powerful
ClosedPublic

Authored by mkazantsev on Mar 20 2018, 3:51 AM.

Download Raw Diff

Details

Reviewers

sanjoy
apilipenko
anna
skatkov
reames
mkazantsev
efriedma

Commits

rGc01e47b43f46: [SCEV] Make computeExitLimit more simple and more powerful
rL329047: [SCEV] Make computeExitLimit more simple and more powerful

Summary

Current implementation of computeExitLimit has a big piece of code
the only purpose of which is to prove that after the execution of this
block the latch will be executed. What it currently checks is actually a
subset of situations where the exiting block dominates latch.

This patch replaces all these checks for simple particular cases with
domination check over loop's latch which is the only necessary condition
of taking the exiting block into consideration. This change allows to
calculate exact loop taken count for simple loops like

for (int i = 0; i < 100; i++) {
  if (cond) {...} else {...}
  if (i > 50) break;
  . . .
}

Diff Detail

Event Timeline

mkazantsev created this revision.Mar 20 2018, 3:51 AM

mkazantsev added a parent revision: D44676: [SCEV] Make exact taken count calculation more optimistic.

One internal fuzzer test failed, investigating.

It seems that this patch is OK, it just revealed some existing bug in IndVar simplifier: it fails to forget loops at some point.

Fix for the bug triggered by this patch is https://reviews.llvm.org/D44818

fhahn mentioned this in D44676: [SCEV] Make exact taken count calculation more optimistic.Mar 23 2018, 3:11 AM

javed.absar added a subscriber: javed.absar.Mar 26 2018, 3:32 AM

I think it's safe to go with. Sanjoy is on vacation for few weeks and we have an agreement that he will review it after he returns. If no one has objections, I will have in merged in 24 hours.

This revision is now accepted and ready to land.Apr 1 2018, 10:42 PM

LGTM with one minor comment.

lib/Analysis/ScalarEvolution.cpp
6921	assert(Exit && "Exiting block must have at least one exit");

mkazantsev added inline comments.Apr 2 2018, 9:46 PM

lib/Analysis/ScalarEvolution.cpp
6921	Fair enough, will add before commiting.

Closed by commit rL329047: [SCEV] Make computeExitLimit more simple and more powerful (authored by mkazantsev). · Explain WhyApr 2 2018, 11:00 PM

This revision was automatically updated to reflect the committed changes.

Given how many places we've had to add SE->forgetTopmostLoop after this pass, I'm wondering if this change violated some poorly specified SCEV invariant. In particular, previously had we called computeExitLimit on an exiting block in an loop inner to L we'd always conclude said exiting block was not "always executed the same number of times as the loop" because we'd bail out of the backwards climb at the header of the inner loop (which cannot have a getUniquePredecessor). What do you think?

lib/Analysis/ScalarEvolution.cpp
6895	"far from trivial" is too vague -- can you please rewrite this bit to be more specific?

computeExitLimit's logic shouldn't care if the exiting block is inside a nested loop, as long as the condition is invariant relative to the inner loop. (This should work correctly, as far as I can tell; if it doesn't, it would be easy to fix.) So the question is if there's some invariant related to the caching?

The comment for forgetLoop says "This method should be called by the client when it has changed a loop in a way that may effect ScalarEvolution's ability to compute a trip count". We could try to restrict this, bit I'm not sure what restriction would actually be useful. I can't imagine any formulation which doesn't require calling forgetLoop when a branch that exits that loop is added/removed/modified.

computeExitLimit's logic shouldn't care if the exiting block is inside a nested loop, as long as the condition is invariant relative to the inner loop. (This should work correctly, as far as I can tell; if it doesn't, it would be easy to fix.) So the question is if there's some invariant related to the caching?

Agreed -- I too was thinking of a caching invariant. For instance, maybe there is some invariant which lets us avoid caching blocks from child loops in parent's BackedgeTakenInfo instances and effectively treats child loops as black boxes.

The comment for forgetLoop says "This method should be called by the client when it has changed a loop in a way that may effect ScalarEvolution's ability to compute a trip count". We could try to restrict this, bit I'm not sure what restriction would actually be useful. I can't imagine any formulation which doesn't require calling forgetLoop when a branch that exits that loop is added/removed/modified.

The kind of invariant I had in mind was as long as you don't change the *trip count* of a loop, its parents do not need to be invalidated. So unrolling or rotating a loop needs to invalidate its parent (but not its grandparent).

In D44677#1080078, @sanjoy wrote:

computeExitLimit's logic shouldn't care if the exiting block is inside a nested loop, as long as the condition is invariant relative to the inner loop. (This should work correctly, as far as I can tell; if it doesn't, it would be easy to fix.) So the question is if there's some invariant related to the caching?

Agreed -- I too was thinking of a caching invariant. For instance, maybe there is some invariant which lets us avoid caching blocks from child loops in parent's BackedgeTakenInfo instances and effectively treats child loops as black boxes.

The comment for forgetLoop says "This method should be called by the client when it has changed a loop in a way that may effect ScalarEvolution's ability to compute a trip count". We could try to restrict this, bit I'm not sure what restriction would actually be useful. I can't imagine any formulation which doesn't require calling forgetLoop when a branch that exits that loop is added/removed/modified.

The kind of invariant I had in mind was as long as you don't change the *trip count* of a loop, its parents do not need to be invalidated. So unrolling or rotating a loop needs to invalidate its parent (but not its grandparent).

I think the invariant here has to be that, so long as you don't change the trip count, you don't need to invalidate it in SE. "When it has changed a loop in a way that may effect ScalarEvolution's ability to compute a trip count" is an impossible contract to keep, except in the most conservative sense. How could the client know exactly how powerful SE's analysis capabilities are such that it would know whether it has done something that would affect SE's ability to compute the trip count? How should the client know which IR values SE is relying on to compute the trip count?

I am not sure that we actually break the invariant which was preserved before. Note that all recent failures we have seen were on assert from the patch D44676. We might as well just break this invariant previously, but without this assert we did not know about it.

As for the invariant itself, SCEV assumes that CFG did not change because it keeps mapping of blocks to exit counts for these blocks. Whenever we change the CFG and, for example, change some domination relationship between blocks, we must invalidate the cache.

Revision Contents

Path

Size

lib/

Analysis/

ScalarEvolution.cpp

75 lines

test/

Analysis/

ScalarEvolution/

exact_iter_count.ll

34 lines

Diff 139098

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,884 Lines • ▼ Show 20 Lines	ScalarEvolution::computeBackedgeTakenCount(const Loop *L,
bool MaxOrZero = (MustExitMaxOrZero && ExitingBlocks.size() == 1);		bool MaxOrZero = (MustExitMaxOrZero && ExitingBlocks.size() == 1);
return BackedgeTakenInfo(std::move(ExitCounts), CouldComputeBECount,		return BackedgeTakenInfo(std::move(ExitCounts), CouldComputeBECount,
MaxBECount, MaxOrZero);		MaxBECount, MaxOrZero);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::computeExitLimit(const Loop L, BasicBlock ExitingBlock,		ScalarEvolution::computeExitLimit(const Loop L, BasicBlock ExitingBlock,
bool AllowPredicates) {		bool AllowPredicates) {
// Okay, we've chosen an exiting block. See what condition causes us to exit		assert(L->contains(ExitingBlock) && "Exit count for non-loop block?");
// at this block and remember the exit block and whether all other targets		// If our exiting block does not dominate the latch, then its connection with
// lead to the loop header.		// loop's exit limit may be far from trivial.
		sanjoyUnsubmitted Not Done Reply Inline Actions "far from trivial" is too vague -- can you please rewrite this bit to be more specific? sanjoy: "far from trivial" is too vague -- can you please rewrite this bit to be more specific?
bool MustExecuteLoopHeader = true;		const BasicBlock *Latch = L->getLoopLatch();
BasicBlock *Exit = nullptr;		if (!Latch \|\| !DT.dominates(ExitingBlock, Latch))
for (auto *SBB : successors(ExitingBlock))
if (!L->contains(SBB)) {
if (Exit) // Multiple exit successors.
return getCouldNotCompute();
Exit = SBB;
} else if (SBB != L->getHeader()) {
MustExecuteLoopHeader = false;
}

// At this point, we know we have a conditional branch that determines whether
// the loop is exited. However, we don't know if the branch is executed each
// time through the loop. If not, then the execution count of the branch will
// not be equal to the trip count of the loop.
//
// Currently we check for this by checking to see if the Exit branch goes to
// the loop header. If so, we know it will always execute the same number of
// times as the loop. We also handle the case where the exit block is the
// loop header. This is common for un-rotated loops.
//
// If both of those tests fail, walk up the unique predecessor chain to the
// header, stopping if there is an edge that doesn't exit the loop. If the
// header is reached, the execution count of the branch will be equal to the
// trip count of the loop.
//
// More extensive analysis could be done to handle more cases here.
//
if (!MustExecuteLoopHeader && ExitingBlock != L->getHeader()) {
// The simple checks failed, try climbing the unique predecessor chain
// up to the header.
bool Ok = false;
for (BasicBlock *BB = ExitingBlock; BB; ) {
BasicBlock *Pred = BB->getUniquePredecessor();
if (!Pred)
return getCouldNotCompute();
TerminatorInst *PredTerm = Pred->getTerminator();
for (const BasicBlock *PredSucc : PredTerm->successors()) {
if (PredSucc == BB)
continue;
// If the predecessor has a successor that isn't BB and isn't
// outside the loop, assume the worst.
if (L->contains(PredSucc))
return getCouldNotCompute();
}
if (Pred == L->getHeader()) {
Ok = true;
break;
}
BB = Pred;
}
if (!Ok)
return getCouldNotCompute();		return getCouldNotCompute();
}

bool IsOnlyExit = (L->getExitingBlock() != nullptr);		bool IsOnlyExit = (L->getExitingBlock() != nullptr);
TerminatorInst *Term = ExitingBlock->getTerminator();		TerminatorInst *Term = ExitingBlock->getTerminator();
if (BranchInst *BI = dyn_cast<BranchInst>(Term)) {		if (BranchInst *BI = dyn_cast<BranchInst>(Term)) {
assert(BI->isConditional() && "If unconditional, it can't be in loop!");		assert(BI->isConditional() && "If unconditional, it can't be in loop!");
bool ExitIfTrue = !L->contains(BI->getSuccessor(0));		bool ExitIfTrue = !L->contains(BI->getSuccessor(0));
assert(ExitIfTrue == L->contains(BI->getSuccessor(1)) &&		assert(ExitIfTrue == L->contains(BI->getSuccessor(1)) &&
"It should have one successor in loop and one exit block!");		"It should have one successor in loop and one exit block!");
// Proceed to the next level to examine the exit condition expression.		// Proceed to the next level to examine the exit condition expression.
return computeExitLimitFromCond(		return computeExitLimitFromCond(
L, BI->getCondition(), ExitIfTrue,		L, BI->getCondition(), ExitIfTrue,
/ControlsExit=/IsOnlyExit, AllowPredicates);		/ControlsExit=/IsOnlyExit, AllowPredicates);
}		}

if (SwitchInst *SI = dyn_cast<SwitchInst>(Term))		if (SwitchInst *SI = dyn_cast<SwitchInst>(Term)) {
		// For switch, make sure that there is a single exit from the loop.
		BasicBlock *Exit = nullptr;
		for (auto *SBB : successors(ExitingBlock))
		if (!L->contains(SBB)) {
		if (Exit) // Multiple exit successors.
		return getCouldNotCompute();
		Exit = SBB;
		}
		efriedmaUnsubmitted Not Done Reply Inline Actions assert(Exit && "Exiting block must have at least one exit"); efriedma: assert(Exit && "Exiting block must have at least one exit");
		mkazantsevAuthorUnsubmitted Not Done Reply Inline Actions Fair enough, will add before commiting. mkazantsev: Fair enough, will add before commiting.

return computeExitLimitFromSingleExitSwitch(L, SI, Exit,		return computeExitLimitFromSingleExitSwitch(L, SI, Exit,
/ControlsExit=/IsOnlyExit);		/ControlsExit=/IsOnlyExit);
		}

return getCouldNotCompute();		return getCouldNotCompute();
}		}

ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromCond(		ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromCond(
const Loop L, Value ExitCond, bool ExitIfTrue,		const Loop L, Value ExitCond, bool ExitIfTrue,
bool ControlsExit, bool AllowPredicates) {		bool ControlsExit, bool AllowPredicates) {
ScalarEvolution::ExitLimitCacheTy Cache(L, ExitIfTrue, AllowPredicates);		ScalarEvolution::ExitLimitCacheTy Cache(L, ExitIfTrue, AllowPredicates);
▲ Show 20 Lines • Show All 4,965 Lines • Show Last 20 Lines

test/Analysis/ScalarEvolution/exact_iter_count.ll

Show All 19 Lines	backedge:
br i1 %loop.cond, label %loop, label %exit		br i1 %loop.cond, label %loop, label %exit

exit:		exit:
ret void		ret void

side.exit:		side.exit:
ret void		ret void
}		}

		define void @test_02(i1 %c) {

		; CHECK-LABEL: Determining loop execution counts for: @test_02
		; CHECK-NEXT: Loop %loop: <multiple exits> backedge-taken count is 50

		entry:
		br label %loop

		loop:
		%iv = phi i32 [ 0, %entry ], [ %iv.next, %backedge ]
		br i1 %c, label %if.true, label %if.false

		if.true:
		br label %merge

		if.false:
		br label %merge

		merge:
		%side.cond = icmp slt i32 %iv, 50
		br i1 %side.cond, label %backedge, label %side.exit

		backedge:
		%iv.next = add i32 %iv, 1
		%loop.cond = icmp slt i32 %iv, 100
		br i1 %loop.cond, label %loop, label %exit

		exit:
		ret void

		side.exit:
		ret void
		}