This is an archive of the discontinued LLVM Phabricator instance.

Enable unrolling of multi-exit loops
ClosedPublic

Authored by meheff on Sep 30 2014, 2:35 PM.

Download Raw Diff

Details

Reviewers

atrick
jingyue

Summary

This patch de-pessimizes the calculation of loop trip counts in ScalarEvolution in the presence of multiple exits. Previously all loops exits had to have identical counts for a loop trip count to be considered computable. This pessimization was implemented by calling getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock) inside of ScalarEvolution::
getSmallConstantTripCount() (see the FIXME in the comments of that function). The pessimization was added to fix a corner case involving undefined behavior (pr/16130). This patch more precisely handles the undefined behavior case allowing the pessimization to be removed.

ControlsExit replaces IsSubExpr to more precisely track the case where undefined behavior is expected to occur. Because undefined behavior is tracked more precisely we can remove MustExit from ExitLimit. MustExit was used to track the case where the limit was computed potentially assuming undefined behavior even if undefined behavior didn't necessarily occur.

Diff Detail

Event Timeline

meheff updated this revision to Diff 14248.Sep 30 2014, 2:35 PM

meheff retitled this revision from to Enable unrolling of multi-exit loops.

meheff updated this object.

meheff edited the test plan for this revision. (Show Details)

meheff added reviewers: atrick, jingyue.

meheff added a subscriber: Unknown Object (MLST).

The review diff tool doesn't do a very good job of presenting it, but a big chunk of the patch is moving the class SCEVDivision and a few static functions earlier in the file so they can be used in ScalarEvolution::HowFarToZero.

LGTM. You may want to mention your change to MustExit and ControlsExit in the description. Thanks for working on this!

lib/Analysis/ScalarEvolution.cpp
4721–4722	Since MustExit is gone, comments need to be updated.
6033–6034	Can we use getUDivExactExpr when R is zero?
7103–7104	Looks like the diff tool is not doing a good job :) Are the remaining differences in this file simply copy and paste?

meheff updated this object.Oct 2 2014, 10:15 AM

meheff edited edge metadata.

meheff added inline comments.

lib/Analysis/ScalarEvolution.cpp
4721–4722	Done.
6033–6034	Done.
7103–7104	Yeah, there is a big cut and paste. Evidently there is some very similar code elsewhere in the file which is confusing the diff.

meheff updated this revision to Diff 14335.Oct 2 2014, 10:25 AM

Hi Andrew,

Do you mind having a look?

Thanks!
Mark

Sorry for the delay reviewing--I had to be careful with this one, and the diff is unclear. It does look like great improvement and it's about time to move on to full-fledge support for multi-exit loops.

With this change, ComputeExitLimit design loses some information. We cannot as precisely determine the trip count of a particular exit given knowledge of NSW. But I see now that information was probably not very useful. The idea of conditionally considering NW/NSW based on the loop structure makes the code simpler and the API safer. As a result, the loop unroller can actually be more aggressive without adding any complexity. So I think this is a great change.

One thing I'm concerned about though is that I think you're using SCEVDivision for the first time in the standard LLVM pipeline (outside of the delinearizer). SCEVDivision looks like it recurses over the expression operands. Every time someone does this, we eventually have to track down some case that results in pathological compile time. Can you prove that this recursion will never visit the same expression twice? If not, I think you need to add a visited set as we do elsewhere, or find another way to check for exact division.

This revision now requires changes to proceed.Oct 7 2014, 11:03 PM

Thanks for the review.

In D5550#10, @atrick wrote:

Sorry for the delay reviewing--I had to be careful with this one, and the diff is unclear. It does look like great improvement and it's about time to move on to full-fledge support for multi-exit loops.

No problem on the latency. It took me a long time to reason through all the code and the change ;-)

With this change, ComputeExitLimit design loses some information. We cannot as precisely determine the trip count of a particular exit given knowledge of NSW. But I see now that information was probably not very useful. The idea of conditionally considering NW/NSW based on the loop structure makes the code simpler and the API safer. As a result, the loop unroller can actually be more aggressive without adding any complexity. So I think this is a great change.

One interesting thing about the NSW case where the condition is "missed": InstCombine catches some of these and transforms them to infinite loops. For example, the following gets transformed to an infinite loop:

;; run with: opt -instcombine
declare void @bar(...) #1
define void @test() {
entry:

br label %loop

loop:

%i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
call void (...)* @bar()
%i.next = add nsw i32 %i, 2
%t = icmp ne i32 %i.next, 5
br i1 %t, label %loop, label %exit

exit:

ret void

}

InstCombine does this by looking at known bit values of the two sides of the icmp ne. However, if loop unrolling runs before instcombine the loop will be unrolled completely with a trip count of 2. If you change the increment value from 2 to 3, then instcombine can't reason about the bits and won't transform it to an infinite loop. Given this inconsistency (not surprising given that we're talking about undefined behavior), I'd also be happy just returning couldNotCompute in cases where the equality condition misses with nsw. This would simplify the code further by eliminating the ControlsExit parameter in the various functions. I wrote the code as is to match existing behavior. Let me know if you think this further simplification and change in behavior is a good idea.

One thing I'm concerned about though is that I think you're using SCEVDivision for the first time in the standard LLVM pipeline (outside of the delinearizer). SCEVDivision looks like it recurses over the expression operands. Every time someone does this, we eventually have to track down some case that results in pathological compile time. Can you prove that this recursion will never visit the same expression twice? If not, I think you need to add a visited set as we do elsewhere, or find another way to check for exact division.

I'll dive in a get back shortly on this...

In D5550#10, @atrick wrote:

One thing I'm concerned about though is that I think you're using SCEVDivision for the first time in the standard LLVM pipeline (outside of the delinearizer). SCEVDivision looks like it recurses over the expression operands. Every time someone does this, we eventually have to track down some case that results in pathological compile time. Can you prove that this recursion will never visit the same expression twice? If not, I think you need to add a visited set as we do elsewhere, or find another way to check for exact division.

Is using SCEVDivision fundamentally different than getUDivExpr which is used now? getUDivExpr recurses over expression operands just like SCEVDivision.

Mark

I though getUDivExpr was less general. Looking at it now it seems to recurse in roughly the same situations as SCEVDivision, with no memoization. So I think it's ok to bring in SCEVDivision. I can't argue that will create a new problem.

This revision is now accepted and ready to land.Oct 8 2014, 4:31 PM

Just fyi, I ran SPEC to see if there was any performance change. Performance difference was in the noise.

Commited r219517.

Andrew, btw, you can probably close "rdar:14038809 [SCEV]: Optimize trip count computation for multi-exit loops.".

Mark

meheff closed this revision.Oct 10 2014, 10:54 AM

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

15 lines

lib/

Analysis/

ScalarEvolution.cpp

884 lines

test/

Analysis/

ScalarEvolution/

trip-count-pow2.ll

4 lines

Transforms/

LoopUnroll/

scevunroll.ll

15 lines

Diff 14335

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	private:

/// Mark predicate values currently being processed by isImpliedCond.		/// Mark predicate values currently being processed by isImpliedCond.
DenseSet<Value*> PendingLoopPredicates;		DenseSet<Value*> PendingLoopPredicates;

/// ExitLimit - Information about the number of loop iterations for which a		/// ExitLimit - Information about the number of loop iterations for which a
/// loop exit's branch condition evaluates to the not-taken path. This is a		/// loop exit's branch condition evaluates to the not-taken path. This is a
/// temporary pair of exact and max expressions that are eventually		/// temporary pair of exact and max expressions that are eventually
/// summarized in ExitNotTakenInfo and BackedgeTakenInfo.		/// summarized in ExitNotTakenInfo and BackedgeTakenInfo.
///
/// If MustExit is true, then the exit must be taken when the BECount
/// reaches Exact (and before surpassing Max). If MustExit is false, then
/// BECount may exceed Exact or Max if the loop exits via another branch. In
/// either case, the loop may exit early via another branch.
///
/// MustExit is true for most cases. However, an exit guarded by an
/// (in)equality on a nonunit stride may be skipped.
struct ExitLimit {		struct ExitLimit {
const SCEV *Exact;		const SCEV *Exact;
const SCEV *Max;		const SCEV *Max;
bool MustExit;

/implicit/ ExitLimit(const SCEV *E)		/implicit/ ExitLimit(const SCEV *E) : Exact(E), Max(E) {}
: Exact(E), Max(E), MustExit(true) {}

ExitLimit(const SCEV E, const SCEV M, bool MustExit)		ExitLimit(const SCEV E, const SCEV M) : Exact(E), Max(M) {}
: Exact(E), Max(M), MustExit(MustExit) {}

/// hasAnyInfo - Test whether this ExitLimit contains any computed		/// hasAnyInfo - Test whether this ExitLimit contains any computed
/// information, or whether it's all SCEVCouldNotCompute values.		/// information, or whether it's all SCEVCouldNotCompute values.
bool hasAnyInfo() const {		bool hasAnyInfo() const {
return !isa<SCEVCouldNotCompute>(Exact) \|\|		return !isa<SCEVCouldNotCompute>(Exact) \|\|
!isa<SCEVCouldNotCompute>(Max);		!isa<SCEVCouldNotCompute>(Max);
}		}
};		};
▲ Show 20 Lines • Show All 657 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 667 Lines • ▼ Show 20 Lines	for (unsigned j = i+1; j != e && Ops[j]->getSCEVType() == Complexity; ++j) {
std::swap(Ops[i+1], Ops[j]);		std::swap(Ops[i+1], Ops[j]);
++i; // no need to rescan it.		++i; // no need to rescan it.
if (i == e-2) return; // Done!		if (i == e-2) return; // Done!
}		}
}		}
}		}
}		}

		static const APInt srem(const SCEVConstant C1, const SCEVConstant C2) {
		APInt A = C1->getValue()->getValue();
		APInt B = C2->getValue()->getValue();
		uint32_t ABW = A.getBitWidth();
		uint32_t BBW = B.getBitWidth();

		if (ABW > BBW)
		B = B.sext(ABW);
		else if (ABW < BBW)
		A = A.sext(BBW);

		return APIntOps::srem(A, B);
		}

		static const APInt sdiv(const SCEVConstant C1, const SCEVConstant C2) {
		APInt A = C1->getValue()->getValue();
		APInt B = C2->getValue()->getValue();
		uint32_t ABW = A.getBitWidth();
		uint32_t BBW = B.getBitWidth();

		if (ABW > BBW)
		B = B.sext(ABW);
		else if (ABW < BBW)
		A = A.sext(BBW);

		return APIntOps::sdiv(A, B);
		}

		namespace {
		struct FindSCEVSize {
		int Size;
		FindSCEVSize() : Size(0) {}

		bool follow(const SCEV *S) {
		++Size;
		// Keep looking at all operands of S.
		return true;
		}
		bool isDone() const {
		return false;
		}
		};
		}

		// Returns the size of the SCEV S.
		static inline int sizeOfSCEV(const SCEV *S) {
		FindSCEVSize F;
		SCEVTraversal<FindSCEVSize> ST(F);
		ST.visitAll(S);
		return F.Size;
		}

		namespace {

		struct SCEVDivision : public SCEVVisitor<SCEVDivision, void> {
		public:
		// Computes the Quotient and Remainder of the division of Numerator by
		// Denominator.
		static void divide(ScalarEvolution &SE, const SCEV *Numerator,
		const SCEV Denominator, const SCEV *Quotient,
		const SCEV **Remainder) {
		assert(Numerator && Denominator && "Uninitialized SCEV");

		SCEVDivision D(SE, Numerator, Denominator);

		// Check for the trivial case here to avoid having to check for it in the
		// rest of the code.
		if (Numerator == Denominator) {
		*Quotient = D.One;
		*Remainder = D.Zero;
		return;
		}

		if (Numerator->isZero()) {
		*Quotient = D.Zero;
		*Remainder = D.Zero;
		return;
		}

		// Split the Denominator when it is a product.
		if (const SCEVMulExpr *T = dyn_cast<const SCEVMulExpr>(Denominator)) {
		const SCEV Q, R;
		*Quotient = Numerator;
		for (const SCEV *Op : T->operands()) {
		divide(SE, *Quotient, Op, &Q, &R);
		*Quotient = Q;

		// Bail out when the Numerator is not divisible by one of the terms of
		// the Denominator.
		if (!R->isZero()) {
		*Quotient = D.Zero;
		*Remainder = Numerator;
		return;
		}
		}
		*Remainder = D.Zero;
		return;
		}

		D.visit(Numerator);
		*Quotient = D.Quotient;
		*Remainder = D.Remainder;
		}

		SCEVDivision(ScalarEvolution &S, const SCEV Numerator, const SCEV Denominator)
		: SE(S), Denominator(Denominator) {
		Zero = SE.getConstant(Denominator->getType(), 0);
		One = SE.getConstant(Denominator->getType(), 1);

		// By default, we don't know how to divide Expr by Denominator.
		// Providing the default here simplifies the rest of the code.
		Quotient = Zero;
		Remainder = Numerator;
		}

		// Except in the trivial case described above, we do not know how to divide
		// Expr by Denominator for the following functions with empty implementation.
		void visitTruncateExpr(const SCEVTruncateExpr *Numerator) {}
		void visitZeroExtendExpr(const SCEVZeroExtendExpr *Numerator) {}
		void visitSignExtendExpr(const SCEVSignExtendExpr *Numerator) {}
		void visitUDivExpr(const SCEVUDivExpr *Numerator) {}
		void visitSMaxExpr(const SCEVSMaxExpr *Numerator) {}
		void visitUMaxExpr(const SCEVUMaxExpr *Numerator) {}
		void visitUnknown(const SCEVUnknown *Numerator) {}
		void visitCouldNotCompute(const SCEVCouldNotCompute *Numerator) {}

		void visitConstant(const SCEVConstant *Numerator) {
		if (const SCEVConstant *D = dyn_cast<SCEVConstant>(Denominator)) {
		Quotient = SE.getConstant(sdiv(Numerator, D));
		Remainder = SE.getConstant(srem(Numerator, D));
		return;
		}
		}

		void visitAddRecExpr(const SCEVAddRecExpr *Numerator) {
		const SCEV StartQ, StartR, StepQ, StepR;
		assert(Numerator->isAffine() && "Numerator should be affine");
		divide(SE, Numerator->getStart(), Denominator, &StartQ, &StartR);
		divide(SE, Numerator->getStepRecurrence(SE), Denominator, &StepQ, &StepR);
		Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
		Numerator->getNoWrapFlags());
		Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
		Numerator->getNoWrapFlags());
		}

		void visitAddExpr(const SCEVAddExpr *Numerator) {
		SmallVector<const SCEV *, 2> Qs, Rs;
		Type *Ty = Denominator->getType();

		for (const SCEV *Op : Numerator->operands()) {
		const SCEV Q, R;
		divide(SE, Op, Denominator, &Q, &R);

		// Bail out if types do not match.
		if (Ty != Q->getType() \|\| Ty != R->getType()) {
		Quotient = Zero;
		Remainder = Numerator;
		return;
		}

		Qs.push_back(Q);
		Rs.push_back(R);
		}

		if (Qs.size() == 1) {
		Quotient = Qs[0];
		Remainder = Rs[0];
		return;
		}

		Quotient = SE.getAddExpr(Qs);
		Remainder = SE.getAddExpr(Rs);
		}

		void visitMulExpr(const SCEVMulExpr *Numerator) {
		SmallVector<const SCEV *, 2> Qs;
		Type *Ty = Denominator->getType();

		bool FoundDenominatorTerm = false;
		for (const SCEV *Op : Numerator->operands()) {
		// Bail out if types do not match.
		if (Ty != Op->getType()) {
		Quotient = Zero;
		Remainder = Numerator;
		return;
		}

		if (FoundDenominatorTerm) {
		Qs.push_back(Op);
		continue;
		}

		// Check whether Denominator divides one of the product operands.
		const SCEV Q, R;
		divide(SE, Op, Denominator, &Q, &R);
		if (!R->isZero()) {
		Qs.push_back(Op);
		continue;
		}

		// Bail out if types do not match.
		if (Ty != Q->getType()) {
		Quotient = Zero;
		Remainder = Numerator;
		return;
		}

		FoundDenominatorTerm = true;
		Qs.push_back(Q);
		}

		if (FoundDenominatorTerm) {
		Remainder = Zero;
		if (Qs.size() == 1)
		Quotient = Qs[0];
		else
		Quotient = SE.getMulExpr(Qs);
		return;
		}

		if (!isa<SCEVUnknown>(Denominator)) {
		Quotient = Zero;
		Remainder = Numerator;
		return;
		}

		// The Remainder is obtained by replacing Denominator by 0 in Numerator.
		ValueToValueMap RewriteMap;
		RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
		cast<SCEVConstant>(Zero)->getValue();
		Remainder = SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);

		if (Remainder->isZero()) {
		// The Quotient is obtained by replacing Denominator by 1 in Numerator.
		RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
		cast<SCEVConstant>(One)->getValue();
		Quotient =
		SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);
		return;
		}

		// Quotient is (Numerator - Remainder) divided by Denominator.
		const SCEV Q, R;
		const SCEV *Diff = SE.getMinusSCEV(Numerator, Remainder);
		if (sizeOfSCEV(Diff) > sizeOfSCEV(Numerator)) {
		// This SCEV does not seem to simplify: fail the division here.
		Quotient = Zero;
		Remainder = Numerator;
		return;
		}
		divide(SE, Diff, Denominator, &Q, &R);
		assert(R == Zero &&
		"(Numerator - Remainder) should evenly divide Denominator");
		Quotient = Q;
		}

		private:
		ScalarEvolution &SE;
		const SCEV Denominator, Quotient, Remainder, Zero, *One;
		};
		}



//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Simple SCEV method implementations		// Simple SCEV method implementations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// BinomialCoefficient - Compute BC(It, K). The result has width W.		/// BinomialCoefficient - Compute BC(It, K). The result has width W.
/// Assume, K > 0.		/// Assume, K > 0.
▲ Show 20 Lines • Show All 3,389 Lines • ▼ Show 20 Lines
/// constant. Will also return 0 if the maximum trip count is very large (>=		/// constant. Will also return 0 if the maximum trip count is very large (>=
/// 2^32).		/// 2^32).
///		///
/// This "trip count" assumes that control exits via ExitingBlock. More		/// This "trip count" assumes that control exits via ExitingBlock. More
/// precisely, it is the number of times that control may reach ExitingBlock		/// precisely, it is the number of times that control may reach ExitingBlock
/// before taking the branch. For loops with multiple exits, it may not be the		/// before taking the branch. For loops with multiple exits, it may not be the
/// number times that the loop header executes because the loop may exit		/// number times that the loop header executes because the loop may exit
/// prematurely via another branch.		/// prematurely via another branch.
///		unsigned ScalarEvolution::getSmallConstantTripCount(Loop *L,
/// FIXME: We conservatively call getBackedgeTakenCount(L) instead of		BasicBlock *ExitingBlock) {
/// getExitCount(L, ExitingBlock) to compute a safe trip count considering all
/// loop exits. getExitCount() may return an exact count for this branch
/// assuming no-signed-wrap. The number of well-defined iterations may actually
/// be higher than this trip count if this exit test is skipped and the loop
/// exits via a different branch. Ideally, getExitCount() would know whether it
/// depends on a NSW assumption, and we would only fall back to a conservative
/// trip count in that case.
unsigned ScalarEvolution::
getSmallConstantTripCount(Loop L, BasicBlock /ExitingBlock/) {
const SCEVConstant *ExitCount =		const SCEVConstant *ExitCount =
dyn_cast<SCEVConstant>(getBackedgeTakenCount(L));		dyn_cast<SCEVConstant>(getExitCount(L, ExitingBlock));
if (!ExitCount)		if (!ExitCount)
return 0;		return 0;

ConstantInt *ExitConst = ExitCount->getValue();		ConstantInt *ExitConst = ExitCount->getValue();

// Guard against huge trip counts.		// Guard against huge trip counts.
if (ExitConst->getValue().getActiveBits() > 32)		if (ExitConst->getValue().getActiveBits() > 32)
return 0;		return 0;
Show All 9 Lines
///		///
/// Returns 1 if the trip count is unknown or not guaranteed to be the		/// Returns 1 if the trip count is unknown or not guaranteed to be the
/// multiple of a constant (which is also the case if the trip count is simply		/// multiple of a constant (which is also the case if the trip count is simply
/// constant, use getSmallConstantTripCount for that case), Will also return 1		/// constant, use getSmallConstantTripCount for that case), Will also return 1
/// if the trip count is very large (>= 2^32).		/// if the trip count is very large (>= 2^32).
///		///
/// As explained in the comments for getSmallConstantTripCount, this assumes		/// As explained in the comments for getSmallConstantTripCount, this assumes
/// that control exits the loop via ExitingBlock.		/// that control exits the loop via ExitingBlock.
unsigned ScalarEvolution::		unsigned
getSmallConstantTripMultiple(Loop L, BasicBlock /ExitingBlock/) {		ScalarEvolution::getSmallConstantTripMultiple(Loop *L,
const SCEV *ExitCount = getBackedgeTakenCount(L);		BasicBlock *ExitingBlock) {
		const SCEV *ExitCount = getExitCount(L, ExitingBlock);
if (ExitCount == getCouldNotCompute())		if (ExitCount == getCouldNotCompute())
return 1;		return 1;

// Get the trip count from the BE count by adding 1.		// Get the trip count from the BE count by adding 1.
const SCEV *TCMul = getAddExpr(ExitCount,		const SCEV *TCMul = getAddExpr(ExitCount,
getConstant(ExitCount->getType(), 1));		getConstant(ExitCount->getType(), 1));
// FIXME: SCEV distributes multiplication as V1C1 + V2C1. We could attempt		// FIXME: SCEV distributes multiplication as V1C1 + V2C1. We could attempt
// to factor simple cases.		// to factor simple cases.
▲ Show 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	if (EL.Exact == getCouldNotCompute())
// we won't be able to compute an exact value for the loop.		// we won't be able to compute an exact value for the loop.
CouldComputeBECount = false;		CouldComputeBECount = false;
else		else
ExitCounts.push_back(std::make_pair(ExitBB, EL.Exact));		ExitCounts.push_back(std::make_pair(ExitBB, EL.Exact));

// 2. Derive the loop's MaxBECount from each exit's max number of		// 2. Derive the loop's MaxBECount from each exit's max number of
// non-exiting iterations. Partition the loop exits into two kinds:		// non-exiting iterations. Partition the loop exits into two kinds:
// LoopMustExits and LoopMayExits.		// LoopMustExits and LoopMayExits.
//		//
// A LoopMustExit meets two requirements:		// If the exit dominates the loop latch, it is a LoopMustExit otherwise it
		jingyueUnsubmitted Not Done Reply Inline Actions Since MustExit is gone, comments need to be updated. jingyue: Since MustExit is gone, comments need to be updated.
		meheffAuthorUnsubmitted Not Done Reply Inline Actions Done. meheff: Done.
//		// is a LoopMayExit. If any computable LoopMustExit is found, then
// (a) Its ExitLimit.MustExit flag must be set which indicates that the exit		// MaxBECount is the minimum EL.Max of computable LoopMustExits. Otherwise,
// test condition cannot be skipped (the tested variable has unit stride or		// MaxBECount is conservatively the maximum EL.Max, where CouldNotCompute is
// the test is less-than or greater-than, rather than a strict inequality).		// considered greater than any computable EL.Max.
//		if (EL.Max != getCouldNotCompute() && Latch &&
// (b) It must dominate the loop latch, hence must be tested on every loop
// iteration.
//
// If any computable LoopMustExit is found, then MaxBECount is the minimum
// EL.Max of computable LoopMustExits. Otherwise, MaxBECount is
// conservatively the maximum EL.Max, where CouldNotCompute is considered
// greater than any computable EL.Max.
if (EL.MustExit && EL.Max != getCouldNotCompute() && Latch &&
DT->dominates(ExitBB, Latch)) {		DT->dominates(ExitBB, Latch)) {
if (!MustExitMaxBECount)		if (!MustExitMaxBECount)
MustExitMaxBECount = EL.Max;		MustExitMaxBECount = EL.Max;
else {		else {
MustExitMaxBECount =		MustExitMaxBECount =
getUMinFromMismatchedTypes(MustExitMaxBECount, EL.Max);		getUMinFromMismatchedTypes(MustExitMaxBECount, EL.Max);
}		}
} else if (MayExitMaxBECount != getCouldNotCompute()) {		} else if (MayExitMaxBECount != getCouldNotCompute()) {
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	for (BasicBlock *BB = ExitingBlock; BB; ) {
break;		break;
}		}
BB = Pred;		BB = Pred;
}		}
if (!Ok)		if (!Ok)
return getCouldNotCompute();		return getCouldNotCompute();
}		}

		bool IsOnlyExit = (L->getExitingBlock() != nullptr);
TerminatorInst *Term = ExitingBlock->getTerminator();		TerminatorInst *Term = ExitingBlock->getTerminator();
if (BranchInst *BI = dyn_cast<BranchInst>(Term)) {		if (BranchInst *BI = dyn_cast<BranchInst>(Term)) {
assert(BI->isConditional() && "If unconditional, it can't be in loop!");		assert(BI->isConditional() && "If unconditional, it can't be in loop!");
// Proceed to the next level to examine the exit condition expression.		// Proceed to the next level to examine the exit condition expression.
return ComputeExitLimitFromCond(L, BI->getCondition(), BI->getSuccessor(0),		return ComputeExitLimitFromCond(L, BI->getCondition(), BI->getSuccessor(0),
BI->getSuccessor(1),		BI->getSuccessor(1),
/IsSubExpr=/false);		/ControlsExit=/IsOnlyExit);
}		}

if (SwitchInst *SI = dyn_cast<SwitchInst>(Term))		if (SwitchInst *SI = dyn_cast<SwitchInst>(Term))
return ComputeExitLimitFromSingleExitSwitch(L, SI, Exit,		return ComputeExitLimitFromSingleExitSwitch(L, SI, Exit,
/IsSubExpr=/false);		/ControlsExit=/IsOnlyExit);

return getCouldNotCompute();		return getCouldNotCompute();
}		}

/// ComputeExitLimitFromCond - Compute the number of times the		/// ComputeExitLimitFromCond - Compute the number of times the
/// backedge of the specified loop will execute if its exit condition		/// backedge of the specified loop will execute if its exit condition
/// were a conditional branch of ExitCond, TBB, and FBB.		/// were a conditional branch of ExitCond, TBB, and FBB.
///		///
/// @param IsSubExpr is true if ExitCond does not directly control the exit		/// @param ControlsExit is true if ExitCond directly controls the exit
/// branch. In this case, we cannot assume that the loop only exits when the		/// branch. In this case, we can assume that the loop exits only if the
/// condition is true and cannot infer that failing to meet the condition prior		/// condition is true and can infer that failing to meet the condition prior to
/// to integer wraparound results in undefined behavior.		/// integer wraparound results in undefined behavior.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::ComputeExitLimitFromCond(const Loop *L,		ScalarEvolution::ComputeExitLimitFromCond(const Loop *L,
Value *ExitCond,		Value *ExitCond,
BasicBlock *TBB,		BasicBlock *TBB,
BasicBlock *FBB,		BasicBlock *FBB,
bool IsSubExpr) {		bool ControlsExit) {
// Check if the controlling expression for this loop is an And or Or.		// Check if the controlling expression for this loop is an And or Or.
if (BinaryOperator *BO = dyn_cast<BinaryOperator>(ExitCond)) {		if (BinaryOperator *BO = dyn_cast<BinaryOperator>(ExitCond)) {
if (BO->getOpcode() == Instruction::And) {		if (BO->getOpcode() == Instruction::And) {
// Recurse on the operands of the and.		// Recurse on the operands of the and.
bool EitherMayExit = L->contains(TBB);		bool EitherMayExit = L->contains(TBB);
ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,		ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,
IsSubExpr \|\| EitherMayExit);		ControlsExit && !EitherMayExit);
ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,		ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,
IsSubExpr \|\| EitherMayExit);		ControlsExit && !EitherMayExit);
const SCEV *BECount = getCouldNotCompute();		const SCEV *BECount = getCouldNotCompute();
const SCEV *MaxBECount = getCouldNotCompute();		const SCEV *MaxBECount = getCouldNotCompute();
bool MustExit = false;
if (EitherMayExit) {		if (EitherMayExit) {
// Both conditions must be true for the loop to continue executing.		// Both conditions must be true for the loop to continue executing.
// Choose the less conservative count.		// Choose the less conservative count.
if (EL0.Exact == getCouldNotCompute() \|\|		if (EL0.Exact == getCouldNotCompute() \|\|
EL1.Exact == getCouldNotCompute())		EL1.Exact == getCouldNotCompute())
BECount = getCouldNotCompute();		BECount = getCouldNotCompute();
else		else
BECount = getUMinFromMismatchedTypes(EL0.Exact, EL1.Exact);		BECount = getUMinFromMismatchedTypes(EL0.Exact, EL1.Exact);
if (EL0.Max == getCouldNotCompute())		if (EL0.Max == getCouldNotCompute())
MaxBECount = EL1.Max;		MaxBECount = EL1.Max;
else if (EL1.Max == getCouldNotCompute())		else if (EL1.Max == getCouldNotCompute())
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
else		else
MaxBECount = getUMinFromMismatchedTypes(EL0.Max, EL1.Max);		MaxBECount = getUMinFromMismatchedTypes(EL0.Max, EL1.Max);
MustExit = EL0.MustExit \|\| EL1.MustExit;
} else {		} else {
// Both conditions must be true at the same time for the loop to exit.		// Both conditions must be true at the same time for the loop to exit.
// For now, be conservative.		// For now, be conservative.
assert(L->contains(FBB) && "Loop block has no successor in loop!");		assert(L->contains(FBB) && "Loop block has no successor in loop!");
if (EL0.Max == EL1.Max)		if (EL0.Max == EL1.Max)
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
if (EL0.Exact == EL1.Exact)		if (EL0.Exact == EL1.Exact)
BECount = EL0.Exact;		BECount = EL0.Exact;
MustExit = EL0.MustExit && EL1.MustExit;
}		}

return ExitLimit(BECount, MaxBECount, MustExit);		return ExitLimit(BECount, MaxBECount);
}		}
if (BO->getOpcode() == Instruction::Or) {		if (BO->getOpcode() == Instruction::Or) {
// Recurse on the operands of the or.		// Recurse on the operands of the or.
bool EitherMayExit = L->contains(FBB);		bool EitherMayExit = L->contains(FBB);
ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,		ExitLimit EL0 = ComputeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,
IsSubExpr \|\| EitherMayExit);		ControlsExit && !EitherMayExit);
ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,		ExitLimit EL1 = ComputeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,
IsSubExpr \|\| EitherMayExit);		ControlsExit && !EitherMayExit);
const SCEV *BECount = getCouldNotCompute();		const SCEV *BECount = getCouldNotCompute();
const SCEV *MaxBECount = getCouldNotCompute();		const SCEV *MaxBECount = getCouldNotCompute();
bool MustExit = false;
if (EitherMayExit) {		if (EitherMayExit) {
// Both conditions must be false for the loop to continue executing.		// Both conditions must be false for the loop to continue executing.
// Choose the less conservative count.		// Choose the less conservative count.
if (EL0.Exact == getCouldNotCompute() \|\|		if (EL0.Exact == getCouldNotCompute() \|\|
EL1.Exact == getCouldNotCompute())		EL1.Exact == getCouldNotCompute())
BECount = getCouldNotCompute();		BECount = getCouldNotCompute();
else		else
BECount = getUMinFromMismatchedTypes(EL0.Exact, EL1.Exact);		BECount = getUMinFromMismatchedTypes(EL0.Exact, EL1.Exact);
if (EL0.Max == getCouldNotCompute())		if (EL0.Max == getCouldNotCompute())
MaxBECount = EL1.Max;		MaxBECount = EL1.Max;
else if (EL1.Max == getCouldNotCompute())		else if (EL1.Max == getCouldNotCompute())
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
else		else
MaxBECount = getUMinFromMismatchedTypes(EL0.Max, EL1.Max);		MaxBECount = getUMinFromMismatchedTypes(EL0.Max, EL1.Max);
MustExit = EL0.MustExit \|\| EL1.MustExit;
} else {		} else {
// Both conditions must be false at the same time for the loop to exit.		// Both conditions must be false at the same time for the loop to exit.
// For now, be conservative.		// For now, be conservative.
assert(L->contains(TBB) && "Loop block has no successor in loop!");		assert(L->contains(TBB) && "Loop block has no successor in loop!");
if (EL0.Max == EL1.Max)		if (EL0.Max == EL1.Max)
MaxBECount = EL0.Max;		MaxBECount = EL0.Max;
if (EL0.Exact == EL1.Exact)		if (EL0.Exact == EL1.Exact)
BECount = EL0.Exact;		BECount = EL0.Exact;
MustExit = EL0.MustExit && EL1.MustExit;
}		}

return ExitLimit(BECount, MaxBECount, MustExit);		return ExitLimit(BECount, MaxBECount);
}		}
}		}

// With an icmp, it may be feasible to compute an exact backedge-taken count.		// With an icmp, it may be feasible to compute an exact backedge-taken count.
// Proceed to the next level to examine the icmp.		// Proceed to the next level to examine the icmp.
if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond))		if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond))
return ComputeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, IsSubExpr);		return ComputeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, ControlsExit);

// Check for a constant condition. These are normally stripped out by		// Check for a constant condition. These are normally stripped out by
// SimplifyCFG, but ScalarEvolution may be used by a pass which wishes to		// SimplifyCFG, but ScalarEvolution may be used by a pass which wishes to
// preserve the CFG and is temporarily leaving constant conditions		// preserve the CFG and is temporarily leaving constant conditions
// in place.		// in place.
if (ConstantInt *CI = dyn_cast<ConstantInt>(ExitCond)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(ExitCond)) {
if (L->contains(FBB) == !CI->getZExtValue())		if (L->contains(FBB) == !CI->getZExtValue())
// The backedge is always taken.		// The backedge is always taken.
Show All 10 Lines
/// ComputeExitLimitFromICmp - Compute the number of times the		/// ComputeExitLimitFromICmp - Compute the number of times the
/// backedge of the specified loop will execute if its exit condition		/// backedge of the specified loop will execute if its exit condition
/// were a conditional branch of the ICmpInst ExitCond, TBB, and FBB.		/// were a conditional branch of the ICmpInst ExitCond, TBB, and FBB.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::ComputeExitLimitFromICmp(const Loop *L,		ScalarEvolution::ComputeExitLimitFromICmp(const Loop *L,
ICmpInst *ExitCond,		ICmpInst *ExitCond,
BasicBlock *TBB,		BasicBlock *TBB,
BasicBlock *FBB,		BasicBlock *FBB,
bool IsSubExpr) {		bool ControlsExit) {

// If the condition was exit on true, convert the condition to exit on false		// If the condition was exit on true, convert the condition to exit on false
ICmpInst::Predicate Cond;		ICmpInst::Predicate Cond;
if (!L->contains(FBB))		if (!L->contains(FBB))
Cond = ExitCond->getPredicate();		Cond = ExitCond->getPredicate();
else		else
Cond = ExitCond->getInversePredicate();		Cond = ExitCond->getInversePredicate();

Show All 35 Lines	if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(LHS))

const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);		const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);
if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;		if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;
}		}

switch (Cond) {		switch (Cond) {
case ICmpInst::ICMP_NE: { // while (X != Y)		case ICmpInst::ICMP_NE: { // while (X != Y)
// Convert to: while (X-Y != 0)		// Convert to: while (X-Y != 0)
ExitLimit EL = HowFarToZero(getMinusSCEV(LHS, RHS), L, IsSubExpr);		ExitLimit EL = HowFarToZero(getMinusSCEV(LHS, RHS), L, ControlsExit);
if (EL.hasAnyInfo()) return EL;		if (EL.hasAnyInfo()) return EL;
break;		break;
}		}
case ICmpInst::ICMP_EQ: { // while (X == Y)		case ICmpInst::ICMP_EQ: { // while (X == Y)
// Convert to: while (X-Y == 0)		// Convert to: while (X-Y == 0)
ExitLimit EL = HowFarToNonZero(getMinusSCEV(LHS, RHS), L);		ExitLimit EL = HowFarToNonZero(getMinusSCEV(LHS, RHS), L);
if (EL.hasAnyInfo()) return EL;		if (EL.hasAnyInfo()) return EL;
break;		break;
}		}
case ICmpInst::ICMP_SLT:		case ICmpInst::ICMP_SLT:
case ICmpInst::ICMP_ULT: { // while (X < Y)		case ICmpInst::ICMP_ULT: { // while (X < Y)
bool IsSigned = Cond == ICmpInst::ICMP_SLT;		bool IsSigned = Cond == ICmpInst::ICMP_SLT;
ExitLimit EL = HowManyLessThans(LHS, RHS, L, IsSigned, IsSubExpr);		ExitLimit EL = HowManyLessThans(LHS, RHS, L, IsSigned, ControlsExit);
if (EL.hasAnyInfo()) return EL;		if (EL.hasAnyInfo()) return EL;
break;		break;
}		}
case ICmpInst::ICMP_SGT:		case ICmpInst::ICMP_SGT:
case ICmpInst::ICMP_UGT: { // while (X > Y)		case ICmpInst::ICMP_UGT: { // while (X > Y)
bool IsSigned = Cond == ICmpInst::ICMP_SGT;		bool IsSigned = Cond == ICmpInst::ICMP_SGT;
ExitLimit EL = HowManyGreaterThans(LHS, RHS, L, IsSigned, IsSubExpr);		ExitLimit EL = HowManyGreaterThans(LHS, RHS, L, IsSigned, ControlsExit);
if (EL.hasAnyInfo()) return EL;		if (EL.hasAnyInfo()) return EL;
break;		break;
}		}
default:		default:
#if 0		#if 0
dbgs() << "ComputeBackedgeTakenCount ";		dbgs() << "ComputeBackedgeTakenCount ";
if (ExitCond->getOperand(0)->getType()->isUnsigned())		if (ExitCond->getOperand(0)->getType()->isUnsigned())
dbgs() << "[unsigned] ";		dbgs() << "[unsigned] ";
dbgs() << *LHS << " "		dbgs() << *LHS << " "
<< Instruction::getOpcodeName(Instruction::ICmp)		<< Instruction::getOpcodeName(Instruction::ICmp)
<< " " << *RHS << "\n";		<< " " << *RHS << "\n";
#endif		#endif
break;		break;
}		}
return ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));		return ComputeExitCountExhaustively(L, ExitCond, !L->contains(TBB));
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::ComputeExitLimitFromSingleExitSwitch(const Loop *L,		ScalarEvolution::ComputeExitLimitFromSingleExitSwitch(const Loop *L,
SwitchInst *Switch,		SwitchInst *Switch,
BasicBlock *ExitingBlock,		BasicBlock *ExitingBlock,
bool IsSubExpr) {		bool ControlsExit) {
assert(!L->contains(ExitingBlock) && "Not an exiting block!");		assert(!L->contains(ExitingBlock) && "Not an exiting block!");

// Give up if the exit is the default dest of a switch.		// Give up if the exit is the default dest of a switch.
if (Switch->getDefaultDest() == ExitingBlock)		if (Switch->getDefaultDest() == ExitingBlock)
return getCouldNotCompute();		return getCouldNotCompute();

assert(L->contains(Switch->getDefaultDest()) &&		assert(L->contains(Switch->getDefaultDest()) &&
"Default case must not exit the loop!");		"Default case must not exit the loop!");
const SCEV *LHS = getSCEVAtScope(Switch->getCondition(), L);		const SCEV *LHS = getSCEVAtScope(Switch->getCondition(), L);
const SCEV *RHS = getConstant(Switch->findCaseDest(ExitingBlock));		const SCEV *RHS = getConstant(Switch->findCaseDest(ExitingBlock));

// while (X != Y) --> while (X-Y != 0)		// while (X != Y) --> while (X-Y != 0)
ExitLimit EL = HowFarToZero(getMinusSCEV(LHS, RHS), L, IsSubExpr);		ExitLimit EL = HowFarToZero(getMinusSCEV(LHS, RHS), L, ControlsExit);
if (EL.hasAnyInfo())		if (EL.hasAnyInfo())
return EL;		return EL;

return getCouldNotCompute();		return getCouldNotCompute();
}		}

static ConstantInt *		static ConstantInt *
EvaluateConstantChrecAtConstant(const SCEVAddRecExpr AddRec, ConstantInt C,		EvaluateConstantChrecAtConstant(const SCEVAddRecExpr AddRec, ConstantInt C,
▲ Show 20 Lines • Show All 856 Lines • ▼ Show 20 Lines
/// HowFarToZero - Return the number of times a backedge comparing the specified		/// HowFarToZero - Return the number of times a backedge comparing the specified
/// value to zero will execute. If not computable, return CouldNotCompute.		/// value to zero will execute. If not computable, return CouldNotCompute.
///		///
/// This is only used for loops with a "x != y" exit test. The exit condition is		/// This is only used for loops with a "x != y" exit test. The exit condition is
/// now expressed as a single expression, V = x-y. So the exit test is		/// now expressed as a single expression, V = x-y. So the exit test is
/// effectively V != 0. We know and take advantage of the fact that this		/// effectively V != 0. We know and take advantage of the fact that this
/// expression only being used in a comparison by zero context.		/// expression only being used in a comparison by zero context.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowFarToZero(const SCEV V, const Loop L, bool IsSubExpr) {		ScalarEvolution::HowFarToZero(const SCEV V, const Loop L, bool ControlsExit) {
// If the value is a constant		// If the value is a constant
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {		if (const SCEVConstant *C = dyn_cast<SCEVConstant>(V)) {
// If the value is already zero, the branch will execute zero times.		// If the value is already zero, the branch will execute zero times.
if (C->getValue()->isZero()) return C;		if (C->getValue()->isZero()) return C;
return getCouldNotCompute(); // Otherwise it will loop infinitely.		return getCouldNotCompute(); // Otherwise it will loop infinitely.
}		}

const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(V);		const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(V);
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if (StepC->getValue()->equalsInt(1) \|\| StepC->getValue()->isAllOnesValue()) {
if (!CountDown && CR.getUnsignedMin().isMinValue())		if (!CountDown && CR.getUnsignedMin().isMinValue())
// When counting up, the worst starting value is 1, not 0.		// When counting up, the worst starting value is 1, not 0.
MaxBECount = CR.getUnsignedMax().isMinValue()		MaxBECount = CR.getUnsignedMax().isMinValue()
? getConstant(APInt::getMinValue(CR.getBitWidth()))		? getConstant(APInt::getMinValue(CR.getBitWidth()))
: getConstant(APInt::getMaxValue(CR.getBitWidth()));		: getConstant(APInt::getMaxValue(CR.getBitWidth()));
else		else
MaxBECount = getConstant(CountDown ? CR.getUnsignedMax()		MaxBECount = getConstant(CountDown ? CR.getUnsignedMax()
: -CR.getUnsignedMin());		: -CR.getUnsignedMin());
return ExitLimit(Distance, MaxBECount, /MustExit=/true);		return ExitLimit(Distance, MaxBECount);
		}

		// If the step exactly divides the distance then unsigned divide computes the
		// backedge count.
		const SCEV Q, R;
		ScalarEvolution &SE = const_cast<ScalarEvolution >(this);
		SCEVDivision::divide(SE, Distance, Step, &Q, &R);
		if (R->isZero()) {
		const SCEV *Exact =
		getUDivExactExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);
		return ExitLimit(Exact, Exact);
		jingyueUnsubmitted Not Done Reply Inline Actions Can we use getUDivExactExpr when R is zero? jingyue: Can we use getUDivExactExpr when R is zero?
		meheffAuthorUnsubmitted Not Done Reply Inline Actions Done. meheff: Done.
}		}

// If the recurrence is known not to wraparound, unsigned divide computes the		// If the condition controls loop exit (the loop exits only if the expression
// back edge count. (Ideally we would have an "isexact" bit for udiv). We know		// is true) and the addition is no-wrap we can use unsigned divide to
// that the value will either become zero (and thus the loop terminates), that		// compute the backedge count. In this case, the step may not divide the
// the loop will terminate through some other exit condition first, or that		// distance, but we don't care because if the condition is "missed" the loop
// the loop has undefined behavior. This means we can't "miss" the exit		// will have undefined behavior due to wrapping.
// value, even with nonunit stride, and exit later via the same branch. Note		if (ControlsExit && AddRec->getNoWrapFlags(SCEV::FlagNW)) {
// that we can skip this exit if loop later exits via a different
// branch. Hence MustExit=false.
//
// This is only valid for expressions that directly compute the loop exit. It
// is invalid for subexpressions in which the loop may exit through this
// branch even if this subexpression is false. In that case, the trip count
// computed by this udiv could be smaller than the number of well-defined
// iterations.
if (!IsSubExpr && AddRec->getNoWrapFlags(SCEV::FlagNW)) {
const SCEV *Exact =		const SCEV *Exact =
getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);		getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);
return ExitLimit(Exact, Exact, /MustExit=/false);		return ExitLimit(Exact, Exact);
}		}

// If Step is a power of two that evenly divides Start we know that the loop
// will always terminate. Start may not be a constant so we just have the
// number of trailing zeros available. This is safe even in presence of
// overflow as the recurrence will overflow to exactly 0.
const APInt &StepV = StepC->getValue()->getValue();
if (StepV.isPowerOf2() &&
GetMinTrailingZeros(getNegativeSCEV(Start)) >= StepV.countTrailingZeros())
return getUDivExactExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);

// Then, try to solve the above equation provided that Start is constant.		// Then, try to solve the above equation provided that Start is constant.
if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start))		if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start))
return SolveLinEquationWithOverflow(StepC->getValue()->getValue(),		return SolveLinEquationWithOverflow(StepC->getValue()->getValue(),
-StartC->getValue()->getValue(),		-StartC->getValue()->getValue(),
*this);		*this);
return getCouldNotCompute();		return getCouldNotCompute();
}		}

▲ Show 20 Lines • Show All 803 Lines • ▼ Show 20 Lines	Delta = Equality ? getAddExpr(Delta, Step)
: getAddExpr(Delta, getMinusSCEV(Step, One));		: getAddExpr(Delta, getMinusSCEV(Step, One));
return getUDivExpr(Delta, Step);		return getUDivExpr(Delta, Step);
}		}

/// HowManyLessThans - Return the number of times a backedge containing the		/// HowManyLessThans - Return the number of times a backedge containing the
/// specified less-than comparison will execute. If not computable, return		/// specified less-than comparison will execute. If not computable, return
/// CouldNotCompute.		/// CouldNotCompute.
///		///
/// @param IsSubExpr is true when the LHS < RHS condition does not directly		/// @param ControlsExit is true when the LHS < RHS condition directly controls
/// control the branch. In this case, we can only compute an iteration count for		/// the branch (loops exits only if condition is true). In this case, we can use
/// a subexpression that cannot overflow before evaluating true.		/// NoWrapFlags to skip overflow checks.
ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowManyLessThans(const SCEV LHS, const SCEV RHS,		ScalarEvolution::HowManyLessThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool IsSigned,		const Loop *L, bool IsSigned,
bool IsSubExpr) {		bool ControlsExit) {
// We handle only IV < Invariant		// We handle only IV < Invariant
if (!isLoopInvariant(RHS, L))		if (!isLoopInvariant(RHS, L))
return getCouldNotCompute();		return getCouldNotCompute();

const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);		const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);

// Avoid weird loops		// Avoid weird loops
if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())		if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();

bool NoWrap = !IsSubExpr &&		bool NoWrap = ControlsExit &&
IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);		IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);

const SCEV Stride = IV->getStepRecurrence(this);		const SCEV Stride = IV->getStepRecurrence(this);

// Avoid negative or zero stride values		// Avoid negative or zero stride values
if (!isKnownPositive(Stride))		if (!isKnownPositive(Stride))
return getCouldNotCompute();		return getCouldNotCompute();

Show All 36 Lines	if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else		else
MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),		MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),
getConstant(MinStride), false);		getConstant(MinStride), false);

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount, /MustExit=/true);		return ExitLimit(BECount, MaxBECount);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::HowManyGreaterThans(const SCEV LHS, const SCEV RHS,		ScalarEvolution::HowManyGreaterThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool IsSigned,		const Loop *L, bool IsSigned,
bool IsSubExpr) {		bool ControlsExit) {
// We handle only IV > Invariant		// We handle only IV > Invariant
if (!isLoopInvariant(RHS, L))		if (!isLoopInvariant(RHS, L))
return getCouldNotCompute();		return getCouldNotCompute();

const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);		const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);

// Avoid weird loops		// Avoid weird loops
if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())		if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();

bool NoWrap = !IsSubExpr &&		bool NoWrap = ControlsExit &&
IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);		IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);

const SCEV Stride = getNegativeSCEV(IV->getStepRecurrence(this));		const SCEV Stride = getNegativeSCEV(IV->getStepRecurrence(this));

// Avoid negative or zero stride values		// Avoid negative or zero stride values
if (!isKnownPositive(Stride))		if (!isKnownPositive(Stride))
return getCouldNotCompute();		return getCouldNotCompute();

Show All 38 Lines	if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else		else
MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),		MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),
getConstant(MinStride), false);		getConstant(MinStride), false);

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount, /MustExit=/true);		return ExitLimit(BECount, MaxBECount);
}		}

/// getNumIterationsInRange - Return the number of iterations of this loop that		/// getNumIterationsInRange - Return the number of iterations of this loop that
/// produce values in the specified constant range. Another way of looking at		/// produce values in the specified constant range. Another way of looking at
/// this is that it returns the first iteration number where the value is not in		/// this is that it returns the first iteration number where the value is not in
/// the condition, thus computing the exit count. If the iteration count can't		/// the condition, thus computing the exit count. If the iteration count can't
/// be computed, an instance of SCEVCouldNotCompute is returned.		/// be computed, an instance of SCEVCouldNotCompute is returned.
const SCEV *SCEVAddRecExpr::getNumIterationsInRange(ConstantRange Range,		const SCEV *SCEVAddRecExpr::getNumIterationsInRange(ConstantRange Range,
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	std::pair<const SCEV ,const SCEV > Roots =
SolveQuadraticEquation(cast<SCEVAddRecExpr>(NewAddRec), SE);		SolveQuadraticEquation(cast<SCEVAddRecExpr>(NewAddRec), SE);
const SCEVConstant *R1 = dyn_cast<SCEVConstant>(Roots.first);		const SCEVConstant *R1 = dyn_cast<SCEVConstant>(Roots.first);
const SCEVConstant *R2 = dyn_cast<SCEVConstant>(Roots.second);		const SCEVConstant *R2 = dyn_cast<SCEVConstant>(Roots.second);
if (R1) {		if (R1) {
// Pick the smallest positive root value.		// Pick the smallest positive root value.
if (ConstantInt *CB =		if (ConstantInt *CB =
dyn_cast<ConstantInt>(ConstantExpr::getICmp(ICmpInst::ICMP_ULT,		dyn_cast<ConstantInt>(ConstantExpr::getICmp(ICmpInst::ICMP_ULT,
R1->getValue(), R2->getValue()))) {		R1->getValue(), R2->getValue()))) {
if (CB->getZExtValue() == false)		if (CB->getZExtValue() == false)
std::swap(R1, R2); // R1 is the minimum root now.		std::swap(R1, R2); // R1 is the minimum root now.
		jingyueUnsubmitted Not Done Reply Inline Actions Looks like the diff tool is not doing a good job :) Are the remaining differences in this file simply copy and paste? jingyue: Looks like the diff tool is not doing a good job :) Are the remaining differences in this file…
		meheffAuthorUnsubmitted Not Done Reply Inline Actions Yeah, there is a big cut and paste. Evidently there is some very similar code elsewhere in the file which is confusing the diff. meheff: Yeah, there is a big cut and paste. Evidently there is some very similar code elsewhere in the…

// Make sure the root is not off by one. The returned iteration should		// Make sure the root is not off by one. The returned iteration should
// not be in the range, but the previous one should be. When solving		// not be in the range, but the previous one should be. When solving
// for "X*X < 5", for example, we should not return a root of 2.		// for "X*X < 5", for example, we should not return a root of 2.
ConstantInt *R1Val = EvaluateConstantChrecAtConstant(this,		ConstantInt *R1Val = EvaluateConstantChrecAtConstant(this,
R1->getValue(),		R1->getValue(),
SE);		SE);
if (Range.contains(R1Val->getValue())) {		if (Range.contains(R1Val->getValue())) {
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	void SCEVAddRecExpr::collectParametricTerms(

DEBUG({		DEBUG({
dbgs() << "Terms:\n";		dbgs() << "Terms:\n";
for (const SCEV *T : Terms)		for (const SCEV *T : Terms)
dbgs() << *T << "\n";		dbgs() << *T << "\n";
});		});
}		}

static const APInt srem(const SCEVConstant C1, const SCEVConstant C2) {
APInt A = C1->getValue()->getValue();
APInt B = C2->getValue()->getValue();
uint32_t ABW = A.getBitWidth();
uint32_t BBW = B.getBitWidth();

if (ABW > BBW)
B = B.sext(ABW);
else if (ABW < BBW)
A = A.sext(BBW);

return APIntOps::srem(A, B);
}

static const APInt sdiv(const SCEVConstant C1, const SCEVConstant C2) {
APInt A = C1->getValue()->getValue();
APInt B = C2->getValue()->getValue();
uint32_t ABW = A.getBitWidth();
uint32_t BBW = B.getBitWidth();

if (ABW > BBW)
B = B.sext(ABW);
else if (ABW < BBW)
A = A.sext(BBW);

return APIntOps::sdiv(A, B);
}

namespace {
struct FindSCEVSize {
int Size;
FindSCEVSize() : Size(0) {}

bool follow(const SCEV *S) {
++Size;
// Keep looking at all operands of S.
return true;
}
bool isDone() const {
return false;
}
};
}

// Returns the size of the SCEV S.
static inline int sizeOfSCEV(const SCEV *S) {
FindSCEVSize F;
SCEVTraversal<FindSCEVSize> ST(F);
ST.visitAll(S);
return F.Size;
}

namespace {

struct SCEVDivision : public SCEVVisitor<SCEVDivision, void> {
public:
// Computes the Quotient and Remainder of the division of Numerator by
// Denominator.
static void divide(ScalarEvolution &SE, const SCEV *Numerator,
const SCEV Denominator, const SCEV *Quotient,
const SCEV **Remainder) {
assert(Numerator && Denominator && "Uninitialized SCEV");

SCEVDivision D(SE, Numerator, Denominator);

// Check for the trivial case here to avoid having to check for it in the
// rest of the code.
if (Numerator == Denominator) {
*Quotient = D.One;
*Remainder = D.Zero;
return;
}

if (Numerator->isZero()) {
*Quotient = D.Zero;
*Remainder = D.Zero;
return;
}

// Split the Denominator when it is a product.
if (const SCEVMulExpr *T = dyn_cast<const SCEVMulExpr>(Denominator)) {
const SCEV Q, R;
*Quotient = Numerator;
for (const SCEV *Op : T->operands()) {
divide(SE, *Quotient, Op, &Q, &R);
*Quotient = Q;

// Bail out when the Numerator is not divisible by one of the terms of
// the Denominator.
if (!R->isZero()) {
*Quotient = D.Zero;
*Remainder = Numerator;
return;
}
}
*Remainder = D.Zero;
return;
}

D.visit(Numerator);
*Quotient = D.Quotient;
*Remainder = D.Remainder;
}

SCEVDivision(ScalarEvolution &S, const SCEV Numerator, const SCEV Denominator)
: SE(S), Denominator(Denominator) {
Zero = SE.getConstant(Denominator->getType(), 0);
One = SE.getConstant(Denominator->getType(), 1);

// By default, we don't know how to divide Expr by Denominator.
// Providing the default here simplifies the rest of the code.
Quotient = Zero;
Remainder = Numerator;
}

// Except in the trivial case described above, we do not know how to divide
// Expr by Denominator for the following functions with empty implementation.
void visitTruncateExpr(const SCEVTruncateExpr *Numerator) {}
void visitZeroExtendExpr(const SCEVZeroExtendExpr *Numerator) {}
void visitSignExtendExpr(const SCEVSignExtendExpr *Numerator) {}
void visitUDivExpr(const SCEVUDivExpr *Numerator) {}
void visitSMaxExpr(const SCEVSMaxExpr *Numerator) {}
void visitUMaxExpr(const SCEVUMaxExpr *Numerator) {}
void visitUnknown(const SCEVUnknown *Numerator) {}
void visitCouldNotCompute(const SCEVCouldNotCompute *Numerator) {}

void visitConstant(const SCEVConstant *Numerator) {
if (const SCEVConstant *D = dyn_cast<SCEVConstant>(Denominator)) {
Quotient = SE.getConstant(sdiv(Numerator, D));
Remainder = SE.getConstant(srem(Numerator, D));
return;
}
}

void visitAddRecExpr(const SCEVAddRecExpr *Numerator) {
const SCEV StartQ, StartR, StepQ, StepR;
assert(Numerator->isAffine() && "Numerator should be affine");
divide(SE, Numerator->getStart(), Denominator, &StartQ, &StartR);
divide(SE, Numerator->getStepRecurrence(SE), Denominator, &StepQ, &StepR);
Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
Numerator->getNoWrapFlags());
Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
Numerator->getNoWrapFlags());
}

void visitAddExpr(const SCEVAddExpr *Numerator) {
SmallVector<const SCEV *, 2> Qs, Rs;
Type *Ty = Denominator->getType();

for (const SCEV *Op : Numerator->operands()) {
const SCEV Q, R;
divide(SE, Op, Denominator, &Q, &R);

// Bail out if types do not match.
if (Ty != Q->getType() \|\| Ty != R->getType()) {
Quotient = Zero;
Remainder = Numerator;
return;
}

Qs.push_back(Q);
Rs.push_back(R);
}

if (Qs.size() == 1) {
Quotient = Qs[0];
Remainder = Rs[0];
return;
}

Quotient = SE.getAddExpr(Qs);
Remainder = SE.getAddExpr(Rs);
}

void visitMulExpr(const SCEVMulExpr *Numerator) {
SmallVector<const SCEV *, 2> Qs;
Type *Ty = Denominator->getType();

bool FoundDenominatorTerm = false;
for (const SCEV *Op : Numerator->operands()) {
// Bail out if types do not match.
if (Ty != Op->getType()) {
Quotient = Zero;
Remainder = Numerator;
return;
}

if (FoundDenominatorTerm) {
Qs.push_back(Op);
continue;
}

// Check whether Denominator divides one of the product operands.
const SCEV Q, R;
divide(SE, Op, Denominator, &Q, &R);
if (!R->isZero()) {
Qs.push_back(Op);
continue;
}

// Bail out if types do not match.
if (Ty != Q->getType()) {
Quotient = Zero;
Remainder = Numerator;
return;
}

FoundDenominatorTerm = true;
Qs.push_back(Q);
}

if (FoundDenominatorTerm) {
Remainder = Zero;
if (Qs.size() == 1)
Quotient = Qs[0];
else
Quotient = SE.getMulExpr(Qs);
return;
}

if (!isa<SCEVUnknown>(Denominator)) {
Quotient = Zero;
Remainder = Numerator;
return;
}

// The Remainder is obtained by replacing Denominator by 0 in Numerator.
ValueToValueMap RewriteMap;
RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
cast<SCEVConstant>(Zero)->getValue();
Remainder = SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);

if (Remainder->isZero()) {
// The Quotient is obtained by replacing Denominator by 1 in Numerator.
RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
cast<SCEVConstant>(One)->getValue();
Quotient =
SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);
return;
}

// Quotient is (Numerator - Remainder) divided by Denominator.
const SCEV Q, R;
const SCEV *Diff = SE.getMinusSCEV(Numerator, Remainder);
if (sizeOfSCEV(Diff) > sizeOfSCEV(Numerator)) {
// This SCEV does not seem to simplify: fail the division here.
Quotient = Zero;
Remainder = Numerator;
return;
}
divide(SE, Diff, Denominator, &Q, &R);
assert(R == Zero &&
"(Numerator - Remainder) should evenly divide Denominator");
Quotient = Q;
}

private:
ScalarEvolution &SE;
const SCEV Denominator, Quotient, Remainder, Zero, *One;
};
}

static bool findArrayDimensionsRec(ScalarEvolution &SE,		static bool findArrayDimensionsRec(ScalarEvolution &SE,
SmallVectorImpl<const SCEV *> &Terms,		SmallVectorImpl<const SCEV *> &Terms,
SmallVectorImpl<const SCEV *> &Sizes) {		SmallVectorImpl<const SCEV *> &Sizes) {
int Last = Terms.size() - 1;		int Last = Terms.size() - 1;
const SCEV *Step = Terms[Last];		const SCEV *Step = Terms[Last];

// End of recursion.		// End of recursion.
if (Last == 0) {		if (Last == 0) {
▲ Show 20 Lines • Show All 838 Lines • Show Last 20 Lines

test/Analysis/ScalarEvolution/trip-count-pow2.ll

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	loop:
%i = phi i32 [ 0, %entry ], [ %i.next, %loop ]		%i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
%i.next = add i32 %i, 96		%i.next = add i32 %i, 96
%t = icmp ne i32 %i.next, %s		%t = icmp ne i32 %i.next, %s
br i1 %t, label %loop, label %exit		br i1 %t, label %loop, label %exit
exit:		exit:
ret void		ret void

; CHECK-LABEL: @test3		; CHECK-LABEL: @test3
; CHECK: Loop %loop: Unpredictable backedge-taken count.		; CHECK: Loop %loop: backedge-taken count is ((-96 + (96 * %n)) /u 96)
; CHECK: Loop %loop: Unpredictable max backedge-taken count.		; CHECK: Loop %loop: max backedge-taken count is ((-96 + (96 * %n)) /u 96)
}		}

test/Transforms/LoopUnroll/scevunroll.ll

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	exit1:
ret i64 %s		ret i64 %s

exit2:		exit2:
ret i64 %s.next		ret i64 %s.next
}		}

; SCEV properly unrolls multi-exit loops.		; SCEV properly unrolls multi-exit loops.
;		;
; SCEV cannot currently unroll this loop.
; It should ideally detect a trip count of 5.
; rdar:14038809 [SCEV]: Optimize trip count computation for multi-exit loops.
; CHECK-LABEL: @multiExit(		; CHECK-LABEL: @multiExit(
; CHECKFIXME: getelementptr i32* %base, i32 10		; CHECK: getelementptr i32* %base, i32 10
; CHECKFIXME-NEXT: load i32*		; CHECK-NEXT: load i32*
; CHECKFIXME: br i1 false, label %l2.10, label %exit1		; CHECK: br i1 false, label %l2.10, label %exit1
; CHECKFIXME: l2.10:		; CHECK: l2.10:
; CHECKFIXME-NOT: br		; CHECK-NOT: br
; CHECKFIXME: ret i32		; CHECK: ret i32
define i32 @multiExit(i32* %base) nounwind {		define i32 @multiExit(i32* %base) nounwind {
entry:		entry:
br label %l1		br label %l1
l1:		l1:
%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l2 ]		%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l2 ]
%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l2 ]		%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l2 ]
%inc1 = add i32 %iv1, 1		%inc1 = add i32 %iv1, 1
%inc2 = add i32 %iv2, 1		%inc2 = add i32 %iv2, 1
▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Enable unrolling of multi-exit loopsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 14335

include/llvm/Analysis/ScalarEvolution.h

lib/Analysis/ScalarEvolution.cpp

test/Analysis/ScalarEvolution/trip-count-pow2.ll

test/Transforms/LoopUnroll/scevunroll.ll

Enable unrolling of multi-exit loops
ClosedPublic