This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Extend trip count to avoid overflow by default
ClosedPublic

Authored by reames on Sep 27 2021, 1:59 PM.

Download Raw Diff

Details

Reviewers

nikic
mkazantsev
efriedma

Commits

rG7f55209cee55: [SCEV] Extend trip count to avoid overflow by default

Summary

As a brief reminder, an "exit count" is the number of times the backedge executes before some event. It can be zero if we exit before the backedge is reached. A "trip count" is the number of times the loop header is entered if we branch into the loop. A trip count *can not* be zero, and in general, TC = BTC + 1.

There is a cornercases which we don't handle well. Let's assume i8 for our examples to keep things simple. If BTC = 255, then the correct trip count is 256. However, 256 is not representable in i8.

In theory, code which needs to reason about trip counts is responsible for checking for this cornercase, and either bailing out, or handling it correctly. Historically, we don't have a great track record about actually doing so.

When reviewing D109676, I found myself asking a basic question. Was there any good reason to preserve the current wrap-to-zero behavior when converting from backedge taken counts to trip counts? After reviewing existing code, I could not find a single case which appears to correctly and precisely handle the overflow case.

This patch changes the default behavior to extend instead of wrap. That is, if the result might be 256, we return a value of i9 type to ensure we interpret the count correctly. I did leave the legacy behavior as an option since a) loop-flatten stops triggering if I extend due to weirdly specific pattern matching I didn't understand and b) we could reasonably use the mode if we'd externally established a lock of overflow.

I want to emphasize that this change is *not* NFC. There are two call sites (one in ScalarEvolution.cpp, one in LoopCacheAnalysis.cpp) which are switched to the extend semantics. The former appears imprecise (but correct) for a constant 255 BTC. The later appears incorrect, though I don't have a test case.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Sep 27 2021, 1:59 PM

Herald added subscribers: bollu, hiraditya, mcrosier. · View Herald TranscriptSep 27 2021, 1:59 PM

reames requested review of this revision.Sep 27 2021, 1:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 27 2021, 1:59 PM

reames mentioned this in D109676: [HardwareLoops] put +1 for loop count before zero extension.Sep 27 2021, 2:03 PM

Harbormaster completed remote builds in B125969: Diff 375393.Sep 27 2021, 2:30 PM

shchenz added a subscriber: shchenz.Sep 27 2021, 5:22 PM

I'm fine with the change.

How do you think, can we add something to SCEV's verifier to catch this theoretical latent bug?

This revision is now accepted and ready to land.Oct 7 2021, 10:26 PM

This revision was landed with ongoing or failed builds.Oct 11 2021, 9:56 AM

Closed by commit rG7f55209cee55: [SCEV] Extend trip count to avoid overflow by default (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rG7f55209cee55: [SCEV] Extend trip count to avoid overflow by default.

SjoerdMeijer mentioned this in rGada6d78a7802: [LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC..Jan 24 2022, 5:57 AM

I did leave the legacy behavior as an option since a) loop-flatten stops triggering if I extend due to weirdly specific pattern matching I didn't understand and b) we could reasonably use the mode if we'd externally established a lock of overflow.

I wasn't aware of this until I noticed the FIXME and have committed NFC patches rGf6ac8088b0e8 and rGada6d78a7802 to try and clarify this.

caojoshua mentioned this in D141823: [SCEV] More precise trip multiples.Jan 16 2023, 1:19 AM

caojoshua mentioned this in D147117: [SCEV] When computing trip count, only zext if necessary.Mar 29 2023, 12:17 AM

efriedma mentioned this in D147868: [SCEV] Strengthen huge constant trip multiples..Apr 9 2023, 11:26 AM

caojoshua mentioned this in D149529: [SCEV][reland] More precise trip multiples.Apr 29 2023, 1:24 PM

caojoshua mentioned this in rG9c1d5e4ae349: [SCEV][reland] More precise trip multiples.May 7 2023, 10:02 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ScalarEvolution.h

10 lines

lib/

Analysis/

ScalarEvolution.cpp

19 lines

Transforms/

Scalar/

LoopFlatten.cpp

9 lines

Diff 378712

llvm/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 753 Lines • ▼ Show 20 Lines	public:
/// Test whether the backedge of the loop is protected by a conditional		/// Test whether the backedge of the loop is protected by a conditional
/// between LHS and RHS. This is used to eliminate casts.		/// between LHS and RHS. This is used to eliminate casts.
bool isLoopBackedgeGuardedByCond(const Loop *L, ICmpInst::Predicate Pred,		bool isLoopBackedgeGuardedByCond(const Loop *L, ICmpInst::Predicate Pred,
const SCEV LHS, const SCEV RHS);		const SCEV LHS, const SCEV RHS);

/// Convert from an "exit count" (i.e. "backedge taken count") to a "trip		/// Convert from an "exit count" (i.e. "backedge taken count") to a "trip
/// count". A "trip count" is the number of times the header of the loop		/// count". A "trip count" is the number of times the header of the loop
/// will execute if an exit is taken after the specified number of backedges		/// will execute if an exit is taken after the specified number of backedges
/// have been taken. (e.g. TripCount = ExitCount + 1) A zero result		/// have been taken. (e.g. TripCount = ExitCount + 1). Note that the
/// must be interpreted as a loop having an unknown trip count.		/// expression can overflow if ExitCount = UINT_MAX. \p Extend controls
const SCEV getTripCountFromExitCount(const SCEV ExitCount);		/// how potential overflow is handled. If true, a wider result type is
		/// returned. ex: EC = 255 (i8), TC = 256 (i9). If false, result unsigned
		/// wraps with 2s-complement semantics. ex: EC = 255 (i8), TC = 0 (i8)
		const SCEV getTripCountFromExitCount(const SCEV ExitCount,
		bool Extend = true);

/// Returns the exact trip count of the loop if we can compute it, and		/// Returns the exact trip count of the loop if we can compute it, and
/// the result is a small constant. '0' is used to represent an unknown		/// the result is a small constant. '0' is used to represent an unknown
/// or non-constant trip count. Note that a trip count is simply one more		/// or non-constant trip count. Note that a trip count is simply one more
/// than the backedge taken count for the loop.		/// than the backedge taken count for the loop.
unsigned getSmallConstantTripCount(const Loop *L);		unsigned getSmallConstantTripCount(const Loop *L);

/// Return the exact trip count for this loop if we exit through ExitingBlock.		/// Return the exact trip count for this loop if we exit through ExitingBlock.
▲ Show 20 Lines • Show All 1,426 Lines • Show Last 20 Lines

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,210 Lines • ▼ Show 20 Lines	const SCEV ScalarEvolution::createSCEV(Value V) {

return getUnknown(V);		return getUnknown(V);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Iteration Count Computation Code		// Iteration Count Computation Code
//		//

const SCEV ScalarEvolution::getTripCountFromExitCount(const SCEV ExitCount) {		const SCEV ScalarEvolution::getTripCountFromExitCount(const SCEV ExitCount,
// Get the trip count from the BE count by adding 1. Overflow, results		bool Extend) {
// in zero which means "unknown".		if (isa<SCEVCouldNotCompute>(ExitCount))
return getAddExpr(ExitCount, getOne(ExitCount->getType()));		return getCouldNotCompute();

		auto *ExitCountType = ExitCount->getType();
		assert(ExitCountType->isIntegerTy());

		if (!Extend)
		return getAddExpr(ExitCount, getOne(ExitCountType));

		auto *WiderType = Type::getIntNTy(ExitCountType->getContext(),
		1 + ExitCountType->getScalarSizeInBits());
		return getAddExpr(getNoopOrZeroExtend(ExitCount, WiderType),
		getOne(WiderType));
}		}

static unsigned getConstantTripCount(const SCEVConstant *ExitCount) {		static unsigned getConstantTripCount(const SCEVConstant *ExitCount) {
if (!ExitCount)		if (!ExitCount)
return 0;		return 0;

ConstantInt *ExitConst = ExitCount->getValue();		ConstantInt *ExitConst = ExitCount->getValue();

▲ Show 20 Lines • Show All 6,517 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopFlatten.cpp

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	static bool findLoopComponents(
// another transformation has changed the compare (e.g. icmp ult %inc,		// another transformation has changed the compare (e.g. icmp ult %inc,
// tripcount -> icmp ult %j, tripcount-1), or both.		// tripcount -> icmp ult %j, tripcount-1), or both.
Value *RHS = Compare->getOperand(1);		Value *RHS = Compare->getOperand(1);
const SCEV *BackedgeTakenCount = SE->getBackedgeTakenCount(L);		const SCEV *BackedgeTakenCount = SE->getBackedgeTakenCount(L);
if (isa<SCEVCouldNotCompute>(BackedgeTakenCount)) {		if (isa<SCEVCouldNotCompute>(BackedgeTakenCount)) {
LLVM_DEBUG(dbgs() << "Backedge-taken count is not predictable\n");		LLVM_DEBUG(dbgs() << "Backedge-taken count is not predictable\n");
return false;		return false;
}		}
const SCEV *SCEVTripCount = SE->getTripCountFromExitCount(BackedgeTakenCount);		// The use of the Extend=false flag on getTripCountFromExitCount was added
		// during a refactoring to preserve existing behavior. However, there's
		// nothing obvious in the surrounding code when handles the overflow case.
		// FIXME: audit code to establish whether there's a latent bug here.
		const SCEV *SCEVTripCount =
		SE->getTripCountFromExitCount(BackedgeTakenCount, false);
const SCEV *SCEVRHS = SE->getSCEV(RHS);		const SCEV *SCEVRHS = SE->getSCEV(RHS);
if (SCEVRHS == SCEVTripCount)		if (SCEVRHS == SCEVTripCount)
return setLoopComponents(RHS, TripCount, Increment, IterationInstructions);		return setLoopComponents(RHS, TripCount, Increment, IterationInstructions);
ConstantInt *ConstantRHS = dyn_cast<ConstantInt>(RHS);		ConstantInt *ConstantRHS = dyn_cast<ConstantInt>(RHS);
if (ConstantRHS) {		if (ConstantRHS) {
const SCEV *BackedgeTCExt = nullptr;		const SCEV *BackedgeTCExt = nullptr;
if (IsWidened) {		if (IsWidened) {
const SCEV *SCEVTripCountExt;		const SCEV *SCEVTripCountExt;
// Find the extended backedge taken count and extended trip count using		// Find the extended backedge taken count and extended trip count using
// SCEV. One of these should now match the RHS of the compare.		// SCEV. One of these should now match the RHS of the compare.
BackedgeTCExt = SE->getZeroExtendExpr(BackedgeTakenCount, RHS->getType());		BackedgeTCExt = SE->getZeroExtendExpr(BackedgeTakenCount, RHS->getType());
SCEVTripCountExt = SE->getTripCountFromExitCount(BackedgeTCExt);		SCEVTripCountExt = SE->getTripCountFromExitCount(BackedgeTCExt, false);
if (SCEVRHS != BackedgeTCExt && SCEVRHS != SCEVTripCountExt) {		if (SCEVRHS != BackedgeTCExt && SCEVRHS != SCEVTripCountExt) {
LLVM_DEBUG(dbgs() << "Could not find valid trip count\n");		LLVM_DEBUG(dbgs() << "Could not find valid trip count\n");
return false;		return false;
}		}
}		}
// If the RHS of the compare is equal to the backedge taken count we need		// If the RHS of the compare is equal to the backedge taken count we need
// to add one to get the trip count.		// to add one to get the trip count.
if (SCEVRHS == BackedgeTCExt \|\| SCEVRHS == BackedgeTakenCount) {		if (SCEVRHS == BackedgeTCExt \|\| SCEVRHS == BackedgeTakenCount) {
▲ Show 20 Lines • Show All 643 Lines • Show Last 20 Lines