This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/lib/Transforms/Scalar/
-
trunk/
-
lib/
-
Transforms/
-
Scalar/
-
IndVarSimplify.cpp

Differential D62880

Prepare for multi-exit LFTR [NFC]
ClosedPublic

Authored by reames on Jun 4 2019, 2:25 PM.

Download Raw Diff

Details

Reviewers

nikic
sanjoy
apilipenko

Commits

rG5d84ccb2303b: Prepare for multi-exit LFTR [NFC]
rL362971: Prepare for multi-exit LFTR [NFC]

Summary

This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. (The actual multi-exit LFTR patch is in D62625 for context.)

Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. This means that we can end up with a different a narrower BE count for the loop than requerying after widening.

For the nestedIV example from elim-extend, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :)

For the moment, the code is actually NFC. I'd like to land this as is, and then adjust the BETC used in a separate patch. (In particular, I want to investigate *why* they're different. They should be equivalent.) I'm open to being convinced that either a) I should investigate that first, or 2) I should just include the resultng test change in this commit and not worry about it. Thoughts?

Diff Detail

Repository: rL LLVM

Event Timeline

reames created this revision.Jun 4 2019, 2:25 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 4 2019, 2:25 PM

Herald added subscribers: bollu, mcrosier. · View Herald Transcript

reames edited the summary of this revision. (Show Details)Jun 4 2019, 2:26 PM

reames mentioned this in D62625: LFTR for multiple exit loops.Jun 4 2019, 2:30 PM

I should just include the resultng test change in this commit and not worry about it.

I would say don't worry about it. This should be rare, but not unheard of. E.g. there are extreme cases where SCEV will be able to constant for A-B but not B-A (where A and B are complex expressions).

lib/Transforms/Scalar/IndVarSimplify.cpp
2201 ↗	(On Diff #203023)	Generalize
2203 ↗	(On Diff #203023)	It would probably be more obvious if you did bool HasOnlyOneExitingBlock = L->getExitingBlock() != nullptr; if (HasOnlyOneExitingBlock) { ... } or something like that. But isn't this called only when the loop has one exiting block (so the check will always return true)?
2650 ↗	(On Diff #203023)	I think this comment can be clearer about how it ties to the code. Are you saying asking SCEV to recompute `getBackedgeTakenCount` here can produce different results so you take care to "re-use" `BackedgeTakenCount`?

This revision is now accepted and ready to land.Jun 6 2019, 9:22 PM

Closed by commit rL362971: Prepare for multi-exit LFTR [NFC] (authored by reames). · Explain WhyJun 10 2019, 10:48 AM

This revision was automatically updated to reflect the committed changes.

reames mentioned this in rL362975: [LFTR] Use recomputed BE count.Jun 10 2019, 12:16 PM

reames mentioned this in rGa9633d5f0b3d: [LFTR] Use recomputed BE count.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

IndVarSimplify.cpp

142 lines

Diff 203861

llvm/trunk/lib/Transforms/Scalar/IndVarSimplify.cpp

Show First 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	class IndVarSimplify {

bool simplifyAndExtend(Loop L, SCEVExpander &Rewriter, LoopInfo LI);		bool simplifyAndExtend(Loop L, SCEVExpander &Rewriter, LoopInfo LI);

bool canLoopBeDeleted(Loop *L, SmallVector<RewritePhi, 8> &RewritePhiSet);		bool canLoopBeDeleted(Loop *L, SmallVector<RewritePhi, 8> &RewritePhiSet);
bool rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter);		bool rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter);
bool rewriteFirstIterationLoopExitValues(Loop *L);		bool rewriteFirstIterationLoopExitValues(Loop *L);
bool hasHardUserWithinLoop(const Loop L, const Instruction I) const;		bool hasHardUserWithinLoop(const Loop L, const Instruction I) const;

bool linearFunctionTestReplace(Loop L, const SCEV BackedgeTakenCount,		bool linearFunctionTestReplace(Loop L, BasicBlock ExitingBB,
		const SCEV *BackedgeTakenCount,
PHINode *IndVar, SCEVExpander &Rewriter);		PHINode *IndVar, SCEVExpander &Rewriter);

bool sinkUnusedInvariants(Loop *L);		bool sinkUnusedInvariants(Loop *L);

public:		public:
IndVarSimplify(LoopInfo LI, ScalarEvolution SE, DominatorTree *DT,		IndVarSimplify(LoopInfo LI, ScalarEvolution SE, DominatorTree *DT,
const DataLayout &DL, TargetLibraryInfo *TLI,		const DataLayout &DL, TargetLibraryInfo *TLI,
TargetTransformInfo *TTI)		TargetTransformInfo *TTI)
▲ Show 20 Lines • Show All 1,816 Lines • ▼ Show 20 Lines	bool IndVarSimplify::simplifyAndExtend(Loop *L,
}		}
return Changed;		return Changed;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// linearFunctionTestReplace and its kin. Rewrite the loop exit condition.		// linearFunctionTestReplace and its kin. Rewrite the loop exit condition.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Return true if this loop's backedge taken count expression can be safely and
/// cheaply expanded into an instruction sequence that can be used by
/// linearFunctionTestReplace.
static bool canExpandBackedgeTakenCount(Loop L, ScalarEvolution SE,
SCEVExpander &Rewriter) {
const SCEV *BackedgeTakenCount = SE->getBackedgeTakenCount(L);
if (isa<SCEVCouldNotCompute>(BackedgeTakenCount))
return false;

// Better to break the backedge
if (BackedgeTakenCount->isZero())
return false;

// Loops with multiple exits are not currently suported by lftr
if (!L->getExitingBlock())
return false;

// Can't rewrite non-branch yet.
if (!isa<BranchInst>(L->getExitingBlock()->getTerminator()))
return false;

if (Rewriter.isHighCostExpansion(BackedgeTakenCount, L))
return false;

return true;
}

/// Given an Value which is hoped to be part of an add recurance in the given		/// Given an Value which is hoped to be part of an add recurance in the given
/// loop, return the associated Phi node if so. Otherwise, return null. Note		/// loop, return the associated Phi node if so. Otherwise, return null. Note
/// that this is less general than SCEVs AddRec checking.		/// that this is less general than SCEVs AddRec checking.
static PHINode getLoopPhiForCounter(Value IncV, Loop *L) {		static PHINode getLoopPhiForCounter(Value IncV, Loop *L) {
Instruction *IncI = dyn_cast<Instruction>(IncV);		Instruction *IncI = dyn_cast<Instruction>(IncV);
if (!IncI)		if (!IncI)
return nullptr;		return nullptr;

Show All 26 Lines	if (L->isLoopInvariant(IncI->getOperand(0)))
return Phi;		return Phi;
}		}
return nullptr;		return nullptr;
}		}

/// Given a loop with one backedge and one exit, return the ICmpInst		/// Given a loop with one backedge and one exit, return the ICmpInst
/// controlling the sole loop exit. There is no guarantee that the exiting		/// controlling the sole loop exit. There is no guarantee that the exiting
/// block is also the latch.		/// block is also the latch.
static ICmpInst getLoopTest(Loop L) {		static ICmpInst getLoopTest(Loop L, BasicBlock *ExitingBB) {
assert(L->getExitingBlock() && "expected loop exit");

BasicBlock *LatchBlock = L->getLoopLatch();		BasicBlock *LatchBlock = L->getLoopLatch();
// Don't bother with LFTR if the loop is not properly simplified.		// Don't bother with LFTR if the loop is not properly simplified.
if (!LatchBlock)		if (!LatchBlock)
return nullptr;		return nullptr;

BranchInst *BI = dyn_cast<BranchInst>(L->getExitingBlock()->getTerminator());		BranchInst *BI = dyn_cast<BranchInst>(ExitingBB->getTerminator());
assert(BI && "expected exit branch");		assert(BI && "expected exit branch");

return dyn_cast<ICmpInst>(BI->getCondition());		return dyn_cast<ICmpInst>(BI->getCondition());
}		}

/// linearFunctionTestReplace policy. Return true unless we can show that the		/// linearFunctionTestReplace policy. Return true unless we can show that the
/// current exit test is already sufficiently canonical.		/// current exit test is already sufficiently canonical.
static bool needsLFTR(Loop *L) {		static bool needsLFTR(Loop L, BasicBlock ExitingBB) {
// Do LFTR to simplify the exit condition to an ICMP.		// Do LFTR to simplify the exit condition to an ICMP.
ICmpInst *Cond = getLoopTest(L);		ICmpInst *Cond = getLoopTest(L, ExitingBB);
if (!Cond)		if (!Cond)
return true;		return true;

// Do LFTR to simplify the exit ICMP to EQ/NE		// Do LFTR to simplify the exit ICMP to EQ/NE
ICmpInst::Predicate Pred = Cond->getPredicate();		ICmpInst::Predicate Pred = Cond->getPredicate();
if (Pred != ICmpInst::ICMP_NE && Pred != ICmpInst::ICMP_EQ)		if (Pred != ICmpInst::ICMP_NE && Pred != ICmpInst::ICMP_EQ)
return true;		return true;

▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines

/// Search the loop header for a loop counter (anadd rec w/step of one)		/// Search the loop header for a loop counter (anadd rec w/step of one)
/// suitable for use by LFTR. If multiple counters are available, select the		/// suitable for use by LFTR. If multiple counters are available, select the
/// "best" one based profitable heuristics.		/// "best" one based profitable heuristics.
///		///
/// BECount may be an i8* pointer type. The pointer difference is already		/// BECount may be an i8* pointer type. The pointer difference is already
/// valid count without scaling the address stride, so it remains a pointer		/// valid count without scaling the address stride, so it remains a pointer
/// expression as far as SCEV is concerned.		/// expression as far as SCEV is concerned.
static PHINode FindLoopCounter(Loop L, const SCEV *BECount,		static PHINode FindLoopCounter(Loop L, BasicBlock *ExitingBB,
ScalarEvolution *SE) {		const SCEV BECount, ScalarEvolution SE) {
uint64_t BCWidth = SE->getTypeSizeInBits(BECount->getType());		uint64_t BCWidth = SE->getTypeSizeInBits(BECount->getType());

Value *Cond =		Value *Cond = cast<BranchInst>(ExitingBB->getTerminator())->getCondition();
cast<BranchInst>(L->getExitingBlock()->getTerminator())->getCondition();

// Loop over all of the PHI nodes, looking for a simple counter.		// Loop over all of the PHI nodes, looking for a simple counter.
PHINode *BestPhi = nullptr;		PHINode *BestPhi = nullptr;
const SCEV *BestInit = nullptr;		const SCEV *BestInit = nullptr;
BasicBlock *LatchBlock = L->getLoopLatch();		BasicBlock *LatchBlock = L->getLoopLatch();
assert(LatchBlock && "needsLFTR should guarantee a loop latch");		assert(LatchBlock && "needsLFTR should guarantee a loop latch");
const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();		const DataLayout &DL = L->getHeader()->getModule()->getDataLayout();

Show All 16 Lines	if (PhiWidth < BCWidth \|\| !DL.isLegalInteger(PhiWidth))
continue;		continue;

// Avoid reusing a potentially undef value to compute other values that may		// Avoid reusing a potentially undef value to compute other values that may
// have originally had a concrete definition.		// have originally had a concrete definition.
if (!hasConcreteDef(Phi)) {		if (!hasConcreteDef(Phi)) {
// We explicitly allow unknown phis as long as they are already used by		// We explicitly allow unknown phis as long as they are already used by
// the loop test. In this case we assume that performing LFTR could not		// the loop test. In this case we assume that performing LFTR could not
// increase the number of undef users.		// increase the number of undef users.
if (ICmpInst *Cond = getLoopTest(L)) {		// TODO: Generalize this to allow any loop exit which is known to
		// execute on each iteration
		if (L->getExitingBlock())
		if (ICmpInst *Cond = getLoopTest(L, ExitingBB))
if (Phi != getLoopPhiForCounter(Cond->getOperand(0), L) &&		if (Phi != getLoopPhiForCounter(Cond->getOperand(0), L) &&
Phi != getLoopPhiForCounter(Cond->getOperand(1), L)) {		Phi != getLoopPhiForCounter(Cond->getOperand(1), L))
continue;		continue;
}		}
}
}
const SCEV *Init = AR->getStart();		const SCEV *Init = AR->getStart();

if (BestPhi && !AlmostDeadIV(BestPhi, LatchBlock, Cond)) {		if (BestPhi && !AlmostDeadIV(BestPhi, LatchBlock, Cond)) {
// Don't force a live loop counter if another IV can be used.		// Don't force a live loop counter if another IV can be used.
if (AlmostDeadIV(Phi, LatchBlock, Cond))		if (AlmostDeadIV(Phi, LatchBlock, Cond))
continue;		continue;

// Prefer to count-from-zero. This is a more "canonical" counter form. It		// Prefer to count-from-zero. This is a more "canonical" counter form. It
Show All 12 Lines	for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I) {
BestInit = Init;		BestInit = Init;
}		}
return BestPhi;		return BestPhi;
}		}

/// Insert an IR expression which computes the value held by the IV IndVar		/// Insert an IR expression which computes the value held by the IV IndVar
/// (which must be an loop counter w/unit stride) after the backedge of loop L		/// (which must be an loop counter w/unit stride) after the backedge of loop L
/// is taken IVCount times.		/// is taken IVCount times.
static Value genLoopLimit(PHINode IndVar, const SCEV IVCount, Loop L,		static Value genLoopLimit(PHINode IndVar, BasicBlock *ExitingBB,
		const SCEV IVCount, Loop L,
SCEVExpander &Rewriter, ScalarEvolution *SE) {		SCEVExpander &Rewriter, ScalarEvolution *SE) {
assert(isLoopCounter(IndVar, L, SE));		assert(isLoopCounter(IndVar, L, SE));
const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(IndVar));		const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(IndVar));
const SCEV *IVInit = AR->getStart();		const SCEV *IVInit = AR->getStart();

// IVInit may be a pointer while IVCount is an integer when FindLoopCounter		// IVInit may be a pointer while IVCount is an integer when FindLoopCounter
// finds a valid pointer IV. Sign extend BECount in order to materialize a		// finds a valid pointer IV. Sign extend BECount in order to materialize a
// GEP. Avoid running SCEVExpander on a new pointer value, instead reusing		// GEP. Avoid running SCEVExpander on a new pointer value, instead reusing
// the existing GEPs whenever possible.		// the existing GEPs whenever possible.
if (IndVar->getType()->isPointerTy() && !IVCount->getType()->isPointerTy()) {		if (IndVar->getType()->isPointerTy() && !IVCount->getType()->isPointerTy()) {
// IVOffset will be the new GEP offset that is interpreted by GEP as a		// IVOffset will be the new GEP offset that is interpreted by GEP as a
// signed value. IVCount on the other hand represents the loop trip count,		// signed value. IVCount on the other hand represents the loop trip count,
// which is an unsigned value. FindLoopCounter only allows induction		// which is an unsigned value. FindLoopCounter only allows induction
// variables that have a positive unit stride of one. This means we don't		// variables that have a positive unit stride of one. This means we don't
// have to handle the case of negative offsets (yet) and just need to zero		// have to handle the case of negative offsets (yet) and just need to zero
// extend IVCount.		// extend IVCount.
Type *OfsTy = SE->getEffectiveSCEVType(IVInit->getType());		Type *OfsTy = SE->getEffectiveSCEVType(IVInit->getType());
const SCEV *IVOffset = SE->getTruncateOrZeroExtend(IVCount, OfsTy);		const SCEV *IVOffset = SE->getTruncateOrZeroExtend(IVCount, OfsTy);

// Expand the code for the iteration count.		// Expand the code for the iteration count.
assert(SE->isLoopInvariant(IVOffset, L) &&		assert(SE->isLoopInvariant(IVOffset, L) &&
"Computed iteration count is not loop invariant!");		"Computed iteration count is not loop invariant!");
BranchInst *BI = cast<BranchInst>(L->getExitingBlock()->getTerminator());		BranchInst *BI = cast<BranchInst>(ExitingBB->getTerminator());
Value *GEPOffset = Rewriter.expandCodeFor(IVOffset, OfsTy, BI);		Value *GEPOffset = Rewriter.expandCodeFor(IVOffset, OfsTy, BI);

Value *GEPBase = IndVar->getIncomingValueForBlock(L->getLoopPreheader());		Value *GEPBase = IndVar->getIncomingValueForBlock(L->getLoopPreheader());
assert(AR->getStart() == SE->getSCEV(GEPBase) && "bad loop counter");		assert(AR->getStart() == SE->getSCEV(GEPBase) && "bad loop counter");
// We could handle pointer IVs other than i8*, but we need to compensate for		// We could handle pointer IVs other than i8*, but we need to compensate for
// gep index scaling. See canExpandBackedgeTakenCount comments.		// gep index scaling.
assert(SE->getSizeOfExpr(IntegerType::getInt64Ty(IndVar->getContext()),		assert(SE->getSizeOfExpr(IntegerType::getInt64Ty(IndVar->getContext()),
cast<PointerType>(GEPBase->getType())		cast<PointerType>(GEPBase->getType())
->getElementType())->isOne() &&		->getElementType())->isOne() &&
"unit stride pointer IV must be i8*");		"unit stride pointer IV must be i8*");

IRBuilder<> Builder(L->getLoopPreheader()->getTerminator());		IRBuilder<> Builder(L->getLoopPreheader()->getTerminator());
return Builder.CreateGEP(GEPBase->getType()->getPointerElementType(),		return Builder.CreateGEP(GEPBase->getType()->getPointerElementType(),
GEPBase, GEPOffset, "lftr.limit");		GEPBase, GEPOffset, "lftr.limit");
Show All 21 Lines	else {
// For integer IVs, truncate the IV before computing IVInit + BECount.		// For integer IVs, truncate the IV before computing IVInit + BECount.
if (SE->getTypeSizeInBits(IVInit->getType())		if (SE->getTypeSizeInBits(IVInit->getType())
> SE->getTypeSizeInBits(IVCount->getType()))		> SE->getTypeSizeInBits(IVCount->getType()))
IVInit = SE->getTruncateExpr(IVInit, IVCount->getType());		IVInit = SE->getTruncateExpr(IVInit, IVCount->getType());

IVLimit = SE->getAddExpr(IVInit, IVCount);		IVLimit = SE->getAddExpr(IVInit, IVCount);
}		}
// Expand the code for the iteration count.		// Expand the code for the iteration count.
BranchInst *BI = cast<BranchInst>(L->getExitingBlock()->getTerminator());		BranchInst *BI = cast<BranchInst>(ExitingBB->getTerminator());
IRBuilder<> Builder(BI);		IRBuilder<> Builder(BI);
assert(SE->isLoopInvariant(IVLimit, L) &&		assert(SE->isLoopInvariant(IVLimit, L) &&
"Computed iteration count is not loop invariant!");		"Computed iteration count is not loop invariant!");
// Ensure that we generate the same type as IndVar, or a smaller integer		// Ensure that we generate the same type as IndVar, or a smaller integer
// type. In the presence of null pointer values, we have an integer type		// type. In the presence of null pointer values, we have an integer type
// SCEV expression (IVInit) for a pointer type IV value (IndVar).		// SCEV expression (IVInit) for a pointer type IV value (IndVar).
Type *LimitTy = IVCount->getType()->isPointerTy() ?		Type *LimitTy = IVCount->getType()->isPointerTy() ?
IndVar->getType() : IVCount->getType();		IndVar->getType() : IVCount->getType();
return Rewriter.expandCodeFor(IVLimit, LimitTy, BI);		return Rewriter.expandCodeFor(IVLimit, LimitTy, BI);
}		}
}		}

/// This method rewrites the exit condition of the loop to be a canonical !=		/// This method rewrites the exit condition of the loop to be a canonical !=
/// comparison against the incremented loop induction variable. This pass is		/// comparison against the incremented loop induction variable. This pass is
/// able to rewrite the exit tests of any loop where the SCEV analysis can		/// able to rewrite the exit tests of any loop where the SCEV analysis can
/// determine a loop-invariant trip count of the loop, which is actually a much		/// determine a loop-invariant trip count of the loop, which is actually a much
/// broader range than just linear tests.		/// broader range than just linear tests.
bool IndVarSimplify::		bool IndVarSimplify::
linearFunctionTestReplace(Loop L, const SCEV BackedgeTakenCount,		linearFunctionTestReplace(Loop L, BasicBlock ExitingBB,
		const SCEV *BackedgeTakenCount,
PHINode *IndVar, SCEVExpander &Rewriter) {		PHINode *IndVar, SCEVExpander &Rewriter) {
assert(canExpandBackedgeTakenCount(L, SE, Rewriter) && "precondition");

// Initialize CmpIndVar and IVCount to their preincremented values.		// Initialize CmpIndVar and IVCount to their preincremented values.
Value *CmpIndVar = IndVar;		Value *CmpIndVar = IndVar;
const SCEV *IVCount = BackedgeTakenCount;		const SCEV *IVCount = BackedgeTakenCount;

assert(L->getLoopLatch() && "Loop no longer in simplified form?");		assert(L->getLoopLatch() && "Loop no longer in simplified form?");

// If the exiting block is the same as the backedge block, we prefer to		// If the exiting block is the same as the backedge block, we prefer to
// compare against the post-incremented value, otherwise we must compare		// compare against the post-incremented value, otherwise we must compare
// against the preincremented value.		// against the preincremented value.
if (L->getExitingBlock() == L->getLoopLatch()) {		if (ExitingBB == L->getLoopLatch()) {
// Add one to the "backedge-taken" count to get the trip count.		// Add one to the "backedge-taken" count to get the trip count.
// This addition may overflow, which is valid as long as the comparison is		// This addition may overflow, which is valid as long as the comparison is
// truncated to BackedgeTakenCount->getType().		// truncated to BackedgeTakenCount->getType().
IVCount = SE->getAddExpr(BackedgeTakenCount,		IVCount = SE->getAddExpr(BackedgeTakenCount,
SE->getOne(BackedgeTakenCount->getType()));		SE->getOne(BackedgeTakenCount->getType()));
// The BackedgeTaken expression contains the number of times that the		// The BackedgeTaken expression contains the number of times that the
// backedge branches to the loop header. This is one less than the		// backedge branches to the loop header. This is one less than the
// number of times the loop executes, so use the incremented indvar.		// number of times the loop executes, so use the incremented indvar.
CmpIndVar = IndVar->getIncomingValueForBlock(L->getExitingBlock());		CmpIndVar = IndVar->getIncomingValueForBlock(ExitingBB);
}		}

// It may be necessary to drop nowrap flags on the incrementing instruction		// It may be necessary to drop nowrap flags on the incrementing instruction
// if either LFTR moves from a pre-inc check to a post-inc check (in which		// if either LFTR moves from a pre-inc check to a post-inc check (in which
// case the increment might have previously been poison on the last iteration		// case the increment might have previously been poison on the last iteration
// only) or if LFTR switches to a different IV that was previously dynamically		// only) or if LFTR switches to a different IV that was previously dynamically
// dead (and as such may be arbitrarily poison). We remove any nowrap flags		// dead (and as such may be arbitrarily poison). We remove any nowrap flags
// that SCEV didn't infer for the post-inc addrec (even if we use a pre-inc		// that SCEV didn't infer for the post-inc addrec (even if we use a pre-inc
// check), because the pre-inc addrec flags may be adopted from the original		// check), because the pre-inc addrec flags may be adopted from the original
// instruction, while SCEV has to explicitly prove the post-inc nowrap flags.		// instruction, while SCEV has to explicitly prove the post-inc nowrap flags.
// TODO: This handling is inaccurate for one case: If we switch to a		// TODO: This handling is inaccurate for one case: If we switch to a
// dynamically dead IV that wraps on the first loop iteration only, which is		// dynamically dead IV that wraps on the first loop iteration only, which is
// not covered by the post-inc addrec. (If the new IV was not dynamically		// not covered by the post-inc addrec. (If the new IV was not dynamically
// dead, it could not be poison on the first iteration in the first place.)		// dead, it could not be poison on the first iteration in the first place.)
Value *IncVar = IndVar->getIncomingValueForBlock(L->getLoopLatch());		Value *IncVar = IndVar->getIncomingValueForBlock(L->getLoopLatch());
if (auto *BO = dyn_cast<BinaryOperator>(IncVar)) {		if (auto *BO = dyn_cast<BinaryOperator>(IncVar)) {
const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(IncVar));		const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(IncVar));
if (BO->hasNoUnsignedWrap())		if (BO->hasNoUnsignedWrap())
BO->setHasNoUnsignedWrap(AR->hasNoUnsignedWrap());		BO->setHasNoUnsignedWrap(AR->hasNoUnsignedWrap());
if (BO->hasNoSignedWrap())		if (BO->hasNoSignedWrap())
BO->setHasNoSignedWrap(AR->hasNoSignedWrap());		BO->setHasNoSignedWrap(AR->hasNoSignedWrap());
}		}

Value *ExitCnt = genLoopLimit(IndVar, IVCount, L, Rewriter, SE);		Value *ExitCnt = genLoopLimit(IndVar, ExitingBB, IVCount, L, Rewriter, SE);
assert(ExitCnt->getType()->isPointerTy() ==		assert(ExitCnt->getType()->isPointerTy() ==
IndVar->getType()->isPointerTy() &&		IndVar->getType()->isPointerTy() &&
"genLoopLimit missed a cast");		"genLoopLimit missed a cast");

// Insert a new icmp_ne or icmp_eq instruction before the branch.		// Insert a new icmp_ne or icmp_eq instruction before the branch.
BranchInst *BI = cast<BranchInst>(L->getExitingBlock()->getTerminator());		BranchInst *BI = cast<BranchInst>(ExitingBB->getTerminator());
ICmpInst::Predicate P;		ICmpInst::Predicate P;
if (L->contains(BI->getSuccessor(0)))		if (L->contains(BI->getSuccessor(0)))
P = ICmpInst::ICMP_NE;		P = ICmpInst::ICMP_NE;
else		else
P = ICmpInst::ICMP_EQ;		P = ICmpInst::ICMP_EQ;

LLVM_DEBUG(dbgs() << "INDVARS: Rewriting loop exit condition to:\n"		LLVM_DEBUG(dbgs() << "INDVARS: Rewriting loop exit condition to:\n"
<< " LHS:" << *CmpIndVar << '\n'		<< " LHS:" << *CmpIndVar << '\n'
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	#endif
if (ReplaceExitValue != NeverRepl &&		if (ReplaceExitValue != NeverRepl &&
!isa<SCEVCouldNotCompute>(BackedgeTakenCount))		!isa<SCEVCouldNotCompute>(BackedgeTakenCount))
Changed \|= rewriteLoopExitValues(L, Rewriter);		Changed \|= rewriteLoopExitValues(L, Rewriter);

// Eliminate redundant IV cycles.		// Eliminate redundant IV cycles.
NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);		NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);

// If we have a trip count expression, rewrite the loop's exit condition		// If we have a trip count expression, rewrite the loop's exit condition
// using it. We can currently only handle loops with a single exit.		// using it.
if (!DisableLFTR && canExpandBackedgeTakenCount(L, SE, Rewriter) &&		if (!DisableLFTR) {
needsLFTR(L)) {		// For the moment, we only do LFTR for single exit loops. The code is
PHINode *IndVar = FindLoopCounter(L, BackedgeTakenCount, SE);		// structured as it is in the expectation of generalization to multi-exit
if (IndVar) {		// loops in the near future. See D62625 for context.
		SmallVector<BasicBlock*, 16> ExitingBlocks;
		if (auto *ExitingBB = L->getExitingBlock())
		ExitingBlocks.push_back(ExitingBB);
		for (BasicBlock *ExitingBB : ExitingBlocks) {
		// Can't rewrite non-branch yet.
		if (!isa<BranchInst>(ExitingBB->getTerminator()))
		continue;

		if (!needsLFTR(L, ExitingBB))
		continue;

		// Note: This block of code is here strictly to seperate an change into
		// two parts: one NFC, one not. What's happening here is that SCEV is
		// returning a more expensive expression for the BackedgeTakenCount for
		// the loop after widening in rare circumstances. In review, we decided
		// to accept that small difference - since it has minimal test suite
		// impact - but for ease of attribution, the functional diff will be it's
		// own change.
		const SCEV *BETakenCount = L->getExitingBlock() ?
		BackedgeTakenCount : SE->getExitCount(L, ExitingBB);
		if (isa<SCEVCouldNotCompute>(BETakenCount))
		continue;

		// Better to fold to true (TODO: do so!)
		if (BETakenCount->isZero())
		continue;

		PHINode *IndVar = FindLoopCounter(L, ExitingBB, BETakenCount, SE);
		if (!IndVar)
		continue;

		// Avoid high cost expansions. Note: This heuristic is questionable in
		// that our definition of "high cost" is not exactly principled.
		if (Rewriter.isHighCostExpansion(BETakenCount, L))
		continue;

// Check preconditions for proper SCEVExpander operation. SCEV does not		// Check preconditions for proper SCEVExpander operation. SCEV does not
// express SCEVExpander's dependencies, such as LoopSimplify. Instead any		// express SCEVExpander's dependencies, such as LoopSimplify. Instead
// pass that uses the SCEVExpander must do it. This does not work well for		// any pass that uses the SCEVExpander must do it. This does not work
// loop passes because SCEVExpander makes assumptions about all loops,		// well for loop passes because SCEVExpander makes assumptions about
// while LoopPassManager only forces the current loop to be simplified.		// all loops, while LoopPassManager only forces the current loop to be
		// simplified.
//		//
// FIXME: SCEV expansion has no way to bail out, so the caller must		// FIXME: SCEV expansion has no way to bail out, so the caller must
// explicitly check any assumptions made by SCEV. Brittle.		// explicitly check any assumptions made by SCEV. Brittle.
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(BackedgeTakenCount);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(BETakenCount);
if (!AR \|\| AR->getLoop()->getLoopPreheader())		if (!AR \|\| AR->getLoop()->getLoopPreheader())
Changed \|= linearFunctionTestReplace(L, BackedgeTakenCount, IndVar,		Changed \|= linearFunctionTestReplace(L, ExitingBB,
		BETakenCount, IndVar,
Rewriter);		Rewriter);
}		}
}		}
// Clear the rewriter cache, because values that are in the rewriter's cache		// Clear the rewriter cache, because values that are in the rewriter's cache
// can be deleted in the loop below, causing the AssertingVH in the cache to		// can be deleted in the loop below, causing the AssertingVH in the cache to
// trigger.		// trigger.
Rewriter.clear();		Rewriter.clear();

▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines