This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
LoopPredication.cpp
-
test/Transforms/LoopPredication/
-
Transforms/
-
LoopPredication/
-
basic.ll
-
nested.ll
-
visited.ll

Differential D37569

Rework loop predication pass
ClosedPublic

Authored by apilipenko on Sep 7 2017, 7:18 AM.

Download Raw Diff

Details

Reviewers

anna
sanjoy
mkazantsev
reames

Commits

rG889dc1e3a58c: Rework loop predication pass
rL313981: Rework loop predication pass

Summary

We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation.

The problem.

Consider the loop:

for (int i = b; i != e; i++)
  guard(i u< len)

The current implementation will ask SCEV if the guard condition is monotonic and if it is will replace the IV in the condition with the value of the IV at the end of the loop. In this loop the increment will be marked as nuw because of the guard. Basing on that SCEV will mark the corresponding add recurrence as nuw, which makes the guard condition monotonic. So, loop predication replaces the guard condition with e <u len condition. This transformation is not correct, for example, the loop is now broken for b = 5, e = 4.

Generally the facts SCEV provides are true if the backedge of the loop is taken, which implicitly assumes that the guard doesn't fail. Using these facts to optimize the guard results in a circular logic where the guard is optimized under the assumption that it never fails.

The fix.

An alternative way to reason about loop predication is to use an inductive proof
approach. Given the loop:

if (B(S)) {
  do {
    I = PHI(S, I.INC)
    I.INC = I + S
    guard(G(I));
  } while (B(I.INC));
}

where B(x) and G(x) are predicates that map integers to booleans, we want a
loop invariant expression M such the following program has the same semantics
as the above:

if (B(S)) {
  do {
    I = PHI(S, I.INC)
    I.INC = I + S
    guard(G(S) && M);
  } while (B(I.INC));
}

One solution for M is M = forall X . (G(X) && B(X + S)) => G(X + S)

For now the transformation is limited to the following case:

The loop has a single latch with either ult or slt icmp condition.
The step of the IV used in the latch condition is 1.
The IV of the latch condition is the same as the post increment IV of the guard condition.
The guard condition is ult.

In this case the latch is of the from:
++i u< latchLimit or ++i s< latchLimit
and the guard is of the form:
i u< guardLimit

For the unsigned latch comparison case M is:
forall X . X u< guardLimit && (X + 1) u< latchLimit => (X + 1) u< guardLimit

This is true if latchLimit u<= guardLimit since then
(X + 1) u< latchLimit u<= guardLimit == (X + 1) u< guardLimit.

So the widened condition is:
i.start u< guardLimit && latchLimit u<= guardLimit

For the signed latch comparison case M is:
forall X . X u< guardLimit && (X + 1) s< latchLimit => (X + 1) u< guardLimit

The only way the antecedent can be true and the consequent can be false is if
X == guardLimit - 1
(and guardLimit is non-zero, but we won't use this latter fact).
If X == guardLimit - 1 then the second half of the antecedent is
guardLimit s< latchLimit
and its negation is
latchLimit s<= guardLimit.

So the widened condition is:
i.start u< guardLimit && latchLimit s<= guardLimit

This patch implements the logic above. In the follow up changes the following will be supported:

ule, sle as the latch conditions
the guard IV and the latch IV start from different values (like, latch IV is {1,+,1}, range check IV {2,+,1})

Diff Detail

Repository: rL LLVM

Event Timeline

apilipenko created this revision.Sep 7 2017, 7:18 AM

apilipenko edited the summary of this revision. (Show Details)

apilipenko edited the summary of this revision. (Show Details)Sep 7 2017, 7:20 AM

apilipenko added a reviewer: reames.

mkazantsev added inline comments.Sep 11 2017, 5:21 AM

lib/Transforms/Scalar/LoopPredication.cpp
306 ↗	(On Diff #114176)	Why do we need to expand `Start`? It comes from above the loop, we should be able to reuse it.
415 ↗	(On Diff #114176)	Doesn't check on `isOne()` automatically check affinity?
422 ↗	(On Diff #114176)	Should be `*Step`.

I did not check the actual code changes carefully -- I've assumed they're in sync with the comments -- but let me know if you want me to take a look. The comments and the math bits LGTM.

lib/Transforms/Scalar/LoopPredication.cpp
38 ↗	(On Diff #114176)	I think we should be clearer here: the facts that SCEV proves about the increment step of add recurrences are true if the backedge of the loop is taken
75 ↗	(On Diff #114176)	Can you please add a short inductive informal proof on why this is sufficient? You should also state that `M` or anything "stronger" (i.e. any condition that implies `M`) will do.
110 ↗	(On Diff #114176)	I would append this to make things 100% clear: """ In other words, if latchLimit s<= guardLimit then (the ranges below are written in ConstantRange notation, where `[A, B)` is the set `for (I = A; I != B; I++ /maywrap/) yield(I);`): forall X . X u< guardLimit && (X + 1) s< latchLimit => (X + 1) u< guardLimit == forall X . X u< guardLimit && (X + 1) s< guardLimit => (X + 1) u< guardLimit == forall X . X in [0, guardLimit) && (X + 1) in [INT_MIN, guardLimit) => (X + 1) in [0, guardLimit) == forall X . X in [0, guardLimit) && X in [INT_MAX, guardLimit-1) => X in [-1, guardLimit-1) == forall X . X in [0, guardLimit-1) => X in [-1, guardLimit-1) == true """
397 ↗	(On Diff #114176)	I don't think you need parens around `A == B`.

This revision is now accepted and ready to land.Sep 12 2017, 11:35 PM

Closed by commit rL313981: Rework loop predication pass (authored by apilipenko). · Explain WhySep 22 2017, 6:15 AM

This revision was automatically updated to reflect the committed changes.

apilipenko mentioned this in D38177: [LoopPredication] Support ule, sle latch predicates.Sep 22 2017, 6:20 AM

apilipenko mentioned this in D39097: [LoopPredication] Handle the case when the guard and the latch IV have different offsets.Oct 19 2017, 9:27 AM

apilipenko mentioned this in rL316768: [LoopPredication] Handle the case when the guard and the latch IV have….Oct 27 2017, 7:46 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

LoopPredication.cpp

258 lines

test/

Transforms/

LoopPredication/

basic.ll

344 lines

nested.ll

83 lines

visited.ll

5 lines

Diff 116338

llvm/trunk/lib/Transforms/Scalar/LoopPredication.cpp

Show All 28 Lines
//		//
// if (n - 1 < len)		// if (n - 1 < len)
// for (i = 0; i < n; i++) {		// for (i = 0; i < n; i++) {
// ...		// ...
// }		// }
// else		// else
// deoptimize		// deoptimize
//		//
		// It's tempting to rely on SCEV here, but it has proven to be problematic.
		// Generally the facts SCEV provides about the increment step of add
		// recurrences are true if the backedge of the loop is taken, which implicitly
		// assumes that the guard doesn't fail. Using these facts to optimize the
		// guard results in a circular logic where the guard is optimized under the
		// assumption that it never fails.
		//
		// For example, in the loop below the induction variable will be marked as nuw
		// basing on the guard. Basing on nuw the guard predicate will be considered
		// monotonic. Given a monotonic condition it's tempting to replace the induction
		// variable in the condition with its value on the last iteration. But this
		// transformation is not correct, e.g. e = 4, b = 5 breaks the loop.
		//
		// for (int i = b; i != e; i++)
		// guard(i u< len)
		//
		// One of the ways to reason about this problem is to use an inductive proof
		// approach. Given the loop:
		//
		// if (B(Start)) {
		// do {
		// I = PHI(Start, I.INC)
		// I.INC = I + Step
		// guard(G(I));
		// } while (B(I.INC));
		// }
		//
		// where B(x) and G(x) are predicates that map integers to booleans, we want a
		// loop invariant expression M such the following program has the same semantics
		// as the above:
		//
		// if (B(Start)) {
		// do {
		// I = PHI(Start, I.INC)
		// I.INC = I + Step
		// guard(G(Start) && M);
		// } while (B(I.INC));
		// }
		//
		// One solution for M is M = forall X . (G(X) && B(X + Step)) => G(X + Step)
		//
		// Informal proof that the transformation above is correct:
		//
		// By the definition of guards we can rewrite the guard condition to:
		// G(I) && G(Start) && M
		//
		// Let's prove that for each iteration of the loop:
		// G(Start) && M => G(I)
		// And the condition above can be simplified to G(Start) && M.
		//
		// Induction base.
		// G(Start) && M => G(Start)
		//
		// Induction step. Assuming G(Start) && M => G(I) on the subsequent
		// iteration:
		//
		// B(I + Step) is true because it's the backedge condition.
		// G(I) is true because the backedge is guarded by this condition.
		//
		// So M = forall X . (G(X) && B(X + Step)) => G(X + Step) implies
		// G(I + Step).
		//
		// Note that we can use anything stronger than M, i.e. any condition which
		// implies M.
		//
		// For now the transformation is limited to the following case:
		// * The loop has a single latch with either ult or slt icmp condition.
		// * The step of the IV used in the latch condition is 1.
		// * The IV of the latch condition is the same as the post increment IV of the
		// guard condition.
		// * The guard condition is ult.
		//
		// In this case the latch is of the from:
		// ++i u< latchLimit or ++i s< latchLimit
		// and the guard is of the form:
		// i u< guardLimit
		//
		// For the unsigned latch comparison case M is:
		// forall X . X u< guardLimit && (X + 1) u< latchLimit =>
		// (X + 1) u< guardLimit
		//
		// This is true if latchLimit u<= guardLimit since then
		// (X + 1) u< latchLimit u<= guardLimit == (X + 1) u< guardLimit.
		//
		// So the widened condition is:
		// i.start u< guardLimit && latchLimit u<= guardLimit
		//
		// For the signed latch comparison case M is:
		// forall X . X u< guardLimit && (X + 1) s< latchLimit =>
		// (X + 1) u< guardLimit
		//
		// The only way the antecedent can be true and the consequent can be false is
		// if
		// X == guardLimit - 1
		// (and guardLimit is non-zero, but we won't use this latter fact).
		// If X == guardLimit - 1 then the second half of the antecedent is
		// guardLimit s< latchLimit
		// and its negation is
		// latchLimit s<= guardLimit.
		//
		// In other words, if latchLimit s<= guardLimit then:
		// (the ranges below are written in ConstantRange notation, where [A, B) is the
		// set for (I = A; I != B; I++ /maywrap/) yield(I);)
		//
		// forall X . X u< guardLimit && (X + 1) s< latchLimit => (X + 1) u< guardLimit
		// == forall X . X u< guardLimit && (X + 1) s< guardLimit => (X + 1) u< guardLimit
		// == forall X . X in [0, guardLimit) && (X + 1) in [INT_MIN, guardLimit) => (X + 1) in [0, guardLimit)
		// == forall X . X in [0, guardLimit) && X in [INT_MAX, guardLimit-1) => X in [-1, guardLimit-1)
		// == forall X . X in [0, guardLimit-1) => X in [-1, guardLimit-1)
		// == true
		//
		// So the widened condition is:
		// i.start u< guardLimit && latchLimit s<= guardLimit
		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Scalar/LoopPredication.h"		#include "llvm/Transforms/Scalar/LoopPredication.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ScalarEvolutionExpander.h"		#include "llvm/Analysis/ScalarEvolutionExpander.h"
#include "llvm/Analysis/ScalarEvolutionExpressions.h"		#include "llvm/Analysis/ScalarEvolutionExpressions.h"
Show All 25 Lines	struct LoopICmp {
LoopICmp() {}		LoopICmp() {}
};		};

ScalarEvolution *SE;		ScalarEvolution *SE;

Loop *L;		Loop *L;
const DataLayout *DL;		const DataLayout *DL;
BasicBlock *Preheader;		BasicBlock *Preheader;
		LoopICmp LatchCheck;

Optional<LoopICmp> parseLoopICmp(ICmpInst *ICI);		Optional<LoopICmp> parseLoopICmp(ICmpInst *ICI) {
		return parseLoopICmp(ICI->getPredicate(), ICI->getOperand(0),
		ICI->getOperand(1));
		}
		Optional<LoopICmp> parseLoopICmp(ICmpInst::Predicate Pred, Value *LHS,
		Value *RHS);

		Optional<LoopICmp> parseLoopLatchICmp();

Value *expandCheck(SCEVExpander &Expander, IRBuilder<> &Builder,		Value *expandCheck(SCEVExpander &Expander, IRBuilder<> &Builder,
ICmpInst::Predicate Pred, const SCEV LHS, const SCEV RHS,		ICmpInst::Predicate Pred, const SCEV LHS, const SCEV RHS,
Instruction *InsertAt);		Instruction *InsertAt);

Optional<Value > widenICmpRangeCheck(ICmpInst ICI, SCEVExpander &Expander,		Optional<Value > widenICmpRangeCheck(ICmpInst ICI, SCEVExpander &Expander,
IRBuilder<> &Builder);		IRBuilder<> &Builder);
bool widenGuardConditions(IntrinsicInst *II, SCEVExpander &Expander);		bool widenGuardConditions(IntrinsicInst *II, SCEVExpander &Expander);
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	PreservedAnalyses LoopPredicationPass::run(Loop &L, LoopAnalysisManager &AM,
LoopPredication LP(&AR.SE);		LoopPredication LP(&AR.SE);
if (!LP.runOnLoop(&L))		if (!LP.runOnLoop(&L))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

return getLoopPassPreservedAnalyses();		return getLoopPassPreservedAnalyses();
}		}

Optional<LoopPredication::LoopICmp>		Optional<LoopPredication::LoopICmp>
LoopPredication::parseLoopICmp(ICmpInst *ICI) {		LoopPredication::parseLoopICmp(ICmpInst::Predicate Pred, Value *LHS,
ICmpInst::Predicate Pred = ICI->getPredicate();		Value *RHS) {

Value *LHS = ICI->getOperand(0);
Value *RHS = ICI->getOperand(1);
const SCEV *LHSS = SE->getSCEV(LHS);		const SCEV *LHSS = SE->getSCEV(LHS);
if (isa<SCEVCouldNotCompute>(LHSS))		if (isa<SCEVCouldNotCompute>(LHSS))
return None;		return None;
const SCEV *RHSS = SE->getSCEV(RHS);		const SCEV *RHSS = SE->getSCEV(RHS);
if (isa<SCEVCouldNotCompute>(RHSS))		if (isa<SCEVCouldNotCompute>(RHSS))
return None;		return None;

// Canonicalize RHS to be loop invariant bound, LHS - a loop computable IV		// Canonicalize RHS to be loop invariant bound, LHS - a loop computable IV
Show All 9 Lines	LoopPredication::parseLoopICmp(ICmpInst::Predicate Pred, Value *LHS,

return LoopICmp(Pred, AR, RHSS);		return LoopICmp(Pred, AR, RHSS);
}		}

Value *LoopPredication::expandCheck(SCEVExpander &Expander,		Value *LoopPredication::expandCheck(SCEVExpander &Expander,
IRBuilder<> &Builder,		IRBuilder<> &Builder,
ICmpInst::Predicate Pred, const SCEV *LHS,		ICmpInst::Predicate Pred, const SCEV *LHS,
const SCEV RHS, Instruction InsertAt) {		const SCEV RHS, Instruction InsertAt) {
		// TODO: we can check isLoopEntryGuardedByCond before emitting the check

Type *Ty = LHS->getType();		Type *Ty = LHS->getType();
assert(Ty == RHS->getType() && "expandCheck operands have different types?");		assert(Ty == RHS->getType() && "expandCheck operands have different types?");
Value *LHSV = Expander.expandCodeFor(LHS, Ty, InsertAt);		Value *LHSV = Expander.expandCodeFor(LHS, Ty, InsertAt);
Value *RHSV = Expander.expandCodeFor(RHS, Ty, InsertAt);		Value *RHSV = Expander.expandCodeFor(RHS, Ty, InsertAt);
return Builder.CreateICmp(Pred, LHSV, RHSV);		return Builder.CreateICmp(Pred, LHSV, RHSV);
}		}

/// If ICI can be widened to a loop invariant condition emits the loop		/// If ICI can be widened to a loop invariant condition emits the loop
/// invariant condition in the loop preheader and return it, otherwise		/// invariant condition in the loop preheader and return it, otherwise
/// returns None.		/// returns None.
Optional<Value > LoopPredication::widenICmpRangeCheck(ICmpInst ICI,		Optional<Value > LoopPredication::widenICmpRangeCheck(ICmpInst ICI,
SCEVExpander &Expander,		SCEVExpander &Expander,
IRBuilder<> &Builder) {		IRBuilder<> &Builder) {
DEBUG(dbgs() << "Analyzing ICmpInst condition:\n");		DEBUG(dbgs() << "Analyzing ICmpInst condition:\n");
DEBUG(ICI->dump());		DEBUG(ICI->dump());

		// parseLoopStructure guarantees that the latch condition is:
		// ++i u< latchLimit or ++i s< latchLimit
		// We are looking for the range checks of the form:
		// i u< guardLimit
auto RangeCheck = parseLoopICmp(ICI);		auto RangeCheck = parseLoopICmp(ICI);
if (!RangeCheck) {		if (!RangeCheck) {
DEBUG(dbgs() << "Failed to parse the loop latch condition!\n");		DEBUG(dbgs() << "Failed to parse the loop latch condition!\n");
return None;		return None;
}		}
		if (RangeCheck->Pred != ICmpInst::ICMP_ULT) {
ICmpInst::Predicate Pred = RangeCheck->Pred;		DEBUG(dbgs() << "Unsupported range check predicate(" << RangeCheck->Pred
const SCEVAddRecExpr *IndexAR = RangeCheck->IV;		<< ")!\n");
const SCEV *RHSS = RangeCheck->Limit;

auto CanExpand = [this](const SCEV *S) {
return SE->isLoopInvariant(S, L) && isSafeToExpand(S, *SE);
};
if (!CanExpand(RHSS))
return None;		return None;
		}
DEBUG(dbgs() << "IndexAR: ");		auto *RangeCheckIV = RangeCheck->IV;
DEBUG(IndexAR->dump());		auto PostIncRangeCheckIV = RangeCheckIV->getPostIncExpr(SE);
		if (LatchCheck.IV != PostIncRangeCheckIV) {
bool IsIncreasing = false;		DEBUG(dbgs() << "Post increment range check IV (" << *PostIncRangeCheckIV
if (!SE->isMonotonicPredicate(IndexAR, Pred, IsIncreasing))		<< ") is not the same as latch IV (" << *LatchCheck.IV
return None;		<< ")!\n");

// If the predicate is increasing the condition can change from false to true
// as the loop progresses, in this case take the value on the first iteration
// for the widened check. Otherwise the condition can change from true to
// false as the loop progresses, so take the value on the last iteration.
const SCEV *NewLHSS = IsIncreasing
? IndexAR->getStart()
: SE->getSCEVAtScope(IndexAR, L->getParentLoop());
if (NewLHSS == IndexAR) {
DEBUG(dbgs() << "Can't compute NewLHSS!\n");
return None;		return None;
}		}
		assert(RangeCheckIV->getStepRecurrence(*SE)->isOne() && "must be one");
		const SCEV *Start = RangeCheckIV->getStart();

DEBUG(dbgs() << "NewLHSS: ");		// Generate the widened condition. See the file header comment for reasoning.
DEBUG(NewLHSS->dump());		// If the latch condition is unsigned:
		// i.start u< guardLimit && latchLimit u<= guardLimit
		// If the latch condition is signed:
		// i.start u< guardLimit && latchLimit s<= guardLimit

		auto LimitCheckPred = ICmpInst::isSigned(LatchCheck.Pred)
		? ICmpInst::ICMP_SLE
		: ICmpInst::ICMP_ULE;

if (!CanExpand(NewLHSS))		auto CanExpand = [this](const SCEV *S) {
		return SE->isLoopInvariant(S, L) && isSafeToExpand(S, *SE);
		};
		if (!CanExpand(Start) \|\| !CanExpand(LatchCheck.Limit) \|\|
		!CanExpand(RangeCheck->Limit))
return None;		return None;

DEBUG(dbgs() << "NewLHSS is loop invariant and safe to expand. Expand!\n");

Instruction *InsertAt = Preheader->getTerminator();		Instruction *InsertAt = Preheader->getTerminator();
return expandCheck(Expander, Builder, Pred, NewLHSS, RHSS, InsertAt);		auto *FirstIterationCheck = expandCheck(Expander, Builder, RangeCheck->Pred,
		Start, RangeCheck->Limit, InsertAt);
		auto *LimitCheck = expandCheck(Expander, Builder, LimitCheckPred,
		LatchCheck.Limit, RangeCheck->Limit, InsertAt);
		return Builder.CreateAnd(FirstIterationCheck, LimitCheck);
}		}

bool LoopPredication::widenGuardConditions(IntrinsicInst *Guard,		bool LoopPredication::widenGuardConditions(IntrinsicInst *Guard,
SCEVExpander &Expander) {		SCEVExpander &Expander) {
DEBUG(dbgs() << "Processing guard:\n");		DEBUG(dbgs() << "Processing guard:\n");
DEBUG(Guard->dump());		DEBUG(Guard->dump());

IRBuilder<> Builder(cast<Instruction>(Preheader->getTerminator()));		IRBuilder<> Builder(cast<Instruction>(Preheader->getTerminator()));
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	for (auto *Check : Checks)
else		else
LastCheck = Builder.CreateAnd(LastCheck, Check);		LastCheck = Builder.CreateAnd(LastCheck, Check);
Guard->setOperand(0, LastCheck);		Guard->setOperand(0, LastCheck);

DEBUG(dbgs() << "Widened checks = " << NumWidened << "\n");		DEBUG(dbgs() << "Widened checks = " << NumWidened << "\n");
return true;		return true;
}		}

		Optional<LoopPredication::LoopICmp> LoopPredication::parseLoopLatchICmp() {
		using namespace PatternMatch;

		BasicBlock *LoopLatch = L->getLoopLatch();
		if (!LoopLatch) {
		DEBUG(dbgs() << "The loop doesn't have a single latch!\n");
		return None;
		}

		ICmpInst::Predicate Pred;
		Value LHS, RHS;
		BasicBlock TrueDest, FalseDest;

		if (!match(LoopLatch->getTerminator(),
		m_Br(m_ICmp(Pred, m_Value(LHS), m_Value(RHS)), TrueDest,
		FalseDest))) {
		DEBUG(dbgs() << "Failed to match the latch terminator!\n");
		return None;
		}
		assert((TrueDest == L->getHeader() \|\| FalseDest == L->getHeader()) &&
		"One of the latch's destinations must be the header");
		if (TrueDest != L->getHeader())
		Pred = ICmpInst::getInversePredicate(Pred);

		auto Result = parseLoopICmp(Pred, LHS, RHS);
		if (!Result) {
		DEBUG(dbgs() << "Failed to parse the loop latch condition!\n");
		return None;
		}

		if (Result->Pred != ICmpInst::ICMP_ULT &&
		Result->Pred != ICmpInst::ICMP_SLT) {
		DEBUG(dbgs() << "Unsupported loop latch predicate(" << Result->Pred
		<< ")!\n");
		return None;
		}

		// Check affine first, so if it's not we don't try to compute the step
		// recurrence.
		if (!Result->IV->isAffine()) {
		DEBUG(dbgs() << "The induction variable is not affine!\n");
		return None;
		}

		auto Step = Result->IV->getStepRecurrence(SE);
		if (!Step->isOne()) {
		DEBUG(dbgs() << "Unsupported loop stride(" << *Step << ")!\n");
		return None;
		}

		return Result;
		}

bool LoopPredication::runOnLoop(Loop *Loop) {		bool LoopPredication::runOnLoop(Loop *Loop) {
L = Loop;		L = Loop;

DEBUG(dbgs() << "Analyzing ");		DEBUG(dbgs() << "Analyzing ");
DEBUG(L->dump());		DEBUG(L->dump());

Module *M = L->getHeader()->getModule();		Module *M = L->getHeader()->getModule();

// There is nothing to do if the module doesn't use guards		// There is nothing to do if the module doesn't use guards
auto *GuardDecl =		auto *GuardDecl =
M->getFunction(Intrinsic::getName(Intrinsic::experimental_guard));		M->getFunction(Intrinsic::getName(Intrinsic::experimental_guard));
if (!GuardDecl \|\| GuardDecl->use_empty())		if (!GuardDecl \|\| GuardDecl->use_empty())
return false;		return false;

DL = &M->getDataLayout();		DL = &M->getDataLayout();

Preheader = L->getLoopPreheader();		Preheader = L->getLoopPreheader();
if (!Preheader)		if (!Preheader)
return false;		return false;

		auto LatchCheckOpt = parseLoopLatchICmp();
		if (!LatchCheckOpt)
		return false;
		LatchCheck = *LatchCheckOpt;

// Collect all the guards into a vector and process later, so as not		// Collect all the guards into a vector and process later, so as not
// to invalidate the instruction iterator.		// to invalidate the instruction iterator.
SmallVector<IntrinsicInst *, 4> Guards;		SmallVector<IntrinsicInst *, 4> Guards;
for (const auto BB : L->blocks())		for (const auto BB : L->blocks())
for (auto &I : *BB)		for (auto &I : *BB)
if (auto *II = dyn_cast<IntrinsicInst>(&I))		if (auto *II = dyn_cast<IntrinsicInst>(&I))
if (II->getIntrinsicID() == Intrinsic::experimental_guard)		if (II->getIntrinsicID() == Intrinsic::experimental_guard)
Guards.push_back(II);		Guards.push_back(II);
Show All 12 Lines

llvm/trunk/test/Transforms/LoopPredication/basic.ll

; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s		; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s
; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s		; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s

declare void @llvm.experimental.guard(i1, ...)		declare void @llvm.experimental.guard(i1, ...)

define i32 @unsigned_loop_0_to_n_ult_check(i32* %array, i32 %length, i32 %n) {		define i32 @unsigned_loop_0_to_n_ult_check(i32* %array, i32 %length, i32 %n) {
; CHECK-LABEL: @unsigned_loop_0_to_n_ult_check		; CHECK-LABEL: @unsigned_loop_0_to_n_ult_check
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp ult i32 [[max_index]], %length		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp ule i32 %n, %length
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
Show All 17 Lines
define i32 @unsigned_loop_0_to_n_ugt_check(i32* %array, i32 %length, i32 %n) {		define i32 @unsigned_loop_0_to_n_ugt_check(i32* %array, i32 %length, i32 %n) {
; CHECK-LABEL: @unsigned_loop_0_to_n_ugt_check		; CHECK-LABEL: @unsigned_loop_0_to_n_ugt_check
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp ult i32 [[max_index]], %length		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp ule i32 %n, %length
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
Show All 9 Lines	; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%continue = icmp ult i32 %i.next, %n		%continue = icmp ult i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

		define i32 @signed_loop_0_to_n_ult_check(i32* %array, i32 %length, i32 %n) {
define i32 @two_range_checks(i32* %array.1, i32 %length.1,		; CHECK-LABEL: @signed_loop_0_to_n_ult_check
i32* %array.2, i32 %length.2, i32 %n) {
; CHECK-LABEL: @two_range_checks
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp sle i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = icmp ult i32 [[max_index]], %length.{{1\|2}}		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp sle i32 %n, %length
; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = icmp ult i32 [[max_index]], %length.{{1\|2}}		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: [[wide_cond:[^ ]+]] = and i1 [[wide_cond_1]], [[wide_cond_2]]
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
%within.bounds.1 = icmp ult i32 %i, %length.1		%within.bounds = icmp ult i32 %i, %length
%within.bounds.2 = icmp ult i32 %i, %length.2
%within.bounds = and i1 %within.bounds.1, %within.bounds.2
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.1.i = load i32, i32* %array.1.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.1 = add i32 %loop.acc, %array.1.i		%loop.acc.next = add i32 %loop.acc, %array.i

%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
%array.2.i = load i32, i32* %array.2.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc.1, %array.2.i

%i.next = add nuw i32 %i, 1		%i.next = add nuw i32 %i, 1
%continue = icmp ult i32 %i.next, %n		%continue = icmp slt i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @three_range_checks(i32* %array.1, i32 %length.1,		define i32 @unsupported_latch_pred_loop_0_to_n(i32* %array, i32 %length, i32 %n) {
i32* %array.2, i32 %length.2,		; CHECK-LABEL: @unsupported_latch_pred_loop_0_to_n
i32* %array.3, i32 %length.3, i32 %n) {
; CHECK-LABEL: @three_range_checks
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp sle i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1
; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = icmp ult i32 [[max_index]], %length.{{1\|2\|3}}
; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = icmp ult i32 [[max_index]], %length.{{1\|2\|3}}
; CHECK-NEXT: [[wide_cond_3:[^ ]+]] = icmp ult i32 [[max_index]], %length.{{1\|2\|3}}
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: [[wide_cond_and:[^ ]+]] = and i1 [[wide_cond_1]], [[wide_cond_2]]		; CHECK: %within.bounds = icmp ult i32 %i, %length
; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[wide_cond_and]], [[wide_cond_3]]		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
%within.bounds.1 = icmp ult i32 %i, %length.1		%within.bounds = icmp ult i32 %i, %length
%within.bounds.2 = icmp ult i32 %i, %length.2
%within.bounds.3 = icmp ult i32 %i, %length.3
%within.bounds.1.and.2 = and i1 %within.bounds.1, %within.bounds.2
%within.bounds = and i1 %within.bounds.1.and.2, %within.bounds.3
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.1.i = load i32, i32* %array.1.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.1 = add i32 %loop.acc, %array.1.i		%loop.acc.next = add i32 %loop.acc, %array.i

%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
%array.2.i = load i32, i32* %array.2.i.ptr, align 4
%loop.acc.2 = add i32 %loop.acc.1, %array.2.i

%array.3.i.ptr = getelementptr inbounds i32, i32* %array.3, i64 %i.i64
%array.3.i = load i32, i32* %array.3.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc.2, %array.3.i

%i.next = add nuw i32 %i, 1		%i.next = add nsw i32 %i, 1
%continue = icmp ult i32 %i.next, %n		%continue = icmp ne i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @three_guards(i32* %array.1, i32 %length.1,		define i32 @signed_loop_0_to_n_unsupported_iv_step(i32* %array, i32 %length, i32 %n) {
i32* %array.2, i32 %length.2,		; CHECK-LABEL: @signed_loop_0_to_n_unsupported_iv_step
i32* %array.3, i32 %length.3, i32 %n) {
; CHECK-LABEL: @three_guards
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp sle i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1
; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = icmp ult i32 [[max_index]], %length.1
; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = icmp ult i32 [[max_index]], %length.2
; CHECK-NEXT: [[wide_cond_3:[^ ]+]] = icmp ult i32 [[max_index]], %length.3
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_1]], i32 9) [ "deopt"() ]		; CHECK: %within.bounds = icmp ult i32 %i, %length
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_2]], i32 9) [ "deopt"() ]		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_3]], i32 9) [ "deopt"() ]

%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
		%within.bounds = icmp ult i32 %i, %length
%within.bounds.1 = icmp ult i32 %i, %length.1		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.1, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.1.i = load i32, i32* %array.1.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.1 = add i32 %loop.acc, %array.1.i		%loop.acc.next = add i32 %loop.acc, %array.i

%within.bounds.2 = icmp ult i32 %i, %length.2
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.2, i32 9) [ "deopt"() ]

%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
%array.2.i = load i32, i32* %array.2.i.ptr, align 4
%loop.acc.2 = add i32 %loop.acc.1, %array.2.i

%within.bounds.3 = icmp ult i32 %i, %length.3
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.3, i32 9) [ "deopt"() ]

%array.3.i.ptr = getelementptr inbounds i32, i32* %array.3, i64 %i.i64
%array.3.i = load i32, i32* %array.3.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc.2, %array.3.i

%i.next = add nuw i32 %i, 1		%i.next = add nsw i32 %i, 2
%continue = icmp ult i32 %i.next, %n		%continue = icmp slt i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @signed_loop_start_to_n_sge_0_check(i32* %array, i32 %length, i32 %start, i32 %n) {		define i32 @signed_loop_0_to_n_equal_iv_range_check(i32* %array, i32 %length, i32 %n) {
; CHECK-LABEL: @signed_loop_start_to_n_sge_0_check		; CHECK-LABEL: @signed_loop_0_to_n_equal_iv_range_check
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp sle i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp sge i32 %start, 0		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp sle i32 %n, %length
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ %start, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
%within.bounds = icmp sge i32 %i, 0		%j = phi i32 [ %j.next, %loop ], [ 0, %loop.preheader ]

		%within.bounds = icmp ult i32 %j, %length
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.i = load i32, i32* %array.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc, %array.i		%loop.acc.next = add i32 %loop.acc, %array.i

		%j.next = add nsw i32 %j, 1
%i.next = add nsw i32 %i, 1		%i.next = add nsw i32 %i, 1
%continue = icmp slt i32 %i.next, %n		%continue = icmp slt i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @signed_loop_start_to_n_upper_slt_length_check(i32* %array, i32 %length, i32 %start, i32 %n) {		define i32 @signed_loop_0_to_n_unrelated_iv_range_check(i32* %array, i32 %start, i32 %length, i32 %n) {
; CHECK-LABEL: @signed_loop_start_to_n_upper_slt_length_check		; CHECK-LABEL: @signed_loop_0_to_n_unrelated_iv_range_check
entry:		entry:
%tmp5 = icmp sle i32 %n, 0		%tmp5 = icmp sle i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[start_1:[^ ]+]] = add i32 %start, 1
; CHECK-NEXT: [[n_sgt_start_1:[^ ]+]] = icmp sgt i32 %n, [[start_1]]
; CHECK-NEXT: [[smax:[^ ]+]] = select i1 [[n_sgt_start_1]], i32 %n, i32 [[start_1]]
; CHECK-NEXT: [[max_index:[^ ]+]] = add i32 [[smax]], -1
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp slt i32 [[max_index]], %length
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: %within.bounds = icmp ult i32 %j, %length
		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ %start, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
%within.bounds = icmp slt i32 %i, %length		%j = phi i32 [ %j.next, %loop ], [ %start, %loop.preheader ]

		%within.bounds = icmp ult i32 %j, %length
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.i = load i32, i32* %array.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc, %array.i		%loop.acc.next = add i32 %loop.acc, %array.i

		%j.next = add nsw i32 %j, 1
%i.next = add nsw i32 %i, 1		%i.next = add nsw i32 %i, 1
%continue = icmp slt i32 %i.next, %n		%continue = icmp slt i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @signed_loop_start_to_n_both_checks(i32* %array, i32 %length, i32 %start, i32 %n) {		define i32 @two_range_checks(i32* %array.1, i32 %length.1,
; CHECK-LABEL: @signed_loop_start_to_n_both_checks		i32* %array.2, i32 %length.2, i32 %n) {
		; CHECK-LABEL: @two_range_checks
entry:		entry:
%tmp5 = icmp sle i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[lower_check:[^ ]+]] = icmp sge i32 %start, 0		; CHECK: [[first_iteration_check_1:[^ ]+]] = icmp ult i32 0, %length.{{1\|2}}
; CHECK-NEXT: [[start_1:[^ ]+]] = add i32 %start, 1		; CHECK-NEXT: [[limit_check_1:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2}}
; CHECK-NEXT: [[n_sgt_start_1:[^ ]+]] = icmp sgt i32 %n, [[start_1]]		; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = and i1 [[first_iteration_check_1]], [[limit_check_1]]
; CHECK-NEXT: [[smax:[^ ]+]] = select i1 [[n_sgt_start_1]], i32 %n, i32 [[start_1]]		; CHECK-NEXT: [[first_iteration_check_2:[^ ]+]] = icmp ult i32 0, %length.{{1\|2}}
; CHECK-NEXT: [[max_index:[^ ]+]] = add i32 [[smax]], -1		; CHECK-NEXT: [[limit_check_2:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2}}
; CHECK-NEXT: [[upper_check:[^ ]+]] = icmp slt i32 [[max_index]], %length		; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = and i1 [[first_iteration_check_2]], [[limit_check_2]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: [[wide_cond:[^ ]+]] = and i1 [[lower_check]], [[upper_check]]		; CHECK: [[wide_cond:[^ ]+]] = and i1 [[wide_cond_1]], [[wide_cond_2]]
; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ %start, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
%within.bounds.1 = icmp slt i32 %i, %length		%within.bounds.1 = icmp ult i32 %i, %length.1
%within.bounds.2 = icmp sge i32 %i, 0		%within.bounds.2 = icmp ult i32 %i, %length.2
%within.bounds = and i1 %within.bounds.1, %within.bounds.2		%within.bounds = and i1 %within.bounds.1, %within.bounds.2
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64		%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64
%array.i = load i32, i32* %array.i.ptr, align 4		%array.1.i = load i32, i32* %array.1.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc, %array.i		%loop.acc.1 = add i32 %loop.acc, %array.1.i

%i.next = add nsw i32 %i, 1		%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
%continue = icmp slt i32 %i.next, %n		%array.2.i = load i32, i32* %array.2.i.ptr, align 4
		%loop.acc.next = add i32 %loop.acc.1, %array.2.i

		%i.next = add nuw i32 %i, 1
		%continue = icmp ult i32 %i.next, %n
		br i1 %continue, label %loop, label %exit

		exit:
		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
		ret i32 %result
		}

		define i32 @three_range_checks(i32* %array.1, i32 %length.1,
		i32* %array.2, i32 %length.2,
		i32* %array.3, i32 %length.3, i32 %n) {
		; CHECK-LABEL: @three_range_checks
		entry:
		%tmp5 = icmp eq i32 %n, 0
		br i1 %tmp5, label %exit, label %loop.preheader

		loop.preheader:
		; CHECK: loop.preheader:
		; CHECK: [[first_iteration_check_1:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_1:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = and i1 [[first_iteration_check_1]], [[limit_check_1]]
		; CHECK-NEXT: [[first_iteration_check_2:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_2:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = and i1 [[first_iteration_check_2]], [[limit_check_2]]
		; CHECK-NEXT: [[first_iteration_check_3:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_3:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_3:[^ ]+]] = and i1 [[first_iteration_check_3]], [[limit_check_3]]
		; CHECK-NEXT: br label %loop
		br label %loop

		loop:
		; CHECK: loop:
		; CHECK: [[wide_cond_and:[^ ]+]] = and i1 [[wide_cond_1]], [[wide_cond_2]]
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[wide_cond_and]], [[wide_cond_3]]
		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
		%within.bounds.1 = icmp ult i32 %i, %length.1
		%within.bounds.2 = icmp ult i32 %i, %length.2
		%within.bounds.3 = icmp ult i32 %i, %length.3
		%within.bounds.1.and.2 = and i1 %within.bounds.1, %within.bounds.2
		%within.bounds = and i1 %within.bounds.1.and.2, %within.bounds.3
		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

		%i.i64 = zext i32 %i to i64
		%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64
		%array.1.i = load i32, i32* %array.1.i.ptr, align 4
		%loop.acc.1 = add i32 %loop.acc, %array.1.i

		%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
		%array.2.i = load i32, i32* %array.2.i.ptr, align 4
		%loop.acc.2 = add i32 %loop.acc.1, %array.2.i

		%array.3.i.ptr = getelementptr inbounds i32, i32* %array.3, i64 %i.i64
		%array.3.i = load i32, i32* %array.3.i.ptr, align 4
		%loop.acc.next = add i32 %loop.acc.2, %array.3.i

		%i.next = add nuw i32 %i, 1
		%continue = icmp ult i32 %i.next, %n
		br i1 %continue, label %loop, label %exit

		exit:
		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
		ret i32 %result
		}

		define i32 @three_guards(i32* %array.1, i32 %length.1,
		i32* %array.2, i32 %length.2,
		i32* %array.3, i32 %length.3, i32 %n) {
		; CHECK-LABEL: @three_guards
		entry:
		%tmp5 = icmp eq i32 %n, 0
		br i1 %tmp5, label %exit, label %loop.preheader

		loop.preheader:
		; CHECK: loop.preheader:
		; CHECK: [[first_iteration_check_1:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_1:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_1:[^ ]+]] = and i1 [[first_iteration_check_1]], [[limit_check_1]]
		; CHECK-NEXT: [[first_iteration_check_2:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_2:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_2:[^ ]+]] = and i1 [[first_iteration_check_2]], [[limit_check_2]]
		; CHECK-NEXT: [[first_iteration_check_3:[^ ]+]] = icmp ult i32 0, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[limit_check_3:[^ ]+]] = icmp ule i32 %n, %length.{{1\|2\|3}}
		; CHECK-NEXT: [[wide_cond_3:[^ ]+]] = and i1 [[first_iteration_check_3]], [[limit_check_3]]
		; CHECK-NEXT: br label %loop
		br label %loop

		loop:
		; CHECK: loop:
		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_1]], i32 9) [ "deopt"() ]
		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_2]], i32 9) [ "deopt"() ]
		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond_3]], i32 9) [ "deopt"() ]

		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]

		%within.bounds.1 = icmp ult i32 %i, %length.1
		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.1, i32 9) [ "deopt"() ]

		%i.i64 = zext i32 %i to i64
		%array.1.i.ptr = getelementptr inbounds i32, i32* %array.1, i64 %i.i64
		%array.1.i = load i32, i32* %array.1.i.ptr, align 4
		%loop.acc.1 = add i32 %loop.acc, %array.1.i

		%within.bounds.2 = icmp ult i32 %i, %length.2
		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.2, i32 9) [ "deopt"() ]

		%array.2.i.ptr = getelementptr inbounds i32, i32* %array.2, i64 %i.i64
		%array.2.i = load i32, i32* %array.2.i.ptr, align 4
		%loop.acc.2 = add i32 %loop.acc.1, %array.2.i

		%within.bounds.3 = icmp ult i32 %i, %length.3
		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds.3, i32 9) [ "deopt"() ]

		%array.3.i.ptr = getelementptr inbounds i32, i32* %array.3, i64 %i.i64
		%array.3.i = load i32, i32* %array.3.i.ptr, align 4
		%loop.acc.next = add i32 %loop.acc.2, %array.3.i

		%i.next = add nuw i32 %i, 1
		%continue = icmp ult i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}

define i32 @unsigned_loop_0_to_n_unrelated_condition(i32* %array, i32 %length, i32 %n, i32 %x) {		define i32 @unsigned_loop_0_to_n_unrelated_condition(i32* %array, i32 %length, i32 %n, i32 %x) {
; CHECK-LABEL: @unsigned_loop_0_to_n_unrelated_condition		; CHECK-LABEL: @unsigned_loop_0_to_n_unrelated_condition
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp ult i32 [[max_index]], %length		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp ule i32 %n, %length
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: %unrelated.cond = icmp ult i32 %x, %length		; CHECK: %unrelated.cond = icmp ult i32 %x, %length
; CHECK: [[guard_cond:[^ ]+]] = and i1 %unrelated.cond, [[wide_cond]]		; CHECK: [[guard_cond:[^ ]+]] = and i1 %unrelated.cond, [[wide_cond]]
; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[guard_cond]], i32 9) [ "deopt"() ]		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[guard_cond]], i32 9) [ "deopt"() ]
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: %bound = add i32 %i, %x		; CHECK: %bound = add i32 %i, %x
; CHECK-NEXT: %within.bounds = icmp slt i32 %i, %bound		; CHECK-NEXT: %within.bounds = icmp ult i32 %i, %bound
; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ %start, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ %start, %loop.preheader ]
%bound = add i32 %i, %x		%bound = add i32 %i, %x
%within.bounds = icmp slt i32 %i, %bound		%within.bounds = icmp ult i32 %i, %bound
call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]		call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

%i.i64 = zext i32 %i to i64		%i.i64 = zext i32 %i to i64
%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64		%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
%array.i = load i32, i32* %array.i.ptr, align 4		%array.i = load i32, i32* %array.i.ptr, align 4
%loop.acc.next = add i32 %loop.acc, %array.i		%loop.acc.next = add i32 %loop.acc, %array.i

%i.next = add nsw i32 %i, 1		%i.next = add nsw i32 %i, 1
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
define i32 @unsigned_loop_0_to_n_hoist_length(i32* %array, i16 %length.i16, i32 %n) {		define i32 @unsigned_loop_0_to_n_hoist_length(i32* %array, i16 %length.i16, i32 %n) {
; CHECK-LABEL: @unsigned_loop_0_to_n_hoist_length		; CHECK-LABEL: @unsigned_loop_0_to_n_hoist_length
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[max_index:[^ ]+]] = add i32 %n, -1		; CHECK: [[length:[^ ]+]] = zext i16 %length.i16 to i32
; CHECK-NEXT: [[length:[^ ]+]] = zext i16 %length.i16 to i32		; CHECK-NEXT: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, [[length]]
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp ult i32 [[max_index]], [[length]]		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp ule i32 %n, [[length]]
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]		; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]		%loop.acc = phi i32 [ %loop.acc.next, %loop ], [ 0, %loop.preheader ]
%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]		%i = phi i32 [ %i.next, %loop ], [ 0, %loop.preheader ]
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopPredication/nested.ll

	; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s			; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s
	; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s			; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s

	declare void @llvm.experimental.guard(i1, ...)			declare void @llvm.experimental.guard(i1, ...)

	define i32 @signed_loop_0_to_n_nested_0_to_l_inner_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {			define i32 @signed_loop_0_to_n_nested_0_to_l_inner_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {
	; CHECK-LABEL: @signed_loop_0_to_n_nested_0_to_l_inner_index_check			; CHECK-LABEL: @signed_loop_0_to_n_nested_0_to_l_inner_index_check
	entry:			entry:
	%tmp5 = icmp sle i32 %n, 0			%tmp5 = icmp sle i32 %n, 0
	br i1 %tmp5, label %exit, label %outer.loop.preheader			br i1 %tmp5, label %exit, label %outer.loop.preheader

	outer.loop.preheader:			outer.loop.preheader:
	; CHECK: outer.loop.preheader:
	; CHECK: [[iteration_count:[^ ]+]] = add i32 %l, -1
	br label %outer.loop			br label %outer.loop

	outer.loop:			outer.loop:
	%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%tmp6 = icmp sle i32 %l, 0			%tmp6 = icmp sle i32 %l, 0
	br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader			br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader

	inner.loop.preheader:			inner.loop.preheader:
	; CHECK: inner.loop.preheader:			; CHECK: inner.loop.preheader:
	; CHECK: [[wide_cond:[^ ]+]] = icmp slt i32 [[iteration_count]], %length			; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
				; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp sle i32 %l, %length
				; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
				; CHECK-NEXT: br label %inner.loop
	br label %inner.loop			br label %inner.loop

	inner.loop:			inner.loop:
	; CHECK: inner.loop:			; CHECK: inner.loop:
	; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]			; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
	%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]			%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]
	%j = phi i32 [ %j.next, %inner.loop ], [ 0, %inner.loop.preheader ]			%j = phi i32 [ %j.next, %inner.loop ], [ 0, %inner.loop.preheader ]

	%within.bounds = icmp slt i32 %j, %length			%within.bounds = icmp ult i32 %j, %length
	call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]			call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

	%j.i64 = zext i32 %j to i64			%j.i64 = zext i32 %j to i64
	%array.j.ptr = getelementptr inbounds i32, i32* %array, i64 %j.i64			%array.j.ptr = getelementptr inbounds i32, i32* %array, i64 %j.i64
	%array.j = load i32, i32* %array.j.ptr, align 4			%array.j = load i32, i32* %array.j.ptr, align 4
	%inner.loop.acc.next = add i32 %inner.loop.acc, %array.j			%inner.loop.acc.next = add i32 %inner.loop.acc, %array.j

	%j.next = add nsw i32 %j, 1			%j.next = add nsw i32 %j, 1
	Show All 14 Lines
	define i32 @signed_loop_0_to_n_nested_0_to_l_outer_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {			define i32 @signed_loop_0_to_n_nested_0_to_l_outer_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {
	; CHECK-LABEL: @signed_loop_0_to_n_nested_0_to_l_outer_index_check			; CHECK-LABEL: @signed_loop_0_to_n_nested_0_to_l_outer_index_check
	entry:			entry:
	%tmp5 = icmp sle i32 %n, 0			%tmp5 = icmp sle i32 %n, 0
	br i1 %tmp5, label %exit, label %outer.loop.preheader			br i1 %tmp5, label %exit, label %outer.loop.preheader

	outer.loop.preheader:			outer.loop.preheader:
	; CHECK: outer.loop.preheader:			; CHECK: outer.loop.preheader:
	; CHECK: [[iteration_count:[^ ]+]] = add i32 %n, -1			; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
	; CHECK: [[wide_cond:[^ ]+]] = icmp slt i32 [[iteration_count]], %length			; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp sle i32 %n, %length
				; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
				; CHECK-NEXT: br label %outer.loop
	br label %outer.loop			br label %outer.loop

	outer.loop:			outer.loop:
	%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%tmp6 = icmp sle i32 %l, 0			%tmp6 = icmp sle i32 %l, 0
	br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader			br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader

	inner.loop.preheader:			inner.loop.preheader:
	br label %inner.loop			br label %inner.loop

	inner.loop:			inner.loop:
	; CHECK: inner.loop:			; CHECK: inner.loop:
	; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]			; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]

	%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]			%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]
	%j = phi i32 [ %j.next, %inner.loop ], [ 0, %inner.loop.preheader ]			%j = phi i32 [ %j.next, %inner.loop ], [ 0, %inner.loop.preheader ]

	%within.bounds = icmp slt i32 %i, %length			%within.bounds = icmp ult i32 %i, %length
	call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]			call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

	%i.i64 = zext i32 %i to i64			%i.i64 = zext i32 %i to i64
	%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64			%array.i.ptr = getelementptr inbounds i32, i32* %array, i64 %i.i64
	%array.i = load i32, i32* %array.i.ptr, align 4			%array.i = load i32, i32* %array.i.ptr, align 4
	%inner.loop.acc.next = add i32 %inner.loop.acc, %array.i			%inner.loop.acc.next = add i32 %inner.loop.acc, %array.i

	%j.next = add nsw i32 %j, 1			%j.next = add nsw i32 %j, 1
	Show All 13 Lines

	define i32 @signed_loop_0_to_n_nested_i_to_l_inner_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {			define i32 @signed_loop_0_to_n_nested_i_to_l_inner_index_check(i32* %array, i32 %length, i32 %n, i32 %l) {
	; CHECK-LABEL: @signed_loop_0_to_n_nested_i_to_l_inner_index_check			; CHECK-LABEL: @signed_loop_0_to_n_nested_i_to_l_inner_index_check
	entry:			entry:
	%tmp5 = icmp sle i32 %n, 0			%tmp5 = icmp sle i32 %n, 0
	br i1 %tmp5, label %exit, label %outer.loop.preheader			br i1 %tmp5, label %exit, label %outer.loop.preheader

	outer.loop.preheader:			outer.loop.preheader:
				; CHECK: outer.loop.preheader:
				; CHECK-NEXT: [[first_iteration_check_outer:[^ ]+]] = icmp ult i32 0, %length
				; CHECK-NEXT: [[limit_check_outer:[^ ]+]] = icmp sle i32 %n, %length
				; CHECK-NEXT: [[wide_cond_outer:[^ ]+]] = and i1 [[first_iteration_check_outer]], [[limit_check_outer]]
				; CHECK-NEXT: br label %outer.loop
	br label %outer.loop			br label %outer.loop

	outer.loop:			outer.loop:
	; CHECK: outer.loop:			; CHECK: outer.loop:
	; CHECK: [[i_1:[^ ]+]] = add i32 %i, 1
	; CHECK-NEXT: [[l_sgt_i_1:[^ ]+]] = icmp sgt i32 %l, [[i_1]]
	; CHECK-NEXT: [[smax:[^ ]+]] = select i1 [[l_sgt_i_1]], i32 %l, i32 [[i_1]]
	; CHECK-NEXT: [[max_j:[^ ]+]] = add i32 [[smax]], -1
	%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]			%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
	%tmp6 = icmp sle i32 %l, 0			%tmp6 = icmp sle i32 %l, 0
	br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader			br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader

	inner.loop.preheader:			inner.loop.preheader:
	; CHECK: inner.loop.preheader:			; CHECK: inner.loop.preheader:
	; CHECK: [[wide_cond:[^ ]+]] = icmp slt i32 [[max_j]], %length			; CHECK: [[limit_check_inner:[^ ]+]] = icmp sle i32 %l, %length
				; CHECK: br label %inner.loop
	br label %inner.loop			br label %inner.loop

	inner.loop:			inner.loop:
	; CHECK: inner.loop:			; CHECK: inner.loop:
				; CHECK: [[wide_cond:[^ ]+]] = and i1 [[limit_check_inner]], [[wide_cond_outer]]
	; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]			; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 [[wide_cond]], i32 9) [ "deopt"() ]
	%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]			%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]
	%j = phi i32 [ %j.next, %inner.loop ], [ %i, %inner.loop.preheader ]			%j = phi i32 [ %j.next, %inner.loop ], [ %i, %inner.loop.preheader ]

	%within.bounds = icmp slt i32 %j, %length			%within.bounds = icmp ult i32 %j, %length
				call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

				%j.i64 = zext i32 %j to i64
				%array.j.ptr = getelementptr inbounds i32, i32* %array, i64 %j.i64
				%array.j = load i32, i32* %array.j.ptr, align 4
				%inner.loop.acc.next = add i32 %inner.loop.acc, %array.j

				%j.next = add nsw i32 %j, 1
				%inner.continue = icmp slt i32 %j.next, %l
				br i1 %inner.continue, label %inner.loop, label %outer.loop.inc

				outer.loop.inc:
				%outer.loop.acc.next = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %outer.loop ]
				%i.next = add nsw i32 %i, 1
				%outer.continue = icmp slt i32 %i.next, %n
				br i1 %outer.continue, label %outer.loop, label %exit

				exit:
				%result = phi i32 [ 0, %entry ], [ %outer.loop.acc.next, %outer.loop.inc ]
				ret i32 %result
				}

				define i32 @cant_expand_guard_check_start(i32* %array, i32 %length, i32 %n, i32 %l, i32 %maybezero) {
				; CHECK-LABEL: @cant_expand_guard_check_start
				entry:
				%tmp5 = icmp sle i32 %n, 0
				br i1 %tmp5, label %exit, label %outer.loop.preheader

				outer.loop.preheader:
				br label %outer.loop

				outer.loop:
				%outer.loop.acc = phi i32 [ %outer.loop.acc.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
				%i = phi i32 [ %i.next, %outer.loop.inc ], [ 0, %outer.loop.preheader ]
				%tmp6 = icmp sle i32 %l, 0
				%div = udiv i32 %i, %maybezero
				br i1 %tmp6, label %outer.loop.inc, label %inner.loop.preheader

				inner.loop.preheader:
				; CHECK: inner.loop.preheader:
				; CHECK: br label %inner.loop
				br label %inner.loop

				inner.loop:
				; CHECK: inner.loop:
				; CHECK: %within.bounds = icmp ult i32 %j, %length
				; CHECK: call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]
				%inner.loop.acc = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %inner.loop.preheader ]
				%j = phi i32 [ %j.next, %inner.loop ], [ %div, %inner.loop.preheader ]

				%within.bounds = icmp ult i32 %j, %length
	call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]			call void (i1, ...) @llvm.experimental.guard(i1 %within.bounds, i32 9) [ "deopt"() ]

	%j.i64 = zext i32 %j to i64			%j.i64 = zext i32 %j to i64
	%array.j.ptr = getelementptr inbounds i32, i32* %array, i64 %j.i64			%array.j.ptr = getelementptr inbounds i32, i32* %array, i64 %j.i64
	%array.j = load i32, i32* %array.j.ptr, align 4			%array.j = load i32, i32* %array.j.ptr, align 4
	%inner.loop.acc.next = add i32 %inner.loop.acc, %array.j			%inner.loop.acc.next = add i32 %inner.loop.acc, %array.j

	%j.next = add nsw i32 %j, 1			%j.next = add nsw i32 %j, 1
	%inner.continue = icmp slt i32 %j.next, %l			%inner.continue = icmp slt i32 %j.next, %l
	br i1 %inner.continue, label %inner.loop, label %outer.loop.inc			br i1 %inner.continue, label %inner.loop, label %outer.loop.inc

	outer.loop.inc:			outer.loop.inc:
	%outer.loop.acc.next = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %outer.loop ]			%outer.loop.acc.next = phi i32 [ %inner.loop.acc.next, %inner.loop ], [ %outer.loop.acc, %outer.loop ]
	%i.next = add nsw i32 %i, 1			%i.next = add nsw i32 %i, 1
	%outer.continue = icmp slt i32 %i.next, %n			%outer.continue = icmp slt i32 %i.next, %n
	br i1 %outer.continue, label %outer.loop, label %exit			br i1 %outer.continue, label %outer.loop, label %exit

	exit:			exit:
	%result = phi i32 [ 0, %entry ], [ %outer.loop.acc.next, %outer.loop.inc ]			%result = phi i32 [ 0, %entry ], [ %outer.loop.acc.next, %outer.loop.inc ]
	ret i32 %result			ret i32 %result
	}			}
	No newline at end of file			No newline at end of file

llvm/trunk/test/Transforms/LoopPredication/visited.ll

; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s		; RUN: opt -S -loop-predication < %s 2>&1 \| FileCheck %s
; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s		; RUN: opt -S -passes='require<scalar-evolution>,loop(loop-predication)' < %s 2>&1 \| FileCheck %s

declare void @llvm.experimental.guard(i1, ...)		declare void @llvm.experimental.guard(i1, ...)

define i32 @test_visited(i32* %array, i32 %length, i32 %n, i32 %x) {		define i32 @test_visited(i32* %array, i32 %length, i32 %n, i32 %x) {
; CHECK-LABEL: @test_visited		; CHECK-LABEL: @test_visited
entry:		entry:
%tmp5 = icmp eq i32 %n, 0		%tmp5 = icmp eq i32 %n, 0
br i1 %tmp5, label %exit, label %loop.preheader		br i1 %tmp5, label %exit, label %loop.preheader

loop.preheader:		loop.preheader:
; CHECK: loop.preheader:		; CHECK: loop.preheader:
; CHECK: [[iteration_count:[^ ]+]] = add i32 %n, -1		; CHECK: [[first_iteration_check:[^ ]+]] = icmp ult i32 0, %length
; CHECK-NEXT: [[wide_cond:[^ ]+]] = icmp ult i32 [[iteration_count]], %length		; CHECK-NEXT: [[limit_check:[^ ]+]] = icmp ule i32 %n, %length
		; CHECK-NEXT: [[wide_cond:[^ ]+]] = and i1 [[first_iteration_check]], [[limit_check]]
; CHECK-NEXT: br label %loop		; CHECK-NEXT: br label %loop
br label %loop		br label %loop

loop:		loop:
; CHECK: loop:		; CHECK: loop:
; CHECK: %unrelated.cond = icmp eq i32 %x, %i		; CHECK: %unrelated.cond = icmp eq i32 %x, %i
; CHECK: [[guard_cond:[^ ]+]] = and i1 %unrelated.cond, [[wide_cond]]		; CHECK: [[guard_cond:[^ ]+]] = and i1 %unrelated.cond, [[wide_cond]]
; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[guard_cond]], i32 9) [ "deopt"() ]		; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[guard_cond]], i32 9) [ "deopt"() ]
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	; CHECK-NEXT: call void (i1, ...) @llvm.experimental.guard(i1 [[guard_cond]], i32 9) [ "deopt"() ]
%i.next = add nuw i32 %i, 1		%i.next = add nuw i32 %i, 1
%continue = icmp ult i32 %i.next, %n		%continue = icmp ult i32 %i.next, %n
br i1 %continue, label %loop, label %exit		br i1 %continue, label %loop, label %exit

exit:		exit:
%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]		%result = phi i32 [ 0, %entry ], [ %loop.acc.next, %loop ]
ret i32 %result		ret i32 %result
}		}
No newline at end of file		No newline at end of file