This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
LoopRerollPass.cpp
-
test/Transforms/LoopReroll/
-
Transforms/
-
LoopReroll/
-
basic.ll
-
complex_reroll.ll
-
indvar_with_ext.ll
-
nonconst_lb.ll
-
ptrindvar.ll
-
reduction.ll

Differential D45191

[LoopReroll] Rewrite induction variable rewriting.
ClosedPublic

Authored by efriedma on Apr 2 2018, 4:33 PM.

Download Raw Diff

Details

Reviewers

hfinkel
jmolloy
fhahn

Commits

rG203eaaf5badb: [LoopReroll] Rewrite induction variable rewriting.
rL335400: [LoopReroll] Rewrite induction variable rewriting.

Summary

This gets rid of a bunch of weird special cases; instead of trying to understand the structure of the induction variable, just use SCEV expansion for everything. In addition to being simpler, this fixes a bug where we would use the wrong stride in certain edge cases; see the new test pointer_bitcast_baseinst. (The bug might be a regression from D26529 ? Not sure.)

The one bit I'm not quite sure about is the trip count handling, specifically the FIXME about overflow. In general, I think we need to widen the exit condition, but that's probably not profitable if the new type isn't legal, so we probably need a check somewhere. That said, I don't think this patch makes the existing problem any worse.

As a followup to this, a bunch of other IV-related could be cleaned up and generalized, since the rewriting can handle more cases.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma created this revision.Apr 2 2018, 4:33 PM

fhahn added a subscriber: fhahn.Apr 2 2018, 10:07 PM

Ping.

Ping

Ping.

Herald added a subscriber: javed.absar. · View Herald TranscriptApr 23 2018, 4:55 PM

I had a high-level look and left some minor comments inline. With respect to the overflow, the original code seems to have the same problem, so this patch does not make things worse.

IIUC the induction variables must be wide enough to represent the scaled number of iterations in the original code, but an overflow can happen because we use EQ with the scaled trip count and the new induction variables start at 0. (e.g. if we have something like for (int64_t i = -100; i < INT64_MAX; i += 10)). I am probably missing something, but maybe we could avoid the overflow by doing the re-writing of the induction variables/exit conditions closer to the original range for the induction variables?

lib/Transforms/Scalar/LoopRerollPass.cpp
397 ↗	(On Diff #140701)	Stale doxygen comment. Could you document LIBETC?
1409 ↗	(On Diff #140701)	Typo, Copute -> Compute
test/Transforms/LoopReroll/basic.ll
86 ↗	(On Diff #140701)	I think we should also check the instruction producing `%0`, which should truncate `%indvar` in this and other changed tests.

javed.absar added inline comments.Apr 24 2018, 9:30 AM

lib/Transforms/Scalar/LoopRerollPass.cpp
1415 ↗	(On Diff #140701)	Should probably use dyn_cast if we cant be 100% it will be SCEVAddRecExpr.
1618 ↗	(On Diff #140701)	Can LIBETC be renamed to something more meaningful. Is LIBETC = Loop ??? back edge taken count?

I am probably missing something, but maybe we could avoid the overflow by doing the re-writing of the induction variables/exit conditions closer to the original range for the induction variables?

The fundamental problem is that the rerolled loop could have a trip count that doesn't fit into a single register, even if the original loop's trip count does fit (e.g. on a 32-bit target, the original loop has a trip count 2^31, we reroll by a factor of four, and the new loop has a trip count of 2^33). So handling overflow correctly in general requires some extra code in the rerolled loop.

In some cases, we could probably prove the trip count can't actually overflow. And in the cases where it could overflow, there are a few ways to handle it: we could emit an induction variable with an illegal type, or we could emit a nested loop. I'm not sure what the best approach is.

lib/Transforms/Scalar/LoopRerollPass.cpp
1415 ↗	(On Diff #140701)	It will always be a SCEVAddRecExpr; we check earlier. (This is a fundamental part of proving that the rerolled loop is equivalent to the original loop.)

In D45191#1077281, @efriedma wrote:

I am probably missing something, but maybe we could avoid the overflow by doing the re-writing of the induction variables/exit conditions closer to the original range for the induction variables?

The fundamental problem is that the rerolled loop could have a trip count that doesn't fit into a single register, even if the original loop's trip count does fit (e.g. on a 32-bit target, the original loop has a trip count 2^31, we reroll by a factor of four, and the new loop has a trip count of 2^33). So handling overflow correctly in general requires some extra code in the rerolled loop.

Ah okay. After a very brief look I thought we only support loops like for ( i = start; i < end; i += scale) for which trip count should be something like (end - start) / scale and the original i should fit into (start, end]. In that case, the induction variable in the re-rolled loop could fit into (start, end] too.

Address review comments

Thanks Eli, LGTM. As you said, I think the patch does not make things worse with respect to the FIXME you added and nicely simplifies the induction variable rewriting. It is probably best to wait a few days with committing though, in case James or @hfinkel have any additional thoughts.

This revision is now accepted and ready to land.Apr 30 2018, 1:36 AM

Closed by commit rL335400: [LoopReroll] Rewrite induction variable rewriting. (authored by efriedma). · Explain WhyJun 22 2018, 4:03 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

LoopRerollPass.cpp

236 lines

test/

Transforms/

LoopReroll/

87 lines

41 lines

18 lines

34 lines

4 lines

12 lines

Diff 152552

llvm/trunk/lib/Transforms/Scalar/LoopRerollPass.cpp

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "loop-reroll"		#define DEBUG_TYPE "loop-reroll"

STATISTIC(NumRerolledLoops, "Number of rerolled loops");		STATISTIC(NumRerolledLoops, "Number of rerolled loops");

static cl::opt<unsigned>		static cl::opt<unsigned>
MaxInc("max-reroll-increment", cl::init(2048), cl::Hidden,
cl::desc("The maximum increment for loop rerolling"));

static cl::opt<unsigned>
NumToleratedFailedMatches("reroll-num-tolerated-failed-matches", cl::init(400),		NumToleratedFailedMatches("reroll-num-tolerated-failed-matches", cl::init(400),
cl::Hidden,		cl::Hidden,
cl::desc("The maximum number of failures to tolerate"		cl::desc("The maximum number of failures to tolerate"
" during fuzzy matching. (default: 400)"));		" during fuzzy matching. (default: 400)"));

// This loop re-rolling transformation aims to transform loops like this:		// This loop re-rolling transformation aims to transform loops like this:
//		//
// int foo(int a);		// int foo(int a);
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	struct DAGRootTracker {
/// Stage 1: Find all the DAG roots for the induction variable.		/// Stage 1: Find all the DAG roots for the induction variable.
bool findRoots();		bool findRoots();

/// Stage 2: Validate if the found roots are valid.		/// Stage 2: Validate if the found roots are valid.
bool validate(ReductionTracker &Reductions);		bool validate(ReductionTracker &Reductions);

/// Stage 3: Assuming validate() returned true, perform the		/// Stage 3: Assuming validate() returned true, perform the
/// replacement.		/// replacement.
/// @param IterCount The maximum iteration count of L.		/// @param BackedgeTakenCount The backedge-taken count of L.
void replace(const SCEV *IterCount);		void replace(const SCEV *BackedgeTakenCount);

protected:		protected:
using UsesTy = MapVector<Instruction *, BitVector>;		using UsesTy = MapVector<Instruction *, BitVector>;

void findRootsRecursive(Instruction *IVU,		void findRootsRecursive(Instruction *IVU,
SmallInstructionSet SubsumedInsts);		SmallInstructionSet SubsumedInsts);
bool findRootsBase(Instruction *IVU, SmallInstructionSet SubsumedInsts);		bool findRootsBase(Instruction *IVU, SmallInstructionSet SubsumedInsts);
bool collectPossibleRoots(Instruction *Base,		bool collectPossibleRoots(Instruction *Base,
Show All 13 Lines	protected:
UsesTy::iterator nextInstr(int Val, UsesTy &In,		UsesTy::iterator nextInstr(int Val, UsesTy &In,
const SmallInstructionSet &Exclude,		const SmallInstructionSet &Exclude,
UsesTy::iterator *StartI=nullptr);		UsesTy::iterator *StartI=nullptr);
bool isBaseInst(Instruction *I);		bool isBaseInst(Instruction *I);
bool isRootInst(Instruction *I);		bool isRootInst(Instruction *I);
bool instrDependsOn(Instruction *I,		bool instrDependsOn(Instruction *I,
UsesTy::iterator Start,		UsesTy::iterator Start,
UsesTy::iterator End);		UsesTy::iterator End);
void replaceIV(Instruction Inst, Instruction IV, const SCEV *IterCount);		void replaceIV(DAGRootSet &DRS, const SCEV Start, const SCEV IncrExpr);
void updateNonLoopCtrlIncr();

LoopReroll *Parent;		LoopReroll *Parent;

// Members of Parent, replicated here for brevity.		// Members of Parent, replicated here for brevity.
Loop *L;		Loop *L;
ScalarEvolution *SE;		ScalarEvolution *SE;
AliasAnalysis *AA;		AliasAnalysis *AA;
TargetLibraryInfo *TLI;		TargetLibraryInfo *TLI;
Show All 36 Lines	bool isCompareUsedByBranch(Instruction *I) {
return false;		return false;
return I->hasOneUse() && TI->getOperand(0) == I;		return I->hasOneUse() && TI->getOperand(0) == I;
};		};

bool isLoopControlIV(Loop L, Instruction IV);		bool isLoopControlIV(Loop L, Instruction IV);
void collectPossibleIVs(Loop *L, SmallInstructionVector &PossibleIVs);		void collectPossibleIVs(Loop *L, SmallInstructionVector &PossibleIVs);
void collectPossibleReductions(Loop *L,		void collectPossibleReductions(Loop *L,
ReductionTracker &Reductions);		ReductionTracker &Reductions);
bool reroll(Instruction IV, Loop L, BasicBlock Header, const SCEV IterCount,		bool reroll(Instruction IV, Loop L, BasicBlock *Header,
ReductionTracker &Reductions);		const SCEV *BackedgeTakenCount, ReductionTracker &Reductions);
};		};

} // end anonymous namespace		} // end anonymous namespace

char LoopReroll::ID = 0;		char LoopReroll::ID = 0;

INITIALIZE_PASS_BEGIN(LoopReroll, "loop-reroll", "Reroll loops", false, false)		INITIALIZE_PASS_BEGIN(LoopReroll, "loop-reroll", "Reroll loops", false, false)
INITIALIZE_PASS_DEPENDENCY(LoopPass)		INITIALIZE_PASS_DEPENDENCY(LoopPass)
Show All 10 Lines
static bool hasUsesOutsideLoop(Instruction I, Loop L) {		static bool hasUsesOutsideLoop(Instruction I, Loop L) {
for (User *U : I->users()) {		for (User *U : I->users()) {
if (!L->contains(cast<Instruction>(U)))		if (!L->contains(cast<Instruction>(U)))
return true;		return true;
}		}
return false;		return false;
}		}

static const SCEVConstant getIncrmentFactorSCEV(ScalarEvolution SE,
const SCEV *SCEVExpr,
Instruction &IV) {
const SCEVMulExpr *MulSCEV = dyn_cast<SCEVMulExpr>(SCEVExpr);

// If StepRecurrence of a SCEVExpr is a constant (c1 * c2, c2 = sizeof(ptr)),
// Return c1.
if (!MulSCEV && IV.getType()->isPointerTy())
if (const SCEVConstant *IncSCEV = dyn_cast<SCEVConstant>(SCEVExpr)) {
const PointerType *PTy = cast<PointerType>(IV.getType());
Type *ElTy = PTy->getElementType();
const SCEV *SizeOfExpr =
SE->getSizeOfExpr(SE->getEffectiveSCEVType(IV.getType()), ElTy);
if (IncSCEV->getValue()->getValue().isNegative()) {
const SCEV *NewSCEV =
SE->getUDivExpr(SE->getNegativeSCEV(SCEVExpr), SizeOfExpr);
return dyn_cast<SCEVConstant>(SE->getNegativeSCEV(NewSCEV));
} else {
return dyn_cast<SCEVConstant>(SE->getUDivExpr(SCEVExpr, SizeOfExpr));
}
}

if (!MulSCEV)
return nullptr;

// If StepRecurrence of a SCEVExpr is a c * sizeof(x), where c is constant,
// Return c.
const SCEVConstant *CIncSCEV = nullptr;
for (const SCEV *Operand : MulSCEV->operands()) {
if (const SCEVConstant *Constant = dyn_cast<SCEVConstant>(Operand)) {
CIncSCEV = Constant;
} else if (const SCEVUnknown *Unknown = dyn_cast<SCEVUnknown>(Operand)) {
Type *AllocTy;
if (!Unknown->isSizeOf(AllocTy))
break;
} else {
return nullptr;
}
}
return CIncSCEV;
}

// Check if an IV is only used to control the loop. There are two cases:		// Check if an IV is only used to control the loop. There are two cases:
// 1. It only has one use which is loop increment, and the increment is only		// 1. It only has one use which is loop increment, and the increment is only
// used by comparison and the PHI (could has sext with nsw in between), and the		// used by comparison and the PHI (could has sext with nsw in between), and the
// comparison is only used by branch.		// comparison is only used by branch.
// 2. It is used by loop increment and the comparison, the loop increment is		// 2. It is used by loop increment and the comparison, the loop increment is
// only used by the PHI, and the comparison is used only by the branch.		// only used by the PHI, and the comparison is used only by the branch.
bool LoopReroll::isLoopControlIV(Loop L, Instruction IV) {		bool LoopReroll::isLoopControlIV(Loop L, Instruction IV) {
unsigned IVUses = IV->getNumUses();		unsigned IVUses = IV->getNumUses();
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	if (!I->getType()->isIntegerTy() && !I->getType()->isPointerTy())
continue;		continue;

if (const SCEVAddRecExpr *PHISCEV =		if (const SCEVAddRecExpr *PHISCEV =
dyn_cast<SCEVAddRecExpr>(SE->getSCEV(&*I))) {		dyn_cast<SCEVAddRecExpr>(SE->getSCEV(&*I))) {
if (PHISCEV->getLoop() != L)		if (PHISCEV->getLoop() != L)
continue;		continue;
if (!PHISCEV->isAffine())		if (!PHISCEV->isAffine())
continue;		continue;
const SCEVConstant *IncSCEV = nullptr;		auto IncSCEV = dyn_cast<SCEVConstant>(PHISCEV->getStepRecurrence(*SE));
if (I->getType()->isPointerTy())
IncSCEV =
getIncrmentFactorSCEV(SE, PHISCEV->getStepRecurrence(SE), I);
else
IncSCEV = dyn_cast<SCEVConstant>(PHISCEV->getStepRecurrence(*SE));
if (IncSCEV) {		if (IncSCEV) {
const APInt &AInt = IncSCEV->getValue()->getValue().abs();
if (IncSCEV->getValue()->isZero() \|\| AInt.uge(MaxInc))
continue;
IVToIncMap[&*I] = IncSCEV->getValue()->getSExtValue();		IVToIncMap[&*I] = IncSCEV->getValue()->getSExtValue();
LLVM_DEBUG(dbgs() << "LRR: Possible IV: " << I << " = " << PHISCEV		LLVM_DEBUG(dbgs() << "LRR: Possible IV: " << I << " = " << PHISCEV
<< "\n");		<< "\n");

if (isLoopControlIV(L, &*I)) {		if (isLoopControlIV(L, &*I)) {
assert(!LoopControlIV && "Found two loop control only IV");		assert(!LoopControlIV && "Found two loop control only IV");
LoopControlIV = &(*I);		LoopControlIV = &(*I);
LLVM_DEBUG(dbgs() << "LRR: Possible loop control only IV: " << *I		LLVM_DEBUG(dbgs() << "LRR: Possible loop control only IV: " << *I
▲ Show 20 Lines • Show All 804 Lines • ▼ Show 20 Lines	bool LoopReroll::DAGRootTracker::validate(ReductionTracker &Reductions) {
}		}

LLVM_DEBUG(dbgs() << "LRR: Matched all iteration increments for " << *IV		LLVM_DEBUG(dbgs() << "LRR: Matched all iteration increments for " << *IV
<< "\n");		<< "\n");

return true;		return true;
}		}

void LoopReroll::DAGRootTracker::replace(const SCEV *IterCount) {		void LoopReroll::DAGRootTracker::replace(const SCEV *BackedgeTakenCount) {
BasicBlock *Header = L->getHeader();		BasicBlock *Header = L->getHeader();

		// Compute the start and increment for each BaseInst before we start erasing
		// instructions.
		SmallVector<const SCEV *, 8> StartExprs;
		SmallVector<const SCEV *, 8> IncrExprs;
		for (auto &DRS : RootSets) {
		const SCEVAddRecExpr *IVSCEV =
		cast<SCEVAddRecExpr>(SE->getSCEV(DRS.BaseInst));
		StartExprs.push_back(IVSCEV->getStart());
		IncrExprs.push_back(SE->getMinusSCEV(SE->getSCEV(DRS.Roots[0]), IVSCEV));
		}

// Remove instructions associated with non-base iterations.		// Remove instructions associated with non-base iterations.
for (BasicBlock::reverse_iterator J = Header->rbegin(), JE = Header->rend();		for (BasicBlock::reverse_iterator J = Header->rbegin(), JE = Header->rend();
J != JE;) {		J != JE;) {
unsigned I = Uses[&*J].find_first();		unsigned I = Uses[&*J].find_first();
if (I > 0 && I < IL_All) {		if (I > 0 && I < IL_All) {
LLVM_DEBUG(dbgs() << "LRR: removing: " << *J << "\n");		LLVM_DEBUG(dbgs() << "LRR: removing: " << *J << "\n");
J++->eraseFromParent();		J++->eraseFromParent();
continue;		continue;
}		}

++J;		++J;
}		}

bool HasTwoIVs = LoopControlIV && LoopControlIV != IV;		// Rewrite each BaseInst using SCEV.
		for (size_t i = 0, e = RootSets.size(); i != e; ++i)
if (HasTwoIVs) {
updateNonLoopCtrlIncr();
replaceIV(LoopControlIV, LoopControlIV, IterCount);
} else
// We need to create a new induction variable for each different BaseInst.
for (auto &DRS : RootSets)
// Insert the new induction variable.		// Insert the new induction variable.
replaceIV(DRS.BaseInst, IV, IterCount);		replaceIV(RootSets[i], StartExprs[i], IncrExprs[i]);

SimplifyInstructionsInBlock(Header, TLI);		{ // Limit the lifetime of SCEVExpander.
DeleteDeadPHIs(Header, TLI);		BranchInst *BI = cast<BranchInst>(Header->getTerminator());
}		const DataLayout &DL = Header->getModule()->getDataLayout();
		SCEVExpander Expander(*SE, DL, "reroll");
		auto Zero = SE->getZero(BackedgeTakenCount->getType());
		auto One = SE->getOne(BackedgeTakenCount->getType());
		auto NewIVSCEV = SE->getAddRecExpr(Zero, One, L, SCEV::FlagAnyWrap);
		Value *NewIV =
		Expander.expandCodeFor(NewIVSCEV, BackedgeTakenCount->getType(),
		Header->getFirstNonPHIOrDbg());
		// FIXME: This arithmetic can overflow.
		auto TripCount = SE->getAddExpr(BackedgeTakenCount, One);
		auto ScaledTripCount = SE->getMulExpr(
		TripCount, SE->getConstant(BackedgeTakenCount->getType(), Scale));
		auto ScaledBECount = SE->getMinusSCEV(ScaledTripCount, One);
		Value *TakenCount =
		Expander.expandCodeFor(ScaledBECount, BackedgeTakenCount->getType(),
		Header->getFirstNonPHIOrDbg());
		Value *Cond =
		new ICmpInst(BI, CmpInst::ICMP_EQ, NewIV, TakenCount, "exitcond");
		BI->setCondition(Cond);

// For non-loop-control IVs, we only need to update the last increment		if (BI->getSuccessor(1) != Header)
// with right amount, then we are done.		BI->swapSuccessors();
void LoopReroll::DAGRootTracker::updateNonLoopCtrlIncr() {
const SCEV *NewInc = nullptr;
for (auto *LoopInc : LoopIncs) {
GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(LoopInc);
const SCEVConstant *COp = nullptr;
if (GEP && LoopInc->getOperand(0)->getType()->isPointerTy()) {
COp = dyn_cast<SCEVConstant>(SE->getSCEV(LoopInc->getOperand(1)));
} else {
COp = dyn_cast<SCEVConstant>(SE->getSCEV(LoopInc->getOperand(0)));
if (!COp)
COp = dyn_cast<SCEVConstant>(SE->getSCEV(LoopInc->getOperand(1)));
}		}

assert(COp && "Didn't find constant operand of LoopInc!\n");		SimplifyInstructionsInBlock(Header, TLI);
		DeleteDeadPHIs(Header, TLI);
const APInt &AInt = COp->getValue()->getValue();
const SCEV *ScaleSCEV = SE->getConstant(COp->getType(), Scale);
if (AInt.isNegative()) {
NewInc = SE->getNegativeSCEV(COp);
NewInc = SE->getUDivExpr(NewInc, ScaleSCEV);
NewInc = SE->getNegativeSCEV(NewInc);
} else
NewInc = SE->getUDivExpr(COp, ScaleSCEV);

LoopInc->setOperand(1, dyn_cast<SCEVConstant>(NewInc)->getValue());
}
}		}

void LoopReroll::DAGRootTracker::replaceIV(Instruction *Inst,		void LoopReroll::DAGRootTracker::replaceIV(DAGRootSet &DRS,
Instruction *InstIV,		const SCEV *Start,
const SCEV *IterCount) {		const SCEV *IncrExpr) {
BasicBlock *Header = L->getHeader();		BasicBlock *Header = L->getHeader();
int64_t Inc = IVToIncMap[InstIV];		Instruction *Inst = DRS.BaseInst;
bool NeedNewIV = InstIV == LoopControlIV;
bool Negative = !NeedNewIV && Inc < 0;

const SCEVAddRecExpr *RealIVSCEV = cast<SCEVAddRecExpr>(SE->getSCEV(Inst));
const SCEV *Start = RealIVSCEV->getStart();

if (NeedNewIV)
Start = SE->getConstant(Start->getType(), 0);

const SCEV *SizeOfExpr = nullptr;
const SCEV *IncrExpr =
SE->getConstant(RealIVSCEV->getType(), Negative ? -1 : 1);
if (auto *PTy = dyn_cast<PointerType>(Inst->getType())) {
Type *ElTy = PTy->getElementType();
SizeOfExpr =
SE->getSizeOfExpr(SE->getEffectiveSCEVType(Inst->getType()), ElTy);
IncrExpr = SE->getMulExpr(IncrExpr, SizeOfExpr);
}
const SCEV *NewIVSCEV =		const SCEV *NewIVSCEV =
SE->getAddRecExpr(Start, IncrExpr, L, SCEV::FlagAnyWrap);		SE->getAddRecExpr(Start, IncrExpr, L, SCEV::FlagAnyWrap);

{ // Limit the lifetime of SCEVExpander.		{ // Limit the lifetime of SCEVExpander.
const DataLayout &DL = Header->getModule()->getDataLayout();		const DataLayout &DL = Header->getModule()->getDataLayout();
SCEVExpander Expander(*SE, DL, "reroll");		SCEVExpander Expander(*SE, DL, "reroll");
Value *NewIV = Expander.expandCodeFor(NewIVSCEV, Inst->getType(),		Value *NewIV = Expander.expandCodeFor(NewIVSCEV, Inst->getType(),
Header->getFirstNonPHIOrDbg());		Header->getFirstNonPHIOrDbg());

for (auto &KV : Uses)		for (auto &KV : Uses)
if (KV.second.find_first() == 0)		if (KV.second.find_first() == 0)
KV.first->replaceUsesOfWith(Inst, NewIV);		KV.first->replaceUsesOfWith(Inst, NewIV);

if (BranchInst *BI = dyn_cast<BranchInst>(Header->getTerminator())) {
// FIXME: Why do we need this check?
if (Uses[BI].find_first() == IL_All) {
const SCEV ICSCEV = RealIVSCEV->evaluateAtIteration(IterCount, SE);

if (NeedNewIV)
ICSCEV = SE->getMulExpr(IterCount,
SE->getConstant(IterCount->getType(), Scale));

// Iteration count SCEV minus or plus 1
const SCEV *MinusPlus1SCEV =
SE->getConstant(ICSCEV->getType(), Negative ? -1 : 1);
if (Inst->getType()->isPointerTy()) {
assert(SizeOfExpr && "SizeOfExpr is not initialized");
MinusPlus1SCEV = SE->getMulExpr(MinusPlus1SCEV, SizeOfExpr);
}

const SCEV *ICMinusPlus1SCEV = SE->getMinusSCEV(ICSCEV, MinusPlus1SCEV);
// Iteration count minus 1
Instruction *InsertPtr = nullptr;
if (isa<SCEVConstant>(ICMinusPlus1SCEV)) {
InsertPtr = BI;
} else {
BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader)
Preheader = InsertPreheaderForLoop(L, DT, LI, PreserveLCSSA);
InsertPtr = Preheader->getTerminator();
}

if (!isa<PointerType>(NewIV->getType()) && NeedNewIV &&
(SE->getTypeSizeInBits(NewIV->getType()) <
SE->getTypeSizeInBits(ICMinusPlus1SCEV->getType()))) {
IRBuilder<> Builder(BI);
Builder.SetCurrentDebugLocation(BI->getDebugLoc());
NewIV = Builder.CreateSExt(NewIV, ICMinusPlus1SCEV->getType());
}
Value *ICMinusPlus1 = Expander.expandCodeFor(
ICMinusPlus1SCEV, NewIV->getType(), InsertPtr);

Value *Cond =
new ICmpInst(BI, CmpInst::ICMP_EQ, NewIV, ICMinusPlus1, "exitcond");
BI->setCondition(Cond);

if (BI->getSuccessor(1) != Header)
BI->swapSuccessors();
}
}
}		}
}		}

// Validate the selected reductions. All iterations must have an isomorphic		// Validate the selected reductions. All iterations must have an isomorphic
// part of the reduction chain and, for non-associative reductions, the chain		// part of the reduction chain and, for non-associative reductions, the chain
// entries must appear in order.		// entries must appear in order.
bool LoopReroll::ReductionTracker::validateSelected() {		bool LoopReroll::ReductionTracker::validateSelected() {
// For a non-associative reduction, the chain entries must appear in order.		// For a non-associative reduction, the chain entries must appear in order.
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
// cannot reorder those side-effect-producing instructions, and rerolling		// cannot reorder those side-effect-producing instructions, and rerolling
// fails.		// fails.
//		//
// Finally, we make sure that all loop instructions are either loop increment		// Finally, we make sure that all loop instructions are either loop increment
// roots, belong to simple latch code, parts of validated reductions, part of		// roots, belong to simple latch code, parts of validated reductions, part of
// f(%iv) or part of some f(%iv.i). If all of that is true (and all reductions		// f(%iv) or part of some f(%iv.i). If all of that is true (and all reductions
// have been validated), then we reroll the loop.		// have been validated), then we reroll the loop.
bool LoopReroll::reroll(Instruction IV, Loop L, BasicBlock *Header,		bool LoopReroll::reroll(Instruction IV, Loop L, BasicBlock *Header,
const SCEV *IterCount,		const SCEV *BackedgeTakenCount,
ReductionTracker &Reductions) {		ReductionTracker &Reductions) {
DAGRootTracker DAGRoots(this, L, IV, SE, AA, TLI, DT, LI, PreserveLCSSA,		DAGRootTracker DAGRoots(this, L, IV, SE, AA, TLI, DT, LI, PreserveLCSSA,
IVToIncMap, LoopControlIV);		IVToIncMap, LoopControlIV);

if (!DAGRoots.findRoots())		if (!DAGRoots.findRoots())
return false;		return false;
LLVM_DEBUG(dbgs() << "LRR: Found all root induction increments for: " << *IV		LLVM_DEBUG(dbgs() << "LRR: Found all root induction increments for: " << *IV
<< "\n");		<< "\n");

if (!DAGRoots.validate(Reductions))		if (!DAGRoots.validate(Reductions))
return false;		return false;
if (!Reductions.validateSelected())		if (!Reductions.validateSelected())
return false;		return false;
// At this point, we've validated the rerolling, and we're committed to		// At this point, we've validated the rerolling, and we're committed to
// making changes!		// making changes!

Reductions.replaceSelected();		Reductions.replaceSelected();
DAGRoots.replace(IterCount);		DAGRoots.replace(BackedgeTakenCount);

++NumRerolledLoops;		++NumRerolledLoops;
return true;		return true;
}		}

bool LoopReroll::runOnLoop(Loop *L, LPPassManager &LPM) {		bool LoopReroll::runOnLoop(Loop *L, LPPassManager &LPM) {
if (skipLoop(L))		if (skipLoop(L))
return false;		return false;
Show All 12 Lines	bool LoopReroll::runOnLoop(Loop *L, LPPassManager &LPM) {

// For now, we'll handle only single BB loops.		// For now, we'll handle only single BB loops.
if (L->getNumBlocks() > 1)		if (L->getNumBlocks() > 1)
return false;		return false;

if (!SE->hasLoopInvariantBackedgeTakenCount(L))		if (!SE->hasLoopInvariantBackedgeTakenCount(L))
return false;		return false;

const SCEV *LIBETC = SE->getBackedgeTakenCount(L);		const SCEV *BackedgeTakenCount = SE->getBackedgeTakenCount(L);
const SCEV *IterCount = SE->getAddExpr(LIBETC, SE->getOne(LIBETC->getType()));
LLVM_DEBUG(dbgs() << "\n Before Reroll:\n" << *(L->getHeader()) << "\n");		LLVM_DEBUG(dbgs() << "\n Before Reroll:\n" << *(L->getHeader()) << "\n");
LLVM_DEBUG(dbgs() << "LRR: iteration count = " << *IterCount << "\n");		LLVM_DEBUG(dbgs() << "LRR: backedge-taken count = " << *BackedgeTakenCount
		<< "\n");

// First, we need to find the induction variable with respect to which we can		// First, we need to find the induction variable with respect to which we can
// reroll (there may be several possible options).		// reroll (there may be several possible options).
SmallInstructionVector PossibleIVs;		SmallInstructionVector PossibleIVs;
IVToIncMap.clear();		IVToIncMap.clear();
LoopControlIV = nullptr;		LoopControlIV = nullptr;
collectPossibleIVs(L, PossibleIVs);		collectPossibleIVs(L, PossibleIVs);

if (PossibleIVs.empty()) {		if (PossibleIVs.empty()) {
LLVM_DEBUG(dbgs() << "LRR: No possible IVs found\n");		LLVM_DEBUG(dbgs() << "LRR: No possible IVs found\n");
return false;		return false;
}		}

ReductionTracker Reductions;		ReductionTracker Reductions;
collectPossibleReductions(L, Reductions);		collectPossibleReductions(L, Reductions);
bool Changed = false;		bool Changed = false;

// For each possible IV, collect the associated possible set of 'root' nodes		// For each possible IV, collect the associated possible set of 'root' nodes
// (i+1, i+2, etc.).		// (i+1, i+2, etc.).
for (Instruction *PossibleIV : PossibleIVs)		for (Instruction *PossibleIV : PossibleIVs)
if (reroll(PossibleIV, L, Header, IterCount, Reductions)) {		if (reroll(PossibleIV, L, Header, BackedgeTakenCount, Reductions)) {
Changed = true;		Changed = true;
break;		break;
}		}
LLVM_DEBUG(dbgs() << "\n After Reroll:\n" << *(L->getHeader()) << "\n");		LLVM_DEBUG(dbgs() << "\n After Reroll:\n" << *(L->getHeader()) << "\n");

// Trip count of L has changed so SE must be re-evaluated.		// Trip count of L has changed so SE must be re-evaluated.
if (Changed)		if (Changed)
SE->forgetLoop(L);		SE->forgetLoop(L);

return Changed;		return Changed;
}		}

llvm/trunk/test/Transforms/LoopReroll/basic.ll

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	for.body: ; preds = %entry, %for.body
%2 = trunc i64 %indvars.iv.next to i32		%2 = trunc i64 %indvars.iv.next to i32
%cmp = icmp slt i32 %2, 1500		%cmp = icmp slt i32 %2, 1500
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

; CHECK-LABEL: @hi1		; CHECK-LABEL: @hi1

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]		; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]
		; CHECK: %0 = trunc i64 %indvar to i32
; CHECK: %call = tail call i32 @foo(i32 0) #1		; CHECK: %call = tail call i32 @foo(i32 0) #1
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvar		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvar
; CHECK: store i32 %call, i32* %arrayidx, align 4		; CHECK: store i32 %call, i32* %arrayidx, align 4
; CHECK: %indvar.next = add i64 %indvar, 1		; CHECK: %indvar.next = add i64 %indvar, 1
; CHECK: %exitcond = icmp eq i64 %indvar, 1499		; CHECK: %exitcond = icmp eq i32 %0, 1499
; CHECK: br i1 %exitcond, label %for.end, label %for.body		; CHECK: br i1 %exitcond, label %for.end, label %for.body

; CHECK: ret		; CHECK: ret

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	for.body: ; preds = %entry, %for.body
%14 = trunc i64 %indvars.iv.next to i32		%14 = trunc i64 %indvars.iv.next to i32
%cmp = icmp slt i32 %14, 3200		%cmp = icmp slt i32 %14, 3200
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

; CHECK-LABEL: @goo		; CHECK-LABEL: @goo

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]		; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]
		; CHECK: %0 = trunc i64 %indvar to i32
; CHECK: %arrayidx = getelementptr inbounds float, float* %b, i64 %indvar		; CHECK: %arrayidx = getelementptr inbounds float, float* %b, i64 %indvar
; CHECK: %0 = load float, float* %arrayidx, align 4		; CHECK: %1 = load float, float* %arrayidx, align 4
; CHECK: %mul = fmul float %0, %alpha		; CHECK: %mul = fmul float %1, %alpha
; CHECK: %arrayidx2 = getelementptr inbounds float, float* %a, i64 %indvar		; CHECK: %arrayidx2 = getelementptr inbounds float, float* %a, i64 %indvar
; CHECK: %1 = load float, float* %arrayidx2, align 4		; CHECK: %2 = load float, float* %arrayidx2, align 4
; CHECK: %add = fadd float %1, %mul		; CHECK: %add = fadd float %2, %mul
; CHECK: store float %add, float* %arrayidx2, align 4		; CHECK: store float %add, float* %arrayidx2, align 4
; CHECK: %indvar.next = add i64 %indvar, 1		; CHECK: %indvar.next = add i64 %indvar, 1
; CHECK: %exitcond = icmp eq i64 %indvar, 3199		; CHECK: %exitcond = icmp eq i32 %0, 3199
; CHECK: br i1 %exitcond, label %for.end, label %for.body		; CHECK: br i1 %exitcond, label %for.end, label %for.body

; CHECK: ret		; CHECK: ret

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	for.body: ; preds = %entry, %for.body
%19 = trunc i64 %indvars.iv.next to i32		%19 = trunc i64 %indvars.iv.next to i32
%cmp = icmp slt i32 %19, 3200		%cmp = icmp slt i32 %19, 3200
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

; CHECK-LABEL: @hoo		; CHECK-LABEL: @hoo

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]		; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]
		; CHECK: %0 = trunc i64 %indvar to i32
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %ip, i64 %indvar		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %ip, i64 %indvar
; CHECK: %0 = load i32, i32* %arrayidx, align 4		; CHECK: %1 = load i32, i32* %arrayidx, align 4
; CHECK: %idxprom1 = sext i32 %0 to i64		; CHECK: %idxprom1 = sext i32 %1 to i64
; CHECK: %arrayidx2 = getelementptr inbounds float, float* %b, i64 %idxprom1		; CHECK: %arrayidx2 = getelementptr inbounds float, float* %b, i64 %idxprom1
; CHECK: %1 = load float, float* %arrayidx2, align 4		; CHECK: %2 = load float, float* %arrayidx2, align 4
; CHECK: %mul = fmul float %1, %alpha		; CHECK: %mul = fmul float %2, %alpha
; CHECK: %arrayidx4 = getelementptr inbounds float, float* %a, i64 %indvar		; CHECK: %arrayidx4 = getelementptr inbounds float, float* %a, i64 %indvar
; CHECK: %2 = load float, float* %arrayidx4, align 4		; CHECK: %3 = load float, float* %arrayidx4, align 4
; CHECK: %add = fadd float %2, %mul		; CHECK: %add = fadd float %3, %mul
; CHECK: store float %add, float* %arrayidx4, align 4		; CHECK: store float %add, float* %arrayidx4, align 4
; CHECK: %indvar.next = add i64 %indvar, 1		; CHECK: %indvar.next = add i64 %indvar, 1
; CHECK: %exitcond = icmp eq i64 %indvar, 3199		; CHECK: %exitcond = icmp eq i32 %0, 3199
; CHECK: br i1 %exitcond, label %for.end, label %for.body		; CHECK: br i1 %exitcond, label %for.end, label %for.body

; CHECK: ret		; CHECK: ret

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
; CHECK:for.body:		; CHECK:for.body:
; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]		; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK: %0 = add i64 %indvars.iv, 6		; CHECK: %0 = add i64 %indvars.iv, 6
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv
; CHECK: store i32 %call, i32* %arrayidx, align 4		; CHECK: store i32 %call, i32* %arrayidx, align 4
; CHECK: %arrayidx6 = getelementptr inbounds i32, i32* %x, i64 %0		; CHECK: %arrayidx6 = getelementptr inbounds i32, i32* %x, i64 %0
; CHECK: store i32 %call, i32* %arrayidx6, align 4		; CHECK: store i32 %call, i32* %arrayidx6, align 4
; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
; CHECK: %exitcond2 = icmp eq i64 %0, 1505		; CHECK: %exitcond1 = icmp eq i64 %indvars.iv, 1499
; CHECK: br i1 %exitcond2, label %for.end, label %for.body		; CHECK: br i1 %exitcond1, label %for.end, label %for.body

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

; void multi2(int *x) {		; void multi2(int *x) {
; y = foo(0)		; y = foo(0)
; for (int i = 0; i < 500; ++i) {		; for (int i = 0; i < 500; ++i) {
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
; CHECK:for.body:		; CHECK:for.body:
; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]		; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK: %0 = add i64 %indvars.iv, 3		; CHECK: %0 = add i64 %indvars.iv, 3
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv
; CHECK: store i32 %call, i32* %arrayidx, align 4		; CHECK: store i32 %call, i32* %arrayidx, align 4
; CHECK: %arrayidx6 = getelementptr inbounds i32, i32* %x, i64 %0		; CHECK: %arrayidx6 = getelementptr inbounds i32, i32* %x, i64 %0
; CHECK: store i32 %call, i32* %arrayidx6, align 4		; CHECK: store i32 %call, i32* %arrayidx6, align 4
; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
; CHECK: %exitcond2 = icmp eq i64 %indvars.iv, 1499		; CHECK: %exitcond1 = icmp eq i64 %indvars.iv, 1499
; CHECK: br i1 %exitcond2, label %for.end, label %for.body		; CHECK: br i1 %exitcond1, label %for.end, label %for.body

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

; void multi3(int *x) {		; void multi3(int *x) {
; y = foo(0)		; y = foo(0)
; for (int i = 0; i < 500; ++i) {		; for (int i = 0; i < 500; ++i) {
Show All 29 Lines

; CHECK-LABEL: @multi3		; CHECK-LABEL: @multi3
; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]		; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK: %0 = add i64 %indvars.iv, 3		; CHECK: %0 = add i64 %indvars.iv, 3
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %0		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %0
; CHECK: store i32 %call, i32* %arrayidx, align 4		; CHECK: store i32 %call, i32* %arrayidx, align 4
; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		; CHECK: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
; CHECK: %exitcond1 = icmp eq i64 %0, 1502		; CHECK: %exitcond1 = icmp eq i64 %indvars.iv, 1499
; CHECK: br i1 %exitcond1, label %for.end, label %for.body		; CHECK: br i1 %exitcond1, label %for.end, label %for.body

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}

; int foo(int a);		; int foo(int a);
; void bar2(int *x, int y, int z) {		; void bar2(int *x, int y, int z) {
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	for.body: ; preds = %for.body, %entry
br i1 %exitcond, label %for.end, label %for.body		br i1 %exitcond, label %for.end, label %for.body

; CHECK-LABEL: @gep-indexing		; CHECK-LABEL: @gep-indexing
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]		; CHECK-NEXT: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
; CHECK-NEXT: %scevgep = getelementptr i32, i32* %x, i64 %indvars.iv		; CHECK-NEXT: %scevgep = getelementptr i32, i32* %x, i64 %indvars.iv
; CHECK-NEXT: store i32 %call, i32* %scevgep, align 4		; CHECK-NEXT: store i32 %call, i32* %scevgep, align 4
; CHECK-NEXT: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		; CHECK-NEXT: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
; CHECK-NEXT: %exitcond2 = icmp eq i32* %scevgep, %scevgep1		; CHECK-NEXT: %exitcond1 = icmp eq i64 %indvars.iv, 1499
; CHECK-NEXT: br i1 %exitcond2, label %for.end, label %for.body		; CHECK-NEXT: br i1 %exitcond1, label %for.end, label %for.body

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret void		ret void
}		}


define void @unordered_atomic_ops(i32* noalias %buf_0, i32* noalias %buf_1) {		define void @unordered_atomic_ops(i32* noalias %buf_0, i32* noalias %buf_1) {
; CHECK-LABEL: @unordered_atomic_ops(		; CHECK-LABEL: @unordered_atomic_ops(
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	; CHECK-NEXT: store atomic i32 %vb, i32* %buf1_b unordered, align 4
store atomic i32 %vb, i32* %buf1_b unordered, align 4		store atomic i32 %vb, i32* %buf1_b unordered, align 4
%cmp = icmp slt i32 %indvars.iv.next, 3200		%cmp = icmp slt i32 %indvars.iv.next, 3200
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end:		for.end:
ret void		ret void
}		}

		define void @pointer_bitcast_baseinst(i16* %arg, i8* %arg1, i64 %arg2) {
		; CHECK-LABEL: @pointer_bitcast_baseinst(
		; CHECK: bb3:
		; CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %bb3 ], [ 0, %bb ]
		; CHECK-NEXT: %4 = shl i64 %indvar, 3
		; CHECK-NEXT: %5 = add i64 %4, 1
		; CHECK-NEXT: %tmp5 = shl nuw i64 %5, 1
		; CHECK-NEXT: %tmp6 = getelementptr i8, i8* %arg1, i64 %tmp5
		; CHECK-NEXT: %tmp7 = bitcast i8* %tmp6 to <8 x i16>*
		; CHECK-NEXT: %tmp8 = load <8 x i16>, <8 x i16>* %tmp7, align 2
		; CHECK-NEXT: %tmp13 = getelementptr i16, i16* %arg, i64 %5
		; CHECK-NEXT: %tmp14 = bitcast i16* %tmp13 to <8 x i16>*
		; CHECK-NEXT: store <8 x i16> %tmp8, <8 x i16>* %tmp14, align 2
		; CHECK-NEXT: %indvar.next = add i64 %indvar, 1
		; CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %3
		; CHECK-NEXT: br i1 %exitcond, label %bb19, label %bb3
		bb:
		br label %bb3

		bb3: ; preds = %bb3, %bb
		%tmp = phi i64 [ 1, %bb ], [ %tmp17, %bb3 ]
		%tmp4 = add nuw i64 %tmp, 8
		%tmp5 = shl nuw i64 %tmp, 1
		%tmp6 = getelementptr i8, i8* %arg1, i64 %tmp5
		%tmp7 = bitcast i8* %tmp6 to <8 x i16>*
		%tmp8 = load <8 x i16>, <8 x i16>* %tmp7, align 2
		%tmp9 = shl i64 %tmp4, 1
		%tmp10 = getelementptr i8, i8* %arg1, i64 %tmp9
		%tmp11 = bitcast i8* %tmp10 to <8 x i16>*
		%tmp12 = load <8 x i16>, <8 x i16>* %tmp11, align 2
		%tmp13 = getelementptr i16, i16* %arg, i64 %tmp
		%tmp14 = bitcast i16* %tmp13 to <8 x i16>*
		store <8 x i16> %tmp8, <8 x i16>* %tmp14, align 2
		%tmp15 = getelementptr i16, i16* %arg, i64 %tmp4
		%tmp16 = bitcast i16* %tmp15 to <8 x i16>*
		store <8 x i16> %tmp12, <8 x i16>* %tmp16, align 2
		%tmp17 = add nuw nsw i64 %tmp, 16
		%tmp18 = icmp eq i64 %tmp17, %arg2
		br i1 %tmp18, label %bb19, label %bb3

		bb19: ; preds = %bb3
		ret void
		}

attributes #0 = { nounwind uwtable }		attributes #0 = { nounwind uwtable }
attributes #1 = { nounwind }		attributes #1 = { nounwind }

llvm/trunk/test/Transforms/LoopReroll/complex_reroll.ll

; RUN: opt -S -loop-reroll %s \| FileCheck %s		; RUN: opt -S -loop-reroll %s \| FileCheck %s
declare i32 @goo(i32, i32)		declare i32 @goo(i32, i32)

@buf = external global i8*		@buf = external global i8*
@aaa = global [16 x i8] c"\01\02\03\04\05\06\07\08\09\0A\0B\0C\0D\0E\0F\10", align 1		@aaa = global [16 x i8] c"\01\02\03\04\05\06\07\08\09\0A\0B\0C\0D\0E\0F\10", align 1

define i32 @test1(i32 %len) {		define i32 @test1(i32 %len) {
entry:		entry:
br label %while.body		br label %while.body

while.body:		while.body:
;CHECK-LABEL: while.body:		;CHECK-LABEL: while.body:
;CHECK-NEXT: %indvar = phi i32 [ %indvar.next, %while.body ], [ 0, %entry ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %entry ]
;CHECK-NEXT: %buf.021 = phi i8* [ getelementptr inbounds ([16 x i8], [16 x i8]* @aaa, i64 0, i64 0), %entry ], [ %add.ptr, %while.body ]
;CHECK-NEXT: %sum44.020 = phi i64 [ 0, %entry ], [ %add, %while.body ]		;CHECK-NEXT: %sum44.020 = phi i64 [ 0, %entry ], [ %add, %while.body ]
;CHECK-NEXT: [[T2:%[0-9]+]] = load i8, i8* %buf.021, align 1		;CHECK-NEXT: %0 = trunc i64 %indvar to i32
		;CHECK-NEXT: %scevgep = getelementptr [16 x i8], [16 x i8]* @aaa, i64 0, i64 %indvar
		;CHECK-NEXT: [[T2:%[0-9]+]] = load i8, i8* %scevgep, align 1
;CHECK-NEXT: %conv = zext i8 [[T2]] to i64		;CHECK-NEXT: %conv = zext i8 [[T2]] to i64
;CHECK-NEXT: %add = add i64 %conv, %sum44.020		;CHECK-NEXT: %add = add i64 %conv, %sum44.020
;CHECK-NEXT: %add.ptr = getelementptr inbounds i8, i8* %buf.021, i64 1		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
;CHECK-NEXT: %indvar.next = add i32 %indvar, 1		;CHECK-NEXT: %exitcond = icmp eq i32 %0, 15
;CHECK-NEXT: %exitcond = icmp eq i32 %indvar, 1
;CHECK-NEXT: br i1 %exitcond, label %while.end, label %while.body		;CHECK-NEXT: br i1 %exitcond, label %while.end, label %while.body

%dec22 = phi i32 [ 4, %entry ], [ %dec, %while.body ]		%dec22 = phi i32 [ 4, %entry ], [ %dec, %while.body ]
%buf.021 = phi i8* [ getelementptr inbounds ([16 x i8], [16 x i8]* @aaa, i64 0, i64 0), %entry ], [ %add.ptr, %while.body ]		%buf.021 = phi i8* [ getelementptr inbounds ([16 x i8], [16 x i8]* @aaa, i64 0, i64 0), %entry ], [ %add.ptr, %while.body ]
%sum44.020 = phi i64 [ 0, %entry ], [ %add9, %while.body ]		%sum44.020 = phi i64 [ 0, %entry ], [ %add9, %while.body ]
%0 = load i8, i8* %buf.021, align 1		%0 = load i8, i8* %buf.021, align 1
%conv = zext i8 %0 to i64		%conv = zext i8 %0 to i64
%add = add i64 %conv, %sum44.020		%add = add i64 %conv, %sum44.020
Show All 32 Lines	for.cond.for.cond.cleanup_crit_edge:
br label %for.cond.cleanup		br label %for.cond.cleanup

for.cond.cleanup:		for.cond.cleanup:
%S.addr.0.lcssa = phi i32 [ %add2, %for.cond.for.cond.cleanup_crit_edge ], [ %S, %entry ]		%S.addr.0.lcssa = phi i32 [ %add2, %for.cond.for.cond.cleanup_crit_edge ], [ %S, %entry ]
ret i32 %S.addr.0.lcssa		ret i32 %S.addr.0.lcssa

for.body:		for.body:
;CHECK-LABEL: for.body:		;CHECK-LABEL: for.body:
;CHECK-NEXT: %indvar = phi i32 [ %indvar.next, %for.body ], [ 0, %for.body.lr.ph ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %for.body.lr.ph ]
;CHECK-NEXT: %S.addr.011 = phi i32 [ %S, %for.body.lr.ph ], [ %add, %for.body ]		;CHECK-NEXT: %S.addr.011 = phi i32 [ %S, %for.body.lr.ph ], [ %add, %for.body ]
;CHECK-NEXT: %a.addr.010 = phi i32* [ %a, %for.body.lr.ph ], [ %incdec.ptr1, %for.body ]		;CHECK-NEXT: %4 = trunc i64 %indvar to i32
;CHECK-NEXT: %4 = load i32, i32* %a.addr.010, align 4		;CHECK-NEXT: %scevgep = getelementptr i32, i32* %a, i64 %indvar
;CHECK-NEXT: %add = add nsw i32 %4, %S.addr.011		;CHECK-NEXT: %5 = load i32, i32* %scevgep, align 4
;CHECK-NEXT: %incdec.ptr1 = getelementptr inbounds i32, i32* %a.addr.010, i64 1		;CHECK-NEXT: %add = add nsw i32 %5, %S.addr.011
;CHECK-NEXT: %indvar.next = add i32 %indvar, 1		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
;CHECK-NEXT: %exitcond = icmp eq i32 %indvar, %3		;CHECK-NEXT: %exitcond = icmp eq i32 %4, %3
;CHECK-NEXT: br i1 %exitcond, label %for.cond.for.cond.cleanup_crit_edge, label %for.body		;CHECK-NEXT: br i1 %exitcond, label %for.cond.for.cond.cleanup_crit_edge, label %for.body

%i.012 = phi i32 [ 0, %for.body.lr.ph ], [ %add3, %for.body ]		%i.012 = phi i32 [ 0, %for.body.lr.ph ], [ %add3, %for.body ]
%S.addr.011 = phi i32 [ %S, %for.body.lr.ph ], [ %add2, %for.body ]		%S.addr.011 = phi i32 [ %S, %for.body.lr.ph ], [ %add2, %for.body ]
%a.addr.010 = phi i32* [ %a, %for.body.lr.ph ], [ %incdec.ptr1, %for.body ]		%a.addr.010 = phi i32* [ %a, %for.body.lr.ph ], [ %incdec.ptr1, %for.body ]
%incdec.ptr = getelementptr inbounds i32, i32* %a.addr.010, i64 1		%incdec.ptr = getelementptr inbounds i32, i32* %a.addr.010, i64 1
%0 = load i32, i32* %a.addr.010, align 4		%0 = load i32, i32* %a.addr.010, align 4
%add = add nsw i32 %0, %S.addr.011		%add = add nsw i32 %0, %S.addr.011
Show All 10 Lines	entry:
%cmp10 = icmp sgt i32 %len, 1		%cmp10 = icmp sgt i32 %len, 1
br i1 %cmp10, label %while.body.preheader, label %while.end		br i1 %cmp10, label %while.body.preheader, label %while.end

while.body.preheader: ; preds = %entry		while.body.preheader: ; preds = %entry
br label %while.body		br label %while.body

while.body: ; preds = %while.body.preheader, %while.body		while.body: ; preds = %while.body.preheader, %while.body
;CHECK-LABEL: while.body:		;CHECK-LABEL: while.body:
;CHECK-NEXT: %indvar = phi i32 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]
;CHECK-NEXT: %S.012 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]		;CHECK-NEXT: %S.012 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]
;CHECK-NEXT: %buf.addr.011 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]		;CHECK-NEXT: %4 = trunc i64 %indvar to i32
;CHECK-NEXT: %4 = load i32, i32* %buf.addr.011, align 4		;CHECK-NEXT: %5 = mul i64 %indvar, -1
;CHECK-NEXT: %add = add nsw i32 %4, %S.012		;CHECK-NEXT: %scevgep = getelementptr i32, i32* %buf, i64 %5
;CHECK-NEXT: %add.ptr = getelementptr inbounds i32, i32* %buf.addr.011, i64 -1		;CHECK-NEXT: %6 = load i32, i32* %scevgep, align 4
;CHECK-NEXT: %indvar.next = add i32 %indvar, 1		;CHECK-NEXT: %add = add nsw i32 %6, %S.012
;CHECK-NEXT: %exitcond = icmp eq i32 %indvar, %3		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
		;CHECK-NEXT: %exitcond = icmp eq i32 %4, %3
;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body		;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body

%i.013 = phi i32 [ %sub, %while.body ], [ %len, %while.body.preheader ]		%i.013 = phi i32 [ %sub, %while.body ], [ %len, %while.body.preheader ]
%S.012 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]		%S.012 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]
%buf.addr.011 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]		%buf.addr.011 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]
%0 = load i32, i32* %buf.addr.011, align 4		%0 = load i32, i32* %buf.addr.011, align 4
%add = add nsw i32 %0, %S.012		%add = add nsw i32 %0, %S.012
%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.011, i64 -1		%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.011, i64 -1
Show All 15 Lines

llvm/trunk/test/Transforms/LoopReroll/indvar_with_ext.ll

; RUN: opt -S -loop-reroll %s \| FileCheck %s		; RUN: opt -S -loop-reroll %s \| FileCheck %s
target triple = "aarch64--linux-gnu"		target triple = "aarch64--linux-gnu"

define void @test(i32 %n, float* %arrayidx200, float* %arrayidx164, float* %arrayidx172) {		define void @test(i32 %n, float* %arrayidx200, float* %arrayidx164, float* %arrayidx172) {
entry:		entry:
%rem.i = srem i32 %n, 4		%rem.i = srem i32 %n, 4
%t22 = load float, float* %arrayidx172, align 4		%t22 = load float, float* %arrayidx172, align 4
%cmp.9 = icmp eq i32 %n, 0		%cmp.9 = icmp eq i32 %n, 0
%t7 = sext i32 %n to i64		%t7 = sext i32 %n to i64
br i1 %cmp.9, label %while.end, label %while.body.preheader		br i1 %cmp.9, label %while.end, label %while.body.preheader

while.body.preheader:		while.body.preheader:
br label %while.body		br label %while.body

while.body:		while.body:
;CHECK-LABEL: while.body:		;CHECK-LABEL: while.body:
;CHECK-NEXT: %indvars.iv.i423 = phi i64 [ %indvars.iv.next.i424, %while.body ], [ 0, %while.body.preheader ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]
;CHECK-NEXT: [[T1:%[0-9]+]] = trunc i64 %indvars.iv.i423 to i32		;CHECK-NEXT: %arrayidx62.i = getelementptr inbounds float, float* %arrayidx200, i64 %indvar
;CHECK-NEXT: %arrayidx62.i = getelementptr inbounds float, float* %arrayidx200, i64 %indvars.iv.i423
;CHECK-NEXT: %t1 = load float, float* %arrayidx62.i, align 4		;CHECK-NEXT: %t1 = load float, float* %arrayidx62.i, align 4
;CHECK-NEXT: %arrayidx64.i = getelementptr inbounds float, float* %arrayidx164, i64 %indvars.iv.i423		;CHECK-NEXT: %arrayidx64.i = getelementptr inbounds float, float* %arrayidx164, i64 %indvar
;CHECK-NEXT: %t2 = load float, float* %arrayidx64.i, align 4		;CHECK-NEXT: %t2 = load float, float* %arrayidx64.i, align 4
;CHECK-NEXT: %mul65.i = fmul fast float %t2, %t22		;CHECK-NEXT: %mul65.i = fmul fast float %t2, %t22
;CHECK-NEXT: %add66.i = fadd fast float %mul65.i, %t1		;CHECK-NEXT: %add66.i = fadd fast float %mul65.i, %t1
;CHECK-NEXT: store float %add66.i, float* %arrayidx62.i, align 4		;CHECK-NEXT: store float %add66.i, float* %arrayidx62.i, align 4
;CHECK-NEXT: %indvars.iv.next.i424 = add i64 %indvars.iv.i423, 1		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
;CHECK-NEXT: [[T2:%[0-9]+]] = sext i32 [[T1]] to i64		;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %{{[0-9]+}}
;CHECK-NEXT: %exitcond = icmp eq i64 [[T2]], %{{[0-9]+}}
;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body		;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body

%indvars.iv.i423 = phi i64 [ %indvars.iv.next.i424, %while.body ], [ 0, %while.body.preheader ]		%indvars.iv.i423 = phi i64 [ %indvars.iv.next.i424, %while.body ], [ 0, %while.body.preheader ]
%i.22.i = phi i32 [ %add103.i, %while.body ], [ %rem.i, %while.body.preheader ]		%i.22.i = phi i32 [ %add103.i, %while.body ], [ %rem.i, %while.body.preheader ]
%arrayidx62.i = getelementptr inbounds float, float* %arrayidx200, i64 %indvars.iv.i423		%arrayidx62.i = getelementptr inbounds float, float* %arrayidx200, i64 %indvars.iv.i423
%t1 = load float, float* %arrayidx62.i, align 4		%t1 = load float, float* %arrayidx62.i, align 4
%arrayidx64.i = getelementptr inbounds float, float* %arrayidx164, i64 %indvars.iv.i423		%arrayidx64.i = getelementptr inbounds float, float* %arrayidx164, i64 %indvars.iv.i423
%t2 = load float, float* %arrayidx64.i, align 4		%t2 = load float, float* %arrayidx64.i, align 4
Show All 27 Lines	entry:
%cmp18 = icmp sgt i64 %n, 0		%cmp18 = icmp sgt i64 %n, 0
br i1 %cmp18, label %for.body.preheader, label %for.end		br i1 %cmp18, label %for.body.preheader, label %for.end

for.body.preheader: ; preds = %entry		for.body.preheader: ; preds = %entry
br label %for.body		br label %for.body

for.body: ; preds = %for.body.preheader, %for.body		for.body: ; preds = %for.body.preheader, %for.body

;CHECK: for.body:		;CHECK-LABEL: for.body:
;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %for.body.preheader ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %for.body.preheader ]
;CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %y, i64 %indvar		;CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %y, i64 %indvar
;CHECK-NEXT: [[T1:%[0-9]+]] = load i32, i32* %arrayidx, align 4		;CHECK-NEXT: [[T1:%[0-9]+]] = load i32, i32* %arrayidx, align 4
;CHECK-NEXT: %arrayidx3 = getelementptr inbounds i32, i32* %x, i64 %indvar		;CHECK-NEXT: %arrayidx3 = getelementptr inbounds i32, i32* %x, i64 %indvar
;CHECK-NEXT: store i32 [[T1]], i32* %arrayidx3, align 4		;CHECK-NEXT: store i32 [[T1]], i32* %arrayidx3, align 4
;CHECK-NEXT: %indvar.next = add i64 %indvar, 1		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %{{[0-9]+}}		;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %{{[0-9]+}}
;CHECK-NEXT: br i1 %exitcond, label %for.end.loopexit, label %for.body		;CHECK-NEXT: br i1 %exitcond, label %for.end.loopexit, label %for.body
Show All 25 Lines	entry:
%cmp21 = icmp sgt i32 %n, 0		%cmp21 = icmp sgt i32 %n, 0
br i1 %cmp21, label %for.body.preheader, label %for.end		br i1 %cmp21, label %for.body.preheader, label %for.end

for.body.preheader: ; preds = %entry		for.body.preheader: ; preds = %entry
br label %for.body		br label %for.body

for.body: ; preds = %for.body.preheader, %for.body		for.body: ; preds = %for.body.preheader, %for.body

;CHECK: for.body:		;CHECK-LABEL: for.body:
;CHECK: %add12 = add i8 %i.022, 2		;CHECK: %add12 = add i8 %i.022, 2
;CHECK-NEXT: %conv = sext i8 %add12 to i32		;CHECK-NEXT: %conv = sext i8 %add12 to i32
;CHECK-NEXT: %cmp = icmp slt i32 %conv, %n		;CHECK-NEXT: %cmp = icmp slt i32 %conv, %n
;CHECK-NEXT: br i1 %cmp, label %for.body, label %for.end.loopexit		;CHECK-NEXT: br i1 %cmp, label %for.body, label %for.end.loopexit

%conv23 = phi i32 [ %conv, %for.body ], [ 0, %for.body.preheader ]		%conv23 = phi i32 [ %conv, %for.body ], [ 0, %for.body.preheader ]
%i.022 = phi i8 [ %add12, %for.body ], [ 0, %for.body.preheader ]		%i.022 = phi i8 [ %add12, %for.body ], [ 0, %for.body.preheader ]
%idxprom = sext i8 %i.022 to i64		%idxprom = sext i8 %i.022 to i64
Show All 25 Lines	entry:
%cmp18 = icmp eq i64 %n, 0		%cmp18 = icmp eq i64 %n, 0
br i1 %cmp18, label %for.end, label %for.body.preheader		br i1 %cmp18, label %for.end, label %for.body.preheader

for.body.preheader: ; preds = %entry		for.body.preheader: ; preds = %entry
br label %for.body		br label %for.body

for.body: ; preds = %for.body.preheader, %for.body		for.body: ; preds = %for.body.preheader, %for.body

;CHECK: for.body:		;CHECK-LABEL: for.body:
;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %for.body.preheader ]		;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %for.body.preheader ]
;CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %y, i64 %indvar		;CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %y, i64 %indvar
;CHECK-NEXT: [[T1:%[0-9]+]] = load i32, i32* %arrayidx, align 4		;CHECK-NEXT: [[T1:%[0-9]+]] = load i32, i32* %arrayidx, align 4
;CHECK-NEXT: %arrayidx3 = getelementptr inbounds i32, i32* %x, i64 %indvar		;CHECK-NEXT: %arrayidx3 = getelementptr inbounds i32, i32* %x, i64 %indvar
;CHECK-NEXT: store i32 [[T1]], i32* %arrayidx3, align 4		;CHECK-NEXT: store i32 [[T1]], i32* %arrayidx3, align 4
;CHECK-NEXT: %indvar.next = add i64 %indvar, 1		;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %{{[0-9]+}}		;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %{{[0-9]+}}
;CHECK-NEXT: br i1 %exitcond, label %for.end.loopexit, label %for.body		;CHECK-NEXT: br i1 %exitcond, label %for.end.loopexit, label %for.body
Show All 22 Lines

llvm/trunk/test/Transforms/LoopReroll/nonconst_lb.ll

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	for.end: ; preds = %for.body, %entry
ret void		ret void
}		}
; CHECK-LABEL: @foo		; CHECK-LABEL: @foo
; CHECK: for.body.preheader: ; preds = %entry		; CHECK: for.body.preheader: ; preds = %entry
; CHECK: %0 = add i32 %n, -1		; CHECK: %0 = add i32 %n, -1
; CHECK: %1 = sub i32 %0, %m		; CHECK: %1 = sub i32 %0, %m
; CHECK: %2 = lshr i32 %1, 2		; CHECK: %2 = lshr i32 %1, 2
; CHECK: %3 = shl i32 %2, 2		; CHECK: %3 = shl i32 %2, 2
; CHECK: %4 = add i32 %m, %3		; CHECK: %4 = add i32 %3, 3
; CHECK: %5 = add i32 %4, 3
; CHECK: br label %for.body		; CHECK: br label %for.body

; CHECK: for.body: ; preds = %for.body, %for.body.preheader		; CHECK: for.body: ; preds = %for.body, %for.body.preheader
; CHECK: %indvar = phi i32 [ 0, %for.body.preheader ], [ %indvar.next, %for.body ]		; CHECK: %indvar = phi i32 [ 0, %for.body.preheader ], [ %indvar.next, %for.body ]
; CHECK: %6 = add i32 %m, %indvar		; CHECK: %5 = add i32 %m, %indvar
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %B, i32 %6		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %B, i32 %5
; CHECK: %7 = load i32, i32* %arrayidx, align 4		; CHECK: %6 = load i32, i32* %arrayidx, align 4
; CHECK: %mul = shl nsw i32 %7, 2		; CHECK: %mul = shl nsw i32 %6, 2
; CHECK: %arrayidx2 = getelementptr inbounds i32, i32* %A, i32 %6		; CHECK: %arrayidx2 = getelementptr inbounds i32, i32* %A, i32 %5
; CHECK: store i32 %mul, i32* %arrayidx2, align 4		; CHECK: store i32 %mul, i32* %arrayidx2, align 4
; CHECK: %indvar.next = add i32 %indvar, 1		; CHECK: %indvar.next = add i32 %indvar, 1
; CHECK: %exitcond = icmp eq i32 %6, %5		; CHECK: %exitcond = icmp eq i32 %indvar, %4
; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body		; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body

;void daxpy_ur(int n,float da,float dx,float dy)		;void daxpy_ur(int n,float da,float dx,float dy)
; {		; {
; int m = n % 4;		; int m = n % 4;
; for (int i = m; i < n; i = i + 4)		; for (int i = m; i < n; i = i + 4)
; {		; {
; dy[i] = dy[i] + da*dx[i];		; dy[i] = dy[i] + da*dx[i];
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
}		}

; CHECK-LABEL: @daxpy_ur		; CHECK-LABEL: @daxpy_ur
; CHECK: for.body.preheader:		; CHECK: for.body.preheader:
; CHECK: %0 = add i32 %n, -1		; CHECK: %0 = add i32 %n, -1
; CHECK: %1 = sub i32 %0, %rem		; CHECK: %1 = sub i32 %0, %rem
; CHECK: %2 = lshr i32 %1, 2		; CHECK: %2 = lshr i32 %1, 2
; CHECK: %3 = shl i32 %2, 2		; CHECK: %3 = shl i32 %2, 2
; CHECK: %4 = add i32 %rem, %3		; CHECK: %4 = add i32 %3, 3
; CHECK: %5 = add i32 %4, 3
; CHECK: br label %for.body		; CHECK: br label %for.body

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i32 [ 0, %for.body.preheader ], [ %indvar.next, %for.body ]		; CHECK: %indvar = phi i32 [ 0, %for.body.preheader ], [ %indvar.next, %for.body ]
; CHECK: %6 = add i32 %rem, %indvar		; CHECK: %5 = add i32 %rem, %indvar
; CHECK: %arrayidx = getelementptr inbounds float, float* %dy, i32 %6		; CHECK: %arrayidx = getelementptr inbounds float, float* %dy, i32 %5
; CHECK: %7 = load float, float* %arrayidx, align 4		; CHECK: %6 = load float, float* %arrayidx, align 4
; CHECK: %arrayidx1 = getelementptr inbounds float, float* %dx, i32 %6		; CHECK: %arrayidx1 = getelementptr inbounds float, float* %dx, i32 %5
; CHECK: %8 = load float, float* %arrayidx1, align 4		; CHECK: %7 = load float, float* %arrayidx1, align 4
; CHECK: %mul = fmul float %8, %da		; CHECK: %mul = fmul float %7, %da
; CHECK: %add = fadd float %7, %mul		; CHECK: %add = fadd float %6, %mul
; CHECK: store float %add, float* %arrayidx, align 4		; CHECK: store float %add, float* %arrayidx, align 4
; CHECK: %indvar.next = add i32 %indvar, 1		; CHECK: %indvar.next = add i32 %indvar, 1
; CHECK: %exitcond = icmp eq i32 %6, %5		; CHECK: %exitcond = icmp eq i32 %indvar, %4
; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body		; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body

llvm/trunk/test/Transforms/LoopReroll/ptrindvar.ll

	Show All 11 Lines
	while.body:			while.body:
	;CHECK-LABEL: while.body:			;CHECK-LABEL: while.body:
	;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]			;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]
	;CHECK-NEXT: %S.011 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]			;CHECK-NEXT: %S.011 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]
	;CHECK-NEXT: %scevgep = getelementptr i32, i32* %buf, i64 %indvar			;CHECK-NEXT: %scevgep = getelementptr i32, i32* %buf, i64 %indvar
	;CHECK-NEXT: %4 = load i32, i32* %scevgep, align 4			;CHECK-NEXT: %4 = load i32, i32* %scevgep, align 4
	;CHECK-NEXT: %add = add nsw i32 %4, %S.011			;CHECK-NEXT: %add = add nsw i32 %4, %S.011
	;CHECK-NEXT: %indvar.next = add i64 %indvar, 1			;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
	;CHECK-NEXT: %exitcond = icmp eq i32* %scevgep, %scevgep5			;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %3
	;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body			;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body

	%S.011 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]			%S.011 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]
	%buf.addr.010 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]			%buf.addr.010 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]
	%0 = load i32, i32* %buf.addr.010, align 4			%0 = load i32, i32* %buf.addr.010, align 4
	%add = add nsw i32 %0, %S.011			%add = add nsw i32 %0, %S.011
	%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.010, i64 1			%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.010, i64 1
	%1 = load i32, i32* %arrayidx1, align 4			%1 = load i32, i32* %arrayidx1, align 4
	Show All 23 Lines
	;CHECK-LABEL: while.body:			;CHECK-LABEL: while.body:
	;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]			;CHECK-NEXT: %indvar = phi i64 [ %indvar.next, %while.body ], [ 0, %while.body.preheader ]
	;CHECK-NEXT: %S.011 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]			;CHECK-NEXT: %S.011 = phi i32 [ %add, %while.body ], [ undef, %while.body.preheader ]
	;CHECK-NEXT: %4 = mul i64 %indvar, -1			;CHECK-NEXT: %4 = mul i64 %indvar, -1
	;CHECK-NEXT: %scevgep = getelementptr i32, i32* %buf, i64 %4			;CHECK-NEXT: %scevgep = getelementptr i32, i32* %buf, i64 %4
	;CHECK-NEXT: %5 = load i32, i32* %scevgep, align 4			;CHECK-NEXT: %5 = load i32, i32* %scevgep, align 4
	;CHECK-NEXT: %add = add nsw i32 %5, %S.011			;CHECK-NEXT: %add = add nsw i32 %5, %S.011
	;CHECK-NEXT: %indvar.next = add i64 %indvar, 1			;CHECK-NEXT: %indvar.next = add i64 %indvar, 1
	;CHECK-NEXT: %exitcond = icmp eq i32* %scevgep, %scevgep5			;CHECK-NEXT: %exitcond = icmp eq i64 %indvar, %3
	;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body			;CHECK-NEXT: br i1 %exitcond, label %while.end.loopexit, label %while.body

	%S.011 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]			%S.011 = phi i32 [ %add2, %while.body ], [ undef, %while.body.preheader ]
	%buf.addr.010 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]			%buf.addr.010 = phi i32* [ %add.ptr, %while.body ], [ %buf, %while.body.preheader ]
	%0 = load i32, i32* %buf.addr.010, align 4			%0 = load i32, i32* %buf.addr.010, align 4
	%add = add nsw i32 %0, %S.011			%add = add nsw i32 %0, %S.011
	%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.010, i64 -1			%arrayidx1 = getelementptr inbounds i32, i32* %buf.addr.010, i64 -1
	%1 = load i32, i32* %arrayidx1, align 4			%1 = load i32, i32* %arrayidx1, align 4
	Show All 13 Lines

llvm/trunk/test/Transforms/LoopReroll/reduction.ll

Show All 29 Lines	for.body: ; preds = %entry, %for.body
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

; CHECK-LABEL: @foo		; CHECK-LABEL: @foo

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]		; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]
; CHECK: %r.029 = phi i32 [ 0, %entry ], [ %add, %for.body ]		; CHECK: %r.029 = phi i32 [ 0, %entry ], [ %add, %for.body ]
; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvar		; CHECK: %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvar
; CHECK: %0 = load i32, i32* %arrayidx, align 4		; CHECK: %1 = load i32, i32* %arrayidx, align 4
; CHECK: %add = add nsw i32 %0, %r.029		; CHECK: %add = add nsw i32 %1, %r.029
; CHECK: %indvar.next = add i64 %indvar, 1		; CHECK: %indvar.next = add i64 %indvar, 1
; CHECK: %exitcond = icmp eq i64 %indvar, 399		; CHECK: %exitcond = icmp eq i32 %0, 399
; CHECK: br i1 %exitcond, label %for.end, label %for.body		; CHECK: br i1 %exitcond, label %for.end, label %for.body

; CHECK: ret		; CHECK: ret

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret i32 %add12		ret i32 %add12
}		}

Show All 25 Lines	for.body: ; preds = %entry, %for.body
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

; CHECK-LABEL: @bar		; CHECK-LABEL: @bar

; CHECK: for.body:		; CHECK: for.body:
; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]		; CHECK: %indvar = phi i64 [ %indvar.next, %for.body ], [ 0, %entry ]
; CHECK: %r.029 = phi float [ 0.000000e+00, %entry ], [ %add, %for.body ]		; CHECK: %r.029 = phi float [ 0.000000e+00, %entry ], [ %add, %for.body ]
; CHECK: %arrayidx = getelementptr inbounds float, float* %x, i64 %indvar		; CHECK: %arrayidx = getelementptr inbounds float, float* %x, i64 %indvar
; CHECK: %0 = load float, float* %arrayidx, align 4		; CHECK: %1 = load float, float* %arrayidx, align 4
; CHECK: %add = fadd float %0, %r.029		; CHECK: %add = fadd float %1, %r.029
; CHECK: %indvar.next = add i64 %indvar, 1		; CHECK: %indvar.next = add i64 %indvar, 1
; CHECK: %exitcond = icmp eq i64 %indvar, 399		; CHECK: %exitcond = icmp eq i32 %0, 399
; CHECK: br i1 %exitcond, label %for.end, label %for.body		; CHECK: br i1 %exitcond, label %for.end, label %for.body

; CHECK: ret		; CHECK: ret

for.end: ; preds = %for.body		for.end: ; preds = %for.body
ret float %add12		ret float %add12
}		}

Show All 38 Lines