This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
ClosedPublic

Authored by john.brawn on Oct 17 2016, 9:09 AM.

Download Raw Diff

Details

Reviewers

silviu.baranga
mzolotukhin
christof
sanjoy
haicheng

Commits

rG84b21835f1ed: [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
rL284818: [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops

Summary

When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved.

Diff Detail

Repository: rL LLVM

Event Timeline

john.brawn updated this revision to Diff 74857.Oct 17 2016, 9:09 AM

john.brawn retitled this revision from to [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops.

john.brawn updated this object.

john.brawn added reviewers: sanjoy, christof, haicheng, mzolotukhin.

john.brawn set the repository for this revision to rL LLVM.

john.brawn added a subscriber: llvm-commits.

john.brawn added a parent revision: D25607: [SCEV] More accurate calculation of max backedge count of some less-than loops.Oct 17 2016, 9:12 AM

The LoopUnroll changes look very reasonable to me, but I'd like Silviu's opinion on the SCEV plumbing.

The loop unrolling part looks good to me, too. I am not an expert of SCEV.

include/llvm/Transforms/Utils/UnrollLoop.h
35 ↗	(On Diff #74857)	Thank you for catching my mistake.

sbaranga added a subscriber: sbaranga.Oct 19 2016, 11:35 AM

sbaranga added inline comments.

include/llvm/Analysis/ScalarEvolution.h
1394 ↗	(On Diff #74857)	I've only taken a brief look at this, but it might be a nicer interface to add a getMaxOrZeroBackedgeTakenCount. That way we don't have to add the MaxOrZero everywhere (same in other places) and would look more like the existing code.
lib/Analysis/ScalarEvolution.cpp
5741 ↗	(On Diff #74857)	Does this work with more than one loop exit and with predicated loop exits? I'm not certain this would hold if we have at least one loop exit that doesn't dominate the latch, given the very strict definition (the backedge gets taken exactly 0 or MaxNotTaken times).

john.brawn added inline comments.Oct 20 2016, 3:26 AM

lib/Analysis/ScalarEvolution.cpp
5741 ↗	(On Diff #74857)	That's handled down on line 5759 - MaxOrZero is set to false on loops with more than one exit.

sbaranga added inline comments.Oct 20 2016, 6:50 AM

lib/Analysis/ScalarEvolution.cpp
5741 ↗	(On Diff #74857)	Good point, I've missed that. It should be fine then.

Pass the MaxOrZero information through to loop unrolling using a separate function instead of via a pointer argument.

LGTM for the SCEV part.

Closed by commit rL284818: [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops (authored by john.brawn). · Explain WhyOct 21 2016, 4:18 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

ScalarEvolution.h

32 lines

Transforms/

Utils/

UnrollLoop.h

5 lines

lib/

Analysis/

ScalarEvolution.cpp

57 lines

Transforms/

Scalar/

LoopUnrollPass.cpp

28 lines

Utils/

LoopUnroll.cpp

13 lines

test/

Analysis/

ScalarEvolution/

trip-count13.ll

8 lines

trip-count14.ll

16 lines

Transforms/

LoopUnroll/

full-unroll-keep-first-exit.ll

207 lines

Diff 75412

llvm/trunk/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 542 Lines • ▼ Show 20 Lines	private:
/// predicate by splitting it into a set of independent predicates.		/// predicate by splitting it into a set of independent predicates.
bool ProvingSplitPredicate;		bool ProvingSplitPredicate;

/// Information about the number of loop iterations for which a loop exit's		/// Information about the number of loop iterations for which a loop exit's
/// branch condition evaluates to the not-taken path. This is a temporary		/// branch condition evaluates to the not-taken path. This is a temporary
/// pair of exact and max expressions that are eventually summarized in		/// pair of exact and max expressions that are eventually summarized in
/// ExitNotTakenInfo and BackedgeTakenInfo.		/// ExitNotTakenInfo and BackedgeTakenInfo.
struct ExitLimit {		struct ExitLimit {
const SCEV *ExactNotTaken;		const SCEV *ExactNotTaken; //< The exit is not taken exactly this many times
const SCEV *MaxNotTaken;		const SCEV *MaxNotTaken; //< The exit is not taken at most this many times
		bool MaxOrZero; //< Not taken either exactly MaxNotTaken or zero times

/// A set of predicate guards for this ExitLimit. The result is only valid		/// A set of predicate guards for this ExitLimit. The result is only valid
/// if all of the predicates in \c Predicates evaluate to 'true' at		/// if all of the predicates in \c Predicates evaluate to 'true' at
/// run-time.		/// run-time.
SmallPtrSet<const SCEVPredicate *, 4> Predicates;		SmallPtrSet<const SCEVPredicate *, 4> Predicates;

void addPredicate(const SCEVPredicate *P) {		void addPredicate(const SCEVPredicate *P) {
assert(!isa<SCEVUnionPredicate>(P) && "Only add leaf predicates here!");		assert(!isa<SCEVUnionPredicate>(P) && "Only add leaf predicates here!");
Predicates.insert(P);		Predicates.insert(P);
}		}

/implicit/ ExitLimit(const SCEV *E) : ExactNotTaken(E), MaxNotTaken(E) {}		/implicit/ ExitLimit(const SCEV *E)
		: ExactNotTaken(E), MaxNotTaken(E), MaxOrZero(false) {}

ExitLimit(		ExitLimit(
const SCEV E, const SCEV M,		const SCEV E, const SCEV M, bool MaxOrZero,
ArrayRef<const SmallPtrSetImpl<const SCEVPredicate > > PredSetList)		ArrayRef<const SmallPtrSetImpl<const SCEVPredicate > > PredSetList)
: ExactNotTaken(E), MaxNotTaken(M) {		: ExactNotTaken(E), MaxNotTaken(M), MaxOrZero(MaxOrZero) {
assert((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\|		assert((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\|
!isa<SCEVCouldNotCompute>(MaxNotTaken)) &&		!isa<SCEVCouldNotCompute>(MaxNotTaken)) &&
"Exact is not allowed to be less precise than Max");		"Exact is not allowed to be less precise than Max");
for (auto *PredSet : PredSetList)		for (auto *PredSet : PredSetList)
for (auto P : PredSet)		for (auto P : PredSet)
addPredicate(P);		addPredicate(P);
}		}

ExitLimit(const SCEV E, const SCEV M,		ExitLimit(const SCEV E, const SCEV M, bool MaxOrZero,
const SmallPtrSetImpl<const SCEVPredicate *> &PredSet)		const SmallPtrSetImpl<const SCEVPredicate *> &PredSet)
: ExitLimit(E, M, {&PredSet}) {}		: ExitLimit(E, M, MaxOrZero, {&PredSet}) {}

ExitLimit(const SCEV E, const SCEV M) : ExitLimit(E, M, None) {}		ExitLimit(const SCEV E, const SCEV M, bool MaxOrZero)
		: ExitLimit(E, M, MaxOrZero, None) {}

/// Test whether this ExitLimit contains any computed information, or		/// Test whether this ExitLimit contains any computed information, or
/// whether it's all SCEVCouldNotCompute values.		/// whether it's all SCEVCouldNotCompute values.
bool hasAnyInfo() const {		bool hasAnyInfo() const {
return !isa<SCEVCouldNotCompute>(ExactNotTaken) \|\|		return !isa<SCEVCouldNotCompute>(ExactNotTaken) \|\|
!isa<SCEVCouldNotCompute>(MaxNotTaken);		!isa<SCEVCouldNotCompute>(MaxNotTaken);
}		}

Show All 32 Lines	class BackedgeTakenInfo {
/// least maximum backedge-taken count of the loop that is known, or a		/// least maximum backedge-taken count of the loop that is known, or a
/// SCEVCouldNotCompute. This expression is only valid if the predicates		/// SCEVCouldNotCompute. This expression is only valid if the predicates
/// associated with all loop exits are true.		/// associated with all loop exits are true.
///		///
/// The integer part of \c MaxAndComplete is a boolean indicating if \c		/// The integer part of \c MaxAndComplete is a boolean indicating if \c
/// ExitNotTaken has an element for every exiting block in the loop.		/// ExitNotTaken has an element for every exiting block in the loop.
PointerIntPair<const SCEV *, 1> MaxAndComplete;		PointerIntPair<const SCEV *, 1> MaxAndComplete;

		/// True iff the backedge is taken either exactly Max or zero times.
		bool MaxOrZero;

/// \name Helper projection functions on \c MaxAndComplete.		/// \name Helper projection functions on \c MaxAndComplete.
/// @{		/// @{
bool isComplete() const { return MaxAndComplete.getInt(); }		bool isComplete() const { return MaxAndComplete.getInt(); }
const SCEV *getMax() const { return MaxAndComplete.getPointer(); }		const SCEV *getMax() const { return MaxAndComplete.getPointer(); }
/// @}		/// @}

public:		public:
BackedgeTakenInfo() : MaxAndComplete(nullptr, 0) {}		BackedgeTakenInfo() : MaxAndComplete(nullptr, 0) {}

BackedgeTakenInfo(BackedgeTakenInfo &&) = default;		BackedgeTakenInfo(BackedgeTakenInfo &&) = default;
BackedgeTakenInfo &operator=(BackedgeTakenInfo &&) = default;		BackedgeTakenInfo &operator=(BackedgeTakenInfo &&) = default;

typedef std::pair<BasicBlock *, ExitLimit> EdgeExitInfo;		typedef std::pair<BasicBlock *, ExitLimit> EdgeExitInfo;

/// Initialize BackedgeTakenInfo from a list of exact exit counts.		/// Initialize BackedgeTakenInfo from a list of exact exit counts.
BackedgeTakenInfo(SmallVectorImpl<EdgeExitInfo> &&ExitCounts, bool Complete,		BackedgeTakenInfo(SmallVectorImpl<EdgeExitInfo> &&ExitCounts, bool Complete,
const SCEV *MaxCount);		const SCEV *MaxCount, bool MaxOrZero);

/// Test whether this BackedgeTakenInfo contains any computed information,		/// Test whether this BackedgeTakenInfo contains any computed information,
/// or whether it's all SCEVCouldNotCompute values.		/// or whether it's all SCEVCouldNotCompute values.
bool hasAnyInfo() const {		bool hasAnyInfo() const {
return !ExitNotTaken.empty() \|\| !isa<SCEVCouldNotCompute>(getMax());		return !ExitNotTaken.empty() \|\| !isa<SCEVCouldNotCompute>(getMax());
}		}

/// Test whether this BackedgeTakenInfo contains complete information.		/// Test whether this BackedgeTakenInfo contains complete information.
Show All 22 Lines	public:
/// edge, or SCEVCouldNotCompute. The loop is guaranteed not to exit via		/// edge, or SCEVCouldNotCompute. The loop is guaranteed not to exit via
/// this block before this number of iterations, but may exit via another		/// this block before this number of iterations, but may exit via another
/// block.		/// block.
const SCEV getExact(BasicBlock ExitingBlock, ScalarEvolution *SE) const;		const SCEV getExact(BasicBlock ExitingBlock, ScalarEvolution *SE) const;

/// Get the max backedge taken count for the loop.		/// Get the max backedge taken count for the loop.
const SCEV getMax(ScalarEvolution SE) const;		const SCEV getMax(ScalarEvolution SE) const;

		/// Return true if the number of times this backedge is taken is either the
		/// value returned by getMax or zero.
		bool isMaxOrZero(ScalarEvolution *SE) const;

/// Return true if any backedge taken count expressions refer to the given		/// Return true if any backedge taken count expressions refer to the given
/// subexpression.		/// subexpression.
bool hasOperand(const SCEV S, ScalarEvolution SE) const;		bool hasOperand(const SCEV S, ScalarEvolution SE) const;

/// Invalidate this result and free associated memory.		/// Invalidate this result and free associated memory.
void clear();		void clear();
};		};

▲ Show 20 Lines • Show All 655 Lines • ▼ Show 20 Lines	public:
/// checks and can be used to perform loop versioning.		/// checks and can be used to perform loop versioning.
const SCEV getPredicatedBackedgeTakenCount(const Loop L,		const SCEV getPredicatedBackedgeTakenCount(const Loop L,
SCEVUnionPredicate &Predicates);		SCEVUnionPredicate &Predicates);

/// Similar to getBackedgeTakenCount, except return the least SCEV value		/// Similar to getBackedgeTakenCount, except return the least SCEV value
/// that is known never to be less than the actual backedge taken count.		/// that is known never to be less than the actual backedge taken count.
const SCEV getMaxBackedgeTakenCount(const Loop L);		const SCEV getMaxBackedgeTakenCount(const Loop L);

		/// Return true if the backedge taken count is either the value returned by
		/// getMaxBackedgeTakenCount or zero.
		bool isBackedgeTakenCountMaxOrZero(const Loop *L);

/// Return true if the specified loop has an analyzable loop-invariant		/// Return true if the specified loop has an analyzable loop-invariant
/// backedge-taken count.		/// backedge-taken count.
bool hasLoopInvariantBackedgeTakenCount(const Loop *L);		bool hasLoopInvariantBackedgeTakenCount(const Loop *L);

/// This method should be called by the client when it has changed a loop in		/// This method should be called by the client when it has changed a loop in
/// a way that may effect ScalarEvolution's ability to compute a trip count,		/// a way that may effect ScalarEvolution's ability to compute a trip count,
/// or if the loop is deleted. This call is potentially expensive for large		/// or if the loop is deleted. This call is potentially expensive for large
/// loop bodies.		/// loop bodies.
▲ Show 20 Lines • Show All 382 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h

	Show All 26 Lines
	class LPPassManager;			class LPPassManager;
	class MDNode;			class MDNode;
	class Pass;			class Pass;
	class OptimizationRemarkEmitter;			class OptimizationRemarkEmitter;
	class ScalarEvolution;			class ScalarEvolution;

	bool UnrollLoop(Loop *L, unsigned Count, unsigned TripCount, bool Force,			bool UnrollLoop(Loop *L, unsigned Count, unsigned TripCount, bool Force,
	bool AllowRuntime, bool AllowExpensiveTripCount,			bool AllowRuntime, bool AllowExpensiveTripCount,
	bool UseUpperBound, unsigned TripMultiple, LoopInfo *LI,			bool PreserveCondBr, bool PreserveOnlyFirst,
	ScalarEvolution SE, DominatorTree DT, AssumptionCache *AC,			unsigned TripMultiple, LoopInfo LI, ScalarEvolution SE,
				DominatorTree DT, AssumptionCache AC,
	OptimizationRemarkEmitter *ORE, bool PreserveLCSSA);			OptimizationRemarkEmitter *ORE, bool PreserveLCSSA);

	bool UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,			bool UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
	bool AllowExpensiveTripCount,			bool AllowExpensiveTripCount,
	bool UseEpilogRemainder, LoopInfo *LI,			bool UseEpilogRemainder, LoopInfo *LI,
	ScalarEvolution SE, DominatorTree DT,			ScalarEvolution SE, DominatorTree DT,
	bool PreserveLCSSA);			bool PreserveLCSSA);

	MDNode GetUnrollMetadata(MDNode LoopID, StringRef Name);			MDNode GetUnrollMetadata(MDNode LoopID, StringRef Name);
	}			}

	#endif			#endif

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,418 Lines • ▼ Show 20 Lines
}		}

/// Similar to getBackedgeTakenCount, except return the least SCEV value that is		/// Similar to getBackedgeTakenCount, except return the least SCEV value that is
/// known never to be less than the actual backedge taken count.		/// known never to be less than the actual backedge taken count.
const SCEV ScalarEvolution::getMaxBackedgeTakenCount(const Loop L) {		const SCEV ScalarEvolution::getMaxBackedgeTakenCount(const Loop L) {
return getBackedgeTakenInfo(L).getMax(this);		return getBackedgeTakenInfo(L).getMax(this);
}		}

		bool ScalarEvolution::isBackedgeTakenCountMaxOrZero(const Loop *L) {
		return getBackedgeTakenInfo(L).isMaxOrZero(this);
		}

/// Push PHI nodes in the header of the given loop onto the given Worklist.		/// Push PHI nodes in the header of the given loop onto the given Worklist.
static void		static void
PushLoopPHIs(const Loop L, SmallVectorImpl<Instruction > &Worklist) {		PushLoopPHIs(const Loop L, SmallVectorImpl<Instruction > &Worklist) {
BasicBlock *Header = L->getHeader();		BasicBlock *Header = L->getHeader();

// Push all Loop-header PHIs onto the Worklist stack.		// Push all Loop-header PHIs onto the Worklist stack.
for (BasicBlock::iterator I = Header->begin();		for (BasicBlock::iterator I = Header->begin();
PHINode *PN = dyn_cast<PHINode>(I); ++I)		PHINode *PN = dyn_cast<PHINode>(I); ++I)
▲ Show 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	ScalarEvolution::BackedgeTakenInfo::getMax(ScalarEvolution *SE) const {
};		};

if (any_of(ExitNotTaken, PredicateNotAlwaysTrue) \|\| !getMax())		if (any_of(ExitNotTaken, PredicateNotAlwaysTrue) \|\| !getMax())
return SE->getCouldNotCompute();		return SE->getCouldNotCompute();

return getMax();		return getMax();
}		}

		bool ScalarEvolution::BackedgeTakenInfo::isMaxOrZero(ScalarEvolution *SE) const {
		auto PredicateNotAlwaysTrue = [](const ExitNotTakenInfo &ENT) {
		return !ENT.hasAlwaysTruePredicate();
		};
		return MaxOrZero && !any_of(ExitNotTaken, PredicateNotAlwaysTrue);
		}

bool ScalarEvolution::BackedgeTakenInfo::hasOperand(const SCEV *S,		bool ScalarEvolution::BackedgeTakenInfo::hasOperand(const SCEV *S,
ScalarEvolution *SE) const {		ScalarEvolution *SE) const {
if (getMax() && getMax() != SE->getCouldNotCompute() &&		if (getMax() && getMax() != SE->getCouldNotCompute() &&
SE->hasOperand(getMax(), S))		SE->hasOperand(getMax(), S))
return true;		return true;

for (auto &ENT : ExitNotTaken)		for (auto &ENT : ExitNotTaken)
if (ENT.ExactNotTaken != SE->getCouldNotCompute() &&		if (ENT.ExactNotTaken != SE->getCouldNotCompute() &&
SE->hasOperand(ENT.ExactNotTaken, S))		SE->hasOperand(ENT.ExactNotTaken, S))
return true;		return true;

return false;		return false;
}		}

/// Allocate memory for BackedgeTakenInfo and copy the not-taken count of each		/// Allocate memory for BackedgeTakenInfo and copy the not-taken count of each
/// computable exit into a persistent ExitNotTakenInfo array.		/// computable exit into a persistent ExitNotTakenInfo array.
ScalarEvolution::BackedgeTakenInfo::BackedgeTakenInfo(		ScalarEvolution::BackedgeTakenInfo::BackedgeTakenInfo(
SmallVectorImpl<ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo>		SmallVectorImpl<ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo>
&&ExitCounts,		&&ExitCounts,
bool Complete, const SCEV *MaxCount)		bool Complete, const SCEV *MaxCount, bool MaxOrZero)
: MaxAndComplete(MaxCount, Complete) {		: MaxAndComplete(MaxCount, Complete), MaxOrZero(MaxOrZero) {
typedef ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo EdgeExitInfo;		typedef ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo EdgeExitInfo;
ExitNotTaken.reserve(ExitCounts.size());		ExitNotTaken.reserve(ExitCounts.size());
std::transform(		std::transform(
ExitCounts.begin(), ExitCounts.end(), std::back_inserter(ExitNotTaken),		ExitCounts.begin(), ExitCounts.end(), std::back_inserter(ExitNotTaken),
[&](const EdgeExitInfo &EEI) {		[&](const EdgeExitInfo &EEI) {
BasicBlock *ExitBB = EEI.first;		BasicBlock *ExitBB = EEI.first;
const ExitLimit &EL = EEI.second;		const ExitLimit &EL = EEI.second;
if (EL.Predicates.empty())		if (EL.Predicates.empty())
Show All 21 Lines	ScalarEvolution::computeBackedgeTakenCount(const Loop *L,

typedef ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo EdgeExitInfo;		typedef ScalarEvolution::BackedgeTakenInfo::EdgeExitInfo EdgeExitInfo;

SmallVector<EdgeExitInfo, 4> ExitCounts;		SmallVector<EdgeExitInfo, 4> ExitCounts;
bool CouldComputeBECount = true;		bool CouldComputeBECount = true;
BasicBlock *Latch = L->getLoopLatch(); // may be NULL.		BasicBlock *Latch = L->getLoopLatch(); // may be NULL.
const SCEV *MustExitMaxBECount = nullptr;		const SCEV *MustExitMaxBECount = nullptr;
const SCEV *MayExitMaxBECount = nullptr;		const SCEV *MayExitMaxBECount = nullptr;
		bool MustExitMaxOrZero = false;

// Compute the ExitLimit for each loop exit. Use this to populate ExitCounts		// Compute the ExitLimit for each loop exit. Use this to populate ExitCounts
// and compute maxBECount.		// and compute maxBECount.
// Do a union of all the predicates here.		// Do a union of all the predicates here.
for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {		for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {
BasicBlock *ExitBB = ExitingBlocks[i];		BasicBlock *ExitBB = ExitingBlocks[i];
ExitLimit EL = computeExitLimit(L, ExitBB, AllowPredicates);		ExitLimit EL = computeExitLimit(L, ExitBB, AllowPredicates);

Show All 16 Lines	for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {
// If the exit dominates the loop latch, it is a LoopMustExit otherwise it		// If the exit dominates the loop latch, it is a LoopMustExit otherwise it
// is a LoopMayExit. If any computable LoopMustExit is found, then		// is a LoopMayExit. If any computable LoopMustExit is found, then
// MaxBECount is the minimum EL.MaxNotTaken of computable		// MaxBECount is the minimum EL.MaxNotTaken of computable
// LoopMustExits. Otherwise, MaxBECount is conservatively the maximum		// LoopMustExits. Otherwise, MaxBECount is conservatively the maximum
// EL.MaxNotTaken, where CouldNotCompute is considered greater than any		// EL.MaxNotTaken, where CouldNotCompute is considered greater than any
// computable EL.MaxNotTaken.		// computable EL.MaxNotTaken.
if (EL.MaxNotTaken != getCouldNotCompute() && Latch &&		if (EL.MaxNotTaken != getCouldNotCompute() && Latch &&
DT.dominates(ExitBB, Latch)) {		DT.dominates(ExitBB, Latch)) {
if (!MustExitMaxBECount)		if (!MustExitMaxBECount) {
MustExitMaxBECount = EL.MaxNotTaken;		MustExitMaxBECount = EL.MaxNotTaken;
else {		MustExitMaxOrZero = EL.MaxOrZero;
		} else {
MustExitMaxBECount =		MustExitMaxBECount =
getUMinFromMismatchedTypes(MustExitMaxBECount, EL.MaxNotTaken);		getUMinFromMismatchedTypes(MustExitMaxBECount, EL.MaxNotTaken);
}		}
} else if (MayExitMaxBECount != getCouldNotCompute()) {		} else if (MayExitMaxBECount != getCouldNotCompute()) {
if (!MayExitMaxBECount \|\| EL.MaxNotTaken == getCouldNotCompute())		if (!MayExitMaxBECount \|\| EL.MaxNotTaken == getCouldNotCompute())
MayExitMaxBECount = EL.MaxNotTaken;		MayExitMaxBECount = EL.MaxNotTaken;
else {		else {
MayExitMaxBECount =		MayExitMaxBECount =
getUMaxFromMismatchedTypes(MayExitMaxBECount, EL.MaxNotTaken);		getUMaxFromMismatchedTypes(MayExitMaxBECount, EL.MaxNotTaken);
}		}
}		}
}		}
const SCEV *MaxBECount = MustExitMaxBECount ? MustExitMaxBECount :		const SCEV *MaxBECount = MustExitMaxBECount ? MustExitMaxBECount :
(MayExitMaxBECount ? MayExitMaxBECount : getCouldNotCompute());		(MayExitMaxBECount ? MayExitMaxBECount : getCouldNotCompute());
		// The loop backedge will be taken the maximum or zero times if there's
		// a single exit that must be taken the maximum or zero times.
		bool MaxOrZero = (MustExitMaxOrZero && ExitingBlocks.size() == 1);
return BackedgeTakenInfo(std::move(ExitCounts), CouldComputeBECount,		return BackedgeTakenInfo(std::move(ExitCounts), CouldComputeBECount,
MaxBECount);		MaxBECount, MaxOrZero);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::computeExitLimit(const Loop L, BasicBlock ExitingBlock,		ScalarEvolution::computeExitLimit(const Loop L, BasicBlock ExitingBlock,
bool AllowPredicates) {		bool AllowPredicates) {

// Okay, we've chosen an exiting block. See what condition causes us to exit		// Okay, we've chosen an exiting block. See what condition causes us to exit
// at this block and remember the exit block and whether all other targets		// at this block and remember the exit block and whether all other targets
▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	if (BO->getOpcode() == Instruction::And) {
// to be more aggressive when computing BECount than when computing		// to be more aggressive when computing BECount than when computing
// MaxBECount. In these cases it is possible for EL0.ExactNotTaken and		// MaxBECount. In these cases it is possible for EL0.ExactNotTaken and
// EL1.ExactNotTaken to match, but for EL0.MaxNotTaken and EL1.MaxNotTaken		// EL1.ExactNotTaken to match, but for EL0.MaxNotTaken and EL1.MaxNotTaken
// to not.		// to not.
if (isa<SCEVCouldNotCompute>(MaxBECount) &&		if (isa<SCEVCouldNotCompute>(MaxBECount) &&
!isa<SCEVCouldNotCompute>(BECount))		!isa<SCEVCouldNotCompute>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount, {&EL0.Predicates, &EL1.Predicates});		return ExitLimit(BECount, MaxBECount, false,
		{&EL0.Predicates, &EL1.Predicates});
}		}
if (BO->getOpcode() == Instruction::Or) {		if (BO->getOpcode() == Instruction::Or) {
// Recurse on the operands of the or.		// Recurse on the operands of the or.
bool EitherMayExit = L->contains(FBB);		bool EitherMayExit = L->contains(FBB);
ExitLimit EL0 = computeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,		ExitLimit EL0 = computeExitLimitFromCond(L, BO->getOperand(0), TBB, FBB,
ControlsExit && !EitherMayExit,		ControlsExit && !EitherMayExit,
AllowPredicates);		AllowPredicates);
ExitLimit EL1 = computeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,		ExitLimit EL1 = computeExitLimitFromCond(L, BO->getOperand(1), TBB, FBB,
Show All 22 Lines	if (BO->getOpcode() == Instruction::Or) {
// For now, be conservative.		// For now, be conservative.
assert(L->contains(TBB) && "Loop block has no successor in loop!");		assert(L->contains(TBB) && "Loop block has no successor in loop!");
if (EL0.MaxNotTaken == EL1.MaxNotTaken)		if (EL0.MaxNotTaken == EL1.MaxNotTaken)
MaxBECount = EL0.MaxNotTaken;		MaxBECount = EL0.MaxNotTaken;
if (EL0.ExactNotTaken == EL1.ExactNotTaken)		if (EL0.ExactNotTaken == EL1.ExactNotTaken)
BECount = EL0.ExactNotTaken;		BECount = EL0.ExactNotTaken;
}		}

return ExitLimit(BECount, MaxBECount, {&EL0.Predicates, &EL1.Predicates});		return ExitLimit(BECount, MaxBECount, false,
		{&EL0.Predicates, &EL1.Predicates});
}		}
}		}

// With an icmp, it may be feasible to compute an exact backedge-taken count.		// With an icmp, it may be feasible to compute an exact backedge-taken count.
// Proceed to the next level to examine the icmp.		// Proceed to the next level to examine the icmp.
if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond)) {		if (ICmpInst *ExitCondICmp = dyn_cast<ICmpInst>(ExitCond)) {
ExitLimit EL =		ExitLimit EL =
computeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, ControlsExit);		computeExitLimitFromICmp(L, ExitCondICmp, TBB, FBB, ControlsExit);
▲ Show 20 Lines • Show All 368 Lines • ▼ Show 20 Lines	auto *Result =
ConstantFoldCompareInstOperands(Pred, StableValue, RHS, DL, &TLI);		ConstantFoldCompareInstOperands(Pred, StableValue, RHS, DL, &TLI);
assert(Result->getType()->isIntegerTy(1) &&		assert(Result->getType()->isIntegerTy(1) &&
"Otherwise cannot be an operand to a branch instruction");		"Otherwise cannot be an operand to a branch instruction");

if (Result->isZeroValue()) {		if (Result->isZeroValue()) {
unsigned BitWidth = getTypeSizeInBits(RHS->getType());		unsigned BitWidth = getTypeSizeInBits(RHS->getType());
const SCEV *UpperBound =		const SCEV *UpperBound =
getConstant(getEffectiveSCEVType(RHS->getType()), BitWidth);		getConstant(getEffectiveSCEVType(RHS->getType()), BitWidth);
return ExitLimit(getCouldNotCompute(), UpperBound);		return ExitLimit(getCouldNotCompute(), UpperBound, false);
}		}

return getCouldNotCompute();		return getCouldNotCompute();
}		}

/// Return true if we can constant fold an instruction of the specified type,		/// Return true if we can constant fold an instruction of the specified type,
/// assuming that all operands were constants.		/// assuming that all operands were constants.
static bool CanConstantFold(const Instruction *I) {		static bool CanConstantFold(const Instruction *I) {
▲ Show 20 Lines • Show All 779 Lines • ▼ Show 20 Lines	if (auto Roots = SolveQuadraticEquation(AddRec, *this)) {
if (!CB->getZExtValue())		if (!CB->getZExtValue())
std::swap(R1, R2); // R1 is the minimum root now.		std::swap(R1, R2); // R1 is the minimum root now.

// We can only use this value if the chrec ends up with an exact zero		// We can only use this value if the chrec ends up with an exact zero
// value at this index. When solving for "X*X != 5", for example, we		// value at this index. When solving for "X*X != 5", for example, we
// should not accept a root of 2.		// should not accept a root of 2.
const SCEV Val = AddRec->evaluateAtIteration(R1, this);		const SCEV Val = AddRec->evaluateAtIteration(R1, this);
if (Val->isZero())		if (Val->isZero())
return ExitLimit(R1, R1, Predicates); // We found a quadratic root!		// We found a quadratic root!
		return ExitLimit(R1, R1, false, Predicates);
}		}
}		}
return getCouldNotCompute();		return getCouldNotCompute();
}		}

// Otherwise we can only handle this if it is affine.		// Otherwise we can only handle this if it is affine.
if (!AddRec->isAffine())		if (!AddRec->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();
Show All 40 Lines	if (StepC->getValue()->equalsInt(1) \|\| StepC->getValue()->isAllOnesValue()) {
if (!CountDown && CR.getUnsignedMin().isMinValue())		if (!CountDown && CR.getUnsignedMin().isMinValue())
// When counting up, the worst starting value is 1, not 0.		// When counting up, the worst starting value is 1, not 0.
MaxBECount = CR.getUnsignedMax().isMinValue()		MaxBECount = CR.getUnsignedMax().isMinValue()
? getConstant(APInt::getMinValue(CR.getBitWidth()))		? getConstant(APInt::getMinValue(CR.getBitWidth()))
: getConstant(APInt::getMaxValue(CR.getBitWidth()));		: getConstant(APInt::getMaxValue(CR.getBitWidth()));
else		else
MaxBECount = getConstant(CountDown ? CR.getUnsignedMax()		MaxBECount = getConstant(CountDown ? CR.getUnsignedMax()
: -CR.getUnsignedMin());		: -CR.getUnsignedMin());
return ExitLimit(Distance, MaxBECount, Predicates);		return ExitLimit(Distance, MaxBECount, false, Predicates);
}		}

// As a special case, handle the instance where Step is a positive power of		// As a special case, handle the instance where Step is a positive power of
// two. In this case, determining whether Step divides Distance evenly can be		// two. In this case, determining whether Step divides Distance evenly can be
// done by counting and comparing the number of trailing zeros of Step and		// done by counting and comparing the number of trailing zeros of Step and
// Distance.		// Distance.
if (!CountDown) {		if (!CountDown) {
const APInt &StepV = StepC->getAPInt();		const APInt &StepV = StepC->getAPInt();
Show All 38 Lines	if (StepV.isPowerOf2() &&
// and a zero extend.		// and a zero extend.

unsigned NarrowWidth = StepV.getBitWidth() - StepV.countTrailingZeros();		unsigned NarrowWidth = StepV.getBitWidth() - StepV.countTrailingZeros();
auto *NarrowTy = IntegerType::get(getContext(), NarrowWidth);		auto *NarrowTy = IntegerType::get(getContext(), NarrowWidth);
auto *WideTy = Distance->getType();		auto *WideTy = Distance->getType();

const SCEV *Limit =		const SCEV *Limit =
getZeroExtendExpr(getTruncateExpr(ModuloResult, NarrowTy), WideTy);		getZeroExtendExpr(getTruncateExpr(ModuloResult, NarrowTy), WideTy);
return ExitLimit(Limit, Limit, Predicates);		return ExitLimit(Limit, Limit, false, Predicates);
}		}
}		}

// If the condition controls loop exit (the loop exits only if the expression		// If the condition controls loop exit (the loop exits only if the expression
// is true) and the addition is no-wrap we can use unsigned divide to		// is true) and the addition is no-wrap we can use unsigned divide to
// compute the backedge count. In this case, the step may not divide the		// compute the backedge count. In this case, the step may not divide the
// distance, but we don't care because if the condition is "missed" the loop		// distance, but we don't care because if the condition is "missed" the loop
// will have undefined behavior due to wrapping.		// will have undefined behavior due to wrapping.
if (ControlsExit && AddRec->hasNoSelfWrap() &&		if (ControlsExit && AddRec->hasNoSelfWrap() &&
loopHasNoAbnormalExits(AddRec->getLoop())) {		loopHasNoAbnormalExits(AddRec->getLoop())) {
const SCEV *Exact =		const SCEV *Exact =
getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);		getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);
return ExitLimit(Exact, Exact, Predicates);		return ExitLimit(Exact, Exact, false, Predicates);
}		}

// Then, try to solve the above equation provided that Start is constant.		// Then, try to solve the above equation provided that Start is constant.
if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start)) {		if (const SCEVConstant *StartC = dyn_cast<SCEVConstant>(Start)) {
const SCEV *E = SolveLinEquationWithOverflow(		const SCEV *E = SolveLinEquationWithOverflow(
StepC->getValue()->getValue(), -StartC->getValue()->getValue(), *this);		StepC->getValue()->getValue(), -StartC->getValue()->getValue(), *this);
return ExitLimit(E, E, Predicates);		return ExitLimit(E, E, false, Predicates);
}		}
return getCouldNotCompute();		return getCouldNotCompute();
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::howFarToNonZero(const SCEV V, const Loop L) {		ScalarEvolution::howFarToNonZero(const SCEV V, const Loop L) {
// Loops that look like: while (X == 0) are very strange indeed. We don't		// Loops that look like: while (X == 0) are very strange indeed. We don't
// handle them yet except for the trivial case. This could be expanded in the		// handle them yet except for the trivial case. This could be expanded in the
▲ Show 20 Lines • Show All 1,425 Lines • ▼ Show 20 Lines	ScalarEvolution::howManyLessThans(const SCEV LHS, const SCEV RHS,
if (isLoopEntryGuardedByCond(L, Cond, getMinusSCEV(Start, Stride), RHS))		if (isLoopEntryGuardedByCond(L, Cond, getMinusSCEV(Start, Stride), RHS))
BECount = BECountIfBackedgeTaken;		BECount = BECountIfBackedgeTaken;
else {		else {
End = IsSigned ? getSMaxExpr(RHS, Start) : getUMaxExpr(RHS, Start);		End = IsSigned ? getSMaxExpr(RHS, Start) : getUMaxExpr(RHS, Start);
BECount = computeBECount(getMinusSCEV(End, Start), Stride, false);		BECount = computeBECount(getMinusSCEV(End, Start), Stride, false);
}		}

const SCEV *MaxBECount;		const SCEV *MaxBECount;
		bool MaxOrZero = false;
if (isa<SCEVConstant>(BECount))		if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else if (isa<SCEVConstant>(BECountIfBackedgeTaken))		else if (isa<SCEVConstant>(BECountIfBackedgeTaken)) {
// If we know exactly how many times the backedge will be taken if it's		// If we know exactly how many times the backedge will be taken if it's
// taken at least once, then the backedge count will either be that or		// taken at least once, then the backedge count will either be that or
// zero.		// zero.
MaxBECount = BECountIfBackedgeTaken;		MaxBECount = BECountIfBackedgeTaken;
else {		MaxOrZero = true;
		} else {
// Calculate the maximum backedge count based on the range of values		// Calculate the maximum backedge count based on the range of values
// permitted by Start, End, and Stride.		// permitted by Start, End, and Stride.
APInt MinStart = IsSigned ? getSignedRange(Start).getSignedMin()		APInt MinStart = IsSigned ? getSignedRange(Start).getSignedMin()
: getUnsignedRange(Start).getUnsignedMin();		: getUnsignedRange(Start).getUnsignedMin();

unsigned BitWidth = getTypeSizeInBits(LHS->getType());		unsigned BitWidth = getTypeSizeInBits(LHS->getType());

APInt StrideForMaxBECount;		APInt StrideForMaxBECount;
Show All 20 Lines	else if (isa<SCEVConstant>(BECountIfBackedgeTaken)) {

MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),		MaxBECount = computeBECount(getConstant(MaxEnd - MinStart),
getConstant(StrideForMaxBECount), false);		getConstant(StrideForMaxBECount), false);
}		}

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount, Predicates);		return ExitLimit(BECount, MaxBECount, MaxOrZero, Predicates);
}		}

ScalarEvolution::ExitLimit		ScalarEvolution::ExitLimit
ScalarEvolution::howManyGreaterThans(const SCEV LHS, const SCEV RHS,		ScalarEvolution::howManyGreaterThans(const SCEV LHS, const SCEV RHS,
const Loop *L, bool IsSigned,		const Loop *L, bool IsSigned,
bool ControlsExit, bool AllowPredicates) {		bool ControlsExit, bool AllowPredicates) {
SmallPtrSet<const SCEVPredicate *, 4> Predicates;		SmallPtrSet<const SCEVPredicate *, 4> Predicates;
// We handle only IV > Invariant		// We handle only IV > Invariant
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	if (isa<SCEVConstant>(BECount))
MaxBECount = BECount;		MaxBECount = BECount;
else		else
MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),		MaxBECount = computeBECount(getConstant(MaxStart - MinEnd),
getConstant(MinStride), false);		getConstant(MinStride), false);

if (isa<SCEVCouldNotCompute>(MaxBECount))		if (isa<SCEVCouldNotCompute>(MaxBECount))
MaxBECount = BECount;		MaxBECount = BECount;

return ExitLimit(BECount, MaxBECount, Predicates);		return ExitLimit(BECount, MaxBECount, false, Predicates);
}		}

const SCEV *SCEVAddRecExpr::getNumIterationsInRange(const ConstantRange &Range,		const SCEV *SCEVAddRecExpr::getNumIterationsInRange(const ConstantRange &Range,
ScalarEvolution &SE) const {		ScalarEvolution &SE) const {
if (Range.isFullSet()) // Infinite loop.		if (Range.isFullSet()) // Infinite loop.
return SE.getCouldNotCompute();		return SE.getCouldNotCompute();

// If the start is a non-zero constant, shift the range to simplify things.		// If the start is a non-zero constant, shift the range to simplify things.
▲ Show 20 Lines • Show All 765 Lines • ▼ Show 20 Lines	static void PrintLoopInfo(raw_ostream &OS, ScalarEvolution *SE,

OS << "\n"		OS << "\n"
"Loop ";		"Loop ";
L->getHeader()->printAsOperand(OS, /PrintType=/false);		L->getHeader()->printAsOperand(OS, /PrintType=/false);
OS << ": ";		OS << ": ";

if (!isa<SCEVCouldNotCompute>(SE->getMaxBackedgeTakenCount(L))) {		if (!isa<SCEVCouldNotCompute>(SE->getMaxBackedgeTakenCount(L))) {
OS << "max backedge-taken count is " << *SE->getMaxBackedgeTakenCount(L);		OS << "max backedge-taken count is " << *SE->getMaxBackedgeTakenCount(L);
		if (SE->isBackedgeTakenCountMaxOrZero(L))
		OS << ", actual taken count either this or zero.";
} else {		} else {
OS << "Unpredictable max backedge-taken count. ";		OS << "Unpredictable max backedge-taken count. ";
}		}

OS << "\n"		OS << "\n"
"Loop ";		"Loop ";
L->getHeader()->printAsOperand(OS, /PrintType=/false);		L->getHeader()->printAsOperand(OS, /PrintType=/false);
OS << ": ";		OS << ": ";
▲ Show 20 Lines • Show All 904 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 994 Lines • ▼ Show 20 Lines	static bool tryToUnrollLoop(Loop L, DominatorTree &DT, LoopInfo LI,
// case, the program would be ill-formed (on most architectures)		// case, the program would be ill-formed (on most architectures)
// unless n were the same on all threads in a thread group.		// unless n were the same on all threads in a thread group.
// Assuming n is the same on all threads, any kind of unrolling is		// Assuming n is the same on all threads, any kind of unrolling is
// safe. But currently llvm's notion of convergence isn't powerful		// safe. But currently llvm's notion of convergence isn't powerful
// enough to express this.		// enough to express this.
if (Convergent)		if (Convergent)
UP.AllowRemainder = false;		UP.AllowRemainder = false;

// Try to find the trip count upper bound if it is allowed and we cannot find		// Try to find the trip count upper bound if we cannot find the exact trip
// exact trip count.		// count.
if (UP.UpperBound) {		bool MaxOrZero = false;
if (!TripCount) {		if (!TripCount) {
MaxTripCount = SE->getSmallConstantMaxTripCount(L);		MaxTripCount = SE->getSmallConstantMaxTripCount(L);
// Only unroll with small upper bound.		MaxOrZero = SE->isBackedgeTakenCountMaxOrZero(L);
if (MaxTripCount > UnrollMaxUpperBound)		// We can unroll by the upper bound amount if it's generally allowed or if
		// we know that the loop is executed either the upper bound or zero times.
		// (MaxOrZero unrolling keeps only the first loop test, so the number of
		// loop tests remains the same compared to the non-unrolled version, whereas
		// the generic upper bound unrolling keeps all but the last loop test so the
		// number of loop tests goes up which may end up being worse on targets with
		// constriained branch predictor resources so is controlled by an option.)
		// In addition we only unroll small upper bounds.
		if (!(UP.UpperBound \|\| MaxOrZero) \|\| MaxTripCount > UnrollMaxUpperBound) {
MaxTripCount = 0;		MaxTripCount = 0;
}		}
}		}

// computeUnrollCount() decides whether it is beneficial to use upper bound to		// computeUnrollCount() decides whether it is beneficial to use upper bound to
// fully unroll the loop.		// fully unroll the loop.
bool UseUpperBound = false;		bool UseUpperBound = false;
bool IsCountSetExplicitly =		bool IsCountSetExplicitly =
computeUnrollCount(L, TTI, DT, LI, SE, &ORE, TripCount, MaxTripCount,		computeUnrollCount(L, TTI, DT, LI, SE, &ORE, TripCount, MaxTripCount,
TripMultiple, LoopSize, UP, UseUpperBound);		TripMultiple, LoopSize, UP, UseUpperBound);
if (!UP.Count)		if (!UP.Count)
return false;		return false;
// Unroll factor (Count) must be less or equal to TripCount.		// Unroll factor (Count) must be less or equal to TripCount.
if (TripCount && UP.Count > TripCount)		if (TripCount && UP.Count > TripCount)
UP.Count = TripCount;		UP.Count = TripCount;

// Unroll the loop.		// Unroll the loop.
if (!UnrollLoop(L, UP.Count, TripCount, UP.Force, UP.Runtime,		if (!UnrollLoop(L, UP.Count, TripCount, UP.Force, UP.Runtime,
UP.AllowExpensiveTripCount, UseUpperBound, TripMultiple, LI,		UP.AllowExpensiveTripCount, UseUpperBound, MaxOrZero,
SE, &DT, &AC, &ORE, PreserveLCSSA))		TripMultiple, LI, SE, &DT, &AC, &ORE, PreserveLCSSA))
return false;		return false;

// If loop has an unroll count pragma or unrolled by explicitly set count		// If loop has an unroll count pragma or unrolled by explicitly set count
// mark loop as unrolled to prevent unrolling beyond that requested.		// mark loop as unrolled to prevent unrolling beyond that requested.
if (IsCountSetExplicitly)		if (IsCountSetExplicitly)
SetLoopAlreadyUnrolled(L);		SetLoopAlreadyUnrolled(L);
return true;		return true;
}		}
▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines
/// exits were taken. Note that UnrollLoop assumes that the loop counter test		/// exits were taken. Note that UnrollLoop assumes that the loop counter test
/// terminates LatchBlock in order to remove unnecesssary instances of the		/// terminates LatchBlock in order to remove unnecesssary instances of the
/// test. In other words, control may exit the loop prior to TripCount		/// test. In other words, control may exit the loop prior to TripCount
/// iterations via an early branch, but control may not exit the loop from the		/// iterations via an early branch, but control may not exit the loop from the
/// LatchBlock's terminator prior to TripCount iterations.		/// LatchBlock's terminator prior to TripCount iterations.
///		///
/// PreserveCondBr indicates whether the conditional branch of the LatchBlock		/// PreserveCondBr indicates whether the conditional branch of the LatchBlock
/// needs to be preserved. It is needed when we use trip count upper bound to		/// needs to be preserved. It is needed when we use trip count upper bound to
/// fully unroll the loop.		/// fully unroll the loop. If PreserveOnlyFirst is also set then only the first
		/// conditional branch needs to be preserved.
///		///
/// Similarly, TripMultiple divides the number of times that the LatchBlock may		/// Similarly, TripMultiple divides the number of times that the LatchBlock may
/// execute without exiting the loop.		/// execute without exiting the loop.
///		///
/// If AllowRuntime is true then UnrollLoop will consider unrolling loops that		/// If AllowRuntime is true then UnrollLoop will consider unrolling loops that
/// have a runtime (i.e. not compile time constant) trip count. Unrolling these		/// have a runtime (i.e. not compile time constant) trip count. Unrolling these
/// loops require a unroll "prologue" that runs "RuntimeTripCount % Count"		/// loops require a unroll "prologue" that runs "RuntimeTripCount % Count"
/// iterations before branching into the unrolled loop. UnrollLoop will not		/// iterations before branching into the unrolled loop. UnrollLoop will not
/// runtime-unroll the loop if computing RuntimeTripCount will be expensive and		/// runtime-unroll the loop if computing RuntimeTripCount will be expensive and
/// AllowExpensiveTripCount is false.		/// AllowExpensiveTripCount is false.
///		///
/// The LoopInfo Analysis that is passed will be kept consistent.		/// The LoopInfo Analysis that is passed will be kept consistent.
///		///
/// This utility preserves LoopInfo. It will also preserve ScalarEvolution and		/// This utility preserves LoopInfo. It will also preserve ScalarEvolution and
/// DominatorTree if they are non-null.		/// DominatorTree if they are non-null.
bool llvm::UnrollLoop(Loop *L, unsigned Count, unsigned TripCount, bool Force,		bool llvm::UnrollLoop(Loop *L, unsigned Count, unsigned TripCount, bool Force,
bool AllowRuntime, bool AllowExpensiveTripCount,		bool AllowRuntime, bool AllowExpensiveTripCount,
bool PreserveCondBr, unsigned TripMultiple, LoopInfo *LI,		bool PreserveCondBr, bool PreserveOnlyFirst,
ScalarEvolution SE, DominatorTree DT,		unsigned TripMultiple, LoopInfo LI, ScalarEvolution SE,
AssumptionCache AC, OptimizationRemarkEmitter ORE,		DominatorTree DT, AssumptionCache AC,
bool PreserveLCSSA) {		OptimizationRemarkEmitter *ORE, bool PreserveLCSSA) {
BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) {		if (!Preheader) {
DEBUG(dbgs() << " Can't unroll; loop preheader-insertion failed.\n");		DEBUG(dbgs() << " Can't unroll; loop preheader-insertion failed.\n");
return false;		return false;
}		}

BasicBlock *LatchBlock = L->getLoopLatch();		BasicBlock *LatchBlock = L->getLoopLatch();
if (!LatchBlock) {		if (!LatchBlock) {
▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines	if (CompletelyUnroll) {
if (j == 0)		if (j == 0)
Dest = LoopExit;		Dest = LoopExit;
// If using trip count upper bound to completely unroll, we need to keep		// If using trip count upper bound to completely unroll, we need to keep
// the conditional branch except the last one because the loop may exit		// the conditional branch except the last one because the loop may exit
// after any iteration.		// after any iteration.
assert(NeedConditional &&		assert(NeedConditional &&
"NeedCondition cannot be modified by both complete "		"NeedCondition cannot be modified by both complete "
"unrolling and runtime unrolling");		"unrolling and runtime unrolling");
NeedConditional = (PreserveCondBr && j);		NeedConditional = (PreserveCondBr && j && !(PreserveOnlyFirst && i != 0));
} else if (j != BreakoutTrip && (TripMultiple == 0 \|\| j % TripMultiple != 0)) {		} else if (j != BreakoutTrip && (TripMultiple == 0 \|\| j % TripMultiple != 0)) {
// If we know the trip count or a multiple of it, we can safely use an		// If we know the trip count or a multiple of it, we can safely use an
// unconditional branch for some iterations.		// unconditional branch for some iterations.
NeedConditional = false;		NeedConditional = false;
}		}

if (NeedConditional) {		if (NeedConditional) {
// Update the conditional branch's successor for the following		// Update the conditional branch's successor for the following
▲ Show 20 Lines • Show All 176 Lines • Show Last 20 Lines

llvm/trunk/test/Analysis/ScalarEvolution/trip-count13.ll

	; RUN: opt -S -analyze -scalar-evolution < %s \| FileCheck %s			; RUN: opt -S -analyze -scalar-evolution < %s \| FileCheck %s

	define void @u_0(i8 %rhs) {			define void @u_0(i8 %rhs) {
	; E.g.: %rhs = 255, %start = 99, backedge taken 156 times			; E.g.: %rhs = 255, %start = 99, backedge taken 156 times
	entry:			entry:
	%start = add i8 %rhs, 100			%start = add i8 %rhs, 100
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]			%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]
	%iv.inc = add nuw i8 %iv, 1 ;; Note: this never unsigned-wraps			%iv.inc = add nuw i8 %iv, 1 ;; Note: this never unsigned-wraps
	%iv.cmp = icmp ult i8 %iv, %rhs			%iv.cmp = icmp ult i8 %iv, %rhs
	br i1 %iv.cmp, label %loop, label %leave			br i1 %iv.cmp, label %loop, label %leave

	; CHECK-LABEL: Determining loop execution counts for: @u_0			; CHECK-LABEL: Determining loop execution counts for: @u_0
	; CHECK-NEXT: Loop %loop: backedge-taken count is (-100 + (-1 * %rhs) + ((100 + %rhs) umax %rhs))			; CHECK-NEXT: Loop %loop: backedge-taken count is (-100 + (-1 * %rhs) + ((100 + %rhs) umax %rhs))
	; CHECK-NEXT: Loop %loop: max backedge-taken count is -100			; CHECK-NEXT: Loop %loop: max backedge-taken count is -100, actual taken count either this or zero.

	leave:			leave:
	ret void			ret void
	}			}

	define void @u_1(i8 %start) {			define void @u_1(i8 %start) {
	entry:			entry:
	; E.g.: %start = 99, %rhs = 255, backedge taken 156 times			; E.g.: %start = 99, %rhs = 255, backedge taken 156 times
	%rhs = add i8 %start, -100			%rhs = add i8 %start, -100
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]			%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]
	%iv.inc = add nuw i8 %iv, 1 ;; Note: this never unsigned-wraps			%iv.inc = add nuw i8 %iv, 1 ;; Note: this never unsigned-wraps
	%iv.cmp = icmp ult i8 %iv, %rhs			%iv.cmp = icmp ult i8 %iv, %rhs
	br i1 %iv.cmp, label %loop, label %leave			br i1 %iv.cmp, label %loop, label %leave

	; CHECK-LABEL: Determining loop execution counts for: @u_1			; CHECK-LABEL: Determining loop execution counts for: @u_1
	; CHECK-NEXT: Loop %loop: backedge-taken count is ((-1 * %start) + ((-100 + %start) umax %start))			; CHECK-NEXT: Loop %loop: backedge-taken count is ((-1 * %start) + ((-100 + %start) umax %start))
	; CHECK-NEXT: Loop %loop: max backedge-taken count is -100			; CHECK-NEXT: Loop %loop: max backedge-taken count is -100, actual taken count either this or zero.

	leave:			leave:
	ret void			ret void
	}			}

	define void @s_0(i8 %rhs) {			define void @s_0(i8 %rhs) {
	entry:			entry:
	; E.g.: %rhs = 127, %start = -29, backedge taken 156 times			; E.g.: %rhs = 127, %start = -29, backedge taken 156 times
	%start = add i8 %rhs, 100			%start = add i8 %rhs, 100
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]			%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]
	%iv.inc = add nsw i8 %iv, 1 ;; Note: this never signed-wraps			%iv.inc = add nsw i8 %iv, 1 ;; Note: this never signed-wraps
	%iv.cmp = icmp slt i8 %iv, %rhs			%iv.cmp = icmp slt i8 %iv, %rhs
	br i1 %iv.cmp, label %loop, label %leave			br i1 %iv.cmp, label %loop, label %leave

	; CHECK-LABEL: Determining loop execution counts for: @s_0			; CHECK-LABEL: Determining loop execution counts for: @s_0
	; CHECK-NEXT: Loop %loop: backedge-taken count is (-100 + (-1 * %rhs) + ((100 + %rhs) smax %rhs))			; CHECK-NEXT: Loop %loop: backedge-taken count is (-100 + (-1 * %rhs) + ((100 + %rhs) smax %rhs))
	; CHECK-NEXT: Loop %loop: max backedge-taken count is -100			; CHECK-NEXT: Loop %loop: max backedge-taken count is -100, actual taken count either this or zero.

	leave:			leave:
	ret void			ret void
	}			}

	define void @s_1(i8 %start) {			define void @s_1(i8 %start) {
	entry:			entry:
	; E.g.: start = -29, %rhs = 127, %backedge taken 156 times			; E.g.: start = -29, %rhs = 127, %backedge taken 156 times
	%rhs = add i8 %start, -100			%rhs = add i8 %start, -100
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]			%iv = phi i8 [ %start, %entry ], [ %iv.inc, %loop ]
	%iv.inc = add nsw i8 %iv, 1			%iv.inc = add nsw i8 %iv, 1
	%iv.cmp = icmp slt i8 %iv, %rhs			%iv.cmp = icmp slt i8 %iv, %rhs
	br i1 %iv.cmp, label %loop, label %leave			br i1 %iv.cmp, label %loop, label %leave

	; CHECK-LABEL: Determining loop execution counts for: @s_1			; CHECK-LABEL: Determining loop execution counts for: @s_1
	; CHECK-NEXT: Loop %loop: backedge-taken count is ((-1 * %start) + ((-100 + %start) smax %start))			; CHECK-NEXT: Loop %loop: backedge-taken count is ((-1 * %start) + ((-100 + %start) smax %start))
	; CHECK-NEXT: Loop %loop: max backedge-taken count is -100			; CHECK-NEXT: Loop %loop: max backedge-taken count is -100, actual taken count either this or zero.

	leave:			leave:
	ret void			ret void
	}			}

llvm/trunk/test/Analysis/ScalarEvolution/trip-count14.ll

Show All 9 Lines	do.body:
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp slt i32 %i.0, %add		%cmp = icmp slt i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times

; CHECK-LABEL: Determining loop execution counts for: @s32_max1		; CHECK-LABEL: Determining loop execution counts for: @s32_max1
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((1 + %n) smax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((1 + %n) smax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 1		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 1, actual taken count either this or zero.

do.end:		do.end:
ret void		ret void
}		}

define void @s32_max2(i32 %n, i32* %p) {		define void @s32_max2(i32 %n, i32* %p) {
entry:		entry:
%add = add i32 %n, 2		%add = add i32 %n, 2
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp slt i32 %i.0, %add		%cmp = icmp slt i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times

; CHECK-LABEL: Determining loop execution counts for: @s32_max2		; CHECK-LABEL: Determining loop execution counts for: @s32_max2
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((2 + %n) smax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((2 + %n) smax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2, actual taken count either this or zero.

do.end:		do.end:
ret void		ret void
}		}

define void @s32_maxx(i32 %n, i32 %x, i32* %p) {		define void @s32_maxx(i32 %n, i32 %x, i32* %p) {
entry:		entry:
%add = add i32 %x, %n		%add = add i32 %x, %n
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp slt i32 %i.0, %add		%cmp = icmp slt i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times

; CHECK-LABEL: Determining loop execution counts for: @s32_maxx		; CHECK-LABEL: Determining loop execution counts for: @s32_maxx
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((%n + %x) smax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((%n + %x) smax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is -1		; CHECK-NEXT: Loop %do.body: max backedge-taken count is -1{{$}}

do.end:		do.end:
ret void		ret void
}		}

define void @s32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {		define void @s32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {
entry:		entry:
%add = add i32 %n, 2		%add = add i32 %n, 2
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]
%cmp = icmp eq i32 %i.0, %x		%cmp = icmp eq i32 %i.0, %x
br i1 %cmp, label %do.end, label %if.end ; unpredictable		br i1 %cmp, label %do.end, label %if.end ; unpredictable

if.end:		if.end:
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp1 = icmp slt i32 %i.0, %add		%cmp1 = icmp slt i32 %i.0, %add
br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times		br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

; CHECK-LABEL: Determining loop execution counts for: @s32_max2_unpredictable_exit		; CHECK-LABEL: Determining loop execution counts for: @s32_max2_unpredictable_exit
; CHECK-NEXT: Loop %do.body: <multiple exits> Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %do.body: <multiple exits> Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}

do.end:		do.end:
ret void		ret void
}		}

define void @u32_max1(i32 %n, i32* %p) {		define void @u32_max1(i32 %n, i32* %p) {
entry:		entry:
%add = add i32 %n, 1		%add = add i32 %n, 1
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp ult i32 %i.0, %add		%cmp = icmp ult i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times

; CHECK-LABEL: Determining loop execution counts for: @u32_max1		; CHECK-LABEL: Determining loop execution counts for: @u32_max1
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((1 + %n) umax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((1 + %n) umax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 1		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 1, actual taken count either this or zero.

do.end:		do.end:
ret void		ret void
}		}

define void @u32_max2(i32 %n, i32* %p) {		define void @u32_max2(i32 %n, i32* %p) {
entry:		entry:
%add = add i32 %n, 2		%add = add i32 %n, 2
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp ult i32 %i.0, %add		%cmp = icmp ult i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times

; CHECK-LABEL: Determining loop execution counts for: @u32_max2		; CHECK-LABEL: Determining loop execution counts for: @u32_max2
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((2 + %n) umax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((2 + %n) umax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2, actual taken count either this or zero.

do.end:		do.end:
ret void		ret void
}		}

define void @u32_maxx(i32 %n, i32 %x, i32* %p) {		define void @u32_maxx(i32 %n, i32 %x, i32* %p) {
entry:		entry:
%add = add i32 %x, %n		%add = add i32 %x, %n
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp = icmp ult i32 %i.0, %add		%cmp = icmp ult i32 %i.0, %add
br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times		br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times

; CHECK-LABEL: Determining loop execution counts for: @u32_maxx		; CHECK-LABEL: Determining loop execution counts for: @u32_maxx
; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((%n + %x) umax %n))		; CHECK-NEXT: Loop %do.body: backedge-taken count is ((-1 * %n) + ((%n + %x) umax %n))
; CHECK-NEXT: Loop %do.body: max backedge-taken count is -1		; CHECK-NEXT: Loop %do.body: max backedge-taken count is -1{{$}}

do.end:		do.end:
ret void		ret void
}		}

define void @u32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {		define void @u32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {
entry:		entry:
%add = add i32 %n, 2		%add = add i32 %n, 2
br label %do.body		br label %do.body

do.body:		do.body:
%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]		%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]
%cmp = icmp eq i32 %i.0, %x		%cmp = icmp eq i32 %i.0, %x
br i1 %cmp, label %do.end, label %if.end ; unpredictable		br i1 %cmp, label %do.end, label %if.end ; unpredictable

if.end:		if.end:
%arrayidx = getelementptr i32, i32* %p, i32 %i.0		%arrayidx = getelementptr i32, i32* %p, i32 %i.0
store i32 %i.0, i32* %arrayidx, align 4		store i32 %i.0, i32* %arrayidx, align 4
%inc = add i32 %i.0, 1		%inc = add i32 %i.0, 1
%cmp1 = icmp ult i32 %i.0, %add		%cmp1 = icmp ult i32 %i.0, %add
br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times		br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

; CHECK-LABEL: Determining loop execution counts for: @u32_max2_unpredictable_exit		; CHECK-LABEL: Determining loop execution counts for: @u32_max2_unpredictable_exit
; CHECK-NEXT: Loop %do.body: <multiple exits> Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %do.body: <multiple exits> Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2		; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}

do.end:		do.end:
ret void		ret void
}		}

llvm/trunk/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll

				; RUN: opt -S -loop-unroll < %s \| FileCheck %s

				; Unroll twice, with first loop exit kept
				; CHECK-LABEL: @s32_max1
				; CHECK: do.body:
				; CHECK: store
				; CHECK: br i1 %cmp, label %do.body.1, label %do.end
				; CHECK: do.end:
				; CHECK: ret void
				; CHECK: do.body.1:
				; CHECK: store
				; CHECK: br label %do.end
				define void @s32_max1(i32 %n, i32* %p) {
				entry:
				%add = add i32 %n, 1
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp slt i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times

				do.end:
				ret void
				}

				; Unroll thrice, with first loop exit kept
				; CHECK-LABEL: @s32_max2
				; CHECK: do.body:
				; CHECK: store
				; CHECK: br i1 %cmp, label %do.body.1, label %do.end
				; CHECK: do.end:
				; CHECK: ret void
				; CHECK: do.body.1:
				; CHECK: store
				; CHECK: store
				; CHECK: br label %do.end
				define void @s32_max2(i32 %n, i32* %p) {
				entry:
				%add = add i32 %n, 2
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp slt i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times

				do.end:
				ret void
				}

				; Should not be unrolled
				; CHECK-LABEL: @s32_maxx
				; CHECK: do.body:
				; CHECK: do.end:
				; CHECK-NOT: do.body.1:
				define void @s32_maxx(i32 %n, i32 %x, i32* %p) {
				entry:
				%add = add i32 %x, %n
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp slt i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times

				do.end:
				ret void
				}

				; Should not be unrolled
				; CHECK-LABEL: @s32_max2_unpredictable_exit
				; CHECK: do.body:
				; CHECK: do.end:
				; CHECK-NOT: do.body.1:
				define void @s32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {
				entry:
				%add = add i32 %n, 2
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]
				%cmp = icmp eq i32 %i.0, %x
				br i1 %cmp, label %do.end, label %if.end ; unpredictable

				if.end:
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp1 = icmp slt i32 %i.0, %add
				br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

				do.end:
				ret void
				}

				; Unroll twice, with first loop exit kept
				; CHECK-LABEL: @u32_max1
				; CHECK: do.body:
				; CHECK: store
				; CHECK: br i1 %cmp, label %do.body.1, label %do.end
				; CHECK: do.end:
				; CHECK: ret void
				; CHECK: do.body.1:
				; CHECK: store
				; CHECK: br label %do.end
				define void @u32_max1(i32 %n, i32* %p) {
				entry:
				%add = add i32 %n, 1
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp ult i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 1 times

				do.end:
				ret void
				}

				; Unroll thrice, with first loop exit kept
				; CHECK-LABEL: @u32_max2
				; CHECK: do.body:
				; CHECK: store
				; CHECK: br i1 %cmp, label %do.body.1, label %do.end
				; CHECK: do.end:
				; CHECK: ret void
				; CHECK: do.body.1:
				; CHECK: store
				; CHECK: store
				; CHECK: br label %do.end
				define void @u32_max2(i32 %n, i32* %p) {
				entry:
				%add = add i32 %n, 2
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp ult i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or 2 times

				do.end:
				ret void
				}

				; Should not be unrolled
				; CHECK-LABEL: @u32_maxx
				; CHECK: do.body:
				; CHECK: do.end:
				; CHECK-NOT: do.body.1:
				define void @u32_maxx(i32 %n, i32 %x, i32* %p) {
				entry:
				%add = add i32 %x, %n
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %do.body ]
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp = icmp ult i32 %i.0, %add
				br i1 %cmp, label %do.body, label %do.end ; taken either 0 or x times

				do.end:
				ret void
				}

				; Should not be unrolled
				; CHECK-LABEL: @u32_max2_unpredictable_exit
				; CHECK: do.body:
				; CHECK: do.end:
				; CHECK-NOT: do.body.1:
				define void @u32_max2_unpredictable_exit(i32 %n, i32 %x, i32* %p) {
				entry:
				%add = add i32 %n, 2
				br label %do.body

				do.body:
				%i.0 = phi i32 [ %n, %entry ], [ %inc, %if.end ]
				%cmp = icmp eq i32 %i.0, %x
				br i1 %cmp, label %do.end, label %if.end ; unpredictable

				if.end:
				%arrayidx = getelementptr i32, i32* %p, i32 %i.0
				store i32 %i.0, i32* %arrayidx, align 4
				%inc = add i32 %i.0, 1
				%cmp1 = icmp ult i32 %i.0, %add
				br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

				do.end:
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loopsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 75412

llvm/trunk/include/llvm/Analysis/ScalarEvolution.h

llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp

llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp

llvm/trunk/test/Analysis/ScalarEvolution/trip-count13.ll

llvm/trunk/test/Analysis/ScalarEvolution/trip-count14.ll

llvm/trunk/test/Transforms/LoopUnroll/full-unroll-keep-first-exit.ll

[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
ClosedPublic