This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
LoopInfo.h
-
lib/
-
Analysis/
-
LoopInfo.cpp
-
Transforms/
-
Scalar/
-
IndVarSimplify.cpp
-
Utils/
-
LCSSA.cpp
-
LoopSimplify.cpp

Differential D25364

[LCSSA] Use linear algorithm for isRecursivelyLCSSAForm
ClosedPublic

Authored by igor-laevsky on Oct 7 2016, 4:58 AM.

Download Raw Diff

Details

Reviewers

philip
mzolotukhin
sanjoy
sunfish

Commits

rG04423cf785f9: [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm
rL283877: [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm

Summary

Currently 'isRecursivelyLCSSAForm' has quadratic complexity on the number of
loop basic blocks. This caused by the fact that in each recursion level we will
visit all blocks from the current loop and all of it's sub-loops. On the next
recursive call we will step into one of the sub-loops and revisit same basic blocks.
We can address that by limiting iteration only on the blocks which are located
directly in the current loop. We know that all sub-loops will be checked by the
following recursive process.

This change is motivated by the 10x slow-down on one of our internal loop
intensive tests (Release+Asserts build). It started to happen after introduction
of the additional lcssa verification - https://github.com/llvm-mirror/llvm/commit/79e702086510fa7b52de178354eab34a7f641025

Unfortunately even with this change we are still experiencing huge slow down. I think
this may be the result of calling LCSSAWrapperPass::verifyAnalysis for the whole
function for each loop. Maybe there is a way we can more efficiently structure this code?

Diff Detail

Event Timeline

igor-laevsky updated this revision to Diff 73906.Oct 7 2016, 4:58 AM

igor-laevsky retitled this revision from to [LCSSA] Use linear algorithm for isRecursivelyLCSSAForm.

igor-laevsky updated this object.

igor-laevsky added reviewers: philip, sanjoy, mzolotukhin, sunfish.

igor-laevsky added a subscriber: llvm-commits.

Hi Igor,

Thanks for working on this! I have two concerns/thoughts:

Is it actually equivalent to the original implementation? It looks so, but we need to check the corner cases and be 100% sure that it's the case. Usually the bugs we detect with such verifiers are in corner cases.
If it's equivalent to the original, why do we need to be recursive at all? We can go through all blocks of the given loop but for each block find the innermost loop containing it and check if the value isn't used outside of it. What do you think?

As for the slowdown, we can put it under some flag, but I'm a bit reluctant to it - we've found quite a few bugs when we started to verify LCSSA, and it's good to have it turned on by default. And unfortunately checking only the current loop is not sufficient, because transformations can change outer loops.

Thanks,
Michael

In D25364#564603, @mzolotukhin wrote:

If it's equivalent to the original, why do we need to be recursive at all? We can go through all blocks of the given loop but for each block find the innermost loop containing it and check if the value isn't used outside of it. What do you think?

I really like this framing. Rather than asking whether each sub-loop is in LCSSA, we would essentially ask whether each BB is in LCSSA w.r.t. it's inner most containing loop. Because we know all of the blocks within a sub-loop are also contained within an outer loop, this gives us the transitive property we need.

However, it does look like we don't currently verify the contains relation for sub-loops. This may be covered by the off by default LoopInfo.verify(), but it's not directly enforced. Not sure we need to block this patch on that fact, but the lack of verification makes me a bit nervous.

In D25364#564896, @reames wrote:

However, it does look like we don't currently verify the contains relation for sub-loops.

Correction: This is checked by verifyLoop

Thanks for the comments!

Michael, your second point seems very reasonable. Please take a look at the updated diff. I kept the original name "isRecursivelyLCSSAForm", despite that function is no longer recursive. I think it still describes the process fairly well.

I agree that putting LCSSAWrapperPass::verifyAnalysis under a flag isn't a good way to deal with the slowdown. Can't we apply same technique as was used for the verifyLoop (https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/lib/Analysis/LoopPass.cpp#L219)? I.e add explicit verification inside LPPassManager for each of the top-level loops.

Hi Igor,

The patch looks good to me, thank you for doing this!

I agree that putting LCSSAWrapperPass::verifyAnalysis under a flag isn't a good way to deal with the slowdown. Can't we apply same technique as was used for the verifyLoop (https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/lib/Analysis/LoopPass.cpp#L219)? I.e add explicit verification inside LPPassManager for each of the top-level loops.

Do I understand it correctly, that we'll be checking LCSSA only for the current loop after each loop-pass? I think it should be sufficient, because if a pass breaks LCSSA, we should fail when LPM proceeds to the parent loop. However, for the sake of easier debugging I would prefer to allow full verification as we have it right now too, probably under -verify-loop-info as well (or under another flag). It really helps to find the broken pass faster.

Thanks,
Michael

Closed by commit rL283877: [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm (authored by igor.laevsky). · Explain WhyOct 11 2016, 6:46 AM

This revision was automatically updated to reflect the committed changes.

Hi Michael,

Do I understand it correctly, that we'll be checking LCSSA only for the current loop after each loop-pass? I think it should be sufficient, because if a pass breaks LCSSA, we should fail when LPM proceeds to the parent loop. However, for the sake of easier debugging I would prefer to allow full verification as we have it right now too, probably under -verify-loop-info as well (or under another flag). It really helps to find the broken pass faster.

Yes, this is the plan. I will send patch up for review in a nearest future.

Revision Contents

Path

Size

include/

llvm/

Analysis/

LoopInfo.h

5 lines

lib/

Analysis/

LoopInfo.cpp

29 lines

Transforms/

Scalar/

IndVarSimplify.cpp

9 lines

Utils/

LCSSA.cpp

3 lines

LoopSimplify.cpp

10 lines

Diff 73906

include/llvm/Analysis/LoopInfo.h

Show First 20 Lines • Show All 406 Lines • ▼ Show 20 Lines	public:
/// variable.		/// variable.
///		///
PHINode *getCanonicalInductionVariable() const;		PHINode *getCanonicalInductionVariable() const;

/// Return true if the Loop is in LCSSA form.		/// Return true if the Loop is in LCSSA form.
bool isLCSSAForm(DominatorTree &DT) const;		bool isLCSSAForm(DominatorTree &DT) const;

/// Return true if this Loop and all inner subloops are in LCSSA form.		/// Return true if this Loop and all inner subloops are in LCSSA form.
bool isRecursivelyLCSSAForm(DominatorTree &DT) const;		/// When 'LI' is null this function has quadratic complexity on the number
		/// of loop blocks. Linear algorithm is used otherwise.
		bool isRecursivelyLCSSAForm(DominatorTree &DT,
		const LoopInfo *LI = nullptr) const;

/// Return true if the Loop is in the form that the LoopSimplify form		/// Return true if the Loop is in the form that the LoopSimplify form
/// transforms loops to, which is sometimes called normal form.		/// transforms loops to, which is sometimes called normal form.
bool isLoopSimplifyForm() const;		bool isLoopSimplifyForm() const;

/// Return true if the loop body is safe to clone in practice.		/// Return true if the loop body is safe to clone in practice.
bool isSafeToClone() const;		bool isSafeToClone() const;

▲ Show 20 Lines • Show All 423 Lines • Show Last 20 Lines

lib/Analysis/LoopInfo.cpp

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	if (ConstantInt *CI =
Inc->getOperand(0) == PN)		Inc->getOperand(0) == PN)
if (ConstantInt *CI = dyn_cast<ConstantInt>(Inc->getOperand(1)))		if (ConstantInt *CI = dyn_cast<ConstantInt>(Inc->getOperand(1)))
if (CI->equalsInt(1))		if (CI->equalsInt(1))
return PN;		return PN;
}		}
return nullptr;		return nullptr;
}		}

bool Loop::isLCSSAForm(DominatorTree &DT) const {		// Return true if the Loop is in LCSSA form.
for (BasicBlock *BB : this->blocks()) {		// When 'LI' is non-null we only check blocks located directly inside 'L' and
		// not in one of it's sub-loops.
		// This functionality is used in 'isRecursivelyLCSSAForm' to prevent us
		// from re-checking basic blocks from the nested loops.
		static bool isLCSSAFormImpl(const Loop &L, DominatorTree &DT,
		const LoopInfo *LI) {
		for (BasicBlock *BB : L.blocks()) {
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
// Tokens can't be used in PHI nodes and live-out tokens prevent loop		// Tokens can't be used in PHI nodes and live-out tokens prevent loop
// optimizations, so for the purposes of considered LCSSA form, we		// optimizations, so for the purposes of considered LCSSA form, we
// can ignore them.		// can ignore them.
if (I.getType()->isTokenTy())		if (I.getType()->isTokenTy())
continue;		continue;

		// Only visit blocks located directly inside the current loop
		if (LI && LI->getLoopFor(BB) != &L)
		continue;

for (Use &U : I.uses()) {		for (Use &U : I.uses()) {
Instruction *UI = cast<Instruction>(U.getUser());		Instruction *UI = cast<Instruction>(U.getUser());
BasicBlock *UserBB = UI->getParent();		BasicBlock *UserBB = UI->getParent();
if (PHINode *P = dyn_cast<PHINode>(UI))		if (PHINode *P = dyn_cast<PHINode>(UI))
UserBB = P->getIncomingBlock(U);		UserBB = P->getIncomingBlock(U);

// Check the current block, as a fast-path, before checking whether		// Check the current block, as a fast-path, before checking whether
// the use is anywhere in the loop. Most values are used in the same		// the use is anywhere in the loop. Most values are used in the same
// block they are defined in. Also, blocks not reachable from the		// block they are defined in. Also, blocks not reachable from the
// entry are special; uses in them don't need to go through PHIs.		// entry are special; uses in them don't need to go through PHIs.
if (UserBB != BB &&		if (UserBB != BB &&
!contains(UserBB) &&		!L.contains(UserBB) &&
DT.isReachableFromEntry(UserBB))		DT.isReachableFromEntry(UserBB))
return false;		return false;
}		}
}		}
}		}

return true;		return true;
}		}

bool Loop::isRecursivelyLCSSAForm(DominatorTree &DT) const {		bool Loop::isLCSSAForm(DominatorTree &DT) const {
if (!isLCSSAForm(DT))		return isLCSSAFormImpl(*this, DT, nullptr);
		}

		bool Loop::isRecursivelyLCSSAForm(DominatorTree &DT,
		const LoopInfo LI / = nullptr */) const {
		if (!isLCSSAFormImpl(*this, DT, LI))
return false;		return false;

return all_of(*this,		return all_of(
[&](const Loop *L) { return L->isRecursivelyLCSSAForm(DT); });		this, [&](const Loop L) { return L->isRecursivelyLCSSAForm(DT, LI); });
}		}

bool Loop::isLoopSimplifyForm() const {		bool Loop::isLoopSimplifyForm() const {
// Normal-form loops have a preheader, a single backedge, and all of their		// Normal-form loops have a preheader, a single backedge, and all of their
// exits have all their predecessors inside the loop.		// exits have all their predecessors inside the loop.
return getLoopPreheader() && getLoopLatch() && hasDedicatedExits();		return getLoopPreheader() && getLoopLatch() && hasDedicatedExits();
}		}

▲ Show 20 Lines • Show All 551 Lines • Show Last 20 Lines

lib/Transforms/Scalar/IndVarSimplify.cpp

Show First 20 Lines • Show All 500 Lines • ▼ Show 20 Lines
/// current expressions.		/// current expressions.
///		///
/// This is mostly redundant with the regular IndVarSimplify activities that		/// This is mostly redundant with the regular IndVarSimplify activities that
/// happen later, except that it's more powerful in some cases, because it's		/// happen later, except that it's more powerful in some cases, because it's
/// able to brute-force evaluate arbitrary instructions as long as they have		/// able to brute-force evaluate arbitrary instructions as long as they have
/// constant operands at the beginning of the loop.		/// constant operands at the beginning of the loop.
void IndVarSimplify::rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter) {		void IndVarSimplify::rewriteLoopExitValues(Loop *L, SCEVExpander &Rewriter) {
// Check a pre-condition.		// Check a pre-condition.
assert(L->isRecursivelyLCSSAForm(*DT) && "Indvars did not preserve LCSSA!");		assert(L->isRecursivelyLCSSAForm(*DT, LI) &&
		"Indvars did not preserve LCSSA!");

SmallVector<BasicBlock*, 8> ExitBlocks;		SmallVector<BasicBlock*, 8> ExitBlocks;
L->getUniqueExitBlocks(ExitBlocks);		L->getUniqueExitBlocks(ExitBlocks);

SmallVector<RewritePhi, 8> RewritePhiSet;		SmallVector<RewritePhi, 8> RewritePhiSet;
// Find all values that are computed inside the loop, but used outside of it.		// Find all values that are computed inside the loop, but used outside of it.
// Because of LCSSA, these values will only occur in LCSSA PHI Nodes. Scan		// Because of LCSSA, these values will only occur in LCSSA PHI Nodes. Scan
// the exit blocks of the loop to find them.		// the exit blocks of the loop to find them.
▲ Show 20 Lines • Show All 1,662 Lines • ▼ Show 20 Lines
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// IndVarSimplify driver. Manage several subpasses of IV simplification.		// IndVarSimplify driver. Manage several subpasses of IV simplification.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

bool IndVarSimplify::run(Loop *L) {		bool IndVarSimplify::run(Loop *L) {
// We need (and expect!) the incoming loop to be in LCSSA.		// We need (and expect!) the incoming loop to be in LCSSA.
assert(L->isRecursivelyLCSSAForm(*DT) && "LCSSA required to run indvars!");		assert(L->isRecursivelyLCSSAForm(*DT, LI) &&
		"LCSSA required to run indvars!");

// If LoopSimplify form is not available, stay out of trouble. Some notes:		// If LoopSimplify form is not available, stay out of trouble. Some notes:
// - LSR currently only supports LoopSimplify-form loops. Indvars'		// - LSR currently only supports LoopSimplify-form loops. Indvars'
// canonicalization can be a pessimization without LSR to "clean up"		// canonicalization can be a pessimization without LSR to "clean up"
// afterwards.		// afterwards.
// - We depend on having a preheader; in particular,		// - We depend on having a preheader; in particular,
// Loop::getCanonicalInductionVariable only supports loops with preheaders,		// Loop::getCanonicalInductionVariable only supports loops with preheaders,
// and we're in trouble if we can't find the induction variable even when		// and we're in trouble if we can't find the induction variable even when
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#endif
// trip count and therefore can further simplify exit values in addition to		// trip count and therefore can further simplify exit values in addition to
// rewriteLoopExitValues.		// rewriteLoopExitValues.
rewriteFirstIterationLoopExitValues(L);		rewriteFirstIterationLoopExitValues(L);

// Clean up dead instructions.		// Clean up dead instructions.
Changed \|= DeleteDeadPHIs(L->getHeader(), TLI);		Changed \|= DeleteDeadPHIs(L->getHeader(), TLI);

// Check a post-condition.		// Check a post-condition.
assert(L->isRecursivelyLCSSAForm(*DT) && "Indvars did not preserve LCSSA!");		assert(L->isRecursivelyLCSSAForm(*DT, LI) &&
		"Indvars did not preserve LCSSA!");

// Verify that LFTR, and any other change have not interfered with SCEV's		// Verify that LFTR, and any other change have not interfered with SCEV's
// ability to compute trip count.		// ability to compute trip count.
#ifndef NDEBUG		#ifndef NDEBUG
if (VerifyIndvars && !isa<SCEVCouldNotCompute>(BackedgeTakenCount)) {		if (VerifyIndvars && !isa<SCEVCouldNotCompute>(BackedgeTakenCount)) {
SE->forgetLoop(L);		SE->forgetLoop(L);
const SCEV *NewBECount = SE->getBackedgeTakenCount(L);		const SCEV *NewBECount = SE->getBackedgeTakenCount(L);
if (SE->getTypeSizeInBits(BackedgeTakenCount->getType()) <		if (SE->getTypeSizeInBits(BackedgeTakenCount->getType()) <
▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

lib/Transforms/Utils/LCSSA.cpp

Show First 20 Lines • Show All 317 Lines • ▼ Show 20 Lines	struct LCSSAWrapperPass : public FunctionPass {
// Cached analysis information for the current function.		// Cached analysis information for the current function.
DominatorTree *DT;		DominatorTree *DT;
LoopInfo *LI;		LoopInfo *LI;
ScalarEvolution *SE;		ScalarEvolution *SE;

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
void verifyAnalysis() const override {		void verifyAnalysis() const override {
assert(		assert(
all_of(LI, [&](Loop L) { return L->isRecursivelyLCSSAForm(*DT); }) &&		all_of(*LI,
		[&](Loop L) { return L->isRecursivelyLCSSAForm(DT, LI); }) &&
"LCSSA form is broken!");		"LCSSA form is broken!");
};		};

/// This transformation requires natural loop information & requires that		/// This transformation requires natural loop information & requires that
/// loop preheaders be inserted into the CFG. It maintains both of these,		/// loop preheaders be inserted into the CFG. It maintains both of these,
/// as well as the CFG. It also requires dominator information.		/// as well as the CFG. It also requires dominator information.
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();		AU.setPreservesCFG();
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

lib/Transforms/Utils/LoopSimplify.cpp

Show First 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	if (PreserveLCSSA) {
// Fix LCSSA form for L. Some values, which previously were only used inside		// Fix LCSSA form for L. Some values, which previously were only used inside
// L, can now be used in NewOuter loop. We need to insert phi-nodes for them		// L, can now be used in NewOuter loop. We need to insert phi-nodes for them
// in corresponding exit blocks.		// in corresponding exit blocks.
// We don't need to form LCSSA recursively, because there cannot be uses		// We don't need to form LCSSA recursively, because there cannot be uses
// inside a newly created loop of defs from inner loops as those would		// inside a newly created loop of defs from inner loops as those would
// already be a use of an LCSSA phi node.		// already be a use of an LCSSA phi node.
formLCSSA(L, DT, LI, SE);		formLCSSA(L, DT, LI, SE);

assert(NewOuter->isRecursivelyLCSSAForm(*DT) &&		assert(NewOuter->isRecursivelyLCSSAForm(*DT, LI) &&
"LCSSA is broken after separating nested loops!");		"LCSSA is broken after separating nested loops!");
}		}

return NewOuter;		return NewOuter;
}		}

/// \brief This method is called when the specified loop has more than one		/// \brief This method is called when the specified loop has more than one
/// backedge in it.		/// backedge in it.
▲ Show 20 Lines • Show All 427 Lines • ▼ Show 20 Lines	bool LoopSimplify::runOnFunction(Function &F) {
AssumptionCache *AC =		AssumptionCache *AC =
&getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		&getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);

bool PreserveLCSSA = mustPreserveAnalysisID(LCSSAID);		bool PreserveLCSSA = mustPreserveAnalysisID(LCSSAID);
#ifndef NDEBUG		#ifndef NDEBUG
if (PreserveLCSSA) {		if (PreserveLCSSA) {
assert(DT && "DT not available.");		assert(DT && "DT not available.");
assert(LI && "LI not available.");		assert(LI && "LI not available.");
bool InLCSSA =		bool InLCSSA = all_of(
all_of(LI, [&](Loop L) { return L->isRecursivelyLCSSAForm(*DT); });		LI, [&](Loop L) { return L->isRecursivelyLCSSAForm(*DT, LI); });
assert(InLCSSA && "Requested to preserve LCSSA, but it's already broken.");		assert(InLCSSA && "Requested to preserve LCSSA, but it's already broken.");
}		}
#endif		#endif

// Simplify each loop nest in the function.		// Simplify each loop nest in the function.
for (LoopInfo::iterator I = LI->begin(), E = LI->end(); I != E; ++I)		for (LoopInfo::iterator I = LI->begin(), E = LI->end(); I != E; ++I)
Changed \|= simplifyLoop(*I, DT, LI, SE, AC, PreserveLCSSA);		Changed \|= simplifyLoop(*I, DT, LI, SE, AC, PreserveLCSSA);

#ifndef NDEBUG		#ifndef NDEBUG
if (PreserveLCSSA) {		if (PreserveLCSSA) {
bool InLCSSA =		bool InLCSSA = all_of(
all_of(LI, [&](Loop L) { return L->isRecursivelyLCSSAForm(*DT); });		LI, [&](Loop L) { return L->isRecursivelyLCSSAForm(*DT, LI); });
assert(InLCSSA && "LCSSA is broken after loop-simplify.");		assert(InLCSSA && "LCSSA is broken after loop-simplify.");
}		}
#endif		#endif
return Changed;		return Changed;
}		}

PreservedAnalyses LoopSimplifyPass::run(Function &F,		PreservedAnalyses LoopSimplifyPass::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines