This is an archive of the discontinued LLVM Phabricator instance.

Throttling LICM to reduce compile time of Value Propagation and Reg Splitting
Needs ReviewPublic

Authored by wmi on Mar 10 2016, 3:23 PM.

Download Raw Diff

Details

Reviewers

atrick
hfinkel
reames

Summary

The is the part2 to solve https://llvm.org/bugs/show_bug.cgi?id=10584

For the testcases in the bug, LICM creates many long stretched vars living across many BBs. When CVP queries Lazy Value Information (LVI) for such vars, for each var, LVI would go through all the BBs it lives across. If there are many such long stretched vars, CVP will become very slow. Reg splitting has similar problem. For a vreg with very long live range, computing spill placement when splitting the vreg will take high cost.

The patch tries to throttle LICM to reduce the number of non-expensive instructions being hoisted outside of very large loop. For small loop, hoisting vars outside of such loop will not increase the cost much so we keep the current logic to hoist as many as possible. For large loop, expensive instructions like mul/div/rem/... will still be hoisted anyway. For non-expensive instructions in large loop, we use reg number to throttle how many instructions should be hoisted.

No unittest is added because I cannot find a small testcase for it.

Tested llvm testsuite and google internal benchmark on x86_64-linux-gnu and didn't find perf regression.

Diff Detail

Repository: rL LLVM

Event Timeline

wmi updated this revision to Diff 50363.Mar 10 2016, 3:23 PM

wmi retitled this revision from to Throttling LICM to reduce compile time of Value Propagation and Reg Splitting.

wmi updated this object.

wmi added reviewers: atrick, reames, hfinkel.

wmi set the repository for this revision to rL LLVM.

wmi added subscribers: llvm-commits, davidxl.

mcrosier added a subscriber: gberry.Mar 11 2016, 6:16 AM

I understand your thought process, but this approach is just not going to work. As a matter of policy, we canonicalize at the IR level and LICM is pretty much the classic definition of canonicalization. Restricting the aggressiveness of LICM to resolve a compile time problem elsewhere in the optimizer is fundamentally unacceptable.

I'm with phillip on this one.
If LVI is being asked about every variable, either
A. We should make something that produces the same info but does it not
lazily (the evaluation order that LVI uses is non-optimal)
B. We should stop asking about every variable.

LazyValueInfo is supposed to be for lazy queries. It is a backwards
solver. This is going to be the worst possible order to ask about things
in :)

There are better orderings and better solving strategies that will produce
identical info, far faster.
However, most of these can't be stopped "in the middle" like LVI can.
If we are asking for every variable, we should do that.

Given that meet/join/etc is already abstracted out pretty well, writing
such a solver using SparseSolver or something should be preetty trivial.
My guess is a couple hundred lines of code at most.
I would do that instead, and use it in passes that are asking about every
variable.

On the topic of LVI vs Eager Value Info, I'm much less convinced than
Danny that jumping straight to that is a good idea. There's lots of
room to improve the implementation of LVI (e.g.
https://llvm.org/bugs/show_bug.cgi?id=26921). I'm not opposed to the
idea of an eager algorithm, but getting the eager code to handle all of
the cases LVI does is not as obvious as it first might seem. In
particular, LVI's handling of loops is surprisingly sophisticated and
non obvious. It's able to do fairly sophisticated inductive proofs of
constant ranges.

Philip

I have not read the patch in details, but it seems the main purpose of the patch to throttle LICM to reduce register pressure and the compile time benefit due to reduced LVI queries is just a good side effect of the patch? Do we have any performance data (runtime) showing the usefulness of the throttling? (Whether this is the right approach to solve the spill problem is also subject to discussions).

David,

I very deeply believe that limiting hoisting in the IR to reduce
register pressure is fundamentally the wrong approach. The backend is
responsible for sinking if desired to reduce register pressure. We do
not model register pressure in the IR. At all. Period.

Philip

Resigning from a stale review (2016). Feel free to re-add if thread ever revived.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2020, 9:01 AM

Herald added a subscriber: asbirlea. · View Herald Transcript

asbirlea removed a subscriber: asbirlea.Feb 25 2020, 10:30 AM

Revision Contents

Path

Size

include/

llvm/

Transforms/

Utils/

LoopUtils.h

10 lines

lib/

Transforms/

Scalar/

LICM.cpp

213 lines

Diff 50363

include/llvm/Transforms/Utils/LoopUtils.h

Show All 27 Lines
class DataLayout;		class DataLayout;
class DominatorTree;		class DominatorTree;
class Loop;		class Loop;
class LoopInfo;		class LoopInfo;
class Pass;		class Pass;
class PredIteratorCache;		class PredIteratorCache;
class ScalarEvolution;		class ScalarEvolution;
class TargetLibraryInfo;		class TargetLibraryInfo;
		class TargetTransformInfo;

/// \brief Captures loop safety information.		/// \brief Captures loop safety information.
/// It keep information for loop & its header may throw exception.		/// It keep information for loop & its header may throw exception.
struct LICMSafetyInfo {		struct LICMSafetyInfo {
bool MayThrow; // The current loop contains an instruction which		bool MayThrow; // The current loop contains an instruction which
// may throw.		// may throw.
bool HeaderMayThrow; // Same as previous, but specific to loop header		bool HeaderMayThrow; // Same as previous, but specific to loop header
// Used to update funclet bundle operands.		// Used to update funclet bundle operands.
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	bool sinkRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
TargetLibraryInfo , Loop , AliasSetTracker *,		TargetLibraryInfo , Loop , AliasSetTracker *,
LICMSafetyInfo *);		LICMSafetyInfo *);

/// \brief Walk the specified region of the CFG (defined by all blocks		/// \brief Walk the specified region of the CFG (defined by all blocks
/// dominated by the specified block, and that are in the current loop) in depth		/// dominated by the specified block, and that are in the current loop) in depth
/// first order w.r.t the DominatorTree. This allows us to visit definitions		/// first order w.r.t the DominatorTree. This allows us to visit definitions
/// before uses, allowing us to hoist a loop body in one pass without iteration.		/// before uses, allowing us to hoist a loop body in one pass without iteration.
/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,		/// Takes DomTreeNode, AliasAnalysis, LoopInfo, DominatorTree, DataLayout,
/// TargetLibraryInfo, Loop, AliasSet information for all instructions of the		/// TargetLibraryInfo, TargetTransformInfo, Loop, AliasSet information for all
/// loop and loop safety information as arguments. It returns changed status.		/// instructions of the loop and loop safety information as arguments. It
		/// returns changed status.
bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,		bool hoistRegion(DomTreeNode , AliasAnalysis , LoopInfo , DominatorTree ,
TargetLibraryInfo , Loop , AliasSetTracker *,		TargetLibraryInfo , TargetTransformInfo TTI, Loop *,
LICMSafetyInfo *);		AliasSetTracker , LICMSafetyInfo );

/// \brief Try to promote memory values to scalars by sinking stores out of		/// \brief Try to promote memory values to scalars by sinking stores out of
/// the loop and moving loads to before the loop. We do this by looping over		/// the loop and moving loads to before the loop. We do this by looping over
/// the stores in the loop, looking for stores to Must pointers which are		/// the stores in the loop, looking for stores to Must pointers which are
/// loop invariant. It takes AliasSet, Loop exit blocks vector, loop exit blocks		/// loop invariant. It takes AliasSet, Loop exit blocks vector, loop exit blocks
/// insertion point vector, PredIteratorCache, LoopInfo, DominatorTree, Loop,		/// insertion point vector, PredIteratorCache, LoopInfo, DominatorTree, Loop,
/// AliasSet information for all instructions of the loop and loop safety		/// AliasSet information for all instructions of the loop and loop safety
/// information as arguments. It returns changed status.		/// information as arguments. It returns changed status.
Show All 33 Lines

lib/Transforms/Scalar/LICM.cpp

Show All 36 Lines
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"		#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	CloneInstructionInExitBlock(Instruction &I, BasicBlock &ExitBlock, PHINode &PN,
const LoopInfo *LI,		const LoopInfo *LI,
const LICMSafetyInfo *SafetyInfo);		const LICMSafetyInfo *SafetyInfo);
static bool canSinkOrHoistInst(Instruction &I, AliasAnalysis *AA,		static bool canSinkOrHoistInst(Instruction &I, AliasAnalysis *AA,
DominatorTree DT, TargetLibraryInfo TLI,		DominatorTree DT, TargetLibraryInfo TLI,
Loop CurLoop, AliasSetTracker CurAST,		Loop CurLoop, AliasSetTracker CurAST,
LICMSafetyInfo *SafetyInfo);		LICMSafetyInfo *SafetyInfo);

namespace {		namespace {
		struct HoistAux {
		Instruction *I; // The candidate instruction to be promoted.
		unsigned Cost; // How much cost can be saved if I is promoted.
		SmallVector<HoistAux *, 8> Deps; // The hoist candidates which generate
		// values used in current instruction.
		unsigned HoistOrder; // The order number in which the instruction is
		// hoisted. Used for sanity check.
		HoistAux(Instruction *Inst) : I(Inst), Cost(0), HoistOrder(0) {}
		};
		}

		namespace {
struct LICM : public LoopPass {		struct LICM : public LoopPass {
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid
LICM() : LoopPass(ID) {		LICM() : LoopPass(ID) {
initializeLICMPass(*PassRegistry::getPassRegistry());		initializeLICMPass(*PassRegistry::getPassRegistry());
}		}

bool runOnLoop(Loop *L, LPPassManager &LPM) override;		bool runOnLoop(Loop *L, LPPassManager &LPM) override;

/// This transformation requires natural loop information & requires that		/// This transformation requires natural loop information & requires that
/// loop preheaders be inserted into the CFG...		/// loop preheaders be inserted into the CFG...
///		///
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
		AU.addRequired<TargetTransformInfoWrapperPass>();
getLoopAnalysisUsage(AU);		getLoopAnalysisUsage(AU);
}		}

using llvm::Pass::doFinalization;		using llvm::Pass::doFinalization;

bool doFinalization() override {		bool doFinalization() override {
assert(LoopToAliasSetMap.empty() && "Didn't free loop alias sets");		assert(LoopToAliasSetMap.empty() && "Didn't free loop alias sets");
return false;		return false;
}		}

private:		private:
AliasAnalysis *AA; // Current AliasAnalysis information		AliasAnalysis *AA; // Current AliasAnalysis information
LoopInfo *LI; // Current LoopInfo		LoopInfo *LI; // Current LoopInfo
DominatorTree *DT; // Dominator Tree for the current Loop.		DominatorTree *DT; // Dominator Tree for the current Loop.

TargetLibraryInfo *TLI; // TargetLibraryInfo for constant folding.		TargetLibraryInfo *TLI; // TargetLibraryInfo for constant folding.
		TargetTransformInfo *TTI; // TargetTransformInfo to get register num.

// State that is updated as we process loops.		// State that is updated as we process loops.
bool Changed; // Set to true when we change anything.		bool Changed; // Set to true when we change anything.
BasicBlock *Preheader; // The preheader block of the current loop...		BasicBlock *Preheader; // The preheader block of the current loop...
Loop *CurLoop; // The current loop we are working on...		Loop *CurLoop; // The current loop we are working on...
AliasSetTracker *CurAST; // AliasSet information for the current loop...		AliasSetTracker *CurAST; // AliasSet information for the current loop...
DenseMap<Loop, AliasSetTracker> LoopToAliasSetMap;		DenseMap<Loop, AliasSetTracker> LoopToAliasSetMap;

Show All 11 Lines	private:
AliasSetTracker collectAliasInfoForLoop(Loop L);		AliasSetTracker collectAliasInfoForLoop(Loop L);
};		};
}		}

char LICM::ID = 0;		char LICM::ID = 0;
INITIALIZE_PASS_BEGIN(LICM, "licm", "Loop Invariant Code Motion", false, false)		INITIALIZE_PASS_BEGIN(LICM, "licm", "Loop Invariant Code Motion", false, false)
INITIALIZE_PASS_DEPENDENCY(LoopPass)		INITIALIZE_PASS_DEPENDENCY(LoopPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(LICM, "licm", "Loop Invariant Code Motion", false, false)		INITIALIZE_PASS_END(LICM, "licm", "Loop Invariant Code Motion", false, false)

Pass *llvm::createLICMPass() { return new LICM(); }		Pass *llvm::createLICMPass() { return new LICM(); }

/// Hoist expressions out of the specified loop. Note, alias info for inner		/// Hoist expressions out of the specified loop. Note, alias info for inner
/// loop is not preserved so it is not a good idea to run LICM multiple		/// loop is not preserved so it is not a good idea to run LICM multiple
/// times on one loop.		/// times on one loop.
///		///
bool LICM::runOnLoop(Loop *L, LPPassManager &LPM) {		bool LICM::runOnLoop(Loop *L, LPPassManager &LPM) {
if (skipOptnoneFunction(L))		if (skipOptnoneFunction(L))
return false;		return false;

Changed = false;		Changed = false;

// Get our Loop and Alias Analysis information...		// Get our Loop and Alias Analysis information...
LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();		AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();		DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();

TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
		Function *F = L->getHeader()->getParent();
		TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(*F);

assert(L->isLCSSAForm(*DT) && "Loop is not in LCSSA form.");		assert(L->isLCSSAForm(*DT) && "Loop is not in LCSSA form.");

CurAST = collectAliasInfoForLoop(L);		CurAST = collectAliasInfoForLoop(L);

CurLoop = L;		CurLoop = L;

// Get the preheader block to move instructions into...		// Get the preheader block to move instructions into...
Show All 12 Lines	bool LICM::runOnLoop(Loop *L, LPPassManager &LPM) {
// that we are guaranteed to see definitions before we see uses. This allows		// that we are guaranteed to see definitions before we see uses. This allows
// us to sink instructions in one pass, without iteration. After sinking		// us to sink instructions in one pass, without iteration. After sinking
// instructions, we perform another pass to hoist them out of the loop.		// instructions, we perform another pass to hoist them out of the loop.
//		//
if (L->hasDedicatedExits())		if (L->hasDedicatedExits())
Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, CurLoop,		Changed \|= sinkRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, CurLoop,
CurAST, &SafetyInfo);		CurAST, &SafetyInfo);
if (Preheader)		if (Preheader)
Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI,		Changed \|= hoistRegion(DT->getNode(L->getHeader()), AA, LI, DT, TLI, TTI,
CurLoop, CurAST, &SafetyInfo);		CurLoop, CurAST, &SafetyInfo);

// Now that all loop invariants have been removed from the loop, promote any		// Now that all loop invariants have been removed from the loop, promote any
// memory references to scalars that we can.		// memory references to scalars that we can.
if (!DisablePromotion && (Preheader \|\| L->hasDedicatedExits())) {		if (!DisablePromotion && (Preheader \|\| L->hasDedicatedExits())) {
SmallVector<BasicBlock *, 8> ExitBlocks;		SmallVector<BasicBlock *, 8> ExitBlocks;
SmallVector<Instruction *, 8> InsertPts;		SmallVector<Instruction *, 8> InsertPts;
PredIteratorCache PIC;		PredIteratorCache PIC;
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	if (isNotUsedInLoop(I, CurLoop, SafetyInfo) &&
canSinkOrHoistInst(I, AA, DT, TLI, CurLoop, CurAST, SafetyInfo)) {		canSinkOrHoistInst(I, AA, DT, TLI, CurLoop, CurAST, SafetyInfo)) {
++II;		++II;
Changed \|= sink(I, LI, DT, CurLoop, CurAST, SafetyInfo);		Changed \|= sink(I, LI, DT, CurLoop, CurAST, SafetyInfo);
}		}
}		}
return Changed;		return Changed;
}		}

		/// Get Instruction cost. If the loop invariant instruction is expensive,
		/// promote it to outside of the loop is beneficial even if the increased
		/// live range will cause more spill.
		///
		static unsigned getInstructionCost(Instruction *I) {
		switch (I->getOpcode()) {
		case Instruction::PHI:
		case Instruction::BitCast:
		case Instruction::PtrToInt:
		case Instruction::IntToPtr:
		return 0;
		case Instruction::GetElementPtr:
		case Instruction::Add:
		case Instruction::FAdd:
		case Instruction::Sub:
		case Instruction::FSub:
		case Instruction::Shl:
		case Instruction::LShr:
		case Instruction::AShr:
		case Instruction::And:
		case Instruction::Or:
		case Instruction::Xor:
		case Instruction::Load:
		case Instruction::ZExt:
		case Instruction::SExt:
		return 1;
		default:
		return 2;
		}
		}

		/// Hoist loop invariant candidates saved in HoistVec.
		/// When a value is hoisted outside of a loop, its live range is stretched to
		/// cover the whole loop body. If there are many such long stretched values
		/// and the loop size is huge, it can increase the compile time of Lazy Value
		/// Information analysis (LVI) and region based register splitting a lot.
		/// To alleviate the problem, LICM is throttled for large size loop and for
		/// non-expensive instructions. A simple rough heuristic is used here: for
		/// large loop, cheap invariant instruction will not be hoisted if instructions
		/// already being hoisted reach the register number of the target arch.
		///
		static SmallVector<HoistAux *, 32> HoistVec;
		static bool selectHoist(DominatorTree DT, Loop CurLoop,
		LICMSafetyInfo SafetyInfo, TargetLibraryInfo TLI,
		TargetTransformInfo *TTI) {
		bool Changed = false;
		unsigned UsedReg = 0;
		unsigned LoopSize = CurLoop->getNumBlocks();
		unsigned RegNum;
		unsigned Order = 0;
		if (LoopSize < 30) {
		RegNum = UINT_MAX;
		} else if (LoopSize < 100) {
		RegNum =
		std::max(TTI->getNumberOfRegisters(0), TTI->getNumberOfRegisters(1));
		} else {
		RegNum = 0;
		}
		SmallVector<HoistAux *, 32> ExpHoistVec;
		SmallVector<HoistAux *, 32> ChpHoistVec;
		for (auto HAI = HoistVec.rbegin(); HAI != HoistVec.rend(); ++HAI) {
		HoistAux HA = (HAI);
		HA->Cost += getInstructionCost(HA->I);
		// If current candidate is expensive, mark its deps also as expensive.
		for (auto HADep : HA->Deps)
		HADep->Cost += HA->Cost;
		if (HA->Cost > 1)
		ExpHoistVec.push_back(HA);
		else
		ChpHoistVec.push_back(HA);
		}
		// For expensive candidates, hoist them anyway.
		for (auto HAI = ExpHoistVec.rbegin(); HAI != ExpHoistVec.rend(); ++HAI) {
		HoistAux HA = (HAI);
		// It is needed to check isSafeToExecuteUnconditionally before hoist again
		// because we have following case:
		// define void @test(..., dereferenceable(8) %cptr, ...)
		// for.body:
		// ...
		// if.then: ; preds = %for.body
		// %c = load i32, i32* %cptr, !dereferenceable !0
		// %1 = load i32, i32* %c, align 4
		// ...
		// br label %for.inc
		// for.inc:
		// ...
		// br i1 %exitcond, label %for.end, label %for.body
		// %c = load i32, i32* %cptr, ... can be speculatively hoisted because
		// %cptr is a param marked as a dereferenceable. Once %c = load ... is
		// hoisted, the dereferenceable attribute associated with %c should be
		// dropped because %c = load ... is not in its original context anymore.
		// So the value %c used in the second load will not have dereferenceable
		// attribute after the first load is hoisted and it should not be hoisted
		// even if it is already in HoistVec.
		if (isSafeToExecuteUnconditionally(
		*(HA->I), DT, TLI, CurLoop, SafetyInfo,
		CurLoop->getLoopPreheader()->getTerminator())) {
		Changed \|= hoist(*(HA->I), DT, CurLoop, SafetyInfo);
		Order++;
		HA->HoistOrder = Order;
		UsedReg++;
		}
		}
		// For cheap candidates, hoist them only when there is available register
		// left.
		for (auto HAI = ChpHoistVec.rbegin(); HAI != ChpHoistVec.rend(); ++HAI) {
		HoistAux HA = (HAI);
		if (UsedReg < RegNum && isSafeToExecuteUnconditionally(
		*(HA->I), DT, TLI, CurLoop, SafetyInfo,
		CurLoop->getLoopPreheader()->getTerminator())) {
		Changed \|= hoist(*(HA->I), DT, CurLoop, SafetyInfo);
		Order++;
		HA->HoistOrder = Order;
		UsedReg++;
		} else {
		break;
		}
		}
		#ifndef NDEBUG
		// Add a sanity check to ensure if an Instruction is hoisted, the
		// instructions it depends on are hoisted before it.
		for (auto HA : HoistVec) {
		if (HA->HoistOrder) {
		for (auto HADep : HA->Deps) {
		assert(HADep->HoistOrder && "Dep Instruction is not hoisted");
		assert(HADep->HoistOrder < HA->HoistOrder && "Hoist order is wrong");
		}
		}
		}
		#endif
		return Changed;
		}

		/// Check if an instruction is loop invariant.
		/// All the invariant candidates will be collected into HoistVec first and
		/// then selected in selectHoist once for all. So when current func is called
		/// for an instruction, its deps havn't been hoisted yet. So an operand is
		/// loop invariant either when it is defined outside of current loop, or it
		/// is defined by instruction in HoistVec.
		///
		static bool isLICMInvariant(Loop L, Instruction I,
		SmallVector<HoistAux *, 4> &Deps) {
		for (unsigned i = 0, e = I->getNumOperands(); i != e; ++i) {
		if (!L->isLoopInvariant(I->getOperand(i))) {
		Instruction *ChildI = dyn_cast<Instruction>(I->getOperand(i));
		if (!ChildI)
		continue;
		bool found = false;
		for (auto HA : HoistVec) {
		if (HA->I == ChildI) {
		found = true;
		Deps.push_back(HA);
		break;
		}
		}
		if (!found)
		return false;
		}
		}
		return true;
		}

/// Walk the specified region of the CFG (defined by all blocks dominated by		/// Walk the specified region of the CFG (defined by all blocks dominated by
/// the specified block, and that are in the current loop) in depth first		/// the specified block, and that are in the current loop) in depth first
/// order w.r.t the DominatorTree. This allows us to visit definitions before		/// order w.r.t the DominatorTree. This allows us to visit definitions before
/// uses, allowing us to hoist a loop body in one pass without iteration.		/// uses, allowing us to hoist a loop body in one pass without iteration.
///		///
bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,		bool llvm::hoistRegion(DomTreeNode N, AliasAnalysis AA, LoopInfo *LI,
DominatorTree DT, TargetLibraryInfo TLI, Loop *CurLoop,		DominatorTree DT, TargetLibraryInfo TLI,
		TargetTransformInfo TTI, Loop CurLoop,
AliasSetTracker CurAST, LICMSafetyInfo SafetyInfo) {		AliasSetTracker CurAST, LICMSafetyInfo SafetyInfo) {
// Verify inputs.		// Verify inputs.
assert(N != nullptr && AA != nullptr && LI != nullptr &&		assert(N != nullptr && AA != nullptr && LI != nullptr &&
DT != nullptr && CurLoop != nullptr && CurAST != nullptr &&		DT != nullptr && CurLoop != nullptr && CurAST != nullptr &&
SafetyInfo != nullptr && "Unexpected input to hoistRegion");		SafetyInfo != nullptr && "Unexpected input to hoistRegion");

BasicBlock *BB = N->getBlock();		BasicBlock *BB = N->getBlock();

// If this subregion is not in the top level loop at all, exit.		// If this subregion is not in the top level loop at all, exit.
if (!CurLoop->contains(BB)) return false;		if (!CurLoop->contains(BB)) return false;

// Only need to process the contents of this block if it is not part of a		// Only need to process the contents of this block if it is not part of a
// subloop (which would already have been processed).		// subloop (which would already have been processed).
bool Changed = false;		if (!inSubLoop(BB, CurLoop, LI)) {
if (!inSubLoop(BB, CurLoop, LI))		SmallVector<HoistAux *, 4> Deps;
for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ) {		for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ) {
Instruction &I = *II++;		Instruction &I = *II++;
// Try constant folding this instruction. If all the operands are		// Try constant folding this instruction. If all the operands are
// constants, it is technically hoistable, but it would be better to just		// constants, it is technically hoistable, but it would be better to just
// fold it.		// fold it.
if (Constant *C = ConstantFoldInstruction(		if (Constant *C = ConstantFoldInstruction(
&I, I.getModule()->getDataLayout(), TLI)) {		&I, I.getModule()->getDataLayout(), TLI)) {
DEBUG(dbgs() << "LICM folding inst: " << I << " --> " << *C << '\n');		DEBUG(dbgs() << "LICM folding inst: " << I << " --> " << *C << '\n');
CurAST->copyValue(&I, C);		CurAST->copyValue(&I, C);
CurAST->deleteValue(&I);		CurAST->deleteValue(&I);
I.replaceAllUsesWith(C);		I.replaceAllUsesWith(C);
I.eraseFromParent();		I.eraseFromParent();
continue;		continue;
}		}

// Try hoisting the instruction out to the preheader. We can only do this		// Try hoisting the instruction out to the preheader. We can only do this
// if all of the operands of the instruction are loop invariant and if it		// if all of the operands of the instruction are loop invariant and if it
// is safe to hoist the instruction.		// is safe to hoist the instruction.
//		//
if (CurLoop->hasLoopInvariantOperands(&I) &&		Deps.clear();
		if (isLICMInvariant(CurLoop, &I, Deps) &&
canSinkOrHoistInst(I, AA, DT, TLI, CurLoop, CurAST, SafetyInfo) &&		canSinkOrHoistInst(I, AA, DT, TLI, CurLoop, CurAST, SafetyInfo) &&
isSafeToExecuteUnconditionally(I, DT, TLI, CurLoop, SafetyInfo,		isSafeToExecuteUnconditionally(
CurLoop->getLoopPreheader()->getTerminator()))		I, DT, TLI, CurLoop, SafetyInfo,
Changed \|= hoist(I, DT, CurLoop, SafetyInfo);		CurLoop->getLoopPreheader()->getTerminator())) {
		HoistAux *HA = new HoistAux(&I);
		HA->Deps.insert(HA->Deps.end(), Deps.begin(), Deps.end());
		HoistVec.push_back(HA);
		}
		}
}		}

const std::vector<DomTreeNode*> &Children = N->getChildren();		const std::vector<DomTreeNode*> &Children = N->getChildren();
for (DomTreeNode *Child : Children)		for (DomTreeNode *Child : Children)
Changed \|= hoistRegion(Child, AA, LI, DT, TLI, CurLoop, CurAST, SafetyInfo);		hoistRegion(Child, AA, LI, DT, TLI, TTI, CurLoop, CurAST, SafetyInfo);

		bool Changed = false;
		if (N == DT->getNode(CurLoop->getHeader())) {
		Changed = selectHoist(DT, CurLoop, SafetyInfo, TLI, TTI);
		for (auto HA : HoistVec)
		delete HA;
		HoistVec.clear();
		}
return Changed;		return Changed;
}		}

/// Computes loop safety information, checks loop body & header		/// Computes loop safety information, checks loop body & header
/// for the possibility of may throw exception.		/// for the possibility of may throw exception.
///		///
void llvm::computeLICMSafetyInfo(LICMSafetyInfo * SafetyInfo, Loop * CurLoop) {		void llvm::computeLICMSafetyInfo(LICMSafetyInfo * SafetyInfo, Loop * CurLoop) {
assert(CurLoop != nullptr && "CurLoop cant be null");		assert(CurLoop != nullptr && "CurLoop cant be null");
▲ Show 20 Lines • Show All 753 Lines • Show Last 20 Lines