This is an archive of the discontinued LLVM Phabricator instance.

code hoisting using GVN
AbandonedPublic

Authored by sebpop on Apr 1 2016, 1:28 PM.

Download Raw Diff

Details

Reviewers

chandlerc
mehdi_amini
mcrosier

Summary

This pass hoists common computations across branches sharing common immediate
dominator. Like early-cse, the primary goal of early-gvn is to reduce the size
of functions before inline heuristics to reduce the total cost of function
inlining. In some cases this pass also reduces the critical path by exposing
more ILP.

Passes llvm regression test and test-suite.

Pass written by:
Sebastian Pop
Aditya Kumar
Xiaoyu Hu
Brian Rzycki

Diff Detail

Event Timeline

sebpop updated this revision to Diff 52415.Apr 1 2016, 1:28 PM

sebpop retitled this revision from to code hoisting using GVN.

sebpop updated this object.

sebpop added reviewers: mcrosier, chandlerc, mehdi_amini.

sebpop set the repository for this revision to rL LLVM.

sebpop added a subscriber: hiraditya.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptApr 1 2016, 1:28 PM

sebpop added a subscriber: flyingforyou.Apr 1 2016, 1:29 PM

hxy9243 added a subscriber: hxy9243.Apr 1 2016, 2:19 PM

mcrosier added inline comments.Apr 4 2016, 6:42 AM

llvm/lib/Transforms/Scalar/GVN.cpp
2726	ZeroOrMore isn't generally used with cl::opt.
2730	Same as above.

sebpop added a subscriber: llvm-commits.Apr 4 2016, 8:40 AM

sebpop added inline comments.

llvm/lib/Transforms/Scalar/GVN.cpp
2726	I copied the stmt from above: GVN.cpp line 76. Do you have an alternative option I should be using?

This needs a ton more correctness tests. It also needs real performance numbers, both on the compile time and execution time side.

For loads, i have trouble seeing how it gets the following case right:

Two loads, one in 4, one in 5.
All pointer operands are defined in block 1 (so you are all good there).
Both 2 and 3 contain calls that kill memory.
It looks like you will try to hoist to 1 anyway, because nowhere do you use memory dependence or memoryssa to figure out *if the memory state* is the same. For scalars, it doesn't matter.

You do stop trying to hoist when you hit modifying expressions in 4 or 5, but that won't help you here.

This seems like a really expensive way of doing this :)

This is pretty badly N^2.

Can i suggest a different method, closely based on what we do in GCC for speed reasons (You will discover the vast majority of things have only one expression for each VN, so it's pointless to walk and do lookups on them)

Make a multimap from VN to each expression with that VN (VNTable is not currently this) over the entire program.

For each VN in the table:
 if (size (expressions with a given VN) > 1):
   One of:
    A. Calculate VBE over the expressions with that VN.
    B. Do something similar to what you do now (now you have N^2 where N is the number of expressions with a given VN, instead of number of blocks)
    C. Something like this should work for completeness and sparseness:
        For each block in domtree in DFS order:
              For each expression, if DFSin/DFSOut(expression->parent) within range of DFSin/DFSOut(domtree), push(possiblehoistlist (a list), expression (current expression), block (insertion point))
              If you have 2 or more things on possiblehoistlist, calculate availability of operands for each expression in block (you can cache the highest point you can move them to for later to checking again and again. Since you are using dominance, you know they can only be hoisted to blocks dominated by the highest point you can hoist to).
              If 2 or more things are still available, hoist

Note you can also likely skip any domtree block that does not have two or more children, i just haven't proven it to myself yet.

Thanks Danny for your feedback.

In D18710#391283, @dberlin wrote:

This needs a ton more correctness tests.

We will add more testcases.

It also needs real performance numbers, both on the compile time and execution time side.

We ran the llvm nightly test-suite on x86, and we have not seen any meaningful numbers we could report: it is too noisy because of the short execution times. We will run spec 2k and 2k6 on x86 where things should be more stable.

For loads, i have trouble seeing how it gets the following case right:
  1
 / \
2   3
|   |
4   5
Two loads, one in 4, one in 5.
All pointer operands are defined in block 1 (so you are all good there).
Both 2 and 3 contain calls that kill memory.
It looks like you will try to hoist to 1 anyway, because nowhere do you use memory dependence or memoryssa to figure out *if the memory state* is the same. For scalars, it doesn't matter.

You do stop trying to hoist when you hit modifying expressions in 4 or 5, but that won't help you here.

The pattern we are matching does not match this example: please correct me if I'm wrong here.
The only case we are handling is without the intermediate blocks 2 and 3 on your example:

BasicBlock *BB = Dom->getBlock();
// Only handle two branches for now: it is possible to extend the hoisting
// to switch statements.
BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());
if (!BI || BI->getNumSuccessors() != 2)
  return false;

BasicBlock *BB1 = BI->getSuccessor(0);
BasicBlock *BB2 = BI->getSuccessor(1);
assert(BB1 != BB2 && "invalid CFG");

if (!DT->properlyDominates(BB, BB1) ||
    !DT->properlyDominates(BB, BB2) ||
    BB1->isEHPad() || BB1->hasAddressTaken() ||
    BB2->isEHPad() || BB2->hasAddressTaken())
  return false;

BB1 and BB2 are the direct successors of BB: there is no other block in between.
We only hoist expressions from BB1 and BB2 into BB.

This seems like a really expensive way of doing this :)

This is pretty badly N^2.

Can i suggest a different method, closely based on what we do in GCC for speed reasons

We will implement your suggestion: Aditya has also remarked that we could be more efficient on compile time by using the internal structures of the VN table, though I agree with you that we will need some more changes.

mcrosier added inline comments.Apr 4 2016, 10:59 AM

llvm/lib/Transforms/Scalar/GVN.cpp
2726	We should be using the default for cl::opt, which is cl::Optional (i.e., the option may be specified zero or once). My suggestion is to remove cl::ZeroOrMore from the command line option as this should only be used with cl::list. Hopefully, that makes sense.

The pattern we are matching does not match this example: please correct me
if I'm wrong here.
The only case we are handling is without the intermediate blocks 2 and 3
on your example:
BasicBlock *BB = Dom->getBlock();
// Only handle two branches for now: it is possible to extend the
hoisting
// to switch statements.
BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());
if (!BI || BI->getNumSuccessors() != 2)
  return false;

BB1 and BB2 are the direct successors of BB: there is no other block in

between.

We only hoist expressions from BB1 and BB2 into BB.

Yes, i misread this part.

This seems like a really expensive way of doing this :)
This is pretty badly N^2.
Can i suggest a different method, closely based on what we do in GCC
for speed reasons

We will implement your suggestion: Aditya has also remarked that we could
be more efficient on compile time by using the internal structures of the
VN table, though I agree with you that we will need some more changes.

(FWIW: New GVN already has such a mapping/structure. For each value, it can
tell you all the member expressions of that value)

(FWIW: New GVN already has such a mapping/structure. For each value, it can
tell you all the member expressions of that value)

Thanks for the feedback. You mention that 'new GVN', could you point where new GVN is.
Are you referring to the LeaderTable?

-Aditya

Addressed comments from Chad and Danny.
Added more testcases and a flag to limit the O(n^2) behavior.

Folks, you didn't follow the specific guidance about how to create a patch
with Phabricator. Specifically, when you created the revision, the
'llvm-commits' list was not CC'ed.

As a result, I suspect a large number of folks are getting inline comments
with *zero context* about what this patch even is. That includes myself. It
makes it super hard to even understand what is going on.

Please nuke this entry on Phab, and create a fresh one with the mailing
list attached. That way it will send a nice original email with the actual
patch file attached.

It would also be really nice if the summary were substantially more useful
than "code hoisting using GVN" which sounds like adding a feature to GVN.
The actual description in PHab seems *very* different:
"""
This pass hoists common computations across branches sharing common
immediate dominator. Like early-cse, the primary goal of early-gvn is to
reduce the size of functions before inline heuristics to reduce the total
cost of function inlining.
"""

So this is actually introducing a totally new pass?? Very confused. Waiting
for a fresh code review to make detailed comments.

We don't want to add flags to control time when we know of better
algorithms that can do it.
Please, let's not add more N^2 algorithms to the compiler just because it
was a bit faster to write. Let's just do it well the first time :)

https://github.com/dberlin/llvm-gvn-rewrite/tree/newgvn2

(and in particular,
https://github.com/dberlin/llvm-gvn-rewrite/blob/newgvn2/lib/Transforms/Scalar/NewGVN.cpp
)

I'm rewriting GVN.
MemorySSA, recently committed, was the first step along this route.

It's at the point where it works, it needs to be broken down again and
pieces sent for review.

in any case, you can see struct congruenceclass, and that tracks all the
expressions that belong to a given congruence class (the memberset).

It even tells you thinks that are equivalent to that congruence class in
subparts of the CFG, or can be coerced through type changes to be the same
as things in the class, etc.

Note: It catches 99% of the things GVN does, and plenty it doesn't. I have
no plans on porting every possible optimization to it (and that would
include code hoisting :P)

Moving review to http://reviews.llvm.org/D18798

Revision Contents

Path

Size

llvm/

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

Scalar.h

7 lines

Scalar/

GVN.h

15 lines

lib/

Passes/

PassRegistry.def

1 line

Transforms/

IPO/

PassManagerBuilder.cpp

1 line

Scalar/

GVN.cpp

323 lines

Scalar.cpp

5 lines

test/

Transforms/

GVN/

hoist.ll

246 lines

Diff 52632

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
	void initializeAddressSanitizerPass(PassRegistry&);			void initializeAddressSanitizerPass(PassRegistry&);
	void initializeAddressSanitizerModulePass(PassRegistry&);			void initializeAddressSanitizerModulePass(PassRegistry&);
	void initializeMemorySanitizerPass(PassRegistry&);			void initializeMemorySanitizerPass(PassRegistry&);
	void initializeThreadSanitizerPass(PassRegistry&);			void initializeThreadSanitizerPass(PassRegistry&);
	void initializeSanitizerCoverageModulePass(PassRegistry&);			void initializeSanitizerCoverageModulePass(PassRegistry&);
	void initializeDataFlowSanitizerPass(PassRegistry&);			void initializeDataFlowSanitizerPass(PassRegistry&);
	void initializeScalarizerPass(PassRegistry&);			void initializeScalarizerPass(PassRegistry&);
	void initializeEarlyCSELegacyPassPass(PassRegistry &);			void initializeEarlyCSELegacyPassPass(PassRegistry &);
				void initializeEarlyGVNLegacyPassPass(PassRegistry &);
	void initializeEliminateAvailableExternallyPass(PassRegistry&);			void initializeEliminateAvailableExternallyPass(PassRegistry&);
	void initializeExpandISelPseudosPass(PassRegistry&);			void initializeExpandISelPseudosPass(PassRegistry&);
	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);			void initializeGCMachineCodeAnalysisPass(PassRegistry&);
	void initializeGCModuleInfoPass(PassRegistry&);			void initializeGCModuleInfoPass(PassRegistry&);
	void initializeGVNLegacyPassPass(PassRegistry&);			void initializeGVNLegacyPassPass(PassRegistry&);
	void initializeGlobalDCEPass(PassRegistry&);			void initializeGlobalDCEPass(PassRegistry&);
	void initializeGlobalOptPass(PassRegistry&);			void initializeGlobalOptPass(PassRegistry&);
	▲ Show 20 Lines • Show All 186 Lines • Show Last 20 Lines

llvm/include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createStripDeadPrototypesPass();		(void) llvm::createStripDeadPrototypesPass();
(void) llvm::createTailCallEliminationPass();		(void) llvm::createTailCallEliminationPass();
(void) llvm::createJumpThreadingPass();		(void) llvm::createJumpThreadingPass();
(void) llvm::createUnifyFunctionExitNodesPass();		(void) llvm::createUnifyFunctionExitNodesPass();
(void) llvm::createInstCountPass();		(void) llvm::createInstCountPass();
(void) llvm::createConstantHoistingPass();		(void) llvm::createConstantHoistingPass();
(void) llvm::createCodeGenPreparePass();		(void) llvm::createCodeGenPreparePass();
(void) llvm::createEarlyCSEPass();		(void) llvm::createEarlyCSEPass();
		(void) llvm::createEarlyGVNPass();
(void) llvm::createMergedLoadStoreMotionPass();		(void) llvm::createMergedLoadStoreMotionPass();
(void) llvm::createGVNPass();		(void) llvm::createGVNPass();
(void) llvm::createMemCpyOptPass();		(void) llvm::createMemCpyOptPass();
(void) llvm::createLoopDeletionPass();		(void) llvm::createLoopDeletionPass();
(void) llvm::createPostDomTree();		(void) llvm::createPostDomTree();
(void) llvm::createInstructionNamerPass();		(void) llvm::createInstructionNamerPass();
(void) llvm::createMetaRenamerPass();		(void) llvm::createMetaRenamerPass();
(void) llvm::createPostOrderFunctionAttrsLegacyPass();		(void) llvm::createPostOrderFunctionAttrsLegacyPass();
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 322 Lines • ▼ Show 20 Lines
	//			//
	// EarlyCSE - This pass performs a simple and fast CSE pass over the dominator			// EarlyCSE - This pass performs a simple and fast CSE pass over the dominator
	// tree.			// tree.
	//			//
	FunctionPass *createEarlyCSEPass();			FunctionPass *createEarlyCSEPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
				// EarlyGVN - This pass performs a simple and fast GVN pass over the dominator
				// tree to hoist common expressions from sibling branches.
				//
				FunctionPass *createEarlyGVNPass();

				//===----------------------------------------------------------------------===//
				//
	// MergedLoadStoreMotion - This pass merges loads and stores in diamonds. Loads			// MergedLoadStoreMotion - This pass merges loads and stores in diamonds. Loads
	// are hoisted into the header, while stores sink into the footer.			// are hoisted into the header, while stores sink into the footer.
	//			//
	FunctionPass *createMergedLoadStoreMotionPass();			FunctionPass *createMergedLoadStoreMotionPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// MemCpyOpt - This pass performs optimizations related to eliminating memcpy			// MemCpyOpt - This pass performs optimizations related to eliminating memcpy
	▲ Show 20 Lines • Show All 164 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar/GVN.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	void markInstructionForDeletion(Instruction *I) {
VN.erase(I);		VN.erase(I);
InstrsToErase.push_back(I);		InstrsToErase.push_back(I);
}		}

DominatorTree &getDominatorTree() const { return *DT; }		DominatorTree &getDominatorTree() const { return *DT; }
AliasAnalysis *getAliasAnalysis() const { return VN.getAliasAnalysis(); }		AliasAnalysis *getAliasAnalysis() const { return VN.getAliasAnalysis(); }
MemoryDependenceResults &getMemDep() const { return *MD; }		MemoryDependenceResults &getMemDep() const { return *MD; }

private:
friend class gvn::GVNLegacyPass;

struct Expression;		struct Expression;
friend struct DenseMapInfo<Expression>;

/// This class holds the mapping between values and value numbers. It is used		/// This class holds the mapping between values and value numbers. It is used
/// as an efficient mechanism to determine the expression-wise equivalence of		/// as an efficient mechanism to determine the expression-wise equivalence of
/// two values.		/// two values.
class ValueTable {		class ValueTable {
DenseMap<Value *, uint32_t> valueNumbering;		DenseMap<Value *, uint32_t> valueNumbering;
DenseMap<Expression, uint32_t> expressionNumbering;		DenseMap<Expression, uint32_t> expressionNumbering;
AliasAnalysis *AA;		AliasAnalysis *AA;
Show All 26 Lines	public:
void setAliasAnalysis(AliasAnalysis *A) { AA = A; }		void setAliasAnalysis(AliasAnalysis *A) { AA = A; }
AliasAnalysis *getAliasAnalysis() const { return AA; }		AliasAnalysis *getAliasAnalysis() const { return AA; }
void setMemDep(MemoryDependenceResults *M) { MD = M; }		void setMemDep(MemoryDependenceResults *M) { MD = M; }
void setDomTree(DominatorTree *D) { DT = D; }		void setDomTree(DominatorTree *D) { DT = D; }
uint32_t getNextUnusedValueNumber() { return nextValueNumber; }		uint32_t getNextUnusedValueNumber() { return nextValueNumber; }
void verifyRemoved(const Value *) const;		void verifyRemoved(const Value *) const;
};		};

		private:
		friend class gvn::GVNLegacyPass;
		friend struct DenseMapInfo<Expression>;

MemoryDependenceResults *MD;		MemoryDependenceResults *MD;
DominatorTree *DT;		DominatorTree *DT;
const TargetLibraryInfo *TLI;		const TargetLibraryInfo *TLI;
AssumptionCache *AC;		AssumptionCache *AC;
SetVector<BasicBlock *> DeadBlocks;		SetVector<BasicBlock *> DeadBlocks;

ValueTable VN;		ValueTable VN;

▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	private:
void addDeadBlock(BasicBlock *BB);		void addDeadBlock(BasicBlock *BB);
void assignValNumForDeadCode();		void assignValNumForDeadCode();
};		};

/// Create a legacy GVN pass. This also allows parameterizing whether or not		/// Create a legacy GVN pass. This also allows parameterizing whether or not
/// loads are eliminated by the pass.		/// loads are eliminated by the pass.
FunctionPass *createGVNPass(bool NoLoads = false);		FunctionPass *createGVNPass(bool NoLoads = false);

		/// \brief A simple and fast domtree-based GVN pass to hoist common expressions
		/// from sibling branches.
		struct EarlyGVNPass : PassInfoMixin<EarlyGVNPass> {
		/// \brief Run the pass over the function.
		PreservedAnalyses run(Function &F, AnalysisManager<Function> &AM);
		};

}		}

#endif		#endif

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	#undef FUNCTION_ANALYSIS			#undef FUNCTION_ANALYSIS

	#ifndef FUNCTION_PASS			#ifndef FUNCTION_PASS
	#define FUNCTION_PASS(NAME, CREATE_PASS)			#define FUNCTION_PASS(NAME, CREATE_PASS)
	#endif			#endif
	FUNCTION_PASS("aa-eval", AAEvaluator())			FUNCTION_PASS("aa-eval", AAEvaluator())
	FUNCTION_PASS("adce", ADCEPass())			FUNCTION_PASS("adce", ADCEPass())
	FUNCTION_PASS("early-cse", EarlyCSEPass())			FUNCTION_PASS("early-cse", EarlyCSEPass())
				FUNCTION_PASS("early-gvn", EarlyGVNPass())
	FUNCTION_PASS("instcombine", InstCombinePass())			FUNCTION_PASS("instcombine", InstCombinePass())
	FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())			FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	FUNCTION_PASS("no-op-function", NoOpFunctionPass())			FUNCTION_PASS("no-op-function", NoOpFunctionPass())
	FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass())			FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass())
	FUNCTION_PASS("gvn", GVN())			FUNCTION_PASS("gvn", GVN())
	FUNCTION_PASS("print", PrintFunctionPass(dbgs()))			FUNCTION_PASS("print", PrintFunctionPass(dbgs()))
	FUNCTION_PASS("print<assumptions>", AssumptionPrinterPass(dbgs()))			FUNCTION_PASS("print<assumptions>", AssumptionPrinterPass(dbgs()))
	FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))			FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))
	Show All 25 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateFunctionPassManager(
addInitialAliasAnalysisPasses(FPM);		addInitialAliasAnalysisPasses(FPM);

FPM.add(createCFGSimplificationPass());		FPM.add(createCFGSimplificationPass());
if (UseNewSROA)		if (UseNewSROA)
FPM.add(createSROAPass());		FPM.add(createSROAPass());
else		else
FPM.add(createScalarReplAggregatesPass());		FPM.add(createScalarReplAggregatesPass());
FPM.add(createEarlyCSEPass());		FPM.add(createEarlyCSEPass());
		FPM.add(createEarlyGVNPass());
FPM.add(createLowerExpectIntrinsicPass());		FPM.add(createLowerExpectIntrinsicPass());
}		}

// Do PGO instrumentation generation or use pass as the option specified.		// Do PGO instrumentation generation or use pass as the option specified.
void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM) {		void PassManagerBuilder::addPGOInstrPasses(legacy::PassManagerBase &MPM) {
if (!PGOInstrGen.empty()) {		if (!PGOInstrGen.empty()) {
MPM.add(createPGOInstrumentationGenPass());		MPM.add(createPGOInstrumentationGenPass());
// Add the profile lowering pass.		// Add the profile lowering pass.
▲ Show 20 Lines • Show All 634 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/GVN.cpp

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	#include "llvm/IR/IntrinsicInst.h"			#include "llvm/IR/IntrinsicInst.h"
	#include "llvm/IR/LLVMContext.h"			#include "llvm/IR/LLVMContext.h"
	#include "llvm/IR/Metadata.h"			#include "llvm/IR/Metadata.h"
	#include "llvm/IR/PatternMatch.h"			#include "llvm/IR/PatternMatch.h"
	#include "llvm/Support/Allocator.h"			#include "llvm/Support/Allocator.h"
	#include "llvm/Support/CommandLine.h"			#include "llvm/Support/CommandLine.h"
	#include "llvm/Support/Debug.h"			#include "llvm/Support/Debug.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/Scalar.h"
	#include "llvm/Transforms/Utils/BasicBlockUtils.h"			#include "llvm/Transforms/Utils/BasicBlockUtils.h"
	#include "llvm/Transforms/Utils/Local.h"			#include "llvm/Transforms/Utils/Local.h"
	#include "llvm/Transforms/Utils/SSAUpdater.h"			#include "llvm/Transforms/Utils/SSAUpdater.h"
	#include <vector>			#include <vector>
				#include <unordered_map>
	using namespace llvm;			using namespace llvm;
	using namespace llvm::gvn;			using namespace llvm::gvn;
	using namespace PatternMatch;			using namespace PatternMatch;

	#define DEBUG_TYPE "gvn"			#define DEBUG_TYPE "gvn"

	STATISTIC(NumGVNInstr, "Number of instructions deleted");			STATISTIC(NumGVNInstr, "Number of instructions deleted");
	STATISTIC(NumGVNLoad, "Number of loads deleted");			STATISTIC(NumGVNLoad, "Number of loads deleted");
	▲ Show 20 Lines • Show All 2,650 Lines • ▼ Show 20 Lines
	INITIALIZE_PASS_BEGIN(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)			INITIALIZE_PASS_BEGIN(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)
	INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)			INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
	INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)			INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)			INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)			INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)			INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)			INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
	INITIALIZE_PASS_END(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)			INITIALIZE_PASS_END(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)

				static cl::opt<int>
				HoistedScalarsThreshold("hoisted-scalars-threshold", cl::Hidden, cl::init(-1),
				cl::desc("Max number of scalar instructions to hoist "
				mcrosierUnsubmitted Not Done Reply Inline Actions ZeroOrMore isn't generally used with cl::opt. mcrosier: ZeroOrMore isn't generally used with cl::opt.
				sebpopAuthorUnsubmitted Not Done Reply Inline Actions I copied the stmt from above: GVN.cpp line 76. Do you have an alternative option I should be using? sebpop: I copied the stmt from above: GVN.cpp line 76. Do you have an alternative option I should be…
				mcrosierUnsubmitted Done Reply Inline Actions We should be using the default for cl::opt, which is cl::Optional (i.e., the option may be specified zero or once). My suggestion is to remove cl::ZeroOrMore from the command line option as this should only be used with cl::list. Hopefully, that makes sense. mcrosier: We should be using the default for cl::opt, which is cl::Optional (i.e., the option may be…
				"(default unlimited = -1)"));
				static cl::opt<int>
				HoistedLoadsThreshold("hoisted-loads-threshold", cl::Hidden, cl::init(-1),
				cl::desc("Max number of loads to hoist "
				mcrosierUnsubmitted Done Reply Inline Actions Same as above. mcrosier: Same as above.
				"(default unlimited = -1)"));
				static cl::opt<int>
				HoistMaxDepthDependence("hoist-max-depth-dependence", cl::Hidden, cl::init(5),
				cl::desc("Max depth of a dependence chain to be hoisted "
				"(default 5, unlimited = -1)"));

				static int ScalarCounter = 0;
				static int LoadCounter = 0;

				namespace {
				// This pass hoists common computations across branches sharing
				// common immediate dominator. The primary goal is to reduce the code size,
				// and in some cases reduce critical path (by exposing more ILP).
				class EarlyGVNLegacyPassImpl {
				public:
				GVN::ValueTable VN;
				DominatorTree *DT;
				AliasAnalysis *AA;
				MemoryDependenceResults *MD;
				static char ID;

				EarlyGVNLegacyPassImpl(DominatorTree dt, AliasAnalysis aa, MemoryDependenceResults *md)
				: DT (dt), AA (aa), MD (md)
				{ }

				// Return true when all operands of Instr are available at insertion point
				// InsertPt. When limiting the number of hoisted expressions, one could hoist
				// a load without hoisting its access function. So before hoisting any
				// expression, make sure that all its operands are available at insert point.
				bool allOperandsAvailable(Instruction I, Instruction InsertPt) {
				for (unsigned i = 0, e = I->getNumOperands(); i != e; ++i) {
				Value *Op = I->getOperand(i);
				Instruction *Inst = dyn_cast<Instruction>(Op);
				if (!Inst)
				continue;

				if (!DT->dominates(Inst->getParent(), InsertPt->getParent()))
				return false;
				}

				return true;
				}

				// Hoist all instructions in H at InsertPt.
				void hoist(std::vector<std::pair<Instruction , Instruction > > &H,
				Instruction *InsertPt) {
				for (std::pair<Instruction , Instruction > &P : H) {
				Instruction *I1 = P.first;
				Instruction *I2 = P.second;
				I1->moveBefore(InsertPt);
				patchAndReplaceAllUsesWith(I2, I1);
				I2->eraseFromParent();
				DEBUG(dbgs() << "GVN hoisting: " << *I1 << '\n');
				}
				}

				// Hoist scalar operations.
				bool hoistScalars(Instruction InsertPt, BasicBlock BB1, BasicBlock *BB2) {
				bool Changed = false;

				// Record from BB1 all instructions and their VN.
				std::unordered_map<unsigned, Instruction *> VNtoInstruction;
				for (Instruction &I1 : *BB1) {
				unsigned V = VN.lookup_or_add(&I1);
				VNtoInstruction.insert(std::make_pair(V, &I1));
				}

				// Scan BB2 for instructions appearing in BB1 with identical VN.
				std::vector<std::pair<Instruction , Instruction > > HoistInstructions;
				for (Instruction &I2 : *BB2) {
				unsigned V = VN.lookup_or_add(&I2);

				if (I2.mayWriteToMemory())
				continue;

				// We are not dealing with loads here.
				LoadInst *Load = dyn_cast<LoadInst>(&I2);
				if (Load)
				continue;

				// Check whether BB1 contains an similar scalar instruction.
				auto It = VNtoInstruction.find(V);
				if (It == VNtoInstruction.end())
				continue;

				// Make sure all operands are available at insertion point.
				if (!allOperandsAvailable(&I2, InsertPt))
				continue;

				// Bound the number of hoisted scalar expressions.
				if (HoistedScalarsThreshold != -1 &&
				ScalarCounter >= HoistedScalarsThreshold)
				break;
				ScalarCounter++;

				// Hoist identical instructions I2 and I1.
				Changed = true;
				Instruction *I1 = It->second;
				HoistInstructions.push_back(std::make_pair(I1, &I2));
				}

				hoist(HoistInstructions, InsertPt);

				return Changed;
				}

				// Hoist identical loads from BB1 and BB2 into BB.
				bool hoistLoads(Instruction InsertPt, BasicBlock BB1, BasicBlock *BB2) {
				bool Changed = false;
				bool IsTriangle = false;

				// The First BB to be traversed should be the one with single predecessor.
				if (!BB1->getSinglePredecessor()) {
				if (!BB2->getSinglePredecessor())
				return false;
				if (BB2->getSingleSuccessor() != BB1)
				return false;

				std::swap(BB1, BB2);
				IsTriangle = true;
				} else if (!BB2->getSinglePredecessor()) {
				if (BB1->getSingleSuccessor() != BB2)
				return false;

				IsTriangle = true;
				}

				assert (BB1->getSinglePredecessor() == InsertPt->getParent());

				// Record from BB1 all loads and their access function VN.
				std::unordered_map<unsigned, LoadInst *> VNtoLoad;
				for (Instruction &I1 : *BB1) {
				if (I1.mayHaveSideEffects()) {
				if (IsTriangle)
				return false;
				break;
				}
				LoadInst *Load = dyn_cast<LoadInst>(&I1);
				if (!Load)
				continue;
				if (!Load->isSimple())
				break;

				Value *Ptr = Load->getPointerOperand();
				unsigned V = VN.lookup_or_add(Ptr);
				VNtoLoad.insert(std::make_pair(V, Load));
				}

				if (VNtoLoad.empty())
				return false;

				// Scan BB2 for loads appearing in BB1 with identical access functions.
				std::vector<std::pair<Instruction , Instruction > > HoistInstructions;
				for (Instruction &I2 : *BB2) {
				if (I2.mayHaveSideEffects())
				break;

				LoadInst *Load = dyn_cast<LoadInst>(&I2);
				if (!Load)
				continue;
				if (!Load->isSimple())
				break;

				Value *Ptr = Load->getPointerOperand();
				unsigned V = VN.lookup_or_add(Ptr);

				// Check whether BB1 contains a similar load.
				auto It = VNtoLoad.find(V);
				if (It == VNtoLoad.end())
				continue;

				// Check whether the load elements are of the same type.
				LoadInst *I1 = It->second;
				if (cast<PointerType>(I1->getPointerOperand()->getType())->getElementType() !=
				cast<PointerType>(Ptr->getType())->getElementType())
				continue;

				// Make sure all operands are available at insertion point.
				if (!allOperandsAvailable(&I2, InsertPt))
				continue;

				// Bound the number of hoisted load expressions.
				if (HoistedLoadsThreshold != -1 &&
				LoadCounter >= HoistedLoadsThreshold)
				break;
				LoadCounter++;

				// Hoist identical load instructions I2 and I1.
				Changed = true;
				HoistInstructions.push_back(std::make_pair(I1, &I2));
				}

				// Code generate.
				hoist(HoistInstructions, InsertPt);

				return Changed;
				}

				// Hoist all expressions. Return true when code has been hoisted under Dom.
				bool hoistExpressions(DomTreeNodeBase<BasicBlock> *Dom) {
				// Depth first search for the leaves of the dominator tree. We start
				// hoisting expressions from the bottom up because that would allow some
				// expressions to be hoisted several times.
				for (auto BB : Dom)
				hoistExpressions(BB);

				BasicBlock *BB = Dom->getBlock();
				// Only handle two branches for now: it is possible to extend the hoisting
				// to switch statements.
				BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());
				if (!BI \|\| BI->getNumSuccessors() != 2)
				return false;

				BasicBlock *BB1 = BI->getSuccessor(0);
				BasicBlock *BB2 = BI->getSuccessor(1);
				assert(BB1 != BB2 && "invalid CFG");

				if (!DT->properlyDominates(BB, BB1) \|\|
				!DT->properlyDominates(BB, BB2) \|\|
				BB1->isEHPad() \|\| BB1->hasAddressTaken() \|\|
				BB2->isEHPad() \|\| BB2->hasAddressTaken())
				return false;

				bool Changed, Res = false;
				int Depth = HoistMaxDepthDependence;
				do {
				// Limit to HoistMaxDepthDependence the number of iterations in order to
				// avoid O(N^2) behavior: dependent instructions are hoisted one at a time
				// in subsequent iterations of this loop.
				if (Depth == 0)
				break;
				if (Depth != -1)
				--Depth;

				Changed = hoistScalars(BB->getTerminator(), BB1, BB2);
				if (hoistLoads(BB->getTerminator(), BB1, BB2)) {
				// Clear the value number table as otherwise the scalar computations
				// depending on the loads would not get value numbered again based on the
				// hoisted loads.
				VN.clear();
				Changed = true;
				}

				if (Changed)
				Res = true;
				} while (Changed);

				return Res;
				}

				bool run() {
				VN.setDomTree(DT);
				VN.setAliasAnalysis(AA);
				VN.setMemDep(MD);
				hoistExpressions(DT->getNode(DT->getRoot()));
				return false;
				}
				};

				class EarlyGVNLegacyPass : public FunctionPass {
				public:
				static char ID;

				EarlyGVNLegacyPass() : FunctionPass(ID) {
				initializeEarlyGVNLegacyPassPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &F) override {
				if (skipOptnoneFunction(F))
				return false;

				auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
				auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
				auto &MD = getAnalysis<MemoryDependenceWrapperPass>().getMemDep();

				EarlyGVNLegacyPassImpl G (&DT, &AA, &MD);
				return G.run();
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<DominatorTreeWrapperPass>();
				AU.addRequired<MemoryDependenceWrapperPass>();
				AU.addRequired<AAResultsWrapperPass>();

				AU.addPreserved<DominatorTreeWrapperPass>();
				}
				};
				} // namespace

				PreservedAnalyses
				EarlyGVNPass::run(Function &F,
				AnalysisManager<Function> &AM) {
				DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F);
				AliasAnalysis &AA = AM.getResult<AAManager>(F);
				MemoryDependenceResults &MD = AM.getResult<MemoryDependenceAnalysis>(F);

				EarlyGVNLegacyPassImpl G (&DT, &AA, &MD);
				if (!G.run())
				return PreservedAnalyses::all();

				PreservedAnalyses PA;
				PA.preserve<DominatorTreeAnalysis>();
				return PA;
				}

				char EarlyGVNLegacyPass::ID = 0;
				INITIALIZE_PASS_BEGIN(EarlyGVNLegacyPass, "early-gvn", "Early GVN Hoisting of Expressions", false, false)
				INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
				INITIALIZE_PASS_END(EarlyGVNLegacyPass, "early-gvn", "Early GVN Hoisting of Expressions", false, false)

				FunctionPass *llvm::createEarlyGVNPass() { return new EarlyGVNLegacyPass(); }

llvm/lib/Transforms/Scalar/Scalar.cpp

Show All 37 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeConstantPropagationPass(Registry);		initializeConstantPropagationPass(Registry);
initializeCorrelatedValuePropagationPass(Registry);		initializeCorrelatedValuePropagationPass(Registry);
initializeDCEPass(Registry);		initializeDCEPass(Registry);
initializeDeadInstEliminationPass(Registry);		initializeDeadInstEliminationPass(Registry);
initializeScalarizerPass(Registry);		initializeScalarizerPass(Registry);
initializeDSEPass(Registry);		initializeDSEPass(Registry);
initializeGVNLegacyPassPass(Registry);		initializeGVNLegacyPassPass(Registry);
initializeEarlyCSELegacyPassPass(Registry);		initializeEarlyCSELegacyPassPass(Registry);
		initializeEarlyGVNLegacyPassPass(Registry);
initializeFlattenCFGPassPass(Registry);		initializeFlattenCFGPassPass(Registry);
initializeInductiveRangeCheckEliminationPass(Registry);		initializeInductiveRangeCheckEliminationPass(Registry);
initializeIndVarSimplifyPass(Registry);		initializeIndVarSimplifyPass(Registry);
initializeJumpThreadingPass(Registry);		initializeJumpThreadingPass(Registry);
initializeLICMPass(Registry);		initializeLICMPass(Registry);
initializeLoopDataPrefetchPass(Registry);		initializeLoopDataPrefetchPass(Registry);
initializeLoopDeletionPass(Registry);		initializeLoopDeletionPass(Registry);
initializeLoopAccessAnalysisPass(Registry);		initializeLoopAccessAnalysisPass(Registry);
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
void LLVMAddCorrelatedValuePropagationPass(LLVMPassManagerRef PM) {		void LLVMAddCorrelatedValuePropagationPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createCorrelatedValuePropagationPass());		unwrap(PM)->add(createCorrelatedValuePropagationPass());
}		}

void LLVMAddEarlyCSEPass(LLVMPassManagerRef PM) {		void LLVMAddEarlyCSEPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createEarlyCSEPass());		unwrap(PM)->add(createEarlyCSEPass());
}		}

		void LLVMAddEarlyGVNLegacyPass(LLVMPassManagerRef PM) {
		unwrap(PM)->add(createEarlyGVNPass());
		}

void LLVMAddTypeBasedAliasAnalysisPass(LLVMPassManagerRef PM) {		void LLVMAddTypeBasedAliasAnalysisPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createTypeBasedAAWrapperPass());		unwrap(PM)->add(createTypeBasedAAWrapperPass());
}		}

void LLVMAddScopedNoAliasAAPass(LLVMPassManagerRef PM) {		void LLVMAddScopedNoAliasAAPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createScopedNoAliasAAWrapperPass());		unwrap(PM)->add(createScopedNoAliasAAWrapperPass());
}		}

void LLVMAddBasicAliasAnalysisPass(LLVMPassManagerRef PM) {		void LLVMAddBasicAliasAnalysisPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createBasicAAWrapperPass());		unwrap(PM)->add(createBasicAAWrapperPass());
}		}

void LLVMAddLowerExpectIntrinsicPass(LLVMPassManagerRef PM) {		void LLVMAddLowerExpectIntrinsicPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createLowerExpectIntrinsicPass());		unwrap(PM)->add(createLowerExpectIntrinsicPass());
}		}

llvm/test/Transforms/GVN/hoist.ll

This file was added.

				; RUN: opt -early-gvn -hoist-max-depth-dependence=1 -S < %s \| FileCheck --check-prefix=DEP1 %s
				; RUN: opt -early-gvn -hoist-max-depth-dependence=2 -S < %s \| FileCheck --check-prefix=DEP2 %s
				; RUN: opt -early-gvn -hoist-max-depth-dependence=3 -S < %s \| FileCheck --check-prefix=DEP3 %s
				; RUN: opt -early-gvn -hoist-max-depth-dependence=4 -S < %s \| FileCheck --check-prefix=DEP4 %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; After hoisting the expressions from the branches we should only have 3 loads,
				; 2 fsub, 2 fmul, instead of 6 loads, 4 fsub, 4 fmul.
				;
				; DEP1-LABEL: @memgvn
				; DEP2-LABEL: @memgvn
				; DEP3-LABEL: @memgvn
				; DEP4-LABEL: @memgvn
				; DEP3: load
				; DEP3: load
				; DEP3: load
				; DEP3: fsub
				; DEP3: fsub
				; DEP3: fmul
				; DEP3: fmul
				; DEP3-NOT: load
				; DEP3-NOT: fmul
				; DEP3-NOT: fsub

				define float @memgvn(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				br label %if.end

				if.else: ; preds = %entry
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%tmax.0 = phi float [ %mul2, %if.then ], [ %mul6, %if.else ]
				%tmin.0 = phi float [ %mul, %if.then ], [ %mul4, %if.else ]
				%add = fadd float %tmax.0, %tmin.0
				ret float %add
				}

				; Check that we do not hoist loads after a store: the first two loads will be
				; hoisted, and then the third load will not be hoisted.
				;
				; DEP1-LABEL: @readsAndWrites
				; DEP2-LABEL: @readsAndWrites
				; DEP3-LABEL: @readsAndWrites
				; DEP4-LABEL: @readsAndWrites
				; DEP3: load
				; DEP3: load
				; DEP3: fsub
				; DEP3: fmul
				; DEP3: store
				; DEP3: load
				; DEP3: fsub
				; DEP3: fmul
				; DEP3: load
				; DEP3: fsub
				; DEP3: fmul
				; DEP3-NOT: load
				; DEP3-NOT: fmul
				; DEP3-NOT: fsub

				@G = internal global float 1.000000e+00

				define float @readsAndWrites(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				store float %0, float* @G
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				br label %if.end

				if.else: ; preds = %entry
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%tmax.0 = phi float [ %mul2, %if.then ], [ %mul6, %if.else ]
				%tmin.0 = phi float [ %mul, %if.then ], [ %mul4, %if.else ]
				%add = fadd float %tmax.0, %tmin.0
				ret float %add
				}

				; Check that we can hoist all independent expressions in one iteration.
				; DEP1-LABEL: @dependenceChain1
				; DEP2-LABEL: @dependenceChain1
				; DEP3-LABEL: @dependenceChain1
				; DEP4-LABEL: @dependenceChain1
				; DEP1: fadd
				; DEP1: fsub
				; DEP1: fdiv
				; DEP1: fmul
				; DEP1-NOT: fsub
				; DEP1-NOT: fdiv
				; DEP1-NOT: fmul
				define float @dependenceChain1(float %a, float %b, i1 %c) {
				entry:
				br i1 %c, label %if.then, label %if.else

				if.then:
				%d = fadd float %b, %a
				%e = fsub float %b, %a
				%f = fdiv float %b, %a
				%g = fmul float %b, %a
				br label %if.end

				if.else:
				%i = fadd float %b, %a
				%h = fsub float %b, %a
				%j = fdiv float %b, %a
				%k = fmul float %b, %a
				br label %if.end

				if.end:
				%p = phi float [ %d, %if.then ], [ %i, %if.else ]
				%q = phi float [ %e, %if.then ], [ %h, %if.else ]
				%r = phi float [ %f, %if.then ], [ %j, %if.else ]
				%s = phi float [ %g, %if.then ], [ %k, %if.else ]
				%t = fadd float %p, %q
				%u = fadd float %r, %s
				%v = fadd float %t, %u
				ret float %v
				}

				; After hoisting the expressions from the branches we should only have 2 fsub and 2 fmul
				; instead of 4 fsub and 4 fmul.
				;
				; DEP1-LABEL: @scalars
				; DEP2-LABEL: @scalars
				; DEP3-LABEL: @scalars
				; DEP4-LABEL: @scalars
				; DEP2: fsub
				; DEP2: fsub
				; DEP2: fmul
				; DEP2: fmul
				; DEP2-NOT: fmul
				; DEP2-NOT: fsub
				define float @scalars(float %d, float %min, float %max, float %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%sub = fsub float %min, %a
				%mul = fmul float %sub, %div
				%sub1 = fsub float %max, %a
				%mul2 = fmul float %sub1, %div
				br label %if.end

				if.else: ; preds = %entry
				%sub3 = fsub float %max, %a
				%mul4 = fmul float %sub3, %div
				%sub5 = fsub float %min, %a
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%tmax.0 = phi float [ %mul2, %if.then ], [ %mul6, %if.else ]
				%tmin.0 = phi float [ %mul, %if.then ], [ %mul4, %if.else ]
				%add = fadd float %tmax.0, %tmin.0
				ret float %add
				}

				; Check whether the flag -hoist-max-depth-dependence works: at depth 4 we should
				; hoist all 4 expressions, whereas at depth 3 we should see the last fmul
				; instruction twice.

				; DEP1-LABEL: @dependenceChain4
				; DEP2-LABEL: @dependenceChain4
				; DEP3-LABEL: @dependenceChain4
				; DEP4-LABEL: @dependenceChain4
				; DEP4: fsub
				; DEP4: fadd
				; DEP4: fdiv
				; DEP4: fmul
				; DEP4-NOT: fsub
				; DEP4-NOT: fadd
				; DEP4-NOT: fdiv
				; DEP4-NOT: fmul

				; DEP3: fsub
				; DEP3: fadd
				; DEP3: fdiv
				; DEP3: fmul
				; DEP3: fmul
				; DEP3-NOT: fsub
				; DEP3-NOT: fadd
				; DEP3-NOT: fdiv
				; DEP3-NOT: fmul
				define float @dependenceChain4(float %a, float %b, i1 %c) {
				entry:
				br i1 %c, label %if.then, label %if.else

				if.then:
				%d = fsub float %b, %a
				%e = fadd float %d, %a
				%f = fdiv float %e, %a
				%g = fmul float %f, %a
				br label %if.end

				if.else:
				%h = fsub float %b, %a
				%i = fadd float %h, %a
				%j = fdiv float %i, %a
				%k = fmul float %j, %a
				br label %if.end

				if.end:
				%r = phi float [ %g, %if.then ], [ %k, %if.else ]
				ret float %r
				}

This is an archive of the discontinued LLVM Phabricator instance.

code hoisting using GVNAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52632

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/LinkAllPasses.h

llvm/include/llvm/Transforms/Scalar.h

llvm/include/llvm/Transforms/Scalar/GVN.h

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/Scalar/GVN.cpp

llvm/lib/Transforms/Scalar/Scalar.cpp

llvm/test/Transforms/GVN/hoist.ll

code hoisting using GVN
AbandonedPublic