This is an archive of the discontinued LLVM Phabricator instance.

It looks like the test is failing on some of the Arm buildbots. In particular some of them don't have the X86 backend: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4611 so it looks like it is most likely missing a REQUIRES: x86. Can you take a look?

In D53437#1270521, @peter.smith wrote:

It looks like the test is failing on some of the Arm buildbots. In particular some of them don't have the X86 backend: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4611 so it looks like it is most likely missing a REQUIRES: x86. Can you take a look?

@peter.smith @hiraditya r344919 should address this.

Note that this addresses in the old PM one issue I found when enabling splitting with ThinLTO, but exposes some related issues. In the old location, which was before the early return when PrepareForThinLTO==true around line 590, we were doing two rounds of splitting: one during the compile step, and a second later in the ThinLTO backends. I saw that we split an already split function (see below for more info). But I notice that there are still a couple issues:

In the old PM, we will still do splitting early (during the compile step) for regular LTO, which doesn't return early from this routine. You should presumably guard the adding of the hot cold split pass by "if (!PrepareForLTO)"
Once you've done 1), regular LTO will no longer perform any splitting, since its backend doesn't invoke populateModulePassManager. You can fix this by adding your hot cold splitting pass to an equivalent location in addLTOOptimizationPasses(), which is invoked only in the regular LTO backend.
In the new PM, the move of the hot cold splitting pass does not fix the issue for ThinLTO. The reason is that there is no early exit from buildModuleSimplificationPipeline. To fix, you can guard the hot cold split pass by "Phase != ThinLTOPhase::PreLink".
In the new PM, regular LTO invokes buildModuleSimplificationPipeline in the compile step, but without any extra info to indicate that it is an LTO compile. So something else will need to be done to suppress the hot cold splitting from happening early for regular LTO (e.g. pass down a new flag?).
Once 4) is done, regular LTO will no longer perform any splitting in the new PM since its backend doesn't invoke buildModuleSimplificationPipeline. To fix, you will want to add the hot cold split pass to an appropriate place in buildLTODefaultPipeline().

This all assumes that you want to only do splitting in the *LTO backends (i.e. after cross module inlining). What are the tradeoffs of doing the splitting before/after inlining? One advantage of doing the splitting earlier for ThinLTO, in the compile step, is that it would reduce the size of the functions during the import analysis, and we might import more hot (split) functions. But then the splitting would be happening before the cross-module inlining in the backends.

Regarding what I am currently seeing with two rounds of splitting with ThinLTO: I noticed that the second round of splitting in the ThinLTO backends was resplitting an already split function - specifically, the cold outlined function was getting split again, which doesn't make a lot of sense since presumably the edge weights were all cold. Note this is after r344558 (the fix for the issue of outlining the whole function). Why would that happen?

Thanks for your notes @tejohnson.

In D53437#1271156, @tejohnson wrote:

Note that this addresses in the old PM one issue I found when enabling splitting with ThinLTO, but exposes some related issues. In the old location, which was before the early return when PrepareForThinLTO==true around line 590, we were doing two rounds of splitting: one during the compile step, and a second later in the ThinLTO backends. I saw that we split an already split function (see below for more info). But I notice that there are still a couple issues:

In the old PM, we will still do splitting early (during the compile step) for regular LTO, which doesn't return early from this routine. You should presumably guard the adding of the hot cold split pass by "if (!PrepareForLTO)"

Once you've done 1), regular LTO will no longer perform any splitting, since its backend doesn't invoke populateModulePassManager. You can fix this by adding your hot cold splitting pass to an equivalent location in addLTOOptimizationPasses(), which is invoked only in the regular LTO backend.

In the new PM, the move of the hot cold splitting pass does not fix the issue for ThinLTO. The reason is that there is no early exit from buildModuleSimplificationPipeline. To fix, you can guard the hot cold split pass by "Phase != ThinLTOPhase::PreLink".

In the new PM, regular LTO invokes buildModuleSimplificationPipeline in the compile step, but without any extra info to indicate that it is an LTO compile. So something else will need to be done to suppress the hot cold splitting from happening early for regular LTO (e.g. pass down a new flag?).

Once 4) is done, regular LTO will no longer perform any splitting in the new PM since its backend doesn't invoke buildModuleSimplificationPipeline. To fix, you will want to add the hot cold split pass to an appropriate place in buildLTODefaultPipeline().

This all assumes that you want to only do splitting in the *LTO backends (i.e. after cross module inlining). What are the tradeoffs of doing the splitting before/after inlining?

One advantage of doing splitting before inlining is, as you pointed out, that more inlining opportunities are created because hot functions become smaller. The downside is that outlining may be a barrier for intra-function optimizations which benefit from seeing both sides of a branch (hoisting/sinking, const prop, maybe jump threading?).

I think the right thing to do is to conservatively avoid introducing barriers by delaying splitting as much as possible. I'll work on gathering some pre/post-patch performance numbers on arm64.

One advantage of doing the splitting earlier for ThinLTO, in the compile step, is that it would reduce the size of the functions during the import analysis, and we might import more hot (split) functions. But then the splitting would be happening before the cross-module inlining in the backends.

Regarding what I am currently seeing with two rounds of splitting with ThinLTO: I noticed that the second round of splitting in the ThinLTO backends was resplitting an already split function - specifically, the cold outlined function was getting split again, which doesn't make a lot of sense since presumably the edge weights were all cold. Note this is after r344558 (the fix for the issue of outlining the whole function). Why would that happen?

tejohnson mentioned this in D41474: Fix a crash in lazy loading of Metadata in ThinLTO.Oct 23 2018, 9:48 AM

In D53437#1271395, @vsk wrote:

Thanks for your notes @tejohnson.

In D53437#1271156, @tejohnson wrote:

Note that this addresses in the old PM one issue I found when enabling splitting with ThinLTO, but exposes some related issues. In the old location, which was before the early return when PrepareForThinLTO==true around line 590, we were doing two rounds of splitting: one during the compile step, and a second later in the ThinLTO backends. I saw that we split an already split function (see below for more info). But I notice that there are still a couple issues:

In the old PM, we will still do splitting early (during the compile step) for regular LTO, which doesn't return early from this routine. You should presumably guard the adding of the hot cold split pass by "if (!PrepareForLTO)"

Once you've done 1), regular LTO will no longer perform any splitting, since its backend doesn't invoke populateModulePassManager. You can fix this by adding your hot cold splitting pass to an equivalent location in addLTOOptimizationPasses(), which is invoked only in the regular LTO backend.

In the new PM, the move of the hot cold splitting pass does not fix the issue for ThinLTO. The reason is that there is no early exit from buildModuleSimplificationPipeline. To fix, you can guard the hot cold split pass by "Phase != ThinLTOPhase::PreLink".

In the new PM, regular LTO invokes buildModuleSimplificationPipeline in the compile step, but without any extra info to indicate that it is an LTO compile. So something else will need to be done to suppress the hot cold splitting from happening early for regular LTO (e.g. pass down a new flag?).

Once 4) is done, regular LTO will no longer perform any splitting in the new PM since its backend doesn't invoke buildModuleSimplificationPipeline. To fix, you will want to add the hot cold split pass to an appropriate place in buildLTODefaultPipeline().

This all assumes that you want to only do splitting in the *LTO backends (i.e. after cross module inlining). What are the tradeoffs of doing the splitting before/after inlining?

One advantage of doing splitting before inlining is, as you pointed out, that more inlining opportunities are created because hot functions become smaller. The downside is that outlining may be a barrier for intra-function optimizations which benefit from seeing both sides of a branch (hoisting/sinking, const prop, maybe jump threading?).

I think the right thing to do is to conservatively avoid introducing barriers by delaying splitting as much as possible. I'll work on gathering some pre/post-patch performance numbers on arm64.

One advantage of doing the splitting earlier for ThinLTO, in the compile step, is that it would reduce the size of the functions during the import analysis, and we might import more hot (split) functions. But then the splitting would be happening before the cross-module inlining in the backends.

Regarding what I am currently seeing with two rounds of splitting with ThinLTO: I noticed that the second round of splitting in the ThinLTO backends was resplitting an already split function - specifically, the cold outlined function was getting split again, which doesn't make a lot of sense since presumably the edge weights were all cold. Note this is after r344558 (the fix for the issue of outlining the whole function). Why would that happen?

I have some thoughts about @tejohnson's second question here about how a cold outlined function could be split twice.

First, I think the cache of outlined Function objects (see shouldOutlineFrom) may be wiped out before LTO begins. Second, CodeExtractor inserts a (useless?) "header" block with an unconditional jump in the outlined function, and I think this creates more work for the splitting pass.

I'll look into getting rid of the cache and teaching CodeExtractor to run MergeBasicBlockIntoOnlyPred on the outlined function. I also have an upcoming patch that should guarantee that cold regions are maximal & have a warm predecessor -- these two conditions should prevent outlined code from being split again.

In D53437#1271156, @tejohnson wrote:

Note that this addresses in the old PM one issue I found when enabling splitting with ThinLTO, but exposes some related issues. In the old location, which was before the early return when PrepareForThinLTO==true around line 590, we were doing two rounds of splitting: one during the compile step, and a second later in the ThinLTO backends. I saw that we split an already split function (see below for more info). But I notice that there are still a couple issues:

In the old PM, we will still do splitting early (during the compile step) for regular LTO, which doesn't return early from this routine. You should presumably guard the adding of the hot cold split pass by "if (!PrepareForLTO)"

Once you've done 1), regular LTO will no longer perform any splitting, since its backend doesn't invoke populateModulePassManager. You can fix this by adding your hot cold splitting pass to an equivalent location in addLTOOptimizationPasses(), which is invoked only in the regular LTO backend.

In the new PM, the move of the hot cold splitting pass does not fix the issue for ThinLTO. The reason is that there is no early exit from buildModuleSimplificationPipeline. To fix, you can guard the hot cold split pass by "Phase != ThinLTOPhase::PreLink".

I'm going to send a patch to fix #3 so that we no longer run hot cold splitting multiple times anywhere to unblock me.

In the new PM, regular LTO invokes buildModuleSimplificationPipeline in the compile step, but without any extra info to indicate that it is an LTO compile. So something else will need to be done to suppress the hot cold splitting from happening early for regular LTO (e.g. pass down a new flag?).

Once 4) is done, regular LTO will no longer perform any splitting in the new PM since its backend doesn't invoke buildModuleSimplificationPipeline. To fix, you will want to add the hot cold split pass to an appropriate place in buildLTODefaultPipeline().

This all assumes that you want to only do splitting in the *LTO backends (i.e. after cross module inlining). What are the tradeoffs of doing the splitting before/after inlining? One advantage of doing the splitting earlier for ThinLTO, in the compile step, is that it would reduce the size of the functions during the import analysis, and we might import more hot (split) functions. But then the splitting would be happening before the cross-module inlining in the backends.

Regarding what I am currently seeing with two rounds of splitting with ThinLTO: I noticed that the second round of splitting in the ThinLTO backends was resplitting an already split function - specifically, the cold outlined function was getting split again, which doesn't make a lot of sense since presumably the edge weights were all cold. Note this is after r344558 (the fix for the issue of outlining the whole function). Why would that happen?

vsk mentioned this in D57082: [HotColdSplit] Move splitting earlier in the pipeline.Jan 22 2019, 8:12 PM

Diffusion mentioned this in rL352080: [HotColdSplit] Move splitting earlier in the pipeline.Jan 24 2019, 10:56 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Passes/

PassBuilder.cpp

6 lines

Transforms/

IPO/

PassManagerBuilder.cpp

6 lines

test/

Other/

opt-hot-cold-split.ll

292 lines

Diff 170344

llvm/trunk/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 615 Lines • ▼ Show 20 Lines	if (Phase != ThinLTOPhase::PreLink)
// We perform early indirect call promotion here, before globalopt.		// We perform early indirect call promotion here, before globalopt.
// This is important for the ThinLTO backend phase because otherwise		// This is important for the ThinLTO backend phase because otherwise
// imported available_externally functions look unreferenced and are		// imported available_externally functions look unreferenced and are
// removed.		// removed.
MPM.addPass(PGOIndirectCallPromotion(Phase == ThinLTOPhase::PostLink,		MPM.addPass(PGOIndirectCallPromotion(Phase == ThinLTOPhase::PostLink,
true));		true));
}		}

if (EnableHotColdSplit)
MPM.addPass(HotColdSplittingPass());

// Interprocedural constant propagation now that basic cleanup has occurred		// Interprocedural constant propagation now that basic cleanup has occurred
// and prior to optimizing globals.		// and prior to optimizing globals.
// FIXME: This position in the pipeline hasn't been carefully considered in		// FIXME: This position in the pipeline hasn't been carefully considered in
// years, it should be re-analyzed.		// years, it should be re-analyzed.
MPM.addPass(IPSCCPPass());		MPM.addPass(IPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
if (Level == O3)		if (Level == O3)
MainCGPipeline.addPass(ArgumentPromotionPass());		MainCGPipeline.addPass(ArgumentPromotionPass());

// Lastly, add the core function simplification pipeline nested inside the		// Lastly, add the core function simplification pipeline nested inside the
// CGSCC walk.		// CGSCC walk.
MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(		MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
buildFunctionSimplificationPipeline(Level, Phase, DebugLogging)));		buildFunctionSimplificationPipeline(Level, Phase, DebugLogging)));

		if (EnableHotColdSplit)
		MPM.addPass(HotColdSplittingPass());

for (auto &C : CGSCCOptimizerLateEPCallbacks)		for (auto &C : CGSCCOptimizerLateEPCallbacks)
C(MainCGPipeline, Level);		C(MainCGPipeline, Level);

// We wrap the CGSCC pipeline in a devirtualization repeater. This will try		// We wrap the CGSCC pipeline in a devirtualization repeater. This will try
// to detect when we devirtualize indirect calls and iterate the SCC passes		// to detect when we devirtualize indirect calls and iterate the SCC passes
// in that case to try and catch knock-on inlining or function attrs		// in that case to try and catch knock-on inlining or function attrs
// opportunities. Then we add it to the module pipeline by walking the SCCs		// opportunities. Then we add it to the module pipeline by walking the SCCs
// in postorder (or bottom-up).		// in postorder (or bottom-up).
▲ Show 20 Lines • Show All 1,233 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 493 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
bool PrepareForThinLTOUsingPGOSampleProfile =		bool PrepareForThinLTOUsingPGOSampleProfile =
PrepareForThinLTO && !PGOSampleUse.empty();		PrepareForThinLTO && !PGOSampleUse.empty();
if (PrepareForThinLTOUsingPGOSampleProfile)		if (PrepareForThinLTOUsingPGOSampleProfile)
DisableUnrollLoops = true;		DisableUnrollLoops = true;

// Infer attributes about declarations if possible.		// Infer attributes about declarations if possible.
MPM.add(createInferFunctionAttrsLegacyPass());		MPM.add(createInferFunctionAttrsLegacyPass());

if (EnableHotColdSplit)
MPM.add(createHotColdSplittingPass());

addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);		addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);

if (OptLevel > 2)		if (OptLevel > 2)
MPM.add(createCallSiteSplittingPass());		MPM.add(createCallSiteSplittingPass());

MPM.add(createIPSCCPPass()); // IP SCCP		MPM.add(createIPSCCPPass()); // IP SCCP
MPM.add(createCalledValuePropagationPass());		MPM.add(createCalledValuePropagationPass());
MPM.add(createGlobalOptimizerPass()); // Optimize out global vars		MPM.add(createGlobalOptimizerPass()); // Optimize out global vars
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
// Get rid of LCSSA nodes.		// Get rid of LCSSA nodes.
MPM.add(createInstSimplifyLegacyPass());		MPM.add(createInstSimplifyLegacyPass());

// This hoists/decomposes div/rem ops. It should run after other sink/hoist		// This hoists/decomposes div/rem ops. It should run after other sink/hoist
// passes to avoid re-sinking, but before SimplifyCFG because it can allow		// passes to avoid re-sinking, but before SimplifyCFG because it can allow
// flattening of blocks.		// flattening of blocks.
MPM.add(createDivRemPairsPass());		MPM.add(createDivRemPairsPass());

		if (EnableHotColdSplit)
		MPM.add(createHotColdSplittingPass());

// LoopSink (and other loop passes since the last simplifyCFG) might have		// LoopSink (and other loop passes since the last simplifyCFG) might have
// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.		// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.
MPM.add(createCFGSimplificationPass());		MPM.add(createCFGSimplificationPass());

addExtensionsToPM(EP_OptimizerLast, MPM);		addExtensionsToPM(EP_OptimizerLast, MPM);

// Rename anon globals to be able to handle them in the summary		// Rename anon globals to be able to handle them in the summary
if (PrepareForLTO)		if (PrepareForLTO)
▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

llvm/trunk/test/Other/opt-hot-cold-split.ll

				; RUN: opt -mtriple=x86_64-- -Os -hotcoldsplit -debug-pass=Structure < %s -o /dev/null 2>&1 \| FileCheck %s
				; REQUIRES: asserts

				; CHECK-LABEL: Pass Arguments:
				; CHECK-NEXT: Target Transform Information
				; CHECK-NEXT: Type-Based Alias Analysis
				; CHECK-NEXT: Scoped NoAlias Alias Analysis
				; CHECK-NEXT: Assumption Cache Tracker
				; CHECK-NEXT: Target Library Information
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Module Verifier
				; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (pre inlining)
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: SROA
				; CHECK-NEXT: Early CSE
				; CHECK-NEXT: Lower 'expect' Intrinsics
				; CHECK-NEXT: Pass Arguments:
				; CHECK-NEXT: Target Library Information
				; CHECK-NEXT: Target Transform Information
				; CHECK-NEXT: Target Pass Configuration
				; CHECK-NEXT: Type-Based Alias Analysis
				; CHECK-NEXT: Scoped NoAlias Alias Analysis
				; CHECK-NEXT: Assumption Cache Tracker
				; CHECK-NEXT: Profile summary info
				; CHECK-NEXT: ModulePass Manager
				; CHECK-NEXT: Force set function attributes
				; CHECK-NEXT: Infer set function attributes
				; CHECK-NEXT: Interprocedural Sparse Conditional Constant Propagation
				; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
				; CHECK-NEXT: Called Value Propagation
				; CHECK-NEXT: Global Variable Optimizer
				; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Promote Memory to Register
				; CHECK-NEXT: Dead Argument Elimination
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: CallGraph Construction
				; CHECK-NEXT: Globals Alias Analysis
				; CHECK-NEXT: Call Graph SCC Pass Manager
				; CHECK-NEXT: Remove unused exception handling info
				; CHECK-NEXT: Function Integration/Inlining
				; CHECK-NEXT: Deduce function attributes
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: SROA
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Memory SSA
				; CHECK-NEXT: Early CSE w/ MemorySSA
				; CHECK-NEXT: Speculatively execute instructions if target has divergent branches
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Lazy Value Information Analysis
				; CHECK-NEXT: Jump Threading
				; CHECK-NEXT: Value Propagation
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Tail Call Elimination
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Reassociate expressions
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Rotate Loops
				; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Unswitch loops
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Induction Variable Simplification
				; CHECK-NEXT: Recognize loop idioms
				; CHECK-NEXT: Delete dead loops
				; CHECK-NEXT: Unroll loops
				; CHECK-NEXT: MergedLoadStoreMotion
				; CHECK-NEXT: Phi Values Analysis
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Memory Dependence Analysis
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Global Value Numbering
				; CHECK-NEXT: Phi Values Analysis
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Memory Dependence Analysis
				; CHECK-NEXT: MemCpy Optimization
				; CHECK-NEXT: Sparse Conditional Constant Propagation
				; CHECK-NEXT: Demanded bits analysis
				; CHECK-NEXT: Bit-Tracking Dead Code Elimination
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Lazy Value Information Analysis
				; CHECK-NEXT: Jump Threading
				; CHECK-NEXT: Value Propagation
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Phi Values Analysis
				; CHECK-NEXT: Memory Dependence Analysis
				; CHECK-NEXT: Dead Store Elimination
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Post-Dominator Tree Construction
				; CHECK-NEXT: Aggressive Dead Code Elimination
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: A No-Op Barrier Pass
				; CHECK-NEXT: Eliminate Available Externally Globals
				; CHECK-NEXT: CallGraph Construction
				; CHECK-NEXT: Deduce function attributes in RPO
				; CHECK-NEXT: Global Variable Optimizer
				; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
				; CHECK-NEXT: Dead Global Elimination
				; CHECK-NEXT: CallGraph Construction
				; CHECK-NEXT: Globals Alias Analysis
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Float to int
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Rotate Loops
				; CHECK-NEXT: Loop Access Analysis
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Loop Distribution
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Loop Access Analysis
				; CHECK-NEXT: Demanded bits analysis
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Loop Vectorization
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Loop Access Analysis
				; CHECK-NEXT: Loop Load Elimination
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Demanded bits analysis
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: SLP Vectorizer
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Unroll loops
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Combine redundant instructions
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Loop Invariant Code Motion
				; CHECK-NEXT: Alignment from assumptions
				; CHECK-NEXT: Strip Unused Function Prototypes
				; CHECK-NEXT: Dead Global Elimination
				; CHECK-NEXT: Merge Duplicate Global Constants
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis
				; CHECK-NEXT: Canonicalize natural loops
				; CHECK-NEXT: LCSSA Verifier
				; CHECK-NEXT: Loop-Closed SSA Form Pass
				; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
				; CHECK-NEXT: Function Alias Analysis Results
				; CHECK-NEXT: Scalar Evolution Analysis
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis
				; CHECK-NEXT: Loop Pass Manager
				; CHECK-NEXT: Loop Sink
				; CHECK-NEXT: Lazy Branch Probability Analysis
				; CHECK-NEXT: Lazy Block Frequency Analysis
				; CHECK-NEXT: Optimization Remark Emitter
				; CHECK-NEXT: Remove redundant instructions
				; CHECK-NEXT: Hoist/decompose integer division and remainder
				; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Hot Cold Splitting
				; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Module Verifier
				; CHECK-NEXT: Bitcode Writer
				; CHECK-NEXT: Pass Arguments: -domtree
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
				; CHECK-NEXT: Target Library Information
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis
				; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
				; CHECK-NEXT: Target Library Information
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis
				; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
				; CHECK-NEXT: Target Library Information
				; CHECK-NEXT: FunctionPass Manager
				; CHECK-NEXT: Dominator Tree Construction
				; CHECK-NEXT: Natural Loop Information
				; CHECK-NEXT: Branch Probability Analysis
				; CHECK-NEXT: Block Frequency Analysis

This is an archive of the discontinued LLVM Phabricator instance.

Schedule Hot Cold Splitting pass after most optimization passesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 170344

llvm/trunk/lib/Passes/PassBuilder.cpp

llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/trunk/test/Other/opt-hot-cold-split.ll

Schedule Hot Cold Splitting pass after most optimization passes
ClosedPublic