This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/
-
Passes/
-
PassBuilder.cpp
-
Transforms/IPO/
-
IPO/
-
PassManagerBuilder.cpp
-
test/Other/
-
Other/
-
X86/
-
lto-hot-cold-split.ll
-
new-pm-thinlto-defaults.ll
-
opt-hot-cold-split.ll

Differential D57082

[HotColdSplit] Move splitting earlier in the pipeline
ClosedPublic

Authored by vsk on Jan 22 2019, 8:12 PM.

Download Raw Diff

Details

Reviewers

hiraditya
tejohnson
thegameg
sebpop

Commits

rGef1ebed1c685: [HotColdSplit] Move splitting earlier in the pipeline
rL352080: [HotColdSplit] Move splitting earlier in the pipeline

Summary

Performing splitting early has several advantages:

Inhibiting inlining of cold code early improves code size. Compared to scheduling splitting at the end of the pipeline, this cuts code size growth in half within the iOS shared cache (0.69% to 0.34%).
Inhibiting inlining of cold code improves compile time. There's no need to inline split cold functions, or to inline as much *within* those split functions as they are marked minsize.
During LTO, extra work is only done in the pre-link step. Less code must be inlined during cross-module inlining.
The most common cold regions identified by the static/conservative splitting heuristic can (a) be found before inlining and (b) do not grow after inlining. E.g. __assert_fail, os_log_error.

The disadvantages are:

Some opportunities for splitting out cold code may be missed. This gap can potentially be narrowed by adding a worklist algorithm to the splitting pass.
Some opportunities to reduce code size may be lost (e.g. store sinking, when one side of the CFG diamond is split). This does not outweigh the code size benefits of splitting earlier.

On net, splitting early in the pipeline has substantial code size
benefits, and no major effects on memory locality or performance. We
measured memory locality using ktrace data, and consistently found that
10% fewer pages were needed to capture 95% of text page faults in key
iOS benchmarks. We measured performance on frequency-stabilized iOS
devices using LNT+externals.

This reverses course on the decision made to schedule splitting late in
r344869 (D53437).

Diff Detail

Repository: rL LLVM

Event Timeline

vsk created this revision.Jan 22 2019, 8:12 PM

Herald added subscribers: dexonsmith, steven_wu, eraman, mehdi_amini. · View Herald TranscriptJan 22 2019, 8:12 PM

tejohnson added inline comments.Jan 23 2019, 7:03 AM

llvm/lib/Passes/PassBuilder.cpp
661 ↗	(On Diff #183030)	Probably should have similar comment about why here (like you added in old PM).
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
521 ↗	(On Diff #183030)	I think you have the wrong LTO guards. This is the opposite of the ThinLTO guard you have in the new PM. Here you are preventing this from running during the ThinLTO (and regular LTO) pre-link steps. The following are equivalent: PrepareForThinLTO (oldPM) == ThinLTOPhase::PreLink (newPM) PerformThinLTO (oldPM) == ThinLTOPhase::PostLink (newPM) Note that there is no PerformLTO, since in the old PM this function is not even called during the regular LTO post link (which has a different, specialized pipeline). So in a nutshell, I believe this should be: if (EnableHotColdSplit && !PerformThinLTO) This will give you splitting during a non-LTO compile, and during the pre-link *LTO compiles.

Fix the pipeline issue in the old PM pointed out by Teresa. Add a test for it.

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
521 ↗	(On Diff #183030)	Thanks for explaining this.

LGTM with suggestion about cutting down test.

llvm/test/Other/opt-hot-cold-split.ll
30 ↗	(On Diff #183143)	Suggest cutting the expected output here and in the pre-link cases down to just look for Hot Cold Splitting. Otherwise this test will need to be changed for unrelated opt pipeline changes. And I don't think we need to check the whole pipeline?

This revision is now accepted and ready to land.Jan 23 2019, 11:54 AM

vsk updated this revision to Diff 183159.Jan 23 2019, 12:47 PM

vsk marked an inline comment as done.

vsk added inline comments.

llvm/test/Other/opt-hot-cold-split.ll
30 ↗	(On Diff #183143)	I'll trim this down, and just check the higher-level invariants described earlier (i.e., after mem2reg, before function simplification passes), if that's all right.

tejohnson added inline comments.Jan 23 2019, 12:53 PM

llvm/test/Other/opt-hot-cold-split.ll
30 ↗	(On Diff #183143)	Sure.

vsk mentioned this in D57125: [HotColdSplit] Introduce a cost model to control splitting behavior.Jan 23 2019, 3:48 PM

Closed by commit rL352080: [HotColdSplit] Move splitting earlier in the pipeline (authored by vedantk). · Explain WhyJan 24 2019, 10:56 AM

This revision was automatically updated to reflect the committed changes.

Test fails on Windows: https://logs.chromium.org/logs/chromium/bb/tryserver.chromium.win/win_upload_clang/467/+/recipes/steps/package_clang/0/stdout

FAIL: LLVM :: Other/opt-hot-cold-split.ll (36787 of 46358)
******************** TEST 'LLVM :: Other/opt-hot-cold-split.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE -mtriple=x86_64-- -Os -hot-cold-split=true -debug-pass=Structure < C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -o /dev/null 2>&1 | C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -check-prefix=DEFAULT-Os
: 'RUN: at line 2';   C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE -mtriple=x86_64-- -Os -hot-cold-split=true -passes='lto-pre-link<Os>' -debug-pass-manager < C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -o /dev/null 2>&1 | C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -check-prefix=LTO-PRELINK-Os
: 'RUN: at line 3';   C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE -mtriple=x86_64-- -Os -hot-cold-split=true -passes='thinlto-pre-link<Os>' -debug-pass-manager < C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -o /dev/null 2>&1 | C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -check-prefix=THINLTO-PRELINK-Os
: 'RUN: at line 4';   C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE -mtriple=x86_64-- -Os -hot-cold-split=true -passes='thinlto<Os>' -debug-pass-manager < C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -o /dev/null 2>&1 | C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll -check-prefix=THINLTO-POSTLINK-Os
--
Exit Code: 1

Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE" "-mtriple=x86_64--" "-Os" "-hot-cold-split=true" "-debug-pass=Structure" "-o" "/dev/null"
$ "C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE" "C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll" "-check-prefix=DEFAULT-Os"
$ ":" "RUN: at line 2"
$ "C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\opt.EXE" "-mtriple=x86_64--" "-Os" "-hot-cold-split=true" "-passes=lto-pre-link<Os>" "-debug-pass-manager" "-o" "/dev/null"
$ "C:\b\rr\tmpgtkggu\w\src\third_party\llvm-bootstrap\bin\FileCheck.EXE" "C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll" "-check-prefix=LTO-PRELINK-Os"
# command stderr:
C:\b\rr\tmpgtkggu\w\src\third_party\llvm\test\Other\opt-hot-cold-split.ll:15:19: error: LTO-PRELINK-Os: expected string not found in input
; LTO-PRELINK-Os: Running pass: ModuleToFunctionPassAdaptor<llvm::PromotePass>
                  ^
<stdin>:3:1: note: scanning from here
Running pass: VerifierPass on <stdin>
^
<stdin>:17:1: note: possible intended match here
Running pass: ModuleToFunctionPassAdaptor<class llvm::PromotePass> on <stdin>

(note additional "class ")

Sorry about that, taking a look now

I relaxed the test in r352138, and think that should address the issue.

Diffusion mentioned this in rL352228: [HotColdSplit] Introduce a cost model to control splitting behavior.Jan 25 2019, 10:32 AM

tejohnson mentioned this in D57805: [HotColdSplit] Move splitting after instrumented PGO use.Feb 5 2019, 8:23 PM

tejohnson mentioned this in rL353270: [HotColdSplit] Move splitting after instrumented PGO use.Feb 5 2019, 8:29 PM

tejohnson mentioned this in rG716abbeb4382: [HotColdSplit] Move splitting after instrumented PGO use.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Passes/

PassBuilder.cpp

13 lines

Transforms/

IPO/

PassManagerBuilder.cpp

14 lines

test/

Other/

X86/

lto-hot-cold-split.ll

10 lines

new-pm-thinlto-defaults.ll

4 lines

opt-hot-cold-split.ll

316 lines

Diff 183348

llvm/trunk/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 652 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
// delete control flows that are dead once globals have been folded to		// delete control flows that are dead once globals have been folded to
// constants.		// constants.
MPM.addPass(createModuleToFunctionPassAdaptor(PromotePass()));		MPM.addPass(createModuleToFunctionPassAdaptor(PromotePass()));

// Remove any dead arguments exposed by cleanups and constand folding		// Remove any dead arguments exposed by cleanups and constand folding
// globals.		// globals.
MPM.addPass(DeadArgumentEliminationPass());		MPM.addPass(DeadArgumentEliminationPass());

		// Split out cold code. Splitting is done before inlining because 1) the most
		// common kinds of cold regions can (a) be found before inlining and (b) do
		// not grow after inlining, and 2) inhibiting inlining of cold code improves
		// code size & compile time. Split after Mem2Reg to make code model estimates
		// more accurate, but before InstCombine to allow it to clean things up.
		if (EnableHotColdSplit && Phase != ThinLTOPhase::PostLink)
		MPM.addPass(HotColdSplittingPass());

// Create a small function pass pipeline to cleanup after all the global		// Create a small function pass pipeline to cleanup after all the global
// optimizations.		// optimizations.
FunctionPassManager GlobalCleanupPM(DebugLogging);		FunctionPassManager GlobalCleanupPM(DebugLogging);
GlobalCleanupPM.addPass(InstCombinePass());		GlobalCleanupPM.addPass(InstCombinePass());
invokePeepholeEPCallbacks(GlobalCleanupPM, Level);		invokePeepholeEPCallbacks(GlobalCleanupPM, Level);

GlobalCleanupPM.addPass(SimplifyCFGPass());		GlobalCleanupPM.addPass(SimplifyCFGPass());
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(GlobalCleanupPM)));		MPM.addPass(createModuleToFunctionPassAdaptor(std::move(GlobalCleanupPM)));
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
if (Level == O3)		if (Level == O3)
MainCGPipeline.addPass(ArgumentPromotionPass());		MainCGPipeline.addPass(ArgumentPromotionPass());

// Lastly, add the core function simplification pipeline nested inside the		// Lastly, add the core function simplification pipeline nested inside the
// CGSCC walk.		// CGSCC walk.
MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(		MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
buildFunctionSimplificationPipeline(Level, Phase, DebugLogging)));		buildFunctionSimplificationPipeline(Level, Phase, DebugLogging)));

// We only want to do hot cold splitting once for ThinLTO, during the
// post-link ThinLTO.
if (EnableHotColdSplit && Phase != ThinLTOPhase::PreLink)
MPM.addPass(HotColdSplittingPass());

for (auto &C : CGSCCOptimizerLateEPCallbacks)		for (auto &C : CGSCCOptimizerLateEPCallbacks)
C(MainCGPipeline, Level);		C(MainCGPipeline, Level);

// We wrap the CGSCC pipeline in a devirtualization repeater. This will try		// We wrap the CGSCC pipeline in a devirtualization repeater. This will try
// to detect when we devirtualize indirect calls and iterate the SCC passes		// to detect when we devirtualize indirect calls and iterate the SCC passes
// in that case to try and catch knock-on inlining or function attrs		// in that case to try and catch knock-on inlining or function attrs
// opportunities. Then we add it to the module pipeline by walking the SCCs		// opportunities. Then we add it to the module pipeline by walking the SCCs
// in postorder (or bottom-up).		// in postorder (or bottom-up).
▲ Show 20 Lines • Show All 1,340 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 414 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addFunctionSimplificationPasses(

if (EnableCHR && OptLevel >= 3 &&		if (EnableCHR && OptLevel >= 3 &&
(!PGOInstrUse.empty() \|\| !PGOSampleUse.empty()))		(!PGOInstrUse.empty() \|\| !PGOSampleUse.empty()))
MPM.add(createControlHeightReductionLegacyPass());		MPM.add(createControlHeightReductionLegacyPass());
}		}

void PassManagerBuilder::populateModulePassManager(		void PassManagerBuilder::populateModulePassManager(
legacy::PassManagerBase &MPM) {		legacy::PassManagerBase &MPM) {
		// Whether this is a default or *LTO pre-link pipeline. The FullLTO post-link
		// is handled separately, so just check this is not the ThinLTO post-link.
		bool DefaultOrPreLinkPipeline = !PerformThinLTO;

if (!PGOSampleUse.empty()) {		if (!PGOSampleUse.empty()) {
MPM.add(createPruneEHPass());		MPM.add(createPruneEHPass());
// In ThinLTO mode, when flattened profile is used, all the available		// In ThinLTO mode, when flattened profile is used, all the available
// profile information will be annotated in PreLink phase so there is		// profile information will be annotated in PreLink phase so there is
// no need to load the profile again in PostLink.		// no need to load the profile again in PostLink.
if (!(FlattenedProfileUsed && PerformThinLTO))		if (!(FlattenedProfileUsed && PerformThinLTO))
MPM.add(createSampleProfileLoaderPass(PGOSampleUse));		MPM.add(createSampleProfileLoaderPass(PGOSampleUse));
}		}
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
MPM.add(createIPSCCPPass()); // IP SCCP		MPM.add(createIPSCCPPass()); // IP SCCP
MPM.add(createCalledValuePropagationPass());		MPM.add(createCalledValuePropagationPass());
MPM.add(createGlobalOptimizerPass()); // Optimize out global vars		MPM.add(createGlobalOptimizerPass()); // Optimize out global vars
// Promote any localized global vars.		// Promote any localized global vars.
MPM.add(createPromoteMemoryToRegisterPass());		MPM.add(createPromoteMemoryToRegisterPass());

MPM.add(createDeadArgEliminationPass()); // Dead argument elimination		MPM.add(createDeadArgEliminationPass()); // Dead argument elimination

		// Split out cold code before inlining. See comment in the new PM
		// (\ref buildModuleSimplificationPipeline).
		if (EnableHotColdSplit && DefaultOrPreLinkPipeline)
		MPM.add(createHotColdSplittingPass());

addInstructionCombiningPass(MPM); // Clean up after IPCP & DAE		addInstructionCombiningPass(MPM); // Clean up after IPCP & DAE
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
MPM.add(createCFGSimplificationPass()); // Clean up after IPCP & DAE		MPM.add(createCFGSimplificationPass()); // Clean up after IPCP & DAE

// For SamplePGO in ThinLTO compile phase, we do not want to do indirect		// For SamplePGO in ThinLTO compile phase, we do not want to do indirect
// call promotion as it will change the CFG too much to make the 2nd		// call promotion as it will change the CFG too much to make the 2nd
// profile annotation in backend more difficult.		// profile annotation in backend more difficult.
// PGO instrumentation is added during the compile phase for ThinLTO, do		// PGO instrumentation is added during the compile phase for ThinLTO, do
// not run it a second time		// not run it a second time
if (!PerformThinLTO && !PrepareForThinLTOUsingPGOSampleProfile)		if (DefaultOrPreLinkPipeline && !PrepareForThinLTOUsingPGOSampleProfile)
addPGOInstrPasses(MPM);		addPGOInstrPasses(MPM);

// We add a module alias analysis pass here. In part due to bugs in the		// We add a module alias analysis pass here. In part due to bugs in the
// analysis infrastructure this "works" in that the analysis stays alive		// analysis infrastructure this "works" in that the analysis stays alive
// for the entire SCC pass run below.		// for the entire SCC pass run below.
MPM.add(createGlobalsAAWrapperPass());		MPM.add(createGlobalsAAWrapperPass());

// Start of CallGraph SCC passes.		// Start of CallGraph SCC passes.
▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
// Get rid of LCSSA nodes.		// Get rid of LCSSA nodes.
MPM.add(createInstSimplifyLegacyPass());		MPM.add(createInstSimplifyLegacyPass());

// This hoists/decomposes div/rem ops. It should run after other sink/hoist		// This hoists/decomposes div/rem ops. It should run after other sink/hoist
// passes to avoid re-sinking, but before SimplifyCFG because it can allow		// passes to avoid re-sinking, but before SimplifyCFG because it can allow
// flattening of blocks.		// flattening of blocks.
MPM.add(createDivRemPairsPass());		MPM.add(createDivRemPairsPass());

if (EnableHotColdSplit)
MPM.add(createHotColdSplittingPass());

// LoopSink (and other loop passes since the last simplifyCFG) might have		// LoopSink (and other loop passes since the last simplifyCFG) might have
// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.		// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.
MPM.add(createCFGSimplificationPass());		MPM.add(createCFGSimplificationPass());

addExtensionsToPM(EP_OptimizerLast, MPM);		addExtensionsToPM(EP_OptimizerLast, MPM);

if (PrepareForLTO) {		if (PrepareForLTO) {
MPM.add(createCanonicalizeAliasesPass());		MPM.add(createCanonicalizeAliasesPass());
▲ Show 20 Lines • Show All 332 Lines • Show Last 20 Lines

llvm/trunk/test/Other/X86/lto-hot-cold-split.ll

				; RUN: opt -module-summary %s -o %t.bc
				; RUN: llvm-lto -hot-cold-split=true -thinlto-action=run %t.bc -debug-pass=Structure 2>&1 \| FileCheck %s -check-prefix=OLDPM-THINLTO-POSTLINK-Os

				; REQUIRES: asserts

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; OLDPM-THINLTO-POSTLINK-Os-LABEL: Pass Arguments
				; OLDPM-THINLTO-POSTLINK-Os-NOT: Hot Cold Splitting

llvm/trunk/test/Other/new-pm-thinlto-defaults.ll

	Show All 20 Lines
	; RUN: -passes='thinlto-pre-link<Os>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<Os>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Os			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Os
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<Oz>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<Oz>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Oz			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-Oz
	; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \			; RUN: opt -disable-verify -debug-pass-manager -new-pm-debug-info-for-profiling \
	; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \			; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-DIS,CHECK-O,CHECK-O2,CHECK-PRELINK-O,CHECK-PRELINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-DIS,CHECK-O,CHECK-O2,CHECK-PRELINK-O,CHECK-PRELINK-O2
	; Enabling the hot-cold-split pass should not affect the ThinLTO pre-link
	; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto-pre-link<O2>,name-anon-globals' -hot-cold-split -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-PRELINK-O2
	;			;
	; Postlink pipelines:			; Postlink pipelines:
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<O1>' -S %s 2>&1 \			; RUN: -passes='thinlto<O1>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-POSTLINK-O,CHECK-POSTLINK-O1			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-POSTLINK-O,CHECK-POSTLINK-O1
	; RUN: opt -disable-verify -debug-pass-manager \			; RUN: opt -disable-verify -debug-pass-manager \
	; RUN: -passes='thinlto<O2>' -S %s 2>&1 \			; RUN: -passes='thinlto<O2>' -S %s 2>&1 \
	; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-POSTLINK-O,CHECK-POSTLINK-O2			; RUN: \| FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-POSTLINK-O,CHECK-POSTLINK-O2
	▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

llvm/trunk/test/Other/opt-hot-cold-split.ll

	; RUN: opt -mtriple=x86_64-- -Os -hotcoldsplit -debug-pass=Structure < %s -o /dev/null 2>&1 \| FileCheck %s			; RUN: opt -mtriple=x86_64-- -Os -hot-cold-split=true -debug-pass=Structure < %s -o /dev/null 2>&1 \| FileCheck %s -check-prefix=DEFAULT-Os
				; RUN: opt -mtriple=x86_64-- -Os -hot-cold-split=true -passes='lto-pre-link<Os>' -debug-pass-manager < %s -o /dev/null 2>&1 \| FileCheck %s -check-prefix=LTO-PRELINK-Os
				; RUN: opt -mtriple=x86_64-- -Os -hot-cold-split=true -passes='thinlto-pre-link<Os>' -debug-pass-manager < %s -o /dev/null 2>&1 \| FileCheck %s -check-prefix=THINLTO-PRELINK-Os
				; RUN: opt -mtriple=x86_64-- -Os -hot-cold-split=true -passes='thinlto<Os>' -debug-pass-manager < %s -o /dev/null 2>&1 \| FileCheck %s -check-prefix=THINLTO-POSTLINK-Os

	; REQUIRES: asserts			; REQUIRES: asserts

	; CHECK-LABEL: Pass Arguments:			; Splitting should occur after Mem2Reg and should be followed by InstCombine.
	; CHECK-NEXT: Target Transform Information
	; CHECK-NEXT: Type-Based Alias Analysis			; DEFAULT-Os: Promote Memory to Register
	; CHECK-NEXT: Scoped NoAlias Alias Analysis			; DEFAULT-Os: Hot Cold Splitting
	; CHECK-NEXT: Assumption Cache Tracker			; DEFAULT-Os: Combine redundant instructions
	; CHECK-NEXT: Target Library Information
	; CHECK-NEXT: FunctionPass Manager			; LTO-PRELINK-Os-LABEL: Starting llvm::Module pass manager run.
	; CHECK-NEXT: Module Verifier			; LTO-PRELINK-Os: Running pass: ModuleToFunctionPassAdaptor<llvm::PromotePass>
	; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (pre inlining)			; LTO-PRELINK-Os: Running pass: HotColdSplittingPass
	; CHECK-NEXT: Simplify the CFG			; LTO-PRELINK-Os: Running pass: ModuleToFunctionPassAdaptor<llvm::PassManager<llvm::Function> >
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: SROA			; THINLTO-PRELINK-Os-LABEL: Running analysis: PassInstrumentationAnalysis
	; CHECK-NEXT: Early CSE			; THINLTO-PRELINK-Os: Running pass: ModuleToFunctionPassAdaptor<llvm::PromotePass>
	; CHECK-NEXT: Lower 'expect' Intrinsics			; THINLTO-PRELINK-Os: Running pass: HotColdSplittingPass
	; CHECK-NEXT: Pass Arguments:			; THINLTO-PRELINK-Os: Running pass: ModuleToFunctionPassAdaptor<llvm::PassManager<llvm::Function> >
	; CHECK-NEXT: Target Library Information
	; CHECK-NEXT: Target Transform Information			; THINLTO-POSTLINK-Os-NOT: HotColdSplitting
	; Target Pass Configuration
	; CHECK: Type-Based Alias Analysis
	; CHECK-NEXT: Scoped NoAlias Alias Analysis
	; CHECK-NEXT: Assumption Cache Tracker
	; CHECK-NEXT: Profile summary info
	; CHECK-NEXT: ModulePass Manager
	; CHECK-NEXT: Force set function attributes
	; CHECK-NEXT: Infer set function attributes
	; CHECK-NEXT: Interprocedural Sparse Conditional Constant Propagation
	; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
	; CHECK-NEXT: Called Value Propagation
	; CHECK-NEXT: Global Variable Optimizer
	; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Promote Memory to Register
	; CHECK-NEXT: Dead Argument Elimination
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: CallGraph Construction
	; CHECK-NEXT: Globals Alias Analysis
	; CHECK-NEXT: Call Graph SCC Pass Manager
	; CHECK-NEXT: Remove unused exception handling info
	; CHECK-NEXT: Function Integration/Inlining
	; CHECK-NEXT: Deduce function attributes
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: SROA
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory SSA
	; CHECK-NEXT: Early CSE w/ MemorySSA
	; CHECK-NEXT: Speculatively execute instructions if target has divergent branches
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Lazy Value Information Analysis
	; CHECK-NEXT: Jump Threading
	; CHECK-NEXT: Value Propagation
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Tail Call Elimination
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Reassociate expressions
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Rotate Loops
	; CHECK-NEXT: Loop Invariant Code Motion
	; CHECK-NEXT: Unswitch loops
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Induction Variable Simplification
	; CHECK-NEXT: Recognize loop idioms
	; CHECK-NEXT: Delete dead loops
	; CHECK-NEXT: Unroll loops
	; CHECK-NEXT: MergedLoadStoreMotion
	; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Global Value Numbering
	; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: MemCpy Optimization
	; CHECK-NEXT: Sparse Conditional Constant Propagation
	; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Bit-Tracking Dead Code Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Lazy Value Information Analysis
	; CHECK-NEXT: Jump Threading
	; CHECK-NEXT: Value Propagation
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Phi Values Analysis
	; CHECK-NEXT: Memory Dependence Analysis
	; CHECK-NEXT: Dead Store Elimination
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Invariant Code Motion
	; CHECK-NEXT: Post-Dominator Tree Construction
	; CHECK-NEXT: Aggressive Dead Code Elimination
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: A No-Op Barrier Pass
	; CHECK-NEXT: Eliminate Available Externally Globals
	; CHECK-NEXT: CallGraph Construction
	; CHECK-NEXT: Deduce function attributes in RPO
	; CHECK-NEXT: Global Variable Optimizer
	; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
	; CHECK-NEXT: Dead Global Elimination
	; CHECK-NEXT: CallGraph Construction
	; CHECK-NEXT: Globals Alias Analysis
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Float to int
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Rotate Loops
	; CHECK-NEXT: Loop Access Analysis
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Loop Distribution
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Loop Access Analysis
	; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Loop Vectorization
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Loop Access Analysis
	; CHECK-NEXT: Loop Load Elimination
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Demanded bits analysis
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: SLP Vectorizer
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Unroll loops
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Combine redundant instructions
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Invariant Code Motion
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Warn about non-applied transformations
	; CHECK-NEXT: Alignment from assumptions
	; CHECK-NEXT: Strip Unused Function Prototypes
	; CHECK-NEXT: Dead Global Elimination
	; CHECK-NEXT: Merge Duplicate Global Constants
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: LCSSA Verifier
	; CHECK-NEXT: Loop-Closed SSA Form Pass
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Function Alias Analysis Results
	; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Sink
	; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Remove redundant instructions
	; CHECK-NEXT: Hoist/decompose integer division and remainder
	; CHECK-NEXT: Simplify the CFG
	; CHECK-NEXT: Hot Cold Splitting
	; CHECK-NEXT: Unnamed pass: implement Pass::getPassName()
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Bitcode Writer
	; CHECK-NEXT: Pass Arguments: -domtree
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
	; CHECK-NEXT: Target Library Information
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
	; CHECK-NEXT: Target Library Information
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis
	; CHECK-NEXT: Pass Arguments: -targetlibinfo -domtree -loops -branch-prob -block-freq
	; CHECK-NEXT: Target Library Information
	; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Branch Probability Analysis
	; CHECK-NEXT: Block Frequency Analysis