This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/IPO/
-
Transforms/
-
IPO/
4
PassManagerBuilder.cpp

Differential D19773

Tweak the ThinLTO pass pipeline
ClosedPublic

Authored by mehdi_amini on Apr 30 2016, 5:56 PM.

Download Raw Diff

Details

Reviewers

tejohnson

Commits

rG31407ba009c8: Tweak the ThinLTO pass pipeline

Summary

The original ThinLTO pipeline was derived from some
work I did tuning FullLTO on the test suite and SPEC. This
patch reduces the amount of work done in the "linker phase" of
the build, and extend the function simplifications passes
performed during the "compile phase". This helps the build time
by reducing the IR as much as possible during the compile phase
and limiting the work to be performed during the "link phase",
while keeping the performance "on par" with the existing pipeline.

Diff Detail

Event Timeline

mehdi_amini updated this revision to Diff 55731.Apr 30 2016, 5:56 PM

mehdi_amini retitled this revision from to Tweak the ThinLTO pass pipeline.

mehdi_amini updated this object.

mehdi_amini added a reviewer: tejohnson.

mehdi_amini added a subscriber: llvm-commits.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptApr 30 2016, 5:56 PM

Fix typo check for LTO

Rebase after splitting other changes

I compared performance of all the C/C++ benchmarks in SPEC cpu2006 on an Ivybridge X86 with gold+ThinLTO, with 3 runs each. It looks performance neutral (some run-to-run variations that look like noise, nothing significant).

lib/Transforms/IPO/PassManagerBuilder.cpp
428	Is there any significance to the move of the globalopt invocation here from its original location below LoopVersioningLICM?
448	Why are the globalDCE before and after globalopt no longer needed (for the reasons stated in the comments)?

mehdi_amini added inline comments.May 6 2016, 9:08 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
428	I didn't really design the original position to be after LICM. I wrote this back in October before `ReversePostOrderFunctionAttrr` and `LoopVersioningLICM` exist. I inserted it originally right after `EliminateAvailableExternally`. In the meantime `LoopVersioningLICM` was inserted before `EliminateAvailableExternally`. In this patch I reorganized `ReversePostOrderFunctionAttrr`, `EliminateAvailableExternally`, and `GlobalOpt` to be ran before `LICM` (which is more part of the function optimization pipeline that follow) last week in this patch. Then I was asked to split the patch, so what remains is only this part of the move.
448	I think my original comments were buggy: `GlobalOptimizer` performs DCE as well.

tejohnson accepted this revision.May 6 2016, 9:30 AM

tejohnson edited edge metadata.

This revision is now accepted and ready to land.May 6 2016, 9:30 AM

r268769

Revision Contents

Path

Size

lib/

Transforms/

IPO/

PassManagerBuilder.cpp

24 lines

Diff 55851

lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addFunctionSimplificationPasses(
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
// Combine silly seq's		// Combine silly seq's
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);

MPM.add(createTailCallEliminationPass()); // Eliminate tail calls		MPM.add(createTailCallEliminationPass()); // Eliminate tail calls
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
MPM.add(createReassociatePass()); // Reassociate expressions		MPM.add(createReassociatePass()); // Reassociate expressions
if (PrepareForThinLTO) {
MPM.add(createAggressiveDCEPass()); // Delete dead instructions
addInstructionCombiningPass(MPM); // Combine silly seq's
return;
}
// Rotate Loop - disable header duplication at -Oz		// Rotate Loop - disable header duplication at -Oz
MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));		MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));
MPM.add(createLICMPass()); // Hoist loop invariants		MPM.add(createLICMPass()); // Hoist loop invariants
MPM.add(createLoopUnswitchPass(SizeLevel \|\| OptLevel < 3));		MPM.add(createLoopUnswitchPass(SizeLevel \|\| OptLevel < 3));
MPM.add(createCFGSimplificationPass());		MPM.add(createCFGSimplificationPass());
addInstructionCombiningPass(MPM);		addInstructionCombiningPass(MPM);
MPM.add(createIndVarSimplifyPass()); // Canonicalize indvars		MPM.add(createIndVarSimplifyPass()); // Canonicalize indvars
MPM.add(createLoopIdiomPass()); // Recognize idioms like memset.		MPM.add(createLoopIdiomPass()); // Recognize idioms like memset.
▲ Show 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
if (PrepareForThinLTO) {		if (PrepareForThinLTO) {
// Reduce the size of the IR as much as possible.		// Reduce the size of the IR as much as possible.
MPM.add(createGlobalOptimizerPass());		MPM.add(createGlobalOptimizerPass());
// Rename anon function to be able to export them in the summary.		// Rename anon function to be able to export them in the summary.
MPM.add(createNameAnonFunctionPass());		MPM.add(createNameAnonFunctionPass());
return;		return;
}		}

		if (PerformThinLTO)
		tejohnsonUnsubmitted Not Done Reply Inline Actions Is there any significance to the move of the globalopt invocation here from its original location below LoopVersioningLICM? tejohnson: Is there any significance to the move of the globalopt invocation here from its original…
		mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions I didn't really design the original position to be after LICM. I wrote this back in October before `ReversePostOrderFunctionAttrr` and `LoopVersioningLICM` exist. I inserted it originally right after `EliminateAvailableExternally`. In the meantime `LoopVersioningLICM` was inserted before `EliminateAvailableExternally`. In this patch I reorganized `ReversePostOrderFunctionAttrr`, `EliminateAvailableExternally`, and `GlobalOpt` to be ran before `LICM` (which is more part of the function optimization pipeline that follow) last week in this patch. Then I was asked to split the patch, so what remains is only this part of the move. mehdi_amini: I didn't really design the original position to be after LICM. I wrote this back in October…
		// Optimize globals now when performing ThinLTO, this enables more
		// optimizations later.
		MPM.add(createGlobalOptimizerPass());

// Scheduling LoopVersioningLICM when inlining is over, because after that		// Scheduling LoopVersioningLICM when inlining is over, because after that
// we may see more accurate aliasing. Reason to run this late is that too		// we may see more accurate aliasing. Reason to run this late is that too
// early versioning may prevent further inlining due to increase of code		// early versioning may prevent further inlining due to increase of code
// size. By placing it just after inlining other optimizations which runs		// size. By placing it just after inlining other optimizations which runs
// later might get benefit of no-alias assumption in clone loop.		// later might get benefit of no-alias assumption in clone loop.
if (UseLoopVersioningLICM) {		if (UseLoopVersioningLICM) {
MPM.add(createLoopVersioningLICMPass()); // Do LoopVersioningLICM		MPM.add(createLoopVersioningLICMPass()); // Do LoopVersioningLICM
MPM.add(createLICMPass()); // Hoist loop invariants		MPM.add(createLICMPass()); // Hoist loop invariants
}		}

if (PerformThinLTO) {
// Remove dead fns and globals. Removing unreferenced functions could lead
// to more opportunities for globalopt.
MPM.add(createGlobalDCEPass());
MPM.add(createGlobalOptimizerPass());
// Remove dead fns and globals after globalopt.
tejohnsonUnsubmitted Not Done Reply Inline Actions Why are the globalDCE before and after globalopt no longer needed (for the reasons stated in the comments)? tejohnson: Why are the globalDCE before and after globalopt no longer needed (for the reasons stated in…
mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions I think my original comments were buggy: `GlobalOptimizer` performs DCE as well. mehdi_amini: I think my original comments were buggy: `GlobalOptimizer` performs DCE as well.
MPM.add(createGlobalDCEPass());
addFunctionSimplificationPasses(MPM);
}

if (EnableNonLTOGlobalsModRef)		if (EnableNonLTOGlobalsModRef)
// We add a fresh GlobalsModRef run at this point. This is particularly		// We add a fresh GlobalsModRef run at this point. This is particularly
// useful as the above will have inlined, DCE'ed, and function-attr		// useful as the above will have inlined, DCE'ed, and function-attr
// propagated everything. We should at this point have a reasonably minimal		// propagated everything. We should at this point have a reasonably minimal
// and richly annotated call graph. By computing aliasing and mod/ref		// and richly annotated call graph. By computing aliasing and mod/ref
// information for all local globals here, the late loop passes and notably		// information for all local globals here, the late loop passes and notably
// the vectorizer will be able to use them to help recognize vectorizable		// the vectorizer will be able to use them to help recognize vectorizable
// memory operations.		// memory operations.
▲ Show 20 Lines • Show All 407 Lines • Show Last 20 Lines