This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/CodeGen/
-
lib/
-
CodeGen/
-
BackendUtil.cpp
-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
-
InlineCost.h
-
Passes/
-
PassBuilder.h
-
Transforms/
-
IPO.h
-
IPO/
-
Inliner.h
-
lib/
-
Analysis/
-
InlineCost.cpp
-
Passes/
-
PassBuilder.cpp
-
Transforms/IPO/
-
IPO/
-
InlineSimple.cpp
-
Inliner.cpp
-
PassManagerBuilder.cpp
-
test/Transforms/Inline/
-
Transforms/
-
Inline/
-
bpi-cold-inlining.ll

Differential D73994

[InlineCost] Relax bonus restrictions on uninlinable functions
Needs ReviewPublic

Authored by george.burgess.iv on Feb 4 2020, 1:58 PM.

Download Raw Diff

Details

Reviewers

davidxl
mtrofin
eraman

Summary

We currently ignore inlining bonuses sometimes (importantly to me, LastCallToStaticBonus) in order to make the caller of a function more inlinable. This appears suboptimal if the caller can't be inlined in the first place.

This is an attempt to fix that suboptimality (which showed up for me most prominently with the new PM), though it's not quite finished. Before I plumb everything through, I'd like to know if anyone's strongly against a change like this.

Importantly, I can see the following downsides:

This patch makes our inline cost depend on state of the caller, which is generally discouraged, since we try to cache these things.
This patch basically causes the inliner to inline differently depending on whether we're LTO'ing or not, which AFAICT is new.

The last bullet is the case because, in a traditional build, I don't think we care about how inlineable exportedFunction is below, assuming it has no users in its TU:

static void helper() {
  // large body
}

void exportedFunction(bool i) {
  if (UNLIKELY(i))
    helper();
}

OTOH, with ThinLTO or LTO enabled, we might discover uses of exportedFunction during our link, so the inlinability of it starts to matter.

Thoughts appreciated. :)

Diff Detail

Event Timeline

george.burgess.iv created this revision.Feb 4 2020, 1:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 1:58 PM

Herald added subscribers: llvm-commits, dexonsmith, haicheng and 2 others. · View Herald Transcript

On the first bullet, I'm trying to find where InlineCost values are cached. Are we doing that?

On the first bullet, I'm trying to find where InlineCost values are cached. Are we doing that?

I don't actually know myself; I got that from above the comment for CandidateCall: "The candidate callsite being analyzed. Please do not use this to do analysis in the caller function; we want the inline cost query to be easily cacheable. Instead, use the cover function paramHasAttr."

This is sort of extension of the 'shouldBeDeferred' check in Inliner.cpp, except that here the caller may not actually be inlined. Blindly eliminating bonus may preventing the callee from being inlined while not actually enabling the caller to be inlined.

In D73994#1858006, @george.burgess.iv wrote:

On the first bullet, I'm trying to find where InlineCost values are cached. Are we doing that?

I don't actually know myself; I got that from above the comment for CandidateCall: "The candidate callsite being analyzed. Please do not use this to do analysis in the caller function; we want the inline cost query to be easily cacheable. Instead, use the cover function paramHasAttr."

Yup - I also don't know of any location caching the cost. I wonder if this was a nerver-realized goal, and whether it is abandoned. May be worth checking with the author of the comment?

Yup - I also don't know of any location caching the cost. I wonder if this was a nerver-realized goal, and whether it is abandoned. May be worth checking with the author of the comment?

Good call -- looks like the comment was added in 9b5c9580e384 around 5 years ago. The commit message says "We're not currently [caching inline costs], but initial results on LTO indicate this will quickly become important," so this sounds like a nice-to-have that was never had.

Removed the cacheable bit of that comment, since I think it's a bit misleading if there's no motion on it.

This is sort of extension of the 'shouldBeDeferred' check in Inliner.cpp, except that here the caller may not actually be inlined. Blindly eliminating bonus may preventing the callee from being inlined while not actually enabling the caller to be inlined.

Oof, yeah, these look similar, but it's not clear to me if there's a good way to unify them as a part of this patch (or lift this out to there). Lemme poke it a bit and see :)

DisallowAllBonuses is called only when the callsite or the callee entry is cold. The only reason to make it conditional on caller's inlinability would be for code size improvement. Among the 3 bonuses disabled in DisallowAllBonuses, only the LastCallToStaticBonus is useful to allow for code size reduction. You don't want to give a single BB bonus or a vector bonus for a cold callee (or a callee at a cold callsite) whose caller is unlikely to be inlined.

Updated to plumb through the "are we doing ThinLTO || prelink LTO" bit in a way that makes sense to me. Suggestions for potentially better approaches is welcome :)

Among the 3 bonuses disabled in DisallowAllBonuses, only the LastCallToStaticBonus is useful to allow for code size reduction [...].

Fixed; thanks!

Also, RE

Oof, yeah, these look similar, but it's not clear to me if there's a good way to unify them as a part of this patch (or lift this out to there). Lemme poke it a bit and see :)

Unfortunately, I couldn't figure out a clean way to do this, so I added a FIXME for now

ping :)

I have concerns on so much plumbing changes to make the small heuristic change.

On the other hand, can you first extract the pass manager builder changes (i.e, passing Phase kind) into a separate patch? That one seems independent.

Especially for inliner changes I suggest a) sharing performance numbers on standard benchmarks with O2 & O2 (thin) LTO, and b) offering a flag to toggle individual changes. Performance here includes run-time performance, compile-time, and code size. A simple change can have detrimental impact on any of key metrics folks might be watching, and that that impact could be different depending on the targets.

I have concerns on so much plumbing changes to make the small heuristic change.

Agreed that that's discouraging. Lemme take numbers and see what this does for us. If this still looks promising, I'll split as requested.

sharing performance numbers on standard benchmarks with O2 & O2 (thin) LTO,

Thanks for the feedback! I'm not sure what benchmarks are considered standard. Do we have documentation on this, or might you have specific ones in mind I can try this on?

Friendly ping Gerolf (or anyone else who has opinions on standardish benchmarks) :)

SPEC2006/2017 will be a good public benchmarks. Clang performance/size (with self build with and without PGO) is another one. Internal benchmark results are also meaningful.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

BackendUtil.cpp

4 lines

llvm/

include/

llvm/

Analysis/

InlineCost.h

15 lines

Passes/

PassBuilder.h

21 lines

Transforms/

IPO.h

14 lines

IPO/

Inliner.h

13 lines

lib/

Analysis/

InlineCost.cpp

55 lines

Passes/

PassBuilder.cpp

71 lines

Transforms/

IPO/

InlineSimple.cpp

43 lines

Inliner.cpp

10 lines

PassManagerBuilder.cpp

5 lines

test/

Transforms/

Inline/

bpi-cold-inlining.ll

76 lines

Diff 243060

clang/lib/CodeGen/BackendUtil.cpp

Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	if (CodeGenOpts.OptimizationLevel <= 1) {
PMBuilder.Inliner = createAlwaysInlinerLegacyPass(InsertLifetimeIntrinsics);		PMBuilder.Inliner = createAlwaysInlinerLegacyPass(InsertLifetimeIntrinsics);
} else {		} else {
// We do not want to inline hot callsites for SamplePGO module-summary build		// We do not want to inline hot callsites for SamplePGO module-summary build
// because profile annotation will happen again in ThinLTO backend, and we		// because profile annotation will happen again in ThinLTO backend, and we
// want the IR of the hot path to match the profile.		// want the IR of the hot path to match the profile.
PMBuilder.Inliner = createFunctionInliningPass(		PMBuilder.Inliner = createFunctionInliningPass(
CodeGenOpts.OptimizationLevel, CodeGenOpts.OptimizeSize,		CodeGenOpts.OptimizationLevel, CodeGenOpts.OptimizeSize,
(!CodeGenOpts.SampleProfileFile.empty() &&		(!CodeGenOpts.SampleProfileFile.empty() &&
CodeGenOpts.PrepareForThinLTO));		CodeGenOpts.PrepareForThinLTO),
		/ExternalFunctionUseListsAreIncomplete=/
		CodeGenOpts.PrepareForThinLTO \|\| CodeGenOpts.PrepareForLTO);
}		}

PMBuilder.OptLevel = CodeGenOpts.OptimizationLevel;		PMBuilder.OptLevel = CodeGenOpts.OptimizationLevel;
PMBuilder.SizeLevel = CodeGenOpts.OptimizeSize;		PMBuilder.SizeLevel = CodeGenOpts.OptimizeSize;
PMBuilder.SLPVectorize = CodeGenOpts.VectorizeSLP;		PMBuilder.SLPVectorize = CodeGenOpts.VectorizeSLP;
PMBuilder.LoopVectorize = CodeGenOpts.VectorizeLoop;		PMBuilder.LoopVectorize = CodeGenOpts.VectorizeLoop;

PMBuilder.DisableUnrollLoops = !CodeGenOpts.UnrollLoops;		PMBuilder.DisableUnrollLoops = !CodeGenOpts.UnrollLoops;
▲ Show 20 Lines • Show All 1,017 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/InlineCost.h

	Show First 20 Lines • Show All 209 Lines • ▼ Show 20 Lines
	/// Note that a default threshold is passed into this function. This threshold			/// Note that a default threshold is passed into this function. This threshold
	/// could be modified based on callsite's properties and only costs below this			/// could be modified based on callsite's properties and only costs below this
	/// new threshold are computed with any accuracy. The new threshold can be			/// new threshold are computed with any accuracy. The new threshold can be
	/// used to bound the computation necessary to determine whether the cost is			/// used to bound the computation necessary to determine whether the cost is
	/// sufficiently low to warrant inlining.			/// sufficiently low to warrant inlining.
	///			///
	/// Also note that calling this function dynamically computes the cost of			/// Also note that calling this function dynamically computes the cost of
	/// inlining the callsite. It is an expensive, heavyweight call.			/// inlining the callsite. It is an expensive, heavyweight call.
	InlineCost getInlineCost(			InlineCost
	CallBase &Call, const InlineParams &Params, TargetTransformInfo &CalleeTTI,			getInlineCost(CallBase &Call, const InlineParams &Params,
				TargetTransformInfo &CalleeTTI,
	std::function<AssumptionCache &(Function &)> &GetAssumptionCache,			std::function<AssumptionCache &(Function &)> &GetAssumptionCache,
	Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,			Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
	ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE = nullptr);			ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE = nullptr,
				bool CallerInlinabilityMatters = true);

	/// Get an InlineCost with the callee explicitly specified.			/// Get an InlineCost with the callee explicitly specified.
	/// This allows you to calculate the cost of inlining a function via a			/// This allows you to calculate the cost of inlining a function via a
	/// pointer. This behaves exactly as the version with no explicit callee			/// pointer. This behaves exactly as the version with no explicit callee
	/// parameter in all other respects.			/// parameter in all other respects.
	//			//
	InlineCost			InlineCost
	getInlineCost(CallBase &Call, Function *Callee, const InlineParams &Params,			getInlineCost(CallBase &Call, Function *Callee, const InlineParams &Params,
	TargetTransformInfo &CalleeTTI,			TargetTransformInfo &CalleeTTI,
	std::function<AssumptionCache &(Function &)> &GetAssumptionCache,			std::function<AssumptionCache &(Function &)> &GetAssumptionCache,
	Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,			Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
	ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE);			ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE,
				bool CallerInlinabilityMatters = true);

	/// Minimal filter to detect invalid constructs for inlining.			/// Minimal filter to detect invalid constructs for inlining.
	InlineResult isInlineViable(Function &Callee);			InlineResult isInlineViable(Function &Callee);
	} // namespace llvm			} // namespace llvm

	#endif			#endif

llvm/include/llvm/Passes/PassBuilder.h

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	public:
/// Note that \p Level cannot be `O0` here. The pipelines produced are		/// Note that \p Level cannot be `O0` here. The pipelines produced are
/// only intended for use when attempting to optimize code. If frontends		/// only intended for use when attempting to optimize code. If frontends
/// require some transformations for semantic reasons, they should explicitly		/// require some transformations for semantic reasons, they should explicitly
/// build them.		/// build them.
///		///
/// \p Phase indicates the current ThinLTO phase.		/// \p Phase indicates the current ThinLTO phase.
ModulePassManager		ModulePassManager
buildModuleSimplificationPipeline(OptimizationLevel Level,		buildModuleSimplificationPipeline(OptimizationLevel Level,
ThinLTOPhase Phase,		bool DebugLogging = false,
bool DebugLogging = false);		ThinLTOPhase Phase = ThinLTOPhase::None);

/// Construct the core LLVM module optimization pipeline.		/// Construct the core LLVM module optimization pipeline.
///		///
/// This pipeline focuses on optimizing the execution speed of the IR. It		/// This pipeline focuses on optimizing the execution speed of the IR. It
/// uses cost modeling and thresholds to balance code growth against runtime		/// uses cost modeling and thresholds to balance code growth against runtime
/// improvements. It includes vectorization and other information destroying		/// improvements. It includes vectorization and other information destroying
/// transformations. It also cannot generally be run repeatedly on a module		/// transformations. It also cannot generally be run repeatedly on a module
/// without potentially seriously regressing either runtime performance of		/// without potentially seriously regressing either runtime performance of
/// the code or serious code size growth.		/// the code or serious code size growth.
///		///
/// Note that \p Level cannot be `O0` here. The pipelines produced are		/// Note that \p Level cannot be `O0` here. The pipelines produced are
/// only intended for use when attempting to optimize code. If frontends		/// only intended for use when attempting to optimize code. If frontends
/// require some transformations for semantic reasons, they should explicitly		/// require some transformations for semantic reasons, they should explicitly
/// build them.		/// build them.
ModulePassManager buildModuleOptimizationPipeline(OptimizationLevel Level,		ModulePassManager
		buildModuleOptimizationPipeline(OptimizationLevel Level,
bool DebugLogging = false,		bool DebugLogging = false,
bool LTOPreLink = false);		ThinLTOPhase Phase = ThinLTOPhase::None);

/// Build a per-module default optimization pipeline.		/// Build a per-module default optimization pipeline.
///		///
/// This provides a good default optimization pipeline for per-module		/// This provides a good default optimization pipeline for per-module
/// optimization and code generation without any link-time optimization. It		/// optimization and code generation without any link-time optimization. It
/// typically correspond to frontend "-O[123]" options for optimization		/// typically correspond to frontend "-O[123]" options for optimization
/// levels \c O1, \c O2 and \c O3 resp.		/// levels \c O1, \c O2 and \c O3 resp.
///		///
/// Note that \p Level cannot be `O0` here. The pipelines produced are		/// Note that \p Level cannot be `O0` here. The pipelines produced are
/// only intended for use when attempting to optimize code. If frontends		/// only intended for use when attempting to optimize code. If frontends
/// require some transformations for semantic reasons, they should explicitly		/// require some transformations for semantic reasons, they should explicitly
/// build them.		/// build them.
ModulePassManager buildPerModuleDefaultPipeline(OptimizationLevel Level,		ModulePassManager
		buildPerModuleDefaultPipeline(OptimizationLevel Level,
bool DebugLogging = false,		bool DebugLogging = false,
bool LTOPreLink = false);		ThinLTOPhase Phase = ThinLTOPhase::None);

/// Build a pre-link, ThinLTO-targeting default optimization pipeline to		/// Build a pre-link, ThinLTO-targeting default optimization pipeline to
/// a pass manager.		/// a pass manager.
///		///
/// This adds the pre-link optimizations tuned to prepare a module for		/// This adds the pre-link optimizations tuned to prepare a module for
/// a ThinLTO run. It works to minimize the IR which needs to be analyzed		/// a ThinLTO run. It works to minimize the IR which needs to be analyzed
/// without making irreversible decisions which could be made better during		/// without making irreversible decisions which could be made better during
/// the LTO run.		/// the LTO run.
▲ Show 20 Lines • Show All 315 Lines • ▼ Show 20 Lines	Error parseCGSCCPassPipeline(CGSCCPassManager &CGPM,
bool VerifyEachPass, bool DebugLogging);		bool VerifyEachPass, bool DebugLogging);
Error parseModulePassPipeline(ModulePassManager &MPM,		Error parseModulePassPipeline(ModulePassManager &MPM,
ArrayRef<PipelineElement> Pipeline,		ArrayRef<PipelineElement> Pipeline,
bool VerifyEachPass, bool DebugLogging);		bool VerifyEachPass, bool DebugLogging);

void addPGOInstrPasses(ModulePassManager &MPM, bool DebugLogging,		void addPGOInstrPasses(ModulePassManager &MPM, bool DebugLogging,
OptimizationLevel Level, bool RunProfileGen, bool IsCS,		OptimizationLevel Level, bool RunProfileGen, bool IsCS,
std::string ProfileFile,		std::string ProfileFile,
std::string ProfileRemappingFile);		std::string ProfileRemappingFile,
		bool ExternalFunctionUseListsAreIncomplete);
void invokePeepholeEPCallbacks(FunctionPassManager &, OptimizationLevel);		void invokePeepholeEPCallbacks(FunctionPassManager &, OptimizationLevel);

// Extension Point callbacks		// Extension Point callbacks
SmallVector<std::function<void(FunctionPassManager &, OptimizationLevel)>, 2>		SmallVector<std::function<void(FunctionPassManager &, OptimizationLevel)>, 2>
PeepholeEPCallbacks;		PeepholeEPCallbacks;
SmallVector<std::function<void(LoopPassManager &, OptimizationLevel)>, 2>		SmallVector<std::function<void(LoopPassManager &, OptimizationLevel)>, 2>
LateLoopOptimizationsEPCallbacks;		LateLoopOptimizationsEPCallbacks;
SmallVector<std::function<void(LoopPassManager &, OptimizationLevel)>, 2>		SmallVector<std::function<void(LoopPassManager &, OptimizationLevel)>, 2>
▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO.h

	Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	/// to inline direct function calls to small functions.			/// to inline direct function calls to small functions.
	///			///
	/// The Threshold can be passed directly, or asked to be computed from the			/// The Threshold can be passed directly, or asked to be computed from the
	/// given optimization and size optimization arguments.			/// given optimization and size optimization arguments.
	///			///
	/// The -inline-threshold command line option takes precedence over the			/// The -inline-threshold command line option takes precedence over the
	/// threshold given here.			/// threshold given here.
	Pass *createFunctionInliningPass();			Pass *createFunctionInliningPass();
	Pass *createFunctionInliningPass(int Threshold);			Pass *
	Pass *createFunctionInliningPass(unsigned OptLevel, unsigned SizeOptLevel,			createFunctionInliningPass(int Threshold,
	bool DisableInlineHotCallSite);			bool ExternalFunctionUseListsAreIncomplete = false);
	Pass *createFunctionInliningPass(InlineParams &Params);			Pass *
				createFunctionInliningPass(unsigned OptLevel, unsigned SizeOptLevel,
				bool DisableInlineHotCallSite,
				bool ExternalFunctionUseListsAreIncomplete = false);
				Pass *
				createFunctionInliningPass(InlineParams &Params,
				bool ExternalFunctionUseListsAreIncomplete = false);

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createPruneEHPass - Return a new pass object which transforms invoke			/// createPruneEHPass - Return a new pass object which transforms invoke
	/// instructions into calls, if the callee can _not_ unwind the stack.			/// instructions into calls, if the callee can _not_ unwind the stack.
	///			///
	Pass *createPruneEHPass();			Pass *createPruneEHPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO/Inliner.h

	Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	/// It should be noted that the legacy inliners do considerably more than this			/// It should be noted that the legacy inliners do considerably more than this
	/// inliner pass does. They provide logic for manually merging allocas, and			/// inliner pass does. They provide logic for manually merging allocas, and
	/// doing considerable DCE including the DCE of dead functions. This pass makes			/// doing considerable DCE including the DCE of dead functions. This pass makes
	/// every attempt to be simpler. DCE of functions requires complex reasoning			/// every attempt to be simpler. DCE of functions requires complex reasoning
	/// about comdat groups, etc. Instead, it is expected that other more focused			/// about comdat groups, etc. Instead, it is expected that other more focused
	/// passes be composed to achieve the same end result.			/// passes be composed to achieve the same end result.
	class InlinerPass : public PassInfoMixin<InlinerPass> {			class InlinerPass : public PassInfoMixin<InlinerPass> {
	public:			public:
	InlinerPass(InlineParams Params = getInlineParams())			InlinerPass(InlineParams Params = getInlineParams(),
	: Params(std::move(Params)) {}			bool ExternalFunctionUseListsAreIncomplete = false)
				: Params(std::move(Params)), ExternalFunctionUseListsAreIncomplete(
				ExternalFunctionUseListsAreIncomplete) {}
	~InlinerPass();			~InlinerPass();
	InlinerPass(InlinerPass &&Arg)			InlinerPass(InlinerPass &&Arg) = default;
	: Params(std::move(Arg.Params)),
	ImportedFunctionsStats(std::move(Arg.ImportedFunctionsStats)) {}

	PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,			PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
	LazyCallGraph &CG, CGSCCUpdateResult &UR);			LazyCallGraph &CG, CGSCCUpdateResult &UR);

	private:			private:
	InlineParams Params;			InlineParams Params;
	std::unique_ptr<ImportedFunctionsInliningStatistics> ImportedFunctionsStats;			std::unique_ptr<ImportedFunctionsInliningStatistics> ImportedFunctionsStats;
				// Assume that new references can't appear on otherwise-unreferenced `extern`
				// functions. This isn't the case in e.g., ThinLTO.
				bool ExternalFunctionUseListsAreIncomplete;
	};			};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_IPO_INLINER_H			#endif // LLVM_TRANSFORMS_IPO_INLINER_H

llvm/lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	protected:
Function &F;		Function &F;

// Cache the DataLayout since we use it a lot.		// Cache the DataLayout since we use it a lot.
const DataLayout &DL;		const DataLayout &DL;

/// The OptimizationRemarkEmitter available for this compilation.		/// The OptimizationRemarkEmitter available for this compilation.
OptimizationRemarkEmitter *ORE;		OptimizationRemarkEmitter *ORE;

/// The candidate callsite being analyzed. Please do not use this to do		/// The candidate callsite being analyzed.
/// analysis in the caller function; we want the inline cost query to be
/// easily cacheable. Instead, use the cover function paramHasAttr.
CallBase &CandidateCall;		CallBase &CandidateCall;

/// Extension points for handling callsite features.		/// Extension points for handling callsite features.
/// Called after a basic block was analyzed.		/// Called after a basic block was analyzed.
virtual void onBlockAnalyzed(const BasicBlock *BB) {}		virtual void onBlockAnalyzed(const BasicBlock *BB) {}

/// Called at the end of the analysis of the callsite. Return the outcome of		/// Called at the end of the analysis of the callsite. Return the outcome of
/// the analysis, i.e. 'InlineResult(true)' if the inlining may happen, or		/// the analysis, i.e. 'InlineResult(true)' if the inlining may happen, or
▲ Show 20 Lines • Show All 237 Lines • ▼ Show 20 Lines	class InlineCostCallAnalyzer final : public CallAnalyzer {

/// Upper bound for the inlining cost. Bonuses are being applied to account		/// Upper bound for the inlining cost. Bonuses are being applied to account
/// for speculative "expected profit" of the inlining decision.		/// for speculative "expected profit" of the inlining decision.
int Threshold = 0;		int Threshold = 0;

/// Attempt to evaluate indirect calls to boost its inline cost.		/// Attempt to evaluate indirect calls to boost its inline cost.
const bool BoostIndirectCalls;		const bool BoostIndirectCalls;

		/// If true, inlining may be more conservative to take the caller's
		/// inlineability into account.
		const bool CallerInlinabilityMatters;

/// Inlining cost measured in abstract units, accounts for all the		/// Inlining cost measured in abstract units, accounts for all the
/// instructions expected to be executed for a given function invocation.		/// instructions expected to be executed for a given function invocation.
/// Instructions that are statically proven to be dead based on call-site		/// Instructions that are statically proven to be dead based on call-site
/// arguments are not counted here.		/// arguments are not counted here.
int Cost = 0;		int Cost = 0;

bool SingleBB = true;		bool SingleBB = true;

▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	class InlineCostCallAnalyzer final : public CallAnalyzer {
}		}

public:		public:
InlineCostCallAnalyzer(		InlineCostCallAnalyzer(
const TargetTransformInfo &TTI,		const TargetTransformInfo &TTI,
std::function<AssumptionCache &(Function &)> &GetAssumptionCache,		std::function<AssumptionCache &(Function &)> &GetAssumptionCache,
Optional<function_ref<BlockFrequencyInfo &(Function &)>> &GetBFI,		Optional<function_ref<BlockFrequencyInfo &(Function &)>> &GetBFI,
ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE, Function &Callee,		ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE, Function &Callee,
CallBase &Call, const InlineParams &Params, bool BoostIndirect = true)		CallBase &Call, const InlineParams &Params, bool BoostIndirect = true,
		bool CallerInlinabilityMatters = true)
: CallAnalyzer(TTI, GetAssumptionCache, GetBFI, PSI, ORE, Callee, Call),		: CallAnalyzer(TTI, GetAssumptionCache, GetBFI, PSI, ORE, Callee, Call),
ComputeFullInlineCost(OptComputeFullInlineCost \|\|		ComputeFullInlineCost(OptComputeFullInlineCost \|\|
Params.ComputeFullInlineCost \|\| ORE),		Params.ComputeFullInlineCost \|\| ORE),
Params(Params), Threshold(Params.DefaultThreshold),		Params(Params), Threshold(Params.DefaultThreshold),
BoostIndirectCalls(BoostIndirect) {}		BoostIndirectCalls(BoostIndirect),
		CallerInlinabilityMatters(CallerInlinabilityMatters) {}
void dump();		void dump();

virtual ~InlineCostCallAnalyzer() {}		virtual ~InlineCostCallAnalyzer() {}
int getThreshold() { return Threshold; }		int getThreshold() { return Threshold; }
int getCost() { return Cost; }		int getCost() { return Cost; }
};		};
} // namespace		} // namespace

▲ Show 20 Lines • Show All 516 Lines • ▼ Show 20 Lines	void InlineCostCallAnalyzer::updateThreshold(CallBase &Call, Function &Callee) {
// guaranteed to reduce code size.		// guaranteed to reduce code size.
//		//
// These bonus percentages may be set to 0 based on properties of the caller		// These bonus percentages may be set to 0 based on properties of the caller
// and the callsite.		// and the callsite.
int SingleBBBonusPercent = 50;		int SingleBBBonusPercent = 50;
int VectorBonusPercent = TTI.getInlinerVectorBonusPercent();		int VectorBonusPercent = TTI.getInlinerVectorBonusPercent();
int LastCallToStaticBonus = InlineConstants::LastCallToStaticBonus;		int LastCallToStaticBonus = InlineConstants::LastCallToStaticBonus;

// Lambda to set all the above bonus and bonus percentages to 0.		// Lambda to set all the above bonus and bonus percentages to 0, if doing so
auto DisallowAllBonuses = [&]() {		// might help us inline the caller elsewhere.
		auto SetBonusesForColdCalee = [&]() {
		// Cold callsites should be kept small; disallow their bonuses.
SingleBBBonusPercent = 0;		SingleBBBonusPercent = 0;
VectorBonusPercent = 0;		VectorBonusPercent = 0;

		// While keeping LastCallToStaticBonus might result in code size reduction,
		// it can cause the size of the caller to increase, which may prevent it
		// from being inlined.
		//
		// FIXME: This is logically a part of our `shouldBeDeferred` logic in the
		// main `Inliner` pass. Figuring out how to hoist some of this there (or
		// sink it here, as the note there mentions) might be nice.
		if (CallerInlinabilityMatters)
LastCallToStaticBonus = 0;		LastCallToStaticBonus = 0;
};		};

// Use the OptMinSizeThreshold or OptSizeThreshold knob if they are available		// Use the OptMinSizeThreshold or OptSizeThreshold knob if they are available
// and reduce the threshold if the caller has the necessary attribute.		// and reduce the threshold if the caller has the necessary attribute.
if (Caller->hasMinSize()) {		if (Caller->hasMinSize()) {
Threshold = MinIfValid(Threshold, Params.OptMinSizeThreshold);		Threshold = MinIfValid(Threshold, Params.OptMinSizeThreshold);
// For minsize, we want to disable the single BB bonus and the vector		// For minsize, we want to disable the single BB bonus and the vector
// bonuses, but not the last-call-to-static bonus. Inlining the last call to		// bonuses, but not the last-call-to-static bonus. Inlining the last call to
Show All 23 Lines	if (!Caller->hasOptSize() && HotCallSiteThreshold) {
LLVM_DEBUG(dbgs() << "Hot callsite.\n");		LLVM_DEBUG(dbgs() << "Hot callsite.\n");
// FIXME: This should update the threshold only if it exceeds the		// FIXME: This should update the threshold only if it exceeds the
// current threshold, but AutoFDO + ThinLTO currently relies on this		// current threshold, but AutoFDO + ThinLTO currently relies on this
// behavior to prevent inlining of hot callsites during ThinLTO		// behavior to prevent inlining of hot callsites during ThinLTO
// compile phase.		// compile phase.
Threshold = HotCallSiteThreshold.getValue();		Threshold = HotCallSiteThreshold.getValue();
} else if (isColdCallSite(Call, CallerBFI)) {		} else if (isColdCallSite(Call, CallerBFI)) {
LLVM_DEBUG(dbgs() << "Cold callsite.\n");		LLVM_DEBUG(dbgs() << "Cold callsite.\n");
// Do not apply bonuses for a cold callsite including the		SetBonusesForColdCalee();
// LastCallToStatic bonus. While this bonus might result in code size
// reduction, it can cause the size of a non-cold caller to increase
// preventing it from being inlined.
DisallowAllBonuses();
Threshold = MinIfValid(Threshold, Params.ColdCallSiteThreshold);		Threshold = MinIfValid(Threshold, Params.ColdCallSiteThreshold);
} else if (PSI) {		} else if (PSI) {
// Use callee's global profile information only if we have no way of		// Use callee's global profile information only if we have no way of
// determining this via callsite information.		// determining this via callsite information.
if (PSI->isFunctionEntryHot(&Callee)) {		if (PSI->isFunctionEntryHot(&Callee)) {
LLVM_DEBUG(dbgs() << "Hot callee.\n");		LLVM_DEBUG(dbgs() << "Hot callee.\n");
// If callsite hotness can not be determined, we may still know		// If callsite hotness can not be determined, we may still know
// that the callee is hot and treat it as a weaker hint for threshold		// that the callee is hot and treat it as a weaker hint for threshold
// increase.		// increase.
Threshold = MaxIfValid(Threshold, Params.HintThreshold);		Threshold = MaxIfValid(Threshold, Params.HintThreshold);
} else if (PSI->isFunctionEntryCold(&Callee)) {		} else if (PSI->isFunctionEntryCold(&Callee)) {
LLVM_DEBUG(dbgs() << "Cold callee.\n");		LLVM_DEBUG(dbgs() << "Cold callee.\n");
// Do not apply bonuses for a cold callee including the		SetBonusesForColdCalee();
// LastCallToStatic bonus. While this bonus might result in code size
// reduction, it can cause the size of a non-cold caller to increase
// preventing it from being inlined.
DisallowAllBonuses();
Threshold = MinIfValid(Threshold, Params.ColdThreshold);		Threshold = MinIfValid(Threshold, Params.ColdThreshold);
}		}
}		}
}		}

// Finally, take the target-specific inlining threshold multiplier into		// Finally, take the target-specific inlining threshold multiplier into
// account.		// account.
Threshold *= TTI.getInliningThresholdMultiplier();		Threshold *= TTI.getInliningThresholdMultiplier();
▲ Show 20 Lines • Show All 861 Lines • ▼ Show 20 Lines	int llvm::getCallsiteCost(CallBase &Call, const DataLayout &DL) {
Cost += InlineConstants::InstrCost + InlineConstants::CallPenalty;		Cost += InlineConstants::InstrCost + InlineConstants::CallPenalty;
return Cost;		return Cost;
}		}

InlineCost llvm::getInlineCost(		InlineCost llvm::getInlineCost(
CallBase &Call, const InlineParams &Params, TargetTransformInfo &CalleeTTI,		CallBase &Call, const InlineParams &Params, TargetTransformInfo &CalleeTTI,
std::function<AssumptionCache &(Function &)> &GetAssumptionCache,		std::function<AssumptionCache &(Function &)> &GetAssumptionCache,
Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,		Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE) {		ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE,
		bool CallerInlinabilityMatters) {
return getInlineCost(Call, Call.getCalledFunction(), Params, CalleeTTI,		return getInlineCost(Call, Call.getCalledFunction(), Params, CalleeTTI,
GetAssumptionCache, GetBFI, PSI, ORE);		GetAssumptionCache, GetBFI, PSI, ORE,
		CallerInlinabilityMatters);
}		}

InlineCost llvm::getInlineCost(		InlineCost llvm::getInlineCost(
CallBase &Call, Function *Callee, const InlineParams &Params,		CallBase &Call, Function *Callee, const InlineParams &Params,
TargetTransformInfo &CalleeTTI,		TargetTransformInfo &CalleeTTI,
std::function<AssumptionCache &(Function &)> &GetAssumptionCache,		std::function<AssumptionCache &(Function &)> &GetAssumptionCache,
Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,		Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE) {		ProfileSummaryInfo PSI, OptimizationRemarkEmitter ORE,
		bool CallerInlinabilityMatters) {

// Cannot inline indirect calls.		// Cannot inline indirect calls.
if (!Callee)		if (!Callee)
return llvm::InlineCost::getNever("indirect call");		return llvm::InlineCost::getNever("indirect call");

// Never inline calls with byval arguments that does not have the alloca		// Never inline calls with byval arguments that does not have the alloca
// address space. Since byval arguments can be replaced with a copy to an		// address space. Since byval arguments can be replaced with a copy to an
// alloca, the inlined code would need to be adjusted to handle that the		// alloca, the inlined code would need to be adjusted to handle that the
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	InlineCost llvm::getInlineCost(
// Don't inline call sites marked noinline.		// Don't inline call sites marked noinline.
if (Call.isNoInline())		if (Call.isNoInline())
return llvm::InlineCost::getNever("noinline call site attribute");		return llvm::InlineCost::getNever("noinline call site attribute");

LLVM_DEBUG(llvm::dbgs() << " Analyzing call of " << Callee->getName()		LLVM_DEBUG(llvm::dbgs() << " Analyzing call of " << Callee->getName()
<< "... (caller:" << Caller->getName() << ")\n");		<< "... (caller:" << Caller->getName() << ")\n");

InlineCostCallAnalyzer CA(CalleeTTI, GetAssumptionCache, GetBFI, PSI, ORE,		InlineCostCallAnalyzer CA(CalleeTTI, GetAssumptionCache, GetBFI, PSI, ORE,
*Callee, Call, Params);		Callee, Call, Params, /BoostIndirect=*/true,
		CallerInlinabilityMatters);
InlineResult ShouldInline = CA.analyze();		InlineResult ShouldInline = CA.analyze();

LLVM_DEBUG(CA.dump());		LLVM_DEBUG(CA.dump());

// Check if there was a reason to force inlining or no inlining.		// Check if there was a reason to force inlining or no inlining.
if (!ShouldInline.isSuccess() && CA.getCost() < CA.getThreshold())		if (!ShouldInline.isSuccess() && CA.getCost() < CA.getThreshold())
return InlineCost::getNever(ShouldInline.getFailureReason());		return InlineCost::getNever(ShouldInline.getFailureReason());
if (ShouldInline.isSuccess() && CA.getCost() >= CA.getThreshold())		if (ShouldInline.isSuccess() && CA.getCost() >= CA.getThreshold())
▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 566 Lines • ▼ Show 20 Lines	PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
if (EnableCHR && Level == OptimizationLevel::O3 && PGOOpt &&		if (EnableCHR && Level == OptimizationLevel::O3 && PGOOpt &&
(PGOOpt->Action == PGOOptions::IRUse \|\|		(PGOOpt->Action == PGOOptions::IRUse \|\|
PGOOpt->Action == PGOOptions::SampleUse))		PGOOpt->Action == PGOOptions::SampleUse))
FPM.addPass(ControlHeightReductionPass());		FPM.addPass(ControlHeightReductionPass());

return FPM;		return FPM;
}		}

void PassBuilder::addPGOInstrPasses(ModulePassManager &MPM, bool DebugLogging,		void PassBuilder::addPGOInstrPasses(
PassBuilder::OptimizationLevel Level,		ModulePassManager &MPM, bool DebugLogging,
bool RunProfileGen, bool IsCS,		PassBuilder::OptimizationLevel Level, bool RunProfileGen, bool IsCS,
std::string ProfileFile,		std::string ProfileFile, std::string ProfileRemappingFile,
std::string ProfileRemappingFile) {		bool ExternalFunctionUseListsAreIncomplete) {
assert(Level != OptimizationLevel::O0 && "Not expecting O0 here!");		assert(Level != OptimizationLevel::O0 && "Not expecting O0 here!");
// Generally running simplification passes and the inliner with an high		// Generally running simplification passes and the inliner with an high
// threshold results in smaller executables, but there may be cases where		// threshold results in smaller executables, but there may be cases where
// the size grows, so let's be conservative here and skip this simplification		// the size grows, so let's be conservative here and skip this simplification
// at -Os/Oz. We will not do this inline for context sensistive PGO (when		// at -Os/Oz. We will not do this inline for context sensistive PGO (when
// IsCS is true).		// IsCS is true).
if (!Level.isOptimizingForSize() && !IsCS) {		if (!Level.isOptimizingForSize() && !IsCS) {
InlineParams IP;		InlineParams IP;

IP.DefaultThreshold = PreInlineThreshold;		IP.DefaultThreshold = PreInlineThreshold;

// FIXME: The hint threshold has the same value used by the regular inliner.		// FIXME: The hint threshold has the same value used by the regular inliner.
// This should probably be lowered after performance testing.		// This should probably be lowered after performance testing.
// FIXME: this comment is cargo culted from the old pass manager, revisit).		// FIXME: this comment is cargo culted from the old pass manager, revisit).
IP.HintThreshold = 325;		IP.HintThreshold = 325;

CGSCCPassManager CGPipeline(DebugLogging);		CGSCCPassManager CGPipeline(DebugLogging);

CGPipeline.addPass(InlinerPass(IP));		CGPipeline.addPass(InlinerPass(IP, ExternalFunctionUseListsAreIncomplete));

FunctionPassManager FPM;		FunctionPassManager FPM;
FPM.addPass(SROA());		FPM.addPass(SROA());
FPM.addPass(EarlyCSEPass()); // Catch trivial redundancies.		FPM.addPass(EarlyCSEPass()); // Catch trivial redundancies.
FPM.addPass(SimplifyCFGPass()); // Merge & remove basic blocks.		FPM.addPass(SimplifyCFGPass()); // Merge & remove basic blocks.
FPM.addPass(InstCombinePass()); // Combine silly sequences.		FPM.addPass(InstCombinePass()); // Combine silly sequences.
invokePeepholeEPCallbacks(FPM, Level);		invokePeepholeEPCallbacks(FPM, Level);

▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	void PassBuilder::addPGOInstrPassesForO0(ModulePassManager &MPM,
MPM.addPass(InstrProfiling(Options, IsCS));		MPM.addPass(InstrProfiling(Options, IsCS));
}		}

static InlineParams		static InlineParams
getInlineParamsFromOptLevel(PassBuilder::OptimizationLevel Level) {		getInlineParamsFromOptLevel(PassBuilder::OptimizationLevel Level) {
return getInlineParams(Level.getSpeedupLevel(), Level.getSizeLevel());		return getInlineParams(Level.getSpeedupLevel(), Level.getSizeLevel());
}		}

ModulePassManager		static bool areFuncUseListsIncompleteDuring(PassBuilder::ThinLTOPhase Phase) {
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,		return Phase != PassBuilder::ThinLTOPhase::None;
ThinLTOPhase Phase,		}
bool DebugLogging) {
		ModulePassManager PassBuilder::buildModuleSimplificationPipeline(
		OptimizationLevel Level, bool DebugLogging, ThinLTOPhase Phase) {
ModulePassManager MPM(DebugLogging);		ModulePassManager MPM(DebugLogging);

bool HasSampleProfile = PGOOpt && (PGOOpt->Action == PGOOptions::SampleUse);		bool HasSampleProfile = PGOOpt && (PGOOpt->Action == PGOOptions::SampleUse);

// In ThinLTO mode, when flattened profile is used, all the available		// In ThinLTO mode, when flattened profile is used, all the available
// profile information will be annotated in PreLink phase so there is		// profile information will be annotated in PreLink phase so there is
// no need to load the profile again in PostLink.		// no need to load the profile again in PostLink.
bool LoadSampleProfile =		bool LoadSampleProfile =
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	ModulePassManager PassBuilder::buildModuleSimplificationPipeline(

// Add all the requested passes for instrumentation PGO, if requested.		// Add all the requested passes for instrumentation PGO, if requested.
if (PGOOpt && Phase != ThinLTOPhase::PostLink &&		if (PGOOpt && Phase != ThinLTOPhase::PostLink &&
(PGOOpt->Action == PGOOptions::IRInstr \|\|		(PGOOpt->Action == PGOOptions::IRInstr \|\|
PGOOpt->Action == PGOOptions::IRUse)) {		PGOOpt->Action == PGOOptions::IRUse)) {
addPGOInstrPasses(MPM, DebugLogging, Level,		addPGOInstrPasses(MPM, DebugLogging, Level,
/* RunProfileGen */ PGOOpt->Action == PGOOptions::IRInstr,		/* RunProfileGen */ PGOOpt->Action == PGOOptions::IRInstr,
/* IsCS */ false, PGOOpt->ProfileFile,		/* IsCS */ false, PGOOpt->ProfileFile,
PGOOpt->ProfileRemappingFile);		PGOOpt->ProfileRemappingFile,
		areFuncUseListsIncompleteDuring(Phase));
MPM.addPass(PGOIndirectCallPromotion(false, false));		MPM.addPass(PGOIndirectCallPromotion(false, false));
}		}
if (PGOOpt && Phase != ThinLTOPhase::PostLink &&		if (PGOOpt && Phase != ThinLTOPhase::PostLink &&
PGOOpt->CSAction == PGOOptions::CSIRInstr)		PGOOpt->CSAction == PGOOptions::CSIRInstr)
MPM.addPass(PGOInstrumentationGenCreateVar(PGOOpt->CSProfileGenFile));		MPM.addPass(PGOInstrumentationGenCreateVar(PGOOpt->CSProfileGenFile));

// Synthesize function entry counts for non-PGO compilation.		// Synthesize function entry counts for non-PGO compilation.
if (EnableSyntheticCounts && !PGOOpt)		if (EnableSyntheticCounts && !PGOOpt)
Show All 22 Lines	ModulePassManager PassBuilder::buildModuleSimplificationPipeline(
// the callees have already been fully optimized, and we want to inline them		// the callees have already been fully optimized, and we want to inline them
// into the callers so that our optimizations can reflect that.		// into the callers so that our optimizations can reflect that.
// For PreLinkThinLTO pass, we disable hot-caller heuristic for sample PGO		// For PreLinkThinLTO pass, we disable hot-caller heuristic for sample PGO
// because it makes profile annotation in the backend inaccurate.		// because it makes profile annotation in the backend inaccurate.
InlineParams IP = getInlineParamsFromOptLevel(Level);		InlineParams IP = getInlineParamsFromOptLevel(Level);
if (Phase == ThinLTOPhase::PreLink && PGOOpt &&		if (Phase == ThinLTOPhase::PreLink && PGOOpt &&
PGOOpt->Action == PGOOptions::SampleUse)		PGOOpt->Action == PGOOptions::SampleUse)
IP.HotCallSiteThreshold = 0;		IP.HotCallSiteThreshold = 0;
MainCGPipeline.addPass(InlinerPass(IP));		MainCGPipeline.addPass(
		InlinerPass(IP, areFuncUseListsIncompleteDuring(Phase)));

// Now deduce any function attributes based in the current code.		// Now deduce any function attributes based in the current code.
MainCGPipeline.addPass(PostOrderFunctionAttrsPass());		MainCGPipeline.addPass(PostOrderFunctionAttrsPass());

// When at O3 add argument promotion to the pass pipeline.		// When at O3 add argument promotion to the pass pipeline.
// FIXME: It isn't at all clear why this should be limited to O3.		// FIXME: It isn't at all clear why this should be limited to O3.
if (Level == OptimizationLevel::O3)		if (Level == OptimizationLevel::O3)
MainCGPipeline.addPass(ArgumentPromotionPass());		MainCGPipeline.addPass(ArgumentPromotionPass());
Show All 14 Lines	ModulePassManager PassBuilder::buildModuleSimplificationPipeline(
MPM.addPass(		MPM.addPass(
createModuleToPostOrderCGSCCPassAdaptor(createDevirtSCCRepeatedPass(		createModuleToPostOrderCGSCCPassAdaptor(createDevirtSCCRepeatedPass(
std::move(MainCGPipeline), MaxDevirtIterations)));		std::move(MainCGPipeline), MaxDevirtIterations)));

return MPM;		return MPM;
}		}

ModulePassManager PassBuilder::buildModuleOptimizationPipeline(		ModulePassManager PassBuilder::buildModuleOptimizationPipeline(
OptimizationLevel Level, bool DebugLogging, bool LTOPreLink) {		OptimizationLevel Level, bool DebugLogging, ThinLTOPhase Phase) {
ModulePassManager MPM(DebugLogging);		ModulePassManager MPM(DebugLogging);

// Optimize globals now that the module is fully simplified.		// Optimize globals now that the module is fully simplified.
MPM.addPass(GlobalOptPass());		MPM.addPass(GlobalOptPass());
MPM.addPass(GlobalDCEPass());		MPM.addPass(GlobalDCEPass());

// Run partial inlining pass to partially inline functions that have		// Run partial inlining pass to partially inline functions that have
// large bodies.		// large bodies.
if (RunPartialInlining)		if (RunPartialInlining)
MPM.addPass(PartialInlinerPass());		MPM.addPass(PartialInlinerPass());

// Remove avail extern fns and globals definitions since we aren't compiling		// Remove avail extern fns and globals definitions since we aren't compiling
// an object file for later LTO. For LTO we want to preserve these so they		// an object file for later LTO. For LTO we want to preserve these so they
// are eligible for inlining at link-time. Note if they are unreferenced they		// are eligible for inlining at link-time. Note if they are unreferenced they
// will be removed by GlobalDCE later, so this only impacts referenced		// will be removed by GlobalDCE later, so this only impacts referenced
// available externally globals. Eventually they will be suppressed during		// available externally globals. Eventually they will be suppressed during
// codegen, but eliminating here enables more opportunity for GlobalDCE as it		// codegen, but eliminating here enables more opportunity for GlobalDCE as it
// may make globals referenced by available external functions dead and saves		// may make globals referenced by available external functions dead and saves
// running remaining passes on the eliminated functions. These should be		// running remaining passes on the eliminated functions. These should be
// preserved during prelinking for link-time inlining decisions.		// preserved during prelinking for link-time inlining decisions.
if (!LTOPreLink)		if (Phase != ThinLTOPhase::PreLink)
MPM.addPass(EliminateAvailableExternallyPass());		MPM.addPass(EliminateAvailableExternallyPass());

if (EnableOrderFileInstrumentation)		if (EnableOrderFileInstrumentation)
MPM.addPass(InstrOrderFilePass());		MPM.addPass(InstrOrderFilePass());

// Do RPO function attribute inference across the module to forward-propagate		// Do RPO function attribute inference across the module to forward-propagate
// attributes where applicable.		// attributes where applicable.
// FIXME: Is this really an optimization rather than a canonicalization?		// FIXME: Is this really an optimization rather than a canonicalization?
MPM.addPass(ReversePostOrderFunctionAttrsPass());		MPM.addPass(ReversePostOrderFunctionAttrsPass());

// Do a post inline PGO instrumentation and use pass. This is a context		// Do a post inline PGO instrumentation and use pass. This is a context
// sensitive PGO pass. We don't want to do this in LTOPreLink phrase as		// sensitive PGO pass. We don't want to do this in LTOPreLink phrase as
// cross-module inline has not been done yet. The context sensitive		// cross-module inline has not been done yet. The context sensitive
// instrumentation is after all the inlines are done.		// instrumentation is after all the inlines are done.
if (!LTOPreLink && PGOOpt) {		if (Phase != ThinLTOPhase::PreLink && PGOOpt) {
if (PGOOpt->CSAction == PGOOptions::CSIRInstr)		if (PGOOpt->CSAction == PGOOptions::CSIRInstr)
addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ true,		addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ true,
/* IsCS */ true, PGOOpt->CSProfileGenFile,		/* IsCS */ true, PGOOpt->CSProfileGenFile,
PGOOpt->ProfileRemappingFile);		PGOOpt->ProfileRemappingFile,
		areFuncUseListsIncompleteDuring(Phase));
else if (PGOOpt->CSAction == PGOOptions::CSIRUse)		else if (PGOOpt->CSAction == PGOOptions::CSIRUse)
addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ false,		addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ false,
/* IsCS */ true, PGOOpt->ProfileFile,		/* IsCS */ true, PGOOpt->ProfileFile,
PGOOpt->ProfileRemappingFile);		PGOOpt->ProfileRemappingFile,
		areFuncUseListsIncompleteDuring(Phase));
}		}

// Re-require GloblasAA here prior to function passes. This is particularly		// Re-require GloblasAA here prior to function passes. This is particularly
// useful as the above will have inlined, DCE'ed, and function-attr		// useful as the above will have inlined, DCE'ed, and function-attr
// propagated everything. We should at this point have a reasonably minimal		// propagated everything. We should at this point have a reasonably minimal
// and richly annotated call graph. By computing aliasing and mod/ref		// and richly annotated call graph. By computing aliasing and mod/ref
// information for all local globals here, the late loop passes and notably		// information for all local globals here, the late loop passes and notably
// the vectorizer will be able to use them to help recognize vectorizable		// the vectorizer will be able to use them to help recognize vectorizable
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	ModulePassManager PassBuilder::buildModuleOptimizationPipeline(

// Now that we've vectorized and unrolled loops, we may have more refined		// Now that we've vectorized and unrolled loops, we may have more refined
// alignment information, try to re-derive it here.		// alignment information, try to re-derive it here.
OptimizePM.addPass(AlignmentFromAssumptionsPass());		OptimizePM.addPass(AlignmentFromAssumptionsPass());

// Split out cold code. Splitting is done late to avoid hiding context from		// Split out cold code. Splitting is done late to avoid hiding context from
// other optimizations and inadvertently regressing performance. The tradeoff		// other optimizations and inadvertently regressing performance. The tradeoff
// is that this has a higher code size cost than splitting early.		// is that this has a higher code size cost than splitting early.
if (EnableHotColdSplit && !LTOPreLink)		if (EnableHotColdSplit && Phase != ThinLTOPhase::PreLink)
MPM.addPass(HotColdSplittingPass());		MPM.addPass(HotColdSplittingPass());

// LoopSink pass sinks instructions hoisted by LICM, which serves as a		// LoopSink pass sinks instructions hoisted by LICM, which serves as a
// canonicalization pass that enables other optimizations. As a result,		// canonicalization pass that enables other optimizations. As a result,
// LoopSink pass needs to be a very late IR pass to avoid undoing LICM		// LoopSink pass needs to be a very late IR pass to avoid undoing LICM
// result too early.		// result too early.
OptimizePM.addPass(LoopSinkPass());		OptimizePM.addPass(LoopSinkPass());

Show All 27 Lines	ModulePassManager PassBuilder::buildModuleOptimizationPipeline(
// pipeline and maybe be the bottom of the canonicalization pipeline? Weird		// pipeline and maybe be the bottom of the canonicalization pipeline? Weird
// ordering here.		// ordering here.
MPM.addPass(GlobalDCEPass());		MPM.addPass(GlobalDCEPass());
MPM.addPass(ConstantMergePass());		MPM.addPass(ConstantMergePass());

return MPM;		return MPM;
}		}

ModulePassManager		ModulePassManager PassBuilder::buildPerModuleDefaultPipeline(
PassBuilder::buildPerModuleDefaultPipeline(OptimizationLevel Level,		OptimizationLevel Level, bool DebugLogging, ThinLTOPhase Phase) {
bool DebugLogging, bool LTOPreLink) {
assert(Level != OptimizationLevel::O0 &&		assert(Level != OptimizationLevel::O0 &&
"Must request optimizations for the default pipeline!");		"Must request optimizations for the default pipeline!");

ModulePassManager MPM(DebugLogging);		ModulePassManager MPM(DebugLogging);

// Force any function attributes we want the rest of the pipeline to observe.		// Force any function attributes we want the rest of the pipeline to observe.
MPM.addPass(ForceFunctionAttrsPass());		MPM.addPass(ForceFunctionAttrsPass());

// Apply module pipeline start EP callback.		// Apply module pipeline start EP callback.
for (auto &C : PipelineStartEPCallbacks)		for (auto &C : PipelineStartEPCallbacks)
C(MPM);		C(MPM);

if (PGOOpt && PGOOpt->SamplePGOSupport)		if (PGOOpt && PGOOpt->SamplePGOSupport)
MPM.addPass(createModuleToFunctionPassAdaptor(AddDiscriminatorsPass()));		MPM.addPass(createModuleToFunctionPassAdaptor(AddDiscriminatorsPass()));

// Add the core simplification pipeline.		// Add the core simplification pipeline.
MPM.addPass(buildModuleSimplificationPipeline(Level, ThinLTOPhase::None,		MPM.addPass(buildModuleSimplificationPipeline(Level, DebugLogging, Phase));
DebugLogging));

// Now add the optimization pipeline.		// Now add the optimization pipeline.
MPM.addPass(buildModuleOptimizationPipeline(Level, DebugLogging, LTOPreLink));		MPM.addPass(buildModuleOptimizationPipeline(Level, DebugLogging, Phase));

return MPM;		return MPM;
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildThinLTOPreLinkDefaultPipeline(OptimizationLevel Level,		PassBuilder::buildThinLTOPreLinkDefaultPipeline(OptimizationLevel Level,
bool DebugLogging) {		bool DebugLogging) {
assert(Level != OptimizationLevel::O0 &&		assert(Level != OptimizationLevel::O0 &&
Show All 9 Lines	PassBuilder::buildThinLTOPreLinkDefaultPipeline(OptimizationLevel Level,

// Apply module pipeline start EP callback.		// Apply module pipeline start EP callback.
for (auto &C : PipelineStartEPCallbacks)		for (auto &C : PipelineStartEPCallbacks)
C(MPM);		C(MPM);

// If we are planning to perform ThinLTO later, we don't bloat the code with		// If we are planning to perform ThinLTO later, we don't bloat the code with
// unrolling/vectorization/... now. Just simplify the module as much as we		// unrolling/vectorization/... now. Just simplify the module as much as we
// can.		// can.
MPM.addPass(buildModuleSimplificationPipeline(Level, ThinLTOPhase::PreLink,		MPM.addPass(buildModuleSimplificationPipeline(Level, DebugLogging,
DebugLogging));		ThinLTOPhase::PreLink));

// Run partial inlining pass to partially inline functions that have		// Run partial inlining pass to partially inline functions that have
// large bodies.		// large bodies.
// FIXME: It isn't clear whether this is really the right place to run this		// FIXME: It isn't clear whether this is really the right place to run this
// in ThinLTO. Because there is another canonicalization and simplification		// in ThinLTO. Because there is another canonicalization and simplification
// phase that will run after the thin link, running this here ends up with		// phase that will run after the thin link, running this here ends up with
// less information than will be available later and it may grow functions in		// less information than will be available later and it may grow functions in
// ways that aren't beneficial.		// ways that aren't beneficial.
Show All 33 Lines	ModulePassManager PassBuilder::buildThinLTODefaultPipeline(

if (Level == OptimizationLevel::O0)		if (Level == OptimizationLevel::O0)
return MPM;		return MPM;

// Force any function attributes we want the rest of the pipeline to observe.		// Force any function attributes we want the rest of the pipeline to observe.
MPM.addPass(ForceFunctionAttrsPass());		MPM.addPass(ForceFunctionAttrsPass());

// Add the core simplification pipeline.		// Add the core simplification pipeline.
MPM.addPass(buildModuleSimplificationPipeline(Level, ThinLTOPhase::PostLink,		MPM.addPass(buildModuleSimplificationPipeline(Level, DebugLogging,
DebugLogging));		ThinLTOPhase::PostLink));

// Now add the optimization pipeline.		// Now add the optimization pipeline.
MPM.addPass(buildModuleOptimizationPipeline(Level, DebugLogging));		MPM.addPass(buildModuleOptimizationPipeline(Level, DebugLogging,
		ThinLTOPhase::PostLink));

return MPM;		return MPM;
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildLTOPreLinkDefaultPipeline(OptimizationLevel Level,		PassBuilder::buildLTOPreLinkDefaultPipeline(OptimizationLevel Level,
bool DebugLogging) {		bool DebugLogging) {
assert(Level != OptimizationLevel::O0 &&		assert(Level != OptimizationLevel::O0 &&
"Must request optimizations for the default pipeline!");		"Must request optimizations for the default pipeline!");
// FIXME: We should use a customized pre-link pipeline!		// FIXME: We should use a customized pre-link pipeline!
return buildPerModuleDefaultPipeline(Level, DebugLogging,		return buildPerModuleDefaultPipeline(Level, DebugLogging,
/* LTOPreLink */ true);		ThinLTOPhase::PreLink);
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,		PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,
ModuleSummaryIndex *ExportSummary) {		ModuleSummaryIndex *ExportSummary) {
ModulePassManager MPM(DebugLogging);		ModulePassManager MPM(DebugLogging);

if (Level == OptimizationLevel::O0) {		if (Level == OptimizationLevel::O0) {
▲ Show 20 Lines • Show All 118 Lines • ▼ Show 20 Lines	PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,
FPM.addPass(JumpThreadingPass());		FPM.addPass(JumpThreadingPass());

// Do a post inline PGO instrumentation and use pass. This is a context		// Do a post inline PGO instrumentation and use pass. This is a context
// sensitive PGO pass.		// sensitive PGO pass.
if (PGOOpt) {		if (PGOOpt) {
if (PGOOpt->CSAction == PGOOptions::CSIRInstr)		if (PGOOpt->CSAction == PGOOptions::CSIRInstr)
addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ true,		addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ true,
/* IsCS */ true, PGOOpt->CSProfileGenFile,		/* IsCS */ true, PGOOpt->CSProfileGenFile,
PGOOpt->ProfileRemappingFile);		PGOOpt->ProfileRemappingFile,
		/ExternalFunctionUseListsAreIncomplete=/false);
else if (PGOOpt->CSAction == PGOOptions::CSIRUse)		else if (PGOOpt->CSAction == PGOOptions::CSIRUse)
addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ false,		addPGOInstrPasses(MPM, DebugLogging, Level, /* RunProfileGen */ false,
/* IsCS */ true, PGOOpt->ProfileFile,		/* IsCS */ true, PGOOpt->ProfileFile,
PGOOpt->ProfileRemappingFile);		PGOOpt->ProfileRemappingFile,
		/ExternalFunctionUseListsAreIncomplete=/false);
}		}

// Break up allocas		// Break up allocas
FPM.addPass(SROA());		FPM.addPass(SROA());

// LTO provides additional opportunities for tailcall elimination due to		// LTO provides additional opportunities for tailcall elimination due to
// link-time inlining, and visibility of nocapture attribute.		// link-time inlining, and visibility of nocapture attribute.
FPM.addPass(TailCallElimPass());		FPM.addPass(TailCallElimPass());
▲ Show 20 Lines • Show All 1,134 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/InlineSimple.cpp

Show All 39 Lines	class SimpleInliner : public LegacyInlinerBase {

InlineParams Params;		InlineParams Params;

public:		public:
SimpleInliner() : LegacyInlinerBase(ID), Params(llvm::getInlineParams()) {		SimpleInliner() : LegacyInlinerBase(ID), Params(llvm::getInlineParams()) {
initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());		initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());
}		}

explicit SimpleInliner(InlineParams Params)		explicit SimpleInliner(InlineParams Params,
: LegacyInlinerBase(ID), Params(std::move(Params)) {		bool ExternalFunctionUseListsAreIncomplete)
		: LegacyInlinerBase(ID), Params(std::move(Params)),
		ExternalFunctionUseListsAreIncomplete(
		ExternalFunctionUseListsAreIncomplete) {
initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());		initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());
}		}

static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid

InlineCost getInlineCost(CallSite CS) override {		InlineCost getInlineCost(CallSite CS) override {
Function *Callee = CS.getCalledFunction();		Function *Callee = CS.getCalledFunction();
TargetTransformInfo &TTI = TTIWP->getTTI(*Callee);		TargetTransformInfo &TTI = TTIWP->getTTI(*Callee);

bool RemarksEnabled = false;		bool RemarksEnabled = false;
const auto &BBs = CS.getCaller()->getBasicBlockList();		const auto &BBs = CS.getCaller()->getBasicBlockList();
if (!BBs.empty()) {		if (!BBs.empty()) {
auto DI = OptimizationRemark(DEBUG_TYPE, "", DebugLoc(), &BBs.front());		auto DI = OptimizationRemark(DEBUG_TYPE, "", DebugLoc(), &BBs.front());
if (DI.isEnabled())		if (DI.isEnabled())
RemarksEnabled = true;		RemarksEnabled = true;
}		}
OptimizationRemarkEmitter ORE(CS.getCaller());		OptimizationRemarkEmitter ORE(CS.getCaller());

std::function<AssumptionCache &(Function &)> GetAssumptionCache =		std::function<AssumptionCache &(Function &)> GetAssumptionCache =
[&](Function &F) -> AssumptionCache & {		[&](Function &F) -> AssumptionCache & {
return ACT->getAssumptionCache(F);		return ACT->getAssumptionCache(F);
};		};

		Function &Caller = *CS.getCaller();
		bool CallerMayHaveUses =
		ExternalFunctionUseListsAreIncomplete \|\| !Caller.use_empty();
		bool CallerMayBeInlined =
		CallerMayHaveUses && !Caller.hasFnAttribute(Attribute::NoInline);

return llvm::getInlineCost(		return llvm::getInlineCost(
cast<CallBase>(*CS.getInstruction()), Params, TTI, GetAssumptionCache,		cast<CallBase>(*CS.getInstruction()), Params, TTI, GetAssumptionCache,
/GetBFI=/None, PSI, RemarksEnabled ? &ORE : nullptr);		/GetBFI=/None, PSI, RemarksEnabled ? &ORE : nullptr,
		/CallerInlineabilityMatters=/CallerMayBeInlined);
}		}

bool runOnSCC(CallGraphSCC &SCC) override;		bool runOnSCC(CallGraphSCC &SCC) override;
void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;

private:		private:
TargetTransformInfoWrapperPass *TTIWP;		TargetTransformInfoWrapperPass *TTIWP;
		// Assume that new references can't appear on otherwise-unreferenced `extern`
		// functions. This isn't the case in e.g., ThinLTO.
		bool ExternalFunctionUseListsAreIncomplete = true;
};		};

} // end anonymous namespace		} // end anonymous namespace

char SimpleInliner::ID = 0;		char SimpleInliner::ID = 0;
INITIALIZE_PASS_BEGIN(SimpleInliner, "inline", "Function Integration/Inlining",		INITIALIZE_PASS_BEGIN(SimpleInliner, "inline", "Function Integration/Inlining",
false, false)		false, false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)		INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)
INITIALIZE_PASS_DEPENDENCY(ProfileSummaryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(ProfileSummaryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_END(SimpleInliner, "inline", "Function Integration/Inlining",		INITIALIZE_PASS_END(SimpleInliner, "inline", "Function Integration/Inlining",
false, false)		false, false)

Pass *llvm::createFunctionInliningPass() { return new SimpleInliner(); }		Pass *llvm::createFunctionInliningPass() { return new SimpleInliner(); }

Pass *llvm::createFunctionInliningPass(int Threshold) {		Pass *
return new SimpleInliner(llvm::getInlineParams(Threshold));		llvm::createFunctionInliningPass(int Threshold,
		bool ExternalFunctionUseListsAreIncomplete) {
		return new SimpleInliner(llvm::getInlineParams(Threshold),
		ExternalFunctionUseListsAreIncomplete);
}		}

Pass *llvm::createFunctionInliningPass(unsigned OptLevel,		Pass *
unsigned SizeOptLevel,		llvm::createFunctionInliningPass(unsigned OptLevel, unsigned SizeOptLevel,
bool DisableInlineHotCallSite) {		bool DisableInlineHotCallSite,
		bool ExternalFunctionUseListsAreIncomplete) {
auto Param = llvm::getInlineParams(OptLevel, SizeOptLevel);		auto Param = llvm::getInlineParams(OptLevel, SizeOptLevel);
if (DisableInlineHotCallSite)		if (DisableInlineHotCallSite)
Param.HotCallSiteThreshold = 0;		Param.HotCallSiteThreshold = 0;
return new SimpleInliner(Param);		return new SimpleInliner(Param, ExternalFunctionUseListsAreIncomplete);
}		}

Pass *llvm::createFunctionInliningPass(InlineParams &Params) {		Pass *
return new SimpleInliner(Params);		llvm::createFunctionInliningPass(InlineParams &Params,
		bool ExternalFunctionUseListsAreIncomplete) {
		return new SimpleInliner(Params, ExternalFunctionUseListsAreIncomplete);
}		}

bool SimpleInliner::runOnSCC(CallGraphSCC &SCC) {		bool SimpleInliner::runOnSCC(CallGraphSCC &SCC) {
TTIWP = &getAnalysis<TargetTransformInfoWrapperPass>();		TTIWP = &getAnalysis<TargetTransformInfoWrapperPass>();
return LegacyInlinerBase::runOnSCC(SCC);		return LegacyInlinerBase::runOnSCC(SCC);
}		}

void SimpleInliner::getAnalysisUsage(AnalysisUsage &AU) const {		void SimpleInliner::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
LegacyInlinerBase::getAnalysisUsage(AU);		LegacyInlinerBase::getAnalysisUsage(AU);
}		}

llvm/lib/Transforms/IPO/Inliner.cpp

Show First 20 Lines • Show All 1,009 Lines • ▼ Show 20 Lines	for (int i = 0; i < (int)Calls.size(); ++i) {
};		};

auto GetInlineCost = [&](CallSite CS) {		auto GetInlineCost = [&](CallSite CS) {
Function &Callee = *CS.getCalledFunction();		Function &Callee = *CS.getCalledFunction();
auto &CalleeTTI = FAM.getResult<TargetIRAnalysis>(Callee);		auto &CalleeTTI = FAM.getResult<TargetIRAnalysis>(Callee);
bool RemarksEnabled =		bool RemarksEnabled =
Callee.getContext().getDiagHandlerPtr()->isMissedOptRemarkEnabled(		Callee.getContext().getDiagHandlerPtr()->isMissedOptRemarkEnabled(
DEBUG_TYPE);		DEBUG_TYPE);

		Function &Caller = *CS.getCaller();
		bool CallerMayHaveUses =
		ExternalFunctionUseListsAreIncomplete \|\| !Caller.use_empty();
		bool CallerMayBeInlined =
		CallerMayHaveUses && !Caller.hasFnAttribute(Attribute::NoInline);

return getInlineCost(cast<CallBase>(*CS.getInstruction()), Params,		return getInlineCost(cast<CallBase>(*CS.getInstruction()), Params,
CalleeTTI, GetAssumptionCache, {GetBFI}, PSI,		CalleeTTI, GetAssumptionCache, {GetBFI}, PSI,
RemarksEnabled ? &ORE : nullptr);		RemarksEnabled ? &ORE : nullptr,
		/CallerInlineabilityMatters=/CallerMayBeInlined);
};		};

// Now process as many calls as we have within this caller in the sequnece.		// Now process as many calls as we have within this caller in the sequnece.
// We bail out as soon as the caller has to change so we can update the		// We bail out as soon as the caller has to change so we can update the
// call graph and prepare the context of that new caller.		// call graph and prepare the context of that new caller.
bool DidInline = false;		bool DidInline = false;
for (; i < (int)Calls.size() && Calls[i].first.getCaller() == &F; ++i) {		for (; i < (int)Calls.size() && Calls[i].first.getCaller() == &F; ++i) {
int InlineHistoryID;		int InlineHistoryID;
▲ Show 20 Lines • Show All 215 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	if (OptLevel > 0 && SizeLevel == 0 && !DisablePreInliner &&
// inliner to influence pre-inlining. The only fields of InlineParams we		// inliner to influence pre-inlining. The only fields of InlineParams we
// care about are DefaultThreshold and HintThreshold.		// care about are DefaultThreshold and HintThreshold.
InlineParams IP;		InlineParams IP;
IP.DefaultThreshold = PreInlineThreshold;		IP.DefaultThreshold = PreInlineThreshold;
// FIXME: The hint threshold has the same value used by the regular inliner.		// FIXME: The hint threshold has the same value used by the regular inliner.
// This should probably be lowered after performance testing.		// This should probably be lowered after performance testing.
IP.HintThreshold = 325;		IP.HintThreshold = 325;

MPM.add(createFunctionInliningPass(IP));		MPM.add(createFunctionInliningPass(
		IP,
		/ExternalFunctionUseListsAreIncomplete=/PrepareForLTO \|\|
		PrepareForThinLTO \|\| PerformThinLTO));
MPM.add(createSROAPass());		MPM.add(createSROAPass());
MPM.add(createEarlyCSEPass()); // Catch trivial redundancies		MPM.add(createEarlyCSEPass()); // Catch trivial redundancies
MPM.add(createCFGSimplificationPass()); // Merge & remove BBs		MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
MPM.add(createInstructionCombiningPass()); // Combine silly seq's		MPM.add(createInstructionCombiningPass()); // Combine silly seq's
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
}		}
if ((EnablePGOInstrGen && !IsCS) \|\| (EnablePGOCSInstrGen && IsCS)) {		if ((EnablePGOInstrGen && !IsCS) \|\| (EnablePGOCSInstrGen && IsCS)) {
MPM.add(createPGOInstrumentationGenLegacyPass(IsCS));		MPM.add(createPGOInstrumentationGenLegacyPass(IsCS));
▲ Show 20 Lines • Show All 858 Lines • Show Last 20 Lines

llvm/test/Transforms/Inline/bpi-cold-inlining.ll

This file was added.

				; RUN: opt < %s -passes=inline -inline-cold-callsite-threshold=0 -S \| FileCheck %s

				declare void @foo(i32)

				@a = external global i1

				define internal void @callee1() {
				call void @foo(i32 1)
				ret void
				}

				; CHECK-LABEL: define void @not_inlined_if_cold
				define void @not_inlined_if_cold() {
				entry:
				%a = load i1, i1* @a
				br i1 %a, label %if.then, label %if.end, !prof !0

				if.then:
				; CHECK: call void @callee1()
				call void @callee1()
				br label %if.end

				if.end:
				; CHECK: ret void
				ret void
				}

				@not_inlined_if_cold_addr = global void ()* @not_inlined_if_cold

				define internal void @callee2() {
				call void @foo(i32 2)
				ret void
				}

				; CHECK-LABEL: define void @gets_inlined_noinline_and_cold
				define void @gets_inlined_noinline_and_cold() #0 {
				entry:
				%a = load i1, i1* @a
				br i1 %a, label %if.then, label %if.end, !prof !0

				if.then:
				; CHECK: call void @foo(i32 2)
				call void @callee2()
				br label %if.end

				if.end:
				; CHECK: ret void
				ret void
				}

				@gets_inlined_noinline_and_cold_addr = global void ()* @gets_inlined_noinline_and_cold

				define internal void @callee3() {
				call void @foo(i32 3)
				ret void
				}

				; CHECK-LABEL: define void @gets_inlined_no_uses_and_cold
				define void @gets_inlined_no_uses_and_cold() {
				entry:
				%a = load i1, i1* @a
				br i1 %a, label %if.then, label %if.end, !prof !0

				if.then:
				; CHECK: call void @foo(i32 3)
				call void @callee3()
				br label %if.end

				if.end:
				; CHECK: ret void
				ret void
				}

				attributes #0 = { noinline }

				!0 = !{!"branch_weights", i32 1, i32 2000}

This is an archive of the discontinued LLVM Phabricator instance.

[InlineCost] Relax bonus restrictions on uninlinable functionsNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 243060

clang/lib/CodeGen/BackendUtil.cpp

llvm/include/llvm/Analysis/InlineCost.h

llvm/include/llvm/Passes/PassBuilder.h

llvm/include/llvm/Transforms/IPO.h

llvm/include/llvm/Transforms/IPO/Inliner.h

llvm/lib/Analysis/InlineCost.cpp

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Transforms/IPO/InlineSimple.cpp

llvm/lib/Transforms/IPO/Inliner.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/test/Transforms/Inline/bpi-cold-inlining.ll

[InlineCost] Relax bonus restrictions on uninlinable functions
Needs ReviewPublic