This is an archive of the discontinued LLVM Phabricator instance.

[Inliner,OptDiag] Add hotness attribute to opt diagnostics
ClosedPublic

Authored by anemet on Jul 22 2016, 11:58 AM.

Download Raw Diff

Details

Reviewers

chandlerc
davidxl
eraman
hfinkel

Commits

rG896c09bd10d8: [Inliner,OptDiag] Add hotness attribute to opt diagnostics
rL278185: [Inliner,OptDiag] Add hotness attribute to opt diagnostics

Summary

The inliner not being a function pass requires the work-around of
generating the OptimizationRemarkEmitter and in turn BFI on demand.
This will go away after the new PM is ready.

BFI is only computed inside ORE if the user has requested hotness information for optimization diagnostitics (-pass-remark-with-hotness at the 'opt' level). Thus there is no additional overhead without the flag.

Diff Detail

Repository: rL LLVM

Event Timeline

anemet updated this revision to Diff 65120.Jul 22 2016, 11:58 AM

anemet retitled this revision from to [Inliner,OptDiag] Add hotness attribute to opt diagnostics.

anemet updated this object.

anemet added reviewers: hfinkel, davidxl.

anemet added a subscriber: llvm-commits.

davidxl added a reviewer: eraman.Jul 22 2016, 12:09 PM

anemet updated this object.Jul 22 2016, 12:15 PM

Rebase on top of r276488.

This computes DominatorTree, LoopInfo, BPI and BFI for *every callsite* that is considered for inlining. Even though this is (I assume) turned off by default, it's still very expensive. Even in the new PM, this can be avoided only by incrementally updating the BFI of the caller after inlining a callee into it.

In D22694#500037, @eraman wrote:

This computes DominatorTree, LoopInfo, BPI and BFI for *every callsite* that is considered for inlining. Even though this is (I assume) turned off by default, it's still very expensive.

Certainly it is off by default and it's meant as a performance analysis tool so the budget is different. That said, we can I guess cache ORE per caller and only invalidate it once we inline. Does that sound reasonable to you?

Even in the new PM, this can be avoided only by incrementally updating the BFI of the caller after inlining a callee into it.

Sure but I thought that was required for PGO-based inlining anyway. I was hoping to piggyback on that.

In D22694#500141, @anemet wrote:

In D22694#500037, @eraman wrote:

This computes DominatorTree, LoopInfo, BPI and BFI for *every callsite* that is considered for inlining. Even though this is (I assume) turned off by default, it's still very expensive.

Certainly it is off by default and it's meant as a performance analysis tool so the budget is different.

Yes, I understand that this is a reporting tool and so the compilation time is not a major constraint. I was wondering if this makes it practical for use in real applications. Perhaps it is. Have you run this on anything large (SPEC, for example) and measured the overhead? I don't have any objections to this patch if you have measured the overhead and found them reasonable.

That said, we can I guess cache ORE per caller and only invalidate it once we inline. Does that sound reasonable to you?

That'll certainly reduce the overhead in the cases where inlining is considered, but does not happen. I am not sure if it is worth the additional complexity though. If you go this route, please isolate the changes to within ORE and invalidate this in emitOptimizationRemark.

Even in the new PM, this can be avoided only by incrementally updating the BFI of the caller after inlining a callee into it.

Sure but I thought that was required for PGO-based inlining anyway. I was hoping to piggyback on that.

Yes, that's true. I didn't mean to say that ORE has to specifically do this.

test/Transforms/Inline/optimization-remarks-with-hotness.ll
1 ↗	(On Diff #65185)	The test case could be vastly simplified. The contents of the functions do not affect the test case.

In D22694#500213, @eraman wrote:

In D22694#500141, @anemet wrote:

In D22694#500037, @eraman wrote:

This computes DominatorTree, LoopInfo, BPI and BFI for *every callsite* that is considered for inlining. Even though this is (I assume) turned off by default, it's still very expensive.

Certainly it is off by default and it's meant as a performance analysis tool so the budget is different.

Yes, I understand that this is a reporting tool and so the compilation time is not a major constraint. I was wondering if this makes it practical for use in real applications. Perhaps it is. Have you run this on anything large (SPEC, for example) and measured the overhead? I don't have any objections to this patch if you have measured the overhead and found them reasonable.

No, I haven't run it with anything yet. I can keep this patch local for now and start doing the evaluation and go from there.

That said, we can I guess cache ORE per caller and only invalidate it once we inline. Does that sound reasonable to you?

That'll certainly reduce the overhead in the cases where inlining is considered, but does not happen. I am not sure if it is worth the additional complexity though. If you go this route, please isolate the changes to within ORE and invalidate this in emitOptimizationRemark.

Yes, certainly. It would be good if as little of this as possible would spill over into the inliner.

anemet added inline comments.Jul 29 2016, 11:08 AM

test/Transforms/Inline/optimization-remarks-with-hotness.ll
1 ↗	(On Diff #65185)	It is just a copy of optimization-remarks.ll further annotated with !prof metadata. Anyhow, I can simplify it.

In D22694#500213, @eraman wrote:

Yes, I understand that this is a reporting tool and so the compilation time is not a major constraint. I was wondering if this makes it practical for use in real applications. Perhaps it is. Have you run this on anything large (SPEC, for example) and measured the overhead? I don't have any objections to this patch if you have measured the overhead and found them reasonable.

I've run this on SPEC2000_int, SPEC2006_int/fp now and the overhead seems reasonable with a release (no-asserts) compiler. The worst ones were perlbmk (both 2000 and 2006) and omnetpp around 30% slower. The differences trail off quickly afterwards. Also this was done without LTO. LTO is probably worse.

Let me know if you have further questions.

Adam

Simplified the test per @eraman's comments.

anemet mentioned this in D23284: Add -fdiagnostics-show-hotness.Aug 8 2016, 3:35 PM

LGTM w.r.t. to the inliner. Please make sure the other heavy users of remarks are happy as well as you are changing the output mechanism here.

I agree with the sentiment in the patch description that this is a bit gross, but at least seems sufficiently isolated.

include/llvm/Analysis/OptimizationDiagnosticInfo.h
37–38 ↗	(On Diff #67187)	This needs a proper doxygen comment, especially regarding the cost of doing this. And doubly especially that it is free unless the context has the hotness emission specifically enabled. I had to read the implementation to understand why this is OK. =[ I'll also note that this applies to this entire file which seems ... remarkably devoid of doxygen API comments. Please document this interface and infrastructure in subsequent commits. =/
156 ↗	(On Diff #67187)	No need for '\brief'.

This revision is now accepted and ready to land.Aug 8 2016, 11:25 PM

lgtm -- the compile time issue is temporary (and not affecting the default path).

Closed by commit rL278185: [Inliner,OptDiag] Add hotness attribute to opt diagnostics (authored by anemet). · Explain WhyAug 9 2016, 5:52 PM

This revision was automatically updated to reflect the committed changes.

anemet mentioned this in rL278186: [OptDiag] Add class Doxygen comment.

anemet added inline comments.Aug 9 2016, 5:57 PM

include/llvm/Analysis/OptimizationDiagnosticInfo.h
37–38 ↗	(On Diff #67187)	Actually, I think that only the class comment was missing. The member functions were heavily commented. Fixed in rL278186. Thanks to all of you for the reviews!

anemet mentioned this in rL279829: [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass.Aug 26 2016, 9:06 AM

anemet mentioned this in rL281276: Add -fdiagnostics-show-hotness.Sep 12 2016, 4:57 PM

anemet mentioned this in rL281293: Reapply r281276 with passing -emit-llvm in one of the tests.Sep 12 2016, 9:41 PM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

OptimizationDiagnosticInfo.h

18 lines

Transforms/

IPO/

InlinerPass.h

1 line

lib/

Analysis/

OptimizationDiagnosticInfo.cpp

24 lines

Transforms/

IPO/

Inliner.cpp

75 lines

test/

Transforms/

Inline/

optimization-remarks-with-hotness.ll

39 lines

Diff 67441

llvm/trunk/include/llvm/Analysis/OptimizationDiagnosticInfo.h

Show All 10 Lines
// that by using this service passes become dependent on BFI as well. BFI is		// that by using this service passes become dependent on BFI as well. BFI is
// used to compute the "hotness" of the diagnostic message.		// used to compute the "hotness" of the diagnostic message.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_IR_OPTIMIZATIONDIAGNOSTICINFO_H		#ifndef LLVM_IR_OPTIMIZATIONDIAGNOSTICINFO_H
#define LLVM_IR_OPTIMIZATIONDIAGNOSTICINFO_H		#define LLVM_IR_OPTIMIZATIONDIAGNOSTICINFO_H

#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
		#include "llvm/Analysis/BlockFrequencyInfo.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"

namespace llvm {		namespace llvm {
class BlockFrequencyInfo;
class DebugLoc;		class DebugLoc;
class Function;		class Function;
class LLVMContext;		class LLVMContext;
class Loop;		class Loop;
class Pass;		class Pass;
class Twine;		class Twine;
class Value;		class Value;

class OptimizationRemarkEmitter {		class OptimizationRemarkEmitter {
public:		public:
OptimizationRemarkEmitter(Function F, BlockFrequencyInfo BFI)		OptimizationRemarkEmitter(Function F, BlockFrequencyInfo BFI)
: F(F), BFI(BFI) {}		: F(F), BFI(BFI) {}

		/// \brief This variant can be used to generate ORE on demand (without the
		/// analysis pass).
		///
		/// Note that this ctor has a very different cost depending on whether
		/// F->getContext().getDiagnosticHotnessRequested() is on or not. If it's off
		/// the operation is free.
		///
		/// Whereas if DiagnosticHotnessRequested is on, it is fairly expensive
		/// operation since BFI and all its required analyses are computed. This is
		/// for example useful for CGSCC passes that can't use function analyses
		/// passes in the old PM.
		OptimizationRemarkEmitter(Function *F);

OptimizationRemarkEmitter(OptimizationRemarkEmitter &&Arg)		OptimizationRemarkEmitter(OptimizationRemarkEmitter &&Arg)
: F(Arg.F), BFI(Arg.BFI) {}		: F(Arg.F), BFI(Arg.BFI) {}

OptimizationRemarkEmitter &operator=(OptimizationRemarkEmitter &&RHS) {		OptimizationRemarkEmitter &operator=(OptimizationRemarkEmitter &&RHS) {
F = RHS.F;		F = RHS.F;
BFI = RHS.BFI;		BFI = RHS.BFI;
return *this;		return *this;
}		}
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	public:
void emitOptimizationRemarkAnalysisAliasing(const char PassName, Loop L,		void emitOptimizationRemarkAnalysisAliasing(const char PassName, Loop L,
const Twine &Msg);		const Twine &Msg);

private:		private:
Function *F;		Function *F;

BlockFrequencyInfo *BFI;		BlockFrequencyInfo *BFI;

		/// If we generate BFI on demand, we need to free it when ORE is freed.
		std::unique_ptr<BlockFrequencyInfo> OwnedBFI;

Optional<uint64_t> computeHotness(const Value *V);		Optional<uint64_t> computeHotness(const Value *V);

OptimizationRemarkEmitter(const OptimizationRemarkEmitter &) = delete;		OptimizationRemarkEmitter(const OptimizationRemarkEmitter &) = delete;
void operator=(const OptimizationRemarkEmitter &) = delete;		void operator=(const OptimizationRemarkEmitter &) = delete;
};		};

class OptimizationRemarkEmitterWrapperPass : public FunctionPass {		class OptimizationRemarkEmitterWrapperPass : public FunctionPass {
std::unique_ptr<OptimizationRemarkEmitter> ORE;		std::unique_ptr<OptimizationRemarkEmitter> ORE;
Show All 30 Lines

llvm/trunk/include/llvm/Transforms/IPO/InlinerPass.h

	Show All 21 Lines
	#include "llvm/Analysis/TargetTransformInfo.h"			#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/Transforms/Utils/ImportedFunctionsInliningStatistics.h"			#include "llvm/Transforms/Utils/ImportedFunctionsInliningStatistics.h"

	namespace llvm {			namespace llvm {
	class AssumptionCacheTracker;			class AssumptionCacheTracker;
	class CallSite;			class CallSite;
	class DataLayout;			class DataLayout;
	class InlineCost;			class InlineCost;
				class OptimizationRemarkEmitter;
	class ProfileSummaryInfo;			class ProfileSummaryInfo;
	template <class PtrType, unsigned SmallSize> class SmallPtrSet;			template <class PtrType, unsigned SmallSize> class SmallPtrSet;

	/// This class contains all of the helper code which is used to perform the			/// This class contains all of the helper code which is used to perform the
	/// inlining operations that do not depend on the policy.			/// inlining operations that do not depend on the policy.
	struct Inliner : public CallGraphSCCPass {			struct Inliner : public CallGraphSCCPass {
	explicit Inliner(char &ID);			explicit Inliner(char &ID);
	explicit Inliner(char &ID, bool InsertLifetime);			explicit Inliner(char &ID, bool InsertLifetime);
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/OptimizationDiagnosticInfo.cpp

	//===- OptimizationDiagnosticInfo.cpp - Optimization Diagnostic -- C++ --===//			//===- OptimizationDiagnosticInfo.cpp - Optimization Diagnostic -- C++ --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// Optimization diagnostic interfaces. It's packaged as an analysis pass so			// Optimization diagnostic interfaces. It's packaged as an analysis pass so
	// that by using this service passes become dependent on BFI as well. BFI is			// that by using this service passes become dependent on BFI as well. BFI is
	// used to compute the "hotness" of the diagnostic message.			// used to compute the "hotness" of the diagnostic message.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/Analysis/OptimizationDiagnosticInfo.h"			#include "llvm/Analysis/OptimizationDiagnosticInfo.h"
				#include "llvm/Analysis/BranchProbabilityInfo.h"
	#include "llvm/Analysis/LazyBlockFrequencyInfo.h"			#include "llvm/Analysis/LazyBlockFrequencyInfo.h"
	#include "llvm/Analysis/LoopInfo.h"			#include "llvm/Analysis/LoopInfo.h"
	#include "llvm/IR/DiagnosticInfo.h"			#include "llvm/IR/DiagnosticInfo.h"
				#include "llvm/IR/Dominators.h"
	#include "llvm/IR/LLVMContext.h"			#include "llvm/IR/LLVMContext.h"

	using namespace llvm;			using namespace llvm;

				OptimizationRemarkEmitter::OptimizationRemarkEmitter(Function *F)
				: F(F), BFI(nullptr) {
				if (!F->getContext().getDiagnosticHotnessRequested())
				return;

				// First create a dominator tree.
				DominatorTree DT;
				DT.recalculate(*F);

				// Generate LoopInfo from it.
				LoopInfo LI;
				LI.analyze(DT);

				// Then compute BranchProbabilityInfo.
				BranchProbabilityInfo BPI;
				BPI.calculate(*F, LI);

				// Finally compute BFI.
				OwnedBFI = llvm::make_unique<BlockFrequencyInfo>(*F, BPI, LI);
				BFI = OwnedBFI.get();
				}

	Optional<uint64_t> OptimizationRemarkEmitter::computeHotness(const Value *V) {			Optional<uint64_t> OptimizationRemarkEmitter::computeHotness(const Value *V) {
	if (!BFI)			if (!BFI)
	return None;			return None;

	return BFI->getBlockProfileCount(cast<BasicBlock>(V));			return BFI->getBlockProfileCount(cast<BasicBlock>(V));
	}			}

	void OptimizationRemarkEmitter::emitOptimizationRemark(const char *PassName,			void OptimizationRemarkEmitter::emitOptimizationRemark(const char *PassName,
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/Inliner.cpp

Show All 14 Lines

#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/CallGraph.h"		#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/InlineCost.h"		#include "llvm/Analysis/InlineCost.h"
		#include "llvm/Analysis/OptimizationDiagnosticInfo.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DiagnosticInfo.h"		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	for (unsigned AllocaNo = 0, e = IFI.StaticAllocas.size(); AllocaNo != e;
// operation.		// operation.
AllocasForType.push_back(AI);		AllocasForType.push_back(AI);
UsedAllocas.insert(AI);		UsedAllocas.insert(AI);
}		}

return true;		return true;
}		}

static void emitAnalysis(CallSite CS, const Twine &Msg) {		static void emitAnalysis(CallSite CS, OptimizationRemarkEmitter &ORE,
Function *Caller = CS.getCaller();		const Twine &Msg) {
LLVMContext &Ctx = Caller->getContext();		ORE.emitOptimizationRemarkAnalysis(DEBUG_TYPE, CS.getInstruction(), Msg);
DebugLoc DLoc = CS.getInstruction()->getDebugLoc();
emitOptimizationRemarkAnalysis(Ctx, DEBUG_TYPE, *Caller, DLoc, Msg);
}		}

/// Return true if inlining of CS can block the caller from being		/// Return true if inlining of CS can block the caller from being
/// inlined which is proved to be more beneficial. \p IC is the		/// inlined which is proved to be more beneficial. \p IC is the
/// estimated inline cost associated with callsite \p CS.		/// estimated inline cost associated with callsite \p CS.
/// \p TotalAltCost will be set to the estimated cost of inlining the caller		/// \p TotalAltCost will be set to the estimated cost of inlining the caller
/// if \p CS is suppressed for inlining.		/// if \p CS is suppressed for inlining.
static bool		static bool
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	shouldBeDeferred(Function *Caller, CallSite CS, InlineCost IC,
if (inliningPreventsSomeOuterInline && TotalSecondaryCost < IC.getCost())		if (inliningPreventsSomeOuterInline && TotalSecondaryCost < IC.getCost())
return true;		return true;

return false;		return false;
}		}

/// Return true if the inliner should attempt to inline at the given CallSite.		/// Return true if the inliner should attempt to inline at the given CallSite.
static bool shouldInline(CallSite CS,		static bool shouldInline(CallSite CS,
function_ref<InlineCost(CallSite CS)> GetInlineCost) {		function_ref<InlineCost(CallSite CS)> GetInlineCost,
		OptimizationRemarkEmitter &ORE) {
InlineCost IC = GetInlineCost(CS);		InlineCost IC = GetInlineCost(CS);

if (IC.isAlways()) {		if (IC.isAlways()) {
DEBUG(dbgs() << " Inlining: cost=always"		DEBUG(dbgs() << " Inlining: cost=always"
<< ", Call: " << *CS.getInstruction() << "\n");		<< ", Call: " << *CS.getInstruction() << "\n");
emitAnalysis(CS, Twine(CS.getCalledFunction()->getName()) +		emitAnalysis(CS, ORE, Twine(CS.getCalledFunction()->getName()) +
" should always be inlined (cost=always)");		" should always be inlined (cost=always)");
return true;		return true;
}		}

if (IC.isNever()) {		if (IC.isNever()) {
DEBUG(dbgs() << " NOT Inlining: cost=never"		DEBUG(dbgs() << " NOT Inlining: cost=never"
<< ", Call: " << *CS.getInstruction() << "\n");		<< ", Call: " << *CS.getInstruction() << "\n");
emitAnalysis(CS, Twine(CS.getCalledFunction()->getName() +		emitAnalysis(CS, ORE, Twine(CS.getCalledFunction()->getName() +
" should never be inlined (cost=never)"));		" should never be inlined (cost=never)"));
return false;		return false;
}		}

Function *Caller = CS.getCaller();		Function *Caller = CS.getCaller();
if (!IC) {		if (!IC) {
DEBUG(dbgs() << " NOT Inlining: cost=" << IC.getCost()		DEBUG(dbgs() << " NOT Inlining: cost=" << IC.getCost()
<< ", thres=" << (IC.getCostDelta() + IC.getCost())		<< ", thres=" << (IC.getCostDelta() + IC.getCost())
<< ", Call: " << *CS.getInstruction() << "\n");		<< ", Call: " << *CS.getInstruction() << "\n");
emitAnalysis(CS, Twine(CS.getCalledFunction()->getName() +		emitAnalysis(CS, ORE, Twine(CS.getCalledFunction()->getName() +
" too costly to inline (cost=") +		" too costly to inline (cost=") +
Twine(IC.getCost()) + ", threshold=" +		Twine(IC.getCost()) + ", threshold=" +
Twine(IC.getCostDelta() + IC.getCost()) + ")");		Twine(IC.getCostDelta() + IC.getCost()) + ")");
return false;		return false;
}		}

int TotalSecondaryCost = 0;		int TotalSecondaryCost = 0;
if (shouldBeDeferred(Caller, CS, IC, TotalSecondaryCost, GetInlineCost)) {		if (shouldBeDeferred(Caller, CS, IC, TotalSecondaryCost, GetInlineCost)) {
DEBUG(dbgs() << " NOT Inlining: " << *CS.getInstruction()		DEBUG(dbgs() << " NOT Inlining: " << *CS.getInstruction()
<< " Cost = " << IC.getCost()		<< " Cost = " << IC.getCost()
<< ", outer Cost = " << TotalSecondaryCost << '\n');		<< ", outer Cost = " << TotalSecondaryCost << '\n');
emitAnalysis(CS, Twine("Not inlining. Cost of inlining " +		emitAnalysis(CS, ORE,
		Twine("Not inlining. Cost of inlining " +
CS.getCalledFunction()->getName() +		CS.getCalledFunction()->getName() +
" increases the cost of inlining " +		" increases the cost of inlining " +
CS.getCaller()->getName() + " in other contexts"));		CS.getCaller()->getName() + " in other contexts"));
return false;		return false;
}		}

DEBUG(dbgs() << " Inlining: cost=" << IC.getCost()		DEBUG(dbgs() << " Inlining: cost=" << IC.getCost()
<< ", thres=" << (IC.getCostDelta() + IC.getCost())		<< ", thres=" << (IC.getCostDelta() + IC.getCost())
<< ", Call: " << *CS.getInstruction() << '\n');		<< ", Call: " << *CS.getInstruction() << '\n');
emitAnalysis(		emitAnalysis(CS, ORE, CS.getCalledFunction()->getName() +
CS, CS.getCalledFunction()->getName() + Twine(" can be inlined into ") +		Twine(" can be inlined into ") +
CS.getCaller()->getName() + " with cost=" + Twine(IC.getCost()) +		CS.getCaller()->getName() + " with cost=" +
" (threshold=" + Twine(IC.getCostDelta() + IC.getCost()) + ")");		Twine(IC.getCost()) + " (threshold=" +
		Twine(IC.getCostDelta() + IC.getCost()) + ")");
return true;		return true;
}		}

/// Return true if the specified inline history ID		/// Return true if the specified inline history ID
/// indicates an inline history that includes the specified function.		/// indicates an inline history that includes the specified function.
static bool InlineHistoryIncludes(		static bool InlineHistoryIncludes(
Function *F, int InlineHistoryID,		Function *F, int InlineHistoryID,
const SmallVectorImpl<std::pair<Function *, int>> &InlineHistory) {		const SmallVectorImpl<std::pair<Function *, int>> &InlineHistory) {
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	for (unsigned CSi = 0; CSi != CallSites.size(); ++CSi) {
// itself. If so, we'd be recursively inlining the same function,		// itself. If so, we'd be recursively inlining the same function,
// which would provide the same callsites, which would cause us to		// which would provide the same callsites, which would cause us to
// infinitely inline.		// infinitely inline.
int InlineHistoryID = CallSites[CSi].second;		int InlineHistoryID = CallSites[CSi].second;
if (InlineHistoryID != -1 &&		if (InlineHistoryID != -1 &&
InlineHistoryIncludes(Callee, InlineHistoryID, InlineHistory))		InlineHistoryIncludes(Callee, InlineHistoryID, InlineHistory))
continue;		continue;

LLVMContext &CallerCtx = Caller->getContext();

// Get DebugLoc to report. CS will be invalid after Inliner.		// Get DebugLoc to report. CS will be invalid after Inliner.
DebugLoc DLoc = CS.getInstruction()->getDebugLoc();		DebugLoc DLoc = CS.getInstruction()->getDebugLoc();
		BasicBlock *Block = CS.getParent();
		// FIXME for new PM: because of the old PM we currently generate ORE and
		// in turn BFI on demand. With the new PM, the ORE dependency should
		// just become a regular analysis dependency.
		OptimizationRemarkEmitter ORE(Caller);

// If the policy determines that we should inline this function,		// If the policy determines that we should inline this function,
// try to do so.		// try to do so.
if (!shouldInline(CS, GetInlineCost)) {		if (!shouldInline(CS, GetInlineCost, ORE)) {
emitOptimizationRemarkMissed(CallerCtx, DEBUG_TYPE, *Caller, DLoc,		ORE.emitOptimizationRemarkMissed(DEBUG_TYPE, DLoc, Block,
Twine(Callee->getName() +		Twine(Callee->getName() +
" will not be inlined into " +		" will not be inlined into " +
Caller->getName()));		Caller->getName()));
continue;		continue;
}		}

// Attempt to inline the function.		// Attempt to inline the function.
if (!InlineCallIfPossible(CS, InlineInfo, InlinedArrayAllocas,		if (!InlineCallIfPossible(CS, InlineInfo, InlinedArrayAllocas,
InlineHistoryID, InsertLifetime, AARGetter,		InlineHistoryID, InsertLifetime, AARGetter,
ImportedFunctionsStats)) {		ImportedFunctionsStats)) {
emitOptimizationRemarkMissed(CallerCtx, DEBUG_TYPE, *Caller, DLoc,		ORE.emitOptimizationRemarkMissed(DEBUG_TYPE, DLoc, Block,
Twine(Callee->getName() +		Twine(Callee->getName() +
" will not be inlined into " +		" will not be inlined into " +
Caller->getName()));		Caller->getName()));
continue;		continue;
}		}
++NumInlined;		++NumInlined;

// Report the inline decision.		// Report the inline decision.
emitOptimizationRemark(		ORE.emitOptimizationRemark(
CallerCtx, DEBUG_TYPE, *Caller, DLoc,		DEBUG_TYPE, DLoc, Block,
Twine(Callee->getName() + " inlined into " + Caller->getName()));		Twine(Callee->getName() + " inlined into " + Caller->getName()));

// If inlining this function gave us any new call sites, throw them		// If inlining this function gave us any new call sites, throw them
// onto our worklist to process. They are useful inline candidates.		// onto our worklist to process. They are useful inline candidates.
if (!InlineInfo.InlinedCalls.empty()) {		if (!InlineInfo.InlinedCalls.empty()) {
// Create a new inline history entry for this, so that we remember		// Create a new inline history entry for this, so that we remember
// that these new callsites came about due to inlining Callee.		// that these new callsites came about due to inlining Callee.
int NewHistoryID = InlineHistory.size();		int NewHistoryID = InlineHistory.size();
▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/optimization-remarks-with-hotness.ll

				; RUN: opt < %s -inline -pass-remarks=inline -pass-remarks-missed=inline \
				; RUN: -pass-remarks-analysis=inline -pass-remarks-with-hotness -S 2>&1 \
				; RUN: \| FileCheck %s

				; CHECK: foo should always be inlined (cost=always) (hotness: 30)
				; CHECK: foo inlined into bar (hotness: 30)
				; CHECK: foz should never be inlined (cost=never) (hotness: 30)
				; CHECK: foz will not be inlined into bar (hotness: 30)

				; Function Attrs: alwaysinline nounwind uwtable
				define i32 @foo() #0 !prof !1 {
				entry:
				ret i32 4
				}

				; Function Attrs: noinline nounwind uwtable
				define i32 @foz() #1 !prof !2 {
				entry:
				ret i32 2
				}

				; Function Attrs: nounwind uwtable
				define i32 @bar() !prof !3 {
				entry:
				%call = call i32 @foo()
				%call2 = call i32 @foz()
				%mul = mul i32 %call, %call2
				ret i32 %mul
				}

				attributes #0 = { alwaysinline }
				attributes #1 = { noinline }

				!llvm.ident = !{!0}

				!0 = !{!"clang version 3.5.0 "}
				!1 = !{!"function_entry_count", i64 10}
				!2 = !{!"function_entry_count", i64 20}
				!3 = !{!"function_entry_count", i64 30}

This is an archive of the discontinued LLVM Phabricator instance.

[Inliner,OptDiag] Add hotness attribute to opt diagnosticsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 67441

llvm/trunk/include/llvm/Analysis/OptimizationDiagnosticInfo.h

llvm/trunk/include/llvm/Transforms/IPO/InlinerPass.h

llvm/trunk/lib/Analysis/OptimizationDiagnosticInfo.cpp

llvm/trunk/lib/Transforms/IPO/Inliner.cpp

llvm/trunk/test/Transforms/Inline/optimization-remarks-with-hotness.ll

[Inliner,OptDiag] Add hotness attribute to opt diagnostics
ClosedPublic