This is an archive of the discontinued LLVM Phabricator instance.

[InlineCost] Don't return early when allowSizeGrowth(CS) is false
Needs RevisionPublic

Authored by paquette on Mar 13 2018, 4:28 PM.

Download Raw Diff

Details

Reviewers

fhahn
efriedma
haicheng
davide

Summary

This patch concerns chains of calls like so:

pluto();

foo() {
  pluto();
}

wibble() {
  foo();
}

Suppose foo has internal linkage. Also let's say that the call to foo within wibble is the only call to foo within the module.

Normally, the inlining cost model will apply a large bonus to ensure that foo is inlined into wibble because there's basically no cost in doing so. However, if wibble is unreachable-terminated, this won't happen. This is because the cost model currently does the following:

Check if the block containing the call is unreachable-terminated (allowSizeGrowth)
If it is, set the threshold to 0 and return

This happens before the bonus is applied. Therefore, any "zero-cost" case relying on the bonus won't ever be inlined when we're dealing with unreachable-terminated blocks.

This commit

Removes the early return when allowSizeGrowth is false
Wraps the threshold tweaks in a conditional which is true only when size growth is allowed.

The tweaks are wrapped in a conditional to reflect that we only want to inline when the cost of inlining is truly 0 or better; any modifications to the threshold would break this assertion. The early return is removed to facilitate inlining the example case.

This produced some minor code size improvements for ARM, AArch64, and x86-64 at Oz.

Output from compare.py here for Oz: https://hastebin.com/fojuquzoru.erl

Edit: More clarity. The word salad from before wasn't that great.

Diff Detail

Event Timeline

paquette created this revision.Mar 13 2018, 4:28 PM

Herald added subscribers: kristof.beyls, eraman, javed.absar. · View Herald TranscriptMar 13 2018, 4:28 PM

efriedma added a subscriber: zzheng.Mar 13 2018, 4:57 PM

paquette edited the summary of this revision. (Show Details)Mar 14 2018, 10:04 AM

eraman added inline comments.Mar 14 2018, 10:17 AM

lib/Analysis/InlineCost.cpp
844	Move the code applying last-call-to-static bonus to the top. Then, you could early exit after setting the threshold to 0 under the !allowSizeGrowth condition.

Update patch to move the LastCallToStaticBonus logic to the top of updateThreshold.

fhahn added inline comments.Mar 15 2018, 8:26 AM

lib/Analysis/InlineCost.cpp
847	The changes after this point do not seem to be necessary, i.e. could they stay at the same place?

Jessica, are you still planing on pushing this change? :)

Wow, I entirely forgot about this somehow.

I'll take a look and see if it's still relevant...

Is this still needed? Marking as changes requested to clear from review queue.

This revision now requires changes to proceed.Feb 1 2023, 3:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 3:03 AM

Herald added subscribers: ChuanqiXu, StephenFan, pengfei. · View Herald Transcript

Revision Contents

Path

Size

lib/

Analysis/

InlineCost.cpp

48 lines

test/

Transforms/

Inline/

ARM/

unreachable.ll

21 lines

Diff 138429

lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 834 Lines • ▼ Show 20 Lines	CallAnalyzer::getHotCallSiteThreshold(CallSite CS,
if (CallSiteFreq >= CallerEntryFreq * HotCallSiteRelFreq)		if (CallSiteFreq >= CallerEntryFreq * HotCallSiteRelFreq)
return Params.LocallyHotCallSiteThreshold;		return Params.LocallyHotCallSiteThreshold;

// Otherwise treat it normally.		// Otherwise treat it normally.
return None;		return None;
}		}

void CallAnalyzer::updateThreshold(CallSite CS, Function &Callee) {		void CallAnalyzer::updateThreshold(CallSite CS, Function &Callee) {
// If no size growth is allowed for this inlining, set Threshold to 0.
if (!allowSizeGrowth(CS)) {
Threshold = 0;
return;
}
fhahnUnsubmitted Not Done Reply Inline Actions The changes after this point do not seem to be necessary, i.e. could they stay at the same place? fhahn: The changes after this point do not seem to be necessary, i.e. could they stay at the same…

Function *Caller = CS.getCaller();

// return min(A, B) if B is valid.
auto MinIfValid = [](int A, Optional<int> B) {
return B ? std::min(A, B.getValue()) : A;
};

// return max(A, B) if B is valid.
auto MaxIfValid = [](int A, Optional<int> B) {
return B ? std::max(A, B.getValue()) : A;
};

// Various bonus percentages. These are multiplied by Threshold to get the		// Various bonus percentages. These are multiplied by Threshold to get the
// bonus values.		// bonus values.
		eramanUnsubmitted Not Done Reply Inline Actions Move the code applying last-call-to-static bonus to the top. Then, you could early exit after setting the threshold to 0 under the !allowSizeGrowth condition. eraman: Move the code applying last-call-to-static bonus to the top. Then, you could early exit after…
// SingleBBBonus: This bonus is applied if the callee has a single reachable		// SingleBBBonus: This bonus is applied if the callee has a single reachable
// basic block at the given callsite context. This is speculatively applied		// basic block at the given callsite context. This is speculatively applied
// and withdrawn if more than one basic block is seen.		// and withdrawn if more than one basic block is seen.
//		//
// Vector bonuses: We want to more aggressively inline vector-dense kernels		// Vector bonuses: We want to more aggressively inline vector-dense kernels
// and apply this bonus based on the percentage of vector instructions. A		// and apply this bonus based on the percentage of vector instructions. A
// bonus is applied if the vector instructions exceed 50% and half that amount		// bonus is applied if the vector instructions exceed 50% and half that amount
// is applied if it exceeds 10%. Note that these bonuses are some what		// is applied if it exceeds 10%. Note that these bonuses are some what
// arbitrary and evolved over time by accident as much as because they are		// arbitrary and evolved over time by accident as much as because they are
// principled bonuses.		// principled bonuses.
// FIXME: It would be nice to base the bonus values on something more		// FIXME: It would be nice to base the bonus values on something more
// scientific.		// scientific.
//		//
// LstCallToStaticBonus: This large bonus is applied to ensure the inlining		// LstCallToStaticBonus: This large bonus is applied to ensure the inlining
// of the last call to a static function as inlining such functions is		// of the last call to a static function as inlining such functions is
// guaranteed to reduce code size.		// guaranteed to reduce code size.
//		//
// These bonus percentages may be set to 0 based on properties of the caller		// These bonus percentages may be set to 0 based on properties of the caller
// and the callsite.		// and the callsite.
int SingleBBBonusPercent = 50;		int SingleBBBonusPercent = 50;
int VectorBonusPercent = 150;		int VectorBonusPercent = 150;
int LastCallToStaticBonus = InlineConstants::LastCallToStaticBonus;		int LastCallToStaticBonus = InlineConstants::LastCallToStaticBonus;

		// If this is true, then we can apply the LastCallToStaticBonus.
		bool OnlyOneCallAndLocalLinkage =
		F.hasLocalLinkage() && F.hasOneUse() && &F == CS.getCalledFunction();

		// If no size growth is allowed in this inlining, set the threshold to 0 and
		// apply any cost bonuses. After that, return.
		if (!allowSizeGrowth(CS)) {
		Threshold = 0;

		// This could still be "zero-cost" if F only has one use and has local
		// linkage. Apply the LastCallToStatic bonus if it applies.
		if (OnlyOneCallAndLocalLinkage)
		Cost -= LastCallToStaticBonus;
		return;
		}

		Function *Caller = CS.getCaller();

		// return min(A, B) if B is valid.
		auto MinIfValid = [](int A, Optional<int> B) {
		return B ? std::min(A, B.getValue()) : A;
		};

		// return max(A, B) if B is valid.
		auto MaxIfValid = [](int A, Optional<int> B) {
		return B ? std::max(A, B.getValue()) : A;
		};

// Lambda to set all the above bonus and bonus percentages to 0.		// Lambda to set all the above bonus and bonus percentages to 0.
auto DisallowAllBonuses = [&]() {		auto DisallowAllBonuses = [&]() {
SingleBBBonusPercent = 0;		SingleBBBonusPercent = 0;
VectorBonusPercent = 0;		VectorBonusPercent = 0;
LastCallToStaticBonus = 0;		LastCallToStaticBonus = 0;
};		};

// Use the OptMinSizeThreshold or OptSizeThreshold knob if they are available		// Use the OptMinSizeThreshold or OptSizeThreshold knob if they are available
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	void CallAnalyzer::updateThreshold(CallSite CS, Function &Callee) {

// Finally, take the target-specific inlining threshold multiplier into		// Finally, take the target-specific inlining threshold multiplier into
// account.		// account.
Threshold *= TTI.getInliningThresholdMultiplier();		Threshold *= TTI.getInliningThresholdMultiplier();

SingleBBBonus = Threshold * SingleBBBonusPercent / 100;		SingleBBBonus = Threshold * SingleBBBonusPercent / 100;
VectorBonus = Threshold * VectorBonusPercent / 100;		VectorBonus = Threshold * VectorBonusPercent / 100;

bool OnlyOneCallAndLocalLinkage =
F.hasLocalLinkage() && F.hasOneUse() && &F == CS.getCalledFunction();
// If there is only one call of the function, and it has internal linkage,		// If there is only one call of the function, and it has internal linkage,
// the cost of inlining it drops dramatically. It may seem odd to update		// the cost of inlining it drops dramatically. It may seem odd to update
// Cost in updateThreshold, but the bonus depends on the logic in this method.		// Cost in updateThreshold, but the bonus depends on the logic in this method.
if (OnlyOneCallAndLocalLinkage)		if (OnlyOneCallAndLocalLinkage)
Cost -= LastCallToStaticBonus;		Cost -= LastCallToStaticBonus;
}		}

bool CallAnalyzer::visitCmpInst(CmpInst &I) {		bool CallAnalyzer::visitCmpInst(CmpInst &I) {
▲ Show 20 Lines • Show All 1,169 Lines • Show Last 20 Lines

test/Transforms/Inline/ARM/unreachable.ll

This file was added.

				; RUN: opt < %s -mtriple=arm--- -S -inline \| FileCheck %s

				declare i32* @pluto() local_unnamed_addr #0

				define internal void @foo() local_unnamed_addr #0 {
				bb:
				call i32* @pluto() #0
				ret void
				}

				define void @wibble() unnamed_addr #0 {
				; CHECK-LABEL: wibble
				; CHECK-NOT: @foo()
				; CHECK: %0 = call i32* @pluto()
				; CHECK-NEXT: unreachable
				bb:
				tail call void @foo() #0
				unreachable
				}

				attributes #0 = { minsize optsize }