This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
2
InlineCost.cpp
-
test/Transforms/Inline/
-
Transforms/
-
Inline/
-
inline-indirect-always.ll

Differential D34471

[Inliner] Boost inlining of an indirect call to always_inline function.
AbandonedPublic

Authored by eraman on Jun 21 2017, 12:11 PM.

Download Raw Diff

Details

Reviewers

chandlerc

Summary

If inlining baz->bar makes an indirect call inside bar to be a direct
call to foo with always_inline hint, boost the chances of baz->bar
inlining (and therefore bar->foo inlining) by adding a large negative
cost. I have used LastCallToStaticBonus here, but if that makes it
confusing I will create another constant with similar high value.

Diff Detail

Build Status

Buildable 7466
Build 7466: arc lint + arc unit

Event Timeline

eraman created this revision.Jun 21 2017, 12:11 PM

Herald added a subscriber: mehdi_amini. · View Herald TranscriptJun 21 2017, 12:11 PM

davidxl added inline comments.Jun 21 2017, 9:20 PM

lib/Analysis/InlineCost.cpp
978	Maybe rename LastCallToStaticBonus more appropriately ?

eraman added inline comments.Jun 22 2017, 10:15 AM

lib/Analysis/InlineCost.cpp
978	I am open to suggestions. How about AlmostAlwaysInlineBonus ?

This seems likely to lead to surprising behavior. LastCallToStaticBonus is very large, so we'll end up duplicating a lot of code in some cases, which could be surprising. On the other hand, it isn't infinitely large, so you can't depend on always_inline to trigger the behavior reliably.

What's the motivation for this change?

In D34471#788221, @efriedma wrote:

This seems likely to lead to surprising behavior. LastCallToStaticBonus is very large, so we'll end up duplicating a lot of code in some cases, which could be surprising. On the other hand, it isn't infinitely large, so you can't depend on always_inline to trigger the behavior reliably.

What's the motivation for this change?

Consider a function B that takes a function pointer as an argument and calls that. Let's assume A invokes B and passes a pointer to C which has always_inline attribute. It is reasonable to relax the A->B inlining limits so that we could honor the always_inline hint on C. I think this is a fairly uncommon scenario that size increase is not a concern in general, but yes this could end up duplicating a lot in some cases. One way to think about is what could the programmer do to ensure C always gets inlined. They could clone B to B' that directly calls C and change a subset of B's callers to B'. This roughly doubles the size of B. We could perhaps approximate this by giving a bonus of <some large value> / (# callers of B in the module).

I think this is a fairly uncommon scenario that size increase is not a concern in general,

All code written in C contains always_inline functions because C libraries mark inline functions always_inline. This means we're potentially introducing gigantic size regressions into every piece of C code in existence. I don't want to deal with that.

FWIW, I'm not really a fan of this behavior in the inliner.

I know the test case that motivates this, and as was discussed previously I suspect using a template argument would be better. I think this is actually a general pattern we can recommend rather than using alway_inline for this:

__attribute__((always_inline)) inline void f(int, int) {
  volatile int x;
  x = 42; x = 42; x = 42; x = 42;
}

template <typename FunctionT, FunctionT *Function>
void g(int *a, int *b, int count)  {
  for (int i = 0; i < count; ++i)
    Function(a[i], b[i]);
}

void h(int *a, int *b, int count) {
  g<decltype(f), f>(a, b, count);
}

Even at -O0 this doesn't allows the always_inline to have its effect: https://godbolt.org/g/S4HzTD

It also makes it explicit in the source code that the programmer *wants* code duplication. For something this powerful, that seems like the right general approach.

Chandler's example does not involve any indirect calls so it is not related to this patch.

I think Easwaran's patch is consistent with the current inline behavior -- it boots inlining to callee with indirect callsites that can be turned into direct calls and then further inlined. The amount of boost is the benefit of inlining the indirect callsites (i.e., the difference between inline cost and allowed threshold). In fact, if the CA.analyzeCall call in the existing code

auto IndirectCallParams = Params;
IndirectCallParams.DefaultThreshold = InlineConstants::IndirectCallThreshold;
CallAnalyzer CA(TTI, GetAssumptionCache, GetBFI, PSI, *F, CS,
                IndirectCallParams);
if (CA.analyzeCall(CS)) {
    ...

is turned into getInlineCost(CS), it will achieve similar effect -- except that InlineCost associated with AlwaysInline is a little too extreme.

In many cases, AlwaysInline is very strong hint by the user to indicate performance benefit of inlining the function, it does not make sense to completely ignore it.

In D34471#788736, @davidxl wrote:

Chandler's example does not involve any indirect calls so it is not related to this patch.

Actually, it is...

The original test case doesn't look exactly like my example. Rather than a template argument and a call through that template argument, it uses an indirect call to a normal function argument.

My suggestion is that the code should be changed to use a template argument if the intent is to make always_inline actually *always* inline.

I think Easwaran's patch is consistent with the current inline behavior -- it boots inlining to callee with indirect callsites that can be turned into direct calls and then further inlined. The amount of boost is the benefit of inlining the indirect callsites (i.e., the difference between inline cost and allowed threshold). In fact, if the CA.analyzeCall call in the existing code
auto IndirectCallParams = Params;
IndirectCallParams.DefaultThreshold = InlineConstants::IndirectCallThreshold;
CallAnalyzer CA(TTI, GetAssumptionCache, GetBFI, PSI, *F, CS,
                IndirectCallParams);
if (CA.analyzeCall(CS)) {
    ...
is turned into getInlineCost(CS), it will achieve similar effect -- except that InlineCost associated with AlwaysInline is a little too extreme.

In many cases, AlwaysInline is very strong hint by the user to indicate performance benefit of inlining the function, it does not make sense to completely ignore it.

I don't think this is really the right way to model the always_inline attribute. I continue to think that this attribute should mean that we very literally *always* inline. See the discussion here for more details:
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089466.html

(Note that I *also* am still in favor of having an attribute that is actually a very strong hint to do inlining, I just don't think any of the existing attributes really fit that bill, and I don't think the original motivating test case is a particularly compelling example. Even if a user *does* want to get really strong inlining because of a weird indirect function call, I don't know why the inliner has to go to great lengths to detect this rather than expecting the user to also attribute the function receiving the function pointer...)

eraman abandoned this revision.Jun 26 2017, 11:01 AM

Revision Contents

Path

Size

lib/

Analysis/

InlineCost.cpp

12 lines

test/

Transforms/

Inline/

inline-indirect-always.ll

32 lines

Diff 103438

lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 964 Lines • ▼ Show 20 Lines	bool CallAnalyzer::visitCallSite(CallSite CS) {
Function *F = dyn_cast_or_null<Function>(SimplifiedValues.lookup(Callee));		Function *F = dyn_cast_or_null<Function>(SimplifiedValues.lookup(Callee));
if (!F)		if (!F)
return Base::visitCallSite(CS);		return Base::visitCallSite(CS);

// If we have a constant that we are calling as a function, we can peer		// If we have a constant that we are calling as a function, we can peer
// through it and see the function target. This happens not infrequently		// through it and see the function target. This happens not infrequently
// during devirtualization and so we want to give it a hefty bonus for		// during devirtualization and so we want to give it a hefty bonus for
// inlining, but cap that bonus in the event that inlining wouldn't pan		// inlining, but cap that bonus in the event that inlining wouldn't pan
// out. Pretend to inline the function, with a custom threshold.		// out.

		// First, check if the function has AlwaysInline attribute and give a huge
		// bonus if that is the case.
		if (F->hasFnAttribute(Attribute::AlwaysInline)) {
		Cost -= InlineConstants::LastCallToStaticBonus;
		davidxlUnsubmitted Not Done Reply Inline Actions Maybe rename LastCallToStaticBonus more appropriately ? davidxl: Maybe rename LastCallToStaticBonus more appropriately ?
		eramanAuthorUnsubmitted Not Done Reply Inline Actions I am open to suggestions. How about AlmostAlwaysInlineBonus ? eraman: I am open to suggestions. How about AlmostAlwaysInlineBonus ?
		return Base::visitCallSite(CS);
		}

		// Otherwise, pretend to inline the function, with a custom threshold and give
		// a bonus based on this threshold and the cost of inlining F.
auto IndirectCallParams = Params;		auto IndirectCallParams = Params;
IndirectCallParams.DefaultThreshold = InlineConstants::IndirectCallThreshold;		IndirectCallParams.DefaultThreshold = InlineConstants::IndirectCallThreshold;
CallAnalyzer CA(TTI, GetAssumptionCache, GetBFI, PSI, *F, CS,		CallAnalyzer CA(TTI, GetAssumptionCache, GetBFI, PSI, *F, CS,
IndirectCallParams);		IndirectCallParams);
if (CA.analyzeCall(CS)) {		if (CA.analyzeCall(CS)) {
// We were able to inline the indirect call! Subtract the cost from the		// We were able to inline the indirect call! Subtract the cost from the
// threshold to get the bonus we want to apply, but don't go below zero.		// threshold to get the bonus we want to apply, but don't go below zero.
Cost -= std::max(0, CA.getThreshold() - CA.getCost());		Cost -= std::max(0, CA.getThreshold() - CA.getCost());
▲ Show 20 Lines • Show All 679 Lines • Show Last 20 Lines

test/Transforms/Inline/inline-indirect-always.ll

This file was added.

				; RUN: opt -S -inline -inline-threshold=0 < %s \| FileCheck %s
				declare void @ext();

				define void @foo() #1 {
				call void @ext();
				ret void
				}

				define void @bar(void ()* %fptr) {
				call void %fptr();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				call void @ext();
				ret void
				}
				; CHECK-LABEL: @baz
				define void @baz() {
				; CHECK-NOT: call void @bar
				call void @bar(void ()* @foo)
				; CHECK: ret
				ret void
				}

				attributes #1 = { alwaysinline }