This is an archive of the discontinued LLVM Phabricator instance.

[NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive
ClosedPublic

Authored by aeubanks on Mar 6 2022, 11:53 PM.

Details

Summary

Introduce a new attribute "function-inline-cost-multiplier" which
multiplies the inline cost of a call site (or all calls to a callee) by
the multiplier.

When processing the list of calls created by inlining, check each call
to see if the new call's callee is in the same SCC as the original
callee. If so, set the "function-inline-cost-multiplier" attribute of
the new call site to double the original call site's attribute value.
This does not happen when the original call site is intra-SCC.

This is an alternative to D120584, which marks the call sites as
noinline.

Hopefully fixes PR45253.

Diff Detail

Event Timeline

aeubanks created this revision.Mar 6 2022, 11:53 PM
Herald added a project: Restricted Project. · View Herald Transcript
aeubanks requested review of this revision.Mar 6 2022, 11:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2022, 11:53 PM
aeubanks added inline comments.Mar 7 2022, 12:25 AM
llvm/test/Transforms/Inline/inline-cost-attributes.ll
27

this is due to implementation details of the printer pass

nikic requested changes to this revision.Mar 7 2022, 7:36 AM
nikic added a subscriber: nikic.
nikic added inline comments.
llvm/lib/Transforms/IPO/Inliner.cpp
973

CB has been deleted at this point -- the old cost multiplier needs to be fetched before InlineFunction is called.

This revision now requires changes to proceed.Mar 7 2022, 7:36 AM
ormris removed a subscriber: ormris.Mar 7 2022, 9:54 AM
nikic resigned from this revision.Mar 7 2022, 9:59 AM
nikic added a subscriber: davidxl.

I've not tested this as extensively, but this variant is also performance neutral on my side, and does seem to address the issue in practice.

This approach is a bit more ad-hoc than D120584, and it wouldn't be my first choice, unless we had evidence that it actually performs better than D120584 in some real-world instance. However, if this approach is more appealing to @davidxl, then I'm also fine with it as a compromise solution.

llvm/lib/Analysis/InlineCost.cpp
912

We should probably export this string so it's not repeated three times.

davidxl added inline comments.Mar 7 2022, 10:09 AM
llvm/lib/Transforms/IPO/Inliner.cpp
973

let '2' be controllable by an option.

aeubanks updated this revision to Diff 413555.Mar 7 2022, 10:34 AM

address comments

davidxl accepted this revision.Mar 7 2022, 10:58 AM
This revision is now accepted and ready to land.Mar 7 2022, 10:58 AM
nikic added a comment.Mar 7 2022, 12:24 PM

I've been testing this with more cases, and unfortunately this one does sometimes do appreciably worse than D120584. For a fluent_bundle-derived test case, this approach takes 2x longer to compile than the previous version.

Perhaps we can experiment with the default value for -intra-scc-cost-multiplier. I know davidxl has done a lot of work with inline tuning so I'm inclined to go with this patch for now.

I did see a consistent 0.75-1% regression on an important internal benchmark with the other approach (which I believe we actually care about), although some benchmarks are sensitive to alignment.

nikic added a comment.Mar 7 2022, 1:21 PM

I did see a consistent 0.75-1% regression on an important internal benchmark with the other approach (which I believe we actually care about), although some benchmarks are sensitive to alignment.

And the regressions is not present with this variant? If so, yeah, let's go with this.

llvm/lib/Transforms/IPO/Inliner.cpp
971

Should we also fetch the previous multiplier of ICB here and multiply it in? Not sure if it's really relevant in practice, but currently we're only preserving the existing multiplier from CB, but not ICB.

I did see a consistent 0.75-1% regression on an important internal benchmark with the other approach (which I believe we actually care about), although some benchmarks are sensitive to alignment.

And the regressions is not present with this variant? If so, yeah, let's go with this.

I haven't tested this variant yet and the benchmarks take hours to run. I'm going to start it now, but given that 14.0.0 is tomorrow I'll submit this tonight if the benchmark isn't done by then and request a cherry-pick since davidxl is happy with this and it helps with many of the worst compile time issues. If this does turn out to be similar to the noinline patch in terms of performance then we can do something with the noinline patch.

llvm/lib/Transforms/IPO/Inliner.cpp
971

if we had

a -> b <-> c

z -> a <-> d

where both the a -> b and z -> a inlinings were problematic, then multiplying the value in ICB could help, else we'd restart the multipliers

as you said, probably very unlikely, we can do that in a follow-up change if we think it's important

This revision was landed with ongoing or failed builds.Mar 7 2022, 11:54 PM
This revision was automatically updated to reflect the committed changes.

Is this the patch we should backport to release/14.x?

Is this the patch we should backport to release/14.x?

yeah that'd be awesome, thanks

davide added a subscriber: davide.Nov 23 2022, 1:04 PM

Hey folks, this is still not enough. We're seeing exponential behaviors when linking real-world projects.

Here's a reduced example that still goes exponential.
https://github.com/llvm/llvm-project/issues/59126

cc: @davidxl @aeubanks