This is an archive of the discontinued LLVM Phabricator instance.

[Inlining] Set inline-deferral-scale to 2.
ClosedPublic

Authored by kazu on May 19 2020, 12:12 PM.

Details

Summary

This patch sets inline-deferral-scale to 2.

Both internal and SPEC benchmarking show that 2 is the best number
among -1, 2, 3, and 4.

inline-deferral-scale SPECint2006

-1  38.0 (the default without this patch)
 2  38.5
 3  38.1
 4  38.1

With the new default number, shouldBeDeferred returns true if:

TotalCost < IC.getCost() * 2

where

TotalCost is TotalSecondaryCost + IC.getCost() * NumCallerUsers.

If TotalCost >= 0 and NumCallerUsers >= 2, then
TotalCost >= IC.getCost() * 2, so shouldBeDeferred returns true only
when NumCallerUsers is 1.

Now, if TotalSecondaryCost < 0, which can happen if
InlineConstants::LastCallToStaticBonus, a huge number, has been
subtracted from TotalSecondaryCost, then TotalCost may be negative.
In this case, shouldBeDeferred may return true even when
NumCallerUsers >= 2.

Diff Detail

Event Timeline

kazu created this revision.May 19 2020, 12:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2020, 12:12 PM

SPEC numbers look great:)

kazu added a comment.May 20 2020, 9:47 PM

PTAL Thanks!

davidxl accepted this revision.May 20 2020, 10:12 PM

How does it affect text size overall?

This revision is now accepted and ready to land.May 20 2020, 10:12 PM
kazu added a comment.May 21 2020, 10:57 AM

How does it affect text size overall?

Thanks for a review!

PGO+ThinLTO bootstrap of clang results in smaller .text sections:

SectionBeforeAfterChange
.text.hot59205865917770-0.05%
.text71552097085897-0.97%
.text.unlikely3592264035625712-0.83%
.text.startup728048728032-0.00%

thanks for the data. Would it be possible to collect size data with plain O2 or O3 build?

nikic added a comment.May 21 2020, 1:44 PM

Here are some text size numbers from CTMark for this patch: http://llvm-compile-time-tracker.com/compare.php?from=361e4f14e35981e7498a38ab7c31da7fb7414fb8&to=d9666f0e34439c226438f1ad0e64a86456c323db&stat=size-text There is a huge (4-6%) text size reduction on sqlite. Don't know if that's a good sign or a bad sign in this case...

performance win + size reduction is double win. LGTM

This revision was automatically updated to reflect the committed changes.

It's kinda off-topic and not sure if anyone is interested in more test cases,
but i have a codebase for which the current inlining decisions still seem
just a little bit too weak. I.e. if i drop all __attribute__((always_inline)) attrs,
i see +10% perf drops in all of those cases. Inline parameters isn't something i looked into,
so i'm not really sure where to look. Feel free to let me know if anyone is interested.

nikic added a comment.Dec 9 2021, 3:30 AM

I finally understood what this code is doing ... and I'm wondering why you only tested scales from 2 upwards, and did not test a scale of 1. Scale 1 should be the one that is actually "correct" in terms of modeling, as we are choosing to incur TotalCost in place of IC.getCost().