This is an archive of the discontinued LLVM Phabricator instance.

[Inlining] Teach shouldBeDeferred to take the total cost into account
ClosedPublic

Authored by kazu on Apr 29 2020, 3:45 PM.

Details

Summary

This patch teaches shouldBeDeferred to take into account the total
cost of inlining.

Suppose we have a call hierarchy {A1,A2,A3,...}->B->C. (Each of A1,
A2, A3, ... calls B, which in turn calls C.)

Without this patch, shouldBeDeferred essentially returns true if

TotalSecondaryCost < IC.getCost()

where TotalSecondaryCost is the total cost of inlining B into As.
This means that if B is a small wraper function, for example, it would
get inlined into all of As. In turn, C gets inlined into all of As.
In other words, shouldBeDeferred ignores the cost of inlining C into
each of As.

This patch replaces the expression above with:

TotalCost < Allowance

where

  • TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and
  • Allowance is IC.getCost() * Scale

For now, Scale defaults to 2, which essentially limits the number of
As to 1 for shouldBeDeferred to return true.

With this patch, Clang PGO bootstrap results in a 0.33% smaller .text*
sections. Compiling the 10 largest preprocessed files of Clang with
the PGO bootstrapped clang takes:

  • 69.677 seconds on average of five runs without the patch, and
  • 68.939 seconds on average of five runs with the patch.

Diff Detail

Event Timeline

kazu created this revision.Apr 29 2020, 3:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2020, 3:45 PM
davidxl added inline comments.Apr 30 2020, 2:26 PM
llvm/lib/Transforms/IPO/Inliner.cpp
348

NumCallerUsers may be more explicit.

kazu updated this revision to Diff 261386.Apr 30 2020, 4:25 PM

I've renamed SecondaryUsers to NumCallerUsers.

kazu marked an inline comment as done.Apr 30 2020, 4:25 PM

This is a conceptually a very good change. I think it should be split into two stages. The first is to make the changes but with default settings as NFC. The second stage is to reset the parameter with some benchmark number (e.g, spec perf should not regress, also code size impact).

One way to to the first stage is to not consider NumCaller adjustment if the deferral scale is a special value (such as -1).

kazu updated this revision to Diff 261927.May 4 2020, 2:05 PM

I've updated the patch to turn off the new cost calculation by default.

davidxl accepted this revision.May 5 2020, 10:42 AM

lgtm

This revision is now accepted and ready to land.May 5 2020, 10:42 AM
This revision was automatically updated to reflect the committed changes.