This is an archive of the discontinued LLVM Phabricator instance.

Refactor threshold computation for inline cost analysis
ClosedPublic

Authored by eraman on Dec 9 2015, 5:14 PM.

Details

Summary

This refactoring

  • Moves the threshold computation methods to InlineCost.cpp.
  • Removes state related to Threshold from the base Inliner class and moves it to SimpleInliner since threshold is only relevant for ICA which is used by SimpleInliner.
  • Cleans up getInlineThreshold a little bit.

NFC intended. The motivation for this refactor is to allow getInlineThreshold to be called from inside CallAnalyzer when peering through indirect calls instead of using a constant value.

Diff Detail

Repository
rL LLVM

Event Timeline

eraman updated this revision to Diff 42359.Dec 9 2015, 5:14 PM
eraman retitled this revision from to Refactor threshold computation for inline cost analysis.
eraman updated this object.
eraman added reviewers: chandlerc, stoklund.
eraman set the repository for this revision to rL LLVM.
eraman added subscribers: llvm-commits, davidxl.
hfinkel added a subscriber: hfinkel.Dec 9 2015, 8:29 PM
hfinkel added inline comments.
include/llvm/Analysis/InlineCost.h
145 ↗(On Diff #42359)

Given that we don't really have a SizeOptLevel in LLVM, but -Os, -Oz settings, we should document here what SizeOptLevel means.

lib/Analysis/InlineCost.cpp
1421 ↗(On Diff #42359)

Seems like we could fix this FIXME while we're here.

davidxl added inline comments.Dec 9 2015, 9:34 PM
include/llvm/Analysis/InlineCost.h
145 ↗(On Diff #42359)

make it a static member function of InlineCostAnalysis?

148 ↗(On Diff #42359)

Same here.

lib/Analysis/InlineCost.cpp
1385 ↗(On Diff #42359)

Change it to DefaultInlineThreshold?

1414 ↗(On Diff #42359)

Should it be a static member function of InlineCostAnalysis class?

1414 ↗(On Diff #42359)

Missing documentation of the method.

1414 ↗(On Diff #42359)

The interface of this method also needs to be changed to take an additional callee function as the argument so that indirect callsite's threshold can be computed.

The existing method can be a wrapper to this one.

lib/Transforms/IPO/InlineSimple.cpp
41 ↗(On Diff #42359)

-> DefaultThreshold

57 ↗(On Diff #42359)

This does not look right -- DefaultThreshold is used here which should not be.

I also don't see the need to pass CallsiteIndependent Threshold all the way down from SimpleInliner::getInlineCost all the way down to CallAnalyzer (without being used anywhere else) -- the CallAnalyzer can simply compute the CallSiteIndependentThreshold itself. This will greatly simplify the interfaces.

lib/Analysis/InlineCost.cpp
1414 ↗(On Diff #42359)

Why do we need a threshold parameter here? The base threshold should be the default threshold.

eraman marked 6 inline comments as done.Dec 10 2015, 2:30 PM
eraman added inline comments.
include/llvm/Analysis/InlineCost.h
145 ↗(On Diff #42359)

I initially had it as a static member function, but the thought of calling this inside CallAnalyzer (not in this patch, but when replacing the constant threshold for indirect call bonus) didn't feel right to me (CallAnalyzer being 'aware' of InlineCostAnalysis). Functionally there is no difference and if you strongly prefer I'll make it a static member function.

lib/Analysis/InlineCost.cpp
1385 ↗(On Diff #42359)

I don't really like the name, but went ahead with it. Of course we don't want to change the option itself.

1414 ↗(On Diff #42359)

The idea of calling a method of InlineCostAnalysis from this inside CallAnalyzer (not in this patch, but when replacing the constant threshold for indirect call bonus) doesn't feel right to me. Functionally there is no difference and if you strongly prefer I'll make it a static member function.

1414 ↗(On Diff #42359)

Since there is only one caller, I haven't added the wrapper.

1414 ↗(On Diff #42359)

I am not sure what you mean by the base threshold. You pass a threshold that is not dependent on the callsite. This may get changed based on callsite properties and returned.

1421 ↗(On Diff #42359)

If the function has a MinSize attribute, this OptSize bool will be false, but I think that is not intentional. If I want a NFC patch, I should replace Caller->hasFnAttribute(Attribute::OptimizeForSize) with (Caller->optForSize() && ! Caller->optForMinSize()) to replicate the exact behavior. Instead I thought I'll just replace it with Caller->optForSize() as a separate patch (or perhaps I am obsessing a lot over keeping this NFC since this not using a smaller threshold for MinSize attribute seems unintentional)

lib/Transforms/IPO/InlineSimple.cpp
57 ↗(On Diff #42359)

Why? the Threshold could be either the default one or based on opt levels. This is the callsite independent threshold which is modified by getInlineCost

eraman updated this revision to Diff 42464.Dec 10 2015, 2:31 PM
eraman marked 2 inline comments as done.

Addressed reviewer comments (mainly variable renaming, documentation and making some functions static member functions)

Threshold is used in inline cost analysis to bail out early for compile time reasons, so it is an implementation detail that should not be exposed at interface level. Currently it is needed to be passed because the DefaultThreshold (aka callSiteIndependent Threshold) depends on opt level, sizeOptLevel which ICA does not have access to.

The solution is pretty simple: just pass the optlevel to ICA when ICA is created. If Threshold does need to be passed for some reason (which I doubt) please rename such parameters (including the member field in SimpleInliner) to be DefaultThreshold (with comment it depends on opt level) to avoid confusion. CallsiteIndependentThreshold is too long a name to use.

include/llvm/Analysis/InlineCost.h
145 ↗(On Diff #42359)

SizeOptLevel == 1 maps to -Os, while SizeOptLevel == 2 maps to -Oz. In fact, FE also sets OptimizeForSize and MinSize function attribute when the command line option is on:

if (!HasOptnone) {

  if (CodeGenOpts.OptimizeSize)
    FuncAttrs.addAttribute(llvm::Attribute::OptimizeForSize);
  if (CodeGenOpts.OptimizeSize == 2)
    FuncAttrs.addAttribute(llvm::Attribute::MinSize);
}

This means inliner SizeOptLevel is not needed here.

lib/Transforms/IPO/InlineSimple.cpp
57 ↗(On Diff #42359)

Ok -- I see that the getInlineThreshold call is pushed into getInlineCost method so this is correct.

davidxl added inline comments.Dec 11 2015, 10:41 PM
lib/Analysis/InlineCost.cpp
1414 ↗(On Diff #42464)

sounds reasonable.

1414 ↗(On Diff #42464)

I thought the threshold was the opt level independent one -- but looks like not. See my previous comments -- we should need to pass any threshold.

Threshold is used in inline cost analysis to bail out early for compile time reasons, so it is an implementation detail that should not be exposed at interface level. Currently it is needed to be passed because the DefaultThreshold (aka callSiteIndependent Threshold) depends on opt level, sizeOptLevel which ICA does not have access to.

The solution is pretty simple: just pass the optlevel to ICA when ICA is created. If Threshold does need to be passed for some reason (which I doubt) please rename such parameters (including the member field in SimpleInliner) to be DefaultThreshold (with comment it depends on opt level) to avoid confusion. CallsiteIndependentThreshold is too long a name to use.

SimpleInliner instances are created through createFunctionInliningPass. There are three overloads of this method. One passes the size and opt level, but the other two either explicitly passes the threshold (this is used in bugpoint) or uses a default threshold. So there is no way to avoid passing the threshold. I will do the renaming.

include/llvm/Analysis/InlineCost.h
153 ↗(On Diff #42464)

Unfortunately, there is subtle difference in how the threshold is computed for -Os and OptimizeForSize attribute. If -Os is specified, the default threshold is lowered to 75. This is independent of whether we pass -inline-threshold or not. However, if -Os is not specified and the caller function has OptimizeForSize attribute, the threshold is lowered only if -inline-threshold is not specified.

Additionally if -Oz is specified, the threshold is lowered to 25, but the getInlineThreshold does not check for MinSize attribute at all.

Clearly, this needs to be fixed to give a consistent behavior but I think should be done separately.

junbuml added inline comments.Dec 15 2015, 11:15 AM
lib/Analysis/InlineCost.cpp
1444 ↗(On Diff #42464)

It appears that here we check only fn attributes for Callee, but I think it's also possible for a CS can have its own attributes. Shouldn't we check CS.hasFnAttr(Attribute::Cold) or CS.hasFnAttr(Attribute::InlineHint) as well? E.g., a Callee itself is not Cold, but a CS to the Callee is Cold. If yes, may be a separate patch.

davidxl added inline comments.Dec 15 2015, 12:42 PM
lib/Analysis/InlineCost.cpp
1444 ↗(On Diff #42464)

This is a refactoring patch, so what you suggested can be done in a separate patch.

eraman updated this revision to Diff 42907.Dec 15 2015, 2:03 PM

Renamed CallSiteIndependentThreshold to DefaultThreshold and fixed some comments.

chandlerc edited edge metadata.Dec 15 2015, 2:28 PM

I generally like the direction here, but I think you should actually go further than this.

I think you should change the input to the cost analysis from a numeric threshold to a symbolic enum that selects between the high-level "kinds" of thresholds to use. Then you can internalize all of the logic inside of the cost analysis.

Further, you should feel relatively free to change the behavior of the '-inline-threshold' flag. That flag is primarily a debugging aid. We don't want to flagrantly change its behavior (as that would just be annoying), we don't have any firm or hard contract around exactly how that flag is interpreted. Does that make sense?

I generally like the direction here, but I think you should actually go further than this.

I think you should change the input to the cost analysis from a numeric threshold to a symbolic enum that selects between the high-level "kinds" of thresholds to use. Then you can internalize all of the logic inside of the cost analysis.

Further, you should feel relatively free to change the behavior of the '-inline-threshold' flag. That flag is primarily a debugging aid. We don't want to flagrantly change its behavior (as that would just be annoying), we don't have any firm or hard contract around exactly how that flag is interpreted. Does that make sense?

Passing an absolute number in inline-threshold is useful in writing tests. How about passing a symbolic enum to the getInlineCost, modify it based on properties of callsite but finally override it with -inline-threshold (or its variants) if it is explicitly passed. Does this sound reasonable?

I generally like the direction here, but I think you should actually go further than this.

I think you should change the input to the cost analysis from a numeric threshold to a symbolic enum that selects between the high-level "kinds" of thresholds to use. Then you can internalize all of the logic inside of the cost analysis.

Further, you should feel relatively free to change the behavior of the '-inline-threshold' flag. That flag is primarily a debugging aid. We don't want to flagrantly change its behavior (as that would just be annoying), we don't have any firm or hard contract around exactly how that flag is interpreted. Does that make sense?

Passing an absolute number in inline-threshold is useful in writing tests. How about passing a symbolic enum to the getInlineCost, modify it based on properties of callsite but finally override it with -inline-threshold (or its variants) if it is explicitly passed. Does this sound reasonable?

I'm just suggesting changing the *APIs* to deal in symbolic settings. Clearly the flag will be numeric. I'm suggested sinking the inline threshold flag into the inline cost analysis completely.

I think we should do what Chandler suggested as a follow up. This patch IMO is good enough as an incremental step. We should move a little faster as there are also other cleanups mentioned in the review (Os command line vs attribute, Cold Function attribute and callsite attribute) needed to to be done.

I'm just suggesting changing the *APIs* to deal in symbolic settings. Clearly the flag will be numeric. I'm suggested sinking the inline threshold flag into the inline cost analysis completely.

I attempted to do that and hit an issue. InlineSimple.cpp provides a createFunctionInliningPass((int Threshold) API. To sink thresholds to InlineCost, this needs to be removed, but this is called by LLVMPassManagerBuilderUseInlinerWithThreshold which is exposed by the llvm-c API

I'm just suggesting changing the *APIs* to deal in symbolic settings. Clearly the flag will be numeric. I'm suggested sinking the inline threshold flag into the inline cost analysis completely.

I attempted to do that and hit an issue. InlineSimple.cpp provides a createFunctionInliningPass((int Threshold) API. To sink thresholds to InlineCost, this needs to be removed, but this is called by LLVMPassManagerBuilderUseInlinerWithThreshold which is exposed by the llvm-c API

I had to spend a bunch of time thinking about this.

On one hand, I think exposing this kind of configurability is really frustrating from an API-design perspective. But I think I can imagine users of LLVM (particularly library users) wanting to have pretty wildly different inlining tolerances. However, this raises an important question of how that should be propagated to when the cost analysis has to recurse across yet another function call.

I think we need to move *completely* away from having different *initial* thresholds for things like inline-hint and opt-size or min-size. We have numerous adjustments to the threshold based on different analyses properties. I think inline-hint and size based stuff should work the same way. This will let you just sink the capping and ballooning of the threshold into analyzeCall where we also compute all the bonuses. Does that make sense? It should also avoid the need to separately call 'getInlineThreshold' -- you'll just store and pass along the initial threshold.

I'm just suggesting changing the *APIs* to deal in symbolic settings. Clearly the flag will be numeric. I'm suggested sinking the inline threshold flag into the inline cost analysis completely.

I attempted to do that and hit an issue. InlineSimple.cpp provides a createFunctionInliningPass((int Threshold) API. To sink thresholds to InlineCost, this needs to be removed, but this is called by LLVMPassManagerBuilderUseInlinerWithThreshold which is exposed by the llvm-c API

I had to spend a bunch of time thinking about this.

On one hand, I think exposing this kind of configurability is really frustrating from an API-design perspective. But I think I can imagine users of LLVM (particularly library users) wanting to have pretty wildly different inlining tolerances. However, this raises an important question of how that should be propagated to when the cost analysis has to recurse across yet another function call.

I think we need to move *completely* away from having different *initial* thresholds for things like inline-hint and opt-size or min-size. We have numerous adjustments to the threshold based on different analyses properties. I think inline-hint and size based stuff should work the same way. This will let you just sink the capping and ballooning of the threshold into analyzeCall where we also compute all the bonuses. Does that make sense? It should also avoid the need to separately call 'getInlineThreshold' -- you'll just store and pass along the initial threshold.

One thing that may make it much easier to innact this refactoring in a simple way would be to first do another much over-due refactoring to make InlineCostAnalysis not *actually* be an analysis *pass*. The only "analysis" done is to capture two pointers. Instead, this should just be a utility class and method that the inliner pass constructs (passing in the relevant bits) and uses.

If it would be helpful, I can send you a patch that does that, but I don't want to make merging and such more complicated if it makes more sense for you to do this on your end.

eraman updated this revision to Diff 43710.Dec 28 2015, 5:21 PM
eraman edited edge metadata.
eraman removed rL LLVM as the repository for this revision.

Moved the code that updates threshold based on callsite properties to CallAnalyzer. This also merges in the changes made in http://reviews.llvm.org/D15245

This version looks very good -- the only thing I dislike is that we still need to pass default threshold around via getInlineCost interface -- I think we can deal with that in the future.

davidxl accepted this revision.Jan 12 2016, 11:43 AM
davidxl added a reviewer: davidxl.

Sorry about the delay!

This patch is NFC cleanup with general direction agreed upon. The version LGTM (with the documentation nit). Please follow up post-commit if there are more issues to be handled.

lib/Transforms/IPO/InlineSimple.cpp
41 ↗(On Diff #43710)

Document this field. The DefaultThreshold can either be the default threshold specified by -inline-threshold, or the opt level dependent default threshold or a user specified value.

This revision is now accepted and ready to land.Jan 12 2016, 11:43 AM
davidxl added inline comments.Jan 12 2016, 12:39 PM
lib/Analysis/InlineCost.cpp
197 ↗(On Diff #43710)

A related follow up that can be done:

  1. change the parameter name to DefaultThreshold
  2. Save the default threshold value in a different member so that it is not overriden by updatethreshold
  3. Use the default threshold when recurisively analyizing the indirect call (instead of a constant).

(Not required for this patch).

Chandler, if you don't have any other comments, I'll check this in tomorrow.

lib/Analysis/InlineCost.cpp
197 ↗(On Diff #43710)

Ok, will address these in a separate patch.

This revision was automatically updated to reflect the committed changes.