This is an archive of the discontinued LLVM Phabricator instance.

[InlineAdvisor] Add -inline-replay-scope=<Function|Module> to control replay scope
ClosedPublic

Authored by modimo on Sep 28 2021, 2:25 PM.

Details

Summary

The goal is to allow grafting an inline tree from Clang or GCC into a new compilation without affecting other functions. For GCC, we're doing this by extracting the inline tree from dwarf information and generating the equivalent remarks.

This allows easier side-by-side asm analysis and a trial way to see if a particular inlining setup provides benefits by itself.

Testing:
ninja check-all

Diff Detail

Event Timeline

modimo created this revision.Sep 28 2021, 2:25 PM
modimo requested review of this revision.Sep 28 2021, 2:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2021, 2:25 PM
modimo edited the summary of this revision. (Show Details)Sep 28 2021, 2:30 PM
modimo added reviewers: wenlei, mtrofin.
modimo added inline comments.Sep 28 2021, 2:34 PM
llvm/lib/Transforms/IPO/Inliner.cpp
835

Since getAdvice returns a unique_ptr, returning a null seems like a good way to indicate "no advice". Does that also make sense for the ML inliner @mtrofin?

mtrofin added inline comments.Sep 28 2021, 2:39 PM
llvm/lib/Transforms/IPO/Inliner.cpp
835

The design intent was for the advice to be clear, i.e. either inline or not. You probably want to delegate the decision to some other advisor if you don't have one? i.e. the remarks advisor could delegate to the default one - and return a non-null Advice. WDYT?

modimo added inline comments.Sep 28 2021, 2:51 PM
llvm/lib/Transforms/IPO/Inliner.cpp
835

It does delegate it in the CGSCC case because we have a nicely nested system with the default one taking over.

Unfortunately the SampleProfile inliner currently isn't using the InlineAdvisor setup. That being said, perhaps the better approach is to adapt the SampleProfile inliner to an InlineAdvisor so the replay advisor will always return a non-null.

modimo updated this revision to Diff 375744.Sep 28 2021, 5:05 PM

Formatting

modimo marked an inline comment as not done.Sep 28 2021, 5:13 PM
modimo added inline comments.
llvm/lib/Transforms/IPO/Inliner.cpp
835

The SampleProfile inliner needs additional information beyond CallBase to make its decision. Namely, InlineCandidate which contains sampling information. Does it make sense to extend getAdvice to take additional information?

mtrofin added inline comments.Sep 29 2021, 8:16 AM
llvm/lib/Analysis/ReplayInlineAdvisor.cpp
72

nit: probably no need for curly brackets.

llvm/lib/Transforms/IPO/Inliner.cpp
835

I don't follow why, but to the scope of this patch, couldn't the ReplayInlineAdvisor return an Advice that says "no" when it has no advice to provide?

modimo marked an inline comment as not done.Sep 29 2021, 1:15 PM
modimo added inline comments.
llvm/lib/Transforms/IPO/Inliner.cpp
835

In SampleProfile, it needs 3 states:

  1. Yes when it matches replay
  2. No when it doesn't match replay
  3. HasNoAdvice when in strict mode and we want the SampleProfile inliner to make the decision

ATM there's only (1) and (2), I'm using null to represent (3). An alternative could be to have a HasNoAdvice state that maps to no inline but can be queried to differentiate between (2) and (3)

mtrofin added inline comments.Sep 29 2021, 1:57 PM
llvm/lib/Transforms/IPO/Inliner.cpp
835

I think I see now. OK, let's allow a null return then from getAdvice; to answer your original question, I don't think it affects advisors that want to be categorical (i.e. an advisor can always return yes/no), but it affects consumers, and your patch handles that. The ML stuff doesn't consume, it just implements the advisor interfaces.

We can explore afterwards if/how to adapt SampleProfile to an advisor design; we'd definitely not want to introduce that InlineCandidate in the advisor interface, the former is too specific to the sample profiler.

Could you leave a comment on the getAdvice() abstract API re. the fact that it can return nothing, a pointer to this review, and a TODO that maybe we can tighten it?

Thanks!

modimo updated this revision to Diff 376041.Sep 29 2021, 2:27 PM

Add comments on getAdvice abstract API. Remove curly brackets in ReplayInlineAdvisor destructor

mtrofin accepted this revision.Sep 29 2021, 2:31 PM
This revision is now accepted and ready to land.Sep 29 2021, 2:31 PM

Thanks for working on this. Though perhaps there's some misunderstanding on what "strict" mode means (sorry I should have clarified more beforehand). In the original inline replay patch (https://reviews.llvm.org/D83743), I mentioned strict/positive/negative mode.

The only mode we have before and after this patch is the strict mode. "strict" means existing remarks indicate positive decision, and the rest are all considered negative decision. "positive" means existing remarks indicate positive decision, but the rest is left to other heuristics (or fall back advisors). Accordingly, negative means the input only specifies negative decision.

The change is limiting the scope of replay to function, as opposed to module or whole program. A powerful inline replay framework would support scope={module|function} x mode={strict|positive|negative}.

So I would suggest adding a flag -inline-replay-scope=[function|module] (with a default value), instead of -inline-replay-strict. Later on if we add positive/negative mode, we could add flag -inline-replay-mode=[strict|positive|negative], but for now we don't need it. Thoughts?

llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
43

nit: InlineCallersFromRemarks -> InlinerFromRemarks

llvm/lib/Analysis/InlineAdvisor.cpp
50

Sample loader replay was independent of cgscc replay as they use different switches for taking different inputs. Now the "mode" switch is global, so both replays are tied to a single mode even if they use different input files. I think it's more flexible if they are still kept independent (the hot/cold thresholds etc are all kept separate for the two inliners).

llvm/lib/Analysis/ReplayInlineAdvisor.cpp
46

nits:

  • Pair.first.split(" inlined into ") is duplicated string search and can be merged.
  • If you do Pair.first.split("' inlined into '"), the drop_back() and drop_front() can be removed, and it helps readability..
52

Pair.second.drop_back() works? or you expect something after ;?

54

Check Caller.epmty() too?

I would also add a warning if any of the three is empty - invalid remarks, can't be replayed etc.. instead of silently swallow it.

68–71

I'm not sure whether printing such debug dumps from dtor is a good idea. Lifetime of objects could potentially be somewhat arbitrary, but logs better be printed in more controlled/organized fashion. Looking at the test case, it also seems like these prints are added for testing purpose? See the comments in the test case - we better avoid using debug prints for testing.

81–82

Perhaps we can remove this for simplicity. The shortcut isn't helping much.

120

nit:

if (!Strict || InlineCallersFromRemarks.count(CB.getFunction()->getName())) {
  ...
}
else if (Strict) {
  if (OriginalAdvisor)
    return OriginalAdvisor->getAdvice(CB);
  return {};
}

This way we don't need a lambda - since it's only called once it can be inlined.

llvm/lib/Transforms/IPO/Inliner.cpp
835

@mtrofin I added InlineCandidate stuff in sample loader in https://reviews.llvm.org/D94001 and https://reviews.llvm.org/D95024, but the use of extra info for inline decision predates these changes. As to why we can't just use CallBase there - the inlining happens before profile is annotated on IR as branch metadata (it's the sample *loader*), so none of the BPI, BFI stuff is available, hence we have to use the raw context profile representation alongside CallBase to make inline decisions.

I agree that the InlineCandidate there is very specific and not good for advisor API. However I think having three state yes/no/unknown is more flexible, and some advisors can still choose to not use the unknown state. We can use NULL for unknown, or a different representation.

You probably want to delegate the decision to some other advisor if you don't have one

Conceptually there's a local "unknown" for some advisor already. Forcing a yes/no decision and not having a representation for unknown makes the inline advisor system "closed" instead of "open" in the sense that everything needs to reach a decision within the advisor hierarchy without any external help. In ideal world, it's clean, and sure we can try to convert everything into the system, but practically it might be too restrictive?

llvm/test/Transforms/Inline/cgscc-inline-replay.ll
21

This would fail ninja check with release builds. Usually we try to avoid testing that requires debug builds unless it's critical and no other way to observe.

mtrofin added inline comments.Oct 1 2021, 7:02 AM
llvm/lib/Transforms/IPO/Inliner.cpp
835

Maybe. Right now though I think a good chunk of the discussion stems from the fact that SampleProfile does 2 things in one: populates profile info; and inlines. We should probably split this, but that's both very much out of the scope of this patch, and orthogonal to the tri/bi-state advisor topic.

modimo updated this revision to Diff 378885.Oct 11 2021, 11:17 PM
modimo marked 6 inline comments as done.

Address feedback and change over to replay-scope

Thanks for working on this. Though perhaps there's some misunderstanding on what "strict" mode means (sorry I should have clarified more beforehand). In the original inline replay patch (https://reviews.llvm.org/D83743), I mentioned strict/positive/negative mode.

The only mode we have before and after this patch is the strict mode. "strict" means existing remarks indicate positive decision, and the rest are all considered negative decision. "positive" means existing remarks indicate positive decision, but the rest is left to other heuristics (or fall back advisors). Accordingly, negative means the input only specifies negative decision.

The change is limiting the scope of replay to function, as opposed to module or whole program. A powerful inline replay framework would support scope={module|function} x mode={strict|positive|negative}.

So I would suggest adding a flag -inline-replay-scope=[function|module] (with a default value), instead of -inline-replay-strict. Later on if we add positive/negative mode, we could add flag -inline-replay-mode=[strict|positive|negative], but for now we don't need it. Thoughts?

I think using scope to capture the functionality here makes sense. For your definition of strict I see it as modifier to positive/negative. A strict+negative replay would not inline the remarks present but will attempt to inline everything else and strict+positive would inline the remarks but not inline anything else. Non-strict+negative would make sure the remarks are not inlined but defer the rest of the decisions to the original advisor and non-strict+positive will inline the remarks but defer the rest to the original advisor.

Strict+negative is useful because one way we can capture replay is by looking at all call-sites that remain in the assembly and eliminate all the rest (i.e. making sure they're inlined) which has the advantage that inlined sites no longer present in the assembly (e.g. simple accessors whose bodies completely get folded) will properly be inlined while in strict+positive mode because the dwarf information is gone no remarks are generated and the original call remains.

llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
43

This is to indicate which callers should have replay enforced in function scope, don't think InlinerFromRemarks is enough to capture that.

llvm/lib/Analysis/InlineAdvisor.cpp
50

Makes sense, split

llvm/lib/Analysis/ReplayInlineAdvisor.cpp
52

If the remarks are generated via -Rpass there can be more after:

inline.cpp:8:12: remark: _Z3foov inlined into main with (cost=0, threshold=375) at callsite main:2:12; [-Rpass=inline]
    return foo();
68–71

This was originally intended as a way to easily check which remark was not processed to give a hint to the user that an unexpected inlining difference could be caused by it. In practice I haven't found it to be too useful because other reasons more often led to replay mismatching:

  1. Source line mismatch between GCC and LLVM, seen on destructor placement
  2. Pre-inline IR difference, GCC performs tail recursion elimination pre-inline which changes what call-sites remain post-inline compared to Clang.

So these end up being red herrings. Removed.

120

I like it, changed.

llvm/test/Transforms/Inline/cgscc-inline-replay.ll
21

Removed

Thanks for working on this. Though perhaps there's some misunderstanding on what "strict" mode means (sorry I should have clarified more beforehand). In the original inline replay patch (https://reviews.llvm.org/D83743), I mentioned strict/positive/negative mode.

The only mode we have before and after this patch is the strict mode. "strict" means existing remarks indicate positive decision, and the rest are all considered negative decision. "positive" means existing remarks indicate positive decision, but the rest is left to other heuristics (or fall back advisors). Accordingly, negative means the input only specifies negative decision.

The change is limiting the scope of replay to function, as opposed to module or whole program. A powerful inline replay framework would support scope={module|function} x mode={strict|positive|negative}.

So I would suggest adding a flag -inline-replay-scope=[function|module] (with a default value), instead of -inline-replay-strict. Later on if we add positive/negative mode, we could add flag -inline-replay-mode=[strict|positive|negative], but for now we don't need it. Thoughts?

I think using scope to capture the functionality here makes sense. For your definition of strict I see it as modifier to positive/negative. A strict+negative replay would not inline the remarks present but will attempt to inline everything else and strict+positive would inline the remarks but not inline anything else. Non-strict+negative would make sure the remarks are not inlined but defer the rest of the decisions to the original advisor and non-strict+positive will inline the remarks but defer the rest to the original advisor.

Strict+negative is useful because one way we can capture replay is by looking at all call-sites that remain in the assembly and eliminate all the rest (i.e. making sure they're inlined) which has the advantage that inlined sites no longer present in the assembly (e.g. simple accessors whose bodies completely get folded) will properly be inlined while in strict+positive mode because the dwarf information is gone no remarks are generated and the original call remains.

Using strict as modifier would make it even more flexible. I didn't think strict+negative would be useful, so didn't give it much thought. Actually it's a bit weird to represent negative decision using the same remark format (... inlined into ...) but relying on flag to tell positive vs negative, perhaps negative-ness can be represented with the remark line using ... not inlined into ... (same as missed remarks). This eliminates the need to tell positive/negative from command line and also enable mixing of positive and negative within a file (for non-strict mode). But we can defer that discussion/decision.

llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
43

What I meant is Inliner is equivalent to InlineCaller, but more of a concise and canonical term. I guess you intend to emphasize on top level inliner, but InlinerCaller doesn't really carry that info either. Not a big deal though.

llvm/lib/Transforms/IPO/Inliner.cpp
103

Please use enum directly to avoid parsing strings and error checking in ReplayInlineAdvisor's ctor. Search for cl::opt<SampleProfileFormat> OutputFormat or cl::opt<enum PassDebugLevel>PassDebugging. Description needs to be updated to be per value too.

llvm/test/Transforms/SampleProfile/inline-replay.ll
15

inline-replay-strict.txt does't seem to be included here? And let's rename it accordingly.

modimo updated this revision to Diff 379235.Oct 12 2021, 5:45 PM
modimo marked an inline comment as done.

Review feedback. InlineCallersFromRemarks->CallersToReplay. Refactor creating a ReplayInlineAdvisor to use getReplayInlineAdvisor.

Thanks for working on this. Though perhaps there's some misunderstanding on what "strict" mode means (sorry I should have clarified more beforehand). In the original inline replay patch (https://reviews.llvm.org/D83743), I mentioned strict/positive/negative mode.

The only mode we have before and after this patch is the strict mode. "strict" means existing remarks indicate positive decision, and the rest are all considered negative decision. "positive" means existing remarks indicate positive decision, but the rest is left to other heuristics (or fall back advisors). Accordingly, negative means the input only specifies negative decision.

The change is limiting the scope of replay to function, as opposed to module or whole program. A powerful inline replay framework would support scope={module|function} x mode={strict|positive|negative}.

So I would suggest adding a flag -inline-replay-scope=[function|module] (with a default value), instead of -inline-replay-strict. Later on if we add positive/negative mode, we could add flag -inline-replay-mode=[strict|positive|negative], but for now we don't need it. Thoughts?

I think using scope to capture the functionality here makes sense. For your definition of strict I see it as modifier to positive/negative. A strict+negative replay would not inline the remarks present but will attempt to inline everything else and strict+positive would inline the remarks but not inline anything else. Non-strict+negative would make sure the remarks are not inlined but defer the rest of the decisions to the original advisor and non-strict+positive will inline the remarks but defer the rest to the original advisor.

Strict+negative is useful because one way we can capture replay is by looking at all call-sites that remain in the assembly and eliminate all the rest (i.e. making sure they're inlined) which has the advantage that inlined sites no longer present in the assembly (e.g. simple accessors whose bodies completely get folded) will properly be inlined while in strict+positive mode because the dwarf information is gone no remarks are generated and the original call remains.

Using strict as modifier would make it even more flexible. I didn't think strict+negative would be useful, so didn't give it much thought. Actually it's a bit weird to represent negative decision using the same remark format (... inlined into ...) but relying on flag to tell positive vs negative, perhaps negative-ness can be represented with the remark line using ... not inlined into ... (same as missed remarks). This eliminates the need to tell positive/negative from command line and also enable mixing of positive and negative within a file (for non-strict mode). But we can defer that discussion/decision.

Looks like the remarks format for not inlining is "will not be inlined into" (see OptimizationRemarkMissed in Inliner.cpp) which can be used to distinguish negative decisions. That being said, negative mode itself is different in that it assumes all other legal sites will be inlined. Remark format is for when replay has a decision, mode format is what happens to all the other inline sites.

Negative mode only really works in function-scope since module scope would likely lead to never-ending compilation. In my testing, however, it has an opportunity to be more faithful to replay a full inline tree than positive/strict since the outcome is directly going after having identical call-sites at the end. Positive can deviate since non-matching sites get heuristics applied and strict loses information if no assembly instructions remain detailing that an inlining has occurred. Negative OTOH will leave all call sites that existed earlier intact while generally ensuring every other call site gets inlined since the call didn't exist in inlining that generated the replay. Regardless, separate discussion for separate patch.

llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
43

Hmm maybe CallersToReplay? The key point is that this is a list of callers to replay on.

llvm/lib/Transforms/IPO/Inliner.cpp
103

Ah nice, I was looking for functionality like that but couldn't find it. Changed.

@mtrofin Adding the enum ReplayInlineScope revealed that I was adding #include "llvm/Analysis/ReplayInlineAdvisor.h" to InlineAdvisor.cpp which seems wrong from a dependency perspective. Looking at how the other advisors interact it seems the interface is to declare a getReplayInlineAdvisor in InlineAdvisor.h similar to getReleaseModeAdvisor and also define ReplayInlineScope there so including InlineAdvisor.h by itself is enough to be able to create a ReplayInlineAdvisor. Does this look right?

wenlei accepted this revision.Oct 12 2021, 6:03 PM

lgtm, thanks.

That being said, negative mode itself is different in that it assumes all other legal sites will be inlined.

Depends on how we define it. This is how I think about it now: in non-strict mode, positive/negative means do or don't do certain inlining, but leave the rest to default heuristic. With strict mode, anything unspecific gets the opposite replay decision (not inline for positive, inline for negative). With that definition, negative is no different in that it can have both strict and non-strict mode.

Negative mode only really works in function-scope since module scope would likely lead to never-ending compilation.

I think negative + strict only works in function-scope; but negative + non-strict should be fine. On the other hand, people can spell out all negative decision in input, same as positive ones, then negative + strict works for whole program too.

llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
43

sounds good.

modimo added a comment.EditedOct 12 2021, 6:28 PM

lgtm, thanks.

That being said, negative mode itself is different in that it assumes all other legal sites will be inlined.

Depends on how we define it. This is how I think about it now: in non-strict mode, positive/negative means do or don't do certain inlining, but leave the rest to default heuristic. With strict mode, anything unspecific gets the opposite replay decision (not inline for positive, inline for negative). With that definition, negative is no different in that it can have both strict and non-strict mode.

Ah, I'm thinking ""will not be inlined into" vs. "inlined into" in the remarks is the differentiation for positive/negative as you suggested before. Thinking further, in that setup only strict-negative and strict-positive would exist as modifiers. If implemented we would drop the strict naming completely: unspecified would mean leave it to the heuristic and positive/negative would mean always not inline/inline call sites without remarks associated with them.

Negative mode only really works in function-scope since module scope would likely lead to never-ending compilation.

I think negative + strict only works in function-scope; but negative + non-strict should be fine. On the other hand, people can spell out all negative decision in input, same as positive ones, then negative + strict works for whole program too.

True, what I described is actually strict-negative mode under your definition.

modimo retitled this revision from [InlineAdvisor] Add -inline-replay-strict to replay inline decisions only in callers that have remarks to [InlineAdvisor] Add -inline-replay-scope=<Function|Module> to control replay scope.Oct 13 2021, 4:54 PM
modimo updated this revision to Diff 379555.Oct 13 2021, 4:56 PM

Fix up description so to be consistent in that Function scope is default.

This revision was landed with ongoing or failed builds.Oct 18 2021, 1:09 PM
This revision was automatically updated to reflect the committed changes.