This is an archive of the discontinued LLVM Phabricator instance.

[InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks
ClosedPublic

Authored by modimo on Jan 8 2021, 12:36 PM.

Details

Summary

This change leverages the work done in D83743 to replay in the SampleProfile inliner to also be used in the CGSCC inliner. NOTE: currently restricted to non-ML advisors only.

The added switch -cgscc-inline-replay=<remarks file> will replay the inlining decisions in that file where the remarks file is generated via -Rpass=inline. The aim here is to make it easier to analyze changes that would modify inlining heuristics to be separated from this behavior. Doing so allows easier examination of assembly and runtime behavior compared to the baseline rather than trying to dig through the large churn caused by inlining.

In LTO compilation, since inlining is done twice you can separately specify replay by passing the flag to the FE (-cgscc-inline-replay=) and to the linker (-Wl,cgscc-inline-replay=) with the remarks generated from their respective places.

Testing on mysqld by comparing the inline decisions between base (generates remarks.txt) and diff (replay using identical input/tools with remarks.txt) and examining the inlining sites with diff shows 14,000 mismatches out of 247,341 for a ~94% replay accuracy. I believe this gap can be narrowed further though for the general case we may never achieve full accuracy. For my personal use, this is close enough to be representative: I set the baseline as the one generated by the replay on identical input/toolset and compare that to my modified input/toolset using the same replay.

Testing:
ninja check-llvm
newly added test correctly replays CGSCC inlining decisions

Diff Detail

Event Timeline

modimo created this revision.Jan 8 2021, 12:36 PM
modimo requested review of this revision.Jan 8 2021, 12:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 8 2021, 12:36 PM
modimo retitled this revision from cgscc replay to [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks.Jan 8 2021, 1:39 PM
modimo edited the summary of this revision. (Show Details)
modimo added reviewers: mtrofin, wenlei, wmi.
modimo added a reviewer: davidxl.
wenlei added inline comments.Jan 8 2021, 2:17 PM
llvm/lib/Analysis/ReplayInlineAdvisor.cpp
61

This looks redundant/similar to DefaultInlineAdvice, is that just for controlling EmitRemarks? ORE should be able to handle remark printing (or not) correctly without extra guard.

mtrofin added inline comments.Jan 8 2021, 2:28 PM
llvm/lib/Analysis/ReplayInlineAdvisor.cpp
61

Along the same lines as @wenlei 's comment - if the Advisor can both generate and digest the trace of decisions, why rely on ORE and not, instead, use a more structured format that wouldn't need parsing like ReplayInlineAdvisor.cpp:43?

Also (if using ORE is desirable, case in which I share @wenlei's question), I think there's a yaml output format ORE generates, perhaps requiring that as input would also simplify ingestion?

modimo updated this revision to Diff 315559.Jan 8 2021, 5:45 PM
modimo edited the summary of this revision. (Show Details)

Move the ReplayInlineAdvisor.cpp/h and SampleProfile.cpp files to D94333 as they need to be atomic with the remarks format change.

modimo edited the summary of this revision. (Show Details)Jan 8 2021, 6:15 PM
modimo added inline comments.Jan 8 2021, 6:29 PM
llvm/lib/Analysis/ReplayInlineAdvisor.cpp
61

I've moved this section to D94333 since the replay mechanism change to consume the new line:col.discriminator format needed to be together with format change. I've folded ReplayInlineAdvice back into DefaultInlineAdvice with the additional features I need. The extra guard is needed because the SampleProfile inliner uses the "legacy PM" mechanism of inline printing rather than bundling it with InlineAdvice calls. Since the use of InlineAdvice in SampleProfile is purely to support replay right now I'm leaving that refactoring (if we want to go after it) for the future.

As far as using yaml I like how condensed the format is in remarks form. Something that's a single line in remarks ends up as 24 lines (like in llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll) which makes manual reading and modification tedious especially on larger binaries. The current text processing is also fairly simple as is which makes a change here less pressing.

That being said I'm not against using the yaml file as the official/supported format. A nice advantage there is that if we wanted to add more replay data (say negative inline decisions) it'll be smoother in yaml than adding new parsing of the text remark.

wenlei added inline comments.Jan 12 2021, 9:07 PM
llvm/lib/Transforms/IPO/Inliner.cpp
644

Plug in replay inline advisor here isn't extensible. In the future we want to be able to use inline replay only for a specific function, or enforce/prevent certain inlining at particular callsite, and fall back to regular advisor for the rest (see comments in D83743).

That means we would need to be able to fall back from replay advisor to default advisor (or whatever main advisor being used) when replay advisor doesn't have info. For that cascaded model, we would need inline advise to have something like hasInlineRecommendation in addition to isInliningRecommended. We should probably still record inlining on each advice, but don't want to emit duplicated remarks from each advice. These changes can come later, but current change better offer that flexibility - we don't to stick to replay advisor for the entire module inliner pass.

llvm/test/Transforms/Inline/cgscc-inline-replay.ll
5

Better verify inline decision without replay first, to make sure the replay has visible impact on inlining. See DEFAULT and REPLAY check for SampleProfile/inline-replay.ll.

modimo edited the summary of this revision. (Show Details)Jan 14 2021, 3:00 PM
modimo added inline comments.Jan 14 2021, 3:15 PM
llvm/lib/Transforms/IPO/Inliner.cpp
644

I think your suggested change is to initialize both Advisors and allow fallback if we defer on one. With how the current scheme is setup up though advisors are single entities and we only ask it once:
auto Advice = Advisor.getAdvice(*CB);
A proven approach to doing something like this is with alias analysis where we query AA in a specific order until we hit a real recommendation. That would cause a rehaul of this function rather than a small tweak so I don't see a good intermediate step to take for this patch. Let me know if you think there's something to do here now for it.

llvm/test/Transforms/Inline/cgscc-inline-replay.ll
5

Makes sense, added.

modimo updated this revision to Diff 316796.Jan 14 2021, 3:20 PM

Add DEFAULT testing to make sure baseline inlining differs from replay. Fix copy-paste error in flag description for -cgscc-inline-replay

modimo marked an inline comment as not done.Jan 15 2021, 11:56 AM
modimo added inline comments.
llvm/lib/Analysis/ReplayInlineAdvisor.cpp
61

Along the same lines as @wenlei 's comment - if the Advisor can both generate and digest the trace of decisions, why rely on ORE and not, instead, use a more structured format that wouldn't need parsing like ReplayInlineAdvisor.cpp:43?

@mtrofin I very much agree on this point. Personal front-runners for me is a CSV file which gives you line density but makes parsing easy or a tree format based on what @wenlei shows in D82213:

Inlinees for main
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 369
[P]  _Z7random1i @ 363
[C]    _Z8myrandomv @ 2
[P]  _Z7random1i @ 364
[C]    _Z8myrandomv @ 2
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 366
[P]  _ZN6wayobj9createwayEiiiiRP8point16tRi @ 327
[P]    _ZN6wayobj11createwayarEiiRP8point16tRi @ 37.1
[P]      _ZN6wayobj5indexEii @ 143
[P]      _ZN6wayobj5indexEii @ 130
[P]      _ZN6wayobj6indexxEi @ 31
[P]      _ZN6wayobj6indexyEi @ 32
[C]      _ZN8point16tC2Ess @ 2
[C]      _ZN8point16tC2Ess @ 2.1

I do want to see what users think about the current flow that's currently the same between CGSCC and sample inliner because there's definitely more refinements (additional replay accuracy, more logging, global allow-list/block-list etc.) that can be pursued but which I don't have a sense of value/priority for. I'm hoping that'll give us more information on what path to pursue here.

wenlei added inline comments.Jan 18 2021, 11:34 PM
llvm/lib/Transforms/IPO/Inliner.cpp
644

Well, we don't have to have everything setup for the cascaded query like how AA works, but something more flexible than having entire inliner sticking to one advisor would be good (and does not seem like a significant change)

What I was thinking about is that the main advisor can still go through getAdvisor interface, then for inline replay, we can just let ModuleInlinerWrapperPass own an ExternalInlineAdvisor just like how SampleProfileLoader owns one. Then it can be passed to InlinerPass and serve as a short-circuit look up or side look up when available in addition to the main advisor from getAdvisor.

The changes to add hasInlineRecommendation etc are not what I'm suggesting for this patch though I don't think these are significant either. It can evolve into cascaded advice support in the framework if needed, but if replay inline advice is the only case needing that support, generalizing it doesn't not seem like a must do.

mtrofin added inline comments.Jan 19 2021, 8:08 AM
llvm/lib/Transforms/IPO/Inliner.cpp
644

+1 to what @wenlei said - you can wrap advisors in advisors, basically. There may be some helper utilities that we may need factored in InlineAdvisor, and I was at a point thinking of doing that, but the motivating scenario at the time ended up not really needing that. If this turns out to be that scenario, I'd be happy to help!

modimo updated this revision to Diff 318339.Jan 21 2021, 3:54 PM

Wrap ReplayInlineAdvisor into InlineAdvisor so that we can fall back to the original Advisor if we don't want to follow the replay. I think the composition of advisors here makes sense but I'm not sure so I'm very open to different approaches.

wenlei added inline comments.Jan 21 2021, 4:24 PM
llvm/include/llvm/Analysis/InlineAdvisor.h
199

By adding a ReplayAdvisor field into every InlineAdvisor, we're allowing advisors to be chained. It'd be weird if a replay advisor itself has a non-empty replay advisor though the current implementation doesn't prohibit that. However if we change the names to be like a fall back advisor, and let replay advisor be just a use case of the fall back chain, I think that would be more reasonable.

mtrofin added inline comments.Jan 21 2021, 5:43 PM
llvm/include/llvm/Analysis/InlineAdvisor.h
199

Instead of all advisors knowing about replay, why not doing it vice-versa: Replay wraps other advisors. Then, in InlineAdvisorAnalysis::Result::tryCreate (in InlineAdvisor.cpp), you see if replaying is requested, and build the ReplayInlineAdvisor wrapping the advisor requested initially - something like adding, right before return:

if (ReplayRequired)

Advisor = std::make_unique<ReplayInlineAdvisor>(<params>, std::move(Advisor))

I believe this keeps the concerns (replaying vs regular advising) separated, while also allowing future usecases where the the replay advisor can delegate to some other advisor, generically.

Note: I think you'd also need to have a ReplayInlineAdvice wrapping the InlineAdvice coming from the contained InlineAdvisor.

modimo updated this revision to Diff 318639.Jan 22 2021, 1:36 PM

Nest original advisors in ReplayInlineAdvisor rather than the other way around.

mtrofin accepted this revision.Jan 22 2021, 1:41 PM

Nice! LGTM (assuming comment in tryCreate is addressed)

llvm/lib/Analysis/InlineAdvisor.cpp
187

Probably best to check first if Advisor isn't null before line 180, then not bother making a replay advisor if the underlying one can't be made in the first place (and just return false)

This revision is now accepted and ready to land.Jan 22 2021, 1:41 PM
modimo added inline comments.Jan 22 2021, 1:42 PM
llvm/include/llvm/Analysis/InlineAdvisor.h
199

However if we change the names to be like a fall back advisor, and let replay advisor be just a use case of the fall back chain, I think that would be more reasonable.

Fall back would be better wording for it, agreed.

Instead of all advisors knowing about replay, why not doing it vice-versa: Replay wraps other advisors.

I like it, knowing about tryCreate makes it easier than I first thought given it's a centralizing creation point where we can wrap.

Note: I think you'd also need to have a ReplayInlineAdvice wrapping the InlineAdvice coming from the contained InlineAdvisor.

Can you elaborate on what would go into ReplayInlineAdvice? My thinking is that if the ReplayInlineAdvisor::getAdviceImpl declines to offer advice then we go to OriginalAdvisor->getAdvice(CB) so wrapping doesn't seem needed.

mtrofin added inline comments.Jan 22 2021, 1:49 PM
llvm/include/llvm/Analysis/InlineAdvisor.h
199

TL;DR; right now it's probably fine.

Longer story: the ML advisors are stateful - they track module-wide changes. So *if* we wanted to combine replaying with one of those, then the replayer would always have to get an advice from the underlying advisor, so it'd be able to notify back through it on what actually happened.

But the motivation for that scenario is kind of tenuous, I think, and it'd complicate the design unnecessarily. May be better to just disallow replaying with anything else other than the default advisor and we can add there a comment as to why.

modimo updated this revision to Diff 318652.Jan 22 2021, 2:21 PM
modimo edited the summary of this revision. (Show Details)

Restrict replay to default advisor only

wenlei accepted this revision.Jan 23 2021, 7:44 AM

Looks great, thanks!