Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Can you add a test case where there is more than one level of inline contexts for the callsite?
llvm/lib/Analysis/InlineAdvisor.cpp | ||
---|---|---|
391 | is this necessary? User should know if their build has profile or not. What is more useful is when PGO is on, but some callsite does not have profile data, then it is worth reporting. |
llvm/lib/Analysis/InlineAdvisor.cpp | ||
---|---|---|
391 |
This was used to differentiate between SampleProfileLoader inline vs CGSCC inline. Maybe the message by profile guided inliner isn't great, but can't think of a better and concise way.. With the differentiation in the message, the inlinee tree recovered through some parsing is what I'm looking for ([P] for SampleProfileLoader inline, [C] for CGSCC inline): Inlinees for main [P] _ZN15largesolidarrayIP6regobjEixEi @ 369 [P] _Z7random1i @ 363 [C] _Z8myrandomv @ 2 [P] _Z7random1i @ 364 [C] _Z8myrandomv @ 2 [P] _ZN15largesolidarrayIP6regobjEixEi @ 366 [P] _ZN6wayobj9createwayEiiiiRP8point16tRi @ 327 [P] _ZN6wayobj11createwayarEiiRP8point16tRi @ 37.1 [P] _ZN6wayobj5indexEii @ 143 [P] _ZN6wayobj5indexEii @ 130 [P] _ZN6wayobj6indexxEi @ 31 [P] _ZN6wayobj6indexyEi @ 32 [C] _ZN8point16tC2Ess @ 2 [C] _ZN8point16tC2Ess @ 2.1
That can be useful. I was also looking for a way to get call site count printed (if we have a count), but looks like it's not available from InlineCost. I'm going to defer that for now if that's ok. |
llvm/lib/Analysis/InlineAdvisor.cpp | ||
---|---|---|
392 | Perhaps reword it to " to match profiling context" .. |
llvm/lib/Analysis/InlineAdvisor.cpp | ||
---|---|---|
392 | Sounds good, updated. |
That's interesting. We are also using something similar for the matrix lowering remarks [1]: we traverse the inlining chain bottom up and emit a remark at each step which contains the expression available at that level. I think those approaches could be useful in general to surface remarks at the right level and it might be worth moving them somewhere so they can be shared. What do you think?
That's indeed similar, though it seems like what you're doing is more than just showing the full inline stack as location. Agreed that if we start to do these in more places for optimization remarks, it'd make sense to build it into remarks infra. But we may not always want full inline stack names as location (considering deep inlining with long template instantiation names that can "pollute" the remark messages), so I'm guessing what we could do is move that into remarks infra, but still use a separate switch to control whether we show inline locations (just like how -fdiagnostics-show-hotness controls whether we show hotness for remarks)?
ProfileGuidedInline --> ForProfileContext