Page MenuHomePhabricator

[Remarks] Add callsite locations to inline remarks
ClosedPublic

Authored by wenlei on Jun 19 2020, 10:29 AM.

Details

Summary

Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.

Diff Detail

Event Timeline

wenlei created this revision.Jun 19 2020, 10:29 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 19 2020, 10:29 AM

This sounds useful indeed. @fhahn, @anemet might want to take a look.

Can you add a test case where there is more than one level of inline contexts for the callsite?

llvm/lib/Analysis/InlineAdvisor.cpp
391

is this necessary? User should know if their build has profile or not.

What is more useful is when PGO is on, but some callsite does not have profile data, then it is worth reporting.

wenlei marked an inline comment as done.Jun 19 2020, 5:33 PM
wenlei added inline comments.
llvm/lib/Analysis/InlineAdvisor.cpp
391

is this necessary? User should know if their build has profile or not.

This was used to differentiate between SampleProfileLoader inline vs CGSCC inline. Maybe the message by profile guided inliner isn't great, but can't think of a better and concise way..

With the differentiation in the message, the inlinee tree recovered through some parsing is what I'm looking for ([P] for SampleProfileLoader inline, [C] for CGSCC inline):

Inlinees for main
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 369
[P]  _Z7random1i @ 363
[C]    _Z8myrandomv @ 2
[P]  _Z7random1i @ 364
[C]    _Z8myrandomv @ 2
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 366
[P]  _ZN6wayobj9createwayEiiiiRP8point16tRi @ 327
[P]    _ZN6wayobj11createwayarEiiRP8point16tRi @ 37.1
[P]      _ZN6wayobj5indexEii @ 143
[P]      _ZN6wayobj5indexEii @ 130
[P]      _ZN6wayobj6indexxEi @ 31
[P]      _ZN6wayobj6indexyEi @ 32
[C]      _ZN8point16tC2Ess @ 2
[C]      _ZN8point16tC2Ess @ 2.1

What is more useful is when PGO is on, but some callsite does not have profile data, then it is worth reporting.

That can be useful. I was also looking for a way to get call site count printed (if we have a count), but looks like it's not available from InlineCost. I'm going to defer that for now if that's ok.

wenlei updated this revision to Diff 272240.Jun 19 2020, 10:45 PM

Address David's comments, add test for nested inlinining.

davidxl added inline comments.Jun 20 2020, 8:52 AM
llvm/lib/Analysis/InlineAdvisor.cpp
392

Perhaps reword it to " to match profiling context" ..

wenlei marked an inline comment as done.Jun 20 2020, 10:07 AM
wenlei added inline comments.
llvm/lib/Analysis/InlineAdvisor.cpp
392

Sounds good, updated.

wenlei updated this revision to Diff 272263.Jun 20 2020, 10:07 AM

Update remark message.

davidxl accepted this revision.Jun 20 2020, 8:03 PM

lgtm

llvm/lib/Analysis/InlineAdvisor.cpp
383

ProfileGuidedInline --> ForProfileContext

This revision is now accepted and ready to land.Jun 20 2020, 8:03 PM
This revision was automatically updated to reflect the committed changes.
fhahn added a comment.Jun 24 2020, 2:20 AM

That's interesting. We are also using something similar for the matrix lowering remarks [1]: we traverse the inlining chain bottom up and emit a remark at each step which contains the expression available at that level. I think those approaches could be useful in general to surface remarks at the right level and it might be worth moving them somewhere so they can be shared. What do you think?

[1] https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp#L1783

That's interesting. We are also using something similar for the matrix lowering remarks [1]: we traverse the inlining chain bottom up and emit a remark at each step which contains the expression available at that level. I think those approaches could be useful in general to surface remarks at the right level and it might be worth moving them somewhere so they can be shared. What do you think?

[1] https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp#L1783

That's indeed similar, though it seems like what you're doing is more than just showing the full inline stack as location. Agreed that if we start to do these in more places for optimization remarks, it'd make sense to build it into remarks infra. But we may not always want full inline stack names as location (considering deep inlining with long template instantiation names that can "pollute" the remark messages), so I'm guessing what we could do is move that into remarks infra, but still use a separate switch to control whether we show inline locations (just like how -fdiagnostics-show-hotness controls whether we show hotness for remarks)?