This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][llvm-profgen] Fix external address issues of perf reader (return to external addr part)
ClosedPublic

Authored by wlei on Dec 10 2021, 1:42 PM.

Details

Summary

Before we have an issue with artificial LBR whose source is a return, recalling that "an internal code(A) can return to external address, then from the external address call a new internal code(B), making an artificial branch that looks like a return from A to B can confuse the unwinder". We just ignore the LBRs after this artificial LBR which can miss some samples. This change aims at fixing this by correctly unwinding them instead of ignoring them.

List some typical scenarios covered by this change.

  1. multiple sequential call back happen in external address, e.g.
[ext, call, foo] [foo, return, ext] [ext, call, bar]

Unwinder should avoid having foo return from bar. Wrong call stack is like [foo, bar]

  1. the call stack before and after external call should be correctly unwinded.
{call stack1}                                            {call stack2}  
[foo, call, ext]  [ext, call, bar]  [bar, return, ext]  [ext, return, foo ]

call stack 1 should be the same to call stack2. Both shouldn't be truncated

  1. call stack should be truncated after call into external code since we can't do inlining with external code.
[foo, call, ext]  [ext, call, bar]  [bar, call, baz] [baz, return, bar ] [bar, return, ext]

the call stack of code in baz should not include foo.

Implementation:

We leverage artificial frame to fix #2 and #3: when we got a return artificial LBR, push an extra artificial frame to the stack. when we pop frame, check if the parent is an artificial frame to pop(fix #2). Therefore, call/ return artificial LBR is just the same as regular LBR which can keep the call stack.

While recording context on the trie, artificial frame is used as a tag indicating that we should truncate the call stack(fix #3).

To differentiate #1 and #2, we leverage getCallAddrFromFrameAddr. Normally the target of the return should be the next inst of a call inst and getCallAddrFromFrameAddr will return the address of call inst. Otherwise, getCallAddrFromFrameAddr will return to 0 which is the case of #1.

Diff Detail

Event Timeline

wlei created this revision.Dec 10 2021, 1:42 PM
wlei requested review of this revision.Dec 10 2021, 1:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2021, 1:42 PM
wlei updated this revision to Diff 393589.Dec 10 2021, 1:57 PM

Updating D115550: [CSSPGO][llvm-profgen] Fix external address issues of perf reader (return to external addr part)

wlei edited the summary of this revision. (Show Details)Dec 10 2021, 4:13 PM
wlei added reviewers: hoy, wenlei.
wlei edited the summary of this revision. (Show Details)Dec 10 2021, 4:25 PM
wenlei added inline comments.Dec 11 2021, 2:49 PM
llvm/tools/llvm-profgen/PerfReader.cpp
147–159

Can we move this check into isReturnState? This should not be considered as return state, then we would naturally fall back to unwindBranchWithinFrame which does what we want. We may need to rename unwindBranchWithinFrame to simply unwindBranch because this now can be inter-function too.

llvm/tools/llvm-profgen/PerfReader.h
296

Comment for the definition and purpose for artificial frame.

423

nit: if we don't expect pushing artificial frame, the message can be explicit, something like "Artificial frame not expected for context stack."

hoy added inline comments.Dec 12 2021, 11:33 PM
llvm/tools/llvm-profgen/PerfReader.cpp
222–224

nit: ... since this isn't a real call context the compiler will see

llvm/tools/llvm-profgen/PerfReader.h
309

nit: switch the two statements for readability? A comment would also be helpful.

423

Or we just return false for artificial frames so that it doesn't need to be handled on the caller side?

wenlei added inline comments.Dec 12 2021, 11:37 PM
llvm/tools/llvm-profgen/PerfReader.h
309

iiuc, switching changes the semantic though - we will be checking IsArtificialFrame the grand parent instead of parent ..

hoy added inline comments.Dec 12 2021, 11:43 PM
llvm/tools/llvm-profgen/PerfReader.h
309

Right, the code needs to be slightly tweaked. I was thinking of reading this routine like we unwind once first. When an artificial frame is found, we unwind once more to skip that frame. Not a strong preference though.

wlei updated this revision to Diff 394129.Dec 13 2021, 8:46 PM
wlei edited the summary of this revision. (Show Details)

addressing reviewer's feedback

wlei updated this revision to Diff 394138.Dec 13 2021, 9:47 PM

refine some comments

wlei added inline comments.Dec 13 2021, 9:47 PM
llvm/tools/llvm-profgen/PerfReader.cpp
147–159

That's a good point! moved to isReturnState

222–224

Moved to pushFrame

llvm/tools/llvm-profgen/PerfReader.h
296

changed to reuse external frame instead of artificail frame. Added some comments near it's definition.

309

ok, I realized that moving this check`if (CurrentLeafFrame->Parent->IsArtificialFrame())` to its caller seems better, then we can have better handling the unmatched cases.

423

Sounds good, changed to return false for external frame.

wenlei added inline comments.Dec 13 2021, 11:39 PM
llvm/tools/llvm-profgen/PerfReader.cpp
147

replace "artificial frame" with "external frame" now that we're converging on the term?

llvm/tools/llvm-profgen/PerfReader.h
423

I actually think that it's cleaner to handle this on caller side and assert we don't have external frame when it comes here.

525–527

nit: return (CallAddr != 0); to make the boolean return explicit.

hoy added inline comments.Dec 14 2021, 10:57 AM
llvm/tools/llvm-profgen/PerfReader.h
423

Sounds good to me.

wlei updated this revision to Diff 394350.Dec 14 2021, 12:28 PM

addressing feedbacks.

wenlei accepted this revision.Dec 14 2021, 12:36 PM

lgtm, thanks.

This revision is now accepted and ready to land.Dec 14 2021, 12:36 PM
hoy accepted this revision.Dec 14 2021, 12:50 PM

lgtm, thx.