This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO] Pseudo probes for function calls.
ClosedPublic

Authored by hoy on Nov 18 2020, 5:40 PM.

Details

Summary

An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work.

One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the !dbg metadata is a first-class metadata and has all essential support in place , leveraging the !dbg metadata as a channel to encode pseudo probe information is probably the easiest solution.

With the requirement of not inflating !dbg metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The !dbg metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original !dbg metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst.

To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the -pseudo-probe-for-profiling switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use.

Callsite pseudo probes are inserted in SampleProfileProbePass and a target-independent MIR pass PseudoProbeInserter is added to unpack the probe ID/type from !dbg.

Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time.

Diff Detail

Event Timeline

hoy created this revision.Nov 18 2020, 5:40 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 18 2020, 5:40 PM
hoy requested review of this revision.Nov 18 2020, 5:40 PM
hoy updated this revision to Diff 306280.Nov 18 2020, 6:28 PM
hoy edited the summary of this revision. (Show Details)

Updating D91756: [CSSPGO] Pseudo probes for function calls.

hoy updated this revision to Diff 306497.Nov 19 2020, 11:55 AM

Updating D91756: [CSSPGO] Pseudo probes for function calls.

Corresponding changes to support IR/MIR intrinsic attributes.

wmi added inline comments.Dec 1 2020, 3:33 PM
llvm/lib/CodeGen/PseudoProbeInserter.cpp
51

What is the usage of FirstInstr?

55

Will tailcall or other optimizations convert call into something else before PseudoProbeInserter pass?

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
166–168

for (auto &[Call, Index] : CallProbeIds) {

hoy added inline comments.Dec 1 2020, 4:04 PM
llvm/lib/CodeGen/PseudoProbeInserter.cpp
51

Good patch. It's not used in this change. It's used in an upcoming change where a probe probing an empty block will be marked specially for better counts inference.

55

Good point. A tailcall instruction (a special MIR jump instruction like TAILJMPd) is considered as a call and can be identified as MI.isCall() && MI.isTerminator() && MI.isReturn(). Tail calls are bad to the frame-pointer-based virtual unwinding since they may cause missing frames. An upcoming change will mark them specially and a tail call tracker in llvm-profgen will try to track the missing frames based on some static analyses.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
166–168

Thanks for the suggestion. This is a C++17 usage and may cause a warning with the current build setup which defaults to C++14 (due to -Wc++17)

hoy updated this revision to Diff 308806.Dec 1 2020, 4:04 PM

Updating D91756: [CSSPGO] Pseudo probes for function calls.

wmi added inline comments.Dec 1 2020, 9:16 PM
llvm/include/llvm/IR/PseudoProbe.h
34–35

Add assertion messages here and the other places.

llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
166–168

Ah, right. I mistakenly remember I saw that somewhere in llvm codebase. llvm coding standard says we should use c++14.

hoy marked an inline comment as done.Dec 1 2020, 10:06 PM
hoy updated this revision to Diff 308874.Dec 1 2020, 10:08 PM

Updating D91756: [CSSPGO] Pseudo probes for function calls.

Adding messages for asserts.

wmi added a comment.Dec 1 2020, 11:08 PM

Another question. Sorry for not bringing it up earlier. When a call with probe metadata attached is inlined, the probe will be gone or it will be kept somehow? I think you want to keep the probe especially for inline instance to reconstruct the context but I didn't figure it out how you did that from the description.

hoy added a comment.Dec 1 2020, 11:24 PM
In D91756#2427759, @wmi wrote:

Another question. Sorry for not bringing it up earlier. When a call with probe metadata attached is inlined, the probe will be gone or it will be kept somehow? I think you want to keep the probe especially for inline instance to reconstruct the context but I didn't figure it out how you did that from the description.

No problem. Sorry for not clarifying it in the description. When a callee is inlined, the probe metadata will go with the inlined instructions. The !dbg metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original !dbg metadata and the bottom of the stack is the for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator fields (which actually represents a calliste probe) can be used to represent the inline context of an inlined PseudoProbeInst or a CallInst. I'll update the description.

hoy edited the summary of this revision. (Show Details)Dec 1 2020, 11:32 PM
wmi accepted this revision.Dec 2 2020, 11:46 AM
In D91756#2427795, @hoy wrote:
In D91756#2427759, @wmi wrote:

Another question. Sorry for not bringing it up earlier. When a call with probe metadata attached is inlined, the probe will be gone or it will be kept somehow? I think you want to keep the probe especially for inline instance to reconstruct the context but I didn't figure it out how you did that from the description.

No problem. Sorry for not clarifying it in the description. When a callee is inlined, the probe metadata will go with the inlined instructions. The !dbg metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original !dbg metadata and the bottom of the stack is the for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator fields (which actually represents a calliste probe) can be used to represent the inline context of an inlined PseudoProbeInst or a CallInst. I'll update the description.

I see. Thanks for the explanation.

This revision is now accepted and ready to land.Dec 2 2020, 11:46 AM
This revision was landed with ongoing or failed builds.Dec 2 2020, 1:45 PM
This revision was automatically updated to reflect the committed changes.