This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
ELF section format
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(.pseudo_probe_desc and .pseudoprobe ) is emitted in ELF to support pseudo probe.
The format of .pseudo_probe_desc section looks like:
.section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi"
For each .pseudoprobe section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the .text section). A function record has the following format :
FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function.
Disassembling
A switch --show-pseudo-probe is added to use along with --show-disassembly to print disassembly code with pseudo probe directives.
For example:
00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret
Implementation
- PseudoProbeDecoder is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: GUIDProbeFunctionMap stores all the PseudoProbeFunction which is the abstraction of a general function. AddressProbesMap stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(PseudoProbeInlineTree) and will be constructed from the decoding. Each pseudo probe can get its inline context(getInlineContext) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Some parts of this file can be merged with llvm/include/llvm/IR/PseudoProbe.h. Consider moving non-decoding-specific stuff into the common header?