[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling

Authored by wlei on Nov 23 2020, 8:33 PM.


[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling

This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

ELF section format
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:

Two section(.pseudo_probe_desc and  .pseudoprobe ) is emitted in ELF to support pseudo probe.
The format of .pseudo_probe_desc section looks like:

.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"

For each .pseudoprobe section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the .text section). A function record has the following format :

FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
        Number of probes originating from this function.
        Number of callees inlined into this function, aka number of
        first-level inlinees
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
            A FUNCTION BODY entry describing the inlined function.

A switch --show-pseudo-probe is added to use along with --show-disassembly to print disassembly code with pseudo probe directives.

For example:

00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret


  • PseudoProbeDecoder is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: GUIDProbeFunctionMap stores all the PseudoProbeFunction which is the abstraction of a general function. AddressProbesMap stores all the pseudo probe info indexed by its address.
  • All the inline info is encoded into binary as a trie(PseudoProbeInlineTree) and will be constructed from the decoding. Each pseudo probe can get its inline context(getInlineContext) by traversing its inline tree node backwards.

Test Plan:
ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92334