This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Add BTF generation for BPF target
AbandonedPublic

Authored by yonghong-song on Oct 14 2018, 12:19 PM.

Details

Summary

This patch tries to add BPF Debug Format (BTF)
for BPF target in LLVM.

What is BTF?

First, the BPF is a linux kernel virtual machine
and widely used for tracing, networking and security.

https://www.kernel.org/doc/Documentation/networking/filter.txt
https://cilium.readthedocs.io/en/v1.2/bpf/

BTF is the debug info format for BPF, introduced in the below
linux patch

https://github.com/torvalds/linux/commit/69b693f0aefa0ed521e8bd02260523b5ae446ad7#diff-06fb1c8825f653d7e539058b72c83332

in the patch set mentioned in the below lwn article.

https://lwn.net/Articles/752047/

The BTF debug info will be passed to kernel, so
it is designed to be simple enough to (1) contain
just enough information the kernel BPF subsystem cares, and
(2) be simple enough for kernel to parse and verify.

The BTF format is specified in the above github commit.
In summary, its layout looks like

struct btf_header
type subsection (a list of types)
string subsection (a list of strings)

With such information, the kernel and the user space is able to
pretty print a particular bpf map key/value. One possible example below:

Withtout BTF:
  key: [ 0x01, 0x01, 0x00, 0x00 ]
With BTF:
  key: struct t { a : 1; b : 1; c : 0}
where struct is defined as
  struct t { char a; char b; short c; };

How BTF is generated?

Currently, the BTF is generated through pahole.

https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=68645f7facc2eb69d0aeb2dd7d2f0cac0feb4d69

and available in pahole v1.12

https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4a21c5c8db0fcd2a279d067ecfb731596de822d4

Basically, the bpf program needs to be compiled with -g with
dwarf sections generated. The pahole is enhanced such that
dwarf can be converted to a .BTF section. This format
of the .BTF section matches the format expected by
the kernel, so a bpf loader can just take the .BTF section
and load it into the kernel.

https://github.com/torvalds/linux/commit/8a138aed4a807ceb143882fb23a423d524dcdb35

The .BTF section layout is also specified in this patch:
with file include/llvm/MC/MCBTFContext.h.

What use cases this patch tries to address?

Currently, only the bpf instruction stream is required to
pass to the kernel. The kernel verifies it, jits it if configured
to do so, attaches it to a particular kernel attachment point,
and later executes when a particular event happens.

This patch tries to expand BTF to support two more use cases below:

(1). BPF supports subroutine calls.
     During performance analysis, it would be good to
     differentiate which call is hot instead of just
     providing a virtual address. This would require to
     pass a unique identifier for each subroutine to
     the kernel, the subroutine name is a natual choice.
(2). If a particular jitted instruction is hot, we want
     user to know which source line this jitted instruction
     belongs to. This would require the source information
     is available to various profiling tools.

Note that in a single ELF file,

. there may be multiple loadable bpf programs,
. for a particular to-be-loaded bpf instruction stream,
  its instructions may come from multiple PROGBITS sections,
  the bpf loader needs to merge them together to a single
  consecutive insn stream before loading to the kernel.

For example:

section .text: subroutines funcFoo
section _progA: calling funcFoo
section _progB: calling funcFoo

The bpf loader could construct two loadable bpf instruction
streams and load them into the kernel:

. _progA funcFoo
. _progB funcFoo

So per ELF section function offset and instruction offset
will need to be adjusted before passing to the kernel, and
the kernel essentially expect only one code section regardless
of how many in the ELF file.

What do we propose and Why?

To suppose the above two use cases, we propose to
add an additional section, .BTF.ext, to the ELF file
which is the input of the bpf loader.
The .BTF.ext section has a similar header to the .BTF section
and it contains two subsections for func_info and line_info.

. the func_info maps the func insn byte offset to a func
  type in the .BTF type subsection.
. the line_info maps the insn byte offset to a line info.
. both func_info and line_info subsections are organized
  by ELF PROGBITS AX sections.

The reason to use a different ELF section .BTF.ext than
extending the existing .BTF section:

. The existing .BTF section can be directly loaded into
  the kernel. The above proposed .BTF.ext contents
  cannot since bpf loader needs to perform certain
  merging among multiple ELF sections before the loading.

The layout of the .BTF.ext section can be found at
include/llvm/MC/MCBTFContext.h in this patch.

pahole seems not a good place to implement .BTF.ext as
pahole is mostly for structure hole information and more
importantly, we want to pass the actual code to the
kernel because of the following reasons:

. bpf program typically is small so storage overhead
  should be small.
. in bpf land, it is totally possible that
  an application loads the bpf program into the
  kernel and then that application quits, so
  holding debug info by the user space application
  is not practical as you may not even know who
  loads this bpf program.
. having source codes directly kept by kernel
  would ease deployment since the original source
  code does not need ship on every hosts and
  kernel-devel package does not need to be
  deployed even if kernel headers are used.

LLVM seems a good place to implement with the following

. The only reliable time to get the source code is
  during compilation time. This will result in both more
  accurate information and easier deployment as
  stated in the above.
. Another consideration is for JIT. The project like bcc
  (https://github.com/iovisor/bcc)
  use MCJIT to compile a C program into bpf insns and
  load them to the kernel. The llvm generated BTF sections
  will be readily available for such cases as well.

Design and implementation of emiting .BTF.ext section

This patch implemented generation of .BTF.ext
section in llvm compiler. It implemented generation of
.BTF as well since .BTF.ext has dependence on it
for types and strings.

The BTF related ELF sections will be generated
when both -target bpf and -g are specified. Two sections
are generated:

.BTF contains all the type and string information, and
.BTF.ext contains the func_info and line_info.

Note that dwarf sections will be still generated to
satisfy userspace tools like llvm-objdump or others
which relies on dwarf info.

. dwarf info is used for userspace applications
  like llvm-objdump or any others which inspect
  dwarf debug information.
. BTF sections are used for kernel
. When ready to deploy to different machines for
  execution, dwarf related sections can be
  stripped since the BPF loader and kernel only
  needs BTF sections.

The type and func_info are gathered during CodeGen/AsmPrinter
by traversing dwarf debug_info. The line_info is
gathered in MCObjectStreamer before writing to
the object file. After all the information is gathered,
the two sections are emitted in MCObjectStreamer::finishImpl.
The instruction byte offsets are generated by generating
Fixup records in MCObjectStreamer BTF emit function.

With cmake CMAKE_BUILD_TYPE=Debug, the compiler can
dump out all the tables except insn offset, which
will be resolved later as relocation records.
The debug type "btf" is used for BTFContext dump.

This patch also contains tests to verify generated
.BTF and .BTF.ext contents for all supported types,
func_info and line_info subsections, by comparing
llvm-readelf dumping of the section contents to
the expected values.

Note that the .BTF and .BTF.ext information will not
be emitted to assembly code and there is no assembler
support for BTF either.

In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug,
Each table contents are shown for a simple C program.

-bash-4.2$ cat -n test.c
   1  struct A {
   2    int a;
   3    char b;
   4  };
   5
   6  int test(struct A *t) {
   7    return t->a;
   8  }
-bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c
Type Table:
[1] FUNC NameOff=1 Info=0x0c000001 Size/Type=2
      ParamType=3
[2] INT NameOff=12 Info=0x01000000 Size/Type=4
      Desc=0x01000020
[3] PTR NameOff=0 Info=0x02000000 Size/Type=4
[4] STRUCT NameOff=16 Info=0x04000002 Size/Type=8
      NameOff=18 Type=2 BitOffset=0
      NameOff=20 Type=5 BitOffset=32
[5] INT NameOff=22 Info=0x01000000 Size/Type=1
      Desc=0x02000008

String Table:
0 :
1 : test
6 : .text
12 : int
16 : A
18 : a
20 : b
22 : char
27 : test.c
34 : int test(struct A *t) {
58 :   return t->a;

FuncInfo Table:
SecNameOff=6
      InsnOffset=<Omitted> TypeId=1

LineInfo Table:
SecNameOff=6
      InsnOffset=<Omitted> FileNameOff=27 LineOff=34 LineNum=6 ColumnNum=0
      InsnOffset=<Omitted> FileNameOff=27 LineOff=58 LineNum=7 ColumnNum=3
-bash-4.2$ readelf -S test.o
......
  [12] .BTF              PROGBITS         0000000000000000  0000028d
     00000000000000c1  0000000000000000           0     0     1
  [13] .BTF.ext          PROGBITS         0000000000000000  0000034e
     0000000000000050  0000000000000000           0     0     1
  [14] .rel.BTF.ext      REL              0000000000000000  00000648
     0000000000000030  0000000000000010          16    13     8
......
-bash-4.2$

The linux kernel 4.18 can already support .BTF with type information
except BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO which is added in this patch.
The following patch set submitted to linux netdev:

https://www.spinics.net/lists/netdev/msg528817.html

adds supports in kernel for .BTF.ext func_info subsection.
The patchset refers to a previous commit, which is reverted due to
lacking proper review. But it can still be tried together with this
patchset as there is no internal implementation change between this one and
https://reviews.llvm.org/rL344366.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>

Diff Detail

Repository
rL LLVM

Event Timeline

yonghong-song created this revision.Oct 14 2018, 12:19 PM
aprantl added inline comments.Oct 15 2018, 8:46 AM
include/llvm/MC/MCBTFContext.h
76

Could you convert this into a static function?

79

Could you convert this into an enum with doxygen comments?

162

Please add doxygen comments to all new data types.

243

... and classes :-)

339

StringRef instead of std::string& ?

include/llvm/MC/MCObjectFileInfo.h
210

The proper way to document a group in doxygen is

/// BTF specific sections.
/// @{
MCSection *BTFSection;
MCSection *BTFExtSection;
/// @}
lib/CodeGen/AsmPrinter/Dwarf2BTF.h
25

enum please

27

Please document the purpose of this class.

30

///

36

///

lib/MC/MCBTFContext.cpp
98

Please use full sentences:
// Emit header.

test/MC/BPF/btf-func-line-1.ll
3 ↗(On Diff #169619)

Please add a comment explaining what is being tested here.

yonghong-song edited the summary of this revision. (Show Details)

Address comments raised by Adrian Prantl and a few other changes:

  • variable names confirms to llvm coding standard
  • adding doxygen comments for certain data structures, all classes and some methods
  • removed a few unused routines (forgot to remove last time) and I double checked that for this veresion I did not have any unused routines
  • a few code improvement (better C++ styple implementation, for example, "for (auto &TypeEntry : TypeEntries) ..." instead of old way "for (uint32_t I = 0; I < TypeEntries.size(); I++) ...".
  • removed dead codes in Die2BTFEntry constructor

Hi, @aprantl, thanks a lot for the review! I just updated the patch which I hope addressed all your comments. Please take a look and let
me know whether I missed something or there are additional changes are needed. Thanks again!

yonghong-song marked 12 inline comments as done.Oct 15 2018, 10:54 PM

Thanks!

include/llvm/MC/MCBTFContext.h
24
/// \file
/// ...
168

do you need any kind of packing attribute here for this to work on all platforms?

208

You might want to look at include/llvm/BinaryFormat/Dwarf.h and include/llvm/BinaryFormat/Dwarf.def for how we usually do enums<->string conversions

lib/CodeGen/AsmPrinter/Dwarf2BTF.cpp
176

// Handle BTF_INT_ENCODING in IntVal.

182

ditto...

yonghong-song added inline comments.Oct 16 2018, 11:30 AM
include/llvm/MC/MCBTFContext.h
168

The structure here is designed to be no holes based on C standard. Do you know which platform could require packing?

aprantl added inline comments.Oct 16 2018, 1:58 PM
include/llvm/MC/MCBTFContext.h
168

No, I'm no expert on the C record layout; that's why I was asking :-)

aprantl added inline comments.Oct 16 2018, 1:59 PM
test/MC/BPF/btf-type-llong.ll
11

Are you planning to add a BTF disassembler to LLVM in the future? It would make these tests easier to maintain. If not that's fine, too.

yonghong-song added inline comments.Oct 16 2018, 2:58 PM
test/MC/BPF/btf-type-llong.ll
11

Currently, we do not have such a plan yet. We will have some kind of BTF dumper in the kernel tree (e.g., linux:tools/bpf/bpftool). Yes, that is the reason I just compare bytes...

Addressed Adrian's comments:

  • introduced include/llvm/MC/MCBTF.def to better sync enum and strings which are related to enum
  • added some file level descripts in include/llvm/MC/MCBTFContext.h
  • some other minor format fixes

@aprantl, just addressed your comments. Indeed enum/strings definition is much cleaner.
I did not add packed attributes to BTFHeader and BTFExtHeader
as the data structure is designed not to have holes based on C standard. Please take a look. Thanks!

Thanks! Stylistically this looks good now.
@echristo may have an opinion on the integration with MC and DwarfDebug?

lib/CodeGen/AsmPrinter/DwarfFile.h
117

/// ... .

Address a few more format-related changes.

pending Eric's review, I just updated another revision to fix the format issue mentioned by Adrian in the previous comments.

yonghong-song edited the summary of this revision. (Show Details)

. change name convertDwarf2BTF to generateBTFromDwarf

as Dwarf is not removed

. make it clear in the commit message that

with "-target bpf -g" both Dwarf and BTF will be generated.

I just added some more commit messages to make it clear. The below is
what happens when "-target bpf -g" is specified:

. both dwarf and BTF sections will be generated.
. We still need dwarf for llvm-objdump or any other
  user space tools which inspect dwarf, which may
  help debug/inspect the BPF byte codes.
. BTF sections are consumed by kernel BTF loader
  and kernel itself.
. When the BPF object files are ready to deploy to
  different machines for kernel loading/execution,
  all dwarf sections can be stripped as they will
  not be used any more.

I briefly looked at the parsing directly from IR
to BTF. The CodeView provides a good example
and it is certainly possible. But since we need
to generate dwarf anyway, dwarf2BTF seems
more compelling?

if File.Source is available, use it; Otherwise trying to get contents from the file
add a test for it with -g -gdwarf-5 -gembed-source

ONe thing we discussed in person at the LLVM dev meeting was that it might be nicer to go directly from LLVM debug info metadata -> BTF rather than go from metadata -> DWARF -> BTF.

@aprantl, yes, I am experimenting IR->BTF conversion now.

ONe thing we discussed in person at the LLVM dev meeting was that it might be nicer to go directly from LLVM debug info metadata -> BTF rather than go from metadata -> DWARF -> BTF.

Yeah, looking at the amount of code here, seems like it might be worth considering metadata->BTF, avoiding more reliance on the DWARF emission code (would worry that changes to DWARF emission structure, forms, etc, would break this in inconvenient ways)

Is this a feature already implemented in another compiler? How much room is there for design input? (wondering if, rather than reinventing the format, there could be an extension attribute in the DWARF saying "yeah, I guarantee I've met the requirements to use this in the driver" (or whatever it is))

include/llvm/MC/MCBTFContext.h
53

Doesn't look like this type is used, except for sizeof()? Perhaps it'd be better to hardcode the size? (for example, as-is, this might not portably provide a consistent size - different implementations might use different padding, etc)

193

Naming a variable the same as a type is pretty confusing - (any place where you have to qualify a type name with "struct"/"class"/"enum" is a bit of a hint that something's a bit odd) best avoided?

201

Similarly - little hesitant to rely on the sizeof a type that's not "packed", and even that's not super awesome (non-portable, etc).

lib/CodeGen/AsmPrinter/Dwarf2BTF.cpp
18

Maybe use "= default" here to define a default dtor?

lib/CodeGen/AsmPrinter/Dwarf2BTF.h
28

Not sure LLVM really uses the '2'/too thing in its naming convention? Maybe "BTFFromDwarf" (for the file name), "BTFEntryFromDie", etc?

@dblaikie Thanks for the review. All of code review parts make sense. Currently I am experimenting with IR->BTF. Once that is ready we can decide whether we do IR->BTF or Dwarf->BTF. If Dwarf->BTF is preferred, I will make changes.
For sizeof, the structure is already "packed" based on C standard. There are no holes. But I can add packed attributes to the definition to make it portable across different platforms.

For the question "Is this a feature already implemented in another compiler? How much room is there for design input? (wondering if, rather than reinventing the format, there could be an extension attribute in the DWARF saying "yeah, I guarantee I've met the requirements to use this in the driver" (or whatever it is))"

The .BTF section has been implemented in pahole and in linux kernel. Since it has been implemented in linux kernel as part of syscall interface, it becomes a kernel ABI and we cannot change any more.
But the structure itself has a version number, we could bump version number with different structures. The .BTF.ext section is new, not landed in kernel yet. So changes are possible.

Since the debug info passed to kernel needs to be parsed and *verified* by kernel before using it. In the past, the kernel community has rejected the dwarf parser/verifier inside the kernle due to its complexity
and that is why people designed BTF (simple parser and verifier). Any idea to improve the interface is surely welcome.

Thanks!

@aprantl, @dblaikie I just submitted a new patch which generates BTF from IR directly. https://reviews.llvm.org/D53736 Looks like that is indeed better. Could you take a look? Thanks!

Thanks, to avoid confusion, could you please close/abandon this review?

yonghong-song abandoned this revision.Oct 26 2018, 9:37 AM

Let us close this revision. We have agreed that IR->BTF translation (patch https://reviews.llvm.org/D53736) is a better approach.