"BTF" is a debug information format used by LLVM's BPF backend.
The format is much smaller in scope than DWARF, the following info is
available:
- full set of C types used in the binary file;
- types for global values;
- line number / line source code information .
BTF information is embedded in ELF as .BTF and .BTF.ext sections.
Detailed format description could be found as a part of Linux Source
tree, e.g. here: [1].
This commit modifies llvm-objdump utility to use line number
information provided by BTF if DWARF information is not available.
E.g., the goal is to make the following to print source code lines,
interleaved with disassembly:
$ clang --target=bpf -g test.c -o test.o $ llvm-strip --strip-debug test.o $ llvm-objdump -Sd test.o test.o: file format elf64-bpf Disassembly of section .text: <foo>: ; void foo(void) { r1 = 0x1 ; consume(1); call -0x1 r1 = 0x2 ; consume(2); call -0x1 ; } exit
A common production use case for BPF programs is to:
- compile separate object files using clang with -g -c flags;
- link these files as a final "static" binary using bpftool linker ([2]).
The bpftool linker discards most of the DWARF sections
(line information sections as well) but merges .BTF and .BTF.ext sections.
Hence, having llvm-objdump capable to print source code using .BTF.ext
is valuable.
The commit consists of the following modifications:
- llvm/lib/DebugInfo/BTF aka DebugInfoBTF component is added to host the code needed to process BTF (with assumption that BTF support would be added to some other tools as well, e.g. llvm-readelf):
- DebugInfoBTF provides llvm::BTFParser class, that loads information from .BTF and .BTF.ext sections of a given object::ObjectFile instance and allows to query this information. Currently only line number information is loaded.
- DebugInfoBTF also provides llvm::BTFContext class, which is an implementation of DIContext interface, used by llvm-objdump to query information about line numbers corresponding to specific instructions.
- Structure DILineInfo is modified with field LineSource.
DIContext interface uses DILineInfo structure to communicate line number and source code information. Specifically, DILineInfo::Source field encodes full file source code, if available. BTF only stores source code for selected lines of the file, not a complete source file. Moreover, stored lines are not guaranteed to be sorted in a specific order.
To avoid reconstruction of a file source code from a set of available lines, this commit adds LineSource field instead.
- Symbolize class is modified to use BTFContext instead of DWARFContext when DWARF sections are not available but BTF sections are present in the object file. (Symbolize is instantiated by llvm-objdump).
- Integration and unit tests.
Note, that DWARF has a notion of "instruction sequence".
DWARF implementation of DIContext::getLineInfoForAddress() provides
inexact responses if exact address information is not available but
address falls within "instruction sequence" with some known line
information (see DWARFDebugLine::LineTable::findRowInSeq()).
BTF does not provide instruction sequence groupings, thus
getLineInfoForAddress() queries only return exact matches.
This does not seem to be a big issue in practice, but output
of the llvm-objdump -Sd might differ slightly when BTF
is used instead of DWARF.
[1] https://www.kernel.org/doc/html/latest/bpf/btf.html
[2] https://github.com/libbpf/bpftool
Depends on https://reviews.llvm.org/D149501
ditto below