This is an archive of the discontinued LLVM Phabricator instance.

[BPF][DebugInfo] Show CO-RE relocations in llvm-objdump
ClosedPublic

Authored by eddyz87 on May 7 2023, 5:47 PM.

Details

Summary

Extend llvm-objdump to show CO-RE relocations when -r option is
passed and object file has .BTF and .BTF.ext sections.

For example, the following C program:

#define __pai __attribute__((preserve_access_index))

struct foo { int i; int j;} __pai;
struct bar { struct foo f[7]; } __pai;
extern void sink(void *);

void root(struct bar *bar) {
  sink(&bar[2].f[3].j);
}

Should lead to the following objdump output:

$ clang --target=bpf -O2 -g t.c -c -o - | \
    llvm-objdump  --no-addresses --no-show-raw-insn -dr -

...
        r2 = 0x94
                CO-RE <byte_off> [2] struct bar::[2].f[3].j (2:0:3:1)
        r1 += r2
        call -0x1
                R_BPF_64_32     sink
        exit
...

More examples could be found in unit tests, see BTFParserTest.cpp.

To achieve this:

  • Move CO-RE relocation kinds definitions from BPFCORE.h to BTF.h.
  • Extend BTF.h with types derived from BTF::CommonType, e.g. BTF::IntType and BTF::StrutType, to allow dyn_cast() and access to type additional data.
  • Extend BTFParser to load BTF type and relocation data.
  • Modify llvm-objdump.cpp to create instance of BTFParser when disassembly of object file with BTF sections is processed and -r flag is supplied.

Additional information about CO-RE is available at [1].

[1] https://docs.kernel.org/bpf/llvm_reloc.html

Depends on D149058

Diff Detail

Event Timeline

eddyz87 created this revision.May 7 2023, 5:47 PM
Herald added a reviewer: MaskRay. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

Do you think it would be possible to keep output format close to libbpf's one? Here are few different CO-RE relocations as logged by libbpf, to give you an idea:

<byte_off> [6] struct task_struct.comm (0:115 @ offset 1944)
<field_exists> [586] struct bpf_testmod_test_read_ctx.buf (0:1 @ offset 8)
<enumval_value> [7] enum named_enum::NAMED_ENUM_VAL2 = 2
<enumval_value> [5] typedef anon_enum::ANON_ENUM_VAL1 = 16
<byte_off> [5] struct core_reloc_arrays.a[2] (0:0:2 @ offset 8)
<byte_off> [5] struct core_reloc_arrays.b[1][2][3] (0:1:1:2:3 @ offset 43)
<byte_off> [5] struct core_reloc_ptr_as_arr[2].a (2:0 @ offset 8)
<byte_off> [5] struct core_reloc_nesting.a.a.a (0:0:0:0 @ offset 0)
<byte_off> [5] struct core_reloc_nesting.b.b.b (0:1:0:0 @ offset 4)

Note how libbpf outputs almost C-like access syntax, and skips unnecessary [0] for initial array reference.

eddyz87 updated this revision to Diff 537664.Jul 6 2023, 4:35 AM

Rebase, update in line with fixes suggeested for D149058.

eddyz87 updated this revision to Diff 540808.Jul 16 2023, 7:16 AM
  • styling fixes
  • test case to check formatting
  • adjust-vma option support
eddyz87 published this revision for review.Jul 16 2023, 7:35 AM
eddyz87 edited reviewers, added: yonghong-song; removed: jhenderson.
eddyz87 added a subscriber: yonghong-song.

Hi @MaskRay, @yonghong-song,

Could you please take a look at this revision?
It adds support for CO-RE relocations printing in llvm-objdump.
While current implementation works, there are two aspects that might be sub-optimal:

  • ELF relocations are not unified with BTF relocations and thus there are two functions: llvm-objdump.cpp:printRelocation(), llvm-objdump.cpp:printBTFRelocation();
  • The BTFParser instance is not shared between LLVMSymbolizer (used for source line printing) and llvm-objdump.cpp:disassembleObject() (used for relocations printing).

Fixing both items might require some interface level changes that I opted not to do at the moment. Please let me know what you think.

Herald added a project: Restricted Project. · View Herald Transcript
eddyz87 edited the summary of this revision. (Show Details)Jul 16 2023, 7:37 AM

Hi @MaskRay,

if you have a minute, could you please take a look?
I used this locally and it proved to be convenient for debugging (e.g. here).

Hi @MaskRay,

if you have a minute, could you please take a look?
I used this locally and it proved to be convenient for debugging (e.g. here).

Sorry, I am on a trip and will be slow to respond.

Hi @MaskRay ,

If you have some time, could you please take a look?

FWIW, output format looks great, thanks!

Hi @MaskRay ,

If you have some time, could you please take a look?

Sorry, was pretty busy handling a backlog of reviews and day work... Feel free to ping after a week https://llvm.org/docs/CodeReview.html#lgtm-how-a-patch-is-accepted

In a -DBUILD_SHARED_LIBS=on build, I get ld.lld: error: undefined symbol: llvm::BTFParser::parse(llvm::object::ObjectFile const&, llvm::BTFParser::ParseOptions const&). We need

--- i/llvm/tools/llvm-objdump/CMakeLists.txt
+++ w/llvm/tools/llvm-objdump/CMakeLists.txt
@@ -5,2 +5,3 @@ set(LLVM_LINK_COMPONENTS
   BinaryFormat
+  DebugInfoBTF
   DebugInfoDWARF

https://github.com/MaskRay/Config/wiki/LLVM#build-modes Personally, when I make CMake changes, I'll check both the default config and -DBUILD_SHARED_LIBS=on

MaskRay added inline comments.Aug 22 2023, 10:00 PM
llvm/test/tools/llvm-objdump/BPF/core-relo-formatting.s
1 ↗(On Diff #540808)

We probably should change one of the files to test both -d and -dr, so that the effect of -r is clearer.

53 ↗(On Diff #540808)

Use -NEXT: whenever applicable https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-next-directive https://maskray.me/blog/2021-08-08-toolchain-testing#use-the-check-next-directive (I decided to add a section to this article so that I don't need to repeat the argument for future reviews:)

ditto throughout

llvm/tools/llvm-objdump/llvm-objdump.cpp
1474

BTF = std::make_unique<BTFParser>();

However, std::optional<BTFParser> BTF and BTF.emplace() is better as we avoid an unneeded dynamic allocation.

MaskRay added inline comments.Aug 22 2023, 10:00 PM
llvm/include/llvm/DebugInfo/BTF/BTF.h
280

Macros in public headers should be used with caution as they may collide with downstream users.

308

These derived classes are only referenced in byteSize. This doesn't justify creating so many derived classes.

See that printMod you only use the kind value, not the derived classes.

385

Unneeded if we don't define these classes.

llvm/include/llvm/DebugInfo/BTF/BTFParser.h
113

use member initializers, otherwise it's error-prone (without {} or = {}, we get uninitialized values).

127

avoid default argument. this doesn't

llvm/lib/DebugInfo/BTF/BTFParser.cpp
231

Use the OwningArrayRef(ArrayRef<T> Data) constructor to avoid a separate memcpy

239
MaskRay added a subscriber: ast.Aug 22 2023, 10:57 PM
MaskRay added inline comments.
llvm/lib/DebugInfo/BTF/BTFParser.cpp
379–426
715

typo: Splis

splits SpecStr by ':', parses numbers, and pushes them to RawSpec.

725

What if SpecStr is empty?

https://en.cppreference.com/w/cpp/string/basic_string_view/substr I think we may harden StringRef::substr to abort when pos > size in the future.

731
751

This is untested

754
787

Unlike Linux kernel, we don't keep a blank line after a variable declaration.
Group is immediately used so not having a blank line is arguably more readable.

llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
353

This macro may collide with others (e.g. llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h). Use a static function

397

clear is more idiomatic

525

If we are unsure whether this works on a big-endian machine, we can ask someone to verify when this patch is about to land...

617

C++ prefers using

MaskRay added inline comments.Aug 22 2023, 11:36 PM
llvm/include/llvm/DebugInfo/BTF/BTF.h
340

This and Enum*Type are used in BTFParser::symbolize. Keeping just these classes is fine but other classes are not needed.

Note: we can also utilize llvm/include/llvm/Support/TrailingObjects.h to access the trailing object, instead of DEFINE_TAIL_ARR

eddyz87 updated this revision to Diff 552936.Aug 23 2023, 5:47 PM
eddyz87 marked 20 inline comments as done.

Fixes basing on feedback from @MaskRay.
Fix for incorrect BTF::CommonType::getVlen() mask.

Sorry, was pretty busy handling a backlog of reviews and day work... Feel free to ping after a week https://llvm.org/docs/CodeReview.html#lgtm-how-a-patch-is-accepted

Thanks a lot for taking a look.

In a -DBUILD_SHARED_LIBS=on build, I get ld.lld: error: undefined symbol: llvm::BTFParser::parse(llvm::object::ObjectFile const&, llvm::BTFParser::ParseOptions const&). We need

--- i/llvm/tools/llvm-objdump/CMakeLists.txt
+++ w/llvm/tools/llvm-objdump/CMakeLists.txt
@@ -5,2 +5,3 @@ set(LLVM_LINK_COMPONENTS
   BinaryFormat
+  DebugInfoBTF
   DebugInfoDWARF

Done

https://github.com/MaskRay/Config/wiki/LLVM#build-modes Personally, when I make CMake changes, I'll check both the default config and -DBUILD_SHARED_LIBS=on

Noted, thank you.

llvm/include/llvm/DebugInfo/BTF/BTF.h
280

Removed CoreRelo.def, I reused it for printable names in the original revision but it is changed since then, so there is no need in this file.

308

Those are all types that have trailing items. I removed the types not used outside of byteSize and switched byteSize to use switch over kind (which is a bit less safe, imo).

As a final step for BTF support in objdump Yonghong wanted me to add support for printing BTF definitions like bpftool btf dump file does. The definitions in question would have been useful in that case. But I can add them back when necessary.

llvm/include/llvm/DebugInfo/BTF/BTFParser.h
127

Replaced with overload.

llvm/lib/DebugInfo/BTF/BTFParser.cpp
231

Done, but it is kind-off ugly because StringRef encapsulates char and I use uint8_t for TypesBuffer:

TypesBuffer = OwningArrayRef<uint8_t>(
  ArrayRef<uint8_t>((const uint8_t *)RawData.data(), RawData.size()));
379–426

I will add necessary documentation.

725

Changed to exit loop if SpecStr is empty after consumeUnsignedInteger, added missing check that type relocation spec should be '0'.

751

Missed this one :-/

llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
525

I'll try to check in the VM.

yonghong-song added inline comments.Aug 23 2023, 11:55 PM
llvm/lib/DebugInfo/BTF/BTFParser.cpp
379–426
eddyz87 marked an inline comment as done.Aug 24 2023, 4:07 PM
eddyz87 added inline comments.
llvm/lib/DebugInfo/BTF/BTFParser.cpp
379–426

Working on it, submitted patch here.

Seems good in shape! Wonder whether @jhenderson has some comments.

llvm/include/llvm/DebugInfo/BTF/BTF.h
102

prefer constexpr uint32_t to enum, especially that these are bit masks.

llvm/lib/DebugInfo/BTF/BTFParser.cpp
233

Some using declarations such as big,little, seem unnecessary?

240

arrayRefFromStringRef

llvm/test/tools/llvm-objdump/BPF/core-relo-byte-offset.ll
7

For continuation lines we indent by 2.

I did a quick skim of comments and the like, but I don't really have the time to read up on BTF, so can't contirbute anything more meaningful really, sorry.

llvm/include/llvm/DebugInfo/BTF/BTF.h
101

Nit.

279

Nit.

llvm/include/llvm/DebugInfo/BTF/BTFParser.h
50–51

The "Mb" suffix is ambiguous/potentially incorrect. Is it meant to be megabits (I doubt it)?

Assuming it is intended to be binary megabytes (i.e. 1024*1024 bytes), consider using "MiB" (mebibytes).

111

Nit.

llvm/lib/DebugInfo/BTF/BTFParser.cpp
241
596

reclocKindName maybe? ("reloc" is a more common abbreviation for "relocation").

Same goes below at line 559 ("relo kind #") and at various other places (e.g. ReloKindGroup).

Neither of these apply if "relo" is an abbreviation used in the BTF spec or similar.

648–649

I'm a little nervous about referencing file paths in comments - I would expect them to fairly easily rot. Do you actually need to do that?

715

"splits" is duplicated.

730
llvm/test/tools/llvm-objdump/BPF/core-relo-byte-offset.ll
6–8

I have a personal preference to put the | at the end of the previous line. My reasoning is that by looking at any given line, you can tell if the command itself is continued (it ends with | \ or simply \), without needing to look at the next line, whilst the continuation lines are clearly continuations (due to being indented), and will either start with an executable name (indicating they are the start of a new command) or otherwise (indicating that they aren't).

Inline edit is to demonstrate - not a requirement if you object to the proposal.

llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
560

std::numeric_limits::max<uint32_t>() might be clearer?

591–592

Similar can be done in various other places. There's also EXPECT variations and an ASSERT_THAT_EXPECTED for Expected types, and you can use FailedWithMessage() to check that the error failed with a specific error message too.

659

Nit (also below for other comments).

eddyz87 updated this revision to Diff 555774.Sep 4 2023, 12:27 PM
eddyz87 marked 19 inline comments as done.

Stylistic changes according to feedback from @MaskRay and @jhenderson.

Hi @MaskRay, @jhenderson,

Thank you for the feedback, I think I covered all but one items.
The last remaining item is to verify that unit test is working on the bigendian system.
I'll update the revision once I check that.

llvm/include/llvm/DebugInfo/BTF/BTFParser.h
50–51

Changed to MiB.

llvm/lib/DebugInfo/BTF/BTFParser.cpp
233

support::endian::system_endianness was unused, indeed. The big and little are used in the following declaration:

endianness Endianness = Ctx.Obj.isLittleEndian() ? little : big;
379–426

Documentation was merged to bpf tree: here and here.

596

Changed to reloc in identifiers and "relocation" in strings.

648–649

Removed.

715

Right, sorry about that.

llvm/test/tools/llvm-objdump/BPF/core-relo-byte-offset.ll
6–8

Updated to | \. I find the version with | at the new line a bit more aesthetically pleasing, but oh-well.

llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
560

Documentation refers to value -1 (for unsigned field :) ), so I think it's better to keep -1 here to avoid confusion.

591–592

Thanks, replaced with local macro:

#define ASSERT_SUCCEEDED(E) ASSERT_THAT_ERROR((E), Succeeded())

I do use FailedWithMessage in macro EXPECT_PARSE_ERROR.

659

Added dots for comments that look like English sentences, did not add dots for comments that refer to type definitions.

eddyz87 updated this revision to Diff 555779.Sep 4 2023, 4:19 PM

Unit tests are passing on S390 under emulation w/o issues.

eddyz87 marked an inline comment as done.Sep 4 2023, 4:23 PM

Only a couple of nits remaining from me.

llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
67

Now I'm being really nit-picky - here and in the E struct below it should be capitalized "No" not "no".

560

Fair enough.

eddyz87 updated this revision to Diff 555849.Sep 5 2023, 4:38 AM
eddyz87 marked 2 inline comments as done.
eddyz87 edited the summary of this revision. (Show Details)

Missing dot at the end of the comment.

Hi @MaskRay,

Could you please take a look at this revision?
I think I addressed all feedback.

MaskRay accepted this revision.Sep 19 2023, 11:57 PM

Sorry for the delay. LG!

Additional information about CO-RE is available at [1].
[1] https://nakryiko.com/posts/bpf-core-reference-guide/

Link to https://docs.kernel.org/bpf/llvm_reloc.html ? Thank you for https://git.kernel.org/linus/be4033d36070e44fba766a21ef2d0c24fa04c377 :)

llvm/include/llvm/DebugInfo/BTF/BTF.h
101

nit: in this case we don't add parens

llvm/lib/DebugInfo/BTF/BTFContext.cpp
77–80
llvm/lib/DebugInfo/BTF/BTFParser.cpp
581
llvm/unittests/DebugInfo/BTF/BTFParserTest.cpp
375
1000

EXPECT_EQ as findString is non-fatail if failed

http://google.github.io/googletest/reference/assertions.html

1068

These find* calls can use EXPECT (nonfatal if failed)

This revision is now accepted and ready to land.Sep 19 2023, 11:57 PM
eddyz87 updated this revision to Diff 557152.Sep 20 2023, 4:48 PM
eddyz87 marked 6 inline comments as done.
eddyz87 edited the summary of this revision. (Show Details)

Changes in accordance with @MaskRay feedback.

Hi @MaskRay, @jhenderson,

Thank you for all the feedback!
I've applied the latest suggestions and going to merge this revision once CI is done.

Link to https://docs.kernel.org/bpf/llvm_reloc.html ?

Commit message updated, thank you for the --verbatim tip.

llvm/lib/DebugInfo/BTF/BTFParser.cpp
581

Forgot about namespaces here, sorry. Surprisingly, to avoid compilation errors operator<< had to be moved inside the namespace as well.

This revision was automatically updated to reflect the committed changes.