Page MenuHomePhabricator

[llvm-objdump] Create fake sections for a ELF core file
ClosedPublic

Authored by namhyung on Jun 28 2022, 12:19 AM.

Details

Summary

The linux perf tools use /proc/kcore for disassembly kernel functions.
Actually it copies the relevant parts to a temp file and then pass it to
objdump. But it doesn't have section headers so llvm-objdump cannot
handle it.

Let's create fake section headers for the program headers. It'd have a
single section for each segment to cover the entire range. And for this
purpose we can consider only executable code segments.

With this change, I can see the following command shows proper outputs.

perf annotate --stdio --objdump=/path/to/llvm-objdump

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jhenderson added inline comments.Jul 1 2022, 1:14 AM
llvm/test/tools/llvm-objdump/X86/disassemble-no-section-kcore.test
35 ↗(On Diff #441512)

To reduce test noise, this could be just PF_X.

36 ↗(On Diff #441512)

Is the high VAddr actually important to the test?

37 ↗(On Diff #441512)

Is this relevant? If not, I believe it can be omitted.

38–41 ↗(On Diff #441512)

I believe the Align field is unnecessary for the test. I believe the Offset and FileSize are derived by the associated "section" too, so those should be omittable too. It's potentially the same for MemSize, but I'm not certain.

45 ↗(On Diff #441512)

Nit: too many blank lines at EOF (the file should end with exactly one \n).

llvm/tools/llvm-objdump/llvm-objdump.cpp
1924

Is it possible to just remove the const from the input parameter, to avoid the const_cast, or does that require further changes in the call tree?

MaskRay added inline comments.Jul 1 2022, 10:34 AM
llvm/include/llvm/Object/ELF.h
767

This should not be restricted to ET_CORE. llvm-strip --strip-sections processed files can be disassembled by objdump -d

perf annotate --stdio --objdump=/path/to/llvm-objdump

This command does not say how perf finds core

Thanks for your detailed review! Really appreciated!

llvm/include/llvm/Object/ELF.h
767

Hmm.. ok.

llvm/include/llvm/Object/ELFObjectFile.h
462

Right, will remove.

llvm/test/tools/llvm-objdump/X86/disassemble-no-section-kcore.test
1 ↗(On Diff #441512)

I guess normal core files don't have the contents (code) since it's in the binary. But linux perf tools generate a core file with binary to disassemble a kernel function easily. That's why I need to add content to phdr (now it works with section fill though).

1–2 ↗(On Diff #441512)

Ok, will change.

4–6 ↗(On Diff #441512)

Is there any particular reason to do two separate llvm-objdump invocations?

@MaskRay asked to test it with -h option too.

Do you actually need the explicit --start-address and --stop-address options?

No, but that's how the liinux perf tools run it. I'd like to make it as same as the actual use case.

10 ↗(On Diff #441512)

No, they generate section name based on p_type and index. Probably I can add it as a follow up.

12 ↗(On Diff #441512)

We usually add additional spacing to ensure lines without -NEXT suffixes line up with those that do (see inline edit).

Thanks, will change.

Again, does this style match GNU?

Do you mean symbol name? They just show the same name of the section. Not sure it's from a (fake) symbol or falls back to section name.

35 ↗(On Diff #441512)

It's the same as kcore extracted by linux perf tools and I'd like have the same environment as much as possible.

36 ↗(On Diff #441512)

Not really, but again, I just wanted to have a similar binary.

37 ↗(On Diff #441512)

Then it'd have the same value as VAddr. FYI, kcore files have 0 for PAddr. But for tests, it doesn't matter.

38–41 ↗(On Diff #441512)

For this test, they don't matter. It follows values in a kcore.

45 ↗(On Diff #441512)

Will remove.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1924

Not really, dumpObject already has a non-const ObjectFile pointer. I'll make the change.

perf annotate --stdio --objdump=/path/to/llvm-objdump

This command does not say how perf finds core

In fact, perf internally creates a kcore file for a kernel function and pass it to objdump, after that it's removed. So users don't know about the kcore.

jhenderson added inline comments.Jul 4 2022, 12:00 AM
llvm/test/tools/llvm-objdump/X86/disassemble-no-section-kcore.test
1 ↗(On Diff #441512)

Okay, thanks for the explanation. As @MaskRay points out, llvm-strip can result in binaries without section headers, which could also benefit from being disassembled. Consequently, this isn't a feature specific to kcore files. I'd suggest renaming the test to simply disassemble-no-section-headers.test. You should also update the comment to say something like: "This test checks -d disassembles an ELF file without section headers. Such files include kcore files extracted by linux perf tools, or executables with section headers stripped by e.g. llvm-strip --strip-sections." (The second sentence is entirely optional, but it's nice to have the original motivating example).

4–6 ↗(On Diff #441512)

Is there any particular reason to do two separate llvm-objdump invocations?

@MaskRay asked to test it with -h option too.

Yeah, I understand that. I meant you could do something like llvm-objdump -d -h ... rather than two separate executions of llvm-objdump. This would help the test performance. On the other hand, there's an argument for splitting out the -h case into an entirely separate file, because it's not about disassembly.

Do you actually need the explicit --start-address and --stop-address options?

No, but that's how the liinux perf tools run it. I'd like to make it as same as the actual use case.

Okay, fair enough. In that case, I think it would be good to have test cases both with and without those options. This is to show that llvm-objdump will dump the all the relevant bytes, but also that it can be restricted by address for kcore files. What do you think?

10 ↗(On Diff #441512)

Sounds good.

12 ↗(On Diff #441512)

I guess my main objection is to the <>: format, which just looks weird. I guess we should either completely omit it or synthesize a name for it. It might be that the section name addition above will cover this though automatically.

35 ↗(On Diff #441512)

The risk is that by having extra stuff in the file, you obscure what's actually important for the test case. We try to keep things as minimal as possible in general in our testing. PF_R and PF_W have no bearing on how llvm-objdump operates, so they are unnecessary and just obscure the important bit (PF_X), meaning people may not realise that the PF_X is actually the important bit.

As a spin-off thought, should we have some content covered by a phdr without PF_X, to show that this isn't disassembled? Also, what should happen with -D (which disassembles non-executable stuff too)?

36 ↗(On Diff #441512)

Same sort of response, although here it's less of an issue, but it would be nice if the VAddr was some easy-to-parse value, e.g. 0xFFFFFFFF00000000 or similar.

37 ↗(On Diff #441512)

Yeah, exactly - it doesn't matter, so let's omit it.

38–41 ↗(On Diff #441512)

Again, same points as above - reduce the noise in the test to make it easy to see what's important.

namhyung added inline comments.Jul 4 2022, 11:23 AM
llvm/test/tools/llvm-objdump/X86/disassemble-no-section-kcore.test
1 ↗(On Diff #441512)

Ok, I'll update the test as you suggested.

4–6 ↗(On Diff #441512)

Is there any particular reason to do two separate llvm-objdump invocations?

@MaskRay asked to test it with -h option too.

Yeah, I understand that. I meant you could do something like llvm-objdump -d -h ... rather than two separate executions of llvm-objdump. This would help the test performance. On the other hand, there's an argument for splitting out the -h case into an entirely separate file, because it's not about disassembly.

Yeah, I'd like to split the -h case as there might be more to come later.

Do you actually need the explicit --start-address and --stop-address options?

No, but that's how the liinux perf tools run it. I'd like to make it as same as the actual use case.

Okay, fair enough. In that case, I think it would be good to have test cases both with and without those options. This is to show that llvm-objdump will dump the all the relevant bytes, but also that it can be restricted by address for kcore files. What do you think?

Then I can update the disassemble test with and without the addresses restrictions.

12 ↗(On Diff #441512)

I'll omit it for now and add it back when I work on the section name, ok?

35 ↗(On Diff #441512)

The risk is that by having extra stuff in the file, you obscure what's actually important for the test case. We try to keep things as minimal as possible in general in our testing. PF_R and PF_W have no bearing on how llvm-objdump operates, so they are unnecessary and just obscure the important bit (PF_X), meaning people may not realise that the PF_X is actually the important bit.

Ok, will remove unnecessary bits.

As a spin-off thought, should we have some content covered by a phdr without PF_X, to show that this isn't disassembled? Also, what should happen with -D (which disassembles non-executable stuff too)?

I'm not sure if it worth adding the test now. In the current implementation, it only creates fake sections if the phdr has PF_X so non-executable phdrs won't work. When I add proper section headers for all types of phdr, I can add the tests.

36 ↗(On Diff #441512)

Fair enough, will change.

37 ↗(On Diff #441512)

Sure.

38–41 ↗(On Diff #441512)

Ok.

namhyung updated this revision to Diff 442133.Jul 4 2022, 12:24 PM
namhyung marked 21 inline comments as done.

Updates:

  • split test cases and improve comments
  • remove unnecessary info in the test yaml
  • remove const from obj to remove the const_cast

Your patch (probably) is causing llvm/test/Object/objdump-no-sectionheaders.test to fail. Please fix!

llvm/test/tools/llvm-objdump/X86/disassemble-no-section.test
9

Sorry, when I said omit the <>: entirely, I meant prevent llvm-objdump from producing it at all. If that's not practical (I suspect it isn't for various reasons), I'd suggest leaving it in, so that the check will fail and points out to whoever populates the name where to update a test.

Also, minor nit, but in the tests I'm usually involved in, we have a single blank line between RUN and CHECK directives, to make it easier to identify the different blocks.

16

Nit: I usually indent continuation lines a little, as per the inline edit.

17

Same as above.

20

This is a little bit fragile, as it wouldn't be unreasonable for the format of the instruction and operands to change somewhat, which would mean this check would no longer be valid. It might be best simply to do something like RANGE-EMPTY: to show that the next line is empty.

42

Nit: too many blank lines at EOF (files should end with exactly one \n).

namhyung updated this revision to Diff 442385.Jul 5 2022, 1:41 PM

Updates:

  • small formatting issues on the test files
namhyung marked 5 inline comments as done.Jul 5 2022, 1:42 PM
namhyung added inline comments.
llvm/test/tools/llvm-objdump/X86/disassemble-no-section.test
9

Ok, then I'll leave the part as is and add a blank line.

16

Will change.

17

Ok.

20

Right, will change.

42

Will remove.

namhyung updated this revision to Diff 442966.Jul 7 2022, 10:07 AM
namhyung marked 5 inline comments as done.

Updated, please take a look!

MaskRay accepted this revision.Jul 7 2022, 12:28 PM

LGTM when @jhenderson feels ok.

llvm/include/llvm/Object/ELF.h
188

Move this together with the other member variable StringRef Buf.

llvm/test/tools/llvm-objdump/X86/disassemble-no-section.test
9

Consider adding the two lines

# CHECK:       Disassembly of section :
# CHECK-EMPTY:
19

There is a warning (may be spurious) but you don't need to fix it in this patch.

Use 2>&1 | and test the warning to catch changes when someone fixes it.

23

Better to add needed anchor lines (like <>: you added for the test above) before 55 pushq %rbp.

It may not look great, but the intention is to catch display change which others may improve.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1134–1148

For non-null pointers, the convention in llvm-project is to use references. The llvm-objdump code traditionally did not follow the best practice. I just pushed a commit to migrate a bunch of functions. You will need to rebase and change these to references.

This revision is now accepted and ready to land.Jul 7 2022, 12:28 PM
namhyung added inline comments.Jul 7 2022, 2:27 PM
llvm/include/llvm/Object/ELF.h
188

Will do.

llvm/test/tools/llvm-objdump/X86/disassemble-no-section.test
9

Ok, will add.

19

Sure.

23

Will add.

llvm/tools/llvm-objdump/llvm-objdump.cpp
1134–1148

I see, I will make the change.

namhyung added inline comments.Jul 7 2022, 2:54 PM
llvm/test/tools/llvm-objdump/no-section-headers.test
6 ↗(On Diff #442966)

Hmm.. it seems we already have a test in llvm/test/Object/objdump-no-sectionheaders.test with an binary input file. That means we can drop this test?

MaskRay added inline comments.Jul 7 2022, 3:07 PM
llvm/test/tools/llvm-objdump/no-section-headers.test
6 ↗(On Diff #442966)

If so, yes.

namhyung updated this revision to Diff 443077.Jul 7 2022, 3:21 PM

Updates:

  • drop tools/llvm-objdump/no-section-header.test
  • fix test failure in Object/objdump-no-sectionheaders.test
  • rebase onto main branch
  • a few formatting changes
namhyung marked 4 inline comments as done.Jul 7 2022, 3:23 PM
jhenderson accepted this revision.Jul 8 2022, 12:50 AM

Looks good, once the remaining nits have been addressed.

Do you need assistance merging this patch into the main branch?

llvm/test/Object/objdump-no-sectionheaders.test
4–8

Use CHECK-NEXT, rather than simply CHECK, since these lines should be in order. Probably still want the CHECK-NOT at the end though, to catch further sections being added.

llvm/test/tools/llvm-objdump/X86/disassemble-no-section.test
19

I think @MaskRay suggested also checking the contents of the printed out warning.

20

Nit: Since this comment is relevant to the test case immediately following, rather than a batch of test cases, I'd typically not include a blank line between the comment and run lines.

namhyung updated this revision to Diff 443163.Jul 8 2022, 1:24 AM

Updates:

  • address reviews on test formatting
namhyung marked 3 inline comments as done.Jul 8 2022, 1:28 AM

Do you need assistance merging this patch into the main branch?

Yes, I don't have the commit right. Please use "Namhyung Kim <namhyung@google.com>" as author. Thanks!

Hi, I've addressed all the comments from the previous one. Could you please review it again and commit with "Namhyung Kim <namhyung@google.com>"?

Apologies, I'm not sure how I missed your last comments. I'll go ahead and land this for you.

This revision was landed with ongoing or failed builds.Jul 14 2022, 5:40 AM
This revision was automatically updated to reflect the committed changes.

I've landed the change. Please keep an eye out for build bot failure notifications to see if any might have been caused by your change.