This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Further rearrange llvm-objdump sections for compatability
ClosedPublic

Authored by rupprecht on Sep 25 2019, 5:59 PM.

Details

Summary

rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.

I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:

  • Archive headers
  • File headers
  • Section headers
  • Symbol table
  • Dwarf debugging
  • Relocations (if --disassemble is not used)
  • Section contents
  • Disassembly

Diff Detail

Repository
rL LLVM

Event Timeline

rupprecht created this revision.Sep 25 2019, 5:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2019, 5:59 PM
jhenderson added inline comments.Sep 26 2019, 3:15 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
2090–2091 ↗(On Diff #221873)

GNU objdump section contents appear after relocations in the version I'm using.

2094 ↗(On Diff #221873)

Is there a test case for this behaviour, i.e. suppressing -R output (also -r) when --disassemble is specified?

2118 ↗(On Diff #221873)

Not sure if this is the right bit to comment on, but --dwarf=frames output for GNU objdump appears immediately after the symbol table, and llvm-objdump doesn't appear to be doing this currently, even with these changes.

rupprecht marked 4 inline comments as done.
  • Rearrange section contents & dwarf output to correct locations
  • Add test cases for orderings not covered by all-headers.test
  • Add test case for --disassemble + --dyn-relocs (--disassemble + --reloc already exists)
  • Update lld tests
rupprecht edited the summary of this revision. (Show Details)Sep 27 2019, 12:03 PM

Thanks, sorry for missing the sections contents & dwarf output ordering. Fixed those and added tests for the missing pieces.

I didn't run check-lld when mailing, and it turns a lot of those test make frequent use of llvm-objdump. Most cases were just rearranging lines, but a few places required tweaking the matchers for variable captures.

llvm/tools/llvm-objdump/llvm-objdump.cpp
2094 ↗(On Diff #221873)

-r is covered by elf-disassemble-relocs.test. I forked one that runs on a dynamic object and covers this case.

@rupprecht A bit of a beginner to LLVM, but why add the suppression for DynamicRelocations when --disassemble is provided?

@rupprecht A bit of a beginner to LLVM, but why add the suppression for DynamicRelocations when --disassemble is provided?

The LLVM binutils, like llvm-objdump, aim to produce output identical to the equivalent GNU binutils (with some exceptions for various specific reasons). This change aims to improve that compatibility. It could be argued however, that the dynamic relocations change should be put in a separate change, since it's a slightly different issue.

lld/test/ELF/mips-32.s
29 ↗(On Diff #222222)

Delete the extra blank line.

lld/test/ELF/mips-got16-relocatable.s
25–26 ↗(On Diff #222222)

What's the purpose of the new 00020{{0*}}? I'm guessing the address printed by the disassembly is not the exact value? Could this be improved using some extra switches, so that the pattern is robust to future changes, e.g. different base addresses?

llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-relocs.test
1 ↗(On Diff #222222)

This test doesn't make any checks for relocations being printed inline as far as I can see.

56 ↗(On Diff #222222)

Is this needed any more? I know @grimar has been making various improvements to yaml2obj to fix things like this.

llvm/test/tools/llvm-objdump/all-headers.test
15 ↗(On Diff #222222)

What's the point of this line?

34 ↗(On Diff #222222)

Ditto

44 ↗(On Diff #222222)

Ditto

llvm/test/tools/llvm-objdump/headers-ordering.test
1 ↗(On Diff #222222)

I'd rename the file "output-ordering.test" - it tests things other than just headers.

2–3 ↗(On Diff #222222)

I think it would be better to be explicit with all the switches rather than use --all-headers. It would help figure out what output to actually expect.

17 ↗(On Diff #222222)

Do you need this?

18 ↗(On Diff #222222)

-h

rupprecht marked 2 inline comments as done.Oct 1 2019, 2:38 PM

@rupprecht A bit of a beginner to LLVM, but why add the suppression for DynamicRelocations when --disassemble is provided?

The LLVM binutils, like llvm-objdump, aim to produce output identical to the equivalent GNU binutils (with some exceptions for various specific reasons). This change aims to improve that compatibility. It could be argued however, that the dynamic relocations change should be put in a separate change, since it's a slightly different issue.

Good point, I split that off to D68303. Will remove it from this revision and address the other points shortly.

rupprecht updated this revision to Diff 222737.Oct 1 2019, 6:15 PM
rupprecht marked 10 inline comments as done.
  • Remove --dyn-relocs + --disassemble change and move it to a different patch
  • Delete whitespace
  • Rename test to output-ordering
  • Remove dynamic section content checks (INIT)
  • Expand -x flag in test
lld/test/ELF/mips-got16-relocatable.s
25–26 ↗(On Diff #222222)

The instruction printed is: addiu $4, $25, 0x220. This corresponds to symbol value 00020220.

I'm not familiar with this test (or mips in general), and I can't quite figure out where the 0002xxxx offset is coming from. Does this regex look better? It should get the last 4 digits, zeroes removed, so the base address can change (as long as the offset is a multiple of 00010000).

Alternatively, I could just force the old ordering by using an intermediate file, e.g.
llvm-objdump -d file.o > tmp
llvm-objdump -t file.o >> tmp
FileCheck --input-file=tmp

llvm/test/tools/llvm-objdump/all-headers.test
15 ↗(On Diff #222222)

I suppose I don't need it anymore.

GNU objdump doesn't print "Dynamic Section:" if there is nothing there, but llvm-objdump does. To compare sections, I added it to the test data. I think it's still useful to keep in the test data, but we don't need to assert on it -- just the header.

@rupprecht A bit of a beginner to LLVM, but why add the suppression for DynamicRelocations when --disassemble is provided?

As James mentioned, this is to emulate GNU objdump better; and AIUI the original intention is that --disassemble should include relocations inline in the assembly output, so including a separate section is redundant.

rupprecht edited the summary of this revision. (Show Details)Oct 1 2019, 6:17 PM
jhenderson added inline comments.Oct 2 2019, 1:37 AM
lld/test/ELF/mips-got16-relocatable.s
25–26 ↗(On Diff #222222)

I'm not familiar with this test (or mips in general), and I can't quite figure out where the 0002xxxx offset is coming from

Without looking in detail, I'd guess it's just the standard image base, so effectively the default value of '.' in the internal linker script.

llvm/test/tools/llvm-objdump/output-ordering.test
9 ↗(On Diff #222737)

headers -> output

54–67 ↗(On Diff #222737)

Get rid of the excessive spacing

rupprecht updated this revision to Diff 222864.Oct 2 2019, 10:25 AM
  • Fix comments and whitespace
rupprecht marked 2 inline comments as done.Oct 2 2019, 10:26 AM

@rupprecht With most of the lit test adjustments here the only change was the ordering, which makes sense. Could you help me understand why the addition of the various [1-9] style regex expressions were also added to certain tests? Are those just ensuring more verbose matches with FileCheck?

jhenderson accepted this revision.Oct 3 2019, 1:40 AM

LGTM. I'll leave @rupprecht to answer @justice_adams's comment.

This revision is now accepted and ready to land.Oct 3 2019, 1:40 AM

@rupprecht With most of the lit test adjustments here the only change was the ordering, which makes sense. Could you help me understand why the addition of the various [1-9] style regex expressions were also added to certain tests? Are those just ensuring more verbose matches with FileCheck?

In llvm/test/MC/COFF/symidx.s for example, the thing we're matching was originally:

Contents of section .data:
  0000 0b000000 0a000000
SYMBOL TABLE:
  [ a] ... foo
  [ b] ... bar

where a and b are single digit numbers. We don't necessarily know or care what those values are, just that they should have the same value in both places. So we capture it the first time we see the pattern, and then match against it the second time we see it.

For the line:

// CHECK-NEXT:  0000 0[[BAR:[1-9]]]000000 0[[FOO:[1-9]]]000000

The [[BAR:[1-9]]] means to look for the regex [1-9] and capture it in variable BAR. We can then subsequently write a check that verifies we see a line with that previously captured value:

// CHECK-NEXT: [ [[BAR]]](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000001 bar

e.g. if BAR is captured to be 7, the test is equivalent to:

// CHECK-NEXT: [ 7](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000001 bar

This is described in more detail here: http://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-string-substitution-blocks

Anyway: this patch changes the order of sections to:

SYMBOL TABLE:
  [ a] ... foo
  [ b] ... bar
Contents of section .data:
  0000 0b000000 0a000000

(hit submit too soon)

Because the order of sections changes, we have to move the regex definition to the first place we capture the variable. That's why the :[1-9] part moves to the first var match.

@justice_adams I hope this answers your question, if not I'm happy to elaborate some more.

@rupprecht Thanks for the explanation, that makes perfect sense. This LGTM (but note that I'm fairly new to the binutils)

justice_adams accepted this revision.Oct 3 2019, 2:54 PM
This revision was automatically updated to reflect the committed changes.