This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Match GNU objdump on symbol types shown in disassembly output.
ClosedPublic

Authored by ychen on Jun 6 2019, 9:28 AM.

Event Timeline

ychen created this revision.Jun 6 2019, 9:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2019, 9:28 AM
ychen edited the summary of this revision. (Show Details)Jun 6 2019, 9:45 AM
ychen edited the summary of this revision. (Show Details)Jun 6 2019, 6:00 PM
ychen removed a subscriber: rupprecht.

The test doesn't have a dynamic symbole table (.dynsym) so the source change seems irrelevant.
I think we have the current output without the source change.

See not_func is misaligned, so something goes wrong.

0000000000001001 both_dyn:
    1001: 90                            nop
    1002: 90                            nop

0000000000001003 not_func:
    1003:        90                              .

STT_SECTION exists primarily for relocations are they normally have STB_LOCAL binding.
None of ld.bfd/gold/lld places STB_LOCAL symbols in .dynsym (I am less confident with the statement regarding ld.bfd) so the updated test may not be used in practice.

The test doesn't have a dynamic symbole table (.dynsym) so the source change seems irrelevant.

@MaskRay, the test DOES have a dynamic symbol table. Line 61 in the updated test starts the DynamicSymbols list, which causes one to be implicitly generated.

I think the code change is mostly just to bring things into line with static symbol dumping, but I could see an argument for deleting the if statement entirely. In that case, I'd just confirm against GNU objdump behaviour to see if it really matters.

ychen added a comment.EditedJun 7 2019, 5:03 PM

Thanks for the review, MaskRay.

See not_func is misaligned, so something goes wrong.

I think it might be better to take care of the formatting issue (dumpELFData emits extra tab) in a separate patch. And this one focuses on which symbols should be included in the disassembly output?

STT_SECTION exists primarily for relocations are they normally have STB_LOCAL binding.
None of ld.bfd/gold/lld places STB_LOCAL symbols in .dynsym (I am less confident with the statement regarding ld.bfd) so the updated test may not be used in practice.

@MaskRay @jhenderson @grimar @rupprecht I'm not sure if this has been discussed before, I think it comes down to whether we want llvm-objdump to be compatible with GNU objdump on:

  1. well-formed object files (definitely yes)
  2. well-formed object files with something not usually produced by static linker etc.. and not explicitly disallowed by ABI (like STT_SECTION symbol in .dynsym in this case)
  3. ill-formed object files (maybe not?)

What's your thought?

deleting the if statement entirely

@jhenderson this seems to assume 1. The patch in its current shape assumes 2?

My personal thought is definitely 1), and no need for 3), as long as the error/output makes sense (note that if GNU produces something sensible and doesn't error for 3), we should do the same). I'm in two minds about 2), mostly because we can't be certain that every case we think is unusual applies to every use case out there for downstream users. Ultimately, I'm okay with 2) going either way. I'd plump for matching GNU, but if there's general opinions from others against it, I don't mind.

How about only addressing what PR41947 need (STT_OBJECT and STT_FUNC) in this patch and have a separate patch addressing matching with GNU on what other types of symbols should show up in output? Additionally, a patch for matching GNU on output format for these symbols got shown.

How about only addressing what PR41947 need (STT_OBJECT and STT_FUNC) in this patch and have a separate patch addressing matching with GNU on what other types of symbols should show up in output?

PR41947 only used a STT_OBJECT as an example, so we shouldn't just limit this patch to the two types. Note that STT_NOTYPE, as well as symbols with processor etc specific types, can also appear in dynamic symbol tables.

Additionally, a patch for matching GNU on output format for these symbols got shown.

Sorry, I'm not sure I understand what you mean.

ychen updated this revision to Diff 204985.Jun 16 2019, 6:46 PM
  • Match GNU objdump on symbol types shown in disassembly output.
  • Update test
jhenderson added inline comments.Jun 17 2019, 4:36 AM
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
112

As noted in D63393, we shouldn't do this and should instead use 0x10 or whatever explicitly, since STT_LOOS etc are just range markers. Also, it's probably only necessary to test one value in each range.

ychen updated this revision to Diff 205214.Jun 17 2019, 4:30 PM
  • update test
ychen marked an inline comment as done.Jun 17 2019, 4:31 PM
ychen retitled this revision from [llvm-objdump] Include dynamic zero-sized and non-function symbols in disassembly to [llvm-objdump] Match GNU objdump on symbol types shown in disassembly output..
ychen edited the summary of this revision. (Show Details)

Aside from one test comment, this looks good to me.

llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
91

I'm not sure I follow what the purpose of this particular symbol is beyond what func_dyn gives us?

ychen updated this revision to Diff 205353.Jun 18 2019, 7:58 AM
ychen marked 2 inline comments as done.
  • update
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
91

I think it is not needed. It was in the test file but I'm not sure its purpose so I left it here.

Deleted.

jhenderson added inline comments.Jun 18 2019, 8:41 AM
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
63

I think you may have gone overboard with the symbol renaming. These first two are deliberately named so that they clearly clash/don't clash with a corresponding static symbol (only_[static|dyn] means only the static/dynamic symbol respectively is present at that address whilst the both_[static|dyn] means both symbols have that value).

86

This value is bad, because the test case was supposed to be for just a static symbol at this address. See the symbol name comments.

91

I don't think you need to rename this symbol from zero_sized.

ychen updated this revision to Diff 205368.Jun 18 2019, 8:57 AM
  • update test
ychen marked 4 inline comments as done.Jun 18 2019, 8:59 AM
ychen added inline comments.
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
63

Thank you. Now I see the point.

ychen updated this revision to Diff 205369.Jun 18 2019, 9:01 AM
ychen marked an inline comment as done.
  • update test
jhenderson added inline comments.Jun 19 2019, 3:59 AM
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
25–27

This looks wrong, and I'm surprised this test isn't failing now. As things stand 0x1002 isn't being checked for at all in dynamic output, but there's no reason it shouldn't be there.

74

You can reduce the diff by making this symbol the STT_FUNC one and the func symbol the STT_NOTYPE one.

ychen updated this revision to Diff 205670.Jun 19 2019, 2:19 PM
ychen marked an inline comment as done.
  • update
ychen updated this revision to Diff 205676.Jun 19 2019, 2:36 PM
ychen marked 2 inline comments as done.
  • update
ychen added inline comments.Jun 19 2019, 2:37 PM
llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
25–27

both_dyn is STT_OBJECT, dumped as data in one line until the next symbol regardless of its size.

jhenderson accepted this revision.Jun 20 2019, 2:16 AM

LGTM, with the suggested change, but please wait for the other reviewers to confirm too.

llvm/test/tools/llvm-objdump/X86/elf-disassemble-dynamic-symbols.test
25–27

Ah, okay. Probably worth making both_dyn the STT_NOTYPE symbol then and making what is currently notype the STT_OBJECT one. That'll be less confusing.

This revision is now accepted and ready to land.Jun 20 2019, 2:16 AM
ychen updated this revision to Diff 205838.Jun 20 2019, 9:21 AM
  • update
ychen marked an inline comment as done.Jun 20 2019, 9:22 AM
ychen added a comment.Jun 23 2019, 9:13 PM

Hi @MaskRay, does this version look good to you?

MaskRay accepted this revision.Jun 23 2019, 9:29 PM
This revision was automatically updated to reflect the committed changes.