This is an archive of the discontinued LLVM Phabricator instance.

[yaml2obj][obj2yaml] - Change how symbol's binding is descibed when parsing/dumping.
ClosedPublic

Authored by grimar on Apr 2 2019, 5:38 AM.

Details

Summary

Currently, YAML has the following syntax for describing the symbols:

Symbols:
  Local:
    LocalSymbol1:
    ...
    LocalSymbol2:
    ...
  ...
  Global:
    GlobalSymbol1:
  ...
  Weak:
  ...
  GNUUnique:

I.e. symbols are grouped by their bindings. That is not very convenient,
because:

  • It does not allow to set a custom binding, what can be useful for producing

broken/special outputs for test cases. Adding a new binding would require to
change a syntax (what we observed when added GNUUnique recently).

  • It does not allow to change the order of the symbols in .symtab/.dynsym,

i.e. currently all Local symbols are placed first, then Global, Weak and GNUUnique
are following, but we are not able to change the order.

  • It is not consistent. Binding is just one of the properties of the symbol,

we do not group them by other properties.

  • It makes the code more complex that it can be. This patch shows it can be simplified

with the change performed.

The patch changes the syntax to just:

Symbols:
  Symbol1:
  ...
  Symbol2:
  ...
...

With that, we are able to work with the binding field just like with any other symbol property.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Apr 2 2019, 5:38 AM
rupprecht added inline comments.Apr 2 2019, 11:59 AM
tools/yaml2obj/yaml2elf.cpp
807–808 ↗(On Diff #193259)

Why does this need to be an error?

grimar marked an inline comment as done.Apr 2 2019, 2:18 PM
grimar added inline comments.
tools/yaml2obj/yaml2elf.cpp
807–808 ↗(On Diff #193259)

To always produce the correct ELF output for now:
"In each symbol table, all symbols with STB_LOCAL binding precede the weak and global symbols."
(http://refspecs.linuxbase.org/elf/gabi4+/ch4.symtab.html)

Before this change, we followed this rule. And the current code does the same.

I think we might want to remove this restriction soon (to allow producing broken outputs),
but I just did not want to do it in this patch. That would require adding more test case(s) I think
to document our behavior in different situations.

Also atm findLocalsNum I added to yaml2elf.cpp assumes that locals always go first for simplicity.

LGTM, but since this is a pretty big change, it'd be good to get someone else's opinion

tools/yaml2obj/yaml2elf.cpp
807–808 ↗(On Diff #193259)

Thanks, that makes sense. I wasn't aware of that restriction.

When you do want to break it, it would be great if the *default* mode was to be strict like this, and require additional syntax when you want to create a broken elf file.

grimar marked an inline comment as done.Apr 3 2019, 1:13 AM

LGTM, but since this is a pretty big change, it'd be good to get someone else's opinion

Yeah. Thanks, Jordan!

tools/yaml2obj/yaml2elf.cpp
807–808 ↗(On Diff #193259)

When you do want to break it, it would be great if the *default* mode was to be strict like this, and require additional syntax when you want to create a broken elf file.

In recent reviews of yaml2obj/obj2yaml we briefly discussed an idea about adding one more flag, -no-validate or alike,
which would allow disabling validation of the output. Perhaps this is one more case when such a flag might be useful.

Thanks for doing this!

test/Object/X86/yaml2obj-elf-x86-rel.yaml
41 ↗(On Diff #193259)

Nit: missing new line at EOF.

test/tools/llvm-objdump/verneed-elf.test
48 ↗(On Diff #193259)

Nit: No new line at EOF.

test/tools/llvm-readobj/broken-group.test
79 ↗(On Diff #193259)

This doesn't match up with their old binding. Was that deliberate?

test/tools/llvm-readobj/elf-section-types.test
223 ↗(On Diff #193259)

Nit: no new line at EOF.

test/tools/obj2yaml/verneed-section.yaml
70 ↗(On Diff #193259)

Not: no new line at EOF.

test/tools/yaml2obj/dynamic-symbols.yaml
25–28 ↗(On Diff #193259)

Why doesn't this cause an error?

test/tools/yaml2obj/dynsym-dynstr-addr.yaml
39 ↗(On Diff #193259)

Nit: no binding here? Probably harmless though...

test/tools/yaml2obj/elf-symbols-binding-order.yaml
3 ↗(On Diff #193259)

Nit: delete "the"

6 ↗(On Diff #193259)

Nit: I'd get rid of "Has a". I don't think it's needed. I also think the message should disambiguate between dynamic and static symbols, e.g. "local symbol 'bar' after global in Symbols list".

test/tools/yaml2obj/elf-symtab-shtype.yaml
22 ↗(On Diff #193259)

Nit: no new line at EOF.

tools/yaml2obj/yaml2elf.cpp
324 ↗(On Diff #193259)

Perhaps better to call this "findLastLocal" or something along those lines. It's not clear what "LocalsNum" is.

343 ↗(On Diff #193259)

Is this behaviour tested anywhere?

grimar marked an inline comment as done.Apr 3 2019, 2:11 AM
grimar added inline comments.
test/tools/yaml2obj/dynamic-symbols.yaml
25–28 ↗(On Diff #193259)

We do not have a check for .dynsym symbols order. buildSymbolIndex is called for .symtab only.
I believe .dynsym never contains local symbols in a normally produced ELF.
I.e. this test case is kind of a bit broken from start.

grimar updated this revision to Diff 193465.Apr 3 2019, 4:06 AM
grimar marked 7 inline comments as done.
  • Addressed review comments.
test/tools/llvm-readobj/broken-group.test
79 ↗(On Diff #193259)

No. Thanks for catching!

test/tools/yaml2obj/dynsym-dynstr-addr.yaml
39 ↗(On Diff #193259)

It was not my intention to change any bindings. Fixed, thanks!

test/tools/yaml2obj/elf-symbols-binding-order.yaml
6 ↗(On Diff #193259)

Done. buildSymbolIndex is not called for dynamic symbols. Hence the error message now always mentions the "Symbols" list.

I thought about renaming buildSymbolIndex to checkSymbolsAndBuildIndex and pass both Symbols
and DynamicSymbols there. Since .dynsym normally do not contain local symbols, for DynamicSymbols
we should just probably report any local symbol occurrence. Since it would break the dynamic-symbols.yaml,
I would probably do such a change in an independent review.

tools/yaml2obj/yaml2elf.cpp
343 ↗(On Diff #193259)

I think no, I posted a patch: D60192 for that.

jhenderson accepted this revision.Apr 3 2019, 5:00 AM

LGTM, with one minor comment.

tools/yaml2obj/yaml2elf.cpp
324 ↗(On Diff #193465)

Sorry, actually, I think findFirstNonGlobal would be better, now that I think about it!

This revision is now accepted and ready to land.Apr 3 2019, 5:00 AM
grimar marked an inline comment as done.Apr 3 2019, 5:03 AM
grimar added inline comments.
tools/yaml2obj/yaml2elf.cpp
324 ↗(On Diff #193465)

Yeah, that probably makes more sense. Thanks for the review!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2019, 7:52 AM
Herald added a subscriber: jrtc27. · View Herald Transcript

This broke the LLDB tests on the Windows Bot:

http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/3237

And I just saw you fix it. Thanks!