This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Support -X|--discard-locals.
ClosedPublic

Authored by rupprecht on Jan 25 2019, 10:23 AM.

Details

Summary

This adds support for the --discard-locals flag, which acts similarly to --discard-all, except it only applies to compiler-generated symbols (i.e. symbols starting with .L in ELF).

I am not sure about COFF local symbols: those appear to also use .L in most cases, but also use just L in other cases, so for now I am just leaving it unimplemented there.

Fixes PR36160

Diff Detail

Repository
rL LLVM

Event Timeline

rupprecht created this revision.Jan 25 2019, 10:23 AM
rupprecht set the repository for this revision to rL LLVM.Jan 25 2019, 10:41 AM

I am not sure about COFF local symbols: those appear to also use .L in most cases, but also use just L in other cases, so for now I am just leaving it unimplemented there.

AFAIK they only use .L - however GNU objcopy seems to strip out all local symbols that start with either .L or L. On arches other than i386, where symbols have no prefix otherwise, this actually seems to get rid of any normal local symbol that happens to start with such a char.

But leaving it unimplemented here is fine by me.

Please add a test for the case where the symbol that would be discarded is referenced by a relocation, and show that it matches GNU behaviour, or at least that we are consistent with our existing behaviour.

Another behaviour to check is what happens with SHN_ABS and SHN_COMMON .L* local symbols?

llvm/test/tools/llvm-objcopy/ELF/discard-locals.test
29–39 ↗(On Diff #183569)

You can simplify this a bit, by removing one of the sections, removing the content of the other section, and removing things like the flags, address and alignment fields. You don't need much of an ELF to test symbol discarding.

45–46 ↗(On Diff #183569)

The size and value of these symbols is irrelevant, so let's not list them.

llvm/test/tools/llvm-objcopy/ELF/discard-mix-local-and-all.test
52 ↗(On Diff #183569)

Same comments as in the other test apply here.

100 ↗(On Diff #183569)

COMP-LOCAL? What does that mean? Probably better to call it something like NO-DISCARD.

llvm/tools/llvm-objcopy/CopyConfig.cpp
351–355 ↗(On Diff #183569)

I just want to confirm that this matches GNU objcopy. If using both --discard-all and --discard-locals in GNU objcopy, does the last triumph, or does --discard-all always triumph, regardless of order?

llvm/tools/llvm-objcopy/CopyConfig.h
86 ↗(On Diff #183569)

This isn't a boolean option any more, so should be elsewhere in the config.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
331–332 ↗(On Diff #183569)

I don't think a number of these cases are being tested, from what I can tell. I'd expect to see the following test cases:

  1. normal local symbol (e.g. STT_OBJECT) with .L prefix.
  2. other normal local symbol (i.e. without .L prefix).
  3. Undefined local symbol with .L prefix.
  4. Local file symbol with .L prefix.
  5. Local section symbol with .L prefix.
  6. Global/weak symbol with .L prefix.

As things stand, the test only tests 1) and 2).

rupprecht marked 9 inline comments as done.Jan 28 2019, 3:08 PM

I am not sure about COFF local symbols: those appear to also use .L in most cases, but also use just L in other cases, so for now I am just leaving it unimplemented there.

AFAIK they only use .L - however GNU objcopy seems to strip out all local symbols that start with either .L or L. On arches other than i386, where symbols have no prefix otherwise, this actually seems to get rid of any normal local symbol that happens to start with such a char.

But leaving it unimplemented here is fine by me.

I would prefer to leave it unimplemented as part of this patch, because I don't know anything about COFF :) but I can follow up with an attempt to add it specifically for COFF once all this wiring is in place

Please add a test for the case where the symbol that would be discarded is referenced by a relocation, and show that it matches GNU behaviour, or at least that we are consistent with our existing behaviour.

Added a test -- we're consistent with existing behavior, but not with GNU here. i.e.:

$ llvm-objcopy --discard-all <yaml obj from discard-locals-rel.test> /tmp/foo.o
llvm-objcopy: error: not stripping symbol '.L.rel' because it is named in a relocation.
$ llvm-objcopy --discard-locals <yaml obj from discard-locals-rel.test> /tmp/foo.o
llvm-objcopy: error: not stripping symbol '.L.rel' because it is named in a relocation.
$ objcopy --discard-locals <yaml obj from discard-locals-rel.test> /tmp/foo.o
# No errors, copies the object and avoids stripping .L.rel

I'm not sure if this is a problem in practice; I can test a bit more once this patch is committed. If it is a problem, it's an existing problem with --discard-all.

Another behaviour to check is what happens with SHN_ABS and SHN_COMMON .L* local symbols?

Added these tests, and verified we're consistent with GNU objcopy --discard-locals for these test files.

llvm/test/tools/llvm-objcopy/ELF/discard-mix-local-and-all.test
100 ↗(On Diff #183569)

COMP-LOCAL is for "Compiler-generated local symbols" as it is called in documentation (at least it is in gnu man pages & as I added to --help in this patch); clarified with a comment at the top and expanded to COMPILER-LOCAL

llvm/tools/llvm-objcopy/CopyConfig.cpp
351–355 ↗(On Diff #183569)

Yes, see here: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=binutils/objcopy.c;h=0e17b863d5a7d4b95b1dfbc7aa8d6b683b85d592;hb=HEAD#l4453

  while ((c = getopt(...)) {
      switch (c) {
...
	case 'x':
	  discard_locals = LOCALS_ALL;
	  break;
	case 'X':
	  discard_locals = LOCALS_START_L;
	  break;

i.e. last one wins. This can be double checked with objcopy -x -X vs objcopy -X -x.

rupprecht updated this revision to Diff 183969.Jan 28 2019, 3:12 PM
rupprecht marked 2 inline comments as done.
  • Move option out of boolean-option section
  • Simplify tests
  • Add more test cases

Something that's worth noting is that section symbols often don't actually have a name. Some tools use their section name instead for clarity. I'm wondering if this might cause an issue for such a section symbol for a section named ".L..." e.g. ".LLVM.Custom.Section" or whatever? Definitely it deserves including in the testcase.

llvm/test/tools/llvm-objcopy/ELF/discard-locals.test
37–39 ↗(On Diff #183969)

You can probably get rid of this and the LocalFile symbol below, and just keep the .L versions of those symbols. They don't provide any additional coverage.

54–56 ↗(On Diff #183969)

Ditto.

60–64 ↗(On Diff #183969)

These aren't local, so won't be discarded anyway (and therefore aren't testing anything). They should probably be in the local list...

rupprecht marked 4 inline comments as done.Jan 29 2019, 7:53 AM

Something that's worth noting is that section symbols often don't actually have a name. Some tools use their section name instead for clarity. I'm wondering if this might cause an issue for such a section symbol for a section named ".L..." e.g. ".LLVM.Custom.Section" or whatever? Definitely it deserves including in the testcase.

Added

Of note -- GNU objcopy is indeed enforcing the empty section name by clearing the symbol name when copying the .L.LocalSection symbol here. We preserve it, not sure if that matters.
Also two other differences: GNU objcopy is promoting .L.LocalFile and .L.undefined from locals to globals. Again, I think that doesn't matter too much, those files should probably be global to begin with.

llvm/test/tools/llvm-objcopy/ELF/discard-locals.test
60–64 ↗(On Diff #183969)

I don't think a local SHN_COMMON symbol makes any sense, so I just removed that case, but moved the others to local... although I'm not sure the others make sense to be local either.

rupprecht updated this revision to Diff 184093.Jan 29 2019, 7:54 AM
rupprecht marked an inline comment as done.
  • Update test cases
jhenderson accepted this revision.Jan 30 2019, 1:56 AM

I'm happy with this as is. LGTM.

llvm/test/tools/llvm-objcopy/ELF/discard-locals.test
60–64 ↗(On Diff #183969)

I guess it depends what you mean by "make sense". At least in assembly, you can generate them (although you can't generate section symbols with names, I believe).

llvm/tools/llvm-objcopy/StripOpts.td
61 ↗(On Diff #184093)

Perhaps this should be "i.e." rather than "e.g."?

This revision is now accepted and ready to land.Jan 30 2019, 1:56 AM
rupprecht marked an inline comment as done.Jan 30 2019, 6:28 AM
rupprecht added inline comments.
llvm/tools/llvm-objcopy/StripOpts.td
61 ↗(On Diff #184093)

I think .L is not universal -- it seems macho and (sometimes) COFF uses just L, and mips uses $ -- hence "e.g." as this is just ELF's example.

This revision was automatically updated to reflect the committed changes.