This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add --keep-symbol (-K) option
ClosedPublic

Authored by paulsemel on May 14 2018, 2:20 AM.

Details

Summary

This option permits to explicitly keep the specified symbol so that it doesn't get removed.

Diff Detail

Repository
rL LLVM

Event Timeline

paulsemel created this revision.May 14 2018, 2:20 AM

Code change looks fine to me. However, what is the GNU objcopy behaviour for doing --strip-symbol/--keep-symbol in the same command-line? Does last win, or is the symbol always kept, or is the symbol always discarded? Please also add a test for our behaviour for this pair of switches, as it is an interesting edge case.

Also, I know that at this stage the behaviour of --keep-symbol doesn't affect symbols in discarded sections (e.g. due to --remove-section), but could you add a test for attempting to keep a symbol in such a section anyway, please, to document the behaviour, in case we decide to refactor things later.

Code change looks fine to me. However, what is the GNU objcopy behaviour for doing --strip-symbol/--keep-symbol in the same command-line? Does last win, or is the symbol always kept, or is the symbol always discarded? Please also add a test for our behaviour for this pair of switches, as it is an interesting edge case.

So, for --strip-symbol as for --discard options, the --keep-symbol option win on the previous ones, that's why I didn't add a second test, but I can do it for sure :)

Also, I know that at this stage the behaviour of --keep-symbol doesn't affect symbols in discarded sections (e.g. due to --remove-section), but could you add a test for attempting to keep a symbol in such a section anyway, please, to document the behaviour, in case we decide to refactor things later.

Sure, I will add a test for this behavior !

paulsemel updated this revision to Diff 146573.May 14 2018, 3:55 AM

Added some tests for :

  • in addition with --remove-section
  • in addition with the --strip-symbol option
jhenderson accepted this revision.May 14 2018, 4:40 AM

LGTM with the suggested comments.

test/tools/llvm-objcopy/keep-symbol-remove-section.test
2

Nit: I'd suggest using only long options in this sort of test, to make it a little clearer.

test/tools/llvm-objcopy/keep-symbol.test
3

Could you make this "-K foo -N foo -N bar --keep-symbol bar -N baz" or similar, to show that order doesn't matter, please?

This revision is now accepted and ready to land.May 14 2018, 4:40 AM
paulsemel updated this revision to Diff 146585.May 14 2018, 5:33 AM
paulsemel marked 2 inline comments as done.

Applied James' nits

I'll wait for @jakehehrlich approval to land this one :)

Would be great to expose this option in llvm-strip, not necessarily in this
patch, though.

jakehehrlich accepted this revision.May 14 2018, 4:15 PM

LGTM

test/tools/llvm-objcopy/keep-symbol-remove-section.test
2

I'd like to +1. It's good to test that the short options work but I can never remember what the aliases are. I had to go look them up for keep-symbol.test

test/tools/llvm-objcopy/keep-symbol.test
3

nit: Can you use --discard-all following James recommendation above. Since --discard-all isn't being tested (nor any related option really) here it shouldn't use an alias.

This revision was automatically updated to reflect the committed changes.

btw, i've looked at the code one more time, it seems to me it's not completely correct, the situation is a bit more subtle.
For example,

main.c:
int xx= 1;
int yy() { return 239; }

gcc -g -c main.c -o main.o
cp main.o _main.o

objcopy --keep-symbol xx --strip-all main.o -o main2.o
llvm-objcopy --keep-symbol xx --strip-all _main.o -o _main2.o

llvm-objcopy will entirely wipe out the symbols table, the logic appears to be wrong
(bintuils objcopy does the right thing at the same time)

yeah, the behavior of --keep-symbol is more complicated than what's going on here.
In particular,

main.c

attribute((section(".data.foo"))) int xx= 1;
int yy() { return 239; }

objcopy --keep-symbol xx --remove-section .data.foo main.o main2.o

objcopy: main1.o: symbol `xx' required but not present
objcopy:main1.o: No symbols
[alexshap@devvm ~/local] echo $?
1

so probably the exit code should not be 0, error reporting can be better, though, than what binutils objcopy has to offer.

@paulsemel , can you send fix (looks like it's not that involved to fix it) ? otherwise i can send a patch for review myself . cc: @jakehehrlich

In D46819#1100415, @alexshap wrote:

yeah, the behavior of --keep-symbol is more complicated than what's going on here.
In particular,

main.c

attribute((section(".data.foo"))) int xx= 1;
int yy() { return 239; }

objcopy --keep-symbol xx --remove-section .data.foo main.o main2.o

objcopy: main1.o: symbol `xx' required but not present
objcopy:main1.o: No symbols
[alexshap@devvm ~/local] echo $?
1

so probably the exit code should not be 0, error reporting can be better, though, than what binutils objcopy has to offer.

@paulsemel , can you send fix (looks like it's not that involved to fix it) ? otherwise i can send a patch for review myself . cc: @jakehehrlich

I'm not sure how concerned I am about this but I've got some thoughts I suppose. The issue here is that GNU objcopy implicitly keeps a section if a symbol is explicitly kept that is defined in that section. Right now we don't keep such sections. I'm fine with the added error for --remove-section and --keep-symbol conflicts because those are explicit but I don't think we should error out for all such cases and I don't want to put any extra effort into erroring out in that case. In more complicated mixtures of symbol keeping and section removals we should probably err on the side of caution by not removing sections (which appears to be what GNU objcopy does as well). This is probably a slightly non-trivial feature to implement. The reasoning is that in some big build success is more important than reducing size. A warning should be issued at best in that case and I'd only consider an error valid in the case of an explicit-keep explicit-remove conflict. In past discussions of this form it was decided that we shouldn't worry about that sort of error. In general an explict-keep should override *everything*.

So my recommendation would be to implement keeping of sections for kept symbols defined in those sections but not to worry about the error case. In fact I'm not even sure I'd be ok with an error in that case.

@jakehehrlich Jake, it's one issue, a separate issue (which is probably more important and creating troubles right now when i first tried to use this newly-implemented flag) (mentioned in my first comment) is
that llvm-objcopy removes the symbols table regardless of -K (for a detailed example have a look at that comment), imo this needs to be fixed. Am I missing smth ?

In D46819#1100483, @alexshap wrote:

@jakehehrlich Jake, it's one issue, a separate issue (which is probably more important and creating troubles right now when i first tried to use this newly-implemented flag) (mentioned in my first comment) is
that llvm-objcopy removes the symbols table regardless of -K (for a detailed example have a look at that comment), imo this needs to be fixed. Am I missing smth ?

I've not looked at the GNU behaviour, but if GNU keeps that symbol (and only that symbol), it implies that we cannot simply strip the symbol table section via the removeSections code. Instead I'd suggest we need to do two things: 1) If Config.StripAll is set, remove a symbol, unless it is marked as keep. 2) If Config.StripAll is set, and the symbol table is empty (i.e. no symbols were marked as keep), we strip the symbol table.

I think --strip-all is an example of an "implicit" removal, rather than explicit, so the explicit keep should win out with no warning or error.

Hi @alexshap !

I actually found this behavior yesterday, but I don't think the problem resides in this option implementation, but more in option conflicts (I dive deep in objcopy source code, and it seems that they have such conflict management).
So, I think you're right and we might do something here, but I truly think that it might not be a fix of this option, but more a real option conflicts management thing.
As far as I remember during my tests, this is not the only case where we have options conflicts now.
What do you think ?
cc @jakehehrlich @jhenderson

@paulsemel, as i said, there are 2 (similar but separate) issue.
I'm mostly concerned with the fact that we incorrectly remove the symbols table if --strip-all -K foo is specified because it's a real world use-case.
As @jhenderson suggested

>Instead I'd suggest we need to do two things: 1) If Config.StripAll is set, remove a symbol, unless it is marked as keep. 2) If Config.StripAll is set, and the symbol table is empty (i.e. no symbols were marked as 
>keep), we strip the symbol table.

i think this is how it should work, so i completely agree with James.
Does it sound good to you ?

In D46819#1101523, @alexshap wrote:

@paulsemel, as i said, there are 2 (similar but separate) issue.
I'm mostly concerned with the fact that we incorrectly remove the symbols table if --strip-all -K foo is specified because it's a real world use-case.
As @jhenderson suggested

>Instead I'd suggest we need to do two things: 1) If Config.StripAll is set, remove a symbol, unless it is marked as keep. 2) If Config.StripAll is set, and the symbol table is empty (i.e. no symbols were marked as 
>keep), we strip the symbol table.

i think this is how it should work, so i completely agree with James.
Does it sound good to you ?

Sure, totally fine with this. My point was more that the problem was not in this patch (I mean, even if you would have find this case before, this would still have been a separate one, as it more concerns the --strip-all option implementation than this one).
Anyway, I think that the current architecture is not really compliant with this (maybe the symbol removal should happen before the section removal ?)

! In D46819#1101534, @paulsemel wrote:
Sure, totally fine with this. My point was more that the problem was not in this patch (I mean, even if you would have find this case before, this would still have been a separate one, as it more concerns the --strip-all option implementation than this one).
Anyway, I think that the current architecture is not really compliant with this (maybe the symbol removal should happen before the section removal ?)

No, forget about it, we should probably do another pass after symbol removal.

to avoid misunderstanding - how will we proceed ? (should i send a diff myself or you will fix it)

In D46819#1101548, @alexshap wrote:

to avoid misunderstanding - how will we proceed ? (should i send a diff myself or you will fix it)

You seem to have something clear in mind, so you can go for it, I don't matter :)