This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add --strip-symbol (-N) option
ClosedPublic

Authored by paulsemel on May 2 2018, 3:53 PM.

Details

Summary

This option strips a symbol from the binary.
As for the --weaken-symbol, --localize-symbol and --globalize-symbol options, we are not following GNU objcopy behavior, as we consider that the user might expect the symbol to be removed, even if it doesn't totally make sense (for this option, it appears that undefined weaken symbols are not stripped).

Diff Detail

Repository
rL LLVM

Event Timeline

paulsemel created this revision.May 2 2018, 3:53 PM
This revision is now accepted and ready to land.May 2 2018, 4:44 PM
jakehehrlich accepted this revision.May 2 2018, 5:03 PM
jhenderson requested changes to this revision.EditedMay 3 2018, 1:31 AM

What happens if the requested symbol is a) referred to by a relocation? b) The signature symbol for a group section?

There might be other cases I haven't thought of either.

Please add tests for this and show that the behaviour is sensible.

This revision now requires changes to proceed.May 3 2018, 1:31 AM

What happens if the requested symbol is a) referred to by a relocation? b) The signature symbol for a group section?

There might be other cases I haven't thought of either.

Please add tests for this and show that the behaviour is sensible.

Well, I thought we concurred in a previous review that we were letting the user do what he wanted, even if it was not making sense.
But, to answer your questions, as you already guessed, for those two cases, objcopy doesn't do anything with the symbols, as it doesn't make any sense to strip them.
That's why I didn't add any tests for those one, as we are not handling symbols differently anyway.

Whilst we want to enable users to do slightly strange things, I don't think we want to leave the ELF in an invalid state. I am aware of two issues with removing symbols:

  1. Relocations targeting the specified symbol:

GNU objcopy's behaviour: don't strip the symbol, and print a message (printing a message is incredibly rare in GNU objcopy!).
With this patch llvm-objcopy's behaviour: strip the symbol, and leave the relocation with an invalid symbol reference, rendering tools unable to use the file.

  1. Groups using the specified symbol as its signature:

GNU objcopy's behaviour: discard the group.
With this patch llvm-objcopy's behaviour: strip the symbol, and leave an invalid reference in the group's sh_info field (in my case, on a Windows debug build, I get the value 0xDDDDDDDD).

Neither of these behaviour's are good in my opinion. When we strip sections, we don't allow stripping sections which are referenced from elsewhere, for example.

Alright, I understand what you're saying to me !
However, in this case, we might want to change other options behavior (I can also leave the ELF file in an invalid state with --localize-symbol or --globalize-symbol.
I actually truly think that, for the options where the user explicitly gives a symbol name, we might just do whatever he wants. But I agree with you for other options like --weaken or --strip-unneeded, where in this case we really want to have the ELF file in a valid state.
Anyway, if you want it, I can follow, the GNU objcopy behavior for this one (if I understood you well, this might consist of issuing a warning, and removing section)

Alright, I understand what you're saying to me !
However, in this case, we might want to change other options behavior (I can also leave the ELF file in an invalid state with --localize-symbol or --globalize-symbol.
I actually truly think that, for the options where the user explicitly gives a symbol name, we might just do whatever he wants. But I agree with you for other options like --weaken or --strip-unneeded, where in this case we really want to have the ELF file in a valid state.
Anyway, if you want it, I can follow, the GNU objcopy behavior for this one (if I understood you well, this might consist of issuing a warning, and removing section)

I think --strip-symbol is slightly different to --localize-symbol etc, because the relocations and groups will still be valid (if a little less than expected). I'm not quite sure what context you meant by leaving it in an invalid state for these.

I have no particular issue about doing what the user wants, but if we do that, we need to do something with the relocations/groups in this situation - we can't leave dangling references, so we'd need to remove these references, if we did. I'd like to defer this to @jakehehrlich to decide, whether we match the GNU objcopy behaviour or attempt to do something slightly different (but still leave things in a valid structural state).

Indeed, the options I mentioned are different..

I did some investigation while implementing --unneeded-symbol, and it appears that the warning you are mentioning only appears when we are manipulating a relocatable ELF file..
Anyway, I would need to better check what this option is really doing (or probably checking the objcopy source code), but it's way more complicated than just this.
Anyway, I would really like to have @jakehehrlich toughts about it, so I'll wait for this too :)

Btw, thanks for reviewing @jhenderson !

jakehehrlich requested changes to this revision.May 3 2018, 4:41 PM

I totally forgot about those isues! Yeah we don't want to leave dangling references around. I really should have caught that issue but I was thinking. I just figured I had already worked out all the issues with symbol removal when I wrote removeSectionReferences but you don't have to worry about those sections sticking around in that case where you *do* have to worry about it in the generic symbol removal case. The behavior I'd like to see here is that the relocations referencing a remove symbol cause an error to be thrown. I'm considering what I want to happen for groups but I belive it should also cause an error if you try to remove the symbol of a group section. I couldn't find any other Symbol pointers and that is the canonical way to refer to a pointer so I believe James caught all the cases. This is going to require extending the hierarchy with a special 'removeSymbolReferences' method in SectionBase and then adding a special 'removeSymbols' function in Object...sorry this just got a lot more complicated.

Can you see what GNU objcopy does when the symbol is defined (both local and global cases)? Roland mentioned that in the defined case it might actually transform the relocation into a section relative relocation instead! That'd be pretty neat IMO but is likely out of the scope of this diff (even with added complexity already mentioned). If it turns out that GNU objcopy emits a section reletive relocation in that case then we can add a TODO there.

Roland also provided some thoughts on what to do with groups. I think I agree that for now should throw an error and make sure to mention all sections that would need to be removed via -R to make it go though. If someone ever submits a bug saying that removing the symbol for a group section doesn't remove the group we can change that behavior.

paulsemel updated this revision to Diff 145404.May 6 2018, 11:05 AM

This change include the handling of symbols referred by group sections.
It also forbids relocation symbol removal for relocatable ELF (like GNU objcopy).

@jakehehrlich I wasn't able to reproduce what Roland told you, but I might have miss something (in my opinion, what he told you only make sense for relocatable ELF files, but I also got the error for defined symbols..)
After doing some tests, it seems that I am getting the same results than GNU objcopy.
I tried to do my best not to change too much the previous architecture.
RFC :)

paulsemel updated this revision to Diff 145405.May 6 2018, 11:14 AM

small fix : removed useless function

jakehehrlich added inline comments.May 7 2018, 3:03 PM
tools/llvm-objcopy/Object.cpp
926

I'd like to move everything over to using function_ref where possible. The way we accumulate in HandleArgs will remain std::function but I think preety much everywhere else we can use function_ref.

935

If this is still needed in the final version of this method, make sure it comes before any pre-computation.

941

This is the wrong way to do this. Link and info are not the only way to reference symbols and you can't know how those sections are using Link/Info here so you might get false positives. This is also unacceptably inefficient. For instance we support chromium and chromium uses --ffunction-sections. That would make this an n^2 loop where n = about 60k (also n will grow to over 100k after I implement large section indexes properly. They're currently using tricks to only run llvm-objcopy on fewer sections at a time.) It's actually worse than n^2 because data symbols are also present. So this is unacceptably inefficient as is. We need a linear time algorithm for removing symbols. In general I want to be linear with respect to number of symbols and n*lg(n) with respect to sections (and I'd like to push for linear in the case where no program headers exist but I don't think that's the case right now).

I didn't think about this at the time but its slightly tricky to get this efficient. I think instead of using 'removeSymbolReferences' giving every section a 'removeSymbols' method that takes a predicate might be called for. The idea is that each section can a) ignore b) throw an error or c) actually remove symbols.

Then you're code is a loop over sections, one of which contains a loop over symbols. You also don't need the extra loop over sections to get the group sections.

946

Actually for this patch, I'd prefer we not remove groups and instead throw an error explaining that you could instead just remove those group sections if you really want that. I don't *think* anyone is relaying on llvm-objcopy to remove those sections when that symbol is removed. If they are they can file a bug report.

paulsemel added inline comments.May 8 2018, 2:56 AM
tools/llvm-objcopy/Object.cpp
941

Arf.. I was totally focused on groups, and totally forgot about this -ffunction-sections option. So yeah, in this case, this might definitely grow very much the size of the Groups (which name doesn't fit anymore btw). Sorry for not thinking about it before..

Anyway, there is something I am missing with the implementation you are suggesting. Basically, what you're saying is that our principal removeSymbols function will call another removeSymbols function on every section. The problem is that only SymbolTable seciton might be able to access symbols, despite we might need to issue a check in for example group sections, to know whether the predicate includes the symbol we are actually referencing. I was thinking about passing the SymbolTable as an argument so that we might be able to search for the referencing symbol and check whether it matches the predicate, but it would, again, lead to inefficient code...

The second thing I was thinking about was to get nested predicates from each sections (except SymbolTable), and then call the SymbolTable removeSymbols function with this last predicate so that other sections might be able to check for conflicts in symbols they are referencing without impacting the complexity of the algorithm.

Am I getting wrong ? What do you think about it ?

Alright, I come back here because I completely missed something.. I didn't realized that the groups sections were already linked to symbols, which makes it completely possible to remove the symbol directly from here.
Anyway, for -ffunction-sections, we are not linking the section with the symbol (which we are doing for the Groups), which will again lead to the problem I mentioned above (I think).
For the moment, I will refactor the code as you told me, and wait until we find a solution for this functions sections. :)

paulsemel updated this revision to Diff 145676.May 8 2018, 6:13 AM

Refactor the code to make it more efficient.
Throwing an error when trying to remove Group symbol or relocation symbols.

Not sure this is specifically an issue with your change, but I think we need a specific test for the first non-local in a symbol table being removed. This is because that affects the sh_info field of that section, but I suspect we don't (deliberately) test that anywhere.

tools/llvm-objcopy/Object.cpp
352

Test case?

383

Test case?

I think this is okay, but not ideal - the section name for a .group section is officially unspecified, but essentially always ".group" so the message won't provide a useful way of identifying the section. Perhaps the input section index should be printed instead/as well?

jakehehrlich added inline comments.May 8 2018, 12:24 PM
tools/llvm-objcopy/Object.cpp
349

You can just use a range based for loop here. In general my preference is for range based for loops. I feel that std::for_each should be used when you have a more complicated iterator situation that isn't neatly handled by ranges.

383

+1 on adding section index.

930

Please use ranged based for loops instead of std::for_each

932

The symbol table is just any old section. You don't have to special case it.

941

I don't think I understand the issue you're having with SymbolTables. The only objects that can store references to symbols are sections. If we loop over the sections they'll either fatally error out or all succeed together (meaning removing the section was valid). I don't think other sections need access to SymbolTable in order to remove their own internal references (or to throw an error).

paulsemel updated this revision to Diff 145807.May 8 2018, 4:09 PM
paulsemel marked 6 inline comments as done.

Added suggestions.
Added test cases for the two errors.

tools/llvm-objcopy/Object.cpp
941

There is actually no issues, I just confused myself, but I then understood the whole thing and changed the patch (as you can see). Sorry about it.. :/

jakehehrlich accepted this revision.May 8 2018, 5:27 PM

This LGTM with a couple fixes but please wait on James for the final approval.

tools/llvm-objcopy/Object.cpp
383

Use Twine instead of std::to_string

941

No worries. I live in a constant state of confusion myself.

tools/llvm-objcopy/llvm-objcopy.cpp
349

Just thought about this. Can we do symbol removals before symbol updates?

rja accepted this revision.May 8 2018, 11:53 PM
rja added a subscriber: rja.

LGTM too

paulsemel added inline comments.May 9 2018, 2:11 AM
tools/llvm-objcopy/llvm-objcopy.cpp
349

Well, I took a look and, to me, it doesn't seem that there is technical reasons not to do it.
For the moment, I don't think this is really relevant, as the user might explicitly enter collapsing options to trigger the counter performance (like LocalizeHidden and DiscardAll for example). Anyway, for the future options I'm planning to implement, I'm pretty sure that the collapsing will be less evident to the user (and in all cases, we might not count on the user not to enter collapsing options, so we might do it anyway).
If you want, I can propose this change in its own patch, to get better track of changes we make ? :)

paulsemel updated this revision to Diff 145882.May 9 2018, 2:14 AM

Replaced std::to_string to llvm::Twine.

paulsemel updated this revision to Diff 145884.May 9 2018, 2:20 AM

Removed llvm namespace specification to Twine to remain compliant with the rest of the code

Side thought: we should probably ensure all our messages reporting sections include the section indexes, since section names don't have to be unique (although it's unusual for them to be otherwise).

test/tools/llvm-objcopy/strip-symbol-group.test
2

I'd rename this test "strip-group-symbol.test", as the current name is a little confusing.

test/tools/llvm-objcopy/strip-symbol-reloc.test
2

Similarly, please rename this test to "strip-reloc-symbol.test" or similar.

27

Could you give these symbols different names, please, that don't indicate something to do with symbol bindings, e.g. 'foo' and 'bar'

We also probably need only 1 symbol.

test/tools/llvm-objcopy/strip-symbol.test
21

As noted above, the bindings are irrelevant, so please rename these symbols to generic names.

tools/llvm-objcopy/llvm-objcopy.cpp
349

I agree with Paul that this should be its own patch. I was thinking about the comment myself, but there are a number of cases we need to consider, as he's suggested, where it's unclear which we should do first (and I'd suggest it follow GNU objcopy if sensible).

paulsemel updated this revision to Diff 145889.May 9 2018, 3:05 AM
paulsemel marked 4 inline comments as done.

Tests: replaced symbol names with more generic names.

removed unnecessary symbols
jhenderson accepted this revision.May 9 2018, 3:10 AM

LGTM, with nit.

test/tools/llvm-objcopy/strip-group-symbol.test
26 ↗(On Diff #145889)

Nit: I don't think it's important, but group signature symbols are usually STB_WEAK binding.

This revision is now accepted and ready to land.May 9 2018, 3:10 AM
paulsemel updated this revision to Diff 145891.May 9 2018, 3:24 AM
paulsemel marked an inline comment as done.

Changed group symbol binding in tests
Rebased

committed on r331924
Still waiting for @jakehehrlich about the symbol removal before the symbol update to close this review.

committed on r331924
Still waiting for @jakehehrlich about the symbol removal before the symbol update to close this review.

Oh sorry; I didn't realize you were waiting on me. I agree with you guys. It's not clear what should happen and at the very least given the complexity that I overlooked it should be another change if it should happen at all. Feel free to land this.

Ping. Do you need anything else from one of us?

paulsemel closed this revision.May 11 2018, 1:02 PM

Ping. Do you need anything else from one of us?

I'm sorry, this patch has been push 2 days ago, just forgot to close the revision.. my bad :)