Page MenuHomePhabricator

[llvm-objcopy] Add --strip-unneeded option
ClosedPublic

Authored by paulsemel on May 15 2018, 10:48 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'm now getting a bit lost. Please could you outline the GNU objcopy behaviour for the following axes:

Sure, I'll try to be as much consistent as I can.

Axis 1: elf type (ET_REL versus ET_DYN/ET_EXEC)
Axis 2: symbol binding (Global/Weak/Local)

There is a real thing here. The fact is that, as you already know, for ET_REL, global symbols might be link with other relocatable files. For this reason, and only for ET_REL, objcopy is keeping those symbols.
I will try to give you an example. Consider the test file test/tools/llvm-objcopy/localize.test. If you yaml2obj it, and launch objcopy --strip-unneeded %t %t1, you will notice that those the weak and global symbols are kept in the result binary, despite the fact there are not referenced in a relocation.
Now, if you take this same test file but change from ET_REL to ET_DYN (and do the same procedure again), you will notice that the symbols are not present anymore (symtab should be removed actually).
This is what I'm trying to reproduce with this behavior. The fact is that, if you take a look at objcopy code, you won't see anything about this ET_REL handling.
But the fact is that they are also handling this option in the section removal part of objcopy, which makes them removing symbol table when this option is set.

I didn't do it this way because I truly think that the symtab removal might be done in the writing part of llvm-objcopy. By this I mean that we might test whether the symtab is empty, and if so, just remove it. The second thing is that I wanted to avoid the problem mentioned by @alexshap in an other review with the keep-symbol option. This way of doing, we are avoiding the problem.

To summarize: For ELF other than ET_REL, objcopy is basically stripping everything (except if in a reloc). For ET_REL, it tries not to leave the binary in a broken state. Indeed, if we also remove Global/Weak symbols, we are just breaking the linking part (that's not what we want).

Axis 3: undefined/defined symbols

For other than ET_REL, objcopy doesn't seem to care about undefined/defined symbols. BUT, as @jakehehrlich mentioned, undefined Global/Weak symbols (again, that are not present in reloc) are stripped from the binary. This is, as far as I can tell, the only time objcopy strips G/W symbols in ET_REL Elf files.

This should help us evaluate the behaviour and determine if the approach taken is correct. It should be straightforward to generate these different cases using yaml2obj, like you do in the tests.

Hope this clarifies, please tell me if you want me to elaborate on some other points :)

Eh I think this is kind of obscure behavior for GNU objcopy. I don't think it matters what we choose to do with symbols for linked binaries. For ET_REL we definitely need to preserve any symbol referenced by a relocation and we definitely need to preserve defined global/weak symbols. All other symbols should be removed. I'm fine being more aggressive for none ET_REL but I also don't care if we aren't. It's fine if you match the behavior for ET_EXEC and ET_DYN and its fine if you don't. My preference is for simpler code with fewer edge cases.

tools/llvm-objcopy/Object.cpp
653–659 ↗(On Diff #146897)

That's fine.

tools/llvm-objcopy/Object.h
347 ↗(On Diff #146897)

I was thinking about this and I was wondering if this isn't something that should be more general. Like it there are other ways symbols might be needed other than relocations. This should be a generic reference count. But we already have shared_ptr and weak_ptr to tell us how this should work!

Can we think of a way to reuse that infrastructure? For instance those types already have use_count methods so that can be used to determine weather a symbol is needed or not. Alternatively it would make sense to create a similar sort of abstraction that managed the reference counting for us.

tools/llvm-objcopy/llvm-objcopy.cpp
392 ↗(On Diff #146897)

We probably don't want to remove empty symbol tables IMO. I don't think it has any real tendency to come up for any practical purpose where the desire wouldn't just be to remove the section directly.

I'm happy to defer to @jakehehrlich's preferences here, but FWIW, I think GNU objcopy's behaviour is actually the wrong way around regarding the Global/Weak symbols: if you strip a defined symbol from an ET_REL ELF that isn't referenced within that ELF, the file can still be used in a link, you just can't reference anything that wasn't already referenced by that object file. Stripping symbols from ET_DYN/EXEC is okay, as it should have zero impact on the runtime behaviour (although debugging etc might not be as trivial). Anything that requires symbols at runtime should be using dynamic relocations and dynamic symbols, which this behaviour won't touch.

I think we need to be slightly wary of using ET_* types directly, because it will make it hard for vendors to add their own ET_* types to the list of things that should/should not be affected by some behaviour. If we do decide to go down the route of matching objcopy's behaviour here and therefore need to distinguish between relocatable and executable types, I'd suggest adding helper functions to wrap them, and call those instead of testing e_type directly.

Hi !

I kind of disagree with one point. If we chose to not differentiate between ET_* files, we need to keep Global/Weak defined symbol in every cases.
Indeed, it just doesn't make sense to strip thoses symbols for ET_REL files, as it will make the relocatable file not usable.. (or kind of not usable).
I actually don't care for the remaining part (even if I admit I like objcopy current behavior).
I will wait for you to tell me what to do, I don't want to take decisions you will not approve :)

tools/llvm-objcopy/Object.h
347 ↗(On Diff #146897)

Hm.. ok do you already have something in mind ?
The fact is that not only relocation can reference symbols, so we need to be able to distinguish between blocking references and normal references if you know what I mean.

tools/llvm-objcopy/llvm-objcopy.cpp
392 ↗(On Diff #146897)

I don't get it. What's the point of keeping an empty symtab ?

paulsemel updated this revision to Diff 147321.May 17 2018, 8:06 AM

Applied some suggestions.
For the moment, a counter has been implemented for the references.. waiting for some suggestions to change this.
Same, I let the ET_* things, waiting for some thoughts about it.

I'll be back to respond further. I'm pressed for time today.b

tools/llvm-objcopy/Object.h
347 ↗(On Diff #146897)

Right which seems to be like a shared_ptr vs weak_ptr difference. I thought a tiny bit more about it and we don't want to connect collection to this. So I think I'd actually prefer an abstraction that has an explicit "clone" method that wraps pointers without calling delete on them. It seems like most ways of doing this I can think of require an extra layer of indirection. If you want we do explicit reference counting and I'll try and come up with something better later. Basically I think manual reference counting is error prone and I'd like to avoid it.

I'm happy to defer to @jakehehrlich's preferences here, but FWIW, I think GNU objcopy's behaviour is actually the wrong way around regarding the Global/Weak symbols: if you strip a defined symbol from an ET_REL ELF that isn't referenced within that ELF, the file can still be used in a link, you just can't reference anything that wasn't already referenced by that object file. Stripping symbols from ET_DYN/EXEC is okay, as it should have zero impact on the runtime behaviour (although debugging etc might not be as trivial). Anything that requires symbols at runtime should be using dynamic relocations and dynamic symbols, which this behaviour won't touch.

I think we need to be slightly wary of using ET_* types directly, because it will make it hard for vendors to add their own ET_* types to the list of things that should/should not be affected by some behaviour. If we do decide to go down the route of matching objcopy's behaviour here and therefore need to distinguish between relocatable and executable types, I'd suggest adding helper functions to wrap them, and call those instead of testing e_type directly.

First off, I think I agree with you on ET_* and the more I think about it the more I agree. I think given a) You're preference for not using ET_ and b) my initial and now growing preference we should not use ET_. @paulsemel please do not use ET_* checks in this way (if at all really)

Second. I think I disagree with removing defined symbols using --strip-uneeded. I interpret "uneeded" as "100% not needed in any possible link" not "uneeded to fully link this". For instance a library might define lots of functions that it doesn't call and are only meant to be used by other bits of code. Such symbols are quite common and shouldn't be removed by --strip-uneeded IMO. I'm happy for --strip-uneeded-symbol to behave the way you describe however.

I'm happy to defer to @jakehehrlich's preferences here, but FWIW, I think GNU objcopy's behaviour is actually the wrong way around regarding the Global/Weak symbols: if you strip a defined symbol from an ET_REL ELF that isn't referenced within that ELF, the file can still be used in a link, you just can't reference anything that wasn't already referenced by that object file. Stripping symbols from ET_DYN/EXEC is okay, as it should have zero impact on the runtime behaviour (although debugging etc might not be as trivial). Anything that requires symbols at runtime should be using dynamic relocations and dynamic symbols, which this behaviour won't touch.

I think we need to be slightly wary of using ET_* types directly, because it will make it hard for vendors to add their own ET_* types to the list of things that should/should not be affected by some behaviour. If we do decide to go down the route of matching objcopy's behaviour here and therefore need to distinguish between relocatable and executable types, I'd suggest adding helper functions to wrap them, and call those instead of testing e_type directly.

First off, I think I agree with you on ET_* and the more I think about it the more I agree. I think given a) You're preference for not using ET_ and b) my initial and now growing preference we should not use ET_. @paulsemel please do not use ET_* checks in this way (if at all really)

Second. I think I disagree with removing defined symbols using --strip-uneeded. I interpret "uneeded" as "100% not needed in any possible link" not "uneeded to fully link this". For instance a library might define lots of functions that it doesn't call and are only meant to be used by other bits of code. Such symbols are quite common and shouldn't be removed by --strip-uneeded IMO. I'm happy for --strip-uneeded-symbol to behave the way you describe however.

Ok, so we should keep defined symbols in any way, regardless the ELF type ? That's sounds really weird for me, as there is plenty of way we want to remove them (especially in ET_EXEC), but we can go for it :)

I'm happy to defer to @jakehehrlich's preferences here, but FWIW, I think GNU objcopy's behaviour is actually the wrong way around regarding the Global/Weak symbols: if you strip a defined symbol from an ET_REL ELF that isn't referenced within that ELF, the file can still be used in a link, you just can't reference anything that wasn't already referenced by that object file. Stripping symbols from ET_DYN/EXEC is okay, as it should have zero impact on the runtime behaviour (although debugging etc might not be as trivial). Anything that requires symbols at runtime should be using dynamic relocations and dynamic symbols, which this behaviour won't touch.

I think we need to be slightly wary of using ET_* types directly, because it will make it hard for vendors to add their own ET_* types to the list of things that should/should not be affected by some behaviour. If we do decide to go down the route of matching objcopy's behaviour here and therefore need to distinguish between relocatable and executable types, I'd suggest adding helper functions to wrap them, and call those instead of testing e_type directly.

First off, I think I agree with you on ET_* and the more I think about it the more I agree. I think given a) You're preference for not using ET_ and b) my initial and now growing preference we should not use ET_. @paulsemel please do not use ET_* checks in this way (if at all really)

Second. I think I disagree with removing defined symbols using --strip-uneeded. I interpret "uneeded" as "100% not needed in any possible link" not "uneeded to fully link this". For instance a library might define lots of functions that it doesn't call and are only meant to be used by other bits of code. Such symbols are quite common and shouldn't be removed by --strip-uneeded IMO. I'm happy for --strip-uneeded-symbol to behave the way you describe however.

Ok, so we should keep defined symbols in any way, regardless the ELF type ? That's sounds really weird for me, as there is plenty of way we want to remove them (especially in ET_EXEC), but we can go for it :)

In practice, in an ET_EXEC/DYN type, there are no regular relocations, so GNU objcopy's behaviour in this area is almost the same as --strip-all, so I don't think it matters.

paulsemel updated this revision to Diff 147468.May 18 2018, 4:08 AM

Not stripping Global/Weak symbols regardless the elf type.

Since the ELF type is no longer relevant, could you remove all but one of the tests, please.

test/tools/llvm-objcopy/strip-unneeded-rel.test
44 ↗(On Diff #147468)

Please add an undefined weak and a defined global to this test. Also a STT_FILE symbol and STT_SECTION symbol to show they are not stripped.

tools/llvm-objcopy/Object.cpp
264–265 ↗(On Diff #147468)

I might be inclined to make this check into a separate shared method, to avoid risk of the two diverging.

662 ↗(On Diff #147468)

++Sym->RelocRefs instead of += 1?

tools/llvm-objcopy/llvm-objcopy.cpp
392 ↗(On Diff #147468)

I'd refer to the "zero symbol" as the "null symbol". Ditto the section.

paulsemel updated this revision to Diff 147501.May 18 2018, 7:02 AM
paulsemel marked 3 inline comments as done.

Applied James' suggestions

tools/llvm-objcopy/Object.cpp
264–265 ↗(On Diff #147468)

What do you think about this ?

I'm not sure if it makes much sense, but could this interact badly with group section signature symbols?

tools/llvm-objcopy/Object.cpp
264–265 ↗(On Diff #147468)

I'm always a bit wary of const_cast, but I think it looks okay to me.

I'm not sure if it makes much sense, but could this interact badly with group section signature symbols?

If by interact badly you mean "having an error if this situation occurs", the answer is yes badly..
But I told myself that this problem is handle in the symbol removal function, so this is not the "problem" of this option if I can say.
Btw, we need to get rid of the error throwing and handle this case correctly.

I'm not sure if it makes much sense, but could this interact badly with group section signature symbols?

If by interact badly you mean "having an error if this situation occurs", the answer is yes badly..
But I told myself that this problem is handle in the symbol removal function, so this is not the "problem" of this option if I can say.
Btw, we need to get rid of the error throwing and handle this case correctly.

Shouldn't we consider a symbol needed if it is referenced by a group section? I think it should add to the reference count.

tools/llvm-objcopy/Object.cpp
199 ↗(On Diff #147321)

The relocation section can do this itself by storing a non-const pointer.

tools/llvm-objcopy/Object.h
443 ↗(On Diff #147321)

I don't think you need this function.

paulsemel updated this revision to Diff 147583.May 18 2018, 2:16 PM
paulsemel marked 2 inline comments as done.

Applied Jake's suggestions

Indeed, I didn't think enough about Group symbols. You are right @jhenderson @jakehehrlich :)

Could you add a test for --strip-unneeded and group sections, please.

Thinking more about testing, what's the behaviour of --strip-unneeded + --remove-section <some relocation section> and are we happy with it?

tools/llvm-objcopy/Object.h
347 ↗(On Diff #147583)

As this now includes group references, it should probably be renamed. Perhaps simply "ReferenceCount" or something similar?

468 ↗(On Diff #147583)

Is this really how clang-format formats this?

Could you add a test for --strip-unneeded and group sections, please.

Thinking more about testing, what's the behaviour of --strip-unneeded + --remove-section <some relocation section> and are we happy with it?

Could you add a test for --strip-unneeded and group sections, please.

Sure.

Thinking more about testing, what's the behaviour of --strip-unneeded + --remove-section <some relocation section> and are we happy with it?

For the moment, as the section removal occurs before, the symbol will be deleted, as the relocation referencing it doesn't exist anymore. In objcopy, I believe the section removal occurs after, as the symbol is not deleted.
Anyway, if this PR (https://reviews.llvm.org/D47052) is accepted, we will match objcopy behavior (I actually prefer the current way).

paulsemel updated this revision to Diff 147760.May 21 2018, 4:01 AM
paulsemel marked 2 inline comments as done.

Applied James' suggestions.

paulsemel marked an inline comment as done.May 21 2018, 4:01 AM
paulsemel added inline comments.
tools/llvm-objcopy/Object.h
468 ↗(On Diff #147583)

Oups.. probably missed my clang-format.. sorry about it !

jhenderson accepted this revision.May 21 2018, 4:05 AM

LGTM, but please wait for @jakehehrlich to approve as well.

This revision is now accepted and ready to land.May 21 2018, 4:05 AM

One little thing and this LGTM

I think this pointed out some additional refactoring that's going to be needed though and I'm already behind on previously agreed on refactors :(

tools/llvm-objcopy/Object.h
465 ↗(On Diff #147760)

Please add nullptr check

jakehehrlich accepted this revision.May 21 2018, 3:15 PM
This revision was automatically updated to reflect the committed changes.
paulsemel reopened this revision.May 21 2018, 6:12 PM

It appears that RelocationSection is destroyed after SymbolTableSection. Thus, we are using some symbol pointers that have been freed before. I think we should directly move to shared_ptr or something similar for the symbols storage.

This revision is now accepted and ready to land.May 21 2018, 6:12 PM

I didn't think about that. I think the key thing here then is to use shared_ptr to store all symbols and then use use_count to keep track of the references.

@jakehehrlich
if i'm not mistaken it could be done differently, in particular, without introducing these subtle dependencies on the order of destruction of different section types.
In fact, if i'm not mistaken it's not really necessary to introduce that logic into the destructors at all.
Would the following approach work : simply calculate the values of Symbol.UseCount after the removal of sections (it would be one loop over all the relocations and group sections) ?

(it seems to me that using shared ownership (shared_ptr) might not be the right solution here, the model when SymbolsTable owns the symbols seems better)

@jakehehrlich
if i'm not mistaken it could be done differently, in particular, without introducing these subtle dependencies on the order of destruction of different section types.
In fact, if i'm not mistaken it's not really necessary to introduce that logic into the destructors at all.
Would the following approach work : simply calculate the values of Symbol.UseCount after the removal of sections (it would be one loop over all the relocations and group sections) ?

I tentatively agree with Alex that the SymbolTable should be the owner of a symbol, and only the SymbolTable. It just feels "right"! Everything else in ELF just refers to the symbol. I think this approach could work, but I'd suggest that it becomes another section interface, so that we just iterate once through all of the sections, and each section type either does nothing (because it doesn't refer to symbols) or updates the use count of the relevant symbols - that would allow implementing of other section types that might rely on symbols, and would also mean we don't need to track what group and relocation sections we have.

Ok, I've a working version with shared_ptr (I should have checked my emails before going on the implementation :) ). What should we do ?
If we go for the new section interface, we should probably introduce it in a separate patch, right ?
Do I still update this patch with the shared_ptr waiting for a better implementation ?
If you guys don't have time to do it, I can try if you want (maybe I'll need a bit more explanation about what you have in mind :) )

cc: @jakehehrlich @jhenderson @alexshap

Just a heads up I'm not convinced my shared_ptr idea is ok. I want to
consider what Alex and James are saying. My hope has always been to keep
the symbol table as the owner so if Alex has a proposal I want to read it
first.

Yeah I agree with Alex and James here. I think I'd prefer some sort of interface extension. Something as simple as 'markSymbols' that's done as a loop though the sections which then just sets a Boolean like you had originally. That should a) give the desired time complexity and b) not be super complicated. That loop can also only be triggered if --strip-uneeded is used which is nice.

paulsemel updated this revision to Diff 148408.May 24 2018, 7:32 AM

I got a bit confused about what you wanted @jakehehrlich .
Tell me if it does fit your expectations :)

jhenderson accepted this revision.May 24 2018, 7:39 AM

I think this makes sense to me. LGTM, with one nit, but obviously please wait for @jakehehrlich to give his opinion.

tools/llvm-objcopy/Object.cpp
189 ↗(On Diff #148408)

This could probably be initialised in the class inline.

paulsemel updated this revision to Diff 148419.May 24 2018, 8:25 AM
paulsemel marked an inline comment as done.

Initialized Referenced field directly in the structure.

alexshap added inline comments.May 24 2018, 12:20 PM
tools/llvm-objcopy/Object.cpp
261 ↗(On Diff #148419)

probably you don't need the "internal" cast static_cast<const SymbolTableSection *>(this), do you ?

alexshap added inline comments.May 24 2018, 12:22 PM
tools/llvm-objcopy/Object.h
632 ↗(On Diff #148419)

missing override ?

As always the right solution winds up 10000 times simpler than anything I considered in the interim. This looks good to me after fixing Alex's and my comments.

tools/llvm-objcopy/Object.cpp
950 ↗(On Diff #148419)

This can be done externally via a loop over sections(). I'd like to keep the exposed methods as simple as possible.

paulsemel added inline comments.May 24 2018, 1:05 PM
tools/llvm-objcopy/Object.cpp
261 ↗(On Diff #148419)

I'm sorry, but I don't see how I can do this differently. A bit of help would be appreciated.
If I do not static_cast this, I will self call, no ?

950 ↗(On Diff #148419)

Alright, I'll strip this function :)

tools/llvm-objcopy/Object.h
632 ↗(On Diff #148419)

Actually no, this method has the same name as the markSymbols BaseSection method, but is different (I should I named it differently, anyway, @jakehehrlich wants me to remove it, so no more issues :) )

alexshap added inline comments.May 24 2018, 2:15 PM
tools/llvm-objcopy/Object.cpp
261 ↗(On Diff #148419)

oh, right, it's fine

paulsemel updated this revision to Diff 148488.May 24 2018, 2:26 PM

Move away loop over sections to HandleArgs, as suggested by @jakehehrlich

alexshap accepted this revision.May 24 2018, 2:46 PM

Does it look ok for you @jakehehrlich ?

jakehehrlich accepted this revision.May 24 2018, 3:14 PM

nit: Add comment above loop but otherwise LGTM. Feel free to submit after that.

tools/llvm-objcopy/llvm-objcopy.cpp
267 ↗(On Diff #148488)

nit: If you could add a comment explaining what this is doing and why it needs to be here that would be great.

alexshap added inline comments.May 24 2018, 9:39 PM
tools/llvm-objcopy/llvm-objcopy.cpp
267 ↗(On Diff #148488)

@paulsemel, wait, but shouldn't it be right after Obj.removeSections(RemovePred); ?
(and in that case update the symbols table again)
For example, if one removes a group of sections (including the .group section itself),
the list of "referenced" symbols might potentially change, it's necessary to add a test for this case too, i think.

alexshap requested changes to this revision.May 24 2018, 9:39 PM
This revision now requires changes to proceed.May 24 2018, 9:39 PM
paulsemel added inline comments.May 25 2018, 12:12 AM
tools/llvm-objcopy/llvm-objcopy.cpp
267 ↗(On Diff #148488)

The problem you are mentioning is actually a bigger one I told you about in your last review... (https://reviews.llvm.org/D47052)

Indeed, you introduced the whole behavior. This behavior is now the same as objcopy's. It seems that they are first removing symbols, and then sections, which in this case remove the group section but not the symbol. As far as I recall, I gave the example of removing a relocation and in the same time remove the referenced symbols (try it with objcopy and please read the comment I'm referring to). This is the exact same problem.

As I said in the mentioned comment, I still think we shouldn't have handled options conflicts this way, but anyway, if we want to handle all of those problems differently, I'm convinced this is not about this patch, but (again) more generally about how we handle option conflicts.

This revision is now accepted and ready to land.May 25 2018, 1:53 AM
paulsemel updated this revision to Diff 148566.May 25 2018, 2:18 AM

Added comment above the markSymbols loop.
Now landing.

This revision was automatically updated to reflect the committed changes.