This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Fix null symbol handling
ClosedPublic

Authored by paulsemel on May 26 2018, 10:13 AM.

Details

Summary

This fixes the bug where strip-all option was leading to a malformed outputted ELF file.

Diff Detail

Repository
rL LLVM

Event Timeline

paulsemel created this revision.May 26 2018, 10:13 AM

thanks for addressing this (i noticed this isssue too, but didn't have a minute to send a fix).
Regarding the code: to be honest i would prefer to avoid these somewhat "hacky" manipulations with the index (i.e. in getSymbolByIndex etc),
instead, i would keep this special symbol in the symbol table and in llvm-objcopy.cpp I would replace the check !Obj.SymbolTable->empty() with
Obj.SymbolTable->size() >= 2 or smth like this (+ add a comment that we need to take into account that special symbol).
What are your thoughts ?

Hi !

In D47414#1113435, @alexshap wrote:

thanks for addressing this (i noticed this isssue too, but didn't have a minute to send a fix).
Regarding the code: to be honest i would prefer to avoid these somewhat "hacky" manipulations with the index (i.e. in getSymbolByIndex etc),
instead, i would keep this special symbol in the symbol table and in llvm-objcopy.cpp I would replace the check !Obj.SymbolTable->empty() with
Obj.SymbolTable->size() >= 2 or smth like this (+ add a comment that we need to take into account that special symbol).
What are your thoughts ?

Thansk for reviewing ! I actually agree with you that this hacky thing with the indexes is not perfect at all.. However, as it is with the null section, I would have really liked not to expose this symbol at all to the user.
Indeed, we might really want not to think about this dummy symbol while implementing new options or smth.
I couldn't find a smarter way to do this, do you have smth in mind ? :)

I've been thinking about the structure a lot here. It seems to me that the right thing to do IS to have a null symbol that getSymbolByIndex() returns. If a user has explicitly asked for the symbol at index 0, who are we to say they can't have it? As far as I can see, there is nothing in the ELF standard that prevents us having one.

Indeed, this is a wider problem - our internal implementation of the SymbolTableSection needs to not expose that we don't have a real null symbol. This includes things like how the size is calculated, how to iterate over things, whether or not it is empty etc. If there is a need to explicitly distinguish the null symbol from other symbols on a per-symbol basis, we should have something to identify it as such (e.g. Index == 0).

I know this is different from how the section header table works, and I'm beginning to wonder whether we made the wrong choice there too.

This then brings up the question, how do we prevent the symbol from being stripped? Unlike @alexshap, I'd prefer to keep the empty() check in the symbol table. Perhaps the definition of empty() should be "no symbols other than the null symbol"? In other words Symbols.size() == 1. The removeSymbols predicate would explicitly not remove the null symbol, or better yet, SymbolTableSection::removeSymbols (and probably also updateSymbols) start at the second symbol, rather than the first. What do you guys think?

test/tools/llvm-objcopy/strip-all-and-keep-symbol.test
61 ↗(On Diff #148724)

Could you add something here (or in a different test) to show that we haven't given the null symbol a real name, please?

tools/llvm-objcopy/Object.cpp
121 ↗(On Diff #148724)

zero symbol -> null symbol

It's worth pointing out that the standard requires the first byte to be a null character, so maybe this should be added as a string in the StringTableSection initialize method or constructor instead? That will also ensure that the size is correct, which could affect the behaviour of assignOffsets etc.

170 ↗(On Diff #148724)

Comment here please, to explain why we start from 1.

1142 ↗(On Diff #148724)

It feels to me like we should actually have a virtual .size() method on the section base class (which will be overriden to return Symbols.size() + 1 in this case), which wraps this sort of thing, hiding the details from assignOffsets (or its caller). Alternatively, can't this be done in the symbol table constructor/initialize method?

in general i agree with @jhenderson. Regarding changing the definition of empty() or checking the size() (what i suggested) - the latter seemes to be more explicit to me and more intuitive / causing less surprise, but i don't insist.

In D47414#1114472, @alexshap wrote:

in general i agree with @jhenderson. Regarding changing the definition of empty() or checking the size() (what i suggested) - the latter seemes to be more explicit to me and more intuitive / causing less surprise, but i don't insist.

My main concern is that a developer can easily forget the difference, and use the wrong one, but I'm not particularly fussed either way.

to me this is fine either way, i don't really have a strong opinion on this, we can change the definition of empty() - it's fine too.

Hi!

I understand your point but still, there is smth that I don't get. Why would a user need to access the null symbol ? I see two different things :

  • For reading. This is useless, this is just all zeros right, it makes no sense to read it.
  • For writing. This is forbidden, we might never change the null symbol.

So, if we let the user play with it, that means we might add complexity while writing the new binary, to make sure the user didn't touched this symbol.
Can you elaborate more on the benefit to let the user access this symbol ?

The fact is that I find the way of changing the different functions that are using the SymbolTable not to touch the null symbol way more hacky than the current way (which is also hacky I admit it).

Our tool is an auxiliary tool, to help with manipulating ELF files. It is not a verification tool, so we should not be trying to identify every single problem with the ELF, unless it makes it hard/impossible to do something (e.g. a link field refers to a non-existent section). The only two clients of the symbol index that we care about, as far as I am aware, are relocations (which can have symbols index 0, but in this case just refer to no symbol - we should already special case this, although we don't), and group sections (which shouldn't refer to index 0, but it doesn't really matter if they do, as we do nothing with that symbol except get its updated index). End users can't directly manipulate the null symbol.

tools/llvm-objcopy/Object.cpp
219 ↗(On Diff #148724)

How about std::begin(Symbols)+1? That's really simple, and will mean the null symbol is not removed.

Hi @jhenderson,

I'm really sorry, but still don't really get your point. I think that the goal is to make llvm-objcopy as close as possible as what GNU objcopy is doing in terms of functionality right ? My point was more that, in the future, we might want to implement other options that deals with symbols. And maybe it could be nice that, at this time, we don't have to deal with the null symbol anymore. My point was more : "There won't be any case (or options) where we will need to do smth with the null symbol, so we are better not exposing it".

I mean, it totally makes sense to make that, when the symbol vector is empty, that means that there is no symbol in the SymbolTable anymore, as no one needs to know whether the null symbol is present, as it always is. I don't know if you get my point, maybe I am going completely wrong, I would just need some explanations on this one :)

Btw, thanks for reviewing !

Let's imagine the case where we add an option --print-symbol-value=<index> which prints the value of the symbol at the specified index. I think it's clear that we should not prevent a user from printing the symbol at index 0 (i.e. the null symbol) - yes the value is always 0, but that doesn't stop us from printing it. It is clearly not the SymbolTableSection's responsibility to get the value from the symbol - it should be the symbol's responsibility to provide such an interface, and the caller should use it directly. As such, we would need a way to retrieve the symbol at index 0 from the symbol table. If a developer is trying to implement this, it makes sense for them to call getSymbolByIndex, but then they see that there is an error check for index 0. How do they then retrieve the null symbol? They can't, so they'll have to put some special case handling in for index 0. In other words, I disagree with you that a user will never want to read the null symbol. Manipulating it is perhaps another matter, but I don't care too much about verifying that it hasn't been adjusted - perhaps there's a good platform-specific reason somebody wants to?

A couple of related questions I'd like to be clear on:

  1. What do you think should be the return value of SymbolTable.size() when there is only the null symbol around?
  2. What do you think should be the return value of SymbolTable.empty() when there is only the null symbol around?

Well, the fact is that , in my opinion, the goal of llvm-objcopy or objcopy is not to print anything. Their only goal is to modify/alter the given binary. And the fact is that, as you can see with objcopy, for every options that plays with symbols, the null symbol is obviously omitted (I think that's also why you only search for symbol names, as you shouldn't need to do anything with null symbol).
But anyway, I can keep this symbol if you want, and modify the helper functions so that we omit the null symbol everywhere else !

Ok, I'm getting back to you because I really think that we want to go forward on this one, as this fixes a really "big" bug.

Are you guys totally convinced that we might keep this symbol ?

If so, I will change the implementation, even though I truly think not exposing the symbol will simplify the future development of llvm-objcopy.

cc: @jhenderson @alexshap

Quick drop in. What I was thinking about before was to just never pass the
null symbol to remove or anything else. It can still be there it just
shouldn't be allowed to be updated or removed.

As for how the null section header should be treated we have to consider
large indexes before proceeding.

Quick drop in. What I was thinking about before was to just never pass the
null symbol to remove or anything else. It can still be there it just
shouldn't be allowed to be updated or removed.

As for how the null section header should be treated we have to consider
large indexes before proceeding.

Yes, I think this is essentially my suggestion with the change to removeSymbols etc.

paulsemel updated this revision to Diff 149273.May 31 2018, 5:57 AM
paulsemel edited the summary of this revision. (Show Details)
jhenderson added inline comments.May 31 2018, 6:13 AM
tools/llvm-objcopy/Object.cpp
203 ↗(On Diff #149273)

Usual style in LLVM is to assign begin and end values in the initialization clause. See Don't evaluate end() every time through a loop. Similarly use ++Sym instead of Sym++.

tools/llvm-objcopy/Object.h
370–371 ↗(On Diff #149273)

I'd probably rephrase this, and say something like "An 'empty' symbol table still contains a null symbol." The fact that we don't alter it is not really relevant.

paulsemel updated this revision to Diff 149283.May 31 2018, 6:52 AM
paulsemel marked 2 inline comments as done.

Applied James' changes.
Added test for null symbol name.

jhenderson accepted this revision.May 31 2018, 7:09 AM

LGTM with the inline comments, but please wait for @alexshap or @jakehehrlich to confirm they're happy.

test/tools/llvm-objcopy/null-symbol.test
21 ↗(On Diff #149283)

I think this can be simplified to Name: (0). Whitespace (except new lines) is not significant for FileCheck, except to divide other non-whitespace, i.e. Name: (0) (many spaces) is identical to Name: (0) (one space) but not Name:(0) (no spaces).

tools/llvm-objcopy/Object.cpp
203–205 ↗(On Diff #149283)

Hmm... now that I think about it, could we use std::for_each here?

This revision is now accepted and ready to land.May 31 2018, 7:09 AM

Don't wait on me until June 10th. I'm just skimming emails.

paulsemel updated this revision to Diff 149424.Jun 1 2018, 3:20 AM
paulsemel marked 2 inline comments as done.

Applied James' suggestions.

test/tools/llvm-objcopy/null-symbol.test
21 ↗(On Diff #149283)

Thanks for the tips :)

tools/llvm-objcopy/Object.cpp
203–205 ↗(On Diff #149283)

Actually yes :)

This revision was automatically updated to reflect the committed changes.