This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add --weaken option
ClosedPublic

Authored by paulsemel on Apr 27 2018, 3:42 PM.

Details

Summary

This option mark all global symbols as weaken.
This option follows the GNU objcopy --weaken option, as it doesn't mark the symbol as weak if this one is undefined (a relocation).

Diff Detail

Repository
rL LLVM

Event Timeline

paulsemel created this revision.Apr 27 2018, 3:42 PM
jakehehrlich added inline comments.Apr 27 2018, 4:00 PM
tools/llvm-objcopy/llvm-objcopy.cpp
333–337

Can you make this two separate if statements?

paulsemel updated this revision to Diff 144417.Apr 27 2018, 4:11 PM
paulsemel marked an inline comment as done.

Applied Jake's suggestions :

  • Separated the two if statements
jakehehrlich added inline comments.Apr 27 2018, 4:21 PM
tools/llvm-objcopy/llvm-objcopy.cpp
341

Why is this NameIndex check here?

paulsemel added inline comments.Apr 27 2018, 4:25 PM
tools/llvm-objcopy/llvm-objcopy.cpp
341

This check tells me whether the symbol is defined or not. If not, it means that it's a relocation.
In this case (as GNU objdump does to), I'm not marking the symbol.

paulsemel added inline comments.Apr 27 2018, 4:39 PM
tools/llvm-objcopy/llvm-objcopy.cpp
341

Sorry, I meant GNU objcopy..

jakehehrlich added inline comments.Apr 27 2018, 5:18 PM
tools/llvm-objcopy/llvm-objcopy.cpp
341

Well at the very least this isn't the right way to handle the situation you describe. I wasn't aware of this as it doesn't seem to be documented behavior (*highly* common with GNU objcopy). I'm going to ask around on what the behavior *should* be for the intended use cases to see what should happen. It sounds like we may want other 'weaken' options to do the same thing.

  1. NameIndex is the index in the string table so it has nothing todo with weather a symbol is defined or not. It also isn't valid or meaningful until *after* finalization.
  2. There isn't a good notion of "defined" in llvm-objcopy right now. There's 'defined in section X' but that dosn't cover absolute symbols and the like so there are some edge cases to check there.
  3. Did we miss this behavior in the other 'weaken' options? It seems strange to me that this case would behave differently from the other weaken cases.

I've asked Roland McGrath whats supposed to be happening here but I don't suspect he'll get back to me until Monday. Sorry about that! I'll also do some black box testing of GNU objcopy to see what its doing but it isn't the case that we always want to copy it.

paulsemel added inline comments.Apr 27 2018, 5:55 PM
tools/llvm-objcopy/llvm-objcopy.cpp
341

Well, you're definitely right ! I was looking for the section index, and I switched from getShndx() to NameIndex and I still can't explain why I did this..
But yes, it might neither respect the exact behavior of GNU objcopy, as you pointed out. I will wait for the answer of Roland McGrath then !
Anyway, about your third question, I am going to do some experiments to check whether I missed a behavior (which in my opinion is highly probable), and I'll come back to you !

paulsemel added inline comments.Apr 28 2018, 2:09 AM
tools/llvm-objcopy/llvm-objcopy.cpp
341

Alright, the behavior is the same for the --weaken-symbol option. Anyway, my thoughts are that we might not need to change the behavior of this option.

Indeed, as the user explicitly want to weaken a symbol, I think we might weaken it in anyway; or at least, we might throw some sort of warnings (GNU objcopy silently does nothing..)

For this --weaken option, I would also go for this behavior (i.e. weaken every global symbols), as it is the behavior explained in the objcopy man page. Anyway, maybe we are missing something, and Roland McGrath will tell us why they made this choice of keeping undefined symbols global.

What do you think about it ?

jhenderson added inline comments.Apr 30 2018, 1:54 AM
tools/llvm-objcopy/llvm-objcopy.cpp
341

FWIW, checking getShndx() == SHN_UNDEF is the correct way to test for undefined symbols currently. This deals with absolute symbols etc happily.

On a related note, what is GNU objcopy's behaviour for --weaken etc and global absolute and common symbols?

Switching back from NameIndex to getShndx, as it is an any way the right thing to do here.
Waiting for Jake's feedback to fit better with GNU objcopy behavior.

So I asked the gods about this and got a half response on this issue but this appears to be intended for localize and globalize and *might* have been intended for weaken. After some thought I don't think it makes a lot of sense to weaken an undefined symbol and it might be unexpected behavior so I'm cool with it working this way. That would be my preference and it gives compatibility with objcopy.

Reading the code snippet Roland pointed out to me I don't think absolute symbols are treated specially but I don't know how common symbols are treated. Paul, can you do some blackbox testing of GNU objcopy to see how common symbols are treated?

tools/llvm-objcopy/llvm-objcopy.cpp
341

Ah! Good catch. I forgot about that.

Sure, here is what I've got so far :

Generally speaking, while copying, objcopy turns common symbols into object symbols. Otherway, it goes the same as it is with other symbol types.

For --globalize-symbol option :

  • doesn't do anything for weak symbols
  • works with undefined local symbols

For --localize-symbol option:

  • works with all (but undef of course) symbols

So, do you think that I need to change the behavior of those two options to match the objcopy behavior ?

Thanks for taking the time to ask Roland McGrath !

I'm happy with this change. I think we should change --globalize and --localize to match but if --globalize-symbol or --localize-symbol is used it shouldn't bother checking; if people want to do non-sense that's their choice. I didn't think about it but it makes a lot of sense not to globalize weak symbols and it makes sense to not localize undefined symbols. So I think the choices objcopy chose makes sense. If you could change those to match that, it would be awesome!

I'm happy with this change. I think we should change --globalize and --localize to match but if --globalize-symbol or --localize-symbol is used it shouldn't bother checking; if people want to do non-sense that's their choice. I didn't think about it but it makes a lot of sense not to globalize weak symbols and it makes sense to not localize undefined symbols. So I think the choices objcopy chose makes sense. If you could change those to match that, it would be awesome!

Cool, nice to hear ! However, I don't think there is --globalize and --localize option implemented.. so I don't really see what you want me to change

jakehehrlich accepted this revision.May 2 2018, 11:42 AM

I'm happy with this change. I think we should change --globalize and --localize to match but if --globalize-symbol or --localize-symbol is used it shouldn't bother checking; if people want to do non-sense that's their choice. I didn't think about it but it makes a lot of sense not to globalize weak symbols and it makes sense to not localize undefined symbols. So I think the choices objcopy chose makes sense. If you could change those to match that, it would be awesome!

Cool, nice to hear ! However, I don't think there is --globalize and --localize option implemented.. so I don't really see what you want me to change

I have an imaginative memory apparently. For some reason I thought you implemented those. Whoops. LGTM on this

This revision is now accepted and ready to land.May 2 2018, 11:42 AM
paulsemel closed this revision.May 2 2018, 1:19 PM

commited on r331397