This is an archive of the discontinued LLVM Phabricator instance.

Fix for -wrap linker option and LTO, PR 33145
ClosedPublic

Authored by dmikulin on May 26 2017, 9:32 PM.

Details

Summary

An attempt to fix PR33145, --wrap doesn't work with lto.
Wrapped symbols need to excluded from inter-procedural optimizations to prevent dropping symbols and allow the linker to process re-directs.
Current limitation: doesn't work with ThinLTO.

Diff Detail

Repository
rL LLVM

Event Timeline

dmikulin created this revision.May 26 2017, 9:32 PM
davide requested changes to this revision.May 26 2017, 9:41 PM

There are no tests associated with this patch.

lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)

Can you explain why you need all three?

338–340 ↗(On Diff #100521)

empty ifs are not really common in llvm.

lld/ELF/SymbolTable.h
140 ↗(On Diff #100521)

A StringMap<bool> is a little silly.
If you really want a set container, you should use a proper one.

llvm/include/llvm/LTO/LTO.h
385 ↗(On Diff #100521)

I'd say, LinkerRedefined.

llvm/lib/LTO/LTO.cpp
423 ↗(On Diff #100521)

why do you need this being part of GlobalResolution?
If you do, please add a test showing that you fail switching the linkage earlier.

This revision now requires changes to proceed.May 26 2017, 9:41 PM
ruiu added a subscriber: ruiu.May 27 2017, 3:01 AM

Do you really need this much change? I expected something smaller than this. I mean, I thought what we need to do is to just pass a list of symbols which LTO should keep.

pcc added a reviewer: pcc.May 27 2017, 9:27 AM
dmikulin added inline comments.May 30 2017, 11:05 AM
lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)

We need all three if we want to match non-LTO behavior.

  • _wrap_bar is needed for the case where it is defined by the user but may be eliminated during LTO.
  • _real_bar is needed for the test case in the PR: the direct call to _real_bar() from foo() will be bound and inlined by LTO to the user-defined _real_bar() instead of bar() if _real_bar() is not on the list.
  • similarly, if bar() is not on the list, the call to bar() from foo() will be inlined instead of being routed to _wrap_bar().
338–340 ↗(On Diff #100521)

I'll fix this. Negating the condition seamed counter-intuitive...

lld/ELF/SymbolTable.h
140 ↗(On Diff #100521)

Is a set container like SmallSet good enough in this case? What's the worst case for the number of -wrap functions? I picked a map because we have to do a lookup in this container for every symbol.

llvm/include/llvm/LTO/LTO.h
385 ↗(On Diff #100521)

I'll change this.

llvm/lib/LTO/LTO.cpp
423 ↗(On Diff #100521)

You're right, I can do it sooner, in addRegularLTO(). I'll take out the GlobalRes stuff.

davide added inline comments.May 30 2017, 11:38 AM
lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)

Can you please add a test showing when 3) happens?

lld/ELF/SymbolTable.h
140 ↗(On Diff #100521)

What I'm trying to say is that a map from something to bool is just a Set.
If you think your set will grow a lot (unlikely, but OK, I'd recommend DenseSet).

dmikulin added inline comments.May 30 2017, 11:51 AM
lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)
  1. is also demonstrated by the test case in the PR. I'm working on regression tests for all known cases and will submit them with the new version of the patch.
lld/ELF/SymbolTable.h
140 ↗(On Diff #100521)

OK, thanks. Will fix.

pcc added inline comments.May 30 2017, 1:07 PM
lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)

_wrap_bar is needed for the case where it is defined by the user but may be eliminated during LTO.

This one only requires VisibleToRegularObj, doesn't it?

338–340 ↗(On Diff #100521)

Is the purpose of this code just to restore the binding of the original symbol? I think it would be simpler to have a post-LTO step that manually restores the binding for each renamed symbol rather than dealing with it here.

pcc added inline comments.May 30 2017, 1:16 PM
lld/ELF/SymbolTable.cpp
167 ↗(On Diff #100521)

What about the rename target? If I pass --defsym=foo=bar, the linker should not allow IPO past foo, right?

dmikulin added inline comments.May 30 2017, 5:44 PM
lld/ELF/SymbolTable.cpp
159–163 ↗(On Diff #100521)

Looks like it. Do we want to handle _wrap_* symbols specially and not set the weak binding for LTO?

167 ↗(On Diff #100521)

Right. I'll add it to the list.

338–340 ↗(On Diff #100521)

Yes, it was to restore the binding. I changed it to have a separate pass over renamed symbols after LTO.

dmikulin updated this revision to Diff 100812.May 30 2017, 6:01 PM
dmikulin edited edge metadata.

Revised patch. Will add test cases tomorrow.

pcc added inline comments.May 30 2017, 7:06 PM
lld/ELF/SymbolTable.cpp
124 ↗(On Diff #100812)

If you move the list of renamed symbols to Config you can avoid this lambda.

192 ↗(On Diff #100812)

You mean Alias here, right?

159–163 ↗(On Diff #100521)

Up to you. Note that you can make LTO keep a symbol by calling addUndefined with the symbol name, so it wouldn't be *that* special.

I'll wait for your tests before another careful review. Some drive by comments.

lld/ELF/SymbolTable.cpp
134–136 ↗(On Diff #100812)

Do you need an explicit iterator? Can't you use a range for ?

lld/ELF/SymbolTable.h
141 ↗(On Diff #100812)

I'd be more explicit here. e.g. "this is a map from Symbol name to binding" or something like that.

dmikulin marked an inline comment as done.May 30 2017, 9:54 PM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
124 ↗(On Diff #100812)

OK, will do.

192 ↗(On Diff #100812)

Of course.

lld/ELF/SymbolTable.h
141 ↗(On Diff #100812)

OK

dmikulin updated this revision to Diff 100921.May 31 2017, 2:32 PM
dmikulin marked an inline comment as done.

Addressed review comments. Added test cases. Tweaked saved pre-LTO bindings to more accurately reflect the state of the symbol table.

davide requested changes to this revision.May 31 2017, 10:37 PM

This is getting better.
Please rebase the patch because it didn't apply cleanly.

lld/ELF/SymbolTable.cpp
131 ↗(On Diff #100921)

Convention says all Variable names should start with capital letter. Also, i is a little vague.

llvm/include/llvm/LTO/LTO.h
369–370 ↗(On Diff #100921)

Is this formatted? Doesn't seem so.

llvm/lib/LTO/LTO.cpp
421 ↗(On Diff #100921)

If I remove this line, all the tests still pass.

441–442 ↗(On Diff #100921)

If I comment out this line, all the tests still pass.

547–548 ↗(On Diff #100921)

This needs a comment.

This revision now requires changes to proceed.May 31 2017, 10:37 PM
dmikulin marked 5 inline comments as done.Jun 1 2017, 9:08 AM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
131 ↗(On Diff #100921)

Changed it to RSI (Renamed Symbol Iterator)

llvm/lib/LTO/LTO.cpp
421 ↗(On Diff #100921)

This was meant for ThinLTO support, which hasn't been implemented yet. I can take it out from this patch.

441–442 ↗(On Diff #100921)

Added a test point for this

pcc added inline comments.Jun 1 2017, 10:59 AM
lld/test/ELF/lto/defsym.ll
13–14 ↗(On Diff #100921)

These are addresses of PLT entries, right? Can you avoid magic numbers here and make what you are testing for more explicit? I think you can do that by making the symbols hidden and checking that the calls correspond to the correct symbol addresses. Same for wrap-2.ll.

llvm/lib/LTO/LTO.cpp
442 ↗(On Diff #100921)

You also need to add support for 'r' to llvm-lto2. Please also add an llvm-lto2 based test that verifies that the linkage has changed.

dmikulin updated this revision to Diff 101119.Jun 1 2017, 2:57 PM
dmikulin marked 5 inline comments as done.
dmikulin edited edge metadata.

Addressed review comments.
Made changes and added a test to llvm-lto2.

dmikulin updated this revision to Diff 101247.Jun 2 2017, 11:26 AM

Changed the mapping from string --> binding to Symbol* --> binding.

pcc added inline comments.Jun 2 2017, 11:48 AM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

Can we now replace these functions with something that applies the rename to each symbol in Config->RenamedSymbols? I think it should be possible with this change to how we declare RenamedSymbols.

struct RenamedSymbol {
  uint8_t OrigBinding;
  Symbol *Target;
};
MapVector<Symbol*, RenamedSymbol> RenamedSymbols;
dmikulin added inline comments.Jun 2 2017, 11:53 AM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

But __wrap_ symbols are different: they need to be on the list, but they are not being replaced.

pcc added inline comments.Jun 2 2017, 11:56 AM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

They only need to be marked as IsUsedInRegularObj. And that's taken care of for you by addUndefined.

dmikulin added inline comments.Jun 2 2017, 11:56 AM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

Map it to NULL?

davide added inline comments.Jun 2 2017, 11:57 AM
lld/ELF/Config.h
102 ↗(On Diff #101247)

This is the only field in Config that has a comment. Not sure how I feel about it, I'll defer it to Rui.

lld/ELF/SymbolTable.cpp
191–194 ↗(On Diff #101247)

Is this bit needed? I don't see a test for it.

195 ↗(On Diff #101247)

Can this ever be null? If so, you should check for it.
If not, you may as well inline this in the single use.

lld/ELF/SymbolTable.h
42–43 ↗(On Diff #101247)

Unsorted.

llvm/lib/LTO/LTO.cpp
407–410 ↗(On Diff #101247)

This comment is now stale after your change.

pcc added inline comments.Jun 2 2017, 12:04 PM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

I mean that you can just set IsUsedInRegularObj on the __wrap_ symbol before you assign it to Target. Then LTO will keep it alive. There's no need to add it (as a key) to RenamedSymbols then.
http://llvm-cs.pcc.me.uk/tools/lld/ELF/LTO.cpp#136

dmikulin added inline comments.Jun 2 2017, 2:02 PM
lld/ELF/SymbolTable.cpp
203–231 ↗(On Diff #101247)

This works fine for -wrap, but not for -defsym.

For -wrap, we want to keep the __real_ and the original symbols on the list, so the mappings are:
Real --> Orig and Orig --> Wrap
After we're done with LTO, we copy targets of the mapping back to their keys, Orig --> Real and Wrap --> Orig.

For -defsym, we want to preserve the original symbol, so we would add Orig --> Alias mapping to communicate with LTO. But after we're done, we need to copy Orig --> Alias. I added a flag to the mapping structure to reverse the direction of the copy, so now in the -defsym test the call to bar3() gets inlined and the call to bar2() goes to its alias, bar3. This sounds reasonable to me, just want to make sure it's acceptable.

dmikulin marked 4 inline comments as done.Jun 2 2017, 2:15 PM
dmikulin added inline comments.
lld/ELF/Config.h
102 ↗(On Diff #101247)

Removed the comment. Now it's next to the RemappedSymbol structure.

lld/ELF/SymbolTable.cpp
191–194 ↗(On Diff #101247)

It was moved here from SymbolTable::alias, there is a test case for it already.

195 ↗(On Diff #101247)

It can't be null, I just find it easier to read, that's all.

dmikulin updated this revision to Diff 101280.Jun 2 2017, 2:56 PM
dmikulin marked 2 inline comments as done.

Changed the way -defsym and -wrap are handled.

pcc added inline comments.Jun 2 2017, 3:26 PM
lld/ELF/Driver.cpp
972 ↗(On Diff #101280)

This comment and the one below aren't really accurate any more because this code is used in the non-LTO case as well.

lld/ELF/LTO.cpp
139 ↗(On Diff #101280)

I think this can just be Config->RenamedSymbols.count(Sym).

lld/ELF/SymbolTable.cpp
160 ↗(On Diff #101280)

Is this function necessary? All callers pass a non-null SymbolBody, so they can just query the binding directly.

167–169 ↗(On Diff #101280)

Move this comment to applySymbolRenames.

172 ↗(On Diff #101280)

I'd minimise the diff here and rename this variable back.

dmikulin marked 5 inline comments as done.Jun 2 2017, 3:46 PM
dmikulin updated this revision to Diff 101289.Jun 2 2017, 4:28 PM

Fixed comments.
Rebased the patch.

davide added inline comments.Jun 2 2017, 5:20 PM
lld/ELF/SymbolTable.cpp
167 ↗(On Diff #101289)

You're setting IsUsedInRegularObj from what I can see, here, for all the symbols passed with --wrap.
Can you check this is really NFC? IIRC it could impact the way symbols are added (or not) to the symbol table.

184 ↗(On Diff #101289)

Same here.

rafael accepted this revision.Jun 3 2017, 6:24 PM

This LGTM. I am not a big fan of keeping a copy or the original binding, but it sure saves a lot of plumbing.

Please just wait to see if Davide and Peter have something to add.

pcc accepted this revision.Jun 3 2017, 10:47 PM

LGTM as well

lld/test/ELF/lto/wrap-1.ll
13 ↗(On Diff #101289)

Please update this comment. I think you also want to test 'r' for __real_bar.

davide accepted this revision.Jun 3 2017, 10:59 PM

I'm fine with this once Peter's comments are addressed. I'm not a fan of having to carry around the original binding as well, but I think it's tolerable.

This revision is now accepted and ready to land.Jun 3 2017, 10:59 PM
ruiu added inline comments.Jun 5 2017, 6:41 AM
lld/ELF/Config.h
107 ↗(On Diff #101289)

Please run clang-format-diff. It should be formatted as <Symbol *, RenamedSymbol>

lld/ELF/Driver.cpp
973 ↗(On Diff #101289)

I prefer a more concrete comment. Handle -wrap option.

977 ↗(On Diff #101289)

Ditto. Handle --defsym=sym=alias option.

lld/ELF/SymbolTable.cpp
166 ↗(On Diff #101289)

Add a blank line before a comment.

183 ↗(On Diff #101289)

Add a blank line before a comment.

189 ↗(On Diff #101289)

It is not obvious why you need this function from the source code. You need to add a comment.

190 ↗(On Diff #101289)

What is RSI? I found just KV (short for key-value) makes sense in most cases.

194 ↗(On Diff #101289)

Ditto

dmikulin marked 11 inline comments as done.Jun 5 2017, 8:38 AM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
190 ↗(On Diff #101289)

RSI is supposed to be Renamed Symbol Iterator, but KV is ok with me too

lld/test/ELF/lto/wrap-1.ll
13 ↗(On Diff #101289)

Resolutions file is driven by input file symbol tables. Since __real_bar is not defined in an object, it does not show up there. I added a definition and a test for the 'r' flag.

This revision was automatically updated to reflect the committed changes.
dmikulin marked 2 inline comments as done.

There were some comments pending by Rui. You addressed them, but you still need to wait for an explicit LGTM before committing.
I don't think it's worth reverting this patch, but this is not okay and you should consider it next time.