This is an archive of the discontinued LLVM Phabricator instance.

Fix for -wrap linker option and LTO, PR 33145
ClosedPublic

Authored by dmikulin on May 26 2017, 9:32 PM.

Details

Summary

An attempt to fix PR33145, --wrap doesn't work with lto.
Wrapped symbols need to excluded from inter-procedural optimizations to prevent dropping symbols and allow the linker to process re-directs.
Current limitation: doesn't work with ThinLTO.

Diff Detail

Event Timeline

dmikulin created this revision.May 26 2017, 9:32 PM
davide requested changes to this revision.May 26 2017, 9:41 PM

There are no tests associated with this patch.

lld/ELF/SymbolTable.cpp
172–176

Can you explain why you need all three?

367–369

empty ifs are not really common in llvm.

lld/ELF/SymbolTable.h
140

A StringMap<bool> is a little silly.
If you really want a set container, you should use a proper one.

llvm/include/llvm/LTO/LTO.h
384

I'd say, LinkerRedefined.

llvm/lib/LTO/LTO.cpp
423

why do you need this being part of GlobalResolution?
If you do, please add a test showing that you fail switching the linkage earlier.

This revision now requires changes to proceed.May 26 2017, 9:41 PM
ruiu added a subscriber: ruiu.May 27 2017, 3:01 AM

Do you really need this much change? I expected something smaller than this. I mean, I thought what we need to do is to just pass a list of symbols which LTO should keep.

pcc added a reviewer: pcc.May 27 2017, 9:27 AM
dmikulin added inline comments.May 30 2017, 11:05 AM
lld/ELF/SymbolTable.cpp
172–176

We need all three if we want to match non-LTO behavior.

  • _wrap_bar is needed for the case where it is defined by the user but may be eliminated during LTO.
  • _real_bar is needed for the test case in the PR: the direct call to _real_bar() from foo() will be bound and inlined by LTO to the user-defined _real_bar() instead of bar() if _real_bar() is not on the list.
  • similarly, if bar() is not on the list, the call to bar() from foo() will be inlined instead of being routed to _wrap_bar().
367–369

I'll fix this. Negating the condition seamed counter-intuitive...

lld/ELF/SymbolTable.h
140

Is a set container like SmallSet good enough in this case? What's the worst case for the number of -wrap functions? I picked a map because we have to do a lookup in this container for every symbol.

llvm/include/llvm/LTO/LTO.h
384

I'll change this.

llvm/lib/LTO/LTO.cpp
423

You're right, I can do it sooner, in addRegularLTO(). I'll take out the GlobalRes stuff.

davide added inline comments.May 30 2017, 11:38 AM
lld/ELF/SymbolTable.cpp
172–176

Can you please add a test showing when 3) happens?

lld/ELF/SymbolTable.h
140

What I'm trying to say is that a map from something to bool is just a Set.
If you think your set will grow a lot (unlikely, but OK, I'd recommend DenseSet).

dmikulin added inline comments.May 30 2017, 11:51 AM
lld/ELF/SymbolTable.cpp
172–176
  1. is also demonstrated by the test case in the PR. I'm working on regression tests for all known cases and will submit them with the new version of the patch.
lld/ELF/SymbolTable.h
140

OK, thanks. Will fix.

pcc added inline comments.May 30 2017, 1:07 PM
lld/ELF/SymbolTable.cpp
172–176

_wrap_bar is needed for the case where it is defined by the user but may be eliminated during LTO.

This one only requires VisibleToRegularObj, doesn't it?

367–369

Is the purpose of this code just to restore the binding of the original symbol? I think it would be simpler to have a post-LTO step that manually restores the binding for each renamed symbol rather than dealing with it here.

pcc added inline comments.May 30 2017, 1:16 PM
lld/ELF/SymbolTable.cpp
180

What about the rename target? If I pass --defsym=foo=bar, the linker should not allow IPO past foo, right?

dmikulin added inline comments.May 30 2017, 5:44 PM
lld/ELF/SymbolTable.cpp
172–176

Looks like it. Do we want to handle _wrap_* symbols specially and not set the weak binding for LTO?

180

Right. I'll add it to the list.

367–369

Yes, it was to restore the binding. I changed it to have a separate pass over renamed symbols after LTO.

dmikulin updated this revision to Diff 100812.May 30 2017, 6:01 PM
dmikulin edited edge metadata.

Revised patch. Will add test cases tomorrow.

pcc added inline comments.May 30 2017, 7:06 PM
lld/ELF/SymbolTable.cpp
124

If you move the list of renamed symbols to Config you can avoid this lambda.

172–176

Up to you. Note that you can make LTO keep a symbol by calling addUndefined with the symbol name, so it wouldn't be *that* special.

192

You mean Alias here, right?

I'll wait for your tests before another careful review. Some drive by comments.

lld/ELF/SymbolTable.cpp
134–136

Do you need an explicit iterator? Can't you use a range for ?

lld/ELF/SymbolTable.h
141

I'd be more explicit here. e.g. "this is a map from Symbol name to binding" or something like that.

dmikulin marked an inline comment as done.May 30 2017, 9:54 PM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
124

OK, will do.

192

Of course.

lld/ELF/SymbolTable.h
141

OK

dmikulin updated this revision to Diff 100921.May 31 2017, 2:32 PM
dmikulin marked an inline comment as done.

Addressed review comments. Added test cases. Tweaked saved pre-LTO bindings to more accurately reflect the state of the symbol table.

davide requested changes to this revision.May 31 2017, 10:37 PM

This is getting better.
Please rebase the patch because it didn't apply cleanly.

lld/ELF/SymbolTable.cpp
134

Convention says all Variable names should start with capital letter. Also, i is a little vague.

llvm/include/llvm/LTO/LTO.h
369–371

Is this formatted? Doesn't seem so.

llvm/lib/LTO/LTO.cpp
421

If I remove this line, all the tests still pass.

441–442

If I comment out this line, all the tests still pass.

547–548

This needs a comment.

This revision now requires changes to proceed.May 31 2017, 10:37 PM
dmikulin marked 5 inline comments as done.Jun 1 2017, 9:08 AM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
134

Changed it to RSI (Renamed Symbol Iterator)

llvm/lib/LTO/LTO.cpp
421

This was meant for ThinLTO support, which hasn't been implemented yet. I can take it out from this patch.

441–442

Added a test point for this

pcc added inline comments.Jun 1 2017, 10:59 AM
lld/test/ELF/lto/defsym.ll
13–14 ↗(On Diff #100921)

These are addresses of PLT entries, right? Can you avoid magic numbers here and make what you are testing for more explicit? I think you can do that by making the symbols hidden and checking that the calls correspond to the correct symbol addresses. Same for wrap-2.ll.

llvm/lib/LTO/LTO.cpp
442

You also need to add support for 'r' to llvm-lto2. Please also add an llvm-lto2 based test that verifies that the linkage has changed.

dmikulin updated this revision to Diff 101119.Jun 1 2017, 2:57 PM
dmikulin marked 5 inline comments as done.
dmikulin edited edge metadata.

Addressed review comments.
Made changes and added a test to llvm-lto2.

dmikulin updated this revision to Diff 101247.Jun 2 2017, 11:26 AM

Changed the mapping from string --> binding to Symbol* --> binding.

pcc added inline comments.Jun 2 2017, 11:48 AM
lld/ELF/SymbolTable.cpp
198–224

Can we now replace these functions with something that applies the rename to each symbol in Config->RenamedSymbols? I think it should be possible with this change to how we declare RenamedSymbols.

struct RenamedSymbol {
  uint8_t OrigBinding;
  Symbol *Target;
};
MapVector<Symbol*, RenamedSymbol> RenamedSymbols;
dmikulin added inline comments.Jun 2 2017, 11:53 AM
lld/ELF/SymbolTable.cpp
198–224

But __wrap_ symbols are different: they need to be on the list, but they are not being replaced.

pcc added inline comments.Jun 2 2017, 11:56 AM
lld/ELF/SymbolTable.cpp
198–224

They only need to be marked as IsUsedInRegularObj. And that's taken care of for you by addUndefined.

dmikulin added inline comments.Jun 2 2017, 11:56 AM
lld/ELF/SymbolTable.cpp
198–224

Map it to NULL?

davide added inline comments.Jun 2 2017, 11:57 AM
lld/ELF/Config.h
102 ↗(On Diff #101247)

This is the only field in Config that has a comment. Not sure how I feel about it, I'll defer it to Rui.

lld/ELF/SymbolTable.cpp
199–202

Is this bit needed? I don't see a test for it.

203

Can this ever be null? If so, you should check for it.
If not, you may as well inline this in the single use.

lld/ELF/SymbolTable.h
42–43

Unsorted.

llvm/lib/LTO/LTO.cpp
407–410

This comment is now stale after your change.

pcc added inline comments.Jun 2 2017, 12:04 PM
lld/ELF/SymbolTable.cpp
198–224

I mean that you can just set IsUsedInRegularObj on the __wrap_ symbol before you assign it to Target. Then LTO will keep it alive. There's no need to add it (as a key) to RenamedSymbols then.
http://llvm-cs.pcc.me.uk/tools/lld/ELF/LTO.cpp#136

dmikulin added inline comments.Jun 2 2017, 2:02 PM
lld/ELF/SymbolTable.cpp
198–224

This works fine for -wrap, but not for -defsym.

For -wrap, we want to keep the __real_ and the original symbols on the list, so the mappings are:
Real --> Orig and Orig --> Wrap
After we're done with LTO, we copy targets of the mapping back to their keys, Orig --> Real and Wrap --> Orig.

For -defsym, we want to preserve the original symbol, so we would add Orig --> Alias mapping to communicate with LTO. But after we're done, we need to copy Orig --> Alias. I added a flag to the mapping structure to reverse the direction of the copy, so now in the -defsym test the call to bar3() gets inlined and the call to bar2() goes to its alias, bar3. This sounds reasonable to me, just want to make sure it's acceptable.

dmikulin marked 4 inline comments as done.Jun 2 2017, 2:15 PM
dmikulin added inline comments.
lld/ELF/Config.h
102 ↗(On Diff #101247)

Removed the comment. Now it's next to the RemappedSymbol structure.

lld/ELF/SymbolTable.cpp
199–202

It was moved here from SymbolTable::alias, there is a test case for it already.

203

It can't be null, I just find it easier to read, that's all.

dmikulin updated this revision to Diff 101280.Jun 2 2017, 2:56 PM
dmikulin marked 2 inline comments as done.

Changed the way -defsym and -wrap are handled.

pcc added inline comments.Jun 2 2017, 3:26 PM
lld/ELF/Driver.cpp
972

This comment and the one below aren't really accurate any more because this code is used in the non-LTO case as well.

lld/ELF/LTO.cpp
139

I think this can just be Config->RenamedSymbols.count(Sym).

lld/ELF/SymbolTable.cpp
167–169

Move this comment to applySymbolRenames.

174

Is this function necessary? All callers pass a non-null SymbolBody, so they can just query the binding directly.

201

I'd minimise the diff here and rename this variable back.

dmikulin marked 5 inline comments as done.Jun 2 2017, 3:46 PM
dmikulin updated this revision to Diff 101289.Jun 2 2017, 4:28 PM

Fixed comments.
Rebased the patch.

davide added inline comments.Jun 2 2017, 5:20 PM
lld/ELF/SymbolTable.cpp
206

You're setting IsUsedInRegularObj from what I can see, here, for all the symbols passed with --wrap.
Can you check this is really NFC? IIRC it could impact the way symbols are added (or not) to the symbol table.

224

Same here.

rafael accepted this revision.Jun 3 2017, 6:24 PM

This LGTM. I am not a big fan of keeping a copy or the original binding, but it sure saves a lot of plumbing.

Please just wait to see if Davide and Peter have something to add.

pcc accepted this revision.Jun 3 2017, 10:47 PM

LGTM as well

lld/test/ELF/lto/wrap-1.ll
13 ↗(On Diff #101289)

Please update this comment. I think you also want to test 'r' for __real_bar.

davide accepted this revision.Jun 3 2017, 10:59 PM

I'm fine with this once Peter's comments are addressed. I'm not a fan of having to carry around the original binding as well, but I think it's tolerable.

This revision is now accepted and ready to land.Jun 3 2017, 10:59 PM
ruiu added inline comments.Jun 5 2017, 6:41 AM
lld/ELF/Config.h
107 ↗(On Diff #101289)

Please run clang-format-diff. It should be formatted as <Symbol *, RenamedSymbol>

lld/ELF/Driver.cpp
972

I prefer a more concrete comment. Handle -wrap option.

976

Ditto. Handle --defsym=sym=alias option.

lld/ELF/SymbolTable.cpp
205

Add a blank line before a comment.

223

Add a blank line before a comment.

229

It is not obvious why you need this function from the source code. You need to add a comment.

230

What is RSI? I found just KV (short for key-value) makes sense in most cases.

234

Ditto

dmikulin marked 11 inline comments as done.Jun 5 2017, 8:38 AM
dmikulin added inline comments.
lld/ELF/SymbolTable.cpp
230

RSI is supposed to be Renamed Symbol Iterator, but KV is ok with me too

lld/test/ELF/lto/wrap-1.ll
13 ↗(On Diff #101289)

Resolutions file is driven by input file symbol tables. Since __real_bar is not defined in an object, it does not show up there. I added a definition and a test for the 'r' flag.

This revision was automatically updated to reflect the committed changes.
dmikulin marked 2 inline comments as done.

There were some comments pending by Rui. You addressed them, but you still need to wait for an explicit LGTM before committing.
I don't think it's worth reverting this patch, but this is not okay and you should consider it next time.