Page MenuHomePhabricator

Change how we handle -wrap.

Authored by ruiu on Aug 10 2018, 10:15 AM.



We have an issue with -wrap that the option doesn't work well when
renamed symbols get PLT entries. I'll explain what is the issue and
how this patch solves it.

For one -wrap option, we have three symbols: foo, wrap_foo and real_foo.
Currently, we use memcpy to overwrite wrapped symbols so that they get
the same contents. This works in most cases but doesn't when the relocation
processor sets some flags in the symbol. memcpy'ed symbols are just
aliases, so they always have to have the same contents, but the
relocation processor breaks that assumption.

r336609 is an attempt to fix the issue by memcpy'ing again after
processing relocations, so that symbols that are out of sync get the
same contents again. That works in most cases as well, but it breaks
ASan build in a mysterious way.

We could probably fix the issue by choosing symbol attributes that need
to be copied after they are updated. But it feels too complicated to me.

So, in this patch, I fixed it once and for all. With this patch, we no
longer memcpy symbols. All references to renamed symbols point to new
symbols after wrapSymbols() is done.

Diff Detail

rLLD LLVM Linker

Event Timeline

ruiu created this revision.Aug 10 2018, 10:15 AM
ruiu updated this revision to Diff 160136.Aug 10 2018, 10:18 AM
  • removed unrelated space changes
pcc added inline comments.Aug 10 2018, 10:59 AM
1404–1407 ↗(On Diff #160136)

I think you don't need to visit binary files because they do not contain relocations. Nor do you need to visit bitcode files because at this point they will have been replaced by real object files.

ruiu updated this revision to Diff 160147.Aug 10 2018, 11:02 AM
  • Do not visit BinaryFiles and BitcodeFiles

    Technically, binary file contain __start_<filename> and _end_<filename> symbols, but it is hard to imagine that you want to wrap these symbols.
pcc added inline comments.Aug 10 2018, 11:04 AM
1404–1407 ↗(On Diff #160136)

Looks like you're still visiting bitcode files.

1386 ↗(On Diff #160147)

This noticeably increases the link time of the project I work on (from ~6 to ~7 seconds on my computer). Placing the WrappedSymbols inside of a temporary lookup table like std::unordered_map fixes this. Maybe this should be considered in case other projects wrap as many symbols as mine?

ruiu updated this revision to Diff 160176.Aug 10 2018, 2:15 PM
  • Do not touch BinaryFiles
  • Use DenseMap to make it faster
ruiu edited the summary of this revision. (Show Details)Aug 10 2018, 2:16 PM
ruiu updated this revision to Diff 160178.Aug 10 2018, 2:23 PM
ruiu edited the summary of this revision. (Show Details)
  • use ParallelForEach to rewrite symbols in object files
grimar added inline comments.Aug 11 2018, 1:47 AM
159 ↗(On Diff #160178)

Maybe naming with a something consistent with the arguments, like SymIdx, RealIdx and WrapIdx would be better?

162 ↗(On Diff #160178)

It is technically correct I think, but no tests are failing without these 2 lines.
Can it be tested?

31 ↗(On Diff #160178)

I would reveal more information here to show where the jmps are actually pointing to.

// DISASM:       _start:
// DISASM-NEXT:   1002:       {{.*}}  jmp     41
// DISASM-NEXT:   1007:       {{.*}}  jmp     36
// DISASM-NEXT:   100c:       {{.*}}  jmp     47

// 0x1002 + 41 + 5 == 0x1030 <__wrap_foo@plt>
// 0x1007 + 36 + 5 == 0x1030 <__wrap_foo@plt>
// 0x100c + 47 + 5 == 0x1040 <_start@plt>
ikudrin added inline comments.
89 ↗(On Diff #160178)

Don't you consider to add a method like void remapSymbols(const DenseMap<Symbol *, Symbol *>&) instead of this change?
I believe that granting the access to a protected member through a get method makes the behavior a bit unexpected. Note, that the original getSymbols method might be declared as const.

ruiu updated this revision to Diff 161676.Aug 21 2018, 2:40 AM
  • address review comments
ruiu added inline comments.Aug 21 2018, 2:46 AM
89 ↗(On Diff #160178)

I want to write a logic in one place rather than scatter them as remapSymbols, but I can see your concern, so I made a new function getMutableVector so that it is explicit that the function returns a mutable symbol array.

159 ↗(On Diff #160178)

Since these variables have very narrow scope, this kind of name should be fine. I didn't choose SymIdx and RealIdx because they don't align vertically.

162 ↗(On Diff #160178)


grimar accepted this revision.Aug 21 2018, 3:02 AM

LGTM with a nit.

31 ↗(On Diff #160178)

This was not addressed.

This revision is now accepted and ready to land.Aug 21 2018, 3:02 AM
ruiu added inline comments.Aug 21 2018, 3:11 AM
31 ↗(On Diff #160178)

This test was not written by me, but honestly I think this is good enough.

grimar added inline comments.Aug 21 2018, 3:22 AM
31 ↗(On Diff #160178)


Closed by commit rLLD340387: Change how we handle -wrap. (authored by ruiu, committed by ). · Explain WhyAug 22 2018, 12:03 AM
This revision was automatically updated to reflect the committed changes.