This is an archive of the discontinued LLVM Phabricator instance.

ELF: Create dynamic symbols for symbol aliases of copy relocations.
ClosedPublic

Authored by pcc on Apr 2 2016, 2:35 PM.

Details

Summary

For each copy relocation that we create, look through the DSO's symbol table
for aliases and create a dynamic symbol for each one. This causes the copy
relocation to correctly interpose any aliases.

Copy relocations are relatively uncommon (on my machine, 56% of binaries in
/usr/bin have no copy relocations probably due to being PIEs, 97% of them
have <10, and the binary with the largest number of them has 97) so it's
probably fine to do this in a relatively inefficient way.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 52479.Apr 2 2016, 2:35 PM
pcc retitled this revision from to ELF: Create dynamic symbols for symbol aliases of copy relocations..
pcc updated this object.
pcc added reviewers: ruiu, rafael.
pcc added a subscriber: llvm-commits.
ruiu added inline comments.Apr 2 2016, 2:50 PM
ELF/Writer.cpp
388 ↗(On Diff #52479)

How is it guaranteed that this function is not called more than once for each shared symbol?

pcc added inline comments.Apr 2 2016, 2:55 PM
ELF/Writer.cpp
388 ↗(On Diff #52479)

We set the NeedsCopyOrPltAddr flag on each alias. That causes the test on line 385 to succeed.

pcc updated this revision to Diff 52617.Apr 4 2016, 1:27 PM

Use a sorted list of object aliases

pcc added a comment.Apr 4 2016, 1:27 PM

For comparison, here's a version of the patch that produces a list of object aliases for the DSO sorted by symbol address upon first encountering a copy relocation, and uses that to optimize lookups. Perf improvement was ~1% for CMake's ctest binary, one of the heavier users of copy relocations on my system. Not sure if this is worth the added code complexity, though.

rafael edited edge metadata.Apr 4 2016, 1:57 PM
rafael added a subscriber: rafael.

I will take a look at the patch, but I agree that when copy relocations are
involved we should push for simplicity.

ruiu edited edge metadata.Apr 4 2016, 1:57 PM

I think the original one is better because as you said copy relocations are rather rare.

ELF/OutputSections.cpp
842 ↗(On Diff #52479)

What is this change for?

ELF/Writer.cpp
764 ↗(On Diff #52479)

You set MustBeInDynSym true here for all symbols at the same offset. Assume that you have symbols A and B where B is a weak alias to A. If your program uses symbol A, then will both A and B be in the program's dynsym? Is this expected?

Please check all relocations in the test.

Please check the section of the symbols.

pcc added a comment.Apr 4 2016, 2:16 PM

Okay, I'll revert to the previous version.

ELF/OutputSections.cpp
842 ↗(On Diff #52617)

This change moves copy relocation allocation from after offset assignment to before assignment. This function needs to take into account any space allocated that way.

ELF/Writer.cpp
816 ↗(On Diff #52617)

Yes, both symbols will be in the dynsym. I think we probably need to create dynsyms for all symbols at the same offset regardless of whether they are strong or weak, as otherwise the DSO will be able to observe breakage.

e.g. if the DSO looks like this:

int a;
__attribute__((weak, alias("a"))) int b;

bool f() {
  return &a == &b;
}

the function f should return true regardless of which symbols are used in the main executable (if they're being overridden, that's another matter, though).

ruiu added inline comments.Apr 4 2016, 2:27 PM
ELF/Writer.cpp
816 ↗(On Diff #52617)

Makes sense, although it's very subtle. But I don't know how to fix it. Probably the very notion of the copy relocation is subtle.

pcc added a comment.Apr 4 2016, 2:29 PM

Please check all relocations in the test.

I don't understand what you're asking for. There are two relocations, and I'm checking both of them (and that there are no others).

Please check the section of the symbols.

Done.

pcc updated this revision to Diff 52627.Apr 4 2016, 2:29 PM
pcc edited edge metadata.
  • Revert to previous version of the patch; improve test
pcc updated this revision to Diff 52628.Apr 4 2016, 2:38 PM
  • Use CHECK-NEXT to check the whole relocation block instead of CHECK-NOT

I am pretty sure this is OK. Just give me a minute to dig why gold treats weak specially.

ELF/Writer.cpp
752 ↗(On Diff #52627)

Just merge this on the previous line now that the loop is gone.

764 ↗(On Diff #52627)

I think this is expected. Consider the case:

Program uses LibA and LibB.
LibA also uses LibB.
Program has copy relocation to X of LibB.
LibA assumes that X and Y have the same address.

Under these conditions the Program has to have both X and Y.

Adding something like that to the comment would be nice :-)

pcc updated this revision to Diff 52629.Apr 4 2016, 3:05 PM
pcc marked an inline comment as done.
  • Simplify
ruiu accepted this revision.Apr 4 2016, 3:11 PM
ruiu edited edge metadata.

LGTM

ELF/Writer.cpp
759–760 ↗(On Diff #52629)

Let's use a shorter name and early continue.

for (SharedSymbol<ELFT> &S : SS->File->getSharedSymbols()) {
  if (S.Sym.st_shndx != Shndx || S.Sym.st_value != Value)
    continue;
This revision is now accepted and ready to land.Apr 4 2016, 3:11 PM
rafael added inline comments.Apr 4 2016, 3:11 PM
ELF/Writer.cpp
765 ↗(On Diff #52629)

Why do you need this?

Can you leave it out and just pass SS to addReloc?

pcc marked 2 inline comments as done.Apr 4 2016, 3:34 PM
pcc added inline comments.
ELF/Writer.cpp
765 ↗(On Diff #52629)

I observed that gold was creating a copy relocation to the strong symbol if the referenced symbol was weak. That's not a very good justification though, and nothing seems to break if I do as you suggest, so let's do that.

This revision was automatically updated to reflect the committed changes.
pcc marked an inline comment as done.