This is an archive of the discontinued LLVM Phabricator instance.

[ELF] LLD does not create a record in the .dynsym if a strong symbol defined in the executable file replaces a weak symbol from a shared library.
AbandonedPublic

Authored by atanasyan on Aug 5 2014, 6:21 AM.

Details

Summary

LLD does not create a record in the .dynsym if a strong symbol defined in the executable file replaces a weak symbol from a shared library. This bug affects all ELF architectures and leads to segfault:

% cat foo.c
extern int __attribute__((weak)) flag;
int foo() { return flag; }

% cat main.c
int flag = 1;
int foo();
int main() { return foo() == 1 ? 0 : -1; }

% clang -c -fPIC foo.c main.c
% lld -flavor gnu -target x86_64 -shared -o libfoo.so ... foo.o
% lld -flavor gnu -target x86_64 -o a.out ... main.o libfoo.so
% ./a.out
Segmentation fault

The problem is in the Resolver class. We lose all information about coalesced symbols after the Resolver::resolve() method is finished. Besides, the symbol resolving code is target independent and we need to keep it in this state.

The patch solves the problem by introducing a new class ELFResolver which is inherited from the Resolver class. In this new class we override the checkUndefines() method, iterate over atoms and configure dynamic export settings for ELFDefinedAtom when necessary.

The Driver class requests an instance if the Resolver class from the LinkingContext. For non-ELF targets we continue to use Resolver class. For ELF targets we create an instance of ELFResolver.

Diff Detail

Event Timeline

atanasyan updated this revision to Diff 12194.Aug 5 2014, 6:21 AM
atanasyan retitled this revision from to [ELF] LLD does not create a record in the .dynsym if a strong symbol defined in the executable file replaces a weak symbol from a shared library..
atanasyan updated this object.
atanasyan edited the test plan for this revision. (Show Details)
atanasyan added reviewers: Bigcheese, shankarke, ruiu.
atanasyan added a project: lld.
atanasyan added a subscriber: Unknown Object (MLST).

The best way to do this is using a ELF reference that sets a dynamic export tag. We should not be adding a export dynamic to an atom as that's not a property.

atanasyan updated this revision to Diff 12206.Aug 5 2014, 11:26 AM

Symbols require to be exported is now tagged using an ELFReference.

ruiu edited edge metadata.Aug 5 2014, 1:13 PM

I'd think we shouldn't subclass Resolver class. Windows linker is probably more different from the original Resolver class behavior but we can still be able to share the class. Also Resolver class already contains some arch/binary format specific code -- for example, fallback atom for Windows and GNU LinkOnce attribute for ELF. I understand the reason why you are adding this code, but what this patch is trying to do seems rather simple compared to its patch size. It feels it's probably better not to subclassing the entire class but instead add a few lines to Resolver directly.

That being said, I don't have a concrete idea on how to do it. The issue is that we lose the information that an undefined atom had after it's coalesced away. That happens in Resolver::updateReference(), in which we completely ditches coalesed-away atoms. Maybe we should add code there to copy some information from a coalesed-away atom to a replacing atom. Then the ELF writer is able to see that to decide if it needs a .dynsym entry for a replacement atom.

shankarke edited edge metadata.Aug 5 2014, 4:42 PM

IMO

  • We should not be adding dynamicExport() to the Atom, as its not a property of the Atom.
  • We should not add a new Resolver.

What we could do is

a) Create references dynamicExportNone, dynamicExportAsNeeded, dynamicExportAlways(For shared libraries)

b) The reader would set the reference to be dynamicExportNone for all atoms that it creates,

c) A new pass (DynamicSymbolTableCreate) would go through all atoms, and if has been resolved from a shared library, sets it to dynamicExportAsNeeded.

d) The ELF writer would go through all the atoms that have been set dynamicExportAsNeeded and create dynsym entries.

With this change, there is no change needed in the Resolver and there is no need of a ELFResolver.

atanasyan abandoned this revision.Aug 12 2014, 4:45 AM

Thanks for review. I will try to implement another solution using your ideas.

From the description of the problem, I thought that libfoo would have a weak definition of "flag" and main had a strong definition of "flag" and the problem was that the binaries were not set up properly so that ld.so would pick main's flag to override libfoo's flag.

But looking at the example code, "flag" is extern. So this seems like the other meaning of weak - that the symbol can be missing at runtime. But the use of "flag" in foo() is not checking if its address is NULL before accessing. But in this case it should not be NULL because ld.so should find it in main.

Which case is this?

What info is missing after resolve() is done?

When the weak symbol in the shared library is resolved from another object file when building an executable that symbol would need to be exported.

I was suggesting to have a reference in the atom that if it needs to be exported, be set to exportDynamic during the resolution phase.

It could be recorded in the elf context as well but to be cleaner I felt that a reference used.

Does that sound the right way (or) I am over complicating things ?

Shankar Easwaran