This is an archive of the discontinued LLVM Phabricator instance.

[lld] [ELF] Support for general dynamic TLS relocations on X86_64
ClosedPublic

Authored by rafaelauler on Sep 26 2014, 11:38 AM.

Details

Summary

This patch adds support for the general dynamic TLS access model for X86_64 (see www.akkadia.org/drepper/tls.pdf).

To properly support TLS, the patch also changes the tls_get_addr atom to be a shared library atom instead of a regularly defined atom (the previous lld approach). This closely models the reality of a function that will be resolved at runtime by the dynamic linker and loader itself (ld.so). I was tempted to force LLD to link against ld.so itself to resolve these symbols, but since GNU ld does not need the ld.so library to resolve this symbol, I decided to mimic its behavior and keep hardwired a definition of tls_get_addr in the lld code.

This patch also moves some important logic that previously was only available to the MIPS lld backend to be used to all ELF backends. This logic, which now lives in the DefaultLayout class, will monitor which external (shared lib) symbols are really imported by the current module and will only populate the dynamic symbol table with used symbols, as opposed to the previous approach of dumping all shared lib symbols in the dynamic symbol table. This is important to this patch to avoid __tls_get_addr from getting injected into all dynamic symbol tables.

By solving the previous problem of always adding __tls_get_addr, now the produced symbol tables are slightly smaller. But this impacted several tests that relied on hardwired/predefined sizes of the symbol table, requiring this patch to update such tests.

Diff Detail

Event Timeline

rafaelauler retitled this revision from to [lld] [ELF] Support for general dynamic TLS relocations on X86_64.
rafaelauler updated this object.
rafaelauler edited the test plan for this revision. (Show Details)
rafaelauler added reviewers: ruiu, rafael, Bigcheese.
rafaelauler added a project: lld.
rafaelauler added a subscriber: Unknown Object (MLST).
Bigcheese accepted this revision.Sep 26 2014, 12:47 PM
Bigcheese edited edge metadata.

lgtm with the change.

lib/ReaderWriter/ELF/Atoms.h
652–657

Not sure this is needed. TLSGETADDRAtom can just directly derive from SharedLibraryAtom.

This revision is now accepted and ready to land.Sep 26 2014, 12:47 PM
rafaelauler edited edge metadata.

Thanks for putting in time to see this. I agree with the changes and I updated my patch. I don't have commit access, could you please commit it?

One area of linking not fleshed out in lld is the "canBeNullAtBuildtime" attribute for UndefinedSymbols. That was intended to model ELF weak undefines. Is that how tls_get_addr works with gnu ld? If so, then getting canBeNullAtBuildtime to work in general could solve this specific issue without needing to special case tls_get_addr.

__tls_get_addr is a regular, non-weak versioned symbol from the dynamic linker and loader. For example:

$ readelf -a /lib64/ld-linux-x86-64.so.2
...

21: 0000000000013340    55 FUNC    GLOBAL DEFAULT   11 __tls_get_addr@@GLIBC_2.3

...

The only difference to a regular lib is that you don't need to pass "-lsomelibname" to link against it but, rather, it is an implicit library that is always present. Another difference is that you don't need to put a DT_NEEDED tag to request the "ld.so" library: it will already be loaded because it is the dynamic linker and loader.

Since we can't rely on a ld.so to link against, but we must implicitly provide this symbol, I put this as an explicit atom. Since it will come from a lib at runtime, I modelled this as a subclass of SharedLibraryAtom.

For example, if you strace the GNU ld when linking a TLS library, it will never open "ld.so". This suggests that the ELF linker should be self-contained with respect to the symbols of ld.so, providing hardwired definitions of them.

I guess I was not clear. I did not mean where does the *definition* of tls_get_addr come from. I was asking where does the *undefine* for tls_get_addr come from? If that undefine is "weak", that tells the linker that if no definition is found, to assume it was in some DSO. Seems like that is what you are special casing tls_get_addr to do.

Sorry, my bad, I mixed up definitions. You are right, that would work well. However, usually compilers generate an undefine for __tls_get_addr that is just a regular, non-weak undefine. For example:

$ clang -c -o test.o -x c - <<< '__thread int my_tls_number = 0; int mynum() { return my_tls_number; }'
$ readelf -a test.o
...

12: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __tls_get_addr

...

joerg added a subscriber: joerg.Sep 26 2014, 2:59 PM

I'm not sure this patch is the right direction. Essentially, for statically linked binaries, libc is responsible for providing the symbol. For dynamically linked binaries, libc or libdl may provide a definition, but it is a symbol that is explicitly known. This leads back to the main issue that lld deals badly with symbols that are known to be defined somewhere, but not necessarily explicitly. IMO it would be better to address that issue first.

shankarke requested changes to this revision.Sep 26 2014, 4:15 PM
shankarke added a reviewer: shankarke.
shankarke added a subscriber: shankarke.
shankarke added inline comments.
lib/ReaderWriter/ELF/DynamicLibraryWriter.h
70 ↗(On Diff #14126)

Why do we want to add this atom ? This should be an undefined symbol with fPIC.

lib/ReaderWriter/ELF/ExecutableWriter.h
69–71 ↗(On Diff #14126)

as my previous comment, we should not add this atom by default.

This revision now requires changes to proceed.Sep 26 2014, 4:15 PM

Hi Joerg,

I see, I agree that this does not mix well with static linking, I was very much glibc-focused. In fact, this is a preliminary patch that addresses only the first TLS general dynamic use case, I would work in the static cases later. To address both dynamic and static cases, I think that Kledzik's suggestion of putting weak symbols to work will fit in perfectly: if the linker doesn't find any objects defining tl_get_addr at build time, it will leave to the dynamic linker at runtime. What do you think?

Hi Shankar,

Thanks for putting in time to review this. I agree that we should not add this atom in dynamic libraries, I did this to replace the previous code that would always add a definition of tls_get_addr regardless the output type, but removing this atom would still work well. However, I am not sure if I fully understood your last comment -- I cannot see why we should remove this atom from ExecutableWriter, since, to complete linking an executable, all symbols must be resolved and we cannot finish it with undefined references.

Best regards,
Rafael Auler

rafaelauler edited edge metadata.

Addressed Shankar's first comment and removed atom adding logic from DynamicLibraryWriter.

to complete linking an executable, all symbols must be resolved and we cannot finish it with undefined references

The way linking should work is that if any UndefinedAtoms remain, if all of them are marked canBeNullAtBuildtime, the LinkingContext should instantiate SharedLibraryAtoms to resolve those. canBeNullAtBuildtime is something only ELF supports, so the LinkingContext needs to called to do this.

With the functionality in place, the trick is to create the UndefinedAtoms for tls_get_addr as canBeNullAtBuildtime.

Sounds reasonable. Since the undefined references for __tls_get_addr are usually not weak by themselves, should we add a special case to the reader to convert them to weak undefines?

canBeNullAtBuildTime is associated with a weak symbol while __tls_get_addr is not a weak symbo in the writer, which changes the symbol semantics.

The reason I was mentioning about removing the __tls_get_addr atom for the executable is for the code was not PIC.

should we add a special case to the reader to convert them to weak undefines?

That is what I was thinking. But you should investigate how this works with the gnu linker. Does the linker really hard code "__tls_get_addr" to be special? Or does it always implicitly link with ld.so?

canBeNullAtBuildTime is associated with a weak symbol while __tls_get_addr is not a weak symbo in the writer, which changes the symbol semantics.

I'm not sure what you are saying here. If "tls_get_adder" was a weak *undefined* symbol in the .o file, everything would work correctly. Static linking ignores the canBeNullAtBuildTime aspect of UndefinedAtoms, so either it is defined in an archive, or you get a linker error. For dynamic executables, if it "tls_get_adder" not defined anywhere, the linker notices the canBeNullAtBuildTime and conjures up a SharedLibraryAtom for it.

I'm sorry, I think I was influenced by the previous lld code that always defined _tls_get_addr to be a regular symbol, I'm checking here and I think we don't need to define an atom for __tls_get_addr at all.

What happens is that the linker command line that clang uses eventually includes a linker script that foces it to link against ld.so.... so no need to explictly define this atom.

I will continue to investigate, though, to see if there are cases where we do need to special case tls_get_addr. Does anyone know why tls_get_addr was explictly defined as an atom in previous lld code in the first place?

rafaelauler edited edge metadata.

This new version of the patch completely removes the special case for the "tls_get_addr" atom/symbol, since it will be provided by input libraries (usually ld.so). However, some tests depended on tls_get_addr being always defined, independent of inputs. To avoid breaking these tests, this patch adds "--defsym=tls_get_addr=0" to the linker command line in these tests. This patch also updates a "TODO" file that asked to remove tls_get_addr from the output.

emaste added a subscriber: emaste.Sep 27 2014, 7:15 AM
shankarke accepted this revision.Sep 27 2014, 4:58 PM
shankarke edited edge metadata.
This revision is now accepted and ready to land.Sep 27 2014, 4:58 PM
rafaelauler edited edge metadata.

I have a few minor updates to this patch.

  1. I enhanced the test case to also check whether the PLT32 relocation against tls_get_addr is being properly generated in a shared lib, if it uses TLS.
  2. When I removed the tlsgetaddr atom, the PLT32 relocation against tls_get_addr stopped being generated. I fixed this in a logic in the X86_64RelocationPass.cpp that would only add a PLT relocation if the symbol came from a SharedLib. I changed this to add whenever it comes from a SharedLIb <or> if it is undefined, because we can't leave a PLT32 unresolved if the target is not defined.
  3. I removed the logic that checks whether the symbol comes from the dynamic linker and, if so, do not add it to DT_NEEDED. This logic is quite expensive (needs to strcmp every symbol that is being added to the dynamic symbol table) and I am not sure if this is necessary, I only did it to mimic gnu ld.

If you want to accept this patch, could you please commit it? I don't have commit access.

rafaelauler closed this revision.Sep 29 2014, 3:15 PM

Committed as r218633.