This is an archive of the discontinued LLVM Phabricator instance.

[lld] Initial implementation of TLSDESC relocation handling
AbandonedPublic

Authored by zatrazz on Apr 11 2016, 5:39 AM.

Details

Reviewers
ruiu
rafael
Summary

This patch implements the initial lld support for TLSDESC relocation
types. They have lazy relocation nature, so require entries in got.plt.
Each one is double sized. Also special Plt entry with resolver is created in
addition to single got entry. A module that uses lazy TLSDESC relocations
must define next two entries in dynamic section:

  • DT_TLSDESC_PLT - Location of PLT entry for TLS descriptor resolver calls.
  • DT_TLSDESC_GOT - Location of GOT entry used by TLS descriptor resolver PLT entry.

More information can be found in "Thread-Local Storage Descriptors for IA32 and
AMD64/EM64T" and "Thread-Local Storage Descriptors for the ARM platform"
(http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt,
http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt)

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 53226.Apr 11 2016, 5:39 AM
zatrazz retitled this revision from to [lld] Initial implementation of TLSDESC relocation handling.
zatrazz updated this object.
zatrazz added reviewers: ruiu, rafael.
zatrazz set the repository for this revision to rL LLVM.
zatrazz added a project: lld.
zatrazz added subscribers: llvm-commits, lld.
emaste added a subscriber: emaste.Apr 11 2016, 5:49 AM
grimar added a subscriber: grimar.Apr 11 2016, 8:46 AM
grimar added inline comments.
ELF/OutputSections.cpp
63–64

Method does not seem to be clang formatted.

76

I would write:

Entries[DynReloc_TlsDesc].insert(Entries[DynReloc_TlsDesc].end(), { nullptr, nullptr });

As in fact this is single entry

80
EntriesSize
342

Not formatted.

358–369

I don't think we use auto in such cases as type is not obvious here. So I would use

for (std::vector<DynamicReloc<ELFT>> &Vec : Relocs) {
  for (const DynamicReloc<ELFT> &Rel : Vec) {
ELF/OutputSections.h
153

Do you really need direct assignmens here ? Why don't write like:

DynReloc_JumpSlot,
DynReloc_TlsDesc,
DynReloc_NumKind
ELF/Symbols.cpp
156

This one not formatted either. Please use clang-format for this patch.

ELF/Target.cpp
177 ↗(On Diff #53226)

Remove the empty line please.

zatrazz updated this revision to Diff 53284.Apr 11 2016, 11:00 AM
zatrazz removed rL LLVM as the repository for this revision.

Updated patch based on previous comment.

ruiu edited edge metadata.Apr 11 2016, 3:08 PM

This seems to conflict with http://reviews.llvm.org/D18711, so you may want to wait for that.

ELF/InputSection.cpp
308

I'd name this getTlsDescVA rather than getGotPltTlsDescVA if there's no reason to include "GotPlt".

ELF/OutputSections.h
162–163

This code is hard to understand. If I understand it correctly, it is a vector of vectors binned according to their DynReloc_* types. Is this correct?

There seems to be no reason to aggregate multiple vectors into one vector (of vectors) here. Why don't you define multiple vectors?

emaste added inline comments.Apr 11 2016, 5:51 PM
ELF/OutputSections.h
162–163

I think this comes out of a discussion in D18330: http://reviews.llvm.org/D18330#inline-153615 @grimar or @zatrazz?

In D18960#397780, @ruiu wrote:

This seems to conflict with http://reviews.llvm.org/D18711, so you may want to wait for that.

Right, I will rebase using this patch as base.

ELF/InputSection.cpp
308

Ack.

ELF/OutputSections.cpp
63–64

Indeed, I will fix it and other parts as well (I forgot to use clang-format-diff in this patch).

76

I think this is slight worse regarding readability, but I will change it.

80

Ack.

358–369

Ack.

ELF/OutputSections.h
153

It is just to make it explicit, since C++11 does state first enum is 0 and next one will have the value incremented by 1. I will change it.

162–163

It is correct and I thought about adding default vector for each type of relocation, I used this approach mainly to no tie relocation types to a specific vector. Although I think now using different vectors seems a better alternative, I will change it.

ELF/Symbols.cpp
156

Ack.

ELF/Target.cpp
177 ↗(On Diff #53226)

Ack.

I'm looking at this from an AARCH64 perspective. I've also checked to the best of my knowledge that the architecture independent parts of the descriptor model implemented here are compatible with other architectures such as X86.

This is my first time looking at LLD, so I hope any misunderstandings of the code-base don't show through too heavily.

General comments:

  • it looks like you are only implementing the case where lazy specialization of TLS is permitted. Is this intentional?
  • it looks like you are not defining _TLS_MODULE_BASE_ (points at beginning of TLS segment for module) which I think is used by the local dynamic model (local dynamic is not well supported for AARCH64 I belive)

I've left some comments above where I found it more difficult than usual to follow the code. Feel free to ignore.

Some more references that I found useful when looking at this and the (now abandoned http://reviews.llvm.org/D18839):

ARM doesn't have much public documentation on AARCH64 TLS besides the relocations defined in the ABI, the best alternative source of information I have seen is in https://llvm.org/bugs/show_bug.cgi?id=22408 clang no longer bootstraps itself on AArch64 linux.

Paper "Speeding Up Thread-Local Storage Access in Dynamic Libraries in the ARM platform": http://www.fsfla.org/~lxoliva/writeups/TLS/paper-lk2006.pdf

Relocations for AARCH64 are defined in ELF for the ARM 64-bit Architecture (AArch64) Table 4.6.10.5 Thread-local storage descriptors:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0056b/IHI0056B_aaelf64.pdf

ELF/InputSection.cpp
301–302

A descriptor can only have one parameter (2nd word in .got.plt) in the comment below would it be better to say "a function pointer and its argument",

ELF/OutputSections.cpp
87–89

Position of this comment is a bit confusing as the TLS Desc entries come after the Jump Slot entries. Perhaps move after the for loop?

I assume that nullptr will be written out as zero or some other padding value in the actual file.

305

I found this comment a bit confusing.

The double .got.plt entries are for the TLS descriptors, but we are talking about the PLT entry for the lazy resolver here. The writePltDescEntry() from what I can see of your aarch64 patch is really writing the contents of the _dl_tlsdesc_lazy_trampoline function, and nothing else. The DT_TLSDESC_PLT is created somewhere else.

Would it be worth simplifying the comment to what the code is doing locally?

ELF/OutputSections.h
114–115

both Got and PltGot have an addTlsDescEntry member function that do different things.

In GotSection::addTlsDescEntry() we are reserving a single entry in the GOT that DT_TLSDESC_GOT will refer to. This member function will only add this entry once.

In GotPltSection::addTlsDescEntry(SymbolBody&) we are adding a TLS Descriptor to the .got.plt section for the symbol.

Would it be better to rename one of these member functions to capture the difference in meaning. Or perhaps add a comment.

161

Is a vector of vectors really needed here? For example you could have two separate vectors: JumpSlotEntries and TlsDescEntries. I can't see too much advantage in the vector here.

https://sourceware.org/bugzilla/show_bug.cgi?id=13302 IRELATIVE relocation should come last shows that there is at least one more sub-section that will be needed, but I'm not aware of more than 3.

I've not been able to find a reference to why the Jump Slot entries and TlsDesc entries must be separated, although Gold does make the same separation. Do you happen to know? It might be worth adding a comment if there is a reason as in your AARCH64 review you explicitly test for the order.

294–295

As a first time look at the code base I had to check whether size meant number of relocs or size of relocs in bytes. Would getNumRelocs() work better? Apologies if this is an lld convention I've missed.

466

Rather than add special which doesn't help you much when you see special in isolation, would it be better to introduce 2 separate kinds, one for each of DT_TLSDESC_PLT and DT_TLSDESC_GOT?

ELF/Target.h
45–52

Is it worth a comment on what the PltDescEntry must do, seeing as it is a target specific function that we don't have an implementation for (in this patch anyway)?

ELF/Writer.cpp
397

Would it be worth splitting the comment so that 1. appeared alongside Got->addTlsDescEntry() and 2. Appeared alongside GotPlt->addTlsDescEntry(Body).

zatrazz added inline comments.Apr 12 2016, 12:08 PM
ELF/InputSection.cpp
301–302

Ack.

ELF/OutputSections.cpp
87–89

I will change this class to contain two internal vector representing both the jump slot and tlcdesc relocation. And the the idea here is just to write down the Entries VA from actually symbols, the size will be computed 'finalize'.

305

What about something like:

// If module uses TLS description relocations writes down the architecture specific
// lazy relocation entrypoint.
ELF/OutputSections.h
114–116

I think the idea is to follow the same one for addEntry, which will also for GotSection reserve a entry and for GotPltSection will add on the .got.plt. I do not see why changes their names.

161

Yes, from Rui comments I think just adding two vectors seems a better strategy and for IRELATIVE I think we can just add another vector if is the case.

I am trying to found out exactly why gold is doing this separation and based on IRELATIVE issue I am assuming it some dependency issue, so I assumed it would be wiser to follow up gold example.

294–295

I think it is, I will change it.

466

Yes I think is would be better indeed, I will change it.

ELF/Target.h
45–53

I will add it.

ELF/Writer.cpp
397

Indeed, I will change it.

zatrazz updated this revision to Diff 54251.Apr 19 2016, 1:38 PM
zatrazz edited edge metadata.

This is an updated patch based on previous comments and rebased against master.
From Peter's comments, the main points I did not add on this change is the inclusion
of _TLS_MODULE_BASE_. Although I have patch to include it, I am still checking
if the code if fully correct and on which circumstances the symbol is used for aarch64.
Also this is not preventing any test and the bootstrap itself, so I think we can postpone
its implementation.

ruiu added inline comments.Apr 19 2016, 1:43 PM
ELF/Writer.cpp
357

Can you upload this again? This patch contains merge conflict markers.

zatrazz updated this revision to Diff 54355.Apr 20 2016, 6:06 AM

Sorry about that, correct patch uploaded now.

ruiu added a comment.Apr 20 2016, 3:17 PM

Hi Adhemerval,

As this change heavily relies on the code that Rafael recently updated, I'll defer him to review this.

zatrazz updated this revision to Diff 54578.Apr 21 2016, 2:30 PM

Alright, that's fine. I am rebasing the patch based on recent modifications.

Current status linking FreeBSD/arm64 buildworld with lld, with this patch included:

--- libc.so.7.full ---
unrecognized reloc 562
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libc.so.7.full] Error code 1

That is, libc.so linking fails due to R_AARCH64_TLSDESC_ADR_PAGE21

ELF/Writer.cpp
397

extra blank line left over from rebasing?

Current status linking FreeBSD/arm64 buildworld with lld, with this patch included:

--- libc.so.7.full ---
unrecognized reloc 562

Yes I split the original patch in TLSDESC arch independent support and aarch64 specific bits. I did not upload to review the aarch64 bits yet to avoid diverge review efforts. To actually bootstrap on linux I am using two more fixes (one to implement R_AARCH64_TLSDESC_* relocations and another to Add GD to IE TLS relax optimization).

cc: error: linker command failed with exit code 1 (use -v to see invocation)

  • [libc.so.7.full] Error code 1
That is, libc.so linking fails due to `R_AARCH64_TLSDESC_ADR_PAGE21`
emaste added a comment.May 1 2016, 5:43 AM

(Needs updating after rL268178)

zatrazz updated this revision to Diff 55818.May 2 2016, 8:01 AM

Updated patch against master branch. Basically it changes how the Symbol GlobalDynIndex is set (since it was moved to Symbol class) and add the R_TLSDESC{_PAGE} to refersToGotEntry.

emaste added inline comments.May 4 2016, 10:51 AM
ELF/InputSection.cpp
161–162

This is the MI part; why does it reference AArch64?

ELF/InputSection.h
58–59

Keep in alpha order

ELF/OutputSections.cpp
88

Comment is unclear to me.

emaste added a comment.May 8 2016, 6:31 PM

Needs to be updated again.

@ruiu, @rafael, what do you think of this patch? I'm really hopeful that we can make progress with this and the AArch64 TLSDESC change.

ruiu added a comment.May 9 2016, 2:54 PM

Rafael should know more about relocations, so you want to have him take a look.

ELF/InputSection.cpp
160–162

You are not using T_TLSDESC_PAGE. Please remove.

ELF/OutputSections.cpp
735

"Special" is not a good name. Once this code is submitted, this code becomes a regular part of the linker.

ELF/OutputSections.h
142

Use uint32_t instead of uintX_t. Realistically, we'll never have a GOT which contains more than 2^32 entries.

ELF/Writer.cpp
282

GD -> Gd

300

TLS -> Tls

rafael edited edge metadata.May 9 2016, 3:27 PM

The main issue is that this adds the notion of tls descriptor base relocations while the code for the existing relocation processing is still not finalized.

I would like to make sure we at least have a clean framework for creating and optimizing relocations before adding more kinds of them.

ELF/InputSection.cpp
160

I would call it PAGE_PC since it is relative to the PC.

ELF/OutputSections.h
297

Why do you need the second vector? It seems you just relocate it at the same time as Relocs.

emaste added inline comments.May 10 2016, 7:57 AM
ELF/OutputSections.h
297

The TLSDESC relocs have to follow the others; earlier versions of the patch kept track of the number of different types and inserted at the appropriate point in the vector (and a middle version had a vector of reloc vectors), but this way seems more clear to me.

BTW, I setting up an aarch64 dev env. This is the first patch to review, right?

Cheers,
Rafael

emaste added a comment.Jun 2 2016, 6:29 AM

BTW, I setting up an aarch64 dev env. This is the first patch to review, right?

Yes, I believe it is.

zatrazz abandoned this revision.Jun 6 2016, 9:07 AM

Since Rafael implemented it in another patch, I will just drop...