Page MenuHomePhabricator

[ELF/AArch64] Implemented set of R_AARCH64_TLSDESC_* relocations.
AbandonedPublic

Authored by zatrazz on Mar 21 2016, 2:15 PM.

Details

Reviewers
ruiu
rafael
Summary

This patch implements next relocations:

R_AARCH64_TLSDESC_ADR_PAGE21, R_AARCH64_TLSDESC_LD64_LO12_NC,
R_AARCH64_TLSDESC_ADD_LO12_NC, R_AARCH64_TLSDESC_CALL.

They have lazy relocation nature, so require entries in got.plt. Each one
is double sized. Also special Plt entry with resolver is created in
addition to single got entry. A module that uses lazy TLSDESC relocations
must define next two entries in dynamic section:

DT_TLSDESC_PLT - Location of PLT entry for TLS descriptor resolver calls.
DT_TLSDESC_GOT - Location of GOT entry used by TLS descriptor resolver PLT entry.

More information can be found in "Thread-Local Storage Descriptors for IA32 and
AMD64/EM64T" and "Thread-Local Storage Descriptors for the ARM platform"
(http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt,
http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt)

Sample app to see them on AArch64 is:
__thread int foo;
int main() {

foo = 5;

}
aarch64-linux-gnu-g++ -fPIC test.cpp -S -o test.s

PS: this patch is based on http://reviews.llvm.org/D16201

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 51225.Mar 21 2016, 2:15 PM
zatrazz retitled this revision from to [ELF/AArch64] Implemented set of R_AARCH64_TLSDESC_* relocations..
zatrazz updated this object.
zatrazz added reviewers: ruiu, rafael.
zatrazz set the repository for this revision to rL LLVM.
zatrazz added a project: lld.
zatrazz added subscribers: emaste, grimar, rengolin, llvm-commits.
yinma added a subscriber: yinma.Mar 21 2016, 3:08 PM

With this and D18331, D18332, D18333 applied, linking FreeBSD/arm64 with lld still gives me:

relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.

This happens when linking libc.so and is proving slightly tricky to track down.

With this and D18331, D18332, D18333 applied, linking FreeBSD/arm64 with lld still gives me:

relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.
relocation R_AARCH64_LDST64_ABS_LO12_NC cannot be used when making a shared object; recompile with -fPIC.

This happens when linking libc.so and is proving slightly tricky to track down.

Does this happen while trying to bootstrap or using a smaller testscase?

Does this happen while trying to bootstrap or using a smaller testscase?

Some small test cases are fine and I can self-host lld and pass ninja check-lld.

grimar added a comment.EditedMar 23 2016, 3:51 AM

So it is really very close with http://reviews.llvm.org/D16201.
Can I abandon my patch then ? If you want and going to support aarch64 stuff - I would be happy with that.

So it is really very close with http://reviews.llvm.org/D16201.
Can I abandon my patch then ? If you want and going to support aarch64 stuff - I would be happy with that.

I think so, this patch is based directly on yours (I basically rebase against trunk).

Does this happen while trying to bootstrap or using a smaller testscase?

I tracked the issue down by adding an abort() at the "the recompile with -fPIC" warning to find the name of the symbol triggering it. It turned out to be a similar case as I previously had on X86 -- non-hidden symbols in assembly language routines.

A workaround on the FreeBSD side is fairly straightforward and a build is ongoing. I do wonder if we should handle this as GNU ld does, but this can be revisited later on.

So it is really very close with http://reviews.llvm.org/D16201.
Can I abandon my patch then ? If you want and going to support aarch64 stuff - I would be happy with that.

I think so, this patch is based directly on yours (I basically rebase against trunk).

Ok. My last change was rebasing also. But if you are willing to support that, then I am happy to take that from my shoulders :)
I`ll abandon my version.

minor nits and a question

ELF/InputSection.cpp
272

"are specific" seems to be missing information. Maybe "...follow a specific format. They use two..."?

275

Maybe mention "is based on" or such to make it clear this wasn't accidentally copied and pasted from X86-specific stuff.

ELF/OutputSections.cpp
322

I wonder if there's a nicer way to handle this?

ELF/OutputSections.h
177

Comment?

zatrazz added inline comments.Mar 23 2016, 11:17 AM
ELF/InputSection.cpp
275

What about:

TLS descriptor uses two words in the .got.plt which describes a structure
of a function pointer and its arguments. Both fields will be adjusted at
runtime by the dynamic loader with a single relocation (usually
R_<ARCH>_TLSDESC).

ELF/OutputSections.cpp
322

I though about this, but I think to remove the Tail it would require to maintain two or move relocation vectors (one for normal and another for tlsdesc) and this will just add more complexity IMHO.

However we might change in later if this shows to be a performance issue.

ELF/OutputSections.h
177

I think it would be better to change to Off_GotPltTlsdesc and add a comment such 'the got.plt entry for a tlsdesc Sym'.

Does this happen while trying to bootstrap or using a smaller testscase?

I tracked the issue down by adding an abort() at the "the recompile with -fPIC" warning to find the name of the symbol triggering it. It turned out to be a similar case as I previously had on X86 -- non-hidden symbols in assembly language routines.

A workaround on the FreeBSD side is fairly straightforward and a build is ongoing. I do wonder if we should handle this as GNU ld does, but this can be revisited later on.

How GNU ld handles it?

emaste added inline comments.Mar 23 2016, 11:43 AM
ELF/InputSection.cpp
275

Makes sense to me.

ELF/OutputSections.cpp
322

However we might change in later if this shows to be a performance issue.

To me having two vectors seems more clear. Anyway, we can certainly try it both ways alter on and evaluate the performance difference.

How GNU ld handles it?

I'm investigating exactly what happens, but it seems the two symbols responsible for lld's failure messages (minbrk and curbrk) do not appear in a ld.bfd-linked libc.so and have no relocations.

rafael added inline comments.Mar 23 2016, 2:11 PM
ELF/OutputSections.cpp
81

Can you provide a description of what this is?

In general, any reason you cannot just use GlobalDynIndex for the position of the pair of got entries you need?

grimar added inline comments.Mar 24 2016, 3:21 AM
ELF/OutputSections.cpp
322

Problem here is that 2 vectors are not really a solution. I mean for this place we just need to put these relocations after all others, but actually there are other similar group of relocations also: IRelative, which are also should be placed at tail. If we one day we will want to implement them (I mean dynamic case, for static one we already did it and static case does not have that problem), then 2 vectors probably is not the best design as we will need 3 of them then ?

At the same time, probably, to have two vectors for now would be more clear, I just kept in mind that all stuff above when implemented that in my version of patch and decided to have one vector basing on that thoughts.

This needs to be rebased again; hopefully it can be reviewed soon.

ELF/OutputSections.cpp
322

I tried a quick experiment with a std::vector<int>. First case called push_back 10 times, then insert(v.end() - 10, i)

zatrazz added inline comments.Mar 25 2016, 7:01 AM
ELF/OutputSections.cpp
81

The 'Tail' basically keep the position of last default relocation since tlsdesc for aarch64 are place after the jump slots and IRELATIVE.

And I think we can GlobalDynIndex instead of using this extra function. I will check that.

zatrazz added inline comments.Mar 25 2016, 7:08 AM
ELF/OutputSections.cpp
322

Indeed 2 vectors is not a definitely solution, but I think we can use current approach now and think about a more flexible solution later (since now only aarch64 requires tlsdesc support in this layout).

I think a better solution would be to add a variable number of vectors with the addTls* function to add on the specific one (for instance 'jump slot', 'tlsdesc', and 'irelative'). It could be a DenseMap or similar associating an type to a vector. The Target will then provide the desirable layout for the got.plt and the class will write down as specified. What do you think?

grimar added inline comments.Mar 25 2016, 7:17 AM
ELF/OutputSections.cpp
322

I would not use 2 vectors. Because that is probably not solves the problem in general at all. Having 2 of them now and 3 later just will bring the complication of whole code.
I am sure there should be much simplier solution for that. I have no ideas currently unfortunately though.
So I would leave that as is now, that place anyways needs refactoring, I think we all agree with that.

zatrazz updated this revision to Diff 51651.Mar 25 2016, 10:19 AM
zatrazz removed rL LLVM as the repository for this revision.

I have rebased and simplified the patch a bit, based on previous comments.
Changes from previous version:

  • Reword the comment about TLSDESC description;
  • Remove getTailSize function and used internal SymbolBody fields instead;
  • Changes PltRev to PltTlsDesc to describe better the idea of the new field.

I have check and bootstrapped the compiler (along with http://reviews.llvm.org/D18332).

zatrazz updated this revision to Diff 51655.Mar 25 2016, 10:26 AM

I forgot to add a fix on AArch64TargetInfo::needsGot and
AArch64TargetInfo::refersToGotEntry in last patch. This one is the
correct update.

emaste added inline comments.Mar 28 2016, 6:46 AM
ELF/OutputSections.cpp
322

I tried a simple experiment with std::vector<int> to get a sense of the scale of the difference. One case started with 10 elements and then added 13 million more with v.insert(v.end() - 10 , i) while the other case maintained two vectors and concatenated them at the end. I took the 13 million number from the relocation count in @ruiu's note on linking debug Chrome.

The insert approach takes about 200 times as long in this test. Times in ms:

x concat.txt
+ insert.txt
+------------------------------------------------------------------------------+
|x                                                                      ++++  +|
|A                                                                      |_MA_| |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5             2             5             4             4     1.2247449
+   5           770           840           800           802     25.884358
Difference at 95.0% confidence
        798 +/- 26.7237
        19950% +/- 668.094%
        (Student's t, pooled s = 18.3235)

Something like @zatrazz's suggestion above sounds reasonable to me, although I'd be happy enough to have a functional version committed first and the change made in a subsequent commit.

Needs to be rebased again

ELF/Target.cpp
1378–1379

git complained about trailing whitespace here

zatrazz updated this revision to Diff 52259.Mar 31 2016, 12:26 PM

Rebase against master, fixed trailing whitespace and ping.

zatrazz abandoned this revision.Apr 5 2016, 7:23 AM