There is a piece of code to set TLS module index to 1 only when
it is ARM and we are linking an executable (as opposed to DSOs).
I'm not sure how useful it is. If this is just to save one dynamic
relocation at load-time, I want to remove this to simplify this code.
We'll probably be able to restore this functionality in future once
I'm done with refactoring.
Details
- Reviewers
psmith
Diff Detail
- Build Status
Buildable 4996 Build 4996: arc lint + arc unit
Event Timeline
Hello, thanks for giving me the chance to review.
The reason I put the optimization in was for static linking, there are no relaxations for TLS for ARM, and we can't use a dynamic relocation to set the module index. So according to the specification we need to set the index to 1 when static linking. Implementing the optimization seemed like the cleanest way of doing it at the time, the alternative was to go through the writeTo() part of the GOT and hard code the slot to 1.
I thought I'd try and write a test case to show why it was required, only to find that when static linking with glibc, the __tls_get_addr that we pass the address of the module-index to, hard codes the module index to 1, so for glibc it doesn't matter what we put in the module index. However not all the world uses glibc so being a purist I'd prefer to stick to the specification just in case other C libraries rely on the static linker to set the module index. Pragmatically given the low user-base I think we could temporarily remove it if we were to put it back at least for static linking soon.
Sadly while writing the test case
#include <stdio.h> extern __thread int val __attribute__((tls_model("global-dynamic"))); extern __thread int val2 __attribute__((tls_model("global-dynamic"))); extern __thread int val3 __attribute__((tls_model("global-dynamic"))); int main(void) { printf("%d %d %d\n", val, val2, val3); return 0; }
file2.c
__thread int val = 10; __thread int val2 = 20; __thread int val3;
lld is giving the wrong result for both static and dynamic linking, and printing out 10, 10, 10 rather than 10, 20, 0. What seems to be happening is that the offset from the TLS block is always being written as 0. I don't know whether this has been broken recently and the tests didn't pick it up or never worked in the first place. Back when I wrote the ARM TLS I seem to remember the getVA() when writing the GOT picked up that the symbol was TLS and put the correct offset in. I'll need to work out what happened and fix it. For what I can easily test (X86_64, ARM and AArch64) this problem looks ARM specific, although it may impact Mips as well as ARM is most similar to the Mips TLS model.
I'm very near to finishing a reviewable range extension Thunks implementation, I'm hoping to get this finished soon after Euro LLVM, when I've done that I'll look at the TLS offset problem.
Thank you for reviewing. I now understand why this is needed. I'll retract this patch and try something else. I think the issue we have here is that there's no easy way to write just 1 to the GOT entry, so we add a R_ARM_TLS_DTPMOD32 relocation and do that later, which is not very intuitive. I'll try to see if I can implement a more direct way of doing it.
Could you explain the following expression for me? I think I understand !Config->Pic and Config->Emachine == EM_ARM, but why !Body.isPreemptible()?
if (!Body.isPreemptible() && !Config->Pic && Config->EMachine == EM_ARM)
A symbol passed to this expression is somewhat arbitrary (it's just the first symbol that happened to this function), so it seems odd to me that that affects the decision how we set up the TLS module index.
I haven't got a good answer for why !isPreemptible() is there. My best guess is that it started as isPreemptible(), there was originally no !Config->Pic, when the !Config->Pic got added this made the !isPreemptible() redundant in that case. The isPreemptible() is important in the Target->TlsOffsetRel case as even for a shared library the linker will know the offset of the Symbol in the TLS block if the symbol isn't preemptible.
In summary, for the module index I think !Config-Pic covers all the cases that !isPreemptible() would so it can be removed.
I've done some more investigation into the problem with global dynamic TLS that I found yesterday. I can trace it back to r288107, this changed the GotSection::writeTo() to only use relocations. The previous ARM TLS code was relying on the generic GotSection entry writing code to write the offset from the module index. This means that handleNoRelaxTlsRelocation needs to add a relocation to the Target->TlsOffsetRel as well as the module index. The Mips TLS should be unaffected as it has its own MipsGotSection that will handle this case.
if (Target->isTlsGlobalDynamicRel(Type)) { if (Got->addDynTlsEntry(Body) && (Body.isPreemptible() || Config->EMachine == EM_ARM)) { uintX_t Off = Got->getGlobalDynOffset(Body); addModuleReloc(Body, Got, Off, false); if (Body.isPreemptible()) In<ELFT>::RelaDyn->addReloc({Target->TlsOffsetRel, Got, Off + (uintX_t)sizeof(uintX_t), false, &Body, 0}); // Next 3 lines are new else if (Config->EMachine == EM_ARM) Got->Relocations.push_back({R_ABS, Target->TlsOffsetRel, Off + (uintX_t)sizeof(uintX_t), 0, &Body }); } C.Relocations.push_back({Expr, Type, Offset, Addend, &Body}); return 1; }
And an addition of R_ARM_TLS_DTPOFF32: to the ARMTargetInfo::relocateOne: (same as R_ARM_TLS_TPOFF32).
I mention this just in case it affects your refactoring. I'll post a patch with test case soon, hopefully next week.