This is an archive of the discontinued LLVM Phabricator instance.

I had suggested doing something like this at https://reviews.llvm.org/D53906#1326683. I like the idea, but I think we need to fix up the variant 2 case in getTlsTpOffset:

case EM_386:
case EM_X86_64:
  // Variant 2. The TLS segment is located just before the thread pointer.
  return -Out::TlsPhdr->p_memsz;

e.g. If the TLS segment is 4-bytes in size with 8-byte alignment (and assuming p_vaddr % p_align == 0), then the segment will be located at TP-8 on variant 2 targets, not TP-4.

I haven't tested this, but something like this would probably work:

return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);

Otherwise, lld and the loader will disagree about the location of TLS symbols.

MaskRay mentioned this in rL361084: [ELF] Fix TP offset of TLS Variant I after D62059.May 17 2019, 5:44 PM

MaskRay mentioned this in rLLD361084: [ELF] Fix TP offset of TLS Variant I after D62059.

MaskRay mentioned this in rG348731aeed4b: [ELF] Fix TP offset of TLS Variant I after D62059.

In D62059#1507017, @rprichard wrote:
I had suggested doing something like this at https://reviews.llvm.org/D53906#1326683. I like the idea, but I think we need to fix up the variant 2 case in getTlsTpOffset:
case EM_386:
case EM_X86_64:
  // Variant 2. The TLS segment is located just before the thread pointer.
  return -Out::TlsPhdr->p_memsz;
e.g. If the TLS segment is 4-bytes in size with 8-byte alignment (and assuming p_vaddr % p_align == 0), then the segment will be located at TP-8 on variant 2 targets, not TP-4.

I haven't tested this, but something like this would probably work:
return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
Otherwise, lld and the loader will disagree about the location of TLS symbols.

Thanks for pointing this out! I pushed a quick fix. To be more precise, it should be:

diff --git c/ELF/InputSection.cpp w/ELF/InputSection.cpp
index 72a2c298d..2eb9a0452 100644
--- c/ELF/InputSection.cpp
+++ w/ELF/InputSection.cpp
@@ -594,11 +594,13 @@ static int64_t getTlsTpOffset() {
     // NB: While the ARM/AArch64 ABI formally has a 2-word TCB size, lld
     // effectively increases the TCB size to 8 words for Android compatibility.
     // It accomplishes this by increasing the segment's alignment.
-    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align);
+    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align, Out::TlsPhdr->FirstSec->Addr);
   case EM_386:
   case EM_X86_64:
     // Variant 2. The TLS segment is located just before the thread pointer.
-    return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
+    return -(Out::TlsPhdr->p_memsz +
+             (-Out::TlsPhdr->p_memsz - Out::TlsPhdr->FirstSec->Addr &
+              Out::TlsPhdr->p_align - 1));
   case EM_PPC64:
     // The thread pointer points to a fixed offset from the start of the
     // executable's TLS segment. An offset of 0x7000 allows a signed 16-bit

The formulae are complex because they take p_vaddr%p_align!=0 into account... I didn't do that in the commit because I cannot think of a way to make p_vaddr%p_align!=0 (after we remove the ARM/AArch64 overalignment hack).

MaskRay mentioned this in rL361088: [ELF][X86] Fix R_RELAX_TLS_GD_TO_LE_NEG and R_NEG_TLS after D62059.May 17 2019, 6:58 PM

MaskRay mentioned this in rLLD361088: [ELF][X86] Fix R_RELAX_TLS_GD_TO_LE_NEG and R_NEG_TLS after D62059.

MaskRay mentioned this in rG898896836dd1: [ELF][X86] Fix R_RELAX_TLS_GD_TO_LE_NEG and R_NEG_TLS after D62059.

In D62059#1507362, @MaskRay wrote:
In D62059#1507017, @rprichard wrote:
I had suggested doing something like this at https://reviews.llvm.org/D53906#1326683. I like the idea, but I think we need to fix up the variant 2 case in getTlsTpOffset:
case EM_386:
case EM_X86_64:
  // Variant 2. The TLS segment is located just before the thread pointer.
  return -Out::TlsPhdr->p_memsz;
e.g. If the TLS segment is 4-bytes in size with 8-byte alignment (and assuming p_vaddr % p_align == 0), then the segment will be located at TP-8 on variant 2 targets, not TP-4.

I haven't tested this, but something like this would probably work:
return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
Otherwise, lld and the loader will disagree about the location of TLS symbols.
Thanks for pointing this out! I pushed a quick fix. To be more precise, it should be:
diff --git c/ELF/InputSection.cpp w/ELF/InputSection.cpp
index 72a2c298d..2eb9a0452 100644
--- c/ELF/InputSection.cpp
+++ w/ELF/InputSection.cpp
@@ -594,11 +594,13 @@ static int64_t getTlsTpOffset() {
     // NB: While the ARM/AArch64 ABI formally has a 2-word TCB size, lld
     // effectively increases the TCB size to 8 words for Android compatibility.
     // It accomplishes this by increasing the segment's alignment.
-    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align);
+    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align, Out::TlsPhdr->FirstSec->Addr);
   case EM_386:
   case EM_X86_64:
     // Variant 2. The TLS segment is located just before the thread pointer.
-    return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
+    return -(Out::TlsPhdr->p_memsz +
+             (-Out::TlsPhdr->p_memsz - Out::TlsPhdr->FirstSec->Addr &
+              Out::TlsPhdr->p_align - 1));
   case EM_PPC64:
     // The thread pointer points to a fixed offset from the start of the
     // executable's TLS segment. An offset of 0x7000 allows a signed 16-bit
The formulae are complex because they take p_vaddr%p_align!=0 into account... I didn't do that in the commit because I cannot think of a way to make p_vaddr%p_align!=0 (after we remove the ARM/AArch64 overalignment hack).

@rprichard @ruiu I missed another two cases for x86-32: gd to le relaxation and @tpoff... rLLD361088 now.

Anyway, I think with recent llvm/lld changes, it is low overhead to implement the workaround in crtbegin*.o now.

.section .tdata,"awT",@progbits
.p2align 8
.zero 1 # delete if you don't care about ld.bfd (ld.bfd strips empty PT_TLS)

.text
_start:
  .reloc 0, R_AARCH64_NONE, .tdata

MaskRay mentioned this in D62055: [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack.May 17 2019, 7:26 PM

MaskRay mentioned this in rL361090: [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack.May 17 2019, 8:14 PM

MaskRay mentioned this in rLLD361090: [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack.

MaskRay mentioned this in rGed2ad77ccb00: [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack.

sidorovd mentioned this in rGfe81463e0071: [ELF] Fix TP offset of TLS Variant I after D62059.May 30 2019, 10:40 AM

sidorovd mentioned this in rGd6a49527f6cc: [ELF][X86] Fix R_RELAX_TLS_GD_TO_LE_NEG and R_NEG_TLS after D62059.

sidorovd mentioned this in rG18bfb2b0f9fa: [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack.

tstellar mentioned this in rL362858: Merging r361090:.Jun 7 2019, 5:06 PM

tstellar mentioned this in rGad5bcd4ee602: Merging r361090:.

In D62059#1507362, @MaskRay wrote:
In D62059#1507017, @rprichard wrote:
I had suggested doing something like this at https://reviews.llvm.org/D53906#1326683. I like the idea, but I think we need to fix up the variant 2 case in getTlsTpOffset:
case EM_386:
case EM_X86_64:
  // Variant 2. The TLS segment is located just before the thread pointer.
  return -Out::TlsPhdr->p_memsz;
e.g. If the TLS segment is 4-bytes in size with 8-byte alignment (and assuming p_vaddr % p_align == 0), then the segment will be located at TP-8 on variant 2 targets, not TP-4.

I haven't tested this, but something like this would probably work:
return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
Otherwise, lld and the loader will disagree about the location of TLS symbols.
Thanks for pointing this out! I pushed a quick fix. To be more precise, it should be:
diff --git c/ELF/InputSection.cpp w/ELF/InputSection.cpp
index 72a2c298d..2eb9a0452 100644
--- c/ELF/InputSection.cpp
+++ w/ELF/InputSection.cpp
@@ -594,11 +594,13 @@ static int64_t getTlsTpOffset() {
     // NB: While the ARM/AArch64 ABI formally has a 2-word TCB size, lld
     // effectively increases the TCB size to 8 words for Android compatibility.
     // It accomplishes this by increasing the segment's alignment.
-    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align);
+    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align, Out::TlsPhdr->FirstSec->Addr);
   case EM_386:
   case EM_X86_64:
     // Variant 2. The TLS segment is located just before the thread pointer.
-    return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
+    return -(Out::TlsPhdr->p_memsz +
+             (-Out::TlsPhdr->p_memsz - Out::TlsPhdr->FirstSec->Addr &
+              Out::TlsPhdr->p_align - 1));
   case EM_PPC64:
     // The thread pointer points to a fixed offset from the start of the
     // executable's TLS segment. An offset of 0x7000 allows a signed 16-bit
The formulae are complex because they take p_vaddr%p_align!=0 into account... I didn't do that in the commit because I cannot think of a way to make p_vaddr%p_align!=0 (after we remove the ARM/AArch64 overalignment hack).

I'm just working on some internal testing in relation to this. It's quite possible to trigger this behaviour using a linker script. All you need is for the PT_TLS segment to appear somewhere other than at the front of its PT_LOAD segment, with data before it. In my experiment, I had an 8-byte section first, followed by an 8-byte sized, 8-byte aligned .tdata and 16-byte aligned .tbss. This resulted in a PT_TLS appearing at a p_vaddr % p_align != 0 position.

I'm posting this because I wasn't sure if a test case has since been added, or whether I should just try posting my example as a test case. @MaskRay, what do you think?

In D62059#2436681, @jhenderson wrote:
In D62059#1507362, @MaskRay wrote:
In D62059#1507017, @rprichard wrote:
I had suggested doing something like this at https://reviews.llvm.org/D53906#1326683. I like the idea, but I think we need to fix up the variant 2 case in getTlsTpOffset:
case EM_386:
case EM_X86_64:
  // Variant 2. The TLS segment is located just before the thread pointer.
  return -Out::TlsPhdr->p_memsz;
e.g. If the TLS segment is 4-bytes in size with 8-byte alignment (and assuming p_vaddr % p_align == 0), then the segment will be located at TP-8 on variant 2 targets, not TP-4.

I haven't tested this, but something like this would probably work:
return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
Otherwise, lld and the loader will disagree about the location of TLS symbols.
Thanks for pointing this out! I pushed a quick fix. To be more precise, it should be:
diff --git c/ELF/InputSection.cpp w/ELF/InputSection.cpp
index 72a2c298d..2eb9a0452 100644
--- c/ELF/InputSection.cpp
+++ w/ELF/InputSection.cpp
@@ -594,11 +594,13 @@ static int64_t getTlsTpOffset() {
     // NB: While the ARM/AArch64 ABI formally has a 2-word TCB size, lld
     // effectively increases the TCB size to 8 words for Android compatibility.
     // It accomplishes this by increasing the segment's alignment.
-    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align);
+    return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align, Out::TlsPhdr->FirstSec->Addr);
   case EM_386:
   case EM_X86_64:
     // Variant 2. The TLS segment is located just before the thread pointer.
-    return -alignTo(Out::TlsPhdr->p_memsz, Out::TlsPhdr->p_align);
+    return -(Out::TlsPhdr->p_memsz +
+             (-Out::TlsPhdr->p_memsz - Out::TlsPhdr->FirstSec->Addr &
+              Out::TlsPhdr->p_align - 1));
   case EM_PPC64:
     // The thread pointer points to a fixed offset from the start of the
     // executable's TLS segment. An offset of 0x7000 allows a signed 16-bit
The formulae are complex because they take p_vaddr%p_align!=0 into account... I didn't do that in the commit because I cannot think of a way to make p_vaddr%p_align!=0 (after we remove the ARM/AArch64 overalignment hack).
I'm just working on some internal testing in relation to this. It's quite possible to trigger this behaviour using a linker script. All you need is for the PT_TLS segment to appear somewhere other than at the front of its PT_LOAD segment, with data before it. In my experiment, I had an 8-byte section first, followed by an 8-byte sized, 8-byte aligned .tdata and 16-byte aligned .tbss. This resulted in a PT_TLS appearing at a p_vaddr % p_align != 0 position.

I'm posting this because I wasn't sure if a test case has since been added, or whether I should just try posting my example as a test case. @MaskRay, what do you think?

Adding back the code does not break any test, so there is no. If you are going to add a test, probably reuse an existing tls test and check PT_TLS.

Revision Contents

Path

Size

lld/

trunk/

ELF/

Writer.cpp

5 lines

test/

ELF/

tls-align.s

21 lines

Diff 200024

lld/trunk/ELF/Writer.cpp

Show First 20 Lines • Show All 2,196 Lines • ▼ Show 20 Lines	if (P->p_type == PT_TLS && P->p_memsz) {
// On ARM/AArch64, reserve extra space (8 words) between the thread		// On ARM/AArch64, reserve extra space (8 words) between the thread
// pointer and an executable's TLS segment by overaligning the segment.		// pointer and an executable's TLS segment by overaligning the segment.
// This reservation is needed for backwards compatibility with Android's		// This reservation is needed for backwards compatibility with Android's
// TCB, which allocates several slots after the thread pointer (e.g.		// TCB, which allocates several slots after the thread pointer (e.g.
// TLS_SLOT_STACK_GUARD==5). For simplicity, this overalignment is also		// TLS_SLOT_STACK_GUARD==5). For simplicity, this overalignment is also
// done on other operating systems.		// done on other operating systems.
P->p_align = std::max<uint64_t>(P->p_align, Config->Wordsize * 8);		P->p_align = std::max<uint64_t>(P->p_align, Config->Wordsize * 8);
}		}

// The TLS pointer goes after PT_TLS for variant 2 targets. At least glibc
// will align it, so round up the size to make sure the offsets are
// correct.
P->p_memsz = alignTo(P->p_memsz, P->p_align);
}		}
}		}
}		}

// A helper struct for checkSectionOverlap.		// A helper struct for checkSectionOverlap.
namespace {		namespace {
struct SectionOffset {		struct SectionOffset {
OutputSection *Sec;		OutputSection *Sec;
▲ Show 20 Lines • Show All 374 Lines • Show Last 20 Lines

lld/trunk/test/ELF/tls-align.s

	// REQUIRES: x86
	// RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
	// RUN: ld.lld %t -o %tout -shared
	// RUN: llvm-readobj -l %tout \| FileCheck %s

	.section .tbss,"awT",@nobits
	.align 8
	.long 0

	// CHECK: ProgramHeader {
	// CHECK: Type: PT_TLS
	// CHECK-NEXT: Offset:
	// CHECK-NEXT: VirtualAddress:
	// CHECK-NEXT: PhysicalAddress:
	// CHECK-NEXT: FileSize: 0
	// CHECK-NEXT: MemSize: 8
	// CHECK-NEXT: Flags [
	// CHECK-NEXT: PF_R (0x4)
	// CHECK-NEXT: ]
	// CHECK-NEXT: Alignment: 8
	// CHECK-NEXT: }

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Don't align PT_TLS's p_memszClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 200024

lld/trunk/ELF/Writer.cpp

lld/trunk/test/ELF/tls-align.s

[ELF] Don't align PT_TLS's p_memsz
ClosedPublic