This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
ELF/
-
Arch/
-
AArch64.cpp
-
ARM.cpp
-
PPC64.cpp
5
InputSection.cpp
-
Target.h

Differential D53905

[ELF] Refactor per-target TLS layout configuration. NFC.
ClosedPublic

Authored by rprichard on Oct 30 2018, 3:18 PM.

Download Raw Diff

Details

Reviewers

• espindola
ruiu
PkmX
jrtc27

Commits

rGe7cb0225a088: [ELF] Refactor per-target TLS layout configuration. NFC.
rL345775: [ELF] Refactor per-target TLS layout configuration. NFC.
rLLD345775: [ELF] Refactor per-target TLS layout configuration. NFC.

Summary

There are really three different kinds of TLS layouts:

A fixed TLS-to-TP offset. On architectures like PowerPC, MIPS, and RISC-V, the thread pointer points to a fixed offset from the start of the executable's TLS segment. The offset is 0x7000 for PowerPC and MIPS, which allows a signed 16-bit offset to reach 0x1000 of per-thread implementation data and 0xf000 of the application's TLS segment. The size and layout of the TCB isn't relevant to the static linker and might not be known.
A fixed TCB size. This is the format documented as "variant 1" in Ulrich Drepper's TLS spec. The thread pointer points to a 2-word TCB followed by the executable's TLS segment. The first word is always the DTV pointer. Used on ARM. The thread pointer must be aligned to the TLS segment's alignment, possibly creating alignment padding.
Variant 2. This format predates variant 1 and is also documented in Drepper's TLS spec. It allocates the executable's TLS segment before the thread pointer, apparently for backwards-compatibility. It's used on x86 and SPARC.

Factor out an lld::elf::getTlsTpOffset() function for use in a
follow-up patch for Android. The TcbSize/TlsTpOffset fields are only used
in getTlsTpOffset, so replace them with a switch on Config->EMachine.

Diff Detail

Repository: rLLD LLVM Linker

Event Timeline

rprichard created this revision.Oct 30 2018, 3:18 PM

Herald added a reviewer: • espindola. · View Herald TranscriptOct 30 2018, 3:18 PM

Herald added subscribers: llvm-commits, jsji, PkmX and 10 others. · View Herald Transcript

Harbormaster completed remote builds in B24360: Diff 171823.Oct 30 2018, 3:18 PM

rprichard added a parent revision: D53904: [ELF] Define PT_ANDROID_TLS_TPOFF.Oct 30 2018, 3:23 PM

rprichard added a child revision: D53906: [ARM][AArch64] Increase TLS alignment to reserve space for Android's TCB.

This change would make it possible to model RISC-V TLS without the (Config->EMachine == EM_RISCV) special case in https://reviews.llvm.org/D39324 (InputSection.cpp). RISC-V expects a fixed offset of 0.

Hmm, I'm not sure I like making TlsTpOffset negative, the name reads as meaning the offset of the thread pointer, which is *positive*, and in general linkers, loaders and libcs like to define it as a positive constant (along with others, such as PPC64TocOffset and DynamicThreadPointerOffset in PPC64.cpp). Personally I'd leave TlsTpOffset as being positive, rename getTlsTpOffset (maybe getTlsOffsetFromTp?) and make it negate TlsTpOffset. Thoughts? Otherwise looks correct to me.

ruiu added inline comments.Oct 30 2018, 4:14 PM

ELF/InputSection.cpp
572–573	I don't think you need to define TlsLayoutKind. I'd just dispatch based on `Config->EMachine` in this function.

jrtc27 added inline comments.Oct 30 2018, 4:21 PM

ELF/InputSection.cpp
572–573	That works, and there are certainly other parts of LLD that do things like that, but you could say the same thing about `TlsTpOffset` and `TcbSize` themselves... I personally like to see `Config->EMachine` used as little as possible and put the logic in `ELF/Arch/$ARCH.cpp`, but of course as maintainer you ultimately decide where to draw the line for when to abstract, and it's not exactly a big deal (though does then disregard the original motivation for this patch). Also, on a separate issue, the case labels (and bodies) should be deindented one level to be flush with the `switch`.

jrtc27 requested changes to this revision.Oct 30 2018, 4:22 PM

This revision now requires changes to proceed.Oct 30 2018, 4:22 PM

ruiu added inline comments.Oct 30 2018, 4:29 PM

ELF/InputSection.cpp
572–573	`TlsTpOffset` and `TcbSize` are just member variables that doesn't have any logic, but `TlsLayoutKind` is a new thing with which we compute some value, and we use that member variable only in this function. So I think eliminating `TlsLayoutKind` and directly use `Config->EMachine` matches the taste of lld's code. I do understand your motivation to write target-dependent code in files under `Arch/`, but we are not too serious about doing that. We have target-dependent code in many other places if it makes code easier to read. As to the indentation depth, `case` should be at the same nesting level as `switch` in the LLVM coding style.

rprichard added inline comments.Oct 30 2018, 5:32 PM

ELF/InputSection.cpp
572–573	Using `Config->EMachine` makes sense to me. I'm wondering if we should keep the `TcbSize` / `TlsTpOffset` fields or move them into `getTlsTpOffset`. I'm leaning toward removing the fields so that all the TP-to-TLS-segment logic is in one place.

Replace the TargetInfo TLS layout fields with a switch on Config->EMachine.

Harbormaster completed remote builds in B24372: Diff 171855.Oct 30 2018, 6:34 PM

rprichard retitled this revision from [ELF] Refactor TLS layout TargetInfo config. NFC. to [ELF] Refactor per-target TLS layout configuration. NFC..Oct 30 2018, 6:39 PM

rprichard edited the summary of this revision. (Show Details)

LGTM

Nice! The new code is much easier to understand IMO.

Yeah, if we're not using a field for which TLS variant the target wants, this is the clearest way to do it, putting all the logic in one place rather than some in the target and some in InputSection.

This revision is now accepted and ready to land.Oct 31 2018, 8:07 AM

ruiu added inline comments.Oct 31 2018, 8:11 AM

ELF/InputSection.cpp
571	I think you can make this a file-scope `static` function.

LGTM. I will update D39324 to reflect this change.

Make getTlsTpOffset a file-scope static function.

Harbormaster completed remote builds in B24409: Diff 172001.Oct 31 2018, 1:20 PM

rprichard mentioned this in D53906: [ARM][AArch64] Increase TLS alignment to reserve space for Android's TCB.Oct 31 2018, 1:21 PM

Still LGTM, please commit.

Closed by commit rLLD345775: [ELF] Refactor per-target TLS layout configuration. NFC. (authored by rprichard). · Explain WhyOct 31 2018, 1:56 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

ELF/

Arch/

3 lines

2 lines

2 lines

44 lines

9 lines

Diff 172013

ELF/Arch/AArch64.cpp

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	AArch64::AArch64() {
PltEntrySize = 16;		PltEntrySize = 16;
PltHeaderSize = 32;		PltHeaderSize = 32;
DefaultMaxPageSize = 65536;		DefaultMaxPageSize = 65536;

// Align to the 2 MiB page size (known as a superpage or huge page).		// Align to the 2 MiB page size (known as a superpage or huge page).
// FreeBSD automatically promotes 2 MiB-aligned allocations.		// FreeBSD automatically promotes 2 MiB-aligned allocations.
DefaultImageBase = 0x200000;		DefaultImageBase = 0x200000;

// It doesn't seem to be documented anywhere, but tls on aarch64 uses variant
// 1 of the tls structures and the tcb size is 16.
TcbSize = 16;
NeedsThunks = true;		NeedsThunks = true;
}		}

RelExpr AArch64::getRelExpr(RelType Type, const Symbol &S,		RelExpr AArch64::getRelExpr(RelType Type, const Symbol &S,
const uint8_t *Loc) const {		const uint8_t *Loc) const {
switch (Type) {		switch (Type) {
case R_AARCH64_TLSDESC_ADR_PAGE21:		case R_AARCH64_TLSDESC_ADR_PAGE21:
return R_TLSDESC_PAGE;		return R_TLSDESC_PAGE;
▲ Show 20 Lines • Show All 358 Lines • Show Last 20 Lines

ELF/Arch/ARM.cpp

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	ARM::ARM() {
TlsModuleIndexRel = R_ARM_TLS_DTPMOD32;		TlsModuleIndexRel = R_ARM_TLS_DTPMOD32;
TlsOffsetRel = R_ARM_TLS_DTPOFF32;		TlsOffsetRel = R_ARM_TLS_DTPOFF32;
GotBaseSymInGotPlt = false;		GotBaseSymInGotPlt = false;
GotEntrySize = 4;		GotEntrySize = 4;
GotPltEntrySize = 4;		GotPltEntrySize = 4;
PltEntrySize = 16;		PltEntrySize = 16;
PltHeaderSize = 32;		PltHeaderSize = 32;
TrapInstr = 0xd4d4d4d4;		TrapInstr = 0xd4d4d4d4;
// ARM uses Variant 1 TLS
TcbSize = 8;
NeedsThunks = true;		NeedsThunks = true;
}		}

uint32_t ARM::calcEFlags() const {		uint32_t ARM::calcEFlags() const {
// The ABIFloatType is used by loaders to detect the floating point calling		// The ABIFloatType is used by loaders to detect the floating point calling
// convention.		// convention.
uint32_t ABIFloatType = 0;		uint32_t ABIFloatType = 0;
if (Config->ARMVFPArgs == ARMVFPArgKind::Base \|\|		if (Config->ARMVFPArgs == ARMVFPArgKind::Base \|\|
▲ Show 20 Lines • Show All 536 Lines • Show Last 20 Lines

ELF/Arch/PPC64.cpp

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	PPC64::PPC64() {
PltEntrySize = 4;		PltEntrySize = 4;
GotPltEntrySize = 8;		GotPltEntrySize = 8;
GotBaseSymInGotPlt = false;		GotBaseSymInGotPlt = false;
GotBaseSymOff = 0x8000;		GotBaseSymOff = 0x8000;
GotHeaderEntriesNum = 1;		GotHeaderEntriesNum = 1;
GotPltHeaderEntriesNum = 2;		GotPltHeaderEntriesNum = 2;
PltHeaderSize = 60;		PltHeaderSize = 60;
NeedsThunks = true;		NeedsThunks = true;
TcbSize = 8;
TlsTpOffset = 0x7000;

TlsModuleIndexRel = R_PPC64_DTPMOD64;		TlsModuleIndexRel = R_PPC64_DTPMOD64;
TlsOffsetRel = R_PPC64_DTPREL64;		TlsOffsetRel = R_PPC64_DTPREL64;

TlsGotRel = R_PPC64_TPREL64;		TlsGotRel = R_PPC64_TPREL64;

NeedsMoreStackNonSplit = false;		NeedsMoreStackNonSplit = false;

▲ Show 20 Lines • Show All 661 Lines • Show Last 20 Lines

ELF/InputSection.cpp

Show First 20 Lines • Show All 560 Lines • ▼ Show 20 Lines	for (auto It = std::get<0>(Range); It != std::get<1>(Range); ++It)
if (isRelExprOneOf<R_PC>(It->Expr))		if (isRelExprOneOf<R_PC>(It->Expr))
return &*It;		return &*It;

error("R_RISCV_PCREL_LO12 relocation points to " + IS->getObjMsg(D->Value) +		error("R_RISCV_PCREL_LO12 relocation points to " + IS->getObjMsg(D->Value) +
" without an associated R_RISCV_PCREL_HI20 relocation");		" without an associated R_RISCV_PCREL_HI20 relocation");
return nullptr;		return nullptr;
}		}

		// A TLS symbol's virtual address is relative to the TLS segment. Add a
		// target-specific adjustment to produce a thread-pointer-relative offset.
		static int64_t getTlsTpOffset() {
		ruiuUnsubmitted Not Done Reply Inline Actions I think you can make this a file-scope `static` function. ruiu: I think you can make this a file-scope `static` function.
		switch (Config->EMachine) {
		case EM_ARM:
		ruiuUnsubmitted Not Done Reply Inline Actions I don't think you need to define TlsLayoutKind. I'd just dispatch based on `Config->EMachine` in this function. ruiu: I don't think you need to define TlsLayoutKind. I'd just dispatch based on `Config->EMachine`…
		jrtc27Unsubmitted Not Done Reply Inline Actions That works, and there are certainly other parts of LLD that do things like that, but you could say the same thing about `TlsTpOffset` and `TcbSize` themselves... I personally like to see `Config->EMachine` used as little as possible and put the logic in `ELF/Arch/$ARCH.cpp`, but of course as maintainer you ultimately decide where to draw the line for when to abstract, and it's not exactly a big deal (though does then disregard the original motivation for this patch). Also, on a separate issue, the case labels (and bodies) should be deindented one level to be flush with the `switch`. jrtc27: That works, and there are certainly other parts of LLD that do things like that, but you could…
		ruiuUnsubmitted Not Done Reply Inline Actions `TlsTpOffset` and `TcbSize` are just member variables that doesn't have any logic, but `TlsLayoutKind` is a new thing with which we compute some value, and we use that member variable only in this function. So I think eliminating `TlsLayoutKind` and directly use `Config->EMachine` matches the taste of lld's code. I do understand your motivation to write target-dependent code in files under `Arch/`, but we are not too serious about doing that. We have target-dependent code in many other places if it makes code easier to read. As to the indentation depth, `case` should be at the same nesting level as `switch` in the LLVM coding style. ruiu: `TlsTpOffset` and `TcbSize` are just member variables that doesn't have any logic, but…
		rprichardAuthorUnsubmitted Not Done Reply Inline Actions Using `Config->EMachine` makes sense to me. I'm wondering if we should keep the `TcbSize` / `TlsTpOffset` fields or move them into `getTlsTpOffset`. I'm leaning toward removing the fields so that all the TP-to-TLS-segment logic is in one place. rprichard: Using `Config->EMachine` makes sense to me. I'm wondering if we should keep the `TcbSize` /…
		case EM_AARCH64:
		// Variant 1. The thread pointer points to a TCB with a fixed 2-word size,
		// followed by a variable amount of alignment padding, followed by the TLS
		// segment.
		return alignTo(Config->Wordsize * 2, Out::TlsPhdr->p_align);
		case EM_386:
		case EM_X86_64:
		// Variant 2. The TLS segment is located just before the thread pointer.
		return -Out::TlsPhdr->p_memsz;
		case EM_PPC64:
		// The thread pointer points to a fixed offset from the start of the
		// executable's TLS segment. An offset of 0x7000 allows a signed 16-bit
		// offset to reach 0x1000 of TCB/thread-library data and 0xf000 of the
		// program's TLS segment.
		return -0x7000;
		default:
		llvm_unreachable("unhandled Config->EMachine");
		}
		}

static uint64_t getRelocTargetVA(const InputFile *File, RelType Type, int64_t A,		static uint64_t getRelocTargetVA(const InputFile *File, RelType Type, int64_t A,
uint64_t P, const Symbol &Sym, RelExpr Expr) {		uint64_t P, const Symbol &Sym, RelExpr Expr) {
switch (Expr) {		switch (Expr) {
case R_INVALID:		case R_INVALID:
return 0;		return 0;
case R_ABS:		case R_ABS:
case R_RELAX_TLS_LD_TO_LE_ABS:		case R_RELAX_TLS_LD_TO_LE_ABS:
case R_RELAX_GOT_PC_NOPIC:		case R_RELAX_GOT_PC_NOPIC:
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	static uint64_t getRelocTargetVA(const InputFile *File, RelType Type, int64_t A,
case R_TLS:		case R_TLS:
// A weak undefined TLS symbol resolves to the base of the TLS		// A weak undefined TLS symbol resolves to the base of the TLS
// block, i.e. gets a value of zero. If we pass --gc-sections to		// block, i.e. gets a value of zero. If we pass --gc-sections to
// lld and .tbss is not referenced, it gets reclaimed and we don't		// lld and .tbss is not referenced, it gets reclaimed and we don't
// create a TLS program header. Therefore, we resolve this		// create a TLS program header. Therefore, we resolve this
// statically to zero.		// statically to zero.
if (Sym.isTls() && Sym.isUndefWeak())		if (Sym.isTls() && Sym.isUndefWeak())
return 0;		return 0;
		return Sym.getVA(A) + getTlsTpOffset();
// For TLS variant 1 the TCB is a fixed size, whereas for TLS variant 2 the
// TCB is on unspecified size and content. Targets that implement variant 1
// should set TcbSize.
if (Target->TcbSize) {
// PPC64 V2 ABI has the thread pointer offset into the middle of the TLS
// storage area by TlsTpOffset for efficient addressing TCB and up to
// 4KB – 8 B of other thread library information (placed before the TCB).
// Subtracting this offset will get the address of the first TLS block.
if (Target->TlsTpOffset)
return Sym.getVA(A) - Target->TlsTpOffset;

// If thread pointer is not offset into the middle, the first thing in the
// TLS storage area is the TCB. Add the TcbSize to get the address of the
// first TLS block.
return Sym.getVA(A) + alignTo(Target->TcbSize, Out::TlsPhdr->p_align);
}
return Sym.getVA(A) - Out::TlsPhdr->p_memsz;
case R_RELAX_TLS_GD_TO_LE_NEG:		case R_RELAX_TLS_GD_TO_LE_NEG:
case R_NEG_TLS:		case R_NEG_TLS:
return Out::TlsPhdr->p_memsz - Sym.getVA(A);		return Out::TlsPhdr->p_memsz - Sym.getVA(A);
case R_SIZE:		case R_SIZE:
return Sym.getSize() + A;		return Sym.getSize() + A;
case R_TLSDESC:		case R_TLSDESC:
return In.Got->getGlobalDynAddr(Sym) + A;		return In.Got->getGlobalDynAddr(Sym) + A;
case R_TLSDESC_PAGE:		case R_TLSDESC_PAGE:
▲ Show 20 Lines • Show All 551 Lines • Show Last 20 Lines

ELF/Target.h

Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	public:

// At least on x86_64 positions 1 and 2 are used by the first plt entry		// At least on x86_64 positions 1 and 2 are used by the first plt entry
// to support lazy loading.		// to support lazy loading.
unsigned GotPltHeaderEntriesNum = 3;		unsigned GotPltHeaderEntriesNum = 3;

// On PPC ELF V2 abi, the first entry in the .got is the .TOC.		// On PPC ELF V2 abi, the first entry in the .got is the .TOC.
unsigned GotHeaderEntriesNum = 0;		unsigned GotHeaderEntriesNum = 0;

// For TLS variant 1, the TCB is a fixed size specified by the Target.
// For variant 2, the TCB is an unspecified size.
// Set to 0 for variant 2.
unsigned TcbSize = 0;

// Set to the offset (in bytes) that the thread pointer is initialized to
// point to, relative to the start of the thread local storage.
unsigned TlsTpOffset = 0;

bool NeedsThunks = false;		bool NeedsThunks = false;

// A 4-byte field corresponding to one or more trap instructions, used to pad		// A 4-byte field corresponding to one or more trap instructions, used to pad
// executable OutputSections.		// executable OutputSections.
uint32_t TrapInstr = 0;		uint32_t TrapInstr = 0;

// If a target needs to rewrite calls to __morestack to instead call		// If a target needs to rewrite calls to __morestack to instead call
// __morestack_non_split when a split-stack enabled caller calls a		// __morestack_non_split when a split-stack enabled caller calls a
▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines