This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/
-
Arch/
-
AArch64.cpp
-
MapFile.cpp
17/17
Relocations.cpp
6/6
Symbols.h
-
Symbols.cpp
4/4
SyntheticSections.h
1/1
SyntheticSections.cpp
-
Writer.cpp
-
test/ELF/
-
ELF/
-
combreloc.s
-
comdat-discarded-error.s
-
undef-multi.s
-
undef.s
-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
-
Parallel.h
-
lib/Support/
-
Support/
2/2
Parallel.cpp

Differential D133003

[ELF] Parallelize relocation scanning
ClosedPublic

Authored by MaskRay on Aug 31 2022, 1:38 AM.

Download Raw Diff

Details

Reviewers

andrewng
ikudrin
peter.smith

Commits

rGe6aebff67426: [ELF] Parallelize relocation scanning

Summary

Change Symbol::flags to a std::atomic<uint16_t>
Add llvm::parallel::threadIndex as a thread-local non-negative integer
Add relocsVec to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.

MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.

Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:

clang (Release): 1.27x as fast
clang (Debug): 1.06x as fast
chrome (default): 1.05x as fast
scylladb (default): 1.04x as fast

Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):

clang (Release): 1.31x as fast
scylladb (default): 1.06x as fast

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MaskRay created this revision.Aug 31 2022, 1:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 1:38 AM

Herald added subscribers: StephenFan, atanasyan, arichardson and 2 others. · View Herald Transcript

MaskRay requested review of this revision.Aug 31 2022, 1:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 1:38 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B184313: Diff 456893.Aug 31 2022, 2:08 AM

ikudrin added inline comments.Aug 31 2022, 11:45 AM

lld/ELF/Relocations.cpp
300–301	Why not define a copy constructor?
lld/ELF/Symbols.h
299–300	Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike?
315	You use it with two flags at least once, maybe call it `setFlags`?

MaskRay added inline comments.Aug 31 2022, 12:24 PM

lld/ELF/Symbols.h
299–300	All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want to keep the size of `SymbolUnion` unchanged.

ikudrin added inline comments.Sep 1 2022, 2:10 AM

lld/ELF/Symbols.h
299–300	Does that mean that some flags in the atomic do not really need to be handled as such, or that this flag is left outside despite it can be potentially updated concurrently, but there is no space for it in `flags`? In any case, that is worth documenting, at least.

rebase. address comments

lld/ELF/Relocations.cpp
300–301	Good idea. Adopted
lld/ELF/Symbols.h
299–300	I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now, but I suspect 16-bit atomic operations are not efficient on common architectures.

MaskRay retitled this revision from [WIP][ELF] Parallelize relocation scanning to [ELF] Parallelize relocation scanning.Sep 4 2022, 6:29 PM

Harbormaster completed remote builds in B185029: Diff 457877.Sep 4 2022, 6:37 PM

rebase

Harbormaster completed remote builds in B185039: Diff 457889.Sep 4 2022, 11:48 PM

lkail added a subscriber: lkail.Sep 4 2022, 11:53 PM

ikudrin added inline comments.Sep 5 2022, 7:11 AM

lld/ELF/Relocations.cpp
1241	If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to `atomic<bool>`, the same should be done for `Configuration::needsTlsLd` because their usage pattern is similar.
1296	Shouldn't `relocMutex` be locked before this call?
1575	`uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the range is `NEEDS_TLSIE` which is not used here, but still.

Sorry, I've been busy, so have only just had some time to look at this patch. Looks promising but unfortunately there are performance regressions on Windows for both chrome (~3%) and mozilla (~5%) from lld-speed-test.tar.xz. Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Had some time to investigate a bit more and it seems that the slow down, at least on my 12C/24T Windows PC, is actually a result of contention over relocMutex in RelocationScanner::processAux. So "too many" concurrent threads running RelocationScanner::processAux can result in an overall slow down to scan the relocations and in these cases, it's likely to slow down even further with more available threads. Unfortunately, there's no mechanism in parallel::TaskGroup to limit the number of concurrent tasks being run by the pool from the group, so there's no "easy" solution. I've been experimenting with some ideas that shard the input sections such that there are fewer concurrent threads running the relocation scanning code.

>>! In D133003#3772446, @andrewng wrote:

Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Had some time to investigate a bit more and it seems that the slow down, at least on my 12C/24T Windows PC, is actually a result of contention over relocMutex in RelocationScanner::processAux. So "too many" concurrent threads running RelocationScanner::processAux can result in an overall slow down to scan the relocations and in these cases, it's likely to slow down even further with more available threads. Unfortunately, there's no mechanism in parallel::TaskGroup to limit the number of concurrent tasks being run by the pool from the group, so there's no "easy" solution. I've been experimenting with some ideas that shard the input sections such that there are fewer concurrent threads running the relocation scanning code.

Thanks for catching the issue. Perhaps we can add a thread_local thread index (for getDefaultExecutor) to llvm/Support/Parallel.h and allocate a relocation vector for each thread. Finally merge and sort the relocation vectors.

lld/ELF/Relocations.cpp
1575	Thanks for catching this!

andrewng mentioned this in D133431: [WIP][ELF] Parallelize relocation scanning.Sep 7 2022, 9:15 AM

I've created D133431 which is the result of my experimentation thus far. In my testing, it does slightly improve performance in the test cases that regressed in performance. In other test cases, it's around the same or slightly lower performance increase. However, I can't help but feel there should be a "better" solution. Although, I guess you've always got to balance that with complexity/maintainability. The hard coded concurrency limit of 8 tasks in D133431 also doesn't feel great.

Thanks for catching the issue. Perhaps we can add a thread_local thread index (for getDefaultExecutor) to llvm/Support/Parallel.h and allocate a relocation vector for each thread. Finally merge and sort the relocation vectors.

Yes, trying to eliminate the lock contention does sound like a good approach, although it feels like it would add complexity.

Also forgot to mention that there were 2 other ELF tests that seemed to need the --threads=1 treatment: comdat-discarded-error.s and debug-line-obj.s (although this might be due to the change in D133431).

Remove mutex for relative relocations. Thanks to @andrewng for finding the issue

If the NEEDS_* change looks good, I'll pre-commit it (without using std::atomic) to reduce diff for future updates.

Herald added subscribers: ctetreau, hiraditya. · View Herald TranscriptSep 9 2022, 12:44 AM

Harbormaster completed remote builds in B185781: Diff 458972.Sep 9 2022, 2:07 AM

If the NEEDS_* change looks good, I'll pre-commit it (without using std::atomic) to reduce diff for future updates.

The NEEDS_* change LGTM.

This approach definitely looks better and hasn't added too much complexity. Initial testing on Windows is looking good, but I need to do a bit more.

lld/ELF/Relocations.cpp
1559	I wonder if it might be worthwhile using the previous code for the serial case? Although, it probably doesn't make a big difference to performance.
lld/ELF/Symbols.h
318	The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit - 1)) == 0)`?
lld/ELF/SyntheticSections.h
546–547	Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class RelrBaseSection`?
llvm/lib/Support/Parallel.cpp
21	Perhaps `int` -> `unsigned`?
53	Perhaps move this initialisation of `threadIndex` and the one below into `work()`?

Performance on Windows looks good! Every test case I've tried has shown an improvement.

lld/ELF/SyntheticSections.cpp
1603–1606	Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`.

MaskRay mentioned this in rGbd16ffb38981: [ELF] Merge Symbol::needs* into uint16_t flags. NFC.Sep 9 2022, 2:37 PM

Thanks a lot for the comments.
Updated.

MaskRay added inline comments.Sep 9 2022, 2:46 PM

lld/ELF/Relocations.cpp
1543	I'll remove `AndroidPackedRelocationSection does not support parallelism.` . It works with deterministic parallelism.
1559	Use which piece of code for the serial case?

Harbormaster completed remote builds in B185954: Diff 459206.Sep 9 2022, 4:24 PM

MaskRay edited the summary of this revision. (Show Details)Sep 9 2022, 11:45 PM

andrewng added inline comments.Sep 10 2022, 11:23 AM

lld/ELF/Relocations.cpp
1559	I was thinking this: for (InputSectionBase sec : inputSections) if (sec->isLive() && (sec->flags & SHF_ALLOC)) scanner.template scanSection<ELFT>(sec); But on the other hand, in terms of future development and maintenance, it's probably better to use as much of the same code for both "paths", even if there's a minor performance penalty for the serial one.
1566–1571	This is running on the main thread. Is there a chance that this might clash with thread 0 of the task pool?

reduce contention

lld/ELF/Relocations.cpp
1559	Yes, using the same code for both paths is better for maintenance.
1566–1571	Thanks for catching this. The main thread doing heavy work will contend with the thread pool. Changed to use `tg.execute`.

Harbormaster completed remote builds in B186035: Diff 459312.Sep 10 2022, 4:16 PM

andrewng added inline comments.Sep 11 2022, 6:10 AM

lld/ELF/Relocations.cpp
1559	Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned it. But as there are specific configurations that are limited to serial I thought that it might be worth considering.
1566–1571	I think the previous code could have actually caused a threading issue, i.e. concurrent updates to the `0` indexed relocation vector. This will ensure that can't happen. The only minor thing is it looks a little odd that the "serial" case uses `tg` but I guess it is still serial.

MaskRay marked an inline comment as done.Sep 11 2022, 11:00 AM

MaskRay added inline comments.

lld/ELF/Relocations.cpp

1559

Add the comment before the tg.execute([] { line?

+  // Both the main thread and thread pool index 0 use threadIndex==0.  Be
+  // careful that they don't concurrently run scanSections. When serial is
+  // true, fn() has finished at this point, so running execute is safe

This LGTM now, but I think it would be good to get another opinion too.

lld/ELF/Relocations.cpp
1559	Yes, I think it would be worth adding the comment for clarity.

This revision is now accepted and ready to land.Sep 12 2022, 1:46 AM

No objections from me. Some small suggestions for comments and a way that might catch someone using the relocs array before mergeRels has been called.

lld/ELF/SyntheticSections.h
506	Would it be better to move the text into the /// comment as it is a precondition for calling the function?
545–548	Now that mergeRels has to be called before this is useable, is it worth making this private with an interface that asserts mergeRels has been called?
546	Suggest "// will be moved into relocs by mergeRels()."

Add concurrency to constructors and make RelocationBaseSection::relocsVec protected

MaskRay marked 3 inline comments as done.Sep 12 2022, 10:46 AM

Harbormaster completed remote builds in B186197: Diff 459518.Sep 12 2022, 11:55 AM

MaskRay edited the summary of this revision. (Show Details)Sep 12 2022, 12:50 PM

Herald added a subscriber: kristof.beyls. · View Herald TranscriptSep 12 2022, 12:50 PM

Closed by commit rGe6aebff67426: [ELF] Parallelize relocation scanning (authored by MaskRay). · Explain WhySep 12 2022, 12:56 PM

This revision was automatically updated to reflect the committed changes.

MaskRay added a commit: rGe6aebff67426: [ELF] Parallelize relocation scanning.

Unfortunately, this commit broke mingw dylib builds with Windows native TLS. The reason for this is that with Windows native TLS, you can't directly access a TLS variable residing in a different DLL.

(Mingw setups that use emulated TLS doesn't have that drawback in itself. But GCC/binutils does occasionally have issues with non-static TLS variables accessed from multiple source files - such variables end up with a bunch of extra wrapper functions, which use weak linkage, which has a couple issues in GCC/binutils too; see D111779 where we avoided cross-translation-unit TLS variables in LLDB to avoid crashes when built with GCC.)

Is it possible to wrap the accesses to parallel::threadIndex into a wrapper function, i.e. like parallel::getThreadIndex()? I presume that would add a tiny bit of overhead, in a routine that we want to tune for performance anyway. We could have that wrapper be inline, in the case of non-Windows platforms (which should result in the same code generated, I guess) and be defined in Parallel.cpp for Windows cases. (There are build configurations on Windows where this wouldn't be strictly necessary, but the overhead is probably small enough that it's not worth the effort to try to distinguish all the individual cases.)

On z/OS this would also break the build because there is no support for TLS. To workaround this we have disabled LLVM_ENABLE_THREADS here https://github.com/llvm/llvm-project/blob/main/llvm/CMakeLists.txt#L478 . Would we be able to move the declaration inside #if LLVM_ENABLE_THREADS? Thanks in advance

andrewng mentioned this in D133759: [Support] Access threadIndex via a wrapper function.Sep 13 2022, 7:57 AM

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

In D133003#3805979, @hans wrote:

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

I've put an lld repro here: https://drive.google.com/file/d/19zRK4jUxghCA5Pg_OJUugLM-D4yZ7iQR/view?usp=sharing (1.4 GB, requires google.com login)

(The lack of thread_local problem on some configurations of Windows/zOS has been resolved.)

In D133003#3806508, @hans wrote:

In D133003#3805979, @hans wrote:

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

I've put an lld repro here: https://drive.google.com/file/d/19zRK4jUxghCA5Pg_OJUugLM-D4yZ7iQR/view?usp=sharing (1.4 GB, requires google.com login)

Thanks for the reproduce. The nondeterminism is due to --pack-dyn-relocs=android.
I suspected whether it had trouble in a previous revision but after reading some code I thought it was ok.

I'll remove AndroidPackedRelocationSection does not support parallelism. . It works with deterministic parallelism.

So the section still has some problems. --pack-dyn-relocs=relr is deterministic from the many experiments I have done.

MaskRay mentioned this in rGbce6416775ea: [ELF] --pack-dyn-relocs=android: scan relocation serially after D133003.Sep 21 2022, 11:43 AM

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

In D133003#3817988, @hans wrote:

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

For the repro tarball, I've verified it's fixed.

while :; do fld.lld @response.txt --threads=4 -o 0; fld.lld @response.txt --threads=4 -o 1; cmp 0 1; done no output

In D133003#3818518, @MaskRay wrote:

In D133003#3817988, @hans wrote:

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

For the repro tarball, I've verified it's fixed.

while :; do fld.lld @response.txt --threads=4 -o 0; fld.lld @response.txt --threads=4 -o 1; cmp 0 1; done no output

Okay, thanks. I'll see if I can provide some kind of reproducer for the new problem.

MaskRay mentioned this in rG62e7c5b4e2e1: Revert "[ELF] --pack-dyn-relocs=android: scan relocation serially after D133003".Sep 28 2022, 12:06 AM

Sorry, turns out the bot which was failing hadn't picked up your change yet. I've verified that we're good locally, and also at tip-of-tree which includes the revert above.

Revision Contents

Path

Size

lld/

ELF/

Arch/

4 lines

2 lines

112 lines

55 lines

2 lines

51 lines

SyntheticSections.cpp

50 lines

Writer.cpp

9 lines

test/

ELF/

combreloc.s

2 lines

comdat-discarded-error.s

2 lines

undef-multi.s

4 lines

undef.s

4 lines

llvm/

include/

llvm/

Support/

Parallel.h

1 line

lib/

Support/

Parallel.cpp

7 lines

Diff 458972

lld/ELF/Arch/AArch64.cpp

Show First 20 Lines • Show All 850 Lines • ▼ Show 20 Lines	const uint8_t pacBr[] = {
0x20, 0x02, 0x1f, 0xd6 // br x17		0x20, 0x02, 0x1f, 0xd6 // br x17
};		};
const uint8_t stdBr[] = {		const uint8_t stdBr[] = {
0x20, 0x02, 0x1f, 0xd6, // br x17		0x20, 0x02, 0x1f, 0xd6, // br x17
0x1f, 0x20, 0x03, 0xd5 // nop		0x1f, 0x20, 0x03, 0xd5 // nop
};		};
const uint8_t nopData[] = { 0x1f, 0x20, 0x03, 0xd5 }; // nop		const uint8_t nopData[] = { 0x1f, 0x20, 0x03, 0xd5 }; // nop

// needsCopy indicates a non-ifunc canonical PLT entry whose address may		// NEEDS_COPY indicates a non-ifunc canonical PLT entry whose address may
// escape to shared objects. isInIplt indicates a non-preemptible ifunc. Its		// escape to shared objects. isInIplt indicates a non-preemptible ifunc. Its
// address may escape if referenced by a direct relocation. The condition is		// address may escape if referenced by a direct relocation. The condition is
// conservative.		// conservative.
bool hasBti = btiHeader && (sym.needsCopy \|\| sym.isInIplt);		bool hasBti = btiHeader && (sym.hasFlag(NEEDS_COPY) \|\| sym.isInIplt);
if (hasBti) {		if (hasBti) {
memcpy(buf, btiData, sizeof(btiData));		memcpy(buf, btiData, sizeof(btiData));
buf += sizeof(btiData);		buf += sizeof(btiData);
pltEntryAddr += sizeof(btiData);		pltEntryAddr += sizeof(btiData);
}		}

uint64_t gotPltEntryAddr = sym.getGotPltVA();		uint64_t gotPltEntryAddr = sym.getGotPltVA();
memcpy(buf, addrInst, sizeof(addrInst));		memcpy(buf, addrInst, sizeof(addrInst));
Show All 25 Lines

lld/ELF/MapFile.cpp

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines

	// Returns a list of all symbols that we want to print out.			// Returns a list of all symbols that we want to print out.
	static std::vector<Defined *> getSymbols() {			static std::vector<Defined *> getSymbols() {
	std::vector<Defined *> v;			std::vector<Defined *> v;
	for (ELFFileBase *file : ctx->objectFiles)			for (ELFFileBase *file : ctx->objectFiles)
	for (Symbol *b : file->getSymbols())			for (Symbol *b : file->getSymbols())
	if (auto *dr = dyn_cast<Defined>(b))			if (auto *dr = dyn_cast<Defined>(b))
	if (!dr->isSection() && dr->section && dr->section->isLive() &&			if (!dr->isSection() && dr->section && dr->section->isLive() &&
	(dr->file == file \|\| dr->needsCopy \|\| dr->section->bss))			(dr->file == file \|\| dr->hasFlag(NEEDS_COPY) \|\| dr->section->bss))
	v.push_back(dr);			v.push_back(dr);
	return v;			return v;
	}			}

	// Returns a map from sections to their symbols.			// Returns a map from sections to their symbols.
	static SymbolMapTy getSectionSyms(ArrayRef<Defined *> syms) {			static SymbolMapTy getSectionSyms(ArrayRef<Defined *> syms) {
	SymbolMapTy ret;			SymbolMapTy ret;
	for (Defined *dr : syms)			for (Defined *dr : syms)
	▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

lld/ELF/Relocations.cpp

Show First 20 Lines • Show All 291 Lines • ▼ Show 20 Lines
// When a symbol is copy relocated or we create a canonical plt entry, it is		// When a symbol is copy relocated or we create a canonical plt entry, it is
// effectively a defined symbol. In the case of copy relocation the symbol is		// effectively a defined symbol. In the case of copy relocation the symbol is
// in .bss and in the case of a canonical plt entry it is in .plt. This function		// in .bss and in the case of a canonical plt entry it is in .plt. This function
// replaces the existing symbol with a Defined pointing to the appropriate		// replaces the existing symbol with a Defined pointing to the appropriate
// location.		// location.
static void replaceWithDefined(Symbol &sym, SectionBase &sec, uint64_t value,		static void replaceWithDefined(Symbol &sym, SectionBase &sec, uint64_t value,
uint64_t size) {		uint64_t size) {
Symbol old = sym;		Symbol old = sym;

sym.replace(Defined{sym.file, StringRef(), sym.binding, sym.stOther,		sym.replace(Defined{sym.file, StringRef(), sym.binding, sym.stOther,
		ikudrinUnsubmitted Done Reply Inline Actions Why not define a copy constructor? ikudrin: Why not define a copy constructor?
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Good idea. Adopted MaskRay: Good idea. Adopted
sym.type, value, size, &sec});		sym.type, value, size, &sec});

sym.auxIdx = old.auxIdx;		sym.auxIdx = old.auxIdx;
sym.verdefIndex = old.verdefIndex;		sym.verdefIndex = old.verdefIndex;
sym.exportDynamic = true;		sym.exportDynamic = true;
sym.isUsedInRegularObj = true;		sym.isUsedInRegularObj = true;
// A copy relocated alias may need a GOT entry.		// A copy relocated alias may need a GOT entry.
sym.needsGot = old.needsGot;		if (old.hasFlag(NEEDS_GOT))
		sym.setFlags(NEEDS_GOT);
}		}

// Reserve space in .bss or .bss.rel.ro for copy relocation.		// Reserve space in .bss or .bss.rel.ro for copy relocation.
//		//
// The copy relocation is pretty much a hack. If you use a copy relocation		// The copy relocation is pretty much a hack. If you use a copy relocation
// in your program, not only the symbol name but the symbol's size, RW/RO		// in your program, not only the symbol name but the symbol's size, RW/RO
// bit and alignment become part of the ABI. In addition to that, if the		// bit and alignment become part of the ABI. In addition to that, if the
// symbol has aliases, the aliases become part of the ABI. That's subtle,		// symbol has aliases, the aliases become part of the ABI. That's subtle,
▲ Show 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	struct Loc {
InputSectionBase *sec;		InputSectionBase *sec;
uint64_t offset;		uint64_t offset;
};		};
std::vector<Loc> locs;		std::vector<Loc> locs;
bool isWarning;		bool isWarning;
};		};

std::vector<UndefinedDiag> undefs;		std::vector<UndefinedDiag> undefs;
		std::mutex relocMutex;
}		}

// Check whether the definition name def is a mangled function name that matches		// Check whether the definition name def is a mangled function name that matches
// the reference name ref.		// the reference name ref.
static bool canSuggestExternCForCXX(StringRef ref, StringRef def) {		static bool canSuggestExternCForCXX(StringRef ref, StringRef def) {
llvm::ItaniumPartialDemangler d;		llvm::ItaniumPartialDemangler d;
std::string name = def.str();		std::string name = def.str();
if (d.partialDemangle(name.c_str()))		if (d.partialDemangle(name.c_str()))
▲ Show 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	if (!undef.locs.empty())
reportUndefinedSymbol(undef, i < 2);		reportUndefinedSymbol(undef, i < 2);
undefs.clear();		undefs.clear();
}		}

// Report an undefined symbol if necessary.		// Report an undefined symbol if necessary.
// Returns true if the undefined symbol will produce an error message.		// Returns true if the undefined symbol will produce an error message.
static bool maybeReportUndefined(Undefined &sym, InputSectionBase &sec,		static bool maybeReportUndefined(Undefined &sym, InputSectionBase &sec,
uint64_t offset) {		uint64_t offset) {
		std::lock_guard<std::mutex> lock(relocMutex);
// If versioned, issue an error (even if the symbol is weak) because we don't		// If versioned, issue an error (even if the symbol is weak) because we don't
// know the defining filename which is required to construct a Verneed entry.		// know the defining filename which is required to construct a Verneed entry.
if (sym.hasVersionSuffix) {		if (sym.hasVersionSuffix) {
undefs.push_back({&sym, {{&sec, offset}}, false});		undefs.push_back({&sym, {{&sec, offset}}, false});
return true;		return true;
}		}
if (sym.isWeak())		if (sym.isWeak())
return false;		return false;
Show All 32 Lines	RelType RelocationScanner::getMipsN32RelType(RelTy *&rel) const {
uint64_t offset = rel->r_offset;		uint64_t offset = rel->r_offset;

int n = 0;		int n = 0;
while (rel != static_cast<const RelTy *>(end) && rel->r_offset == offset)		while (rel != static_cast<const RelTy *>(end) && rel->r_offset == offset)
type \|= (rel++)->getType(config->isMips64EL) << (8 * n++);		type \|= (rel++)->getType(config->isMips64EL) << (8 * n++);
return type;		return type;
}		}

		template <bool shard = false>
static void addRelativeReloc(InputSectionBase &isec, uint64_t offsetInSec,		static void addRelativeReloc(InputSectionBase &isec, uint64_t offsetInSec,
Symbol &sym, int64_t addend, RelExpr expr,		Symbol &sym, int64_t addend, RelExpr expr,
RelType type) {		RelType type) {
Partition &part = isec.getPartition();		Partition &part = isec.getPartition();

// Add a relative relocation. If relrDyn section is enabled, and the		// Add a relative relocation. If relrDyn section is enabled, and the
// relocation offset is guaranteed to be even, add the relocation to		// relocation offset is guaranteed to be even, add the relocation to
// the relrDyn section, otherwise add it to the relaDyn section.		// the relrDyn section, otherwise add it to the relaDyn section.
// relrDyn sections don't support odd offsets. Also, relrDyn sections		// relrDyn sections don't support odd offsets. Also, relrDyn sections
// don't store the addend values, so we must write it to the relocated		// don't store the addend values, so we must write it to the relocated
// address.		// address.
if (part.relrDyn && isec.alignment >= 2 && offsetInSec % 2 == 0) {		if (part.relrDyn && isec.alignment >= 2 && offsetInSec % 2 == 0) {
isec.relocations.push_back({expr, type, offsetInSec, addend, &sym});		isec.relocations.push_back({expr, type, offsetInSec, addend, &sym});
		if (shard)
		part.relrDyn->relocsVec[parallel::threadIndex].push_back({&isec, offsetInSec});
		else
part.relrDyn->relocs.push_back({&isec, offsetInSec});		part.relrDyn->relocs.push_back({&isec, offsetInSec});
return;		return;
}		}
part.relaDyn->addRelativeReloc(target->relativeRel, isec, offsetInSec, sym,		part.relaDyn->addRelativeReloc<shard>(target->relativeRel, isec, offsetInSec,
addend, type, expr);		sym, addend, type, expr);
}		}

template <class PltSection, class GotPltSection>		template <class PltSection, class GotPltSection>
static void addPltEntry(PltSection &plt, GotPltSection &gotPlt,		static void addPltEntry(PltSection &plt, GotPltSection &gotPlt,
RelocationBaseSection &rel, RelType type, Symbol &sym) {		RelocationBaseSection &rel, RelType type, Symbol &sym) {
plt.addEntry(sym);		plt.addEntry(sym);
gotPlt.addEntry(sym);		gotPlt.addEntry(sym);
rel.addReloc({type, &gotPlt, sym.getGotPltOffset(),		rel.addReloc({type, &gotPlt, sym.getGotPltOffset(),
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	if (isStaticLinkTimeConstant(expr, type, sym, offset) \|\|
sec->relocations.push_back({expr, type, offset, addend, &sym});		sec->relocations.push_back({expr, type, offset, addend, &sym});
return;		return;
}		}

bool canWrite = (sec->flags & SHF_WRITE) \|\| !config->zText;		bool canWrite = (sec->flags & SHF_WRITE) \|\| !config->zText;
if (canWrite) {		if (canWrite) {
RelType rel = target.getDynRel(type);		RelType rel = target.getDynRel(type);
if (expr == R_GOT \|\| (rel == target.symbolicRel && !sym.isPreemptible)) {		if (expr == R_GOT \|\| (rel == target.symbolicRel && !sym.isPreemptible)) {
addRelativeReloc(*sec, offset, sym, addend, expr, type);		addRelativeReloc<true>(*sec, offset, sym, addend, expr, type);
return;		return;
} else if (rel != 0) {		} else if (rel != 0) {
if (config->emachine == EM_MIPS && rel == target.symbolicRel)		if (config->emachine == EM_MIPS && rel == target.symbolicRel)
rel = target.relativeRel;		rel = target.relativeRel;
		std::lock_guard<std::mutex> lock(relocMutex);
sec->getPartition().relaDyn->addSymbolReloc(rel, *sec, offset, sym,		sec->getPartition().relaDyn->addSymbolReloc(rel, *sec, offset, sym,
addend, type);		addend, type);

// MIPS ABI turns using of GOT and dynamic relocations inside out.		// MIPS ABI turns using of GOT and dynamic relocations inside out.
// While regular ABI uses dynamic relocations to fill up GOT entries		// While regular ABI uses dynamic relocations to fill up GOT entries
// MIPS ABI requires dynamic linker to fills up GOT entries using		// MIPS ABI requires dynamic linker to fills up GOT entries using
// specially sorted dynamic symbol table. This affects even dynamic		// specially sorted dynamic symbol table. This affects even dynamic
// relocations against symbols which do not require GOT entries		// relocations against symbols which do not require GOT entries
Show All 25 Lines	if (!config->shared) {
if (sym.isObject()) {		if (sym.isObject()) {
// Produce a copy relocation.		// Produce a copy relocation.
if (auto *ss = dyn_cast<SharedSymbol>(&sym)) {		if (auto *ss = dyn_cast<SharedSymbol>(&sym)) {
if (!config->zCopyreloc)		if (!config->zCopyreloc)
error("unresolvable relocation " + toString(type) +		error("unresolvable relocation " + toString(type) +
" against symbol '" + toString(*ss) +		" against symbol '" + toString(*ss) +
"'; recompile with -fPIC or remove '-z nocopyreloc'" +		"'; recompile with -fPIC or remove '-z nocopyreloc'" +
getLocation(*sec, sym, offset));		getLocation(*sec, sym, offset));
sym.needsCopy = true;		sym.setFlags(NEEDS_COPY);
}		}
sec->relocations.push_back({expr, type, offset, addend, &sym});		sec->relocations.push_back({expr, type, offset, addend, &sym});
return;		return;
}		}

// This handles a non PIC program call to function in a shared library. In		// This handles a non PIC program call to function in a shared library. In
// an ideal world, we could just report an error saying the relocation can		// an ideal world, we could just report an error saying the relocation can
// overflow at runtime. In the real world with glibc, crt1.o has a		// overflow at runtime. In the real world with glibc, crt1.o has a
Show All 21 Lines	if (!config->shared) {
// compiled without -fPIE/-fPIC and doesn't maintain ebx.		// compiled without -fPIE/-fPIC and doesn't maintain ebx.
// * If a library definition gets preempted to the executable, it will have		// * If a library definition gets preempted to the executable, it will have
// the wrong ebx value.		// the wrong ebx value.
if (sym.isFunc()) {		if (sym.isFunc()) {
if (config->pie && config->emachine == EM_386)		if (config->pie && config->emachine == EM_386)
errorOrWarn("symbol '" + toString(sym) +		errorOrWarn("symbol '" + toString(sym) +
"' cannot be preempted; recompile with -fPIE" +		"' cannot be preempted; recompile with -fPIE" +
getLocation(*sec, sym, offset));		getLocation(*sec, sym, offset));
sym.needsCopy = true;		sym.setFlags(NEEDS_COPY \| NEEDS_PLT);
sym.needsPlt = true;
sec->relocations.push_back({expr, type, offset, addend, &sym});		sec->relocations.push_back({expr, type, offset, addend, &sym});
return;		return;
}		}
}		}

errorOrWarn("relocation " + toString(type) + " cannot be used against " +		errorOrWarn("relocation " + toString(type) + " cannot be used against " +
(sym.getName().empty() ? "local symbol"		(sym.getName().empty() ? "local symbol"
: "symbol '" + toString(sym) + "'") +		: "symbol '" + toString(sym) + "'") +
Show All 37 Lines	static unsigned handleTlsRelocation(RelType type, Symbol &sym,

if (config->emachine == EM_MIPS)		if (config->emachine == EM_MIPS)
return handleMipsTlsRelocation(type, sym, c, offset, addend, expr);		return handleMipsTlsRelocation(type, sym, c, offset, addend, expr);

if (oneof<R_AARCH64_TLSDESC_PAGE, R_TLSDESC, R_TLSDESC_CALL, R_TLSDESC_PC,		if (oneof<R_AARCH64_TLSDESC_PAGE, R_TLSDESC, R_TLSDESC_CALL, R_TLSDESC_PC,
R_TLSDESC_GOTPLT>(expr) &&		R_TLSDESC_GOTPLT>(expr) &&
config->shared) {		config->shared) {
if (expr != R_TLSDESC_CALL) {		if (expr != R_TLSDESC_CALL) {
sym.needsTlsDesc = true;		sym.setFlags(NEEDS_TLSDESC);
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
}		}
return 1;		return 1;
}		}

// ARM, Hexagon and RISC-V do not support GD/LD to IE/LE relaxation. For		// ARM, Hexagon and RISC-V do not support GD/LD to IE/LE relaxation. For
// PPC64, if the file has missing R_PPC64_TLSGD/R_PPC64_TLSLD, disable		// PPC64, if the file has missing R_PPC64_TLSGD/R_PPC64_TLSLD, disable
// relaxation as well.		// relaxation as well.
Show All 21 Lines	if (oneof<R_TLSLD_GOT, R_TLSLD_GOTPLT, R_TLSLD_PC, R_TLSLD_HINT>(
if (toExecRelax) {		if (toExecRelax) {
c.relocations.push_back(		c.relocations.push_back(
{target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type, offset,		{target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type, offset,
addend, &sym});		addend, &sym});
return target->getTlsGdRelaxSkip(type);		return target->getTlsGdRelaxSkip(type);
}		}
if (expr == R_TLSLD_HINT)		if (expr == R_TLSLD_HINT)
return 1;		return 1;
config->needsTlsLd = true;		config->needsTlsLd = true;
		ikudrinUnsubmitted Done Reply Inline Actions If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to `atomic<bool>`, the same should be done for `Configuration::needsTlsLd` because their usage pattern is similar. ikudrin: If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to…
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
return 1;		return 1;
}		}

// Local-Dynamic relocs can be relaxed to Local-Exec.		// Local-Dynamic relocs can be relaxed to Local-Exec.
if (expr == R_DTPREL) {		if (expr == R_DTPREL) {
if (toExecRelax)		if (toExecRelax)
expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE);		expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE);
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
return 1;		return 1;
}		}

// Local-Dynamic sequence where offset of tls variable relative to dynamic		// Local-Dynamic sequence where offset of tls variable relative to dynamic
// thread pointer is stored in the got. This cannot be relaxed to Local-Exec.		// thread pointer is stored in the got. This cannot be relaxed to Local-Exec.
if (expr == R_TLSLD_GOT_OFF) {		if (expr == R_TLSLD_GOT_OFF) {
sym.needsGotDtprel = true;		sym.setFlags(NEEDS_GOT_DTPREL);
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
return 1;		return 1;
}		}

if (oneof<R_AARCH64_TLSDESC_PAGE, R_TLSDESC, R_TLSDESC_CALL, R_TLSDESC_PC,		if (oneof<R_AARCH64_TLSDESC_PAGE, R_TLSDESC, R_TLSDESC_CALL, R_TLSDESC_PC,
R_TLSDESC_GOTPLT, R_TLSGD_GOT, R_TLSGD_GOTPLT, R_TLSGD_PC>(expr)) {		R_TLSDESC_GOTPLT, R_TLSGD_GOT, R_TLSGD_GOTPLT, R_TLSGD_PC>(expr)) {
if (!toExecRelax) {		if (!toExecRelax) {
sym.needsTlsGd = true;		sym.setFlags(NEEDS_TLSGD);
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
return 1;		return 1;
}		}

// Global-Dynamic relocs can be relaxed to Initial-Exec or Local-Exec		// Global-Dynamic relocs can be relaxed to Initial-Exec or Local-Exec
// depending on the symbol being locally defined or not.		// depending on the symbol being locally defined or not.
if (sym.isPreemptible) {		if (sym.isPreemptible) {
sym.needsTlsGdToIe = true;		sym.setFlags(NEEDS_TLSGD_TO_IE);
c.relocations.push_back(		c.relocations.push_back(
{target->adjustTlsExpr(type, R_RELAX_TLS_GD_TO_IE), type, offset,		{target->adjustTlsExpr(type, R_RELAX_TLS_GD_TO_IE), type, offset,
addend, &sym});		addend, &sym});
} else {		} else {
c.relocations.push_back(		c.relocations.push_back(
{target->adjustTlsExpr(type, R_RELAX_TLS_GD_TO_LE), type, offset,		{target->adjustTlsExpr(type, R_RELAX_TLS_GD_TO_LE), type, offset,
addend, &sym});		addend, &sym});
}		}
return target->getTlsGdRelaxSkip(type);		return target->getTlsGdRelaxSkip(type);
}		}

if (oneof<R_GOT, R_GOTPLT, R_GOT_PC, R_AARCH64_GOT_PAGE_PC, R_GOT_OFF,		if (oneof<R_GOT, R_GOTPLT, R_GOT_PC, R_AARCH64_GOT_PAGE_PC, R_GOT_OFF,
R_TLSIE_HINT>(expr)) {		R_TLSIE_HINT>(expr)) {
// Initial-Exec relocs can be relaxed to Local-Exec if the symbol is locally		// Initial-Exec relocs can be relaxed to Local-Exec if the symbol is locally
// defined.		// defined.
if (toExecRelax && isLocalInExecutable) {		if (toExecRelax && isLocalInExecutable) {
c.relocations.push_back(		c.relocations.push_back(
{R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym});		{R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym});
} else if (expr != R_TLSIE_HINT) {		} else if (expr != R_TLSIE_HINT) {
sym.needsTlsIe = true;		sym.setFlags(NEEDS_TLSIE);
// R_GOT needs a relative relocation for PIC on i386 and Hexagon.		// R_GOT needs a relative relocation for PIC on i386 and Hexagon.
if (expr == R_GOT && config->isPic && !target->usesOnlyLowPageBits(type))		if (expr == R_GOT && config->isPic && !target->usesOnlyLowPageBits(type))
addRelativeReloc(c, offset, sym, addend, expr, type);		addRelativeReloc<true>(c, offset, sym, addend, expr, type);
		ikudrinUnsubmitted Done Reply Inline Actions Shouldn't `relocMutex` be locked before this call? ikudrin: Shouldn't `relocMutex` be locked before this call?
else		else
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
}		}
return 1;		return 1;
}		}

return 0;		return 0;
}		}
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	template <class ELFT, class RelTy> void RelocationScanner::scanOne(RelTy *&i) {
}		}

// If the relocation does not emit a GOT or GOTPLT entry but its computation		// If the relocation does not emit a GOT or GOTPLT entry but its computation
// uses their addresses, we need GOT or GOTPLT to be created.		// uses their addresses, we need GOT or GOTPLT to be created.
//		//
// The 5 types that relative GOTPLT are all x86 and x86-64 specific.		// The 5 types that relative GOTPLT are all x86 and x86-64 specific.
if (oneof<R_GOTPLTONLY_PC, R_GOTPLTREL, R_GOTPLT, R_PLT_GOTPLT,		if (oneof<R_GOTPLTONLY_PC, R_GOTPLTREL, R_GOTPLT, R_PLT_GOTPLT,
R_TLSDESC_GOTPLT, R_TLSGD_GOTPLT>(expr)) {		R_TLSDESC_GOTPLT, R_TLSGD_GOTPLT>(expr)) {
in.gotPlt->hasGotPltOffRel = true;		in.gotPlt->hasGotPltOffRel.store(true, std::memory_order_relaxed);
} else if (oneof<R_GOTONLY_PC, R_GOTREL, R_PPC32_PLTREL, R_PPC64_TOCBASE,		} else if (oneof<R_GOTONLY_PC, R_GOTREL, R_PPC32_PLTREL, R_PPC64_TOCBASE,
R_PPC64_RELAX_TOC>(expr)) {		R_PPC64_RELAX_TOC>(expr)) {
in.got->hasGotOffRel = true;		in.got->hasGotOffRel.store(true, std::memory_order_relaxed);
}		}

// Process TLS relocations, including relaxing TLS relocations. Note that		// Process TLS relocations, including relaxing TLS relocations. Note that
// R_TPREL and R_TPREL_NEG relocations are resolved in processAux.		// R_TPREL and R_TPREL_NEG relocations are resolved in processAux.
if (expr == R_TPREL \|\| expr == R_TPREL_NEG) {		if (expr == R_TPREL \|\| expr == R_TPREL_NEG) {
if (config->shared) {		if (config->shared) {
errorOrWarn("relocation " + toString(type) + " against " + toString(sym) +		errorOrWarn("relocation " + toString(type) + " against " + toString(sym) +
" cannot be used with -shared" +		" cannot be used with -shared" +
Show All 31 Lines	if (!sym.isPreemptible && (!isIfunc \|\| config->zIfuncNoplt)) {
} else if (!isAbsoluteValue(sym)) {		} else if (!isAbsoluteValue(sym)) {
expr = target.adjustGotPcExpr(type, addend, relocatedAddr);		expr = target.adjustGotPcExpr(type, addend, relocatedAddr);
}		}
}		}

// We were asked not to generate PLT entries for ifuncs. Instead, pass the		// We were asked not to generate PLT entries for ifuncs. Instead, pass the
// direct relocation on through.		// direct relocation on through.
if (LLVM_UNLIKELY(isIfunc) && config->zIfuncNoplt) {		if (LLVM_UNLIKELY(isIfunc) && config->zIfuncNoplt) {
		std::lock_guard<std::mutex> lock(relocMutex);
sym.exportDynamic = true;		sym.exportDynamic = true;
mainPart->relaDyn->addSymbolReloc(type, *sec, offset, sym, addend, type);		mainPart->relaDyn->addSymbolReloc(type, *sec, offset, sym, addend, type);
return;		return;
}		}

if (needsGot(expr)) {		if (needsGot(expr)) {
if (config->emachine == EM_MIPS) {		if (config->emachine == EM_MIPS) {
// MIPS ABI has special rules to process GOT entries and doesn't		// MIPS ABI has special rules to process GOT entries and doesn't
// require relocation entries for them. A special case is TLS		// require relocation entries for them. A special case is TLS
// relocations. In that case dynamic loader applies dynamic		// relocations. In that case dynamic loader applies dynamic
// relocations to initialize TLS GOT entries.		// relocations to initialize TLS GOT entries.
// See "Global Offset Table" in Chapter 5 in the following document		// See "Global Offset Table" in Chapter 5 in the following document
// for detailed description:		// for detailed description:
// ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf		// ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
in.mipsGot->addEntry(*sec->file, sym, addend, expr);		in.mipsGot->addEntry(*sec->file, sym, addend, expr);
} else {		} else {
sym.needsGot = true;		sym.setFlags(NEEDS_GOT);
}		}
} else if (needsPlt(expr)) {		} else if (needsPlt(expr)) {
sym.needsPlt = true;		sym.setFlags(NEEDS_PLT);
} else if (LLVM_UNLIKELY(isIfunc)) {		} else if (LLVM_UNLIKELY(isIfunc)) {
sym.hasDirectReloc = true;		sym.setFlags(HAS_DIRECT_RELOC);
}		}

processAux(expr, type, offset, sym, addend);		processAux(expr, type, offset, sym, addend);
}		}

// R_PPC64_TLSGD/R_PPC64_TLSLD is required to mark `bl __tls_get_addr` for		// R_PPC64_TLSGD/R_PPC64_TLSLD is required to mark `bl __tls_get_addr` for
// General Dynamic/Local Dynamic code sequences. If a GD/LD GOT relocation is		// General Dynamic/Local Dynamic code sequences. If a GD/LD GOT relocation is
// found but no R_PPC64_TLSGD/R_PPC64_TLSLD is seen, we assume that the		// found but no R_PPC64_TLSGD/R_PPC64_TLSLD is seen, we assume that the
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	else
scan<ELFT>(rels.relas);		scan<ELFT>(rels.relas);
}		}

template <class ELFT> void elf::scanRelocations() {		template <class ELFT> void elf::scanRelocations() {
// Scan all relocations. Each relocation goes through a series of tests to		// Scan all relocations. Each relocation goes through a series of tests to
// determine if it needs special treatment, such as creating GOT, PLT,		// determine if it needs special treatment, such as creating GOT, PLT,
// copy relocations, etc. Note that relocations for non-alloc sections are		// copy relocations, etc. Note that relocations for non-alloc sections are
// directly processed by InputSection::relocateNonAlloc.		// directly processed by InputSection::relocateNonAlloc.

		// Deterministic parallellism needs sorting relocations which is unsuitable
		// for -z nocombreloc. AndroidPackedRelocationSection does not support
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I'll remove `AndroidPackedRelocationSection does not support parallelism.` . It works with deterministic parallelism. MaskRay: I'll remove `AndroidPackedRelocationSection does not support parallelism. `. It works with…
		// parallelism. MIPS and PPC64 use global states which are not suitable for
		// parallelism.
		bool serial = !config->zCombreloc \|\| config->emachine == EM_MIPS \|\|
		config->emachine == EM_PPC64;
		parallel::TaskGroup tg;
		for (ELFFileBase *f : ctx->objectFiles) {
		auto fn = [f]() {
		RelocationScanner scanner;
		for (InputSectionBase *s : f->getSections()) {
		if (s && s->kind() == SectionBase::Regular && s->isLive() &&
		(s->flags & SHF_ALLOC) &&
		!(s->type == SHT_ARM_EXIDX && config->emachine == EM_ARM))
		scanner.template scanSection<ELFT>(*s);
		}
		};
		if (serial)
		andrewngUnsubmitted Done Reply Inline Actions I wonder if it might be worthwhile using the previous code for the serial case? Although, it probably doesn't make a big difference to performance. andrewng: I wonder if it might be worthwhile using the previous code for the serial case? Although, it…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Use which piece of code for the serial case? MaskRay: Use which piece of code for the serial case?
		andrewngUnsubmitted Done Reply Inline Actions I was thinking this: for (InputSectionBase sec : inputSections) if (sec->isLive() && (sec->flags & SHF_ALLOC)) scanner.template scanSection<ELFT>(sec); But on the other hand, in terms of future development and maintenance, it's probably better to use as much of the same code for both "paths", even if there's a minor performance penalty for the serial one. andrewng: I was thinking this: ``` for (InputSectionBase *sec : inputSections) if (sec->isLive() &&…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Yes, using the same code for both paths is better for maintenance. MaskRay: Yes, using the same code for both paths is better for maintenance.
		andrewngUnsubmitted Done Reply Inline Actions Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned it. But as there are specific configurations that are limited to serial I thought that it might be worth considering. andrewng: Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Add the comment before the `tg.execute([] {` line? + // Both the main thread and thread pool index 0 use threadIndex==0. Be + // careful that they don't concurrently run scanSections. When serial is + // true, fn() has finished at this point, so running execute is safe MaskRay: Add the comment before the `tg.execute([] {` line? ``` + // Both the main thread and thread…
		andrewngUnsubmitted Done Reply Inline Actions Yes, I think it would be worth adding the comment for clarity. andrewng: Yes, I think it would be worth adding the comment for clarity.
		fn();
		else
		tg.execute(fn);
		}

RelocationScanner scanner;		RelocationScanner scanner;
for (InputSectionBase *sec : inputSections)
if (sec->isLive() && (sec->flags & SHF_ALLOC))
scanner.template scanSection<ELFT>(*sec);
for (Partition &part : partitions) {		for (Partition &part : partitions) {
for (EhInputSection *sec : part.ehFrame->sections)		for (EhInputSection *sec : part.ehFrame->sections)
scanner.template scanSection<ELFT>(*sec);		scanner.template scanSection<ELFT>(*sec);
if (part.armExidx && part.armExidx->isLive())		if (part.armExidx && part.armExidx->isLive())
for (InputSection *sec : part.armExidx->exidxSections)		for (InputSection *sec : part.armExidx->exidxSections)
scanner.template scanSection<ELFT>(*sec);		scanner.template scanSection<ELFT>(*sec);
		andrewngUnsubmitted Done Reply Inline Actions This is running on the main thread. Is there a chance that this might clash with thread 0 of the task pool? andrewng: This is running on the main thread. Is there a chance that this might clash with thread 0 of…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Thanks for catching this. The main thread doing heavy work will contend with the thread pool. Changed to use `tg.execute`. MaskRay: Thanks for catching this. The main thread doing heavy work will contend with the thread pool.
		andrewngUnsubmitted Done Reply Inline Actions I think the previous code could have actually caused a threading issue, i.e. concurrent updates to the `0` indexed relocation vector. This will ensure that can't happen. The only minor thing is it looks a little odd that the "serial" case uses `tg` but I guess it is still serial. andrewng: I think the previous code could have actually caused a threading issue, i.e. concurrent updates…
}		}
}		}

static bool handleNonPreemptibleIfunc(Symbol &sym) {		static bool handleNonPreemptibleIfunc(Symbol &sym, uint16_t flags) {
		ikudrinUnsubmitted Done Reply Inline Actions `uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the range is `NEEDS_TLSIE` which is not used here, but still. ikudrin: `uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Thanks for catching this! MaskRay: Thanks for catching this!
// Handle a reference to a non-preemptible ifunc. These are special in a		// Handle a reference to a non-preemptible ifunc. These are special in a
// few ways:		// few ways:
//		//
// - Unlike most non-preemptible symbols, non-preemptible ifuncs do not have		// - Unlike most non-preemptible symbols, non-preemptible ifuncs do not have
// a fixed value. But assuming that all references to the ifunc are		// a fixed value. But assuming that all references to the ifunc are
// GOT-generating or PLT-generating, the handling of an ifunc is		// GOT-generating or PLT-generating, the handling of an ifunc is
// relatively straightforward. We create a PLT entry in Iplt, which is		// relatively straightforward. We create a PLT entry in Iplt, which is
// usually at the end of .plt, which makes an indirect call using a		// usually at the end of .plt, which makes an indirect call using a
Show All 27 Lines	static bool handleNonPreemptibleIfunc(Symbol &sym, uint16_t flags) {
// exception to the general rule that a statically linked executable does		// exception to the general rule that a statically linked executable does
// not require any form of dynamic relocation. To handle these relocations		// not require any form of dynamic relocation. To handle these relocations
// correctly, the IRELATIVE relocations are stored in an array which a		// correctly, the IRELATIVE relocations are stored in an array which a
// statically linked executable's startup code must enumerate using the		// statically linked executable's startup code must enumerate using the
// linker-defined symbols __rela?_iplt_{start,end}.		// linker-defined symbols __rela?_iplt_{start,end}.
if (!sym.isGnuIFunc() \|\| sym.isPreemptible \|\| config->zIfuncNoplt)		if (!sym.isGnuIFunc() \|\| sym.isPreemptible \|\| config->zIfuncNoplt)
return false;		return false;
// Skip unreferenced non-preemptible ifunc.		// Skip unreferenced non-preemptible ifunc.
if (!(sym.needsGot \|\| sym.needsPlt \|\| sym.hasDirectReloc))		if (!(flags & (NEEDS_GOT \| NEEDS_PLT \| HAS_DIRECT_RELOC)))
return true;		return true;

sym.isInIplt = true;		sym.isInIplt = true;

// Create an Iplt and the associated IRELATIVE relocation pointing to the		// Create an Iplt and the associated IRELATIVE relocation pointing to the
// original section/value pairs. For non-GOT non-PLT relocation case below, we		// original section/value pairs. For non-GOT non-PLT relocation case below, we
// may alter section/value, so create a copy of the symbol to make		// may alter section/value, so create a copy of the symbol to make
// section/value fixed.		// section/value fixed.
auto *directSym = makeDefined(cast<Defined>(sym));		auto *directSym = makeDefined(cast<Defined>(sym));
directSym->allocateAux();		directSym->allocateAux();
addPltEntry(in.iplt, in.igotPlt, *in.relaIplt, target->iRelativeRel,		addPltEntry(in.iplt, in.igotPlt, *in.relaIplt, target->iRelativeRel,
*directSym);		*directSym);
sym.allocateAux();		sym.allocateAux();
symAux.back().pltIdx = symAux[directSym->auxIdx].pltIdx;		symAux.back().pltIdx = symAux[directSym->auxIdx].pltIdx;

if (sym.hasDirectReloc) {		if (flags & HAS_DIRECT_RELOC) {
// Change the value to the IPLT and redirect all references to it.		// Change the value to the IPLT and redirect all references to it.
auto &d = cast<Defined>(sym);		auto &d = cast<Defined>(sym);
d.section = in.iplt.get();		d.section = in.iplt.get();
d.value = d.getPltIdx() * target->ipltEntrySize;		d.value = d.getPltIdx() * target->ipltEntrySize;
d.size = 0;		d.size = 0;
// It's important to set the symbol type here so that dynamic loaders		// It's important to set the symbol type here so that dynamic loaders
// don't try to call the PLT as if it were an ifunc resolver.		// don't try to call the PLT as if it were an ifunc resolver.
d.type = STT_FUNC;		d.type = STT_FUNC;

if (sym.needsGot)		if (flags & NEEDS_GOT)
addGotEntry(sym);		addGotEntry(sym);
} else if (sym.needsGot) {		} else if (flags & NEEDS_GOT) {
// Redirect GOT accesses to point to the Igot.		// Redirect GOT accesses to point to the Igot.
sym.gotInIgot = true;		sym.gotInIgot = true;
}		}
return true;		return true;
}		}

void elf::postScanRelocations() {		void elf::postScanRelocations() {
auto fn = [](Symbol &sym) {		auto fn = [](Symbol &sym) {
if (handleNonPreemptibleIfunc(sym))		auto flags = sym.flags.load(std::memory_order_relaxed);
		if (handleNonPreemptibleIfunc(sym, flags))
return;		return;
if (!sym.needsDynReloc())		if (!sym.needsDynReloc())
return;		return;
sym.allocateAux();		sym.allocateAux();

if (sym.needsGot)		if (flags & NEEDS_GOT)
addGotEntry(sym);		addGotEntry(sym);
if (sym.needsPlt)		if (flags & NEEDS_PLT)
addPltEntry(in.plt, in.gotPlt, *in.relaPlt, target->pltRel, sym);		addPltEntry(in.plt, in.gotPlt, *in.relaPlt, target->pltRel, sym);
if (sym.needsCopy) {		if (flags & NEEDS_COPY) {
if (sym.isObject()) {		if (sym.isObject()) {
invokeELFT(addCopyRelSymbol, cast<SharedSymbol>(sym));		invokeELFT(addCopyRelSymbol, cast<SharedSymbol>(sym));
// needsCopy is cleared for sym and its aliases so that in later		// NEEDS_COPY is cleared for sym and its aliases so that in
// iterations aliases won't cause redundant copies.		// later iterations aliases won't cause redundant copies.
assert(!sym.needsCopy);		assert(!sym.hasFlag(NEEDS_COPY));
} else {		} else {
assert(sym.isFunc() && sym.needsPlt);		assert(sym.isFunc() && sym.hasFlag(NEEDS_PLT));
if (!sym.isDefined()) {		if (!sym.isDefined()) {
replaceWithDefined(sym, *in.plt,		replaceWithDefined(sym, *in.plt,
target->pltHeaderSize +		target->pltHeaderSize +
target->pltEntrySize * sym.getPltIdx(),		target->pltEntrySize * sym.getPltIdx(),
0);		0);
sym.needsCopy = true;		sym.setFlags(NEEDS_COPY);
if (config->emachine == EM_PPC) {		if (config->emachine == EM_PPC) {
// PPC32 canonical PLT entries are at the beginning of .glink		// PPC32 canonical PLT entries are at the beginning of .glink
cast<Defined>(sym).value = in.plt->headerSize;		cast<Defined>(sym).value = in.plt->headerSize;
in.plt->headerSize += 16;		in.plt->headerSize += 16;
cast<PPC32GlinkSection>(*in.plt).canonical_plts.push_back(&sym);		cast<PPC32GlinkSection>(*in.plt).canonical_plts.push_back(&sym);
}		}
}		}
}		}
}		}

if (!sym.isTls())		if (!sym.isTls())
return;		return;
bool isLocalInExecutable = !sym.isPreemptible && !config->shared;		bool isLocalInExecutable = !sym.isPreemptible && !config->shared;

if (sym.needsTlsDesc) {		if (flags & NEEDS_TLSDESC) {
in.got->addTlsDescEntry(sym);		in.got->addTlsDescEntry(sym);
mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(		mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
target->tlsDescRel, *in.got, in.got->getTlsDescOffset(sym), sym,		target->tlsDescRel, *in.got, in.got->getTlsDescOffset(sym), sym,
target->tlsDescRel);		target->tlsDescRel);
}		}
if (sym.needsTlsGd) {		if (flags & NEEDS_TLSGD) {
in.got->addDynTlsEntry(sym);		in.got->addDynTlsEntry(sym);
uint64_t off = in.got->getGlobalDynOffset(sym);		uint64_t off = in.got->getGlobalDynOffset(sym);
if (isLocalInExecutable)		if (isLocalInExecutable)
// Write one to the GOT slot.		// Write one to the GOT slot.
in.got->relocations.push_back(		in.got->relocations.push_back(
{R_ADDEND, target->symbolicRel, off, 1, &sym});		{R_ADDEND, target->symbolicRel, off, 1, &sym});
else		else
mainPart->relaDyn->addSymbolReloc(target->tlsModuleIndexRel, *in.got,		mainPart->relaDyn->addSymbolReloc(target->tlsModuleIndexRel, *in.got,
off, sym);		off, sym);

// If the symbol is preemptible we need the dynamic linker to write		// If the symbol is preemptible we need the dynamic linker to write
// the offset too.		// the offset too.
uint64_t offsetOff = off + config->wordsize;		uint64_t offsetOff = off + config->wordsize;
if (sym.isPreemptible)		if (sym.isPreemptible)
mainPart->relaDyn->addSymbolReloc(target->tlsOffsetRel, *in.got,		mainPart->relaDyn->addSymbolReloc(target->tlsOffsetRel, *in.got,
offsetOff, sym);		offsetOff, sym);
else		else
in.got->relocations.push_back(		in.got->relocations.push_back(
{R_ABS, target->tlsOffsetRel, offsetOff, 0, &sym});		{R_ABS, target->tlsOffsetRel, offsetOff, 0, &sym});
}		}
if (sym.needsTlsGdToIe) {		if (flags & NEEDS_TLSGD_TO_IE) {
in.got->addEntry(sym);		in.got->addEntry(sym);
mainPart->relaDyn->addSymbolReloc(target->tlsGotRel, *in.got,		mainPart->relaDyn->addSymbolReloc(target->tlsGotRel, *in.got,
sym.getGotOffset(), sym);		sym.getGotOffset(), sym);
}		}
if (sym.needsGotDtprel) {		if (flags & NEEDS_GOT_DTPREL) {
in.got->addEntry(sym);		in.got->addEntry(sym);
in.got->relocations.push_back(		in.got->relocations.push_back(
{R_ABS, target->tlsOffsetRel, sym.getGotOffset(), 0, &sym});		{R_ABS, target->tlsOffsetRel, sym.getGotOffset(), 0, &sym});
}		}

if (sym.needsTlsIe && !sym.needsTlsGdToIe)		if ((flags & NEEDS_TLSIE) && !(flags & NEEDS_TLSGD_TO_IE))
addTpOffsetGotEntry(sym);		addTpOffsetGotEntry(sym);
};		};

if (config->needsTlsLd && in.got->addTlsIndex()) {		if (config->needsTlsLd && in.got->addTlsIndex()) {
static Undefined dummy(nullptr, "", STB_LOCAL, 0, 0);		static Undefined dummy(nullptr, "", STB_LOCAL, 0, 0);
if (config->shared)		if (config->shared)
mainPart->relaDyn->addReloc(		mainPart->relaDyn->addReloc(
{target->tlsModuleIndexRel, in.got.get(), in.got->getTlsIndexOff()});		{target->tlsModuleIndexRel, in.got.get(), in.got->getTlsIndexOff()});
▲ Show 20 Lines • Show All 545 Lines • Show Last 20 Lines

lld/ELF/Symbols.h

Show All 33 Lines
class SectionBase;		class SectionBase;
class InputSectionBase;		class InputSectionBase;
class SharedSymbol;		class SharedSymbol;
class Symbol;		class Symbol;
class Undefined;		class Undefined;
class LazyObject;		class LazyObject;
class InputFile;		class InputFile;

		enum {
		NEEDS_GOT = 1 << 0,
		NEEDS_PLT = 1 << 1,
		HAS_DIRECT_RELOC = 1 << 2,
		// True if this symbol needs a canonical PLT entry, or (during
		// postScanRelocations) a copy relocation.
		NEEDS_COPY = 1 << 3,
		NEEDS_TLSDESC = 1 << 4,
		NEEDS_TLSGD = 1 << 5,
		NEEDS_TLSGD_TO_IE = 1 << 6,
		NEEDS_GOT_DTPREL = 1 << 7,
		NEEDS_TLSIE = 1 << 8,
		};

// Some index properties of a symbol are stored separately in this auxiliary		// Some index properties of a symbol are stored separately in this auxiliary
// struct to decrease sizeof(SymbolUnion) in the majority of cases.		// struct to decrease sizeof(SymbolUnion) in the majority of cases.
struct SymbolAux {		struct SymbolAux {
uint32_t gotIdx = -1;		uint32_t gotIdx = -1;
uint32_t pltIdx = -1;		uint32_t pltIdx = -1;
uint32_t tlsDescIdx = -1;		uint32_t tlsDescIdx = -1;
uint32_t tlsGdIdx = -1;		uint32_t tlsGdIdx = -1;
};		};
Show All 12 Lines	enum Kind {
LazyObjectKind,		LazyObjectKind,
};		};

Kind kind() const { return static_cast<Kind>(symbolKind); }		Kind kind() const { return static_cast<Kind>(symbolKind); }

// The file from which this symbol was created.		// The file from which this symbol was created.
InputFile *file;		InputFile *file;

		// The default copy constructor is deleted due to atomic flags. Define one for
		// places where no atomic is needed.
		Symbol(const Symbol &o) { memcpy(this, &o, sizeof(o)); }

protected:		protected:
const char *nameData;		const char *nameData;
// 32-bit size saves space.		// 32-bit size saves space.
uint32_t nameSize;		uint32_t nameSize;

public:		public:
// The next three fields have the same meaning as the ELF symbol attributes.		// The next three fields have the same meaning as the ELF symbol attributes.
// type and binding are placed in this order to optimize generating st_info,		// type and binding are placed in this order to optimize generating st_info,
▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	protected:
Symbol(Kind k, InputFile *file, StringRef name, uint8_t binding,		Symbol(Kind k, InputFile *file, StringRef name, uint8_t binding,
uint8_t stOther, uint8_t type)		uint8_t stOther, uint8_t type)
: file(file), nameData(name.data()), nameSize(name.size()), type(type),		: file(file), nameData(name.data()), nameSize(name.size()), type(type),
binding(binding), stOther(stOther), symbolKind(k), isPreemptible(false),		binding(binding), stOther(stOther), symbolKind(k), isPreemptible(false),
isUsedInRegularObj(false), used(false), exportDynamic(false),		isUsedInRegularObj(false), used(false), exportDynamic(false),
inDynamicList(false), referenced(false), referencedAfterWrap(false),		inDynamicList(false), referenced(false), referencedAfterWrap(false),
traced(false), hasVersionSuffix(false), isInIplt(false),		traced(false), hasVersionSuffix(false), isInIplt(false),
gotInIgot(false), folded(false), needsTocRestore(false),		gotInIgot(false), folded(false), needsTocRestore(false),
scriptDefined(false), dsoProtected(false), needsCopy(false),		scriptDefined(false), dsoProtected(false) {}
needsGot(false), needsPlt(false), needsTlsDesc(false),
needsTlsGd(false), needsTlsGdToIe(false), needsGotDtprel(false),
needsTlsIe(false), hasDirectReloc(false) {}

public:		public:
// True if this symbol is in the Iplt sub-section of the Plt and the Igot		// True if this symbol is in the Iplt sub-section of the Plt and the Igot
// sub-section of the .got.plt or .got.		// sub-section of the .got.plt or .got.
uint8_t isInIplt : 1;		uint8_t isInIplt : 1;

// True if this symbol needs a GOT entry and its GOT entry is actually in		// True if this symbol needs a GOT entry and its GOT entry is actually in
// Igot. This will be true only for certain non-preemptible ifuncs.		// Igot. This will be true only for certain non-preemptible ifuncs.
Show All 9 Lines	public:
// True if this symbol is defined by a symbol assignment or wrapped by --wrap.		// True if this symbol is defined by a symbol assignment or wrapped by --wrap.
//		//
// LTO shouldn't inline the symbol because it doesn't know the final content		// LTO shouldn't inline the symbol because it doesn't know the final content
// of the symbol.		// of the symbol.
uint8_t scriptDefined : 1;		uint8_t scriptDefined : 1;

// True if defined in a DSO as protected visibility.		// True if defined in a DSO as protected visibility.
uint8_t dsoProtected : 1;		uint8_t dsoProtected : 1;

// True if this symbol needs a canonical PLT entry, or (during
// postScanRelocations) a copy relocation.
uint8_t needsCopy : 1;

// Temporary flags used to communicate which symbol entries need PLT and GOT		// Temporary flags used to communicate which symbol entries need PLT and GOT
		ikudrinUnsubmitted Done Reply Inline Actions Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike? ikudrin: Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike?
		MaskRayAuthorUnsubmitted Done Reply Inline Actions All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want to keep the size of `SymbolUnion` unchanged. MaskRay: All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want…
		ikudrinUnsubmitted Done Reply Inline Actions Does that mean that some flags in the atomic do not really need to be handled as such, or that this flag is left outside despite it can be potentially updated concurrently, but there is no space for it in `flags`? In any case, that is worth documenting, at least. ikudrin: Does that mean that some flags in the atomic do not really need to be handled as such, or that…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now, but I suspect 16-bit atomic operations are not efficient on common architectures. MaskRay: I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now…
// entries during postScanRelocations();		// entries during postScanRelocations();
uint8_t needsGot : 1;		std::atomic<uint16_t> flags = 0;
uint8_t needsPlt : 1;
uint8_t needsTlsDesc : 1;
uint8_t needsTlsGd : 1;
uint8_t needsTlsGdToIe : 1;
uint8_t needsGotDtprel : 1;
uint8_t needsTlsIe : 1;
uint8_t hasDirectReloc : 1;

// A symAux index used to access GOT/PLT entry indexes. This is allocated in		// A symAux index used to access GOT/PLT entry indexes. This is allocated in
// postScanRelocations().		// postScanRelocations().
uint32_t auxIdx = -1;		uint32_t auxIdx = -1;
uint32_t dynsymIndex = 0;		uint32_t dynsymIndex = 0;

// This field is a index to the symbol's version definition.		// This field is a index to the symbol's version definition.
uint16_t verdefIndex = -1;		uint16_t verdefIndex = -1;

// Version definition index.		// Version definition index.
uint16_t versionId;		uint16_t versionId;

		void setFlags(uint16_t bits) {
		ikudrinUnsubmitted Done Reply Inline Actions You use it with two flags at least once, maybe call it `setFlags`? ikudrin: You use it with two flags at least once, maybe call it `setFlags`?
		flags.fetch_or(bits, std::memory_order_relaxed);
		}
		bool hasFlag(uint16_t bit) const {
		andrewngUnsubmitted Done Reply Inline Actions The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit - 1)) == 0)`? andrewng: The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit…
		return flags.load(std::memory_order_relaxed) & bit;
		}

bool needsDynReloc() const {		bool needsDynReloc() const {
return needsCopy \|\| needsGot \|\| needsPlt \|\| needsTlsDesc \|\| needsTlsGd \|\|		return flags.load(std::memory_order_relaxed) &
needsTlsGdToIe \|\| needsGotDtprel \|\| needsTlsIe;		(NEEDS_COPY \| NEEDS_GOT \| NEEDS_PLT \| NEEDS_TLSDESC \| NEEDS_TLSGD \|
		NEEDS_TLSGD_TO_IE \| NEEDS_GOT_DTPREL \| NEEDS_TLSIE);
}		}
void allocateAux() {		void allocateAux() {
assert(auxIdx == uint32_t(-1));		assert(auxIdx == uint32_t(-1));
auxIdx = symAux.size();		auxIdx = symAux.size();
symAux.emplace_back();		symAux.emplace_back();
}		}

bool isSection() const { return type == llvm::ELF::STT_SECTION; }		bool isSection() const { return type == llvm::ELF::STT_SECTION; }
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines
}		}

template <typename... T> Defined *makeDefined(T &&...args) {		template <typename... T> Defined *makeDefined(T &&...args) {
return new (reinterpret_cast<Defined *>(		return new (reinterpret_cast<Defined *>(
getSpecificAllocSingleton<SymbolUnion>().Allocate()))		getSpecificAllocSingleton<SymbolUnion>().Allocate()))
Defined(std::forward<T>(args)...);		Defined(std::forward<T>(args)...);
}		}

		inline Defined *makeDefined(Defined &o) {
		auto ret = reinterpret_cast<Defined >(
		getSpecificAllocSingleton<SymbolUnion>().Allocate());
		memcpy(ret, &o, sizeof(o));
		return ret;
		}

void reportDuplicate(const Symbol &sym, const InputFile *newFile,		void reportDuplicate(const Symbol &sym, const InputFile *newFile,
InputSectionBase *errSec, uint64_t errOffset);		InputSectionBase *errSec, uint64_t errOffset);
void maybeWarnUnorderableSymbol(const Symbol *sym);		void maybeWarnUnorderableSymbol(const Symbol *sym);
bool computeIsPreemptible(const Symbol &sym);		bool computeIsPreemptible(const Symbol &sym);

} // namespace elf		} // namespace elf
} // namespace lld		} // namespace lld

#endif		#endif

lld/ELF/Symbols.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	case Symbol::DefinedKind: {
// symbols has the `STO_MIPS_MICROMIPS` flag in the `st_other`		// symbols has the `STO_MIPS_MICROMIPS` flag in the `st_other`
// field. Unfortunately, the `MIPS::relocate()` method has		// field. Unfortunately, the `MIPS::relocate()` method has
// a symbol value only. To pass type of the symbol (regular/microMIPS)		// a symbol value only. To pass type of the symbol (regular/microMIPS)
// to that routine as well as other places where we write		// to that routine as well as other places where we write
// a symbol value as-is (.dynamic section, `Elf_Ehdr::e_entry`		// a symbol value as-is (.dynamic section, `Elf_Ehdr::e_entry`
// field etc) do the same trick as compiler uses to mark microMIPS		// field etc) do the same trick as compiler uses to mark microMIPS
// for CPU - set the less-significant bit.		// for CPU - set the less-significant bit.
if (config->emachine == EM_MIPS && isMicroMips() &&		if (config->emachine == EM_MIPS && isMicroMips() &&
((sym.stOther & STO_MIPS_MICROMIPS) \|\| sym.needsCopy))		((sym.stOther & STO_MIPS_MICROMIPS) \|\| sym.hasFlag(NEEDS_COPY)))
va \|= 1;		va \|= 1;

if (d.isTls() && !config->relocatable) {		if (d.isTls() && !config->relocatable) {
// Use the address of the TLS segment's first section rather than the		// Use the address of the TLS segment's first section rather than the
// segment's address, because segment addresses aren't initialized until		// segment's address, because segment addresses aren't initialized until
// after sections are finalized. (e.g. Measuring the size of .rela.dyn		// after sections are finalized. (e.g. Measuring the size of .rela.dyn
// for Android relocation packing requires knowing TLS symbol addresses		// for Android relocation packing requires knowing TLS symbol addresses
// during section finalization.)		// during section finalization.)
▲ Show 20 Lines • Show All 544 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.h

Show All 20 Lines
#define LLD_ELF_SYNTHETIC_SECTIONS_H		#define LLD_ELF_SYNTHETIC_SECTIONS_H

#include "Config.h"		#include "Config.h"
#include "InputSection.h"		#include "InputSection.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/MC/StringTableBuilder.h"		#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
		#include "llvm/Support/Parallel.h"
#include "llvm/Support/Threading.h"		#include "llvm/Support/Threading.h"

namespace lld::elf {		namespace lld::elf {
class Defined;		class Defined;
struct PhdrEntry;		struct PhdrEntry;
class SymbolTableBaseSection;		class SymbolTableBaseSection;

struct CieRecord {		struct CieRecord {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	public:
uint64_t getGlobalDynAddr(const Symbol &b) const;		uint64_t getGlobalDynAddr(const Symbol &b) const;
uint64_t getGlobalDynOffset(const Symbol &b) const;		uint64_t getGlobalDynOffset(const Symbol &b) const;

uint64_t getTlsIndexVA() { return this->getVA() + tlsIndexOff; }		uint64_t getTlsIndexVA() { return this->getVA() + tlsIndexOff; }
uint32_t getTlsIndexOff() const { return tlsIndexOff; }		uint32_t getTlsIndexOff() const { return tlsIndexOff; }

// Flag to force GOT to be in output if we have relocations		// Flag to force GOT to be in output if we have relocations
// that relies on its address.		// that relies on its address.
bool hasGotOffRel = false;		std::atomic<bool> hasGotOffRel = false;

protected:		protected:
size_t numEntries = 0;		size_t numEntries = 0;
uint32_t tlsIndexOff = -1;		uint32_t tlsIndexOff = -1;
uint64_t size = 0;		uint64_t size = 0;
};		};

// .note.GNU-stack section.		// .note.GNU-stack section.
▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines	public:
GotPltSection();		GotPltSection();
void addEntry(Symbol &sym);		void addEntry(Symbol &sym);
size_t getSize() const override;		size_t getSize() const override;
void writeTo(uint8_t *buf) override;		void writeTo(uint8_t *buf) override;
bool isNeeded() const override;		bool isNeeded() const override;

// Flag to force GotPlt to be in output if we have relocations		// Flag to force GotPlt to be in output if we have relocations
// that relies on its address.		// that relies on its address.
bool hasGotPltOffRel = false;		std::atomic<bool> hasGotPltOffRel = false;

private:		private:
SmallVector<const Symbol *, 0> entries;		SmallVector<const Symbol *, 0> entries;
};		};

// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc		// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc
// Symbols that will be relocated by Target->IRelativeRel.		// Symbols that will be relocated by Target->IRelativeRel.
// On most Targets the IgotPltSection will immediately follow the GotPltSection		// On most Targets the IgotPltSection will immediately follow the GotPltSection
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

class RelocationBaseSection : public SyntheticSection {		class RelocationBaseSection : public SyntheticSection {
public:		public:
RelocationBaseSection(StringRef name, uint32_t type, int32_t dynamicTag,		RelocationBaseSection(StringRef name, uint32_t type, int32_t dynamicTag,
int32_t sizeDynamicTag, bool combreloc);		int32_t sizeDynamicTag, bool combreloc);
/// Add a dynamic relocation without writing an addend to the output section.		/// Add a dynamic relocation without writing an addend to the output section.
/// This overload can be used if the addends are written directly instead of		/// This overload can be used if the addends are written directly instead of
/// using relocations on the input section (e.g. MipsGotSection::writeTo()).		/// using relocations on the input section (e.g. MipsGotSection::writeTo()).
		template <bool shard = false>
void addReloc(const DynamicReloc &reloc) { relocs.push_back(reloc); }		void addReloc(const DynamicReloc &reloc) { relocs.push_back(reloc); }
/// Add a dynamic relocation against \p sym with an optional addend.		/// Add a dynamic relocation against \p sym with an optional addend.
void addSymbolReloc(RelType dynType, InputSectionBase &isec,		void addSymbolReloc(RelType dynType, InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym, int64_t addend = 0,		uint64_t offsetInSec, Symbol &sym, int64_t addend = 0,
llvm::Optional<RelType> addendRelType = llvm::None);		llvm::Optional<RelType> addendRelType = llvm::None);
/// Add a relative dynamic relocation that uses the target address of \p sym		/// Add a relative dynamic relocation that uses the target address of \p sym
/// (i.e. InputSection::getRelocTargetVA()) + \p addend as the addend.		/// (i.e. InputSection::getRelocTargetVA()) + \p addend as the addend.
		template <bool shard = false>
void addRelativeReloc(RelType dynType, InputSectionBase &isec,		void addRelativeReloc(RelType dynType, InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym, int64_t addend,		uint64_t offsetInSec, Symbol &sym, int64_t addend,
RelType addendRelType, RelExpr expr);		RelType addendRelType, RelExpr expr) {
		// This function should only be called for non-preemptible symbols or
		peter.smithUnsubmitted Done Reply Inline Actions Would it be better to move the text into the /// comment as it is a precondition for calling the function? peter.smith: Would it be better to move the text into the /// comment as it is a precondition for calling…
		// RelExpr values that refer to an address inside the output file (e.g. the
		// address of the GOT entry for a potentially preemptible symbol).
		assert(expr != R_ADDEND && "expected non-addend relocation expression");
		addReloc<shard>(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec,
		offsetInSec, sym, addend, expr, addendRelType);
		}
/// Add a dynamic relocation using the target address of \p sym as the addend		/// Add a dynamic relocation using the target address of \p sym as the addend
/// if \p sym is non-preemptible. Otherwise add a relocation against \p sym.		/// if \p sym is non-preemptible. Otherwise add a relocation against \p sym.
void addAddendOnlyRelocIfNonPreemptible(RelType dynType,		void addAddendOnlyRelocIfNonPreemptible(RelType dynType,
InputSectionBase &isec,		InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym,		uint64_t offsetInSec, Symbol &sym,
RelType addendRelType);		RelType addendRelType);
void addReloc(DynamicReloc::Kind kind, RelType dynType,		template <bool shard = false>
InputSectionBase &inputSec, uint64_t offsetInSec, Symbol &sym,		void addReloc(DynamicReloc::Kind kind, RelType dynType, InputSectionBase &sec,
int64_t addend, RelExpr expr, RelType addendRelType);		uint64_t offsetInSec, Symbol &sym, int64_t addend, RelExpr expr,
bool isNeeded() const override { return !relocs.empty(); }		RelType addendRelType) {
		// Write the addends to the relocated address if required. We skip
		// it if the written value would be zero.
		if (config->writeAddends && (expr != R_ADDEND \|\| addend != 0))
		sec.relocations.push_back(
		{expr, addendRelType, offsetInSec, addend, &sym});
		addReloc<shard>({dynType, &sec, offsetInSec, kind, sym, addend, expr});
		}
		bool isNeeded() const override {
		return !relocs.empty() \|\|
		llvm::any_of(relocsVec, [](auto &v) { return !v.empty(); });
		}
size_t getSize() const override { return relocs.size() * this->entsize; }		size_t getSize() const override { return relocs.size() * this->entsize; }
size_t getRelativeRelocCount() const { return numRelativeRelocs; }		size_t getRelativeRelocCount() const { return numRelativeRelocs; }
		void mergeRels();
void partitionRels();		void partitionRels();
void finalizeContents() override;		void finalizeContents() override;
static bool classof(const SectionBase *d) {		static bool classof(const SectionBase *d) {
return SyntheticSection::classof(d) &&		return SyntheticSection::classof(d) &&
(d->type == llvm::ELF::SHT_RELA \|\| d->type == llvm::ELF::SHT_REL \|\|		(d->type == llvm::ELF::SHT_RELA \|\| d->type == llvm::ELF::SHT_REL \|\|
d->type == llvm::ELF::SHT_RELR);		d->type == llvm::ELF::SHT_RELR);
}		}
int32_t dynamicTag, sizeDynamicTag;		int32_t dynamicTag, sizeDynamicTag;
SmallVector<DynamicReloc, 0> relocs;		SmallVector<DynamicReloc, 0> relocs;
		// Used when parallel relocation scanning adds relocations. The elements
		peter.smithUnsubmitted Done Reply Inline Actions Suggest "// will be moved into relocs by mergeRels()." peter.smith: Suggest "// will be moved into relocs by mergeRels()."
		// should will be moved into relocs.
		andrewngUnsubmitted Done Reply Inline Actions Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class RelrBaseSection`? andrewng: Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class…
		SmallVector<SmallVector<DynamicReloc, 0>, 0> relocsVec;
		peter.smithUnsubmitted Done Reply Inline Actions Now that mergeRels has to be called before this is useable, is it worth making this private with an interface that asserts mergeRels has been called? peter.smith: Now that mergeRels has to be called before this is useable, is it worth making this private…

protected:		protected:
void computeRels();		void computeRels();
size_t numRelativeRelocs = 0; // used by -z combreloc		size_t numRelativeRelocs = 0; // used by -z combreloc
bool combreloc;		bool combreloc;
};		};

		template <>
		inline void RelocationBaseSection::addReloc<true>(const DynamicReloc &reloc) {
		relocsVec[llvm::parallel::threadIndex].push_back(reloc);
		}

template <class ELFT>		template <class ELFT>
class RelocationSection final : public RelocationBaseSection {		class RelocationSection final : public RelocationBaseSection {
using Elf_Rel = typename ELFT::Rel;		using Elf_Rel = typename ELFT::Rel;
using Elf_Rela = typename ELFT::Rela;		using Elf_Rela = typename ELFT::Rela;

public:		public:
RelocationSection(StringRef name, bool combreloc);		RelocationSection(StringRef name, bool combreloc);
void writeTo(uint8_t *buf) override;		void writeTo(uint8_t *buf) override;
Show All 22 Lines	struct RelativeReloc {

const InputSectionBase *inputSec;		const InputSectionBase *inputSec;
uint64_t offsetInSec;		uint64_t offsetInSec;
};		};

class RelrBaseSection : public SyntheticSection {		class RelrBaseSection : public SyntheticSection {
public:		public:
RelrBaseSection();		RelrBaseSection();
bool isNeeded() const override { return !relocs.empty(); }		void mergeRels();
		bool isNeeded() const override {
		return !relocs.empty() \|\|
		llvm::any_of(relocsVec, [](auto &v) { return !v.empty(); });
		}
SmallVector<RelativeReloc, 0> relocs;		SmallVector<RelativeReloc, 0> relocs;
		SmallVector<SmallVector<RelativeReloc, 0>, 0> relocsVec;
};		};

// RelrSection is used to encode offsets for relative relocations.		// RelrSection is used to encode offsets for relative relocations.
// Proposal for adding SHT_RELR sections to generic-abi is here:		// Proposal for adding SHT_RELR sections to generic-abi is here:
// https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg		// https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
// For more details, see the comment in RelrSection::updateAllocSize().		// For more details, see the comment in RelrSection::updateAllocSize().
template <class ELFT> class RelrSection final : public RelrBaseSection {		template <class ELFT> class RelrSection final : public RelrBaseSection {
using Elf_Relr = typename ELFT::Relr;		using Elf_Relr = typename ELFT::Relr;
▲ Show 20 Lines • Show All 692 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.cpp

Show First 20 Lines • Show All 1,580 Lines • ▼ Show 20 Lines	void RelocationBaseSection::addSymbolReloc(RelType dynType,
InputSectionBase &isec,		InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym,		uint64_t offsetInSec, Symbol &sym,
int64_t addend,		int64_t addend,
Optional<RelType> addendRelType) {		Optional<RelType> addendRelType) {
addReloc(DynamicReloc::AgainstSymbol, dynType, isec, offsetInSec, sym, addend,		addReloc(DynamicReloc::AgainstSymbol, dynType, isec, offsetInSec, sym, addend,
R_ADDEND, addendRelType ? *addendRelType : target->noneRel);		R_ADDEND, addendRelType ? *addendRelType : target->noneRel);
}		}

void RelocationBaseSection::addRelativeReloc(
RelType dynType, InputSectionBase &inputSec, uint64_t offsetInSec,
Symbol &sym, int64_t addend, RelType addendRelType, RelExpr expr) {
// This function should only be called for non-preemptible symbols or
// RelExpr values that refer to an address inside the output file (e.g. the
// address of the GOT entry for a potentially preemptible symbol).
assert((!sym.isPreemptible \|\| expr == R_GOT) &&
"cannot add relative relocation against preemptible symbol");
assert(expr != R_ADDEND && "expected non-addend relocation expression");
addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, inputSec, offsetInSec,
sym, addend, expr, addendRelType);
}

void RelocationBaseSection::addAddendOnlyRelocIfNonPreemptible(		void RelocationBaseSection::addAddendOnlyRelocIfNonPreemptible(
RelType dynType, InputSectionBase &isec, uint64_t offsetInSec, Symbol &sym,		RelType dynType, InputSectionBase &isec, uint64_t offsetInSec, Symbol &sym,
RelType addendRelType) {		RelType addendRelType) {
// No need to write an addend to the section for preemptible symbols.		// No need to write an addend to the section for preemptible symbols.
if (sym.isPreemptible)		if (sym.isPreemptible)
addReloc({dynType, &isec, offsetInSec, DynamicReloc::AgainstSymbol, sym, 0,		addReloc({dynType, &isec, offsetInSec, DynamicReloc::AgainstSymbol, sym, 0,
R_ABS});		R_ABS});
else		else
addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec, offsetInSec,		addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec, offsetInSec,
sym, 0, R_ABS, addendRelType);		sym, 0, R_ABS, addendRelType);
}		}

void RelocationBaseSection::addReloc(DynamicReloc::Kind kind, RelType dynType,		void RelocationBaseSection::mergeRels() {
InputSectionBase &inputSec,		size_t newSize = relocs.size();
uint64_t offsetInSec, Symbol &sym,		for (auto &v : relocsVec)
int64_t addend, RelExpr expr,		newSize += v.size();
RelType addendRelType) {		relocs.reserve(newSize);
// Write the addends to the relocated address if required. We skip		for (auto &v : relocsVec)
		andrewngUnsubmitted Done Reply Inline Actions Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`. andrewng: Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`.
// it if the written value would be zero.		llvm::append_range(relocs, v);
if (config->writeAddends && (expr != R_ADDEND \|\| addend != 0))		relocsVec.clear();
inputSec.relocations.push_back(
{expr, addendRelType, offsetInSec, addend, &sym});
addReloc({dynType, &inputSec, offsetInSec, kind, sym, addend, expr});
}		}

void RelocationBaseSection::partitionRels() {		void RelocationBaseSection::partitionRels() {
if (!combreloc)		if (!combreloc)
return;		return;
const RelType relativeRel = target->relativeRel;		const RelType relativeRel = target->relativeRel;
numRelativeRelocs =		numRelativeRelocs =
llvm::partition(relocs, [=](auto &r) { return r.type == relativeRel; }) -		llvm::partition(relocs, [=](auto &r) { return r.type == relativeRel; }) -
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	template <class ELFT> void RelocationSection<ELFT>::writeTo(uint8_t *buf) {
}		}
}		}

RelrBaseSection::RelrBaseSection()		RelrBaseSection::RelrBaseSection()
: SyntheticSection(SHF_ALLOC,		: SyntheticSection(SHF_ALLOC,
config->useAndroidRelrTags ? SHT_ANDROID_RELR : SHT_RELR,		config->useAndroidRelrTags ? SHT_ANDROID_RELR : SHT_RELR,
config->wordsize, ".relr.dyn") {}		config->wordsize, ".relr.dyn") {}

		void RelrBaseSection::mergeRels() {
		size_t newSize = relocs.size();
		for (auto &v : relocsVec)
		newSize += v.size();
		relocs.reserve(newSize);
		for (auto &v : relocsVec)
		llvm::append_range(relocs, v);
		relocsVec.clear();
		}

template <class ELFT>		template <class ELFT>
AndroidPackedRelocationSection<ELFT>::AndroidPackedRelocationSection(		AndroidPackedRelocationSection<ELFT>::AndroidPackedRelocationSection(
StringRef name)		StringRef name)
: RelocationBaseSection(		: RelocationBaseSection(
name, config->isRela ? SHT_ANDROID_RELA : SHT_ANDROID_REL,		name, config->isRela ? SHT_ANDROID_RELA : SHT_ANDROID_REL,
config->isRela ? DT_ANDROID_RELA : DT_ANDROID_REL,		config->isRela ? DT_ANDROID_RELA : DT_ANDROID_REL,
config->isRela ? DT_ANDROID_RELASZ : DT_ANDROID_RELSZ,		config->isRela ? DT_ANDROID_RELASZ : DT_ANDROID_RELSZ,
/combreloc=/false) {		/combreloc=/false) {
▲ Show 20 Lines • Show All 455 Lines • ▼ Show 20 Lines
static BssSection getCommonSec(Symbol sym) {		static BssSection getCommonSec(Symbol sym) {
if (config->relocatable)		if (config->relocatable)
if (auto *d = dyn_cast<Defined>(sym))		if (auto *d = dyn_cast<Defined>(sym))
return dyn_cast_or_null<BssSection>(d->section);		return dyn_cast_or_null<BssSection>(d->section);
return nullptr;		return nullptr;
}		}

static uint32_t getSymSectionIndex(Symbol *sym) {		static uint32_t getSymSectionIndex(Symbol *sym) {
assert(!(sym->needsCopy && sym->isObject()));		assert(!(sym->hasFlag(NEEDS_COPY) && sym->isObject()));
if (!isa<Defined>(sym) \|\| sym->needsCopy)		if (!isa<Defined>(sym) \|\| sym->hasFlag(NEEDS_COPY))
return SHN_UNDEF;		return SHN_UNDEF;
if (const OutputSection *os = sym->getOutputSection())		if (const OutputSection *os = sym->getOutputSection())
return os->sectionIndex >= SHN_LORESERVE ? (uint32_t)SHN_XINDEX		return os->sectionIndex >= SHN_LORESERVE ? (uint32_t)SHN_XINDEX
: os->sectionIndex;		: os->sectionIndex;
return SHN_ABS;		return SHN_ABS;
}		}

// Write the internal symbol table contents to the output symbol table.		// Write the internal symbol table contents to the output symbol table.
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	template <class ELFT> void SymbolTableSection<ELFT>::writeTo(uint8_t *buf) {
// pointer equality by STO_MIPS_PLT flag. That is necessary to help		// pointer equality by STO_MIPS_PLT flag. That is necessary to help
// dynamic linker distinguish such symbols and MIPS lazy-binding stubs.		// dynamic linker distinguish such symbols and MIPS lazy-binding stubs.
// https://sourceware.org/ml/binutils/2008-07/txt00000.txt		// https://sourceware.org/ml/binutils/2008-07/txt00000.txt
if (config->emachine == EM_MIPS) {		if (config->emachine == EM_MIPS) {
auto eSym = reinterpret_cast<Elf_Sym >(buf);		auto eSym = reinterpret_cast<Elf_Sym >(buf);

for (SymbolTableEntry &ent : symbols) {		for (SymbolTableEntry &ent : symbols) {
Symbol *sym = ent.sym;		Symbol *sym = ent.sym;
if (sym->isInPlt() && sym->needsCopy)		if (sym->isInPlt() && sym->hasFlag(NEEDS_COPY))
eSym->st_other \|= STO_MIPS_PLT;		eSym->st_other \|= STO_MIPS_PLT;
if (isMicroMips()) {		if (isMicroMips()) {
// We already set the less-significant bit for symbols		// We already set the less-significant bit for symbols
// marked by the `STO_MIPS_MICROMIPS` flag and for microMIPS PLT		// marked by the `STO_MIPS_MICROMIPS` flag and for microMIPS PLT
// records. That allows us to distinguish such symbols in		// records. That allows us to distinguish such symbols in
// the `MIPS<ELFT>::relocate()` routine. Now we should		// the `MIPS<ELFT>::relocate()` routine. Now we should
// clear that bit for non-dynamic symbol table, so tools		// clear that bit for non-dynamic symbol table, so tools
// like `objdump` will be able to deal with a correct		// like `objdump` will be able to deal with a correct
// symbol position.		// symbol position.
if (sym->isDefined() &&		if (sym->isDefined() &&
((sym->stOther & STO_MIPS_MICROMIPS) \|\| sym->needsCopy)) {		((sym->stOther & STO_MIPS_MICROMIPS) \|\| sym->hasFlag(NEEDS_COPY))) {
if (!strTabSec.isDynamic())		if (!strTabSec.isDynamic())
eSym->st_value &= ~1;		eSym->st_value &= ~1;
eSym->st_other \|= STO_MIPS_MICROMIPS;		eSym->st_other \|= STO_MIPS_MICROMIPS;
}		}
}		}
if (config->relocatable)		if (config->relocatable)
if (auto *d = dyn_cast<Defined>(sym))		if (auto *d = dyn_cast<Defined>(sym))
if (isMipsPIC<ELFT>(d))		if (isMipsPIC<ELFT>(d))
▲ Show 20 Lines • Show All 1,703 Lines • Show Last 20 Lines

lld/ELF/Writer.cpp

Show First 20 Lines • Show All 343 Lines • ▼ Show 20 Lines	for (Partition &part : partitions) {
part.dynamic = std::make_unique<DynamicSection<ELFT>>();		part.dynamic = std::make_unique<DynamicSection<ELFT>>();

if (config->emachine == EM_AARCH64 &&		if (config->emachine == EM_AARCH64 &&
config->androidMemtagMode != ELF::NT_MEMTAG_LEVEL_NONE) {		config->androidMemtagMode != ELF::NT_MEMTAG_LEVEL_NONE) {
part.memtagAndroidNote = std::make_unique<MemtagAndroidNote>();		part.memtagAndroidNote = std::make_unique<MemtagAndroidNote>();
add(*part.memtagAndroidNote);		add(*part.memtagAndroidNote);
}		}

		const unsigned threadCount = parallel::strategy.compute_thread_count();
if (config->androidPackDynRelocs)		if (config->androidPackDynRelocs)
part.relaDyn =		part.relaDyn =
std::make_unique<AndroidPackedRelocationSection<ELFT>>(relaDynName);		std::make_unique<AndroidPackedRelocationSection<ELFT>>(relaDynName);
else		else
part.relaDyn = std::make_unique<RelocationSection<ELFT>>(		part.relaDyn = std::make_unique<RelocationSection<ELFT>>(
relaDynName, config->zCombreloc);		relaDynName, config->zCombreloc);
		part.relaDyn->relocsVec.resize(threadCount);

if (config->hasDynSymTab) {		if (config->hasDynSymTab) {
add(*part.dynSymTab);		add(*part.dynSymTab);

part.verSym = std::make_unique<VersionTableSection>();		part.verSym = std::make_unique<VersionTableSection>();
add(*part.verSym);		add(*part.verSym);

if (!namedVersionDefs().empty()) {		if (!namedVersionDefs().empty()) {
Show All 16 Lines	if (config->hasDynSymTab) {

add(*part.dynamic);		add(*part.dynamic);
add(*part.dynStrTab);		add(*part.dynStrTab);
add(*part.relaDyn);		add(*part.relaDyn);
}		}

if (config->relrPackDynRelocs) {		if (config->relrPackDynRelocs) {
part.relrDyn = std::make_unique<RelrSection<ELFT>>();		part.relrDyn = std::make_unique<RelrSection<ELFT>>();
		part.relrDyn->relocsVec.resize(threadCount);
add(*part.relrDyn);		add(*part.relrDyn);
}		}

if (!config->relocatable) {		if (!config->relocatable) {
if (config->ehFrameHdr) {		if (config->ehFrameHdr) {
part.ehFrameHdr = std::make_unique<EhFrameHeader>();		part.ehFrameHdr = std::make_unique<EhFrameHeader>();
add(*part.ehFrameHdr);		add(*part.ehFrameHdr);
}		}
▲ Show 20 Lines • Show All 1,671 Lines • ▼ Show 20 Lines	setReservedSymbolSections();
finalizeSynthetic(in.iplt.get());		finalizeSynthetic(in.iplt.get());
finalizeSynthetic(in.ppc32Got2.get());		finalizeSynthetic(in.ppc32Got2.get());
finalizeSynthetic(in.partIndex.get());		finalizeSynthetic(in.partIndex.get());

// Dynamic section must be the last one in this list and dynamic		// Dynamic section must be the last one in this list and dynamic
// symbol table section (dynSymTab) must be the first one.		// symbol table section (dynSymTab) must be the first one.
for (Partition &part : partitions) {		for (Partition &part : partitions) {
if (part.relaDyn) {		if (part.relaDyn) {
		part.relaDyn->mergeRels();
// Compute DT_RELACOUNT to be used by part.dynamic.		// Compute DT_RELACOUNT to be used by part.dynamic.
part.relaDyn->partitionRels();		part.relaDyn->partitionRels();
finalizeSynthetic(part.relaDyn.get());		finalizeSynthetic(part.relaDyn.get());
}		}
		if (part.relrDyn) {
		part.relrDyn->mergeRels();
		finalizeSynthetic(part.relrDyn.get());
		}

finalizeSynthetic(part.dynSymTab.get());		finalizeSynthetic(part.dynSymTab.get());
finalizeSynthetic(part.gnuHashTab.get());		finalizeSynthetic(part.gnuHashTab.get());
finalizeSynthetic(part.hashTab.get());		finalizeSynthetic(part.hashTab.get());
finalizeSynthetic(part.verDef.get());		finalizeSynthetic(part.verDef.get());
finalizeSynthetic(part.relrDyn.get());
finalizeSynthetic(part.ehFrameHdr.get());		finalizeSynthetic(part.ehFrameHdr.get());
finalizeSynthetic(part.verSym.get());		finalizeSynthetic(part.verSym.get());
finalizeSynthetic(part.verNeed.get());		finalizeSynthetic(part.verNeed.get());
finalizeSynthetic(part.dynamic.get());		finalizeSynthetic(part.dynamic.get());
}		}
}		}

if (!script->hasSectionsCommand && !config->relocatable)		if (!script->hasSectionsCommand && !config->relocatable)
▲ Show 20 Lines • Show All 882 Lines • Show Last 20 Lines

lld/test/ELF/combreloc.s

	Show All 29 Lines
	# NOCOMB: DynamicSection [			# NOCOMB: DynamicSection [
	# NOCOMB-NOT: RELACOUNT			# NOCOMB-NOT: RELACOUNT
	# NOCOMB: Relocations [			# NOCOMB: Relocations [
	# NOCOMB-NEXT: Section ({{.*}}) .rela.dyn {			# NOCOMB-NEXT: Section ({{.*}}) .rela.dyn {
	# NOCOMB-NEXT: 0x33F8 R_X86_64_64 aaa 0x0			# NOCOMB-NEXT: 0x33F8 R_X86_64_64 aaa 0x0
	# NOCOMB-NEXT: 0x3400 R_X86_64_64 ccc 0x0			# NOCOMB-NEXT: 0x3400 R_X86_64_64 ccc 0x0
	# NOCOMB-NEXT: 0x3408 R_X86_64_64 bbb 0x0			# NOCOMB-NEXT: 0x3408 R_X86_64_64 bbb 0x0
	# NOCOMB-NEXT: 0x3410 R_X86_64_64 aaa 0x0			# NOCOMB-NEXT: 0x3410 R_X86_64_64 aaa 0x0
	# NOCOMB-NEXT: 0x3418 R_X86_64_RELATIVE - 0x3420
	# NOCOMB-NEXT: 0x23F0 R_X86_64_GLOB_DAT aaa 0x0			# NOCOMB-NEXT: 0x23F0 R_X86_64_GLOB_DAT aaa 0x0
				# NOCOMB-NEXT: 0x3418 R_X86_64_RELATIVE - 0x3420
	# NOCOMB-NEXT: }			# NOCOMB-NEXT: }

	.globl aaa, bbb, ccc			.globl aaa, bbb, ccc
	.data			.data
	.quad aaa			.quad aaa
	.quad ccc			.quad ccc
	.quad bbb			.quad bbb
	.quad aaa			.quad aaa
	.quad relative			.quad relative
	relative:			relative:

lld/test/ELF/comdat-discarded-error.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
	# RUN: echo '.section .text.foo,"axG",@progbits,foo,comdat; .globl foo; foo:' \|\			# RUN: echo '.section .text.foo,"axG",@progbits,foo,comdat; .globl foo; foo:' \|\
	# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t2.o
	# RUN: echo '.weak foo; foo: .section .text.foo,"axG",@progbits,foo,comdat; .globl bar; bar:' \|\			# RUN: echo '.weak foo; foo: .section .text.foo,"axG",@progbits,foo,comdat; .globl bar; bar:' \|\
	# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t3.o

	# RUN: not ld.lld %t2.o %t3.o %t1.o -o /dev/null 2>&1 \| FileCheck %s			# RUN: not ld.lld --threads=1 %t2.o %t3.o %t1.o -o /dev/null 2>&1 \| FileCheck %s

	# CHECK: error: relocation refers to a symbol in a discarded section: bar			# CHECK: error: relocation refers to a symbol in a discarded section: bar
	# CHECK-NEXT: >>> defined in {{.*}}3.o			# CHECK-NEXT: >>> defined in {{.*}}3.o
	# CHECK-NEXT: >>> section group signature: foo			# CHECK-NEXT: >>> section group signature: foo
	# CHECK-NEXT: >>> prevailing definition is in {{.*}}2.o			# CHECK-NEXT: >>> prevailing definition is in {{.*}}2.o
	# CHECK-NEXT: >>> or the symbol in the prevailing group {{.*}}			# CHECK-NEXT: >>> or the symbol in the prevailing group {{.*}}
	# CHECK-NEXT: >>> referenced by {{.*}}1.o:(.text+0x1)			# CHECK-NEXT: >>> referenced by {{.*}}1.o:(.text+0x1)

	Show All 15 Lines

lld/test/ELF/undef-multi.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o
	# RUN: not ld.lld %t.o %t2.o -o /dev/null 2>&1 \| FileCheck %s			# RUN: not ld.lld --threads=1 %t.o %t2.o -o /dev/null 2>&1 \| FileCheck %s

	# CHECK: error: undefined symbol: zed2			# CHECK: error: undefined symbol: zed2
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0x1)			# CHECK-NEXT: >>> {{.*}}:(.text+0x1)
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0x6)			# CHECK-NEXT: >>> {{.*}}:(.text+0x6)
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0xB)			# CHECK-NEXT: >>> {{.*}}:(.text+0xB)
	# CHECK-NEXT: >>> referenced 2 more times			# CHECK-NEXT: >>> referenced 2 more times

	# All references to a single undefined symbol count as a single error -- but			# All references to a single undefined symbol count as a single error -- but
	# at most 10 references are printed.			# at most 10 references are printed.
	# RUN: echo ".globl _bar" > %t.moreref.s			# RUN: echo ".globl _bar" > %t.moreref.s
	# RUN: echo "_bar:" >> %t.moreref.s			# RUN: echo "_bar:" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %t.moreref.s -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %t.moreref.s -o %t3.o
	# RUN: not ld.lld %t.o %t2.o %t3.o -o /dev/null -error-limit=2 2>&1 \| \			# RUN: not ld.lld --threads=1 %t.o %t2.o %t3.o -o /dev/null -error-limit=2 2>&1 \| \
	# RUN: FileCheck --check-prefix=LIMIT %s			# RUN: FileCheck --check-prefix=LIMIT %s

	# LIMIT: error: undefined symbol: zed2			# LIMIT: error: undefined symbol: zed2
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	# LIMIT-NEXT: >>> {{.*}}:(.text+0x1)			# LIMIT-NEXT: >>> {{.*}}:(.text+0x1)
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	# LIMIT-NEXT: >>> {{.*}}:(.text+0x6)			# LIMIT-NEXT: >>> {{.*}}:(.text+0x6)
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	Show All 20 Lines

lld/test/ELF/undef.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-bad-debug.s -o %t4.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-bad-debug.s -o %t4.o
	# RUN: rm -f %t2.a			# RUN: rm -f %t2.a
	# RUN: llvm-ar rc %t2.a %t2.o			# RUN: llvm-ar rc %t2.a %t2.o
	# RUN: not ld.lld %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \			# RUN: not ld.lld --threads=1 %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \
	# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"			# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"
	# RUN: not ld.lld -pie %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \			# RUN: not ld.lld --threads=1 -pie %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \
	# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"			# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"

	# CHECK: error: undefined symbol: foo			# CHECK: error: undefined symbol: foo
	# CHECK-NEXT: >>> referenced by undef.s			# CHECK-NEXT: >>> referenced by undef.s
	# CHECK-NEXT: {{.*}}:(.text+0x1)			# CHECK-NEXT: {{.*}}:(.text+0x1)

	# CHECK: error: undefined symbol: bar			# CHECK: error: undefined symbol: bar
	# CHECK-NEXT: >>> referenced by undef.s			# CHECK-NEXT: >>> referenced by undef.s
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/include/llvm/Support/Parallel.h

	Show All 22 Lines
	namespace llvm {			namespace llvm {

	namespace parallel {			namespace parallel {

	// Strategy for the default executor used by the parallel routines provided by			// Strategy for the default executor used by the parallel routines provided by
	// this file. It defaults to using all hardware threads and should be			// this file. It defaults to using all hardware threads and should be
	// initialized before the first use of parallel routines.			// initialized before the first use of parallel routines.
	extern ThreadPoolStrategy strategy;			extern ThreadPoolStrategy strategy;
				extern thread_local int threadIndex;

	namespace detail {			namespace detail {
	class Latch {			class Latch {
	uint32_t Count;			uint32_t Count;
	mutable std::mutex Mutex;			mutable std::mutex Mutex;
	mutable std::condition_variable Cond;			mutable std::condition_variable Cond;

	public:			public:
	▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

llvm/lib/Support/Parallel.cpp

Show All 12 Lines

#include <atomic>		#include <atomic>
#include <future>		#include <future>
#include <stack>		#include <stack>
#include <thread>		#include <thread>
#include <vector>		#include <vector>

llvm::ThreadPoolStrategy llvm::parallel::strategy;		llvm::ThreadPoolStrategy llvm::parallel::strategy;
		thread_local int llvm::parallel::threadIndex;
		andrewngUnsubmitted Done Reply Inline Actions Perhaps `int` -> `unsigned`? andrewng: Perhaps `int` -> `unsigned`?

namespace llvm {		namespace llvm {
namespace parallel {		namespace parallel {
#if LLVM_ENABLE_THREADS		#if LLVM_ENABLE_THREADS
namespace detail {		namespace detail {

namespace {		namespace {

Show All 14 Lines	explicit ThreadPoolExecutor(ThreadPoolStrategy S = hardware_concurrency()) {
unsigned ThreadCount = S.compute_thread_count();		unsigned ThreadCount = S.compute_thread_count();
// Spawn all but one of the threads in another thread as spawning threads		// Spawn all but one of the threads in another thread as spawning threads
// can take a while.		// can take a while.
Threads.reserve(ThreadCount);		Threads.reserve(ThreadCount);
Threads.resize(1);		Threads.resize(1);
std::lock_guard<std::mutex> Lock(Mutex);		std::lock_guard<std::mutex> Lock(Mutex);
Threads[0] = std::thread([this, ThreadCount, S] {		Threads[0] = std::thread([this, ThreadCount, S] {
for (unsigned I = 1; I < ThreadCount; ++I) {		for (unsigned I = 1; I < ThreadCount; ++I) {
Threads.emplace_back([=] { work(S, I); });		Threads.emplace_back([=] {
		threadIndex = I;
		andrewngUnsubmitted Done Reply Inline Actions Perhaps move this initialisation of `threadIndex` and the one below into `work()`? andrewng: Perhaps move this initialisation of `threadIndex` and the one below into `work()`?
		work(S, I);
		});
if (Stop)		if (Stop)
break;		break;
}		}
ThreadsCreated.set_value();		ThreadsCreated.set_value();
		threadIndex = 0;
work(S, 0);		work(S, 0);
});		});
}		}

void stop() {		void stop() {
{		{
std::lock_guard<std::mutex> Lock(Mutex);		std::lock_guard<std::mutex> Lock(Mutex);
if (Stop)		if (Stop)
▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Parallelize relocation scanningClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 458972

lld/ELF/Arch/AArch64.cpp

lld/ELF/MapFile.cpp

lld/ELF/Relocations.cpp

lld/ELF/Symbols.h

lld/ELF/Symbols.cpp

lld/ELF/SyntheticSections.h

lld/ELF/SyntheticSections.cpp

lld/ELF/Writer.cpp

lld/test/ELF/combreloc.s

lld/test/ELF/comdat-discarded-error.s

lld/test/ELF/undef-multi.s

lld/test/ELF/undef.s

llvm/include/llvm/Support/Parallel.h

llvm/lib/Support/Parallel.cpp

[ELF] Parallelize relocation scanning
ClosedPublic