This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
ELF/
-
Config.h
17/17
Relocations.cpp
6/6
Symbols.h
4/4
SyntheticSections.h
1/1
SyntheticSections.cpp
-
Writer.cpp
-
test/ELF/
-
ELF/
-
combreloc.s
-
comdat-discarded-error.s
-
undef-multi.s
-
undef.s
-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
-
Parallel.h
-
lib/Support/
-
Support/
2/2
Parallel.cpp

Differential D133003

[ELF] Parallelize relocation scanning
ClosedPublic

Authored by MaskRay on Aug 31 2022, 1:38 AM.

Download Raw Diff

Details

Reviewers

andrewng
ikudrin
peter.smith

Commits

rGe6aebff67426: [ELF] Parallelize relocation scanning

Summary

Change Symbol::flags to a std::atomic<uint16_t>
Add llvm::parallel::threadIndex as a thread-local non-negative integer
Add relocsVec to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.

MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.

Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:

clang (Release): 1.27x as fast
clang (Debug): 1.06x as fast
chrome (default): 1.05x as fast
scylladb (default): 1.04x as fast

Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):

clang (Release): 1.31x as fast
scylladb (default): 1.06x as fast

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MaskRay created this revision.Aug 31 2022, 1:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 1:38 AM

Herald added subscribers: StephenFan, atanasyan, arichardson and 2 others. · View Herald Transcript

MaskRay requested review of this revision.Aug 31 2022, 1:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 1:38 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B184313: Diff 456893.Aug 31 2022, 2:08 AM

ikudrin added inline comments.Aug 31 2022, 11:45 AM

lld/ELF/Relocations.cpp
300–301	Why not define a copy constructor?
lld/ELF/Symbols.h
297–298	Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike?
315	You use it with two flags at least once, maybe call it `setFlags`?

MaskRay added inline comments.Aug 31 2022, 12:24 PM

lld/ELF/Symbols.h
297–298	All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want to keep the size of `SymbolUnion` unchanged.

ikudrin added inline comments.Sep 1 2022, 2:10 AM

lld/ELF/Symbols.h
297–298	Does that mean that some flags in the atomic do not really need to be handled as such, or that this flag is left outside despite it can be potentially updated concurrently, but there is no space for it in `flags`? In any case, that is worth documenting, at least.

rebase. address comments

lld/ELF/Relocations.cpp
300–301	Good idea. Adopted
lld/ELF/Symbols.h
297–298	I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now, but I suspect 16-bit atomic operations are not efficient on common architectures.

MaskRay retitled this revision from [WIP][ELF] Parallelize relocation scanning to [ELF] Parallelize relocation scanning.Sep 4 2022, 6:29 PM

Harbormaster completed remote builds in B185029: Diff 457877.Sep 4 2022, 6:37 PM

rebase

Harbormaster completed remote builds in B185039: Diff 457889.Sep 4 2022, 11:48 PM

lkail added a subscriber: lkail.Sep 4 2022, 11:53 PM

ikudrin added inline comments.Sep 5 2022, 7:11 AM

lld/ELF/Relocations.cpp
1242	If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to `atomic<bool>`, the same should be done for `Configuration::needsTlsLd` because their usage pattern is similar.
1297	Shouldn't `relocMutex` be locked before this call?
1580	`uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the range is `NEEDS_TLSIE` which is not used here, but still.

Sorry, I've been busy, so have only just had some time to look at this patch. Looks promising but unfortunately there are performance regressions on Windows for both chrome (~3%) and mozilla (~5%) from lld-speed-test.tar.xz. Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Had some time to investigate a bit more and it seems that the slow down, at least on my 12C/24T Windows PC, is actually a result of contention over relocMutex in RelocationScanner::processAux. So "too many" concurrent threads running RelocationScanner::processAux can result in an overall slow down to scan the relocations and in these cases, it's likely to slow down even further with more available threads. Unfortunately, there's no mechanism in parallel::TaskGroup to limit the number of concurrent tasks being run by the pool from the group, so there's no "easy" solution. I've been experimenting with some ideas that shard the input sections such that there are fewer concurrent threads running the relocation scanning code.

>>! In D133003#3772446, @andrewng wrote:

Don't yet know the reason for the slow down but I suspect it will be related to the "size" of the tasks being spawned in parallel.

Had some time to investigate a bit more and it seems that the slow down, at least on my 12C/24T Windows PC, is actually a result of contention over relocMutex in RelocationScanner::processAux. So "too many" concurrent threads running RelocationScanner::processAux can result in an overall slow down to scan the relocations and in these cases, it's likely to slow down even further with more available threads. Unfortunately, there's no mechanism in parallel::TaskGroup to limit the number of concurrent tasks being run by the pool from the group, so there's no "easy" solution. I've been experimenting with some ideas that shard the input sections such that there are fewer concurrent threads running the relocation scanning code.

Thanks for catching the issue. Perhaps we can add a thread_local thread index (for getDefaultExecutor) to llvm/Support/Parallel.h and allocate a relocation vector for each thread. Finally merge and sort the relocation vectors.

lld/ELF/Relocations.cpp
1580	Thanks for catching this!

andrewng mentioned this in D133431: [WIP][ELF] Parallelize relocation scanning.Sep 7 2022, 9:15 AM

I've created D133431 which is the result of my experimentation thus far. In my testing, it does slightly improve performance in the test cases that regressed in performance. In other test cases, it's around the same or slightly lower performance increase. However, I can't help but feel there should be a "better" solution. Although, I guess you've always got to balance that with complexity/maintainability. The hard coded concurrency limit of 8 tasks in D133431 also doesn't feel great.

Thanks for catching the issue. Perhaps we can add a thread_local thread index (for getDefaultExecutor) to llvm/Support/Parallel.h and allocate a relocation vector for each thread. Finally merge and sort the relocation vectors.

Yes, trying to eliminate the lock contention does sound like a good approach, although it feels like it would add complexity.

Also forgot to mention that there were 2 other ELF tests that seemed to need the --threads=1 treatment: comdat-discarded-error.s and debug-line-obj.s (although this might be due to the change in D133431).

Remove mutex for relative relocations. Thanks to @andrewng for finding the issue

If the NEEDS_* change looks good, I'll pre-commit it (without using std::atomic) to reduce diff for future updates.

Herald added subscribers: ctetreau, hiraditya. · View Herald TranscriptSep 9 2022, 12:44 AM

Harbormaster completed remote builds in B185781: Diff 458972.Sep 9 2022, 2:07 AM

If the NEEDS_* change looks good, I'll pre-commit it (without using std::atomic) to reduce diff for future updates.

The NEEDS_* change LGTM.

This approach definitely looks better and hasn't added too much complexity. Initial testing on Windows is looking good, but I need to do a bit more.

lld/ELF/Relocations.cpp
1560	I wonder if it might be worthwhile using the previous code for the serial case? Although, it probably doesn't make a big difference to performance.
lld/ELF/Symbols.h
318	The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit - 1)) == 0)`?
lld/ELF/SyntheticSections.h
548–549	Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class RelrBaseSection`?
llvm/lib/Support/Parallel.cpp
21	Perhaps `int` -> `unsigned`?
53	Perhaps move this initialisation of `threadIndex` and the one below into `work()`?

Performance on Windows looks good! Every test case I've tried has shown an improvement.

lld/ELF/SyntheticSections.cpp
1604–1607	Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`.

MaskRay mentioned this in rGbd16ffb38981: [ELF] Merge Symbol::needs* into uint16_t flags. NFC.Sep 9 2022, 2:37 PM

Thanks a lot for the comments.
Updated.

MaskRay added inline comments.Sep 9 2022, 2:46 PM

lld/ELF/Relocations.cpp
1544	I'll remove `AndroidPackedRelocationSection does not support parallelism.` . It works with deterministic parallelism.
1560	Use which piece of code for the serial case?

Harbormaster completed remote builds in B185954: Diff 459206.Sep 9 2022, 4:24 PM

MaskRay edited the summary of this revision. (Show Details)Sep 9 2022, 11:45 PM

andrewng added inline comments.Sep 10 2022, 11:23 AM

lld/ELF/Relocations.cpp
1541–1562	This is running on the main thread. Is there a chance that this might clash with thread 0 of the task pool?
1560	I was thinking this: for (InputSectionBase sec : inputSections) if (sec->isLive() && (sec->flags & SHF_ALLOC)) scanner.template scanSection<ELFT>(sec); But on the other hand, in terms of future development and maintenance, it's probably better to use as much of the same code for both "paths", even if there's a minor performance penalty for the serial one.

reduce contention

lld/ELF/Relocations.cpp
1541–1562	Thanks for catching this. The main thread doing heavy work will contend with the thread pool. Changed to use `tg.execute`.
1560	Yes, using the same code for both paths is better for maintenance.

Harbormaster completed remote builds in B186035: Diff 459312.Sep 10 2022, 4:16 PM

andrewng added inline comments.Sep 11 2022, 6:10 AM

lld/ELF/Relocations.cpp
1541–1562	I think the previous code could have actually caused a threading issue, i.e. concurrent updates to the `0` indexed relocation vector. This will ensure that can't happen. The only minor thing is it looks a little odd that the "serial" case uses `tg` but I guess it is still serial.
1560	Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned it. But as there are specific configurations that are limited to serial I thought that it might be worth considering.

MaskRay marked an inline comment as done.Sep 11 2022, 11:00 AM

MaskRay added inline comments.

lld/ELF/Relocations.cpp

1560

Add the comment before the tg.execute([] { line?

+  // Both the main thread and thread pool index 0 use threadIndex==0.  Be
+  // careful that they don't concurrently run scanSections. When serial is
+  // true, fn() has finished at this point, so running execute is safe

This LGTM now, but I think it would be good to get another opinion too.

lld/ELF/Relocations.cpp
1560	Yes, I think it would be worth adding the comment for clarity.

This revision is now accepted and ready to land.Sep 12 2022, 1:46 AM

No objections from me. Some small suggestions for comments and a way that might catch someone using the relocs array before mergeRels has been called.

lld/ELF/SyntheticSections.h
511	Would it be better to move the text into the /// comment as it is a precondition for calling the function?
547	Now that mergeRels has to be called before this is useable, is it worth making this private with an interface that asserts mergeRels has been called?
548	Suggest "// will be moved into relocs by mergeRels()."

Add concurrency to constructors and make RelocationBaseSection::relocsVec protected

MaskRay marked 3 inline comments as done.Sep 12 2022, 10:46 AM

Harbormaster completed remote builds in B186197: Diff 459518.Sep 12 2022, 11:55 AM

MaskRay edited the summary of this revision. (Show Details)Sep 12 2022, 12:50 PM

Herald added a subscriber: kristof.beyls. · View Herald TranscriptSep 12 2022, 12:50 PM

Closed by commit rGe6aebff67426: [ELF] Parallelize relocation scanning (authored by MaskRay). · Explain WhySep 12 2022, 12:56 PM

This revision was automatically updated to reflect the committed changes.

MaskRay added a commit: rGe6aebff67426: [ELF] Parallelize relocation scanning.

Unfortunately, this commit broke mingw dylib builds with Windows native TLS. The reason for this is that with Windows native TLS, you can't directly access a TLS variable residing in a different DLL.

(Mingw setups that use emulated TLS doesn't have that drawback in itself. But GCC/binutils does occasionally have issues with non-static TLS variables accessed from multiple source files - such variables end up with a bunch of extra wrapper functions, which use weak linkage, which has a couple issues in GCC/binutils too; see D111779 where we avoided cross-translation-unit TLS variables in LLDB to avoid crashes when built with GCC.)

Is it possible to wrap the accesses to parallel::threadIndex into a wrapper function, i.e. like parallel::getThreadIndex()? I presume that would add a tiny bit of overhead, in a routine that we want to tune for performance anyway. We could have that wrapper be inline, in the case of non-Windows platforms (which should result in the same code generated, I guess) and be defined in Parallel.cpp for Windows cases. (There are build configurations on Windows where this wouldn't be strictly necessary, but the overhead is probably small enough that it's not worth the effort to try to distinguish all the individual cases.)

On z/OS this would also break the build because there is no support for TLS. To workaround this we have disabled LLVM_ENABLE_THREADS here https://github.com/llvm/llvm-project/blob/main/llvm/CMakeLists.txt#L478 . Would we be able to move the declaration inside #if LLVM_ENABLE_THREADS? Thanks in advance

andrewng mentioned this in D133759: [Support] Access threadIndex via a wrapper function.Sep 13 2022, 7:57 AM

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

In D133003#3805979, @hans wrote:

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

I've put an lld repro here: https://drive.google.com/file/d/19zRK4jUxghCA5Pg_OJUugLM-D4yZ7iQR/view?usp=sharing (1.4 GB, requires google.com login)

(The lack of thread_local problem on some configurations of Windows/zOS has been resolved.)

In D133003#3806508, @hans wrote:

In D133003#3805979, @hans wrote:

We're seeing non-deterministic build output after this change: https://bugs.chromium.org/p/chromium/issues/detail?id=1364380

I've put an lld repro here: https://drive.google.com/file/d/19zRK4jUxghCA5Pg_OJUugLM-D4yZ7iQR/view?usp=sharing (1.4 GB, requires google.com login)

Thanks for the reproduce. The nondeterminism is due to --pack-dyn-relocs=android.
I suspected whether it had trouble in a previous revision but after reading some code I thought it was ok.

I'll remove AndroidPackedRelocationSection does not support parallelism. . It works with deterministic parallelism.

So the section still has some problems. --pack-dyn-relocs=relr is deterministic from the many experiments I have done.

MaskRay mentioned this in rGbce6416775ea: [ELF] --pack-dyn-relocs=android: scan relocation serially after D133003.Sep 21 2022, 11:43 AM

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

In D133003#3817988, @hans wrote:

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

For the repro tarball, I've verified it's fixed.

while :; do fld.lld @response.txt --threads=4 -o 0; fld.lld @response.txt --threads=4 -o 1; cmp 0 1; done no output

In D133003#3818518, @MaskRay wrote:

In D133003#3817988, @hans wrote:

We're still seeing non-determinism after D133003. Did you verify that your change fixed the non-determinism in the repro tarball?

For the repro tarball, I've verified it's fixed.

while :; do fld.lld @response.txt --threads=4 -o 0; fld.lld @response.txt --threads=4 -o 1; cmp 0 1; done no output

Okay, thanks. I'll see if I can provide some kind of reproducer for the new problem.

MaskRay mentioned this in rG62e7c5b4e2e1: Revert "[ELF] --pack-dyn-relocs=android: scan relocation serially after D133003".Sep 28 2022, 12:06 AM

Sorry, turns out the bot which was failing hadn't picked up your change yet. I've verified that we're good locally, and also at tip-of-tree which includes the revert above.

Revision Contents

Path

Size

lld/

ELF/

5 lines

75 lines

12 lines

65 lines

SyntheticSections.cpp

66 lines

Writer.cpp

20 lines

test/

ELF/

combreloc.s

2 lines

comdat-discarded-error.s

2 lines

undef-multi.s

4 lines

undef.s

4 lines

llvm/

include/

llvm/

Support/

Parallel.h

1 line

lib/

Support/

Parallel.cpp

2 lines

Diff 459538

lld/ELF/Config.h

Show First 20 Lines • Show All 317 Lines • ▼ Show 20 Lines	struct Configuration {
// not in various places.		// not in various places.
//		//
// (Note that MIPS64EL is not a typo for MIPS64LE. This is the official		// (Note that MIPS64EL is not a typo for MIPS64LE. This is the official
// name whatever that means. A fun hypothesis is that "EL" is short for		// name whatever that means. A fun hypothesis is that "EL" is short for
// little-endian written in the little-endian order, but I don't know		// little-endian written in the little-endian order, but I don't know
// if that's true.)		// if that's true.)
bool isMips64EL;		bool isMips64EL;

// True if we need to reserve two .got entries for local-dynamic TLS model.
bool needsTlsLd = false;

// True if we need to set the DF_STATIC_TLS flag to an output file, which		// True if we need to set the DF_STATIC_TLS flag to an output file, which
// works as a hint to the dynamic loader that the shared object contains code		// works as a hint to the dynamic loader that the shared object contains code
// compiled with the initial-exec TLS model.		// compiled with the initial-exec TLS model.
bool hasTlsIe = false;		bool hasTlsIe = false;

// Holds set of ELF header flags for the target.		// Holds set of ELF header flags for the target.
uint32_t eflags = 0;		uint32_t eflags = 0;

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	struct Ctx {
SmallVector<BitcodeFile *, 0> lazyBitcodeFiles;		SmallVector<BitcodeFile *, 0> lazyBitcodeFiles;
// Duplicate symbol candidates.		// Duplicate symbol candidates.
SmallVector<DuplicateSymbol, 0> duplicates;		SmallVector<DuplicateSymbol, 0> duplicates;
// Symbols in a non-prevailing COMDAT group which should be changed to an		// Symbols in a non-prevailing COMDAT group which should be changed to an
// Undefined.		// Undefined.
SmallVector<std::pair<Symbol *, unsigned>, 0> nonPrevailingSyms;		SmallVector<std::pair<Symbol *, unsigned>, 0> nonPrevailingSyms;
// True if SHT_LLVM_SYMPART is used.		// True if SHT_LLVM_SYMPART is used.
std::atomic<bool> hasSympart{false};		std::atomic<bool> hasSympart{false};
		// True if we need to reserve two .got entries for local-dynamic TLS model.
		std::atomic<bool> needsTlsLd{false};
// A tuple of (reference, extractedFile, sym). Used by --why-extract=.		// A tuple of (reference, extractedFile, sym). Used by --why-extract=.
SmallVector<std::tuple<std::string, const InputFile *, const Symbol &>, 0>		SmallVector<std::tuple<std::string, const InputFile *, const Symbol &>, 0>
whyExtractRecords;		whyExtractRecords;
// A mapping from a symbol to an InputFile referencing it backward. Used by		// A mapping from a symbol to an InputFile referencing it backward. Used by
// --warn-backrefs.		// --warn-backrefs.
llvm::DenseMap<const Symbol *,		llvm::DenseMap<const Symbol *,
std::pair<const InputFile , const InputFile >>		std::pair<const InputFile , const InputFile >>
backwardReferences;		backwardReferences;
Show All 21 Lines

lld/ELF/Relocations.cpp

Show First 20 Lines • Show All 291 Lines • ▼ Show 20 Lines
// When a symbol is copy relocated or we create a canonical plt entry, it is		// When a symbol is copy relocated or we create a canonical plt entry, it is
// effectively a defined symbol. In the case of copy relocation the symbol is		// effectively a defined symbol. In the case of copy relocation the symbol is
// in .bss and in the case of a canonical plt entry it is in .plt. This function		// in .bss and in the case of a canonical plt entry it is in .plt. This function
// replaces the existing symbol with a Defined pointing to the appropriate		// replaces the existing symbol with a Defined pointing to the appropriate
// location.		// location.
static void replaceWithDefined(Symbol &sym, SectionBase &sec, uint64_t value,		static void replaceWithDefined(Symbol &sym, SectionBase &sec, uint64_t value,
uint64_t size) {		uint64_t size) {
Symbol old = sym;		Symbol old = sym;

sym.replace(Defined{sym.file, StringRef(), sym.binding, sym.stOther,		sym.replace(Defined{sym.file, StringRef(), sym.binding, sym.stOther,
		ikudrinUnsubmitted Done Reply Inline Actions Why not define a copy constructor? ikudrin: Why not define a copy constructor?
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Good idea. Adopted MaskRay: Good idea. Adopted
sym.type, value, size, &sec});		sym.type, value, size, &sec});

sym.auxIdx = old.auxIdx;		sym.auxIdx = old.auxIdx;
sym.verdefIndex = old.verdefIndex;		sym.verdefIndex = old.verdefIndex;
sym.exportDynamic = true;		sym.exportDynamic = true;
sym.isUsedInRegularObj = true;		sym.isUsedInRegularObj = true;
// A copy relocated alias may need a GOT entry.		// A copy relocated alias may need a GOT entry.
if (old.hasFlag(NEEDS_GOT))		if (old.hasFlag(NEEDS_GOT))
▲ Show 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	struct Loc {
InputSectionBase *sec;		InputSectionBase *sec;
uint64_t offset;		uint64_t offset;
};		};
std::vector<Loc> locs;		std::vector<Loc> locs;
bool isWarning;		bool isWarning;
};		};

std::vector<UndefinedDiag> undefs;		std::vector<UndefinedDiag> undefs;
		std::mutex relocMutex;
}		}

// Check whether the definition name def is a mangled function name that matches		// Check whether the definition name def is a mangled function name that matches
// the reference name ref.		// the reference name ref.
static bool canSuggestExternCForCXX(StringRef ref, StringRef def) {		static bool canSuggestExternCForCXX(StringRef ref, StringRef def) {
llvm::ItaniumPartialDemangler d;		llvm::ItaniumPartialDemangler d;
std::string name = def.str();		std::string name = def.str();
if (d.partialDemangle(name.c_str()))		if (d.partialDemangle(name.c_str()))
▲ Show 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	if (!undef.locs.empty())
reportUndefinedSymbol(undef, i < 2);		reportUndefinedSymbol(undef, i < 2);
undefs.clear();		undefs.clear();
}		}

// Report an undefined symbol if necessary.		// Report an undefined symbol if necessary.
// Returns true if the undefined symbol will produce an error message.		// Returns true if the undefined symbol will produce an error message.
static bool maybeReportUndefined(Undefined &sym, InputSectionBase &sec,		static bool maybeReportUndefined(Undefined &sym, InputSectionBase &sec,
uint64_t offset) {		uint64_t offset) {
		std::lock_guard<std::mutex> lock(relocMutex);
// If versioned, issue an error (even if the symbol is weak) because we don't		// If versioned, issue an error (even if the symbol is weak) because we don't
// know the defining filename which is required to construct a Verneed entry.		// know the defining filename which is required to construct a Verneed entry.
if (sym.hasVersionSuffix) {		if (sym.hasVersionSuffix) {
undefs.push_back({&sym, {{&sec, offset}}, false});		undefs.push_back({&sym, {{&sec, offset}}, false});
return true;		return true;
}		}
if (sym.isWeak())		if (sym.isWeak())
return false;		return false;
Show All 32 Lines	RelType RelocationScanner::getMipsN32RelType(RelTy *&rel) const {
uint64_t offset = rel->r_offset;		uint64_t offset = rel->r_offset;

int n = 0;		int n = 0;
while (rel != static_cast<const RelTy *>(end) && rel->r_offset == offset)		while (rel != static_cast<const RelTy *>(end) && rel->r_offset == offset)
type \|= (rel++)->getType(config->isMips64EL) << (8 * n++);		type \|= (rel++)->getType(config->isMips64EL) << (8 * n++);
return type;		return type;
}		}

		template <bool shard = false>
static void addRelativeReloc(InputSectionBase &isec, uint64_t offsetInSec,		static void addRelativeReloc(InputSectionBase &isec, uint64_t offsetInSec,
Symbol &sym, int64_t addend, RelExpr expr,		Symbol &sym, int64_t addend, RelExpr expr,
RelType type) {		RelType type) {
Partition &part = isec.getPartition();		Partition &part = isec.getPartition();

// Add a relative relocation. If relrDyn section is enabled, and the		// Add a relative relocation. If relrDyn section is enabled, and the
// relocation offset is guaranteed to be even, add the relocation to		// relocation offset is guaranteed to be even, add the relocation to
// the relrDyn section, otherwise add it to the relaDyn section.		// the relrDyn section, otherwise add it to the relaDyn section.
// relrDyn sections don't support odd offsets. Also, relrDyn sections		// relrDyn sections don't support odd offsets. Also, relrDyn sections
// don't store the addend values, so we must write it to the relocated		// don't store the addend values, so we must write it to the relocated
// address.		// address.
if (part.relrDyn && isec.alignment >= 2 && offsetInSec % 2 == 0) {		if (part.relrDyn && isec.alignment >= 2 && offsetInSec % 2 == 0) {
isec.relocations.push_back({expr, type, offsetInSec, addend, &sym});		isec.relocations.push_back({expr, type, offsetInSec, addend, &sym});
		if (shard)
		part.relrDyn->relocsVec[parallel::threadIndex].push_back(
		{&isec, offsetInSec});
		else
part.relrDyn->relocs.push_back({&isec, offsetInSec});		part.relrDyn->relocs.push_back({&isec, offsetInSec});
return;		return;
}		}
part.relaDyn->addRelativeReloc(target->relativeRel, isec, offsetInSec, sym,		part.relaDyn->addRelativeReloc<shard>(target->relativeRel, isec, offsetInSec,
addend, type, expr);		sym, addend, type, expr);
}		}

template <class PltSection, class GotPltSection>		template <class PltSection, class GotPltSection>
static void addPltEntry(PltSection &plt, GotPltSection &gotPlt,		static void addPltEntry(PltSection &plt, GotPltSection &gotPlt,
RelocationBaseSection &rel, RelType type, Symbol &sym) {		RelocationBaseSection &rel, RelType type, Symbol &sym) {
plt.addEntry(sym);		plt.addEntry(sym);
gotPlt.addEntry(sym);		gotPlt.addEntry(sym);
rel.addReloc({type, &gotPlt, sym.getGotPltOffset(),		rel.addReloc({type, &gotPlt, sym.getGotPltOffset(),
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	if (isStaticLinkTimeConstant(expr, type, sym, offset) \|\|
sec->relocations.push_back({expr, type, offset, addend, &sym});		sec->relocations.push_back({expr, type, offset, addend, &sym});
return;		return;
}		}

bool canWrite = (sec->flags & SHF_WRITE) \|\| !config->zText;		bool canWrite = (sec->flags & SHF_WRITE) \|\| !config->zText;
if (canWrite) {		if (canWrite) {
RelType rel = target.getDynRel(type);		RelType rel = target.getDynRel(type);
if (expr == R_GOT \|\| (rel == target.symbolicRel && !sym.isPreemptible)) {		if (expr == R_GOT \|\| (rel == target.symbolicRel && !sym.isPreemptible)) {
addRelativeReloc(*sec, offset, sym, addend, expr, type);		addRelativeReloc<true>(*sec, offset, sym, addend, expr, type);
return;		return;
} else if (rel != 0) {		} else if (rel != 0) {
if (config->emachine == EM_MIPS && rel == target.symbolicRel)		if (config->emachine == EM_MIPS && rel == target.symbolicRel)
rel = target.relativeRel;		rel = target.relativeRel;
		std::lock_guard<std::mutex> lock(relocMutex);
sec->getPartition().relaDyn->addSymbolReloc(rel, *sec, offset, sym,		sec->getPartition().relaDyn->addSymbolReloc(rel, *sec, offset, sym,
addend, type);		addend, type);

// MIPS ABI turns using of GOT and dynamic relocations inside out.		// MIPS ABI turns using of GOT and dynamic relocations inside out.
// While regular ABI uses dynamic relocations to fill up GOT entries		// While regular ABI uses dynamic relocations to fill up GOT entries
// MIPS ABI requires dynamic linker to fills up GOT entries using		// MIPS ABI requires dynamic linker to fills up GOT entries using
// specially sorted dynamic symbol table. This affects even dynamic		// specially sorted dynamic symbol table. This affects even dynamic
// relocations against symbols which do not require GOT entries		// relocations against symbols which do not require GOT entries
▲ Show 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	if (oneof<R_TLSLD_GOT, R_TLSLD_GOTPLT, R_TLSLD_PC, R_TLSLD_HINT>(
if (toExecRelax) {		if (toExecRelax) {
c.relocations.push_back(		c.relocations.push_back(
{target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type, offset,		{target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type, offset,
addend, &sym});		addend, &sym});
return target->getTlsGdRelaxSkip(type);		return target->getTlsGdRelaxSkip(type);
}		}
if (expr == R_TLSLD_HINT)		if (expr == R_TLSLD_HINT)
return 1;		return 1;
config->needsTlsLd = true;		ctx->needsTlsLd.store(true, std::memory_order_relaxed);
		ikudrinUnsubmitted Done Reply Inline Actions If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to `atomic<bool>`, the same should be done for `Configuration::needsTlsLd` because their usage pattern is similar. ikudrin: If `GotSection::hasGotOffRel` and `GotPltSection::hasGotPltOffRel` are converted to…
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
return 1;		return 1;
}		}

// Local-Dynamic relocs can be relaxed to Local-Exec.		// Local-Dynamic relocs can be relaxed to Local-Exec.
if (expr == R_DTPREL) {		if (expr == R_DTPREL) {
if (toExecRelax)		if (toExecRelax)
expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE);		expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE);
Show All 38 Lines	if (oneof<R_GOT, R_GOTPLT, R_GOT_PC, R_AARCH64_GOT_PAGE_PC, R_GOT_OFF,
// defined.		// defined.
if (toExecRelax && isLocalInExecutable) {		if (toExecRelax && isLocalInExecutable) {
c.relocations.push_back(		c.relocations.push_back(
{R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym});		{R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym});
} else if (expr != R_TLSIE_HINT) {		} else if (expr != R_TLSIE_HINT) {
sym.setFlags(NEEDS_TLSIE);		sym.setFlags(NEEDS_TLSIE);
// R_GOT needs a relative relocation for PIC on i386 and Hexagon.		// R_GOT needs a relative relocation for PIC on i386 and Hexagon.
if (expr == R_GOT && config->isPic && !target->usesOnlyLowPageBits(type))		if (expr == R_GOT && config->isPic && !target->usesOnlyLowPageBits(type))
addRelativeReloc(c, offset, sym, addend, expr, type);		addRelativeReloc<true>(c, offset, sym, addend, expr, type);
		ikudrinUnsubmitted Done Reply Inline Actions Shouldn't `relocMutex` be locked before this call? ikudrin: Shouldn't `relocMutex` be locked before this call?
else		else
c.relocations.push_back({expr, type, offset, addend, &sym});		c.relocations.push_back({expr, type, offset, addend, &sym});
}		}
return 1;		return 1;
}		}

return 0;		return 0;
}		}
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	template <class ELFT, class RelTy> void RelocationScanner::scanOne(RelTy *&i) {
}		}

// If the relocation does not emit a GOT or GOTPLT entry but its computation		// If the relocation does not emit a GOT or GOTPLT entry but its computation
// uses their addresses, we need GOT or GOTPLT to be created.		// uses their addresses, we need GOT or GOTPLT to be created.
//		//
// The 5 types that relative GOTPLT are all x86 and x86-64 specific.		// The 5 types that relative GOTPLT are all x86 and x86-64 specific.
if (oneof<R_GOTPLTONLY_PC, R_GOTPLTREL, R_GOTPLT, R_PLT_GOTPLT,		if (oneof<R_GOTPLTONLY_PC, R_GOTPLTREL, R_GOTPLT, R_PLT_GOTPLT,
R_TLSDESC_GOTPLT, R_TLSGD_GOTPLT>(expr)) {		R_TLSDESC_GOTPLT, R_TLSGD_GOTPLT>(expr)) {
in.gotPlt->hasGotPltOffRel = true;		in.gotPlt->hasGotPltOffRel.store(true, std::memory_order_relaxed);
} else if (oneof<R_GOTONLY_PC, R_GOTREL, R_PPC32_PLTREL, R_PPC64_TOCBASE,		} else if (oneof<R_GOTONLY_PC, R_GOTREL, R_PPC32_PLTREL, R_PPC64_TOCBASE,
R_PPC64_RELAX_TOC>(expr)) {		R_PPC64_RELAX_TOC>(expr)) {
in.got->hasGotOffRel = true;		in.got->hasGotOffRel.store(true, std::memory_order_relaxed);
}		}

// Process TLS relocations, including relaxing TLS relocations. Note that		// Process TLS relocations, including relaxing TLS relocations. Note that
// R_TPREL and R_TPREL_NEG relocations are resolved in processAux.		// R_TPREL and R_TPREL_NEG relocations are resolved in processAux.
if (expr == R_TPREL \|\| expr == R_TPREL_NEG) {		if (expr == R_TPREL \|\| expr == R_TPREL_NEG) {
if (config->shared) {		if (config->shared) {
errorOrWarn("relocation " + toString(type) + " against " + toString(sym) +		errorOrWarn("relocation " + toString(type) + " against " + toString(sym) +
" cannot be used with -shared" +		" cannot be used with -shared" +
Show All 31 Lines	if (!sym.isPreemptible && (!isIfunc \|\| config->zIfuncNoplt)) {
} else if (!isAbsoluteValue(sym)) {		} else if (!isAbsoluteValue(sym)) {
expr = target.adjustGotPcExpr(type, addend, relocatedAddr);		expr = target.adjustGotPcExpr(type, addend, relocatedAddr);
}		}
}		}

// We were asked not to generate PLT entries for ifuncs. Instead, pass the		// We were asked not to generate PLT entries for ifuncs. Instead, pass the
// direct relocation on through.		// direct relocation on through.
if (LLVM_UNLIKELY(isIfunc) && config->zIfuncNoplt) {		if (LLVM_UNLIKELY(isIfunc) && config->zIfuncNoplt) {
		std::lock_guard<std::mutex> lock(relocMutex);
sym.exportDynamic = true;		sym.exportDynamic = true;
mainPart->relaDyn->addSymbolReloc(type, *sec, offset, sym, addend, type);		mainPart->relaDyn->addSymbolReloc(type, *sec, offset, sym, addend, type);
return;		return;
}		}

if (needsGot(expr)) {		if (needsGot(expr)) {
if (config->emachine == EM_MIPS) {		if (config->emachine == EM_MIPS) {
// MIPS ABI has special rules to process GOT entries and doesn't		// MIPS ABI has special rules to process GOT entries and doesn't
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	template <class ELFT> void RelocationScanner::scanSection(InputSectionBase &s) {
else		else
scan<ELFT>(rels.relas);		scan<ELFT>(rels.relas);
}		}

template <class ELFT> void elf::scanRelocations() {		template <class ELFT> void elf::scanRelocations() {
// Scan all relocations. Each relocation goes through a series of tests to		// Scan all relocations. Each relocation goes through a series of tests to
// determine if it needs special treatment, such as creating GOT, PLT,		// determine if it needs special treatment, such as creating GOT, PLT,
// copy relocations, etc. Note that relocations for non-alloc sections are		// copy relocations, etc. Note that relocations for non-alloc sections are
// directly processed by InputSection::relocateNonAlloc.		// directly processed by InputSection::relocateNonAlloc.

		// Deterministic parallellism needs sorting relocations which is unsuitable
		// for -z nocombreloc. MIPS and PPC64 use global states which are not suitable
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I'll remove `AndroidPackedRelocationSection does not support parallelism.` . It works with deterministic parallelism. MaskRay: I'll remove `AndroidPackedRelocationSection does not support parallelism. `. It works with…
		// for parallelism.
		bool serial = !config->zCombreloc \|\| config->emachine == EM_MIPS \|\|
		config->emachine == EM_PPC64;
		parallel::TaskGroup tg;
		for (ELFFileBase *f : ctx->objectFiles) {
		auto fn = [f]() {
		RelocationScanner scanner;
		for (InputSectionBase *s : f->getSections()) {
		if (s && s->kind() == SectionBase::Regular && s->isLive() &&
		(s->flags & SHF_ALLOC) &&
		!(s->type == SHT_ARM_EXIDX && config->emachine == EM_ARM))
		scanner.template scanSection<ELFT>(*s);
		}
		};
		if (serial)
		fn();
		andrewngUnsubmitted Done Reply Inline Actions I wonder if it might be worthwhile using the previous code for the serial case? Although, it probably doesn't make a big difference to performance. andrewng: I wonder if it might be worthwhile using the previous code for the serial case? Although, it…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Use which piece of code for the serial case? MaskRay: Use which piece of code for the serial case?
		andrewngUnsubmitted Done Reply Inline Actions I was thinking this: for (InputSectionBase sec : inputSections) if (sec->isLive() && (sec->flags & SHF_ALLOC)) scanner.template scanSection<ELFT>(sec); But on the other hand, in terms of future development and maintenance, it's probably better to use as much of the same code for both "paths", even if there's a minor performance penalty for the serial one. andrewng: I was thinking this: ``` for (InputSectionBase *sec : inputSections) if (sec->isLive() &&…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Yes, using the same code for both paths is better for maintenance. MaskRay: Yes, using the same code for both paths is better for maintenance.
		andrewngUnsubmitted Done Reply Inline Actions Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned it. But as there are specific configurations that are limited to serial I thought that it might be worth considering. andrewng: Yes, I think I agree. If it only affected the single threaded case, I wouldn't have mentioned…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Add the comment before the `tg.execute([] {` line? + // Both the main thread and thread pool index 0 use threadIndex==0. Be + // careful that they don't concurrently run scanSections. When serial is + // true, fn() has finished at this point, so running execute is safe MaskRay: Add the comment before the `tg.execute([] {` line? ``` + // Both the main thread and thread…
		andrewngUnsubmitted Done Reply Inline Actions Yes, I think it would be worth adding the comment for clarity. andrewng: Yes, I think it would be worth adding the comment for clarity.
		else
		tg.execute(fn);
		andrewngUnsubmitted Done Reply Inline Actions This is running on the main thread. Is there a chance that this might clash with thread 0 of the task pool? andrewng: This is running on the main thread. Is there a chance that this might clash with thread 0 of…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Thanks for catching this. The main thread doing heavy work will contend with the thread pool. Changed to use `tg.execute`. MaskRay: Thanks for catching this. The main thread doing heavy work will contend with the thread pool.
		andrewngUnsubmitted Done Reply Inline Actions I think the previous code could have actually caused a threading issue, i.e. concurrent updates to the `0` indexed relocation vector. This will ensure that can't happen. The only minor thing is it looks a little odd that the "serial" case uses `tg` but I guess it is still serial. andrewng: I think the previous code could have actually caused a threading issue, i.e. concurrent updates…
		}

		// Both the main thread and thread pool index 0 use threadIndex==0. Be
		// careful that they don't concurrently run scanSections. When serial is
		// true, fn() has finished at this point, so running execute is safe.
		tg.execute([] {
RelocationScanner scanner;		RelocationScanner scanner;
for (InputSectionBase *sec : inputSections)
if (sec->isLive() && (sec->flags & SHF_ALLOC))
scanner.template scanSection<ELFT>(*sec);
for (Partition &part : partitions) {		for (Partition &part : partitions) {
for (EhInputSection *sec : part.ehFrame->sections)		for (EhInputSection *sec : part.ehFrame->sections)
scanner.template scanSection<ELFT>(*sec);		scanner.template scanSection<ELFT>(*sec);
if (part.armExidx && part.armExidx->isLive())		if (part.armExidx && part.armExidx->isLive())
for (InputSection *sec : part.armExidx->exidxSections)		for (InputSection *sec : part.armExidx->exidxSections)
scanner.template scanSection<ELFT>(*sec);		scanner.template scanSection<ELFT>(*sec);
}		}
		});
}		}

static bool handleNonPreemptibleIfunc(Symbol &sym, uint16_t flags) {		static bool handleNonPreemptibleIfunc(Symbol &sym, uint16_t flags) {
		ikudrinUnsubmitted Done Reply Inline Actions `uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the range is `NEEDS_TLSIE` which is not used here, but still. ikudrin: `uint8_t` -> `uint16_t`; not that it changes anything because the only flag that exceeds the…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions Thanks for catching this! MaskRay: Thanks for catching this!
// Handle a reference to a non-preemptible ifunc. These are special in a		// Handle a reference to a non-preemptible ifunc. These are special in a
// few ways:		// few ways:
//		//
// - Unlike most non-preemptible symbols, non-preemptible ifuncs do not have		// - Unlike most non-preemptible symbols, non-preemptible ifuncs do not have
// a fixed value. But assuming that all references to the ifunc are		// a fixed value. But assuming that all references to the ifunc are
// GOT-generating or PLT-generating, the handling of an ifunc is		// GOT-generating or PLT-generating, the handling of an ifunc is
// relatively straightforward. We create a PLT entry in Iplt, which is		// relatively straightforward. We create a PLT entry in Iplt, which is
// usually at the end of .plt, which makes an indirect call using a		// usually at the end of .plt, which makes an indirect call using a
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	if (flags & HAS_DIRECT_RELOC) {
// Redirect GOT accesses to point to the Igot.		// Redirect GOT accesses to point to the Igot.
sym.gotInIgot = true;		sym.gotInIgot = true;
}		}
return true;		return true;
}		}

void elf::postScanRelocations() {		void elf::postScanRelocations() {
auto fn = [](Symbol &sym) {		auto fn = [](Symbol &sym) {
auto flags = sym.flags;		auto flags = sym.flags.load(std::memory_order_relaxed);
if (handleNonPreemptibleIfunc(sym, flags))		if (handleNonPreemptibleIfunc(sym, flags))
return;		return;
if (!sym.needsDynReloc())		if (!sym.needsDynReloc())
return;		return;
sym.allocateAux();		sym.allocateAux();

if (flags & NEEDS_GOT)		if (flags & NEEDS_GOT)
addGotEntry(sym);		addGotEntry(sym);
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	if (flags & NEEDS_GOT_DTPREL) {
in.got->relocations.push_back(		in.got->relocations.push_back(
{R_ABS, target->tlsOffsetRel, sym.getGotOffset(), 0, &sym});		{R_ABS, target->tlsOffsetRel, sym.getGotOffset(), 0, &sym});
}		}

if ((flags & NEEDS_TLSIE) && !(flags & NEEDS_TLSGD_TO_IE))		if ((flags & NEEDS_TLSIE) && !(flags & NEEDS_TLSGD_TO_IE))
addTpOffsetGotEntry(sym);		addTpOffsetGotEntry(sym);
};		};

if (config->needsTlsLd && in.got->addTlsIndex()) {		if (ctx->needsTlsLd.load(std::memory_order_relaxed) &&
		in.got->addTlsIndex()) {
static Undefined dummy(nullptr, "", STB_LOCAL, 0, 0);		static Undefined dummy(nullptr, "", STB_LOCAL, 0, 0);
if (config->shared)		if (config->shared)
mainPart->relaDyn->addReloc(		mainPart->relaDyn->addReloc(
{target->tlsModuleIndexRel, in.got.get(), in.got->getTlsIndexOff()});		{target->tlsModuleIndexRel, in.got.get(), in.got->getTlsIndexOff()});
else		else
in.got->relocations.push_back(		in.got->relocations.push_back(
{R_ADDEND, target->symbolicRel, in.got->getTlsIndexOff(), 1, &dummy});		{R_ADDEND, target->symbolicRel, in.got->getTlsIndexOff(), 1, &dummy});
}		}
▲ Show 20 Lines • Show All 541 Lines • Show Last 20 Lines

lld/ELF/Symbols.h

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	enum Kind {
LazyObjectKind,		LazyObjectKind,
};		};

Kind kind() const { return static_cast<Kind>(symbolKind); }		Kind kind() const { return static_cast<Kind>(symbolKind); }

// The file from which this symbol was created.		// The file from which this symbol was created.
InputFile *file;		InputFile *file;

		// The default copy constructor is deleted due to atomic flags. Define one for
		// places where no atomic is needed.
		Symbol(const Symbol &o) { memcpy(this, &o, sizeof(o)); }

protected:		protected:
const char *nameData;		const char *nameData;
// 32-bit size saves space.		// 32-bit size saves space.
uint32_t nameSize;		uint32_t nameSize;

public:		public:
// The next three fields have the same meaning as the ELF symbol attributes.		// The next three fields have the same meaning as the ELF symbol attributes.
// type and binding are placed in this order to optimize generating st_info,		// type and binding are placed in this order to optimize generating st_info,
▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	public:
uint8_t needsTocRestore : 1;		uint8_t needsTocRestore : 1;

// True if this symbol is defined by a symbol assignment or wrapped by --wrap.		// True if this symbol is defined by a symbol assignment or wrapped by --wrap.
//		//
// LTO shouldn't inline the symbol because it doesn't know the final content		// LTO shouldn't inline the symbol because it doesn't know the final content
// of the symbol.		// of the symbol.
uint8_t scriptDefined : 1;		uint8_t scriptDefined : 1;

// True if defined in a DSO as protected visibility.		// True if defined in a DSO as protected visibility.
uint8_t dsoProtected : 1;		uint8_t dsoProtected : 1;
		ikudrinUnsubmitted Done Reply Inline Actions Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike? ikudrin: Why is not `needsTlsGdToIe` moved under `atomic` like `needsTlsGd` and alike?
		MaskRayAuthorUnsubmitted Done Reply Inline Actions All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want to keep the size of `SymbolUnion` unchanged. MaskRay: All the 8 bits of `std::atomic<uint8_t>` have been used. We need one not in atomic if we want…
		ikudrinUnsubmitted Done Reply Inline Actions Does that mean that some flags in the atomic do not really need to be handled as such, or that this flag is left outside despite it can be potentially updated concurrently, but there is no space for it in `flags`? In any case, that is worth documenting, at least. ikudrin: Does that mean that some flags in the atomic do not really need to be handled as such, or that…
		MaskRayAuthorUnsubmitted Done Reply Inline Actions I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now, but I suspect 16-bit atomic operations are not efficient on common architectures. MaskRay: I have replaced `Symbol::visibility` with `Symbol::stOther` and atomic<uint16_t> is fine now…

// Temporary flags used to communicate which symbol entries need PLT and GOT		// Temporary flags used to communicate which symbol entries need PLT and GOT
// entries during postScanRelocations();		// entries during postScanRelocations();
uint16_t flags = 0;		std::atomic<uint16_t> flags = 0;

// A symAux index used to access GOT/PLT entry indexes. This is allocated in		// A symAux index used to access GOT/PLT entry indexes. This is allocated in
// postScanRelocations().		// postScanRelocations().
uint32_t auxIdx = -1;		uint32_t auxIdx = -1;
uint32_t dynsymIndex = 0;		uint32_t dynsymIndex = 0;

// This field is a index to the symbol's version definition.		// This field is a index to the symbol's version definition.
uint16_t verdefIndex = -1;		uint16_t verdefIndex = -1;

// Version definition index.		// Version definition index.
uint16_t versionId;		uint16_t versionId;

void setFlags(uint16_t bits) {		void setFlags(uint16_t bits) {
		ikudrinUnsubmitted Done Reply Inline Actions You use it with two flags at least once, maybe call it `setFlags`? ikudrin: You use it with two flags at least once, maybe call it `setFlags`?
flags \|= bits;		flags.fetch_or(bits, std::memory_order_relaxed);
}		}
bool hasFlag(uint16_t bit) const {		bool hasFlag(uint16_t bit) const {
		andrewngUnsubmitted Done Reply Inline Actions The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit - 1)) == 0)`? andrewng: The argument name implies a single bit but perhaps add an assert, e.g. `assert((bit & (bit…
assert(bit && (bit & (bit - 1)) == 0 && "bit must be a power of 2");		assert(bit && (bit & (bit - 1)) == 0 && "bit must be a power of 2");
return flags & bit;		return flags.load(std::memory_order_relaxed) & bit;
}		}

bool needsDynReloc() const {		bool needsDynReloc() const {
return flags &		return flags.load(std::memory_order_relaxed) &
(NEEDS_COPY \| NEEDS_GOT \| NEEDS_PLT \| NEEDS_TLSDESC \| NEEDS_TLSGD \|		(NEEDS_COPY \| NEEDS_GOT \| NEEDS_PLT \| NEEDS_TLSDESC \| NEEDS_TLSGD \|
NEEDS_TLSGD_TO_IE \| NEEDS_GOT_DTPREL \| NEEDS_TLSIE);		NEEDS_TLSGD_TO_IE \| NEEDS_GOT_DTPREL \| NEEDS_TLSIE);
}		}
void allocateAux() {		void allocateAux() {
assert(auxIdx == uint32_t(-1));		assert(auxIdx == uint32_t(-1));
auxIdx = symAux.size();		auxIdx = symAux.size();
symAux.emplace_back();		symAux.emplace_back();
}		}
▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.h

Show All 20 Lines
#define LLD_ELF_SYNTHETIC_SECTIONS_H		#define LLD_ELF_SYNTHETIC_SECTIONS_H

#include "Config.h"		#include "Config.h"
#include "InputSection.h"		#include "InputSection.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/MC/StringTableBuilder.h"		#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
		#include "llvm/Support/Parallel.h"
#include "llvm/Support/Threading.h"		#include "llvm/Support/Threading.h"

namespace lld::elf {		namespace lld::elf {
class Defined;		class Defined;
struct PhdrEntry;		struct PhdrEntry;
class SymbolTableBaseSection;		class SymbolTableBaseSection;

struct CieRecord {		struct CieRecord {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	public:
uint64_t getGlobalDynAddr(const Symbol &b) const;		uint64_t getGlobalDynAddr(const Symbol &b) const;
uint64_t getGlobalDynOffset(const Symbol &b) const;		uint64_t getGlobalDynOffset(const Symbol &b) const;

uint64_t getTlsIndexVA() { return this->getVA() + tlsIndexOff; }		uint64_t getTlsIndexVA() { return this->getVA() + tlsIndexOff; }
uint32_t getTlsIndexOff() const { return tlsIndexOff; }		uint32_t getTlsIndexOff() const { return tlsIndexOff; }

// Flag to force GOT to be in output if we have relocations		// Flag to force GOT to be in output if we have relocations
// that relies on its address.		// that relies on its address.
bool hasGotOffRel = false;		std::atomic<bool> hasGotOffRel = false;

protected:		protected:
size_t numEntries = 0;		size_t numEntries = 0;
uint32_t tlsIndexOff = -1;		uint32_t tlsIndexOff = -1;
uint64_t size = 0;		uint64_t size = 0;
};		};

// .note.GNU-stack section.		// .note.GNU-stack section.
▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines	public:
GotPltSection();		GotPltSection();
void addEntry(Symbol &sym);		void addEntry(Symbol &sym);
size_t getSize() const override;		size_t getSize() const override;
void writeTo(uint8_t *buf) override;		void writeTo(uint8_t *buf) override;
bool isNeeded() const override;		bool isNeeded() const override;

// Flag to force GotPlt to be in output if we have relocations		// Flag to force GotPlt to be in output if we have relocations
// that relies on its address.		// that relies on its address.
bool hasGotPltOffRel = false;		std::atomic<bool> hasGotPltOffRel = false;

private:		private:
SmallVector<const Symbol *, 0> entries;		SmallVector<const Symbol *, 0> entries;
};		};

// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc		// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc
// Symbols that will be relocated by Target->IRelativeRel.		// Symbols that will be relocated by Target->IRelativeRel.
// On most Targets the IgotPltSection will immediately follow the GotPltSection		// On most Targets the IgotPltSection will immediately follow the GotPltSection
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines
private:		private:
std::vector<std::pair<int32_t, uint64_t>> computeContents();		std::vector<std::pair<int32_t, uint64_t>> computeContents();
uint64_t size = 0;		uint64_t size = 0;
};		};

class RelocationBaseSection : public SyntheticSection {		class RelocationBaseSection : public SyntheticSection {
public:		public:
RelocationBaseSection(StringRef name, uint32_t type, int32_t dynamicTag,		RelocationBaseSection(StringRef name, uint32_t type, int32_t dynamicTag,
int32_t sizeDynamicTag, bool combreloc);		int32_t sizeDynamicTag, bool combreloc,
		unsigned concurrency);
/// Add a dynamic relocation without writing an addend to the output section.		/// Add a dynamic relocation without writing an addend to the output section.
/// This overload can be used if the addends are written directly instead of		/// This overload can be used if the addends are written directly instead of
/// using relocations on the input section (e.g. MipsGotSection::writeTo()).		/// using relocations on the input section (e.g. MipsGotSection::writeTo()).
void addReloc(const DynamicReloc &reloc) { relocs.push_back(reloc); }		template <bool shard = false> void addReloc(const DynamicReloc &reloc) {
		relocs.push_back(reloc);
		}
/// Add a dynamic relocation against \p sym with an optional addend.		/// Add a dynamic relocation against \p sym with an optional addend.
void addSymbolReloc(RelType dynType, InputSectionBase &isec,		void addSymbolReloc(RelType dynType, InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym, int64_t addend = 0,		uint64_t offsetInSec, Symbol &sym, int64_t addend = 0,
llvm::Optional<RelType> addendRelType = llvm::None);		llvm::Optional<RelType> addendRelType = llvm::None);
/// Add a relative dynamic relocation that uses the target address of \p sym		/// Add a relative dynamic relocation that uses the target address of \p sym
/// (i.e. InputSection::getRelocTargetVA()) + \p addend as the addend.		/// (i.e. InputSection::getRelocTargetVA()) + \p addend as the addend.
		/// This function should only be called for non-preemptible symbols or
		/// RelExpr values that refer to an address inside the output file (e.g. the
		/// address of the GOT entry for a potentially preemptible symbol).
		template <bool shard = false>
void addRelativeReloc(RelType dynType, InputSectionBase &isec,		void addRelativeReloc(RelType dynType, InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym, int64_t addend,		uint64_t offsetInSec, Symbol &sym, int64_t addend,
RelType addendRelType, RelExpr expr);		RelType addendRelType, RelExpr expr) {
		assert(expr != R_ADDEND && "expected non-addend relocation expression");
		peter.smithUnsubmitted Done Reply Inline Actions Would it be better to move the text into the /// comment as it is a precondition for calling the function? peter.smith: Would it be better to move the text into the /// comment as it is a precondition for calling…
		addReloc<shard>(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec,
		offsetInSec, sym, addend, expr, addendRelType);
		}
/// Add a dynamic relocation using the target address of \p sym as the addend		/// Add a dynamic relocation using the target address of \p sym as the addend
/// if \p sym is non-preemptible. Otherwise add a relocation against \p sym.		/// if \p sym is non-preemptible. Otherwise add a relocation against \p sym.
void addAddendOnlyRelocIfNonPreemptible(RelType dynType,		void addAddendOnlyRelocIfNonPreemptible(RelType dynType,
InputSectionBase &isec,		InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym,		uint64_t offsetInSec, Symbol &sym,
RelType addendRelType);		RelType addendRelType);
void addReloc(DynamicReloc::Kind kind, RelType dynType,		template <bool shard = false>
InputSectionBase &inputSec, uint64_t offsetInSec, Symbol &sym,		void addReloc(DynamicReloc::Kind kind, RelType dynType, InputSectionBase &sec,
int64_t addend, RelExpr expr, RelType addendRelType);		uint64_t offsetInSec, Symbol &sym, int64_t addend, RelExpr expr,
bool isNeeded() const override { return !relocs.empty(); }		RelType addendRelType) {
		// Write the addends to the relocated address if required. We skip
		// it if the written value would be zero.
		if (config->writeAddends && (expr != R_ADDEND \|\| addend != 0))
		sec.relocations.push_back(
		{expr, addendRelType, offsetInSec, addend, &sym});
		addReloc<shard>({dynType, &sec, offsetInSec, kind, sym, addend, expr});
		}
		bool isNeeded() const override {
		return !relocs.empty() \|\|
		llvm::any_of(relocsVec, [](auto &v) { return !v.empty(); });
		}
size_t getSize() const override { return relocs.size() * this->entsize; }		size_t getSize() const override { return relocs.size() * this->entsize; }
size_t getRelativeRelocCount() const { return numRelativeRelocs; }		size_t getRelativeRelocCount() const { return numRelativeRelocs; }
		void mergeRels();
void partitionRels();		void partitionRels();
void finalizeContents() override;		void finalizeContents() override;
static bool classof(const SectionBase *d) {		static bool classof(const SectionBase *d) {
return SyntheticSection::classof(d) &&		return SyntheticSection::classof(d) &&
(d->type == llvm::ELF::SHT_RELA \|\| d->type == llvm::ELF::SHT_REL \|\|		(d->type == llvm::ELF::SHT_RELA \|\| d->type == llvm::ELF::SHT_REL \|\|
d->type == llvm::ELF::SHT_RELR);		d->type == llvm::ELF::SHT_RELR);
}		}
int32_t dynamicTag, sizeDynamicTag;		int32_t dynamicTag, sizeDynamicTag;
SmallVector<DynamicReloc, 0> relocs;		SmallVector<DynamicReloc, 0> relocs;
		peter.smithUnsubmitted Done Reply Inline Actions Now that mergeRels has to be called before this is useable, is it worth making this private with an interface that asserts mergeRels has been called? peter.smith: Now that mergeRels has to be called before this is useable, is it worth making this private…

		peter.smithUnsubmitted Done Reply Inline Actions Suggest "// will be moved into relocs by mergeRels()." peter.smith: Suggest "// will be moved into relocs by mergeRels()."
protected:		protected:
		andrewngUnsubmitted Done Reply Inline Actions Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class RelrBaseSection`? andrewng: Typo: `should will` -> `will`? Is it worth adding the same comment to `relocsVec` in `class…
void computeRels();		void computeRels();
		// Used when parallel relocation scanning adds relocations. The elements
		// will be moved into relocs by mergeRel().
		SmallVector<SmallVector<DynamicReloc, 0>, 0> relocsVec;
size_t numRelativeRelocs = 0; // used by -z combreloc		size_t numRelativeRelocs = 0; // used by -z combreloc
bool combreloc;		bool combreloc;
};		};

		template <>
		inline void RelocationBaseSection::addReloc<true>(const DynamicReloc &reloc) {
		relocsVec[llvm::parallel::threadIndex].push_back(reloc);
		}

template <class ELFT>		template <class ELFT>
class RelocationSection final : public RelocationBaseSection {		class RelocationSection final : public RelocationBaseSection {
using Elf_Rel = typename ELFT::Rel;		using Elf_Rel = typename ELFT::Rel;
using Elf_Rela = typename ELFT::Rela;		using Elf_Rela = typename ELFT::Rela;

public:		public:
RelocationSection(StringRef name, bool combreloc);		RelocationSection(StringRef name, bool combreloc, unsigned concurrency);
void writeTo(uint8_t *buf) override;		void writeTo(uint8_t *buf) override;
};		};

template <class ELFT>		template <class ELFT>
class AndroidPackedRelocationSection final : public RelocationBaseSection {		class AndroidPackedRelocationSection final : public RelocationBaseSection {
using Elf_Rel = typename ELFT::Rel;		using Elf_Rel = typename ELFT::Rel;
using Elf_Rela = typename ELFT::Rela;		using Elf_Rela = typename ELFT::Rela;

public:		public:
AndroidPackedRelocationSection(StringRef name);		AndroidPackedRelocationSection(StringRef name, unsigned concurrency);

bool updateAllocSize() override;		bool updateAllocSize() override;
size_t getSize() const override { return relocData.size(); }		size_t getSize() const override { return relocData.size(); }
void writeTo(uint8_t *buf) override {		void writeTo(uint8_t *buf) override {
memcpy(buf, relocData.data(), relocData.size());		memcpy(buf, relocData.data(), relocData.size());
}		}

private:		private:
SmallVector<char, 0> relocData;		SmallVector<char, 0> relocData;
};		};

struct RelativeReloc {		struct RelativeReloc {
uint64_t getOffset() const { return inputSec->getVA(offsetInSec); }		uint64_t getOffset() const { return inputSec->getVA(offsetInSec); }

const InputSectionBase *inputSec;		const InputSectionBase *inputSec;
uint64_t offsetInSec;		uint64_t offsetInSec;
};		};

class RelrBaseSection : public SyntheticSection {		class RelrBaseSection : public SyntheticSection {
public:		public:
RelrBaseSection();		RelrBaseSection(unsigned concurrency);
bool isNeeded() const override { return !relocs.empty(); }		void mergeRels();
		bool isNeeded() const override {
		return !relocs.empty() \|\|
		llvm::any_of(relocsVec, [](auto &v) { return !v.empty(); });
		}
SmallVector<RelativeReloc, 0> relocs;		SmallVector<RelativeReloc, 0> relocs;
		SmallVector<SmallVector<RelativeReloc, 0>, 0> relocsVec;
};		};

// RelrSection is used to encode offsets for relative relocations.		// RelrSection is used to encode offsets for relative relocations.
// Proposal for adding SHT_RELR sections to generic-abi is here:		// Proposal for adding SHT_RELR sections to generic-abi is here:
// https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg		// https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
// For more details, see the comment in RelrSection::updateAllocSize().		// For more details, see the comment in RelrSection::updateAllocSize().
template <class ELFT> class RelrSection final : public RelrBaseSection {		template <class ELFT> class RelrSection final : public RelrBaseSection {
using Elf_Relr = typename ELFT::Relr;		using Elf_Relr = typename ELFT::Relr;

public:		public:
RelrSection();		RelrSection(unsigned concurrency);

bool updateAllocSize() override;		bool updateAllocSize() override;
size_t getSize() const override { return relrRelocs.size() * this->entsize; }		size_t getSize() const override { return relrRelocs.size() * this->entsize; }
void writeTo(uint8_t *buf) override {		void writeTo(uint8_t *buf) override {
memcpy(buf, relrRelocs.data(), getSize());		memcpy(buf, relrRelocs.data(), getSize());
}		}

private:		private:
▲ Show 20 Lines • Show All 681 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.cpp

Show First 20 Lines • Show All 1,566 Lines • ▼ Show 20 Lines	assert((index != 0 \|\| (type != target->gotRel && type != target->pltRel) \|\|
!mainPart->dynSymTab->getParent()) &&		!mainPart->dynSymTab->getParent()) &&
"GOT or PLT relocation must refer to symbol in dynamic symbol table");		"GOT or PLT relocation must refer to symbol in dynamic symbol table");
return index;		return index;
}		}

RelocationBaseSection::RelocationBaseSection(StringRef name, uint32_t type,		RelocationBaseSection::RelocationBaseSection(StringRef name, uint32_t type,
int32_t dynamicTag,		int32_t dynamicTag,
int32_t sizeDynamicTag,		int32_t sizeDynamicTag,
bool combreloc)		bool combreloc,
		unsigned concurrency)
: SyntheticSection(SHF_ALLOC, type, config->wordsize, name),		: SyntheticSection(SHF_ALLOC, type, config->wordsize, name),
dynamicTag(dynamicTag), sizeDynamicTag(sizeDynamicTag),		dynamicTag(dynamicTag), sizeDynamicTag(sizeDynamicTag),
combreloc(combreloc) {}		relocsVec(concurrency), combreloc(combreloc) {}

void RelocationBaseSection::addSymbolReloc(RelType dynType,		void RelocationBaseSection::addSymbolReloc(RelType dynType,
InputSectionBase &isec,		InputSectionBase &isec,
uint64_t offsetInSec, Symbol &sym,		uint64_t offsetInSec, Symbol &sym,
int64_t addend,		int64_t addend,
Optional<RelType> addendRelType) {		Optional<RelType> addendRelType) {
addReloc(DynamicReloc::AgainstSymbol, dynType, isec, offsetInSec, sym, addend,		addReloc(DynamicReloc::AgainstSymbol, dynType, isec, offsetInSec, sym, addend,
R_ADDEND, addendRelType ? *addendRelType : target->noneRel);		R_ADDEND, addendRelType ? *addendRelType : target->noneRel);
}		}

void RelocationBaseSection::addRelativeReloc(
RelType dynType, InputSectionBase &inputSec, uint64_t offsetInSec,
Symbol &sym, int64_t addend, RelType addendRelType, RelExpr expr) {
// This function should only be called for non-preemptible symbols or
// RelExpr values that refer to an address inside the output file (e.g. the
// address of the GOT entry for a potentially preemptible symbol).
assert((!sym.isPreemptible \|\| expr == R_GOT) &&
"cannot add relative relocation against preemptible symbol");
assert(expr != R_ADDEND && "expected non-addend relocation expression");
addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, inputSec, offsetInSec,
sym, addend, expr, addendRelType);
}

void RelocationBaseSection::addAddendOnlyRelocIfNonPreemptible(		void RelocationBaseSection::addAddendOnlyRelocIfNonPreemptible(
RelType dynType, InputSectionBase &isec, uint64_t offsetInSec, Symbol &sym,		RelType dynType, InputSectionBase &isec, uint64_t offsetInSec, Symbol &sym,
RelType addendRelType) {		RelType addendRelType) {
// No need to write an addend to the section for preemptible symbols.		// No need to write an addend to the section for preemptible symbols.
if (sym.isPreemptible)		if (sym.isPreemptible)
addReloc({dynType, &isec, offsetInSec, DynamicReloc::AgainstSymbol, sym, 0,		addReloc({dynType, &isec, offsetInSec, DynamicReloc::AgainstSymbol, sym, 0,
R_ABS});		R_ABS});
else		else
addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec, offsetInSec,		addReloc(DynamicReloc::AddendOnlyWithTargetVA, dynType, isec, offsetInSec,
sym, 0, R_ABS, addendRelType);		sym, 0, R_ABS, addendRelType);
}		}

void RelocationBaseSection::addReloc(DynamicReloc::Kind kind, RelType dynType,		void RelocationBaseSection::mergeRels() {
InputSectionBase &inputSec,		size_t newSize = relocs.size();
uint64_t offsetInSec, Symbol &sym,		for (const auto &v : relocsVec)
int64_t addend, RelExpr expr,		newSize += v.size();
RelType addendRelType) {		relocs.reserve(newSize);
// Write the addends to the relocated address if required. We skip		for (const auto &v : relocsVec)
		andrewngUnsubmitted Done Reply Inline Actions Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`. andrewng: Perhaps `const auto &v`? Same for `RelrBaseSection::mergeRels()`.
// it if the written value would be zero.		llvm::append_range(relocs, v);
if (config->writeAddends && (expr != R_ADDEND \|\| addend != 0))		relocsVec.clear();
inputSec.relocations.push_back(
{expr, addendRelType, offsetInSec, addend, &sym});
addReloc({dynType, &inputSec, offsetInSec, kind, sym, addend, expr});
}		}

void RelocationBaseSection::partitionRels() {		void RelocationBaseSection::partitionRels() {
if (!combreloc)		if (!combreloc)
return;		return;
const RelType relativeRel = target->relativeRel;		const RelType relativeRel = target->relativeRel;
numRelativeRelocs =		numRelativeRelocs =
llvm::partition(relocs, [=](auto &r) { return r.type == relativeRel; }) -		llvm::partition(relocs, [=](auto &r) { return r.type == relativeRel; }) -
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	if (combreloc) {
// Non-relative relocations are few, so don't bother with parallelSort.		// Non-relative relocations are few, so don't bother with parallelSort.
llvm::sort(nonRelative, relocs.end(), [&](auto &a, auto &b) {		llvm::sort(nonRelative, relocs.end(), [&](auto &a, auto &b) {
return std::tie(a.r_sym, a.r_offset) < std::tie(b.r_sym, b.r_offset);		return std::tie(a.r_sym, a.r_offset) < std::tie(b.r_sym, b.r_offset);
});		});
}		}
}		}

template <class ELFT>		template <class ELFT>
RelocationSection<ELFT>::RelocationSection(StringRef name, bool combreloc)		RelocationSection<ELFT>::RelocationSection(StringRef name, bool combreloc,
		unsigned concurrency)
: RelocationBaseSection(name, config->isRela ? SHT_RELA : SHT_REL,		: RelocationBaseSection(name, config->isRela ? SHT_RELA : SHT_REL,
config->isRela ? DT_RELA : DT_REL,		config->isRela ? DT_RELA : DT_REL,
config->isRela ? DT_RELASZ : DT_RELSZ, combreloc) {		config->isRela ? DT_RELASZ : DT_RELSZ, combreloc,
		concurrency) {
this->entsize = config->isRela ? sizeof(Elf_Rela) : sizeof(Elf_Rel);		this->entsize = config->isRela ? sizeof(Elf_Rela) : sizeof(Elf_Rel);
}		}

template <class ELFT> void RelocationSection<ELFT>::writeTo(uint8_t *buf) {		template <class ELFT> void RelocationSection<ELFT>::writeTo(uint8_t *buf) {
computeRels();		computeRels();
for (const DynamicReloc &rel : relocs) {		for (const DynamicReloc &rel : relocs) {
auto p = reinterpret_cast<Elf_Rela >(buf);		auto p = reinterpret_cast<Elf_Rela >(buf);
p->r_offset = rel.r_offset;		p->r_offset = rel.r_offset;
p->setSymbolAndType(rel.r_sym, rel.type, config->isMips64EL);		p->setSymbolAndType(rel.r_sym, rel.type, config->isMips64EL);
if (config->isRela)		if (config->isRela)
p->r_addend = rel.addend;		p->r_addend = rel.addend;
buf += config->isRela ? sizeof(Elf_Rela) : sizeof(Elf_Rel);		buf += config->isRela ? sizeof(Elf_Rela) : sizeof(Elf_Rel);
}		}
}		}

RelrBaseSection::RelrBaseSection()		RelrBaseSection::RelrBaseSection(unsigned concurrency)
: SyntheticSection(SHF_ALLOC,		: SyntheticSection(SHF_ALLOC,
config->useAndroidRelrTags ? SHT_ANDROID_RELR : SHT_RELR,		config->useAndroidRelrTags ? SHT_ANDROID_RELR : SHT_RELR,
config->wordsize, ".relr.dyn") {}		config->wordsize, ".relr.dyn"),
		relocsVec(concurrency) {}

		void RelrBaseSection::mergeRels() {
		size_t newSize = relocs.size();
		for (const auto &v : relocsVec)
		newSize += v.size();
		relocs.reserve(newSize);
		for (const auto &v : relocsVec)
		llvm::append_range(relocs, v);
		relocsVec.clear();
		}

template <class ELFT>		template <class ELFT>
AndroidPackedRelocationSection<ELFT>::AndroidPackedRelocationSection(		AndroidPackedRelocationSection<ELFT>::AndroidPackedRelocationSection(
StringRef name)		StringRef name, unsigned concurrency)
: RelocationBaseSection(		: RelocationBaseSection(
name, config->isRela ? SHT_ANDROID_RELA : SHT_ANDROID_REL,		name, config->isRela ? SHT_ANDROID_RELA : SHT_ANDROID_REL,
config->isRela ? DT_ANDROID_RELA : DT_ANDROID_REL,		config->isRela ? DT_ANDROID_RELA : DT_ANDROID_REL,
config->isRela ? DT_ANDROID_RELASZ : DT_ANDROID_RELSZ,		config->isRela ? DT_ANDROID_RELASZ : DT_ANDROID_RELSZ,
/combreloc=/false) {		/combreloc=/false, concurrency) {
this->entsize = 1;		this->entsize = 1;
}		}

template <class ELFT>		template <class ELFT>
bool AndroidPackedRelocationSection<ELFT>::updateAllocSize() {		bool AndroidPackedRelocationSection<ELFT>::updateAllocSize() {
// This function computes the contents of an Android-format packed relocation		// This function computes the contents of an Android-format packed relocation
// section.		// section.
//		//
▲ Show 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	bool AndroidPackedRelocationSection<ELFT>::updateAllocSize() {
// Returns whether the section size changed. We need to keep recomputing both		// Returns whether the section size changed. We need to keep recomputing both
// section layout and the contents of this section until the size converges		// section layout and the contents of this section until the size converges
// because changing this section's size can affect section layout, which in		// because changing this section's size can affect section layout, which in
// turn can affect the sizes of the LEB-encoded integers stored in this		// turn can affect the sizes of the LEB-encoded integers stored in this
// section.		// section.
return relocData.size() != oldSize;		return relocData.size() != oldSize;
}		}

template <class ELFT> RelrSection<ELFT>::RelrSection() {		template <class ELFT>
		RelrSection<ELFT>::RelrSection(unsigned concurrency)
		: RelrBaseSection(concurrency) {
this->entsize = config->wordsize;		this->entsize = config->wordsize;
}		}

template <class ELFT> bool RelrSection<ELFT>::updateAllocSize() {		template <class ELFT> bool RelrSection<ELFT>::updateAllocSize() {
// This function computes the contents of an SHT_RELR packed relocation		// This function computes the contents of an SHT_RELR packed relocation
// section.		// section.
//		//
// Proposal for adding SHT_RELR sections to generic-abi is here:		// Proposal for adding SHT_RELR sections to generic-abi is here:
▲ Show 20 Lines • Show All 1,991 Lines • Show Last 20 Lines

lld/ELF/Writer.cpp

Show First 20 Lines • Show All 311 Lines • ▼ Show 20 Lines	if (config->emachine == EM_MIPS) {
if ((in.mipsOptions = MipsOptionsSection<ELFT>::create()))		if ((in.mipsOptions = MipsOptionsSection<ELFT>::create()))
add(*in.mipsOptions);		add(*in.mipsOptions);
if ((in.mipsReginfo = MipsReginfoSection<ELFT>::create()))		if ((in.mipsReginfo = MipsReginfoSection<ELFT>::create()))
add(*in.mipsReginfo);		add(*in.mipsReginfo);
}		}

StringRef relaDynName = config->isRela ? ".rela.dyn" : ".rel.dyn";		StringRef relaDynName = config->isRela ? ".rela.dyn" : ".rel.dyn";

		const unsigned threadCount = parallel::strategy.compute_thread_count();
for (Partition &part : partitions) {		for (Partition &part : partitions) {
auto add = [&](SyntheticSection &sec) {		auto add = [&](SyntheticSection &sec) {
sec.partition = part.getNumber();		sec.partition = part.getNumber();
inputSections.push_back(&sec);		inputSections.push_back(&sec);
};		};

if (!part.name.empty()) {		if (!part.name.empty()) {
part.elfHeader = std::make_unique<PartitionElfHeaderSection<ELFT>>();		part.elfHeader = std::make_unique<PartitionElfHeaderSection<ELFT>>();
Show All 17 Lines	for (Partition &part : partitions) {

if (config->emachine == EM_AARCH64 &&		if (config->emachine == EM_AARCH64 &&
config->androidMemtagMode != ELF::NT_MEMTAG_LEVEL_NONE) {		config->androidMemtagMode != ELF::NT_MEMTAG_LEVEL_NONE) {
part.memtagAndroidNote = std::make_unique<MemtagAndroidNote>();		part.memtagAndroidNote = std::make_unique<MemtagAndroidNote>();
add(*part.memtagAndroidNote);		add(*part.memtagAndroidNote);
}		}

if (config->androidPackDynRelocs)		if (config->androidPackDynRelocs)
part.relaDyn =		part.relaDyn = std::make_unique<AndroidPackedRelocationSection<ELFT>>(
std::make_unique<AndroidPackedRelocationSection<ELFT>>(relaDynName);		relaDynName, threadCount);
else		else
part.relaDyn = std::make_unique<RelocationSection<ELFT>>(		part.relaDyn = std::make_unique<RelocationSection<ELFT>>(
relaDynName, config->zCombreloc);		relaDynName, config->zCombreloc, threadCount);

if (config->hasDynSymTab) {		if (config->hasDynSymTab) {
add(*part.dynSymTab);		add(*part.dynSymTab);

part.verSym = std::make_unique<VersionTableSection>();		part.verSym = std::make_unique<VersionTableSection>();
add(*part.verSym);		add(*part.verSym);

if (!namedVersionDefs().empty()) {		if (!namedVersionDefs().empty()) {
Show All 15 Lines	if (config->hasDynSymTab) {
}		}

add(*part.dynamic);		add(*part.dynamic);
add(*part.dynStrTab);		add(*part.dynStrTab);
add(*part.relaDyn);		add(*part.relaDyn);
}		}

if (config->relrPackDynRelocs) {		if (config->relrPackDynRelocs) {
part.relrDyn = std::make_unique<RelrSection<ELFT>>();		part.relrDyn = std::make_unique<RelrSection<ELFT>>(threadCount);
add(*part.relrDyn);		add(*part.relrDyn);
}		}

if (!config->relocatable) {		if (!config->relocatable) {
if (config->ehFrameHdr) {		if (config->ehFrameHdr) {
part.ehFrameHdr = std::make_unique<EhFrameHeader>();		part.ehFrameHdr = std::make_unique<EhFrameHeader>();
add(*part.ehFrameHdr);		add(*part.ehFrameHdr);
}		}
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	template <class ELFT> void elf::createSyntheticSections() {
}		}

if (config->gdbIndex)		if (config->gdbIndex)
add(*GdbIndexSection::create<ELFT>());		add(*GdbIndexSection::create<ELFT>());

// We always need to add rel[a].plt to output if it has entries.		// We always need to add rel[a].plt to output if it has entries.
// Even for static linking it can contain R_[*]_IRELATIVE relocations.		// Even for static linking it can contain R_[*]_IRELATIVE relocations.
in.relaPlt = std::make_unique<RelocationSection<ELFT>>(		in.relaPlt = std::make_unique<RelocationSection<ELFT>>(
config->isRela ? ".rela.plt" : ".rel.plt", /sort=/false);		config->isRela ? ".rela.plt" : ".rel.plt", /sort=/false,
		/threadCount=/1);
add(*in.relaPlt);		add(*in.relaPlt);

// The relaIplt immediately follows .rel[a].dyn to ensure that the IRelative		// The relaIplt immediately follows .rel[a].dyn to ensure that the IRelative
// relocations are processed last by the dynamic loader. We cannot place the		// relocations are processed last by the dynamic loader. We cannot place the
// iplt section in .rel.dyn when Android relocation packing is enabled because		// iplt section in .rel.dyn when Android relocation packing is enabled because
// that would cause a section type mismatch. However, because the Android		// that would cause a section type mismatch. However, because the Android
// dynamic loader reads .rel.plt after .rel.dyn, we can get the desired		// dynamic loader reads .rel.plt after .rel.dyn, we can get the desired
// behaviour by placing the iplt section in .rel.plt.		// behaviour by placing the iplt section in .rel.plt.
in.relaIplt = std::make_unique<RelocationSection<ELFT>>(		in.relaIplt = std::make_unique<RelocationSection<ELFT>>(
config->androidPackDynRelocs ? in.relaPlt->name : relaDynName,		config->androidPackDynRelocs ? in.relaPlt->name : relaDynName,
/sort=/false);		/sort=/false, /threadCount=/1);
add(*in.relaIplt);		add(*in.relaIplt);

if ((config->emachine == EM_386 \|\| config->emachine == EM_X86_64) &&		if ((config->emachine == EM_386 \|\| config->emachine == EM_X86_64) &&
(config->andFeatures & GNU_PROPERTY_X86_FEATURE_1_IBT)) {		(config->andFeatures & GNU_PROPERTY_X86_FEATURE_1_IBT)) {
in.ibtPlt = std::make_unique<IBTPltSection>();		in.ibtPlt = std::make_unique<IBTPltSection>();
add(*in.ibtPlt);		add(*in.ibtPlt);
}		}

▲ Show 20 Lines • Show All 1,578 Lines • ▼ Show 20 Lines	setReservedSymbolSections();
finalizeSynthetic(in.iplt.get());		finalizeSynthetic(in.iplt.get());
finalizeSynthetic(in.ppc32Got2.get());		finalizeSynthetic(in.ppc32Got2.get());
finalizeSynthetic(in.partIndex.get());		finalizeSynthetic(in.partIndex.get());

// Dynamic section must be the last one in this list and dynamic		// Dynamic section must be the last one in this list and dynamic
// symbol table section (dynSymTab) must be the first one.		// symbol table section (dynSymTab) must be the first one.
for (Partition &part : partitions) {		for (Partition &part : partitions) {
if (part.relaDyn) {		if (part.relaDyn) {
		part.relaDyn->mergeRels();
// Compute DT_RELACOUNT to be used by part.dynamic.		// Compute DT_RELACOUNT to be used by part.dynamic.
part.relaDyn->partitionRels();		part.relaDyn->partitionRels();
finalizeSynthetic(part.relaDyn.get());		finalizeSynthetic(part.relaDyn.get());
}		}
		if (part.relrDyn) {
		part.relrDyn->mergeRels();
		finalizeSynthetic(part.relrDyn.get());
		}

finalizeSynthetic(part.dynSymTab.get());		finalizeSynthetic(part.dynSymTab.get());
finalizeSynthetic(part.gnuHashTab.get());		finalizeSynthetic(part.gnuHashTab.get());
finalizeSynthetic(part.hashTab.get());		finalizeSynthetic(part.hashTab.get());
finalizeSynthetic(part.verDef.get());		finalizeSynthetic(part.verDef.get());
finalizeSynthetic(part.relrDyn.get());
finalizeSynthetic(part.ehFrameHdr.get());		finalizeSynthetic(part.ehFrameHdr.get());
finalizeSynthetic(part.verSym.get());		finalizeSynthetic(part.verSym.get());
finalizeSynthetic(part.verNeed.get());		finalizeSynthetic(part.verNeed.get());
finalizeSynthetic(part.dynamic.get());		finalizeSynthetic(part.dynamic.get());
}		}
}		}

if (!script->hasSectionsCommand && !config->relocatable)		if (!script->hasSectionsCommand && !config->relocatable)
▲ Show 20 Lines • Show All 882 Lines • Show Last 20 Lines

lld/test/ELF/combreloc.s

	Show All 29 Lines
	# NOCOMB: DynamicSection [			# NOCOMB: DynamicSection [
	# NOCOMB-NOT: RELACOUNT			# NOCOMB-NOT: RELACOUNT
	# NOCOMB: Relocations [			# NOCOMB: Relocations [
	# NOCOMB-NEXT: Section ({{.*}}) .rela.dyn {			# NOCOMB-NEXT: Section ({{.*}}) .rela.dyn {
	# NOCOMB-NEXT: 0x33F8 R_X86_64_64 aaa 0x0			# NOCOMB-NEXT: 0x33F8 R_X86_64_64 aaa 0x0
	# NOCOMB-NEXT: 0x3400 R_X86_64_64 ccc 0x0			# NOCOMB-NEXT: 0x3400 R_X86_64_64 ccc 0x0
	# NOCOMB-NEXT: 0x3408 R_X86_64_64 bbb 0x0			# NOCOMB-NEXT: 0x3408 R_X86_64_64 bbb 0x0
	# NOCOMB-NEXT: 0x3410 R_X86_64_64 aaa 0x0			# NOCOMB-NEXT: 0x3410 R_X86_64_64 aaa 0x0
	# NOCOMB-NEXT: 0x3418 R_X86_64_RELATIVE - 0x3420
	# NOCOMB-NEXT: 0x23F0 R_X86_64_GLOB_DAT aaa 0x0			# NOCOMB-NEXT: 0x23F0 R_X86_64_GLOB_DAT aaa 0x0
				# NOCOMB-NEXT: 0x3418 R_X86_64_RELATIVE - 0x3420
	# NOCOMB-NEXT: }			# NOCOMB-NEXT: }

	.globl aaa, bbb, ccc			.globl aaa, bbb, ccc
	.data			.data
	.quad aaa			.quad aaa
	.quad ccc			.quad ccc
	.quad bbb			.quad bbb
	.quad aaa			.quad aaa
	.quad relative			.quad relative
	relative:			relative:

lld/test/ELF/comdat-discarded-error.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t1.o
	# RUN: echo '.section .text.foo,"axG",@progbits,foo,comdat; .globl foo; foo:' \|\			# RUN: echo '.section .text.foo,"axG",@progbits,foo,comdat; .globl foo; foo:' \|\
	# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t2.o
	# RUN: echo '.weak foo; foo: .section .text.foo,"axG",@progbits,foo,comdat; .globl bar; bar:' \|\			# RUN: echo '.weak foo; foo: .section .text.foo,"axG",@progbits,foo,comdat; .globl bar; bar:' \|\
	# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64 - -o %t3.o

	# RUN: not ld.lld %t2.o %t3.o %t1.o -o /dev/null 2>&1 \| FileCheck %s			# RUN: not ld.lld --threads=1 %t2.o %t3.o %t1.o -o /dev/null 2>&1 \| FileCheck %s

	# CHECK: error: relocation refers to a symbol in a discarded section: bar			# CHECK: error: relocation refers to a symbol in a discarded section: bar
	# CHECK-NEXT: >>> defined in {{.*}}3.o			# CHECK-NEXT: >>> defined in {{.*}}3.o
	# CHECK-NEXT: >>> section group signature: foo			# CHECK-NEXT: >>> section group signature: foo
	# CHECK-NEXT: >>> prevailing definition is in {{.*}}2.o			# CHECK-NEXT: >>> prevailing definition is in {{.*}}2.o
	# CHECK-NEXT: >>> or the symbol in the prevailing group {{.*}}			# CHECK-NEXT: >>> or the symbol in the prevailing group {{.*}}
	# CHECK-NEXT: >>> referenced by {{.*}}1.o:(.text+0x1)			# CHECK-NEXT: >>> referenced by {{.*}}1.o:(.text+0x1)

	Show All 15 Lines

lld/test/ELF/undef-multi.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o
	# RUN: not ld.lld %t.o %t2.o -o /dev/null 2>&1 \| FileCheck %s			# RUN: not ld.lld --threads=1 %t.o %t2.o -o /dev/null 2>&1 \| FileCheck %s

	# CHECK: error: undefined symbol: zed2			# CHECK: error: undefined symbol: zed2
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0x1)			# CHECK-NEXT: >>> {{.*}}:(.text+0x1)
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0x6)			# CHECK-NEXT: >>> {{.*}}:(.text+0x6)
	# CHECK-NEXT: >>> referenced by undef-multi.s			# CHECK-NEXT: >>> referenced by undef-multi.s
	# CHECK-NEXT: >>> {{.*}}:(.text+0xB)			# CHECK-NEXT: >>> {{.*}}:(.text+0xB)
	# CHECK-NEXT: >>> referenced 2 more times			# CHECK-NEXT: >>> referenced 2 more times

	# All references to a single undefined symbol count as a single error -- but			# All references to a single undefined symbol count as a single error -- but
	# at most 10 references are printed.			# at most 10 references are printed.
	# RUN: echo ".globl _bar" > %t.moreref.s			# RUN: echo ".globl _bar" > %t.moreref.s
	# RUN: echo "_bar:" >> %t.moreref.s			# RUN: echo "_bar:" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: echo " call zed2" >> %t.moreref.s			# RUN: echo " call zed2" >> %t.moreref.s
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %t.moreref.s -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %t.moreref.s -o %t3.o
	# RUN: not ld.lld %t.o %t2.o %t3.o -o /dev/null -error-limit=2 2>&1 \| \			# RUN: not ld.lld --threads=1 %t.o %t2.o %t3.o -o /dev/null -error-limit=2 2>&1 \| \
	# RUN: FileCheck --check-prefix=LIMIT %s			# RUN: FileCheck --check-prefix=LIMIT %s

	# LIMIT: error: undefined symbol: zed2			# LIMIT: error: undefined symbol: zed2
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	# LIMIT-NEXT: >>> {{.*}}:(.text+0x1)			# LIMIT-NEXT: >>> {{.*}}:(.text+0x1)
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	# LIMIT-NEXT: >>> {{.*}}:(.text+0x6)			# LIMIT-NEXT: >>> {{.*}}:(.text+0x6)
	# LIMIT-NEXT: >>> referenced by undef-multi.s			# LIMIT-NEXT: >>> referenced by undef-multi.s
	Show All 20 Lines

lld/test/ELF/undef.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-bad-debug.s -o %t4.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-bad-debug.s -o %t4.o
	# RUN: rm -f %t2.a			# RUN: rm -f %t2.a
	# RUN: llvm-ar rc %t2.a %t2.o			# RUN: llvm-ar rc %t2.a %t2.o
	# RUN: not ld.lld %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \			# RUN: not ld.lld --threads=1 %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \
	# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"			# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"
	# RUN: not ld.lld -pie %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \			# RUN: not ld.lld --threads=1 -pie %t.o %t2.a %t3.o %t4.o -o /dev/null 2>&1 \
	# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"			# RUN: \| FileCheck %s --implicit-check-not="error:" --implicit-check-not="warning:"

	# CHECK: error: undefined symbol: foo			# CHECK: error: undefined symbol: foo
	# CHECK-NEXT: >>> referenced by undef.s			# CHECK-NEXT: >>> referenced by undef.s
	# CHECK-NEXT: {{.*}}:(.text+0x1)			# CHECK-NEXT: {{.*}}:(.text+0x1)

	# CHECK: error: undefined symbol: bar			# CHECK: error: undefined symbol: bar
	# CHECK-NEXT: >>> referenced by undef.s			# CHECK-NEXT: >>> referenced by undef.s
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/include/llvm/Support/Parallel.h

	Show All 22 Lines
	namespace llvm {			namespace llvm {

	namespace parallel {			namespace parallel {

	// Strategy for the default executor used by the parallel routines provided by			// Strategy for the default executor used by the parallel routines provided by
	// this file. It defaults to using all hardware threads and should be			// this file. It defaults to using all hardware threads and should be
	// initialized before the first use of parallel routines.			// initialized before the first use of parallel routines.
	extern ThreadPoolStrategy strategy;			extern ThreadPoolStrategy strategy;
				extern thread_local unsigned threadIndex;

	namespace detail {			namespace detail {
	class Latch {			class Latch {
	uint32_t Count;			uint32_t Count;
	mutable std::mutex Mutex;			mutable std::mutex Mutex;
	mutable std::condition_variable Cond;			mutable std::condition_variable Cond;

	public:			public:
	▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

llvm/lib/Support/Parallel.cpp

Show All 12 Lines

#include <atomic>		#include <atomic>
#include <future>		#include <future>
#include <stack>		#include <stack>
#include <thread>		#include <thread>
#include <vector>		#include <vector>

llvm::ThreadPoolStrategy llvm::parallel::strategy;		llvm::ThreadPoolStrategy llvm::parallel::strategy;
		thread_local unsigned llvm::parallel::threadIndex;
		andrewngUnsubmitted Done Reply Inline Actions Perhaps `int` -> `unsigned`? andrewng: Perhaps `int` -> `unsigned`?

namespace llvm {		namespace llvm {
namespace parallel {		namespace parallel {
#if LLVM_ENABLE_THREADS		#if LLVM_ENABLE_THREADS
namespace detail {		namespace detail {

namespace {		namespace {

Show All 15 Lines	explicit ThreadPoolExecutor(ThreadPoolStrategy S = hardware_concurrency()) {
// Spawn all but one of the threads in another thread as spawning threads		// Spawn all but one of the threads in another thread as spawning threads
// can take a while.		// can take a while.
Threads.reserve(ThreadCount);		Threads.reserve(ThreadCount);
Threads.resize(1);		Threads.resize(1);
std::lock_guard<std::mutex> Lock(Mutex);		std::lock_guard<std::mutex> Lock(Mutex);
Threads[0] = std::thread([this, ThreadCount, S] {		Threads[0] = std::thread([this, ThreadCount, S] {
for (unsigned I = 1; I < ThreadCount; ++I) {		for (unsigned I = 1; I < ThreadCount; ++I) {
Threads.emplace_back([=] { work(S, I); });		Threads.emplace_back([=] { work(S, I); });
if (Stop)		if (Stop)
		andrewngUnsubmitted Done Reply Inline Actions Perhaps move this initialisation of `threadIndex` and the one below into `work()`? andrewng: Perhaps move this initialisation of `threadIndex` and the one below into `work()`?
break;		break;
}		}
ThreadsCreated.set_value();		ThreadsCreated.set_value();
work(S, 0);		work(S, 0);
});		});
}		}

void stop() {		void stop() {
Show All 29 Lines	void add(std::function<void()> F) override {
std::lock_guard<std::mutex> Lock(Mutex);		std::lock_guard<std::mutex> Lock(Mutex);
WorkStack.push(std::move(F));		WorkStack.push(std::move(F));
}		}
Cond.notify_one();		Cond.notify_one();
}		}

private:		private:
void work(ThreadPoolStrategy S, unsigned ThreadID) {		void work(ThreadPoolStrategy S, unsigned ThreadID) {
		threadIndex = ThreadID;
S.apply_thread_strategy(ThreadID);		S.apply_thread_strategy(ThreadID);
while (true) {		while (true) {
std::unique_lock<std::mutex> Lock(Mutex);		std::unique_lock<std::mutex> Lock(Mutex);
Cond.wait(Lock, [&] { return Stop \|\| !WorkStack.empty(); });		Cond.wait(Lock, [&] { return Stop \|\| !WorkStack.empty(); });
if (Stop)		if (Stop)
break;		break;
auto Task = std::move(WorkStack.top());		auto Task = std::move(WorkStack.top());
WorkStack.pop();		WorkStack.pop();
▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Parallelize relocation scanningClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 459538

lld/ELF/Config.h

lld/ELF/Relocations.cpp

lld/ELF/Symbols.h

lld/ELF/SyntheticSections.h

lld/ELF/SyntheticSections.cpp

lld/ELF/Writer.cpp

lld/test/ELF/combreloc.s

lld/test/ELF/comdat-discarded-error.s

lld/test/ELF/undef-multi.s

lld/test/ELF/undef.s

llvm/include/llvm/Support/Parallel.h

llvm/lib/Support/Parallel.cpp

[ELF] Parallelize relocation scanning
ClosedPublic