This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/MachO/
-
MachO/
-
Driver.cpp
-
InputSection.h
1/2
MarkLive.cpp
-
SymbolTable.cpp
-
Symbols.h
-
Symbols.cpp
-
UnwindInfoSection.h
-
UnwindInfoSection.cpp
-
Writer.cpp

Differential D103977

[lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection
ClosedPublic

Authored by int3 on Jun 9 2021, 10:25 AM.

Download Raw Diff

Details

Reviewers

gkm
thakis

Group Reviewers

Restricted Project

Commits

rG7f2ba39b1688: [lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection

Summary

These fields currently live in the parent InputSection class,
but they should be specific to ConcatInputSection, since the other
InputSection classes (that contain literals) aren't atomically live or
dead -- rather their component string/int literals should have
individual liveness states. (An upcoming diff will add liveness bits for
StringPieces and fixed-sized literals.)

I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp.
We now avoid putting coalesced sections in the inputSections vector,
so we don't have to check/assert against it everywhere.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

int3 created this revision.Jun 9 2021, 10:25 AM

Herald added a reviewer: gkm. · View Herald TranscriptJun 9 2021, 10:25 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: jfb. · View Herald Transcript

int3 requested review of this revision.Jun 9 2021, 10:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2021, 10:25 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B108450: Diff 350942.Jun 9 2021, 10:26 AM

thakis added a subscriber: thakis.Jun 10 2021, 5:37 AM

thakis added inline comments.

lld/MachO/MarkLive.cpp
165	Is that correct? Imagine a S_ATTR_LIVE_SUPPORT symbol pointing to a live literal.

thakis added a child revision: D103979: [lld-macho] Have dead-stripping work with literal sections.Jun 10 2021, 5:49 AM

This change as-is is fine.

But overall, it feels like things get a lot more complicated because we're not creating real InputSections for each literal in literal sections. Are there so many more literals than normal symboled inputsections? What's the memory / perf hit from just having normal InputSections for each literal?

lld/MachO/MarkLive.cpp
165	Nevermind, this is the referring section, not the referent.

This revision is now accepted and ready to land.Jun 10 2021, 5:54 AM

(Also FYI if you use the "Edit Related Revisions…" button in the upper right on phab, the presubmit bots can correctly handle dependent changes)

But overall, it feels like things get a lot more complicated because we're not creating real InputSections for each literal in literal sections.

I agree, it's quite unfortunate...

Are there so many more literals than normal symboled inputsections? What's the memory / perf hit from just having normal InputSections for each literal?

Good questions! For chromium_framework, here are the relative counts of literals and subsections before deduplication (generated via D104158):

word literals: 145224 (8%)
cstring literals: 353031 (20%)
subsections: 1260720 (72%)

So actual subsections are still by far the largest, but cstrings take up a sizable chunk regardless. I also hacked up a diff that creates one InputSection per string, and even without doing dedup, it's already slower: D104159

The remaining question is... could we trim InputSection's size and close this performance gap? There are definitely opportunities here (e.g. we could replace name, segname, and flags with a pointer to the original section_64 header.) But there's still quite a number of other fields that I doubt can be removed. So while this is not a watertight analysis, I think it's enough to make a case for landing this architectural change.

This revision was landed with ongoing or failed builds.Jun 11 2021, 4:50 PM

Closed by commit rG7f2ba39b1688: [lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection (authored by int3). · Explain Why

This revision was automatically updated to reflect the committed changes.

int3 added a commit: rG7f2ba39b1688: [lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection.

One of your 4 commits broke the build: http://45.33.8.238/linux/48738/step_4.txt

Please take a look and revert for now if it takes a while to fix.

And consider spreading out landing commits so that it's easier to see which one breaks things :)

And consider spreading out landing commits so that it's easier to see which one breaks things :)

We don't have internal buildbots that notify us of upstream breakages this quickly, so I usually depend on the public LLVM buildbots, which also typically take an hour+ to get to my diffs. Hence the batching.

Anyway, this one looks like my bad for not having tested with a debug/RelWithAsserts build... running one locally now

int3 mentioned this in rG5de7467e9821: [lld-macho] Fix debug build.Jun 11 2021, 5:21 PM

I ended up removing a lot of the asserts since sprinkling casts to ConcatInputSection seemed pretty awkward. I think you added most of these asserts, so lmk if you think we should put them back

You can look at http://45.33.8.238/ , it's public and the Linux bot on it cycles in a little over 3 minutes :)

The asserts are kind of load bearing in that if they fire we'll write invalid output, so I think it'd be nice to keep them. Maybe the cast can go in a helper or something?

In D103977#2814584, @int3 wrote:
But overall, it feels like things get a lot more complicated because we're not creating real InputSections for each literal in literal sections.

I agree, it's quite unfortunate...

Are there so many more literals than normal symboled inputsections? What's the memory / perf hit from just having normal InputSections for each literal?

Good questions! For chromium_framework, here are the relative counts of literals and subsections before deduplication (generated via D104158):
word literals: 145224 (8%)
cstring literals: 353031 (20%)
subsections: 1260720 (72%)
So actual subsections are still by far the largest, but cstrings take up a sizable chunk regardless. I also hacked up a diff that creates one InputSection per string, and even without doing dedup, it's already slower: D104159

The remaining question is... could we trim InputSection's size and close this performance gap? There are definitely opportunities here (e.g. we could replace name, segname, and flags with a pointer to the original section_64 header.) But there's still quite a number of other fields that I doubt can be removed. So while this is not a watertight analysis, I think it's enough to make a case for landing this architectural change.

Thanks for collecting this data!

The ELF's port's -fdata-sections behavior means it has one actual section per string, right?

Since most sections are "normal" subsection sections, we kind of need to optimize sections anyways. But we can do that first and then reconsider if we need the special cases for literals.

Looks like this change and its follow up 5de7467e9821485f492eb97fafd796e1db4c6bb5 causes a test failure (lld::alignment-too-large.yaml) in ppc buildbot: https://lab.llvm.org/buildbot/#/builders/36/builds/9356.

Please take a look.

Revision Contents

Path

Size

lld/

MachO/

11 lines

33 lines

47 lines

4 lines

4 lines

2 lines

2 lines

UnwindInfoSection.cpp

12 lines

Writer.cpp

15 lines

Diff 351586

lld/MachO/Driver.cpp

Show First 20 Lines • Show All 1,287 Lines • ▼ Show 20 Lines	for (const Arg *arg : args.filtered(OPT_sectcreate)) {
if (buffer)		if (buffer)
inputFiles.insert(make<OpaqueFile>(*buffer, segName, sectName));		inputFiles.insert(make<OpaqueFile>(*buffer, segName, sectName));
}		}

{		{
TimeTraceScope timeScope("Gathering input sections");		TimeTraceScope timeScope("Gathering input sections");
// Gather all InputSections into one vector.		// Gather all InputSections into one vector.
for (const InputFile *file : inputFiles) {		for (const InputFile *file : inputFiles) {
for (const SubsectionMap &map : file->subsections)		for (const SubsectionMap &map : file->subsections) {
for (const SubsectionEntry &subsectionEntry : map)		for (const SubsectionEntry &entry : map) {
inputSections.push_back(subsectionEntry.isec);		if (auto concatIsec = dyn_cast<ConcatInputSection>(entry.isec))
		if (concatIsec->isCoalescedWeak())
		continue;
		inputSections.push_back(entry.isec);
		}
		}
}		}
}		}

if (config->deadStrip)		if (config->deadStrip)
markLive();		markLive();

// Write to an output file.		// Write to an output file.
if (target->wordSize == 8)		if (target->wordSize == 8)
Show All 23 Lines

lld/MachO/InputSection.h

Show All 35 Lines	public:
virtual uint64_t getSize() const { return data.size(); }		virtual uint64_t getSize() const { return data.size(); }
uint64_t getFileSize() const;		uint64_t getFileSize() const;
// Translates \p off -- an offset relative to this InputSection -- into an		// Translates \p off -- an offset relative to this InputSection -- into an
// offset from the beginning of its parent OutputSection.		// offset from the beginning of its parent OutputSection.
virtual uint64_t getOffset(uint64_t off) const = 0;		virtual uint64_t getOffset(uint64_t off) const = 0;
// The offset from the beginning of the file.		// The offset from the beginning of the file.
virtual uint64_t getFileOffset(uint64_t off) const = 0;		virtual uint64_t getFileOffset(uint64_t off) const = 0;
uint64_t getVA(uint64_t off) const;		uint64_t getVA(uint64_t off) const;
		// Whether the data at \p off in this InputSection is live.
		virtual bool isLive(uint64_t off) const = 0;

void writeTo(uint8_t *buf);		void writeTo(uint8_t *buf);

InputFile *file = nullptr;		InputFile *file = nullptr;
StringRef name;		StringRef name;
StringRef segname;		StringRef segname;

OutputSection *parent = nullptr;		OutputSection *parent = nullptr;

uint32_t align = 1;		uint32_t align = 1;
uint32_t flags = 0;		uint32_t flags = 0;
uint32_t callSiteCount = 0;		uint32_t callSiteCount = 0;
bool isFinal = false; // is address assigned?		bool isFinal = false; // is address assigned?

// How many symbols refer to this InputSection.
uint32_t numRefs = 0;

// With subsections_via_symbols, most symbols have their own InputSection,
// and for weak symbols (e.g. from inline functions), only the
// InputSection from one translation unit will make it to the output,
// while all copies in other translation units are coalesced into the
// first and not copied to the output.
bool wasCoalesced = false;

bool isCoalescedWeak() const { return wasCoalesced && numRefs == 0; }
bool shouldOmitFromOutput() const { return !live \|\| isCoalescedWeak(); }

bool live = !config->deadStrip;

ArrayRef<uint8_t> data;		ArrayRef<uint8_t> data;
std::vector<Reloc> relocs;		std::vector<Reloc> relocs;

protected:		protected:
explicit InputSection(Kind kind) : sectionKind(kind) {}		explicit InputSection(Kind kind) : sectionKind(kind) {}

private:		private:
Kind sectionKind;		Kind sectionKind;
};		};

// ConcatInputSections are combined into (Concat)OutputSections through simple		// ConcatInputSections are combined into (Concat)OutputSections through simple
// concatentation, in contrast with literal sections which may have their		// concatentation, in contrast with literal sections which may have their
// contents merged before output.		// contents merged before output.
class ConcatInputSection : public InputSection {		class ConcatInputSection : public InputSection {
public:		public:
ConcatInputSection() : InputSection(ConcatKind) {}		ConcatInputSection() : InputSection(ConcatKind) {}
uint64_t getFileOffset(uint64_t off) const override;		uint64_t getFileOffset(uint64_t off) const override;
uint64_t getOffset(uint64_t off) const override { return outSecOff + off; }		uint64_t getOffset(uint64_t off) const override { return outSecOff + off; }
uint64_t getVA() const { return InputSection::getVA(0); }		uint64_t getVA() const { return InputSection::getVA(0); }
		// ConcatInputSections are entirely live or dead, so the offset is irrelevant.
		bool isLive(uint64_t off) const override { return live; }
		bool isCoalescedWeak() const { return wasCoalesced && numRefs == 0; }
		bool shouldOmitFromOutput() const { return !live \|\| isCoalescedWeak(); }

static bool classof(const InputSection *isec) {		static bool classof(const InputSection *isec) {
return isec->kind() == ConcatKind;		return isec->kind() == ConcatKind;
}		}

		// With subsections_via_symbols, most symbols have their own InputSection,
		// and for weak symbols (e.g. from inline functions), only the
		// InputSection from one translation unit will make it to the output,
		// while all copies in other translation units are coalesced into the
		// first and not copied to the output.
		bool wasCoalesced = false;
		bool live = !config->deadStrip;
		// How many symbols refer to this InputSection.
		uint32_t numRefs = 0;
uint64_t outSecOff = 0;		uint64_t outSecOff = 0;
uint64_t outSecFileOff = 0;		uint64_t outSecFileOff = 0;
};		};

// We allocate a lot of these and binary search on them, so they should be as		// We allocate a lot of these and binary search on them, so they should be as
// compact as possible. Hence the use of 32 rather than 64 bits for the hash.		// compact as possible. Hence the use of 32 rather than 64 bits for the hash.
struct StringPiece {		struct StringPiece {
// Offset from the start of the containing input section.		// Offset from the start of the containing input section.
Show All 15 Lines
// ld64 is more conservative and does not do that. This was mostly done for		// ld64 is more conservative and does not do that. This was mostly done for
// implementation simplicity; if we find programs that need the more		// implementation simplicity; if we find programs that need the more
// conservative behavior we can certainly implement that.		// conservative behavior we can certainly implement that.
class CStringInputSection : public InputSection {		class CStringInputSection : public InputSection {
public:		public:
CStringInputSection() : InputSection(CStringLiteralKind) {}		CStringInputSection() : InputSection(CStringLiteralKind) {}
uint64_t getFileOffset(uint64_t off) const override;		uint64_t getFileOffset(uint64_t off) const override;
uint64_t getOffset(uint64_t off) const override;		uint64_t getOffset(uint64_t off) const override;
		// FIXME implement this
		bool isLive(uint64_t off) const override { return true; }
// Find the StringPiece that contains this offset.		// Find the StringPiece that contains this offset.
const StringPiece &getStringPiece(uint64_t off) const;		const StringPiece &getStringPiece(uint64_t off) const;
// Split at each null byte.		// Split at each null byte.
void splitIntoPieces();		void splitIntoPieces();

// Returns i'th piece as a CachedHashStringRef. This function is very hot when		// Returns i'th piece as a CachedHashStringRef. This function is very hot when
// string merging is enabled, so we want to inline.		// string merging is enabled, so we want to inline.
LLVM_ATTRIBUTE_ALWAYS_INLINE		LLVM_ATTRIBUTE_ALWAYS_INLINE
Show All 11 Lines	public:
std::vector<StringPiece> pieces;		std::vector<StringPiece> pieces;
};		};

class WordLiteralInputSection : public InputSection {		class WordLiteralInputSection : public InputSection {
public:		public:
WordLiteralInputSection() : InputSection(WordLiteralKind) {}		WordLiteralInputSection() : InputSection(WordLiteralKind) {}
uint64_t getFileOffset(uint64_t off) const override;		uint64_t getFileOffset(uint64_t off) const override;
uint64_t getOffset(uint64_t off) const override;		uint64_t getOffset(uint64_t off) const override;
		// FIXME implement this
		bool isLive(uint64_t off) const override { return true; }

static bool classof(const InputSection *isec) {		static bool classof(const InputSection *isec) {
return isec->kind() == WordLiteralKind;		return isec->kind() == WordLiteralKind;
}		}
};		};

inline uint8_t sectionType(uint32_t flags) {		inline uint8_t sectionType(uint32_t flags) {
return flags & llvm::MachO::SECTION_TYPE;		return flags & llvm::MachO::SECTION_TYPE;
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

lld/MachO/MarkLive.cpp

Show All 24 Lines
// InputSections will be ignored by Writer, so they will be excluded		// InputSections will be ignored by Writer, so they will be excluded
// from the final output.		// from the final output.
void markLive() {		void markLive() {
TimeTraceScope timeScope("markLive");		TimeTraceScope timeScope("markLive");

// We build up a worklist of sections which have been marked as live. We only		// We build up a worklist of sections which have been marked as live. We only
// push into the worklist when we discover an unmarked section, and we mark		// push into the worklist when we discover an unmarked section, and we mark
// as we push, so sections never appear twice in the list.		// as we push, so sections never appear twice in the list.
SmallVector<InputSection *, 256> worklist;		// Literal sections cannot contain references to other sections, so we only
		// store ConcatInputSections in our worklist.
auto enqueue = [&](InputSection *s) {		SmallVector<ConcatInputSection *, 256> worklist;

		auto enqueue = [&](InputSection *isec) {
		if (auto s = dyn_cast<ConcatInputSection>(isec)) {
		assert(!s->isCoalescedWeak());
if (s->live)		if (s->live)
return;		return;
s->live = true;		s->live = true;
worklist.push_back(s);		worklist.push_back(s);
		}
};		};

auto addSym = [&](Symbol *s) {		auto addSym = [&](Symbol *s) {
s->used = true;		s->used = true;
if (auto *d = dyn_cast<Defined>(s))		if (auto *d = dyn_cast<Defined>(s))
if (d->isec)		if (d->isec)
enqueue(d->isec);		enqueue(d->isec);
};		};
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	for (InputSection *isec : inputSections) {
// __LD,__compact_unwind alive here.		// __LD,__compact_unwind alive here.
// But that section contains absolute references to __TEXT,__text and		// But that section contains absolute references to __TEXT,__text and
// keeps most code alive due to that. So we can't just enqueue() the		// keeps most code alive due to that. So we can't just enqueue() the
// section: We must skip the relocations for the functionAddress		// section: We must skip the relocations for the functionAddress
// in each CompactUnwindEntry.		// in each CompactUnwindEntry.
// See also scanEhFrameSection() in lld/ELF/MarkLive.cpp.		// See also scanEhFrameSection() in lld/ELF/MarkLive.cpp.
if (isec->segname == segment_names::ld &&		if (isec->segname == segment_names::ld &&
isec->name == section_names::compactUnwind) {		isec->name == section_names::compactUnwind) {
isec->live = true;		auto concatIsec = cast<ConcatInputSection>(isec);
		concatIsec->live = true;
const int compactUnwindEntrySize =		const int compactUnwindEntrySize =
target->wordSize == 8 ? sizeof(CompactUnwindEntry<uint64_t>)		target->wordSize == 8 ? sizeof(CompactUnwindEntry<uint64_t>)
: sizeof(CompactUnwindEntry<uint32_t>);		: sizeof(CompactUnwindEntry<uint32_t>);
for (const Reloc &r : isec->relocs) {		for (const Reloc &r : isec->relocs) {
// This is the relocation for the address of the function itself.		// This is the relocation for the address of the function itself.
// Ignore it, else these would keep everything alive.		// Ignore it, else these would keep everything alive.
if (r.offset % compactUnwindEntrySize == 0)		if (r.offset % compactUnwindEntrySize == 0)
continue;		continue;

if (auto s = r.referent.dyn_cast<Symbol >())		if (auto s = r.referent.dyn_cast<Symbol >())
addSym(s);		addSym(s);
else {		else
auto referentIsec = r.referent.get<InputSection >();		enqueue(r.referent.get<InputSection *>());
assert(!referentIsec->isCoalescedWeak());
enqueue(referentIsec);
}
}		}
continue;		continue;
}		}
}		}

do {		do {
// Mark things reachable from GC roots as live.		// Mark things reachable from GC roots as live.
while (!worklist.empty()) {		while (!worklist.empty()) {
InputSection *s = worklist.pop_back_val();		ConcatInputSection *s = worklist.pop_back_val();
assert(s->live && "We mark as live when pushing onto the worklist!");		assert(s->live && "We mark as live when pushing onto the worklist!");

// Mark all symbols listed in the relocation table for this section.		// Mark all symbols listed in the relocation table for this section.
for (const Reloc &r : s->relocs) {		for (const Reloc &r : s->relocs) {
if (auto s = r.referent.dyn_cast<Symbol >()) {		if (auto s = r.referent.dyn_cast<Symbol >())
addSym(s);		addSym(s);
} else {		else
auto referentIsec = r.referent.get<InputSection >();		enqueue(r.referent.get<InputSection *>());
assert(!referentIsec->isCoalescedWeak());
enqueue(referentIsec);
}
}		}
}		}

// S_ATTR_LIVE_SUPPORT sections are live if they point _to_ a live section.		// S_ATTR_LIVE_SUPPORT sections are live if they point _to_ a live section.
// Process them in a second pass.		// Process them in a second pass.
for (InputSection *isec : inputSections) {		for (InputSection *isec : inputSections) {
		if (!isa<ConcatInputSection>(isec))
		thakisUnsubmitted Not Done Reply Inline Actions Is that correct? Imagine a S_ATTR_LIVE_SUPPORT symbol pointing to a live literal. thakis: Is that correct? Imagine a S_ATTR_LIVE_SUPPORT symbol pointing to a live literal.
		thakisUnsubmitted Done Reply Inline Actions Nevermind, this is the referring section, not the referent. thakis: Nevermind, this is the referring section, not the referent.
		continue;
		auto concatIsec = cast<ConcatInputSection>(isec);
// FIXME: Check if copying all S_ATTR_LIVE_SUPPORT sections into a		// FIXME: Check if copying all S_ATTR_LIVE_SUPPORT sections into a
// separate vector and only walking that here is faster.		// separate vector and only walking that here is faster.
if (!(isec->flags & S_ATTR_LIVE_SUPPORT) \|\| isec->live)		if (!(concatIsec->flags & S_ATTR_LIVE_SUPPORT) \|\| concatIsec->live)
continue;		continue;

for (const Reloc &r : isec->relocs) {		for (const Reloc &r : isec->relocs) {
bool referentLive;		bool referentLive;
if (auto s = r.referent.dyn_cast<Symbol >())		if (auto s = r.referent.dyn_cast<Symbol >())
referentLive = s->isLive();		referentLive = s->isLive();
else		else
referentLive = r.referent.get<InputSection *>()->live;		referentLive = r.referent.get<InputSection *>()->isLive(r.addend);
if (referentLive)		if (referentLive)
enqueue(isec);		enqueue(isec);
}		}
}		}

// S_ATTR_LIVE_SUPPORT could have marked additional sections live,		// S_ATTR_LIVE_SUPPORT could have marked additional sections live,
// which in turn could mark additional S_ATTR_LIVE_SUPPORT sections live.		// which in turn could mark additional S_ATTR_LIVE_SUPPORT sections live.
// Iterate. In practice, the second iteration won't mark additional		// Iterate. In practice, the second iteration won't mark additional
// S_ATTR_LIVE_SUPPORT sections live.		// S_ATTR_LIVE_SUPPORT sections live.
} while (!worklist.empty());		} while (!worklist.empty());
}		}

} // namespace macho		} // namespace macho
} // namespace lld		} // namespace lld

lld/MachO/SymbolTable.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	if (auto *defined = dyn_cast<Defined>(s)) {
defined->referencedDynamically \|= isReferencedDynamically;		defined->referencedDynamically \|= isReferencedDynamically;
defined->noDeadStrip \|= noDeadStrip;		defined->noDeadStrip \|= noDeadStrip;

// FIXME: Handle this for bitcode files.		// FIXME: Handle this for bitcode files.
// FIXME: We currently only do this if both symbols are weak.		// FIXME: We currently only do this if both symbols are weak.
// We could do this if either is weak (but getting the		// We could do this if either is weak (but getting the
// case where !isWeakDef && defined->isWeakDef() right		// case where !isWeakDef && defined->isWeakDef() right
// requires some care and testing).		// requires some care and testing).
if (isec)		if (auto concatIsec = dyn_cast_or_null<ConcatInputSection>(isec))
isec->wasCoalesced = true;		concatIsec->wasCoalesced = true;
}		}

return defined;		return defined;
}		}
if (!defined->isWeakDef())		if (!defined->isWeakDef())
error("duplicate symbol: " + name + "\n>>> defined in " +		error("duplicate symbol: " + name + "\n>>> defined in " +
toString(defined->getFile()) + "\n>>> defined in " +		toString(defined->getFile()) + "\n>>> defined in " +
toString(file));		toString(file));
▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines

lld/MachO/Symbols.h

Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines	public:
Defined(StringRefZ name, InputFile file, InputSection isec, uint64_t value,		Defined(StringRefZ name, InputFile file, InputSection isec, uint64_t value,
uint64_t size, bool isWeakDef, bool isExternal, bool isPrivateExtern,		uint64_t size, bool isWeakDef, bool isExternal, bool isPrivateExtern,
bool isThumb, bool isReferencedDynamically, bool noDeadStrip)		bool isThumb, bool isReferencedDynamically, bool noDeadStrip)
: Symbol(DefinedKind, name, file), isec(isec), value(value), size(size),		: Symbol(DefinedKind, name, file), isec(isec), value(value), size(size),
overridesWeakDef(false), privateExtern(isPrivateExtern),		overridesWeakDef(false), privateExtern(isPrivateExtern),
includeInSymtab(true), thumb(isThumb),		includeInSymtab(true), thumb(isThumb),
referencedDynamically(isReferencedDynamically),		referencedDynamically(isReferencedDynamically),
noDeadStrip(noDeadStrip), weakDef(isWeakDef), external(isExternal) {		noDeadStrip(noDeadStrip), weakDef(isWeakDef), external(isExternal) {
if (isec)		if (auto concatIsec = dyn_cast_or_null<ConcatInputSection>(isec))
isec->numRefs++;		concatIsec->numRefs++;
}		}

bool isWeakDef() const override { return weakDef; }		bool isWeakDef() const override { return weakDef; }
bool isExternalWeakDef() const {		bool isExternalWeakDef() const {
return isWeakDef() && isExternal() && !privateExtern;		return isWeakDef() && isExternal() && !privateExtern;
}		}
bool isTlv() const override {		bool isTlv() const override {
return !isAbsolute() && isThreadLocalVariables(isec->flags);		return !isAbsolute() && isThreadLocalVariables(isec->flags);
▲ Show 20 Lines • Show All 190 Lines • Show Last 20 Lines

lld/MachO/Symbols.cpp

Show All 34 Lines	bool Symbol::isLive() const {
if (isa<DylibSymbol>(this) \|\| isa<Undefined>(this))		if (isa<DylibSymbol>(this) \|\| isa<Undefined>(this))
return used;		return used;

if (auto *d = dyn_cast<Defined>(this)) {		if (auto *d = dyn_cast<Defined>(this)) {
// Non-absolute symbols might be alive because their section is		// Non-absolute symbols might be alive because their section is
// no_dead_strip or live_support. In that case, the section will know		// no_dead_strip or live_support. In that case, the section will know
// that it's live but `used` might be false. Non-absolute symbols always		// that it's live but `used` might be false. Non-absolute symbols always
// have to use the section's `live` bit as source of truth.		// have to use the section's `live` bit as source of truth.
return d->isAbsolute() ? used : d->isec->live;		return d->isAbsolute() ? used : d->isec->isLive(d->value);
}		}

assert(!isa<CommonSymbol>(this) &&		assert(!isa<CommonSymbol>(this) &&
"replaceCommonSymbols() runs before dead code stripping, and isLive() "		"replaceCommonSymbols() runs before dead code stripping, and isLive() "
"should only be called after dead code stripping");		"should only be called after dead code stripping");

// Assume any other kind of symbol is live.		// Assume any other kind of symbol is live.
return true;		return true;
Show All 36 Lines

lld/MachO/UnwindInfoSection.h

Show All 26 Lines	template <class Ptr> struct CompactUnwindEntry {
Ptr personality;		Ptr personality;
Ptr lsda;		Ptr lsda;
};		};

class UnwindInfoSection : public SyntheticSection {		class UnwindInfoSection : public SyntheticSection {
public:		public:
bool isNeeded() const override { return compactUnwindSection != nullptr; }		bool isNeeded() const override { return compactUnwindSection != nullptr; }
uint64_t getSize() const override { return unwindInfoSize; }		uint64_t getSize() const override { return unwindInfoSize; }
virtual void prepareRelocations(InputSection *) = 0;		virtual void prepareRelocations(ConcatInputSection *) = 0;

void setCompactUnwindSection(ConcatOutputSection *cuSection) {		void setCompactUnwindSection(ConcatOutputSection *cuSection) {
compactUnwindSection = cuSection;		compactUnwindSection = cuSection;
}		}

protected:		protected:
UnwindInfoSection()		UnwindInfoSection()
: SyntheticSection(segment_names::text, section_names::unwindInfo) {		: SyntheticSection(segment_names::text, section_names::unwindInfo) {
Show All 13 Lines

lld/MachO/UnwindInfoSection.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	struct SecondLevelPage {
size_t entryCount;		size_t entryCount;
size_t byteCount;		size_t byteCount;
std::vector<compact_unwind_encoding_t> localEncodings;		std::vector<compact_unwind_encoding_t> localEncodings;
EncodingMap localEncodingIndexes;		EncodingMap localEncodingIndexes;
};		};

template <class Ptr> class UnwindInfoSectionImpl : public UnwindInfoSection {		template <class Ptr> class UnwindInfoSectionImpl : public UnwindInfoSection {
public:		public:
void prepareRelocations(InputSection *) override;		void prepareRelocations(ConcatInputSection *) override;
void finalize() override;		void finalize() override;
void writeTo(uint8_t *buf) const override;		void writeTo(uint8_t *buf) const override;

private:		private:
std::vector<std::pair<compact_unwind_encoding_t, size_t>> commonEncodings;		std::vector<std::pair<compact_unwind_encoding_t, size_t>> commonEncodings;
EncodingMap commonEncodingIndexes;		EncodingMap commonEncodingIndexes;
// Indices of personality functions within the GOT.		// Indices of personality functions within the GOT.
std::vector<uint32_t> personalities;		std::vector<uint32_t> personalities;
Show All 10 Lines
};		};

// Compact unwind relocations have different semantics, so we handle them in a		// Compact unwind relocations have different semantics, so we handle them in a
// separate code path from regular relocations. First, we do not wish to add		// separate code path from regular relocations. First, we do not wish to add
// rebase opcodes for __LD,__compact_unwind, because that section doesn't		// rebase opcodes for __LD,__compact_unwind, because that section doesn't
// actually end up in the final binary. Second, personality pointers always		// actually end up in the final binary. Second, personality pointers always
// reside in the GOT and must be treated specially.		// reside in the GOT and must be treated specially.
template <class Ptr>		template <class Ptr>
void UnwindInfoSectionImpl<Ptr>::prepareRelocations(InputSection *isec) {		void UnwindInfoSectionImpl<Ptr>::prepareRelocations(ConcatInputSection *isec) {
assert(isec->segname == segment_names::ld &&		assert(isec->segname == segment_names::ld &&
isec->name == section_names::compactUnwind);		isec->name == section_names::compactUnwind);
assert(!isec->shouldOmitFromOutput() &&		assert(!isec->shouldOmitFromOutput() &&
"__compact_unwind section should not be omitted");		"__compact_unwind section should not be omitted");

// FIXME: This could skip relocations for CompactUnwindEntries that		// FIXME: This could skip relocations for CompactUnwindEntries that
// point to dead-stripped functions. That might save some amount of		// point to dead-stripped functions. That might save some amount of
// work. But since there are usually just few personality functions		// work. But since there are usually just few personality functions
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	for (Reloc &r : isec->relocs) {
}		}
}		}
}		}

// Unwind info lives in __DATA, and finalization of __TEXT will occur before		// Unwind info lives in __DATA, and finalization of __TEXT will occur before
// finalization of __DATA. Moreover, the finalization of unwind info depends on		// finalization of __DATA. Moreover, the finalization of unwind info depends on
// the exact addresses that it references. So it is safe for compact unwind to		// the exact addresses that it references. So it is safe for compact unwind to
// reference addresses in __TEXT, but not addresses in any other segment.		// reference addresses in __TEXT, but not addresses in any other segment.
static void checkTextSegment(InputSection *isec) {		static ConcatInputSection checkTextSegment(InputSection isec) {
if (isec->segname != segment_names::text)		if (isec->segname != segment_names::text)
error("compact unwind references address in " + toString(isec) +		error("compact unwind references address in " + toString(isec) +
" which is not in segment __TEXT");		" which is not in segment __TEXT");
		// __text should always be a ConcatInputSection.
		return cast<ConcatInputSection>(isec);
}		}

// We need to apply the relocations to the pre-link compact unwind section		// We need to apply the relocations to the pre-link compact unwind section
// before converting it to post-link form. There should only be absolute		// before converting it to post-link form. There should only be absolute
// relocations here: since we are not emitting the pre-link CU section, there		// relocations here: since we are not emitting the pre-link CU section, there
// is no source address to make a relative location meaningful.		// is no source address to make a relative location meaningful.
template <class Ptr>		template <class Ptr>
static void		static void
Show All 14 Lines	for (const Reloc &r : isec->relocs) {
if (auto *defined = dyn_cast<Defined>(referentSym))		if (auto *defined = dyn_cast<Defined>(referentSym))
checkTextSegment(defined->isec);		checkTextSegment(defined->isec);
// At this point in the link, we may not yet know the final address of		// At this point in the link, we may not yet know the final address of
// the GOT, so we just encode the index. We make it a 1-based index so		// the GOT, so we just encode the index. We make it a 1-based index so
// that we can distinguish the null pointer case.		// that we can distinguish the null pointer case.
referentVA = referentSym->gotIndex + 1;		referentVA = referentSym->gotIndex + 1;
}		}
} else if (auto referentIsec = r.referent.dyn_cast<InputSection >()) {		} else if (auto referentIsec = r.referent.dyn_cast<InputSection >()) {
checkTextSegment(referentIsec);		ConcatInputSection *concatIsec = checkTextSegment(referentIsec);
if (referentIsec->shouldOmitFromOutput())		if (concatIsec->shouldOmitFromOutput())
referentVA = UINT64_MAX; // Tombstone value		referentVA = UINT64_MAX; // Tombstone value
else		else
referentVA = referentIsec->getVA(r.addend);		referentVA = referentIsec->getVA(r.addend);
}		}

writeAddress(buf + r.offset, referentVA, r.length);		writeAddress(buf + r.offset, referentVA, r.length);
}		}
}		}
▲ Show 20 Lines • Show All 310 Lines • Show Last 20 Lines

lld/MachO/Writer.cpp

Show First 20 Lines • Show All 560 Lines • ▼ Show 20 Lines	if (relocAttrs.hasAttr(RelocAttrBits::BRANCH)) {
if (!(isThreadLocalVariables(isec->flags) && isa<Defined>(sym)))		if (!(isThreadLocalVariables(isec->flags) && isa<Defined>(sym)))
addNonLazyBindingEntries(sym, isec, r.offset, r.addend);		addNonLazyBindingEntries(sym, isec, r.offset, r.addend);
}		}
}		}

void Writer::scanRelocations() {		void Writer::scanRelocations() {
TimeTraceScope timeScope("Scan relocations");		TimeTraceScope timeScope("Scan relocations");
for (InputSection *isec : inputSections) {		for (InputSection *isec : inputSections) {
if (isec->shouldOmitFromOutput())		if (!isa<ConcatInputSection>(isec))
continue;		continue;
		auto concatIsec = cast<ConcatInputSection>(isec);

if (isec->segname == segment_names::ld) {		if (concatIsec->shouldOmitFromOutput())
in.unwindInfo->prepareRelocations(isec);		continue;

		if (concatIsec->segname == segment_names::ld) {
		in.unwindInfo->prepareRelocations(concatIsec);
continue;		continue;
}		}

for (auto it = isec->relocs.begin(); it != isec->relocs.end(); ++it) {		for (auto it = isec->relocs.begin(); it != isec->relocs.end(); ++it) {
Reloc &r = *it;		Reloc &r = *it;
if (target->hasAttr(r.type, RelocAttrBits::SUBTRAHEND)) {		if (target->hasAttr(r.type, RelocAttrBits::SUBTRAHEND)) {
// Skip over the following UNSIGNED relocation -- it's just there as the		// Skip over the following UNSIGNED relocation -- it's just there as the
// minuend, and doesn't have the usual UNSIGNED semantics. We don't want		// minuend, and doesn't have the usual UNSIGNED semantics. We don't want
// to emit rebase opcodes for it.		// to emit rebase opcodes for it.
it++;		it++;
continue;		continue;
}		}
if (auto sym = r.referent.dyn_cast<Symbol >()) {		if (auto sym = r.referent.dyn_cast<Symbol >()) {
if (auto *undefined = dyn_cast<Undefined>(sym))		if (auto *undefined = dyn_cast<Undefined>(sym))
treatUndefinedSymbol(*undefined);		treatUndefinedSymbol(*undefined);
// treatUndefinedSymbol() can replace sym with a DylibSymbol; re-check.		// treatUndefinedSymbol() can replace sym with a DylibSymbol; re-check.
if (!isa<Undefined>(sym) && validateSymbolRelocation(sym, isec, r))		if (!isa<Undefined>(sym) && validateSymbolRelocation(sym, isec, r))
prepareSymbolRelocation(sym, isec, r);		prepareSymbolRelocation(sym, isec, r);
} else {		} else {
assert(r.referent.is<InputSection *>());		assert(r.referent.is<InputSection *>());
assert(!r.referent.get<InputSection *>()->shouldOmitFromOutput());
if (!r.pcrel)		if (!r.pcrel)
in.rebase->addEntry(isec, r.offset);		in.rebase->addEntry(isec, r.offset);
}		}
}		}
}		}
}		}

void Writer::scanSymbols() {		void Writer::scanSymbols() {
▲ Show 20 Lines • Show All 253 Lines • ▼ Show 20 Lines	template <class LP> void Writer::createOutputSections() {
default:		default:
llvm_unreachable("unhandled output file type");		llvm_unreachable("unhandled output file type");
}		}

// Then add input sections to output sections.		// Then add input sections to output sections.
DenseMap<NamePair, ConcatOutputSection *> concatOutputSections;		DenseMap<NamePair, ConcatOutputSection *> concatOutputSections;
for (const auto &p : enumerate(inputSections)) {		for (const auto &p : enumerate(inputSections)) {
InputSection *isec = p.value();		InputSection *isec = p.value();
if (isec->shouldOmitFromOutput())
continue;
OutputSection *osec;		OutputSection *osec;
if (auto *concatIsec = dyn_cast<ConcatInputSection>(isec)) {		if (auto *concatIsec = dyn_cast<ConcatInputSection>(isec)) {
		if (concatIsec->shouldOmitFromOutput())
		continue;
NamePair names = maybeRenameSection({isec->segname, isec->name});		NamePair names = maybeRenameSection({isec->segname, isec->name});
ConcatOutputSection *&concatOsec = concatOutputSections[names];		ConcatOutputSection *&concatOsec = concatOutputSections[names];
if (concatOsec == nullptr)		if (concatOsec == nullptr)
concatOsec = make<ConcatOutputSection>(names.second);		concatOsec = make<ConcatOutputSection>(names.second);
concatOsec->addInput(concatIsec);		concatOsec->addInput(concatIsec);
osec = concatOsec;		osec = concatOsec;
} else if (auto *cStringIsec = dyn_cast<CStringInputSection>(isec)) {		} else if (auto *cStringIsec = dyn_cast<CStringInputSection>(isec)) {
in.cStringSection->addInput(cStringIsec);		in.cStringSection->addInput(cStringIsec);
▲ Show 20 Lines • Show All 202 Lines • Show Last 20 Lines