This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/
-
MachO/
-
Arch/
6/6
ARM64.cpp
-
Driver.cpp
1/1
InputSection.h
1/1
InputSection.cpp
10/10
MergedOutputSection.h
43/44
MergedOutputSection.cpp
-
Options.td
1/1
Symbols.h
-
Symbols.cpp
1/1
SyntheticSections.h
-
SyntheticSections.cpp
2/5
Target.h
18/18
Writer.cpp
-
test/MachO/
-
MachO/
9/9
arm64-thunks.s
-
tools/
3/3
generate-thunkable-program.py

Differential D100818

[lld-macho] Implement branch-range-extension thunks
ClosedPublic

Authored by gkm on Apr 19 2021, 11:59 PM.

Download Raw Diff

Details

Reviewers

int3
jdoerfert

Group Reviewers

Restricted Project

Commits

rG93c8559baf55: [lld-macho] Implement branch-range-extension thunks

Summary

Extend the range of calls beyond an architecture's limited branch range by first calling a thunk, which loads the far address into a scratch register (x16 on ARM64) and branches through it.

Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead. This MachO algorithm places thunks during MergedOutputSection::finalize() in a single pass using exact thunk-space overheads. Thunks are kept in a separate vector to avoid the overhead of inserting into the inputs vector of MergedOutputSection.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gkm created this revision.Apr 19 2021, 11:59 PM

Herald added a reviewer: int3. · View Herald TranscriptApr 19 2021, 11:59 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: dang, kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B99626: Diff 338729.Apr 20 2021, 1:08 AM

many updates

Harbormaster completed remote builds in B100645: Diff 340130.Apr 23 2021, 2:42 PM

gkm added a parent revision: D101395: [lld-macho] Implement builtin section renaming.Apr 27 2021, 12:51 PM

Major rewrite

gkm published this revision for review.Apr 29 2021, 9:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2021, 9:07 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B101656: Diff 341540.Apr 29 2021, 9:45 AM

tschuett added a subscriber: tschuett.Apr 30 2021, 2:22 AM

tschuett added inline comments.

lld/MachO/MergedOutputSection.cpp
224	LLVM prefers preincrement: https://llvm.org/docs/CodingStandards.html#prefer-preincrement There a couple of postincrements.

int3 added inline comments.Apr 30 2021, 5:52 AM

lld/MachO/Arch/ARM64.cpp
136	could you comment on why the alignTo is needed? also remove llvm::
144	stale comment? (since data seems like it's being populated)
lld/MachO/InputSection.cpp
38–39	I guess we can do away with the braces now
lld/MachO/InputSection.h
46	would be good to have a comment; the link to `finalize()` isn't immediately obvious
lld/MachO/MergedOutputSection.cpp
38–39	nit: `isThunkable \|= input->isThunkable;`
87	it won't be 'new' after this lands :)
101
202	seems more readable
219–221	I find these kind of hard to read, could we use slightly more verbose names?
222	doesn't seem like a useful assert
224	I mean this line actually does want to return a copy of the original value, so I think it's fine... but yeah the increments-for-effect-only should be preincrements
233	redundant assert
235	I believe llvm-mc emits relocs in reverse order, but I don't think that's guaranteed anywhere in the format... we should probably sort it ourselves
245–246	(`get<>()` will assert)
283	I think the lint is right here... the return value of `to_string` is going to be unowned
312	LLD-ELF handles `--verbose` by assigning to `errorHandler().verbose`, I think we should do likewise
322–333	instead of two sorted arrays, would it be simpler to create a map of regular InputSection to an array of thunks that immediately follow it?
lld/MachO/MergedOutputSection.h
23	nit: this type seems too simple to be worth an alias. Also since other places in the codebase (like the global `inputSections`) don't use it, we now have two different ways to refer to the same type in the codebase...
40–41	doesn't seem like these are implemented
49–52	nor any of these
lld/MachO/Symbols.h
264–265	this can be committed on its own as an NFC commit... try to keep the diff to relevant changes only
lld/MachO/Target.h
69	nit: insert `llvm_unreachable()` in the body?
lld/MachO/Writer.cpp
536–537	why aren't externalWeakDefs thunkable?
540	do we actually get here in practice? I would assume that `prepareBranchTarget` is only ever called on DylibSymbols and Defined symbols...
596–597	I thought we discussed that this case should be impossible; how does it arise?
1064
1067	seems like you forgot to finish this sentence :p

Feel free to ignore me, but it is an odd dance with the default implementation of populateThunk, which is unreachable.

And the following is only used by ARM64.

size_t thunkSize = 0;
uint64_t branchRange = 0;

Would a subclass of TargetInfo, like e.g. ThunkableTargetInfo with populateThunk, thunkSize, and branchRange be a nicer solution?

See:
https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

In D100818#2729135, @tschuett wrote:

Feel free to ignore me,

I won't ignore you, but I will punt and keep your recommendations in mind for some future time. Thanx!

but it is an odd dance with the default implementation of populateThunk, which is unreachable.

It now calls llvm_unreachable() !

lld/MachO/MergedOutputSection.cpp
219–221	s/`ic`/`I`/; s/`ie`/`E`/, by convention I see many places in LLVM code, though not so much in LLD yet. s/`ix`/`iFinal`/ Let me know if you hate `I` and `E`.
235	Before I add code and overhead to sort already-sorted vectors, I'd like to look into this further.
312	Done. Note that ELF has no other use of `OPT_verbose`. COFF assigns to `errorHandler().verbose`, and also has `config->verbose` to enable other output.
322–333	Sounds like more overhead & thus slower.

int3 added inline comments.May 3 2021, 3:38 PM

lld/MachO/MergedOutputSection.cpp
219–221	how about finalIdx, callIdx, endIdx?
235	if nothing else we should assert for `is_sorted()`
322–333	Simpler code-wise though. This isn't likely to be perf-critical...

gkm marked 7 inline comments as done.May 3 2021, 10:26 PM

gkm added inline comments.

lld/MachO/MergedOutputSection.cpp
322–333	I wish to punt on this for now. I will revisit after I get big programs to run with thunks.

Revise according to review feedback

gkm edited the summary of this revision. (Show Details)May 3 2021, 10:54 PM

Harbormaster completed remote builds in B102460: Diff 342645.May 3 2021, 11:22 PM

looks pretty good. Let's add some tests :)

lld/MachO/Arch/ARM64.cpp
137	remove llvm::
lld/MachO/MergedOutputSection.cpp
88
312	I don't think this line below should be a warning though... `log()` would suffice (and obviate the need for Config::verbose)
lld/MachO/MergedOutputSection.h
21	leftover?
lld/MachO/Writer.cpp
36	leftover?
515–516	seems outdated now
537	why is this `config->entry` check necessary?
557	can't we just check for `relocAttrs.hasAttr(RelocAttrBits::BRANCH)` instead of adding a new property on Reloc?

Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead.

I've seen a worst case of 10 passes. COFF/Mach-O may be in a better situation because they don't tend to have too large binaries.

MaskRay added inline comments.May 5 2021, 11:44 AM

lld/MachO/SyntheticSections.h
126	`const {`
lld/MachO/Target.h
77	const
lld/MachO/Writer.cpp
558	pre-increment

gkm marked 10 inline comments as done.May 7 2021, 10:32 AM

gkm added inline comments.

lld/MachO/MergedOutputSection.h
21	Not anymore.
lld/MachO/Writer.cpp
537	Because of this ... template <class LP> void Writer::run() { prepareBranchTarget(config->entry); . . . ... and this ... bool macho::link(ArrayRef<const char > argsArr, bool canExitEarly, raw_ostream &stdoutOS, raw_ostream &stderrOS) { . . . config->entry = symtab->addUndefined(args.getLastArgValue(OPT_e, "_main"), /file=/nullptr, /isWeakRef=*/false); . . .
557	After seeing your abandoned diff about the performance drag of checking `relocAttrs`, I chose to save the work of doing so again. Perhaps it is a micro-efficiency that I should avoid?

I think you haven't uploaded the latest changes

lld/MachO/Writer.cpp
537	ah. I think it'd make more sense not to call `prepareBranchTarget` with an undefined symbol in the first place...
557	yes please. If you've noticed the theme of my other comments... let's write the simplest code possible first, and optimize it if profile data indicates it's worthwhile :)

revise according to review feedback
place thunks to reach far-away stubs, both when __stubs precedes __text and when __stubs follows __text

Herald added a reviewer: jdoerfert. · View Herald TranscriptMay 7 2021, 6:40 PM

Herald added a subscriber: sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B103286: Diff 343797.May 7 2021, 7:13 PM

gkm marked 2 inline comments as done.May 9 2021, 12:13 PM

gkm added inline comments.

lld/MachO/Writer.cpp
537	Dropping the call to `prepareBranchTarget(config->entry)` causes an assert in `lld/test/MachO/entry-symbol.s`. We can debug that later.

revise according to review feedback

Harbormaster completed remote builds in B103397: Diff 343925.May 9 2021, 1:00 PM

seems like you're addressing the comments incrementally, lmk when this is ready for review

lld/MachO/Writer.cpp
537	Well I looked into it :) Didn't see the assert but noticed another issue: D102137

handle OPT_verbose via log()
Drop new Reloc member isCallSite, and use existing RelocAttrBits::BRANCH
add test case

In D100818#2746789, @int3 wrote:

seems like you're addressing the comments incrementally, lmk when this is ready for review

It only appeared to be incremental because I had neglected to properly address a couple items. Please review!

Harbormaster completed remote builds in B103538: Diff 344126.May 10 2021, 11:53 AM

add a call (__nan) through a dylib stub to the test case

Harbormaster completed remote builds in B103553: Diff 344143.May 10 2021, 12:48 PM

disable test for windows, which has no shell for loops

Harbormaster completed remote builds in B103617: Diff 344238.May 10 2021, 5:06 PM

smeenai added a subscriber: smeenai.May 10 2021, 5:10 PM

smeenai added inline comments.

lld/test/MachO/arm64-thunks.s
3	Is the `REQUIRES: shell` sufficient by itself, or do you also need the explicit `UNSUPPORTED: windows`?

Hopefully the last round of comments :)

Request: Can we profile this against some arm64 program that's not large enough to actually need thunks? Just want to be sure we're not creating lots of overhead when we don't need it. Though if it's a small regression, we can fine-tune it in follow-up diffs.

lld/MachO/Arch/ARM64.cpp
80	nit: 'always-relaxed' sounds like stubCode is sometimes itself relaxed... which I don't think is true. How about just 'relaxed'? Also it would be nice to spell out why this is the case (since the destination address of a thunk is always statically known)
81	nit: can we put this directly above `populateThunk`?
131	nit: can we keep the constructor at the bottom, right above `createARM64TargetInfo`?
lld/MachO/MergedOutputSection.cpp
58	might be worth mentioning somewhere that thunks themselves are presumed to have an effectively unlimited range, so thunks do not need to jump into other thunks
76
121	nit: rm the newline? I usually take a newline to indicate "the above comment may apply to more than just the function immediately below"
131	hm so we are calling `estimateStubsInRangeVA` itself in a loop... is this potentially expensive? I guess `thunkMap` and `endIdx - callIdx` aren't typically large, so the nested loops are probably fine... but might be worth calling out explicitly in a comment
133	unnecessary parens
145–151	all these `to_hexString()` calls makes me wonder if we should have a `LOG` macro that doesn't actually evaluate its arguments till they are needed... (I still think we should avoid adding `Config::verbose`, we can just extend ErrorHandler) if this turns out to be necessary let's do it in a stacked diff... there's too much going on in this one already
216	can we drop this functionality for now, until if/when we decide to make stubs-before-text an option?
250	from STLExtras.h
258	since you capitalized everything else
262	nit aside... I'm a bit confused by this comment. What address computation is going on? Isn't it more like "Determine whether the call's referent is reachable" or "Determine if the referent should be replaced by a thunk"?
274
296	I think this should work
lld/MachO/MergedOutputSection.h
66	nit :p
70
79	is this different from `Symbol::isInStubs()`?
82	looks like this is only used in one place... IMO we could just use `auto` there, but up to you
lld/MachO/Target.h
93–95	This seems unnecessarily cute... Can we just use `numeric_limits<uint64_t>::max()`? Also it's not really target-specific so I'd prefer it not be put in Target
lld/MachO/Writer.cpp
516	rm newline plz
557–560	should we wrap this in an `if (target->needsThunks())`?
lld/test/MachO/arm64-thunks.s
13	looks like this is failing on Windows. Not sure there's a cross-platform way to loop, but writing 9 `llvm-mc` invocations doesn't seem too terrible either out of curiosity: what happens if we put everything into the same file, instead of multiple files -- what does `llvm-mc` generate?
18–19	seems like you could pipe it directly into FileCheck
23	since you're using the `%.6x` syntax, the `{{0*}}` seems unnecessary... (you might have to change `.6` to `.8` though)
lld/test/MachO/tools/generate-thunkable-program.py
16–18	Couldn't we just generate a bunch of random strings for the symbols? This list isn't really helping us exercise new code paths in LLD...

It only appeared to be incremental because I had neglected to properly address a couple items.

JFYI, if you click on the M / N Comments button in the top-right-hand corner, it will take you to the un-done comments :)

lld/test/MachO/arm64-thunks.s
13	ah I see you disabled the test on Windows... I don't think this is a good enough reason to disable stuff on Windows unfortunately (I got pushback for it in https://bugs.llvm.org/show_bug.cgi?id=49512)

int3 added inline comments.May 10 2021, 5:19 PM

lld/test/MachO/arm64-thunks.s
3	It should be sufficient by itself. I believe they're largely the same thing as far as the buildbots are concerned, but `REQUIRES: shell` alone would be more semantically descriptive of what this test depends on in its current form

gkm marked 26 inline comments as done.May 11 2021, 3:33 PM

gkm added inline comments.

lld/MachO/MergedOutputSection.cpp
131	Although we call `estimateStubsInRangeVA()` within a loop, it is predicated by condition that is only true once. The call needs to happen within the loop in order to happen at the proper time: as soon as all input sections are finalized, i.e., when the end of __text is within forward-branch range of the current call site. I will add a comment to highlight that feature.
145–151	This is called only once, so the overhead of a few `to_hexString()` calls is negligible, and not worthy of a `LOG` macro. I already removed `Config::verbose` and inject `OPT_verbose` into `errorHandler().verbose`.
296	Sadly, it doesn't. Unlike old C compilers, `__FUNCTION__` is not expanded by the preprocessor into a string literal. It is a compiler-internal variable, and thus not amenable to concat via adjacent string literals.
lld/MachO/MergedOutputSection.h
79	No. Removed. Thanx!
lld/MachO/Target.h
93–95	FYI, this wasn't intended to be cute at all. `numeric_limits<uint64_t>::max()` is not viable since it is the tombstone value for `DenseMap<>` and induces weird assertions. Another disqualifier is that `max()` is only one increment away from wrapping to 0. My chosen value `0xf000'0000'0000'0000` is VERY FAR away from 0. Perhaps MachO guarantees that `__text` addresses will never be within even 4 GiB of 0, so I am unnecessarily cautious? Regarding target-specificity: it is OS/runtime specific (mostly Darwin & iOS), modulo CPU-arch variations. It so happens all are Apple creations, and common enough that we can choose a single constant that works for all. If not `Target.h`, where do you propose we define this? I don't see anywhere that seems a better fit ... `Config.h`? `OutputSection.h`? `Relocations.h`? What looks good to you?
lld/test/MachO/arm64-thunks.s
3	The saddest part is that the test still runs and fails on Windows.
13	With `.subsections_via_symbols`, it all works as a single input file. Thanx!
23	`.13`, since these are 64-bit values with 16 hex digits.

In D100818#2749379, @int3 wrote:

JFYI, if you click on the M / N Comments button in the top-right-hand corner, it will take you to the un-done comments :)

I already use that, but apparently make mistakes. 😦

revise according to review feedback

Harbormaster completed remote builds in B103881: Diff 344592.May 11 2021, 5:10 PM

int3 added inline comments.May 11 2021, 6:49 PM

lld/MachO/MergedOutputSection.cpp
131	oh I see. Thanks for adding the comment!
296	huh, TIL. I did vaguely wonder when C preprocessors became smart enough to parse function declarations :p makes more sense that the compiler is doing it
lld/MachO/Target.h
93–95	since it is the tombstone value for DenseMap<> and induces weird assertions ohh okay. That makes sense... please add that to the comment :) I thought you were trying to make a constant with the string "full" in it... :p I just was thinking of having it as a global. It can still be in Target.h. It's runtime-specific, but as you said it works uniformly for all the targets we support, and that's not quite obvious from seeing `target->outOfRangeVA`. Though if you don't want to pollute the global namespace further I guess changing all the use sites to `Target::outOfRangeVA` would work too.
lld/test/MachO/tools/generate-thunkable-program.py
16–18	not yet addressed

remove call-site memoization overhead from prepareSymbolRelocation() to avoid penalizing programs that don't need thunks. Do that work in MergedOutputSection::needsThunks(), only after we have determined that we need thunks.

gkm marked 2 inline comments as done.May 11 2021, 9:28 PM

gkm added inline comments.

lld/test/MachO/tools/generate-thunkable-program.py
16–18	I am leveraging symbols already present in `libSystem.tbd`. The goal is to generate calls through dylib stubs, to make sure the thunker properly makes thunks for out-of-range stubs. That is the LLD code path I am exercising. I could generate random strings, and then generate a matching `libLOL.tbd`, but that seems like extra work for marginal benefit. I suppose an advantage to generating `libLOL.tbd` is that I can control it size and stress-test the thunker with a huge dylib, or with multiple generated dylibs containing random symbols.

Harbormaster completed remote builds in B103930: Diff 344649.May 11 2021, 9:30 PM

backout a few gratuitious changes
s/target->target->outOfRangeVA/TargetInfo::outOfRangeVA/ plus extra comments

Harbormaster completed remote builds in B103934: Diff 344654.May 11 2021, 10:10 PM

lgtm, thanks!

lld/MachO/MergedOutputSection.cpp
52
57
152	"Since `__stubs` is placed after __text"?

This revision is now accepted and ready to land.May 12 2021, 8:55 AM

gkm marked 3 inline comments as done.May 12 2021, 9:42 AM

gkm edited the summary of this revision. (Show Details)

final revisions according to review feedback

This revision was landed with ongoing or failed builds.May 12 2021, 9:46 AM

Closed by commit rG93c8559baf55: [lld-macho] Implement branch-range-extension thunks (authored by gkm). · Explain Why

This revision was automatically updated to reflect the committed changes.

gkm added a commit: rG93c8559baf55: [lld-macho] Implement branch-range-extension thunks.

Harbormaster completed remote builds in B104067: Diff 344851.May 12 2021, 10:24 AM

This seems to slow down (x86) links by 2.8% (using the repro at https://bugs.llvm.org/show_bug.cgi?id=48657#c0 , and ~/src/hack/bench.py -n10 -o at_thunk ../out/gn/bin/ld64.lld @response.txt , with https://github.com/nico/hack/blob/master/bench.py; I built lld right before this rev (git checkout 93c8559baf551a7a30ab17654569ac5ac92986f4^) and at this rev (git checkout 93c8559baf551a7a30ab17654569ac5ac92986f4).

`
chromium_framework % ministat at*thunk
x at_before_thunk
+ at_thunk
    N           Min           Max        Median           Avg        Stddev
x  10     3.9163771     3.9668748     3.9350698     3.9360609   0.016793477
+  10      3.954447     4.2126601     4.0607598     4.0460292   0.072968429

Is that expected? Is this something you can repro?

Also, re patch description: Would be nice if the commit message had said _why_ a different algorithm was chosen :)

thakis mentioned this in D102655: [lld/mac] Inline a check.May 17 2021, 3:26 PM

thakis mentioned this in rGbc588f996111: [lld/mac] Inline a check.May 17 2021, 5:09 PM

Revision Contents

Path

Size

lld/

MachO/

Arch/

27 lines

1 line

2 lines

23 lines

MergedOutputSection.h

30 lines

MergedOutputSection.cpp

291 lines

2 lines

11 lines

20 lines

14 lines

SyntheticSections.cpp

7 lines

Target.h

16 lines

Writer.cpp

12 lines

test/

MachO/

arm64-thunks.s

300 lines

tools/

generate-thunkable-program.py

429 lines

Diff 344852

lld/MachO/Arch/ARM64.cpp

Show All 13 Lines

#include "lld/Common/ErrorHandler.h"		#include "lld/Common/ErrorHandler.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/BinaryFormat/MachO.h"		#include "llvm/BinaryFormat/MachO.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"

		using namespace llvm;
using namespace llvm::MachO;		using namespace llvm::MachO;
using namespace llvm::support::endian;		using namespace llvm::support::endian;
using namespace lld;		using namespace lld;
using namespace lld::macho;		using namespace lld::macho;

namespace {		namespace {

struct ARM64 : ARM64Common {		struct ARM64 : ARM64Common {
ARM64();		ARM64();
void writeStub(uint8_t *buf, const Symbol &) const override;		void writeStub(uint8_t *buf, const Symbol &) const override;
void writeStubHelperHeader(uint8_t *buf) const override;		void writeStubHelperHeader(uint8_t *buf) const override;
void writeStubHelperEntry(uint8_t *buf, const DylibSymbol &,		void writeStubHelperEntry(uint8_t *buf, const DylibSymbol &,
uint64_t entryAddr) const override;		uint64_t entryAddr) const override;
const RelocAttrs &getRelocAttrs(uint8_t type) const override;		const RelocAttrs &getRelocAttrs(uint8_t type) const override;
		void populateThunk(InputSection thunk, Symbol funcSym) override;
};		};

} // namespace		} // namespace

// Random notes on reloc types:		// Random notes on reloc types:
// ADDEND always pairs with BRANCH26, PAGE21, or PAGEOFF12		// ADDEND always pairs with BRANCH26, PAGE21, or PAGEOFF12
// POINTER_TO_GOT: ld64 supports a 4-byte pc-relative form as well as an 8-byte		// POINTER_TO_GOT: ld64 supports a 4-byte pc-relative form as well as an 8-byte
// absolute version of this relocation. The semantics of the absolute relocation		// absolute version of this relocation. The semantics of the absolute relocation
Show All 26 Lines
}		}

static constexpr uint32_t stubCode[] = {		static constexpr uint32_t stubCode[] = {
0x90000010, // 00: adrp x16, __la_symbol_ptr@page		0x90000010, // 00: adrp x16, __la_symbol_ptr@page
0xf9400210, // 04: ldr x16, [x16, __la_symbol_ptr@pageoff]		0xf9400210, // 04: ldr x16, [x16, __la_symbol_ptr@pageoff]
0xd61f0200, // 08: br x16		0xd61f0200, // 08: br x16
};		};

void ARM64::writeStub(uint8_t *buf8, const Symbol &sym) const {		void ARM64::writeStub(uint8_t *buf8, const Symbol &sym) const {
		int3Unsubmitted Done Reply Inline Actions nit: 'always-relaxed' sounds like stubCode is sometimes itself relaxed... which I don't think is true. How about just 'relaxed'? Also it would be nice to spell out why this is the case (since the destination address of a thunk is always statically known) int3: nit: 'always-relaxed' sounds like stubCode is sometimes itself relaxed... which I don't think…
::writeStub<LP64>(buf8, stubCode, sym);		::writeStub<LP64>(buf8, stubCode, sym);
		int3Unsubmitted Done Reply Inline Actions nit: can we put this directly above `populateThunk`? int3: nit: can we put this directly above `populateThunk`?
}		}

static constexpr uint32_t stubHelperHeaderCode[] = {		static constexpr uint32_t stubHelperHeaderCode[] = {
0x90000011, // 00: adrp x17, _dyld_private@page		0x90000011, // 00: adrp x17, _dyld_private@page
0x91000231, // 04: add x17, x17, _dyld_private@pageoff		0x91000231, // 04: add x17, x17, _dyld_private@pageoff
0xa9bf47f0, // 08: stp x16/x17, [sp, #-16]!		0xa9bf47f0, // 08: stp x16/x17, [sp, #-16]!
0x90000010, // 0c: adrp x16, dyld_stub_binder@page		0x90000010, // 0c: adrp x16, dyld_stub_binder@page
0xf9400210, // 10: ldr x16, [x16, dyld_stub_binder@pageoff]		0xf9400210, // 10: ldr x16, [x16, dyld_stub_binder@pageoff]
Show All 10 Lines	static constexpr uint32_t stubHelperEntryCode[] = {
0x00000000, // 08: l0: .long 0		0x00000000, // 08: l0: .long 0
};		};

void ARM64::writeStubHelperEntry(uint8_t *buf8, const DylibSymbol &sym,		void ARM64::writeStubHelperEntry(uint8_t *buf8, const DylibSymbol &sym,
uint64_t entryVA) const {		uint64_t entryVA) const {
::writeStubHelperEntry(buf8, stubHelperEntryCode, sym, entryVA);		::writeStubHelperEntry(buf8, stubHelperEntryCode, sym, entryVA);
}		}

		// A thunk is the relaxed variation of stubCode. We don't need the
		// extra indirection through a lazy pointer because the target address
		// is known at link time.
		static constexpr uint32_t thunkCode[] = {
		0x90000010, // 00: adrp x16, <thunk.ptr>@page
		0x91000210, // 04: add x16, [x16,<thunk.ptr>@pageoff]
		0xd61f0200, // 08: br x16
		};

		void ARM64::populateThunk(InputSection thunk, Symbol funcSym) {
		thunk->align = 4;
		thunk->data = {reinterpret_cast<const uint8_t *>(thunkCode),
		sizeof(thunkCode)};
		thunk->relocs.push_back({/type=/ARM64_RELOC_PAGEOFF12,
		/pcrel=/false, /length=/2,
		/offset=/4, /addend=/0,
		/referent=/funcSym});
		thunk->relocs.push_back({/type=/ARM64_RELOC_PAGE21,
		/pcrel=/true, /length=/2,
		/offset=/0, /addend=/0,
		/referent=/funcSym});
		}

ARM64::ARM64() : ARM64Common(LP64()) {		ARM64::ARM64() : ARM64Common(LP64()) {
		int3Unsubmitted Done Reply Inline Actions nit: can we keep the constructor at the bottom, right above `createARM64TargetInfo`? int3: nit: can we keep the constructor at the bottom, right above `createARM64TargetInfo`?
cpuType = CPU_TYPE_ARM64;		cpuType = CPU_TYPE_ARM64;
cpuSubtype = CPU_SUBTYPE_ARM64_ALL;		cpuSubtype = CPU_SUBTYPE_ARM64_ALL;

stubSize = sizeof(stubCode);		stubSize = sizeof(stubCode);
		thunkSize = sizeof(thunkCode);
		int3Unsubmitted Done Reply Inline Actions could you comment on why the alignTo is needed? also remove llvm:: int3: could you comment on why the alignTo is needed? also remove llvm::
		branchRange = maxIntN(28) - thunkSize;
		int3Unsubmitted Done Reply Inline Actions remove llvm:: int3: remove llvm::
stubHelperHeaderSize = sizeof(stubHelperHeaderCode);		stubHelperHeaderSize = sizeof(stubHelperHeaderCode);
stubHelperEntrySize = sizeof(stubHelperEntryCode);		stubHelperEntrySize = sizeof(stubHelperEntryCode);
}		}

TargetInfo *macho::createARM64TargetInfo() {		TargetInfo *macho::createARM64TargetInfo() {
static ARM64 t;		static ARM64 t;
return &t;		return &t;
		int3Unsubmitted Done Reply Inline Actions stale comment? (since data seems like it's being populated) int3: stale comment? (since data seems like it's being populated)
}		}

lld/MachO/Driver.cpp

Show First 20 Lines • Show All 895 Lines • ▼ Show 20 Lines	bool macho::link(ArrayRef<const char *> argsArr, bool canExitEarly,

MachOOptTable parser;		MachOOptTable parser;
InputArgList args = parser.parse(argsArr.slice(1));		InputArgList args = parser.parse(argsArr.slice(1));

errorHandler().errorLimitExceededMsg =		errorHandler().errorLimitExceededMsg =
"too many errors emitted, stopping now "		"too many errors emitted, stopping now "
"(use --error-limit=0 to see all errors)";		"(use --error-limit=0 to see all errors)";
errorHandler().errorLimit = args::getInteger(args, OPT_error_limit_eq, 20);		errorHandler().errorLimit = args::getInteger(args, OPT_error_limit_eq, 20);
		errorHandler().verbose = args.hasArg(OPT_verbose);

if (args.hasArg(OPT_help_hidden)) {		if (args.hasArg(OPT_help_hidden)) {
parser.printHelp(argsArr[0], /showHidden=/true);		parser.printHelp(argsArr[0], /showHidden=/true);
return true;		return true;
}		}
if (args.hasArg(OPT_help)) {		if (args.hasArg(OPT_help)) {
parser.printHelp(argsArr[0], /showHidden=/false);		parser.printHelp(argsArr[0], /showHidden=/false);
return true;		return true;
▲ Show 20 Lines • Show All 329 Lines • Show Last 20 Lines

lld/MachO/InputSection.h

Show All 36 Lines	public:
StringRef segname;		StringRef segname;

OutputSection *parent = nullptr;		OutputSection *parent = nullptr;
uint64_t outSecOff = 0;		uint64_t outSecOff = 0;
uint64_t outSecFileOff = 0;		uint64_t outSecFileOff = 0;

uint32_t align = 1;		uint32_t align = 1;
uint32_t flags = 0;		uint32_t flags = 0;
		uint32_t callSiteCount = 0;
		bool isFinal = false; // is address assigned?
		int3Unsubmitted Done Reply Inline Actions would be good to have a comment; the link to `finalize()` isn't immediately obvious int3: would be good to have a comment; the link to `finalize()` isn't immediately obvious

// How many symbols refer to this InputSection.		// How many symbols refer to this InputSection.
uint32_t numRefs = 0;		uint32_t numRefs = 0;

// True if this InputSection could not be written to the output file.		// True if this InputSection could not be written to the output file.
// With subsections_via_symbols, most symbol have its own InputSection,		// With subsections_via_symbols, most symbol have its own InputSection,
// and for weak symbols (e.g. from inline functions), only the		// and for weak symbols (e.g. from inline functions), only the
// InputSection from one translation unit will make it to the output,		// InputSection from one translation unit will make it to the output,
▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

lld/MachO/InputSection.cpp

Show All 28 Lines
}		}

uint64_t InputSection::getFileSize() const {		uint64_t InputSection::getFileSize() const {
return isZeroFill(flags) ? 0 : getSize();		return isZeroFill(flags) ? 0 : getSize();
}		}

uint64_t InputSection::getVA() const { return parent->addr + outSecOff; }		uint64_t InputSection::getVA() const { return parent->addr + outSecOff; }

static uint64_t resolveSymbolVA(uint8_t *loc, const Symbol &sym, uint8_t type) {		static uint64_t resolveSymbolVA(const Symbol *sym, uint8_t type) {
const RelocAttrs &relocAttrs = target->getRelocAttrs(type);		const RelocAttrs &relocAttrs = target->getRelocAttrs(type);
if (relocAttrs.hasAttr(RelocAttrBits::BRANCH)) {		if (relocAttrs.hasAttr(RelocAttrBits::BRANCH))
		int3Unsubmitted Done Reply Inline Actions I guess we can do away with the braces now int3: I guess we can do away with the braces now
if (sym.isInStubs())		return sym->resolveBranchVA();
return in.stubs->addr + sym.stubsIndex * target->stubSize;		else if (relocAttrs.hasAttr(RelocAttrBits::GOT))
} else if (relocAttrs.hasAttr(RelocAttrBits::GOT)) {		return sym->resolveGotVA();
if (sym.isInGot())		else if (relocAttrs.hasAttr(RelocAttrBits::TLV))
return in.got->addr + sym.gotIndex * target->wordSize;		return sym->resolveTlvVA();
} else if (relocAttrs.hasAttr(RelocAttrBits::TLV)) {		return sym->getVA();
if (sym.isInGot())
return in.tlvPointers->addr + sym.gotIndex * target->wordSize;
assert(isa<Defined>(&sym));
}
return sym.getVA();
}		}

void InputSection::writeTo(uint8_t *buf) {		void InputSection::writeTo(uint8_t *buf) {
assert(!shouldOmitFromOutput());		assert(!shouldOmitFromOutput());

if (getFileSize() == 0)		if (getFileSize() == 0)
return;		return;

Show All 14 Lines	if (target->hasAttr(r.type, RelocAttrBits::SUBTRAHEND)) {
assert(!referentIsec->shouldOmitFromOutput());		assert(!referentIsec->shouldOmitFromOutput());
minuendVA = referentIsec->getVA();		minuendVA = referentIsec->getVA();
}		}
referentVA = minuendVA - fromSym->getVA() + minuend.addend;		referentVA = minuendVA - fromSym->getVA() + minuend.addend;
} else if (auto referentSym = r.referent.dyn_cast<Symbol >()) {		} else if (auto referentSym = r.referent.dyn_cast<Symbol >()) {
if (target->hasAttr(r.type, RelocAttrBits::LOAD) &&		if (target->hasAttr(r.type, RelocAttrBits::LOAD) &&
!referentSym->isInGot())		!referentSym->isInGot())
target->relaxGotLoad(loc, r.type);		target->relaxGotLoad(loc, r.type);
referentVA = resolveSymbolVA(loc, *referentSym, r.type);		referentVA = resolveSymbolVA(referentSym, r.type);

if (isThreadLocalVariables(flags)) {		if (isThreadLocalVariables(flags)) {
// References from thread-local variable sections are treated as offsets		// References from thread-local variable sections are treated as offsets
// relative to the start of the thread-local data memory area, which		// relative to the start of the thread-local data memory area, which
// is initialized via copying all the TLV data sections (which are all		// is initialized via copying all the TLV data sections (which are all
// contiguous).		// contiguous).
if (isa<Defined>(referentSym))		if (isa<Defined>(referentSym))
referentVA -= firstTLVDataSection->addr;		referentVA -= firstTLVDataSection->addr;
Show All 29 Lines

lld/MachO/MergedOutputSection.h

//===- OutputSection.h ------------------------------------------*- C++ -*-===// //===- OutputSection.h ------------------------------------------*- C++ -*-===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#ifndef LLD_MACHO_MERGED_OUTPUT_SECTION_H #ifndef LLD_MACHO_MERGED_OUTPUT_SECTION_H

#define LLD_MACHO_MERGED_OUTPUT_SECTION_H #define LLD_MACHO_MERGED_OUTPUT_SECTION_H

#include "InputSection.h" #include "InputSection.h"

#include "OutputSection.h" #include "OutputSection.h"

#include "lld/Common/LLVM.h" #include "lld/Common/LLVM.h"

#include "llvm/ADT/DenseMap.h"

#include "llvm/ADT/MapVector.h" #include "llvm/ADT/MapVector.h"

namespace lld { namespace lld {

namespace macho { namespace macho {

class Defined;

int3Unsubmitted

Done

leftover?

int3: leftover?

gkmAuthorUnsubmitted

Done

Not anymore.

gkm: Not anymore.

// Linking multiple files will inevitably mean resolving sections in different // Linking multiple files will inevitably mean resolving sections in different

int3Unsubmitted

Done

nit: this type seems too simple to be worth an alias. Also since other places in the codebase (like the global inputSections) don't use it, we now have two different ways to refer to the same type in the codebase...

int3: nit: this type seems too simple to be worth an alias. Also since other places in the codebase…

// files that are labeled with the same segment and section name. This class // files that are labeled with the same segment and section name. This class

// contains all such sections and writes the data from each section sequentially // contains all such sections and writes the data from each section sequentially

// in the final binary. // in the final binary.

class MergedOutputSection : public OutputSection { class MergedOutputSection : public OutputSection {

public: public:

MergedOutputSection(StringRef name) : OutputSection(MergedKind, name) {} MergedOutputSection(StringRef name) : OutputSection(MergedKind, name) {}

const InputSection *firstSection() const { return inputs.front(); } const InputSection *firstSection() const { return inputs.front(); }

const InputSection *lastSection() const { return inputs.back(); } const InputSection *lastSection() const { return inputs.back(); }

// These accessors will only be valid after finalizing the section // These accessors will only be valid after finalizing the section

uint64_t getSize() const override { return size; } uint64_t getSize() const override { return size; }

uint64_t getFileSize() const override { return fileSize; } uint64_t getFileSize() const override { return fileSize; }

void mergeInput(InputSection *input); void mergeInput(InputSection *input);

void finalize() override; void finalize() override;

bool needsThunks() const;

uint64_t estimateStubsInRangeVA(size_t callIdx) const;

int3Unsubmitted

Done

doesn't seem like these are implemented

int3: doesn't seem like these are implemented

void writeTo(uint8_t *buf) const override; void writeTo(uint8_t *buf) const override;

std::vector<InputSection *> inputs; std::vector<InputSection *> inputs;

std::vector<InputSection *> thunks;

static bool classof(const OutputSection *sec) { static bool classof(const OutputSection *sec) {

return sec->kind() == MergedKind; return sec->kind() == MergedKind;

} }

private: private:

int3Unsubmitted

Done

nor any of these

int3: nor any of these

void mergeFlags(InputSection *input); void mergeFlags(InputSection *input);

size_t size = 0; size_t size = 0;

uint64_t fileSize = 0; uint64_t fileSize = 0;

}; };

// We maintain one ThunkInfo per real function.

// The "active thunk" is represented by the sym/isec pair that

// turns-over during finalize(): as the call-site address advances,

// the active thunk goes out of branch-range, and we create a new

// thunk to take its place.

// The remaining members -- bools and counters -- apply to the

int3Unsubmitted

Done

// thunk to take its place.

- // The remaining members--bools and counters--apply to the collection

+ // The remaining members -- bools and counters -- apply to the collection

// of thunks associated with the real function.

nit :p

int3: nit :p

// collection of thunks associated with the real function.

struct ThunkInfo {

// These denote the active thunk:

int3Unsubmitted

Done

struct ThunkInfo {

- // These denote the active thunk. isec

+ // These denote the active thunk.

Defined *sym = nullptr; // private-extern symbol for active thunk

int3:

Defined *sym = nullptr; // private-extern symbol for active thunk

InputSection *isec = nullptr; // input section for active thunk

// The following values are cumulative across all thunks on this function

uint32_t callSiteCount = 0; // how many calls to the real function?

uint32_t callSitesUsed = 0; // how many call sites processed so-far?

uint32_t thunkCallCount = 0; // how many call sites went to thunk?

uint8_t sequence = 0; // how many thunks created so-far?

};

int3Unsubmitted

Done

is this different from Symbol::isInStubs()?

int3: is this different from `Symbol::isInStubs()`?

gkmAuthorUnsubmitted

Done

No. Removed. Thanx!

gkm: No. Removed. Thanx!

extern llvm::DenseMap<Symbol *, ThunkInfo> thunkMap;

int3Unsubmitted

Done

looks like this is only used in one place... IMO we could just use auto there, but up to you

int3: looks like this is only used in one place... IMO we could just use `auto` there, but up to you

} // namespace macho } // namespace macho

} // namespace lld } // namespace lld

#endif #endif

lld/MachO/MergedOutputSection.cpp

//===- OutputSection.cpp --------------------------------------------------===// //===- OutputSection.cpp --------------------------------------------------===//

// //

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information. // See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// //

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

#include "MergedOutputSection.h" #include "MergedOutputSection.h"

#include "Config.h"

#include "OutputSegment.h" #include "OutputSegment.h"

#include "SymbolTable.h"

#include "Symbols.h"

#include "SyntheticSections.h"

#include "Target.h"

#include "lld/Common/ErrorHandler.h" #include "lld/Common/ErrorHandler.h"

#include "lld/Common/Memory.h" #include "lld/Common/Memory.h"

#include "llvm/BinaryFormat/MachO.h" #include "llvm/BinaryFormat/MachO.h"

#include "llvm/Support/ScopedPrinter.h" #include "llvm/Support/ScopedPrinter.h"

#include <algorithm>

using namespace llvm; using namespace llvm;

using namespace llvm::MachO; using namespace llvm::MachO;

using namespace lld; using namespace lld;

using namespace lld::macho; using namespace lld::macho;

void MergedOutputSection::mergeInput(InputSection *input) { void MergedOutputSection::mergeInput(InputSection *input) {

if (inputs.empty()) { if (inputs.empty()) {

align = input->align; align = input->align;

flags = input->flags; flags = input->flags;

} else { } else {

align = std::max(align, input->align); align = std::max(align, input->align);

mergeFlags(input); mergeFlags(input);

} }

inputs.push_back(input); inputs.push_back(input);

input->parent = this; input->parent = this;

} }

int3Unsubmitted

Done

nit: isThunkable |= input->isThunkable;

int3: nit: `isThunkable |= input->isThunkable;`

// Branch-range extension can be implemented in two ways, either through ...

// (1) Branch islands: Single branch instructions (also of limited range),

// that might be chained in multiple hops to reach the desired

// destination. On ARM64, as 16 branch islands are needed to hop between

// opposite ends of a 2 GiB program. LD64 uses branch islands exclusively,

// even when it needs excessive hops.

// (2) Thunks: Instruction(s) to load the destination address into a scratch

// register, followed by a register-indirect branch. Thunks are

// constructed to reach any arbitrary address, so need not be

// chained. Although thunks need not be chained, a program might need

// multiple thunks to the same destination distributed throughout a large

int3Unsubmitted

Done

// chained. Although thunks need not be chained, a program might need

- // multiple thunks to the same destiation distributed throughout a large

+ // multiple thunks to the same destination distributed throughout a large

// program so that all call sites can have one within range.

int3:

// program so that all call sites can have one within range.

// The optimal approach is to mix islands for distinations within two hops,

// and use thunks for destinations at greater distance. For now, we only

// implement thunks. TODO: Adding support for branch islands!

int3Unsubmitted

Done

// and use thunks for destinations at greater distance. For now, we only

- // implement thunks. TODO: Adding suppport for branch islands!

+ // implement thunks. TODO: Adding support for branch islands!

// Internally -- as expressed in LLD's data structures -- a

int3:

int3Unsubmitted

Done

might be worth mentioning somewhere that thunks themselves are presumed to have an effectively unlimited range, so thunks do not need to jump into other thunks

int3: might be worth mentioning somewhere that thunks themselves are presumed to have an effectively…

// Internally -- as expressed in LLD's data structures -- a

// branch-range-extension thunk comprises ...

// (1) new Defined privateExtern symbol for the thunk named

// <FUNCTION>.thunk.<SEQUENCE>, which references ...

// (2) new InputSection, which contains ...

// (3.1) new data for the instructions to load & branch to the far address +

// (3.2) new Relocs on instructions to load the far address, which reference ...

// (4.1) existing Defined extern symbol for the real function in __text, or

// (4.2) existing DylibSymbol for the real function in a dylib

// Nearly-optimal thunk-placement algorithm features:

// * Single pass: O(n) on the number of call sites.

// * Accounts for the exact space overhead of thunks - no heuristics

// * Exploits the full range of call instructions - forward & backward

int3Unsubmitted

Done

// pointing to (b) an InputSection holding machine instructions

- // (same code as a MachO stub), and (c) Reloc(s) that reference the

+ // (similar code as a MachO stub), and (c) Reloc(s) that reference the

// real function for fixing-up the stub code.

int3:

// Data:

// * DenseMap<Symbol *, ThunkInfo> thunkMap: Maps the function symbol

// to its thunk bookkeeper.

// * struct ThunkInfo (bookkeeper): Call instructions have limited range, and

// distant call sites might be unable to reach the same thunk, so multiple

// thunks are necessary to serve all call sites in a very large program. A

// thunkInfo stores state for all thunks associated with a particular

// function: (a) thunk symbol, (b) input section containing stub code, and

int3Unsubmitted

Done

// prepareSymbolRelocation() and prepareBranchTarget() dig into

- // Reloc records. They flip the new booleans Reloc::isThunkable,

+ // Reloc records. They flip the booleans Reloc::isThunkable,

// Inputsection::isThunkable, and MergedOutputSection::isThunkable

it won't be 'new' after this lands :)

int3: it won't be 'new' after this lands :)

// (c) sequence number for the active thunk incarnation. When an old thunk

int3Unsubmitted

Done

// prepareSymbolRelocation() and prepareBranchTarget() dig into

- // Reloc records. Relocs::isCallSite, Inputsection::callSiteCount,

+ // Reloc records. Relocs::isCallSite, InputSection::callSiteCount,

// and MergedOutputSection::callSiteCount memoize paths to call

int3:

// goes out of range, we increment the sequence number and create a new

// thunk named <FUNCTION>.thunk.<SEQUENCE>.

// * A thunk incarnation comprises (a) private-extern Defined symbol pointing

// to (b) an InputSection holding machine instructions (similar to a MachO

// stub), and (c) Reloc(s) that reference the real function for fixing-up

// the stub code.

// * std::vector<InputSection *> MergedInputSection::thunks: A vector parallel

// to the inputs vector. We store new thunks via cheap vector append, rather

// than costly insertion into the inputs vector.

// Control Flow:

int3Unsubmitted

Done

// * MergedInputSection::finalize() and MergedInputSection::writeTo()

- // merge the inputs and thunks vectors (both ordered by asending

+ // merge the inputs and thunks vectors (both ordered by ascending

// address), which is simple and cheap.

int3:

// * During address assignment, MergedInputSection::finalize() examines call

// sites by ascending address and creates thunks. When a function is beyond

// the range of a call site, we need a thunk. Place it at the largest

// available forward address from the call site. Call sites increase

// monotonically and thunks are always placed as far forward as possible;

// thus, we place thunks at monotonically increasing addresses. Once a thunk

// is placed, it and all previous input-section addresses are final.

// * MergedInputSection::finalize() and MergedInputSection::writeTo() merge

// the inputs and thunks vectors (both ordered by ascending address), which

// is simple and cheap.

DenseMap<Symbol *, ThunkInfo> lld::macho::thunkMap;

// Determine whether we need thunks, which depends on the target arch -- RISC

// (i.e., ARM) generally does because it has limited-range branch/call

// instructions, whereas CISC (i.e., x86) generally doesn't. RISC only needs

// thunks for programs so large that branch source & destination addresses

// might differ more than the range of branch instruction(s).

int3Unsubmitted

Done

nit: rm the newline? I usually take a newline to indicate "the above comment may apply to more than just the function immediately below"

int3: nit: rm the newline? I usually take a newline to indicate "the above comment may apply to more…

bool MergedOutputSection::needsThunks() const {

if (!target->usesThunks())

return false;

uint64_t isecAddr = addr;

for (InputSection *isec : inputs)

isecAddr = alignTo(isecAddr, isec->align) + isec->getSize();

if (isecAddr - addr + in.stubs->getSize() <= target->branchRange)

return false;

// Yes, this program is large enough to need thunks.

for (InputSection *isec : inputs) {

int3Unsubmitted

Done

hm so we are calling estimateStubsInRangeVA itself in a loop... is this potentially expensive? I guess thunkMap and endIdx - callIdx aren't typically large, so the nested loops are probably fine... but might be worth calling out explicitly in a comment

int3: hm so we are calling `estimateStubsInRangeVA` itself in a loop... is this potentially expensive?

gkmAuthorUnsubmitted

Done

Although we call estimateStubsInRangeVA() within a loop, it is predicated by condition that is only true once. The call needs to happen within the loop in order to happen at the proper time: as soon as all input sections are finalized, i.e., when the end of __text is within forward-branch range of the current call site. I will add a comment to highlight that feature.

gkm: Although we call `estimateStubsInRangeVA()` within a loop, it is predicated by condition that…

int3Unsubmitted

Done

oh I see. Thanks for adding the comment!

int3: oh I see. Thanks for adding the comment!

for (Reloc &r : isec->relocs) {

if (!target->hasAttr(r.type, RelocAttrBits::BRANCH))

int3Unsubmitted

Done

unnecessary parens

int3: unnecessary parens

continue;

auto *sym = r.referent.get<Symbol *>();

// Pre-populate the thunkMap and memoize call site counts for every

// InputSection and ThunkInfo. We do this for the benefit of

// MergedOutputSection::estimateStubsInRangeVA()

ThunkInfo &thunkInfo = thunkMap[sym];

// Knowing ThunkInfo call site count will help us know whether or not we

// might need to create more for this referent at the time we are

// estimating distance to __stubs in .

++thunkInfo.callSiteCount;

// Knowing InputSection call site count will help us avoid work on those

// that have no BRANCH relocs.

++isec->callSiteCount;

}

return true;

}

int3Unsubmitted

Done

all these to_hexString() calls makes me wonder if we should have a LOG macro that doesn't actually evaluate its arguments till they are needed...

(I still think we should avoid adding Config::verbose, we can just extend ErrorHandler)

if this turns out to be necessary let's do it in a stacked diff... there's too much going on in this one already

int3: all these `to_hexString()` calls makes me wonder if we should have a `LOG` macro that doesn't…

gkmAuthorUnsubmitted

Done

This is called only once, so the overhead of a few to_hexString() calls is negligible, and not worthy of a LOG macro. I already removed Config::verbose and inject OPT_verbose into errorHandler().verbose.

gkm: This is called only once, so the overhead of a few `to_hexString()` calls is negligible, and…

// Since __stubs is placed after __text, we must estimate the address

int3Unsubmitted

Done

"Since __stubs is placed after __text"?

int3: "Since `__stubs` is placed after __text"?

// beyond which stubs are within range of a simple forward branch.

uint64_t MergedOutputSection::estimateStubsInRangeVA(size_t callIdx) const {

uint64_t branchRange = target->branchRange;

size_t endIdx = inputs.size();

InputSection *isec = inputs[callIdx];

uint64_t isecVA = isec->getVA();

// Tally the non-stub functions which still have call sites

// remaining to process, which yields the maximum number

// of thunks we might yet place.

size_t maxPotentialThunks = 0;

for (auto &tp : thunkMap) {

ThunkInfo &ti = tp.second;

maxPotentialThunks +=

!tp.first->isInStubs() && ti.callSitesUsed < ti.callSiteCount;

}

// Tally the total size of input sections remaining to process.

uint64_t isecEnd = isec->getVA();

for (size_t i = callIdx; i < endIdx; i++) {

InputSection *isec = inputs[i];

isecEnd = alignTo(isecEnd, isec->align) + isec->getSize();

}

// Estimate the address after which call sites can safely call stubs

// directly rather than through intermediary thunks.

uint64_t stubsInRangeVA = isecEnd + maxPotentialThunks * target->thunkSize +

in.stubs->getSize() - branchRange;

log("thunks = " + std::to_string(thunkMap.size()) +

", potential = " + std::to_string(maxPotentialThunks) +

", stubs = " + std::to_string(in.stubs->getSize()) + ", isecVA = " +

to_hexString(isecVA) + ", threshold = " + to_hexString(stubsInRangeVA) +

", isecEnd = " + to_hexString(isecEnd) +

", tail = " + to_hexString(isecEnd - isecVA) +

", slop = " + to_hexString(branchRange - (isecEnd - isecVA)));

return stubsInRangeVA;

}

void MergedOutputSection::finalize() { void MergedOutputSection::finalize() {

uint64_t isecAddr = addr; uint64_t isecAddr = addr;

uint64_t isecFileOff = fileOff; uint64_t isecFileOff = fileOff;

for (InputSection *isec : inputs) { auto finalizeOne = [&](InputSection *isec) {

isecAddr = alignTo(isecAddr, isec->align); isecAddr = alignTo(isecAddr, isec->align);

isecFileOff = alignTo(isecFileOff, isec->align); isecFileOff = alignTo(isecFileOff, isec->align);

isec->outSecOff = isecAddr - addr; isec->outSecOff = isecAddr - addr;

isec->outSecFileOff = isecFileOff - fileOff; isec->outSecFileOff = isecFileOff - fileOff;

isec->isFinal = true;

isecAddr += isec->getSize(); isecAddr += isec->getSize();

isecFileOff += isec->getFileSize(); isecFileOff += isec->getFileSize();

};

if (!needsThunks()) {

for (InputSection *isec : inputs)

int3Unsubmitted

Done

size_t thunkSize = target->thunkSize;

- if (thunkSize == 0) {

+ if (!target->usesThunks()) {

for (InputSection *isec : inputs)

seems more readable

int3: seems more readable

finalizeOne(isec);

size = isecAddr - addr;

fileSize = isecFileOff - fileOff;

return;

}

uint64_t branchRange = target->branchRange;

uint64_t stubsInRangeVA = TargetInfo::outOfRangeVA;

size_t thunkSize = target->thunkSize;

size_t relocCount = 0;

size_t callSiteCount = 0;

size_t thunkCallCount = 0;

size_t thunkCount = 0;

int3Unsubmitted

Done

can we drop this functionality for now, until if/when we decide to make stubs-before-text an option?

int3: can we drop this functionality for now, until if/when we decide to make stubs-before-text an…

// inputs[finalIdx] is for finalization (address-assignment)

size_t finalIdx = 0;

// Kick-off by ensuring that the first input section has an address

for (size_t callIdx = 0, endIdx = inputs.size(); callIdx < endIdx;

++callIdx) {

int3Unsubmitted

Done

I find these kind of hard to read, could we use slightly more verbose names?

int3: I find these kind of hard to read, could we use slightly more verbose names?

gkmAuthorUnsubmitted

Done

s/ic/I/; s/ie/E/, by convention I see many places in LLVM code, though not so much in LLD yet.
s/ix/iFinal/

Let me know if you hate I and E.

gkm: s/`ic`/`I`/; s/`ie`/`E`/, by convention I see many places in LLVM code, though not so much in…

int3Unsubmitted

Done

how about finalIdx, callIdx, endIdx?

int3: how about finalIdx, callIdx, endIdx?

if (finalIdx == callIdx)

int3Unsubmitted

Done

doesn't seem like a useful assert

int3: doesn't seem like a useful assert

finalizeOne(inputs[finalIdx++]);

InputSection *isec = inputs[callIdx];

tschuettUnsubmitted

Done

LLVM prefers preincrement:
https://llvm.org/docs/CodingStandards.html#prefer-preincrement

There a couple of postincrements.

tschuett: LLVM prefers preincrement: https://llvm.org/docs/CodingStandards.html#prefer-preincrement…

int3Unsubmitted

Done

I mean this line actually does want to return a copy of the original value, so I think it's fine... but yeah the increments-for-effect-only should be preincrements

int3: I mean this line actually does want to return a copy of the original value, so I think it's…

assert(isec->isFinal);

uint64_t isecVA = isec->getVA();

// Assign addresses up-to the forward branch-range limit

while (finalIdx < endIdx &&

isecAddr + inputs[finalIdx]->getSize() < isecVA + branchRange)

finalizeOne(inputs[finalIdx++]);

if (isec->callSiteCount == 0)

continue;

if (finalIdx == endIdx && stubsInRangeVA == TargetInfo::outOfRangeVA) {

int3Unsubmitted

Done

redundant assert

int3: redundant assert

// When we have finalized all input sections, __stubs (destined

// to follow __text) comes within range of forward branches and

int3Unsubmitted

Done

I believe llvm-mc emits relocs in reverse order, but I don't think that's guaranteed anywhere in the format... we should probably sort it ourselves

int3: I believe llvm-mc emits relocs in reverse order, but I don't think that's guaranteed anywhere…

gkmAuthorUnsubmitted

Done

Before I add code and overhead to sort already-sorted vectors, I'd like to look into this further.

gkm: Before I add code and overhead to sort already-sorted vectors, I'd like to look into this…

int3Unsubmitted

Done

if nothing else we should assert for is_sorted()

int3: if nothing else we should assert for `is_sorted()`

// we can estimate the threshold address after which we can

// reach any stub with a forward branch. Note that although it

// sits in the middle of a loop, this code executes only once.

// It is in the loop because we need to call it at the proper

// time: the earliest call site from which the end of __text

// (and start of __stubs) comes within range of a forward branch.

stubsInRangeVA = estimateStubsInRangeVA(callIdx);

}

// Process relocs by ascending address, i.e., ascending offset within isec

std::vector<Reloc> &relocs = isec->relocs;

assert(is_sorted(relocs,

int3Unsubmitted

Done

// Calculate our call referent address

- auto *funcSym = r.referent.dyn_cast<Symbol *>();

- assert(funcSym);

+ auto *funcSym = r.referent.get<Symbol *>();

assert(isa<Defined>(funcSym) || isa<DylibSymbol>(funcSym));

(get<>() will assert)

int3: (`get<>()` will assert)

[](Reloc &a, Reloc &b) { return a.offset > b.offset; }));

for (Reloc &r : reverse(relocs)) {

++relocCount;

if (!target->hasAttr(r.type, RelocAttrBits::BRANCH))

int3Unsubmitted

Done

std::vector<Reloc> &relocs = isec->relocs;

- assert(std::is_sorted(relocs.begin(), relocs.end(), [](Reloc &a, Reloc &b) {

+ assert(is_sorted(relocs, [](Reloc &a, Reloc &b) {

return a.offset > b.offset;

from STLExtras.h

int3: from STLExtras.h

continue;

++callSiteCount;

// Calculate branch reachability boundaries

uint64_t callVA = isecVA + r.offset;

uint64_t lowVA = branchRange < callVA ? callVA - branchRange : 0;

uint64_t highVA = callVA + branchRange;

// Calculate our call referent address

auto *funcSym = r.referent.get<Symbol *>();

int3Unsubmitted

Done

++callSiteCount;

- // calculate branch reachability boundaries

+ // Calculate branch reachability boundaries

uint64_t callVA = isecVA + r.offset;

since you capitalized everything else

int3: since you capitalized everything else

ThunkInfo &thunkInfo = thunkMap[funcSym];

// The referent is not reachable, so we need to use a thunk ...

if (funcSym->isInStubs() && callVA >= stubsInRangeVA) {

// ... Oh, wait! We are close enough to the end that __stubs

int3Unsubmitted

Done

uint64_t highVA = callVA + branchRange;

- // Calculate our call referent address

+ // Calculate our call's referent address

auto *funcSym = r.referent.get<Symbol *>();

nit aside... I'm a bit confused by this comment. What address computation is going on? Isn't it more like "Determine whether the call's referent is reachable" or "Determine if the referent should be replaced by a thunk"?

int3: nit aside... I'm a bit confused by this comment. What address computation is going on? Isn't it…

// are now within range of a simple forward branch.

continue;

}

uint64_t funcVA = funcSym->resolveBranchVA();

++thunkInfo.callSitesUsed;

if (lowVA < funcVA && funcVA < highVA) {

// The referent is reachable with a simple call instruction.

continue;

}

++thunkInfo.thunkCallCount;

++thunkCallCount;

// If an existing thunk is reachable, use it ...

int3Unsubmitted

Done

if (lowVA < funcVA && funcVA < highVA) {

- // The is referent reachable with a simple call instruction.

+ // The referent is reachable with a simple call instruction.

continue;

int3:

if (thunkInfo.sym) {

uint64_t thunkVA = thunkInfo.isec->getVA();

if (lowVA < thunkVA && thunkVA < highVA) {

r.referent = thunkInfo.sym;

continue;

}

// ... otherwise, create a new thunk

if (isecAddr > highVA) {

int3Unsubmitted

Done

I think the lint is right here... the return value of to_string is going to be unowned

int3: I think the lint is right here... the return value of `to_string` is going to be unowned

// When there is small-to-no margin between highVA and

// isecAddr and the distance between subsequent call sites is

// smaller than thunkSize, then a new thunk can go out of

// range. Fix by unfinalizing inputs[finalIdx] to reduce the

// distance between callVA and highVA, then shift some thunks

// to occupy address-space formerly occupied by the

// unfinalized inputs[finalIdx].

fatal(Twine(__FUNCTION__) + ": FIXME: thunk range overrun");

}

thunkInfo.isec = make<InputSection>();

thunkInfo.isec->name = isec->name;

thunkInfo.isec->segname = isec->segname;

thunkInfo.isec->parent = this;

int3Unsubmitted

Done

// unfinalized inputs[finalIdx].

- fatal(Twine(__FUNCTION__) + ": FIXME: thunk range overrun");

+ fatal(__FUNCTION__ ": FIXME: thunk range overrun");

}

thunkInfo.isec = make<InputSection>();

I think this should work

int3: I think this should work

gkmAuthorUnsubmitted

Done

Sadly, it doesn't. Unlike old C compilers, __FUNCTION__ is not expanded by the preprocessor into a string literal. It is a compiler-internal variable, and thus not amenable to concat via adjacent string literals.

gkm: Sadly, it doesn't. Unlike old C compilers, `__FUNCTION__` is not expanded by the preprocessor…

int3Unsubmitted

Done

huh, TIL. I *did* vaguely wonder when C preprocessors became smart enough to parse function declarations :p makes more sense that the compiler is doing it

int3: huh, TIL. I *did* vaguely wonder when C preprocessors became smart enough to parse function…

StringRef thunkName = saver.save(funcSym->getName() + ".thunk." +

std::to_string(thunkInfo.sequence++));

r.referent = thunkInfo.sym = symtab->addDefined(

thunkName, /*file=*/nullptr, thunkInfo.isec, /*value=*/0,

/*size=*/thunkSize, /*isWeakDef=*/false, /*isPrivateExtern=*/true,

/*isThumb=*/false);

target->populateThunk(thunkInfo.isec, funcSym);

finalizeOne(thunkInfo.isec);

thunks.push_back(thunkInfo.isec);

++thunkCount;

}

} }

size = isecAddr - addr; size = isecAddr - addr;

fileSize = isecFileOff - fileOff; fileSize = isecFileOff - fileOff;

log("thunks for " + parent->name + "," + name +

int3Unsubmitted

Done

LLD-ELF handles --verbose by assigning to errorHandler().verbose, I think we should do likewise

int3: LLD-ELF handles `--verbose` by assigning to `errorHandler().verbose`, I think we should do…

gkmAuthorUnsubmitted

Done

Done. Note that ELF has no other use of OPT_verbose. COFF assigns to errorHandler().verbose, and also has config->verbose to enable other output.

gkm: Done. Note that ELF has no other use of `OPT_verbose`. COFF assigns to `errorHandler().verbose`…

int3Unsubmitted

Not Done

I don't think this line below should be a warning though... log() would suffice (and obviate the need for Config::verbose)

int3: I don't think this line below should be a warning though... `log()` would suffice (and obviate…

": funcs = " + std::to_string(thunkMap.size()) +

", relocs = " + std::to_string(relocCount) +

", all calls = " + std::to_string(callSiteCount) +

", thunk calls = " + std::to_string(thunkCallCount) +

", thunks = " + std::to_string(thunkCount));

} }

void MergedOutputSection::writeTo(uint8_t *buf) const { void MergedOutputSection::writeTo(uint8_t *buf) const {

for (InputSection *isec : inputs) // Merge input sections from thunk & ordinary vectors

isec->writeTo(buf + isec->outSecFileOff); size_t i = 0, ie = inputs.size();

size_t t = 0, te = thunks.size();

while (i < ie || t < te) {

while (i < ie && (t == te || inputs[i]->getSize() == 0 ||

inputs[i]->outSecOff < thunks[t]->outSecOff)) {

inputs[i]->writeTo(buf + inputs[i]->outSecFileOff);

++i;

}

while (t < te && (i == ie || thunks[t]->outSecOff < inputs[i]->outSecOff)) {

thunks[t]->writeTo(buf + thunks[t]->outSecFileOff);

++t;

}

int3Unsubmitted

Done

instead of two sorted arrays, would it be simpler to create a map of regular InputSection to an array of thunks that immediately follow it?

int3: instead of two sorted arrays, would it be simpler to create a map of regular InputSection to an…

gkmAuthorUnsubmitted

Done

Sounds like more overhead & thus slower.

gkm: Sounds like more overhead & thus slower.

int3Unsubmitted

Done

Simpler code-wise though. This isn't likely to be perf-critical...

int3: Simpler code-wise though. This isn't likely to be perf-critical...

gkmAuthorUnsubmitted

Done

I wish to punt on this for now. I will revisit after I get big programs to run with thunks.

gkm: I wish to punt on this for now. I will revisit after I get big programs to run with thunks.

}

} }

// TODO: this is most likely wrong; reconsider how section flags // TODO: this is most likely wrong; reconsider how section flags

// are actually merged. The logic presented here was written without // are actually merged. The logic presented here was written without

// any form of informed research. // any form of informed research.

void MergedOutputSection::mergeFlags(InputSection *input) { void MergedOutputSection::mergeFlags(InputSection *input) {

uint8_t baseType = flags & SECTION_TYPE; uint8_t baseType = flags & SECTION_TYPE;

uint8_t inputType = input->flags & SECTION_TYPE; uint8_t inputType = input->flags & SECTION_TYPE;

Show All 19 Lines

lld/MachO/Options.td

	include "llvm/Option/OptParser.td"			include "llvm/Option/OptParser.td"

	// Flags that lld/MachO understands but ld64 doesn't. These take			// Flags that lld/MachO understands but ld64 doesn't. These take
	// '--' instead of '-' and use dashes instead of underscores, so			// '--' instead of '-' and use dashes instead of underscores, so
	// they don't collide with the ld64 compat options.			// they don't collide with the ld64 compat options.
	def grp_lld : OptionGroup<"kind">, HelpText<"LLD-SPECIFIC">;			def grp_lld : OptionGroup<"kind">, HelpText<"LLD-SPECIFIC">;

	def help : Flag<["-", "--"], "help">,			def help : Flag<["-", "--"], "help">,
	Group<grp_lld>;			Group<grp_lld>;
	def help_hidden : Flag<["--"], "help-hidden">,			def help_hidden : Flag<["--"], "help-hidden">,
	HelpText<"Display help for hidden options">,			HelpText<"Display help for hidden options">,
	Group<grp_lld>;			Group<grp_lld>;
				def verbose : Flag<["--"], "verbose">,
				Group<grp_lld>;
	def error_limit_eq : Joined<["--"], "error-limit=">,			def error_limit_eq : Joined<["--"], "error-limit=">,
	HelpText<"Maximum number of errors to print before exiting (default: 20)">,			HelpText<"Maximum number of errors to print before exiting (default: 20)">,
	Group<grp_lld>;			Group<grp_lld>;
	def color_diagnostics: Flag<["--"], "color-diagnostics">,			def color_diagnostics: Flag<["--"], "color-diagnostics">,
	HelpText<"Alias for --color-diagnostics=always">,			HelpText<"Alias for --color-diagnostics=always">,
	Group<grp_lld>;			Group<grp_lld>;
	def no_color_diagnostics: Flag<["--"], "no-color-diagnostics">,			def no_color_diagnostics: Flag<["--"], "no-color-diagnostics">,
	HelpText<"Alias for --color-diagnostics=never">,			HelpText<"Alias for --color-diagnostics=never">,
	▲ Show 20 Lines • Show All 1,292 Lines • Show Last 20 Lines

lld/MachO/Symbols.h

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	public:
virtual bool isTlv() const { llvm_unreachable("cannot be TLV"); }		virtual bool isTlv() const { llvm_unreachable("cannot be TLV"); }

// Whether this symbol is in the GOT or TLVPointer sections.		// Whether this symbol is in the GOT or TLVPointer sections.
bool isInGot() const { return gotIndex != UINT32_MAX; }		bool isInGot() const { return gotIndex != UINT32_MAX; }

// Whether this symbol is in the StubsSection.		// Whether this symbol is in the StubsSection.
bool isInStubs() const { return stubsIndex != UINT32_MAX; }		bool isInStubs() const { return stubsIndex != UINT32_MAX; }

		uint64_t getStubVA() const;
		uint64_t getGotVA() const;
		uint64_t getTlvVA() const;
		uint64_t resolveBranchVA() const {
		assert(isa<Defined>(this) \|\| isa<DylibSymbol>(this));
		return isInStubs() ? getStubVA() : getVA();
		}
		uint64_t resolveGotVA() const { return isInGot() ? getGotVA() : getVA(); }
		uint64_t resolveTlvVA() const { return isInGot() ? getTlvVA() : getVA(); }

// The index of this symbol in the GOT or the TLVPointer section, depending		// The index of this symbol in the GOT or the TLVPointer section, depending
// on whether it is a thread-local. A given symbol cannot be referenced by		// on whether it is a thread-local. A given symbol cannot be referenced by
// both these sections at once.		// both these sections at once.
uint32_t gotIndex = UINT32_MAX;		uint32_t gotIndex = UINT32_MAX;

uint32_t stubsIndex = UINT32_MAX;		uint32_t stubsIndex = UINT32_MAX;

uint32_t symtabIndex = UINT32_MAX;		uint32_t symtabIndex = UINT32_MAX;
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines

class DylibSymbol : public Symbol {		class DylibSymbol : public Symbol {
public:		public:
DylibSymbol(DylibFile *file, StringRefZ name, bool isWeakDef,		DylibSymbol(DylibFile *file, StringRefZ name, bool isWeakDef,
RefState refState, bool isTlv)		RefState refState, bool isTlv)
: Symbol(DylibKind, name, file), refState(refState), weakDef(isWeakDef),		: Symbol(DylibKind, name, file), refState(refState), weakDef(isWeakDef),
tlv(isTlv) {}		tlv(isTlv) {}

		uint64_t getVA() const override;
bool isWeakDef() const override { return weakDef; }		bool isWeakDef() const override { return weakDef; }
bool isWeakRef() const override { return refState == RefState::Weak; }		bool isWeakRef() const override { return refState == RefState::Weak; }
bool isReferenced() const { return refState != RefState::Unreferenced; }		bool isReferenced() const { return refState != RefState::Unreferenced; }
bool isTlv() const override { return tlv; }		bool isTlv() const override { return tlv; }
bool isDynamicLookup() const { return file == nullptr; }		bool isDynamicLookup() const { return file == nullptr; }
bool hasStubsHelper() const { return stubsHelperIndex != UINT32_MAX; }		bool hasStubsHelper() const { return stubsHelperIndex != UINT32_MAX; }

DylibFile *getFile() const {		DylibFile *getFile() const {
Show All 27 Lines	private:
const llvm::object::Archive::Symbol sym;		const llvm::object::Archive::Symbol sym;
};		};

union SymbolUnion {		union SymbolUnion {
alignas(Defined) char a[sizeof(Defined)];		alignas(Defined) char a[sizeof(Defined)];
alignas(Undefined) char b[sizeof(Undefined)];		alignas(Undefined) char b[sizeof(Undefined)];
alignas(CommonSymbol) char c[sizeof(CommonSymbol)];		alignas(CommonSymbol) char c[sizeof(CommonSymbol)];
alignas(DylibSymbol) char d[sizeof(DylibSymbol)];		alignas(DylibSymbol) char d[sizeof(DylibSymbol)];
alignas(LazySymbol) char e[sizeof(LazySymbol)];		alignas(LazySymbol) char e[sizeof(LazySymbol)];
};		};
		int3Unsubmitted Done Reply Inline Actions this can be committed on its own as an NFC commit... try to keep the diff to relevant changes only int3: this can be committed on its own as an NFC commit... try to keep the diff to relevant changes…

template <typename T, typename... ArgT>		template <typename T, typename... ArgT>
T replaceSymbol(Symbol s, ArgT &&...arg) {		T replaceSymbol(Symbol s, ArgT &&...arg) {
static_assert(sizeof(T) <= sizeof(SymbolUnion), "SymbolUnion too small");		static_assert(sizeof(T) <= sizeof(SymbolUnion), "SymbolUnion too small");
static_assert(alignof(T) <= alignof(SymbolUnion),		static_assert(alignof(T) <= alignof(SymbolUnion),
"SymbolUnion not aligned enough");		"SymbolUnion not aligned enough");
assert(static_cast<Symbol >(static_cast<T >(nullptr)) == nullptr &&		assert(static_cast<Symbol >(static_cast<T >(nullptr)) == nullptr &&
"Not a Symbol");		"Not a Symbol");
Show All 15 Lines

lld/MachO/Symbols.cpp

	Show All 21 Lines
	}			}

	std::string lld::toString(const Symbol &sym) { return demangle(sym.getName()); }			std::string lld::toString(const Symbol &sym) { return demangle(sym.getName()); }

	std::string lld::toMachOString(const object::Archive::Symbol &b) {			std::string lld::toMachOString(const object::Archive::Symbol &b) {
	return demangle(b.getName());			return demangle(b.getName());
	}			}

				uint64_t Symbol::getStubVA() const { return in.stubs->getVA(stubsIndex); }
				uint64_t Symbol::getGotVA() const { return in.got->getVA(gotIndex); }
				uint64_t Symbol::getTlvVA() const { return in.tlvPointers->getVA(gotIndex); }

	uint64_t Defined::getVA() const {			uint64_t Defined::getVA() const {
	if (isAbsolute())			if (isAbsolute())
	return value;			return value;

				if (!isec->isFinal) {
				// A target arch that does not use thunks ought never ask for
				// the address of a function that has not yet been finalized.
				assert(target->usesThunks());

				// MergedOutputSection::finalize() can seek the address of a
				// function before its address is assigned. The thunking algorithm
				// knows that unfinalized functions will be out of range, so it is
				// expedient to return a contrived out-of-range address.
				return TargetInfo::outOfRangeVA;
				}
	return isec->getVA() + value;			return isec->getVA() + value;
	}			}

	uint64_t Defined::getFileOffset() const {			uint64_t Defined::getFileOffset() const {
	if (isAbsolute()) {			if (isAbsolute()) {
	error("absolute symbol " + toString(*this) +			error("absolute symbol " + toString(*this) +
	" does not have a file offset");			" does not have a file offset");
	return 0;			return 0;
	}			}
	return isec->getFileOffset() + value;			return isec->getFileOffset() + value;
	}			}

				uint64_t DylibSymbol::getVA() const {
				return isInStubs() ? getStubVA() : Symbol::getVA();
				}

	void LazySymbol::fetchArchiveMember() { getFile()->fetch(sym); }			void LazySymbol::fetchArchiveMember() { getFile()->fetch(sym); }

lld/MachO/SyntheticSections.h

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	public:
uint64_t getSize() const override {		uint64_t getSize() const override {
return entries.size() * target->wordSize;		return entries.size() * target->wordSize;
}		}

void writeTo(uint8_t *buf) const override;		void writeTo(uint8_t *buf) const override;

void addEntry(Symbol *sym);		void addEntry(Symbol *sym);

		uint64_t getVA(uint32_t gotIndex) const {
		MaskRayUnsubmitted Done Reply Inline Actions `const {` MaskRay: `const {`
		return addr + gotIndex * target->wordSize;
		}

private:		private:
llvm::SetVector<const Symbol *> entries;		llvm::SetVector<const Symbol *> entries;
};		};

class GotSection : public NonLazyPointerSectionBase {		class GotSection : public NonLazyPointerSectionBase {
public:		public:
GotSection()		GotSection()
: NonLazyPointerSectionBase(segment_names::dataConst,		: NonLazyPointerSectionBase(segment_names::dataConst,
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines
// appropriate symbol is found at runtime. However, the bound addresses will		// appropriate symbol is found at runtime. However, the bound addresses will
// still be written (non-lazily) into the LazyPointerSection.		// still be written (non-lazily) into the LazyPointerSection.

class StubsSection : public SyntheticSection {		class StubsSection : public SyntheticSection {
public:		public:
StubsSection();		StubsSection();
uint64_t getSize() const override;		uint64_t getSize() const override;
bool isNeeded() const override { return !entries.empty(); }		bool isNeeded() const override { return !entries.empty(); }
		void finalize() override;
void writeTo(uint8_t *buf) const override;		void writeTo(uint8_t *buf) const override;
const llvm::SetVector<Symbol *> &getEntries() const { return entries; }		const llvm::SetVector<Symbol *> &getEntries() const { return entries; }
// Returns whether the symbol was added. Note that every stubs entry will		// Returns whether the symbol was added. Note that every stubs entry will
// have a corresponding entry in the LazyPointerSection.		// have a corresponding entry in the LazyPointerSection.
bool addEntry(Symbol *);		bool addEntry(Symbol *);
		uint64_t getVA(uint32_t stubsIndex) const {
		// MergedOutputSection::finalize() can seek the address of a
		// stub before its address is assigned. Before __stubs is
		// finalized, return a contrived out-of-range address.
		return isFinal ? addr + stubsIndex * target->stubSize
		: TargetInfo::outOfRangeVA;
		}

		bool isFinal = false; // is address assigned?

private:		private:
llvm::SetVector<Symbol *> entries;		llvm::SetVector<Symbol *> entries;
};		};

class StubHelperSection : public SyntheticSection {		class StubHelperSection : public SyntheticSection {
public:		public:
StubHelperSection();		StubHelperSection();
▲ Show 20 Lines • Show All 231 Lines • Show Last 20 Lines

lld/MachO/SyntheticSections.cpp

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
// dyld3's MachOLoaded::getSlide() assumes that the __TEXT segment starts		// dyld3's MachOLoaded::getSlide() assumes that the __TEXT segment starts
// from the beginning of the file (i.e. the header).		// from the beginning of the file (i.e. the header).
MachHeaderSection::MachHeaderSection()		MachHeaderSection::MachHeaderSection()
: SyntheticSection(segment_names::text, section_names::header) {		: SyntheticSection(segment_names::text, section_names::header) {
// XXX: This is a hack. (See D97007)		// XXX: This is a hack. (See D97007)
// Setting the index to 1 to pretend that this section is the text		// Setting the index to 1 to pretend that this section is the text
// section.		// section.
index = 1;		index = 1;
		isec->isFinal = true;
}		}

void MachHeaderSection::addLoadCommand(LoadCommand *lc) {		void MachHeaderSection::addLoadCommand(LoadCommand *lc) {
loadCommands.push_back(lc);		loadCommands.push_back(lc);
sizeOfCmds += lc->getSize();		sizeOfCmds += lc->getSize();
}		}

uint64_t MachHeaderSection::getSize() const {		uint64_t MachHeaderSection::getSize() const {
▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines
void StubsSection::writeTo(uint8_t *buf) const {		void StubsSection::writeTo(uint8_t *buf) const {
size_t off = 0;		size_t off = 0;
for (const Symbol *sym : entries) {		for (const Symbol *sym : entries) {
target->writeStub(buf + off, *sym);		target->writeStub(buf + off, *sym);
off += target->stubSize;		off += target->stubSize;
}		}
}		}

		void StubsSection::finalize() { isFinal = true; }

bool StubsSection::addEntry(Symbol *sym) {		bool StubsSection::addEntry(Symbol *sym) {
bool inserted = entries.insert(sym);		bool inserted = entries.insert(sym);
if (inserted)		if (inserted)
sym->stubsIndex = entries.size() - 1;		sym->stubsIndex = entries.size() - 1;
return inserted;		return inserted;
}		}

StubHelperSection::StubHelperSection()		StubHelperSection::StubHelperSection()
▲ Show 20 Lines • Show All 660 Lines • ▼ Show 20 Lines	void macho::createSyntheticSymbols() {
switch (config->outputType) {		switch (config->outputType) {
// FIXME: Assign the right address value for these symbols		// FIXME: Assign the right address value for these symbols
// (rather than 0). But we need to do that after assignAddresses().		// (rather than 0). But we need to do that after assignAddresses().
case MH_EXECUTE:		case MH_EXECUTE:
// If linking PIE, __mh_execute_header is a defined symbol in		// If linking PIE, __mh_execute_header is a defined symbol in
// __TEXT, __text)		// __TEXT, __text)
// Otherwise, it's an absolute symbol.		// Otherwise, it's an absolute symbol.
if (config->isPic)		if (config->isPic)
symtab->addSynthetic("__mh_execute_header", in.header->isec, 0,		symtab->addSynthetic("__mh_execute_header", in.header->isec, /value=/0,
/privateExtern=/false,		/privateExtern=/false,
/includeInSymtab=/true);		/includeInSymtab=/true);
else		else
symtab->addSynthetic("__mh_execute_header",		symtab->addSynthetic("__mh_execute_header",
/isec/ nullptr, 0,		/isec/ nullptr, /value=/0,
/privateExtern=/false,		/privateExtern=/false,
/includeInSymtab=/true);		/includeInSymtab=/true);
break;		break;

// The following symbols are N_SECT symbols, even though the header is not		// The following symbols are N_SECT symbols, even though the header is not
// part of any section and that they are private to the bundle/dylib/object		// part of any section and that they are private to the bundle/dylib/object
// they are part of.		// they are part of.
case MH_BUNDLE:		case MH_BUNDLE:
Show All 26 Lines

lld/MachO/Target.h

Show All 18 Lines
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>

namespace lld {		namespace lld {
namespace macho {		namespace macho {
LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE();		LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE();

class Symbol;		class Symbol;
		class Defined;
class DylibSymbol;		class DylibSymbol;
class InputSection;		class InputSection;

class TargetInfo {		class TargetInfo {
public:		public:
template <class LP> TargetInfo(LP) {		template <class LP> TargetInfo(LP) {
// Having these values available in TargetInfo allows us to access them		// Having these values available in TargetInfo allows us to access them
// without having to resort to templates.		// without having to resort to templates.
Show All 25 Lines	public:
// entries. resolveSymbolVA() may also relax the target instructions to save		// entries. resolveSymbolVA() may also relax the target instructions to save
// on a level of address indirection.		// on a level of address indirection.
virtual void relaxGotLoad(uint8_t *loc, uint8_t type) const = 0;		virtual void relaxGotLoad(uint8_t *loc, uint8_t type) const = 0;

virtual const RelocAttrs &getRelocAttrs(uint8_t type) const = 0;		virtual const RelocAttrs &getRelocAttrs(uint8_t type) const = 0;

virtual uint64_t getPageSize() const = 0;		virtual uint64_t getPageSize() const = 0;

		virtual void populateThunk(InputSection thunk, Symbol funcSym) {
		int3Unsubmitted Done Reply Inline Actions nit: insert `llvm_unreachable()` in the body? int3: nit: insert `llvm_unreachable()` in the body?
		llvm_unreachable("target does not use thunks");
		}

bool hasAttr(uint8_t type, RelocAttrBits bit) const {		bool hasAttr(uint8_t type, RelocAttrBits bit) const {
return getRelocAttrs(type).hasAttr(bit);		return getRelocAttrs(type).hasAttr(bit);
}		}

		bool usesThunks() const { return thunkSize > 0; }
		MaskRayUnsubmitted Done Reply Inline Actions const MaskRay: const

uint32_t magic;		uint32_t magic;
llvm::MachO::CPUType cpuType;		llvm::MachO::CPUType cpuType;
uint32_t cpuSubtype;		uint32_t cpuSubtype;

uint64_t pageZeroSize;		uint64_t pageZeroSize;
size_t headerSize;		size_t headerSize;
size_t stubSize;		size_t stubSize;
size_t stubHelperHeaderSize;		size_t stubHelperHeaderSize;
size_t stubHelperEntrySize;		size_t stubHelperEntrySize;
size_t wordSize;		size_t wordSize;

		size_t thunkSize = 0;
		uint64_t branchRange = 0;

		// We contrive this value as sufficiently far from any valid address that it
		// will always be out-of-range for any architecture. UINT64_MAX is not a
		// good choice because it is (a) only 1 away from wrapping to 0, and (b) the
		int3Unsubmitted Not Done Reply Inline Actions This seems unnecessarily cute... Can we just use `numeric_limits<uint64_t>::max()`? Also it's not really target-specific so I'd prefer it not be put in Target int3: This seems unnecessarily cute... Can we just use `numeric_limits<uint64_t>::max()`? Also it's…
		gkmAuthorUnsubmitted Not Done Reply Inline Actions FYI, this wasn't intended to be cute at all. `numeric_limits<uint64_t>::max()` is not viable since it is the tombstone value for `DenseMap<>` and induces weird assertions. Another disqualifier is that `max()` is only one increment away from wrapping to 0. My chosen value `0xf000'0000'0000'0000` is VERY FAR away from 0. Perhaps MachO guarantees that `__text` addresses will never be within even 4 GiB of 0, so I am unnecessarily cautious? Regarding target-specificity: it is OS/runtime specific (mostly Darwin & iOS), modulo CPU-arch variations. It so happens all are Apple creations, and common enough that we can choose a single constant that works for all. If not `Target.h`, where do you propose we define this? I don't see anywhere that seems a better fit ... `Config.h`? `OutputSection.h`? `Relocations.h`? What looks good to you? gkm: FYI, this wasn't intended to be cute at all. `numeric_limits<uint64_t>::max()` is not viable…
		int3Unsubmitted Not Done Reply Inline Actions since it is the tombstone value for DenseMap<> and induces weird assertions ohh okay. That makes sense... please add that to the comment :) I thought you were trying to make a constant with the string "full" in it... :p I just was thinking of having it as a global. It can still be in Target.h. It's runtime-specific, but as you said it works uniformly for all the targets we support, and that's not quite obvious from seeing `target->outOfRangeVA`. Though if you don't want to pollute the global namespace further I guess changing all the use sites to `Target::outOfRangeVA` would work too. int3: > since it is the tombstone value for DenseMap<> and induces weird assertions ohh okay. That…
		// tombstone value for DenseMap<> and caused weird assertions for me.
		static constexpr uint64_t outOfRangeVA = 0xfull << 60;
};		};

TargetInfo *createX86_64TargetInfo();		TargetInfo *createX86_64TargetInfo();
TargetInfo *createARM64TargetInfo();		TargetInfo *createARM64TargetInfo();
TargetInfo *createARM64_32TargetInfo();		TargetInfo *createARM64_32TargetInfo();
TargetInfo *createARMTargetInfo(uint32_t cpuSubtype);		TargetInfo *createARMTargetInfo(uint32_t cpuSubtype);

struct LP64 {		struct LP64 {
Show All 37 Lines

lld/MachO/Writer.cpp

Show All 27 Lines

#include "llvm/Support/LEB128.h" #include "llvm/Support/LEB128.h"

#include "llvm/Support/MathExtras.h" #include "llvm/Support/MathExtras.h"

#include "llvm/Support/Parallel.h" #include "llvm/Support/Parallel.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/TimeProfiler.h"

#include "llvm/Support/xxhash.h" #include "llvm/Support/xxhash.h"

#include <algorithm> #include <algorithm>

int3Unsubmitted

Done

leftover?

int3: leftover?

using namespace llvm; using namespace llvm;

using namespace llvm::MachO; using namespace llvm::MachO;

using namespace llvm::sys; using namespace llvm::sys;

using namespace lld; using namespace lld;

using namespace lld::macho; using namespace lld::macho;

namespace { namespace {

class LCUuid; class LCUuid;

▲ Show 20 Lines • Show All 461 Lines • ▼ Show 20 Lines void writeTo(uint8_t *buf) const override {

c->datasize = section->getSize(); c->datasize = section->getSize();

} }

CodeSignatureSection *section; CodeSignatureSection *section;

}; };

} // namespace } // namespace

// Adds stubs and bindings where necessary (e.g. if the symbol is a // Add stubs and bindings where necessary (e.g. if the symbol is a

// DylibSymbol.) // DylibSymbol.)

static void prepareBranchTarget(Symbol *sym) { static void prepareBranchTarget(Symbol *sym) {

int3Unsubmitted

Done

seems outdated now

int3: seems outdated now

int3Unsubmitted

Done

rm newline plz

int3: rm newline plz

if (auto *dysym = dyn_cast<DylibSymbol>(sym)) { if (auto *dysym = dyn_cast<DylibSymbol>(sym)) {

if (in.stubs->addEntry(dysym)) { if (in.stubs->addEntry(dysym)) {

if (sym->isWeakDef()) { if (sym->isWeakDef()) {

in.binding->addEntry(dysym, in.lazyPointers->isec, in.binding->addEntry(dysym, in.lazyPointers->isec,

sym->stubsIndex * target->wordSize); sym->stubsIndex * target->wordSize);

in.weakBinding->addEntry(sym, in.lazyPointers->isec, in.weakBinding->addEntry(sym, in.lazyPointers->isec,

sym->stubsIndex * target->wordSize); sym->stubsIndex * target->wordSize);

} else { } else {

in.lazyBinding->addEntry(dysym); in.lazyBinding->addEntry(dysym);

} }

} else if (auto *defined = dyn_cast<Defined>(sym)) { } else if (auto *defined = dyn_cast<Defined>(sym)) {

if (defined->isExternalWeakDef()) { if (defined->isExternalWeakDef()) {

if (in.stubs->addEntry(sym)) { if (in.stubs->addEntry(sym)) {

in.rebase->addEntry(in.lazyPointers->isec, in.rebase->addEntry(in.lazyPointers->isec,

sym->stubsIndex * target->wordSize); sym->stubsIndex * target->wordSize);

in.weakBinding->addEntry(sym, in.lazyPointers->isec, in.weakBinding->addEntry(sym, in.lazyPointers->isec,

sym->stubsIndex * target->wordSize); sym->stubsIndex * target->wordSize);

} }

} else { } else {

int3Unsubmitted

Done

why aren't externalWeakDefs thunkable?

int3: why aren't externalWeakDefs thunkable?

int3Unsubmitted

Done

why is this config->entry check necessary?

int3: why is this `config->entry` check necessary?

gkmAuthorUnsubmitted

Done

Because of this ...

template <class LP> void Writer::run() {
  prepareBranchTarget(config->entry);
  . . .

... and this ...

bool macho::link(ArrayRef<const char *> argsArr, bool canExitEarly,
                 raw_ostream &stdoutOS, raw_ostream &stderrOS) {
  . . .
  config->entry = symtab->addUndefined(args.getLastArgValue(OPT_e, "_main"),
                                       /*file=*/nullptr,
                                       /*isWeakRef=*/false);
  . . .

gkm: Because of this ... ``` template <class LP> void Writer::run() { prepareBranchTarget(config…

int3Unsubmitted

Done

ah. I think it'd make more sense not to call prepareBranchTarget with an undefined symbol in the first place...

int3: ah. I think it'd make more sense not to call `prepareBranchTarget` with an undefined symbol in…

gkmAuthorUnsubmitted

Done

Dropping the call to prepareBranchTarget(config->entry) causes an assert in lld/test/MachO/entry-symbol.s. We can debug that later.

gkm: Dropping the call to `prepareBranchTarget(config->entry)` causes an assert in…

int3Unsubmitted

Done

Well I looked into it :) Didn't see the assert but noticed another issue: D102137

int3: Well I looked into it :) Didn't see the assert but noticed another issue: D102137

assert(false && "invalid symbol type for branch"); llvm_unreachable("invalid branch target symbol type");

} }

int3Unsubmitted

Done

do we actually get here in practice? I would assume that prepareBranchTarget is only ever called on DylibSymbols and Defined symbols...

int3: do we actually get here in practice? I would assume that `prepareBranchTarget` is only ever…

// Can a symbol's address can only be resolved at runtime? // Can a symbol's address can only be resolved at runtime?

static bool needsBinding(const Symbol *sym) { static bool needsBinding(const Symbol *sym) {

if (isa<DylibSymbol>(sym)) if (isa<DylibSymbol>(sym))

return true; return true;

if (const auto *defined = dyn_cast<Defined>(sym)) if (const auto *defined = dyn_cast<Defined>(sym))

return defined->isExternalWeakDef(); return defined->isExternalWeakDef();

return false; return false;

} }

static void prepareSymbolRelocation(Symbol *sym, const InputSection *isec, static void prepareSymbolRelocation(Symbol *sym, const InputSection *isec,

const Reloc &r) { const Reloc &r) {

const RelocAttrs &relocAttrs = target->getRelocAttrs(r.type); const RelocAttrs &relocAttrs = target->getRelocAttrs(r.type);

if (relocAttrs.hasAttr(RelocAttrBits::BRANCH)) { if (relocAttrs.hasAttr(RelocAttrBits::BRANCH)) {

prepareBranchTarget(sym); prepareBranchTarget(sym);

} else if (relocAttrs.hasAttr(RelocAttrBits::GOT)) { } else if (relocAttrs.hasAttr(RelocAttrBits::GOT)) {

int3Unsubmitted

Done

can't we just check for relocAttrs.hasAttr(RelocAttrBits::BRANCH) instead of adding a new property on Reloc?

int3: can't we just check for `relocAttrs.hasAttr(RelocAttrBits::BRANCH)` instead of adding a new…

gkmAuthorUnsubmitted

Done

After seeing your abandoned diff about the performance drag of checking relocAttrs, I chose to save the work of doing so again. Perhaps it is a micro-efficiency that I should avoid?

gkm: After seeing your abandoned diff about the performance drag of checking `relocAttrs`, I chose…

int3Unsubmitted

Done

yes please. If you've noticed the theme of my other comments... let's write the simplest code possible first, and optimize it if profile data indicates it's worthwhile :)

int3: yes please. If you've noticed the theme of my other comments... let's write the simplest code…

if (relocAttrs.hasAttr(RelocAttrBits::POINTER) || needsBinding(sym)) if (relocAttrs.hasAttr(RelocAttrBits::POINTER) || needsBinding(sym))

MaskRayUnsubmitted

Done

pre-increment

MaskRay: pre-increment

in.got->addEntry(sym); in.got->addEntry(sym);

} else if (relocAttrs.hasAttr(RelocAttrBits::TLV)) { } else if (relocAttrs.hasAttr(RelocAttrBits::TLV)) {

int3Unsubmitted

Done

should we wrap this in an if (target->needsThunks())?

int3: should we wrap this in an `if (target->needsThunks())`?

if (needsBinding(sym)) if (needsBinding(sym))

in.tlvPointers->addEntry(sym); in.tlvPointers->addEntry(sym);

} else if (relocAttrs.hasAttr(RelocAttrBits::UNSIGNED)) { } else if (relocAttrs.hasAttr(RelocAttrBits::UNSIGNED)) {

// References from thread-local variable sections are treated as offsets // References from thread-local variable sections are treated as offsets

// relative to the start of the referent section, and therefore have no // relative to the start of the referent section, and therefore have no

// need of rebase opcodes. // need of rebase opcodes.

if (!(isThreadLocalVariables(isec->flags) && isa<Defined>(sym))) if (!(isThreadLocalVariables(isec->flags) && isa<Defined>(sym)))

addNonLazyBindingEntries(sym, isec, r.offset, r.addend); addNonLazyBindingEntries(sym, isec, r.offset, r.addend);

Show All 19 Lines for (auto it = isec->relocs.begin(); it != isec->relocs.end(); ++it) {

// to emit rebase opcodes for it. // to emit rebase opcodes for it.

it++; it++;

continue; continue;

} }

if (auto *sym = r.referent.dyn_cast<Symbol *>()) { if (auto *sym = r.referent.dyn_cast<Symbol *>()) {

if (auto *undefined = dyn_cast<Undefined>(sym)) if (auto *undefined = dyn_cast<Undefined>(sym))

treatUndefinedSymbol(*undefined); treatUndefinedSymbol(*undefined);

// treatUndefinedSymbol() can replace sym with a DylibSymbol; re-check. // treatUndefinedSymbol() can replace sym with a DylibSymbol; re-check.

if (!isa<Undefined>(sym) && validateSymbolRelocation(sym, isec, r)) if (!isa<Undefined>(sym) && validateSymbolRelocation(sym, isec, r))

prepareSymbolRelocation(sym, isec, r); prepareSymbolRelocation(sym, isec, r);

int3Unsubmitted

Done

I thought we discussed that this case should be impossible; how does it arise?

int3: I thought we discussed that this case should be impossible; how does it arise?

} else { } else {

assert(r.referent.is<InputSection *>()); assert(r.referent.is<InputSection *>());

assert(!r.referent.get<InputSection *>()->shouldOmitFromOutput()); assert(!r.referent.get<InputSection *>()->shouldOmitFromOutput());

if (!r.pcrel) if (!r.pcrel)

in.rebase->addEntry(isec, r.offset); in.rebase->addEntry(isec, r.offset);

} }

▲ Show 20 Lines • Show All 347 Lines • ▼ Show 20 Lines for (OutputSegment *seg : outputSegments) {

// `fileOff + fileSize == next segment fileOff`. So we call alignTo() before // `fileOff + fileSize == next segment fileOff`. So we call alignTo() before

// (instead of after) computing fileSize to ensure that the segments are // (instead of after) computing fileSize to ensure that the segments are

// contiguous. We handle addr / vmSize similarly for the same reason. // contiguous. We handle addr / vmSize similarly for the same reason.

fileOff = alignTo(fileOff, pageSize); fileOff = alignTo(fileOff, pageSize);

addr = alignTo(addr, pageSize); addr = alignTo(addr, pageSize);

seg->vmSize = addr - seg->firstSection()->addr; seg->vmSize = addr - seg->firstSection()->addr;

seg->fileSize = fileOff - seg->fileOff; seg->fileSize = fileOff - seg->fileOff;

} }

// FIXME(gkm): create branch-extension thunks here, then adjust addresses

} }

void Writer::finalizeLinkEditSegment() { void Writer::finalizeLinkEditSegment() {

TimeTraceScope timeScope("Finalize __LINKEDIT segment"); TimeTraceScope timeScope("Finalize __LINKEDIT segment");

// Fill __LINKEDIT contents. // Fill __LINKEDIT contents.

std::vector<LinkEditSection *> linkEditSections{ std::vector<LinkEditSection *> linkEditSections{

in.rebase, in.binding, in.weakBinding, in.lazyBinding, in.rebase, in.binding, in.weakBinding, in.lazyBinding,

in.exports, symtabSection, indirectSymtabSection, functionStartsSection, in.exports, symtabSection, indirectSymtabSection, functionStartsSection,

▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines

template <class LP> void Writer::run() { template <class LP> void Writer::run() {

if (config->entry && !isa<Undefined>(config->entry)) if (config->entry && !isa<Undefined>(config->entry))

prepareBranchTarget(config->entry); prepareBranchTarget(config->entry);

scanRelocations(); scanRelocations();

if (in.stubHelper->isNeeded()) if (in.stubHelper->isNeeded())

in.stubHelper->setup(); in.stubHelper->setup();

scanSymbols(); scanSymbols();

createOutputSections<LP>(); createOutputSections<LP>();

// No more sections nor segments are created beyond this point. // After this point, we create no new segments; HOWEVER, we might

// yet create branch-range extension thunks for architectures whose

int3Unsubmitted

Done

// After this point, we create no new segments; HOWEVER, we might

- // yet create branch-range extention thunks for architectures whose

+ // yet create branch-range extension thunks for architectures whose

// hardware call instructions have limited range, e.g., ARM(64)

int3:

// hardware call instructions have limited range, e.g., ARM(64).

// The thunks are created as InputSections interspersed among

// the ordinary __TEXT,_text InputSections.

int3Unsubmitted

Done

seems like you forgot to finish this sentence :p

int3: seems like you forgot to finish this sentence :p

sortSegmentsAndSections(); sortSegmentsAndSections();

createLoadCommands<LP>(); createLoadCommands<LP>();

finalizeAddresses(); finalizeAddresses();

finalizeLinkEditSegment(); finalizeLinkEditSegment();

writeMapFile(); writeMapFile();

writeOutputFile(); writeOutputFile();

} }

Show All 22 Lines

lld/test/MachO/arm64-thunks.s

This file was added.

				# REQUIRES: aarch64

				## Check for the following:
				smeenaiUnsubmitted Done Reply Inline Actions Is the `REQUIRES: shell` sufficient by itself, or do you also need the explicit `UNSUPPORTED: windows`? smeenai: Is the `REQUIRES: shell` sufficient by itself, or do you also need the explicit `UNSUPPORTED…
				int3Unsubmitted Done Reply Inline Actions It should be sufficient by itself. I believe they're largely the same thing as far as the buildbots are concerned, but `REQUIRES: shell` alone would be more semantically descriptive of what this test depends on in its current form int3: It should be sufficient by itself. I believe they're largely the same thing as far as the…
				gkmAuthorUnsubmitted Done Reply Inline Actions The saddest part is that the test still runs and fails on Windows. gkm: The saddest part is that the test //still// runs and fails on Windows.
				## (1) address match between thunk definitions and call destinations
				## (2) address match between thunk page+offset computations and function definitions
				## (3) a second thunk is created when the first one goes out of range
				## (4) early calls to a dylib stub use a thunk, and later calls the stub directly
				## Notes:
				## 0x4000000 = 64 Mi = half the magnitude of the forward-branch range

				# RUN: rm -rf %t; mkdir %t
				# RUN: llvm-mc -filetype=obj -triple=arm64-apple-darwin %s -o %t/input.o
				# RUN: %lld -arch arm64 -lSystem -o %t/thunk %t/input.o
				int3Unsubmitted Done Reply Inline Actions looks like this is failing on Windows. Not sure there's a cross-platform way to loop, but writing 9 `llvm-mc` invocations doesn't seem too terrible either out of curiosity: what happens if we put everything into the same file, instead of multiple files -- what does `llvm-mc` generate? int3: looks like this is failing on Windows. Not sure there's a cross-platform way to loop, but…
				int3Unsubmitted Done Reply Inline Actions ah I see you disabled the test on Windows... I don't think this is a good enough reason to disable stuff on Windows unfortunately (I got pushback for it in https://bugs.llvm.org/show_bug.cgi?id=49512) int3: ah I see you disabled the test on Windows... I don't think this is a good enough reason to…
				gkmAuthorUnsubmitted Done Reply Inline Actions With `.subsections_via_symbols`, it all works as a single input file. Thanx! gkm: With `.subsections_via_symbols`, it all works as a single input file. Thanx!
				# RUN: llvm-objdump -d --no-show-raw-insn %t/thunk \| FileCheck %s

				# CHECK: Disassembly of section __TEXT,__text:

				# CHECK: [[#%.13x, A_PAGE:]][[#%.3x, A_OFFSET:]] <_a>:
				# CHECK: bl 0x[[#%x, A:]] <_a>
				int3Unsubmitted Done Reply Inline Actions seems like you could pipe it directly into FileCheck int3: seems like you could pipe it directly into FileCheck
				# CHECK: bl 0x[[#%x, B:]] <_b>
				# CHECK: bl 0x[[#%x, C:]] <_c>
				# CHECK: bl 0x[[#%x, D_THUNK_0:]] <_d.thunk.0>
				# CHECK: bl 0x[[#%x, E_THUNK_0:]] <_e.thunk.0>
				int3Unsubmitted Done Reply Inline Actions since you're using the `%.6x` syntax, the `{{0}}` seems unnecessary... (you might have to change `.6` to `.8` though) int3:* since you're using the `%.6x` syntax, the `{{0*}}` seems unnecessary... (you might have to…
				gkmAuthorUnsubmitted Done Reply Inline Actions `.13`, since these are 64-bit values with 16 hex digits. gkm: `.13`, since these are 64-bit values with 16 hex digits.
				# CHECK: bl 0x[[#%x, F_THUNK_0:]] <_f.thunk.0>
				# CHECK: bl 0x[[#%x, G_THUNK_0:]] <_g.thunk.0>
				# CHECK: bl 0x[[#%x, H_THUNK_0:]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0:]] <___nan.thunk.0>

				# CHECK: [[#%.13x, B_PAGE:]][[#%.3x, B_OFFSET:]] <_b>:
				# CHECK: bl 0x[[#%x, A]] <_a>
				# CHECK: bl 0x[[#%x, B]] <_b>
				# CHECK: bl 0x[[#%x, C]] <_c>
				# CHECK: bl 0x[[#%x, D_THUNK_0]] <_d.thunk.0>
				# CHECK: bl 0x[[#%x, E_THUNK_0]] <_e.thunk.0>
				# CHECK: bl 0x[[#%x, F_THUNK_0]] <_f.thunk.0>
				# CHECK: bl 0x[[#%x, G_THUNK_0]] <_g.thunk.0>
				# CHECK: bl 0x[[#%x, H_THUNK_0]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0]] <___nan.thunk.0>

				# CHECK: [[#%.13x, C_PAGE:]][[#%.3x, C_OFFSET:]] <_c>:
				# CHECK: bl 0x[[#%x, A]] <_a>
				# CHECK: bl 0x[[#%x, B]] <_b>
				# CHECK: bl 0x[[#%x, C]] <_c>
				# CHECK: bl 0x[[#%x, D:]] <_d>
				# CHECK: bl 0x[[#%x, E:]] <_e>
				# CHECK: bl 0x[[#%x, F_THUNK_0]] <_f.thunk.0>
				# CHECK: bl 0x[[#%x, G_THUNK_0]] <_g.thunk.0>
				# CHECK: bl 0x[[#%x, H_THUNK_0]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0]] <___nan.thunk.0>

				# CHECK: [[#%x, D_THUNK_0]] <_d.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, D_PAGE:]]
				# CHECK: add x16, x16, #[[#D_OFFSET:]]

				# CHECK: [[#%x, E_THUNK_0]] <_e.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, E_PAGE:]]
				# CHECK: add x16, x16, #[[#E_OFFSET:]]

				# CHECK: [[#%x, F_THUNK_0]] <_f.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, F_PAGE:]]
				# CHECK: add x16, x16, #[[#F_OFFSET:]]

				# CHECK: [[#%x, G_THUNK_0]] <_g.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, G_PAGE:]]
				# CHECK: add x16, x16, #[[#G_OFFSET:]]

				# CHECK: [[#%x, H_THUNK_0]] <_h.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, H_PAGE:]]
				# CHECK: add x16, x16, #[[#H_OFFSET:]]

				# CHECK: [[#%x, NAN_THUNK_0]] <___nan.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, NAN_PAGE:]]
				# CHECK: add x16, x16, #[[#NAN_OFFSET:]]

				# CHECK: [[#%x, D_PAGE + D_OFFSET]] <_d>:
				# CHECK: bl 0x[[#%x, A]] <_a>
				# CHECK: bl 0x[[#%x, B]] <_b>
				# CHECK: bl 0x[[#%x, C]] <_c>
				# CHECK: bl 0x[[#%x, D]] <_d>
				# CHECK: bl 0x[[#%x, E]] <_e>
				# CHECK: bl 0x[[#%x, F_THUNK_0]] <_f.thunk.0>
				# CHECK: bl 0x[[#%x, G_THUNK_0]] <_g.thunk.0>
				# CHECK: bl 0x[[#%x, H_THUNK_0]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0]] <___nan.thunk.0>

				# CHECK: [[#%x, E_PAGE + E_OFFSET]] <_e>:
				# CHECK: bl 0x[[#%x, A_THUNK_0:]] <_a.thunk.0>
				# CHECK: bl 0x[[#%x, B_THUNK_0:]] <_b.thunk.0>
				# CHECK: bl 0x[[#%x, C]] <_c>
				# CHECK: bl 0x[[#%x, D]] <_d>
				# CHECK: bl 0x[[#%x, E]] <_e>
				# CHECK: bl 0x[[#%x, F:]] <_f>
				# CHECK: bl 0x[[#%x, G:]] <_g>
				# CHECK: bl 0x[[#%x, H_THUNK_0]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0]] <___nan.thunk.0>

				# CHECK: [[#%x, F_PAGE + F_OFFSET]] <_f>:
				# CHECK: bl 0x[[#%x, A_THUNK_0]] <_a.thunk.0>
				# CHECK: bl 0x[[#%x, B_THUNK_0]] <_b.thunk.0>
				# CHECK: bl 0x[[#%x, C]] <_c>
				# CHECK: bl 0x[[#%x, D]] <_d>
				# CHECK: bl 0x[[#%x, E]] <_e>
				# CHECK: bl 0x[[#%x, F]] <_f>
				# CHECK: bl 0x[[#%x, G]] <_g>
				# CHECK: bl 0x[[#%x, H_THUNK_0]] <_h.thunk.0>
				# CHECK: bl 0x[[#%x, NAN_THUNK_0]] <___nan.thunk.0>

				# CHECK: [[#%x, G_PAGE + G_OFFSET]] <_g>:
				# CHECK: bl 0x[[#%x, A_THUNK_0]] <_a.thunk.0>
				# CHECK: bl 0x[[#%x, B_THUNK_0]] <_b.thunk.0>
				# CHECK: bl 0x[[#%x, C_THUNK_0:]] <_c.thunk.0>
				# CHECK: bl 0x[[#%x, D_THUNK_1:]] <_d.thunk.1>
				# CHECK: bl 0x[[#%x, E]] <_e>
				# CHECK: bl 0x[[#%x, F]] <_f>
				# CHECK: bl 0x[[#%x, G]] <_g>
				# CHECK: bl 0x[[#%x, H:]] <_h>
				# CHECK: bl 0x[[#%x, STUBS:]]

				# CHECK: [[#%x, A_THUNK_0]] <_a.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, A_PAGE]]000
				# CHECK: add x16, x16, #[[#%d, A_OFFSET]]

				# CHECK: [[#%x, B_THUNK_0]] <_b.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, B_PAGE]]000
				# CHECK: add x16, x16, #[[#%d, B_OFFSET]]

				# CHECK: [[#%x, H_PAGE + H_OFFSET]] <_h>:
				# CHECK: bl 0x[[#%x, A_THUNK_0]] <_a.thunk.0>
				# CHECK: bl 0x[[#%x, B_THUNK_0]] <_b.thunk.0>
				# CHECK: bl 0x[[#%x, C_THUNK_0]] <_c.thunk.0>
				# CHECK: bl 0x[[#%x, D_THUNK_1]] <_d.thunk.1>
				# CHECK: bl 0x[[#%x, E]] <_e>
				# CHECK: bl 0x[[#%x, F]] <_f>
				# CHECK: bl 0x[[#%x, G]] <_g>
				# CHECK: bl 0x[[#%x, H]] <_h>
				# CHECK: bl 0x[[#%x, STUBS]]

				# CHECK: <_main>:
				# CHECK: bl 0x[[#%x, A_THUNK_0]] <_a.thunk.0>
				# CHECK: bl 0x[[#%x, B_THUNK_0]] <_b.thunk.0>
				# CHECK: bl 0x[[#%x, C_THUNK_0]] <_c.thunk.0>
				# CHECK: bl 0x[[#%x, D_THUNK_1]] <_d.thunk.1>
				# CHECK: bl 0x[[#%x, E_THUNK_1:]] <_e.thunk.1>
				# CHECK: bl 0x[[#%x, F_THUNK_1:]] <_f.thunk.1>
				# CHECK: bl 0x[[#%x, G]] <_g>
				# CHECK: bl 0x[[#%x, H]] <_h>
				# CHECK: bl 0x[[#%x, STUBS]]

				# CHECK: [[#%x, C_THUNK_0]] <_c.thunk.0>:
				# CHECK: adrp x16, 0x[[#%x, C_PAGE]]000
				# CHECK: add x16, x16, #[[#%d, C_OFFSET]]

				# CHECK: [[#%x, D_THUNK_1]] <_d.thunk.1>:
				# CHECK: adrp x16, 0x[[#%x, D_PAGE]]
				# CHECK: add x16, x16, #[[#D_OFFSET]]

				# CHECK: [[#%x, E_THUNK_1]] <_e.thunk.1>:
				# CHECK: adrp x16, 0x[[#%x, E_PAGE]]
				# CHECK: add x16, x16, #[[#E_OFFSET]]

				# CHECK: [[#%x, F_THUNK_1]] <_f.thunk.1>:
				# CHECK: adrp x16, 0x[[#%x, F_PAGE]]
				# CHECK: add x16, x16, #[[#F_OFFSET]]

				# CHECK: Disassembly of section __TEXT,__stubs:

				# CHECK: [[#%x, NAN_PAGE + NAN_OFFSET]] <__stubs>:

				.subsections_via_symbols
				.text

				.globl _a
				.p2align 2
				_a:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				ret

				.globl _b
				.p2align 2
				_b:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				.space 0x4000000-0x3c
				ret

				.globl _c
				.p2align 2
				_c:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				ret

				.globl _d
				.p2align 2
				_d:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				.space 0x4000000-0x38
				ret

				.globl _e
				.p2align 2
				_e:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				ret

				.globl _f
				.p2align 2
				_f:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				.space 0x4000000-0x34
				ret

				.globl _g
				.p2align 2
				_g:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				ret

				.globl _h
				.p2align 2
				_h:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				.space 0x4000000-0x30
				ret

				.globl _main
				.p2align 2
				_main:
				bl _a
				bl _b
				bl _c
				bl _d
				bl _e
				bl _f
				bl _g
				bl _h
				bl ___nan
				ret

lld/test/MachO/tools/generate-thunkable-program.py

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/usr/bin/env python3

				"""Generate many skeletal functions with a thick call graph spanning a
				large address space to induce lld to create branch-islands for arm64.

				"""
				from __future__ import print_function
				import random
				import argparse
				import string
				from pprint import pprint
				from math import factorial
				from itertools import permutations

				# This list comes from libSystem.tbd and contains a sizeable subset
				# of dylib calls available for all MacOS target archs.
				libSystem_calls = (
				"__CurrentRuneLocale", "__DefaultRuneLocale", "__Exit", "__NSGetArgc",
				int3Unsubmitted Done Reply Inline Actions Couldn't we just generate a bunch of random strings for the symbols? This list isn't really helping us exercise new code paths in LLD... int3: Couldn't we just generate a bunch of random strings for the symbols? This list isn't really…
				int3Unsubmitted Done Reply Inline Actions not yet addressed int3: not yet addressed
				gkmAuthorUnsubmitted Done Reply Inline Actions I am leveraging symbols already present in `libSystem.tbd`. The goal is to generate calls through dylib stubs, to make sure the thunker properly makes thunks for out-of-range stubs. That is the LLD code path I am exercising. I could generate random strings, and then generate a matching `libLOL.tbd`, but that seems like extra work for marginal benefit. I suppose an advantage to generating `libLOL.tbd` is that I can control it size and stress-test the thunker with a huge dylib, or with multiple generated dylibs containing random symbols. gkm: I am leveraging symbols already present in `libSystem.tbd`. The goal is to generate calls…
				"__NSGetArgv", "__NSGetEnviron", "__NSGetMachExecuteHeader",
				"__NSGetProgname", "__PathLocale", "__Read_RuneMagi", "___Balloc_D2A",
				"___Bfree_D2A", "___ULtod_D2A", "____mb_cur_max", "____mb_cur_max_l",
				"____runetype", "____runetype_l", "____tolower", "____tolower_l",
				"____toupper", "____toupper_l", "___add_ovflpage", "___addel",
				"___any_on_D2A", "___assert_rtn", "___b2d_D2A", "___big_delete",
				"___big_insert", "___big_keydata", "___big_return", "___big_split",
				"___bigtens_D2A", "___bt_close", "___bt_cmp", "___bt_defcmp",
				"___bt_defpfx", "___bt_delete", "___bt_dleaf", "___bt_fd",
				"___bt_free", "___bt_get", "___bt_new", "___bt_open", "___bt_pgin",
				"___bt_pgout", "___bt_put", "___bt_ret", "___bt_search", "___bt_seq",
				"___bt_setcur", "___bt_split", "___bt_sync", "___buf_free",
				"___call_hash", "___cleanup", "___cmp_D2A", "___collate_equiv_match",
				"___collate_load_error", "___collate_lookup", "___collate_lookup_l",
				"___copybits_D2A", "___cxa_atexit", "___cxa_finalize",
				"___cxa_finalize_ranges", "___cxa_thread_atexit", "___d2b_D2A",
				"___dbpanic", "___decrement_D2A", "___default_hash", "___default_utx",
				"___delpair", "___diff_D2A", "___dtoa", "___expand_table",
				"___fflush", "___fgetwc", "___find_bigpair", "___find_last_page",
				"___fix_locale_grouping_str", "___fread", "___free_ovflpage",
				"___freedtoa", "___gdtoa", "___gdtoa_locks", "___get_buf",
				"___get_page", "___gethex_D2A", "___getonlyClocaleconv",
				"___hash_open", "___hdtoa", "___hexdig_D2A", "___hexdig_init_D2A",
				"___hexnan_D2A", "___hi0bits_D2A", "___hldtoa", "___i2b_D2A",
				"___ibitmap", "___increment_D2A", "___isctype", "___istype",
				"___istype_l", "___ldtoa", "___libc_init", "___lo0bits_D2A",
				"___log2", "___lshift_D2A", "___maskrune", "___maskrune_l",
				"___match_D2A", "___mb_cur_max", "___mb_sb_limit", "___memccpy_chk",
				"___memcpy_chk", "___memmove_chk", "___memset_chk", "___mult_D2A",
				"___multadd_D2A", "___nrv_alloc_D2A", "___opendir2", "___ovfl_delete",
				"___ovfl_get", "___ovfl_put", "___pow5mult_D2A", "___put_page",
				"___quorem_D2A", "___ratio_D2A", "___rec_close", "___rec_delete",
				"___rec_dleaf", "___rec_fd", "___rec_fmap", "___rec_fpipe",
				"___rec_get", "___rec_iput", "___rec_open", "___rec_put",
				"___rec_ret", "___rec_search", "___rec_seq", "___rec_sync",
				"___rec_vmap", "___rec_vpipe", "___reclaim_buf", "___rshift_D2A",
				"___rv_alloc_D2A", "___s2b_D2A", "___sF", "___sclose", "___sdidinit",
				"___set_ones_D2A", "___setonlyClocaleconv", "___sflags", "___sflush",
				"___sfp", "___sfvwrite", "___sglue", "___sinit", "___slbexpand",
				"___smakebuf", "___snprintf_chk", "___snprintf_object_size_chk",
				"___split_page", "___sprintf_chk", "___sprintf_object_size_chk",
				"___sread", "___srefill", "___srget", "___sseek", "___stack_chk_fail",
				"___stack_chk_guard", "___stderrp", "___stdinp", "___stdoutp",
				"___stpcpy_chk", "___stpncpy_chk", "___strcat_chk", "___strcp_D2A",
				"___strcpy_chk", "___strlcat_chk", "___strlcpy_chk", "___strncat_chk",
				"___strncpy_chk", "___strtodg", "___strtopdd", "___sum_D2A",
				"___svfscanf", "___swbuf", "___swhatbuf", "___swrite", "___swsetup",
				"___tens_D2A", "___tinytens_D2A", "___tolower", "___tolower_l",
				"___toupper", "___toupper_l", "___trailz_D2A", "___ulp_D2A",
				"___ungetc", "___ungetwc", "___vsnprintf_chk", "___vsprintf_chk",
				"___wcwidth", "___wcwidth_l", "__allocenvstate", "__atexit_receipt",
				"__c_locale", "__cleanup", "__closeutx", "__copyenv",
				"__cthread_init_routine", "__deallocenvstate", "__endutxent",
				"__flockfile_debug_stub", "__fseeko", "__ftello", "__fwalk",
				"__getenvp", "__getutxent", "__getutxid", "__getutxline",
				"__inet_aton_check", "__init_clock_port", "__int_to_time",
				"__libc_fork_child", "__libc_initializer", "__long_to_time",
				"__mkpath_np", "__mktemp", "__openutx", "__os_assert_log",
				"__os_assert_log_ctx", "__os_assumes_log", "__os_assumes_log_ctx",
				"__os_avoid_tail_call", "__os_crash", "__os_crash_callback",
				"__os_crash_fmt", "__os_debug_log", "__os_debug_log_error_str",
				"__putenvp", "__pututxline", "__rand48_add", "__rand48_mult",
				"__rand48_seed", "__readdir_unlocked", "__reclaim_telldir",
				"__seekdir", "__setenvp", "__setutxent", "__sigaction_nobind",
				"__sigintr", "__signal_nobind", "__sigvec_nobind", "__sread",
				"__sseek", "__subsystem_init", "__swrite", "__time32_to_time",
				"__time64_to_time", "__time_to_int", "__time_to_long",
				"__time_to_time32", "__time_to_time64", "__unsetenvp", "__utmpxname",
				"_a64l", "_abort", "_abort_report_np", "_abs", "_acl_add_flag_np",
				"_acl_add_perm", "_acl_calc_mask", "_acl_clear_flags_np",
				"_acl_clear_perms", "_acl_copy_entry", "_acl_copy_ext",
				"_acl_copy_ext_native", "_acl_copy_int", "_acl_copy_int_native",
				"_acl_create_entry", "_acl_create_entry_np", "_acl_delete_def_file",
				"_acl_delete_entry", "_acl_delete_fd_np", "_acl_delete_file_np",
				"_acl_delete_flag_np", "_acl_delete_link_np", "_acl_delete_perm",
				"_acl_dup", "_acl_free", "_acl_from_text", "_acl_get_entry",
				"_acl_get_fd", "_acl_get_fd_np", "_acl_get_file", "_acl_get_flag_np",
				"_acl_get_flagset_np", "_acl_get_link_np", "_acl_get_perm_np",
				"_acl_get_permset", "_acl_get_permset_mask_np", "_acl_get_qualifier",
				"_acl_get_tag_type", "_acl_init", "_acl_maximal_permset_mask_np",
				"_acl_set_fd", "_acl_set_fd_np", "_acl_set_file", "_acl_set_flagset_np",
				"_acl_set_link_np", "_acl_set_permset", "_acl_set_permset_mask_np",
				"_acl_set_qualifier", "_acl_set_tag_type", "_acl_size", "_acl_to_text",
				"_acl_valid", "_acl_valid_fd_np", "_acl_valid_file_np",
				"_acl_valid_link", "_addr2ascii", "_alarm", "_alphasort",
				"_arc4random", "_arc4random_addrandom", "_arc4random_buf",
				"_arc4random_stir", "_arc4random_uniform", "_ascii2addr", "_asctime",
				"_asctime_r", "_asprintf", "_asprintf_l", "_asxprintf",
				"_asxprintf_exec", "_atexit", "_atexit_b", "_atof", "_atof_l",
				"_atoi", "_atoi_l", "_atol", "_atol_l", "_atoll", "_atoll_l",
				"_backtrace", "_backtrace_from_fp", "_backtrace_image_offsets",
				"_backtrace_symbols", "_backtrace_symbols_fd", "_basename",
				"_basename_r", "_bcopy", "_brk", "_bsd_signal", "_bsearch",
				"_bsearch_b", "_btowc", "_btowc_l", "_catclose", "_catgets",
				"_catopen", "_cfgetispeed", "_cfgetospeed", "_cfmakeraw",
				"_cfsetispeed", "_cfsetospeed", "_cfsetspeed", "_cgetcap",
				"_cgetclose", "_cgetent", "_cgetfirst", "_cgetmatch", "_cgetnext",
				"_cgetnum", "_cgetset", "_cgetstr", "_cgetustr", "_chmodx_np",
				"_clearerr", "_clearerr_unlocked", "_clock", "_clock_getres",
				"_clock_gettime", "_clock_gettime_nsec_np", "_clock_port",
				"_clock_sem", "_clock_settime", "_closedir", "_compat_mode",
				"_confstr", "_copy_printf_domain", "_creat", "_crypt", "_ctermid",
				"_ctermid_r", "_ctime", "_ctime_r", "_daemon", "_daylight",
				"_dbm_clearerr", "_dbm_close", "_dbm_delete", "_dbm_dirfno",
				"_dbm_error", "_dbm_fetch", "_dbm_firstkey", "_dbm_nextkey",
				"_dbm_open", "_dbm_store", "_dbopen", "_devname", "_devname_r",
				"_difftime", "_digittoint", "_digittoint_l", "_dirfd", "_dirname",
				"_dirname_r", "_div", "_dprintf", "_dprintf_l", "_drand48",
				"_duplocale", "_dxprintf", "_dxprintf_exec", "_ecvt", "_encrypt",
				"_endttyent", "_endusershell", "_endutxent", "_endutxent_wtmp",
				"_erand48", "_err", "_err_set_exit", "_err_set_exit_b",
				"_err_set_file", "_errc", "_errx", "_execl", "_execle", "_execlp",
				"_execv", "_execvP", "_execvp", "_exit", "_f_prealloc", "_fchmodx_np",
				"_fclose", "_fcvt", "_fdopen", "_fdopendir", "_feof", "_feof_unlocked",
				"_ferror", "_ferror_unlocked", "_fflagstostr", "_fflush", "_fgetc",
				"_fgetln", "_fgetpos", "_fgetrune", "_fgets", "_fgetwc", "_fgetwc_l",
				"_fgetwln", "_fgetwln_l", "_fgetws", "_fgetws_l", "_fileno",
				"_fileno_unlocked", "_filesec_dup", "_filesec_free",
				"_filesec_get_property", "_filesec_init", "_filesec_query_property",
				"_filesec_set_property", "_filesec_unset_property", "_flockfile",
				"_fmemopen", "_fmtcheck", "_fmtmsg", "_fnmatch", "_fopen", "_fork",
				"_forkpty", "_fparseln", "_fprintf", "_fprintf_l", "_fpurge",
				"_fputc", "_fputrune", "_fputs", "_fputwc", "_fputwc_l", "_fputws",
				"_fputws_l", "_fread", "_free_printf_comp", "_free_printf_domain",
				"_freelocale", "_freopen", "_fscanf", "_fscanf_l", "_fseek",
				"_fseeko", "_fsetpos", "_fstatvfs", "_fstatx_np", "_fsync_volume_np",
				"_ftell", "_ftello", "_ftime", "_ftok", "_ftrylockfile",
				"_fts_children", "_fts_close", "_fts_open", "_fts_open_b",
				"_fts_read", "_fts_set", "_ftw", "_fungetrune", "_funlockfile",
				"_funopen", "_fwide", "_fwprintf", "_fwprintf_l", "_fwrite",
				"_fwscanf", "_fwscanf_l", "_fxprintf", "_fxprintf_exec", "_gcvt",
				"_getbsize", "_getc", "_getc_unlocked", "_getchar", "_getchar_unlocked",
				"_getcwd", "_getdate", "_getdate_err", "_getdelim", "_getdiskbyname",
				"_getenv", "_gethostid", "_gethostname", "_getipv4sourcefilter",
				"_getlastlogx", "_getlastlogxbyname", "_getline", "_getloadavg",
				"_getlogin", "_getlogin_r", "_getmntinfo", "_getmntinfo_r_np",
				"_getmode", "_getopt", "_getopt_long", "_getopt_long_only",
				"_getpagesize", "_getpass", "_getpeereid", "_getprogname", "_gets",
				"_getsourcefilter", "_getsubopt", "_gettimeofday", "_getttyent",
				"_getttynam", "_getusershell", "_getutmp", "_getutmpx", "_getutxent",
				"_getutxent_wtmp", "_getutxid", "_getutxline", "_getvfsbyname",
				"_getw", "_getwc", "_getwc_l", "_getwchar", "_getwchar_l", "_getwd",
				"_glob", "_glob_b", "_globfree", "_gmtime", "_gmtime_r", "_grantpt",
				"_hash_create", "_hash_destroy", "_hash_purge", "_hash_search",
				"_hash_stats", "_hash_traverse", "_hcreate", "_hdestroy",
				"_heapsort", "_heapsort_b", "_hsearch", "_imaxabs", "_imaxdiv",
				"_inet_addr", "_inet_aton", "_inet_lnaof", "_inet_makeaddr",
				"_inet_net_ntop", "_inet_net_pton", "_inet_neta", "_inet_netof",
				"_inet_network", "_inet_nsap_addr", "_inet_nsap_ntoa", "_inet_ntoa",
				"_inet_ntop", "_inet_ntop4", "_inet_ntop6", "_inet_pton",
				"_initstate", "_insque", "_isalnum", "_isalnum_l", "_isalpha",
				"_isalpha_l", "_isascii", "_isatty", "_isblank", "_isblank_l",
				"_iscntrl", "_iscntrl_l", "_isdigit", "_isdigit_l", "_isgraph",
				"_isgraph_l", "_ishexnumber", "_ishexnumber_l", "_isideogram",
				"_isideogram_l", "_islower", "_islower_l", "_isnumber", "_isnumber_l",
				"_isphonogram", "_isphonogram_l", "_isprint", "_isprint_l",
				"_ispunct", "_ispunct_l", "_isrune", "_isrune_l", "_isspace",
				"_isspace_l", "_isspecial", "_isspecial_l", "_isupper", "_isupper_l",
				"_iswalnum", "_iswalnum_l", "_iswalpha", "_iswalpha_l", "_iswascii",
				"_iswblank", "_iswblank_l", "_iswcntrl", "_iswcntrl_l", "_iswctype",
				"_iswctype_l", "_iswdigit", "_iswdigit_l", "_iswgraph", "_iswgraph_l",
				"_iswhexnumber", "_iswhexnumber_l", "_iswideogram", "_iswideogram_l",
				"_iswlower", "_iswlower_l", "_iswnumber", "_iswnumber_l",
				"_iswphonogram", "_iswphonogram_l", "_iswprint", "_iswprint_l",
				"_iswpunct", "_iswpunct_l", "_iswrune", "_iswrune_l", "_iswspace",
				"_iswspace_l", "_iswspecial", "_iswspecial_l", "_iswupper",
				"_iswupper_l", "_iswxdigit", "_iswxdigit_l", "_isxdigit",
				"_isxdigit_l", "_jrand48", "_kOSThermalNotificationPressureLevelName",
				"_killpg", "_l64a", "_labs", "_lchflags", "_lchmod", "_lcong48",
				"_ldiv", "_lfind", "_link_addr", "_link_ntoa", "_llabs", "_lldiv",
				"_localeconv", "_localeconv_l", "_localtime", "_localtime_r",
				"_lockf", "_login", "_login_tty", "_logout", "_logwtmp", "_lrand48",
				"_lsearch", "_lstatx_np", "_lutimes", "_mblen", "_mblen_l",
				"_mbmb", "_mbrlen", "_mbrlen_l", "_mbrrune", "_mbrtowc", "_mbrtowc_l",
				"_mbrune", "_mbsinit", "_mbsinit_l", "_mbsnrtowcs", "_mbsnrtowcs_l",
				"_mbsrtowcs", "_mbsrtowcs_l", "_mbstowcs", "_mbstowcs_l", "_mbtowc",
				"_mbtowc_l", "_memmem", "_memset_s", "_mergesort", "_mergesort_b",
				"_mkdirx_np", "_mkdtemp", "_mkdtempat_np", "_mkfifox_np",
				"_mkostemp", "_mkostemps", "_mkostempsat_np", "_mkpath_np",
				"_mkpathat_np", "_mkstemp", "_mkstemp_dprotected_np", "_mkstemps",
				"_mkstempsat_np", "_mktemp", "_mktime", "_monaddition", "_moncontrol",
				"_moncount", "_moninit", "_monitor", "_monoutput", "_monreset",
				"_monstartup", "_mpool_close", "_mpool_filter", "_mpool_get",
				"_mpool_new", "_mpool_open", "_mpool_put", "_mpool_sync", "_mrand48",
				"_nanosleep", "_new_printf_comp", "_new_printf_domain", "_newlocale",
				"_nextwctype", "_nextwctype_l", "_nftw", "_nice", "_nl_langinfo",
				"_nl_langinfo_l", "_nrand48", "_nvis", "_off32", "_off64",
				"_offtime", "_open_memstream", "_open_with_subsystem",
				"_open_wmemstream", "_opendev", "_opendir", "_openpty", "_openx_np",
				"_optarg", "_opterr", "_optind", "_optopt", "_optreset", "_pause",
				"_pclose", "_perror", "_popen", "_posix2time", "_posix_openpt",
				"_posix_spawnp", "_printf", "_printf_l", "_psignal", "_psort",
				"_psort_b", "_psort_r", "_ptsname", "_ptsname_r", "_putc",
				"_putc_unlocked", "_putchar", "_putchar_unlocked", "_putenv",
				"_puts", "_pututxline", "_putw", "_putwc", "_putwc_l", "_putwchar",
				"_putwchar_l", "_qsort", "_qsort_b", "_qsort_r", "_querylocale",
				"_radixsort", "_raise", "_rand", "_rand_r", "_random", "_rb_tree_count",
				"_rb_tree_find_node", "_rb_tree_find_node_geq", "_rb_tree_find_node_leq",
				"_rb_tree_init", "_rb_tree_insert_node", "_rb_tree_iterate",
				"_rb_tree_remove_node", "_readdir", "_readdir_r", "_readpassphrase",
				"_reallocf", "_realpath", "_recv", "_regcomp", "_regcomp_l",
				"_regerror", "_regexec", "_regfree", "_register_printf_domain_function",
				"_register_printf_domain_render_std", "_regncomp", "_regncomp_l",
				"_regnexec", "_regwcomp", "_regwcomp_l", "_regwexec", "_regwncomp",
				"_regwncomp_l", "_regwnexec", "_remove", "_remque", "_rewind",
				"_rewinddir", "_rindex", "_rpmatch", "_sbrk", "_scandir",
				"_scandir_b", "_scanf", "_scanf_l", "_seed48", "_seekdir", "_send",
				"_setbuf", "_setbuffer", "_setenv", "_sethostid", "_sethostname",
				"_setinvalidrune", "_setipv4sourcefilter", "_setkey", "_setlinebuf",
				"_setlocale", "_setlogin", "_setmode", "_setpgrp", "_setprogname",
				"_setrgid", "_setruid", "_setrunelocale", "_setsourcefilter",
				"_setstate", "_settimeofday", "_setttyent", "_setusershell",
				"_setutxent", "_setutxent_wtmp", "_setvbuf", "_sigaction",
				"_sigaddset", "_sigaltstack", "_sigblock", "_sigdelset",
				"_sigemptyset", "_sigfillset", "_sighold", "_sigignore",
				"_siginterrupt", "_sigismember", "_signal", "_sigpause", "_sigrelse",
				"_sigset", "_sigsetmask", "_sigvec", "_skip", "_sl_add", "_sl_find",
				"_sl_free", "_sl_init", "_sleep", "_snprintf", "_snprintf_l",
				"_snvis", "_sockatmark", "_sprintf", "_sprintf_l", "_sradixsort",
				"_srand", "_srand48", "_sranddev", "_srandom", "_srandomdev",
				"_sscanf", "_sscanf_l", "_stat_with_subsystem", "_statvfs",
				"_statx_np", "_stpcpy", "_stpncpy", "_strcasecmp", "_strcasecmp_l",
				"_strcasestr", "_strcasestr_l", "_strcat", "_strcoll", "_strcoll_l",
				"_strcspn", "_strdup", "_strenvisx", "_strerror", "_strerror_r",
				"_strfmon", "_strfmon_l", "_strftime", "_strftime_l", "_strmode",
				"_strncasecmp", "_strncasecmp_l", "_strncat", "_strndup", "_strnstr",
				"_strnunvis", "_strnunvisx", "_strnvis", "_strnvisx", "_strpbrk",
				"_strptime", "_strptime_l", "_strrchr", "_strsenvisx", "_strsep",
				"_strsignal", "_strsignal_r", "_strsnvis", "_strsnvisx", "_strspn",
				"_strsvis", "_strsvisx", "_strtod", "_strtod_l", "_strtof",
				"_strtof_l", "_strtofflags", "_strtoimax", "_strtoimax_l",
				"_strtok", "_strtok_r", "_strtol", "_strtol_l", "_strtold",
				"_strtold_l", "_strtoll", "_strtoll_l", "_strtonum", "_strtoq",
				"_strtoq_l", "_strtoul", "_strtoul_l", "_strtoull", "_strtoull_l",
				"_strtoumax", "_strtoumax_l", "_strtouq", "_strtouq_l", "_strunvis",
				"_strunvisx", "_strvis", "_strvisx", "_strxfrm", "_strxfrm_l",
				"_suboptarg", "_svis", "_swab", "_swprintf", "_swprintf_l",
				"_swscanf", "_swscanf_l", "_sxprintf", "_sxprintf_exec",
				"_sync_volume_np", "_sys_errlist", "_sys_nerr", "_sys_siglist",
				"_sys_signame", "_sysconf", "_sysctl", "_sysctlbyname",
				"_sysctlnametomib", "_system", "_tcdrain", "_tcflow", "_tcflush",
				"_tcgetattr", "_tcgetpgrp", "_tcgetsid", "_tcsendbreak", "_tcsetattr",
				"_tcsetpgrp", "_tdelete", "_telldir", "_tempnam", "_tfind",
				"_thread_stack_pcs", "_time", "_time2posix", "_timegm", "_timelocal",
				"_timeoff", "_times", "_timespec_get", "_timezone", "_timingsafe_bcmp",
				"_tmpfile", "_tmpnam", "_toascii", "_tolower", "_tolower_l",
				"_toupper", "_toupper_l", "_towctrans", "_towctrans_l", "_towlower",
				"_towlower_l", "_towupper", "_towupper_l", "_tre_ast_new_catenation",
				"_tre_ast_new_iter", "_tre_ast_new_literal", "_tre_ast_new_node",
				"_tre_ast_new_union", "_tre_compile", "_tre_fill_pmatch",
				"_tre_free", "_tre_mem_alloc_impl", "_tre_mem_destroy",
				"_tre_mem_new_impl", "_tre_parse", "_tre_stack_destroy",
				"_tre_stack_new", "_tre_stack_num_objects", "_tre_tnfa_run_backtrack",
				"_tre_tnfa_run_parallel", "_tsearch", "_ttyname", "_ttyname_r",
				"_ttyslot", "_twalk", "_tzname", "_tzset", "_tzsetwall", "_ualarm",
				"_ulimit", "_umaskx_np", "_uname", "_ungetc", "_ungetwc",
				"_ungetwc_l", "_unlockpt", "_unsetenv", "_unvis", "_uselocale",
				"_usleep", "_utime", "_utmpxname", "_uuid_clear", "_uuid_compare",
				"_uuid_copy", "_uuid_generate", "_uuid_generate_random",
				"_uuid_generate_time", "_uuid_is_null", "_uuid_pack", "_uuid_parse",
				"_uuid_unpack", "_uuid_unparse", "_uuid_unparse_lower",
				"_uuid_unparse_upper", "_vasprintf", "_vasprintf_l", "_vasxprintf",
				"_vasxprintf_exec", "_vdprintf", "_vdprintf_l", "_vdxprintf",
				"_vdxprintf_exec", "_verr", "_verrc", "_verrx", "_vfprintf",
				"_vfprintf_l", "_vfscanf", "_vfscanf_l", "_vfwprintf", "_vfwprintf_l",
				"_vfwscanf", "_vfwscanf_l", "_vfxprintf", "_vfxprintf_exec",
				"_vis", "_vprintf", "_vprintf_l", "_vscanf", "_vscanf_l",
				"_vsnprintf", "_vsnprintf_l", "_vsprintf", "_vsprintf_l", "_vsscanf",
				"_vsscanf_l", "_vswprintf", "_vswprintf_l", "_vswscanf",
				"_vswscanf_l", "_vsxprintf", "_vsxprintf_exec", "_vwarn", "_vwarnc",
				"_vwarnx", "_vwprintf", "_vwprintf_l", "_vwscanf", "_vwscanf_l",
				"_vxprintf", "_vxprintf_exec", "_wait", "_wait3", "_waitpid",
				"_warn", "_warnc", "_warnx", "_wcpcpy", "_wcpncpy", "_wcrtomb",
				"_wcrtomb_l", "_wcscasecmp", "_wcscasecmp_l", "_wcscat", "_wcschr",
				"_wcscmp", "_wcscoll", "_wcscoll_l", "_wcscpy", "_wcscspn",
				"_wcsdup", "_wcsftime", "_wcsftime_l", "_wcslcat", "_wcslcpy",
				"_wcslen", "_wcsncasecmp", "_wcsncasecmp_l", "_wcsncat", "_wcsncmp",
				"_wcsncpy", "_wcsnlen", "_wcsnrtombs", "_wcsnrtombs_l", "_wcspbrk",
				"_wcsrchr", "_wcsrtombs", "_wcsrtombs_l", "_wcsspn", "_wcsstr",
				"_wcstod", "_wcstod_l", "_wcstof", "_wcstof_l", "_wcstoimax",
				"_wcstoimax_l", "_wcstok", "_wcstol", "_wcstol_l", "_wcstold",
				"_wcstold_l", "_wcstoll", "_wcstoll_l", "_wcstombs", "_wcstombs_l",
				"_wcstoul", "_wcstoul_l", "_wcstoull", "_wcstoull_l", "_wcstoumax",
				"_wcstoumax_l", "_wcswidth", "_wcswidth_l", "_wcsxfrm", "_wcsxfrm_l",
				"_wctob", "_wctob_l", "_wctomb", "_wctomb_l", "_wctrans",
				"_wctrans_l", "_wctype", "_wctype_l", "_wcwidth", "_wcwidth_l",
				"_wmemchr", "_wmemcmp", "_wmemcpy", "_wmemmove", "_wmemset",
				"_wordexp", "_wordfree", "_wprintf", "_wprintf_l", "_wscanf",
				"_wscanf_l", "_wtmpxname", "_xprintf", "_xprintf_exec"
				)

				def print_here_head(name):
				print("""\
				(tee %s.s \|llvm-mc -filetype=obj -triple %s -o %s.o) <<END_OF_FILE &""" % (name, triple, name))

				def print_here_tail():
				print("""\
				END_OF_FILE
				""")

				def print_function_head(p2align, name):
				if args.os == "macos":
				print("""\
				.section __TEXT,__text,regular,pure_instructions
				.p2align %d, 0x90
				.globl _%s
				_%s:""" % (p2align, name, name))
				elif args.os == "windows":
				print("""\
				.text
				.def %s;
				.scl 2;
				.type 32;
				.endef
				.globl %s
				.p2align %d
				%s:""" % (name, name, p2align, name))
				elif args.os == "linux":
				print("""\
				.text
				.p2align %d
				.globl %s
				%s:""" % (p2align, name, name))

				def print_function(addr, size, addrs):
				name = "x%08x" % addr
				calls = random.randint(0, size>>12)
				print_here_head(name)
				print("""\
				### %s size=%x calls=%x""" % (name, size, calls))
				print_function_head(4, name)
				for i in range(calls):
				print(" bl %sx%08x\n .p2align 4" %
				("_" if args.os == "macos" else "",
				addrs[random.randint(0, len(addrs)-1)]))
				if args.os == "macos":
				print(" bl %s\n .p2align 4" %
				(libSystem_calls[random.randint(0, len(libSystem_calls)-1)]))
				fill = size - 4 * (calls + 1)
				assert fill > 0
				print("""\
				.fill 0x%x
				ret""" % (fill))
				print_here_tail()

				def random_seed():
				"""Generate a seed that can easily be passsed back in via --seed=STRING"""
				return ''.join(random.choice(string.ascii_lowercase) for i in range(10))

				def generate_sizes(base, megabytes):
				total = 0
				while total < megabytes:
				size = random.randint(0x100, 0x10000) * 0x10
				yield size
				total += size

				def generate_addrs(addr, sizes):
				i = 0
				while i < len(sizes):
				yield addr
				addr += sizes[i]
				i += 1

				def main():
				parser = argparse.ArgumentParser(
				description=__doc__,
				epilog="""\
				WRITEME
				""")
				parser.add_argument('--seed', type=str, default=random_seed(),
				help='Seed the random number generator')
				parser.add_argument('--size', type=int, default=None,
				help='Total text size to generate, in megabytes')
				parser.add_argument('--os', type=str, default="macos",
				help='Target OS: macos, windows, or linux')
				global args
				args = parser.parse_args()
				triples = {
				"macos": "arm64-apple-macos",
				"linux": "aarch64-pc-linux",
				"windows": "aarch64-pc-windows"
				}
				global triple
				triple = triples.get(args.os)

				print("""\
				### seed=%s triple=%s
				""" % (args.seed, triple))

				random.seed(args.seed)

				base = 0x4010
				megabytes = (int(args.size) if args.size else 512) * 1024 * 1024
				sizes = [size for size in generate_sizes(base, megabytes)]
				addrs = [addr for addr in generate_addrs(base, sizes)]

				for i in range(len(addrs)):
				print_function(addrs[i], sizes[i], addrs)

				print_here_head("main")
				print("""\
				### _x%08x
				""" % (addrs[-1] + sizes[-1]))
				print_function_head(14 if args.os == "macos" else 4, "main")
				print(" ret")
				print_here_tail()
				print("wait")


				if __name__ == '__main__':
				main()

This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Implement branch-range-extension thunksClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 344852

lld/MachO/Arch/ARM64.cpp

lld/MachO/Driver.cpp

lld/MachO/InputSection.h

lld/MachO/InputSection.cpp

lld/MachO/MergedOutputSection.h

lld/MachO/MergedOutputSection.cpp

lld/MachO/Options.td

lld/MachO/Symbols.h

lld/MachO/Symbols.cpp

lld/MachO/SyntheticSections.h

lld/MachO/SyntheticSections.cpp

lld/MachO/Target.h

lld/MachO/Writer.cpp

lld/test/MachO/arm64-thunks.s

lld/test/MachO/tools/generate-thunkable-program.py

[lld-macho] Implement branch-range-extension thunks
ClosedPublic