Page MenuHomePhabricator

[lld] Initial commit for new Mach-O backend
ClosedPublic

Authored by int3 on Feb 28 2020, 1:14 PM.

Details

Summary

This is the first commit for the new Mach-O backend, designed to roughly follow the architecture of the existing ELF and COFF backends, and building off work that @ruiu and @pcc did in a branch a while back. Note that this is a very stripped-down commit with the bare minimum of functionality for ease of review. We'll be following up with more diffs soon.

Currently, we're able to generate a simple "Hello World!" executable that runs on OS X Catalina (and possibly on earlier OS X versions; I haven't tested them). (This executable can be obtained by compiling test/MachO/relocations.s.) We're mocking out a few load commands to achieve this -- for example, we can't load dynamic libraries, but Catalina requires binaries to be linked against dyld, so we hardcode the emission of a LC_LOAD_DYLIB command. Other mocked out load commands include LC_SYMTAB and LC_DYSYMTAB.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
int3 added a comment.Mar 12 2020, 11:38 AM

Thanks everyone for the reviews and suggestions!

lld/MachO/InputFiles.cpp
142

hm this is an enum value, not a member/parameter, so I think CamelCase is correct here?

smeenai added inline comments.Mar 12 2020, 11:59 AM
lld/MachO/InputFiles.cpp
142

Yeah ObjKind is consistent with LLD ELF.

I'll carefully read through the patch. Hope you don't mind keeping it open for more time.

I'll carefully read through the patch. Hope you don't mind keeping it open for more time.

Sure, go ahead. The more eyes on this the better :)

ruiu added a comment.Mar 13 2020, 12:09 AM

Jez and Shoaib, if you guys have more patches on this one, you can send them now for review if you want.

Jez and Shoaib, if you guys have more patches on this one, you can send them now for review if you want.

Yup, we'll start putting up follow-ups soon (probably next week).

int3 updated this revision to Diff 250317.Mar 13 2020, 3:54 PM

clang-tidy + rebase

MaskRay added inline comments.Mar 16 2020, 9:15 PM
lld/MachO/Driver.cpp
41

const char *NAME[] = VALUE;

75

Add a -arch i386 test. There is currently no test.

90

Add a test similar to test/ELF/invalid/executable.s

127

Add an -e test with an undefined symbol.

140

vec can be omitted.

lld/MachO/InputFiles.cpp
25

the beginning

Do you mean that a symbol defined in the section must be at the beginning of a subsection, unless the N_ALT_ENTRY attribute is set?

71

Add a cannot open test. See test/ELF/basic.s (I don't think placing many things in one file is good but you can copy one of the tests (%t.no.such.file))

85
86

reinterpret_cast is more idiomatic.

107

What if sec.align is >= 32 ?

MaskRay added inline comments.Mar 16 2020, 9:41 PM
lld/MachO/Arch/X86_64.cpp
51

getImplicitAddend errors for an unknown relocation type. Is here unreachable?

lld/MachO/Options.td
12

--version uses a double-dash form while other long options use single-dash form. Can you confirm?

lld/MachO/Target.h
19

pageSize
imageBase

28

In ELF, D73254 renamed relocateOne to relocate and passed in Relocation. You may think whether Mach-O will need other fields of Reloc.

lld/MachO/Writer.cpp
39

virtual uint64_t getSize() const = 0;

50

Why are the two variables not placed with other member variables?

95

const

100

Can os be local to createDyldInfoContents?

int3 updated this revision to Diff 250695.Mar 16 2020, 10:48 PM

minor update

int3 updated this revision to Diff 250999.Mar 17 2020, 11:46 PM
int3 marked 20 inline comments as done.

address comments

int3 updated this revision to Diff 251000.Mar 17 2020, 11:47 PM

update

lld/MachO/Arch/X86_64.cpp
52

good point, hadn't noticed that Writer would exit if errorCount was nonzero.

lld/MachO/InputFiles.cpp
25

It was @ruiu who originally wrote this, but I believe it means that a symbol w/o the N_ALT_ENTRY attribute defines the beginning of a subsection. So your statement should be true by definition. Maybe "by default" should be replaced with "by definition"...

86

fair enough. I've converted all the C-style casts I could find.

107

I haven't found the logic in ld64 that handles this, but some experimentation (creating object files with high alignments using yaml2obj and then passing it to ld64) suggests it does applies mod 256 to sec.align w/o emitting warnings or anything. Obviously 2 ** 256 is still ridiculously large, and ultimately ld64 emits a 2GB object file, so it looks like some silent overflows are happening.

I'm leaning towards applying mod 32 here and emitting a warning.

lld/MachO/Options.td
12

hm, ld64 actually doesn't have a version flag at all, I was just copying lld-elf here... but I suppose it would make sense for this to be single-dash to be consistent with the other flags here (and in ld64)

lld/MachO/Target.h
19

clang-tidy specifies enums to be CamelCase

int3 marked 3 inline comments as done.Mar 17 2020, 11:55 PM
ruiu added inline comments.Mar 18 2020, 12:22 AM
lld/MachO/InputFiles.cpp
107

mod 32 seems a bit too arbitrary, so how about always setting isec->align to sec.align with a warning if sec.align is too large?

lld/MachO/Options.td
12

It looks like ld64 does have -v instead of --version or -version.

Harbormaster failed remote builds in B49555: Diff 251003!
Harbormaster failed remote builds in B49551: Diff 250999!
int3 marked an inline comment as done.Mar 18 2020, 12:34 AM
int3 added inline comments.
lld/MachO/InputFiles.cpp
107

how do you propose we align the addresses for sections with large align values?

ruiu added inline comments.Mar 18 2020, 12:42 AM
lld/MachO/InputFiles.cpp
107

Oh, the first thing I'd like to point out is that we store raw alignment values instead of power-of-two values to alignment variables in other ports, and mach-o probably should follow that to avoid confusion, even if it means we compute a power-of-two and then convert it back to an exponent.

To answer to your question, if you use uint32_t for align member, the maximum alignment we can represent is naturally limited to 2^31. How about print out an error (instead of warning) if it is greater than that?

int3 updated this revision to Diff 251013.Mar 18 2020, 1:45 AM
int3 marked 3 inline comments as done.

change alignment handling

lld/MachO/InputFiles.cpp
107

Ah, thanks for the explanation. And yeah, I guess ld64's behavior isn't worth emulating, erroring out is better.

ruiu accepted this revision.Mar 18 2020, 2:47 AM

LGTM

lld/MachO/InputFiles.cpp
109

nit: error messages should start with a lowercase letter.

110

Set a dummy value (e.g. 1) to isec->align if the input is invalid, so that even after we meet an error condition, all the member variables are at least initialized with some value instead of left uninitialized.

int3 updated this revision to Diff 251075.Mar 18 2020, 7:14 AM
int3 marked an inline comment as done.

lowercase

int3 marked an inline comment as done.Mar 18 2020, 7:15 AM
int3 added inline comments.
lld/MachO/InputFiles.cpp
110

align is default-initialized to zero in the class definition

int3 updated this revision to Diff 251139.Mar 18 2020, 11:35 AM

more const + override

int3 updated this revision to Diff 251533.Mar 19 2020, 7:29 PM

fix align handling

int3 updated this revision to Diff 252808.Mar 26 2020, 5:24 AM

have LC::getSize() return uint32_t since sizeofcmds is 32 bit

gkm added a subscriber: gkm.Mar 26 2020, 12:54 PM
ruiu added a comment.Mar 30 2020, 10:19 PM

Can you submit this change? I think this is good enough as an initial commit, and you can make further changes on top of this.

int3 added a comment.Mar 31 2020, 11:13 AM

I don't have commit access yet. @smeenai, could you do it?

I don't have commit access yet. @smeenai, could you do it?

Sure, I can get to this soon.

I don't have commit access yet. @smeenai, could you do it?

Sure, I can get to this soon.

Make sure to use git c --amend --author='Jez Ng <.............>'

MaskRay accepted this revision.Mar 31 2020, 11:37 AM
This revision was automatically updated to reflect the committed changes.

I don't have commit access yet. @smeenai, could you do it?

Sure, I can get to this soon.

Make sure to use git c --amend --author='Jez Ng <.............>'

@int3 used arc diff to upload the patch, so arc patch sets the correct author information automatically.

MaskRay added a comment.EditedApr 2 2020, 11:55 AM

This was reverted by af39151f3c54b90ff0ae1064f540a97b640298e9 along with my gn build fix (c886e2be1ee4c57ef9f3a75509b4f2299ad9f61b).

@ostannard Hope you can paste a notice and give some time before reverting :)
This looks like an easy fix. A revert could be more disruptive.

Relanded with commit 6acd3003755db3944f7fcbea7542bd574a41c0a0