This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Implement support for PIC
ClosedPublic

Authored by int3 on Sep 5 2020, 7:05 PM.

Details

Reviewers
gkm
Group Reviewers
Restricted Project
Commits
rGe4e673e75a06: [lld-macho] Implement support for PIC
Summary
  • Implement rebase opcodes. Rebase opcodes tell dyld where absolute addresses have been encoded in the binary. If the binary is not loaded at its preferred address, dyld has to rebase these addresses by adding an offset to them.
  • Support -pie and use it to test rebase opcodes.

This is necessary for absolute address references in dylibs, bundles etc
to work.

Diff Detail

Event Timeline

int3 created this revision.Sep 5 2020, 7:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2020, 7:05 PM
int3 requested review of this revision.Sep 5 2020, 7:05 PM

Is there a funnel point that we know that all the rebase entries will pass through to ensure that we catch them rather than adding them at a couple of sites?

lld/MachO/SyntheticSections.h
187

What do you think of moving addend to after target?

214

Similar

int3 added a comment.Sep 8 2020, 9:02 PM

Is there a funnel point that we know that all the rebase entries will pass through to ensure that we catch them rather than adding them at a couple of sites?

It may be clearer if we scan the GOT/LazyPointerSection inside RebaseSection::finalizeContents() for any absolute addresses, rather than relying on the code that's adding entries to either of those sections to also handle the rebasing. I will try that... however, there isn't a funnel point that will cover absolute addresses from both our SyntheticSections as well as from InputSections.

lld/MachO/SyntheticSections.cpp
159

I should just return early instead of checking locations.empty() twice

lld/MachO/SyntheticSections.h
187

What's the rationale behind your preference? I thought putting the addend next to the symbol made sense since they both determine what address gets written into the target

int3 added a comment.Sep 8 2020, 9:48 PM

It may be clearer if we scan the GOT/LazyPointerSection inside RebaseSection::finalizeContents() for any absolute addresses

Actually, this doesn't really work either, because we need to add the rebases early in order that the isNeeded() filtering doesn't filter out RebaseSection. Also there's a fair bit of code duplication in this approach; it essentially mirrors the logic in NonLazyPointerSectionBase::writeTo() and LazyPointerSection::writeTo(). I don't see a nice solution here...

int3 updated this revision to Diff 290637.Sep 8 2020, 9:59 PM

minor update

smeenai added a subscriber: smeenai.Sep 9 2020, 6:16 PM

Is there a funnel point that we know that all the rebase entries will pass through to ensure that we catch them rather than adding them at a couple of sites?

It may be clearer if we scan the GOT/LazyPointerSection inside RebaseSection::finalizeContents() for any absolute addresses, rather than relying on the code that's adding entries to either of those sections to also handle the rebasing. I will try that... however, there isn't a funnel point that will cover absolute addresses from both our SyntheticSections as well as from InputSections.

I haven't reviewed this yet so I'm missing a bunch of context, and I haven't thought through this very much yet either, but would any of this be easier if SyntheticSections were InputSections instead of OutputSections? (There's definitely downsides to that too, but it's worth thinking about the trade-offs.)

int3 added a comment.Sep 9 2020, 6:58 PM

but would any of this be easier if SyntheticSections were InputSections instead of OutputSections

Maybe... we could make them InputSections and then generate unsigned relocations for each of their entries. There would be some additional overhead (e.g. the addend field would be useless) but it would allow us to handle the rebases in one place. One thing I'm not sure about is whether other architectures have an equivalent of X86_64_RELOC_UNSIGNED that can be used for this; it looks like arm64 does, but not sure about PPC (not that it really matters).

There are other benefits to having SyntheticSections be InputSections: there would no longer be a need for the SectionPointerUnion, and DSOHandle could just be an ordinary Defined symbol.

I actually can't think of many real downsides... do you have any in mind? That said, the current architecture is not terrible, and I'm inclined to work on implementing more features and pushing the refactoring till later.

int3 updated this revision to Diff 292073.Sep 15 2020, 6:58 PM

emit rebase opcodes for section relocations too

gkm added a subscriber: gkm.Sep 23 2020, 10:18 PM
gkm added inline comments.
lld/MachO/Writer.cpp
339–341

Without checking godbolt, I boldly assume that the computation for the string compare will be hoisted out of the inner loop. Please add a testcase for mixing -pie and compact unwind.

gkm added a comment.Sep 23 2020, 10:50 PM

In lld/MachO/SyntheticSections.cpp, there are three instances of ...

clang-tidy: warning: 'auto *osec' can be declared as 'const auto *osec' [llvm-qualified-auto]

... which seem worthy of addressing.

smeenai added inline comments.Sep 23 2020, 10:58 PM
lld/MachO/Writer.cpp
339–341

There's gonna be lots of relocations, so it's probably worth confirming that the hoisting happens :)

The section comparison seems fine for now, but it also seems a bit janky. Is there any long-term redesign we could do here to make it cleaner? We should add a TODO somewhere if so to keep track of it.

but would any of this be easier if SyntheticSections were InputSections instead of OutputSections

Maybe... we could make them InputSections and then generate unsigned relocations for each of their entries. There would be some additional overhead (e.g. the addend field would be useless) but it would allow us to handle the rebases in one place. One thing I'm not sure about is whether other architectures have an equivalent of X86_64_RELOC_UNSIGNED that can be used for this; it looks like arm64 does, but not sure about PPC (not that it really matters).

There are other benefits to having SyntheticSections be InputSections: there would no longer be a need for the SectionPointerUnion, and DSOHandle could just be an ordinary Defined symbol.

I actually can't think of many real downsides... do you have any in mind? That said, the current architecture is not terrible, and I'm inclined to work on implementing more features and pushing the refactoring till later.

I agree with refactoring later (not too much later, but it probably makes sense to at least get some experience with implementing other architectures first). We decided to make SyntheticSections be OutputSections back in https://reviews.llvm.org/D77893#inline-716459. We were debating why ELF makes them InputSections; it appears the main motivation for that design was to allow synthetic sections to be manipulated via linker scripts, which we don't need to support with Mach-O. We can decide what ultimately makes more sense when we have a bit more of the implementation completed though.

@psmith, in your talk on LLD's performance (https://archive.fosdem.org/2019/schedule/event/llvm_lld/), you mentioned something along the lines of LLD's synthetic sections being input sections (as opposed to output) significantly speeding up thunks (at the 25:10 mark). Would you be able to elaborate on why? We'll need to implement thunks when we add the ARM64 support, although our design will probably be closer to COFF's than ELF's (since we don't have linker scripts).

gkm added inline comments.Sep 23 2020, 11:25 PM
lld/MachO/Writer.cpp
339–341

The section comparison seems fine for now, but it also seems a bit janky. Is there any long-term redesign we could do here to make it cleaner? We should add a TODO somewhere if so to keep track of it.

What springs to mind is a flag indicating that an InputSection (or its parent MergedOutputSection) will not be written to the output file. We can add S_COMPACT_UNWIND to enum SectionType, which is cheaper to check than string compare on the name. And/or we can add an attribute S_ATTR_TEMPORARY or S_ATTR_INTERNAL or some such, indicating an input section is for internal use and not destined for output. The attribute will not be set on compact unwind sections with -r.

MaskRay added inline comments.
lld/MachO/SyntheticSections.cpp
165

If there are empty sections, there may be stability concerns.

gkm added inline comments.Sep 24 2020, 7:28 AM
lld/MachO/Writer.cpp
330–332

This is better: nothing in the relocs loop pertains to __compact_unwind so we can skip it at the top.

int3 added inline comments.Sep 24 2020, 12:41 PM
lld/MachO/SyntheticSections.cpp
165

we can't have a rebase entry pointing to an empty section though

int3 updated this revision to Diff 294170.Sep 24 2020, 2:39 PM

address comments

int3 marked 2 inline comments as done.Sep 24 2020, 2:40 PM
int3 added inline comments.
lld/MachO/Writer.cpp
330–332

good point. I've gone with this option, plus added a TODO so we revisit

gkm accepted this revision.Sep 24 2020, 11:04 PM

Feel free to land when you have the -pie + compact unwind testcase.

This revision is now accepted and ready to land.Sep 24 2020, 11:04 PM
int3 marked an inline comment as done.Sep 25 2020, 10:44 AM

@gkm the test is already in the latest revision :)

This revision was landed with ongoing or failed builds.Sep 25 2020, 11:28 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Nov 30 2020, 1:19 PM
thakis added inline comments.
lld/MachO/Driver.cpp
641

This didn't remove HelpHidden from pie, so we still warn that this is unimplemented.

Also, this should be on by default when targeting macOS 10.7+.

int3 marked an inline comment as done.Nov 30 2020, 9:13 PM
int3 added inline comments.
lld/MachO/Driver.cpp
641

Thanks! Addressed in D92362.