Page MenuHomePhabricator

[lld-macho] Move ICF earlier to avoid emitting redundant binds

Authored by int3 on Jun 28 2021, 11:34 AM.



This is a pretty big refactoring diff, so here are the motivations:

Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.

However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by

In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big inputSections vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.

Adding literal sections directly to their output sections means that we
can no longer determine inputOrder from iterating over
inputSections. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
outSecOff to store this order value.

One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before processRelocations(). I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.

The upside of all this is that the inputSections vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.

I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular .text section, they're more common in .data sections, so it
seems more natural to test it that way.

This change is perf-neutral when linking chromium_framework.

Diff Detail

Event Timeline

int3 created this revision.Jun 28 2021, 11:34 AM
Herald added a project: Restricted Project. · View Herald Transcript
int3 requested review of this revision.Jun 28 2021, 11:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2021, 11:34 AM
int3 added inline comments.Jun 28 2021, 4:41 PM

another benefit of moving ICF earlier: the lack of canonicalization is now obvious, since the parent point of coalesced InputSections is null.

int3 planned changes to this revision.Jun 28 2021, 10:16 PM
int3 updated this revision to Diff 355114.Jun 28 2021, 10:35 PM
int3 edited the summary of this revision. (Show Details)

change how we implement inputOrder

oontvoo added inline comments.

(clang-tidy suggestion)


why not leave it as 0-based rather than 1?

not saying it's wrong ... just curious :)


why "4"? is this not platform dependent?


nit: just for consistency

int3 added inline comments.Jun 30 2021, 6:59 PM

well some reviewers seem to dislike post-increments :)

either a zero- or 1-based index works here, but yeah, maybe keeping it zero-based will be less surprising. I'll change it back


this is existing code, I think @gkm wrote this

int3 updated this revision to Diff 355752.Jun 30 2021, 7:37 PM
int3 marked 4 inline comments as done.

address comments

int3 edited the summary of this revision. (Show Details)Jul 1 2021, 6:30 AM
oontvoo accepted this revision as: oontvoo.Jul 1 2021, 2:47 PM

LGTM - thanks! (but still curious about the hard-coded alignment)

This revision is now accepted and ready to land.Jul 1 2021, 2:47 PM
int3 added a comment.Jul 1 2021, 4:11 PM

A quick check of ld64's output shows that __unwind_info is aligned to 4 across x86_64/x86/arm64/armv7, so I think we're good here :)