This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Have ICF operate on all sections at once
ClosedPublic

Authored by int3 on Jul 8 2021, 9:32 AM.

Details

Reviewers
gkm
smeenai
Group Reviewers
Restricted Project
Commits
rG428a7c1b38d2: [lld-macho] Have ICF operate on all sections at once
Summary

ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular __text sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the __text section, so we cannot
perform ICF in phases.

In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting InputSection::parent to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
section$start$ symbols.)

This diff also merges InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the parent pointer check subsumes the
flags check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).

I've also updated ICF.cpp's block comment to better reflect its current
status.

Diff Detail

Event Timeline

int3 created this revision.Jul 8 2021, 9:32 AM
Herald added a project: Restricted Project. · View Herald Transcript
int3 requested review of this revision.Jul 8 2021, 9:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2021, 9:33 AM
int3 updated this revision to Diff 357258.Jul 8 2021, 9:45 AM
int3 edited the summary of this revision. (Show Details)
int3 planned changes to this revision.Jul 8 2021, 9:55 AM
int3 updated this revision to Diff 357388.Jul 8 2021, 4:48 PM
int3 retitled this revision from [lld-macho][nfc] Create OutputSections earlier to [lld-macho] Have ICF operate on all sections at once.
int3 edited the summary of this revision. (Show Details)
smeenai accepted this revision.Jul 16 2021, 9:59 AM
smeenai added a subscriber: smeenai.

LGTM

My one concern is that we now have to set isec->parent in a whole bunch of places (vs. the previous centralized setup), and that feels easy to forget if we need it in more spots in the future. I imagine things would blow up pretty loudly if it were forgotten though.

This revision is now accepted and ready to land.Jul 16 2021, 9:59 AM
int3 added a comment.Jul 17 2021, 10:41 AM

Yeah, I agree that it's unfortunate, but we can't emit any InputSection that doesn't have a parent, so omissions should indeed be pretty easy to catch :)

This revision was automatically updated to reflect the committed changes.