This is an archive of the discontinued LLVM Phabricator instance.

[lld/mac] Implement -dead_strip
ClosedPublic

Authored by thakis on May 28 2021, 8:46 AM.

Details

Reviewers
int3
gkm
Group Reviewers
Restricted Project
Commits
rGa5645513dba7: [lld/mac] Implement -dead_strip
Summary

Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
    N           Min           Max        Median           Avg        Stddev
x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
+  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
Difference at 95.0% confidence
        -0.144711 +/- 0.0336749
        -3.60967% +/- 0.839989%
        (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added isLive() checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:

  • debug info
  • export tries
  • import opcodes
  • flags like -exported_symbol(s_list)
  • -U / dynamic_lookup
  • mod_init_funcs, mod_term_funcs
  • weak symbol handling
  • unwind info
  • stubs
  • map files
  • -sectcreate
  • undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:

  • check-llvm check-clang check-lld work with lld with this patch as host linker and -dead_strip enabled
  • Chromium still starts
  • Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:

  • Rename canOmitFromOutput to wasCoalesced (no behavior change) since it really is for weak coalesced symbols
  • Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP (.no_dead_strip in asm)

Fixes PR49276.

Diff Detail

Event Timeline

thakis created this revision.May 28 2021, 8:46 AM
Herald added a reviewer: gkm. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: dang, mgorny. · View Herald Transcript
thakis requested review of this revision.May 28 2021, 8:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 8:46 AM

Thanks! Iget:

= note: lld: error: unknown argument '-dead_strip'
thakis updated this revision to Diff 348545.May 28 2021, 8:55 AM

minor tweak to a comment

thakis edited the summary of this revision. (Show Details)EditedMay 28 2021, 8:59 AM

Thanks! Iget:

= note: lld: error: unknown argument '-dead_strip'

…with this patch applied? If so, are you invoking lld trough the ld64.lld symlink to get the MachO driver?

Sorry. Before this patch. I am trying to link Rust.

Sorry. Before this patch. I am trying to link Rust.

Ah, I see . I'd recommend not passing -Wl,-fatal_warnings, then that's only a warning :) (But that particular diag will go away once this patch is in.)

thakis edited the summary of this revision. (Show Details)May 28 2021, 9:02 AM

Sorry for bothering with this with ld64.lld:

= note: ld64.lld: error: unknown argument '-m64'
        ld64.lld: error: unknown argument '-Wl,-dead_strip'
        ld64.lld: error: unknown argument '-nodefaultlibs'
        ld64.lld: error: must specify -platform_version

Sorry for bothering with this with ld64.lld:

= note: ld64.lld: error: unknown argument '-m64'
        ld64.lld: error: unknown argument '-Wl,-dead_strip'
        ld64.lld: error: unknown argument '-nodefaultlibs'
        ld64.lld: error: must specify -platform_version

Oh, if you call it through clang (which is the normal way), just -fuse-ld=lld does the right thing. This is really off topic for this review try though 😃

thakis updated this revision to Diff 348560.May 28 2021, 11:25 AM

make clang-format happy

not sure what clang-tidy wants from me though, that header guard looks fine. would be nice if that check had more details.

Sorry for bothering with this with ld64.lld:

= note: ld64.lld: error: unknown argument '-m64'
        ld64.lld: error: unknown argument '-Wl,-dead_strip'
        ld64.lld: error: unknown argument '-nodefaultlibs'
        ld64.lld: error: must specify -platform_version

Oh, if you call it through clang (which is the normal way), just -fuse-ld=lld does the right thing. This is really off topic for this review try though 😃

The last off topic comment. The rust compiler invokes the linker: https://doc.rust-lang.org/rustc/codegen-options/index.html#linker

make clang-format happy

not sure what clang-tidy wants from me though, that header guard looks fine. would be nice if that check had more details.

Maybe clang-tidy is in general unhappy with the MachO style, but it only complains about new files.

thakis updated this revision to Diff 348601.May 28 2021, 5:13 PM

tiny cosmetic tweaks

thakis updated this revision to Diff 348610.May 28 2021, 6:49 PM

more tiny cosmetic tweaks and some more clang-format appeasing

int3 added a comment.Jun 1 2021, 9:24 PM

Nice work :D

lld/MachO/Driver.cpp
930

can we put this together with the other config init lines around line 969?

edit: ah, I guess it's because Symbol's ctor uses it... a comment would be good. Although -- what do you think about always marking Symbols/InputSections as not-live and then have isLive() check for !config->deadStrip? I guess there'll be a bit more runtime overhead but not sure it'd be significant...

hm I see the other LLD ports have also implemented things this way :/

lld/MachO/InputSection.h
51
64

it seems a bit counterintuitive to me that something can be live even though it has been coalesced away :) having live default to false (as mentioned in my comment in Driver.cpp) would avoid this issue too

lld/MachO/MarkLive.cpp
28

This is a pretty large function, can we break it up? Maybe have the seed-marking vs transitive-closure-marking as separate functions

29

when should we use ScopedTimer over TimeTraceScope? also, I don't think we call Timer::print() anywhere...

108–109
124

I'd personally prefer target->wordSize == 8 ? sizeof(CompactUnwindEntry<uint64_t>) : sizeof(CompactUnwindEntry<uint32_t>). More verbose but also more explicit

lld/MachO/Symbols.cpp
43

seems clearer this way imo

lld/MachO/Symbols.h
113

nit for mach-o terminology consistency

lld/test/MachO/dead-strip.s
12
28–38

nit: the first {{.*}} on each line is unnecessary

43
128–129

is this line meant to check that no bindings are present in the bind table? I don't think it does the job... I think you want something like

STRIPDYLIB: Bind table:
STRIDYLIB-EMPTY:
STRIPDYLIB-NEXT: Lazy bind table:
214
243–245

nit: please align :D

562
lld/test/MachO/mh-header-link.s
12

why is this necessary?

thakis updated this revision to Diff 349243.Jun 2 2021, 5:48 AM
thakis marked 11 inline comments as done.

address comments

lld/MachO/Driver.cpp
930

Added a comment.

Right, it's for consistency with the other LLD ports. I had the explicit check in isLive first but I moved to this for consistency with the other LLDs and for some other reason I no longer remember. Maybe used in Symbol needed an accessor for reading then? Not sure that was it. It felt marginally nicer this way :)

lld/MachO/InputSection.h
64

If dead stripping is on, all coalesced weak InputSections should also be marked !live.

If dead stripping is not on, then yes, this can happen. It's a bit strange, but it matches the other ports.

lld/MachO/MarkLive.cpp
28

Yes, that's what I was getting at with

"""
I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
"""

The ELF port has a class with member functions and stuff, and I agree that the root collecting code should be its own function. If it's ok I'd like to iterate on this in-tree though.

(I also have a local branch that implements -why_live … actually let me upload that, D103517 … but implementing it the same way ld64 did is too expensive perf wise, so I don't know yet if I need to templatize this on the state bit or made dead_strip independent of why_live or what. And finally, it feels like several pieces of code walk all files just to iterate all local symbols, so maybe I'd to pull the localSymbols computation out of SyntheticSections.cpp and then use that in the several places that use it. And speaking of local symbols, the order file processing currently walks all symbols, exported and local, and checks each against the order file, even though if the order file is short and only contains public symbols it should be way faster to walk the order file and look each symbol up in the symbol table instead, then it's small-constant-times-amortized-linear-of-large instead of large-constant-times-amortized-linear-of-small.)

Sorry about the wall of text! Summary: I agree I should do this, but I'd like to do it after getting the first draft committed :)

29

Good catch. Apparently the COFF port uses ScopedTimer (for its fancy /time feature), while ELF and MachO use TimeTraceScope (for their fancy trace export). I think I should change TimeTraceScope to also be able to print time reports, then switch COFF to that so that it grows a fancy trace export, and then give the ELF and MachO ports flags for a time report on the console. (For MachO, I guess that'd go behind -print_statistics which already exists.)

For now, changed this to TimeTraceScope.

124

Also requires putting CompactUnwindEntry in the header. But sure, done.

lld/test/MachO/dead-strip.s
128–129

IIRC the --implicit-check-not _unref does most of the work and this here just checks that I remembered to pass --bind

lld/test/MachO/mh-header-link.s
12

Good question. I think it was test coverage for making sure createSyntheticSymbols() works with dead stripping. I think in the end it just worked without any changes, but I figured it's still nice to have the coverage.

int3 accepted this revision.Jun 2 2021, 7:57 AM

lgtm

lld/MachO/MarkLive.cpp
28

no need for llvm::

This revision is now accepted and ready to land.Jun 2 2021, 7:57 AM
thakis marked an inline comment as done.Jun 2 2021, 8:09 AM

Thanks! :)

lld/MachO/MarkLive.cpp
28
../../lld/MachO/MarkLive.cpp:27:3: error: unknown type name 'TimeTraceScope'; did you mean 'llvm::TimeTraceScope'?

(Added a using namespace llvm; for it)

This revision was landed with ongoing or failed builds.Jun 2 2021, 8:09 AM
This revision was automatically updated to reflect the committed changes.
thakis marked an inline comment as done.

@thakis dead-strip.s has been failing on our 32 bit armv8 bot: https://lab.llvm.org/buildbot/#/builders/178/builds/42
(I think we missed when this started due to some bot config changes we were doing)

I've disabled it for 32 bit builds in: https://reviews.llvm.org/rG6942076096e6dcfb0893a351a9a586490beec572

Seems like some of the symbol names change order when built for 32 bit. At least that's my guess, doesn't seem like an Arm specific thing unless it's distro related somehow. Happens with GCC or clang built lld.

Expected:

# UNWIND-NEXT:   g F __TEXT,__text __mh_execute_header
# UNWIND-NEXT:   *UND* ___cxa_allocate_exception
# UNWIND-NEXT:   *UND* ___cxa_end_catch
# UNWIND-NEXT:   *UND* __ZTIi
# UNWIND-NEXT:   *UND* ___cxa_throw
# UNWIND-NEXT:   *UND* ___gxx_personality_v0
# UNWIND-NEXT:   *UND* ___cxa_begin_catch
# UNWIND-NEXT:   *UND* dyld_stub_binder

Got:

 8: 0000000100000000 g F __TEXT,__text __mh_execute_header
 9: 0000000000000000 *UND* __ZTIi
10: 0000000000000000 *UND* ___gxx_personality_v0
11: 0000000000000000 *UND* ___cxa_throw
12: 0000000000000000 *UND* ___cxa_end_catch
13: 0000000000000000 *UND* ___cxa_allocate_exception
14: 0000000000000000 *UND* ___cxa_begin_catch
15: 0000000000000000 *UND* dyld_stub_binder

Any ideas?