This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Only emit one BIND_OPCODE_SET_SYMBOL per symbol
ClosedPublic

Authored by int3 on Jun 28 2021, 7:54 PM.

Details

Summary

Size-wise, BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM is the most
expensive opcode, since it comes with an associated symbol string. We
were previously emitting it once per binding, instead of once per
symbol. This diff groups all bindings for a given symbol together and
ensures we only emit one such opcode per symbol. This matches ld64's
behavior.

While this is a relatively small win on chromium_framework (-72KiB), for
programs that have more dynamic bindings, the difference can be quite
large.

This change is perf-neutral when linking chromium_framework.

Diff Detail

Event Timeline

int3 created this revision.Jun 28 2021, 7:54 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: mgrang. · View Herald Transcript
int3 requested review of this revision.Jun 28 2021, 7:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2021, 7:54 PM
thakis added a subscriber: thakis.Jun 30 2021, 8:56 AM
thakis added inline comments.
lld/MachO/SyntheticSections.cpp
358

Is it guaranteed that we have filtered out separate symbols pointing at the same address by now? If not, this has nondeterminism issues (here and in the sort below)

int3 added inline comments.Jun 30 2021, 1:01 PM
lld/MachO/SyntheticSections.cpp
358

These are binding address, not symbol addresses, and I don't think any well-formed program will have two bindings pointing to the same address. I'm not even sure how you would generate that; I suppose you could use obj2yaml to hand-craft two relocations pointing at the same address, but I don't think it's possible if you're writing assembly.

Also FWIW I double-checked with -DLLVM_ENABLE_EXPENSIVE_CHECKS, and our test cases don't find any issues with this

int3 added inline comments.Jun 30 2021, 2:18 PM
lld/MachO/SyntheticSections.cpp
358

ICF could cause two binds to point to the same dedup'ed address, but that gets fixed by D105044: [lld-macho] Move ICF earlier to avoid emitting redundant binds :)

int3 edited the summary of this revision. (Show Details)Jul 1 2021, 6:31 AM
int3 edited the summary of this revision. (Show Details)
thakis accepted this revision.Jul 5 2021, 4:51 PM

lgtm

This revision is now accepted and ready to land.Jul 5 2021, 4:51 PM
This revision was automatically updated to reflect the committed changes.