This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Align cstrings less conservatively
ClosedPublic

Authored by int3 on Mar 9 2022, 2:37 PM.

Details

Reviewers
oontvoo
Group Reviewers
Restricted Project
Commits
rG4308f031cd0c: [lld-macho] Align cstrings less conservatively
Summary

Previously, we aligned every cstring to 16 bytes as a temporary hack to
deal with https://github.com/llvm/llvm-project/issues/50135. However, it
was highly wasteful in terms of binary size.

To recap, in contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them
all in one section. Strings that need to be aligned have the .p2align
directive emitted before them, which simply translates into zero padding
in the object file. In other words, we have to infer the alignment of
the cstrings from their addresses.

We differ slightly from ld64 in how we've chosen to align these
cstrings. Both LLD and ld64 preserve the number of trailing zeros in
each cstring's address in the input object files. When deduplicating
identical cstrings, both linkers pick the cstring whose address has more
trailing zeros, and preserve the alignment of that address in the final
binary. However, ld64 goes a step further and also preserves the offset
of the cstring from the last section-aligned address. I.e. if a cstring
is at offset 18 in the input, with a section alignment of 16, then both
LLD and ld64 will ensure the final address is 2-byte aligned (since
18 == 16 + 2). But ld64 will also ensure that the final address is of
the form 16 * k + 2 for some k (which implies 2-byte alignment).

Note that ld64's heuristic means that a dedup'ed cstring's final address is
dependent on the order of the input object files. E.g. if in addition to the
cstring at offset 18 above, we have a duplicate one in another file with a
.cstring section alignment of 2 and an offset of zero, then ld64 will pick
the cstring from the object file earlier on the command line (since both have
the same number of trailing zeros in their address). So the final cstring may
either be at some address 16 * k + 2 or at some address 2 * k.

I've opted not to follow this behavior primarily for implementation
simplicity, and secondarily to save a few more bytes. It's not clear to me
that preserving the section alignment + offset is ever necessary, and there
are many cases that are clearly redundant. In particular, if an x86_64 object
file contains some strings that are accessed via SIMD instructions, then the
.cstring section in the object file will be 16-byte-aligned (since SIMD
requires its operand addresses to be 16-byte aligned). However, there will
typically also be other cstrings in the same file that aren't used via SIMD
and don't need this alignment. They will be emitted at some arbitrary address
A, but ld64 will treat them as being 16-byte aligned with an offset of
16 % A.

I have verified that the two repros in https://github.com/llvm/llvm-project/issues/50135
work well with the new alignment behavior.

Fixes https://github.com/llvm/llvm-project/issues/54036.

Diff Detail

Event Timeline

int3 created this revision.Mar 9 2022, 2:37 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 9 2022, 2:37 PM
Herald added a subscriber: pengfei. · View Herald Transcript
int3 requested review of this revision.Mar 9 2022, 2:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 2:37 PM
int3 edited the summary of this revision. (Show Details)Mar 9 2022, 2:56 PM
int3 edited the summary of this revision. (Show Details)
oontvoo added a subscriber: oontvoo.Mar 9 2022, 6:34 PM
oontvoo added inline comments.
lld/MachO/SyntheticSections.cpp
1366–1376

should we note soemthing about this in the lld-vs-ld64.rst doc?
(Not this implementation detail particularly but rather the possible observable difference in the final binaries)

1419–1429

how's this?
just to make it a bit clearer you're updating the entry - otherwise it's a bit easy to miss at first glance.

(related nit: could have a static constant (eg., TombStone) so we dont have to remember using UINT64_MAX)

lld/MachO/SyntheticSections.h
534

just delete the whole decl? (default ctor is implicitly provided already)

541

(consistency with "isec")

int3 edited the summary of this revision. (Show Details)Mar 9 2022, 7:35 PM
int3 marked 4 inline comments as done.Mar 9 2022, 7:58 PM
int3 added inline comments.
lld/MachO/SyntheticSections.cpp
1366–1376

Done (but with a rather brief note). I don't think it's worth diving into the specifics of e.g. 16 * k + 2 (maybe that's what you meant by 'implementation detail'), that would just confuse most end users... but yeah, on the off chance that some build depends on this exact behavior, it could help someone triage the problem. Thanks!

1419–1429

excellent suggestion, it's clearer this way :)

re static constant, we already use UINT*_MAX as a sentinel value in a lot of other places in the code (both in LLD/ELF and LLD/MachO), so I think it's a pretty common pattern overall and not likely to cause confusion

lld/MachO/SyntheticSections.h
541

outSecOff is already used in a lot of other places though, so I'm just following suit here. Arguably it's inconsistent with isec, but the counterargument is that member names should be more explicit/verbose than local variable names.

FWIW, LLD/ELF uses outSecOff together with isec as well.

int3 updated this revision to Diff 414266.Mar 9 2022, 7:58 PM
int3 marked 3 inline comments as done.

address comments

lld/MachO/SyntheticSections.cpp
1377–1396

I've added this paragraph to further make the case that it's unlikely for any build to depend on this specific behavior.

int3 edited the summary of this revision. (Show Details)Mar 9 2022, 9:02 PM
oontvoo accepted this revision.Mar 10 2022, 7:33 AM

LG thanks!

This revision is now accepted and ready to land.Mar 10 2022, 7:33 AM
This revision was automatically updated to reflect the committed changes.