This is an archive of the discontinued LLVM Phabricator instance.

[MachO] Shrink reloc from 32 bytes to 24 bytes
ClosedPublic

Authored by smeenai on Nov 12 2021, 4:23 PM.

Details

Reviewers
gkm
int3
Group Reviewers
Restricted Project
Commits
rG93bf271f2743: [MachO] Shrink reloc from 32 bytes to 24 bytes
Summary

The r_address field of relocation_info is only 4 bytes, so our
offset field (which is the r_address field adjusted for subsection
splitting) also only needs to be 4 bytes. This reduces the structure
size from 32 bytes to 24 bytes.

Combined with https://reviews.llvm.org/D113813, this is a minor perf
improvement for linking an internal app, tested on two machines:

           smol-relocs     baseline        difference (95% CI)
sys_time   7.367 ± 0.138   7.543 ± 0.157   [  +0.9% ..   +3.8%]
user_time  21.843 ± 0.351  21.861 ± 0.450  [  -1.3% ..   +1.4%]
wall_time  20.301 ± 0.307  20.556 ± 0.324  [  +0.1% ..   +2.4%]
samples    16              16

           smol-relocs     baseline        difference (95% CI)
sys_time   2.923 ± 0.050   2.992 ± 0.018   [  +1.4% ..   +3.4%]
user_time  10.345 ± 0.039  10.448 ± 0.023  [  +0.8% ..   +1.2%]
wall_time  12.068 ± 0.071  12.229 ± 0.021  [  +1.0% ..   +1.7%]
samples    15              12

More importantly though, this change by itself reduces our maximum
resident set size by 220 MB (2.75%, from 7.85 GB to 7.64 GB) on the
first machine. On the second machine, it reduces it by 125 MB (1.94%,
from 6.31 GB to 6.19 GB).

Diff Detail

Event Timeline

smeenai created this revision.Nov 12 2021, 4:23 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
smeenai requested review of this revision.Nov 12 2021, 4:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2021, 4:23 PM

Turning findContainingSubsection into a template is a little ugly, but there's legitimate cases where the offset can be 64 bits (e.g. embedded addends for relocations), so we need to support both uint32_t * and uint64_t * arguments.

If reviewers prefer, I could instead make findContainingSubsection take the InputSection * as the output argument and return the adjusted offset, so that the return value can just be downcast instead of needing a template.

int3 added a subscriber: int3.Nov 12 2021, 4:35 PM

from 7.85 MB to 7.64 MB

I think you meant GB :)

lld/MachO/InputFiles.cpp
355

hm, can't we leave it as uint64_t and have the uint32_t be automatically promoted?

smeenai edited the summary of this revision. (Show Details)Nov 12 2021, 4:40 PM

from 7.85 MB to 7.64 MB

I think you meant GB :)

Hah, fixed.

lld/MachO/InputFiles.cpp
355

Promotion doesn't happen for pointer arguments (you wouldn't wanna write 8 bytes into a 4 byte storage slot).

int3 accepted this revision.Nov 12 2021, 5:03 PM

Thanks!

lld/MachO/InputFiles.cpp
355

OH right. I knew that, I just couldn't read :P

If reviewers prefer, I could instead make findContainingSubsection take the InputSection * as the output argument and return the adjusted offset, so that the return value can just be downcast instead of needing a template.

Nah this seems ugly in a different way, findFoo functions should generally return Foos...

Maybe add a comment here that we expect T to be either uint32_t or uint64_t?

This revision is now accepted and ready to land.Nov 12 2021, 5:03 PM
This revision was automatically updated to reflect the committed changes.
smeenai added inline comments.
lld/MachO/InputFiles.cpp
355

I was gonna add a type assert but screwed up my push; whoops. @oontvoo fixed that in D114044; thank you!