HomePhabricator

[lld-macho] Teach ICF to dedup functions with identical unwind info

Authored by int3 on Nov 12 2021, 1:01 PM.

Description

[lld-macho] Teach ICF to dedup functions with identical unwind info

Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on relocateCompactUnwind() to correctly handle
these truncated input sections.

Here are the numbers before and after D109944, D109945, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:

Without any optimizations:

           base           diff           difference (95% CI)
sys_time   0.849 ± 0.015  0.896 ± 0.012  [  +4.8% ..   +6.2%]
user_time  3.357 ± 0.030  3.512 ± 0.023  [  +4.3% ..   +5.0%]
wall_time  3.944 ± 0.039  4.032 ± 0.031  [  +1.8% ..   +2.6%]
samples    40             38

With -dead_strip:

           base           diff           difference (95% CI)
sys_time   0.847 ± 0.010  0.896 ± 0.012  [  +5.2% ..   +6.5%]
user_time  3.377 ± 0.014  3.532 ± 0.015  [  +4.4% ..   +4.8%]
wall_time  3.962 ± 0.024  4.060 ± 0.030  [  +2.1% ..   +2.8%]
samples    47             30

With -dead_strip and --icf=all:

           base           diff           difference (95% CI)
sys_time   0.935 ± 0.013  0.957 ± 0.018  [  +1.5% ..   +3.2%]
user_time  3.472 ± 0.022  6.531 ± 0.046  [ +87.6% ..  +88.7%]
wall_time  4.080 ± 0.040  5.329 ± 0.060  [ +30.0% ..  +31.2%]
samples    37             30

Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.

Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D109946

Details

Committed
int3Nov 12 2021, 1:02 PM
Reviewer
Restricted Project
Differential Revision
D109946: [lld-macho] Teach ICF to dedup functions with identical unwind info
Parents
rG6c32dd4dfafe: [AArch64][x86] add tests for swapped cmp+vselect patterns; NFC
Branches
Unknown
Tags
Unknown