Previously, we have a hash table containing strings and their offsets to
manage mergeable strings. Technically we can live without that because we
can do binary search on a vector of mergeable strings to find a mergeable
strings. The table was there to speed up offset -> string piece lookup.
We recently observed that lld tend to consume more memory than gold when
linking executables with debug info, and we found that a few percent of
memory is consumed by the hash table. I wondered if we can save memory
here, so I run a few benchmarks with and without the hash table. Here is
the result.
Speed (measured by `perf stat -r10`) Program w/patch w/o patch Slowdown chrome 2.004718511 1.988568454 0.81% clang 0.518536707 0.503528155 2.98% clang-fsds 0.566316070 0.551120769 2.75% clang-gdb-index 5.091346130 5.052952745 0.75% gold 0.325922095 0.318968615 2.17% gold-fsds 0.353570003 0.342080243 3.35% linux-kernel 0.872819415 0.866891291 0.68% llvm-as 0.057709114 0.054467349 5.95% llvm-as-fsds 0.054338842 0.053751006 1.09% mozilla 3.730120566 3.836634236 -2.77% scylla 1.011386393 1.031144224 -1.91% Maximum RSS (measured by `time -v`) Program w/patch w/o patch Memory saving chrome 1163088 1163572 0.04% clang 369364 373760 1.17% clang-fsds 392484 396712 1.06% clang-gdb-index 10391636 10391680 0.00% gold 227508 229108 0.69% gold-fsds 237308 238960 0.69% linux-kernel 143028 146932 2.65% llvm-as 46792 46976 0.39% llvm-as-fsds 47112 47336 0.47% mozilla 4631424 4833940 4.18% scylla 2712036 2793468 2.91%
It looks like the slowdown is not negligible, but lld got slower only when
the program being linked is small. For large programs, the regression
seems small, or it even got faster. Given that, I don't think having the
hash table is still a good tradeoff; we should drop the hash table to save
memory.