This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Don't clone SymbolSlab::Builder arenas when finalizing.
ClosedPublic

Authored by sammccall on Oct 4 2022, 5:35 PM.

Details

Summary

SymbolSlab::Builder has an arena to store strings of owned symbols, and
deduplicates them. build() copies all the strings and deduplicates them again!
This is potentially useful: we may have overwritten a symbol and
rendered some strings unreachable.

However in practice this is not the case. When testing on a variety of
files in LLVM (e.g. SemaExpr.cpp), the strings for the full preamble
index are 3MB and shrink by 0.4% (12KB). For comparison the serializde
preamble is >50MB.
There are also hundreds of smaller slabs (file sharding) that do not shrink at
all.

CPU time spent on this is significant (something like 3-5% of buildPreamble).
We're better off not bothering.

Diff Detail

Event Timeline

sammccall created this revision.Oct 4 2022, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 5:35 PM
sammccall requested review of this revision.Oct 4 2022, 5:35 PM

My first take on this was to keep track in various ways of whether GC was required (statically known at callsite, or have builder track whether symbols were overwritten).
However these failed to identify a bunch of cases where the arena wasn't shrinking, and this seems much simpler.

adamcz accepted this revision.Oct 5 2022, 9:35 AM
This revision is now accepted and ready to land.Oct 5 2022, 9:35 AM