This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Index symbols share storage within a slab.
ClosedPublic

Authored by sammccall on Dec 21 2017, 2:32 AM.

Details

Summary

Symbols are not self-contained - it's only safe to hand them out if you
guarantee the lifetime of the underlying data.

Memory usage tests:
I loaded all LLVM project symbols into a single slab.
Using the old implementation, this was 55MB, plus some malloc overhead for out-of-line string allocations that I couldn't easily measure.
Using the new implementation, this was 27MB - we saved half.
(I switched SymbolSlab from DenseMap to vector for both halves of this test to make measurement easier - this is a change @ilya-biryukov suggested would make sense to do for real in a followup patch)

Event Timeline

sammccall created this revision.Dec 21 2017, 2:32 AM
sammccall updated this revision to Diff 127849.Dec 21 2017, 2:32 AM

Don't intern unless the symbol was actually inserted.

ilya-biryukov added inline comments.Dec 21 2017, 6:03 AM
clangd/index/Index.h
134

A comment on why we use BumpPtrAllocator here might be useful.
I.e., it uses more memory than malloc, but we're getting better data locality. (I hope that I got its intention right)

sammccall edited the summary of this revision. (Show Details)Dec 21 2017, 6:39 AM
hokein accepted this revision.Dec 21 2017, 6:53 AM

Nice, LGTM.

clangd/index/Index.h
106–107

Do you want to remove this FIXME now or later?

126

nit: worth a comment here.

This revision is now accepted and ready to land.Dec 21 2017, 6:53 AM
sammccall marked an inline comment as done.Dec 21 2017, 6:54 AM
sammccall added inline comments.
clangd/index/Index.h
134

It's strictly better than malloc for us I think:

  • no malloc bookkeeping/padding memory overhead
  • we make lots of tiny allocations, they're much cheaper in CPU
  • the locality you mentioned

There's no extra memory usage: stringset doesn't use the allocator to allocate the hashtable, just the nodes.

Added a comment.

sammccall updated this revision to Diff 127877.Dec 21 2017, 6:56 AM
sammccall marked 4 inline comments as done.

Comment changes requiested in review.

This revision was automatically updated to reflect the committed changes.