This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Don't keep demangled names in memory after indexing
ClosedPublic

Authored by JDevlieghere on Feb 2 2022, 9:59 AM.

Details

Summary

The symbol table needs to demangle all symbol names when building its index. However, this doesn't require the full mangled name: we only need the base name and the function declaration context. Currently, we always construct the demangled string during indexing and cache it in the string pool as a way to speed up future lookups. Constructing the demangled string is by far the most expensive step of the demangling process, because the output string can be exponentially larger than the input and unless you're dumping the symbol table, many of those demangled names will not be needed again.

This patch avoids constructing the full demangled string when we can partially demangle. This speeds up indexing and reduces memory usage.

Diff Detail

Event Timeline

JDevlieghere created this revision.Feb 2 2022, 9:59 AM
JDevlieghere requested review of this revision.Feb 2 2022, 9:59 AM

the amount of memory we save here is well worth the small performance hit

any numbers to share?

the amount of memory we save here is well worth the small performance hit

any numbers to share?

We have one scenario where this bring memory usage after attach down from 7gb to 1.8gb. (rdar://86413848) This is definitely an extreme example though.

This patch does have one side-effect that I know of, which is that if we don't keep demangled names in the symbol table, then breaking on an overload by demangled name won't work. OTOH, that only worked if you exactly matched the demangler's output - no extra spaces, etc. We don't have a smart matcher for overloads, so I'm not sure that requiring you to supply the mangled name instead in this case is much of a burden.

And if we want to actually have a nice way to break on overloads, it would involve parsing the incoming specification, extracting the method name and arguments, finding the method name matches and then doing some kind of fuzzy match against the arguments. So we wouldn't need to keep all the demangled strings for that purpose either.

IMO the UE for breaking on overloads is not currently good enough that it outweighs the savings we get from not storing all the demangled names. And if this is important to somebody's workflow, they can always get the old behavior back by reversing the setting value.

JDevlieghere edited the summary of this revision. (Show Details)

The symbol table index is the only caller to DemangleWithRichManglingInfo, so I removed the optional argument. Based on Pavel's observations in D118812 I also removed the setting because the current behavior defeats the purpose of partial demangling.

Before
Memory usage: 280MB

Benchmark 1: ./bin/lldb -n Slack -o quit
  Time (mean ± σ):      4.829 s ±  0.518 s    [User: 4.012 s, System: 0.208 s]
  Range (min … max):    4.624 s …  6.294 s    10 runs

After
Memory usage: 189MB

Benchmark 1: ./bin/lldb -n Slack -o quit
  Time (mean ± σ):      4.182 s ±  0.025 s    [User: 3.536 s, System: 0.192 s]
  Range (min … max):    4.152 s …  4.233 s    10 runs

LGTM if the RichManglingContext isn't actually demangling the string anyway, no need to force it to do so and cache the results.

lldb/source/Symbol/Symtab.cpp
387–388

I was checking this function out as the only user of RichManglingContext. Seems that we could modify RichManglingContext::ParseFunctionBaseName() to just return the base name? Seems weird to call parse and then fetch. Could also just rename to "StringRef RichManglingContext::GetFunctionBaseName().

In fact the RichManglingContext class has the notion of an internal buffer where people call some parse routine and then call RichManglingContext::GetBufferRef()? Seems like we should just get rid of the internal buffer and just return it each time from all of the RichManglingContext::ParseXXX routines. Not needed, but just something I found interesting about the RichManglingContext class implementation.

JDevlieghere marked an inline comment as done.Feb 3 2022, 4:02 PM
JDevlieghere added inline comments.
lldb/source/Symbol/Symtab.cpp
387–388

Agreed. I created a separate patch for that: https://reviews.llvm.org/D118953

clayborg accepted this revision.Feb 3 2022, 4:58 PM
This revision is now accepted and ready to land.Feb 3 2022, 4:58 PM

This seems fine, though it's not clear to me what is the effect of this patch in terms of functionality. Does the "side-effect" mentioned by Jim still apply here, or is this NFC now? Either is probably fine, but I'd like to understand what is going on. It seems like it should be NFC, but does that mean that the demangling (and the cpu/memory cost) is delayed until the first operation which requests it (such as matching a breakpoint by the full demangled name) ?

lldb/source/Core/Mangled.cpp
198

I guess it would be more correct to call this GetRichManglingInfo now

This seems fine, though it's not clear to me what is the effect of this patch in terms of functionality. Does the "side-effect" mentioned by Jim still apply here, or is this NFC now? Either is probably fine, but I'd like to understand what is going on. It seems like it should be NFC, but does that mean that the demangling (and the cpu/memory cost) is delayed until the first operation which requests it (such as matching a breakpoint by the full demangled name) ?

I haven't gone back to read our lookups in detail, but I certainly hope that the first time we see a breakpoint on a symbol name we don't recognize, we wouldn't go demangling every symbol name in the system. We really try to keep mistypings from cascading into "unpack the entire world" events.

JDevlieghere marked an inline comment as done.Feb 4 2022, 9:35 AM

This seems fine, though it's not clear to me what is the effect of this patch in terms of functionality. Does the "side-effect" mentioned by Jim still apply here, or is this NFC now? Either is probably fine, but I'd like to understand what is going on. It seems like it should be NFC, but does that mean that the demangling (and the cpu/memory cost) is delayed until the first operation which requests it (such as matching a breakpoint by the full demangled name) ?

I haven't gone back to read our lookups in detail, but I certainly hope that the first time we see a breakpoint on a symbol name we don't recognize, we wouldn't go demangling every symbol name in the system. We really try to keep mistypings from cascading into "unpack the entire world" events.

Yes, this does break the ability to set breakpoints on full demangled names. Based on the code and the comments, it really looks like it was always the intention to avoid demangling the whole name, but then (accidentally?) made it work by storing it in the ConstString. The continue on line 333 is what prevents us from indexing the full name. Before this patch, GetDemangledName would return the cached full demanged name, which now isn't cached and would have to be computed on demand (effectively defeating the purpose of this patch and making things slower).

This seems fine, though it's not clear to me what is the effect of this patch in terms of functionality. Does the "side-effect" mentioned by Jim still apply here, or is this NFC now? Either is probably fine, but I'd like to understand what is going on. It seems like it should be NFC, but does that mean that the demangling (and the cpu/memory cost) is delayed until the first operation which requests it (such as matching a breakpoint by the full demangled name) ?

I haven't gone back to read our lookups in detail, but I certainly hope that the first time we see a breakpoint on a symbol name we don't recognize, we wouldn't go demangling every symbol name in the system. We really try to keep mistypings from cascading into "unpack the entire world" events.

Yes, this does break the ability to set breakpoints on full demangled names. Based on the code and the comments, it really looks like it was always the intention to avoid demangling the whole name, but then (accidentally?) made it work by storing it in the ConstString. The continue on line 333 is what prevents us from indexing the full name. Before this patch, GetDemangledName would return the cached full demanged name, which now isn't cached and would have to be computed on demand (effectively defeating the purpose of this patch and making things slower).

We really should come up with a good story for symbol lookup on function names that include arguments, but "match the exact output of the demangler" was never a good story. And I don't think we need pre-demangled names to do it right, rather we should pull the method name out of the user specification, find the matches to that - which goes quickly 'cause it uses the name chopper indices - then winnow down the matches. Since this is just for function name matches, we could even do smart stuff like (this is spitballing, not a proposed design):

(lldb) break set -n foo::bar::bar(int, *)

meaning, the first parameter has to be an int, I don't care about the others, etc.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2022, 10:18 AM

This seems fine, though it's not clear to me what is the effect of this patch in terms of functionality. Does the "side-effect" mentioned by Jim still apply here, or is this NFC now? Either is probably fine, but I'd like to understand what is going on. It seems like it should be NFC, but does that mean that the demangling (and the cpu/memory cost) is delayed until the first operation which requests it (such as matching a breakpoint by the full demangled name) ?

I haven't gone back to read our lookups in detail, but I certainly hope that the first time we see a breakpoint on a symbol name we don't recognize, we wouldn't go demangling every symbol name in the system. We really try to keep mistypings from cascading into "unpack the entire world" events.

Yes, this does break the ability to set breakpoints on full demangled names. Based on the code and the comments, it really looks like it was always the intention to avoid demangling the whole name, but then (accidentally?) made it work by storing it in the ConstString. The continue on line 333 is what prevents us from indexing the full name.

That's what I was missing. Thanks.

lldb/test/API/macosx/dyld-trie-symbols/TestDyldTrieSymbols.py
41

I guess this is no longer true