This is an archive of the discontinued LLVM Phabricator instance.

Create synthetic symbol names on demand to improve memory consumption and startup times.
ClosedPublic

Authored by clayborg on Jun 29 2021, 4:28 PM.

Details

Summary

This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.

Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.

This patch fixes the issue by:

  • not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
  • doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
  • removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool

Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)

After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)

The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.

Diff Detail

Event Timeline

clayborg created this revision.Jun 29 2021, 4:28 PM
clayborg requested review of this revision.Jun 29 2021, 4:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2021, 4:28 PM

@stella.stamenova all test suite failures should be fixed now. This is a redo on https://reviews.llvm.org/D104488 which was reverted in less that a day...

lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2829

This is a bug that I tried to fix by putting "entry_point_addr.GetOffset()" instead of zero here. Restoring to zero to make sure that:

lldb-shell :: SymbolFile/dissassemble-entry-point.s
lldb-unit :: ObjectFile/ELF/./ObjectFileELFTests.exe/ObjectFileELFTest.GetSymtab_NoSymEntryPointArmThumbAddressClass

don't fail. I will fix this bug in another patch.

lldb/test/Shell/ObjectFile/ELF/eh_frame-symbols.yaml
6–8

This fixes the following previously failing test:

lldb-shell :: ObjectFile/ELF/eh_frame-symbols.yaml
lldb/test/Shell/SymbolFile/Breakpad/symtab.test
8

This fixes the following previously failing test:

lldb-shell :: SymbolFile/Breakpad/symtab.test

@stella.stamenova all test suite failures should be fixed now. This is a redo on https://reviews.llvm.org/D104488 which was reverted in less that a day...

@clayborg : Thanks! We ended up with several more failing tests because of issues that were committed while the bot was red and was not sending notifications. It's always good to keep things green, so that issues don't pile up. I'm sorry I didn't get to it sooner.

wallace accepted this revision.Jun 29 2021, 4:50 PM

Crossing fingers

This revision is now accepted and ready to land.Jun 29 2021, 4:50 PM
This revision was landed with ongoing or failed builds.Jun 29 2021, 5:44 PM
This revision was automatically updated to reflect the committed changes.

I'm sorry to be the bearer of bad news but this breaks macosx/dyld-trie-symbols/TestDyldTrieSymbols.py. Given that it's Friday I'm going to revert to turn the bot green for the (holiday) weekend.

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/33379/testReport/junit/lldb-api/macosx_dyld-trie-symbols/TestDyldTrieSymbols_py/