This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Decouple ObjCLanguage from Symtab
ClosedPublic

Authored by bulbazord on Jun 10 2021, 2:59 PM.

Details

Summary

We can extend/modify GetMethodNameVariants to suit our purposes here.
What symtab is looking for is alternate names we may want to use to
search for a specific symbol, and asking for variants of a name makes
the most sense here.

It might make more sense to wrap the ConstString and FunctionNameType
into a struct with a name, but for now I think a pair suffices.

Diff Detail

Event Timeline

bulbazord created this revision.Jun 10 2021, 2:59 PM
bulbazord requested review of this revision.Jun 10 2021, 2:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2021, 2:59 PM

Note that this is working towards solving the same problem as D103210 but it does so in a different way. Instead of repurposing the old diff, I made a new one so it is a little easier to compare/contrast. Hopefully it's not too confusing.

This looks pretty good to me.

It's a little awkward in InitNameIndexes that we look up the various NameToSymbolIndex maps by eFunctionNameType, use the function name type again to sort the names & index pairs into the bucket we looked up before. I wonder if that could be made cleaner by having an

AddToSymbolNameToIndexMap(symbol_name, index, func_name_type)

interface, which would just sort the symbol names into the right map. Not sure that's worth the bother, however.

lldb/source/Symbol/Symtab.cpp
343–353

Shouldn't this be in a loop over the supported languages?

teemperor requested changes to this revision.Jun 11 2021, 12:38 AM

This looks pretty good to me.

It's a little awkward in InitNameIndexes that we look up the various NameToSymbolIndex maps by eFunctionNameType, use the function name type again to sort the names & index pairs into the bucket we looked up before. I wonder if that could be made cleaner by having an

AddToSymbolNameToIndexMap(symbol_name, index, func_name_type)

interface, which would just sort the symbol names into the right map. Not sure that's worth the bother, however.

That sounds good to me as a follow-up refactoring.

Only some small complains but otherwise this seems pretty good. Someone (*puts finger on nose*) should maybe do some stats whether

lldb/source/Breakpoint/BreakpointResolverName.cpp
222–223

Shouldn't that use the type form the variant instead of the name_type_mask? FWIW, I would prefer if we keep this code unchanged as right now we introduce the selectors to this list of lookups.

So what about adding filter here for eFunctionNameTypeFull to keep this patch NFC? And then maybe a FIXME: to figure out if we should add variants that aren't the full name.

lldb/source/Plugins/Language/ObjC/ObjCLanguage.h
104

Could we make this a custom struct? first second are always so non-descriptive and we might want to extend what we return from this function in a future patch.

class MethodNameVariant {
  ConstString m_name;
  lldb::FunctionNameType m_type;
public:
  [...]
}
lldb/source/Symbol/Symtab.cpp
343–353

+1

I also wonder if this might become quite expensive in the future if we end up with more language plugins, but I guess in that case we can make some kind of initial query pass where plugins can register whether they care about this indexing stuff here. Anyway, I don't think that's a real concern at the moment.

This revision now requires changes to proceed.Jun 11 2021, 12:38 AM
shafik added a subscriber: shafik.Jun 11 2021, 11:29 AM
shafik added inline comments.
lldb/source/Plugins/Language/ObjC/ObjCLanguage.h
104

I second (pun intended?) this, if we can avoid using std::pair and instead use a custom struct we should.

bulbazord updated this revision to Diff 351696.Jun 12 2021, 7:58 PM

Addressing comments

teemperor requested changes to this revision.Jun 21 2021, 4:18 AM

ping!

Sorry, feel free to me directly sooner :)

I ran some benchmarks and this changed the average runtime of InitNameIndexes from 2.63s to 2.85s when debugging LLDB, which is ok-ish. The overhead seems to be from the language-plugin iteration and the new checks for the different function name types. We can't skip the new function name type checks in the new general approach, but we can avoid the expensive Language::ForEach call. If we just once collect the Language plugins in a list and then iterate over that we get down to an average runtime of 2.66s (which is pretty much just what we had before);

diff --git a/lldb/source/Symbol/Symtab.cpp b/lldb/source/Symbol/Symtab.cpp
index b61af37a356d..95c57243f20d 100644
--- a/lldb/source/Symbol/Symtab.cpp
+++ b/lldb/source/Symbol/Symtab.cpp
@@ -263,6 +263,13 @@ void Symtab::InitNameIndexes() {
     m_name_indexes_computed = true;
     LLDB_SCOPED_TIMER();
 
+    // Collect all loaded language plugins.
+    std::vector<Language *> languages;
+    Language::ForEach([&languages](Language *l) {
+      languages.push_back(l);
+      return true;
+    });
+
     auto &name_to_index = GetNameToSymbolIndexMap(lldb::eFunctionNameTypeNone);
     auto &basename_to_index =
         GetNameToSymbolIndexMap(lldb::eFunctionNameTypeBase);
@@ -329,7 +336,7 @@ void Symtab::InitNameIndexes() {
 
         // If the demangled name turns out to be an ObjC name, and is a category
         // name, add the version without categories to the index too.
-        auto map_variant_names = [&](Language *lang) {
+        for (Language *lang : languages) {
           for (auto variant : lang->GetMethodNameVariants(name)) {
             if (variant.GetType() & lldb::eFunctionNameTypeSelector)
               selector_to_index.Append(variant.GetName(), value);
@@ -340,10 +347,7 @@ void Symtab::InitNameIndexes() {
             else if (variant.GetType() & lldb::eFunctionNameTypeBase)
               basename_to_index.Append(variant.GetName(), value);
           }
-          return true;
         };
-
-        Language::ForEach(map_variant_names);
       }
     }

Beside that just two nits and this is ready to go. Thanks!

lldb/include/lldb/Target/Language.h
194

Both functions can be const

lldb/source/Plugins/Language/ObjC/ObjCLanguage.h
103–104

// Also returns the (the "We..." isn't something that is usually in doxygen comments which this comment is aspiring to be).

This revision now requires changes to proceed.Jun 21 2021, 4:18 AM
bulbazord updated this revision to Diff 353484.Jun 21 2021, 2:09 PM

Addressed comments! :)

teemperor accepted this revision.Jun 22 2021, 1:09 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jun 22 2021, 1:09 AM
This revision was landed with ongoing or failed builds.Jun 23 2021, 1:57 PM
This revision was automatically updated to reflect the committed changes.