This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Remove CPlusPlusLanguage from Mangled
ClosedPublic

Authored by bulbazord on Jun 30 2021, 11:16 AM.

Details

Summary

The only remaining plugin dependency in Mangled is CPlusPlusLanguage which it
uses to extract information from C++ mangled names. The static function
GetDemangledNameWithoutArguments is written specifically for C++, so it
would make sense for this specific functionality to live in a
C++-related plugin. In order to keep this functionality in Mangled
without maintaining this dependency, I added
Language::GetDemangledFunctionNameWithoutArguments.

Diff Detail

Event Timeline

bulbazord requested review of this revision.Jun 30 2021, 11:16 AM
bulbazord created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 11:16 AM

I kind of feel that Language::GetDemangledFunctionNameWithoutArguments may be a bit too specific for a generalized language plugins. I think it may be worth it to make Mangled an interface that language plugins can implement (e.g. CPlusPlusMangledName) but I haven't totally thought out what the ramifications of that would be yet.

I kind of feel that Language::GetDemangledFunctionNameWithoutArguments may be a bit too specific for a generalized language plugins. I think it may be worth it to make Mangled an interface that language plugins can implement (e.g. CPlusPlusMangledName) but I haven't totally thought out what the ramifications of that would be yet.

The name is unfortunate, but the notion that function types have an identifier, that is then decorated by arguments and maybe return types, seems pretty common. So in this particular case, maybe we just need a better name? GetBaseName isn't right since this function also returns any namespace information. Maybe GetFullyQualifiedBaseName?

I kind of feel that Language::GetDemangledFunctionNameWithoutArguments may be a bit too specific for a generalized language plugins. I think it may be worth it to make Mangled an interface that language plugins can implement (e.g. CPlusPlusMangledName) but I haven't totally thought out what the ramifications of that would be yet.

The name is unfortunate, but the notion that function types have an identifier, that is then decorated by arguments and maybe return types, seems pretty common. So in this particular case, maybe we just need a better name? GetBaseName isn't right since this function also returns any namespace information. Maybe GetFullyQualifiedBaseName?

This may be my ignorance of other languages speaking, but GetFullyQualifiedBaseName sounds a little more specific to C++. I'm not sure if other languages use this terminology, so I tried to keep it as generic as possible.

ping!

@jingham I can change the name to GetFullyQualifiedBaseName if needed. Is there anything else here that should be changed or addressed?

Mostly I was reacting to your saying the "fully qualified name" idea was too much of a C++ specific thing and didn't belong in the Language abstraction. That doesn't seem to me right. Swift has module names & class namespaces, etc. And you might want to dial up all methods with some base name or narrow down to the one implemented by a class or module in the same way. And we make the distinction between matching base names & qualified names in most of our searches; so this does seem a generally useful accessor.

The name was a way to phrase the access that made sense of it as a general thing, but I'm not stuck on that name . And you didn't make any actual moves to put it just in the C++ Language, so I'm doubly fine with that...

It does seem odd to stick "Demangled" in the name as that does seem more an implementation detail. But my main point was that I did think this belongs in the Language plugin, so I'm good.

Jim

clayborg added inline comments.
lldb/source/Core/Mangled.cpp
322–324

Maybe we should make a Language::FindPlugin(...) that like:

Language *Language::FindPlugin(Mangled::ManglingScheme mangling_scheme);

Should be easy to add since this change is kind of about refactoring and putting the code into plug-ins. It is essentially what "lldb::LanguageType Mangled::GuessLanguage() const" is doing. That code could be moved to where Language::FindPlugin(...) lives.

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
67

We are doing the work over and over here. Who calls this? Is this something we should think about caching? It would be easy to make a map of ConstString (demangled name) to ConstString (result of this function call) and we can possibly improve the efficiency of this.

bulbazord added inline comments.Jul 27 2021, 3:01 PM
lldb/source/Core/Mangled.cpp
322–324

GuessLanguage has a bunch of uses outside of Mangled itself, so I think that would make more sense if we performed other refactors first.

I don't think putting this into another Language::FindPlugin function is a good idea because (as I understand it) mangling schemes aren't specific to languages.

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
67

The only use of GetDemangledFunctionNameWithoutArguments is Mangled::GetName, and the only time it's ever actually executed is if the argument to GetName is ePreferDemangledWithoutArguments. This happens twice from what I can see: Symbol::GetNameNoArguments and Function::GetNameNoArguments. I believe these methods are used for dumping and formatting.

Not sure how often they are actually used though. May be useful to profile and potentially cache if it's used enough?

My comments were mostly about how we should refactor in a perfect world. Not sure we need to solve that right away. I am ok with this patch as it is if Jim is.

Jim? What do you think about this patch? OK? I am good if you are.

lldb/source/Core/Mangled.cpp
322–324

We _are_ using GuessLanguage to guess the language and we are using a language specific plug-in to do the work. So effectively it is the same thing.

I wonder if GetName(...) should take an extra Language parameter? If a lldb_private::Function wants the name as , then we _do_ know the language, then it could send the language in for it to be more correct. Itanium name mangling can mangle different languages right? Is there any way to differentiate correctly without actually knowing the language?

jingham accepted this revision.Jul 27 2021, 3:40 PM

I still think the "Demangled" part of the GetDemangledFunctionNameWithoutArguments is redundant. I'm not sure what sense it would make to get a mangled name without arguments? And you could have a language that doesn't mangle symbols but still has namespaces, classes & overloading, for which the notion of getting the full name w/o arguments makes sense even though no mangling/demangling is involved. But the current name isn't horrible.

We also have a general awkwardness with demangling names using Languages because C++ by itself is underdetermined - you really need {C++, mangling scheme}. We just end up guessing about the mangling based on symbols - or in this case so far as I can see we only ever handle the Itanium mangling scheme.

But this patch doesn't make that worse. So this is fine by me.

This revision is now accepted and ready to land.Jul 27 2021, 3:40 PM

Thanks for the reviews and comments! It's kind of unfortunate that LLDB is like this right now but I believe this patch is a step in the right direction. :)

lldb/source/Core/Mangled.cpp
322–324

Right, I understand. *Ideally* we wouldn't be doing that, but...

What you're suggesting might be better in the long run (using language information where available) but unfortunately I think that's far from where we are at currently. As for itanium name mangling, I don't think you can reliably determine the language the symbol came from unfortunately. Some languages have their own name mangling scheme luckily, so it may be easier for some languages.

This revision was landed with ongoing or failed builds.Jul 29 2021, 2:14 PM
This revision was automatically updated to reflect the committed changes.