This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
packages/Python/lldbsuite/test/make/
-
Python/
-
lldbsuite/
-
test/
-
make/
-
Makefile.rules
-
source/Plugins/Language/CPlusPlus/
-
Plugins/
-
Language/
-
CPlusPlus/
-
CPlusPlusLanguage.h
6/7
CPlusPlusLanguage.cpp
-
test/API/lang/cpp/abi_tag_lookup/
-
API/
-
lang/
-
cpp/
-
abi_tag_lookup/
-
Makefile
1/2
TestAbiTagLookup.py
-
main.cpp

Differential D131335

[lldb] abi_tag support 3/3 - Use mangle tree API to determine approximate mangled matches
Needs ReviewPublic

Authored by Michael137 on Aug 6 2022, 10:47 AM.

Download Raw Diff

Details

Reviewers

aprantl
jingham
labath

Summary

When resolving symbols during IR execution, lldb makes a last effort attempt
to resolve external symbols from object files by approximate name matching.
It currently uses CPlusPlusNameParser to parse the demangled function name
and arguments for the unresolved symbol and its candidates. However, this
hand-rolled C++ parser doesn’t support ABI tags which, depending on the demangler,
get demangled into [abi:tag]. This lack of parsing support causes lldb to never
consider a candidate mangled function name that has ABI tags.

The issue reproduces by calling an ABI-tagged template function from the
expression evaluator. This is particularly problematic with the recent
addition of ABI tags to numerous libcxx APIs.

This patch deals with the above symbol resolution issue by using the
Itanium mangle tree API to compare mangled function symbols and no
longer relying on parsing demangled names via CPlusPlusNameParser. Since
the MSVC mangler doesn't expose the mangle tree we continue using the
old matching logic when the candidate symbol has the MSVC mangling
scheme.

Testing

Added API tests

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Michael137 created this revision.Aug 6 2022, 10:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2022, 10:47 AM

Harbormaster completed remote builds in B179727: Diff 450560.Aug 6 2022, 10:48 AM

Michael137 added a parent revision: D131334: [lldb] abi_tag support 3/4 - Add char const* overload for SymbolNameFitsToLanguage.Aug 6 2022, 10:48 AM

Michael137 added a parent revision: D131330: [llvm][Demangle] Add getter for FunctionEncoding::Attrs.

Michael137 added a reviewer: jingham.Aug 6 2022, 10:57 AM

Michael137 mentioned this in D131331: [llvm] lldb abi_tag support 1/3 - Add equality comparison API to itanium_demangle::Node.Aug 6 2022, 11:28 AM

Michael137 edited the summary of this revision. (Show Details)Aug 6 2022, 12:10 PM

Rebase
Fix API tests:
- Weaken pre-condition for non-function mangled names
- Allow non-C++ mangled names

Harbormaster completed remote builds in B179875: Diff 450751.Aug 8 2022, 3:45 AM

Michael137 added a reviewer: labath.Aug 8 2022, 8:42 AM

Michael137 published this revision for review.Aug 8 2022, 3:55 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 8 2022, 3:55 PM

Herald added a subscriber: lldb-commits. · View Herald Transcript

Reword commit message
Replace SymbolNameFitsToLanguage check

Herald added a subscriber: JDevlieghere. · View Herald TranscriptAug 8 2022, 4:06 PM

Harbormaster completed remote builds in B180059: Diff 450992.Aug 8 2022, 4:07 PM

Michael137 retitled this revision from WIP: [lldb] abi_tag support 4/4 - Use mangle tree API to determine approximate mangle matches to WIP: [lldb] abi_tag support 3/3 - Use mangle tree API to determine approximate mangled matches.Aug 8 2022, 4:08 PM

I haven't been able to keep up with all the lldb happenings lately, so it's possible I have missed something, but can you give a summary (or point me to one) of what is the problem with abi tags in lldb. For example, why isn't the DW_AT_linkage_name string sufficient to find the correct function?

I am somewhat sceptical that this is the best way to fix this problem. As you've said yourself, the "alternate name" machinery is a last resort, and the things that it's doing are wrong on several levels (and quite slow). To tell the truth, I've been hoping that we could get rid of it completely one day. Relying on this mechanism for "core" functionality would preclude that from happening...

In D131335#3709788, @labath wrote:

I haven't been able to keep up with all the lldb happenings lately, so it's possible I have missed something, but can you give a summary (or point me to one) of what is the problem with abi tags in lldb. For example, why isn't the DW_AT_linkage_name string sufficient to find the correct function?

I am somewhat sceptical that this is the best way to fix this problem. As you've said yourself, the "alternate name" machinery is a last resort, and the things that it's doing are wrong on several levels (and quite slow). To tell the truth, I've been hoping that we could get rid of it completely one day. Relying on this mechanism for "core" functionality would preclude that from happening...

Unqualified lookup of template functions currently always resorts this "fall-back" mechanism. I investigated this mainly with import-std-module enabled since those were the tests that failed due to D127444. There are two parts to it:

During unqualified lookup clang::Sema asks LLDB for the decl corresponding to the template in question. DWARFASTParserClang adds two decls to the AST, one non-template FunctionDecl marked extern and another TemplateFunctionDecl. Because of unqualified lookup rules in C++ clang::Sema picks the extern FunctionDecl. This is why the generated IR has an unresolved symbol that we then try to find in the object files.

When resolving this external symbol we try to match mangled function names using the hand-rolled CPlusPlusNameParser which doesn't support ABI tags and we fail to find a suitable symbol.

We can fish out the ABI tag from the DW_AT_linkage_name when parsing DWARF but we would still fail at (2). The plan was to address both. First reduce the reliance on CPlusPlusNameParser since that fixes the unqualified lookup issue trivially and is more robust to C++ syntax changes/attribute additions/etc. And then address the unqualified template lookup which has had numerous workarounds added to it over the years (https://reviews.llvm.org/D33025,
https://reviews.llvm.org/D61044, https://reviews.llvm.org/D75761, etc.)

aprantl added inline comments.Aug 9 2022, 10:31 AM

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
619	Could this be `for (ConstString alternate_mangled_name : alternates)` ?
619	Comment what the loop is doing?
624	`alternate_mangled_name.GetLength()` ? It might sill need to call strlen under the hood, but it could do something more efficient.
651	`.`
1544	why not return the vector?
1551	I'd rather ask the conststring for its length again

Another interesting example of how the fallback mechanism produces unexpected result:

namespace A {
struct B {};

template <typename T>             
[[gnu::abi_tag ("vTest")]]        
int insideNS(T t) { return 0; }

[[gnu::abi_tag ("vTest")]]        
int insideNS(int t) { return 1; }
} // namespace A

template <typename T>             
[[gnu::abi_tag ("vTest")]]        
int outsideNS(T t) { return 0; }  
                                  
[[gnu::abi_tag ("vTest")]]        
int outsideNS(int t) { return 1; }

(lldb) p outsideNS(A::B{})    <<< Uses the fallback `outsideNS`
(int) $1 = 1
(lldb) p outsideNS(0)         <<< Uses the fallback `_Z9outsideNSi`
(int) $2 = 1

(inner) p insideNS(A::B{})
error: Couldn't lookup symbols:
(inner) p insideNS(0)
error: Couldn't lookup symbols:

I should add this to the API test

In D131335#3710236, @Michael137 wrote:

In D131335#3709788, @labath wrote:

I haven't been able to keep up with all the lldb happenings lately, so it's possible I have missed something, but can you give a summary (or point me to one) of what is the problem with abi tags in lldb. For example, why isn't the DW_AT_linkage_name string sufficient to find the correct function?

I am somewhat sceptical that this is the best way to fix this problem. As you've said yourself, the "alternate name" machinery is a last resort, and the things that it's doing are wrong on several levels (and quite slow). To tell the truth, I've been hoping that we could get rid of it completely one day. Relying on this mechanism for "core" functionality would preclude that from happening...

Unqualified lookup of template functions currently always resorts this "fall-back" mechanism. I investigated this mainly with import-std-module enabled since those were the tests that failed due to D127444. There are two parts to it:

During unqualified lookup clang::Sema asks LLDB for the decl corresponding to the template in question. DWARFASTParserClang adds two decls to the AST, one non-template FunctionDecl marked extern and another TemplateFunctionDecl. Because of unqualified lookup rules in C++ clang::Sema picks the extern FunctionDecl. This is why the generated IR has an unresolved symbol that we then try to find in the object files.

When resolving this external symbol we try to match mangled function names using the hand-rolled CPlusPlusNameParser which doesn't support ABI tags and we fail to find a suitable symbol.

We can fish out the ABI tag from the DW_AT_linkage_name when parsing DWARF but we would still fail at (2). The plan was to address both. First reduce the reliance on CPlusPlusNameParser since that fixes the unqualified lookup issue trivially and is more robust to C++ syntax changes/attribute additions/etc. And then address the unqualified template lookup which has had numerous workarounds added to it over the years (https://reviews.llvm.org/D33025,
https://reviews.llvm.org/D61044, https://reviews.llvm.org/D75761, etc.)

Thanks for the explanation. I have to admit I am surprised that this is motivated by the import-std-module feature. That's the last place I would expect, as my understanding was that this is basically about providing a fully faithful AST representation (of the module) -- by having clang compile itself rather than us trying to roundtrip it through dwarf.

From your description, it sounds to me like things go wrong right at the start -- with the two AST Decls. Do you understand why is that happening? I pretty sure I am missing something, but I can't escape the feeling that this could be solved by generating a "better" AST. I know that isn't always possible, but it's not clear to me why it is not be possible in this case.

The way I understand it, for regular (non-templated) functions we are generating an AST roughly corresponding to extern void foo(int) asm("mangled_name");. That means we tell clang precisely the mangled name it should use (the one we get from dwarf), and nobody will care whether it contains an abi tag or anything else. Could we do the same for templated functions as well? extern template void foo<int>() asm("whatever"); doesn't appear to be entirely valid syntax (gcc rejects it outright, while clang just ignores the asm part, but maybe it wouldn't be too hard to change that (we wouldn't actually need to accept that source code, just a way to express something similar in the AST)?

Would something like that help here? Or am I just completely in the dark?

(BTW, I really like the ability to do AST-based processing (comparisons) of mangled names. That seems like it could be very useful in some situations -- it's just not clear to me why this is one of them.)

dblaikie added a subscriber: dblaikie.Aug 10 2022, 11:07 AM

In D131335#3712243, @labath wrote:

In D131335#3710236, @Michael137 wrote:

In D131335#3709788, @labath wrote:

I haven't been able to keep up with all the lldb happenings lately, so it's possible I have missed something, but can you give a summary (or point me to one) of what is the problem with abi tags in lldb. For example, why isn't the DW_AT_linkage_name string sufficient to find the correct function?

I am somewhat sceptical that this is the best way to fix this problem. As you've said yourself, the "alternate name" machinery is a last resort, and the things that it's doing are wrong on several levels (and quite slow). To tell the truth, I've been hoping that we could get rid of it completely one day. Relying on this mechanism for "core" functionality would preclude that from happening...

Unqualified lookup of template functions currently always resorts this "fall-back" mechanism. I investigated this mainly with import-std-module enabled since those were the tests that failed due to D127444. There are two parts to it:

During unqualified lookup clang::Sema asks LLDB for the decl corresponding to the template in question. DWARFASTParserClang adds two decls to the AST, one non-template FunctionDecl marked extern and another TemplateFunctionDecl. Because of unqualified lookup rules in C++ clang::Sema picks the extern FunctionDecl. This is why the generated IR has an unresolved symbol that we then try to find in the object files.

When resolving this external symbol we try to match mangled function names using the hand-rolled CPlusPlusNameParser which doesn't support ABI tags and we fail to find a suitable symbol.

We can fish out the ABI tag from the DW_AT_linkage_name when parsing DWARF but we would still fail at (2). The plan was to address both. First reduce the reliance on CPlusPlusNameParser since that fixes the unqualified lookup issue trivially and is more robust to C++ syntax changes/attribute additions/etc. And then address the unqualified template lookup which has had numerous workarounds added to it over the years (https://reviews.llvm.org/D33025,
https://reviews.llvm.org/D61044, https://reviews.llvm.org/D75761, etc.)

Thanks for the explanation. I have to admit I am surprised that this is motivated by the import-std-module feature. That's the last place I would expect, as my understanding was that this is basically about providing a fully faithful AST representation (of the module) -- by having clang compile itself rather than us trying to roundtrip it through dwarf.

From your description, it sounds to me like things go wrong right at the start -- with the two AST Decls. Do you understand why is that happening? I pretty sure I am missing something, but I can't escape the feeling that this could be solved by generating a "better" AST. I know that isn't always possible, but it's not clear to me why it is not be possible in this case.

The way I understand it, for regular (non-templated) functions we are generating an AST roughly corresponding to extern void foo(int) asm("mangled_name");. That means we tell clang precisely the mangled name it should use (the one we get from dwarf), and nobody will care whether it contains an abi tag or anything else. Could we do the same for templated functions as well? extern template void foo<int>() asm("whatever"); doesn't appear to be entirely valid syntax (gcc rejects it outright, while clang just ignores the asm part, but maybe it wouldn't be too hard to change that (we wouldn't actually need to accept that source code, just a way to express something similar in the AST)?

Would something like that help here? Or am I just completely in the dark?

(BTW, I really like the ability to do AST-based processing (comparisons) of mangled names. That seems like it could be very useful in some situations -- it's just not clear to me why this is one of them.)

Thanks for taking a closer look. I'm revisiting the AST construction for template functions again and seeing some discrepancies in the structure of the AST generated by Clang's -ast-dump vs LLDB. We're not only dropping the ABI tags from the mangled name but also the template arguments. That's because the decl that clang::Sema finds during lookup looks like the following:

// main.cpp

namespace A {
struct B {};
template <typename T> [[gnu::abi_tag("vTest")]] bool tagged(T t) { return true; }

int main() { bool res = tagged(A::B{}); }

...
FunctionDecl 0x11910ffd0 <<invalid sloc>> <invalid sloc> used tagged 'bool (A::B)' extern
|-ParmVarDecl 0x11910ff68 <<invalid sloc>> <invalid sloc> t 'A::B'

It seems like the FunctionTemplateDecl we create in DWARFASTParserClang doesn't actually get attached to the AST properly. E.g., we never call addDecl on it. I'm not sure if that's by design, but that is one of the discrepancies; it has pretty much stayed this way since support for template functions was first introduced in 3c2e3ae49093a4f5e0660c96932d83f6d81bd798.

This lack of a FunctionTemplateDecl is reflected again when emitting the $__lldb_result_expr IR via CodeGen::EmitDirectCall (which is the component that emits the "incorrectly" mangled function call).

-ast-dump yields the following

...
    |         |-ImplicitCastExpr 0x13311c548 <col:28> 'bool (*)(A::B)' <FunctionToPointerDecay>                                                                                       
    |         | `-DeclRefExpr 0x13311c4c0 <col:28> 'bool (A::B)' lvalue Function 0x13311a130 'tagged' 'bool (A::B)' (FunctionTemplate 0x1331166c8 'tagged')      
...

while LLDB's AST looks like:

...
    |-ImplicitCastExpr 0x119110208 '_Bool (*)(struct A::B)' <FunctionToPointerDecay>
    | `-DeclRefExpr 0x1191101c0 '_Bool (struct A::B)' lvalue Function 0x11910ffd0 'tagged' '_Bool (struct A::B)'
...

Note how the DeclRefExpr node doesn't have a FunctionTemplate attached to it in LLDB's AST.

I'll have to see if there's something wrong with how TypeSystemClang::CreateFunctionTemplateDecl is used

So one of the problems with constructing a more accurate AST is that we don't have enough information in DWARF to construct a unspecialised FunctionProtoType, i.e., FunctionProtoType 0x1282fe870 'bool (T)' as opposed to the current FunctionProtoType 0x1282fe870 '_Bool (struct A::B)' cdecl.

This was also noted in https://reviews.llvm.org/D61044

creating the correct templated signature is just impossible. DWARF contains the template parameter, but not their mapping to the actual function signature. In the above example I know that foo has been parametrized by the type ‘int’ (and that T was the template argument name), but I don’t know whether the return type or the first argument were templated or plain ints…

E.g., here is an example DWARF entry for a template function:

0x000001f8:     DW_TAG_subprogram
                  DW_AT_low_pc  (0x0000000100003f00)
                  DW_AT_high_pc (0x0000000100003f18)
                  DW_AT_APPLE_omit_frame_ptr    (true)
                  DW_AT_frame_base      (DW_OP_reg31 WSP)
                  DW_AT_linkage_name    ("_ZN1A10multiParamINS_1BEiEEbT_T0_S1_")
                  DW_AT_name    ("multiParam<A::B, int>")
                  DW_AT_decl_file       ("adl.cpp")
                  DW_AT_decl_line       (34)
                  DW_AT_type    (0x000000000000007d "bool")
                  DW_AT_external        (true)

0x00000215:       DW_TAG_formal_parameter
                    DW_AT_location      (DW_OP_fbreg +15)
                    DW_AT_name  ("p1")
                    DW_AT_decl_file     ("adl.cpp")
                    DW_AT_decl_line     (34)
                    DW_AT_type  (0x0000000000000089 "A::B")

0x00000223:       DW_TAG_formal_parameter
                    DW_AT_location      (DW_OP_fbreg +8)
                    DW_AT_name  ("p2")
                    DW_AT_decl_file     ("adl.cpp")
                    DW_AT_decl_line     (34)
                    DW_AT_type  (0x0000000000000317 "int")

0x00000231:       DW_TAG_formal_parameter
                    DW_AT_location      (DW_OP_fbreg +14)
                    DW_AT_name  ("p3")
                    DW_AT_decl_file     ("adl.cpp")
                    DW_AT_decl_line     (34)
                    DW_AT_type  (0x0000000000000089 "A::B")

0x0000023f:       DW_TAG_template_type_parameter
                    DW_AT_type  (0x0000000000000089 "A::B")
                    DW_AT_name  ("First")

0x00000248:       DW_TAG_template_type_parameter
                    DW_AT_type  (0x0000000000000317 "int")
                    DW_AT_name  ("Second")

Just from this we wouldn't be able to reconstruct the function prototype (which we need to create a more accurate FunctionTemplateDecl). Mainly we don't know whether the formal parameters are of a template parameter type or a concrete type. Maybe someone has an idea of how to get the function prototype through other means at the point where we need to create the FunctionTemplateDecl, but I didn't manage to yet.

Use ConstString::GetLength where possible
Add more comments
Rebase
Add test case

Harbormaster completed remote builds in B181186: Diff 452547.Aug 14 2022, 11:36 AM

Michael137 added inline comments.Aug 14 2022, 11:36 AM

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
1544	We're adding to it in a loop in `CollectAlternateFunctionNames`. Of course we could push the loop into `CollectAlternateFunctionNamesItanium` and `CollectAlternateFunctionNamesMangled` but imo that makes the functions less readable

As @labath mentioned, we do force clang to preserve the linkage name via asm(), but only for class member functions. This was added in 675767a5910d2ec77ef8b51c78fe312cf9022896 (https://reviews.llvm.org/D40283) to also support abi_tag! But that didn't cover templates functions:

Use the DWARF linkage name when importing C++ methods.
When importing C++ methods into clang AST nodes from the DWARF symbol
table, preserve the DW_AT_linkage_name and use it as the linker
("asm") name for the symbol.

Concretely, this enables `expression` to call into names that use the
GNU `abi_tag` extension

I tried adding an AsmLabelAttr to the FunctionDecls we create when parsing DWARF and it does fix the ABI-tag problem on my small test case. But this only works because the way we create FunctionTemplateDecls is incorrect (as I've described in my previous comments).

So the options are any combination of the following:

carry this patch forward (and possibly remove the asm() hack for C++ member functions)
Add the asm() attribute hack to all function declarations (or just when we are dealing with template functions)
Fix the way we generate FunctionTemplateDecls when parsing DWARF (this likely needs a change to DWARF generation)

@aprantl @labath Any preference? To me it seems 1 and 3 are the more "proper" way to fix this issue. And once we fix 3 (which we should do anyway) #2 may break. The good thing about #2 is that we avoid searching object files, improving performance.

In D131335#3722595, @Michael137 wrote:
As @labath mentioned, we do force clang to preserve the linkage name via asm(), but only for class member functions. This was added in 675767a5910d2ec77ef8b51c78fe312cf9022896 (https://reviews.llvm.org/D40283) to also support abi_tag! But that didn't cover templates functions:
Use the DWARF linkage name when importing C++ methods.
When importing C++ methods into clang AST nodes from the DWARF symbol
table, preserve the DW_AT_linkage_name and use it as the linker
("asm") name for the symbol.

Concretely, this enables `expression` to call into names that use the
GNU `abi_tag` extension
I tried adding an AsmLabelAttr to the FunctionDecls we create when parsing DWARF and it does fix the ABI-tag problem on my small test case. But this only works because the way we create FunctionTemplateDecls is incorrect (as I've described in my previous comments).

So the options are any combination of the following:

carry this patch forward (and possibly remove the asm() hack for C++ member functions)

Add the asm() attribute hack to all function declarations (or just when we are dealing with template functions)

Fix the way we generate FunctionTemplateDecls when parsing DWARF (this likely needs a change to DWARF generation)

@aprantl @labath Any preference? To me it seems 1 and 3 are the more "proper" way to fix this issue. And once we fix 3 (which we should do anyway) #2 may break. The good thing about #2 is that we avoid searching object files, improving performance.

I think this sounds like a decent plan forward. I would suggest to start with (1) and then start a discussion about (3). Conceivably we could change the DWARF representation of function signatures to refer to the DW_TAG_template_type_parameter DIEs as types inside the DW_AT_types of the other parameters. This is probably something we'd need to prototype in the LLDB debugger tuning first because it would confuse other consumers.

Michael137 retitled this revision from WIP: [lldb] abi_tag support 3/3 - Use mangle tree API to determine approximate mangled matches to [lldb] abi_tag support 3/3 - Use mangle tree API to determine approximate mangled matches.Aug 15 2022, 11:26 AM

In D131335#3722595, @Michael137 wrote:

As @labath mentioned, we do force clang to preserve the linkage name via asm(), but only for class member functions. This was added in 675767a5910d2ec77ef8b51c78fe312cf9022896 (https://reviews.llvm.org/D40283) to also support abi_tag!

Yes, that's the patch I was alluding to. I'm sorry I was too lazy to look it up.

But that didn't cover templates functions:
Use the DWARF linkage name when importing C++ methods.
When importing C++ methods into clang AST nodes from the DWARF symbol
table, preserve the DW_AT_linkage_name and use it as the linker
("asm") name for the symbol.

Concretely, this enables `expression` to call into names that use the
GNU `abi_tag` extension
I tried adding an AsmLabelAttr to the FunctionDecls we create when parsing DWARF and it does fix the ABI-tag problem on my small test case.

Which test case are you referring to. The one you made inline in https://reviews.llvm.org/D131335#3714567. Or the test attached to this patch? If it's the former, what part of the big test case is missing (and can that be fixed)?

But this only works because the way we create FunctionTemplateDecls is incorrect (as I've described in my previous comments).

So the options are any combination of the following:

carry this patch forward (and possibly remove the asm() hack for C++ member functions)

Add the asm() attribute hack to all function declarations (or just when we are dealing with template functions)

Fix the way we generate FunctionTemplateDecls when parsing DWARF (this likely needs a change to DWARF generation)

@aprantl @labath Any preference? To me it seems 1 and 3 are the more "proper" way to fix this issue. And once we fix 3 (which we should do anyway) #2 may break.

I would like to understand what (if any) are the issues with extending the approach #2 to cover non-member functions as well. For what it's worth, I don't think #2 is a hack. I might actually say that's "using DWARF as it was meant to be used" -- the mangled name is there for us to use (*). Before LLDB came around, I doubt anybody on the DWARF committee imagined people would try round-tripping the DWARF information back into a "regular" C++ compiler (and expecting it to recreate "perfect" mangled names for templates and all).

I don't mean to offend, but if anything, I would say that #1 is a hack. What it essentially does is pretend that abi tags don't exist. The way I understand it, the main reason for the introduction of abi tags was to enable one to have two versions of the same function (class, etc.) co-exist in the same binary/program. This essentially defeats that. I haven't looked at the implementation too closely, but I don't see how one could properly disambiguate two differently-tagged versions of the same functions using this approach. It may be a hack that we have to live with, but I am not convinced of that (yet).

Regarding #3, yes, there are definite problems regarding the way we represent templates. The root cause is the fundamental mismatch between what clang wants (a completely accurate description of all objects "as if written in source") versus what DWARF provides (a description of the most important properties of entities that are present in the binary). Where the problem lies depends on your point of view. However, even if we come up with a better way to represent the debug information in AST, it's not clear to me why that would be incompatible with #2. Since we don't have the actual source code for the template function, we will never be able to provide a fully generic definition of it -- we will always need to deal with specific instantiations of that template. So what we just need is the attach a (asm) label to a specific template instantiation. I would hope that wouldn't be too much to ask of clang to support.

The idea of encoding this information into dwarf (via DW_TAG_template_type_parameter) sounds interesting but it's not clear to me how one would extend it to more complicated scenarios. E.g., I'm not sure how I'd describe template<typename T> void f(std::enable_if_t<std::is_whatever_v<T>, T>) in this way, but the result would likely be huge. One could also consider "fishing" this information out of the mangled name (at least the itanium scheme should have all of this in there, and in a more compact scheme), but that does seem rather hacky.

The good thing about #2 is that we avoid searching object files, improving performance.

Speaking of performance, how does this searching impact the evaluation time of expressions?

(*) The use of mangled names is not particularly important here. What we really need is just some way to associate the entities in the clang AST with the symbols in llvm IR. Mangled names, due to their almost-guaranteed uniqueness, happen to be very good for that. However, one can imagine an implementation where we tag each symbol with an asm("__lldb_deadbeef") label, where 0xdeadbeef is the address of that particular symbol in a binary, and then we translate that symbol to an actual address in llvm IR. That would arguably be a hack, but only because it works around an lldb/clang limitation. Functionality-wise, I think it would be even more correct than using mangled names.

lldb/test/API/lang/cpp/abi_tag_lookup/TestAbiTagLookup.py
14	This means the test is only expected to pass with the `dsym` debug info flavour. Is that really what you wanted to say?

Thanks for taking the time to look at this

I would like to understand what (if any) are the issues with extending the approach #2 to cover non-member functions as well. For what it's worth, I don't think #2 is a hack. I might actually say that's "using DWARF as it was meant to be used" -- the mangled name is there for us to use (*). Before LLDB came around, I doubt anybody on the DWARF committee imagined people would try round-tripping the DWARF information back into a "regular" C++ compiler (and expecting it to recreate "perfect" mangled names for templates and all).

There are no issues that I know of with this. My hesitation with this approach was just that if we eventually fix the way we construct template decls, we will need to make sure we continue attaching the labels correctly. Maybe this isn't a big deal if we note it in a comment

I don't mean to offend, but if anything, I would say that #1 is a hack. What it essentially does is pretend that abi tags don't exist. The way I understand it, the main reason for the introduction of abi tags was to enable one to have two versions of the same function (class, etc.) co-exist in the same binary/program. This essentially defeats that. I haven't looked at the implementation too closely, but I don't see how one could properly disambiguate two differently-tagged versions of the same functions using this approach. It may be a hack that we have to live with, but I am not convinced of that (yet).

The way this patch allows distinguishing differently-tagged versions is by checking the Attrs Node of the Itanium mangle tree. But you're right in that this change does rely on a follow-up patch that attaches AbiTagAttr nodes to the FunctionDecls we generate, which we were planning on doing. But I guess if we are going to attach labels to FunctionDecls anyway, we could just attach the AsmLabel instead and thus cover all types of attributes trivially.

Regarding #3, yes, there are definite problems regarding the way we represent templates. The root cause is the fundamental mismatch between what clang wants (a completely accurate description of all objects "as if written in source") versus what DWARF provides (a description of the most important properties of entities that are present in the binary). Where the problem lies depends on your point of view. However, even if we come up with a better way to represent the debug information in AST, it's not clear to me why that would be incompatible with #2. Since we don't have the actual source code for the template function, we will never be able to provide a fully generic definition of it -- we will always need to deal with specific instantiations of that template. So what we just need is the attach a (asm) label to a specific template instantiation. I would hope that wouldn't be too much to ask of clang to support.

This mismatch currently causes problems with unqualified lookups, especially with import-std-module enabled, so it would be nice to fix eventually. We don't need a fully generic definition, just the generic prototype to get the mangled name correctly. But as you say, this can be done with an asm() label with less trouble.

I can open a separate diff with approach #2 and see from there.

Michael137 added inline comments.Aug 16 2022, 8:31 AM

lldb/test/API/lang/cpp/abi_tag_lookup/TestAbiTagLookup.py
14	Yup this was intentional. All these suffer from issues with unqualified lookup due to the way we construct the template decls

Michael137 mentioned this in D131974: [lldb][ClangExpression] Add asm() label to all FunctionDecls we create from DWARF.Aug 16 2022, 9:11 AM

In D131335#3726204, @Michael137 wrote:

Thanks for taking the time to look at this

I would like to understand what (if any) are the issues with extending the approach #2 to cover non-member functions as well. For what it's worth, I don't think #2 is a hack. I might actually say that's "using DWARF as it was meant to be used" -- the mangled name is there for us to use (*). Before LLDB came around, I doubt anybody on the DWARF committee imagined people would try round-tripping the DWARF information back into a "regular" C++ compiler (and expecting it to recreate "perfect" mangled names for templates and all).

There are no issues that I know of with this. My hesitation with this approach was just that if we eventually fix the way we construct template decls, we will need to make sure we continue attaching the labels correctly. Maybe this isn't a big deal if we note it in a comment

Yeah, if you end up working on this, and find that you need to have clang to something in this area, I can try to loop in some clang people to help.

I don't mean to offend, but if anything, I would say that #1 is a hack. What it essentially does is pretend that abi tags don't exist. The way I understand it, the main reason for the introduction of abi tags was to enable one to have two versions of the same function (class, etc.) co-exist in the same binary/program. This essentially defeats that. I haven't looked at the implementation too closely, but I don't see how one could properly disambiguate two differently-tagged versions of the same functions using this approach. It may be a hack that we have to live with, but I am not convinced of that (yet).

The way this patch allows distinguishing differently-tagged versions is by checking the Attrs Node of the Itanium mangle tree. But you're right in that this change does rely on a follow-up patch that attaches AbiTagAttr nodes to the FunctionDecls we generate, which we were planning on doing. But I guess if we are going to attach labels to FunctionDecls anyway, we could just attach the AsmLabel instead and thus cover all types of attributes trivially.

FWIW, I would be all for attaching the abi tags to the clang declarations if they would be in some easily accessible form (e.g. a DWARF attribute). Parsing them out of the mangled name is somewhat dubious, but I am not entirely against that, if it is necessary for some use case. However, even if we did that, I'd still say that attaching the asm attribute is a good idea.

Regarding #3, yes, there are definite problems regarding the way we represent templates. The root cause is the fundamental mismatch between what clang wants (a completely accurate description of all objects "as if written in source") versus what DWARF provides (a description of the most important properties of entities that are present in the binary). Where the problem lies depends on your point of view. However, even if we come up with a better way to represent the debug information in AST, it's not clear to me why that would be incompatible with #2. Since we don't have the actual source code for the template function, we will never be able to provide a fully generic definition of it -- we will always need to deal with specific instantiations of that template. So what we just need is the attach a (asm) label to a specific template instantiation. I would hope that wouldn't be too much to ask of clang to support.

This mismatch currently causes problems with unqualified lookups, especially with import-std-module enabled, so it would be nice to fix eventually. We don't need a fully generic definition, just the generic prototype to get the mangled name correctly. But as you say, this can be done with an asm() label with less trouble.

I can open a separate diff with approach #2 and see from there.

Cool. Thanks for doing that.

In D131335#3732877, @labath wrote:

In D131335#3726204, @Michael137 wrote:

Thanks for taking the time to look at this

I would like to understand what (if any) are the issues with extending the approach #2 to cover non-member functions as well. For what it's worth, I don't think #2 is a hack. I might actually say that's "using DWARF as it was meant to be used" -- the mangled name is there for us to use (*). Before LLDB came around, I doubt anybody on the DWARF committee imagined people would try round-tripping the DWARF information back into a "regular" C++ compiler (and expecting it to recreate "perfect" mangled names for templates and all).

There are no issues that I know of with this. My hesitation with this approach was just that if we eventually fix the way we construct template decls, we will need to make sure we continue attaching the labels correctly. Maybe this isn't a big deal if we note it in a comment

Yeah, if you end up working on this, and find that you need to have clang to something in this area, I can try to loop in some clang people to help.

I don't mean to offend, but if anything, I would say that #1 is a hack. What it essentially does is pretend that abi tags don't exist. The way I understand it, the main reason for the introduction of abi tags was to enable one to have two versions of the same function (class, etc.) co-exist in the same binary/program. This essentially defeats that. I haven't looked at the implementation too closely, but I don't see how one could properly disambiguate two differently-tagged versions of the same functions using this approach. It may be a hack that we have to live with, but I am not convinced of that (yet).

The way this patch allows distinguishing differently-tagged versions is by checking the Attrs Node of the Itanium mangle tree. But you're right in that this change does rely on a follow-up patch that attaches AbiTagAttr nodes to the FunctionDecls we generate, which we were planning on doing. But I guess if we are going to attach labels to FunctionDecls anyway, we could just attach the AsmLabel instead and thus cover all types of attributes trivially.

FWIW, I would be all for attaching the abi tags to the clang declarations if they would be in some easily accessible form (e.g. a DWARF attribute). Parsing them out of the mangled name is somewhat dubious, but I am not entirely against that, if it is necessary for some use case. However, even if we did that, I'd still say that attaching the asm attribute is a good idea.

Regarding #3, yes, there are definite problems regarding the way we represent templates. The root cause is the fundamental mismatch between what clang wants (a completely accurate description of all objects "as if written in source") versus what DWARF provides (a description of the most important properties of entities that are present in the binary). Where the problem lies depends on your point of view. However, even if we come up with a better way to represent the debug information in AST, it's not clear to me why that would be incompatible with #2. Since we don't have the actual source code for the template function, we will never be able to provide a fully generic definition of it -- we will always need to deal with specific instantiations of that template. So what we just need is the attach a (asm) label to a specific template instantiation. I would hope that wouldn't be too much to ask of clang to support.

This mismatch currently causes problems with unqualified lookups, especially with import-std-module enabled, so it would be nice to fix eventually. We don't need a fully generic definition, just the generic prototype to get the mangled name correctly. But as you say, this can be done with an asm() label with less trouble.

I can open a separate diff with approach #2 and see from there.

Cool. Thanks for doing that.

Yeah, if you end up working on this, and find that you need to have clang to something in this area, I can try to loop in some clang people to help.

Sounds good. Planning to look into this soon since it's causing some subtle bugs with import-std-module and is also likely needed to get rid off the XFAILs (though haven't looked into this deeply).

FWIW, I would be all for attaching the abi tags to the clang declarations if they would be in some easily accessible form (e.g. a DWARF attribute). Parsing them out of the mangled name is somewhat dubious, but I am not entirely against that, if it is necessary for some use case. However, even if we did that, I'd still say that attaching the asm attribute is a good idea.

We're actually thinking of maybe getting rid off the fallback logic for the C++ plugin entirely. Only a handful of API tests seem to call into it (and some of them don't even expect the symbol to be found). But maybe there are some critical corner cases that require this that aren't covered by the tests.

In D131335#3733117, @Michael137 wrote:

In D131335#3732877, @labath wrote:

FWIW, I would be all for attaching the abi tags to the clang declarations if they would be in some easily accessible form (e.g. a DWARF attribute). Parsing them out of the mangled name is somewhat dubious, but I am not entirely against that, if it is necessary for some use case. However, even if we did that, I'd still say that attaching the asm attribute is a good idea.

We're actually thinking of maybe getting rid off the fallback logic for the C++ plugin entirely. Only a handful of API tests seem to call into it (and some of them don't even expect the symbol to be found). But maybe there are some critical corner cases that require this that aren't covered by the tests.

I expect that this logic exists mostly for the benefit of low-quality or "minimal" debug info, at least that's what it looks like for me when I see the kinds of fallbacks that are used. For example the char->signed char substitution exists probably to work around the fact that DWARF does not really recognize char as a separate type (not in the same way that C does). So the type char could end up having using DW_ATE_signed_char (same as signed char) or DW_ATE_unsigned_char (same as unsigned char), depending on whether the type actually is signed or not. Makes sense if all you want is to provide a description of the type, but it does not sufficient for reconstructing mangled names. This is why our DWARF parsing code (TypeSystemClang::GetBuiltinTypeForDWARFEncodingAndBitSize) needs to use string comparisons to get this right. One could say the same for int->long substitutions, etc.

That means that these things will only show up when using debug info produced by third-party compilers (I think even gcc should be fine), which is not something that we test regularly. That said, I don't think this is a good reason to keep this code around, and I'd hope that the asm labels will fix most of these issues (obviously, you won't be able to call the right overload based on the signedness of a char variable, but you couldn't do that before either). If it breaks anyones use case, we can figure out how to fix this in a better way.

Revision Contents

Path

Size

lldb/

packages/

Python/

lldbsuite/

test/

make/

Makefile.rules

3 lines

source/

Plugins/

Language/

CPlusPlus/

CPlusPlusLanguage.h

12 lines

CPlusPlusLanguage.cpp

198 lines

test/

API/

lang/

cpp/

abi_tag_lookup/

Makefile

3 lines

TestAbiTagLookup.py

56 lines

main.cpp

71 lines

Diff 452547

lldb/packages/Python/lldbsuite/test/make/Makefile.rules

Show First 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	ifeq (1,$(USE_LIBSTDCPP))
# Clang requires an extra flag: -stdlib=libstdc++		# Clang requires an extra flag: -stdlib=libstdc++
ifneq (,$(findstring clang,$(CC)))		ifneq (,$(findstring clang,$(CC)))
CXXFLAGS += -stdlib=libstdc++ -DLLDB_USING_LIBSTDCPP		CXXFLAGS += -stdlib=libstdc++ -DLLDB_USING_LIBSTDCPP
LDFLAGS += -stdlib=libstdc++		LDFLAGS += -stdlib=libstdc++
endif		endif
endif		endif

ifeq (1,$(USE_LIBCPP))		ifeq (1,$(USE_LIBCPP))
# TODO: Teach LLDB to handle ABI tags in libc++ namespaces.		CXXFLAGS += -DLLDB_USING_LIBCPP
CXXFLAGS += -DLLDB_USING_LIBCPP -D_LIBCPP_NO_ABI_TAG
ifeq "$(OS)" "Android"		ifeq "$(OS)" "Android"
# Nothing to do, this is already handled in		# Nothing to do, this is already handled in
# Android.rules.		# Android.rules.
else		else
CXXFLAGS += -stdlib=libc++		CXXFLAGS += -stdlib=libc++
LDFLAGS += -stdlib=libc++		LDFLAGS += -stdlib=libc++
endif		endif
ifneq (,$(filter $(OS), FreeBSD Linux NetBSD))		ifneq (,$(filter $(OS), FreeBSD Linux NetBSD))
▲ Show 20 Lines • Show All 238 Lines • Show Last 20 Lines

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h

Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	public:

static void Terminate();		static void Terminate();

static lldb_private::Language *CreateInstance(lldb::LanguageType language);		static lldb_private::Language *CreateInstance(lldb::LanguageType language);

static llvm::StringRef GetPluginNameStatic() { return "cplusplus"; }		static llvm::StringRef GetPluginNameStatic() { return "cplusplus"; }

bool SymbolNameFitsToLanguage(Mangled mangled) const override;		bool SymbolNameFitsToLanguage(Mangled mangled) const override;

bool DemangledNameContainsPath(llvm::StringRef path,		bool DemangledNameContainsPath(llvm::StringRef path,
ConstString demangled) const override;		ConstString demangled) const override;

ConstString		ConstString
GetDemangledFunctionNameWithoutArguments(Mangled mangled) const override;		GetDemangledFunctionNameWithoutArguments(Mangled mangled) const override;

static bool IsCPPMangledName(llvm::StringRef name);		static bool IsCPPMangledName(llvm::StringRef name);

Show All 17 Lines	public:
GenerateAlternateFunctionManglings(const ConstString mangled) const override;		GenerateAlternateFunctionManglings(const ConstString mangled) const override;

std::vector<ConstString>		std::vector<ConstString>
CollectAlternateFunctionNames(const std::vector<ConstString> &mangled_names,		CollectAlternateFunctionNames(const std::vector<ConstString> &mangled_names,
const SymbolContext &sc) const override;		const SymbolContext &sc) const override;

// PluginInterface protocol		// PluginInterface protocol
llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }		llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }

		private:
		void
		CollectAlternateFunctionNamesFromMangled(std::vector<ConstString> &results,
		ConstString name,
		const SymbolContext &sc) const;

		void CollectAlternateFunctionNamesItanium(std::vector<ConstString> &results,
		ConstString name,
		const SymbolContext &sc) const;
};		};

} // namespace lldb_private		} // namespace lldb_private

#endif // LLDB_SOURCE_PLUGINS_LANGUAGE_CPLUSPLUS_CPLUSPLUSLANGUAGE_H		#endif // LLDB_SOURCE_PLUGINS_LANGUAGE_CPLUSPLUS_CPLUSPLUSLANGUAGE_H

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp

Show All 12 Lines

#include <functional>		#include <functional>
#include <memory>		#include <memory>
#include <mutex>		#include <mutex>
#include <set>		#include <set>

#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Demangle/ItaniumDemangle.h"		#include "llvm/Demangle/ItaniumDemangle.h"
		#include "llvm/Support/Compiler.h"
		#include "llvm/Support/ErrorHandling.h"

#include "lldb/Core/Mangled.h"		#include "lldb/Core/Mangled.h"
#include "lldb/Core/Module.h"		#include "lldb/Core/Module.h"
#include "lldb/Core/PluginManager.h"		#include "lldb/Core/PluginManager.h"
#include "lldb/Core/UniqueCStringMap.h"		#include "lldb/Core/UniqueCStringMap.h"
#include "lldb/DataFormatters/CXXFunctionPointer.h"		#include "lldb/DataFormatters/CXXFunctionPointer.h"
#include "lldb/DataFormatters/DataVisualization.h"		#include "lldb/DataFormatters/DataVisualization.h"
#include "lldb/DataFormatters/FormattersHelpers.h"		#include "lldb/DataFormatters/FormattersHelpers.h"
▲ Show 20 Lines • Show All 476 Lines • ▼ Show 20 Lines	std::vector<ConstString> CPlusPlusLanguage::GenerateAlternateFunctionManglings(

if (ConstString ctor_fixup =		if (ConstString ctor_fixup =
CtorDtorSubstitutor().substitute(mangled_name.GetStringRef()))		CtorDtorSubstitutor().substitute(mangled_name.GetStringRef()))
alternates.push_back(ctor_fixup);		alternates.push_back(ctor_fixup);

return alternates;		return alternates;
}		}

		char const printNode(const llvm::itanium_demangle::Node RootNode) {
		llvm::itanium_demangle::OutputBuffer OB;

		// Initial size of name buffer. Will get resized inside of
		// 'OutputBuffer' if necessary.
		constexpr size_t KInitBufferSize = 128;

		if (!initializeOutputBuffer(nullptr, nullptr, OB, KInitBufferSize))
		return nullptr;
		RootNode->print(OB);
		OB += '\0';
		return OB.getBuffer();
		}

		char const getFunctionName(llvm::itanium_demangle::Node const root) {
		using namespace llvm::itanium_demangle;

		if (root->getKind() != Node::KFunctionEncoding)
		return nullptr;

		auto Name = static_cast<FunctionEncoding const >(root)->getName();

		assert(Name != nullptr);

		return printNode(Name);
		}

static ConstString		static ConstString
FindBestAlternateFunctionMangledName(const Mangled mangled,		FindBestAlternateFunctionMangledName(const Mangled mangled,
const SymbolContext &sym_ctx) {		const SymbolContext &sym_ctx) {
ConstString demangled = mangled.GetDemangledName();		ConstString demangled = mangled.GetDemangledName();
if (!demangled)		if (!demangled)
return ConstString();		return ConstString();

CPlusPlusLanguage::MethodName cpp_name(demangled);		CPlusPlusLanguage::MethodName cpp_name(demangled);
Show All 35 Lines	if (param_and_qual_matches.size())
return param_and_qual_matches[0]; // It is assumed that there will be only		return param_and_qual_matches[0]; // It is assumed that there will be only
// one!		// one!
else if (param_matches.size())		else if (param_matches.size())
return param_matches[0]; // Return one of them as a best match		return param_matches[0]; // Return one of them as a best match
else		else
return ConstString();		return ConstString();
}		}

		ConstString
		FindBestAlternateFunctionMangledName(llvm::itanium_demangle::Node const *node,
		const SymbolContext &sym_ctx) {
		if (!sym_ctx.module_sp)
		return ConstString();

		lldb_private::SymbolFile *sym_file = sym_ctx.module_sp->GetSymbolFile();
		if (!sym_file)
		return ConstString();

		using namespace llvm::itanium_demangle;
		assert(node != nullptr);

		if (node->getKind() != Node::KFunctionEncoding)
		return ConstString();

		std::vector<ConstString> alternates;
		sym_file->GetMangledNamesForFunction(getFunctionName(node), alternates);

		auto const encoding = static_cast<FunctionEncoding const >(node);
		Qualifiers quals = encoding->getCVQuals();
		FunctionRefQual refQual = encoding->getRefQual();
		NodeArray params = encoding->getParams();
		auto const *attrs = encoding->getAttrs();

		// Iterate over all candidate function names and
		// find a match for the mangled function name
		aprantlUnsubmitted Done Reply Inline Actions Could this be `for (ConstString alternate_mangled_name : alternates)` ? aprantl: Could this be `for (ConstString alternate_mangled_name : alternates)` ?
		aprantlUnsubmitted Done Reply Inline Actions Comment what the loop is doing? aprantl: Comment what the loop is doing?
		// represented by 'node'. We demangle each candidate
		// into a mangle tree and compare it to the tree rooted
		// at 'node'. We return a perfect match if we find one.
		// Otherwise we collect all approximate matches and return
		// one of them. A "match" is defined as:
		aprantlUnsubmitted Done Reply Inline Actions `alternate_mangled_name.GetLength()` ? It might sill need to call strlen under the hood, but it could do something more efficient. aprantl: `alternate_mangled_name.GetLength()` ? It might sill need to call strlen under the hood, but it…
		//
		// * Perfect match: parameters, CV-quals, reference-quals,
		// attributes are all equal to those of
		// the input node
		//
		// * Approximate match: parameters are equal to those of the
		// input node
		std::vector<ConstString> param_matches;
		for (ConstString alternate_mangled_name : alternates) {
		if (!alternate_mangled_name)
		continue;

		char const *alternate = alternate_mangled_name.GetCString();
		const auto alternate_len = alternate_mangled_name.GetLength();
		ManglingParser<NodeAllocator> alternate_parser(alternate,
		alternate + alternate_len);

		if (auto const *alt_node = alternate_parser.parse()) {
		assert(alt_node->getKind() == Node::KFunctionEncoding);
		auto const *alt_encoding =
		static_cast<FunctionEncoding const *>(alt_node);
		Qualifiers alt_quals = alt_encoding->getCVQuals();
		FunctionRefQual alt_refQual = alt_encoding->getRefQual();
		NodeArray alt_params = alt_encoding->getParams();
		auto const *alt_attrs = alt_encoding->getAttrs();

		if (params == alt_params) {
		aprantlUnsubmitted Done Reply Inline Actions `.` aprantl: `.`
		if (quals == alt_quals && refQual == alt_refQual) {
		if ((attrs && alt_attrs && attrs->equals(alt_attrs)) \|\|
		(!attrs && !alt_attrs)) {
		// Perfect match. Return it.
		return alternate_mangled_name;
		}
		}

		// Not a perfect match, but count matching parameters
		// as an approximate match.
		param_matches.push_back(alternate_mangled_name);
		}
		}
		}

		// No perfect match. Return one of the approximate
		// matches as a best match.
		if (param_matches.size())
		return param_matches[0];

		// No matches.
		return ConstString();
		}

static void LoadLibCxxFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {		static void LoadLibCxxFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {
if (!cpp_category_sp)		if (!cpp_category_sp)
return;		return;

TypeSummaryImpl::Flags stl_summary_flags;		TypeSummaryImpl::Flags stl_summary_flags;
stl_summary_flags.SetCascades(true)		stl_summary_flags.SetCascades(true)
.SetSkipPointers(false)		.SetSkipPointers(false)
.SetSkipReferences(false)		.SetSkipReferences(false)
▲ Show 20 Lines • Show All 830 Lines • ▼ Show 20 Lines	if (file_path.endswith_insensitive(suffix))
return true;		return true;
}		}

// Check if we're in a STL path (where the files usually have no extension		// Check if we're in a STL path (where the files usually have no extension
// that we could check for.		// that we could check for.
return file_path.contains("/usr/include/c++/");		return file_path.contains("/usr/include/c++/");
}		}

std::vector<ConstString> CPlusPlusLanguage::CollectAlternateFunctionNames(		void CPlusPlusLanguage::CollectAlternateFunctionNamesFromMangled(
const std::vector<ConstString> &mangled_names,		std::vector<ConstString> &results, ConstString name,
const SymbolContext &sc) const {		const SymbolContext &sc) const {
std::vector<ConstString> results;
for (const ConstString &name : mangled_names) {
Mangled mangled(name);		Mangled mangled(name);
if (SymbolNameFitsToLanguage(mangled)) {		if (SymbolNameFitsToLanguage(mangled)) {
if (ConstString best_alternate =		if (ConstString best_alternate =
FindBestAlternateFunctionMangledName(mangled, sc)) {		FindBestAlternateFunctionMangledName(mangled, sc)) {
results.push_back(best_alternate);		results.push_back(best_alternate);
}		}
}		}

std::vector<ConstString> alternates =		std::vector<ConstString> alternates =
GenerateAlternateFunctionManglings(name);		GenerateAlternateFunctionManglings(name);
results.insert(results.end(), alternates.begin(), alternates.end());		results.insert(results.end(), alternates.begin(), alternates.end());

// As a last-ditch fallback, try the base name for C++ names. It's		// As a last-ditch fallback, try the base name for C++ names. It's
// terrible, but the DWARF doesn't always encode "extern C" correctly.		// terrible, but the DWARF doesn't always encode "extern C" correctly.
ConstString basename = GetDemangledFunctionNameWithoutArguments(mangled);		ConstString basename = GetDemangledFunctionNameWithoutArguments(mangled);
results.push_back(basename);		results.push_back(basename);
}		}

		void CPlusPlusLanguage::CollectAlternateFunctionNamesItanium(
		std::vector<ConstString> &results, ConstString name,
		aprantlUnsubmitted Not Done Reply Inline Actions why not return the vector? aprantl: why not return the vector?
		Michael137AuthorUnsubmitted Done Reply Inline Actions We're adding to it in a loop in `CollectAlternateFunctionNames`. Of course we could push the loop into `CollectAlternateFunctionNamesItanium` and `CollectAlternateFunctionNamesMangled` but imo that makes the functions less readable Michael137: We're adding to it in a loop in `CollectAlternateFunctionNames`. Of course we could push the…
		const SymbolContext &sc) const {
		if (!name)
		return;

		char const *mangled = name.GetCString();
		const auto mangled_len = name.GetLength();
		using namespace llvm::itanium_demangle;
		aprantlUnsubmitted Done Reply Inline Actions I'd rather ask the conststring for its length again aprantl: I'd rather ask the conststring for its length again
		ManglingParser<NodeAllocator> parser(mangled, mangled + mangled_len);
		auto const *node = parser.parse();

		if (node && CPlusPlusLanguage::IsCPPMangledName(name.GetStringRef())) {
		if (ConstString best_alternate =
		FindBestAlternateFunctionMangledName(node, sc)) {
		results.push_back(best_alternate);
		}
		}

		std::vector<ConstString> alternates =
		GenerateAlternateFunctionManglings(name);
		results.insert(results.end(), alternates.begin(), alternates.end());

		// As a last-ditch fallback, try the base name for C++ names. It's
		// terrible, but the DWARF doesn't always encode "extern C" correctly.
		if (node) {
		ConstString basename(getFunctionName(node));
		results.push_back(basename);
		}
		}

		std::vector<ConstString> CPlusPlusLanguage::CollectAlternateFunctionNames(
		const std::vector<ConstString> &mangled_names,
		const SymbolContext &sc) const {
		std::vector<ConstString> results;
		for (const ConstString &name : mangled_names) {
		auto scheme = Mangled::GetManglingScheme(name.GetStringRef());
		switch (scheme) {
		case Mangled::ManglingScheme::eManglingSchemeItanium:
		CollectAlternateFunctionNamesItanium(results, name, sc);
		break;
		case Mangled::ManglingScheme::eManglingSchemeNone:
		// Macho-O symbols sometimes contain the non-standard
		// double-underscore prefix which 'Mangled::GetManglingScheme'
		// purposefully doesn't support.
		if (name.GetStringRef().startswith("__Z")) {
		CollectAlternateFunctionNamesItanium(results, name, sc);
		break;
		}
		LLVM_FALLTHROUGH;
		case Mangled::ManglingScheme::eManglingSchemeMSVC:
		case Mangled::ManglingScheme::eManglingSchemeRustV0:
		case Mangled::ManglingScheme::eManglingSchemeD:
		// We don't have a mangle tree API available. Fall back
		// to parsing demangled function names to find alternates
		CollectAlternateFunctionNamesFromMangled(results, name, sc);
		break;
		}
		}

return results;		return results;
}		}

lldb/test/API/lang/cpp/abi_tag_lookup/Makefile

This file was added.

				CXX_SOURCES := main.cpp

				include Makefile.rules

lldb/test/API/lang/cpp/abi_tag_lookup/TestAbiTagLookup.py

This file was added.

				"""
				Test that we can call functions and use types
				annotated (and thus mangled) with ABI tags.
				"""

				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				from lldbsuite.test import lldbutil

				class AbiTagLookupTestCase(TestBase):

				@skipIfWindows
				@expectedFailureAll(debug_info=["dwarf", "gmodules", "dwo"])
				labathUnsubmitted Not Done Reply Inline Actions This means the test is only expected to pass with the `dsym` debug info flavour. Is that really what you wanted to say? labath: This means the test is only expected to pass with the `dsym` debug info flavour. Is that really…
				Michael137AuthorUnsubmitted Done Reply Inline Actions Yup this was intentional. All these suffer from issues with unqualified lookup due to the way we construct the template decls Michael137: Yup this was intentional. All these suffer from issues with unqualified lookup due to the way…
				def test_abi_tag_lookup(self):
				self.build()
				lldbutil.run_to_source_breakpoint(self, 'Break here',
				lldb.SBFileSpec("main.cpp", False))

				# Qualified/unqualified lookup to templates in namespace
				self.expect_expr("operator<(b1, b2)", result_type="bool", result_value="true")
				self.expect_expr("A::operator<(b1, b2)", result_type="bool", result_value="true")
				self.expect_expr("b1 < b2", result_type="bool", result_value="true")

				# Qualified/unqualified lookup to templates with ABI tags in namespace
				self.expect_expr("operator>(b1, b2)", result_type="bool", result_value="true")
				self.expect_expr("A::operator>(b1, b2)", result_type="bool", result_value="true")
				self.expect_expr("b1 > b2", result_type="bool", result_value="true")

				# Call non-operator templates with ABI tags
				self.expect_expr("A::withAbiTagInNS(1, 1)", result_type="int", result_value="1")

				self.expect_expr("A::withAbiTagInNS(1.0, 1.0)", result_type="int", result_value="2")
				self.expect_expr("withAbiTagInNS(b1, b2)", result_type="int", result_value="2")
				self.expect_expr("A::withAbiTagInNS(b1, b2)", result_type="int", result_value="2")

				self.expect_expr("withAbiTag(b1, b2)", result_type="int", result_value="3")
				self.expect_expr("withAbiTag(0, 0)", result_type="int", result_value="-3")

				# Structures with ABI tags
				self.expect_expr("t.Value()", result_type="const int", result_value="4")
				self.expect_expr("tt.Value()", result_type="const int", result_value="5")

				self.expect_expr("Tagged{.mem = 6}", result_type="Tagged",
				result_children=[ValueCheck(name="mem", value="6")])

				# Inline namespaces with ABI tags
				self.expect_expr("v1::withImplicitTag(Simple{.mem = 6})", result_type="int", result_value="6")
				self.expect_expr("withImplicitTag(Simple{.mem = 6})", result_type="int", result_value="6")

				# FIXME: this currently fails because CPlusPlusLanguage::FindBestAlternateFunctionMangledName
				# requires at least the parameters of the symbols it compares to match. However,
				# the IR we generate doesn't include ABI tags on the parameter. To fix this we need
				# to add ABI tag attributes to the decls that DWARFASTParserClang generates.
				self.expect("expr v1::withImplicitTag(Tagged{.mem = 6})", error=True, substrs=["error: Couldn't lookup symbols:"])
				self.expect("expr withImplicitTag(Tagged{.mem = 6})", error=True, substrs=["error: Couldn't lookup symbols:"])

lldb/test/API/lang/cpp/abi_tag_lookup/main.cpp

This file was added.

				#include <cstdio>

				namespace A {
				template <typename T> bool operator<(const T &, const T &) { return true; }

				template <typename T>
				[[gnu::abi_tag("Two", "Tags")]] bool operator>(const T &, const T &) {
				return true;
				}

				template <typename T>
				[[gnu::abi_tag("OneTag")]] bool operator==(const T &, const T &) {
				return true;
				}

				[[gnu::abi_tag("Foo")]] int withAbiTagInNS(const int &, const int &) {
				return 1;
				}

				template <typename T>
				[[gnu::abi_tag("Bar")]] int withAbiTagInNS(const T &, const T &) {
				return 2;
				}

				struct B {};
				} // namespace A

				template <typename T>
				[[gnu::abi_tag("Baz")]] int withAbiTag(const T &, const T &) {
				return 3;
				}

				[[gnu::abi_tag("Baz")]] int withAbiTag(const int &, const int &) { return -3; }

				struct Simple {
				int mem;
				};

				struct [[gnu::abi_tag("Qux")]] Tagged {
				int mem;

				int const &Value() const { return mem; }
				};

				template <typename T> struct [[gnu::abi_tag("Quux", "Quuux")]] TaggedTemplate {
				T mem;

				T const &Value() const { return mem; }
				};

				// clang-format off
				inline namespace [[gnu::abi_tag("Inline", "NS")]] v1 {
				template <typename T> int withImplicitTag(T const &t) { return t.mem; }
				} // namespace
				// clang-format on

				int main() {
				A::B b1;
				A::B b2;
				Tagged t{.mem = 4};
				TaggedTemplate<int> tt{.mem = 5};

				int result = (b1 < b2) + (b1 > b2) + (b1 == b2) + withAbiTag(b1, b2) +
				A::withAbiTagInNS(1.0, 2.0) + withAbiTagInNS(b1, b2) +
				A::withAbiTagInNS(1, 2) + withImplicitTag(Tagged{.mem = 6}) +
				withImplicitTag(Simple{.mem = 6}) + t.Value() + tt.Value();

				std::puts("Break here");

				return result;
				}