This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Support simplified template names
ClosedPublic

Authored by aeubanks on Sep 21 2022, 11:28 AM.

Details

Summary

See https://discourse.llvm.org/t/dwarf-using-simplified-template-names/58417 for background on simplified template names.

lldb doesn't work with simplified template names because it uses DW_AT_name which doesn't contain template parameters under simplified template names.

Two major changes are required to make lldb work with simplified template names.

  1. When building clang ASTs for struct-like dies, we use the name as a cache key. To distinguish between different instantiations of a template class, we need to add in the template parameters.
  1. When looking up types, if the requested type name contains '<' and we didn't initially find any types from the index searching the name, strip the template parameters and search the index, then filter out results with non-matching template parameters. This takes advantage of the clang AST's ability to print full names rather than doing it by ourself.

An alternative is to fix up the names in the index to contain the fully qualified name, but that doesn't respect .debug_names.

Diff Detail

Event Timeline

aeubanks created this revision.Sep 21 2022, 11:28 AM
Herald added a project: Restricted Project. · View Herald Transcript
aeubanks published this revision for review.Sep 21 2022, 11:41 AM
aeubanks added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
801 ↗(On Diff #461963)

moved since now we need the clang AST to rebuild the fully qualified name
the only other caller is some logging

Herald added a project: Restricted Project. · View Herald TranscriptSep 21 2022, 11:41 AM
dblaikie added inline comments.Oct 5 2022, 1:00 PM
lldb/packages/Python/lldbsuite/test/make/Makefile.rules
230–231 ↗(On Diff #461963)

Oh, nifty place to test this - I'd been testing with the default changed in clang itself.

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1654–1656

I think there's some existing problems with lldb here (& gdb, as it happens) - this doesn't specify where the parameter pack is in the argument list (looks like it assumes it's the only template parameters?) - and doesn't handle the possibility of multiple parameter packs in a single parameter list, which I think is possible in some corner cases.

I think maybe the existing lldb code can handle some non-pack parameters followed by a single pack, but this code assumes it's either non-pack or a pack, never both - so might be more restrictive than the existing limitations? But maybe not - I might've misread/misremembered the other lldb code.

1660

OK, so this line uses Clang's AST printing (avoiding having to reimplement all the type printing in lldb, like I've done in llvm-dwarfdump/libDebugInfoDWARF) - but I guess there's a reason we can't do that at a higher level for the whole template, but have to do it per template argument?

It'd be nice if we could do it for the whole parameter list/template specialization, to avoid having to have this <>, etc handling.

I guess it's a circular problem - have to build the name to look up the AST if we already have it... so can't build the name /from/ the AST, as such. Fair enough.

1681

What's incorrect about it at the moment?

Oh, I see this code has just moved around.

1686–1687
1701–1704
1816–1834

Maybe? But the named variable and less indentation's probably good too.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2508–2512

Maybe a comment here? I guess what's happening here is that if the name found isn't a template then it isn't relevant/can't match the original query that did have template parameters?

2515–2518

And this checks the stringified template params match exactly - so there's opportunity for improvement in the future to compare with more fuzzy matching (I guess we can't use clang's AST because that'd involve making two instances of the type somehow which doesn't make a lot of sense) so that users don't have to spell the template spec exactly the same way clang does (eg: x<int, int> versus x<int,int> - or other more complicated situations with function pointers, parentheses, casts, etc.

Michael137 added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1646

Any thoughts on making this a private method of DWARFASTParserClang instead of a lambda?

1681

Why move this into ParseStructureLikeDIE? Just to remove the need for a mutable std::string& param? I guess access to die? Might be easier to review (and maintain) as a helper function still

Michael137 added inline comments.Oct 5 2022, 3:01 PM
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2497

Minor: might be easier for future readers to cross-reference

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
9359–9362

Minor

Wonder if it's worth commenting why the template_basename may or may not contain a <.

aeubanks updated this revision to Diff 465586.Oct 5 2022, 4:01 PM
aeubanks marked 5 inline comments as done.

address comments

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1654–1656

you're right, fixed to check both, and added test case

1681

so that we have access to ParseTemplateParameterInfos, which calls ParseTemplateDIE which creates clang types and stuffs them into TemplateParameterInfos, which we use to create the template parameters string.

moved into helper function

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2515–2518

lldb already requires exact name matching when looking up classes

e.g.

$ cat /tmp/a.cc
template<class T>struct Foo {};
int main() {
        Foo<int> a;
        Foo<long> b;
        Foo<float> c;
}
template<class T>struct Foo {};
int main() {
        Foo<int> a;
        Foo<long> b;
        Foo<float> c;
}
~/repos/llvm-project/build/cmake$ ./bin/lldb /tmp/a
(lldb) target create "/tmp/a"
Current executable set to '/tmp/a' (x86_64).
(lldb) im loo -t 'Foo< int>'
(lldb) im loo -t 'Foo<int>'
1 match found in /tmp/a:
id = {0x00000058}, name = "Foo<int>", byte-size = 1, decl = a.cc:1, compiler_type = "template<> struct Foo<int> {
}"
Michael137 added inline comments.Oct 6 2022, 2:45 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1528

Can this affect operator< at all? Haven't thought much about it but that's one construct that comes to mind that can contain <

$ cat operator.cpp 
template<typename T>
struct Foo {
    int x = 0;

    template<typename U>
    friend bool operator<(Foo<U> const& lhs, Foo<U> const& rhs) {
        return lhs.x < rhs.x;
    }
};

int main() {
    Foo<int> x;
    Foo<int> y;
    return x < y;
}
0x000000c8:   DW_TAG_subprogram                                    
                DW_AT_low_pc    (0x0000000100003f5c)               
                DW_AT_high_pc   (0x0000000100003f8c)               
                DW_AT_APPLE_omit_frame_ptr      (true)        
                DW_AT_frame_base        (DW_OP_reg31 WSP)          
                DW_AT_linkage_name      ("_ZltIiEbRK3FooIT_ES4_")  
                DW_AT_name      ("operator<<int>")                 
                DW_AT_decl_file ("/Users/michaelbuch/operator.cpp")
                DW_AT_decl_line (6)                                
                DW_AT_type      (0x0000000000000135 "bool")        
                DW_AT_external  (true)
Michael137 added inline comments.Oct 6 2022, 2:46 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1528

Looks like -gsimple-template-names leaves the template params in DW_AT_name in this case so we're fine? Would be nice to test though

Michael137 added inline comments.Oct 6 2022, 2:48 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1528

Oh this wouldn't apply anyway since we only care about type names, ignore this comment

Michael137 added inline comments.Oct 6 2022, 3:17 AM
lldb/test/API/lang/cpp/unique-types2/TestUniqueTypes.py
2

I think it would be good to test this with -gsimple-template-names/-gno-simple-template-names

You can pass in a dictionary into self.build() and override Makefile flags

E..g, in lldb/test/API/lang/c/forward/TestForwardDeclaration.py we do this for -gdwarf-5.

So you could have two test_XXX methods like:

def do_test(self, debug_flags):
    self.build(dictionary=debug_flags)
    ... actual test ...

def test_simple_template_name(self):
    do_test(self, dict(CFLAGS_EXTRAS="-gsimple-template-names")

def test_no_simple_template_name(self):
    do_test(self, dict(CFLAGS_EXTRAS="-gno-simple-template-names")
aeubanks updated this revision to Diff 465926.Oct 6 2022, 4:35 PM

address comments

lldb/test/API/lang/cpp/unique-types2/TestUniqueTypes.py
2

thanks, that's very useful to know! done

Looking pretty good to me FWIW

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1561

Sorry, when I gave feedback on some pieces of this that were just moved around rather than new code written in this review - I don't mind the code moving around without changes (and optionally before/after the move making improvements, but not necessary).

If possible, might simplify the code review to move the code first/separately from this change, if the move isn't too meaningless independent of the rest of the changes?

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2515–2518

Yeah, sorry - I understand it requires exact matching currently (because the index has the exact string in it) - my comment was forward-looking/idle thought that now that with simplified template names we can/have to lookup by base name, we have the option to do fuzzier matching when filtering all the base name matches - not suggesting that it's a regression that this currently does exact matching.

aeubanks added inline comments.Oct 14 2022, 11:44 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1561
rnk added a subscriber: rnk.Oct 19 2022, 2:51 PM

@Michael137, are you comfortable approving this, or can you reroute to a reviewer who can help us with this?

Michael137 added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2500

Is there some other way to determine whether the debug-info was generated from -gsimple-template-names? Then we wouldn't have to check the existence of a < in the name in multiple places and wouldn't have to do this lookup speculatively. With this change, if we fail to find a template type in the index, we would do the lookup all over again, only to not find it again. Could that get expensive? Just trying to figure out if we can avoid doing this GetTypes call twice.

There have been talks about doing a base-name search by default followed by fuzzy matching on template parameters (though in the context of function names). The simple-template-names came up as a good motivator/facilitator for doing so. But for that idea to work here we'd have to be able to retrieve from the index by basename, so perhaps out of the scope of what we're trying to do here

tagging people involved in that discussion: @dblaikie @aprantl @jingham @labath

Other than this, generally LGTM

labath added inline comments.Oct 20 2022, 2:36 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1528

reverse the condition and use early exit?

1541

dump is a "debugging aid". I guess you should call print and pass the appropriate PrintingPolicy. It might just be the default, though maybe we could also try to make an effort to match the output in the non-simplified-names case.

1551

When can this be empty? Should we still include the <> in those cases?

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2497

AT_name ?

2498

I don't think this can be keyed off of Empty() for two reasons:

  • I believe the user can pass a non-empty type list into this function, expecting it to append to it
  • just because you found one templated type above, it doesn't mean that there can't be another non-templated type somewhere (if only a part of the binary was built with simplified names).

I think we should just unconditionally (possibly guarded by a max_matches check) try both.

2500

Negative matches should be fast, and since typically only one of the branches will produce any results, I think this should be fine. Filtering matches in the simplified case would be slower, since you'd have build the full name of all potential candidates (and that could be thousands of instantiations), but I don't know how slow. But that's the price we pay for better filtering (which we don't do right now, but we could start doing afterwards).

2577

Do we need to do something similar in this FindTypes overload as well?

aeubanks updated this revision to Diff 470321.Oct 24 2022, 4:29 PM
aeubanks marked 3 inline comments as done.

address comments

rebasing to ToT, TestCPPBreakpointsLocations.py is failing when using -gsimple-template-names. still working on that

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1551

it was empty when we're working with a non-templated die, so a lot of cases. I've added an early exit and changed this to an assert

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2500

Done.

I believe the only difference with -gsimple-template-names is that the name doesn't contain the template parameters, so I don't think there's any other way of determining if the debug info was generated from -gsimple-template-names.

2577

I was wondering when this code path is even taken, turns out it's only called from lldb-test, and only in two tests:

lldb-shell :: SymbolFile/DWARF/x86/compilercontext.ll
lldb-shell :: SymbolFile/DWARF/x86/module-ownership.mm

so probably doesn't need to be updated?

aeubanks updated this revision to Diff 470617.Oct 25 2022, 2:20 PM

revert breakpoint test change
we can leave that to a later change, so I think this change is ready to go

I can't say I fully understand all of this code, but I also don't know who would, so I guess I'll just say it "looks good" :)

I am wondering about the testing situation though. If I understand correctly, you've run the test suite with hardcoded simplified names, and it all passed (?)

I am definitely not suggesting we add a new test suite mode for that, but maybe we could extend this one test case with extra check that look at the type names in other contexts than in name lookup (e.g. expression evaluation, backtraces, ???) -- just to make sure that something doesn't break there in the future. WDYT?

lldb/test/API/lang/cpp/unique-types2/main.cpp
21

Would it be interesting to test nested types as well (Foo<T>::Bar<U>) ?

cmtice added a subscriber: cmtice.Oct 27 2022, 12:34 PM
aeubanks updated this revision to Diff 471330.Oct 27 2022, 4:56 PM

fix nested types by introducing Type::GetBaseName()
add expr eval tests

regarding the failure in TestCPPBreakpointLocations.py (added recently in https://reviews.llvm.org/D135921, seems like it's catching real issues with this patch), it looks like the manual dwarf index needs a similar fix to this

I can't say I fully understand all of this code, but I also don't know who would, so I guess I'll just say it "looks good" :)

I am wondering about the testing situation though. If I understand correctly, you've run the test suite with hardcoded simplified names, and it all passed (?)

yes, that was true until https://reviews.llvm.org/D135921, which catches some missing functionality in this patch around breakpoints (I'd still like to defer that to a separate patch)

I am definitely not suggesting we add a new test suite mode for that, but maybe we could extend this one test case with extra check that look at the type names in other contexts than in name lookup (e.g. expression evaluation, backtraces, ???) -- just to make sure that something doesn't break there in the future. WDYT?

I've added expression evaluation to the test. do you have examples of backtrace tests?

lldb/test/API/lang/cpp/unique-types2/main.cpp
21

yes, and that caught a bug, thanks for the suggestion

I've added a Type::GetBaseName(), could you take a look at that?

Michael137 added inline comments.Oct 27 2022, 5:19 PM
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
3834

Could we just re-use GetTypeName and add an optional printing policy argument to it?

lldb/test/API/lang/cpp/unique-types2/main.cpp
21

Out of curiosity, what was the issue with using the qualified name?

aeubanks updated this revision to Diff 471375.Oct 27 2022, 7:33 PM
aeubanks marked an inline comment as done.

combine GetBaseName into GetTypeName

aeubanks added inline comments.Oct 27 2022, 7:35 PM
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
3812

this was mostly copied from GetTypeName above, not sure if these two should be factored out?

lldb/test/API/lang/cpp/unique-types2/main.cpp
21

GetQualifiedName returns the name including any scopes, so in this case it gave Foo<int>::Nested<char> rather than Nested<char> and the '<' string logic ended up comparing <int>::Nested<char> to <char>

labath accepted this revision.Oct 28 2022, 1:05 AM

regarding the failure in TestCPPBreakpointLocations.py (added recently in https://reviews.llvm.org/D135921, seems like it's catching real issues with this patch), it looks like the manual dwarf index needs a similar fix to this

I can't say I fully understand all of this code, but I also don't know who would, so I guess I'll just say it "looks good" :)

I am wondering about the testing situation though. If I understand correctly, you've run the test suite with hardcoded simplified names, and it all passed (?)

yes, that was true until https://reviews.llvm.org/D135921, which catches some missing functionality in this patch around breakpoints (I'd still like to defer that to a separate patch)

Yes, sounds good.

Can you elaborate on the manual index issue. My impression that it just takes the strings from inside DW_AT_name, and so it should behave similarly (identically?) to a compiler index -- though, in this case, we have direct control over it, so we could e.g. make it always store simplified names regardless of the debug info.

I am definitely not suggesting we add a new test suite mode for that, but maybe we could extend this one test case with extra check that look at the type names in other contexts than in name lookup (e.g. expression evaluation, backtraces, ???) -- just to make sure that something doesn't break there in the future. WDYT?

I've added expression evaluation to the test. do you have examples of backtrace tests?

I don't have anything particular in mind. I figured you'd just stop the middle of some templated member function and run bt. Though, when I think of it now, these names come from the demangler, so they're probably not particularly interesting here. (Unless you build with -gsce, but in that case, they should come from the same code that is doing the type templatification).

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
2509

This isn't a qualified name anymore. I guess every instance of the string "qual" in this block should be replaced or deleted.

This revision is now accepted and ready to land.Oct 28 2022, 1:05 AM
aeubanks updated this revision to Diff 471585.Oct 28 2022, 10:10 AM

rename variables

This revision was automatically updated to reflect the committed changes.