This is an archive of the discontinued LLVM Phabricator instance.

Better scheme to lookup alternate mangled name when looking up function address.
ClosedPublic

Authored by sivachandra on Sep 11 2015, 11:25 AM.

Details

Summary

This change is relevant for inferiors compiled with GCC. GCC does not
emit complete debug info for std::basic_string<...>, and consequently, Clang
(the LLDB compiler) does not generate correct mangled names for certain
functions.

This change removes the hard-coded alternate names in
ItaniumABILanguageRuntime.cpp.

Before the hard-coded names were put in ItaniumABILanguageRuntime.cpp, one could
not evaluate std::string methods (ex. std::string::length). After putting in
the hard-coded names, one could evaluate them. However, it did not still
enable one to call methods on, say for example, std::vector<string>.
This change makes that possible.

There is some amount of incompleteness in this change. Consider the
following example:

std::string hello("hello"), world("world");
std::map<std::string, std::string> m;
m[hello] = world;

One can still not evaluate the expression "m[hello]" in LLDB. Will
address this issue in another pass.

Diff Detail

Event Timeline

sivachandra retitled this revision from to Better scheme to lookup alternate mangled name when looking up function address..
sivachandra updated this object.
sivachandra added reviewers: clayborg, spyffe.
sivachandra added a subscriber: lldb-commits.

Just a mention: This change does have an impact on expression evaluation performance and memory utilization.

clayborg requested changes to this revision.Sep 14 2015, 11:19 AM
clayborg edited edge metadata.

So let me try to understand. When we are asked during expressions to lookup some mangled named for "std::string::length()", it doesn't exist in GCC binaries. So we want to then find any alternate manglings and we do this by asking the symbol file to find alternate manglings for "std::string::length"? I.E. we remove the parens and any arguments and lookup just the fully qualified function name?

If so, we should just lookup functions using:

size_t
Module::FindFunctions (const ConstString &name,
               const CompilerDeclContext *parent_decl_ctx,
               uint32_t name_type_mask, 
               bool symbols_ok,
               bool inlines_ok,
               bool append, 
               SymbolContextList& sc_list);

And then look at all of the symbol contexts in sc_list and find a function that we want? I don't really see the need for any of the new SymbolFile::GetMangledNamesForFunction() functions.

This revision now requires changes to proceed.Sep 14 2015, 11:19 AM

I could very well be missing something obvious. However, let me explain what I am trying to solve here. Lets take the example of std::vector<string>::size method. The DWARF we get when compiled with GCC is as follows:

< 3><0x000020a3>        DW_TAG_subprogram
                          DW_AT_external              yes(1)
                          DW_AT_name                  "size"
                          DW_AT_decl_file             0x00000003 /usr/include/c++/4.8/bits/stl_vector.h
                          DW_AT_decl_line             0x00000285
                          DW_AT_linkage_name          "_ZNKSt6vectorISsSaISsEE4sizeEv"
                          DW_AT_type                  <0x00001eb1>
                          DW_AT_accessibility         DW_ACCESS_public
                          DW_AT_declaration           yes(1)
                          DW_AT_object_pointer        <0x000020bc>
                          DW_AT_sibling               <0x000020c2>

If we demangle the linkage name from here, we get "std::vector<std::string, std::allocator<std::string> >::size() const". So, the m_function_fullname_index of SymbolFileDWARF will have entry for this function with this full name.

However, due to missing debug info elsewhere, the IR generated by clang (the LLDB compiler), generates a mangled name like this:
"_ZNKSt6vectorISbIcSt17char_traits<char>St15allocator<char>ESt82allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >E4sizeEv"

This demangles to
"std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::size() const"

Since neither the clang generated mangled name is present in the ELF symtab, and nor its corresponding demangled name is not present in any of the DWARF indices, the existing FindFunctions will not be helpful. Also, only functions with debug info (those which have an address specified in the DWARF) are indexed.

What I am doing in my change is to use the fact that all methods (and their types) are grokked while creating the AST for clang (the LLDB compiler). So, when a method is grokked, store a map from its scoped name to its DIE. Even if there were any discrepancies in the mangled name in the debug info versus that generated by the LLDB compiler, the fully scoped names should be the same. In which case, use the fully scoped name to get to the DIE and retrieve its "actual" mangled name.

spyffe requested changes to this revision.Sep 14 2015, 1:22 PM
spyffe edited edge metadata.

I LOVE the idea of getting rid of those horrid "alternate manglings." We knew what the mangling was during name lookup, we should be able to recognize them later!
As listed in my inline comments, I have some concerns about the scope. This knowledge is built up during expression parsing and used during expression parsing – we're done.
Thanks for working on this, Siva.

source/Expression/ClangExpressionDeclMap.cpp
661 ↗(On Diff #34565)

This should definitely only be done if we can't find the name the original way. I'm always happy to pay extra runtime to fix an expression that would otherwise not work – but expressions that would work (the >90% case) shouldn't be paying for this.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
207

Why is this attached to the DWARF? I would want to attach this to the ClangExpressionDeclMap because we identify these alternate names during function name lookup, and we just need to remember them when resolving the references in IR. After that, they are no longer needed.

sivachandra added inline comments.Sep 14 2015, 1:44 PM
source/Expression/ClangExpressionDeclMap.cpp
661 ↗(On Diff #34565)

The "original" way is still attempted first at line 643 above. Lines 669 to 674 below take care of another problem: http://reviews.llvm.org/D12613. Since that problem is much more a rare case than that solved in this change, I chose to keep this "try" before that "try".

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
207

My thinking was, DWARF is the only thing which knows about the correct mangled name, so keep it close to the code dealing with DWARF. Your suggestion also makes sense, but might (I have not yet thought enough about it) require us to expose DIE info into ClangExpressionDeclMap. I will think more about this approach and get back to you.

sivachandra added inline comments.Sep 15 2015, 12:43 PM
source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
207

I spent some time thinking about this. ClandExpressionDeclMap doesn't really explicitly lookup method names. If we have an expression like "v.size()", we lookup what "v" is, and that conveys to Clang about the existence of a method "size" in its class. The requirement for alternate names kicks in (so to say) when we are looking for the address of the method. I am not very clear on how we can cleanly keep track of all the methods parsed, while looking up variables, in ClangExpressionDeclMap and use that knowledge while looking up addresses. Do you have any suggestions?

I agree that it is indeed odd to have a method GetMangledNamesForFunction in SymbolFile which is useful only for expression evaluation. How about having a temp object that ClangExpressionDeclMap registers with SymbolFile, and cleans it up after expression evaluation is done? SymbolFile stuffs in to the object all info that ClangExpressionDeclMap could potentially use while parsing the DIEs. ExpressionEvaluationIndex?

Ping. Any suggestion on how to take this forward?

Sean is the best person to answer your questions. I will ping him and have him comment.

Ping.
The patch in its current state will require a rebase. However, any suggestions on how to take this forward would be great. I can rebase and make modifications together.

Friendly periodic ping. Let me know if once in 2 days is too frequent.

sivachandra updated this revision to Diff 36818.Oct 7 2015, 6:42 PM
sivachandra edited edge metadata.

Rebase.

Another friendly ping.

Ping. I have few follow up changes over whatever goes in here. Would greatly appreciate if someone can make a decision on this, or advice on how to take this forward.

clayborg resigned from this revision.Oct 13 2015, 4:24 PM
clayborg edited reviewers, added: jingham; removed: clayborg.

I will get Sean to take a look.

dawn added a subscriber: dawn.

Eugene has been doing some work in this area - perhaps he can accept this patch for you.

dawn requested changes to this revision.Nov 9 2015, 1:52 PM
dawn added a reviewer: dawn.
  1. Please rebase again - this patch no longer applies cleanly, and fails to build after fixing merge conflicts.
  2. Please add tests.
This revision now requires changes to proceed.Nov 9 2015, 1:52 PM
sivachandra updated this revision to Diff 41706.Dec 2 2015, 6:16 PM
sivachandra edited edge metadata.

Rebased and updated to take in DWO changes.

PTaL
I finally got time to work on this. I have now rebased it and updated it take in DWO changes.

I understand this change only enhances debug experience when GCC is used as the DWARF producer (the targeted functionality already works as expected when the producer is clang). However, GCC is still very important for Android development and hence this fix is very useful for us.

A test for this already exists: TestCallStdStringFunction.py

dawn accepted this revision.Dec 3 2015, 12:39 AM
dawn edited edge metadata.

lgtm. Patch was applied locally and tested with no regressions. Thanks for fixing the patch.

@dawn: Thanks for accepting the patch. I guess I still need to wait for sign-off/comments from a CODE_OWNER.

dawn added a comment.Dec 4 2015, 11:45 AM

@dawn: Thanks for accepting the patch. I guess I still need to wait for sign-off/comments from a CODE_OWNER.

Yeah. Maybe add more proactive reviewers? And definitely be a squeaky wheel about it. Keep adding reviewers and spamming them until someone gets irritated enough to have a look :)

dawn added a comment.Dec 16 2015, 11:45 AM

Can a "code owner" please review this patch? Thanks, -Dawn

dawn added a comment.Jan 6 2016, 12:39 PM

@spyffe I think this review is waiting on you since you had previously rejected it. Can you please have a look? It continues to show up on my "Blocking Others" list and there's nothing more I can do on my side. Please review or resign out of curtesy to sivachandra??? Thanks in advance.

@sivachandra If spyffe (or anyone for that matter) doesn't respond in a reasonable time frame, I would say it's safe to assume he is inactive and you can go ahead and commit based on my acceptance. This patch has been on review for months - you've given folks more than enough time to voice any objections. That said, it would be nice to get at least one other person's eyes on this.

dawn added a subscriber: clayborg.

Adding Greg - he is an excellent reviewer and quite responsive (poor guy - ends up with way more than his fair share of reviews as a result :( ).

@clayborg would you be willing to be a 2nd pair of eyes here?

clayborg resigned from this revision.Jan 6 2016, 1:23 PM
clayborg removed a reviewer: clayborg.

I looked at this before, and resigned to let Sean Callanan comment on the fix since this is in his area of expertise: the expression parser.

spyffe accepted this revision.Jan 6 2016, 1:30 PM
spyffe edited edge metadata.

The expression parser side fixes look fine to me, and they remove some hacks and make things slightly more elegant.
Obviously the ideal would be to have GCC generate full debug info, but to me this patch looks like a fine next-best thing.

This revision is now accepted and ready to land.Jan 6 2016, 1:30 PM

Thanks a lot Dwan Perchik, for persisting on my behalf.

And thanks to Sean and Greg for taking a look.

I second Dwan's earlier comment about Greg's code reviews. He has been the best reviewer I have worked with, internally and externally.

Will commit this change shortly after rebasing.

sivachandra updated this revision to Diff 44274.Jan 7 2016, 3:34 PM
sivachandra edited edge metadata.

Rebase

sivachandra closed this revision.Jan 7 2016, 3:36 PM