This is an archive of the discontinued LLVM Phabricator instance.

[lldb][ClangExpression] Add asm() label to all FunctionDecls we create from DWARF
ClosedPublic

Authored by Michael137 on Aug 16 2022, 9:08 AM.

Details

Summary

When resolving symbols during IR execution, lldb makes a last effort attempt
to resolve external symbols from object files by approximate name matching.
It currently uses CPlusPlusNameParser to parse the demangled function name
and arguments for the unresolved symbol and its candidates. However, this
hand-rolled C++ parser doesn’t support ABI tags which, depending on the demangler,
get demangled into [abi:tag]. This lack of parsing support causes lldb to never
consider a candidate mangled function name that has ABI tags.

The issue reproduces by calling an ABI-tagged template function from the
expression evaluator. This is particularly problematic with the recent
addition of ABI tags to numerous libcxx APIs.

The issue stems from the fact that clang::CodeGen emits function
function calls using the mangled name inferred from the FunctionDecl
LLDB constructs from DWARF. Debug info often lacks information for
us to construct a perfect FunctionDecl resulting in subtle mangled
name inaccuracies.

This patch side-steps the problem of inaccurate FunctionDecls by
attaching an asm() label to each FunctionDecl LLDB creates from DWARF.
clang::CodeGen consults this label to get the mangled name as one of
the first courses of action when emitting a function call.

LLDB already does this for C++ member functions as of
675767a5910d2ec77ef8b51c78fe312cf9022896

Testing

  • Added API tests

Diff Detail

Event Timeline

Michael137 created this revision.Aug 16 2022, 9:08 AM
Herald added a project: Restricted Project. · View Herald Transcript
Michael137 requested review of this revision.Aug 16 2022, 9:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2022, 9:08 AM

This implements @labath's suggestion from https://reviews.llvm.org/D131335

Following tests fail with this:

lang/cpp/extern_c/TestExternCSymbols.py
lang/cpp/namespace/TestNamespaceLookup.py
macosx/rosetta/TestRosetta.py

Investigating currently...

lang/cpp/extern_c/TestExternCSymbols.py

I think that's because extern "C" symbols don't have a DW_AT_linkage_name attribute. Generally speaking, I don't think we can assume all debug info entries will have one (e.g. -gsce makes clang omit those), and should add them only if available.

This comment was removed by Michael137.
lang/cpp/extern_c/TestExternCSymbols.py

I think that's because extern "C" symbols don't have a DW_AT_linkage_name attribute. Generally speaking, I don't think we can assume all debug info entries will have one (e.g. -gsce makes clang omit those), and should add them only if available.

Yup makes sense. Checking the others. Wonder if it needs a check for attrs.storage == SC_Extern

  • Check mangled name and storage class before attaching label
  • Remove libcxx ABI-tag workaround from API test makefile
aprantl accepted this revision.Aug 16 2022, 9:57 AM

Thanks! This looks like a fine incremental improvement. But we should aim at eventually getting rid of these XFAILs.

This revision is now accepted and ready to land.Aug 16 2022, 9:57 AM
labath accepted this revision.Aug 16 2022, 10:17 AM

For me this wins on simplicity grounds alone.

  • Check mangled name and storage class before attaching label

I don't think that the storage class check is helping anything. Non-external functions can have DW_AT_linkage_names and ABI tags (although their usefulness is questionable) as well. Of course, that means that the local (indicated by the L in the name) mangled name is not necessarily unique (and that's why I said that going off of the address would be more correct). However, this is not something we can fix by letting clang deduce the name for itself.

For me this wins on simplicity grounds alone.

  • Check mangled name and storage class before attaching label

I don't think that the storage class check is helping anything. Non-external functions can have DW_AT_linkage_names and ABI tags (although their usefulness is questionable) as well. Of course, that means that the local (indicated by the L in the name) mangled name is not necessarily unique (and that's why I said that going off of the address would be more correct). However, this is not something we can fix by letting clang deduce the name for itself.

Any particular reason for not removing the SC_Extern check? I'm pretty sure I could create a test case with a static abi-tagged function which we wouldn't be able to call with that check in place...

For me this wins on simplicity grounds alone.

  • Check mangled name and storage class before attaching label

I don't think that the storage class check is helping anything. Non-external functions can have DW_AT_linkage_names and ABI tags (although their usefulness is questionable) as well. Of course, that means that the local (indicated by the L in the name) mangled name is not necessarily unique (and that's why I said that going off of the address would be more correct). However, this is not something we can fix by letting clang deduce the name for itself.

Any particular reason for not removing the SC_Extern check? I'm pretty sure I could create a test case with a static abi-tagged function which we wouldn't be able to call with that check in place...

Boiled down to this comment:

the local (indicated by the L in the name) mangled name is not necessarily unique

Attaching it to non-external functions caused cpp/namespace/TestNamespaceLookup.py to fail. With a non-static and a static version of func() LLDB chose the wrong one. I can follow up on whether this is a bug elsewhere and how feasible it would be to unconditionally attach the label

Any particular reason for not removing the SC_Extern check? I'm pretty sure I could create a test case with a static abi-tagged function which we wouldn't be able to call with that check in place...

Boiled down to this comment:

the local (indicated by the L in the name) mangled name is not necessarily unique

Attaching it to non-external functions caused cpp/namespace/TestNamespaceLookup.py to fail. With a non-static and a static version of func() LLDB chose the wrong one. I can follow up on whether this is a bug elsewhere and how feasible it would be to unconditionally attach the label

Ah, this is a fun one. So this breaks the test_scope_lookup_with_run_command test in TestNamespaceLookup.py, but simultaneously "fixes" test_file_scope_lookup_with_run_command. I would say this is a pre-existing bug, that just happens to manifest itself differently with this change. The problem is that lldb is not able to correctly scope CU-local objects, as it ends up putting all files into one huge AST context. It was mostly accidental that lldb picked the external version of the function (because it happened to get its name mangling "right"). Now it happens to pick the static one as they both have the same signature (and I guess this is the one that gets parsed first).

I'd say we should just flip the xfails around, since this is already pretty broken, and the asm labels definitely help in some situations.

Michael137 added a comment.EditedAug 19 2022, 6:30 AM

Any particular reason for not removing the SC_Extern check? I'm pretty sure I could create a test case with a static abi-tagged function which we wouldn't be able to call with that check in place...

Boiled down to this comment:

the local (indicated by the L in the name) mangled name is not necessarily unique

Attaching it to non-external functions caused cpp/namespace/TestNamespaceLookup.py to fail. With a non-static and a static version of func() LLDB chose the wrong one. I can follow up on whether this is a bug elsewhere and how feasible it would be to unconditionally attach the label

Ah, this is a fun one. So this breaks the test_scope_lookup_with_run_command test in TestNamespaceLookup.py, but simultaneously "fixes" test_file_scope_lookup_with_run_command. I would say this is a pre-existing bug, that just happens to manifest itself differently with this change. The problem is that lldb is not able to correctly scope CU-local objects, as it ends up putting all files into one huge AST context. It was mostly accidental that lldb picked the external version of the function (because it happened to get its name mangling "right"). Now it happens to pick the static one as they both have the same signature (and I guess this is the one that gets parsed first).

When I initially looked at it, it seemed unintuitive for LLDB to pick the file-static function (from a context where it's usually not accessible) over the one in the header so I kept the check in.
But it's true that the storage-class check just works around a different bug. There seem to always be cases where we pick an incorrect overload no matter how you flip it.
Happy to remove the check and flip the XFAIL