This is an archive of the discontinued LLVM Phabricator instance.

[lldb][CPlusPlusLanguage] Respect the step-avoid-regex for functions with auto return types
ClosedPublic

Authored by Michael137 on Oct 6 2022, 4:51 PM.

Details

Summary

Summary

The primary motivation for this patch is to make sure we handle
the step-in behaviour for functions in the std namespace which
have an auto return type. Currently the default step-avoid-regex
setting is ^std:: but LLDB will still step into template functions
with auto return types in the std namespace.

Details
When we hit a breakpoint and check whether we should stop, we call
into ThreadPlanStepInRange::FrameMatchesAvoidCriteria. We then ask
for the frame function name via SymbolContext::GetFunctionName(Mangled::ePreferDemangledWithoutArguments).
This ends up trying to parse the function name using CPlusPlusLanguage::MethodName::GetBasename which
parses the raw demangled name string.

CPlusPlusNameParser::ParseFunctionImpl calls ConsumeTypename to skip
the (in our case auto) return type of the demangled name (according to the
Itanium ABI this is a valid thing to encode into the mangled name). However,
ConsumeTypename doesn't strip out a plain auto identifier
(it will strip a `decltype(auto) return type though). So we are now left with
a basename that still has the return type in it, thus failing to match the ^std::
regex.

Example frame where the return type is still part of the function name:

Process 1234 stopped
* thread #1, stop reason = step in
    frame #0: 0x12345678 repro`auto std::test_return_auto<int>() at main.cpp:12:5
   9
   10   template <class>
   11   auto test_return_auto() {
-> 12       return 42;
   13   }

This is another case where the CPlusPlusNameParser breaks us in subtle ways
due to evolving C++ syntax. There are longer-term plans of replacing the hand-rolled
C++ parser with an alternative that uses the mangle tree API to do the parsing for us.

Testing

  • Added API and unit-tests
  • Adding support for ABI tags into the parser is a larger undertaking which we would rather solve properly by using libcxxabi's mangle tree parser

Diff Detail

Event Timeline

Michael137 created this revision.Oct 6 2022, 4:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2022, 4:51 PM
Michael137 requested review of this revision.Oct 6 2022, 4:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2022, 4:51 PM
Michael137 added inline comments.Oct 6 2022, 4:52 PM
lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
68–71

git clang-format

Michael137 edited the summary of this revision. (Show Details)Oct 6 2022, 4:58 PM
Michael137 edited the summary of this revision. (Show Details)
Michael137 edited the summary of this revision. (Show Details)Oct 6 2022, 5:00 PM
labath accepted this revision.Oct 10 2022, 3:08 AM

There are longer-term plans of replacing the hand-rolled C++ parser with an alternative that uses the mangle tree API to do the parsing for us.

You may be aware of this, but I feel I should mention that there are cases when a function just does not have a mangled name, either because it is in an extern "C" block, or because it was complied with --dwarf-linkage-names=Abstract (default for -gsce). In this case, we construct a fake demangled name from the DWARF debug info (the names of enclosing (DW_TAG_)namespaces, and the types for (DW_TAG_)formal_parameters. Of course, in this case, it makes even less sense to parse the resulting string, since we're the ones who constructed it in the first place. However, it may not be sufficient to assume that one can just start with a mangled name, and get everything out that way.

This revision is now accepted and ready to land.Oct 10 2022, 3:08 AM
  • Reword commit message
  • Remove redundant header from test

There are longer-term plans of replacing the hand-rolled C++ parser with an alternative that uses the mangle tree API to do the parsing for us.

You may be aware of this, but I feel I should mention that there are cases when a function just does not have a mangled name, either because it is in an extern "C" block, or because it was complied with --dwarf-linkage-names=Abstract (default for -gsce). In this case, we construct a fake demangled name from the DWARF debug info (the names of enclosing (DW_TAG_)namespaces, and the types for (DW_TAG_)formal_parameters. Of course, in this case, it makes even less sense to parse the resulting string, since we're the ones who constructed it in the first place. However, it may not be sufficient to assume that one can just start with a mangled name, and get everything out that way.

Good points, thanks! Didn't know about the --dwarf-linkage-names=Abstract option

Looks like this change broke the windows lldb bot: https://lab.llvm.org/buildbot/#/builders/83/builds/24631

I thought I had skipped it on Windows. Will fix now