Page MenuHomePhabricator

Fix debug-info generation for block invocations so that we set the LinkageName instead of the Name
ClosedPublic

Authored by shafik on Jan 23 2020, 10:41 AM.

Details

Summary

clang has an extension for block invocations whose mangled name starts with ___Z. Currently when generating debug-info for block invocations we are setting the Name to the mangled name as opposed to the LinkageName. This means we see the mangled name for block invcations ends up in DW_AT_Name as opposed to DW_AT_linkage_name.

This patch fixes this case so the LinkageName for block invocations is set to the mangled name.

Diff Detail

Event Timeline

shafik created this revision.Jan 23 2020, 10:41 AM
aprantl added inline comments.Jan 24 2020, 4:31 PM
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

Could you please add a comment that Clang Blocks are generated as raw llvm::Functions but do have a mangled name and that is handling this case? Otherwise this would look suspicious.

aprantl added a project: debug-info.
aprantl added inline comments.
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

Should *all* raw LLVM functions have their name as the linkage name? Perhaps a raw LLVM function should only have a linkage name and no human-readable name?

dblaikie added inline comments.Jan 24 2020, 7:50 PM
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

Seems plausible to me - do we have any data on other types of functions that hit this codepath?

shafik marked an inline comment as done.Jan 28 2020, 11:30 AM
shafik added inline comments.
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

So it was not obvious to me what other cases would this branch so I added an assert and ran check-clang and from that I saw four cases that ended up here:

GenerateCapturedStmtFunction
GenerateOpenMPCapturedStmtFunction
GenerateBlockFunction
generateDestroyHelper

It is not obvious to me we want to alter the behavior of any of the other cases.

dblaikie added inline comments.Jan 28 2020, 11:38 AM
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

Could you show any small source examples & their corresponding DWARF & how that DWARF would change? (what names are we using, what names would we end up using/what sort of things are they naming)

shafik added a comment.EditedJan 28 2020, 3:00 PM

Example as requested, using the same code from clang/test/CodeGenCXX/debug-info-block-invocation-linkage-name.cpp the DWARF for block created in f() and passed to g() as an argument changes from:

0x00000052:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000060)
                DW_AT_high_pc	(0x0000000000000089)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_name	("___Z1fU13block_pointerFviE_block_invoke")
                DW_AT_decl_file	("block_demangle_3.cpp")
                DW_AT_decl_line	(4)

to

0x00000056:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000060)
                DW_AT_high_pc	(0x000000000000008c)
                DW_AT_frame_base	(DW_OP_reg6 RBP)
                DW_AT_linkage_name	("___Z1fU13block_pointerFviE_block_invoke")
                DW_AT_decl_file	("block_demangle_3.cpp")
                DW_AT_decl_line	(4)

With the only change being that the mangled name ___Z1fU13block_pointerFviE_block_invoke instead of ended up in DW_AT_name end up in DW_AT_linkage_name

Yeah, in general I think special casing a particular name mangling is probably not a great idea (what about in Microsoft mode with a different mangling scheme?), so I'm somewhat inclined towards doing this in general as @aprantl was suggesting.

shafik updated this revision to Diff 241508.Jan 30 2020, 10:36 AM
shafik marked 4 inline comments as done.

Address comments by not special casing the ___Z case, this requires setting both the Name and Linkage name because DWARF requires subroutines to have a DW_AT_name. We discovered this when altering the code to just set the LinkageName and it broke several LLDB tests.

shafik marked an inline comment as done.Jan 30 2020, 10:39 AM
shafik added inline comments.
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

Ok, I understand the objections to special casing like this. We ended up setting both the Name and LinkageName unconditionally in this branch because not setting the name for subroutines end up with us generating DW_TAG_subprogram without a DW_AT_name which is not valid. We discovered this when running the LLDB test suite.

dblaikie added inline comments.Jan 30 2020, 12:27 PM
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

If we need to have a name, and these are mangled names - were they mangled /from/ something & have an unmangled name we should be using, then?

llvm-cxxfilt demangles the example/test as "invocation function for block in f(void (int) block_pointer)" - perhaps we should name this "invocation function"? & let the scope of the DIE communicate the rest of the information about this thing (like the "operator()" for a lambda is just "operator()")?

shafik marked an inline comment as done.Jan 30 2020, 3:52 PM
shafik added inline comments.
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

If we take the example from the test I added it demangles to:

invocation function for block in f(void (int) block_pointer)

There is no usable "short" name there, it is a complex description of the block and what wraps it. In general from the LLDB perspective we would want a name to set a breakpoint or display it during a back trace. In this case we care about displaying this properly in a back trace and it would not be reasonable to use such a name to set a breakpoint.

dblaikie added inline comments.Jan 30 2020, 3:59 PM
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

How's that compare to lambdas, for example?

shafik updated this revision to Diff 241638.Jan 30 2020, 10:36 PM

Using a more targeted approach.

shafik marked an inline comment as done.Jan 30 2020, 10:49 PM
shafik added inline comments.
clang/lib/CodeGen/CGDebugInfo.cpp
3668–3672

I updated the PR to used a more targeted approach based on CodeGenFunction::GenerateBlockFunction also generating a mangled name for blocks when it creates the llvm::function.

Lambdas are interesting, they are RecordDecl with a callable operator() so we have both a DW_AT_name for them and a DW_AT_linkage_name for the callable.

Also note that blocks are an clang extension and also work in C whereas lambdas are a standard C++ feature and naming although implementation defined mostly falls out of the fact that C++ has member functions and so can be placed inside class as operator().

I don't think the more targeted fix is a good thing - is the same problem of not demangling names not there for the other cases that the original fix addressed?

shafik added a comment.Feb 3 2020, 3:58 PM

I don't think the more targeted fix is a good thing - is the same problem of not demangling names not there for the other cases that the original fix addressed?

AFAIK no, the other cases I found via the test suite are basically compiler internals and are not real mangled names e.g.:

__omp_outlined__
__omp_offloading_1000008_2f4d793_main_l12_debug__
__cxx_global_array_dtor
dblaikie accepted this revision.EditedFeb 4 2020, 7:55 PM

I don't think the more targeted fix is a good thing - is the same problem of not demangling names not there for the other cases that the original fix addressed?

AFAIK no, the other cases I found via the test suite are basically compiler internals and are not real mangled names e.g.:

__omp_outlined__
__omp_offloading_1000008_2f4d793_main_l12_debug__
__cxx_global_array_dtor

Ah, fair enough then. Thanks for the explanation!

This revision is now accepted and ready to land.Feb 4 2020, 7:55 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2020, 11:12 AM