This is an archive of the discontinued LLVM Phabricator instance.

[DWARFv5][DWARFLinker] avoid stripping template names for .debug_names.
ClosedPublic

Authored by avl on Jun 27 2023, 6:37 AM.

Details

Summary

DWARFLinker puts three names for subprograms into the .apple_names and
.debug_names: short name, linkage name, name without template parameters.

DW_TAG_subprogram
   DW_AT_linkage_name "_Z3fooIcEvv"
   DW_AT_name "foo<char>"

short name: "foo<char>"
linkage name: "_Z3fooIcEvv"
name without template parameters: "foo"

DWARFv5 does not require stripping template parameters for subprogram name.
Current llvm-dwarfdump --verify reports the error if names stored in
accelerator table do not match with DIE name(name with stripped template
parameters stored in accelerator table does not match with original DIE name).
This patch does not store name without template parameters into the .debug_names table.

Diff Detail

Event Timeline

avl created this revision.Jun 27 2023, 6:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 6:37 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
avl requested review of this revision.Jun 27 2023, 6:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 6:37 AM
avl edited the summary of this revision. (Show Details)Jun 27 2023, 6:39 AM
This revision is now accepted and ready to land.Jun 27 2023, 9:15 AM
avl added a comment.Jun 28 2023, 2:43 AM

Thank you for the review!

FWIW, it might be worth considering always including the argument-stripped names. It'd make it easier for consumers to query for names when they don't match exactly (eg: foo<int,int> V foo<int, int> - minor whitespace differences, etc) and be more compatible with -gsimple-template-names.

avl added a comment.Jul 6 2023, 3:33 PM

FWIW, it might be worth considering always including the argument-stripped names. It'd make it easier for consumers to query for names when they don't match exactly (eg: foo<int,int> V foo<int, int> - minor whitespace differences, etc) and be more compatible with -gsimple-template-names.

Agreed, that might be useful. I think we need to do this in llvm-dwarfdump and clang also. so that all tools behave equally.

Sorry to come late to this, but this patch is regressing dsym debugging with DWARF 5 (which I am actively working on enabling) when compared to DWARF 4 dsym debugging.
LLDB relies heavily on querying these stripped names, and this patch creates a number of DWARF 5 failures in dsym debugging flows.

Was this patch submitted mostly to address the verifier failure? Or is there some other use case in mind?
I think we can make a strong enough case to simply disable this verifier check for now (see below)

I was going to post a review to make dsymutil keep its prior behavior (but leaving other users of DWARFLinker unchanged), however llvm-dwarfutil doesn't know how to consume YAML debug maps, so I can't update this test without other changes.

With apologies for the small wall of text, but this is the commit message I had typed prior to stumbling on this test:

[dsymutil] Add template-arguments-stripped names in debug_names

When LLDB needs to evaluate an expression like `foo(42)` and `foo` is defined as
`template<typename A>foo(A)`, LLDB queries the accelerator table for the name
`foo`. Note that the string `foo` itself is neither:

1. `AT_name`, which in this case would be `foo<int>`
2. nor `AT_linkage`, which in this case would be the mangled name of `foo<int>`.

As such, it is guaranteed that LLDB's query will fail, since those are the only
two attributes blessed by the DWARF standard to be used as keys in debug_names.

Even if LLDB were to attempt conjuring the string `foo<int>`, this is not
enough. The *contents* of the `AT_name` attribute are not standardized. For
example, how many white spaces separate nested templates like `<<<`?

With DWARF <= 4 and apple_names, `dsymutil` takes it upon itself to fix this
issue by adding the key `foo` into the table, linking the string `foo` to every
DIE corresponding to a specialization of this function. Note that this _enables_
more functionality in LLDB, but it is not a great solution, as we now have a
debug linker doing more than just linking & optimizing: it is enabling
functionality that would otherwise not be possible if the debug information were
left in object files.

This patch brings this dsymutil behavior to debug_names, so that we don't create
regressions by moving from DWARF 4 to DWARF 5 in dsym-based debugging. This is
slightly non-standard; as mentioned above, only AT_name and AT_linkage_name are
allowed to be keys. However, we believe this to be acceptable:

1. This change only affects dsym-based debugging, which already expects this
behavior.
2. The standard says that "A producer may choose to implement additional rules
for what names are placed in the index, and may communicate those rules to a
cooperating consumer via an augmentation string".
3. There have been favourable discussions about making AT_name be the basename
of the function without any template parameters [1]. In fact, some patches were
put upstream to just that for templated *types*, though the motivation there
seems to be "make debug info smaller". See [2][3][4].
4. These same discussions [1] also suggest this temporary fix as a stop-gap
measure.
avl added a comment.EditedJul 11 2023, 4:17 PM

yep. the motivation was to satisfy llvm-dwarfdump --verify. I think it would be OK to return old dsymutil behavior back and teach llvm-dwarfdump --verify to recognize this case. Probably we also need to change clang to use the same behavior. i.e. add record with stripped template names into .debug_names. So that lldb can rely on this on non-darwin platform.

I will create a patch restoring old dsymutil behavior tomorrow.

fdeazeve added a comment.EditedJul 12 2023, 3:01 AM

Probably we also need to change clang to use the same behavior. i.e. add record with stripped template names into .debug_names. So that lldb can rely on this on non-darwin platform.

Yup, I would like to make this a "non-dsym" feature as well. As I mentioned above, it doesn't seem right that the linker is trying to improve the quality of debug info, instead of only linking/optimizing it; the compiler should be the one generating all the required information.

... I've just realized that my links weren't pasted properly in the other message, so here they are for completeness. I think the relevant discussion for us is [1]

[1]: https://github.com/llvm/llvm-project/issues/58362
[2]: https://discourse.llvm.org/t/dwarf-using-simplified-template-names/58417/11
[3]: https://reviews.llvm.org/rG38c09ea2d279eddddabe3602e2002f8cdfcc5380
[4]: https://reviews.llvm.org/D134378

I will create a patch restoring old dsymutil behavior tomorrow.

Thank you!