This is an archive of the discontinued LLVM Phabricator instance.

Add DW_AT_linkage_name for DW_TAG_labels
Needs ReviewPublic

Authored by dimitry on May 15 2018, 7:06 AM.

Details

Summary

Since DW_AT_name is not a symbol name for DW_TAG_label DIEs
add DW_AT_linkage_name with original symbol name.

Bug: https://bugs.llvm.org/show_bug.cgi?id=37470

Diff Detail

Event Timeline

dimitry created this revision.May 15 2018, 7:06 AM
davide requested changes to this revision.May 15 2018, 7:51 AM
davide added a subscriber: davide.

This needs a testcase.

This revision now requires changes to proceed.May 15 2018, 7:51 AM
aprantl requested changes to this revision.May 15 2018, 7:52 AM
aprantl added a subscriber: aprantl.

I'm not sure that stripping the leading _ unconditionally is correct either. This code exists to emit DWARF for labels in assembler source code and I'm assuming that this patch will break that use-case.

It would be good to add both a testcase for your intended use-case and one with an assembler label (if that doesn't exist already).

Interestingly the test specifically checks if underscore is removed. Looks like this was very much intentional. But I still do not understand why.. It makes the DW_AT_name of the label differ from elf symbol name (in .dynsym section).

The goal would be for the name to match what the user wrote - not
necessarily what the symbol name is.

It looks like in our case it ended up being not what the user wrote.. and this is especially confusing because libc and exit symbol too (and __exit, which gets converted to _exit...

If there's a difference between the in-source name and the exposed-to-linker name, the in-source name should be provided to DW_AT_name and the exposed-to-linker name should be in DW_AT_linkage_name. Don't try to overload DW_AT_name with any kind of decoration imposed for linking purposes.

dimitry updated this revision to Diff 147074.May 16 2018, 6:47 AM
dimitry retitled this revision from Keep underscores in DW_TAG_label symbol names to Add DW_AT_linkage_name for DW_TAG_labels.
dimitry edited the summary of this revision. (Show Details)

Added DW_AT_linkage_name instead of keeping underscore in DW_AT_name.

dimitry updated this revision to Diff 147080.May 16 2018, 6:59 AM

git clang-format --style=file

Is this correct? What if the user has a label called "_foo"?

I would've thought this '_' handling thing was specifically a OSX/iOS thing (from what I've seen in its handling in LLVM in other places) - so I'm surprised it isn't conditionalized in some way?

It would be good to add both a testcase for your intended use-case and one with an assembler label (if that doesn't exist already).

I am not sure if I understand - my intended use-case here is in fact an assembler label (a .glob label). Can you please give me a pointer to another use-case? I am not very familiar with llvm code base.

aprantl accepted this revision.Jun 1 2018, 8:30 AM

I think this looks good now.

There is still a question about which fix is the correct one. This one or https://reviews.llvm.org/D47104. To reiterate which use case requires clang to remove leading underscores in DW_AT_name?

Based on @echristo's comment in D47104 it feels like this is the preferable way to move forward with this.

echristo accepted this revision.Oct 19 2018, 3:21 PM

LGTM. Thanks!

-eric

JohnReagan added inline comments.
lib/MC/MCDwarf.cpp
562

Uh, DW_AT_linkage_name isn't intended to be used with DW_TAG_labels. It is really meant to provide the mangled names with entries, etc. From DWARF4 spec

"Debugging information entries to which DW_AT_linkage_name may apply include: DW_TAG_common_block, DW_TAG_constant, DW_TAG_entry_point, DW_TAG_subprogram and DW_TAG_variable."

DWARF5 dropped that wording, but you can still infer that list from Appendix A, etc.

I can see that we already emit DW_AT_prototyped for DW_TAG_labels (also extra to the standard).

There are other debuggers who consume this stuff. We're about to start work on the OpenVMS debugger to bring it up past DWARF3 and I would have assumed (incorrectly) that nobody would emit these DW_AT_ attributes for labels.

Is there a document somewhere that tracks these extra-standard extensions?

aprantl added inline comments.Oct 22 2018, 8:56 AM
lib/MC/MCDwarf.cpp
562

DWARF5 dropped that wording, but you can still infer that list from Appendix A, etc.

I wonder if the information in Appendix A (which is non-normative) isn't an oversight. Might be worth bringing this up on the dwarf-discuss mailing list.

Is there a document somewhere that tracks these extra-standard extensions?

Not that I'm aware of. Generally LLVM tends to be fairly permissive about emitting extra dwarf attributes based on that consumers can always ignore extra attributes they don't understand.

probinson added inline comments.Oct 22 2018, 11:09 AM
lib/MC/MCDwarf.cpp
562

Appendix A is really a best-effort list, not definitive (which is explicitly stated in the opening text of the appendix). DWARF rarely mandates anything, and permissively allows extra attributes not specified in the normative text to appear in a DIE. Any reader needs to be ready for that sort of thing.
In the case at hand, it seems there are situations where the label's name in the source text is not the same as the name exposed to the linker, and that's pretty much exactly what DW_AT_linkage_name is for. I might prefer that the extra attribute appear only when needed, but we don't make a huge effort to support the best possible debug info for assembler source (which is the case here).

JohnReagan added inline comments.Oct 22 2018, 11:37 AM
lib/MC/MCDwarf.cpp
562

Got it. Thanks.

I do agree about omitting it if the linkage_name is the same as the name since that reduces the number of references to the string table. Given your lightning talk and the reduction of string references that led to the changes for DWARF5, no sense making a DWARF4 (or earlier) user pay that extra cost. You do end up with needing two Abbrevs however.

I still have a question about the use of DW_AT_prototyped (the normative description is in the "Miscellaneous Subprogram Properties" section) and what it means on a label but I'll go dig for when that was included and the rationale behind it.