This is an archive of the discontinued LLVM Phabricator instance.

Add debug info support for Swift/Clang APINotes.
ClosedPublic

Authored by aprantl on Mar 3 2020, 4:37 PM.

Details

Summary

Disclaimer: This is a little awkward, since this adds DWARF support for a feature that is currently only present in swift-clang but has been proposed for upstreaming (http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html) multiple times (http://lists.llvm.org/pipermail/cfe-dev/2017-May/053860.html) and will — I believe — eventually find tis way into upstream clang.

In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF.

Diff Detail

Event Timeline

aprantl created this revision.Mar 3 2020, 4:37 PM
Herald added a project: Restricted Project. · View Herald Transcript
aprantl marked an inline comment as done.Mar 3 2020, 4:38 PM
aprantl added inline comments.
llvm/include/llvm/BinaryFormat/Dwarf.def
419

I'm fine with calling this DW_AT_LLVM_apinotes, too.

Disclaimer: This is a little awkward, since this adds DWARF support for a feature that is currently only present in swift-clang but has been proposed for upstreaming (http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html) multiple times (http://lists.llvm.org/pipermail/cfe-dev/2017-May/053860.html) and will — I believe — eventually find tis way into upstream clang.

Those threads are fairly old (2015 and 2017 respectively). If this is a relevant goal/justiifcation, perhaps it should be moved forward a bit more along with this proposal. Though I'm not 100% sure it's necessary for this feature to be used in clang for it to be added to LLVM - Julia and other non-clang/non-llvm-project frontends have, to my knowledge, had features added to upstream LLVM for their use cases that were never generated by any frontend in llvm-project. (stackmap/patchpoint comes to mind?)

Just to clarify the purpose of the feature in this thread/review: apinotes allow externally imposing attributes on functions/entities you don't own, right? (sort of the same reason as ADL in C++ - I want to say something about "swap" but I don't own the canonical swap, so I need some mechanism to do that)

In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF.

& this applies not just to dsymutil, right? Presumably it's used by debuggers when debugging using unlinked debug info, or using non-.o+dsymutil style debug info, such as when using Swift on Linux/ELF?

Is it worth discussing whether these apinotes should be embedded directly into the debug info, rather than referenced by file name? I guess they're /sort/ of like source files (they likely have the same distribution model - so if the source is accessible, so are the apinotes), so referencing by file name makes sense?

Disclaimer: This is a little awkward, since this adds DWARF support for a feature that is currently only present in swift-clang but has been proposed for upstreaming (http://lists.llvm.org/pipermail/cfe-dev/2015-December/046335.html) multiple times (http://lists.llvm.org/pipermail/cfe-dev/2017-May/053860.html) and will — I believe — eventually find tis way into upstream clang.

Those threads are fairly old (2015 and 2017 respectively). If this is a relevant goal/justiifcation, perhaps it should be moved forward a bit more along with this proposal. Though I'm not 100% sure it's necessary for this feature to be used in clang for it to be added to LLVM - Julia and other non-clang/non-llvm-project frontends have, to my knowledge, had features added to upstream LLVM for their use cases that were never generated by any frontend in llvm-project. (stackmap/patchpoint comes to mind?)

Correct, we have language support in LLVM for languages not in the LLVM project. But since this is a Clang feature and it has been discussed previously, I thought I should link this for context.

Just to clarify the purpose of the feature in this thread/review: apinotes allow externally imposing attributes on functions/entities you don't own, right? (sort of the same reason as ADL in C++ - I want to say something about "swap" but I don't own the canonical swap, so I need some mechanism to do that)

Yes, apinotes are a YAML file that allows adding extra attributes to declarations in system header files without modifying those header files themselves.

In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF.

& this applies not just to dsymutil, right? Presumably it's used by debuggers when debugging using unlinked debug info, or using non-.o+dsymutil style debug info, such as when using Swift on Linux/ELF?

Absolutely. The end goal is for LLDB to import the APINotes, so it can access the attributes. What I'm particularly interested in right now is the Swift name attribute, which defines the name under which an (Objective-)C declaration should be imported into Swift, so LLDB can import Clang types from DWARF into Swift under the correct name.

Is it worth discussing whether these apinotes should be embedded directly into the debug info, rather than referenced by file name? I guess they're /sort/ of like source files (they likely have the same distribution model - so if the source is accessible, so are the apinotes), so referencing by file name makes sense?

The alternative I considered was to invent explicit DWARF for all of the attribute kinds and put them on the decl in CGDebugInfo and deserialize them in LLDB and put them as attributes on the Decl. However, compared to just passing the apinotes file verbatim to Swift's ClangImporter this would be a lot more churn in upstream LLVM, as we'd need to add several new attributes to almost every type and decl kind.

In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF.

& this applies not just to dsymutil, right? Presumably it's used by debuggers when debugging using unlinked debug info, or using non-.o+dsymutil style debug info, such as when using Swift on Linux/ELF?

Absolutely. The end goal is for LLDB to import the APINotes, so it can access the attributes. What I'm particularly interested in right now is the Swift name attribute, which defines the name under which an (Objective-)C declaration should be imported into Swift, so LLDB can import Clang types from DWARF into Swift under the correct name.

Is it worth discussing whether these apinotes should be embedded directly into the debug info, rather than referenced by file name? I guess they're /sort/ of like source files (they likely have the same distribution model - so if the source is accessible, so are the apinotes), so referencing by file name makes sense?

The alternative I considered was to invent explicit DWARF for all of the attribute kinds and put them on the decl in CGDebugInfo and deserialize them in LLDB and put them as attributes on the Decl. However, compared to just passing the apinotes file verbatim to Swift's ClangImporter this would be a lot more churn in upstream LLVM, as we'd need to add several new attributes to almost every type and decl kind.

oh, yeah, I wasn't thinking that level of fidelity - just the whole file embedded as a string or something. But yeah, doesn't seem like that'd be especially better. If APINotes is like a source/header file type thing (some could be user authored, others could be generated from something else, same way header/source might be generated) then it seems fine to reference it on disk.

I guess the only thing that makes me vascillate a bit is that debuggers don't usually process the source in any way - they just render it to the user. APINotes are meant to be semantically consumed by the debugger - that's where the analogy breaks down a bit for me, but not necessarily in a way that invalidates this approach. APINotes seem to be much more readily machine readable (JSon, etc) than, say, C++ source (reason debuggers don't just reparse the source to get type information, etc).

In order for dsymutil to collect .apinotes files (which capture attributes such as nullability, Swift import names, and availability), I want to propose adding an apinotes: field to DIModule that gets translated into a DW_AT_LLVM_apinotes (path) nested inside DW_TAG_module. This will be primarily used by LLDB to indirectly extract the Swift names of Clang declarations that were deserialized from DWARF.

& this applies not just to dsymutil, right? Presumably it's used by debuggers when debugging using unlinked debug info, or using non-.o+dsymutil style debug info, such as when using Swift on Linux/ELF?

Absolutely. The end goal is for LLDB to import the APINotes, so it can access the attributes. What I'm particularly interested in right now is the Swift name attribute, which defines the name under which an (Objective-)C declaration should be imported into Swift, so LLDB can import Clang types from DWARF into Swift under the correct name.

Is it worth discussing whether these apinotes should be embedded directly into the debug info, rather than referenced by file name? I guess they're /sort/ of like source files (they likely have the same distribution model - so if the source is accessible, so are the apinotes), so referencing by file name makes sense?

The alternative I considered was to invent explicit DWARF for all of the attribute kinds and put them on the decl in CGDebugInfo and deserialize them in LLDB and put them as attributes on the Decl. However, compared to just passing the apinotes file verbatim to Swift's ClangImporter this would be a lot more churn in upstream LLVM, as we'd need to add several new attributes to almost every type and decl kind.

oh, yeah, I wasn't thinking that level of fidelity - just the whole file embedded as a string or something. But yeah, doesn't seem like that'd be especially better.

No, I don't think that would be bette, either.

If APINotes is like a source/header file type thing (some could be user authored, others could be generated from something else, same way header/source might be generated) then it seems fine to reference it on disk.

I guess the only thing that makes me vascillate a bit is that debuggers don't usually process the source in any way - they just render it to the user. APINotes are meant to be semantically consumed by the debugger - that's where the analogy breaks down a bit for me, but not necessarily in a way that invalidates this approach.

What blurs the line here is that LLDB is not a traditional debugger in this sense. It can load clang modules and thus also processes source (or at least header files).

APINotes seem to be much more readily machine readable (JSon, etc) than, say, C++ source (reason debuggers don't just reparse the source to get type information, etc).

Yes, though LLDB *can* parse type information from Clang modules to get at templates in the STL, etc. But it is a secondary source used for the expression evaluator, and not needed to just inspect the state of the program by looking at variables in the traditional debugger sense.

Quoting myself from another review:

Mechanically seems fine (though probably/I think these sort of things should be committed in three parts - IR support, frontend usage, backend usage - but nice to review in one piece like this). I'll defer actual detailed review/sign-off to at least someone else with a vested interest in this features usage just to make sure it's got some (more) in-review... review :)

davide accepted this revision.Mar 11 2020, 10:58 AM

This is all fairly mechanical plumbing

llvm/include/llvm/BinaryFormat/Dwarf.def
419

I'd rather call them LLVM_apinotes, but I don't have a strong opinion about it.

This revision is now accepted and ready to land.Mar 11 2020, 10:58 AM
aprantl marked an inline comment as done.Mar 11 2020, 5:48 PM
aprantl added inline comments.
llvm/include/llvm/BinaryFormat/Dwarf.def
419

I carefully chose the number out of the LLVM space already, so the constant could be renamed once it's upstream. But I'm fine with naming it LLVM from the start.

This revision was automatically updated to reflect the committed changes.