This is an archive of the discontinued LLVM Phabricator instance.

Add a DIModule debug info metadata node to the IR.
ClosedPublic

Authored by aprantl on May 8 2015, 11:37 AM.

Details

Summary

This patch adds a DIModule debug info metadata node to the IR. It is meant to be used to record modules @imported by the current compile unit, so a debugger an import the same modules to replicate this environment before dropping into the expression evaluator.

DIModule is a sibling to DINamespace and behaves quite similarly. In addition to the name of the module it also records the module configuration details that are necessary to uniquely identify the module. This includes the configuration macros (e.g., -DNDEBUG), the include path where the module.map file is to be found, and the isysroot.

The idea is that the backend will turn this into a DW_TAG_module. The DW_AT_LLVM_* attributes holding the configuration strings can either be emitted into the DW_TAG_module or as part of the skeleton CU depending on how we decide to move forward on this.

Diff Detail

Repository
rL LLVM

Event Timeline

aprantl updated this revision to Diff 25355.May 8 2015, 11:37 AM
aprantl retitled this revision from to Add a DIModule debug info metadata node to the IR..
aprantl updated this object.
aprantl edited the test plan for this revision. (Show Details)
aprantl set the repository for this revision to rL LLVM.
aprantl added a subscriber: Unknown Object (MLST).

Pinging dblaikie.

Context:
This patch adds an IR node for the dwarf tag DW_TAG_module. A rather lengthy thread on this can be found on cfe-commits (http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150316/125724.html).

The primary use-case for DW_TAG_module is to record modules @imported by the current compile unit, so a clang-based debugger (like LLDB) can import the same modules to replicate the environment of the current CU before dropping into the expression evaluator. This way the debugger can import types directly from the module and has access to types that are not representable in DWARF (uninstantiated templates). The debugger needs an accurate list of the (sub-)modules imported by any CU, since any two (sub-)modules may conflict.

A concern was that using DW_TAG_imported_module for this might confuse debuggers since C++ namespaces are represented similarly. That said, DW_TAG_imported_module (DW_TAG_module) is exactly how the DWARF standard recommends representing modules. I also brought this up on the dwarf-discuss mailing list, and the general consensus was to follow the standard.

[Included here for completeness, since its been discussed at length on cfe-commits, a potential secondary use-case for this is for module debugging debug info:
For non-ODR languages like C and Objective-C the mangled type names produced by the clang indexer are only unique within a module. LLVM computes type signatures by hashing the mangled type name, which doesn't work in this case. One way of disambiguating the resulting type signatures is to put forward declarations of the types and their declcontext inside the DW_TAG_module they are defined in and perform a complete DWARF-style hash of the type+context.
Since there are also other ways of solving this problem, such as introducing a stable clang module hash and hashing it together with the mangled name, I'd like to postpone this discussion and revisit it when it comes up.]

dblaikie added inline comments.Jun 25 2015, 2:32 PM
include/llvm/IR/DIBuilder.h
585

Would it be better to make the contents of the module more opaque/generic? I'm not sure - don't know how authoritative these 4 properties {name, config macros, include path(s?), isysroot} is - if we're likely to later need other things, etc. I mean I know the schema is more forwards-compatible by adding new fields, but equally it's a lot of churn to change the schema now, so if these are just an opaque bundle of things that only have meaning to the frontend anyway, I'm wondering if they should be handled as such.

These four properties are certainly very specific to clang modules.
Since they need to get translated into custom DW_AT_LLVM_key("value") by the backend, we could implement a generic mechanism like:

// assuming that DW_AT_LLVM_config_macros = DW_AT_lo_user = 0x2500
!2 = !DIModule(scope: !0, name: "Module", stringAttrs: !{ i32 9472, !"-DNDEBUG", ...})

... which the backend then interprets as a list of DW_FORM_strp attributes.
But since we don't have named constants in the IR to make this more readable, I'm not convinced that this is preferable at the moment.

aprantl accepted this revision.Jun 26 2015, 4:44 PM
aprantl added a reviewer: aprantl.
This revision is now accepted and ready to land.Jun 26 2015, 4:44 PM
aprantl closed this revision.Aug 18 2015, 10:40 AM

This was r241017.