This is an archive of the discontinued LLVM Phabricator instance.

Implement DW_{OP,AT}_LLVM_* for Heterogeneous Debugging
AbandonedPublic

Authored by scott.linder on Mar 26 2020, 12:17 PM.

Details

Summary

The extensions are gaurded by the flag SupportsHeterogeneousDebuggingExtensions
which is disabled by default, except for AMDGPU.

Though they are several extensions, the encodings are chosen such that they don't
occupy the entire available space and doesn't have collisions. If at all we land
into encoding collisions due to future extensions, the ambiguity can be resolved
by using the proposed new augmentation attribute on the compilation unit and in
unwind information CIEs for any target which wishes to use the operations
defined by the extensions.

Diff Detail

Event Timeline

scott.linder created this revision.Mar 26 2020, 12:17 PM

I've added a few reviewers who are probably better placed to review things from the DWARF/DW_OP perspective than me.

Please add the debug-info project in all the patches.

I don't see a verifier for these operations, please add that.

llvm/include/llvm/BinaryFormat/Dwarf.def
686

Should the Vendor be AMD (or AMDGPU) instead of LLVM? Or this is only LLVM related.

What about other tools, such as GDB, do they have the support for this?

dstenb added a subscriber: dstenb.Mar 27 2020, 6:46 AM
dstenb added inline comments.
llvm/test/tools/llvm-dwarfdump/X86/amdgpu_proposal.s
26–27 ↗(On Diff #252938)

In which ways does this differ from DW_CFA_undefined reg0?

aprantl added inline comments.Mar 27 2020, 8:38 AM
llvm/include/llvm/BinaryFormat/Dwarf.def
686

According to libdwarf these numbers are already taken by HP extensions. I don't know how much this practically matters, but it would be good if we imported all the various vendor extension ranges from libdwarf and binutils...

686

Should the Vendor be AMD (or AMDGPU) instead of LLVM? Or this is only LLVM related.

If the extensions are only useful for AMD hardware then that would be appropriate. If they are generally useful the LLVM vendor might make more sense.

scott.linder marked 2 inline comments as done.Mar 27 2020, 9:24 AM

Thank you for the review, and sorry for neglecting to include any context. The proposal which defines these new operations is at https://llvm.org/docs/AMDGPUUsage.html#dwarf. @t-tye is working on formatting the proposal to be sent to upstream DWARF as an RFC, and I am working to land these patches as an initial implementation.

llvm/include/llvm/BinaryFormat/Dwarf.def
686

These extensions are generally useful, and we are working on implementing them in GDB (currently in an open-source fork rather than directly upstream, although our intention is to upstream as feasible) in parallel with implementing them in LLVM so we have an initial implementation to prove the concept.

The DW_OP_ user encodings have become pretty cramped, as operations are added but never removed, even when they are eventually upstreamed into a version of DWARF. I know that e.g. AArch64 has elected to reserve one encoding to represent a "family" of instructions, where the first operand is the true "AArch64 opcode", but that comes with a lot of infrastructure work and either means we have to break the invariant that operations have a fixed number of opcodes or we have to waste a lot of space in the final encoding with un-used operands.

Our current intention is to instead implement the DW_AT_LLVM_augmentation described at https://llvm.org/docs/AMDGPUUsage.html#debugging-information-entry-attributes as a temporary way to select which of the overloaded versions of these user opcodes should be used for a given DIE.

llvm/test/tools/llvm-dwarfdump/X86/amdgpu_proposal.s
26–27 ↗(On Diff #252938)

It is not a CFA-only rule, it is just a new dwarf expression operation. DW_OP_LLVM_undefined pushes an "undefined location description" on the stack, which is a new concept that was needed in order to generalize the DWARF stack to support operating on location descriptions. It is a way to explicitly capture the existing rule where two adjacent DW_OP_piece-like operations result in some undefined bits in the final composite location description. @t-tye can probably do a better job describing how it relates to the undefined CFA rule in the new proposal; I think in this case the two forms are equivalent, as they will both result in the unwound register location description being an undefined location description.

I am only using .cfi_escape here as a hacky way to generate some DWARF expressions in a way that dwarfdump will parse, i.e. none of these expressions have to be reasonable things one would want to emit.

Please add the debug-info project in all the patches.

I don't see a verifier for these operations, please add that.

I'm not sure what verifier you're referring to; is this a part of CodeGen or DebugInfo?

Please add the debug-info project in all the patches.

I don't see a verifier for these operations, please add that.

I'm not sure what verifier you're referring to; is this a part of CodeGen or DebugInfo?

The Debug Info.
Depending on the intention of using this we should add verifiers for the levels of representation. E.g. if we use it on LLVM IR, we should add a verifier in lib/IR/Verifier. The verifier for operation from the final DWARF context is stored within lib/DebugInfo/DWARF/DWARFVerifier.

Anyhow, I think that having an RFC would make this easier for the review. :)

Please add the debug-info project in all the patches.

I don't see a verifier for these operations, please add that.

I'm not sure what verifier you're referring to; is this a part of CodeGen or DebugInfo?

The Debug Info.
Depending on the intention of using this we should add verifiers for the levels of representation. E.g. if we use it on LLVM IR, we should add a verifier in lib/IR/Verifier. The verifier for operation from the final DWARF context is stored within lib/DebugInfo/DWARF/DWARFVerifier.

Anyhow, I think that having an RFC would make this easier for the review. :)

The diff at https://reviews.llvm.org/D70523 served as the initial RFC, and @t-tye is still working diligently to update it and reformat it in preparation for presenting it to DWARF proper. In the mean time any review of that that section of AMDGPUUsage.rst is welcome. We hope to be able to implement as much as feasible in a compiler (LLVM) and debugger (GDB) in parallel with working to propose it to DWARF as a way to prove an implementation is feasible.

I will look into extending DWARFVerifier further, but this patch does not include any more fundamental changes that would be needed to implement the entire proposal, it implements just enough of it to emit unwind information needed for AMDGPU. I was hoping to review and land this series of patches before starting the remaining work which will involve changes in IR/MIR/DebugInfo.

The diff at https://reviews.llvm.org/D70523 served as the initial RFC

That's not what we mean by an RFC; what we mean is a narrative writeup that is then posted to the appropriate -dev list (llvm-dev in this case).
Reviews do not get the visibility you want for a significant change or enhancement to how we use things like DWARF. I can't say I noticed that previous review.

Given you've already committed the writeup in the previous review, the post to llvm-dev could be short with a pointer to that doc. But that way at least more of the people who care about DWARF will be aware of it.

The diff at https://reviews.llvm.org/D70523 served as the initial RFC

That's not what we mean by an RFC; what we mean is a narrative writeup that is then posted to the appropriate -dev list (llvm-dev in this case).
Reviews do not get the visibility you want for a significant change or enhancement to how we use things like DWARF. I can't say I noticed that previous review.

Given you've already committed the writeup in the previous review, the post to llvm-dev could be short with a pointer to that doc. But that way at least more of the people who care about DWARF will be aware of it.

That makes sense, I'll post a true RFC to the lists soon and we can go from there. I talked with Tony and he has nearly finished updating the proposal to be in the style of a typical DWARF issue, so I am going to hold off on posting to the list until that is done as it will change the structure of the document quite a bit.

In the mean time, I would like to ask if any of the reviewers have thoughts on how to reconcile the need for more new DW_OP_ extensions than there is encoding space for. Currently we are proposing adding a DW_AT_augmentation which could be applied to DW_TAG_compile_unit which will essentially allow for us to say "this compile unit uses extension X", and as part of our definition of the extension we could indicate that we are using certain encodings that may already be in use elsewhere. Essentially this would be a way for extension vendors to multiplex over the same limited encoding space. I imagine we would carry this augmentation string in the DICompileUnit metadata.

In the mean time, I would like to ask if any of the reviewers have thoughts on how to reconcile the need for more new DW_OP_ extensions than there is encoding space for. Currently we are proposing adding a DW_AT_augmentation which could be applied to DW_TAG_compile_unit which will essentially allow for us to say "this compile unit uses extension X", and as part of our definition of the extension we could indicate that we are using certain encodings that may already be in use elsewhere. Essentially this would be a way for extension vendors to multiplex over the same limited encoding space. I imagine we would carry this augmentation string in the DICompileUnit metadata.

This is very similar in concept to what ELF does with SHT_NOTE sections - individual elements in this section use a vendor field, which is a string, and then also have types, but the types are specific to each vendor, thus e.g. GNU type 1 and LLVM type 1 could mean two completely different things. I believe consumers are intended to ignore unknown vendors.

Whatever approach is taken will presumably need adopting in the next DWARF standard. An obvious straightforward alternative would be to increase the range available for DW_OP_ vendor extensions. It might be worth at the same time considering changing to using ULEB128s for DW_OP encodings.

In the mean time, I would like to ask if any of the reviewers have thoughts on how to reconcile the need for more new DW_OP_ extensions than there is encoding space for. Currently we are proposing adding a DW_AT_augmentation which could be applied to DW_TAG_compile_unit which will essentially allow for us to say "this compile unit uses extension X", and as part of our definition of the extension we could indicate that we are using certain encodings that may already be in use elsewhere. Essentially this would be a way for extension vendors to multiplex over the same limited encoding space. I imagine we would carry this augmentation string in the DICompileUnit metadata.

This is very similar in concept to what ELF does with SHT_NOTE sections - individual elements in this section use a vendor field, which is a string, and then also have types, but the types are specific to each vendor, thus e.g. GNU type 1 and LLVM type 1 could mean two completely different things. I believe consumers are intended to ignore unknown vendors.

Yes, I think our proposal is in the same vein, although it is complicated slightly by a desire to support multiple extensions being present at the same time. For example, AMDGPU wants to say that we need these new operations while still supporting GNU and LLVM extensions like GNU_push_tls_address when we know the consumer prefers them.

Whatever approach is taken will presumably need adopting in the next DWARF standard. An obvious straightforward alternative would be to increase the range available for DW_OP_ vendor extensions. It might be worth at the same time considering changing to using ULEB128s for DW_OP encodings.

Right, I think with so few vendor encodings left there has to be some mechanism for reusing them or opening up the encoding space. Just increasing the vendor extension range will eat into the already dwindling standard encoding space. We also tried really hard to make the entire proposal backwards compatible, and switching to ULEB128 seems like a hard break in compatibility as the 7th bit is already active in DW_OP_'s even in the non-user range. Maybe reserving a single DW_OP_ to say "what follows is a ULEB128 extended-opcode" would be feasible, although it would still have a pretty far reaching impact.

Right, I think with so few vendor encodings left there has to be some mechanism for reusing them or opening up the encoding space. Just increasing the vendor extension range will eat into the already dwindling standard encoding space. We also tried really hard to make the entire proposal backwards compatible, and switching to ULEB128 seems like a hard break in compatibility as the 7th bit is already active in DW_OP_'s even in the non-user range. Maybe reserving a single DW_OP_ to say "what follows is a ULEB128 extended-opcode" would be feasible, although it would still have a pretty far reaching impact.

That's what DWARF does for line-number program opcodes; opcode 0 indicates an "extended" opcode, with a ULEB guarding the whole thing so it can be skipped by consumers who don't recognize it. If the expression opcode space is feeling the squeeze (which I can easily imagine, as there are only 32 opcodes in that space) it would seem like a reasonable extension to do the same thing in the DWARF expression space.

For your implementation, you would want to use one of the existing user-range opcodes, but you could define that as introducing the extended-opcode encoding. Please model it on how line-number extended opcodes work.

Right, I think with so few vendor encodings left there has to be some mechanism for reusing them or opening up the encoding space. Just increasing the vendor extension range will eat into the already dwindling standard encoding space. We also tried really hard to make the entire proposal backwards compatible, and switching to ULEB128 seems like a hard break in compatibility as the 7th bit is already active in DW_OP_'s even in the non-user range. Maybe reserving a single DW_OP_ to say "what follows is a ULEB128 extended-opcode" would be feasible, although it would still have a pretty far reaching impact.

That's what DWARF does for line-number program opcodes; opcode 0 indicates an "extended" opcode, with a ULEB guarding the whole thing so it can be skipped by consumers who don't recognize it. If the expression opcode space is feeling the squeeze (which I can easily imagine, as there are only 32 opcodes in that space) it would seem like a reasonable extension to do the same thing in the DWARF expression space.

For your implementation, you would want to use one of the existing user-range opcodes, but you could define that as introducing the extended-opcode encoding. Please model it on how line-number extended opcodes work.

I'm still hesitant to include this in the current proposal because it seems out of scope. The augmentation approach satisfies the need to be able to overlap with existing extensions, and is modeled on the notion of augmentation in the CFI. It does this without structural changes that will ripple through every tool interacting with DWARF expressions, which is an important property for us because we are striving to upstream these new ops into a future DWARF standard and delete these temporary ops anyway. I agree that a separate proposal to resolve the issue in a more general way for both standard and user ranges would be a good idea, but it isn't what we are proposing here.

In the context of LLVM as a generator and consumer of these user DW_OPs nothing is actually required as the overlapping extensions aren't even supported, and in the context of consumers like GDB the augmentation will be sufficient to differentiate between supported extensions. Even there, I don't think they currently implement the overlapping extensions beyond being able to print them. Would that be sufficient to get this landed in LLVM, with the understanding that these are not intended to be extensions permanently, and that solving the encoding exhaustion issue could still be proposed and implemented another way?

Add appropriate augmentation strings to CU and CIE

scott.linder retitled this revision from Implement new DW_OP_LLVM_* operations to Implement DW_{OP,AT}_LLVM_* for Heterogeneous Debugging.
scott.linder edited the summary of this revision. (Show Details)
scott.linder removed a project: debug-info.

Update title and summary.

scott.linder added a reviewer: espindola.

Rebase and move tests to .debug_info as some new operations are not legal in
.debug_frame.

Rebase onto LLVM master

Rebase onto LLVM master

scott.linder added inline comments.Oct 27 2020, 10:33 AM
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp
47

I think I missed actually adding a test for this, i.e. that we do emit the augmentation string for AMDGPU.

Rebase and ping.

clayborg added inline comments.
llvm/lib/MC/MCDwarf.cpp
1667

Should we use a single character here? If we don't, unwinders might end up parsing each character individually and that might make unwinders fail as they parse each character. Or does the unwind spec state that everything between square brackets is a single entry? This seems dangerous unless the spec has some rule like this.

Rebase on to the ToT.

Taking a relook at the vendor extension range, there are enough non-conflicting encodings available for the DWARF operators being proposed here and also leaves few more encodings available for other future extensions LLVM might see.

Does this look good now to go in?

Rebase and ping.

RamNalamothu edited the summary of this revision. (Show Details)

Rebase onto ToT.

Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2022, 2:04 AM
RamNalamothu edited the summary of this revision. (Show Details)Mar 22 2022, 2:05 AM

Could someone review/comment/approve this revision?

This revision is under review for the last two years. I would appreciate any feedback which helps to make progress on this.

djtodoro added inline comments.Mar 22 2022, 2:29 AM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1019

no need for the curly brackets here

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
713

I'd say that the link was mentioned too many times.

llvm/lib/MC/MCDwarf.cpp
1667

@RamNalamothu have you taken a look into this?

Address feedback.

@djtodoro thanks for the quick response

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1019

Done.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
713

I got rid of the redundant ones. And the rest, unless they are big annoyance, I feel they provide ready access and inform what is being supported.

llvm/lib/MC/MCDwarf.cpp
1667

The thinking behind the multi-character versioning is:

  • we should be able to distinguish multiple versions from multiple vendors because
    1. some encoding spaces, e.g. DW_OP_*, are too short and the available encodings are quickly occupied leading to collisions
    2. many historic encodings are not being removed from the code base even when they are not being generated from the compiler
  • it's better to follow a standard versioning such as https://semver.org

I don't think all of the above could be addressed using a single character versioning in a meaningful clean manner, if at all it can be done. And the spec can enforce what is needed from all the interfacing tools.

arsenm added a reviewer: debug-info.
djtodoro accepted this revision.Jun 5 2022, 11:40 PM

LGTM from my side.

This revision is now accepted and ready to land.Jun 5 2022, 11:40 PM

Thank you very much @djtodoro.

Since many people have commented on this revision in the past, I would wait for few days before landing the changes.

@jmorse @aprantl wouldn't mind if one of you could sign off on this, as folks with more context for the location expression side of DWARF than I have these days.

aprantl accepted this revision.Jun 7 2022, 2:13 PM
scott.linder abandoned this revision.Mar 30 2023, 3:39 PM

Abandoned in favor of D147279 which avoids using most/all of the remaining user opcode