Page MenuHomePhabricator

[DebugInfo] Add DWARF emission for DBG_VALUE_LIST
ClosedPublic

Authored by StephenTozer on Jul 9 2020, 11:09 AM.

Details

Summary

Continuing the work discussed in the RFC[0], this patch implements the actual emission of DWARF from a DBG_VALUE_LIST instruction.

The logic for handling the new instruction is simple in most places; DbgEntityHistoryCalculator has a more complex set of changes since it's more involved with register tracking, and the code for producing DW_AT_location in both DwarfDebug and DwarfExpression also required some heftier work.

Previously, the code in emitDebugLocEntry functioned along the lines of:

  1. Emit any fragment info
  2. Emit any entry value info
  3. Emit the location specified in the DBG_VALUE, e.g. DW_OP_reg X or DW_OP_constu X
  4. Finally call DwarfExpression::addExpression(), which handles the DIExpression (except fragments)

Since there may now be multiple locations scattered throughout the expression, rather than a single location at the front, addExpression has been modified to optionally take a lambda that is used to handle DW_OP_LLVM_arg N; the lambda is passed in from emitDebugLocEntry, and performs step 3 using the Nth debug operand. Non-list debug values follow the same behaviour as before. DwarfCompileUnit::constructVariableDIEImpl is similar, but simpler.

The alternative to using the lambda would be to move some of the code in DwarfDebug::emitDebugLocEntry directly into DwarfExpr, and passing a list of locations to addExpression. The hard part with this is that DwarfDebug and DwarfCompileUnit perform step 3 differently, although it's possible their behaviour can be merged. The purpose of choosing the lambda was to minimize the amount of actual change made, but if the alternative option seems like an objectively good refactor then I'm happy to adjust.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html

Diff Detail

Event Timeline

StephenTozer created this revision.Jul 9 2020, 11:09 AM

Hey, just noticed a couple of comments to remove from the tests.

llvm/test/DebugInfo/X86/dbg_value_list_clobbers.mir
42

You can remove this XXX note.

68

I assume this no longer fails?

llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
64

Can remove this XXX note. As you mentioned offline, having multiple references to the same arg (i.e. multiple DW_OP_LLVM_arg, 0 in the expr) is never a problem.

Though, slightly tangentially, I'm still a little unclear on what the final decision was on how to handle duplicate register arg operands. In D82363 you said 'always treat DBG_VALUE_LISTs as potentially having them'. Please could you explain a little further? (i.e. is it an error state, do we need to add extra checks when dealing with DBG_VALUE_LISTs etc).

StephenTozer marked an inline comment as done.Jul 10 2020, 4:28 AM
StephenTozer added inline comments.
llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
64

It is not an error state, just a slightly more inconvenient form than one without duplicates. It requires some extra work in a few places (operating on a vector instead of a single pointer), but there is no reason for it to be invalid.

Remove old comments from tests.

Could we reduce complexity by entirely replacing DBG_VALUE with DBG_VALUE_LIST, *, DW_OP_arg 0, *?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
15

Out of curiosity: Is there an operand kind that we could switch() over?

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
25

By using a callback here the callee cannot use the advanced functionality of addMachineRegExpression for any but a leading DW_OP_LLVM_arg. Do you see a way of either generalizing addMachineRegExpression or otherwise reorganizing this so the addMachineRegExpression functionality becomes available to DBG_VALUE_LIST?

StephenTozer marked 2 inline comments as done.Jul 17 2020, 1:43 PM

Could we reduce complexity by entirely replacing DBG_VALUE with DBG_VALUE_LIST, *, DW_OP_arg 0, *?

Yes, and in the long run I think that should be the goal; it would also be nice to remove the "indirect" operand and all the code paths that use it. So far it has been easier to create this as a separate instruction, but if the debug info cabal as a whole is positive on the replacement then there's no harm in bringing the replacement into these patches (or more likely, adding an extra patch to do so on top of the ones that are already being reviewed).

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
15

It would be useful to have one if there isn't; I didn't want to fold a change like that into this work, but if it exists I can use it (and if not it'd be nice to add in another patch).

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
25

It actually can use that functionality - in this case, all of the functionality that would normally be applied to the location in the DBG_VALUE is applied by this callback. The callback in this case can advance the ExprCursor, so there are no issues with using addMachineRegExpression normally at any point in a DBG_VALUE_LIST.

Add test to confirm that addMachineRegExpression is being correctly used for DBG_VALUE_LIST regs; replace if-else chain with switch in AsmPrinter::emitDebugValueComment.

Rebased onto recent master; no functional changes, minor adjustments in DwarfDebug::emitDebugLocValue.

scott.linder added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
20–41

If you are changing this anyway, I'd vote to just split up these two cases, they don't actually seem related and the existing code reads worse to me.

llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
11–1

Would it make sense to still try applying the heuristic if size(Instruction.debug_values()) == 1 && Instruction.getDebugOperand(0).isReg() && DIExpr->expr_op_begin()->getOp == dwarf::DW_OP_LLVM_arg?

I think this only really makes a difference when we eliminate the old version, but I assume we will still want this thing to work for the cases it can?

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
83

I think you can wrap this constructor body in #ifndef NDEBUG

97

Same here, the whole constructor should be #ifndef NDEBUG

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
67

Was changing this condition intentional? If so can it be in a separate patch?

StephenTozer added inline comments.Jan 4 2021, 6:01 AM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
67

Unintentional, the change here got missed during a rebase (but is fixed in my local copy, which I'll push up shortly).

StephenTozer added inline comments.Jan 7 2021, 9:07 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
20–41

The similarity here is that even when Op is a register, it can still have an offset - it's just calculated above, outside of this loop, by the debug offset operand of MI. I do think this looks quite messy; it'd probably be better to move the offset calculation down into this loop, so that it can be understood for both of them as a local variable.

scott.linder added inline comments.Jan 13 2021, 9:51 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
20–41

+1

I think I completely missed the connection to the offset calculated above, thank you for clarifying.

Rebased onto recent master, fixed some test failures.

scott.linder added inline comments.Jan 13 2021, 12:31 PM
llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
52–56

Why does this behave differently than (what I understand to be) the equivalent DBG_VALUE ?

DBG_VALUE $eax, $noreg, !12, !DIExpression(DW_OP_stack_value), debug-location !15
; becomes: DW_AT_location (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)

It seems like the DW_OP_and is there to select a subregister (I assume EAX), but oddly it comes after the value of the register is already read (i.e. after the DW_OP_breg). I'm lost on what the intended behavior is, and why it differs between DBG_VALUE and DBG_VALUE_LIST.

There is also the existing confusion around the "isIndirect" flag in DBG_VALUE which makes these two equivalent (and both seemingly wrong):

DBG_VALUE $eax, $noreg, !25, !DIExpression(DW_OP_stack_value), debug-location !15
; becomes: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)
  
DBG_VALUE $eax, 0, !26, !DIExpression(DW_OP_stack_value), debug-location !15
; becomes: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)

which makes it harder still to compare.

Would it be more straightforward to always be explicit about indirection in the new form? Why does DW_OP_stack_value imply a DW_OP_deref at all? I.e. why do we not get:

DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_stack_value), $eax, debug-location !15
    ; CHECK:      DW_TAG_variable
    ;  CHECK-NEXT:   (DW_OP_reg RAX, DW_OP_stack_value)
    ;  CHECK-NEXT:   DW_AT_name ("locala")

which in this case I imagine would just be an error. I would expect the correct expression to generate the DW_OP_breg would be something like:

DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref, DW_OP_stack_value), $eax, debug-location !15
    ; CHECK:      DW_TAG_variable
    ;  CHECK-NEXT:   (DW_OP_breg0 RAX+0, DW_OP_stack_value)
    ;  CHECK-NEXT:   DW_AT_name ("locala")

If we don't do this, we seem to retain some of the same ambiguity that makes the old "isIndirect" field so confusing.

StephenTozer added inline comments.Jan 15 2021, 9:30 AM
llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
52–56

To the first point: I'm looking into it now; I noticed the DW_OP_and before, but I'm not sure where it's coming from myself yet - the DBG_VALUE_LIST handling should be following essentially the same code path as DBG_VALUE, so this is a bug one way or another.

To the second point, I think your examples are slightly incorrect:

The isIndirect flag in DBG_VALUE is confusing and inconsistent, what it actually does is dependent on the DIExpression and not well explained. The DBG_VALUE_LIST implementation has no such inconsistencies however (I hope). The problem with your examples is that I think you're using DW_OP_reg to mean a register's literal value, and DW_OP_breg to mean the address pointed to by a register. This isn't quite correct, although they do act like that for most variable locations.

The actual meanings are slightly more complicated; the short answer is that DW_OP_reg is a Register location description: it refers to the register itself, not to the value of that register. DW_OP_breg on the other hand does refer to the literal value of a register; it's generally used with an offset as part of a Memory location expression, but if combined with DW_OP_stack_value then it gives the value in the register as the variable's value (albeit as an Implicit location rather than a Register location).

So with all of that said, the meaning of (DW_OP_breg0 RAX+0, DW_OP_stack_value) is that the variable's value can be found in $rax, but the variable should be read-only, which matches the meaning of the DBG_VALUE_LIST.

scott.linder added inline comments.Jan 17 2021, 12:18 PM
llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
52–56

The isIndirect flag in DBG_VALUE is confusing and inconsistent, what it actually does is dependent on the DIExpression and not well explained.

Agreed, and my point is that a similar issue applies to the interpretation of DW_OP_LLVM_arg in your patch, even with isIndirect gone (modulo the assert mentioned below, which only side-steps the issue).

The problem with your examples is that I think you're using DW_OP_reg to mean a register's literal value, and DW_OP_breg to mean the address pointed to by a register.

That wasn't my intention in the examples I gave; in fact in the unambiguous model we tried to extract from the DWARF spec (https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html) we define the DW_OP_reg as pushing a register location description onto the stack, and DW_OP_breg as pushing a memory location description onto the stack. The interaction which makes the location description pushed onto the stack by DW_OP_breg behave like a value in some contexts (i.e. behave like the offset contents of the register) we begrudgingly capture with an implicit conversion to make our description backwards-compatible with DWARF 4 and 5, but *even if* you want to just define it as pushing a value onto the stack, at the very least the breg must represent reading the value of a reg register location. My question is then: why does adding DW_OP_stack_value implicitly cause the *value* of the register to end up on the stack, with no intervening operation? I.e. why is this the case:

; Sure, seems reasonable: `DW_OP_LLVM_arg` when referring to a register describes the register itself, not the value of the register.
DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0), $eax, debug-location !11
; CHECK: DW_AT_location    (DW_OP_reg0 RAX)

; This follows too: if you do want the value of the register, you can read it explicitly with e.g. `DW_OP_deref`. DWARF actually *requires* this be collapsed into something like `DW_OP_breg` or `DW_OP_regval_type`,
; as `DW_OP_reg RAX, DW_OP_deref` is not a valid location description of any kind. This is an artificial constraint of the standard, and in any consistent view of the spec the two forms would have to otherwise be equivalent.
DBG_VALUE_LIST !13, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref), $eax, debug-location !11
; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0)

; Hmm, this doesn't seem right though: why are there now two indirections? Does `DW_OP_stack_value` imply one for some reason?
DBG_VALUE_LIST !14, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref, DW_OP_stack_value), $eax, debug-location !11
; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_deref, DW_OP_stack_value)

; If you actually want the singly-indirect output you *have* to omit the semantically consistent `DW_OP_deref`. I would have expected this to just be an invalid DIExpression:
DBG_VALUE_LIST !15, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_stack_value), $eax, debug-location !11
; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_stack_value)

Note that I removed the asserts in your patch which seem to artificially require DW_OP_stack_value for DBG_VALUE_LIST. I didn't understand the purpose of them before, but perhaps this issue is one reason for them to be present?

My fundamental argument is that this context-dependent interpretation of DW_OP_LLVM_arg is another source of confusion, just like isIndirect. I think this stems from the fact that DWARF as it is defined today is not general/composable enough to avoid this, but I don't think that should bleed into the internal representation used by LLVM: we can make a sensible choice up until we get into the DWARF backend, where certain expressions will have to be converted into a different form to be legal. Instead, what we have now is a situation where adding operations to the expression changes fundamentally how you are supposed to interpret the DW_OP_LLVM_arg. This is a pre-existing shortcoming, just like isIndirect, so saddling you with the burden of correcting it doesn't seem reasonable, but I think it is important to discuss. This is also important in the context of replacing DBG_VALUE entirely, as the assert will obviously need to go away.

StephenTozer added inline comments.Jan 18 2021, 2:42 AM
llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
52–56

You are correct - currently, DBG_VALUE_LIST is consistent when we only represent expressions with a DW_OP_stack_value; there will need to be an additional flag to correctly represent the full set of DWARF expressions. Of the examples you've given however, I would say that it is the first two that are incorrect, and the latter two that are correct. The expression !DIExpression(DW_OP_LLVM_arg, 0) should, in the absence of a flag declaring it to be a direct/register location, mean that the variable is at the address given by the first argument, so the correct DWARF translation would be DW_AT_location (DW_OP_breg0 RAX+0).

This topic was discussed on the mailing list a while back, starting around here, concluding that we need an extra flag (with a different semantic meaning to the current IsIndirect that avoids the inconsistencies) to accurately represent all DWARF expressions.

The reason why this hasn't been added as part of this patch is that this patch isn't replacing DBG_VALUE with DBG_VALUE_LIST yet; the only place where DBG_VALUE_LIST is used is in salvaging dbg.values, where it will necessarily use DW_OP_stack_value.

Fixed issue in which DBG_VALUE_LIST expressions were being emitted without subregister masking.

jmorse added a subscriber: jmorse.Feb 4 2021, 6:44 AM

Just chiming in with Scott here:

My fundamental argument is that this context-dependent interpretation of DW_OP_LLVM_arg is another source of confusion, just like isIndirect. I think this stems from the fact that DWARF as it is defined today is not general/composable enough to avoid this, but I don't think that should bleed into the internal representation used by LLVM: we can make a sensible choice up until we get into the DWARF backend, where certain expressions will have to be converted into a different form to be legal. Instead, what we have now is a situation where adding operations to the expression changes fundamentally how you are supposed to interpret the DW_OP_LLVM_arg. This is a pre-existing shortcoming, just like isIndirect, so saddling you with the burden of correcting it doesn't seem reasonable, but I think it is important to discuss. This is also important in the context of replacing DBG_VALUE entirely, as the assert will obviously need to go away.

I agree -- there's some existing misery here [0] where a pass has to be aware that modifying an expression might change the context it's interpreted in, which is an un-necessary complexity.

[0] https://github.com/llvm/llvm-project/blob/d06e94031bcdfa43512bf7b0cdfd4b4bad3ca4e1/llvm/lib/CodeGen/PrologEpilogInserter.cpp#L1238

I agree -- there's some existing misery here [0] where a pass has to be aware that modifying an expression might change the context it's interpreted in, which is an un-necessary complexity.

This is true, and it is something that I intend to change in the next patch after this stack, which will be the replacement of the existing DBG_VALUE with the DBG_VALUE_LIST. The discussion about this topic on the mailing list was long, but it covers this issue reasonably well I think. Most of the existing ambiguity comes from the fact that DWARF has two ways to reference a register: DW_OP_reg, which always means "the variable lives in this register", and DW_OP_breg which pushes the value contained in the register onto the expression stack (usually to be interpreted as the address of the live variable), and our method for distinguishing the two is the Directness/Offset flag, which is context-dependent: it reads both the DIExpression and the location operand to see if it should really be indirect (and at different points in the program it does this check differently!). The replacement for that flag, that will be added in the aforementioned replacement patch, will not be context-dependent; if it is set to indirect, it will always mean that the result of the DIExpression is a memory address that the variable lives at, regardless of any location operand or the contents of the DIExpression.

Just to jump back to the example that Scott gave as well, the reason why DW_OP_LLVM_arg is currently context-dependent is because it follows the existing machine register pipeline. This is done for simplicity's sake, but it is worth noting that it is not currently intended to exist in all contexts. A DW_OP_LLVM_arg operator will only currently be produced as part of a DIExpression that ends with DW_OP_LLVM_stack_value, which fixes it to always be emitted as DW_OP_breg regardless of any other context. The replacement patch will simultaneously enable LLVM_arg operators to exist in non-stack_value expressions, and remove the existing inconsistencies from these cases.

This tentatively LsGTM, modulo some comments. Most of the discussion has been about the context-sensitivity of how the expression stack has evolved, which is awkward but pre-existing. Just to check with @scott.linder, you're alright with the patch going in, with improving expression handling a problem for another day?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
31–32

This over-writes Offset from getFrameIndexReference if the DBG_VALUE is indirect -- which I don't believe is a behaviour in the old code.

AFAIUI the "debug offset" should nowadays only be used as a flag indicating indirectness, not the actual numerical offset. There's at least one assertion out the that the offset is zero. (Obviously none of this is ideal).

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
61–1

It'd be good to elaborate on this -- a single machine location within a variable location description, one of possibly many?

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
84–85

To check my understanding: for non-variadic variables emission of integer operands might be signed or unsigned, depending on the type of the variable, we can see that earlier in this function and down in DwarfDebug::emitDebugLocValue. For a variadic expression, however, it's expected that the value gets "compiled in" unsigned, hence we add as unsigned here, yes?

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
27–30

What about if the next opcode is stack_value, do we need to mask in that scenario?

llvm/test/DebugInfo/X86/dbg_value_list_clobbers.mir
2–3

Does not call FileCheck

llvm/test/DebugInfo/X86/dbg_value_list_emission.mir
6

Mega-nit: "good" suggests there's a subjective difference, could I suggest "correct"

95

Could I request a test that a DBG_VALUE_LIST with $noreg somewhere in it does not lead to a location-list entry -- I've been bitten by $noregs not terminating things in the past.

StephenTozer added inline comments.Feb 16 2021, 7:19 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
31–32

This over-writes Offset from getFrameIndexReference if the DBG_VALUE is indirect -- which I don't believe is a behaviour in the old code.

That's because the old code is somewhat misleading; Offset is declared at a single point above as positive iff MI is an indirect debug value; otherwise it is 0, and gets incremented if the location operand is not a register. These two conditions are mutually exclusive, so ultimately it's just being optionally assigned one of two values; I've just made that more explicit here, but the behaviour should be unchanged.

AFAIUI the "debug offset" should nowadays only be used as a flag indicating indirectness, not the actual numerical offset. There's at least one assertion out the that the offset is zero. (Obviously none of this is ideal).

That assertion does exist in some places, but I'm not sure if it's everywhere. I could be wrong, but I believe the assertions that the offset is zero are all at points prior to the stack layout being finalized; at the time we emit DWARF, I think we may still have non-zero offsets for some DBG_VALUEs.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
84–85

This is really just a "dumb" carry-over of the logic above; when there is a non-trivial expression for a non-variadic debug value with an integer location in this function, it looks as though we just add the unsigned bytes and call it a day. I think the intent is that the signedness is ultimately interpreted according to the variable type? But in all existing cases I think the approach is flawed; unless you have an empty DIExpression, there's no guarantee that the signedness of the variable matches the signedness of the location operand.

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
27–30

I think so? To be honest, it looks to me like the only time we can use a subregister and don't apply a mask is when describing either a Register location (where we cannot use subreg masking, we use DW_OP_piece instead), or when describing a simple memory location, i.e. a single DW_OP_breg with 0 offset.

Although the latter case seems like a valid argument for "we don't always need to mask the subregister", I suspect that it's actually the case that we just never use a subregister for those locations. The full size of a register for a given architecture will also generally be the size of a memory address, so it may just be assumed that we don't need to check for subreg masking in that case.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 10 2021, 5:47 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.