This is an archive of the discontinued LLVM Phabricator instance.

[LiveDebugValues] Omit entry values for DBG_VALUEs with pre-existing expressions
AbandonedPublic

Authored by dstenb on Aug 26 2019, 8:14 AM.

Details

Summary

Entry values are currently only supported for register DBG_VALUEs
with empty debug expressions, as the DW_OP_entry_value operation can
only wrap one byte. Creating an entry value for a DBG_VALUE with a
non-empty debug expression would result in an invalid expression.
This occurred for the parameter in the attached test case,
triggering an assert later on.

Diff Detail

Event Timeline

dstenb created this revision.Aug 26 2019, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2019, 8:14 AM
vsk added a comment.Aug 26 2019, 1:14 PM

Thanks for the patch. Could you elaborate on what you mean by OP_entry_value only being able to wrap a single byte? Skimming the implementation, the impression I get is that a full machine-reg-expression can be attached to an OP_entry_value (see: DwarfCompileUnit::addComplexAddress).

llvm/test/DebugInfo/Mips/entry-value-non-empty-expr.ll
25

Not sure if it's worth it here, but in general I find FileCheck's '-implicit-check-not=<pattern>' feature useful in situations like this.

Thanks for the patch.
We wanted to avoid any complex DIExpression here, so this looks good. In order to support it here, we need to add fully support within the AsmPrinter (DwarfDebug, DwarfExpression, DwarfCompileUnit) and then enable it here.

llvm/test/DebugInfo/Mips/entry-value-non-empty-expr.ll
50

I think we don't need the attributes.

In D66746#1645802, @vsk wrote:

Thanks for the patch. Could you elaborate on what you mean by OP_entry_value only being able to wrap a single byte? Skimming the implementation, the impression I get is that a full machine-reg-expression can be attached to an OP_entry_value (see: DwarfCompileUnit::addComplexAddress).

The lang ref describes the DW_OP_entry_value operation in DIExpression as:

If an expression is marked with DW_OP_entry_value all register and memory read operations refer to the respective value at the function entry. The first operand of DW_OP_entry_value is the size of following DWARF expression. DW_OP_entry_value may appear after the LiveDebugValues pass. LLVM only supports entry values for function parameters that are unmodified throughout a function and that are described as simple register location descriptions. DW_OP_entry_value may also appear after the AsmPrinter pass when a call site parameter value (DW_AT_call_site_parameter_value) is represented as entry value of the parameter.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

However, I now got uncertain when looking at prependOpcodes() which is used to add the operation to the DIExpression:

Ops.push_back(dwarf::DW_OP_entry_value);
// Add size info needed for entry value expression.
// Add plus one for target register operand.
Ops.push_back(Expr->getNumElements() + 1);

As seen, there the number of pre-existing elements plus one is used. I don't think the number of elements does not map one-to-one with the byte size of the DWARF block, so I'm not sure how to interpret that. Can you help me understand what the operand in the DIExpression world indicates, @djtodoro?

As far as I understand, we now emit the operation from the DIExpression as-is in the DWARF. That means that if we have a register that turns into a complex expression we will still say that the size of that expression is one byte. I have seen such cases with our downstream target, but I'll see if I can trigger that behavior with an upstream target with a source level reproducer.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

However, I now got uncertain when looking at prependOpcodes() which is used to add the operation to the DIExpression:

Ops.push_back(dwarf::DW_OP_entry_value);
Add size info needed for entry value expression.
Add plus one for target register operand.
Ops.push_back(Expr->getNumElements() + 1);

As seen, there the number of pre-existing elements plus one is used. I don't think the number of elements does not map one-to-one with the byte size of the DWARF block, so I'm not sure how to interpret that. Can you help me understand what the operand in the DIExpression world indicates, @djtodoro?

I think your point is right. We wanted to have there hard-coded value 1 for the size of following expression. Except if we did not cover all the cases where we should avoid complex expressions, we always generate an entry value expression with an empty pre-existing DIExpression, so we assumed that this code will cover current situation and may be extended to support complex debug expressions as well. But, I also think it is hard to distinguish the size of a complex DIExpression until DWARF being printed, so maybe we can change the code to Ops.push_back(1); and put some kind of assertion there.

As far as I understand, we now emit the operation from the DIExpression as-is in the DWARF. That means that if we have a register that turns into a complex expression we will still say that the size of that expression is one byte. I have seen such cases with our downstream target, but I'll see if I can trigger that behavior with an upstream target with a source level reproducer.

If you can produce such scenario it will be desirable. We enabled the feature in this initial stage only for x86 targets, and tried to cover all the situations found for the target. We meant to cover all the places where we should avoid generation of debug entry values with complex expressions (now). Eventually, we should handle all types of expressions.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

I think it would be reasonable to use the number of opcodes in the DIExpression in LLVM IR and only substitute the number of bytes in AsmPrinter.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

I think it would be reasonable to use the number of opcodes in the DIExpression in LLVM IR and only substitute the number of bytes in AsmPrinter.

+1. The DIExpression is more of an abstracted expression than a true DWARF expression, and even supports non-DWARF opcodes e.g. DW_OP_LLVM_convert. We can't be treating this as the size of the final expression.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

I think it would be reasonable to use the number of opcodes in the DIExpression in LLVM IR and only substitute the number of bytes in AsmPrinter.

+1. The DIExpression is more of an abstracted expression than a true DWARF expression, and even supports non-DWARF opcodes e.g. DW_OP_LLVM_convert. We can't be treating this as the size of the final expression.

That sounds like a reasonable approach. I can start looking on some patches for that, if no one objects.

Even if the entry value could wrap multiple operations, we would still not want LDV to emit an entry value for the parameter in the attached test case, since the parameter value has been propagated to the callee, but I guess that's another question.

I have interpreted the size as meaning the byte size of the DWARF block that the operation will cover. Assuming that, at the time of running LiveDebugValues I don't think there is a good way to query the size of the block that the entry value will cover; we don't know that until we actually emit the DWARF, as far as I can tell. That is why I have assumed that a hard coded operand of 1 is emitted there, with the assumption that only simple register location descriptions are supported.

However, I now got uncertain when looking at prependOpcodes() which is used to add the operation to the DIExpression:

Ops.push_back(dwarf::DW_OP_entry_value);
Add size info needed for entry value expression.
Add plus one for target register operand.
Ops.push_back(Expr->getNumElements() + 1);

As seen, there the number of pre-existing elements plus one is used. I don't think the number of elements does not map one-to-one with the byte size of the DWARF block, so I'm not sure how to interpret that. Can you help me understand what the operand in the DIExpression world indicates, @djtodoro?

I think your point is right. We wanted to have there hard-coded value 1 for the size of following expression. Except if we did not cover all the cases where we should avoid complex expressions, we always generate an entry value expression with an empty pre-existing DIExpression, so we assumed that this code will cover current situation and may be extended to support complex debug expressions as well. But, I also think it is hard to distinguish the size of a complex DIExpression until DWARF being printed, so maybe we can change the code to Ops.push_back(1); and put some kind of assertion there.

As far as I understand, we now emit the operation from the DIExpression as-is in the DWARF. That means that if we have a register that turns into a complex expression we will still say that the size of that expression is one byte. I have seen such cases with our downstream target, but I'll see if I can trigger that behavior with an upstream target with a source level reproducer.

If you can produce such scenario it will be desirable. We enabled the feature in this initial stage only for x86 targets, and tried to cover all the situations found for the target. We meant to cover all the places where we should avoid generation of debug entry values with complex expressions (now). Eventually, we should handle all types of expressions.

It can be reproduced with the following C file compiled for sparc64:

volatile long double global;
extern void clobber();
int foo(long double p) {
  global = p;
  clobber();
  return 123;
}

compiled using:

clang --target=sparc64 -g -O2 -Xclang -femit-debug-entry-values -c -integrated-as sparc.c

yields the following entry value after LiveDebugValues:

CALL @clobber, csr, implicit $o6, implicit-def $o6, debug-location !24 {
  STDFri killed renamable $i0, 8, renamable $d1, implicit killed $q0, debug-location !19 :: (store 8)
}
DBG_VALUE $q0, $noreg, !17, !DIExpression(DW_OP_entry_value, 1), debug-location !18

which currently results in the following bad location list entry:

[0x0000000000000020,  0x0000000000000028): DW_OP_GNU_entry_value(DW_OP_regx D0), DW_OP_piece 0x8, DW_OP_regx D1, DW_OP_piece 0x8, DW_OP_stack_value)

Please note that you have to apply D66888, and also add sparc to the list of allowed targets in ParseCodeGenArgs(), to get that reproducer working.

Another comment I made in an earlier review is that it is also reasonable to have completely different semantics for DW_OP_entry_value in LLVM IR; in which case it would be best to introduce an IR-only DW_OP_LLVM_entry_value (like DW_OP_LLVM_fragment) to avoid confusion and lower that to a real DWARF opcode in AsmPrinter.

@dstenb Thanks a lot for the test case, I will take a look into that.

Another comment I made in an earlier review is that it is also reasonable to have completely different semantics for DW_OP_entry_value in LLVM IR; in which case it would be best to introduce an IR-only DW_OP_LLVM_entry_value (like DW_OP_LLVM_fragment) to avoid confusion and lower that to a real DWARF opcode in AsmPrinter.

@aprantl I agree with this... That will simplify things a lot, especially future work.

dstenb abandoned this revision.Nov 7 2019, 10:13 AM

Sorry for the delay!

The DW_OP_LLVM_entry_value operation has now been introduced, and that allows for DW_OP_entry_value operations with blocks larger than one byte to be expressed. However, the DW_OP_LLVM_entry_value operation can currently only wrap one operation in the DIExpression (there are some things missing in DwarfDebug and DwarfExpression for allowing more), so we'd still see the crash that this patch addresses.

However, we do not really want to emit an entry value here, as the caller's parameter value has been propagated into the callee. I think that "omit entry values if the caller's parameter value has been propagated into the callee" is the condition that I really want check here. I have created a new patch that attempts to address that: D69889. I will abandon this revision, and we can then revisit it later on if needed.