This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Support for DW_OP_implicit_pointer (IR Verifier and Bitcode)
Needs ReviewPublic

Authored by alok on Nov 8 2019, 2:37 AM.

Details

Summary

This patch (4/N) stems from D69787 as suggested by @aprantl (Adrian Prantl)

Summary:
This patch adds support in IR verifier and Bitcode for DW_OP_implicit_pointer

The test case having IR with DW_OP_LLVM_implicit_pointer, will be compiled successfully

Class DwarfCompileUnit has been modified to add member ImplicitVars of vector type. Below is the usage of it.

First operand of DW_OP_implicit_pointer operation is a reference to a debugging information entry that describes the dereferenced object’s value. when this operation is created, The debugging information entry is not yet formed. To solve this problem, we maintain a vector of dereferenced objects. We keep index of dereferenced object's at first operand of DW_OP_implicit_pointer operation temporarily. This temporary value is later replaced by actual value when available.

If a variable has only single DBG_VALUE then currently location list is not created instead value is used to initialized the var. In case of DW_OP_implicit_pointer, now location list will still be created.

Testing:

  • Added unit test for validation thru llvm-dwarfdump
  • check-llvm
  • check-debuginfo (the debug info integration tests)

Diff Detail

Event Timeline

alok created this revision.Nov 8 2019, 2:37 AM
  • Can you update LangRef.rst / SourceLevelDebugging.rst to document the semantics of the DW_OP_implicit_pointer in LLVM IR. Specifically what the two arguments mean while in LLVM IR?
  • You'll need to add a Verifier check that rejects DW_OP_implicit_pointer in other positions than the first one.
llvm/include/llvm/CodeGen/MachineInstr.h
484 ↗(On Diff #228391)

I probably wouldn't surface this in MachineInstr at all and just write getDebugExpression()->isImplicitPointer(); for clarity.

llvm/include/llvm/CodeGen/MachineInstrBuilder.h
228 ↗(On Diff #228391)

I'm not sure this should be exposed here either.

llvm/include/llvm/IR/IntrinsicInst.h
127

Why is this necessary?

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
583

What does "later" mean in this context?
Also, please make sure to have all comments be full sentences, starting with an uppercase character and ending with a ..

alok added a comment.Nov 10 2019, 10:40 AM
  • Can you update LangRef.rst / SourceLevelDebugging.rst to document the semantics of the DW_OP_implicit_pointer in LLVM IR. Specifically what the two arguments mean while in LLVM IR?

Thanks for your review. I shall update it in next revision.

  • You'll need to add a Verifier check that rejects DW_OP_implicit_pointer in other positions than the first one.

I shall update it in next revision.

alok marked 5 inline comments as done.Nov 10 2019, 10:51 AM
alok added inline comments.
llvm/include/llvm/CodeGen/MachineInstr.h
484 ↗(On Diff #228391)

I shall update it in next revision.

llvm/include/llvm/CodeGen/MachineInstrBuilder.h
228 ↗(On Diff #228391)

Other option is to duplicate this function for implicit pointer only difference being the absence of first assert. Please let me know your suggestion?

llvm/include/llvm/IR/IntrinsicInst.h
127

It has call from other functions. Please let me know if i should inline the content at call-sites?

llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
583

By later i meant function DwarfDebug::emitDebugLocEntry.
I shall update it in next revision.

alok updated this revision to Diff 228612.Nov 10 2019, 10:53 AM
alok marked an inline comment as done.

Updated for comments from @aprantl (Adrian Prantl)

My first thought was that this might be an opportunity to finally implement explicit dbg.value arguments. (There is prior discussion of this online somewhere but I couldn't find it in my mailbox).

The idea behind a explicit dbg.value arguments is to move away from the implicit first argument of dbg.value to a format where that makes them explicit and thus more flexible.

For example, currently we would write:

call @llvm.dbg.value(%1, !DILocalVariable(name: "x"), !DIExpression(DW_OP_uconst, 4, DW_OP_plus)

which (assuming %1 compiles to reg0) may generate something like

DW_TAG_variable
  DW_AT_name("x")
  DW_AT_location(DW_OP_reg0, DW_OP_uconst, 4, DW_OP_plus)

Note the implicit first element in the DIExpression.

An explicit notation would be something like:

call @llvm.dbg.value(%1, !DILocalVariable(name: "x"), !DIExpression(DW_OP_LLVM_arg0, DW_OP_uconst, 4, DW_OP_plus)

which has the advantage that we could put DW_OP_LLVM_arg0 into places other than just the beginning of the expression.

Similarly, we could then further extend it to allow referring to more than one LLVM SSA value within an expression, using a variadic dbg.value. For example:

call @llvm.dbg.value(%1, %2, !DILocalVariable(name: "x"), !DIExpression(DW_OP_LLVM_arg0 /* %1 */ , DW_OP_uconst, DW_OP_LLVM_arg1 /* %2 */, DW_OP_plus)

This mechanism seems to apply nicely to DW_OP_implicit_pointer — applied to your testcase we'd write:

call @llvm.dbg.value(!DILocalVariable(name: "arr"), !DILocalVariable(name: "ptr"), !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_arg0, 4)

I think this would make the handling of DW_OP_LLVM_implicit_pointer less magic in LLVM IR and would open the door to future features, such as the variadic dbg.value.
What do you think?
What does the debug info cabal think?

llvm/docs/LangRef.rst
4818 ↗(On Diff #228612)

+4 is abit ambiguous. Perhaps first do a generic description, followed by an example?

4819 ↗(On Diff #228612)

that that

4820 ↗(On Diff #228612)

Is that an accurate description? Per DWARF5:

The DW_OP_implicit_pointer operation specifies that the object is a pointer that cannot be represented as a real pointer, even though the value it would point to can be described

llvm/docs/SourceLevelDebugging.rst
248 ↗(On Diff #228612)

Let's move this into the paragraph in LangRef, so it's all in one place.

dstenb added a subscriber: dstenb.Nov 11 2019, 2:29 PM

My first thought was that this might be an opportunity to finally implement explicit dbg.value arguments. (There is prior discussion of this online somewhere but I couldn't find it in my mailbox).

Perhaps https://bugs.llvm.org/show_bug.cgi?id=39141?

My first thought was that this might be an opportunity to finally implement explicit dbg.value arguments. (There is prior discussion of this online somewhere but I couldn't find it in my mailbox).

Perhaps https://bugs.llvm.org/show_bug.cgi?id=39141?

Yes, thanks! That also explains why I didn't find it in my mailbox :-)

alok marked 4 inline comments as done.Nov 12 2019, 10:51 PM

My first thought was that this might be an opportunity to finally implement explicit dbg.value arguments. (There is prior discussion of this online somewhere but I couldn't find it in my mailbox).

The idea behind a explicit dbg.value arguments is to move away from the implicit first argument of dbg.value to a format where that makes them explicit and thus more flexible.

For example, currently we would write:

call @llvm.dbg.value(%1, !DILocalVariable(name: "x"), !DIExpression(DW_OP_uconst, 4, DW_OP_plus)

which (assuming %1 compiles to reg0) may generate something like

DW_TAG_variable
  DW_AT_name("x")
  DW_AT_location(DW_OP_reg0, DW_OP_uconst, 4, DW_OP_plus)

Note the implicit first element in the DIExpression.

An explicit notation would be something like:

call @llvm.dbg.value(%1, !DILocalVariable(name: "x"), !DIExpression(DW_OP_LLVM_arg0, DW_OP_uconst, 4, DW_OP_plus)

which has the advantage that we could put DW_OP_LLVM_arg0 into places other than just the beginning of the expression.

Similarly, we could then further extend it to allow referring to more than one LLVM SSA value within an expression, using a variadic dbg.value. For example:

call @llvm.dbg.value(%1, %2, !DILocalVariable(name: "x"), !DIExpression(DW_OP_LLVM_arg0 /* %1 */ , DW_OP_uconst, DW_OP_LLVM_arg1 /* %2 */, DW_OP_plus)

This mechanism seems to apply nicely to DW_OP_implicit_pointer — applied to your testcase we'd write:

call @llvm.dbg.value(!DILocalVariable(name: "arr"), !DILocalVariable(name: "ptr"), !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_arg0, 4)

I think this would make the handling of DW_OP_LLVM_implicit_pointer less magic in LLVM IR and would open the door to future features, such as the variadic dbg.value.
What do you think?

Your suggestion looks great. I shall include that in next revision.

What does the debug info cabal think?

llvm/docs/LangRef.rst
4818 ↗(On Diff #228612)

Thanks, it will be updated in next revision.

4819 ↗(On Diff #228612)

Thanks, it will be updated in next revision.

4820 ↗(On Diff #228612)

It will be updated to incorporate your other comment to introduce DW_OP_LLVM_arg0 / DW_OP_LLVM_implicit_pointer

llvm/docs/SourceLevelDebugging.rst
248 ↗(On Diff #228612)

It will be updated in next revision.

alok updated this revision to Diff 229014.Nov 12 2019, 11:09 PM
alok edited the summary of this revision. (Show Details)

Review comments from @aprantl have been incorporated.

alok marked 7 inline comments as done.Nov 12 2019, 11:12 PM
alok added a comment.Nov 14 2019, 11:25 AM

Review comments from @aprantl have been incorporated.

@aprantl, do you have any more comments ?

Thanks! Let's cycle back to this after the llvm-dev thread that @dblaikie started has been resolved.

alok updated this revision to Diff 229790.Nov 18 2019, 3:29 AM

Updated to

  • Rebase
  • To remove trailing space
  • typecasted return value of getData to uint8_t
alok updated this revision to Diff 230802.Nov 24 2019, 10:04 AM
alok edited the summary of this revision. (Show Details)

Updated after splitting out versions for DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_argN and llvm.dbg.derefval intrinsic.

Hi @alok,

Adrian said:

Thanks! Let's cycle back to this after the llvm-dev thread that @dblaikie started has been resolved.

Reading the thread I'm not sure if it counts as having been 'resolved', so hopefully these comments are not out of place.
To help keep things running smoothly please make sure:
a) variable names are capitalised, and
b) comments are written as sentences (capitals, punctuation),

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
79

when -> When

81–82

I get what this is saying but I found it a little hard to parse, and as Adrian pointed out elsewhere 'later' is a bit vague:

We keep index of dereferenced object's at first operand of DW_OP_implicit_pointer operation temporarily. This temporary value is later replaced by actual value when available.

84

Could std::vector be llvm::SmallVector [0] here?

[0] http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h

alok marked 6 inline comments as done.Nov 26 2019, 1:32 AM

Hi @alok,

Adrian said:

Thanks! Let's cycle back to this after the llvm-dev thread that @dblaikie started has been resolved.

Reading the thread I'm not sure if it counts as having been 'resolved', so hopefully these comments are not out of place.
To help keep things running smoothly please make sure:
a) variable names are capitalised, and
b) comments are written as sentences (capitals, punctuation),

Thanks for your comments. I will update it in next version.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
79

Thanks for your comment. It will be updated in next version.

81–82

Thanks for your comment. It will be updated in next version.

84

Thanks for your comment. It will be updated in next version.

alok updated this revision to Diff 231022.Nov 26 2019, 1:40 AM
alok marked 3 inline comments as done.

This is updated to incorporate review comments from @Orlando .

StephenTozer added inline comments.
llvm/include/llvm/IR/IntrinsicInst.h
128

I think this check is unnecessary - it seems like this is expected to be true for every DbgVariableIntrinsic, and this check isn't used anywhere else prior to calling getExpression().

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
45

Nit: the full stop should be replaced by a comma, and the comment immediately below should end with a full stop.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
138

The "else" can be removed, since the if block ends with a return statement.

alok marked 6 inline comments as done.Dec 18 2019, 3:24 AM
alok added inline comments.
llvm/include/llvm/IR/IntrinsicInst.h
128

Thanks for your comment. It will be incorporated in next version.

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
45

Thanks for your comment. It will be incorporated in next version.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
138

Thanks for your comment. It will be incorporated in next release.

alok updated this revision to Diff 234503.Dec 18 2019, 4:17 AM
alok marked 3 inline comments as done.
alok edited the summary of this revision. (Show Details)

Updated to re-base and incorporated comments from @StephenTozer