Page MenuHomePhabricator

[Debuginfo][Codegen] (2/7) Support for DW_OP_implicit_pointer for named and unnamed variables (second strategy).
Needs ReviewPublic

Authored by alok on Jul 18 2020, 5:02 PM.

Details

Summary
This patch stems from D84112.
This patch includes changes for CodeGen phase.

Diff Detail

Event Timeline

alok created this revision.Jul 18 2020, 5:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2020, 5:02 PM
jmorse added a subscriber: jmorse.Jul 30 2020, 5:46 AM

I sense this is the bulk of the work; and it looks good. I've posted on llvm-dev asking whether the creation of artificial variables could instead be deferred til DWARF emission. That'll avoid anything having to deal with a new form of DBG_VALUE, and us having to think about their lifetimes. Plus, there's greater scope for de-duplication, as I understand it this patch will generate a new artificial variable for each dbg.value, for each level of indirection.

On that topic -- do we need to give the artificial variable a name? I'm no DWARF expert, but can we just not attach a DW_AT_name to the DIE?

llvm/include/llvm/CodeGen/MachineInstrBuilder.h
250–252

I think the preferred form is !MI->isDebugValue() || static_cast...., i.e., the assertion always passes if the MI isn't a DBG_VALUE, or if it is, the getDebugImplicitVariable() part is tested. This avoids the reader having to consider an un-necessary ternary operator.

254–256

(Same as above)

llvm/include/llvm/IR/DIBuilder.h
647–654

This'll want a docu-comment as to createImplicitTempVariable's purpose.

llvm/include/llvm/IR/DebugInfoMetadata.h
2705

nit: Full stops at ends of comments, here and elsewhere.

2711

"indirection"

llvm/include/llvm/IR/IntrinsicInst.h
131

Doc comment needed too please.

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
69–71

Shouldn't this be in the union?

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
83–85
86

Would it be possible to use a subtype of DINode instead, such as DIVariable? That'll increase type safety and make it clearer what's going on.

135

I'd suggest the return type can be auto here.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2472

I reckon it's better to assert(Die);, there should be no scenarios where the implicit pointer DIE isn't emitted, right? Otherwise, if this if condition is false, you'll write the unmodified ValOffset into the output dwarf, which could be misleading.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
1260–1261

NB: this function definitely needs some comments in it as to what's going on.

1268

Better to test for Level==0 and have the normal-dbg-value case return early; then the rest of this function doesn't need an extra level of indentation / scope.

1334

Correct me if I'm wrong, but isn't generation of DW_OP_LLVM_implicit_pointer only done during instruction selection? If so, can we not reject implicit_pointer expressions at this stage as being illegal IR?

llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
114–121

It looks like this contradicts the immediately preceeding comment, could you update the comment please?

llvm/test/DebugInfo/X86/dwarfdump-implicit_pointer_instcomb.ll
5–13

Don't both ptr and ptrVar become implicit pointers? Does this lead to two implicit_ptr_tmp variables being created?

It'd also be good to capture/compare the operand to DW_OP_implicit_pointer to the DIE address that it refers to, to confirm that it points to the correct place.

8

CHECK-NEXT: DW_AT_name "ptrVar"?

llvm/test/DebugInfo/X86/dwarfdump-implicit_pointer_mem2reg.ll
5 ↗(On Diff #279040)

Again, in this test it would be good to ensure that DW_OP_implicit_pointer points at the correct DIE. If the intention here is to only test for multiple levels of implicit pointer, it's probably sufficient to only check one variable.

llvm/test/DebugInfo/X86/dwarfdump-implicit_pointer_sroa.ll
4

It'd be good to test for non-zero offsets into an implicit variable -- i.e., in the original C, have ptr3 something like:

ptr3 = &arr1[1];
llvm/test/DebugInfo/X86/dwarfdump-implicit_pointer_sroa_inline.ll
8–12

So, there'll be a DW_OP_implicit_pointer for each value a pointer-variable can point at, rather than one DW_OP_implicit_pointer and two entries in the variable/location-list it refers to?

(This might not be the most efficient way of doing it; maybe it's sufficient for now though).

alok retitled this revision from [Debuginfo] (3/N) Support for DW_OP_implicit_pointer for named and unnamed variables (second strategy). to [Debuginfo][Codegen] (2/7) Support for DW_OP_implicit_pointer for named and unnamed variables (second strategy)..Nov 12 2020, 1:42 AM
alok updated this revision to Diff 304749.Nov 12 2020, 1:46 AM

Updated to re-base and include many comments from @jmorse .

djtodoro added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp
680

Can we add support for GNU extension as well in getDwarf5OrGNULocationAtom() ?

djtodoro added inline comments.Nov 13 2020, 6:39 AM
llvm/include/llvm/IR/DebugInfoMetadata.h
2715

early continue?

llvm/include/llvm/IR/IntrinsicInst.h
163

true/false but the return val is unsigned?

llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h
53

intentional change?

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
142

static_cast might be better

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1830

missing . at the end

2464

same here

2466

please refactor these comments

alok marked 21 inline comments as done.Nov 24 2020, 10:25 PM
alok updated this revision to Diff 307830.Nov 26 2020, 4:29 AM
alok added a reviewer: djtodoro.

Updated to incorporate comments from @djtodoro .

A few more comments inline. I think everything in this patch works; there's still an open question in my mind of why we need to create the implicit variables so early, during instruction selection. Isn't it sufficient to leave a:

dbg.value(0, !123, !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer)

In exactly the same form through the MIR phases:

DBG_VALUE 0, $noreg, !123, !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer)

And then create the implicit variables at a later stage? That would avoids a number of SelectionDAG changes, plus global and fast isel too.

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
86

Thinking about it, would UniqueVector be a more appropriate data structure? It's more heavy weight, but contains a map for quick lookup of elements. findOrInsertImplicitVar below uses std::find which is O(n), which isn't going to scale well.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
260–261

getDebugImplicitVariable?

llvm/lib/CodeGen/CodeGenPrepare.cpp
1223

Not clear why this is here -- slipped in by accident? If not, is there a reason why we shouldn't be salvaging these casts? Same below.

llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h
49

Doxygen for this field please

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
1260–1261

Comment-ping on this :)

1287–1295

An illegal input to this loop would be a multi-level implicit pointer expression, and a variable type that has less indirection. IMO, there should be an assertion that this doesn't happen, to avoid it being silently accepted. Not sure where to put it though -- assert that Type is a DIDerivedType each time?

I sense this is the bulk of the work; and it looks good. I've posted on llvm-dev asking whether the creation of artificial variables could instead be deferred til DWARF emission. That'll avoid anything having to deal with a new form of DBG_VALUE, and us having to think about their lifetimes. Plus, there's greater scope for de-duplication, as I understand it this patch will generate a new artificial variable for each dbg.value, for each level of indirection.

On that topic -- do we need to give the artificial variable a name? I'm no DWARF expert, but can we just not attach a DW_AT_name to the DIE?

I haven't looked closely at the code - but "do it later" sounds like the right theory to me. I believe the absence of a name is probably OK - there are unnamed types, for instance, so unnamed variables seem fine too. But some testing with real debuggers would be useful.