This patch stems from D84112. This patch includes changes for CodeGen phase.
I sense this is the bulk of the work; and it looks good. I've posted on llvm-dev asking whether the creation of artificial variables could instead be deferred til DWARF emission. That'll avoid anything having to deal with a new form of DBG_VALUE, and us having to think about their lifetimes. Plus, there's greater scope for de-duplication, as I understand it this patch will generate a new artificial variable for each dbg.value, for each level of indirection.
On that topic -- do we need to give the artificial variable a name? I'm no DWARF expert, but can we just not attach a DW_AT_name to the DIE?
I think the preferred form is !MI->isDebugValue() || static_cast...., i.e., the assertion always passes if the MI isn't a DBG_VALUE, or if it is, the getDebugImplicitVariable() part is tested. This avoids the reader having to consider an un-necessary ternary operator.
(Same as above)
This'll want a docu-comment as to createImplicitTempVariable's purpose.
nit: Full stops at ends of comments, here and elsewhere.
Doc comment needed too please.
Shouldn't this be in the union?
Would it be possible to use a subtype of DINode instead, such as DIVariable? That'll increase type safety and make it clearer what's going on.
I'd suggest the return type can be auto here.
I reckon it's better to assert(Die);, there should be no scenarios where the implicit pointer DIE isn't emitted, right? Otherwise, if this if condition is false, you'll write the unmodified ValOffset into the output dwarf, which could be misleading.
NB: this function definitely needs some comments in it as to what's going on.
Better to test for Level==0 and have the normal-dbg-value case return early; then the rest of this function doesn't need an extra level of indentation / scope.
Correct me if I'm wrong, but isn't generation of DW_OP_LLVM_implicit_pointer only done during instruction selection? If so, can we not reject implicit_pointer expressions at this stage as being illegal IR?
It looks like this contradicts the immediately preceeding comment, could you update the comment please?
Don't both ptr and ptrVar become implicit pointers? Does this lead to two implicit_ptr_tmp variables being created?
It'd also be good to capture/compare the operand to DW_OP_implicit_pointer to the DIE address that it refers to, to confirm that it points to the correct place.
CHECK-NEXT: DW_AT_name "ptrVar"?
Again, in this test it would be good to ensure that DW_OP_implicit_pointer points at the correct DIE. If the intention here is to only test for multiple levels of implicit pointer, it's probably sufficient to only check one variable.
It'd be good to test for non-zero offsets into an implicit variable -- i.e., in the original C, have ptr3 something like:
ptr3 = &arr1;
So, there'll be a DW_OP_implicit_pointer for each value a pointer-variable can point at, rather than one DW_OP_implicit_pointer and two entries in the variable/location-list it refers to?
(This might not be the most efficient way of doing it; maybe it's sufficient for now though).
true/false but the return val is unsigned?
static_cast might be better
missing . at the end
please refactor these comments
A few more comments inline. I think everything in this patch works; there's still an open question in my mind of why we need to create the implicit variables so early, during instruction selection. Isn't it sufficient to leave a:
dbg.value(0, !123, !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer)
In exactly the same form through the MIR phases:
DBG_VALUE 0, $noreg, !123, !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer)
And then create the implicit variables at a later stage? That would avoids a number of SelectionDAG changes, plus global and fast isel too.
Thinking about it, would UniqueVector be a more appropriate data structure? It's more heavy weight, but contains a map for quick lookup of elements. findOrInsertImplicitVar below uses std::find which is O(n), which isn't going to scale well.
Not clear why this is here -- slipped in by accident? If not, is there a reason why we shouldn't be salvaging these casts? Same below.
Doxygen for this field please
Comment-ping on this :)
An illegal input to this loop would be a multi-level implicit pointer expression, and a variable type that has less indirection. IMO, there should be an assertion that this doesn't happen, to avoid it being silently accepted. Not sure where to put it though -- assert that Type is a DIDerivedType each time?
I haven't looked closely at the code - but "do it later" sounds like the right theory to me. I believe the absence of a name is probably OK - there are unnamed types, for instance, so unnamed variables seem fine too. But some testing with real debuggers would be useful.