Page MenuHomePhabricator

[Salvage] Change salvage debug info implementation to use new DW_OP_LLVM_convert where needed

Authored by StephenTozer on Apr 26 2019, 5:40 AM.



Fixes issue:

Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator.

Diff Detail

Event Timeline

StephenTozer created this revision.Apr 26 2019, 5:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2019, 5:40 AM
dstenb added a subscriber: dstenb.Apr 26 2019, 5:51 AM
dstenb added inline comments.

There is a early bail-out for ZExtInst above:

if (CI->isNoopCast(DL) || isa<ZExtInst>(&I))
  return SrcDIExpr;

so that condition in this if statement appears to be dead code as it is now.

I assume that we no longer should bail out for zexts (?).

As per the previous comment, ZExt is equivalent to a Noop on the DWARF expression stack, so can be caught by the same early exit.

StephenTozer edited the summary of this revision. (Show Details)Apr 26 2019, 6:12 AM

Have you verified that this results in the desired DW_OPs being emitted in the end and that the (or a) debugger makes sense out of it?

Since the expression stack is a bit sloppy about types it defaults to the generic type (size of address on the target machine), so without having previously set the type to something else a single DW_OP_LLVM_convert does not mean much. This is the reason that llvm::replaceAllDbgUsesWith in lib/Transforms/Utils/Local.cpp inserts a pair (the first one is simply used to set the type of the current stack element and the second one performs the actual conversion).

For now inserting a pair would probably be the easiest way forward. A bit prettier would be to introduce a DW_OP_LLVM_reinterpret op to do that job. Longer term it would be helpful if all values had a specified type immediately when they are put on the stack.

dstenb added inline comments.Apr 26 2019, 6:29 AM

Also, perhaps this should be inverted to an early exit instead?

A bit prettier would be to introduce a DW_OP_LLVM_reinterpret op to do that job.

I am preparing a patch for that right now. Regardless of this review it should make the existing DW_OP_LLVM_convert stuff a bit cleaner.

A bit prettier would be to introduce a DW_OP_LLVM_reinterpret op to do that job.

I am preparing a patch for that right now. Regardless of this review it should make the existing DW_OP_LLVM_convert stuff a bit cleaner.

Actually forget all I said about DW_OP_reinterpret, it turns out that I did not read the spec properly and did not pay attention to the last sentence saying that: The type of the operand and result type must have the same size in bits.

So clearly it cannot be used like I suggested.

What I said about using two DW_OP_LLVM_convert in sequence still holds though.

Updated to use convert OP correctly as suggested by markus.

aprantl added inline comments.May 2 2019, 11:26 AM


Remove incorrectly included lines.

The DW_OP_LLVM_convert stuff looks-good-to-me but I don't feel that I have authority to approve for the rest so please don't wait for any input from my part.

probinson added inline comments.May 13 2019, 10:35 AM

Looks to me like this 'if' does more than check for vectors?

Update comment

probinson accepted this revision.May 14 2019, 6:49 AM

LGTM after you add the full-stop noted below.


Needs a full-stop at the end of the sentence.

This revision is now accepted and ready to land.May 14 2019, 6:49 AM
This revision was automatically updated to reflect the committed changes.

This broke compilation of libtiff for arm, see for details.

Thanks for working on this, it seems very useful! Any updates?

StephenTozer reopened this revision.Aug 16 2019, 9:35 AM

Reopening, as the merged patch caused a regression issue and needed to be reverted; this update should fix said regression issue.

This revision is now accepted and ready to land.Aug 16 2019, 9:35 AM

What was the fix for the regression and is it also tested?

The regression issue is documented in the bug above:
The main issue can be summarised briefly as "splitting a fragment DIExpression across multiple registers is currently broken", which is what I've submitted a fix for (see comments in SelectionDAGBuilder.cpp); the test for the new behaviour is in fragmented-args-multiple-regs.ll

Thanks for this (a few comments added).


I guess this is unwanted.


We can delete this.


And this as well.


We can also update the version to 10.0.0.


Here as well.


Here as well.

Looks like this was accepted (with minor nitpicks) long time ago. Please let us know if you have any obstacle to commit it.

aprantl added inline comments.Oct 29 2019, 9:18 AM

Should this be a separate commit? This seems unrelated to the SDag change?


Still needs to happen.

djtodoro added inline comments.Nov 1 2019, 6:22 AM

I think as well. This needs to be different patch.

StephenTozer marked an inline comment as done.Nov 5 2019, 1:49 AM
StephenTozer added inline comments.

I'm currently investigating whether this is necessary for the bug fix or not - in this case, immediately prior to CodeGen we have the variable i64 %j, which represents two source variables:

define dso_local i32 @h(i64 %j) local_unnamed_addr #0 !dbg !8 {
  call void @llvm.dbg.value(metadata i64 %j, metadata !14, metadata !DIExpression()), !dbg !29
  call void @llvm.dbg.value(metadata i64 %j, metadata !15, metadata !DIExpression(DW_OP_LLVM_convert, 64, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value, DW_OP_LLVM_fragment, 0, 32)), !dbg !29
  call void @llvm.dbg.value(metadata i64 %j, metadata !15, metadata !DIExpression(DW_OP_constu, 32, DW_OP_shr, DW_OP_LLVM_convert, 64, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value, DW_OP_LLVM_fragment, 32, 32)), !dbg !29

At the ISel stage, the target architecture requires the i64 variable to be split across two 32 bit registers. We now have two 32 bit registers (one "low", one "high") which map exactly to two 32 bit source variables, but due to the previous transforms the finalized MIR looks like:

%1:gpr = COPY $r1
DBG_VALUE %1, $noreg, !14, !DIExpression(DW_OP_LLVM_fragment, 32, 32), debug-location !29
%0:gpr = COPY $r0
DBG_VALUE %0, $noreg, !14, !DIExpression(DW_OP_LLVM_fragment, 0, 32), debug-location !29
DBG_VALUE %0, $noreg, !15, !DIExpression(DW_OP_LLVM_convert, 64, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value, DW_OP_LLVM_fragment, 0, 32), debug-location !29
DBG_VALUE %0, $noreg, !15, !DIExpression(DW_OP_constu, 32, DW_OP_shr, DW_OP_LLVM_convert, 64, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_unsigned, DW_OP_stack_value, DW_OP_LLVM_fragment, 32, 32), debug-location !29

As you can see the 32 bit DW_OP_shr is now being applied to the low 32 bit register in order to get the value in the high register, which won't work. Any operation which expects any kind of "carry over" will fail in the same way when a DBG_VALUE is split. I'm currently looking into whether there's a better way to specifically handle shift operations, since adding DW_OP_shr here still results in a variable being set as "undef" when it could potentially be recoverable.

aprantl added inline comments.Nov 5 2019, 8:54 AM

I'm not opposed to supporting shr here, I just think it should be a separately tested patch that can land in preparation of this one.

I think this one is almost (with the comments addressed) ready to go, but I think that the D70248 is preparation for this one? Should we create a stack of patches for those two patches?

If D70498 is landed, please make sure to use the API here.

Rebased and updated for latest master, fixed clang version and unused attributes in tests.

vsk added a comment.Dec 12 2019, 10:33 AM

This seems to be in good shape overall and the SelectionDAG changes look reasonable. I don't intend my comments here to block committing.


This looks like it would have matched before this patch. Consider ; CHECK: @llvm.dbg.value(metadata i32 %{{.*}},?


I'm not sure I understand this change. Why is the dbg.value for %conv deleted?

Please address my comments and try using the DIExpression::appendExt(). Thanks.

djtodoro added inline comments.Dec 12 2019, 11:23 PM

More precisely, the DIExpression::appendExt() should be used here.

StephenTozer marked an inline comment as done.Dec 13 2019, 1:25 AM
StephenTozer added inline comments.

The purpose of this test as stated in the comment at the top was to check that, when the trivially deletable %conv was deleted, metadata that depended on it was set to undef. Reading that comment it seemed that the important component of the test wasn't the existence of the dbg.value itself, but confirming that it would be set undef rather than being given an incorrect value. As of this patch however, %conv can be salvaged, which seems to defeat the point of the test. As %udiv cannot be salvaged, it seemed like an appropriate stand-in.

Use shared functionality for zero/sign extension ops, fix tests

vsk added inline comments.Dec 13 2019, 8:53 AM

Ah, got it, sgtm.

@djtodoro Does this latest change look good to you?

Since the only other outstanding comments were not intended to block committing, if this is considered resolved and there are no further comments I'll consider this implicitly LGTM'd; although an explicit response as of the latest changes would be preferred.

LGTM, thanks!

Sorry, one more nit included.


Have you considered using the using for declaration of the std::array<uint64_t, 6 return type?

Format, addressing review comments

Looks good to me! Thanks!

StephenTozer closed this revision.Dec 19 2019, 2:11 AM

Merged in commit: llvmorg-10-init-12272-g89d19d60adb9