Page MenuHomePhabricator

[DebugInfo] Discard invalid DBG_VALUE instructions in LiveDebugVariables
ClosedPublic

Authored by bjope on Mar 1 2018, 10:13 AM.

Details

Summary

This is a workaround for pr36417
https://bugs.llvm.org/show_bug.cgi?id=36417

LiveDebugVariables will now verify that the DBG_VALUE instructions
are sane (prior to register allocation) by asking LIS if a virtual
register used in the DBG_VALUE is live (or dead def) in the slot
index before the DBG_VALUE. If it isn't sane the DBG_VALUE is
discarded.

One pass that was identified as introducing non-sane DBG_VALUE
instructtons, when analysing pr36417, was the DAG->DAG Instruction
Selection. It sometimes inserts DBG_VALUE instructions referring to
a virtual register that is defined later in the same basic block.
So it is a use before def kind of problem. The DBG_VALUE is
typically inserted in the beginning of a basic block when this
happens. The problem can be seen in the test case
test/DebugInfo/X86/dbg-value-inlined-parameter.ll

Change-Id: Ia55df65fbfb6d3bc561df4c1e27fd36d0e61e505

Diff Detail

Repository
rL LLVM

Event Timeline

bjope created this revision.Mar 1 2018, 10:13 AM
bjope added a subscriber: Ka-Ka.
aprantl requested changes to this revision.Mar 1 2018, 10:27 AM

Thanks! Two comments:

  • How are these invalid DBG_VALUEs generated? Are they happening during or after instruction selection? If they are introduced earlier by IR passes, it would be better to catch them in the Verifier and fix the bugs in the passes instead.
  • Dropping invalid DBG_VALUE instructions is not correct. What you want to do is insert a DBG_VALUE undef, .... For example:
    ; int x = 0;
    ; ...
    ; x = foo();
    ; ... // x is never used, but not necessarily zero

-->

    DBG_VALUE, 0, "x"
    ... 
    DBG_VALUE, noreg, "x"
    ...
    // if you delete the second DBG_VALUE, then "x" will be represented as still being constant 0 here. Inserting a DBG_VALUE undef will correctly terminate the live range of the constant above.
This revision now requires changes to proceed.Mar 1 2018, 10:27 AM
davide added a subscriber: davide.Mar 1 2018, 10:53 AM
bjope added a comment.Mar 1 2018, 11:18 AM

What I see now in test/DebugInfo/X86/dbg-value-inlined-parameter.ll

Input to ISel is OK as far as I can see:

define void @foobar() #0 !dbg !28 {
entry:
  tail call void @llvm.dbg.value(metadata %struct.S1* @p, metadata !19, metadata !DIExpression()) #3, !dbg !31
  tail call void @llvm.dbg.value(metadata i32 1, metadata !21, metadata !DIExpression()) #3, !dbg !34
  store i32 1, i32* getelementptr inbounds (%struct.S1, %struct.S1* @p, i64 0, i32 1), align 8, !dbg !35
  %call.i = tail call float* @bar(i32 1) #4, !dbg !36
  store float* %call.i, float** getelementptr inbounds (%struct.S1, %struct.S1* @p, i64 0, i32 0), align 8, !dbg !36
  ret void, !dbg !37
}

but then after ISel we get

*** MachineFunction at end of ISel ***
# Machine code for function foobar: IsSSA, TracksLiveness

bb.0.entry:
  DBG_VALUE 1, debug-use $noreg, !"nums", !DIExpression(), debug-location !34; line no:7
  DBG_VALUE debug-use %0:gr64, debug-use $noreg, !"sp", !DIExpression(), debug-location !31; line no:7
  %0:gr64 = MOV64rm $rip, 1, $noreg, target-flags(x86-gotpcrel) @p, $noreg, debug-location !35; mem:LD8[GOT]
  MOV32mi %0:gr64, 1, $noreg, 8, $noreg, 1, debug-location !35; mem:ST4[getelementptr inbounds (%struct.S1, %struct.S1* @p, i64 0, i32 1)](align=8)
  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !36
  %1:gr32 = MOV32ri 1, debug-location !35
  $edi = COPY %1:gr32, debug-location !36
  CALL64pcrel32 @bar, <regmask $bh $bl $bp $bpl $bx $ebp $ebx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w>, implicit $rsp, implicit $ssp, implic
it $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $rax, debug-location !36
  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !36
  %2:gr64 = COPY $rax, debug-location !36
  MOV64mr %0:gr64, 1, $noreg, 0, $noreg, %2:gr64, debug-location !36; mem:ST8[getelementptr inbounds (%struct.S1, %struct.S1* @p, i64 0, i32 0)]
  RET 0, debug-location !37

So for some reason the

DBG_VALUE debug-use %0:gr64

ends up before

%0:gr64 = MOV64rm $rip, 1, $noreg, target-flags(x86-gotpcrel) @p, $noreg, debug-location !35; mem:LD8[GOT]

As far as I can see this is a ISel problem. I haven't tracked down exactly how this happens.

Historical debugging

I was actually hunting similar problems to this, with ISel inserting DBG_VALUE in the wrong location, a few months ago (unfortunately haven't had time to finalize the work...).
If I remember correctly there are some problems in ScheduleDAGSDNodes.cpp (methods ScheduleDAGSDNodes::EmitSchedule, ProcessSourceNode, ProcessSDDbgValues).

For example ProcessSourceNode is selecting between three different choices:
a) process SDDbgValue even if the node does not have any order (use IROrder 0)
b) save it in Orders and associate with nullptr
c) save it in Orders and associate with prev, and process SDDbgValue (using given IROrder)

I think the conditions for the (b) choice could cause problems sometimes, as it looks for PHI-nodes, but does not take any already inserted DBG_VALUES into consideration (it should skip past debug values somehow).
And I think processing the SDDbgValue also in situation (b) could improve things.

The ProcessSDDbgValues is having a condition like this

if (!Order || DVOrder == Order)

but we could get there also with DVOrder < Order (don't remember the exact situation, but I suspected that

if (!Order || DVOrder <= Order)

would be better.

I think that some of the problems might originate from situations where SelectionDAGBuilder::visitIntrinsicCall creates DanglingDebugInfo and then the IROrder is weird or missing (don't remember exactly).

Summary

Code like the one in test/DebugInfo/X86/dbg-value-inlined-parameter.ll (I actually made a C-reproducer ending up in such IR as well) seems to be lowered incorrectly by ISel.
I was analysing problem were ISel lowering caused similar problems a few month ago.
And the RUST-based code in https://bugs.llvm.org/show_bug.cgi?id=36417 also suffered from ISel creating "use before def" situations when inserting DBG_VALUE nodes.

So I guess I need to dig up that old patch I was working on in ScheduleDAGSDNodes.cpp to see if it helps...

bjope added a comment.Mar 1 2018, 11:22 AM
  • Dropping invalid DBG_VALUE instructions is not correct. What you want to do is insert a DBG_VALUE undef, ....

Yes, that should be more correct. I'll look into such a solution tomorrow.

aprantl added a subscriber: vsk.Mar 1 2018, 11:25 AM

Thanks for the write-up! Do we have a bugreport for that issue?

bjope added a comment.Mar 1 2018, 11:32 AM
  • Dropping invalid DBG_VALUE instructions is not correct. What you want to do is insert a DBG_VALUE undef, ....

Yes, that should be more correct. I'll look into such a solution tomorrow.

Although, since the DBG_VALUE is misplaced from the start, then we do not know if it is correct to insert a DBG_VALUE undef, ..., !"x" in that position in the code. Perhaps it could be in a scope where "x" doesn't exist?

bjope added a comment.Mar 1 2018, 11:35 AM

Thanks for the write-up! Do we have a bugreport for that issue?

The historical debugging thing was for our out-of-tree-target. I never got to the point of publishing anything in Phabricator, maybe due to problems with getting any good reproducer for an in-tree-target. I need to do some digging here.

bjope added a comment.Mar 1 2018, 11:40 AM

Thanks for the write-up! Do we have a bugreport for that issue?

The historical debugging thing was for our out-of-tree-target. I never got to the point of publishing anything in Phabricator, maybe due to problems with getting any good reproducer for an in-tree-target. I need to do some digging here.

Maybe this one is related: https://bugs.llvm.org/show_bug.cgi?id=34477
I also found a note that this patch https://reviews.llvm.org/D38197 resolved some problems (maybe that is why I started prioritizing other tasks...).

bjope added a comment.Mar 2 2018, 9:17 AM

I'm having some problems with replacing the location for the DBG_VALUE that should be discarded by %noreg.
If I do that, then the UserValue created in LDVImpl::handleDebugValue will get a location that has the UndefLocNo.
However, after collectDebugValues we do computeIntervals, and in UserValue::computeIntervals there is a loop that discards all undefs. So there is really no point in adding them? The result will be the same as in my original patch.

But maybe it is a bug that LiveDebugVariables removes all DBG_VALUE undef, ...?

If I simply replace the DBG_VALUE by DBG_VALUE %noreg, ... and leave it in the code during register allocation, then it has an impact codegen (might be another bug in the register allocator?), but it is also different from having such DBG_VALUE instruction in the input to LiveDebugVariables (the remove before regalloc phase).

aprantl added a comment.EditedMar 2 2018, 9:37 AM

But maybe it is a bug that LiveDebugVariables removes all DBG_VALUE undef, ...?

I think it is.

If you compile the function at -O1 (to force the constant to be described with a dbg.value)

void foo() {
  // single basic block
  int x = 42;
  bar();
  x = 23;
  bar();
}

and replace the second dbg.value with a dbg.value(undef) in the LLVM IR and run it through llc, do we get the range of the constant 42 correct in the resulting location list for x? Does it end before the second call to bar()?

bjope added a comment.Mar 2 2018, 9:47 AM

But maybe it is a bug that LiveDebugVariables removes all DBG_VALUE undef, ...?

I think it is.

If you compile the function at -O1 (to force the constant to be described with a dbg.value)

void foo() {
  // single basic block
  int x = 42;
  bar();
  x = 23;
  bar();
}

and replace the second dbg.value with a dbg.value(undef) in the LLVM IR and run it through llc, do we get the range of the constant 42 correct in the resulting location list for x? Does it end before the second call to bar()?

Nope, if I do that the DBG_VALUE is removed by LiveDebugVariables. And we no longer get a .debug_loc range for "x". Instead we get

	.byte	3                       # Abbrev [3] 0x3f:0xc DW_TAG_variable
	.byte	42                      # DW_AT_const_value
	.long	.Linfo_string4          # DW_AT_name
	.byte	1                       # DW_AT_decl_file
	.byte	5                       # DW_AT_decl_line
	.long	76                      # DW_AT_type

So "x" will appear as being 42 throughout the function.

Congratulations, you found a bug! :-)
I think the right way forward would be to fix LiveDebugVariables to not drop undef DBG_VALUEs. This will be slightly more expensive in terms of compile time and size of the debug info (since we now need to use location lists for cases like the one you found), but it will be far more accurate. Let me know if you would be interested in working on that otherwise we can also just file a PR as a plan of record.

bjope added a comment.Mar 2 2018, 10:07 AM

Ok, but then I will update this patch with the one that adds and undef DBG_VALUE instead of just discarding it. And then things should work as expected when we solve that other bug.

I think you should file a PR for the other problem. I have some other things to look into, for example trying to hunt down why ISel is inserting these DBG_VALUEs outside of the live range for the virtual reg in the first place.

bjope updated this revision to Diff 136929.Mar 3 2018, 1:42 PM

Updated to replace the faulty DBG_VALUE by 'DBG_VALUE %noreg' instead of
discarding it completely (as suggested by Adrian).

Currently this has little impact, as computeIntervals will discard such
DBG_VALUE instructions anyway. But doing it like this will give the wanted
result when https://bugs.llvm.org/show_bug.cgi?id=36579 has been solved.

bjope added inline comments.Mar 3 2018, 3:11 PM
lib/CodeGen/LiveDebugVariables.cpp
762 ↗(On Diff #136929)

Oops. I did not intend to include this in the patch.

bjope updated this revision to Diff 136935.Mar 4 2018, 12:26 AM

Remove accidentally published debug printout in computeIntervals.

aprantl accepted this revision.Mar 5 2018, 12:13 PM
This revision is now accepted and ready to land.Mar 5 2018, 12:13 PM
This revision was automatically updated to reflect the committed changes.