This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Fix bug in LiveDebugVariables.
ClosedPublic

Authored by HsiangKai on Aug 12 2018, 11:40 PM.

Details

Summary

In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query SlotIndex for following debug instructions. Only handle DBG_VALUE in handleDebugValue().

Diff Detail

Repository
rL LLVM

Event Timeline

HsiangKai created this revision.Aug 12 2018, 11:40 PM

Do you have a testcase for this? You should be able to construct an artificial MIR testcase.

lib/CodeGen/LiveDebugVariables.cpp
590 ↗(On Diff #160291)

why not use a skipDebug... mechanism here, too?

HsiangKai updated this revision to Diff 161158.Aug 16 2018, 8:25 PM

Update according to comments.
Add test case.

HsiangKai marked an inline comment as done.Aug 16 2018, 8:27 PM
aprantl added inline comments.Aug 17 2018, 8:34 AM
test/DebugInfo/Generic/debug-var-slot.ll
2 ↗(On Diff #161158)

Please add a comment that explains what is being tested here.

Also, it would probably be better to run llc with -stop-after-livedebugvariables or something similar and check the MIR output, that will make it less likely that the test bitrots.

HsiangKai updated this revision to Diff 161414.Aug 19 2018, 8:04 PM

Update test case.

HsiangKai marked an inline comment as done.Aug 19 2018, 8:06 PM
bjope added a subscriber: bjope.Aug 20 2018, 12:29 AM
bjope added inline comments.
lib/CodeGen/LiveDebugVariables.cpp
601 ↗(On Diff #161414)

Is the DBG_LABEL always placed before a set of DBG_VALUE instructions, or can they be interleaved?

Earlier this loop was needed since we did not skip debug instructions when searching backwards for the SlotIndex. With the change above I guess the result would be the same even without the loop. If we for example can have

DBG_VALUE
DBG_VALUE
DBG_LABEL
DBG_VALUE
DBG_VALUE

nowadays, then I think that we

  1. either should not have a loop here
  1. or we should skip DBG_LABEL instructions in the inner loop
  1. or we should add some code comments that describes why we need this nested loop (is it still needed as an optimization to avoid doing the SlotIndex lookup multiple times for consecutive DGB_VALUE instructions?)
  1. alternative (2) plus, the condition on line 581 should be changed from checking isDebugValue() to checking isDebugInstr(). Then the new code that you have added can be removed, and instead the inner loop should call handleDebugValue only for DBG_VALUE instructions, while simply skipping past DBG_LABEL instructions. That way we let the inner loop handle all kinds of consecutive debug instructions that share the same SlotIndex.

I put my vote on option (4) as I think it is simpler to understand, we do not need backward skipping when having interleaved DBG_LABEL/DBG_VALUE, and we do not need to benchmark to understand if the inner loop is needed just as an optimization or not.

HsiangKai updated this revision to Diff 161640.Aug 20 2018, 8:57 PM
HsiangKai edited the summary of this revision. (Show Details)
HsiangKai marked an inline comment as done.Aug 20 2018, 9:02 PM
HsiangKai added inline comments.
lib/CodeGen/LiveDebugVariables.cpp
601 ↗(On Diff #161414)

I agree with you. I modified my patch according to your suggestions. It is more clear and easy to understand. I add more comments in the function to show the intension.

HsiangKai marked an inline comment as done.Aug 20 2018, 9:03 PM
bjope added inline comments.Aug 20 2018, 11:04 PM
lib/CodeGen/LiveDebugVariables.cpp
597 ↗(On Diff #161640)

I think this should be (MBBI->isDebugValue() && handleDebugValue(*MBBI, Idx)), or we need to explicitly check for isDebugValue() at the beginning of handleDebugValue. I don't think we want to see the "Can't handle" debug printouts from handleDebugValue due to calling the function with non DBG_VALUE instructions.

HsiangKai marked an inline comment as done.
bjope accepted this revision.Aug 21 2018, 1:03 AM

LGTM!

Just a small nit-pick in the summary text: s/hanlde/handle/

This revision is now accepted and ready to land.Aug 21 2018, 1:03 AM
HsiangKai edited the summary of this revision. (Show Details)Aug 21 2018, 1:27 AM
This revision was automatically updated to reflect the committed changes.
chandlerc added inline comments.
llvm/trunk/test/DebugInfo/Generic/debug-var-slot.ll
8 ↗(On Diff #162128)

You cannot use -debug in test cases reliably.

Adrian actually specifically suggested a way of testing this, please follow up there before submitting (and for the reviewer, please make sure that other commenters on a patch are OK with proceeding).

I'm going to revert for now as it is breaking bots and any user not building with asserts.

bjope added inline comments.Aug 23 2018, 10:04 AM
llvm/trunk/test/DebugInfo/Generic/debug-var-slot.ll
8 ↗(On Diff #162128)

Sorry @chandlerc . I actually read Adrians comments, but must have missed that last part about doing the CHECKs on the MIR-output (the other requests from Adrian had been fixed and the comment was marked as "Done").

@HsiangKai , if I write a test case like this it asserts without your patch, so maybe this is a more narrow test case (I haven't tried it with your patch so please look at the CHECK lines to see if this matches your problem):

# RUN: llc -run-pass=livedebugvars -o - %s | FileCheck %s

--- |
  define i32 @foo(i32 %a, i32 %b) !dbg !4 {
  entry:
    ret i32 0, !dbg !12
  }

  !llvm.dbg.cu = !{!0}
  !llvm.module.flags = !{!3}

  !0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
  !1 = !DIFile(filename: "debug-var-slot.c", directory: "./")
  !2 = !{}
  !3 = !{i32 2, !"Debug Info Version", i32 3}
  !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true, scopeLine: 2, isOptimized: true, unit: !0, retainedNodes: !2)
  !5 = !DISubroutineType(types: !6)
  !6 = !{!7, !7, !7}
  !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
  !8 = !DILabel(scope: !4, name: "top", file: !1, line: 4)
  !9 = !DILocation(line: 4, column: 1, scope: !4)
  !10 = !DILocalVariable(name: "local_var", scope: !4, file: !1, line: 7, type: !7)
  !11 = !DILocation(line: 7, column: 1, scope: !4)
  !12 = !DILocation(line: 8, column: 3, scope: !4)
  !13 = !DILocalVariable(name: "local_var2", scope: !4, file: !1, line: 7, type: !7)

...
---
name:            foo
tracksRegLiveness: true
body:             |
  bb.0:

    DBG_VALUE debug-use $esi, debug-use $noreg, !13, !DIExpression(), debug-location !11
    DBG_LABEL !8, debug-location !9
    DBG_VALUE debug-use $edi, debug-use $noreg, !10, !DIExpression(), debug-location !11
    RET 0, undef $eax, debug-location !12
...

# CHECK-LABEL: name:            foo
# CHECK: bb.0:
# CHECK-DAG: DBG_LABEL
# CHECK-DAG: DBG_VALUE debug-use $esi
# CHECK-DAG: DBG_VALUE debug-use $edi
HsiangKai reopened this revision.Aug 28 2018, 12:08 AM

Test case needs to be reviewed.

This revision is now accepted and ready to land.Aug 28 2018, 12:08 AM
bjope requested changes to this revision.Aug 28 2018, 12:15 AM

Revoking the ready-to-land since the test case needs to be updated.

This revision now requires changes to proceed.Aug 28 2018, 12:15 AM

Thanks @bjope for providing a good test case for the patch.
It will trigger another assertion due to no emitDebugValues() after livedebugvars.
I found it depends on VirtRegRewriter. So, I will change -run-pass to -stop-after.
Besides this, the test case works after applying the patch.

Update the test case.
@bjope and @aprantl, please help me to review the patch. Thanks a lot.

bjope added a comment.Aug 28 2018, 8:25 AM

Thanks @bjope for providing a good test case for the patch.
It will trigger another assertion due to no emitDebugValues() after livedebugvars.
I found it depends on VirtRegRewriter. So, I will change -run-pass to -stop-after.
Besides this, the test case works after applying the patch.

Not sure I understand the above. My suggestion was to only run the livedebugvars pass (making the test case specific to that pass).
Now you changed the -run-pass to -stop-after, but in which pass does it start then? Don't you need a start-before/-start-after as well to indicate where to start?
AFAIK "-run-pass=livedebugvars" is the same as "-start-before=livedebugvars -stop-after=livedebugvars", so we do not run VirtRegRewriter when using a single -run-pass like that.

Note that a problem with running these register allocation passes standalone is that handwritten input could be abnormal. Lot's of DBG_VALUE instructions are normally removed before we get to livedebugvars, and then they are added back to the IR by VirtRegRewriter. And we do not include information about the cached DBG_VALUE instructions in the MIR-format.

So the test case I provided does not necessarily imitate a normal situation when we get to livedebugvars (I just made up something to trigger the problem). Perhaps DBG_VALUE instructions referring to physical registers are among the cached DBG_VALUE instructions that will be reinserted by virtregrewriter. Do you know what the DBG_VALUE instructions looked like in your original TR? Was it perhaps DBG_VALUE instructions where the first operand was a constant value, I think such DBG_VALUE instructions stay in the IR during register allocation.

I don't know if it really is important how we model the DBG_VALUE instructions here, as long as they expose the problem. It is of course better to have the test case as realistic as possible.

Thanks @bjope for providing a good test case for the patch.
It will trigger another assertion due to no emitDebugValues() after livedebugvars.
I found it depends on VirtRegRewriter. So, I will change -run-pass to -stop-after.
Besides this, the test case works after applying the patch.

Not sure I understand the above. My suggestion was to only run the livedebugvars pass (making the test case specific to that pass).
Now you changed the -run-pass to -stop-after, but in which pass does it start then? Don't you need a start-before/-start-after as well to indicate where to start?
AFAIK "-run-pass=livedebugvars" is the same as "-start-before=livedebugvars -stop-after=livedebugvars", so we do not run VirtRegRewriter when using a single -run-pass like that.

Note that a problem with running these register allocation passes standalone is that handwritten input could be abnormal. Lot's of DBG_VALUE instructions are normally removed before we get to livedebugvars, and then they are added back to the IR by VirtRegRewriter. And we do not include information about the cached DBG_VALUE instructions in the MIR-format.

So the test case I provided does not necessarily imitate a normal situation when we get to livedebugvars (I just made up something to trigger the problem). Perhaps DBG_VALUE instructions referring to physical registers are among the cached DBG_VALUE instructions that will be reinserted by virtregrewriter. Do you know what the DBG_VALUE instructions looked like in your original TR? Was it perhaps DBG_VALUE instructions where the first operand was a constant value, I think such DBG_VALUE instructions stay in the IR during register allocation.

I don't know if it really is important how we model the DBG_VALUE instructions here, as long as they expose the problem. It is of course better to have the test case as realistic as possible.

If there is only -run-pass=livedebugvars in the llc command, it will trigger following assertion.

lib/CodeGen/LiveDebugVariables.cpp:391: void {anonymous}::LDVImpl::clear(): Assertion `(!ModifiedMF || EmitDone) && "Dbg values are not emitted in LDV"' failed.

After tracing the source code, I found that if the MF is modified in livedebugvars, it will check EmitDone to ensure the modification is emitted. In addition, virtregrewriter is the only one pass to emit these modifications. So, if the MF is modified in livedebugvars, it must run virtregrewriter pass.

I think we could add -run-pass=virtregrewriter in the llc command. It will bring three additional passes after livedebugvars pass as follow.

Live Stack Slot Analysis
Virtual Register Map
Virtual Register Rewriter

What do you think?

HsiangKai updated this revision to Diff 163077.Aug 29 2018, 7:00 AM

Add -run-pass=virtregrewriter to the test case.

Add -run-pass=virtregrewriter to the test case.

Thanks for the earlier explanation about the assert. I was probably saved by the !ModifiedMF clause in the assert when experimenting with -run-pass=livedebugvars.

I'm fine with this solution about also running virtregrewriter.
@aprantl, are you OK with this updated test case?

aprantl accepted this revision.Aug 29 2018, 1:02 PM
This revision was not accepted when it landed; it landed in state Needs Review.Sep 2 2018, 8:58 AM
This revision was automatically updated to reflect the committed changes.

@bjope: After I committed the patch to LLVM repo, it still failed in clang-ppc64be-linux-lnt buildbot.

I guess the test case should contain target triple to be filtered out by llvm-lit in other architectures.
Is it correct to add "target triple = 'x86_64-unknown-linux-gnu' " in the test case?

bjope added a comment.Sep 4 2018, 1:28 AM

@bjope: After I committed the patch to LLVM repo, it still failed in clang-ppc64be-linux-lnt buildbot.

I guess the test case should contain target triple to be filtered out by llvm-lit in other architectures.
Is it correct to add "target triple = 'x86_64-unknown-linux-gnu' " in the test case?

That buildbot is probably building the compiler with support for both ppc and x86, but with ppc as default target.
So the check in test/DebugInfo/X86/lit.local.cfg that X86 is a configured target will pass, i.e. the test case will be executed even when the default target isn't X86.

Without explicitly specifying the triple (either by a "target triple = ..." statement, or by using -mtriple in the RUN-line) the test case will be executed using the default target.

In this case it might be slightly easier to add -mtriple=x86_64-unknown-linux-gnu to the RUN-line. Adding target triple = 'x86_64-unknown-linux-gnu should work as well, but then you need to put in inside the module definition (after line 10).

@bjope: After I committed the patch to LLVM repo, it still failed in clang-ppc64be-linux-lnt buildbot.

I guess the test case should contain target triple to be filtered out by llvm-lit in other architectures.
Is it correct to add "target triple = 'x86_64-unknown-linux-gnu' " in the test case?

That buildbot is probably building the compiler with support for both ppc and x86, but with ppc as default target.
So the check in test/DebugInfo/X86/lit.local.cfg that X86 is a configured target will pass, i.e. the test case will be executed even when the default target isn't X86.

Without explicitly specifying the triple (either by a "target triple = ..." statement, or by using -mtriple in the RUN-line) the test case will be executed using the default target.

In this case it might be slightly easier to add -mtriple=x86_64-unknown-linux-gnu to the RUN-line. Adding target triple = 'x86_64-unknown-linux-gnu should work as well, but then you need to put in inside the module definition (after line 10).

I got it. Thanks a lot. :)