This is an archive of the discontinued LLVM Phabricator instance.

Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410)
ClosedPublic

Authored by uweigand on Mar 9 2018, 9:54 AM.

Details

Summary

PR 36410 describes a problem where as a result of hoisting common code from "then" and "else" branches of a condition to before the "if", we get a llvm.dbg.value intrinsic with an invalid location. For those intrinsics, the scope indicated in the !dbg location must be identical with the scope of the variable tracked by the intrinsics, or else the module verifier will abort. This constraint can be violated by the current algorithm of getMergedLocation.

Following discussion in https://reviews.llvm.org/D43687, this patch attempts to implement a different approach to fixing this problem. Instead of attempting to merge the locations of two debug intrinsics, we now simply always keep all debug intrinsics from both sides of the "if" during HoistThenElseCodeToIf. This has the additional benefit that we no longer throw away still valid information to help generate better debug data.

Diff Detail

Repository
rL LLVM

Event Timeline

uweigand created this revision.Mar 9 2018, 9:54 AM
vsk added a comment.Mar 9 2018, 11:24 AM

This looks like a nice improvement. I think the test could be tighter/simpler though, e.g:

; RUN: opt -debugify -simplifycfg -S < %s | FileCheck %s

define i64 @caller(i64* %ptr, i64 %flag) {
init:
  %v9 = icmp eq i64 %flag, 0
  br i1 %v9, label %a, label %b

; CHECK:  %vala = load i64, i64* %ptr
; CHECK-NEXT:  call void @llvm.dbg.value(metadata i64 %vala
; CHECK-NEXT:  %valbmasked = and i64 %vala, 1
; CHECK-NEXT:  call void @llvm.dbg.value(metadata i64 %vala

a:                                              ; preds = %init
  %vala = load i64, i64* %ptr, align 8
  br label %test.exit

b:                                              ; preds = %init
  %valb = load i64, i64* %ptr, align 8
  %valbmasked = and i64 %valb, 1
  br label %test.exit

test.exit:                                      ; preds = %a, %b
  %retv = phi i64 [ %vala, %a ], [ %valbmasked, %b ]
  ret i64 %retv
}
uweigand updated this revision to Diff 137819.Mar 9 2018, 12:54 PM

Updated test case as requested.

vsk accepted this revision.Mar 9 2018, 12:57 PM

Thanks, lgtm. I tested this out with a stage2 build locally without issue.

This revision is now accepted and ready to land.Mar 9 2018, 12:57 PM
efriedma added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
1261 ↗(On Diff #137819)

If I'm following this correctly, you're hoisting all dbg.value intrinsics from both sides to the predecessor. So given code like the following:

int foo(bool b) {
  int a;
  if (b) {
    a = 10;
    bar();
  } else {
    a = 20;
    baz();
  }
  return a;
}

You'll end up with two dbg.value intrinsics in the entry block for the variable "a", so the debugger will choose randomly (?) whether to print a=10 or a=20. That seems wrong.

This revision was automatically updated to reflect the committed changes.
uweigand reopened this revision.Mar 9 2018, 2:03 PM

Sorry, I didn't see your comment before checking this in. I believe you're right, and this can cause problems --- I've now reverted and will have another look.

This revision is now accepted and ready to land.Mar 9 2018, 2:03 PM

I think you get approximately the right behavior if you just unconditionally ignore the debug info intrinsics (and never hoist them). Of course, the debug info gets dropped if the terminator is hoisted, but that seems okay; it's hard to handle that well in general anyway.

vsk added a comment.Mar 9 2018, 2:48 PM

In order for both dbg.values for "a" to be hoisted in Eli's example, wouldn't the stores into "a" need to be identical?

In order for both dbg.values for "a" to be hoisted in Eli's example, wouldn't the stores into "a" need to be identical?

After mem2reg runs, there is no store instruction, just a PHI in the return block.

I think you get approximately the right behavior if you just unconditionally ignore the debug info intrinsics (and never hoist them). Of course, the debug info gets dropped if the terminator is hoisted, but that seems okay; it's hard to handle that well in general anyway.

This is of course an option, but it would break an existing test case (test/Transforms/SimplifyCFG/hoist-dbgvalue.ll) that was added by dpatel in 2011 to specifically verify that the following case is handled: the terminator is hoisted, but both "then" and "else" branch contain a dbg.value statement declaring the same value for the same variable -- in this case, the test case wants to verify that we still retain a dbg.value for that value in that variable.

The problem is that the code that does this hoisting of dbg.value statements for the same value/variable can still generate invalid results due to a mismatching !dbg location metadata field, that's the root cause of PR 36410. So far, a number of different potential solutions have been proposed:

  • make sure the merged !dbg location matches the debug value -- this is https://reviews.llvm.org/D43687, but that hasn't been approved yet since it may cause creation of "strange" locations (e.g. using inlined-to fields pointing to places where there is no inlining in the source code)
  • do not merge debug values, but hoist them both -- this is the current Phabricator, which has the problem you pointed out
  • do not hoist debug values at all -- your latest suggestion, but this would break the existing test case

I'm a bit unsure now what the best way to proceed is at this point. My main goal right now is simply to fix the internal compiler errors in PR 36410 ...

Would it be an option to only hoist debug info instructions where both variable and value match (just like the existing code), but then not attempt to merge the !dbg locations, but instead just hoist both copies of the original instructions with both the original locations (like in this patch)?

uweigand updated this revision to Diff 137996.Mar 12 2018, 6:53 AM

Would it be an option to only hoist debug info instructions where both variable and value match (just like the existing code), but then not attempt to merge the !dbg locations, but instead just hoist both copies of the original instructions with both the original locations (like in this patch)?

Updated the patch to implement the above option, just to have a better basis for discussion ...

uweigand requested review of this revision.Mar 12 2018, 6:53 AM
aprantl added inline comments.Mar 12 2018, 9:11 AM
lib/Transforms/Utils/SimplifyCFG.cpp
1295 ↗(On Diff #137996)

Maybe formulate this even stronger:
"The debug location is an integral part of a debug info intrinsic and can't be separated from it or replaced."

uweigand updated this revision to Diff 138378.Mar 14 2018, 8:50 AM

Updated comment as requested.

uweigand marked an inline comment as done.Mar 14 2018, 8:51 AM
aprantl accepted this revision.Mar 14 2018, 8:59 AM
This revision is now accepted and ready to land.Mar 14 2018, 8:59 AM
This revision was automatically updated to reflect the committed changes.