This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstCombine] Preserve DI after combining zext instructions
ClosedPublic

Authored by gramanas on Jun 19 2018, 12:06 PM.

Details

Summary

When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction.

The current implementation works only for integer
non-vector variables, but this should get extendet to work for
every case.

Diff Detail

Event Timeline

gramanas created this revision.Jun 19 2018, 12:06 PM
gramanas edited the summary of this revision. (Show Details)Jun 19 2018, 12:07 PM
gramanas added inline comments.
lib/Transforms/InstCombine/InstCombineCasts.cpp
1102

This and the following were changed by clang-format, I hope there is no problem formating those, since my work is done nearby.

lebedev.ri added inline comments.
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

This does not actually check anything, there is no RUN line.

gramanas updated this revision to Diff 151964.Jun 19 2018, 12:20 PM

Add RUN: clause to the test

gramanas marked an inline comment as done.Jun 19 2018, 12:26 PM
lebedev.ri added inline comments.Jun 19 2018, 12:29 PM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
13–15 ↗(On Diff #151964)

This only check for the existence of three dbginfo's.
I wonder how much noise would the utils/update_test_checks.py output contain.

vsk added inline comments.
lib/Transforms/InstCombine/InstCombineCasts.cpp
268–269

There should be a helper function that cuts down on the boilerplate needed to insert fresh debug values.

Here's the usage I envision:

InsertReplacementDebugValues(CSrc, Res, &*std::next(CI.getIterator()), [](DbgInfoIntrinsic &OldDII) -> DIExpression * {
  return OldDII.getExpression();
});

@aprantl, @mattd and others: wdyt?

1097

Could you make the comment more specific? Example: "Preserve any debug values referring to the zext by ...".

1102

This does pose a problem. Please avoid re-formatting code unrelated to your patch. This makes diffs harder to read, and makes attribution harder when looking at git history. If you'd like, you can use the clang-format-diff git hook to only reformat that changes you're making.

A simple way to remove these changes from your patch is to upstage your changes, then use git add -p to re-stage the chunks you want to keep.

1104

+ @aprantl and @probinson for dwarf expertise.

There are two cases: a zext can either operate on integers or on vectors of integers. I think the fragment you're creating here is correct in the first case, but not the second. For the second case, I think you'd to emit multiple dbg.values with distinct fragments describing each component of the original vector.

1104

With the helper I suggested, the zext-of-integer case could be simplified to:

InsertReplacementDebugValues(Src, Res, &*std::next(CI.getIterator()), [&](DbgInfoIntrinsic &OldDII) -> DIExpression * {
  return DIExpression::createFragmentExpression(OldDII.getExpression(), SrcBitsKept, DestBitSize);
});
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

Right. When you add a run line, please reduce this test by invoking opt -debugify, instead of checking in unnecessary metadata lines.

gramanas updated this revision to Diff 152040.Jun 20 2018, 2:27 AM

Clean up code and test

gramanas marked an inline comment as done.Jun 20 2018, 2:29 AM
probinson added inline comments.Jun 20 2018, 7:17 AM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1097

Comment should end with a full-stop. Also it's should be its here.

1104

Emitting one fragment per element sounds plausible, but Adrian is really the fragment expert.

gramanas updated this revision to Diff 152155.Jun 20 2018, 1:59 PM

Make use of rL335144 utility.

Add non-vector test.

gramanas retitled this revision from [DebugInfo][InstCombine] Preserve DI after merging instructions to [WIP][DebugInfo][InstCombine] Preserve DI after merging zext instructions.Jun 20 2018, 2:11 PM
gramanas edited the summary of this revision. (Show Details)
gramanas updated this revision to Diff 152170.Jun 20 2018, 2:29 PM
gramanas edited the summary of this revision. (Show Details)

Update lamda to return nullptr if it fails to create the fragment.

gramanas updated this revision to Diff 152244.Jun 21 2018, 3:06 AM

Fragments don't seem to be needed here

gramanas updated this revision to Diff 152690.Jun 25 2018, 7:33 AM
  • Focus on the scalar test case
  • Use fragments since the width of the Src and Res instructions change
gramanas edited the summary of this revision. (Show Details)Jun 25 2018, 7:35 AM
aprantl added inline comments.Jun 25 2018, 8:14 AM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1102

If you are going to make a lot of changes to the same file, you can also do a separate commit that is just a clang-format of this file ahead of time.

vsk added inline comments.Jun 25 2018, 12:25 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1103

DIExpression doesn't need to be null-checked here because DbgValueInst::getExpression() is always nonnull.

1104

You mentioned offline that this causes a verifier failure. If we can't use fragments, consider using a simpler approach for scalars: using an empty expression. It won't work for vectors, but in the scalar case, the high bits should already be cleared. We can relax the check in https://reviews.llvm.org/D48408 to allow this.

1107

This unwraps an Optional without a check. If you're sure this can't fail, please add a comment explaining why.

gramanas added inline comments.Jun 26 2018, 6:07 AM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

If I use an empty DIExpression it doesn't change the fact that the dbg.value call points to a DILocalVariable that after instcombine has a different DIBasicType than what it used to.

Basiacly the dbg.value call says that the value was changed to an i32 but the it's type is still i8 in the debug info. If there is a way to explain this to the debugger using DIExpressions, I can't find it.

aprantl added inline comments.Jun 26 2018, 8:53 AM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

(Not sure if that is your question) We don't have any means to express that the SSA value used by a dbg.value instrinsic is larger (e.g., because of s/zext) than the size of the fragment being described.

For example, if the size of the variable is 8 bits and the value is 32 bits then implicitly the lower 8 bits are used. You could make this explicit by masking out everything but the lower out bits in the DIExpression (DW_OP_constu 0xff DW_OP_and) but this is redundant so we don't do it.

Why would you like to express this?

gramanas added inline comments.Jun 26 2018, 12:33 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

I thought the fact that a 32 bit variable has DIBasicType of size 8 is wrong. But modifying the DI would interfere with the way the front-end described the program at the first place, thus I was thinking of a way to describe it with a DIExpression. Since the correct value is being displayed implicitly, I guess it's not a problem.

aprantl added inline comments.Jun 26 2018, 12:57 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

When you say "32 bit variable" are you referring to the source variable or the SSA value that is used by the dbg.value intrinsic?

gramanas added inline comments.Jun 26 2018, 1:39 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

When running opt -debugify -instcombine to the scalar test below, the zext is replaced with a wider mul (i32 from i8) and the resulting dbg.value from this patch points to a DILocalVariable that has DIBasicType of size 8, while its an i32 value.

I think I mean the SSA value.

vsk added inline comments.Jun 26 2018, 1:50 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

It looks like it's OK for a integer dbg.value operand to be wider than the type of its associated DILocalVariable. Adrian's second-to-last comment states that the debugger will handle this by displaying the lower bits.

aprantl added inline comments.Jun 26 2018, 2:02 PM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1104

Correct. I should also point out that the following is entirely legal and actually gets emitted by LLVM on i386 for an 8-bit variable in the ah register:

dbg.value(!metadata i32 %value, !metadata "my8bit-variable", !metadata !DIExpression(DW_OP_shr 8)) // = %value >> 8

So we describe an 8-bit variable in bits 8-15 of a. 32-bit value.

gramanas updated this revision to Diff 153040.Jun 27 2018, 4:11 AM
  • Update the scalar case according to the comments
  • Postpone vector case for a later patch
gramanas marked 4 inline comments as done.Jun 27 2018, 4:12 AM
gramanas retitled this revision from [WIP][DebugInfo][InstCombine] Preserve DI after merging zext instructions to [[DebugInfo][InstCombine] Preserve DI after combining zext instructions.Jun 27 2018, 4:34 AM
gramanas edited the summary of this revision. (Show Details)
gramanas retitled this revision from [[DebugInfo][InstCombine] Preserve DI after combining zext instructions to [DebugInfo][InstCombine] Preserve DI after combining zext instructions.
gramanas edited the summary of this revision. (Show Details)
vsk added inline comments.Jun 27 2018, 8:08 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
17 ↗(On Diff #153040)

Please move the check line closer to the associated IR. You can pick whether to add the checks after the associated IR lines or before. So:

; CHECK-LABEL: define i32 @test-scalar
define i32 @test-scalar(i32 %x, i32 %y) {

etc.

18 ↗(On Diff #153040)

This just tests that there are some dbg.values. Please make the test more specific by checking that the operand of the dbg.value is correct.

bjope added a subscriber: bjope.Jun 27 2018, 10:02 AM
bjope added inline comments.
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

Is that really a good idea?

  1. It is really hard to review this without knowing what the input looks like.
  2. As this will be a regression test, isn't it better to have the input fixated. Who know what -debugify will generate for this input in the future.
  3. There might be lots of redundant debug info generated by -debugify, making it hard to understand the test case if it fails in the future (specially without knowing if -debugify generated something differently originally).
vsk added inline comments.Jun 27 2018, 10:56 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

The basic requirement of -debugify output is that it must be stable enough to use in regression tests. Just to recap, a dbg.value is inserted after each non-void-valued instruction if the insertion point is valid, and the variable names are sequential integers. Occasionally -debugify requires bug fixes, but these changes aren't allowed to break existing tests or to change their meaning. I hope that addresses your first and second concerns? It does take some extra effort initially to understand this style of test, but I think that effort pays off.

Concretely, the advantage of using -debugify in regression tests is that it lets us test an optimization in the exact same place as the optimization's ability to preserve debug info. This makes it easier to keep debug info tests in sync with the behavior of an optimization. And it makes it much easier to add tests by lessening the need for manual reduction, which is a barrier to contribution.

To your third point, you're right that -debugify inserts more debug info than strictly needed, but we can avoid confusion here by using strict CHECK lines.

bjope added inline comments.Jun 27 2018, 11:16 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

So the size of the variable always matches the value exactly?
And it will never happen that -debugify will simulate that SROA has splitted an aggregate into several smaller pieces?
And we will never let -debugify generate DIExpressions?

Anyway, my main concern is that it will be really hard to review patches like this one. without knowing what the input looks like. But maybe that is obvious after playing around with debugify for some hours.

bjope added inline comments.Jun 27 2018, 11:26 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

Concretely, the advantage of using -debugify in regression tests is that it lets us test an optimization in the exact same place as the optimization's ability to preserve debug info.

I guess there is a test case for the optimization somewhere, is the idea that this test shouuld be added as an extra RUN line in that test case instead of adding a new file?

vsk added inline comments.Jun 27 2018, 11:44 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

So the size of the variable always matches the value exactly?

Yes, specifically, DataLayout::getTypeAllocSizeInBits is used for sized types.

And it will never happen that -debugify will simulate that SROA has splitted an aggregate into several smaller pieces?
And we will never let -debugify generate DIExpressions?

I view these specific changes as a bit out of scope for -debugify, but regardless, it's possible to write regression tests which don't depend on these details.

Anyway, my main concern is that it will be really hard to review patches like this one. without knowing what the input looks like. But maybe that is obvious after playing around with debugify for some hours.

I'm.. growing increasingly concerned about this as well. I'd like our tests to be reasonably simple to understand. My view is that these tests already require understanding of at least one pass (in this case -instcombine), so it's reasonable to depend on a stable testing pass on top of that. That could well be too optimistic. How much extra effort is involved in parsing this test? Would more documentation help? (Anyone else have thoughts about this?)

Concretely, the advantage of using -debugify in regression tests is that it lets us test an optimization in the exact same place as the optimization's ability to preserve debug info.

I guess there is a test case for the optimization somewhere, is the idea that this test shouuld be added as an extra RUN line in that test case instead of adding a new file?

Ideally yes. If you grep in tests/Transforms you'll see this pattern used to test gvn, *dce, sccp, licm etc. I would like this patch to adopt the same approach, similar to https://reviews.llvm.org/D48640.

bjope added inline comments.Jun 27 2018, 11:56 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

Ok, I think I got it. Thanks to @vsk for the explanations! And sorry @gramanas for having this discussion as part of reviewing your patch.

gramanas added inline comments.Jun 28 2018, 5:00 AM
test/Transforms/InstCombine/debuginfo-trunc-and-zext.ll
1 ↗(On Diff #151962)

It's not a problem at all. Thanks for your comments, any feedback is welcomed.

I would like this patch to adopt the same approach, similar to https://reviews.llvm.org/D48640.

Indeed the test should get incorporated with already existing tests. I will work on that.

Anyway, my main concern is that it will be really hard to review patches like this one. without knowing what the input looks like. But maybe that is obvious after playing around with debugify for some hours.

I'm.. growing increasingly concerned about this as well. I'd like our tests to be reasonably simple to understand. My view is that these tests already require understanding of at least one pass (in this case -instcombine), so it's reasonable to depend on a stable testing pass on top of that. That could well be too optimistic. How much extra effort is involved in parsing this test? Would more documentation help? (Anyone else have thoughts about this?)

I think the -debugify pass is pretty simple to understand, but some documentation would definitely help future devs who come across this kind of tests, or would like to add similar tests to other passes. Since getting the -debugify output is just an opt invocation away, I think it's better to write tests like this instead of incorporating the debug info in the .ll file which results in clutter.

gramanas updated this revision to Diff 153303.Jun 28 2018, 6:03 AM

Incorporate the -debugify test into the existing test files.

gramanas added inline comments.Jun 28 2018, 6:05 AM
test/Transforms/InstCombine/cast-mul-select.ll
3

Should I add a note saying that the DBGINFO checks are not autogenerated?

bjope added a comment.Jun 28 2018, 8:20 AM

Incorporate the -debugify test into the existing test files.

Nice! I was a little bit confused the other day, but right now I got a very good feeling about this.

vsk added inline comments.Jun 29 2018, 10:05 AM
lib/Transforms/InstCombine/InstCombineCasts.cpp
1084

Nit -- please capitalize this and use the same line wrapping as the surrounding code.

test/Transforms/InstCombine/cast-mul-select.ll
3

No, but please remove the note about the checks being autogenerated, because with this change they no longer are.

20

We don't gain much by capturing DILocVar{C,D} here. It's fine to not do this, because we don't need to check what's inside the variables.

gramanas marked 3 inline comments as done.Jun 30 2018, 5:45 AM
gramanas updated this revision to Diff 153627.Jun 30 2018, 5:45 AM

Addressing inline comments.

vsk accepted this revision.Jul 2 2018, 10:23 AM

LGTM, thanks! We can simplify this a bit with D48676 but it doesn't need to be a blocker.

This revision is now accepted and ready to land.Jul 2 2018, 10:23 AM
This revision was automatically updated to reflect the committed changes.