[DebugInfo] Ignore DBG_VALUE instructions in PostRA Machine Sink
Needs ReviewPublic

Authored by mattd on Apr 13 2018, 2:28 PM.

Details

Summary

The logic for handling the sinking of COPY instructions was generating
different code when building with debug flags.

The original code did not take into consideration debug instructions. This
resulted in the registers in the DBG_VALUE instructions being treated as used,
and prevented the COPY from being sunk. This patch avoids analyzing debug
instructions when trying to sink COPY instructions.

This patch also creates a routine from the code in MachineSinking::SinkInstruction to
perform the logic of sinking an instruction along with its debug instructions.
This functionality is used in multiple places, including the code for sinking COPY instrs.

Diff Detail

mattd created this revision.Apr 13 2018, 2:28 PM
vsk added a subscriber: vsk.Apr 13 2018, 4:58 PM

Can this introduce a use-before-def scenario, where the DBG_VALUE for $eax occurs before the copy?

bjope added a subscriber: bjope.Apr 14 2018, 5:26 AM
In D45637#1067785, @vsk wrote:

Can this introduce a use-before-def scenario, where the DBG_VALUE for $eax occurs before the copy?

I think it is the same kind of problem with sinking as I'm trying to solve (for instcombine) here: https://reviews.llvm.org/D45425
I suppose a similar solution would be OK, even if we are beyond SSA form here. So we should sink any DBG_VALUE with a debug-use of the register being defined by the sunken COPY.

In D45637#1067785, @vsk wrote:

Can this introduce a use-before-def scenario, where the DBG_VALUE for $eax occurs before the copy?

I think it is the same kind of problem with sinking as I'm trying to solve (for instcombine) here: https://reviews.llvm.org/D45425
I suppose a similar solution would be OK, even if we are beyond SSA form here. So we should sink any DBG_VALUE with a debug-use of the register being defined by the sunken COPY.

I think you both are correct, thanks for the feedback!

mattd updated this revision to Diff 143663.Apr 23 2018, 3:19 PM
  • Create the performSink routine from code already used to sink an instruction and its associated debug information.
  • Use this routine to also sink debug information associated to COPY instructions.
  • Update the test to check that the DBG_VALUE instruction moves/sinks.
vsk added a comment.Apr 25 2018, 6:57 PM

This looks good to me. It'd be great to have a +1 from someone who's contributed to this pass.

mattd added a reviewer: jonpa.Wed, May 23, 5:19 PM
mattd removed a reviewer: jonpa.
mattd added a subscriber: jonpa.
mattd added a comment.Fri, Jun 8, 5:23 PM

I'm curious, is this issue on anyone else's radar? I know @jonpa was working on solving this, including a handful of other similar cases. If this patch isn't needed anymore then I'm happy to abandon it. If we do want it, then I am happy to rebase, since it's a bit stale now :)

vsk added a comment.Fri, Jun 8, 5:25 PM

I'd like to see this move forward. I'll be out of the office over the next week. I can try to help find another reviewer when I'm back.

bjope added a comment.Mon, Jun 11, 2:45 AM

This comment in the description does not make sense to me:

The original logic intentionally ignores all call instructions. Since debug intrinsics
are represented as calls, the pass was skipping that block and not
performing the sink.

The debug intrinsic is no longer a call after being lowered to DBG_VALUE. So it must have been something else that caused the debug invariant behavior, right?
I guess it was the call to TII->trackRegDefsUses (on trunk it is LiveRegUnits::accumulateUsedDefed), done for non COPY instructions, that would track register uses in debug instructions unless we avoid that somehow.

Could you update the description to make it less confusing?

lib/CodeGen/MachineSink.cpp
1127

Nowadays we have isDebugInstr() to cover all kinds of debug instructions. I suggest using that instead here, to make the check more general.

mattd updated this revision to Diff 150789.Mon, Jun 11, 10:46 AM
mattd edited the summary of this revision. (Show Details)

Thanks for the review @bjope!

My apologies, you are correct, the registers in the DBG_VALUE were getting
captured in LiveRegUnits::accumulateUsedDefed I fixed up the summary
and a few comments in the code/test as well.

  • Updated the patch to use isDebugInstr().
  • Cor rected some comments and renamed the test to reflect that the functionality is trying to avoid processing debug instructions.
mattd marked an inline comment as done.Mon, Jun 11, 10:46 AM
bjope added a comment.Tue, Jun 12, 6:07 AM

Basically looks good to me now, as I think that getting rid of the debug-invariant problem is important.
When it comes to the sinking of DBG_VALUE instructions I believe that will be as good as the pre-RA version of MachineSinking. I think that could be enough to land this (as getting rid of the debug-invariance probably is more important).

It is probably easy, at least by hand, to write some MIR-tests that show that the sinking of DBG_VALUE instruction could be more complicated in general, compared to what performSink currently can handle.
I'm thinking about something like this:

...
renamable $eax = COPY $edi
DBG_VALUE debug-use $eax, debug-use $noreg, !14, !DIExpression(), debug-location !16
$esi = ADD $edi, 7
DBG_VALUE debug-use $esi, debug-use $noreg, !14, !DIExpression(), debug-location !17
...

In the above code code the variable !14 is given a new value after the ADD. Afaict the performSink method (or actually collectDebugValues) simply decides to sink all DBG_VALUE instructions that is using $eax when sinking the COPY defining $eax. But if we sink the first DBG_VALUE past the second DBG_VALUE it will look as if the variable will get the old value when single-stepping past the sunken COPY. I'm not sure, but I don't think that it should be like that.

I think that performSink should look at other DBG_VALUE instructions to see if they refer to the same variable, and if so probably drop the DBG_VALUE (unless it can be salvaged in some way) instead of sinking it. Although I'm not sure how common code patterns like the one above are in reality. Maybe we can improve performSink in a later patch.

thegameg removed a reviewer: aprantl.
thegameg added a subscriber: thegameg.
mattd added a comment.Tue, Jun 12, 3:03 PM

@bjope , Thanks for your review and example! I agree that any changes to improve performSink/collectDebugValues should probably be in another patch.

I agree, after spending some time with your hypothetical case, I can imagine there are some other cases that might result in old/intermediate values in the user's variable, at least for a short period of time.

...
renamable $eax = COPY $edi
DBG_VALUE debug-use $eax, debug-use $noreg, !14, !DIExpression(), debug-location !16
$esi = ADD $edi, 7
DBG_VALUE debug-use $esi, debug-use $noreg, !14, !DIExpression(), debug-location !17
...

Excuse me if I am getting this wrong, the following is my interpretation of your example. Let's assume this example lives in bb0. In the example, sticking with x86_64, the ADD (or ADD32ri) instruction expects tied physical registers, something like $edi = ADD32ri $edi, 7 However, that case would prevent the COPY from sinking, I'm guessing due to a register def. If we were to make this $esi = ADD32ri $esi, 7, that would allow for the COPY to be sunk. Variable !14 would truly be $esi +7 at that program point in bb0, because the DBG_VALUE you introduced will associate $esi as the value for !14. The sunken copy, presumably in bb1, would be the value copied from $edi, and probably the value that a user would expect. Single stepping from the ADD to the sunken COPY would show !14 as being an intermediate or 'old' value until the sunken copy is reached. This doesn't seem wrong to me. It just could present unexpected values to the user, when inspecting !14 between the ADD and sunken COPY. Perhaps, if we were to delete just the DBG_VALUE associated to the ADD in bb0 , perhaps that would encourage a debugger to present 'optimized-out' in this case. I agree with you, any refinements to performSink or collectDebugValues should probably be for another patch. Thanks again for your review.

@bjope , Thanks for your review and example! I agree that any changes to improve performSink/collectDebugValues should probably be in another patch.

I agree, after spending some time with your hypothetical case, I can imagine there are some other cases that might result in old/intermediate values in the user's variable, at least for a short period of time.

...
renamable $eax = COPY $edi
DBG_VALUE debug-use $eax, debug-use $noreg, !14, !DIExpression(), debug-location !16
$esi = ADD $edi, 7
DBG_VALUE debug-use $esi, debug-use $noreg, !14, !DIExpression(), debug-location !17
...

Excuse me if I am getting this wrong, the following is my interpretation of your example. Let's assume this example lives in bb0. In the example, sticking with x86_64, the ADD (or ADD32ri) instruction expects tied physical registers, something like $edi = ADD32ri $edi, 7 However, that case would prevent the COPY from sinking, I'm guessing due to a register def. If we were to make this $esi = ADD32ri $esi, 7, that would allow for the COPY to be sunk. Variable !14 would truly be $esi +7 at that program point in bb0, because the DBG_VALUE you introduced will associate $esi as the value for !14. The sunken copy, presumably in bb1, would be the value copied from $edi, and probably the value that a user would expect. Single stepping from the ADD to the sunken COPY would show !14 as being an intermediate or 'old' value until the sunken copy is reached. This doesn't seem wrong to me. It just could present unexpected values to the user, when inspecting !14 between the ADD and sunken COPY. Perhaps, if we were to delete just the DBG_VALUE associated to the ADD in bb0 , perhaps that would encourage a debugger to present 'optimized-out' in this case. I agree with you, any refinements to performSink or collectDebugValues should probably be for another patch. Thanks again for your review.

I think you got it (even though I see the value from "debug location !16" as the old value, so by reordering the DBG_VALUE statements it will look like the variable first gets the value from "debug location !17" and then the value from "debug location !16", which could be the opposite from what is expressed in the source code that we debug).

Btw, the "ADD" was just an example. We can even skip the "ADD" and make the DBG_VALUE refer to a constant value:

  ...
bb.0:
  $edi = "load variabel !14"
  renamable $eax = COPY $edi
  DBG_VALUE debug-use $eax, debug-use $noreg, !14, !DIExpression(), debug-location !16
  <some code allowing the COPY to be sunk past this>
  DBG_VALUE i32 555, debug-use $noreg, !14, !DIExpression(), debug-location !17
  ...
bb.1:
  ....

let's say this origins from C code like this

int var14 = X;           // Some stack allocated variable
...
int tmp = Var14;     // location !16
...
Var14 = 555;           // location !17
if (<some cond>) {
  bar = tmp;
  <code block A>
  return;
}
...

My concern was more about what value var14 (that is !14) will appear as having in <code block A>. If we reorder the DBG_VALUE instructions, then !14 might appear as having the old value X in <code block A> (even if we have stepped past both location !16 and !17).

mattd added a comment.Wed, Jun 13, 9:37 AM

That's a great explanation, thank you!