Page MenuHomePhabricator

[DebugInfo] PR40010: Avoid register coalesing altering DBG_VALUE valuations
Needs ReviewPublic

Authored by jmorse on Dec 29 2018, 10:12 AM.



This is a fix for PR40010 [0]. DBG_VALUE instructions do not contribute to the liveness of virtual registers, making them invisible to the register coaleser, which may merge new register valuations over the top of DBG_VALUEs. When this happens, new valuations can appear to debuggers that were never in the source program, which is misleading.

To avoid this, examine the debug users of the register being eliminated, and pick out any that:

  1. Lie in a location where the register isn't live, and
  2. Where the destination register _is_ live at that location, and has a different def.

Which means the DBG_VALUE definitely gets resurrected, with a different value. This patch can leave DBG_VALUEs of non-live registers so long as the register isn't written to.

Note this also marks use-before-any-def's as undef, if they're coalesced to a different register, as it's unclear to me how to establish which def a DBG_VALUE is supposed to refer to in that case.


Diff Detail

Event Timeline

jmorse created this revision.Dec 29 2018, 10:12 AM
jmorse planned changes to this revision.Dec 29 2018, 12:08 PM

Hmmm. In retrospect, aiming to cover all cases where there's a def of the dest-reg in between the last src-def and the DBG_VALUE means considering control flow, which I haven't done.

It's probably better for now to only account for circumstances where reg-coalescing makes the DBG_VALUE live again, and thus we can be 100% confident it does the wrong thing. That more closely matches the test case uploaded.

jmorse updated this revision to Diff 179705.Dec 29 2018, 12:52 PM
jmorse edited the summary of this revision. (Show Details)

Target only DBG_VALUEs that are definitely resurrected with a different def by register coalescing.

vsk added a comment.Dec 31 2018, 2:32 PM

Thanks for working on this.

As you pointed out in, we do have a general problem of leaving around DBG_VALUE instructions which should be dead/undef but aren't. Do you think it would be possible/worthwhile to diagnose this issue in MachineVerifier?


Using MachineRegisterInfo::reg_instructions might aid readability here. Any reason not to?


Could you explain why it's sufficient to fix up coalescer pairs in which the destination is a vreg?

jmorse added a comment.Jan 3 2019, 3:28 AM

Thanks for the review,

In D56151#1343002, @vsk wrote:

As you pointed out in, we do have a general problem of leaving around DBG_VALUE instructions which should be dead/undef but aren't. Do you think it would be possible/worthwhile to diagnose this issue in MachineVerifier?

IMHO that's a definite yes, eventually. It's non-trivial for a human to see where liveness starts and ends for DBG_VALUEs, but a significant liability for optimisations as they can't "see" the DBG_VALUEs they're messing with. A truly dead DBG_VALUE will be dropped anyway, because it has no meaning, so it's not a useful feature to keep.

I'm not familiar with the machine verifier, but it appears to check liveness for things like kill flags, and just ignores debug users, so enabling checking of DBG_VALUEs should be easy to implement.

(I'm preparing a patch that recovers dead DBG_VALUEs from still-live copies, which would have to change if we made dead DBG_VALUEs illegal, but that's a different matter).


No reason (lack of familiarity), I'll switch to that,


Hmmm. I've been operating on the assumption that the coalescer only operates on vregs, but I now see that isn't necessarily the case... I'll investigate this one.

jmorse planned changes to this revision.Jan 3 2019, 8:08 AM

On the topic of coalescing physical registers, it appears the liveness test is well defined in joinReservedPhysReg. However while absorbing all of this, I've realised that dead DBG_VALUEs of both src and dst registers can be resurrected when their intervals are joined. I'll attempt to generalise further...

jmorse updated this revision to Diff 180216.Jan 4 2019, 3:30 AM

Here's a further revision. Two significant changes:

  • Because DBG_VALUEs of either the source or destination register can be resurrected by coalescing, check both,
  • There's now no attempt to check whether the two registers refer to the same value number.

Examining physical registers is only performed for the destination regnum as the CoalescerPair class cannonicalises to make any physreg the destination.

LiveIntervals isn't as strong an analysis as I'd thought, and it looks like my attempt in the previous revision to find the value the DBG_VALUE referred to before it was killed, will break in the presence of any control flow. The patch as it is will now make DBG_VALUEs undef even if the coalescing by coincidence resurrects it to the same value. IMHO this is a reasonable trade-off when we can't prove the resurrection is correct.

On a stage2 build of llvm/clang r349779 a trivial fraction of variable locations are lost by this (less than 0.01%).

It'd be great to revert this patch if we reach a state where dead DBG_VALUEs can't make it as far as simple-register-coalescing in the future.

aprantl added inline comments.Jan 4 2019, 7:29 AM

valuation is not a term we typically use in this context? Perhaps should the operands of the DBG_VALUE be updated?

jmorse updated this revision to Diff 181530.Jan 14 2019, 4:29 AM

Revise comment for mergeChangesDbgValue

jmorse marked an inline comment as done.Jan 14 2019, 4:32 AM
jmorse added inline comments.

Updated with some new text -- I've gone for "would the def that a DBG_VALUE refers to change?", which should be precise about what the function tests for.

bjope added inline comments.Jan 14 2019, 12:10 PM

I think you can remove all the false initializations here (keep tracksRegLiveness: true).



I think the registers: section can be removed (these mapping are given by the MIR below, right?
And liveins, fixedStack, stack, constants can also be removed here, right?



Maybe easier to read if you put all CHECK:s for the basic block here (consecutive lines)?
Or all checks for the function just before/after the function (might need to use # CHECK: instead of ; CHECK: depending on context.

(I've actually never used ; CHECK: within the body like this so I did not know that it works like that.)

I would probably have used something like this:

# <description of what the test is supposed to verify>
# CHECK-LABEL: name:            test1
# CHECK: bb.1:
# CHECK:   %7:gr32 = ADD32rr %7
# CHECK:   DBG_VALUE $noreg
# CHECK: bb.2:

Where the CHECK-LABEL is supposed to make sure that each subtest is self-contained (do not match with anything from another subtest). This also makes it possible to skip some of the IR (since you do not need the symbolic names for the basic blocks.

jmorse updated this revision to Diff 181795.Jan 15 2019, 8:40 AM

Trim and revise MIR test as per Björn's comments.

I've put the block of CHECKs together, which looks a lot better now.

aprantl added inline comments.Jan 15 2019, 8:52 AM


jmorse updated this revision to Diff 181804.Jan 15 2019, 9:17 AM
jmorse marked an inline comment as done.

clang-format-diff, whoops.

TWeaver marked an inline comment as done.Jan 21 2019, 9:35 AM
TWeaver added a subscriber: TWeaver.
TWeaver added inline comments.

I believe theres a typo in this comment

We now "have" whether...

should be

We now "know" whether...

jmorse updated this revision to Diff 183526.Fri, Jan 25, 5:55 AM

Fix a comment wording.