Page MenuHomePhabricator

[DebugInfo] Handle complex expressions of spilt locations in LiveDebugValues
ClosedPublic

Authored by jmorse on Jul 27 2019, 6:29 AM.

Details

Summary

D66284 fixes a crash in LiveDebugValues, but in doing so disallows the spilling of complex variable locations, as they cannot be easily restored.

This [revised] patch slightly adjusts the DBG_VALUE insts that LiveDebugValues tracks: instead of tracking the last DBG_VALUE for a variable, it tracks the last _unspilt_ DBG_VALUE. The spill-restore code is then able to access and copy the original complex expression; but the rest of LiveDebugValues has to be aware of the slight semantic shift, and produce a new spilt location if a spilt location is propagated between blocks.

This avoids the crash and allows us to spill/restore all DBG_VALUEs.

Diff Detail

Event Timeline

jmorse created this revision.Jul 27 2019, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2019, 6:29 AM
vsk added a comment.Jul 27 2019, 12:11 PM

Thanks for working on this. Do I understand correctly that the fragment expression introduced in the restorer may not be accurate when the spilled value has a complex description?

If so, my two cents is that it’d be preferable to either:

a) Entirely drop the dbg_value/location for complex fragments during a spill (as a safer crash mitigation), or

b) Find some way to teach the restorer how to build correct descriptions for fragments (at first glance it seemed like this would be as simple as stripping off the offset applied in the spiller, but I’m probably missing something)

jmorse updated this revision to Diff 213253.Aug 4 2019, 10:16 AM

Revise patch to not spill variable locations with complex (non-empty non-fragment) expressions, as we currently can't recover the original expression on restore.

Vedant wrote:

Thanks for working on this. Do I understand correctly that the fragment expression introduced in the restorer may not be accurate when the spilled value has a complex description?

That's correct,

If so, my two cents is that it’d be preferable to either:

a) Entirely drop the dbg_value/location for complex fragments during a spill (as a safer crash mitigation), or

That sounds a lot safer, done with this update,

b) Find some way to teach the restorer how to build correct descriptions for fragments (at first glance it seemed like this would be as simple as stripping off the offset applied in the spiller, but I’m probably missing something)

I think that would work within LiveDebugValues, but we can also get spill locations from LiveDebugVariables too, as well as stack-locations created in LowerDbgDeclare [0] that aren't actually spills but might present themselves identically. I'm currently building an argument that DIExpressions signal too much information "in-band" as it were, there's also D64595 where digging through the insides of a DIExpression might be appropriate.

~

NB: I'm out-of-office and will be away from SVN credentials for a while, if this is approved, would the approver please commit it too. I can probably file the ticket for merging this into the llvm-9 branch, but it'd be great if that was handled as well.

[0] https://github.com/llvm/llvm-project/blob/44b16bd4a5b20ce8b4e9ef56836123a2038f3670/llvm/lib/Transforms/Utils/Local.cpp#L1427

Thanks. I think this looks reasonable, but don't have the bandwidth right now to test+merge a fix for llvm-9. + debug-info for more feedback/visibility.

llvm/test/DebugInfo/MIR/X86/live-debug-values-restore.mir
38

It's be great to move the checks closer to the MIR code they refer to. Also, as this patch includes a fix for fragment expressions, I think the check should validate the fragment.

aprantl added inline comments.Aug 5 2019, 2:44 PM
llvm/lib/CodeGen/LiveDebugValues.cpp
696

The new case makes sense, but I find the original code quite unintuitive... why would be create an empty expression for the restored value rather than copy the original expression?

850

Ah.. that's why?

Shouldn't we then just copy the original expression? That would be less code and more obvious.

jmorse updated this revision to Diff 213636.Aug 6 2019, 9:12 AM

Adjust test to positively check for correct fragment expressions after a spill and restore

jmorse updated this revision to Diff 213638.Aug 6 2019, 9:20 AM
jmorse marked an inline comment as done.

Curses, previous update spliced in an old LiveDebugValues.cpp diff, this one should contain the revised LiveDebugValues.cpp, and the revised test modification, apologies for confusion.

jmorse marked an inline comment as done.Aug 6 2019, 9:39 AM
In D65368#1615530, @vsk wrote:

Thanks. I think this looks reasonable, but don't have the bandwidth right now to test+merge a fix for llvm-9. + debug-info for more feedback/visibility.

Thank you for reviewing! I should be back before rc3, but probably not before rc2, so this *shouldn't* miss the boat.

llvm/lib/CodeGen/LiveDebugValues.cpp
696

The original came in in D57271 -- at the time, I had a look and didn't feel it was wrong. Perhaps, seeing how LiveDebugValues wasn't handling fragments at all at the time, everyone assumed Something Else (TM) took care of fragments?

850

IIRC, when restored, the only DBG_VALUE available to the restore-code is the spilt-location expression rather than the original. We'd have to track additional information about the original expression in the LiveDebugValues data structures -- which is fine, but IMHO too much new material to pull into LLVM-9.

We could also strip the additional-offset from the expression, but I worry about that too, see main comment from 2019-08-04 above.

(Looks like I missed saying "crash mitigation" in the review description, curses, I'll edit that in)

jmorse edited the summary of this revision. (Show Details)Aug 6 2019, 9:40 AM

(ping -- back in office now, so can land this myself).

aprantl added inline comments.Aug 12 2019, 9:11 AM
llvm/lib/CodeGen/LiveDebugValues.cpp
850

I understand the constraints now. This is a new expression that just describes the spill. Due to the concatenative nature of DWARF expressions, I believe that you still should be able to just append the operations describing the spill at the end of the original expression (but before DW_OP_llvm_fragment/DW_OP_stack_value) and be safe. That's how we do it in salvageDebugInfo for example. Is this situation any different?

jmorse marked an inline comment as done.Aug 12 2019, 9:28 AM
jmorse added inline comments.
llvm/lib/CodeGen/LiveDebugValues.cpp
850

Aha, there might be some wire crossing going on -- interpreting this:

This is a new expression that just describes the spill [...] you still should be able to just append the operations

Creating the spill expression isn't the issue, it's the eventual restore expression that's difficult to create. However Vedant suggested in the first comment that a simpler solution may be to just not create a spill transfer of anything that we can't restore as a broader mitigation (which sounds good to me).

An alternative would be the original patch I uploaded: that preserves fragment information correctly on restore, at the cost of potentially dropping portions of complex expressions.

at the cost of potentially dropping portions of complex expressions.

llvm/lib/CodeGen/LiveDebugValues.cpp
850

at the cost of potentially dropping portions of complex expressions.

This meaning, we're loosing those DBG_VALUEs, not that they are incorrect, right?

jmorse marked an inline comment as done.Aug 12 2019, 1:57 PM

This meaning, we're loosing those DBG_VALUEs, not that they are incorrect, right?

They would be incorrect, as extra operations used to calculate the location would be dropped -- but on the plus-side, the compiler wouldn't crash in the process. Here's an illustrative example to aid discussion, first how this is all supposed to work with a sample input using an empty DIExpression (in-pseudo-mir, destination regs on left hand side):

DBG_VALUE $rax, !1, !DIExpression()
MOV64rm $rsp(-8), $rax
call @somesymbol
MOV64mr $rax, $rsp(-8)

The existing spill-and-restore code will recognise the store/load to/from stack and produce:

DBG_VALUE $rax, !1, !DIExpression()
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus)
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

inserting a DBG_VALUE after the spill and after the restore. The problem is that if the original DBG_VALUE has a fragment, then with no patch we get:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_LLVM_fragment, 0, 32)
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

i.e., the fragment gets dropped on the last DBG_VALUE. This currently fires an assertion in LiveDebugValues, I'm pretty certain the location-list builder would fire a few assertions if it saw this too.

The patch as it stands right now would get the fragment correct on the previous example, but would refuse to spill the following:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
call @somesymbol
MOV64mr $rax, $rsp(-8)

Because of the complex expression on the DBG_VALUE (i.e. the DW_OP_plus_ucons).

The initial version of this patch would have produced:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_deref, DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)   ; illustrative, not sure if deref is in right place
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)

i.e.: the DW_OP_plus_uconst would have been preserved in the spill, but not on the restore. That would give us a bad location after the restore, but on the other hand, no crash.

Obviously these are all unfortunate options. There are a few more: we could just never restore, but this can artificially extend location lifetimes [0]. We could emit DBG_VALUE $noreg instead of restoring. We could also try pattern-matching the portion of the DIExpression that's the spill, and the part that isn't, and try and cut out the spill part -- however my past experiences with manipulating DIExpression opcodes have been error prone.

My (IMHO) preference is the current patch, i.e. don't spill expressions with complex locations. It chops out some functionality (that hasn't been released before) but produces neither crashes nor invalid locations.

[0] https://bugs.llvm.org/show_bug.cgi?id=42772

llvm/lib/CodeGen/LiveDebugValues.cpp
850

(Producing an example which'll display better with more hspace in the main comment thread, so moving there),

This meaning, we're loosing those DBG_VALUEs, not that they are incorrect, right?

They would be incorrect, as extra operations used to calculate the location would be dropped -- but on the plus-side, the compiler wouldn't crash in the process. Here's an illustrative example to aid discussion, first how this is all supposed to work with a sample input using an empty DIExpression (in-pseudo-mir, destination regs on left hand side):

DBG_VALUE $rax, !1, !DIExpression()
MOV64rm $rsp(-8), $rax
call @somesymbol
MOV64mr $rax, $rsp(-8)

The existing spill-and-restore code will recognise the store/load to/from stack and produce:

DBG_VALUE $rax, !1, !DIExpression()
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus)
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

inserting a DBG_VALUE after the spill and after the restore. The problem is that if the original DBG_VALUE has a fragment, then with no patch we get:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_LLVM_fragment, 0, 32)
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

i.e., the fragment gets dropped on the last DBG_VALUE. This currently fires an assertion in LiveDebugValues, I'm pretty certain the location-list builder would fire a few assertions if it saw this too.

The patch as it stands right now would get the fragment correct on the previous example, but would refuse to spill the following:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
call @somesymbol
MOV64mr $rax, $rsp(-8)

Because of the complex expression on the DBG_VALUE (i.e. the DW_OP_plus_ucons).

I think up until here this makes perfect sense to me. Thanks again for the explanation!

The initial version of this patch would have produced:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_deref, DW_OP_plus_uconst, 1, DW_OP_LLVM_fragment, 0, 32)   ; illustrative, not sure if deref is in right place
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)

i.e.: the DW_OP_plus_uconst would have been preserved in the spill, but not on the restore. That would give us a bad location after the restore, but on the other hand, no crash.

Naive question: Why doesn't the restore recognizer insert the exact same expression that was found by the spill recongizer, regardless of what expression is used to describe the spill? Or: what happens on this example with the +1 expression without this patch?

Obviously these are all unfortunate options. There are a few more: we could just never restore, but this can artificially extend location lifetimes [0]. We could emit DBG_VALUE $noreg instead of restoring. We could also try pattern-matching the portion of the DIExpression that's the spill, and the part that isn't, and try and cut out the spill part -- however my past experiences with manipulating DIExpression opcodes have been error prone.

My (IMHO) preference is the current patch, i.e. don't spill expressions with complex locations. It chops out some functionality (that hasn't been released before) but produces neither crashes nor invalid locations.

[0] https://bugs.llvm.org/show_bug.cgi?id=42772

Naive question: Why doesn't the restore recognizer insert the exact same expression that was found by the spill recongizer, regardless of what expression is used to describe the spill? Or: what happens on this example with the +1 expression without this patch?

Great question; each "open"/valid variable location is maintained as a VarLoc object [0], which stores the machine location and variable identity of the DBG_VALUE being tracked. That doesn't include the expression, instead that resides in the tracked DBG_VALUE inst [1]. After a spill, the DBG_VALUE being tracked and corresponding VarLoc object describe the spill location, and there's no connection back to the unspilt DBG_VALUE, illustrated here:

DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_LLVM_fragment, 0, 32) ; <--- VarLoc and its MI field point here, no link to earlier DBG_VALUE
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

In the +1 example without this patch, an empty expression is created on restore, as there's no attempt to recover the original expression.

This could be fixed, possibly easily, by adding a DIExpression field to VarLoc and using that expression on restore. I've shied away from this as it could reduce performance, and the risk that adding new features/facilities won't be acceptable to merge into the LLVM-9 branch to fix the upcoming release. (I've no experience with what is or isn't seen as acceptable).

I suppose another alternative is, rather than having the VarLoc for a spill hold a reference to the DBG_VALUE for the spilt location, we could instead point at the unspilt DBG_VALUE, i.e. instead of the earlier illustration we would instead have

DBG_VALUE $rax, !1, !DIExpression(DW_OP_LLVM_fragment, 0, 32)                               ; <--- VarLoc.MI points here
MOV64rm $rsp(-8), $rax
DBG_VALUE $rsp, !1, !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_LLVM_fragment, 0, 32) ; <--- VarLocs other fields record this variable location
call @somesymbol
MOV64mr $rax, $rsp(-8)
DBG_VALUE $rax, !1, !DIExpression()

which would give enough information to recover the original expression. Taking this option would require other bits of LiveDebugValues to recognise when they're manipulating a spilt location, and manually add the spill offset to the expression. The code the propagates locations between blocks [2] would need that, probably nothing else. There's also the risk that there's some other subtle detail that I haven't accounted for -- but it does mean no crash, no reduction in locations, and no digging through DIExpressions. I'll take a shot at prototyping this.

[0] https://github.com/llvm/llvm-project/blob/2bea69bf6503ffc9f3cde9a52b5dac1a25e94e1c/llvm/lib/CodeGen/LiveDebugValues.cpp#L171
[1] https://github.com/llvm/llvm-project/blob/2bea69bf6503ffc9f3cde9a52b5dac1a25e94e1c/llvm/lib/CodeGen/LiveDebugValues.cpp#L656 <--- line 629 produces the DebugInstr pointer from a VarLoc object
[2] https://github.com/llvm/llvm-project/blob/2bea69bf6503ffc9f3cde9a52b5dac1a25e94e1c/llvm/lib/CodeGen/LiveDebugValues.cpp#L1111

jmorse updated this revision to Diff 214823.Aug 13 2019, 6:43 AM

This update demonstrates one of the alternatives -- pointing the VarLoc object's DBG_VALUE at the unspilt DBG_VALUE allows the spill-restore code to access the correct expression upon restore. This requires slightly more logic when propagating locations, as we can't just blindly clone DBG_VALUEs any more.

I've added a new test function ('h') to check that complex expressions are preserved, which they are. However: it looks like the spiller is getting the spilt expression wrong anyway, as the spilt location DBG_VALUE is:

DBG_VALUE $rsp, 0, !123,  DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_plus_uconst, 1)

both with and without patching. I believe it's missing a DW_OP_deref in the middle, between the plus-part and the stack-offset part.

IMO: this shows spill/restore of complex expressions is all-round broken right now, and the best solution is probably to not spill them at all (Diff 4, numbered 213638 on this review).

This could be fixed, possibly easily, by adding a DIExpression field to VarLoc and using that expression on restore. I've shied away from this as it could reduce performance, and the risk that adding new features/facilities won't be acceptable to merge into the LLVM-9 branch to fix the upcoming release. (I've no experience with what is or isn't seen as acceptable).

As a general point, we should avoid discussing the merit of a patch based on LLVM's release schedule. If we are concerned about this patch it's best to split it into two parts, an uncontroversial one that is an obvious improvement that then can get cherry-picked without being afraid and a follow-up patch that does things right (but at potentially a performance cost that may need to be evaluated). But as far as trunk is concerned, our priority should always be what's right in the long term.

This update demonstrates one of the alternatives -- pointing the VarLoc object's DBG_VALUE at the unspilt DBG_VALUE allows the spill-restore code to access the correct expression upon restore. This requires slightly more logic when propagating locations, as we can't just blindly clone DBG_VALUEs any more.

I like the new patch! It looks like you discovered a few more pre-existing bugs though, so I'd suggest to do the following:

  1. Create a patch that disables spill-recognition for complex parameters, cherry-pick it to the release. Should be obviously correct.
  2. Create a patch that implements *restoring* of complex expressions (e.g., by pointing to the unspilt DBG_VALUE) and potentially cherry-pick. (Should also be obviously correct.)
  3. Create a patch to enable describing the spilt location of complex expressions and iterate until all bugs are fixed (possibly lands after the release).

Let me know what you think!

lib/CodeGen/LiveDebugValues.cpp
694 ↗(On Diff #214823)

So here we are restoring the original expression since DebugInstr points to the pre-spill DBG_VALUE.

1121 ↗(On Diff #214823)

Out of curiosity: What happens if IsIndirect was already true? Are we loosing one level of indirection or does this work out?

Once wrinkle: patch 1 or 2 would also need to create undef DEBUG_VALUEs at the spill location for the values we are skipping to terminate the ranges.

jmorse marked 2 inline comments as done.Aug 14 2019, 9:45 AM

I like the new patch! It looks like you discovered a few more pre-existing bugs though, so I'd suggest to do the following:

Sounds like a plan; I'll open another phab review for patch 1, which should be straight-forwards; the current patch on this review should be sufficient for patch 2; and I'll think about patch 3,

Once wrinkle: patch 1 or 2 would also need to create undef DEBUG_VALUEs at the spill location for the values we are skipping to terminate the ranges.

Hmn -- is this fully necessary? As it stands, if a spill isn't recognised then the (register) variable lifetime will end in the usual way, when it's clobbered. Recognising spills allows us to extend the variable lifetime past register-clobbering, but it should (AFAIUI) be safe to ignore a spill and let the variable location be clobbered.

lib/CodeGen/LiveDebugValues.cpp
694 ↗(On Diff #214823)

Indeed,

1121 ↗(On Diff #214823)

It looks like the extra indirection is dropped -- line 674 of this patch just passes "true" as the indirect flag for the initial spill, and doesn't inspect the old DBG_VALUEs indirectness.

I like the new patch! It looks like you discovered a few more pre-existing bugs though, so I'd suggest to do the following:

  1. Create a patch that disables spill-recognition for complex parameters, cherry-pick it to the release. Should be obviously correct.
  2. Create a patch that implements *restoring* of complex expressions (e.g., by pointing to the unspilt DBG_VALUE) and potentially cherry-pick. (Should also be obviously correct.)
  3. Create a patch to enable describing the spilt location of complex expressions and iterate until all bugs are fixed (possibly lands after the release).

Sounds like a plan; I'll open another phab review for patch 1, which should be straight-forwards; the current patch on this review should be sufficient for patch 2; and I'll think about patch 3,

Great!

Once wrinkle: patch 1 or 2 would also need to create undef DEBUG_VALUEs at the spill location for the values we are skipping to terminate the ranges.

Hmn -- is this fully necessary? As it stands, if a spill isn't recognised then the (register) variable lifetime will end in the usual way, when it's clobbered. Recognising spills allows us to extend the variable lifetime past register-clobbering, but it should (AFAIUI) be safe to ignore a spill and let the variable location be clobbered.

Let me prefix this by saying that I didn't fully think this through when I wrote the comment :-)
I think you are right in that patch 1 is safe standalone, since the lifetime will end naturally because of the clobber. Patch 2 introduces a new DBG_VALUE, clobbering any previous DBG_VALUE so it is naturally safe, too. And I think the same argument even holds for patch 3.

Sorry for the noise!

jmorse retitled this revision from [DebugInfo] LiveDebugValues: Don't drop fragment information when restoring spills to [DebugInfo] Handle complex expressions of spilt locations in LiveDebugValues.Aug 15 2019, 3:15 AM
jmorse edited the summary of this revision. (Show Details)
jmorse updated this revision to Diff 215363.Aug 15 2019, 3:28 AM

Rebase onto the crash-fixing patch in D66284. This patch now removes the crash-fixing workaround that salvages fragments from spilt expressions, in favour of a more generalised method to access the original unspilt expression. I've edited the details of that into the patch summary at the top.

Adrian wrote:

I think you are right in that patch 1 is safe standalone, since the lifetime will end naturally because of the clobber. Patch 2 introduces a new DBG_VALUE, clobbering any previous DBG_VALUE so it is naturally safe, too. And I think the same argument even holds for patch 3.

Cool, sounds like we're all in agreement, patches 1 and 2 are now arranged in phabricator.

aprantl accepted this revision.Aug 15 2019, 9:41 AM

This is patch 2. LGTM

This revision is now accepted and ready to land.Aug 15 2019, 9:41 AM
This revision was automatically updated to reflect the committed changes.

Thanks for all the reviews!