This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][LoopStrengthReduction] Ensure restoration of cached DIExpression when using SCEV-based salvaging
AbandonedPublic

Authored by chrisjackson on Jul 28 2021, 5:31 PM.

Details

Summary

Currently only those dbg.value that have are modified from single location-op to multi location-op by LSR have their DIExpression restored when using SCEV-based salvaging. This patch ensures that dbg.value have their cached DIExpressions restored regardless of the number of location ops. I believe this DIExpression restoration was present in Markus' original LSR salvaging method.

Diff Detail

Event Timeline

chrisjackson created this revision.Jul 28 2021, 5:31 PM
chrisjackson requested review of this revision.Jul 28 2021, 5:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2021, 5:31 PM

should we add a test case?

chrisjackson added a comment.EditedAug 4 2021, 5:29 AM

should we add a test case?

I believe this is already covered somewhat by /test/Transforms/LoopStrengthReduce/dbg-preserve-2.ll.

Furthermore, if LSR has not modified the DIExpression, then restoring it to the cached DIExpression is a NOP.

should we add a test case?

I believe this is already covered somewhat by /test/Transforms/LoopStrengthReduce/dbg-preserve-2.ll.

Does that test fail without this patch and pass with it applied? If that is the case, and you don't feel that a separate test is necessary, please can you add a comment explaining that it checks for the failure that this patch addresses too?

Here's the current comment in llvm/llvm-project/blame/main/llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-2.ll for reference:

; Test that LSR does not produce invalid debug info when a debug value is
; salvaged during LSR by adding additional location operands, then becomes
; undef, and finally recovered by SCEV salvaging.
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
6190
StephenTozer added a comment.EditedAug 10 2021, 2:03 AM

I believe this is already covered somewhat by /test/Transforms/LoopStrengthReduce/dbg-preserve-2.ll.

Just to confirm, was dbg-preserve-2.ll failing before this patch?

Edit: I have been beaten to the punch.

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
6190

Nit, typo

I believe this is already covered somewhat by /test/Transforms/LoopStrengthReduce/dbg-preserve-2.ll.

I think the usual rules apply: if there's an observable change in output from the compiler then there needs to be a test that fails if the modification isn't applied. As that test isn't currently failing, it doesn't cover the behaviour this patch adds. Otherwise LGTM

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
6189

"updated" is a little vague, could we go for "attempted to salvage" or otherwise indicate that normal debug-info mechanisms might have duly modified the expression.

Thanks for the comments. I think the test i mentioned is restricted to internal LSR salvages that modify the number of location operands as well as the the DIExpression. This patch adds restoration of the DIExpression when the number of location operands has not been modified, but the single operand and the DIExpression have been modified,. I'm not sure how easy it is to craft a test that elicits that behaviour from LSR. To me this patch is defence against such modifications of the dbg.value by LSR.
I think this patch introduces safer behaviour, as although LSR may not modify DIExpressions of single location operand dbg.value at the moment, it may do in future. I hope I'm making sense.

Addressed reviewer comments.

chrisjackson marked 3 inline comments as done.Aug 16 2021, 6:49 AM

To be clear why I don't currently have a test, this change resulted from inspection of the source and I'm confident the scenario (LSR modifying the DIExpression alone and not the number of location operands) can occur and the patched behaviour is correct. So, currently I have no reproducer. However, I'm prepared to try and generate one. But, as there are no restrictions on the salvaging behvaiour of LSR, even if the scenario does not occur now, it may in future.

Hmm, actually I hate to suddenly jump in with a broader objection (especially after I already reviewed this earlier), but is this change needed? I think the reason that the cached DIExpression restoration is only needed for arglists is because it will cause an error if we have a mismatch between the number of ArgList elements and the number of elements referenced in the DIExpression; we don't (I think) otherwise care about whether the expression is restored or not, because it should be undef regardless (otherwise it wouldn't have been salvaged).

Hmm, actually I hate to suddenly jump in with a broader objection (especially after I already reviewed this earlier), but is this change needed? I think the reason that the cached DIExpression restoration is only needed for arglists is because it will cause an error if we have a mismatch between the number of ArgList elements and the number of elements referenced in the DIExpression; we don't (I think) otherwise care about whether the expression is restored or not, because it should be undef regardless (otherwise it wouldn't have been salvaged).

Let's assume a single location dbg.value and that LSR sets the location undef. When the dbg.value is cached it may have a DIExpression already. The scev-salvage method will write an expression that restores the location value. This new expression is combined with the old using prependOpCodes() in e.g. setFinalExpression(), in order to fully restore the value that the pre-LSR dbg.value computed. However, it is possible LSR may change the DIExpression as well as make the location undef. If this is the case, the DIExpression created by concatenating the expressions will not be correct.

Let's assume a single location dbg.value and that LSR sets the location undef. When the dbg.value is cached it may have a DIExpression already. The scev-salvage method will write an expression that restores the location value. This new expression is combined with the old using prependOpCodes() in e.g. setFinalExpression(), in order to fully restore the value that the pre-LSR dbg.value computed. However, it is possible LSR may change the DIExpression as well as make the location undef. If this is the case, the DIExpression created by concatenating the expressions will not be correct.

Ah, so the second salvaging may succeed where the first has failed - in that case it makes sense, since the previous logic assumed that attempts to salvage had already failed. Personally, I think this is a fairly trivial fix, but I won't overrule others asking for a test; I wouldn't expect it to be too hard to write a test in that case though, as the only thing you'd need would be a single-value expression that fails the first salvage but succeeds the second, is that right?

Let's assume a single location dbg.value and that LSR sets the location undef. When the dbg.value is cached it may have a DIExpression already. The scev-salvage method will write an expression that restores the location value. This new expression is combined with the old using prependOpCodes() in e.g. setFinalExpression(), in order to fully restore the value that the pre-LSR dbg.value computed. However, it is possible LSR may change the DIExpression as well as make the location undef. If this is the case, the DIExpression created by concatenating the expressions will not be correct.

Ah, so the second salvaging may succeed where the first has failed - in that case it makes sense, since the previous logic assumed that attempts to salvage had already failed. Personally, I think this is a fairly trivial fix, but I won't overrule others asking for a test; I wouldn't expect it to be too hard to write a test in that case though, as the only thing you'd need would be a single-value expression that fails the first salvage but succeeds the second, is that right?

Yes, that's correct.

Update the text for the test that already covers restoration of the dbg.value arguments from the cache. During LSR a failed salvage attempt is made that modifies the dbg.value arguments. These arguments must be restored to their pre-LSR state in order for scev-based salvaging to succeed.

Added a test in response to reviewer comments. The test demonstrates the potential to create an incorrect DIExpression if the pre-LSR cached locations and expressions are not restored.

There seems to be quite a lot of metadata in the new test so I had a look around for quick ways to reduce it. One option could be:

opt dbg-preserve-3.ll --strip-nonlinetable-debuginfo -o stripped.ll -S

Then copy over the dbg.values and requisite metadata over from the original. But that is quite a lot of busywork. -strip-dead-debug-info sounds good but it doesn't appear to work on local variables. If you can reduce it further that would be good imo, but LGTM with test.

@StephenTozer seems happy already and I'm now happy (assuming test nits fixed). That just leaves @jmorse and @djtodoro who were both asking for a test too; I'll accept this tomorrow if there are no further comments.

llvm/test/Transforms/LoopStrengthReduce/dbg-preserve-3.ll
5 ↗(On Diff #372077)

oorrect -> correct

47 ↗(On Diff #372077)

Is FileCheck happy for you to use a variable like this, before it's been initialised ([[var_c]])?

Agreed with the above - if the metadata turns out to be a huge pain to strip then I don't think it should be a blocker, but if Orlando's suggestions work (or anyone else has a good idea of how to do so) then we should use them. No further complaints from me (Orlando's comments aside), so LGTM once the others have had a fair chance to check the review.

Orlando accepted this revision.Sep 13 2021, 8:33 AM

Accepting this now as promised. Before landing, please address the nits and try to reduce the test metadata further (see previous couple of comments) if possible.

This revision is now accepted and ready to land.Sep 13 2021, 8:33 AM
chrisjackson abandoned this revision.Sep 19 2021, 5:15 AM

:Aah, so on closer inspection and attempting to fix the test, I observed the calls to applyExprToDbgValues() and found:

RecoverValue.applyExprToDbgValue(*CachedDVI.DVI, CachedDVI.Expr);

which means that in fact it is not necessary to reapply the cached DIExpression, as any expression created by LSR is not used at all to construct the final expression. The setFinalExpression() functions prepend the new expression to the pre-LSR cached expression and this is applied to the dbg.value directly, without considering what the dbg.value's current expression is.
I will abandon this revision as unnecessary. Thanks for the reviews and apologies that this turned out to be an unnecessary change. However, not all effort was wasted as I did identify that sometimes undef values in SCEVs are being used to construct dbg.values that will be alter stripped, so the work on this patch will result in a change.