This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Add function to test debug values for equivalence
ClosedPublic

Authored by StephenTozer on Oct 18 2022, 7:15 AM.

Details

Summary

Debug values in both IR and MIR have 3 distinct properties that are used to compute the underlying variable value; the DIExpression, a list of one or more machine value(s), and a "directness" flag, given by either the choice of intrinsic in IR (dbg.declare/dbg.addr being indirect, dbg.value being direct), or directly specified as an operand in MIR (the second operand to a DBG_VALUE instruction). The expressions themselves may optionally be variadic, which allows for more than one machine value to be used in the debug value, and changes the expression syntax to explicitly specify where in the expression those machine values appear, instead of implicitly inserting the sole machine value at the front of the expression.

This patch adds a new function that can be used to check all the properties, other than the machine values, of a pair of debug values for equivalence. This is done by folding the "directness" into the expression, converting the expression to variadic form if it is not already in that form, and then comparing directly. In a few places which check whether two debug values are identical to see if their ranges can be merged, this function will correctly identify cases where two debug values are expressed differently but have the same meaning, allowing those ranges to be correctly merged.

Diff Detail

Event Timeline

StephenTozer created this revision.Oct 18 2022, 7:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 7:15 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
StephenTozer requested review of this revision.Oct 18 2022, 7:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2022, 7:15 AM

Add some unit tests for the new function.

Add a lit test.

On the whole this looks good to me but I do have a few nits and questions.

  1. Could we do the comparison in a way that doesn't involve generating new unused metadata? For example, by just constructing an array of the DIExpression elements for each (no call to DIExpression::get) and comparing them elementwise?
  1. I think the change to DIExpression::isValid() could come as a separate patch - and possibly the change to DbgValueLocEntry too. wdyt?
llvm/include/llvm/IR/DebugInfoMetadata.h
2797–2799

nit: suggestion above or even "If \p Expr is non-variadic (i.e. ...."

2814

Since this function doesn't take into account the value or DIArgList contents I think it should be called isEqualExpression or something. Wdyt?

llvm/lib/CodeGen/MachineInstr.cpp
687–688

Could be beneficial to hoist this up to come after isIdentical check as it is a low-cost check?

llvm/test/DebugInfo/Generic/merge-equivalent-ranges.ll
22

What happens to the dbg.value above?

llvm/unittests/IR/MetadataTest.cpp
3131–3136

Maybe worth adding a comment pointing out that the expressions above are equivalent, we just don't cover this case at the moment?

On the whole this looks good to me but I do have a few nits and questions.

There's some justification to use DIExpressions in this function; namely that the DIExpression::append logic is slightly non-trivial, and would be a pain if errors started emerging because the append logic gets changed in one place later on - but with that said, I think the performance benefits are justifiable (or perhaps the append logic could be partially extracted to a new function).. Also w.r.t. point 2, that change is necessary as long as we construct the DIExpressions as part of this function (because it will at some point be used to compare entry value expressions, so it will automatically create DW_OP_LLVM_arg, 0, DW_OP_entry_value, 1 expressions), but can be taken out if we just work with the lists.

llvm/include/llvm/IR/DebugInfoMetadata.h
2814

Reasonable, when I first wrote the function it did compare values, but now that it doesn't a rename is justified.

llvm/test/DebugInfo/Generic/merge-equivalent-ranges.ll
22

All 3 of these dbg.values are given separate entries in the output loclist without this change.

llvm/unittests/IR/MetadataTest.cpp
3131–3136

SGTM.

Address review comments, including removing the unnecessary call to MachineInstr::isIdenticalTo in MachineInstr::isEquivalentDbgInstr and also adding a missing check for getDebugLoc equality.

Orlando accepted this revision.Nov 7 2022, 6:23 AM

LGTM

llvm/lib/CodeGen/MachineInstr.cpp
677

Should this function also check that the DILocalVariable field matches?

llvm/lib/IR/DebugInfoMetadata.cpp
1459–1461

Have you seen (/got an example of) an indirect debug intrinsic using stack_value "naturally" in IR? I feel like that's something that shouldn't exist, but maybe I'm missing something / not being creative enough.

This revision is now accepted and ready to land.Nov 7 2022, 6:23 AM

Rewrote the test to remove some incorrect behaviour; new test uses two different variables to compare direct/indirect and non-variadic/variadic separately. Also, add the missing variable equality check in the added MachineInstr function (the only caller already checks for variable equivalence, but the function would be misleading if it didn't do so as well!)

StephenTozer added inline comments.Nov 10 2022, 12:39 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1459–1461

I do not, and it shouldn't happen as things currently are - this code is just following the general rules for correctly constructed DIExpressions (essentially copied functionality from DIExpression::append. I could remove the condition and add an assert instead though, since I think it's very distinctly not possible to have an indirect stack_value expression right now.

This revision was landed with ongoing or failed builds.Dec 19 2022, 9:18 AM
This revision was automatically updated to reflect the committed changes.