This is an archive of the discontinued LLVM Phabricator instance.

[RemoveRedundantDebugValues] Introduce a MIR pass that removes redundant DBG_VALUEs
ClosedPublic

Authored by djtodoro on Jul 1 2021, 6:46 AM.

Details

Summary

This new MIR pass removes redundant DBG_VALUEs.

After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE.

This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that.

This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one:

For example:

(1) DBG_VALUE $edi, !"var1", ...
(2) DBG_VALUE $esi, !"var2", ...
(3) DBG_VALUE $edi, !"var1", ...
 ...

in this case, we can remove (1).

By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed.

Diff Detail

Event Timeline

djtodoro created this revision.Jul 1 2021, 6:46 AM
djtodoro requested review of this revision.Jul 1 2021, 6:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2021, 6:46 AM
djtodoro updated this revision to Diff 355885.Jul 1 2021, 8:05 AM
  • remove some unwanted newlines
xgupta added a subscriber: xgupta.Jul 1 2021, 8:12 AM
NOTE: This pass can be invoked multiple times in the pipeline, in the case we find some other place it is worth using.

LGTM with a few inline nits (mostly about comments). I'd prefer to wait a little and see if anyone else has comments before hitting Accept. Note to other reviewers: there is a discussion about the stats/impact of this patch-set on D105280.

llvm/docs/SourceLevelDebugging.rst
773–776

nit: this feels maybe a bit clunky? Here is a possible alternative.

llvm/lib/CodeGen/RemoveRedundantDebugValues.cpp
23
70–73

nit/suggestion: possible rewording

// This analysis aims to remove redundant DBG_VALUEs by going backward
// in the basic block and removing all but the last DBG_VALUE for any
// given variable in a set of consecutive DBG_VALUE instructions.
llvm/test/DebugInfo/X86/livedebugvars-avoid-duplicates.ll
4 ↗(On Diff #355885)

You could add --implicit-check-not=DBG_VALUE to FileCheck so that the test fails if there are unexpected DBG_VALUEs?

@Orlando Thanks!

llvm/test/DebugInfo/X86/livedebugvars-avoid-duplicates.ll
4 ↗(On Diff #355885)

I should have deleted this test now. We don’t need it any more.

djtodoro updated this revision to Diff 356175.Jul 2 2021, 8:14 AM
  • addressing comments
StephenTozer added inline comments.
llvm/lib/CodeGen/RemoveRedundantDebugValues.cpp
113

DBG_VALUE_LIST looks like it's already supported by this pass; since you aren't touching the actual debug operands, the interface between the two is identical, and the call to isDebugValue() is true for DBG_VALUE_LIST as well.

djtodoro added inline comments.Jul 7 2021, 6:04 AM
llvm/lib/CodeGen/RemoveRedundantDebugValues.cpp
113

Oh, I see... This should have been added in the next patch from the stack -- we should prevent calling the reduceDbgValsForwardScan() for the DBG_VALUE_LIST with the current implementation.

djtodoro updated this revision to Diff 356945.EditedJul 7 2021, 7:09 AM
  • removing the TODO marker for the DBG_VALUE_LIST, since it is safe for the backward scan

You probably also considered adding this functionality to MIBuilder or LiveDebugVariables itself, to avoid inserting duplicates in the first place. Would that be possible, or is there no good place in the existing code where we could add this to?

llvm/test/DebugInfo/NVPTX/debug-loc-offset.ll
250 ↗(On Diff #356945)

Why did this change, is it because the order in which the DBG_VALUEs are processed is different now?

You probably also considered adding this functionality to MIBuilder or LiveDebugVariables itself, to avoid inserting duplicates in the first place. Would that be possible, or is there no good place in the existing code where we could add this to?

There was a previous patch up to this effect (D105025) that was abandoned due to performance concerns; scanning (potentially) the full basic block at insertion for every DBG_VALUE would be a quadratic solution, while this pass is linear. It would be possible to produce an optimized version using caching in LiveDebugVariables/LiveDebugValues (although probably not MIBuilder), but since they would all be linear time anyway I'm happier with the solution that has less complexity and doesn't insert more into two passes that are already quite dense.

You probably also considered adding this functionality to MIBuilder or LiveDebugVariables itself, to avoid inserting duplicates in the first place. Would that be possible, or is there no good place in the existing code where we could add this to?

There was a previous patch up to this effect (D105025) that was abandoned due to performance concerns; scanning (potentially) the full basic block at insertion for every DBG_VALUE would be a quadratic solution, while this pass is linear. It would be possible to produce an optimized version using caching in LiveDebugVariables/LiveDebugValues (although probably not MIBuilder), but since they would all be linear time anyway I'm happier with the solution that has less complexity and doesn't insert more into two passes that are already quite dense.

Yep, +1.

llvm/test/DebugInfo/NVPTX/debug-loc-offset.ll
250 ↗(On Diff #356945)

Now we have a new abbrev since the parameter becomes DW_AT_const_value:

...
  (1)  DBG_VALUE 0, $noreg, !13, !DIExpression(), debug-location !14
  (2) DBG_VALUE 1, $noreg, !13, !DIExpression(), debug-location !14
...

This will consider the (1) as redundant...is this OK?

djtodoro added inline comments.Jul 9 2021, 1:21 PM
llvm/test/DebugInfo/NVPTX/debug-loc-offset.ll
250 ↗(On Diff #356945)

I’d say it isn’t good…We need to take into account the DBG_VALUEs with a constant as the first operand as a special case…

djtodoro updated this revision to Diff 357898.Jul 12 2021, 4:41 AM
  • handle constants in DBG_VALUEs, since it can be there due to a dead instruction that has been deleted due to some optimizations
  • added a test for it

I'm not sure whether the most recent change is necessary; on the one hand, I understand that this change is meant to NFC, so affecting the actual debug output feels like a mistake, and I can see this change as being justified solely for the sake of keeping this patch NFC. On the other hand, the fact that removing redundant debug values results in a change in the DWARF is more a fault of the DWARF emission step than this pass I think. If we don't need to use a constant DW_AT_location instead of a DW_AT_const_value, then a debug value that is never live shouldn't cause us to emit the former rather than the latter, and if there is a reason to pick one over the other then the distinguishing factor should be something more explicit than the presence of an otherwise unused debug instruction. But this patch shouldn't need to be held up by that concern, since it works fine as is.

LGTM, but preferably leave it open for a day to see if anyone else has thoughts about this change.

@StephenTozer Thanks!

I'd be happy if someone officially accepts the change.

StephenTozer accepted this revision.Jul 14 2021, 3:06 AM

Looks like there's no further comments, LGTM.

This revision is now accepted and ready to land.Jul 14 2021, 3:06 AM
This revision was landed with ongoing or failed builds.Jul 14 2021, 4:30 AM
This revision was automatically updated to reflect the committed changes.
foad added a subscriber: foad.Sep 29 2021, 2:06 AM
foad added inline comments.
llvm/lib/CodeGen/TargetPassConfig.cpp
1126

Is the false here just a copy-and-paste? I assume there's no reason why machine verification should not work after this pass?

djtodoro added inline comments.Sep 29 2021, 2:16 AM
llvm/lib/CodeGen/TargetPassConfig.cpp
1126

oh, yes. Nice catch.