Page MenuHomePhabricator

[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands
ClosedPublic

Authored by StephenTozer on Nov 18 2020, 8:50 AM.

Details

Summary

This patch follows the complete implementation of variadic debug values, making use of them to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions.

The majority of the real work in this patch is in Local.cpp, where the salvage function has been expanded in complexity for GEP and BinOp instructions, which have thus been refactored out into separate functions. The GEP implementation required the creation of a new function collectOffset which behaves similarly to accumulateConstantOffset, except that in addition to any constant offset it also fetches any non-constant offsets and returns them in a map of [SSA Value -> Factor].

Outside of these larger changes, the remainder of this patch consists of small utility functions used in salvaging, and updating uses of the salvage functions. Not all places that use salvaging have been updated to make use of variadic salvaging yet (such as salvaging in ISel), as additional changes need to be made in those areas to support this form of salvaging. They should not be difficult to implement in future, however.

Diff Detail

Event Timeline

StephenTozer created this revision.Nov 18 2020, 8:50 AM
StephenTozer requested review of this revision.Nov 18 2020, 8:50 AM
StephenTozer added inline comments.Nov 18 2020, 8:55 AM
llvm/test/Transforms/Reassociate/undef_intrinsics_when_deleting_instructions.ll
-22–25

A note for reviewers on the changes made to this test: this test has been modified to prevent it failing after this patch is applied. The failure introduced by this patch is not an error, but the opposite: the test relies on intrinsics being made undef when the instruction they relied on is deleted, but as of this patch we are able to salvage them so they will not be set undef. In order to preserve the spirit of this test, I changed it to instead reassociate float arithmetic instructions, which cannot currently be salvaged and so results in the intrinsic being undef'd like before.

dstenb added a subscriber: dstenb.Nov 18 2020, 9:24 AM

Exciting!

I took the liberty to add some review comments whilst familiarizing myself with the code.

llvm/lib/IR/DebugInfoMetadata.cpp
26

Nit: std::max(Result, ExprOp.getArg(0) + 1) ?

llvm/lib/Transforms/Utils/Local.cpp
38–1

Could the creation of this utility function and getSalvageOpsForBinOp be done as a preceding NFC refactoring commit? Just to make it slightly easier to see what the difference in behavior is.

70–71

Should this say DIExpression perhaps? DWARFv5 supports binary operations on base types wider than 64 bits, but LLVM does not support such expressions.

(minor drive-by comment, but I'm not the right person to do more in depth review of this patch, unfortunately)

llvm/test/Transforms/Reassociate/undef_intrinsics_when_deleting_instructions.ll
-22–25

Any chance of using something that'd be even more robust by using, for instance, a call to an external 'pure' function where the return value is used to define the debug value, but is otherwise unused - there would the instruction could be deleted but there would be no way the value could be reconstituted no matter how much we improve debug info in the future

Thanks!! So, this should be the most important one :) I would really like to see some measurements (e.g. llvm-locstats) before/after this patch.
Have you tried building llvm-project itself with this?

(Neither arc patch D91722 nor curl -L 'https://reviews.llvm.org/D91722?download=1' | patch -p1 can apply the patch locally. Can you upload via arc diff or git format-patch?)

This is nice!

llvm/lib/Transforms/Utils/Local.cpp
101

Nit: Instead of having 20 calls to push_back, would it make sense to have a helper function that just does switch(opcode) { case Instruction::Add: return dwarf::DW_OP_plus; ...}?

StephenTozer added inline comments.Dec 3 2020, 6:17 AM
llvm/test/Transforms/Reassociate/undef_intrinsics_when_deleting_instructions.ll
-22–25

In a more general case I'd say so, but this behaviour in the reassociate pass is stimulated by arithmetic instructions, rather than just dead code in general; making use of float arithmetic in this case seemed to be the shortest path.

Thanks!! So, this should be the most important one :) I would really like to see some measurements (e.g. llvm-locstats) before/after this patch.
Have you tried building llvm-project itself with this?

So far I've tested this with the benchmarks from the llvm test suite (specifically the CTMark projects, since we're looking for compilation metrics only). I'll post a more complete set of stats soon, but the summary is that there is a small increase in compile time for most projects; <1% or outright negligible in most cases. In terms of locstats, the change looks to add a modest 1-2% increase in overall variable coverage for most projects, although a notable exception is tramp3d-v4 which saw variable coverage change from 53%=>66% - a fairly substantial improvement.

I'm optimistic about further improvements to the locstats following from this work; the current patch doesn't update any of the existing salvaging in Instruction Selection to support variadic locations (they require some specific changes which aren't complicated but require a separate patch), and also enable some new salvaging opportunities in Instruction Selection and the loop passes (such as Loop Strength Reduce).

Rebased onto recent master, no functional changes; moved the salvageDebugInfoImpl refactoring into a separate NFC patch; addressed outstanding review comments.

aprantl added inline comments.Dec 10 2020, 8:48 AM
llvm/include/llvm/IR/DebugInfoMetadata.h
23

to be in line with the other method names: getNumLocationOperands() or getNumLocationRefs() or getNumArgOps()...?
If wee get super nerdy, I think technically these are operators (that operate on the DWARF stack) and not operands :-)

StephenTozer added inline comments.Dec 10 2020, 9:50 AM
llvm/include/llvm/IR/DebugInfoMetadata.h
23

Sounds good to me! And wrt the nerdy point, I think that's actually an advantageous semantic point here: the intent of this function is actually to get the number of location operands that the DIExpression refers to, as opposed to the number of actual operators of a given type (i.e. for the nonsensical expression DIExpression(DW_OP_LLVM_arg 0, DW_OP_LLVM_arg 0, DW_OP_LLVM_arg 0) we still only refer to one location operand).

Thanks!! So, this should be the most important one :) I would really like to see some measurements (e.g. llvm-locstats) before/after this patch.
Have you tried building llvm-project itself with this?

So far I've tested this with the benchmarks from the llvm test suite (specifically the CTMark projects, since we're looking for compilation metrics only). I'll post a more complete set of stats soon, but the summary is that there is a small increase in compile time for most projects; <1% or outright negligible in most cases. In terms of locstats, the change looks to add a modest 1-2% increase in overall variable coverage for most projects, although a notable exception is tramp3d-v4 which saw variable coverage change from 53%=>66% - a fairly substantial improvement.

I'm optimistic about further improvements to the locstats following from this work; the current patch doesn't update any of the existing salvaging in Instruction Selection to support variadic locations (they require some specific changes which aren't complicated but require a separate patch), and also enable some new salvaging opportunities in Instruction Selection and the loop passes (such as Loop Strength Reduce).

Very cool! Thanks!

Rebased onto recent master, fixed test failures.

Ping - any further comments?

aprantl accepted this revision.Feb 1 2021, 5:33 PM

I think this looks pretty good. I found on last comment about the function signature of collectOffset, but otherwise this is fine.

llvm/include/llvm/IR/Instructions.h
4

Any reason why this doesn't just return an Optional<APInt>?

This revision is now accepted and ready to land.Feb 1 2021, 5:33 PM

Rebased onto latest master.

Modified salvageDebugInfoForDbgValues to only produce DIArgLists for dbg.values, and not dbg.addr or dbg.declare. This is necessary for the time being, as the current implementation of variadic debug values only works with DW_OP_stack_value for reasons previously discussed on the mailing list and in the root patch of this stack. In the next patch to follow, which will make the current DBG_VALUE_LIST into the default debug value instruction, the DIExpression representation will be changed to support non-stack_value variadic debug values, and this change can be reverted.

Minor update to fix an error in a comment (dbg.value -> dbg.declare).

StephenTozer added inline comments.Mar 10 2021, 2:32 PM
llvm/lib/Transforms/Utils/Local.cpp
1830

This else block will be removed in the near future, so I've tried to minimize the amount of change to the function, rather than rewriting a greater portion around this.

jmorse added a subscriber: jmorse.Mar 11 2021, 4:46 AM

Modified salvageDebugInfoForDbgValues to only produce DIArgLists for dbg.values, and not dbg.addr or dbg.declare. This is necessary for the time being, as the current implementation of variadic debug values only works with DW_OP_stack_value for reasons previously discussed on the mailing list and in the root patch of this stack. In the next patch to follow, which will make the current DBG_VALUE_LIST into the default debug value instruction, the DIExpression representation will be changed to support non-stack_value variadic debug values, and this change can be reverted.

SGTM

This revision was landed with ongoing or failed builds.Mar 11 2021, 5:34 AM
This revision was automatically updated to reflect the committed changes.

Fixes two issues with handling of variadic debug values:

  1. SelectionDAG::transferDebugValues incorrectly updating its dependencies list.
  2. LoopStrengthReduce, when trying to "repair" undef debug values by replacing them with equal values, must first restore the original location metadata to prevent it from being incorrectly updated if the number of location operands changed as a result of salvaging by LSR.