This is an archive of the discontinued LLVM Phabricator instance.

[Debuginfo][LSR] Add support for salvaging variadic dbg.value intrinsics [1/2] [NFC]
ClosedPublic

Authored by chrisjackson on Feb 18 2022, 3:35 PM.

Details

Summary

First of two patches that will enable salvaging of dbg.value instrinsics that
have multiple locations ops before the Loop Strength Reduction pass.

The existing single-op SCEV-based salvaging can generate variadic dbg.value
intrinsics in order to salvage a dbg.value that has a single location op.
If a dbg.value has multiple location ops before LSR, and LSR optimises
away one or more of the location operands, then no salvaging will be attempted.

Salvaging can now be added, but first this patch cleans up consistency
in both the code and comments, and applies some refactoring to make application
of the new salvaging implementation more straightforward.

  • Use SCEVDbgValueBuilder for both types of recovery expressions: IV-offset based and iteration count based.
  • Combine the functions that write the final DIExpression.
  • Move some static functions into member functions.

Diff Detail

Event Timeline

chrisjackson created this revision.Feb 18 2022, 3:35 PM
chrisjackson requested review of this revision.Feb 18 2022, 3:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2022, 3:35 PM

Applied clang-format.

chrisjackson edited the summary of this revision. (Show Details)Feb 19 2022, 1:05 AM
chrisjackson edited the summary of this revision. (Show Details)Feb 19 2022, 1:25 PM
chrisjackson edited the summary of this revision. (Show Details)
chrisjackson edited the summary of this revision. (Show Details)Feb 19 2022, 2:45 PM
chrisjackson edited the summary of this revision. (Show Details)Feb 21 2022, 2:46 AM
chrisjackson edited the summary of this revision. (Show Details)
chrisjackson edited the summary of this revision. (Show Details)Feb 21 2022, 2:49 AM
chrisjackson edited the summary of this revision. (Show Details)Feb 21 2022, 3:20 AM
chrisjackson edited the summary of this revision. (Show Details)
davidb added a subscriber: davidb.Feb 21 2022, 5:18 AM
chrisjackson edited the summary of this revision. (Show Details)Feb 25 2022, 3:37 AM

Can this be labelled NFC or is there a functional change here that I've missed?

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
6063

Should this be ...from a given value... rather than ...for a given value...?

6064

nit: I think these function names should all start with lower case.

6072

nit: the function names in this file appear to have a mix of capitalization style -- since it's already inconsistent it might be better to follow the standard (camelCase)? Not a strong opinion, feel free to ignore.

6165–6166

Please can you explain why this is necessary (on phab or in the comment or both)?

chrisjackson updated this revision to Diff 412150.EditedMar 1 2022, 10:04 AM

Responded to review from @Orlando (Thanks!).

  • Improved the comments with respect to writing shorter dbg.value. Its quite hard to explain concisely that DIArgList and {DW_OP_LLVM_arg, 0} aren't always necessary, but I tried.
  • Renamed functions that had been moved inside SCEVDbgValueBuilder and were incorrectly cased.
  • Yes, this patch is NFC, but I wasn't sure of the benefits / rules for labeling it as such. But I'm happy to do so.
chrisjackson marked 3 inline comments as done.Mar 1 2022, 10:05 AM

I don't think there are "rules" about tagging patches as NFC (just put NFC somewhere in the title). It's definitely helpful to be able to see at a glance though - two examples: if an NFC patch is causing changes in output there's a bug, and if a patch is NFC it probably doesn't need any tests.

Nearly ready to approve but I'm still stuck on one thing (sorry if I'm just missing something obvious!).

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
6147

My previous inline comment seems to have sunk further down, so I'm making a new one here. Sorry for holding things up but I still don't really understand what these few lines are for - I tried looking in the pre-patched code but couldn't find any similar code. What is the intended condition here and why does a DW_OP_stack_value need to be added under that condition?

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 3:46 AM

I don't think there are "rules" about tagging patches as NFC (just put NFC somewhere in the title). It's definitely helpful to be able to see at a glance though - two examples: if an NFC patch is causing changes in output there's a bug, and if a patch is NFC it probably doesn't need any tests.

Nearly ready to approve but I'm still stuck on one thing (sorry if I'm just missing something obvious!).

I can clarify the comment. It's my understanding that non-empty DIExpressions must have a DW_OP_stack_value terminator. The expressions generated by the SCEVDbgValueBuilder objects do not have these terminators (as the expressions may be inserted into other expressions). If the pre-LSR expression was empty, once the complete salvaging expression has been generated, then the terminator must be appended.

chrisjackson retitled this revision from [Debuginfo][LSR} Add support for salvaging variadic dbg.value intrinsics [1/2] to [Debuginfo][LSR][NFC] Add support for salvaging variadic dbg.value intrinsics [1/2] .Mar 2 2022, 7:33 AM

Correction to failed function renaming.

Attempt to make UpdateDbgValueInst() more clear when adding the DW_OP_stack_value terminator.

I can clarify the comment. It's my understanding that non-empty DIExpressions must have a DW_OP_stack_value terminator. The expressions generated by the SCEVDbgValueBuilder objects do not have these terminators (as the expressions may be inserted into other expressions). If the pre-LSR expression was empty, once the complete salvaging expression has been generated, then the terminator must be appended.

My main confusion was in why the addition of stack_value was necessary in the patch but wasn't added before. I'd missed that you were previously using prependOpcodes with param stackValue=true, but are now adding the opcode yourself. And I agree that adding a stack_value is correct in this instance because you're generating implicit locations (though I think "non-empty DIExpressions must have a DW_OP_stack_value terminator" is not true in general?). In any case, sorry for the protracted review as a result of my misunderstanding!

prependOpcodes also accounts for inserting a stack_value opcode before fragment opcodes, which your implementation doesn't appear to handle. Would it be possible to rearrange the code to use prependOpcodes so that all cases are accounted for?

chrisjackson added a comment.EditedMar 4 2022, 1:28 PM

I can clarify the comment. It's my understanding that non-empty DIExpressions must have a DW_OP_stack_value terminator. The expressions generated by the SCEVDbgValueBuilder objects do not have these terminators (as the expressions may be inserted into other expressions). If the pre-LSR expression was empty, once the complete salvaging expression has been generated, then the terminator must be appended.

My main confusion was in why the addition of stack_value was necessary in the patch but wasn't added before. I'd missed that you were previously using prependOpcodes with param stackValue=true, but are now adding the opcode yourself. And I agree that adding a stack_value is correct in this instance because you're generating implicit locations (though I think "non-empty DIExpressions must have a DW_OP_stack_value terminator" is not true in general?). In any case, sorry for the protracted review as a result of my misunderstanding!

prependOpcodes also accounts for inserting a stack_value opcode before fragment opcodes, which your implementation doesn't appear to handle. Would it be possible to rearrange the code to use prependOpcodes so that all cases are accounted for?

I understand your query now.

EDIT: This was good advice and I've changed the code accordingly.

Removed the naive code to check and add dwarf::DW_OP_stack_value in UpdateDbgValueInst() following @Orlando 's helpful advice and reverted to using DIExpression::prependOpcodes().

My previous reason for not following the advice didn't actually make sense for this first patch in the series.

The change SGTM but I found a reproducer that causes clang to crash when this patch is applied:

$ cat reduce.cpp
#include <utility>
template <typename a> class Vec {
public:
  void push(a);
};
void d(int argc, char **argv) {
  Vec<std::pair<char *, unsigned>> e;
  for (int f = 0;;)
    for (; f < argc; ++f)
      e.push(std::make_pair(argv[f], f));
}

$ clang++  -O2 -g reduce.cpp

/home/och/dev/llvm/llvm-project/llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h:139: llvm::DbgValueLoc::DbgValueLoc(const llvm::DIExpression*, llvm::ArrayRef<llvm::DbgValueLocEntry>, bool): Assertion `Expr && Expr->isValid() && is_contained(Expr->getElements(), dwarf::DW_OP_stack_value)' failed.
...
chrisjackson added a comment.EditedMar 7 2022, 5:59 AM

The change SGTM but I found a reproducer that causes clang to crash when this patch is applied:

$ cat reduce.cpp
#include <utility>
template <typename a> class Vec {
public:
  void push(a);
};
void d(int argc, char **argv) {
  Vec<std::pair<char *, unsigned>> e;
  for (int f = 0;;)
    for (; f < argc; ++f)
      e.push(std::make_pair(argv[f], f));
}

$ clang++  -O2 -g reduce.cpp

/home/och/dev/llvm/llvm-project/llvm/lib/CodeGen/AsmPrinter/DebugLocEntry.h:139: llvm::DbgValueLoc::DbgValueLoc(const llvm::DIExpression*, llvm::ArrayRef<llvm::DbgValueLocEntry>, bool): Assertion `Expr && Expr->isValid() && is_contained(Expr->getElements(), dwarf::DW_OP_stack_value)' failed.
...

Oops, looks like when I split my patch into two I failed to keep some critical code in this patch. Sometimes LSR itself modifies the number of location ops in a dbg.value. If scev-based salvaging is to attempt recovery it must first restore the dbg.value to a single-op state by removing the DIArglist in the first argument. Without this restoration the salvage attempts will create invalid dbg.value, hence the crash you observed (thanks again!).

I will readd the code that removes DIArglists added by LSR.

chrisjackson retitled this revision from [Debuginfo][LSR][NFC] Add support for salvaging variadic dbg.value intrinsics [1/2] to [Debuginfo][LSR] Add support for salvaging variadic dbg.value intrinsics [1/2] .

Reinstate the check for dbg.value that LSR has changed from single location-op to multi location-op. When LSR makes this change, the single location argument is changed to a DIArglist. This patch now replaces the DIArglist with a single 'undef'. This will prevent the generation of invalid dbg.value that were causing the crash as reported by @Orlando .

Remove accidental double application of dbg.value expression restoration.

Reinstate the check for dbg.value that LSR has changed from single location-op to multi location-op. When LSR makes this change, the single location argument is changed to a DIArglist. This patch now replaces the DIArglist with a single 'undef'. This will prevent the generation of invalid dbg.value that were causing the crash as reported by @Orlando .

If that change makes the patch not-NFC, please could you add a test for this and mention it in the commit message? Lastly, please can you appease the clang-format bot?

chrisjackson added a comment.EditedMar 9 2022, 1:40 PM

Reinstate the check for dbg.value that LSR has changed from single location-op to multi location-op. When LSR makes this change, the single location argument is changed to a DIArglist. This patch now replaces the DIArglist with a single 'undef'. This will prevent the generation of invalid dbg.value that were causing the crash as reported by @Orlando .

If that change makes the patch not-NFC, please could you add a test for this and mention it in the commit message? Lastly, please can you appease the clang-format bot?

Thinking about it, the codeblock in which this improvement exists is removed entirely in patch [2/2]. So I will revert to the old method of simply giving up on the salvage if the undef location is now nullptr (and thus the type is unavailable) . This will reinstate the NFC status of this patch.

The original purpose of the patch was to refactor and clean things up for the new implementation in patch [2/2], so i think leaving out this improvement is the most sensible course of action. I could generate a test but it doesn't seem worth it for code that would be removed in the near future.

Orlando accepted this revision.Mar 10 2022, 12:26 AM

Makes sense to restore NFC-ness of the patch, LGTM with that change.

This revision is now accepted and ready to land.Mar 10 2022, 12:26 AM

Restore the the old behaviour for transforming DIArglist location arguments to single- location op:. When trying to determine the type of an undef Value that has become a nullptr, simply abandon the salvage attempt. This restores the NFC status of this patch.

In the second patch in this series, this code block is removed.

chrisjackson retitled this revision from [Debuginfo][LSR] Add support for salvaging variadic dbg.value intrinsics [1/2] to [Debuginfo][LSR] Add support for salvaging variadic dbg.value intrinsics [1/2] [NFC].Mar 10 2022, 6:17 AM

Add check for empty arglist that was missing from check for nullptr arglist member

This revision was landed with ongoing or failed builds.Apr 27 2022, 3:45 AM
This revision was automatically updated to reflect the committed changes.