This is an archive of the discontinued LLVM Phabricator instance.

[Local] replaceAllDbgUsesWith: Update debug values before RAUW
ClosedPublic

Authored by vsk on Jun 27 2018, 2:59 PM.

Details

Summary

The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.

This improves upon the existing insertReplacementDbgValues API by:

  • Updating debug intrinsics in-place, while preventing use-before-def of the replacement value.
  • Falling back to salvageDebugInfo when a replacement can't be made.
  • Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code.

Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:

  • The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations.
  • Truncation. Applies when the new value is wider than the old one.
  • Zero/sign extension. Applies when the new value is narrower than the old one.

Testing:

  • check-llvm, check-clang, a stage2 -g -O3 build of clang, regression/unit testing.
  • This resolves a number of mis-sized dbg.value diagnostics from Debugify.

Diff Detail

Event Timeline

vsk created this revision.Jun 27 2018, 2:59 PM
aprantl added inline comments.Jun 27 2018, 3:24 PM
include/llvm/IR/DebugInfoMetadata.h
785

Why not make an enum { Unknown, Signed, Unsigned } ?

test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

wouldn't DW_OP_lit0 DW_OP_not be shorter than DW_OP_constu 0xfffffff ?

aprantl added inline comments.Jun 27 2018, 3:26 PM
test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

To keep the IR simple we could also recognize and normalize this sequence in DwarfExpression.cpp

vsk updated this revision to Diff 153203.Jun 27 2018, 4:42 PM

I've incorporated some offline review feedback from @aprantl. We can assume that a debugger will set the high bits of a value to 0, so we can make zero-extension a no-op.

include/llvm/IR/DebugInfoMetadata.h
785

It makes it a bit too easy to make this mistake:

bool IsUnsigned = T.getSignedness() != Signedness::Signed;
test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

Sounds good, thanks! I'll save the normalizing change for later.

vsk marked an inline comment as done.Jun 27 2018, 5:04 PM

I'm double-checking the DWARF for sign extension. Here is what I see in the stage2 llc binary:

0x00754923:                     DW_TAG_inlined_subroutine
                                  DW_AT_abstract_origin	(0x0000000000751e3a "_ZL28isPotentialBlockedMemCpyPairii")
                                  DW_AT_ranges [... snip]
                                  DW_AT_call_file	("/Users/vsk/src/llvm.org-master/llvm/lib/Target/X86/X86AvoidStoreForwardingBlocks.cpp")
                                  DW_AT_call_line	(546)

0x00754930:                       DW_TAG_formal_parameter
                                    DW_AT_location	(0x00186529
                                       [0x00000000000008a4,  0x00000000000008bb): DW_OP_reg2 RCX, DW_OP_piece 0x2, DW_OP_breg2 RCX+0, DW_OP_constu 0xffff, DW_OP_and, DW_OP_constu 0xf, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_stack_value, DW_OP_piece 0x2
                                       [0x00000000000008fe,  0x0000000000000906): DW_OP_reg2 RCX, DW_OP_piece 0x2, DW_OP_breg2 RCX+0, DW_OP_constu 0xffff, DW_OP_and, DW_OP_constu 0xf, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_stack_value, DW_OP_piece 0x2
                                       [0x000000000000096e,  0x0000000000000971): DW_OP_reg2 RCX, DW_OP_piece 0x2, DW_OP_breg2 RCX+0, DW_OP_constu 0xffff, DW_OP_and, DW_OP_constu 0xf, DW_OP_shr, DW_OP_lit0, DW_OP_not, DW_OP_mul, DW_OP_stack_value, DW_OP_piece 0x2)
                                    DW_AT_abstract_origin	(0x0000000000751e4b "LdOpcode")

This shows that the DIExpression is lowered correctly, I think. The variable "LdOpcode" is an "int" in the source, but some optimization has us using only its lower 16 bits. Unfortunately, lldb doesn't accept this variable description:

~/src/builds/llvm.org-master-RA-stage2 (0) $ lldb -- ./bin/llc /Users/vsk/src/llvm.org-master/llvm/test/CodeGen/X86/avoid-sfb-overlaps.ll -mtriple=x86_64-linux -o -
(lldb) b isPotentialBlockedMemCpyPair
Breakpoint 1: where = llc`(anonymous namespace)::X86AvoidSFBPass::runOnMachineFunction(llvm::MachineFunction&) + 1188 [inlined] isPotentialBlockedMemCpyPair(int, int) at X86AvoidStoreForwardingBlocks.cpp:546, address = 0x0000000100078cb4
...
Process 24842 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100078cb4 llc`(anonymous namespace)::X86AvoidSFBPass::runOnMachineFunction(llvm::MachineFunction&) [inlined] isPotentialBlockedMemCpyPair(LdOpcode=<unavailable>, StOpcode=<unavailable>) at X86AvoidStoreForwardingBlocks.cpp:162 [opt]
   159  }
   160
   161  static bool isPotentialBlockedMemCpyPair(int LdOpcode, int StOpcode) {
-> 162    switch (LdOpcode) {
 
(lldb) p LdOpcode
error: Couldn't materialize: couldn't get the value of variable LdOpcode: DW_OP_piece for offset 2 but top of stack is of size 8
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression

Here is what the DWARF4 standard says:

The DW_OP_piece operation takes a single operand, which is an unsigned LEB128 number. The number describes the size in bytes of the piece of the object referenced by the preceding simple location description. If the piece is located in a register, but does not occupy the entire register, the placement of the piece within that register is defined by the ABI.

With this patch, LLVM emits DW_OP_reg2 RCX, DW_OP_piece 0x2. LLDB should know to interpret this as the low 16 bits of %rcx. We'll have to teach it.

aprantl added inline comments.Jun 27 2018, 5:08 PM
test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

While you're at it:

DW_OP_constu, 15 could be shortened to DW_OP_lit15 :-)

It should also be more efficient to join the two fragments <lower> <upper> DW_OP_lit16 DW_OP_shl DW_OP_or
That saves two DW_OP_LLVM_fragment expressions and one dbg.value.

vsk added inline comments.Jun 27 2018, 5:27 PM
test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

Nice, I've made a note of the DW_OP_litX (X < 32) normalization :).

I don't see how to use a single dbg.value to describe a sign-extended variable. You need to refer to the value operand of the dbg.value twice: once for the initial SHR, and again at the end for the OR. Are you suggesting making this work using DW_OP_dup? So, for this example:

DW_OP_dup /* Duplicate the 16-bit and */, lit15, SHR, lit0, NOT, MUL /* Calculate the high bits */, OR /* Combine the high bits with the duplicated low bits */?

vsk updated this revision to Diff 153226.Jun 27 2018, 5:52 PM
vsk marked an inline comment as done.

In this update I've taken @aprantl's suggestion and used DW_OP_dup to get the sign-extension down to one dbg.value. This is a neat simplification. One benefit is that LLDB actually understands DW_OP_dup :). The failing "LdOpcode" example now works:

    frame #0: 0x0000000100077f24 llc`(anonymous namespace)::X86AvoidSFBPass::runOnMachineFunction(llvm::MachineFunction&) [inlined] isPotentialBlockedMemCpyPair(LdOpcode=1899, StOpcode=1898) at X86AvoidStoreForwardingBlocks.cpp:162 [opt]
   159  }
   160 
   161  static bool isPotentialBlockedMemCpyPair(int LdOpcode, int StOpcode) {
-> 162    switch (LdOpcode) {
   163    case X86::MOVUPSrm:
   164    case X86::MOVAPSrm:
   165      return StOpcode == X86::MOVUPSmr || StOpcode == X86::MOVAPSmr;
Target 0: (llc) stopped.
(lldb) p LdOpcode
(int) $0 = 1899
vsk updated this revision to Diff 153249.Jun 27 2018, 8:52 PM
vsk edited the summary of this revision. (Show Details)
  • Remove some defensive code meant to deal with failed fragment creation. This patch doesn't rely on creating fragments any more.
bjope added inline comments.Jun 28 2018, 7:39 AM
lib/Transforms/Utils/Local.cpp
1810

Not sure that I understand this. You say that if at least one of the types is a pointer, and neither is a non-integral pointer type, then they are compatible.
Don't you need to check that the sizes match here? Or in what sense are they compatible?

1862

Some day I think we need to implement support for the types expression stack from DWARF 5. Currently I think LLVM assumes that the DWARF expression stack is 64 bits wide (or in LLVM IR we maybe assume that we have infinite bits and can do DW_OP_shr etc on any type).

For our target the expression stack is 32 bits. And when legalizing larger types like i64 at ISel we end up trying to split these dbg.intrinsics into several DBG_VALUE instruction, using DW_OP_LLVM_fragment for to identify the different parts. Right now we have no simple way in rewriting a complex DIExpression (using 64-bit expressions) into smaller pieces, so we need to discard the debug info.

1867

I don't think WithStackValue is correct if there could be users from findDbgUsers that is a dbg.declare (because dbg.declare is indirect).
Maybe unlikely to happen here, but I don't know. The same kind of fault exist in salvageDebugInfo, and I've been trying to understand if we ever expect to have non-empty DIExpressions in dbg.declare (except for DW_OP_LLVM_fragment). For example having a DW_OP_stack_value in dbg.declare seems weird, right?

And what happens if we have a dbg.value and the DIExpression already got a non-empty DIExpression, ending without a DW_OP_stack_value. Then it is indirect, and adding DW_OP_stack_value would turn it into being direct. Or should it perhaps be illegal to have an indirect dbg.value like that?

aprantl added inline comments.Jun 28 2018, 8:09 AM
test/Transforms/InstCombine/cast-set-preserve-signed-dbg-val.ll
18

You are correct, DW_OP_dup exists exactly for this kind of use-case.

vsk planned changes to this revision.Jun 28 2018, 1:36 PM
vsk added inline comments.
lib/Transforms/Utils/Local.cpp
1810

I'd like this to be a target-aware way to identify semantically-equivalent bitcasts, inttoptr, and ptrtoint conversions. I'm using isNonIntegralPointerType because, AFAIK, inttoptr/ptrtoint are not lossy on targets with integral pointers. I assumed that the sizes of FromTy and ToTy are equal when at least one of the types is a pointer type, but I'm no longer sure this is a valid assumption. I'll rewrite this to be cleaner and add an explicit check.

1862

I think LLVM does assume a 64-bit width for the expression stack (at least, we can't represent any DW_OP_constu wider than 64-bits in the IR).

The scheme you describe of splitting up the dbg intrinsics for i64 values and rewriting their expressions sounds very difficult. I'm not familiar with any backends, but IIRC legalization occurs early, so a split i32 may be optimized out leaving you with an incomplete (actually, incorrect) variable description? Does DWARF5 alleviate the issue at all? Maybe you could introduce a vendor extension to apply DW_OP_convert to a subset of a source variable's bits?

1867

You're correct to point out that the operand of a dbg.{addr,declare} is the address of a variable, not the variable itself. But I don't see how we'd take the sign-extension code path here when the dbg intrinsic operand is a pointer. Actually, allowing this utility and salvageDI to operate on dbg.addr seems useful, because they can preserve debug info across no-op pointer bitcasts.

I don't think we have any code that would add a DW_OP_stack_value to a dbg.{addr, declare}, but if we did, I don't think it would be weird. IIUC, it would just mean the debugger needs to do pointer arithmetic before loading a variable. That seems just like having a dbg.value for a pointer operand and adding a DW_OP_deref at the end (before DW_OP_stack_value).

I don't understand your last question about dbg.values with non-empty, non-stack-value expressions. It seems reasonable to create these (say, by using DW_OP_deref), and I don't see why they can't also be treated as stack values.

Side note: please don't depend on what type a pointer points to, if
possible (& it should be possible everywhere except where necessary to
match the IR constraints - eg: you may need to check a pointer's pointee
type to ensure a use of a value of that type matches up with its
definition). Eventually pointee types will go away & someone will have to
fix up that code... best to avoid that if we can.

bjope added inline comments.Jun 28 2018, 3:12 PM
lib/Transforms/Utils/Local.cpp
1867

I don't understand your last question about dbg.values with non-empty, non-stack-value expressions. It seems reasonable to create these (say, by using DW_OP_deref), and I don't see why they can't also be treated as stack values.

Let's say that we have

dbg.value  i32 %x, !y, DIExpression(DW_OP_lit1, DW_OP_add)

then it means that the value of !y is given by dereferencing (%x+1).
If %x can be rewritten as a sign-extend of some value %z, and you use

prependOpcodes(OldDII.getExpression(), Ops, DIExpression::WithStackValue);

then, afaik, the DW_OP_stack_value is added at the end of the expression, while the Ops are prepended. So you end up with

dbg.value  i32 %z, !y, DIExpression(Ops, DW_OP_lit1, DW_OP_add, DW_OP_stack_value)

So we will end up saying that !y is equal to (%x + 1) instead of that (%x + 1) points to the value.

This could be an unexpected situation here, never supposed to happen. Then some asserts showing those assumptions might be good. Maybe such asserts could be put inside prependOpcodes.

vsk added inline comments.Jun 28 2018, 4:14 PM
lib/Transforms/Utils/Local.cpp
1867

Now I understand, thanks. The key language from the docs I was missing is:
"DIExpressions also follow this model: A DIExpression that doesn’t have a trailing DW_OP_stack_value will describe an address when combined with a concrete location."

This means the current patch does not handle any dbg.{addr,declare} or non-empty DIExpressions properly. It would mess up the variable's address calculation and/or pre-existing stack_value computations.

I think we should handle this case. I'll try to find a way to normalize expressions as stack values before appending any DW_OPs to them.

vsk updated this revision to Diff 154033.Jul 3 2018, 6:08 PM
vsk retitled this revision from [Local] Teach insertReplacementDbgValues basic integer/pointer conversions to [Local] replaceAllDbgUsesWith: Update debug values before RAUW.
vsk edited the summary of this revision. (Show Details)
  • Introduce and use DIExpression::appendToStack() to create sign extension stack values. It takes care of dereferencing indirect values when needed.
  • Rename insertReplacementDbgValues to replaceAllDbgUsesWith. The main change is that debug intrinsics are now updated in-place. The previous behavior of inserting new debug values could reorder the way variable updates are seen in the debugger.
gramanas added inline comments.
lib/Transforms/Utils/Local.cpp
1844

Could FromBits be equal to ToBits? In that case neither zext nor sext is needed. Could this be if (FromBits <= ToBits)?

aprantl added inline comments.Jul 5 2018, 8:25 AM
include/llvm/IR/DebugInfoMetadata.h
2332

This looks very dangerous. DIExpressions are not distinct so they may be shared by many users. Shouldn't we create a new DIExpression instead of modifying the old one?

vsk added inline comments.Jul 5 2018, 10:37 AM
include/llvm/IR/DebugInfoMetadata.h
2332

Why does this modify an existing expression? I think I'm only using it to create a new expression (in DIExpression::appendToStack, appendToVector(NewOps)).

lib/Transforms/Utils/Local.cpp
1844

No, that case should be handled earlier, in isBitCastSemanticsPreserving().

aprantl added inline comments.Jul 5 2018, 11:00 AM
include/llvm/IR/DebugInfoMetadata.h
2332

My bad. I overlooked that this is a const member and that V is the one being modified. Sorry for the noise!

bjope added a comment.Jul 5 2018, 11:28 AM

FYI: Looks like you have address my earlier comments. So I have no blocking remarks.

The addition of DomPoint / DT was interesting (not sure that I have figured out exactly which scenarios that are handled)
I think that we still have the problem with only checking dominance of the moved/replaced value though.
As was discussed earlier somewhere (a patch in post RA MachineSink) we also might have the problem that there are other dbg.value intrinsics, referring to the same variable (or an overlapping fragment). And the question is if it is correct to reorder such dbg.value intrinsics, or what to do. Matt wrote a TR about it here https://bugs.llvm.org/show_bug.cgi?id=37897

lib/Transforms/Utils/Local.cpp
1844

It would still be more correct to enter the if in case FromBits==ToBits. So I don't think it hurts to use <= here.

vsk updated this revision to Diff 154291.Jul 5 2018, 12:49 PM
vsk marked 16 inline comments as done.
  • Rebase.
  • Address the FromBits == ToBits case.
  • Extend the unit test. There are two new values, %f and %g, which are used to demonstrate what happens in the two use-before-def scenarios: either we're able to salvage the debug user, or we need to delete it.
aprantl added inline comments.Jul 5 2018, 1:17 PM
lib/Transforms/Utils/Local.cpp
1853

There is no way to sign-extend, that's right, but zero-extension as a concept doesn't make much sense in the untyped DWARF (4) stack, since we don't know how large the stack values are anyway. Looks like the code knows this but the comment hasn't been updated yet.

aprantl accepted this revision.Jul 5 2018, 1:17 PM

Thanks, I think this looks good now!

This revision is now accepted and ready to land.Jul 5 2018, 1:17 PM
vsk added a comment.Jul 5 2018, 1:25 PM

FYI: Looks like you have address my earlier comments. So I have no blocking remarks.

Thanks for taking a look, I really appreciate your detailed feedback.

The addition of DomPoint / DT was interesting (not sure that I have figured out exactly which scenarios that are handled)

The two scenarios are:

  1. The replacement value is a Constant, will be inserted at a position which dominates the original instruction, or will be inserted immediately after the original instruction (which is deleted later). In this scenario there's no risk of introducing debug value use-before-def.
  2. The original value dominates the position where the replacement value will be inserted. A good example of this is the 'alloca-cast-debuginfo.ll' test. In an intermediate step, InstCombine inserts a temporary cast of an alloca:
%local = alloca %struct.Foo
%tmpcast = bitcast %local
call dbg.declare(%tmpcast, "local")
<some instructions follow>
%castOfTmpcast = bitcast %tmpcast

InstCombine recognizes that it can remove 'tmpcast' and simplify 'castOfTmpcast' if it promotes the type of the alloca. Since 'tmpcast' has only one use, we try to save its debug users, anticipating that 'tmpcast' will be deleted. The DomPoint / DT machinery gives us a way to prevent this use-before-def for the unlinked 'simplifiedCast':

%local = alloca i64
call dbg.declare(%simplifiedCast, "local")
<some instructions follow>
%simplifiedCast = bitcast %local

We either salvage the debug user or delete it. The unit test has examples of both situations.

I think that we still have the problem with only checking dominance of the moved/replaced value though.
As was discussed earlier somewhere (a patch in post RA MachineSink) we also might have the problem that there are other dbg.value intrinsics, referring to the same variable (or an overlapping fragment). And the question is if it is correct to reorder such dbg.value intrinsics, or what to do. Matt wrote a TR about it here https://bugs.llvm.org/show_bug.cgi?id=37897

Doesn't this patch sidestep issues with reordering variable updates, because the updates are now all done in-place?

bjope added a comment.Jul 6 2018, 12:47 AM
In D48676#1153642, @vsk wrote:

FYI: Looks like you have address my earlier comments. So I have no blocking remarks.

Thanks for taking a look, I really appreciate your detailed feedback.

The addition of DomPoint / DT was interesting (not sure that I have figured out exactly which scenarios that are handled)

The two scenarios are:

  1. The replacement value is a Constant, will be inserted at a position which dominates the original instruction, or will be inserted immediately after the original instruction (which is deleted later). In this scenario there's no risk of introducing debug value use-before-def.
  2. The original value dominates the position where the replacement value will be inserted. A good example of this is the 'alloca-cast-debuginfo.ll' test. In an intermediate step, InstCombine inserts a temporary cast of an alloca:
%local = alloca %struct.Foo
%tmpcast = bitcast %local
call dbg.declare(%tmpcast, "local")
<some instructions follow>
%castOfTmpcast = bitcast %tmpcast

InstCombine recognizes that it can remove 'tmpcast' and simplify 'castOfTmpcast' if it promotes the type of the alloca. Since 'tmpcast' has only one use, we try to save its debug users, anticipating that 'tmpcast' will be deleted. The DomPoint / DT machinery gives us a way to prevent this use-before-def for the unlinked 'simplifiedCast':

%local = alloca i64
call dbg.declare(%simplifiedCast, "local")
<some instructions follow>
%simplifiedCast = bitcast %local

We either salvage the debug user or delete it. The unit test has examples of both situations.

I think that we still have the problem with only checking dominance of the moved/replaced value though.
As was discussed earlier somewhere (a patch in post RA MachineSink) we also might have the problem that there are other dbg.value intrinsics, referring to the same variable (or an overlapping fragment). And the question is if it is correct to reorder such dbg.value intrinsics, or what to do. Matt wrote a TR about it here https://bugs.llvm.org/show_bug.cgi?id=37897

Doesn't this patch sidestep issues with reordering variable updates, because the updates are now all done in-place?

I thought that

DII->moveAfter(&DomPoint);

still could move the DII without really taking other dbg.value intrinsics, potentially describing the same variable, into account.
But I see now that the moveAfter only happens when

DomPointAfterFrom && DII->getNextNonDebugInstruction() == &DomPoint

which means that we do not move the DII arbitrarely (it's just moving to the other side of the DomPoint). So it should be ok.

This revision was automatically updated to reflect the committed changes.