This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Prologue inserter need to insert DW_OP_deref_size
ClosedPublic

Authored by markus on Mar 22 2019, 3:32 AM.

Details

Summary

The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it.

The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address).

Diff Detail

Repository
rL LLVM

Event Timeline

markus created this revision.Mar 22 2019, 3:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2019, 3:33 AM
aprantl added inline comments.Mar 22 2019, 8:45 AM
lib/CodeGen/PrologEpilogInserter.cpp
1156 ↗(On Diff #191841)

This would make more sense as a member function of DIExpression?

ormris removed a subscriber: ormris.Mar 22 2019, 8:58 AM
markus marked an inline comment as done.Mar 25 2019, 7:18 AM

I am a bit worried about the other DW_OP_deref used in the compiler and how they behave wrt big-endian targets when loading something that is smaller than the size of an address on the target machine. I do not have any concrete example of where it goes wrong but I have not actively looked for one either.

Do we all agree that using plain DW_OP_deref is potentially problematic?

lib/CodeGen/PrologEpilogInserter.cpp
1156 ↗(On Diff #191841)

Agree. Will move it.

markus marked an inline comment as not done.Mar 25 2019, 7:19 AM
probinson added inline comments.
test/CodeGen/X86/prologepilog_deref_size.mir
58 ↗(On Diff #191841)

Not sure I follow the purpose of these conversions. DW_OP_deref_size is specified to zero-extend the fetched bytes to the size of an address. The convert operators look like they are just doing the same thing over again?

Again (and sorry for possibly over emphasizing this) the purpose of this patch is that a DW_OP_deref needs to be inserted before a memory location description can be prepended to an already existing implicit location description as otherwise the latter expression would act on the former address and not the actual value in memory.

The reasoning behind the DW_OP_deref_size is probably best explained with an example. Consider a machine with 32 bit addresses and assume that our location description referes to a 8 bit value (so that is really the size of the value we want to load).

  • If our machine is little endian then a DW_OP_deref from address A will load (from most significant to least significant byte) {A[3],A[2],A[1],A[0]} onto the stack. If we truncate this 32 bit value to 8 bits then we are left with {A[0]} which exactly corresponds to a 8 bit value at address A, so no problem here.
  • On the other hand if our machine is big endian then a DW_OP_deref from address A will load (from most significant to least significant byte) {A[0],A[1],A[2],A[3]} onto the stack. If we truncate this 32 bit value to 8 bits then we are left with {A[3]} which is clearly not the 8 bit value at address A.

So for big endian we need to load the right size from the start.

test/CodeGen/X86/prologepilog_deref_size.mir
58 ↗(On Diff #191841)

Yes, perhaps this example is a bit ill formed in that sense and would be better of performing sext instead of zext so that the DW_OP_convert ops actually did something useful. I will change that.

Thanks for the detailed explanation. The deref_size is clearly necessary in this example.
Most DWARF expressions are location expressions, not value expressions, so most deref operations are actually fetching an address. As we start doing more with salvaging values from optimizations, we might find ourselves doing more value expressions, so it's worth keeping an eye out for those. The call-site parameter stuff might be a place to think about in those terms.

@aprantl @dblaikie it bugs me that MIParser and LLParser each have their own DIExpression parser. As this patch shows, they can get out of sync. Is it worth trying to do something about that?

@aprantl @dblaikie it bugs me that MIParser and LLParser each have their own DIExpression parser. As this patch shows, they can get out of sync. Is it worth trying to do something about that?

We discussed this in the patch that introduced MIParser support for DIExpressions and concluded that it would be difficult to factor this out. I'm open to suggestions though.

markus updated this revision to Diff 192580.Mar 28 2019, 2:01 AM

Moved isImplicit to DIExpression class and updated test to do sext instead of zext.

Aside from one style nit, I'm okay with this if Adrian is.

lib/CodeGen/PrologEpilogInserter.cpp
1148 ↗(On Diff #192580)

This is the only use of MFI? I don't see any others...
If so, don't bother with a local variable MFI.

I know it's really picky... it's a fuzzy line, at what point do chains of calls like MF.getFrameInfo().getObjectSize(FrameIdx) become too unwieldy, but I think in this case it isn't.

markus updated this revision to Diff 192616.Mar 28 2019, 5:41 AM

Rebased and fixed style nit.

markus marked 5 inline comments as done.Mar 28 2019, 5:43 AM
markus added a comment.Apr 5 2019, 1:07 AM

Ping @aprantl, just a friendly reminder.

@aprantl @dblaikie it bugs me that MIParser and LLParser each have their own DIExpression parser. As this patch shows, they can get out of sync. Is it worth trying to do something about that?

We discussed this in the patch that introduced MIParser support for DIExpressions and concluded that it would be difficult to factor this out. I'm open to suggestions though.

include/llvm/IR/DebugInfoMetadata.h
2514 ↗(On Diff #192616)

Wouldn't it be much faster to only check the last 2 operands? If it is a stack_value, we're good, if it is a DW_OP_LLVM_fragment, we need to check the one before it.

markus updated this revision to Diff 194312.Apr 9 2019, 6:43 AM

Rebased and updated isImplicit() to only check last elements.

aprantl accepted this revision.Apr 9 2019, 10:00 AM
aprantl added inline comments.
include/llvm/IR/DebugInfoMetadata.h
2512 ↗(On Diff #194312)

Thanks! I think the function is too long for an inline definition now, so let's move it into the .cpp file.

2517 ↗(On Diff #194312)

the else is redundant

Is this better? I'm not actually sure...

if (N == 0)
  return false;
switch (getElement(N-1)) {
  case dwarf::DW_OP_stack_value: return true;
  case dwarf::DW_OP_LLVM_fragment:
     return N > 1 && getElement(N-2) == dwarf::DW_OP_stack_value;
  default: return true;
}
This revision is now accepted and ready to land.Apr 9 2019, 10:00 AM
markus updated this revision to Diff 194640.Apr 11 2019, 12:19 AM

Moved isImplicit() to .cpp file.

markus marked an inline comment as done.Apr 11 2019, 12:20 AM
markus added inline comments.
lib/IR/DebugInfoMetadata.cpp
888 ↗(On Diff #194640)

Like this maybe?

aprantl accepted this revision.Apr 11 2019, 8:20 AM

Great, thanks!

This revision was automatically updated to reflect the committed changes.
markus reopened this revision.Apr 13 2019, 12:54 AM

Has been reverted in

commit 0366e3e18142466e4dd99d3109d8facd093cedc8 (llvm.org/master, master)
Author: Hans Wennborg <hans@hanshq.net>
Date:   Fri Apr 12 12:54:52 2019 +0000

    Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."
    
    It causes clang to crash while building Chromium. See https://crbug.com/952230
    for reproducer.

Will analyze the problem and put up a new revision for review shortly.

This revision is now accepted and ready to land.Apr 13 2019, 12:54 AM
markus updated this revision to Diff 195363.Apr 16 2019, 6:55 AM

The Chromium compilation segfault comes down to two issues:

  1. DW_OP_deref_size is supposed to take a single byte argument (and not a ULEB128). If a size argument larger than 127 was used ULEB128 does not encode the same way as data1 so the extractor in DwarfDebug::emitDebugLocEntry looses track and that results in the crash.
  1. When compiling the provoking file much larger than expected objects show up in PEI::replaceFrameIndices (I was expecting that it was just primitive parameter values passed on the stack) but instead we got
(gdb) p MI.getDebugVariable()->isParameter()
$20 = true
(gdb) p MI.dump()
DBG_VALUE $rbp, $noreg, !"this", !DIExpression(DW_OP_plus_uconst, 56, DW_OP_stack_value), debug-location !20746; ../../third_party/SPIRV-Tools/src/test/unit_spirv.h:0 @[ ../../third_party/SPIRV-Tools/src/test/text_to_binary.annotation_test.cpp:64:1 ] line no:0
(gdb) p Size
$23 = 2128

This patch addresses 1) but I am a bit confused about what to do with 2) so I will need to investigate that further.

The Chromium compilation segfault comes down to two issues:

  1. DW_OP_deref_size is supposed to take a single byte argument (and not a ULEB128). If a size argument larger than 127 was used ULEB128 does not encode the same way as data1 so the extractor in DwarfDebug::emitDebugLocEntry looses track and that results in the crash.

DWARF says that's illegal. The deref is fetching a value to push onto the expression stack. "This operand is a 1-byte unsigned integral constant whose value may not be larger than the size of the generic type."

It sounds like you're trying to do this with aggregate types, and that won't work. Only base types (for DWARF 5) or address-like types (for DWARF <= 4) should end up on the stack.

! In D59687#1468571, @probinson wrote:
It sounds like you're trying to do this with aggregate types, and that won't work. Only base types (for DWARF 5) or address-like types (for DWARF <= 4) should end up on the stack.

Yes, but it is not intentional.

I guess that I did not consider that we could enter this place with a aggregate type object and an expression like !DIExpression(DW_OP_plus_uconst, 56, DW_OP_stack_value) ending with a DW_OP_stack_value.
I wonder what kind of source code that corresponds to, something that takes the address 56 bytes into a aggregate? So maybe something like this.

struct S {
  char bar[1024];
}
void foo(S s) {
  char *p = &s.bar[56];
  ...
}

Having a bit of hard time minimizing the Chromium failure to something useful.

Having a bit of hard time minimizing the Chromium failure to something useful.

You may already know this, but just in case: I'm using results delta (http://delta.tigris.org) for these kind of jobs with great results. There is also creduce, but in my personal experience it is slower than multidelta and more easily tripped by unusual inputs.

markus added a subscriber: hans.Apr 25 2019, 5:16 AM

You may already know this, but just in case: I'm using results delta (http://delta.tigris.org) for these kind of jobs with great results. There is also creduce, but in my personal experience it is slower than multidelta and more easily tripped by unusual inputs.

I did try but I had no luck with multidelta but anyway I realized much later that @hans had already been helpful and minimized the reproducer in https://crbug.com/952230 so I managed to do some further analysis using that.

markus updated this revision to Diff 196615.Apr 25 2019, 5:29 AM

After studying the reproducer and reading some in https://llvm.org/docs/SourceLevelDebugging.html I realize that maybe isImplicit() is not a sufficient condition and that we also need to include the second operand of the DBG_VALUE in the test.

Cant say that I am very sure about this though but at least it does solve the reported problem..

Any feedback is welcome!

bjope added inline comments.Apr 25 2019, 6:08 AM
lib/CodeGen/PrologEpilogInserter.cpp
1194 ↗(On Diff #196615)

MI.getOperand(1).isImm()

aprantl added inline comments.Apr 25 2019, 8:30 AM
lib/CodeGen/PrologEpilogInserter.cpp
1194 ↗(On Diff #196615)

Perhaps MachineInstr::isIndirectDebugValue() ?

markus updated this revision to Diff 196809.Apr 26 2019, 12:47 AM

Use MI.isIndirectDebugValue() and improved the test a bit.

Can (or should) I proceed to push this again? I am a bit confused since the accepted status remains even though I upload new patches.

Just land the patch and watch the bots :-)

There's no hard and fast rules about this, but usually, tiny adjustments to fix issues found by bots can be landed without requiring a separate review.

This revision was automatically updated to reflect the committed changes.