This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstrRef] Don't produce over-sized DW_OP_deref_size expressions for very wide stack spills
Changes PlannedPublic

Authored by jmorse on Jul 28 2023, 7:59 AM.

Details

Summary

Addresses https://github.com/llvm/llvm-project/issues/64093 , where instruction referencing is producing an over-sized DW_OP_deref_size expression.

When we describe the locations of variables that have been spilt to the stack, there are occasions where we need to use DW_OP_deref_size, to indicate "this 32 bit variable is the lower order bits of the 64 bit value at this location". This has a meaningful difference on big endian systems, because the consumer can't necessarily know that the smaller value it's looking for is at a different position in a large spill slot.

It's not completely clear to me that instruction referencing is doing this correctly; it also probably isn't useful on a little endian system like x86. Fix the most obvious problem first: the DWARF spec says that the size of DW_OP_deref_size must not exceed the natural address size of the machine, so don't try to produce a DW_OP_deref_size if the operand would be bigger than the pointer size. Test comes from the github bug. While we're at it, refactor the logic into a utility function.

Diff Detail

Event Timeline

jmorse created this revision.Jul 28 2023, 7:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2023, 7:59 AM
jmorse requested review of this revision.Jul 28 2023, 7:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2023, 7:59 AM

Just a couple of inline nits/questions from me.

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1163

Does a test for this part already exist? I assume so, since this patch is just narrowing the use cases rather than adding the functionally, but wanted to check to be sure.

1175

Can ValueSizeInBits ever be a value that doesn't divide cleanly by 8? e.g. 1 from an i1 type (I haven't looked at how getLocSizeInBits works).

If so then I think we might need to use divideCeil from MathExtras.h here. Otherwise, with the code as is, a 1 would result in DerefSizeInBytes of 0.

1181

left over dev comment?

1300

Same question about bit sizes and divisions here.

aprantl added inline comments.
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1175

Would using a PointerSizeInBits instead help?

dblaikie added inline comments.Jul 31 2023, 9:51 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1177–1178

Could you clarify/rephrase the "output will identify a flawed location" - I don't understand what you mean by that & I'd have expected the failure here to be "absence of a location" rather than possibly incorrect locations ("flawed" I assume means "incorrect"?)

llvm/test/DebugInfo/MIR/InstrRef/livedebugvalues_deref_size_too_big.mir
6

How would a plain DW_OP_deref work here - it'd still only be word-wide, right - so 64 bits, but the low 64 bit of a 128bit floating point value presumably isn't a valid floating point value, right?

jmorse planned changes to this revision.Sep 1 2023, 8:00 AM

Sorry for the month delay, I was wandering up some mountains,

Coming back to this with fresh eyes, I missed something in the reproducer from GH ticket -- I thought it was spilling an entire vector register to the stack, however...

push    rbp
mov     rbp, rsp
sub     rsp, 16
movsd   qword ptr [rbp - 8], xmm0
call    c()@PLT
movsd   xmm0, qword ptr [rbp - 8]       # xmm0 = mem[0],zero
call    b(double)@PLT
add     rsp, 16
pop     rbp
ret

... it's actually performing a partial spill of the lower parts of xmm0, a fact that is encoded in the opcode (movsd not mov) but missed by InstrRefBasedLDV. So there's an extra factor of "where does that information go missing" to build into a different patch.

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1163

Yup, it's llvm/test/DebugInfo/MIR/InstrRef/deref-spills-with-size.mir and tests pretty much every combination of sizes and derefs.

1175

I suppose switching to using bits instead of bytes will remove a category of weirdness that can occur, will implement that,

1177–1178

Improved wording would be "There's a risk this is incorrect on big endian systems", however I'm going to back off this for a moment, see main comment.

llvm/test/DebugInfo/MIR/InstrRef/livedebugvalues_deref_size_too_big.mir
6

Hrrm. It's technically valid for x86 as the 128-bit-and-higher registers are all vectors, but baking that assumption in here guarantees it'll break something else. I'm going to back off from this for a moment (see main comment) and come back with a re-rationalised approach.