Page MenuHomePhabricator

[DEBUGINFO]Fix debug info for packed bitfields.
AcceptedPublic

Authored by ABataev on Jun 30 2020, 8:12 AM.

Details

Summary

For the packed bitfields, the compiler emits not quite correct debug
info. Packed bitfield may use more bytes than the base type of the
bitfield. In this case, DW_AT_byte_size and DW_AT_bit_offset attributes
may have incorrect values.

Diff Detail

Event Timeline

ABataev created this revision.Jun 30 2020, 8:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2020, 8:12 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
aprantl added a project: debug-info.

I see that you are modifying the DWARF v2 codepath. The output emitted in DWARF v2 for bitfields on x86 is peculiar unintuitive because the spec was written with big-endian machines in mind. Have you verified (e.g., on https://godbolt.org) that GCC does the same thing?

aprantl added inline comments.Jun 30 2020, 2:23 PM
llvm/test/DebugInfo/X86/packed_bitfields.ll
13

Actually, this is weird. The comment mentions DWARF 2, but the RUN line seems to use the default, which is 4 at the moment. We might want to make sure to test both.

ABataev marked an inline comment as done.Jul 1 2020, 5:48 AM

I see that you are modifying the DWARF v2 codepath. The output emitted in DWARF v2 for bitfields on x86 is peculiar unintuitive because the spec was written with big-endian machines in mind. Have you verified (e.g., on https://godbolt.org) that GCC does the same thing?

gcc acts the same way just like llvm does. But seems to me, it is incorrect. In llvm, the bitfield consumes 2 bytes, but debug info says that it consumes only one bytes and emits negative offset. It may break debug info for some targets, like NVPTX.
I checked the new debug info, it works correctly with gdb.

llvm/test/DebugInfo/X86/packed_bitfields.ll
13

I will add the run line for both.

gcc acts the same way just like llvm does. But seems to me, it is incorrect.

I would be very careful about producing different DWARF than GCC. Chances are that most debuggers will expect this format, too. It is of course possible that both clang and GCC exhibit the same bug, but it might be worth double-checking with the dwarf-discuss mailing list if there isn't a misunderstanding of the DWARF spec here.

Could you perhaps quote the DWARF spec to show how the existing code is misinterpreting it? This might help figuring out whether Clang and GCC are behaving wrong here.

gcc acts the same way just like llvm does. But seems to me, it is incorrect.

I would be very careful about producing different DWARF than GCC. Chances are that most debuggers will expect this format, too. It is of course possible that both clang and GCC exhibit the same bug, but it might be worth double-checking with the dwarf-discuss mailing list if there isn't a misunderstanding of the DWARF spec here.

Could you perhaps quote the DWARF spec to show how the existing code is misinterpreting it? This might help figuring out whether Clang and GCC are behaving wrong here.

Here is a statement from the standard:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.

Currently, for example, in the provided test, the actual storage size is 2 bytes because of the packed struct, but debug info states that the storage size is 1 byte.

So with the debug info before this fix what behavior does gdb and lldb have?

So with the debug info before this fix what behavior does gdb and lldb have?

They correctly processed such debug info. There is no problem with handling this debug info in the debuggers, it just does not meet the standard. I assume, the debuggers just read beyond the specified boundaries, but it does not cause any problems since these bytes are still available. But it breaks the compilation for NVPTX target.

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?
Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

aprantl accepted this revision.Jul 6 2020, 4:36 PM

Thank you! I was worried that we would be breaking compatibility with consumers for now compelling reason. If the consumers don't mind, then neither do I :-)
I would appreciate if you could follow up with adding both DWARF 2 & 4 variants to packed_bitfields.ll before landing this.

This revision is now accepted and ready to land.Jul 6 2020, 4:36 PM
aprantl added inline comments.Jul 6 2020, 4:37 PM
llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
1546

Do we have something in https://llvm.org/doxygen/MathExtras_8h_source.html to make this more readable?

aprantl added inline comments.Jul 6 2020, 4:39 PM
llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
1546

alignTo() perhaps?

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

That said, I don't have great suggestions for how the DWARF should communicate this packed situation where a bitfield crosses a byte boundary either.

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short. The standard does not say anything about the size of the base type. The real size of the bitfield is passed in DW_AT_bit_size attribute (in bits).

That said, I don't have great suggestions for how the DWARF should communicate this packed situation where a bitfield crosses a byte boundary either.

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

ABataev updated this revision to Diff 276057.Jul 7 2020, 7:23 AM

Rebase + used align... functions.

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short.

The storage size is the same as what?

It looks like/my understanding is this patch produces a field of type 'char' with size 2 bytes - which seems surprisingly inconsistent, at least. If there are other pre-existing examples of fields having larger sizes than their types, that might be useful to draw analogy and confidence from.

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short.

The storage size is the same as what?

It looks like/my understanding is this patch produces a field of type 'char' with size 2 bytes - which seems surprisingly inconsistent, at least. If there are other pre-existing examples of fields having larger sizes than their types, that might be useful to draw analogy and confidence from.

Found a discussion in the dwarf-discuss mailing list http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2011-September/003881.html. Looks like signed offsets are allowed in DWARF2.
The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Yes, my BLISS language allows structures to have "negative" offsets. You end up passing around a pointer to the "middle" and have some fields in either direction (often private in one direction, public the other). It is also used as crude form of polymorphism back in the day before the word existed.

And my BLISS/Pascal let you provide explicit field offsets which then end up with alignment holes scattered around. Along with alignment attributes on fields as well as on the overall structure itself.

For Pascal's PACKED RECORDs, subrange values get packed into a small set of bit positions and, unlike C, their alignment isn't derived from the underlying base type.

PACKED RECORD

F1 : 1..10;
F2 : 'A'..'L';
F3 : (RED, WHITE, BLUE);

END;

Ends up with a size of 2 bytes for the overall type (and any variables of that type), F1 is 4 bits big at offset 0 bits from the start, F2 is 7 bits big at offset 4 bits from the start, F3 is 2 bits big at offset 11 bits from the start. Of course, removing PACKED ends up with a much more pleasant layout but I have a legacy PACKED that started back in the VAX days when bytes were precious and I need to describe it. (The first VAX shipped with a base of 256Kbytes [yes 'K'] but most customers splurged for an entire 1MB - the system max'd out at 8MB)

tra added a subscriber: tra.Jul 8 2020, 10:36 AM

The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Looks like we'll need to teach MCAsmStreamer to handle 'unsigned-only' data directives. Right now it always prints a signed value. Making dwarf use unsigned values is just one of the ways to trigger the issue.

I'll get it fixed.

In D82881#2139589, @tra wrote:

The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Looks like we'll need to teach MCAsmStreamer to handle 'unsigned-only' data directives. Right now it always prints a signed value. Making dwarf use unsigned values is just one of the ways to trigger the issue.

I'll get it fixed.

I have a patch for it already. It is quite simple, just need to set the Form to DW_FORM_udata and everything work. I can update this patch, if you want to try the fix.

tra added a comment.Jul 8 2020, 10:53 AM

I have a patch for it already. It is quite simple, just need to set the Form to DW_FORM_udata and everything work. I can update this patch, if you want to try the fix.

If negative values are allowed by dwarf, then dwarf emitter works as intended. No need to fix it if it's not broken. :-)
I appreciate the effort you've put into investigating the problem.

tra added a comment.Jul 8 2020, 1:47 PM

I've sent D83423 to make sure NVPTX can handle negative values.