This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/AsmPrinter/
-
CodeGen/
-
AsmPrinter/
2
DwarfUnit.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
1/2
packed_bitfields.ll

Differential D82881

[DEBUGINFO]Fix debug info for packed bitfields.
AbandonedPublic

Authored by ABataev on Jun 30 2020, 8:12 AM.

Download Raw Diff

Details

Reviewers

aprantl
probinson
shafik

Summary

For the packed bitfields, the compiler emits not quite correct debug
info. Packed bitfield may use more bytes than the base type of the
bitfield. In this case, DW_AT_byte_size and DW_AT_bit_offset attributes
may have incorrect values.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	9,980 ms	linux > libomp.env::Unknown Unit Message ("")
	1,490 ms	linux > libomp.worksharing/for::Unknown Unit Message ("")

Event Timeline

ABataev created this revision.Jun 30 2020, 8:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2020, 8:12 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster failed remote builds in B62334: Diff 274487!Jun 30 2020, 10:18 AM

aprantl added a reviewer: shafik.Jun 30 2020, 2:14 PM

aprantl added a project: debug-info.

I see that you are modifying the DWARF v2 codepath. The output emitted in DWARF v2 for bitfields on x86 is peculiar unintuitive because the spec was written with big-endian machines in mind. Have you verified (e.g., on https://godbolt.org) that GCC does the same thing?

aprantl added inline comments.Jun 30 2020, 2:23 PM

llvm/test/DebugInfo/X86/packed_bitfields.ll
11	Actually, this is weird. The comment mentions DWARF 2, but the RUN line seems to use the default, which is 4 at the moment. We might want to make sure to test both.

In D82881#2123970, @aprantl wrote:

I see that you are modifying the DWARF v2 codepath. The output emitted in DWARF v2 for bitfields on x86 is peculiar unintuitive because the spec was written with big-endian machines in mind. Have you verified (e.g., on https://godbolt.org) that GCC does the same thing?

gcc acts the same way just like llvm does. But seems to me, it is incorrect. In llvm, the bitfield consumes 2 bytes, but debug info says that it consumes only one bytes and emits negative offset. It may break debug info for some targets, like NVPTX.
I checked the new debug info, it works correctly with gdb.

llvm/test/DebugInfo/X86/packed_bitfields.ll
11	I will add the run line for both.

gcc acts the same way just like llvm does. But seems to me, it is incorrect.

I would be very careful about producing different DWARF than GCC. Chances are that most debuggers will expect this format, too. It is of course possible that both clang and GCC exhibit the same bug, but it might be worth double-checking with the dwarf-discuss mailing list if there isn't a misunderstanding of the DWARF spec here.

Could you perhaps quote the DWARF spec to show how the existing code is misinterpreting it? This might help figuring out whether Clang and GCC are behaving wrong here.

In D82881#2126698, @aprantl wrote:

gcc acts the same way just like llvm does. But seems to me, it is incorrect.

I would be very careful about producing different DWARF than GCC. Chances are that most debuggers will expect this format, too. It is of course possible that both clang and GCC exhibit the same bug, but it might be worth double-checking with the dwarf-discuss mailing list if there isn't a misunderstanding of the DWARF spec here.

Could you perhaps quote the DWARF spec to show how the existing code is misinterpreting it? This might help figuring out whether Clang and GCC are behaving wrong here.

Here is a statement from the standard:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.

Currently, for example, in the provided test, the actual storage size is 2 bytes because of the packed struct, but debug info states that the storage size is 1 byte.

So with the debug info before this fix what behavior does gdb and lldb have?

In D82881#2128641, @shafik wrote:

So with the debug info before this fix what behavior does gdb and lldb have?

They correctly processed such debug info. There is no problem with handling this debug info in the debuggers, it just does not meet the standard. I assume, the debuggers just read beyond the specified boundaries, but it does not cause any problems since these bytes are still available. But it breaks the compilation for NVPTX target.

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?
Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

In D82881#2133511, @aprantl wrote:

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:

struct {
short : 3;
short : 6;
} a;

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

Thank you! I was worried that we would be breaking compatibility with consumers for now compelling reason. If the consumers don't mind, then neither do I :-)
I would appreciate if you could follow up with adding both DWARF 2 & 4 variants to packed_bitfields.ll before landing this.

This revision is now accepted and ready to land.Jul 6 2020, 4:36 PM

aprantl added inline comments.Jul 6 2020, 4:37 PM

llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
1546	Do we have something in https://llvm.org/doxygen/MathExtras_8h_source.html to make this more readable?

aprantl added inline comments.Jul 6 2020, 4:39 PM

llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
1546	alignTo() perhaps?

In D82881#2133548, @ABataev wrote:
In D82881#2133511, @aprantl wrote:

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:
struct {
short : 3;
short : 6;
} a;

Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

That said, I don't have great suggestions for how the DWARF should communicate this packed situation where a bitfield crosses a byte boundary either.

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

In D82881#2134913, @dblaikie wrote:
In D82881#2133548, @ABataev wrote:
In D82881#2133511, @aprantl wrote:

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:
struct {
short : 3;
short : 6;
} a;
Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?

The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short. The standard does not say anything about the size of the base type. The real size of the bitfield is passed in DW_AT_bit_size attribute (in bits).

That said, I don't have great suggestions for how the DWARF should communicate this packed situation where a bitfield crosses a byte boundary either.

But the code, produced by the compiler, is also the same. So, I think, the debug info also should be the same.

Also, what happens to the next bit field or variable right after the bit-filed with the now larger container? Is that affected by the patch?

It does not affect the next fields. We point exactly to the bytes, allocated for this particular bitfield only.

Rebase + used align... functions.

In D82881#2135641, @ABataev wrote:
In D82881#2134913, @dblaikie wrote:
In D82881#2133548, @ABataev wrote:
In D82881#2133511, @aprantl wrote:

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:
struct {
short : 3;
short : 6;
} a;
Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?
The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short.

The storage size is the same as what?

It looks like/my understanding is this patch produces a field of type 'char' with size 2 bytes - which seems surprisingly inconsistent, at least. If there are other pre-existing examples of fields having larger sizes than their types, that might be useful to draw analogy and confidence from.

In D82881#2137938, @dblaikie wrote:
In D82881#2135641, @ABataev wrote:
In D82881#2134913, @dblaikie wrote:
In D82881#2133548, @ABataev wrote:
In D82881#2133511, @aprantl wrote:

And conversely, with this patch applied, do GDB and LLDB still produce the expected result?

GDB works correctly. Did not check with lldb, but it also should work. The result is similar to the debug info, produced for the next code:
struct {
short : 3;
short : 6;
} a;
Similar, but seems different in a critical way - in that code, the type of the field is short, which has size 2. Which matches the size of the field.

I think it would be pretty surprising to handle DWARF where the size of a field is different from the size of the type of that field?
The standard clearly says:
A base type entry has a DW_AT_byte_size attribute, whose value is a constant, describing the size in bytes of the storage unit used to represent an object of the given type.
In our example, the storage size is the same, just like for short.
The storage size is the same as what?

It looks like/my understanding is this patch produces a field of type 'char' with size 2 bytes - which seems surprisingly inconsistent, at least. If there are other pre-existing examples of fields having larger sizes than their types, that might be useful to draw analogy and confidence from.

Found a discussion in the dwarf-discuss mailing list http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2011-September/003881.html. Looks like signed offsets are allowed in DWARF2.
The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Yes, my BLISS language allows structures to have "negative" offsets. You end up passing around a pointer to the "middle" and have some fields in either direction (often private in one direction, public the other). It is also used as crude form of polymorphism back in the day before the word existed.

And my BLISS/Pascal let you provide explicit field offsets which then end up with alignment holes scattered around. Along with alignment attributes on fields as well as on the overall structure itself.

For Pascal's PACKED RECORDs, subrange values get packed into a small set of bit positions and, unlike C, their alignment isn't derived from the underlying base type.

PACKED RECORD

F1 : 1..10;
F2 : 'A'..'L';
F3 : (RED, WHITE, BLUE);

END;

Ends up with a size of 2 bytes for the overall type (and any variables of that type), F1 is 4 bits big at offset 0 bits from the start, F2 is 7 bits big at offset 4 bits from the start, F3 is 2 bits big at offset 11 bits from the start. Of course, removing PACKED ends up with a much more pleasant layout but I have a legacy PACKED that started back in the VAX days when bytes were precious and I need to describe it. (The first VAX shipped with a base of 256Kbytes [yes 'K'] but most customers splurged for an entire 1MB - the system max'd out at 8MB)

In D82881#2138942, @ABataev wrote:

The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Looks like we'll need to teach MCAsmStreamer to handle 'unsigned-only' data directives. Right now it always prints a signed value. Making dwarf use unsigned values is just one of the ways to trigger the issue.

I'll get it fixed.

In D82881#2139589, @tra wrote:

In D82881#2138942, @ABataev wrote:

The bug I'm trying to fix is the incompatibility with NVPTX ptxas compiler. It does not allow signed integers in debug sections. Would it be good to emit bit_offset as DW_FORM_udata for NVPTX target to fix incompatibility? Checked that it works with ptxas.

Looks like we'll need to teach MCAsmStreamer to handle 'unsigned-only' data directives. Right now it always prints a signed value. Making dwarf use unsigned values is just one of the ways to trigger the issue.

I'll get it fixed.

I have a patch for it already. It is quite simple, just need to set the Form to DW_FORM_udata and everything work. I can update this patch, if you want to try the fix.

In D82881#2139596, @ABataev wrote:

I have a patch for it already. It is quite simple, just need to set the Form to DW_FORM_udata and everything work. I can update this patch, if you want to try the fix.

If negative values are allowed by dwarf, then dwarf emitter works as intended. No need to fix it if it's not broken. :-)
I appreciate the effort you've put into investigating the problem.

I've sent D83423 to make sure NVPTX can handle negative values.

ABataev abandoned this revision.Sep 16 2020, 5:11 AM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

AsmPrinter/

DwarfUnit.cpp

5 lines

test/

DebugInfo/

X86/

packed_bitfields.ll

4 lines

Diff 274487

llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp

Show First 20 Lines • Show All 1,535 Lines • ▼ Show 20 Lines	DIE &DwarfUnit::constructMemberDIE(DIE &Buffer, const DIDerivedType *DT) {
} else {		} else {
uint64_t Size = DT->getSizeInBits();		uint64_t Size = DT->getSizeInBits();
uint64_t FieldSize = DD->getBaseTypeSize(DT);		uint64_t FieldSize = DD->getBaseTypeSize(DT);
uint32_t AlignInBytes = DT->getAlignInBytes();		uint32_t AlignInBytes = DT->getAlignInBytes();
uint64_t OffsetInBytes;		uint64_t OffsetInBytes;

bool IsBitfield = FieldSize && Size != FieldSize;		bool IsBitfield = FieldSize && Size != FieldSize;
if (IsBitfield) {		if (IsBitfield) {
		uint64_t Offset = DT->getOffsetInBits();
// Handle bitfield, assume bytes are 8 bits.		// Handle bitfield, assume bytes are 8 bits.
		uint64_t StorageSize = ((Offset + Size + 7) / 8 - Offset / 8) * 8;
		aprantlUnsubmitted Not Done Reply Inline Actions Do we have something in https://llvm.org/doxygen/MathExtras_8h_source.html to make this more readable? aprantl: Do we have something in https://llvm.org/doxygen/MathExtras_8h_source.html to make this more…
		aprantlUnsubmitted Not Done Reply Inline Actions alignTo() perhaps? aprantl: alignTo() perhaps?
		if (StorageSize > FieldSize)
		FieldSize = StorageSize;
if (DD->useDWARF2Bitfields())		if (DD->useDWARF2Bitfields())
addUInt(MemberDie, dwarf::DW_AT_byte_size, None, FieldSize/8);		addUInt(MemberDie, dwarf::DW_AT_byte_size, None, FieldSize/8);
addUInt(MemberDie, dwarf::DW_AT_bit_size, None, Size);		addUInt(MemberDie, dwarf::DW_AT_bit_size, None, Size);

uint64_t Offset = DT->getOffsetInBits();
// We can't use DT->getAlignInBits() here: AlignInBits for member type		// We can't use DT->getAlignInBits() here: AlignInBits for member type
// is non-zero if and only if alignment was forced (e.g. _Alignas()),		// is non-zero if and only if alignment was forced (e.g. _Alignas()),
// which can't be done with bitfields. Thus we use FieldSize here.		// which can't be done with bitfields. Thus we use FieldSize here.
uint32_t AlignInBits = FieldSize;		uint32_t AlignInBits = FieldSize;
uint32_t AlignMask = ~(AlignInBits - 1);		uint32_t AlignMask = ~(AlignInBits - 1);
// The bits from the start of the storage unit to the start of the field.		// The bits from the start of the storage unit to the start of the field.
uint64_t StartBitOffset = Offset - (Offset & AlignMask);		uint64_t StartBitOffset = Offset - (Offset & AlignMask);
// The byte offset of the field's aligned storage unit inside the struct.		// The byte offset of the field's aligned storage unit inside the struct.
▲ Show 20 Lines • Show All 227 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/packed_bitfields.ll

	; RUN: llc -mtriple x86_64-apple-macosx -O0 -filetype=obj -o %t_le.o %s			; RUN: llc -mtriple x86_64-apple-macosx -O0 -filetype=obj -o %t_le.o %s
	; RUN: llvm-dwarfdump -v -debug-info %t_le.o \| FileCheck %s			; RUN: llvm-dwarfdump -v -debug-info %t_le.o \| FileCheck %s

	; Produced at -O0 from:			; Produced at -O0 from:
	; struct {			; struct {
	; char : 3;			; char : 3;
	; char a : 6;			; char a : 6;
	; } __attribute__((__packed__)) b;			; } __attribute__((__packed__)) b;

	; Note that DWARF 2 counts bit offsets backwards from the high end of			; Note that DWARF 2 counts bit offsets backwards from the high end of
	; the storage unit to the high end of the bit field.			; the storage unit to the high end of the bit field.
				aprantlUnsubmitted Not Done Reply Inline Actions Actually, this is weird. The comment mentions DWARF 2, but the RUN line seems to use the default, which is 4 at the moment. We might want to make sure to test both. aprantl: Actually, this is weird. The comment mentions DWARF 2, but the RUN line seems to use the…
				ABataevAuthorUnsubmitted Done Reply Inline Actions I will add the run line for both. ABataev: I will add the run line for both.

	; CHECK: DW_TAG_member			; CHECK: DW_TAG_member
	; CHECK-NEXT: DW_AT_name{{.*}}"a"			; CHECK-NEXT: DW_AT_name{{.*}}"a"
	; CHECK-NOT: DW_TAG_member			; CHECK-NOT: DW_TAG_member
	; CHECK: DW_AT_byte_size {{.*}} (0x01)			; CHECK: DW_AT_byte_size {{.*}} (0x02)
	; CHECK-NEXT: DW_AT_bit_size {{.*}} (0x06)			; CHECK-NEXT: DW_AT_bit_size {{.*}} (0x06)
	; CHECK-NEXT: DW_AT_bit_offset {{.*}} (0xffffffffffffffff)			; CHECK-NEXT: DW_AT_bit_offset {{.*}} (0x07)
	; CHECK-NEXT: DW_AT_data_member_location {{.*}} (DW_OP_plus_uconst 0x0)			; CHECK-NEXT: DW_AT_data_member_location {{.*}} (DW_OP_plus_uconst 0x0)

	; ModuleID = 'repro.c'			; ModuleID = 'repro.c'
	source_filename = "repro.c"			source_filename = "repro.c"
	target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-apple-macosx10.4.0"			target triple = "x86_64-apple-macosx10.4.0"

	%struct.anon = type { i16 }			%struct.anon = type { i16 }
	Show All 22 Lines