What's the difference in representation? Is it more expressive? Does the old version use a GNU extension or anything like that? (a little diff snippet of the change in DWARF would be handy to review/assess the impact)

LGTM but I'd prefer to have a second opinion about the version dependency

I'll find the exact release where support first appeared.

What's the difference in representation?

Test case:

struct fields
{
  unsigned a : 4;
  unsigned b : 4;
} flags;

int main() {
  flags.a = 1;
  flags.b = 2;
  return flags.a + flags.b;
}

When tuning for GDB we emit DW_AT_bit_offset and DW_AT_data_member_location for each field:

$ ./bin/clang /tmp/test.c -o /tmp/test.o -ggdb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o
0x00000047:     DW_TAG_member
                  DW_AT_name    ("a")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_byte_size       (0x04)
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x1c)
                  DW_AT_data_member_location    (0x00)

0x00000056:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (4)
                  DW_AT_byte_size       (0x04)
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x18)
                  DW_AT_data_member_location    (0x00)

When tuning for lldb we emit only DW_AT_data_bit_offset:

$ ./bin/clang /tmp/test.c -o /tmp/test.o -glldb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o
0x0000004b:     DW_TAG_member
                  DW_AT_name    ("a")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x00)

0x00000058:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (4)
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x04)

As I understand it, DW_AT_bit_offset has to be emitted in a different ordering depending on endian (well, you offset from the other end of the storage). See https://github.com/llvm/llvm-project/blob/0577a9f8d06b97bf8b4d9aa171096e0797e4f5d1/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp#L1649. This has to be done by the producer of the DWARF.

DW_AT_data_bit_offset is meant to be endian agnostic and the tool interpreting the dwarf applies what it knows to be the endian to work out where the bit offset starts from (so in the link above, we just emit it without checking for any swapping).

From the DWARF 4 spec:

For a DW_AT_data_bit_offset attribute, the value is an integer constant (see Section 2.19) that
specifies the number of bits from the beginning of the containing entity to the beginning of the
data member. This value must be greater than or equal to zero, but is not limited to less than the
number of bits per byte.

For big-endian architectures, bit offsets are counted from high-order to low-order bits
within a byte (or larger storage unit); in this case, the bit offset identifies the high-order
bit of the object.

For little-endian architectures, bit offsets are counted from low-order to high-order bits
within a byte (or larger storage unit); in this case, the bit offset identifies the low-order
bit of the object.

I'm not really arguing that one is inherently better, I just figured I'd make the code up to date with what gdb has been doing (I've been digging into bitfields for other reasons).

Alternatively I can just update the original comment with "GDB >= <version> does support DW_AT_data_bit_offset but we choose not to emit it to maintain compatibility.".

Once I can cite a specific GDB release, let's see how new it is and go from there?

Adding a TODO release note so I don't forget it if one is needed.

Harbormaster completed remote builds in B191682: Diff 467064.Oct 12 2022, 2:41 AM

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

GDB 8 was the first version to support DW_AT_data_bit_offset.

Given that we expect GCC 7, which would probably come with a GDB 7, likely best action is to just update the comment and I'll come back to this if/when the minimum moves up.
(https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library)

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

I will check this out. (it might be that the support in GDB 8 has some holes that GCC accounted for for a while after that)

DavidSpickett edited the summary of this revision. (Show Details)Oct 13 2022, 3:32 AM

DavidSpickett edited the summary of this revision. (Show Details)

In D135583#3855256, @DavidSpickett wrote:

GDB 8 was the first version to support DW_AT_data_bit_offset.

Given that we expect GCC 7, which would probably come with a GDB 7, likely best action is to just update the comment and I'll come back to this if/when the minimum moves up.
(https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library)

What minimum compiler we require for building clang is a pretty different question than the one we're dealing with here.

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

I will check this out. (it might be that the support in GDB 8 has some holes that GCC accounted for for a while after that)

Possible - if there's evidence of such holes, that's useful, but I don't think we'd have to look too hard - the feature seems fairly isolated and probably whatever version GCC started emitting it is a good enough marker for "it's out there and GDB versions expect it/handle it".

What minimum compiler we require for building clang is a pretty different question than the one we're dealing with here.

Maybe I can clarify. I was thinking that if someone's gcc is 7.x then their gdb is likely 7.x as well. If they try to debug a binary produced with the clang they just built, they will likely reach for gdb not lldb. So that's why there's a link there.

But anyway you are correct it's not the main concern here. I will check the DWARF gcc produces across versions.

GCC 11 is where DW_AT_data_bit_offset is first used.

$ ./bin/aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-10.343) 10.4.1 20220909
<...>
$ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/test.o
<...>
0x00000026:     DW_TAG_member
<...>
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x1c)
                  DW_AT_data_member_location    (0x00)

$ ./bin/aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-11.284) 11.3.1 20221005
<...>
$ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/
<...>
0x00000027:     DW_TAG_member
<...>
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x00)

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

With the feature being enabled in GCC 11, it seems to be ok for what will be clang 16 to enable it. For example the most recent Ubuntu LTS "Jammy", ships with GCC 11 and clang 14. So we're not going to see a gcc <=10 plus clang >=16 at least from them.

(hopefully I understood the question correctly)

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

With the feature being enabled in GCC 11, it seems to be ok for what will be clang 16 to enable it. For example the most recent Ubuntu LTS "Jammy", ships with GCC 11 and clang 14. So we're not going to see a gcc <=10 plus clang >=16 at least from them.

(hopefully I understood the question correctly)

Yep, that covers it & works for me, thanks!

This patch probably needs some positive testing that the change has an effect? (otherwise as it stands, the patch removes testing & would pass without the functional change applied at all)

& the TODO patch notes sounds good to add - something like the minimum GDB version required to support the newer bitfield encoding and maybe a note about what to do for older gdb? (use an older DWARF version with -gdwarf-N for some value of N?)

Test that linux DWARFv4 uses data_bit_offset and that adding GDB tuning does not change that.
Add release notes for clang and llvm.

Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 3:12 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

DavidSpickett edited the summary of this revision. (Show Details)Oct 26 2022, 3:13 AM

Harbormaster completed remote builds in B194375: Diff 470766.Oct 26 2022, 4:02 AM

Sounds good, thanks!

This revision is now accepted and ready to land.Oct 26 2022, 9:56 AM

MaskRay added a subscriber: MaskRay.Oct 26 2022, 10:16 AM

MaskRay added inline comments.

clang/docs/ReleaseNotes.rst
627	"DWARF v4" is more common than "DWARFv4". ditto below
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
431

DWARFvN -> DWARF vN
Remove superfluous brackets.

DavidSpickett marked 2 inline comments as done.Oct 27 2022, 1:39 AM

This revision was landed with ongoing or failed builds.Oct 27 2022, 1:51 AM

Closed by commit rGc6e2de60423a: [LLVM] Use DWARFv4 bitfields when tuning for GDB (authored by DavidSpickett). · Explain Why

This revision was automatically updated to reflect the committed changes.

DavidSpickett added a commit: rGc6e2de60423a: [LLVM] Use DWARFv4 bitfields when tuning for GDB.

Harbormaster completed remote builds in B194585: Diff 471065.Oct 27 2022, 3:12 AM

Diff 471070

clang/docs/ReleaseNotes.rst

	Show First 20 Lines • Show All 618 Lines • ▼ Show 20 Lines
	----------------------------			----------------------------

	The -mcpu=generic configuration now enables sign-ext and mutable-globals. These			The -mcpu=generic configuration now enables sign-ext and mutable-globals. These
	proposals are standardized and available in all major engines.			proposals are standardized and available in all major engines.

	DWARF Support in Clang			DWARF Support in Clang
	----------------------			----------------------

				Previously when emitting DWARFv4 and tuning for GDB, Clang would use DWARF v2's
				MaskRayUnsubmitted Done Reply Inline Actions "DWARF v4" is more common than "DWARFv4". ditto below MaskRay: "DWARF v4" is more common than "DWARFv4". ditto below
				``DW_AT_bit_offset`` and ``DW_AT_data_member_location``. Clang now uses DWARF v4's
				``DW_AT_data_bit_offset`` regardless of tuning.

				Support for ``DW_AT_data_bit_offset`` was added in GDB 8.0. For earlier versions,
				you can use the ``-gdwarf-3`` option to emit compatible DWARF.

	Arm and AArch64 Support in Clang			Arm and AArch64 Support in Clang
	--------------------------------			--------------------------------

	- The target(..) function attributes for AArch64 now accept:			- The target(..) function attributes for AArch64 now accept:

	* ``"arch=<arch>"`` strings, that specify the architecture for a function as per the ``-march`` option.			* ``"arch=<arch>"`` strings, that specify the architecture for a function as per the ``-march`` option.
	* ``"cpu=<cpu>"`` strings, that specify the cpu for a function as per the ``-mcpu`` option.			* ``"cpu=<cpu>"`` strings, that specify the cpu for a function as per the ``-mcpu`` option.
	* ``"tune=<cpu>"`` strings, that specify the tune cpu for a function as per ``-mtune``.			* ``"tune=<cpu>"`` strings, that specify the tune cpu for a function as per ``-mtune``.
	▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/docs/ReleaseNotes.rst

	Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines
	---------------------------------			---------------------------------

	* Add Module Flags Metadata ``stack-protector-guard-symbol`` which specify a			* Add Module Flags Metadata ``stack-protector-guard-symbol`` which specify a
	symbol for addressing the stack-protector guard.			symbol for addressing the stack-protector guard.

	Changes to the Debug Info			Changes to the Debug Info
	---------------------------------			---------------------------------

	During this release ...			Previously when emitting DWARF v4 and tuning for GDB, llc would use DWARF v2's
				``DW_AT_bit_offset`` and ``DW_AT_data_member_location``. llc now uses DWARF v4's
				``DW_AT_data_bit_offset`` regardless of tuning.

				Support for ``DW_AT_data_bit_offset`` was added in GDB 8.0. For earlier versions,
				you can use llc's ``-dwarf-version=3`` option to emit compatible DWARF.

	Changes to the LLVM tools			Changes to the LLVM tools
	---------------------------------			---------------------------------

	* ``llvm-readobj --elf-output-style=JSON`` no longer prefixes each JSON object			* ``llvm-readobj --elf-output-style=JSON`` no longer prefixes each JSON object
	with the file name. Previously, each object file's output looked like			with the file name. Previously, each object file's output looked like
	``"main.o":{"FileSummary":{"File":"main.o"},...}`` but is now			``"main.o":{"FileSummary":{"File":"main.o"},...}`` but is now
	``{"FileSummary":{"File":"main.o"},...}``. This allows each JSON object to be			``{"FileSummary":{"File":"main.o"},...}``. This allows each JSON object to be
	Show All 31 Lines

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

Show First 20 Lines • Show All 422 Lines • ▼ Show 20 Lines

DwarfDebug::DwarfDebug(AsmPrinter *A)

// Work around a GDB bug. GDB doesn't support the standard opcode;

// SCE doesn't support GNU's; LLDB prefers the standard opcode, which

// is defined as of DWARF 3.

// See GDB bug 11616 - DW_OP_form_tls_address is unimplemented

// https://sourceware.org/bugzilla/show_bug.cgi?id=11616

UseGNUTLSOpcode = tuneForGDB() || DwarfVersion < 3;

// GDB does not fully support the DWARF 4 representation for bitfields.

UseDWARF2Bitfields = DwarfVersion < 4;

MaskRayUnsubmitted

Done

UseGNUTLSOpcode = tuneForGDB() || DwarfVersion < 3;

- UseDWARF2Bitfields = (DwarfVersion < 4);

+ UseDWARF2Bitfields = DwarfVersion < 4;

// The DWARF v5 string offsets table has - possibly shared - contributions

MaskRay:

UseDWARF2Bitfields = (DwarfVersion < 4) || tuneForGDB();

// The DWARF v5 string offsets table has - possibly shared - contributions

// from each compile and type unit each preceded by a header. The string

// offsets table used by the pre-DWARF v5 split-DWARF implementation uses

// a monolithic string offsets table without any header.

UseSegmentedStringOffsetsTable = DwarfVersion >= 5;

// Emit call-site-param debug info for GDB and LLDB, if the target supports

▲ Show 20 Lines • Show All 3,108 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/bitfields-dwarf4.ll

	; RUN: llc -mtriple x86_64-apple-macosx -O0 -filetype=obj -o - %s \			; RUN: llc -mtriple x86_64-apple-macosx -O0 -filetype=obj -o - %s \
	; RUN: \| llvm-dwarfdump -v -debug-info - \| FileCheck %s			; RUN: \| llvm-dwarfdump -v -debug-info - \| FileCheck %s
	; RUN: llc -mtriple x86_64-gnu-linux -O0 -filetype=obj -o - %s \			; RUN: llc -mtriple x86_64-gnu-linux -O0 -filetype=obj -o - %s \
	; RUN: \| llvm-dwarfdump -v -debug-info - \| FileCheck %s --check-prefix=LINUX			; RUN: \| llvm-dwarfdump -v -debug-info - \| FileCheck %s
	; LINUX-NOT: DW_AT_data_bit_offset			; RUN: llc -mtriple x86_64-gnu-linux -O0 -debugger-tune=gdb -filetype=obj -o - %s \
				; RUN: \| llvm-dwarfdump -v -debug-info - \| FileCheck %s
	;			;
	; Generated from:			; Generated from:
	; #include <stdint.h>			; #include <stdint.h>
	; #pragma pack(1)			; #pragma pack(1)
	; struct PackedBits			; struct PackedBits
	; {			; {
	; char a;			; char a;
	; uint32_t b : 5,			; uint32_t b : 5,
	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/packed_bitfields.ll

	; RUN: llc -dwarf-version=2 -mtriple x86_64-apple-macosx -O0 -filetype=obj -o %t_2_le.o %s			; RUN: llc -dwarf-version=2 -mtriple x86_64-apple-macosx -O0 -filetype=obj -o %t_2_le.o %s
	; RUN: llvm-dwarfdump -v -debug-info %t_2_le.o \| FileCheck %s			; RUN: llvm-dwarfdump -v -debug-info %t_2_le.o \| FileCheck %s
	; RUN: llc -dwarf-version=4 -debugger-tune=gdb -mtriple x86_64-apple-macosx -O0 -filetype=obj -o %t_4_le.o %s
	; RUN: llvm-dwarfdump -v -debug-info %t_4_le.o \| FileCheck %s

	; Produced at -O0 from:			; Produced at -O0 from:
	; struct {			; struct {
	; char : 3;			; char : 3;
	; char a : 6;			; char a : 6;
	; } __attribute__((__packed__)) b;			; } __attribute__((__packed__)) b;

	; Note that DWARF 2 counts bit offsets backwards from the high end of			; Note that DWARF 2 counts bit offsets backwards from the high end of
	Show All 38 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LLVM] Use DWARFv4 bitfields when tuning for GDB
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 471070

clang/docs/ReleaseNotes.rst

llvm/docs/ReleaseNotes.rst

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

llvm/test/DebugInfo/X86/bitfields-dwarf4.ll

llvm/test/DebugInfo/X86/packed_bitfields.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LLVM] Use DWARFv4 bitfields when tuning for GDBClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 471070

clang/docs/ReleaseNotes.rst

llvm/docs/ReleaseNotes.rst

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

llvm/test/DebugInfo/X86/bitfields-dwarf4.ll

llvm/test/DebugInfo/X86/packed_bitfields.ll

[LLVM] Use DWARFv4 bitfields when tuning for GDB
ClosedPublic