GDB implemented data_bit_offset in https://sourceware.org/bugzilla/show_bug.cgi?id=12616
which has been present since GDB 8.0.
GCC started using it at GCC 11.
Differential D135583
[LLVM] Use DWARFv4 bitfields when tuning for GDB DavidSpickett on Oct 10 2022, 7:36 AM. Authored by
Details GDB implemented data_bit_offset in https://sourceware.org/bugzilla/show_bug.cgi?id=12616 GCC started using it at GCC 11.
Diff Detail
Event TimelineComment Actions https://bugs.llvm.org/show_bug.cgi?id=27758
Turns out it didn't take that long, we just didn't notice. Does this warrant a release note do you think? Comment Actions
It's essentially changing the oldest supported gdb, so probably yes. LGTM but I'd prefer to have a second opinion about the version dependency Comment Actions What's the difference in representation? Is it more expressive? Does the old version use a GNU extension or anything like that? (a little diff snippet of the change in DWARF would be handy to review/assess the impact) Comment Actions
I'll find the exact release where support first appeared. Comment Actions
Test case: struct fields { unsigned a : 4; unsigned b : 4; } flags; int main() { flags.a = 1; flags.b = 2; return flags.a + flags.b; } When tuning for GDB we emit DW_AT_bit_offset and DW_AT_data_member_location for each field: $ ./bin/clang /tmp/test.c -o /tmp/test.o -ggdb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o 0x00000047: DW_TAG_member DW_AT_name ("a") DW_AT_type (0x00000066 "unsigned int") DW_AT_decl_file ("/tmp/test.c") DW_AT_decl_line (3) DW_AT_byte_size (0x04) DW_AT_bit_size (0x04) DW_AT_bit_offset (0x1c) DW_AT_data_member_location (0x00) 0x00000056: DW_TAG_member DW_AT_name ("b") DW_AT_type (0x00000066 "unsigned int") DW_AT_decl_file ("/tmp/test.c") DW_AT_decl_line (4) DW_AT_byte_size (0x04) DW_AT_bit_size (0x04) DW_AT_bit_offset (0x18) DW_AT_data_member_location (0x00) When tuning for lldb we emit only DW_AT_data_bit_offset: $ ./bin/clang /tmp/test.c -o /tmp/test.o -glldb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o 0x0000004b: DW_TAG_member DW_AT_name ("a") DW_AT_type (0x00000066 "unsigned int") DW_AT_decl_file ("/tmp/test.c") DW_AT_decl_line (3) DW_AT_bit_size (0x04) DW_AT_data_bit_offset (0x00) 0x00000058: DW_TAG_member DW_AT_name ("b") DW_AT_type (0x00000066 "unsigned int") DW_AT_decl_file ("/tmp/test.c") DW_AT_decl_line (4) DW_AT_bit_size (0x04) DW_AT_data_bit_offset (0x04) As I understand it, DW_AT_bit_offset has to be emitted in a different ordering depending on endian (well, you offset from the other end of the storage). See https://github.com/llvm/llvm-project/blob/0577a9f8d06b97bf8b4d9aa171096e0797e4f5d1/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp#L1649. This has to be done by the producer of the DWARF. DW_AT_data_bit_offset is meant to be endian agnostic and the tool interpreting the dwarf applies what it knows to be the endian to work out where the bit offset starts from (so in the link above, we just emit it without checking for any swapping). From the DWARF 4 spec: For a DW_AT_data_bit_offset attribute, the value is an integer constant (see Section 2.19) that specifies the number of bits from the beginning of the containing entity to the beginning of the data member. This value must be greater than or equal to zero, but is not limited to less than the number of bits per byte. For big-endian architectures, bit offsets are counted from high-order to low-order bits within a byte (or larger storage unit); in this case, the bit offset identifies the high-order bit of the object. For little-endian architectures, bit offsets are counted from low-order to high-order bits within a byte (or larger storage unit); in this case, the bit offset identifies the low-order bit of the object. I'm not really arguing that one is inherently better, I just figured I'd make the code up to date with what gdb has been doing (I've been digging into bitfields for other reasons). Alternatively I can just update the original comment with "GDB >= <version> does support DW_AT_data_bit_offset but we choose not to emit it to maintain compatibility.". Once I can cite a specific GDB release, let's see how new it is and go from there? Comment Actions Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least. Comment Actions GDB 8 was the first version to support DW_AT_data_bit_offset. Given that we expect GCC 7, which would probably come with a GDB 7, likely best action is to just update the comment and I'll come back to this if/when the minimum moves up.
I will check this out. (it might be that the support in GDB 8 has some holes that GCC accounted for for a while after that) Comment Actions What minimum compiler we require for building clang is a pretty different question than the one we're dealing with here. The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.
Possible - if there's evidence of such holes, that's useful, but I don't think we'd have to look too hard - the feature seems fairly isolated and probably whatever version GCC started emitting it is a good enough marker for "it's out there and GDB versions expect it/handle it". Comment Actions
Maybe I can clarify. I was thinking that if someone's gcc is 7.x then their gdb is likely 7.x as well. If they try to debug a binary produced with the clang they just built, they will likely reach for gdb not lldb. So that's why there's a link there. But anyway you are correct it's not the main concern here. I will check the DWARF gcc produces across versions. Comment Actions GCC 11 is where DW_AT_data_bit_offset is first used. $ ./bin/aarch64-none-linux-gnu-gcc --version aarch64-none-linux-gnu-gcc (fsf-10.343) 10.4.1 20220909 <...> $ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/test.o <...> 0x00000026: DW_TAG_member <...> DW_AT_bit_size (0x04) DW_AT_bit_offset (0x1c) DW_AT_data_member_location (0x00) $ ./bin/aarch64-none-linux-gnu-gcc --version aarch64-none-linux-gnu-gcc (fsf-11.284) 11.3.1 20221005 <...> $ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/ <...> 0x00000027: DW_TAG_member <...> DW_AT_bit_size (0x04) DW_AT_data_bit_offset (0x00)
With the feature being enabled in GCC 11, it seems to be ok for what will be clang 16 to enable it. For example the most recent Ubuntu LTS "Jammy", ships with GCC 11 and clang 14. So we're not going to see a gcc <=10 plus clang >=16 at least from them. (hopefully I understood the question correctly) Comment Actions
Yep, that covers it & works for me, thanks! This patch probably needs some positive testing that the change has an effect? (otherwise as it stands, the patch removes testing & would pass without the functional change applied at all) & the TODO patch notes sounds good to add - something like the minimum GDB version required to support the newer bitfield encoding and maybe a note about what to do for older gdb? (use an older DWARF version with -gdwarf-N for some value of N?) Comment Actions
|
"DWARF v4" is more common than "DWARFv4". ditto below