This is an archive of the discontinued LLVM Phabricator instance.

[LLVM] Use DWARFv4 bitfields when tuning for GDB
ClosedPublic

Authored by DavidSpickett on Oct 10 2022, 7:36 AM.

Details

Summary

GDB implemented data_bit_offset in https://sourceware.org/bugzilla/show_bug.cgi?id=12616
which has been present since GDB 8.0.

GCC started using it at GCC 11.

Diff Detail

Event Timeline

DavidSpickett created this revision.Oct 10 2022, 7:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 7:36 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
DavidSpickett requested review of this revision.Oct 10 2022, 7:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 7:36 AM

Remove commented code.

https://bugs.llvm.org/show_bug.cgi?id=27758

Adrian Prantl 2016-05-17 15:13:46 PDT
<...>
Let's hope we get GDB support soon!

Turns out it didn't take that long, we just didn't notice.

Does this warrant a release note do you think?

Does this warrant a release note do you think?

It's essentially changing the oldest supported gdb, so probably yes.

LGTM but I'd prefer to have a second opinion about the version dependency

What's the difference in representation? Is it more expressive? Does the old version use a GNU extension or anything like that? (a little diff snippet of the change in DWARF would be handy to review/assess the impact)

LGTM but I'd prefer to have a second opinion about the version dependency

I'll find the exact release where support first appeared.

What's the difference in representation?

Test case:

struct fields
{
  unsigned a : 4;
  unsigned b : 4;
} flags;

int main() {
  flags.a = 1;
  flags.b = 2;
  return flags.a + flags.b;
}

When tuning for GDB we emit DW_AT_bit_offset and DW_AT_data_member_location for each field:

$ ./bin/clang /tmp/test.c -o /tmp/test.o -ggdb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o
0x00000047:     DW_TAG_member
                  DW_AT_name    ("a")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_byte_size       (0x04)
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x1c)
                  DW_AT_data_member_location    (0x00)

0x00000056:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (4)
                  DW_AT_byte_size       (0x04)
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x18)
                  DW_AT_data_member_location    (0x00)

When tuning for lldb we emit only DW_AT_data_bit_offset:

$ ./bin/clang /tmp/test.c -o /tmp/test.o -glldb -gdwarf-4 && ./bin/llvm-dwarfdump /tmp/test.o
0x0000004b:     DW_TAG_member
                  DW_AT_name    ("a")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (3)
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x00)

0x00000058:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_type    (0x00000066 "unsigned int")
                  DW_AT_decl_file       ("/tmp/test.c")
                  DW_AT_decl_line       (4)
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x04)

As I understand it, DW_AT_bit_offset has to be emitted in a different ordering depending on endian (well, you offset from the other end of the storage). See https://github.com/llvm/llvm-project/blob/0577a9f8d06b97bf8b4d9aa171096e0797e4f5d1/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp#L1649. This has to be done by the producer of the DWARF.

DW_AT_data_bit_offset is meant to be endian agnostic and the tool interpreting the dwarf applies what it knows to be the endian to work out where the bit offset starts from (so in the link above, we just emit it without checking for any swapping).

From the DWARF 4 spec:

For a DW_AT_data_bit_offset attribute, the value is an integer constant (see Section 2.19) that
specifies the number of bits from the beginning of the containing entity to the beginning of the
data member. This value must be greater than or equal to zero, but is not limited to less than the
number of bits per byte.
For big-endian architectures, bit offsets are counted from high-order to low-order bits
within a byte (or larger storage unit); in this case, the bit offset identifies the high-order
bit of the object.

For little-endian architectures, bit offsets are counted from low-order to high-order bits
within a byte (or larger storage unit); in this case, the bit offset identifies the low-order
bit of the object.

I'm not really arguing that one is inherently better, I just figured I'd make the code up to date with what gdb has been doing (I've been digging into bitfields for other reasons).

Alternatively I can just update the original comment with "GDB >= <version> does support DW_AT_data_bit_offset but we choose not to emit it to maintain compatibility.".

Once I can cite a specific GDB release, let's see how new it is and go from there?

Adding a TODO release note so I don't forget it if one is needed.

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

GDB 8 was the first version to support DW_AT_data_bit_offset.

Given that we expect GCC 7, which would probably come with a GDB 7, likely best action is to just update the comment and I'll come back to this if/when the minimum moves up.
(https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library)

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

I will check this out. (it might be that the support in GDB 8 has some holes that GCC accounted for for a while after that)

DavidSpickett edited the summary of this revision. (Show Details)Oct 13 2022, 3:32 AM
DavidSpickett edited the summary of this revision. (Show Details)

GDB 8 was the first version to support DW_AT_data_bit_offset.

Given that we expect GCC 7, which would probably come with a GDB 7, likely best action is to just update the comment and I'll come back to this if/when the minimum moves up.
(https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library)

What minimum compiler we require for building clang is a pretty different question than the one we're dealing with here.

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

Got a sense of what GCC does in what versions? Might be a good argument for when/if/how we should use this newer feature... if they're already using it at least.

I will check this out. (it might be that the support in GDB 8 has some holes that GCC accounted for for a while after that)

Possible - if there's evidence of such holes, that's useful, but I don't think we'd have to look too hard - the feature seems fairly isolated and probably whatever version GCC started emitting it is a good enough marker for "it's out there and GDB versions expect it/handle it".

What minimum compiler we require for building clang is a pretty different question than the one we're dealing with here.

Maybe I can clarify. I was thinking that if someone's gcc is 7.x then their gdb is likely 7.x as well. If they try to debug a binary produced with the clang they just built, they will likely reach for gdb not lldb. So that's why there's a link there.

But anyway you are correct it's not the main concern here. I will check the DWARF gcc produces across versions.

GCC 11 is where DW_AT_data_bit_offset is first used.

$ ./bin/aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-10.343) 10.4.1 20220909
<...>
$ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/test.o
<...>
0x00000026:     DW_TAG_member
<...>
                  DW_AT_bit_size        (0x04)
                  DW_AT_bit_offset      (0x1c)
                  DW_AT_data_member_location    (0x00)
$ ./bin/aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-11.284) 11.3.1 20221005
<...>
$ ./bin/aarch64-none-linux-gnu-gcc /tmp/test.c -o /tmp/test.o -c -g && llvm-dwarfdump-11 /tmp/
<...>
0x00000027:     DW_TAG_member
<...>
                  DW_AT_bit_size        (0x04)
                  DW_AT_data_bit_offset (0x00)

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

With the feature being enabled in GCC 11, it seems to be ok for what will be clang 16 to enable it. For example the most recent Ubuntu LTS "Jammy", ships with GCC 11 and clang 14. So we're not going to see a gcc <=10 plus clang >=16 at least from them.

(hopefully I understood the question correctly)

The question I'm roughly trying to answer is - has GCC shipped this feature? If so, we probably can too - because any system shipping older GCC versions is unlikely to be shipping a newer (as yet unreleased, in this case, but when it does release) clang.

With the feature being enabled in GCC 11, it seems to be ok for what will be clang 16 to enable it. For example the most recent Ubuntu LTS "Jammy", ships with GCC 11 and clang 14. So we're not going to see a gcc <=10 plus clang >=16 at least from them.

(hopefully I understood the question correctly)

Yep, that covers it & works for me, thanks!

This patch probably needs some positive testing that the change has an effect? (otherwise as it stands, the patch removes testing & would pass without the functional change applied at all)

& the TODO patch notes sounds good to add - something like the minimum GDB version required to support the newer bitfield encoding and maybe a note about what to do for older gdb? (use an older DWARF version with -gdwarf-N for some value of N?)

  • Test that linux DWARFv4 uses data_bit_offset and that adding GDB tuning does not change that.
  • Add release notes for clang and llvm.
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 3:12 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
DavidSpickett edited the summary of this revision. (Show Details)Oct 26 2022, 3:13 AM
dblaikie accepted this revision.Oct 26 2022, 9:56 AM

Sounds good, thanks!

This revision is now accepted and ready to land.Oct 26 2022, 9:56 AM
MaskRay added inline comments.
clang/docs/ReleaseNotes.rst
627

"DWARF v4" is more common than "DWARFv4". ditto below

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
431
  • DWARFvN -> DWARF vN
  • Remove superfluous brackets.
DavidSpickett marked 2 inline comments as done.Oct 27 2022, 1:39 AM
This revision was landed with ongoing or failed builds.Oct 27 2022, 1:51 AM
This revision was automatically updated to reflect the committed changes.