This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/AsmPrinter/
-
CodeGen/
-
AsmPrinter/
-
DwarfDebug.cpp
-
test/DebugInfo/X86/
-
DebugInfo/
-
X86/
-
debug-frame-dwarf64.ll
-
debug-line-dwarf64.ll
1/4
dwarf64-support.ll

Differential D87011

[DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19).
ClosedPublic

Authored by ikudrin on Sep 2 2020, 6:22 AM.

Download Raw Diff

Details

Reviewers

dblaikie
jhenderson
probinson
aprantl

Commits

rG982b31fad298: [DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19).

Summary

The patch adds a switch to enable emitting debug info in the 64-bit DWARF format. Most emitter for sections will be updated in the subsequent patches, whereas for .debug_line and .debug_frame the emitters are in the MC library, which is already updated.

For now, the switch is enabled only for 64-bit ELF targets.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ikudrin created this revision.Sep 2 2020, 6:22 AM

Herald added subscribers: ormris, hiraditya. · View Herald TranscriptSep 2 2020, 6:22 AM

ikudrin requested review of this revision.Sep 2 2020, 6:22 AM

ikudrin added a parent revision: D87010: [DebugInfo] Add new emitting methods for values which depend on the DWARF format (3/19)..Sep 2 2020, 6:24 AM

ikudrin added a child revision: D87012: [DebugInfo] Fix emitting DWARF64 .debug_aranges sections (11/19)..

Harbormaster completed remote builds in B70375: Diff 289414.Sep 2 2020, 7:38 AM

Looks good - might use a bit more testing if some of the prior unit-tested patches end up rolled into this one (if this is the first test that makes the codepaths in those patches "live"/testable)

Worth testing that the length field or other parts of the line table were written correctly per the format? Or is that already/going to be tested elsewhere? (if this is the first patch that makes debug_line able to be written in DWARF64, then I'd expect it to be testing any parts of the line table that are different/noteworthy in DWARF64 - possibly using assembly testing, but if llvm-dwarfdump's DWARF64 parsing support is well tested independently, then I guess it's good to just use it here too, like for any other tests)

This revision is now accepted and ready to land.Sep 2 2020, 10:11 AM

In D87011#2252628, @dblaikie wrote:

Worth testing that the length field or other parts of the line table were written correctly per the format? Or is that already/going to be tested elsewhere? (if this is the first patch that makes debug_line able to be written in DWARF64, then I'd expect it to be testing any parts of the line table that are different/noteworthy in DWARF64 - possibly using assembly testing, but if llvm-dwarfdump's DWARF64 parsing support is well tested independently, then I guess it's good to just use it here too, like for any other tests)

The code that emits these tables is already tested with llvm/test/MC/ELF/gen-dwarf64.s. llvm-dwarfdump was updated in advance and ready to parse DWARF64 debug info, so we can trust it in the tests and do not need to overcomplicate them.

ikudrin edited the summary of this revision. (Show Details)Sep 2 2020, 11:18 PM

ikudrin added a child revision: D87014: [DebugInfo] Fix emitting DWARF64 compilation units (5/19)..Sep 3 2020, 7:29 AM

ikudrin removed a child revision: D87012: [DebugInfo] Fix emitting DWARF64 .debug_aranges sections (11/19)..Sep 3 2020, 7:45 AM

MaskRay added a subscriber: MaskRay.Sep 3 2020, 8:53 AM

MaskRay added inline comments.

llvm/test/DebugInfo/X86/dwarf64-support.ll
7	If `-dwarf-version=2 -dwarf64` does not make sense, shouldn't the combo be errored to prevent misuse?

ikudrin added inline comments.Sep 3 2020, 9:20 AM

llvm/test/DebugInfo/X86/dwarf64-support.ll
7	I am not sure where to add that check and reporting. It looks like for internal tools erroneous combinations are just ignored. For example, for NVPTX, setting the DWARF version is silently ignored (see lines 372-374 in `DwarfDebug.cpp`). Thus, my change just follows the crowd.

MaskRay added inline comments.Sep 3 2020, 11:39 AM

llvm/test/DebugInfo/X86/dwarf64-support.ll
7	I guess NVPTX does so for quick MVP prototype ("[DEBUG] Initial adaptation of NVPTX target for debug info emission."). Downgrading DWARF version this way allows them to use -gdwarf-* quickly with their auxiliary target triples (`"-triple" "x86_64-unknown-linux-gnu" "-aux-triple" "nvptx64-nvidia-cuda"`) For clang -gdwarf-4, I think it probably does not hurt when an auxiliary target triple does support DWARF v4 and downgrades to v2. For the testing tool llc, we probably should emit better diagnostic to remind the user.

dblaikie added inline comments.Sep 3 2020, 12:00 PM

llvm/test/DebugInfo/X86/dwarf64-support.ll
7	Eh, there are a fair few flags I think at the llc level we don't bother to provide error messages about. One that comes to mind would be type units - the flag silently does nothing on non-ELF targets, for instance. Enabling debug-macro does nothing when using v4 Split DWARF. Might be nice, but I wouldn't say it's necessary - llc's a tool for us to test LLVM with & some flags have no effect in certain circumstances. (heck, even at the clang driver level that's true - lots of flags are just no-ops in certain situations, rather than providing error messages about their incompatibility)

ikudrin mentioned this in D87026: [DebugInfo] Make offsets of dwarf units 64-bit (19/19)..Sep 11 2020, 5:40 AM

This revision was landed with ongoing or failed builds.Sep 14 2020, 10:24 PM

Closed by commit rG982b31fad298: [DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19). (authored by ikudrin). · Explain Why

This revision was automatically updated to reflect the committed changes.

ikudrin added a commit: rG982b31fad298: [DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19)..

@ikudrin To clarify this will emit R_X86_64_64 bit relocations for .debug_info on 64 bit platform, correct?

In D87011#2357977, @ayermolo wrote:

@ikudrin To clarify this will emit R_X86_64_64 bit relocations for .debug_info on 64 bit platform, correct?

@ayermolo The patch series added support for the 64-bit DWARF format. You are right that many relocations in .debug_* will change from R_X86_64_32 to R_X86_64_64 with the option.

In D87011#2357977, @ayermolo wrote:

@ikudrin To clarify this will emit R_X86_64_64 bit relocations for .debug_info on 64 bit platform, correct?

Right, but only if producing 64-bit DWARF info is directly requested. Without specifying -dwarf64, nothing is changed, and the debugging info will be generated in the DWARF32 format.

Awesome, thanks.
I was trying to pass in -dwarf64 through our build system, but was still seeing 32 bit relocations. Will dig further on my end.
Thanks for working on this.

The switch is implemented only internally in LLVM. There is still some work to be done to enable producing 64-bit debugging info in clang, but I strayed a bit for another task. Hope to come back later this year.

ayermolo mentioned this in D90507: [Driver] Add DWARF64 flag: -gdwarf64.Oct 30 2020, 3:26 PM

@ikudrin What else is left on clang side? I added a diff for passing a flag from clang to be, need to see what's up with failures, anything else that needs to be done?
Testing it locally I was able to generate binary with DWARF64, that llvm-dwarf is able to parse it. Looks like gdb and lldb support varies.

To generate 64-bit debugging info, there should be enough to pass the switch through CLANG, right. Apart from that, we will probably need some compatibility checks so that using the switch in unsupported cases prints out diagnostics. There are also some improvements on the LLD side which are better to be done to support extremely large debugging information. Not all our tools fully support DWARF64 yet, etc.

@ikudrin Can you elaborate on LLD changes. I recently started to look in to it. Reason I am interested in this, internally we are a looking in to using DWARF64, so I have bandwidth, and incentive, to help with it implementation and adoption.

I suppose that it would be helpful to arrange debugging information sections so that DWARF64 comes after DWARF32, otherwise, some 32-bit relocations in the 32-bit info could not be resolved. But that idea might be a bit controversial because usually debugging information is expected to have the same order as the sections it refers to.

In D87011#2376709, @ayermolo wrote:

@ikudrin Can you elaborate on LLD changes. I recently started to look in to it. Reason I am interested in this, internally we are a looking in to using DWARF64, so I have bandwidth, and incentive, to help with it implementation and adoption.

I wouldn't expect LLD to need to do anything specific for DWARF64 support. It should generally speaking be treating the sections as opaque, in my opinion, and treat them no differently to other sections. If a user is mixing DWARF32 and DWARF64, then I'd say it's on their heads if relocations can't reach (just the same as it is if they're using DWARF32 but really need DWARF64). Any interactions LLD does have with the contents of the DWARF sections should be controlled via the DebugInfo library, and therefore if that library works for DWARF64, LLD doesn't need any special handling.

and adoption.

We don't want to encourage unnecessary adoption of DWARF64 as it bloats debug data sizes unnecessarily in cases when the total debug data size is less than the DWARF32 limitations. DWARF64 should always be opt-in in my opinion, for the general purpose user (there might be some large code-bases out there where enabling it by default in their build system makes sense, but they're currently the exception, not the rule).

In D87011#2378224, @jhenderson wrote:

I wouldn't expect LLD to need to do anything specific for DWARF64 support. It should generally speaking be treating the sections as opaque, in my opinion, and treat them no differently to other sections. If a user is mixing DWARF32 and DWARF64, then I'd say it's on their heads if relocations can't reach (just the same as it is if they're using DWARF32 but really need DWARF64). Any interactions LLD does have with the contents of the DWARF sections should be controlled via the DebugInfo library, and therefore if that library works for DWARF64, LLD doesn't need any special handling.

That would be true if there would be no third-party libraries a project might depend on. These libraries would have 32-bit debug info, which is the recommended option from the DWARF standard, as far as I can remember. And, in general, these libraries would be added to the link after the user's code, to satisfy dependencies.

In D87011#2378278, @ikudrin wrote:

In D87011#2378224, @jhenderson wrote:

I wouldn't expect LLD to need to do anything specific for DWARF64 support. It should generally speaking be treating the sections as opaque, in my opinion, and treat them no differently to other sections. If a user is mixing DWARF32 and DWARF64, then I'd say it's on their heads if relocations can't reach (just the same as it is if they're using DWARF32 but really need DWARF64). Any interactions LLD does have with the contents of the DWARF sections should be controlled via the DebugInfo library, and therefore if that library works for DWARF64, LLD doesn't need any special handling.

That would be true if there would be no third-party libraries a project might depend on. These libraries would have 32-bit debug info, which is the recommended option from the DWARF standard, as far as I can remember. And, in general, these libraries would be added to the link after the user's code, to satisfy dependencies.

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

In D87011#2378533, @ikudrin wrote:

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

Exactly. Not to mention, I think for users that actually worry about 4Gig limit they have pretty complex build system that will need to be modified to get build order right. Probably doable, but looking at overall compilation pipeline, is it really the best approach? Within lld we don't have to parse entire debug section, just read few bytes in each CU to determine if it's 32 or 64 bit.
Yes theoretically it is possible that there are just so many third party libraries that they will over flow 4gig by themselves, but I think common case is they will be under 4 gigs.

In D87011#2379603, @ayermolo wrote:

In D87011#2378533, @ikudrin wrote:

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

Exactly. Not to mention, I think for users that actually worry about 4Gig limit they have pretty complex build system that will need to be modified to get build order right. Probably doable, but looking at overall compilation pipeline, is it really the best approach? Within lld we don't have to parse entire debug section, just read few bytes in each CU to determine if it's 32 or 64 bit.
Yes theoretically it is possible that there are just so many third party libraries that they will over flow 4gig by themselves, but I think common case is they will be under 4 gigs.

FWIW, this is probably a big enough discussion to deserve it's own review, probably even it's own llvm-dev thread. My personal take would be: Unless there's a specific user who needs this, probably not worth building it. If you personally have a need or support users who need it, that swings the discussion a fair bit into "what's the best way we can help these users".

In D87011#2379609, @dblaikie wrote:

In D87011#2379603, @ayermolo wrote:

In D87011#2378533, @ikudrin wrote:

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

Exactly. Not to mention, I think for users that actually worry about 4Gig limit they have pretty complex build system that will need to be modified to get build order right. Probably doable, but looking at overall compilation pipeline, is it really the best approach? Within lld we don't have to parse entire debug section, just read few bytes in each CU to determine if it's 32 or 64 bit.
Yes theoretically it is possible that there are just so many third party libraries that they will over flow 4gig by themselves, but I think common case is they will be under 4 gigs.

FWIW, this is probably a big enough discussion to deserve it's own review, probably even it's own llvm-dev thread. My personal take would be: Unless there's a specific user who needs this, probably not worth building it. If you personally have a need or support users who need it, that swings the discussion a fair bit into "what's the best way we can help these users".

+1 for what @ayermolo and @ikudrin said. To me, using -u to force order is not only unrealistic for large code base, but also a bit hacky (-u has implication not intended for this use case). As for potential users, we're considering adopting DWARF64 for some large internal workloads which go over DWARF32 size limit, with some libraries still built with DWARF32 linked in.

I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

Even if eventually most will move to DWARF64, it will take a long time considering all the libraries out there. Having good support for mix use would make DWARF64 adoption and transition much more feasible.

Fair point on moving the discussion to llvm-dev, and happy to learn alternative ways for good support for mix use.

In D87011#2379609, @dblaikie wrote:

In D87011#2379603, @ayermolo wrote:

In D87011#2378533, @ikudrin wrote:

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

Exactly. Not to mention, I think for users that actually worry about 4Gig limit they have pretty complex build system that will need to be modified to get build order right. Probably doable, but looking at overall compilation pipeline, is it really the best approach? Within lld we don't have to parse entire debug section, just read few bytes in each CU to determine if it's 32 or 64 bit.
Yes theoretically it is possible that there are just so many third party libraries that they will over flow 4gig by themselves, but I think common case is they will be under 4 gigs.

FWIW, this is probably a big enough discussion to deserve it's own review, probably even it's own llvm-dev thread. My personal take would be: Unless there's a specific user who needs this, probably not worth building it. If you personally have a need or support users who need it, that swings the discussion a fair bit into "what's the best way we can help these users".

In D87011#2383469, @wenlei wrote:

In D87011#2379609, @dblaikie wrote:

In D87011#2379603, @ayermolo wrote:

In D87011#2378533, @ikudrin wrote:

In D87011#2378347, @jhenderson wrote:

At least in LLD, it's not quite as simple as being added after the user's code: if a library appears on the link line it will be included in the output order as soon as it is determined it is needed. Thus if you have have three modules 1.o, 2.o, and 3.o, with 3.o in an archive 3.a and 1.o requiring 3.o, you end up with an output order of 1.o 3.o 2.o if the input order was 1.o 3.a 2.o or 3.a 1.o 2.o or an output order of 1.o 2.o 3.o if the input order was 1.o 2.o 3.a. In fact, with use of the --undefined linker switch, you can even force 3.o to appear first.

I accept using --undefined or rearranging the command-line order is less than ideal, but I'm really not convinced LLD should have any place in parsing the DWARF to determine output order. Furthermore, it's not even a reliable solution - if the objects built with DWARF32 (potentially all of which might have come from libraries) are large enough, no amount of reordering will fix the behaviour. I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

I'd guess that for a large-scale project the recommendation to use -u would be unrealistic. We are talking about projects where debugging information in a single section can easily go beyond the 4GiB limit; it is impossible for the developer to adjust the command line manually.

By the way, from a semantic point of view, I don't think it matters if the DWARF is in a different order to the data it represents - I'm just concerned about the maintenance and performance burden of having to parse the DWARF to achieve this reordering.

There is no need to parse the debug info sections. Reading only the first 4 bytes of .debug_info is enough to assess the format (there might be input files with format intermixing, but we can ignore them in the sack of simplicity). And we do not need any automatic sorting if the size of an output section is less than 4GiB.

Exactly. Not to mention, I think for users that actually worry about 4Gig limit they have pretty complex build system that will need to be modified to get build order right. Probably doable, but looking at overall compilation pipeline, is it really the best approach? Within lld we don't have to parse entire debug section, just read few bytes in each CU to determine if it's 32 or 64 bit.
Yes theoretically it is possible that there are just so many third party libraries that they will over flow 4gig by themselves, but I think common case is they will be under 4 gigs.

FWIW, this is probably a big enough discussion to deserve it's own review, probably even it's own llvm-dev thread. My personal take would be: Unless there's a specific user who needs this, probably not worth building it. If you personally have a need or support users who need it, that swings the discussion a fair bit into "what's the best way we can help these users".

+1 for what @ayermolo and @ikudrin said. To me, using -u to force order is not only unrealistic for large code base, but also a bit hacky (-u has implication not intended for this use case). As for potential users, we're considering adopting DWARF64 for some large internal workloads which go over DWARF32 size limit, with some libraries still built with DWARF32 linked in.

I think users who need DWARF64 in their libraries are just going to have to request DWARF64 versions of the libraries, if the --undefined and reordering command line options are insufficient.

Even if eventually most will move to DWARF64, it will take a long time considering all the libraries out there. Having good support for mix use would make DWARF64 adoption and transition much more feasible.

Fair point on moving the discussion to llvm-dev, and happy to learn alternative ways for good support for mix use.

I'll post later today on llvm-dev to open for broader discussion.

hoy mentioned this in rG0e23fd676c39: [Driver] Add DWARF64 flag: -gdwarf64.Jan 8 2021, 12:59 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

AsmPrinter/

DwarfDebug.cpp

7 lines

test/

DebugInfo/

X86/

debug-frame-dwarf64.ll

37 lines

debug-line-dwarf64.ll

35 lines

dwarf64-support.ll

59 lines

Diff 291783

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

Show First 20 Lines • Show All 367 Lines • ▼ Show 20 Lines	DwarfDebug::DwarfDebug(AsmPrinter A, Module M)

unsigned DwarfVersionNumber = Asm->TM.Options.MCOptions.DwarfVersion;		unsigned DwarfVersionNumber = Asm->TM.Options.MCOptions.DwarfVersion;
unsigned DwarfVersion = DwarfVersionNumber ? DwarfVersionNumber		unsigned DwarfVersion = DwarfVersionNumber ? DwarfVersionNumber
: MMI->getModule()->getDwarfVersion();		: MMI->getModule()->getDwarfVersion();
// Use dwarf 4 by default if nothing is requested. For NVPTX, use dwarf 2.		// Use dwarf 4 by default if nothing is requested. For NVPTX, use dwarf 2.
DwarfVersion =		DwarfVersion =
TT.isNVPTX() ? 2 : (DwarfVersion ? DwarfVersion : dwarf::DWARF_VERSION);		TT.isNVPTX() ? 2 : (DwarfVersion ? DwarfVersion : dwarf::DWARF_VERSION);

		bool Dwarf64 = Asm->TM.Options.MCOptions.Dwarf64 &&
		DwarfVersion >= 3 && // DWARF64 was introduced in DWARFv3.
		TT.isArch64Bit() && // DWARF64 requires 64-bit relocations.
		TT.isOSBinFormatELF(); // Support only ELF for now.

UseRangesSection = !NoDwarfRangesSection && !TT.isNVPTX();		UseRangesSection = !NoDwarfRangesSection && !TT.isNVPTX();

// Use sections as references. Force for NVPTX.		// Use sections as references. Force for NVPTX.
if (DwarfSectionsAsReferences == Default)		if (DwarfSectionsAsReferences == Default)
UseSectionsAsReferences = TT.isNVPTX();		UseSectionsAsReferences = TT.isNVPTX();
else		else
UseSectionsAsReferences = DwarfSectionsAsReferences == Enable;		UseSectionsAsReferences = DwarfSectionsAsReferences == Enable;

Show All 25 Lines	DwarfDebug::DwarfDebug(AsmPrinter A, Module M)
EmitDebugEntryValues = Asm->TM.Options.ShouldEmitDebugEntryValues();		EmitDebugEntryValues = Asm->TM.Options.ShouldEmitDebugEntryValues();

// It is unclear if the GCC .debug_macro extension is well-specified		// It is unclear if the GCC .debug_macro extension is well-specified
// for split DWARF. For now, do not allow LLVM to emit it.		// for split DWARF. For now, do not allow LLVM to emit it.
UseDebugMacroSection =		UseDebugMacroSection =
DwarfVersion >= 5 \|\| (UseGNUDebugMacro && !useSplitDwarf());		DwarfVersion >= 5 \|\| (UseGNUDebugMacro && !useSplitDwarf());

Asm->OutStreamer->getContext().setDwarfVersion(DwarfVersion);		Asm->OutStreamer->getContext().setDwarfVersion(DwarfVersion);
		Asm->OutStreamer->getContext().setDwarfFormat(Dwarf64 ? dwarf::DWARF64
		: dwarf::DWARF32);
}		}

// Define out of line so we don't have to include DwarfUnit.h in DwarfDebug.h.		// Define out of line so we don't have to include DwarfUnit.h in DwarfDebug.h.
DwarfDebug::~DwarfDebug() = default;		DwarfDebug::~DwarfDebug() = default;

static bool isObjCClass(StringRef Name) {		static bool isObjCClass(StringRef Name) {
return Name.startswith("+") \|\| Name.startswith("-");		return Name.startswith("+") \|\| Name.startswith("-");
}		}
▲ Show 20 Lines • Show All 2,954 Lines • Show Last 20 Lines

llvm/test/DebugInfo/X86/debug-frame-dwarf64.ll

This file was added.

				; This checks that .debug_frame can be generated in the DWARF64 format.

				; RUN: llc -mtriple=x86_64 -dwarf64 -force-dwarf-frame-section -filetype=obj %s -o %t
				; RUN: llvm-dwarfdump -debug-frame %t \| FileCheck %s

				; CHECK: .debug_frame contents:
				; CHECK: 00000000 {{.+}} ffffffffffffffff CIE
				; CHECK-NEXT: Format: DWARF64
				; CHECK: {{.+}} 0000000000000000 FDE cie=00000000 pc=
				; CHECK-NEXT: Format: DWARF64

				; IR generated and reduced from:
				; $ cat foo.c
				; void foo() { }
				; $ clang -g -S -emit-llvm foo.c -o foo.ll

				target triple = "x86_64-unknown-linux-gnu"

				define dso_local void @foo() #0 !dbg !7 {
				ret void, !dbg !10
				}

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!3, !4, !5}
				!llvm.ident = !{!6}

				!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 12.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "foo.c", directory: "/tmp")
				!2 = !{}
				!3 = !{i32 7, !"Dwarf Version", i32 4}
				!4 = !{i32 2, !"Debug Info Version", i32 3}
				!5 = !{i32 1, !"wchar_size", i32 4}
				!6 = !{!"clang version 12.0.0"}
				!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !8, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
				!8 = !DISubroutineType(types: !9)
				!9 = !{null}
				!10 = !DILocation(line: 1, column: 14, scope: !7)

llvm/test/DebugInfo/X86/debug-line-dwarf64.ll

This file was added.

				; This checks that .debug_line can be generated in the DWARF64 format.

				; RUN: llc -mtriple=x86_64 -dwarf-version=3 -dwarf64 -filetype=obj %s -o %t3
				; RUN: llvm-dwarfdump -debug-line %t3 \| FileCheck %s

				; CHECK: .debug_line contents:
				; CHECK-NEXT: debug_line[0x00000000]
				; CHECK-NEXT: Line table prologue:
				; CHECK-NEXT: total_length:
				; CHECK-NEXT: format: DWARF64

				; IR generated and reduced from:
				; $ cat foo.c
				; int foo;
				; $ clang -g -S -emit-llvm foo.c -o foo.ll

				target triple = "x86_64-unknown-linux-gnu"

				@foo = dso_local global i32 0, align 4, !dbg !0

				!llvm.dbg.cu = !{!2}
				!llvm.module.flags = !{!7, !8, !9}
				!llvm.ident = !{!10}

				!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
				!1 = distinct !DIGlobalVariable(name: "foo", scope: !2, file: !3, line: 1, type: !6, isLocal: false, isDefinition: true)
				!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 12.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, splitDebugInlining: false, nameTableKind: None)
				!3 = !DIFile(filename: "foo.c", directory: "/tmp")
				!4 = !{}
				!5 = !{!0}
				!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!7 = !{i32 7, !"Dwarf Version", i32 4}
				!8 = !{i32 2, !"Debug Info Version", i32 3}
				!9 = !{i32 1, !"wchar_size", i32 4}
				!10 = !{!"clang version 12.0.0"}

llvm/test/DebugInfo/X86/dwarf64-support.ll

This file was added.

				; This checks cases when the 64-bit DWARF debug info should not be generated
				; even if '-dwarf64' is specified.

				; The 64-bit DWARF format was introduced in DWARFv3, so the '-dwarf64' switch
				; should be ignored for earlier versions.
				; RUN: llc -mtriple=x86_64 -dwarf-version=2 -dwarf64 -filetype=obj %s -o - \| \
				; RUN: llvm-dwarfdump -debug-line - \| \
				MaskRayUnsubmitted Not Done Reply Inline Actions If `-dwarf-version=2 -dwarf64` does not make sense, shouldn't the combo be errored to prevent misuse? MaskRay: If `-dwarf-version=2 -dwarf64` does not make sense, shouldn't the combo be errored to prevent…
				ikudrinAuthorUnsubmitted Done Reply Inline Actions I am not sure where to add that check and reporting. It looks like for internal tools erroneous combinations are just ignored. For example, for NVPTX, setting the DWARF version is silently ignored (see lines 372-374 in `DwarfDebug.cpp`). Thus, my change just follows the crowd. ikudrin: I am not sure where to add that check and reporting. It looks like for internal tools erroneous…
				MaskRayUnsubmitted Not Done Reply Inline Actions I guess NVPTX does so for quick MVP prototype ("[DEBUG] Initial adaptation of NVPTX target for debug info emission."). Downgrading DWARF version this way allows them to use -gdwarf-* quickly with their auxiliary target triples (`"-triple" "x86_64-unknown-linux-gnu" "-aux-triple" "nvptx64-nvidia-cuda"`) For clang -gdwarf-4, I think it probably does not hurt when an auxiliary target triple does support DWARF v4 and downgrades to v2. For the testing tool llc, we probably should emit better diagnostic to remind the user. MaskRay: I guess NVPTX does so for quick MVP prototype ("[DEBUG] Initial adaptation of NVPTX target for…
				dblaikieUnsubmitted Not Done Reply Inline Actions Eh, there are a fair few flags I think at the llc level we don't bother to provide error messages about. One that comes to mind would be type units - the flag silently does nothing on non-ELF targets, for instance. Enabling debug-macro does nothing when using v4 Split DWARF. Might be nice, but I wouldn't say it's necessary - llc's a tool for us to test LLVM with & some flags have no effect in certain circumstances. (heck, even at the clang driver level that's true - lots of flags are just no-ops in certain situations, rather than providing error messages about their incompatibility) dblaikie: Eh, there are a fair few flags I think at the llc level we don't bother to provide error…
				; RUN: FileCheck %s --check-prefixes=ELF64,CHECK

				; DWARF64 requires 64-bit relocations, so it is not produced for 32-bit targets.
				; RUN: llc -mtriple=i386 -dwarf-version=5 -dwarf64 -filetype=obj %s -o - \| \
				; RUN: llvm-dwarfdump -debug-line - \| \
				; RUN: FileCheck %s --check-prefixes=ELF32,CHECK

				; DWARF64 is enabled only for ELF targets. The switch should be ignored for COFF.
				; RUN: llc -mtriple=x86_64-windows-gnu -dwarf-version=5 -dwarf64 -filetype=obj %s -o - \| \
				; RUN: llvm-dwarfdump -debug-line - \| \
				; RUN: FileCheck %s --check-prefixes=COFF,CHECK

				; DWARF64 is enabled only for ELF targets. The switch should be ignored for Mach-O.
				; RUN: llc -mtriple=x86_64-apple-darwin -dwarf-version=5 -dwarf64 -filetype=obj %s -o - \| \
				; RUN: llvm-dwarfdump -debug-line - \| \
				; RUN: FileCheck %s --check-prefixes=MACHO,CHECK

				; ELF64: file format elf64-x86-64
				; ELF32: file format elf32-i386
				; COFF: file format COFF-x86-64
				; MACHO: file format Mach-O 64-bit x86-64

				; CHECK: .debug_line contents:
				; CHECK-NEXT: debug_line[0x00000000]
				; CHECK-NEXT: Line table prologue:
				; CHECK-NEXT: total_length:
				; CHECK-NEXT: format: DWARF32

				; IR generated and reduced from:
				; $ cat foo.c
				; int foo;
				; $ clang -g -S -emit-llvm foo.c -o foo.ll

				target triple = "x86_64-unknown-linux-gnu"

				@foo = dso_local global i32 0, align 4, !dbg !0

				!llvm.dbg.cu = !{!2}
				!llvm.module.flags = !{!7, !8, !9}
				!llvm.ident = !{!10}

				!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
				!1 = distinct !DIGlobalVariable(name: "foo", scope: !2, file: !3, line: 1, type: !6, isLocal: false, isDefinition: true)
				!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 12.0.0", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, splitDebugInlining: false, nameTableKind: None)
				!3 = !DIFile(filename: "foo.c", directory: "/tmp")
				!4 = !{}
				!5 = !{!0}
				!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!7 = !{i32 7, !"Dwarf Version", i32 4}
				!8 = !{i32 2, !"Debug Info Version", i32 3}
				!9 = !{i32 1, !"wchar_size", i32 4}
				!10 = !{!"clang version 12.0.0"}