This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Always emit `.debug_names` with dwarf 5 for Apple platforms
ClosedPublic

Authored by JDevlieghere on Feb 1 2022, 8:59 PM.

Details

Summary

On Apple platforms, we generate .apple_names, .apple_types, .apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4 and earlier. For DWARF 5, I would expect us to generate .debug_names, but instead we get no accelerator tables at all.

In the backend we are correctly determining that we should be emitting .debug_names instead of .apple_names. However, when we get to the point of emitting the section, CU debug name table kind is not "default" are skipped. This property is controlled by the -gpubnames argument. Indeed, passing -gdwarf-5 -gpubnames does generate .debug_names section, on all platforms.

On Apple platforms, the .debug_names section should be emitted by default. There's a few ways to address this issue:

  1. We could always pass -gpubnames to CC1 for MachO. My concerns with this approach that it's lossy in the sense that you can't tell if -gpubnames was passed explicitly or because we're targeting an Apple platform. If you are not emitting apple accelerator tables or debug names this will results in an actual pubnames section, which we don't want.
  1. We could pass a new flag -gapplenames to CC1 and set DebugNameTableKind to Default. There are a few places that assume that DebugNameTableKind::Default == pubnames but I don't think that assumption actually hurts.
  1. We could pass a new flag -gapplenames to CC1 and set DebugNameTableKind to Apple. That way we know that the CU was compiled with the intent of always emitting accelerator tables. For dwarf 4 and earlier, that means Apple accelerator tables. For dwarf 5 and later, that means .debug names.

The third approach seemed the most principled and is what this patch implements, but I'm happy to discuss alternatives.

https://github.com/llvm/llvm-project/issues/53532

Diff Detail

Event Timeline

JDevlieghere created this revision.Feb 1 2022, 8:59 PM
JDevlieghere requested review of this revision.Feb 1 2022, 8:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2022, 8:59 PM
JDevlieghere retitled this revision from [DebugInfo] Add new DebugNameTableKind::Apple to emit `.debug_names` with to [DebugInfo] Always emit `.debug_names` with dwarf 5 for Apple platforms.

Hmm - maybe there's another option? What if this was hardcoded down in CLang proper, not in the driver? (ie: doesn't matter what flags cc1 was passed, on Apple platforms the names would be requested when the DICompileUnit is constructed)

Hmm - maybe there's another option? What if this was hardcoded down in CLang proper, not in the driver? (ie: doesn't matter what flags cc1 was passed, on Apple platforms the names would be requested when the DICompileUnit is constructed)

Agreed. Options carry an implication that something might not happen, and it sounds like Apple unconditionally (well, as long as there is debug info at all) wants accelerator tables.

Remove CC1 flag and instead hardcode the name table kind in clang.

Rather than having to change the LLVM code in this patch - what if the frontend change in CGDebugInfo.cpp hardcoded Apple for MachO < DWARFv5, and Default for MachO >= DWARFv5? I don't mind /too/ much either way, but that seems like it'd be more accurately expressive in the IR as to what's going on. Remove the Apple special case, etc.

Oh, hmm, Apple isn't a kind at the moment at all? How is this encoded today - no name table kind is specified (None) for Apple, and the backend overrides and produces the apple accelerator tables? Except in DWARFv5 where the apple names are disabled and the DWARFv5 names are meant to be enabled but aren't because the nameTableKind is still None?

Rather than having to change the LLVM code in this patch - what if the frontend change in CGDebugInfo.cpp hardcoded Apple for MachO < DWARFv5, and Default for MachO >= DWARFv5? I don't mind /too/ much either way, but that seems like it'd be more accurately expressive in the IR as to what's going on. Remove the Apple special case, etc.

Oh, hmm, Apple isn't a kind at the moment at all? How is this encoded today - no name table kind is specified (None) for Apple, and the backend overrides and produces the apple accelerator tables? Except in DWARFv5 where the apple names are disabled and the DWARFv5 names are meant to be enabled but aren't because the nameTableKind is still None?

Exactly!

If we wanted we could be more consistent about this in the backend as well and also check that the table kind is "Apple" before emitting our accelerator tables with dwarf <5.

probinson added inline comments.Feb 3 2022, 1:54 PM
clang/test/CodeGen/debug-info-names.c
2

I thought -gapplenames went away?

Rather than having to change the LLVM code in this patch - what if the frontend change in CGDebugInfo.cpp hardcoded Apple for MachO < DWARFv5, and Default for MachO >= DWARFv5? I don't mind /too/ much either way, but that seems like it'd be more accurately expressive in the IR as to what's going on. Remove the Apple special case, etc.

Oh, hmm, Apple isn't a kind at the moment at all? How is this encoded today - no name table kind is specified (None) for Apple, and the backend overrides and produces the apple accelerator tables? Except in DWARFv5 where the apple names are disabled and the DWARFv5 names are meant to be enabled but aren't because the nameTableKind is still None?

Exactly!

OK - then maybe this is a backend only change? You've pointed to the code that opts debug_names out if NameTableKind is not Default, but could you point to the code where the apple names /aren't/ emitted in DWARFv5? Perhaps it's just that that change is missing the symmetric change to always emit debug_names?

Rather than having to change the LLVM code in this patch - what if the frontend change in CGDebugInfo.cpp hardcoded Apple for MachO < DWARFv5, and Default for MachO >= DWARFv5? I don't mind /too/ much either way, but that seems like it'd be more accurately expressive in the IR as to what's going on. Remove the Apple special case, etc.

Oh, hmm, Apple isn't a kind at the moment at all? How is this encoded today - no name table kind is specified (None) for Apple, and the backend overrides and produces the apple accelerator tables? Except in DWARFv5 where the apple names are disabled and the DWARFv5 names are meant to be enabled but aren't because the nameTableKind is still None?

Exactly!

OK - then maybe this is a backend only change? You've pointed to the code that opts debug_names out if NameTableKind is not Default, but could you point to the code where the apple names /aren't/ emitted in DWARFv5? Perhaps it's just that that change is missing the symmetric change to always emit debug_names?

In the frontend, NameTableKind is only set to "Default" if you enable pubnames. In the backend, dwarf debug names are only emitted if NameTableKind is set to "Default". Also in the backend, on an Apple platform, we emit Apple accelerator tables with <DWARFv4 and debug names with >=DWARFv5.

So on an Apple platform, regardless of the dwarf version, if you don't pass pubnames, we set NameTableKind to None in the frontend. If we're compiling with DWARFv4, then we emit the Apple accelerator tables unconditionally, without checking the NameTableKind. If we're compiling with DWARFv5, then in the backend we correctly decide to emit the debug name accelerator tables, but the debug names implementation does check the NameTableKind. It skips every CU that has NameTableKind set to anything different than "Default" and we end up without any accelerator tables at all.

clang/test/CodeGen/debug-info-names.c
2

Yup, I forgot to unstage this test.

So on an Apple platform, regardless of the dwarf version, if you don't pass pubnames, we set NameTableKind to None in the frontend. If we're compiling with DWARFv4, then we emit the Apple accelerator tables unconditionally, without checking the NameTableKind. If we're compiling with DWARFv5, then in the backend we correctly decide to emit the debug name accelerator tables, but the debug names implementation does check the NameTableKind. It skips every CU that has NameTableKind set to anything different than "Default" and we end up without any accelerator tables at all.

I guess we could say "don't check the name table kind" on Apple platforms, but then we lose the ability to disable accelerator tables at all.

So on an Apple platform, regardless of the dwarf version, if you don't pass pubnames, we set NameTableKind to None in the frontend. If we're compiling with DWARFv4, then we emit the Apple accelerator tables unconditionally, without checking the NameTableKind. If we're compiling with DWARFv5, then in the backend we correctly decide to emit the debug name accelerator tables, but the debug names implementation does check the NameTableKind. It skips every CU that has NameTableKind set to anything different than "Default" and we end up without any accelerator tables at all.

I guess we could say "don't check the name table kind" on Apple platforms, but then we lose the ability to disable accelerator tables at all.

Ah, I thought there was no such ability already? Were you wanting to add such an ability? Maybe good to consider that somewhat orthogonally to the issue of "need some kind of accelerator tables for MacOS in DWARFv5".

Would it work to change the driver handling of -gpubnames to default on (instead of off) for Apple targets? Then you'd be able to turn them off with the expected -gno-pubnames and the rest of it ought to fall out naturally.
I admit the handling of the -g* options in Driver is way more intricate than anyone would like, but tweaking the defaulting to depend on target triple is something that happens in a number of places, so it shouldn't be too awful.

Both of you seem to prefer not to add a new DebugNameTableKind which makes me curious as to why. I'm totally fine with the suggested alternatives (I considered them myself) but having this encoded in the NameTableKind means we have more information in both the front and the backend so I'm interested in understanding why we would pick something that doesn't do that.

Both of you seem to prefer not to add a new DebugNameTableKind which makes me curious as to why. I'm totally fine with the suggested alternatives (I considered them myself) but having this encoded in the NameTableKind means we have more information in both the front and the backend so I'm interested in understanding why we would pick something that doesn't do that.

Mostly for me I don't object to changing it - but that it seems like a separate thing than the DWARFv5 debug_names issue. Since it enables a bunch of functionality we didn't have before - whereas just fixing the debug_names issue in a way that's consistent with the existing MachO hardcoded apple names functionality seems like the narrower/status quo fix.

If there's some desire to be able to configure debug names/accelerators on MachO, sure, happy to discuss how that might look - and yeah, having a name table kind seems plausible. (so default == debug_pubnames/pubtypes in v4, debug_names in v5, gnu == debug_gnu_pubnames/pubtypes, apple == apple_names, none == none) then changing the frontend to default to Apple on apple platforms in v4, and default on apple platforms in v5, sure.

But that's a lot of changes if the only issue is missing debug_names in v5 - unless the extra flexibility is desirable, or if the existing hardcoded apple names is considered hacky/technical debt, etc.

Both of you seem to prefer not to add a new DebugNameTableKind which makes me curious as to why. I'm totally fine with the suggested alternatives (I considered them myself) but having this encoded in the NameTableKind means we have more information in both the front and the backend so I'm interested in understanding why we would pick something that doesn't do that.

Mostly for me I don't object to changing it - but that it seems like a separate thing than the DWARFv5 debug_names issue. Since it enables a bunch of functionality we didn't have before - whereas just fixing the debug_names issue in a way that's consistent with the existing MachO hardcoded apple names functionality seems like the narrower/status quo fix.

What additional functionality are you thinking of?

If there's some desire to be able to configure debug names/accelerators on MachO, sure, happy to discuss how that might look - and yeah, having a name table kind seems plausible. (so default == debug_pubnames/pubtypes in v4, debug_names in v5, gnu == debug_gnu_pubnames/pubtypes, apple == apple_names, none == none) then changing the frontend to default to Apple on apple platforms in v4, and default on apple platforms in v5, sure.

But that's a lot of changes if the only issue is missing debug_names in v5 - unless the extra flexibility is desirable, or if the existing hardcoded apple names is considered hacky/technical debt, etc.

I personally consider it to be that yes. The fact that this is broken is the result of implicit assumptions that held before we had debug_names support and then nobody noticed the discrepancy between the front and the backend. I guess subconsciously I'm worried about this breaking again which is why I'm leaning towards being more explicit.

Both of you seem to prefer not to add a new DebugNameTableKind which makes me curious as to why. I'm totally fine with the suggested alternatives (I considered them myself) but having this encoded in the NameTableKind means we have more information in both the front and the backend so I'm interested in understanding why we would pick something that doesn't do that.

Mostly for me I don't object to changing it - but that it seems like a separate thing than the DWARFv5 debug_names issue. Since it enables a bunch of functionality we didn't have before - whereas just fixing the debug_names issue in a way that's consistent with the existing MachO hardcoded apple names functionality seems like the narrower/status quo fix.

What additional functionality are you thinking of?

The ability to enable/disable accelerators on MachO (not currently possible - since it's hardcoded in the backend) - eg: someone adding -gno-pubnames to reduce debug info size, but in the process breaking lldb's behavior due to missing accelerator tables. (or now being able to request non-accelerator tables pubnames (debug_{gnu_,}pub{names,types}) which would be similarly broken.

I guess even if it's represented in the IR level, it isn't necessarily/doesn't have to be exposed on the command line - though, would still mean some more possibility of failure if the IR happened to contain the wrong pubnames kind for the platform, etc.

If there's some desire to be able to configure debug names/accelerators on MachO, sure, happy to discuss how that might look - and yeah, having a name table kind seems plausible. (so default == debug_pubnames/pubtypes in v4, debug_names in v5, gnu == debug_gnu_pubnames/pubtypes, apple == apple_names, none == none) then changing the frontend to default to Apple on apple platforms in v4, and default on apple platforms in v5, sure.

But that's a lot of changes if the only issue is missing debug_names in v5 - unless the extra flexibility is desirable, or if the existing hardcoded apple names is considered hacky/technical debt, etc.

I personally consider it to be that yes. The fact that this is broken is the result of implicit assumptions that held before we had debug_names support and then nobody noticed the discrepancy between the front and the backend. I guess subconsciously I'm worried about this breaking again which is why I'm leaning towards being more explicit.

Fair enough. I'm OK with it if that's the goal.

Though I don't know that nameTableKind = Apple should mean debug_names in DWARFv5 - I'm not sure why it's even partly implemented that way, presumably apple names is compatible with DWARfv5 (I don't see any reason it wouldn't be) - though also unnecessary given how close apple names is to debug_names? I guess... (sorry, I'm rambling, in short, it seems like nameTableKind = Apple should mean apple_names in any version and if you want DWARFv5 to use debug_names on Apple platforms, that should be requested with nameTableKind = Default?)

But then again, things do get complicated when the DWARF version is a module flag but the name table kind is per-CU? If they do happen to be inconsistent not sure what the outcome is - whether we build separate name tables covering just the CUs that asked for that kind of name table... I think /maybe/ that works.

The ability to enable/disable accelerators on MachO (not currently possible - since it's hardcoded in the backend) - eg: someone adding -gno-pubnames to reduce debug info size, but in the process breaking lldb's behavior due to missing accelerator tables.

My memory of when Pavel added .debug_names is that lldb knew how to construct it (e.g. for v4 input) and having it present in the object was a load-time improvement, not a functional requirement. So, the scenario of wanting to turn .debug_names off does seem plausible to me, and not something to be forbidden.

The ability to enable/disable accelerators on MachO (not currently possible - since it's hardcoded in the backend) - eg: someone adding -gno-pubnames to reduce debug info size, but in the process breaking lldb's behavior due to missing accelerator tables.

My memory of when Pavel added .debug_names is that lldb knew how to construct it (e.g. for v4 input) and having it present in the object was a load-time improvement, not a functional requirement. So, the scenario of wanting to turn .debug_names off does seem plausible to me, and not something to be forbidden.

It doesn't actually construct *it* (the debug_names table), but it constructs an internal index with equivalent functionality. The end result is the same though -- lldb should be able to handle the situation where the debug_names section is missing.

(of course, there's always the possibility of bugs, as that code path is not exercised that often, but the fallback logic is pretty simple..)

Ping, I re-read through this tread and I'm not sure if there are still any outstanding concerns? We'd like to pick this up.

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2023, 5:36 PM

No tests (assuming the existing clang test goes away).

llvm/include/llvm/IR/DebugInfoMetadata.h
1380

Does this have any effect on bitcode? Don't want to redefine a value that's already in use.

JDevlieghere marked an inline comment as done.

Address @probinson's feedback:

  • Don't reuse existing value so we don't change the meaning of old bitcode input.
  • Add more tests
aprantl accepted this revision.Jun 14 2023, 9:47 AM

LGTM! This should make a couple more hundred tests pass on the green dragon matrix bot :-)

This revision is now accepted and ready to land.Jun 14 2023, 9:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 1:03 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript