This is an archive of the discontinued LLVM Phabricator instance.

[DWARFv5] Emit normal type units in .debug_info comdats.
ClosedPublic

Authored by probinson on Nov 8 2018, 3:50 PM.

Details

Summary

This should make non-split type units fully conform to DWARF v5.

An artifact of the LLVM implementation means that the type-unit sections
will appear before the compile-unit section in the object file. So, readers
making assumptions about ordering will need to learn better.
(I'm looking at you, LLD.)

I was going to do split units at the same time, but that's going to be trickier,
because it looks like for v5, the split compile and type units should all coexist
in the same section. So, leaving split type-units for another day.

Diff Detail

Repository
rL LLVM

Event Timeline

probinson created this revision.Nov 8 2018, 3:50 PM
dblaikie added inline comments.Nov 8 2018, 4:25 PM
llvm/lib/MC/MCObjectFileInfo.cpp
822 ↗(On Diff #173239)

Seems odd to have a function for "DwarfTypesSection" return the debug_info section. I'd tend towards having MCObjectFileInfo be more literal than this - getDwarfTypesSection(Hash) and getDwarfInfoSection(Hash), etc.

(a very casual/far-off motivation, is that it'd be great if, eventually, debug_info came to allowing various different entities in comdat'd sections - not just types - this is what some of us have thrown around as "bag of DWARF")

probinson added inline comments.Nov 9 2018, 10:27 AM
llvm/lib/MC/MCObjectFileInfo.cpp
822 ↗(On Diff #173239)

Excellent point, that would be a better division of responsibility.

i've had some discussions with Sony linker people about different tactics for subdividing DWARF, using comdats or other ELF tricks. Cost/benefit tradeoffs have not been clear. But worth pursuing.

dblaikie added inline comments.Nov 9 2018, 10:38 AM
llvm/lib/MC/MCObjectFileInfo.cpp
822 ↗(On Diff #173239)

Yeah, stuff Adrian, Eric, and I have thrown around goes in a few directions:

  1. type units aren't super efficient (for GCC they were a 2x growth, roughly - for every byte that was removed from the CU, you got two bytes of TU - for Clang (only because it's a bit more conservative about which types to put in type units) it's about 1.5x) one major reason is that the CU can only reference the type itself - not all the other things, mostly the member function declarations would be great to reference ratehr than as it is today, where any member function definition in the CU references a declaration of that function in the stub type description in the CU, which is a lot of duplicate bytes.

-> So it'd be great if a type unit were enhanced to allow more than one entity to be referenced (have the TU header include a count N, then N x {hash, offset} pairs rather than the current header which has exactly one {hash, offset} pair)

  1. Some data doesn't need to go in a separate unit - sometimes you know you're only emitting a type once, so would be cool to be able to emit it into the CU. But what about types that need to be in a type unit? I guess they could refer to the type by declaration - in which case there'd be no reason to refer back into the CU. But alternatively it could be done by having even the CU have a list of referencable entities... eh, maybe that's less useful.

But in any case, once you do (1) it's perhaps less important to refer to this thing as a type unit, but instead as a bag of referencable entities, whatever the producer wants to group up like that. (& yeah, might work welrl for inline function declarations, etc)

Aside: I actually might try to do the "types referenced from type units are referenced as declarations instead of using any signatures" because with modular debug info there's more chance you'll have types you know are only going in one place (even today we have some of those - with strong vtables you can know you won't put the type anywhere but there) so you'd want to avoid type unit overhead. But some type unit might want to refer to that type - obviously that type unit is going to be in multiple object files and only one of those object files has the first vtable-bound type, so the other copies of the type unit must be using declarations to refer to the vtable-bound type anyway, seems fine to do it in this case where the vtable-bound type is available anyway.

</ramble>

probinson updated this revision to Diff 173381.Nov 9 2018, 10:53 AM

Make DwarfDebug responsible for picking which section to use.

probinson marked an inline comment as done.Nov 9 2018, 10:53 AM
dblaikie accepted this revision.Nov 9 2018, 10:57 AM

Looks great!

This revision is now accepted and ready to land.Nov 9 2018, 10:57 AM
This revision was automatically updated to reflect the committed changes.