This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Defer emitting type units until the compile units are done.
AbandonedPublic

Authored by probinson on Apr 12 2018, 1:29 PM.

Details

Reviewers
dblaikie
aprantl
Summary

This is NFC at the moment but is a prerequisite for emitting type units into the .debug_info[.dwo] section.
Emitting a type unit while in the middle of emitting a compile unit will Just Not Work in DWARF v5.

Diff Detail

Event Timeline

probinson created this revision.Apr 12 2018, 1:29 PM
aprantl accepted this revision.Apr 12 2018, 1:33 PM
aprantl added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
256

Comment?

This revision is now accepted and ready to land.Apr 12 2018, 1:33 PM

This would regress memory usage significantly - see r260578 / D17118 for more details. (17% decrease in memory usage when this was implemented)

Could you explain more what would be problematic about emitting type units immediately in DWARF5?

This would regress memory usage significantly - see r260578 / D17118 for more details. (17% decrease in memory usage when this was implemented)

Could you explain more what would be problematic about emitting type units immediately in DWARF5?

Because compile units and type units go in the same section in DWARF5. We can't serialize a type unit into the middle of a compile unit. One or the other has to be deferred, and it was obviously simple to defer type units. We could conditionalize the deferral on DWARF version but that just kicks the can down the road.

It might be interesting to see what happens if we erase the DwarfTypeUnit immediately after emitting it to MC in emitDebugTypes(), instead of letting it hang around. That way we don't have two complete copies of all type units taking up redundant space.

Another potential solution would be to continue dumping the type units to what MC initially treats as separate sections, but then paste the types onto the end of the .debug_info instead of actually emitting separate sections. That's monkeying with relatively low-level details of MC, though, and I was not sanguine about being able to do that smoothly. I am pretty sure there are never references *into* the type units so we shouldn't have offsets/relocations to fix up, except maybe if the length computation is done using that sort of trickery. Then the section-pasting code would have to DTRT to make those go away.

Other ideas?

But I don't think we start emitting the CU as we are going, right? I
thought it was handled much like the TUs would be by this patch - built...
And then emitted once it was fully built. (Indeed because new attributes
can be added to DIEs, forward references to future DIEs can exist, etc - we
currently have no support for streaming dwarf emission - the entire DIE
tree is built in memory, then emitted)

So I would've expected we could produce a bunch of TUs and emit them while
we are building the CU DIE trees, then emit the CUs.

What specifically has been breaking that you're seeing?

CUs don't come out right away... Huh. In my first attempt at emitting TUs to .debug_info, I just had MC reuse the .debug_info section when it was asked about .debug_types. And I was definitely seeing intermixed headers, and otherwise unparseable sections.

I think intra-unit references are using section-relative offsets, or something, and that doesn't work well when "somebody else" is trying to stuff data into the same section. I didn't look into it too deeply because my immediate thought was, duh, we have effectively two streams pointing to the same section, so just defer one until we know the other one is done.

I'll go back to my first attempt tomorrow and see if it might be as simple as needing to use an actual label instead of the section-symbol as the base for the offsets. It means in DWARF 5, we'd see type units first in the .debug_info section, followed by the CU, which might break somebody's assumptions; but it's legal DWARF, so probably worth finding.

probinson abandoned this revision.Apr 13 2018, 6:41 AM

So, it turns out DebugInfo/DWARF needs to start looking for multiple .debug_info sections, just like it looks for multiple .debug_types sections, because the type units are still in COMDATs. And I think I found the weird intermixing problem, that was a different bug.

So, no need for this patch after all.