This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Add support for DWARF type units
ClosedPublic

Authored by dschuff on Sep 30 2020, 11:59 AM.

Details

Summary

Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.

Diff Detail

Event Timeline

dschuff created this revision.Sep 30 2020, 11:59 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 30 2020, 11:59 AM
dschuff requested review of this revision.Sep 30 2020, 12:00 PM
dschuff added inline comments.Sep 30 2020, 12:01 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

I may add a couple more tests to this, but I did want to ask @sbc100 about this, since I'm not 100% sure at the uniqueID field is for.

sbc100 accepted this revision.Sep 30 2020, 12:07 PM
sbc100 added inline comments.
llvm/lib/MC/WasmObjectWriter.cpp
1380

Just checking in the case where you were hitting this error the SectionName was duplicate but the Begin is uniquified ?

This revision is now accepted and ready to land.Sep 30 2020, 12:07 PM
dschuff added inline comments.Sep 30 2020, 2:43 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

also let me be more clear about the question here: what is UniqueID for, and is it bad that I'm just passing it a number that is totally not unique?

llvm/lib/MC/WasmObjectWriter.cpp
1380

right. There's a second .debug_info section, so the name of Begin gets uniquified.

aardappel added inline comments.Oct 1 2020, 12:58 PM
llvm/test/DebugInfo/WebAssembly/dwarf-headers.ll
48

I guess this doesn't actually test that only one copy of these remain after linking? That's already tested in LLD?

dblaikie added inline comments.
llvm/lib/MC/MCObjectFileInfo.cpp
967

For ELF, at least, I believe the unique ID is used to know which elements are to be treated as part of the same deduplication set.

If Wasm support in lld does the same thing, then using the same number for every type unit would mean the linked binary would end up with only one type definition - even when the input has many varied/independent type definitions. Likely not what's intended.

llvm/test/DebugInfo/WebAssembly/dwarf-headers.ll
25–31

Given that S.plit DWARF type units don't require comdat support (the dwp tool will be aware of the type units and parse their boundaries, read their type hash from the TU Header, etc). /may/ be worth separating that functionality/testing from the comdat support/testing - but maybe not all that interesting to separate it into two separate patches. (& if you find the test coverage works better in one test, that's OK - just a thought)

48

Certainly lld shouldn't be tested here, no (llvm tests can't depend on other subprojects like lld) - but yeah, some kind of comdat deduplication tests should exist in lld (they don't have to be DWARF specific - the DWARF type deduplication is meant to piggy-back on the existing comdat deduplication infrastructure in thel inker)

dschuff added inline comments.Oct 7 2020, 1:34 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

For wasm I had thought that was what the 3rd argument (Group) was for. So if that's what UniqueID is for, then I have the same question about Group :)

llvm/test/DebugInfo/WebAssembly/dwarf-headers.ll
25–31

I copied this test more-or-less directly from X86 :)
Although I will say it's kind of nice to test all the header types in one go like this just because it's slightly annoying to construct the LLVM metadata for debuginfo tests, so it's nice to be able to share that as much as possible. Previously we just didn't have enough coverage for split-dwarf anyway.

48

@aardappel yes, pretty much what @dblaikie said :)

dblaikie added inline comments.Oct 7 2020, 3:11 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

Oh, fair enough - I hadn't read closely. Yeah, guess it's up to you folks/how the wasm object format works... - so I'm with you on the "what is the uniqueID field for" (& what's the other field that's taking the hash?) & I'll leave it to you folks to... hash out.

llvm/test/DebugInfo/WebAssembly/dwarf-headers.ll
25–31

Ah, fair enough - symmetry sounds good :)

dschuff added inline comments.Oct 14 2020, 2:45 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

@sbc100 ping, is this what we want here?

This comment was removed by sbc100.
sbc100 added inline comments.Oct 26 2020, 6:06 PM
llvm/lib/MC/MCObjectFileInfo.cpp
967

It looks like you can use GenericSectionID rather and ~0 here (and we should change that elsewhere too).

Other than that lgtm

dschuff updated this revision to Diff 301133.Oct 27 2020, 5:07 PM

use GenericSectionID instead of ~0

This revision was landed with ongoing or failed builds.Oct 27 2020, 5:14 PM
This revision was automatically updated to reflect the committed changes.

This broke the bots for some strange reason that didn't reproduce locally. But because it was ~all of them, I probably just did something stupid.

This broke the bots for some strange reason that didn't reproduce locally. But because it was ~all of them, I probably just did something stupid.

Good to have links to buildbots and/or quotes from buildbots (incase the logs get retired) about the specifics so other folks can chip in/know what went wrong, etc.

This broke the bots for some strange reason that didn't reproduce locally. But because it was ~all of them, I probably just did something stupid.

Good to have links to buildbots and/or quotes from buildbots (incase the logs get retired) about the specifics so other folks can chip in/know what went wrong, etc.

Example link is http://lab.llvm.org:8011/#/builders/109/builds/1402
But, as I suspected, I just did something stupid (namely, I thought I had assertions enabled, but I didn't)

dschuff reopened this revision.Oct 28 2020, 5:17 PM
This revision is now accepted and ready to land.Oct 28 2020, 5:17 PM
dschuff updated this revision to Diff 301486.Oct 28 2020, 5:17 PM

use getOrCreate

dschuff updated this revision to Diff 301488.Oct 28 2020, 5:34 PM

can't just getOrCreate symbol without dealing with the section
just disable the test for now as we don't need dw5

dschuff updated this revision to Diff 301490.Oct 28 2020, 5:35 PM

fix diff; it should be against master

@sbc100 I found that the cause of the assertion is that
in dwarf 5, the type units apparently go in the .debug_info section (instead of the .debug_type section), and this section already exists (but it was created as a non-comdat).
So when getWasmSection tries to look up an existing section it fails because the group is part of the key. Then it tries to create a new section but that fails because the name is duplicate.

For some reason this doesn't happen or is not a problem with ELF but I haven't looked up why yet.
For now I just disabled the dwarf5+TU tests since we don't really use dwarf5 yet anyway.

@sbc100 I found that the cause of the assertion is that
in dwarf 5, the type units apparently go in the .debug_info section (instead of the .debug_type section), and this section already exists (but it was created as a non-comdat).
So when getWasmSection tries to look up an existing section it fails because the group is part of the key. Then it tries to create a new section but that fails because the name is duplicate.

For some reason this doesn't happen or is not a problem with ELF but I haven't looked up why yet.
For now I just disabled the dwarf5+TU tests since we don't really use dwarf5 yet anyway.

for ELF, each {section name, hash} pair denotes a distinct section (when using SHF_GROUP). Linker's deduplicate the group based on the hash, then squish all the distinct sections with the same name together. (same system that powers -ffunction-sections -fno-unique-section-names - even though each function section is called ".text", because they have different hashes and use SHF_GROUP, they are distinct sections in the object file)

@sbc100 I found that the cause of the assertion is that
in dwarf 5, the type units apparently go in the .debug_info section (instead of the .debug_type section), and this section already exists (but it was created as a non-comdat).
So when getWasmSection tries to look up an existing section it fails because the group is part of the key. Then it tries to create a new section but that fails because the name is duplicate.

For some reason this doesn't happen or is not a problem with ELF but I haven't looked up why yet.
For now I just disabled the dwarf5+TU tests since we don't really use dwarf5 yet anyway.

for ELF, each {section name, hash} pair denotes a distinct section (when using SHF_GROUP). Linker's deduplicate the group based on the hash, then squish all the distinct sections with the same name together. (same system that powers -ffunction-sections -fno-unique-section-names - even though each function section is called ".text", because they have different hashes and use SHF_GROUP, they are distinct sections in the object file)

Without similar behavior for wasm, how does wasm deduplicate these type units, if they're all in one section together? (have you tested this with two types? I'd expect that to hit the same problem - of attempting to create a section that already exists when it went to create the second type unit)

dschuff closed this revision.Oct 30 2020, 4:38 PM