This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Linking debug (DWARF) sections from WebAssembly object files
ClosedPublic

Authored by yurydelendik on Mar 30 2018, 5:57 PM.

Details

Reviewers
sbc100
Summary

The debug (custom) sections can be generated using D44184 logic. The linker needs to combine/concat these sections and re-adjust relocation records. (The documentation for relocation records can be found at https://github.com/WebAssembly/tool-conventions/pull/50)

Event Timeline

yurydelendik created this revision.Mar 30 2018, 5:57 PM

extract getFunctionSizeLength

ruiu added a subscriber: ruiu.Apr 2 2018, 10:34 AM

For a patch like this, I'd like to know the overview of the debug story in the wasm world first, so that I can review this patch in that story line. I can imagine a lot of different strategies to enable debug info for wasm. Is debug info embedded to wasm file or handled as a separate file? Will it be in DWARF format (looks like so)? Can DWARF really support wasm-style memory abstraction? I'd be really helpful if you share your plan.

For a patch like this, I'd like to know the overview of the debug story in the wasm world first, so that I can review this patch in that story line. I can imagine a lot of different strategies to enable debug info for wasm... I'd be really helpful if you share your plan.

As now, there is no concrete proposal/specs on the table for wasm debug information and support yet. The D44184 and this patch will export DWARF information as it is generated by LLVM, which is more than enough to export source maps for WebAssembly. At the moment there is some information missing, such as DebugValue, but that's addressable in the future. While this entire work appears to be too experimental, it will help to prototype and identify future debug information format for WebAssembly and start experiments with debuggers earlier.

Is debug info embedded to wasm file or handled as a separate file? Will it be in DWARF format (looks like so)?

The generated by LLVM wasm files will have custom sections with DWARF (?) information in them as it is generated by LLVM. But it is not necessarily what will be published on the web -- the non-functional requirement for the wasm is still to keep binary as small as possible, so the custom sections (including debug one) will not be present in production builds.

Can DWARF really support wasm-style memory abstraction?

We will try to find the way to express that in DWARF structures, if it's not already happening. I'm more concerned about expressing stack machine registers (in DebugValue expressions) via DWARF -- in a worst case, we will need to extend the format to address that.

If its OK with you I'm going to try and split off the generic custom section handling and land that first. See https://reviews.llvm.org/D45297

If its OK with you I'm going to try and split off the generic custom section handling and land that first. See https://reviews.llvm.org/D45297

It is okay with me.

Don't export DefinedSection symbols.

ruiu added a comment.Apr 6 2018, 5:08 PM

Since this looks like a straightforward implementation to copy DWARF sections to an output, I think I'm fine with this change overall as long as wasm guys are fine. But I feel more comfortable if you document the debug story somewhere (as a separate doc or as a comment). Currently it is too terse that readers of the code can't tell what is decided and what's not.

ruiu added inline comments.Apr 6 2018, 5:08 PM
wasm/InputFiles.cpp
309

Is this a new enum? Looks like it doesn't exist at the moment.

sbc100 added a comment.Apr 6 2018, 7:23 PM

General support for custom sections is here: https://reviews.llvm.org/D45340

Once that lands this change should get a lot smaller I hope.

Rebase on top of D45340

ruiu added a comment.Apr 12 2018, 2:26 PM

I don't think I understand what you are trying to achieve with this patch in the first place, as you didn't explain it anywhere. Can you write a patch description properly so that we can understand the intention of the change and what the change is?

wasm/InputChunks.h
189

This function looks a bit odd. This is "InputSection", so an instance of this class is a section. And does a section has "getSection" method? It doesn't sound quite right.

190

We don't use the term "Payload" in lld. We simply call it Data when we have to name it.

Looks like this function is unused? Can you remove this?

wasm/InputFiles.cpp
46

getSizeLength sounds a odd name. Aren't size and length synonyms?

47

Let's be consistent. Since this function returns size_t, it should be size_t.

But I think it is better to use uint64_t when something can be larger than uint32_t.

134

Are you sure that uint8_t is enough?

We don't normally add const when it is obvious.

141

We don't use auto unless it's type is obvious.

187

When you use {}, add braces to all if and else ifs.

I.e.

if (...) {
} else if (...) {
} else {
}

or

if (...) ...;
else if (...) ...;
else ...;

are OK, but

if (...)
else if (...)
else if (...) {
} else ...

isn't.

yurydelendik retitled this revision from Initial propototype for mergin (custom) debug sections with DWARF format. to Linking debug (DWARF) sections from the WebAssembly object files.Apr 12 2018, 2:40 PM
yurydelendik edited the summary of this revision. (Show Details)
yurydelendik edited the summary of this revision. (Show Details)

Rebase at D45579.
Address review comments.

yurydelendik marked 8 inline comments as done.Apr 16 2018, 2:12 PM
yurydelendik added inline comments.
wasm/InputFiles.cpp
46

Renamed to getFunctionCodeOffset

309

Yes, it is. It will be defined in D44184

yurydelendik marked 2 inline comments as done.

Rebase on top of D45795

sbc100 added inline comments.Apr 26 2018, 1:18 PM
wasm/InputFiles.cpp
96

Can you use Sym->Function->getFunctionBody() here like you do below?

101

Can you remove this TODO? Shouldn't the value in the file always be section-relative?

331

I don't think undefined section symbols should be possible.

In fact, so we even want defined section symbols in the symbol table? They don't have unique names after all. The way ELF handles this is that all section symbols have local binding so should never get added to symboltable.

wasm/SymbolTable.h
42

We already type with this name in OutputSections.h.

wasm/Symbols.h
191

I don't think we can have undefined section symbols, so maybe we only need one type here?

Rename SyntheticSection
Remove UndefinedSection
Add getFunctionInputOffset
Rebase

yurydelendik marked 5 inline comments as done.Apr 26 2018, 4:45 PM
yurydelendik added inline comments.
wasm/SymbolTable.h
42

Renamed to InputFilesSectionsSet (not sure how to give a better name for structure that holds and maps InputSections to their InputFiles)

sbc100 added inline comments.Apr 26 2018, 5:00 PM
wasm/InputChunks.h
28

No longer needed here? maybe in the cpp file?

wasm/InputFiles.cpp
141

Maybe I'm missing something but SymbolTable seems like the wrong place to store this.
What do you think about my last comments about how these symbols should be local and therefore not be added to symbol table at all?

Why can't you just replace this code with ?

InputSection* IS = CustomSectionsByIndex[Sym->SectionIndex];
return IS->OutputOffset + Reloc.Addend;

I'm probably missing something here.

yurydelendik marked an inline comment as done.Apr 27 2018, 11:37 AM
yurydelendik added inline comments.
wasm/InputFiles.cpp
141

What do you think about my last comments about how these symbols should be local and therefore not be added to symbol table at all?

Yeah, I'm trying to figure this out atm

Why can't you just replace this code with ?

Index CustomSectionsByIndex in expectes InputFile section index, but Sym->SectionIndex just contains index for symbol data (not associated with output or input section index).

Remove tracking of InputSection from Symtab.

yurydelendik marked an inline comment as done.May 1 2018, 6:07 AM
sbc100 added a comment.May 2 2018, 3:31 PM

Looking much better in general

wasm/Symbols.h
178

This extra symbol type seems a little strange to me...

Could we put the getOutputSectionIndex/setOutputSectionIndex in the base SectionSymbol class so we don't need a separate CombinedSection?

Then we could want to call getOutputSectionIndex on all of the DefinedSection object too.

wasm/Writer.cpp
777

You can just use S->getName() I think

789

I don't see how you can use SymbolIndex here without incrementing it.

793

This loop seems like its doing some different things.

I wonder if we can find a way to build CombinedSectionSymbols in some other phase? For example perhaps in createCustomSections? Since for each custom section we need such a symbol?

798

Can you replace these three lines with Sym = &P.first->second and avoid the continue?

Simplify assignSymtab for CombinedSectionSymbols
Fix createCustomSections for non-referred sections

yurydelendik marked 3 inline comments as done.May 2 2018, 5:37 PM
yurydelendik added inline comments.
wasm/Symbols.h
178

The CombinedSection comprised from InputChunks of one or more DefinedSections and only CombinedSection will be needed getOutputSectionIndex() for linking section. We have getOutputSymbolIndex() for reloc entries, that is assigned in the assignSymtab.

wasm/Writer.cpp
789

It is incremented at OutputSym->setOutputSymbolIndex(SymbolIndex++); below

yurydelendik marked an inline comment as done.

Move CombinedSectionSymbols creation
Add test

yurydelendik marked an inline comment as done.May 3 2018, 10:52 AM
sbc100 added inline comments.May 3 2018, 12:05 PM
wasm/InputFiles.cpp
47

I don't think you need "const" with ArrayRef as Refs are immutable already.

83

Section symbols are always defined. So you can just cast here and never return 0.

Also, perhaps getSectionSymbol should just return DefinedSection instead?

331

The error message is little confusing. How about "section symbols cannot be undefined"

wasm/Symbols.h
191

Hot about this: We can have single SectionSymbol which we know is always defined. It can have getOutputSectionIndex() / setOutputSectionIndex() as well as const InputSection *Section;.

The Writer could then just pick the first SectionSymbol with a given name to be the in the symbol table? Then we wouldn't need to create any new symbols and we wouldn't need CombinedSection class at all.

Would that work? I could be missing something.

wasm/Writer.cpp
392

Is this just to save the Strings? If so does Saver.save() work?

490

A pattern use elsewhere is to simply to assume you can cast to the ramaining type in the final else.

e.g:

} else {
   // At this point we know we have CombinedSection 
   auto *S = cast<CombinedSection>(Sym));
   ...
}

That way the cast will fail at runtime if anything goes wrong and you don't need the assert case

Refactor SectionSymbol (remove CombinedSection and DefinedSection)
Remove InternedStrings
Review items

yurydelendik marked 6 inline comments as done.May 3 2018, 1:58 PM
sbc100 added a comment.May 3 2018, 2:47 PM

I like this better. Almost ready to land I think.

I'm mentioned that we can/should remove as much of the checking for ".debug_" as we can since I don't think change is specific the debug sections really, it should work for any custom section that want to include relocations.

@ruiu do you understand what are trying to do here?

wasm/Writer.cpp
391

I think we can drop the second part of this condition perhaps?

394

now I guess it should be "code, data, or custom sections"

759

SectionSymbolIndices?

778

Can we drop the first clause of this condition?

779

.count(Name) == 0?

Remove startswith(".debug") checks
Rename SectionSymbolIndices

yurydelendik marked 7 inline comments as done.May 3 2018, 3:22 PM
sbc100 accepted this revision.May 4 2018, 12:04 PM
sbc100 added inline comments.
wasm/InputFiles.cpp
80

No need for this cast anymore

140

No need for this cast anymore, or the if I guess

This revision is now accepted and ready to land.May 4 2018, 12:04 PM

We normally use [WebAssembly] at the start of the CL description.

yurydelendik retitled this revision from Linking debug (DWARF) sections from the WebAssembly object files to [WebAssembly] Linking debug (DWARF) sections from WebAssembly object files.May 4 2018, 1:49 PM

Remove cast<SectionSymbol>

yurydelendik marked 2 inline comments as done.May 4 2018, 1:52 PM
sbc100 added inline comments.May 4 2018, 3:28 PM
wasm/Writer.cpp
744

I don't think you need this since sections symbols are always local right?

Remove isa<SectionSymbol> check