Page MenuHomePhabricator

[WebAssembly] Emit PCH __clang_ast in custom section
Needs ReviewPublic

Authored by kateinoigakukun on Feb 13 2020, 1:37 AM.

Details

Reviewers
dschuff
uweigand
Summary

Resolve https://bugs.llvm.org/show_bug.cgi?id=35928 to promote Swift for WebAssembly

__clang_ast section should be aligned by 4 bytes, so I added padding after the size of custom section name.

Diff Detail

Event Timeline

sbc100 added inline comments.Feb 13 2020, 9:02 AM
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
1701

Hmm.. it looks like this function isn't actually needed: https://reviews.llvm.org/D74565.

Does this change work if you simply return K here? I guess/hope it does.

llvm/lib/MC/WasmObjectWriter.cpp
383

I general this seems like trick that could work. However, I think this value is encoded as an LEB32 which means its max length is technically 5. I don't think a 6 byte LEB is valid .

1113

Why this part needed ? What happens if you don't specify this? I looks like removes section symbol from the symtab?

Can we get a test to see the effect of this patch?

sbc100 added inline comments.Feb 14 2020, 3:34 PM
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
1701

Let me know if this change is uneeded with https://reviews.llvm.org/D74565.? I'll hold of landing that until I hear back. Hopefully it will just work and simplify this patch once it lands.

kateinoigakukun marked 3 inline comments as done.Feb 16 2020, 8:42 PM
kateinoigakukun added inline comments.
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
1701

Thanks for suggestion. But this special treatment is necessary to put the clangast in custom section.
The section kind K come from TargetLoweringObjectFile::getKindForGlobal which is called from AsmPrinter::EmitGlobalVariable, but the getKindForGlobal infers that __clangast global variable should be put in readonly section.

Then readonly section is put in data segment for WebAssembly. But what we really want in WebAssembly context is to put clangast metadata in custom section.

Should I change getKindForGlobal to return metadata section kind for clangast? But I'm not sure how the change effects to other object formats.

llvm/lib/MC/WasmObjectWriter.cpp
383

Thanks. I see, I confirmed that you are right https://webassembly.github.io/spec/core/binary/values.html#integers

Then how should I pad section start? Should I pad with empty data segment or something else?

1113

This part prevents the clangast section symbol from putting it in symbol table.
This part is necessary to put clangast in custom section because the sym is treated as data type but the actual content is not in data section, so WasmObjectWriter raised abortion with "data symbols must live in a data section: __clang_ast" when it tries to write the sym in symtab.

Could you take a look again?

sbc100 added inline comments.Feb 20 2020, 4:08 PM
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
1701

I see. I'd still rather get rid of this function completely in the long run, so if I land my change first you can just do this check directly in TargetLoweringObjectFileWasm::getExplicitSectionGlobal. Is there some other way we can identify __clang_ast here other than by section name? Is it a special kind of symbol?

In the long run I think this problem will be solved by putting each symbol in its own wasm section but to do that we need to diverge from the spec a little. Would you mind added a TODO here and referencing this bug: https://github.com/WebAssembly/tool-conventions/issues/138.

I think it we fix this bug we won't need this change.

It won't solve the alignment issue though :(

llvm/lib/MC/WasmObjectWriter.cpp
383

I think the problem is that the wasm file itself doesn't have the concept of alignment of sections, since its not designed to memory mapped in that way. Its designed to be parsed. Is the loading code expecting the file offset of the section to be aligned? I'm not sure how we can work around this.

llvm/test/MC/WebAssembly/clangast.ll
30

I'm confused by this size value. Shouldn't it be at least 27?

My colleague pointed out that one way to archive padding in the binary format would be to create a synthetic custom section that precedes that one you want to align. Its kind of horrible, but wasm files were not designed to used memory mapped like this.

kateinoigakukun marked an inline comment as done.Feb 20 2020, 7:33 PM

My colleague pointed out that one way to archive padding in the binary format would be to create a synthetic custom section that precedes that one you want to align. Its kind of horrible, but wasm files were not designed to used memory mapped like this.

Thanks! I'll try this way. My only concern is whether the order of custom sections are guaranteed.

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
1701

OK, I'll update after https://reviews.llvm.org/D74565 merged.

Is there some other way we can identify __clang_ast here other than by section name? Is it a special kind of symbol?

I couldn't find any other way to identify the section. I think it's a really special case to support PCH.

In the long run I think this problem will be solved by putting each symbol in its own wasm section but to do that we need to diverge from the spec a little. Would you mind added a TODO here and referencing this bug: https://github.com/WebAssembly/tool-conventions/issues/138.

I see. I agree with you. I'll add TODO comment here.

Thanks! I'll try this way. My only concern is whether the order of custom sections are guaranteed.

If you simple create the extra section right there in the WasmObjectFileWriter it will be in the right order. I wasn't recommending creating an extra section at the llvm level.