This is an archive of the discontinued LLVM Phabricator instance.

Add initial support for WebAssembly binary format
AbandonedPublic

Authored by sbc100 on Oct 13 2016, 7:20 PM.

Details

Summary

Initial support for wasm object file format

Support for the wasm binary format in the following tools:

  • obj2yaml
  • yaml2obj
  • llvm-readobj
  • llvm-objdump
  • llvm-size

Event Timeline

sbc100 updated this revision to Diff 74611.Oct 13 2016, 7:20 PM
sbc100 retitled this revision from to Add initial support for WebAssembly binary format.
sbc100 updated this object.
sbc100 updated this object.Oct 13 2016, 7:24 PM
sbc100 added reviewers: dschuff, sunfish.
dschuff added inline comments.Oct 14 2016, 2:28 PM
include/llvm/Object/Wasm.h
11

implement->implements

12

Probably worth putting a link here to the design doc for the wasm binary format: https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md (and later the text spec/standard once we have it)

36

We should implement classof so that isa/cast etc will work.

42

I guess most of the things in protected are the necessary overrides to implement ObjectFile. But do we need e.g. these data members there? Should they be private?

60

commented-out code.

70

These comments don't make sense without also having the one from SymbolRef

include/llvm/ObjectYAML/WasmYAML.h
42

WasmYAML

58

Commented.

include/llvm/Support/FileSystem.h
265

Do you know what the difference is between the various object vs executable file types is in this context? To the extent that it matters, we should probably start with focusing on executables, since that's well-specified, whereas we haven't got the details of what we want for object files nailed down (e.g. symbols, relocations, etc, including even how we will differentiate an object file from an executable).

lib/Object/WasmObjectFile.cpp
12

alphabetize includes.

23

What does R stand for? return value?

27

should this be return std::move(Err)?

32

Can any of these be static or anonymous namespace?

47

Does this function have any way to fail?

58

How do we handle running out of bytes/off the end of the buffer early? Or bogus data in section header? For that matter, how does decodeULEB128() fail?

81

use llvm_unreachable("message, e.g. not yet implemented") instead of assert(0)

263

I don't know if this string has to match an arch name (e.g. "wasm32" or "wasm64") but it may well end up being the case for wasm that we don't have different file formats for wasm32 vs wasm64. So if this is just arbitrary, for now it could just be "wasm".

268

This can be Triple::wasm32 for now.

dschuff edited edge metadata.Oct 14 2016, 2:29 PM

(in general i do think it's going in the right direction :)

sbc100 updated this revision to Diff 75094.Oct 18 2016, 4:58 PM
sbc100 marked 15 inline comments as done.
sbc100 edited edge metadata.

Add objdump support

sbc100 updated this revision to Diff 75096.Oct 18 2016, 5:03 PM

Remove some debugging

sunfish accepted this revision.Oct 20 2016, 12:11 PM
sunfish edited edge metadata.

Cool! This looks like a great start!

include/llvm/Object/Wasm.h
30

The latest spec changes call this "user defined".

This revision is now accepted and ready to land.Oct 20 2016, 12:11 PM
sunfish added inline comments.Oct 20 2016, 1:01 PM
include/llvm/Object/Wasm.h
42

It might be a little more consistent with the code for other container formats to move these enums and magic numbers toinclude/llvm/Support/Wasm.h.

sunfish added inline comments.Oct 25 2016, 11:24 AM
include/llvm/Object/Wasm.h
27

Let's update this to 0xd too.

sbc100 updated this revision to Diff 75902.Oct 26 2016, 9:17 AM
sbc100 marked 2 inline comments as done.
sbc100 edited edge metadata.

Parse all non-code sections.

sbc100 added inline comments.Oct 26 2016, 9:20 AM
include/llvm/Object/Wasm.h
27

I will probably wait until 0xd branched get merged then update this patch. Since this patch has now grown significantly do you think I should split it into smaller chunks? Perhaps I should land just the YAML<->wasm stuff with tests.. and leave the readobj stuff for a second patch?

include/llvm/Support/FileSystem.h
265

This is just an enumeration of different magic number types. I don't think we know at this point if we want more than one magic number.

lib/Object/WasmObjectFile.cpp
23

I guess so. Cargo culted. I gave it a more meaningful name.

47

Yes, it would be nice to not read past EOF in the cast of malformed file. I'm not sure how to do this since we don't konw how many bytes will be consumbed by decodeULEB128 but I'll figure it out. Added a TODO for now.

sbc100 updated this revision to Diff 75959.Oct 26 2016, 3:34 PM

Add data section handling
Remove un-needed opcodes

sbc100 updated this revision to Diff 75968.Oct 26 2016, 4:49 PM
  • Update to 0xd
sbc100 updated this revision to Diff 75974.Oct 26 2016, 7:40 PM
  • Add objdump support
sbc100 updated this object.Oct 26 2016, 7:49 PM
sbc100 updated this object.
sbc100 updated this object.Oct 26 2016, 7:52 PM
sbc100 updated this revision to Diff 75975.Oct 26 2016, 8:23 PM
sbc100 updated this object.
  • Add support for global section
sbc100 updated this revision to Diff 76164.Oct 27 2016, 9:28 PM
  • Add support for START and GLOBAL sections and add tests

All spec tests can now be round-tripped to YAML

dschuff added inline comments.Oct 28 2016, 3:00 PM
lib/Object/WasmObjectFile.cpp
441

#undef ECase?

test/tools/llvm-readobj/file-headers.test
374

Could be WASM-NEXT:

test/tools/llvm-readobj/sections.test
497

also WASM-NEXT

tools/llvm-readobj/WasmDumper.cpp
63

llvm_unreachable()

tools/obj2yaml/wasm2yaml.cpp
41

these new foo lines could be use auto*

tools/yaml2obj/yaml2wasm.cpp
44

could writeUint{32,64} be a template?

46

could you just use endian::byte_swap<uint64_t, little>(Value)?

Bigcheese requested changes to this revision.Oct 29 2016, 8:24 PM
Bigcheese edited edge metadata.

This patch is a bit too big. I'd prefer if you picked one of the tools and did that first. llvm-objdump is a good place to start.

lib/Object/WasmObjectFile.cpp
38–45

We already have these in Support/Endian.h

tools/llvm-readobj/MachODumper.cpp
1 ↗(On Diff #76164)

This is an unrelated change.

tools/obj2yaml/coff2yaml.cpp
1 ↗(On Diff #76164)

Unrelated change.

tools/obj2yaml/wasm2yaml.cpp
41

This, and the others, would be better as:

auto TypeSec = std::make_unique<WasmYAML::TypeSection>();
// Fill them up
S = std::move(TypeSec);
tools/yaml2obj/yaml2wasm.cpp
44

Support/Endian.h has write64le.

This revision now requires changes to proceed.Oct 29 2016, 8:24 PM
sbc100 updated this revision to Diff 76487.Oct 31 2016, 3:00 PM
sbc100 edited edge metadata.
sbc100 marked 10 inline comments as done.
  • Fixes based on code review
sbc100 added inline comments.Oct 31 2016, 3:02 PM
tools/yaml2obj/yaml2wasm.cpp
44

Switched to write64le / write32le

sunfish added inline comments.Nov 2 2016, 5:02 PM
lib/Object/WasmObjectFile.cpp
199

It'd be good to handle WASM_SEC_USER here too. If I understand the code here, we don't need to actually parse it, for now, similar to WASM_SEC_CODE above.

sunfish added inline comments.Nov 11 2016, 3:12 PM
lib/Object/WasmObjectFile.cpp
92

Here and at similar places elsewhere in this patch, object_error::parse_error would be more descriptive than object::invalid_file_type.

Oops, that's parse_failed, rather than parse_error.

lib/Object/WasmObjectFile.cpp
92

Fixed in the smaller PR: https://reviews.llvm.org/D26172

199

Fixed in smaller the PR: https://reviews.llvm.org/D26172

sbc100 abandoned this revision.Nov 14 2016, 4:11 PM