This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Add llvm-objdump support for wasm file format
ClosedPublic

Authored by sbc100 on Oct 31 2016, 4:27 PM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

sbc100 updated this revision to Diff 76503.Oct 31 2016, 4:27 PM
sbc100 retitled this revision from to Add llvm-objdump support for wasm file format.
sbc100 updated this object.
sbc100 added a subscriber: llvm-commits.
sbc100 updated this revision to Diff 77910.Nov 14 2016, 4:08 PM

Handle USER sections

sbc100 updated this revision to Diff 78034.Nov 15 2016, 10:54 AM
  • Better support for user sections

@Bigcheese would you mind taking a look? I think this is the smallest testable unit that makes sense.

sbc100 updated this revision to Diff 78048.Nov 15 2016, 12:03 PM
  • Update identify_magic change to match recent changes
Bigcheese edited edge metadata.Nov 15 2016, 2:22 PM

I'm taking a look. First thing I noticed is that it parses up front on creation. I'm not familiar with the wasm binary format, but generally we try to be as lazy as possible with this API.

lib/Object/WasmObjectFile.cpp
22–24 ↗(On Diff #77910)
auto ObjectFile = llvm::make_unique<WasmObjectFile>(Buffer, Err));

Also, can you clang-format this patch?

I could probably remove a all/most of the parseXXXSection functions from the PR actually since they are not used by objdump directly. Would make the PR smaller and mean less up-front parsing in the WasmObjectFile.cpp.

There's a lot more code than needed here to pass the tests. Most of the section parsing code isn't actually used.

I'm wondering how you envision mapping between the binary tools and wasm. What should each of the major options of objdump print? Should ar create archives with symbol tables? Is wasm an object file format, executable format, or both?

I'd like to know what the plan is so I can review the design with that in mind.

lib/Object/ObjectFile.cpp
16 ↗(On Diff #78048)

Sort alphabetically.

lib/Object/WasmObjectFile.cpp
45 ↗(On Diff #78048)

This is ignoring the endianess.

52 ↗(On Diff #78048)

This is ignoring the endianess.

119 ↗(On Diff #78048)

Why not return WasmSection by value?

434 ↗(On Diff #78048)

Isn't the name of the section for USER sections a string from the file?

tools/llvm-objdump/WasmDump.cpp
30 ↗(On Diff #78048)

The section table for other formats isn't considered part of the file header.

sbc100 updated this revision to Diff 78113.Nov 15 2016, 5:41 PM
sbc100 marked 5 inline comments as done.
sbc100 edited edge metadata.
  • Better support for user sections
  • Update identify_magic change to match recent changes
  • Remove unused parse functions
  • Run clang-format
  • Cleanup program header

I removed a lot of the code that is not needed for this initial PR.

Added tests for objdump -s, -p and -h.

The wasm file format as it stands today is designed as an executable file format. We plan to use it for object files down the line but our initial target is to use bitcode linking and only emit the wasm file for the final binary. The next phase is to design some extensions to the file format to allow for it to be used for object files also.

The binary format is described here: http://webassembly.org/docs/binary-encoding/

And the initial plan for extending the format for linkable object file is here: https://github.com/WebAssembly/tool-conventions/pull/3

@sunfish please correct me if I'm wrong here.

davide added a subscriber: davide.Nov 15 2016, 5:51 PM
davide added inline comments.
include/llvm/Object/Wasm.h
88 ↗(On Diff #78113)

Any reason why you're inlining this method here but not all the other trivial ones? (getArch() et similia)

include/llvm/Support/Wasm.h
38–49 ↗(On Diff #78113)

Adding comment about what these variables represent (as we do, e.g. in ELF.h) would be nice.

lib/Object/WasmObjectFile.cpp
117–119 ↗(On Diff #78113)

Can any of these unreachable be triggered unvoluntarily by the user? (indirectly, e.g. via objdump?). If so, they should be converted to report_fatal_error. Otherwise it's fine, but please add the name of the function to the unreachable string.

210 ↗(On Diff #78113)

Can this fail? e.g. empty section contents?

sbc100 retitled this revision from Add llvm-objdump support for wasm file format to [WebAssembly] Add llvm-objdump support for wasm file format.Nov 15 2016, 6:06 PM
sbc100 updated this revision to Diff 78219.Nov 16 2016, 10:35 AM
sbc100 marked 2 inline comments as done.
  • Better support for user sections
  • Update identify_magic change to match recent changes
  • Remove unused parse functions
  • Run clang-format
  • Cleanup program header
  • Add comments to Wasm.h
  • Add check for empty section
sbc100 added inline comments.Nov 16 2016, 10:38 AM
include/llvm/Object/Wasm.h
88 ↗(On Diff #78113)

Looks like I cargo-culted this from MachO.h and COFF.h which both have this strangeness. Will move to cpp file.

lib/Object/WasmObjectFile.cpp
117–119 ↗(On Diff #78113)

None of the objdump flags seem to tickle any of these unreachables.

I don't see any precedent for including the name of the unreachable function for unimplemented functions (See git grep -i "llvm_unreachable.*unimp". Want me to go ahead and do this anyway?

210 ↗(On Diff #78113)

No, sections cannot be empty.

Anything else blocking this CL? I think I addressed all the feedback so far.

davide added inline comments.Nov 21 2016, 10:49 AM
include/llvm/Object/Wasm.h
31 ↗(On Diff #78219)

I think inline inside class definition is not necessary (check elsewhere).

34 ↗(On Diff #78219)

This comment is too trivial to be useful. Please remove.

lib/Object/WasmObjectFile.cpp
71 ↗(On Diff #78219)

Can't this just be a StringError instead of a GenericBinaryError?

117–119 ↗(On Diff #78113)

No, leave it as is. Hopefully they won't be unreachable for long.

210 ↗(On Diff #78113)

Please add a comment explaining why this can't fail.

sbc100 updated this revision to Diff 78808.Nov 21 2016, 5:25 PM
sbc100 marked 2 inline comments as done.
  • Better support for user sections
  • Update identify_magic change to match recent changes
  • Remove unused parse functions
  • Run clang-format
  • Cleanup program header
  • Add comments to Wasm.h
  • Add check for empty section
  • Add comment and remove unneeded friend declartions
include/llvm/Object/Wasm.h
31 ↗(On Diff #78219)

All the other headers in include/llvm/Object do it this way

lib/Object/WasmObjectFile.cpp
71 ↗(On Diff #78219)

GenericBinaryError seems to make sense since this is an problem with binary file. Its a subclass of BinaryError which I think is that kind of error we want here.

From the docs from GenericBinaryError:

/ For errors that don't require their own specific sub-error (most errors)
/ this class can be used to describe the error via a string message.

MachOObjectFile.cpp uses it too.

davide added inline comments.Nov 21 2016, 5:28 PM
include/llvm/Object/Wasm.h
31 ↗(On Diff #78219)

This doesn't necessarily mean they're right.

lib/Object/WasmObjectFile.cpp
71 ↗(On Diff #78219)

ehm, feels quite a bit overabstraction to me.
In this case, you don't need anything more than a StringError (as we do for ELF).

sbc100 updated this revision to Diff 78811.Nov 21 2016, 5:41 PM
sbc100 marked an inline comment as done.
  • Remove inline from classof and use StringError
include/llvm/Object/Wasm.h
31 ↗(On Diff #78219)

Hmm, seems like the codebase is quite divided on this one:

$ git grep classof ./include | grep inline | wc -l
230
$ git grep classof ./include | grep -v inline | wc -l
257

Perhaps I should propose a cleanup CL?

In the mean time I'll drop inline here I guess?

lib/Object/WasmObjectFile.cpp
71 ↗(On Diff #78219)

OK.. will change.

davide accepted this revision.Nov 21 2016, 5:55 PM
davide added a reviewer: davide.

This LGTM. Michael might have some extra comment.

include/llvm/Object/Wasm.h
31 ↗(On Diff #78219)

That's a great idea if you feel like it =)
I'm not sure if it's possible in the general case but it would be nice to think if it can be automated using a clang-tidy check to be run on the whole codebase (instead of you changing every definition by hand)

This revision is now accepted and ready to land.Nov 21 2016, 5:55 PM
dschuff accepted this revision.Nov 29 2016, 5:33 PM
dschuff added a reviewer: dschuff.

This LGTM too. @Bigcheese, any further comments?

Bigcheese accepted this revision.Nov 29 2016, 6:06 PM
Bigcheese edited edge metadata.

lgtm

This revision was automatically updated to reflect the committed changes.