This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Better support for WASM Object format
Needs ReviewPublic

Authored by patcheng on Jan 18 2018, 12:51 AM.
This revision needs review, but all reviewers have resigned.

Details

Reviewers
sbc100
Summary

Addressed https://bugs.llvm.org/show_bug.cgi?id=35928

WasmObjectFile::section() now exposes WasmDataSegments as sections
so that when Clang iterates through the sections, it wil find the
WasmDataSegments.

When writing WasmDataSegments, if the segment name is "__clangast", now make sure the WasmDataSegment is aligned by adding padding to the integer encoding.

Diff Detail

Event Timeline

patcheng created this revision.Jan 18 2018, 12:51 AM
patcheng retitled this revision from Better support for WASM Object format to [WebAssembly] Better support for WASM Object format.Jan 18 2018, 12:52 AM
patcheng added a reviewer: sbc100.

This commit is not ready. wasm2yaml is relying on WasmObjectFile::sections() to return only 1 Data Section. Now, we are returning each DataSegment. As a result, some of the tests are failing currently.

sbc100 added inline comments.Jan 18 2018, 10:01 AM
lib/Object/WasmObjectFile.cpp
1058

This seems reasonable, and could even be a separate change that could land right away if your prefer.

1149

Should do this for section_end() too? and symbol_begin()/end() too?

Is there a particular motivation for this? I someone reading this value somewhere?

re: wasm2yaml

Is it only used for unit tests?
Before I process, just want to make sure I am causing issues. Basically, I am thinking changing the output from one block for entire Data section

- Type:            DATA
  Segments:        
    - SectionOffset:   6
      MemoryIndex:     0
      Offset:          
        Opcode:          I32_CONS
        Value:           0
      Content:         '00000000'
...

to a Data Segment.

- Type:            DATA
  SectionOffset:   6
  MemoryIndex:     0
  Offset:          
    Opcode:          I32_CONS
    Value:           0
  Content:         '00000000'
...
lib/Object/WasmObjectFile.cpp
1058

Looks like it's used by clang when -fembed-bitcode is specified:

https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/BackendUtil.cpp#L1246

Probably makes more sense to separate it.

1149

See changes to moveSectionNext() above.

Now when iterating through sections(), we are using Ref.d.a and Ref.d.b.

Don't think it's needed in other methods.

Actually, looking into a bit more, looks like DataRefImpl has a constructor that memset() everything. Maybe it's not strictly necessary (?)

patcheng updated this revision to Diff 130773.Jan 20 2018, 11:09 AM

Updated llvm-readobj and obj2yaml so that output in similar format as previously.

As far as I know yes the yaml<->obj tools are mostly used for testing.

sbc100 added inline comments.Jan 20 2018, 9:59 PM
include/llvm/Object/Wasm.h
213

This method seems like it could use a better name since "DetaRef" is can refer to many things, not just segment. Also, what is wrong with using the existing getSectionContents method?

lib/MC/WasmObjectWriter.cpp
856

Unlike other formats that wasm file formats is not designed to be mapped into memory, so the file alignment should not be relevent. If it is particularly important for bitcode data perhaps you could just align those segments, but I don't think we want padding for data segments in general.

patcheng updated this revision to Diff 131542.Jan 25 2018, 11:37 PM
patcheng edited the summary of this revision. (Show Details)

Now only align when the segment name is "__clangast"

sbc100 added inline comments.Jan 28 2018, 11:11 AM
lib/MC/WasmObjectWriter.cpp
848

drop the the { } ? (and maybe move the comment out side the if if it makes it more readable?)

867

The indentation here looks funny. Can you run clang-format on this patch ?

lib/Object/WasmObjectFile.cpp
1121

Can this line move to the else of the if below?

test/MC/WebAssembly/bss.ll
72 ↗(On Diff #131542)

Revert this change?

test/tools/llvm-objdump/wasm.txt
14 ↗(On Diff #131542)

We should probably give this section (actually its the one and only segment, right?) a name.

Also, perhaps we should omit the data section itself from the list, now that its sub-segments are themselves lists?

patcheng added inline comments.Jan 28 2018, 1:52 PM
test/tools/llvm-objdump/wasm.txt
13 ↗(On Diff #130773)

Now Section.Name comes from DataSegment's Name, which is comes from linking.

In trivial.obj.wasm, there is no names for the DataSegment, so it is currently returning nothing.

Should it default to "DATA"?

sbc100 added inline comments.Jan 28 2018, 2:02 PM
test/tools/llvm-objdump/wasm.txt
13 ↗(On Diff #130773)

Maybe we should regenerate trivial.obj.wasm? See test/tools/llvm-readobj/Inputs/trivial.ll for how to do this

patcheng updated this revision to Diff 131732.Jan 28 2018, 4:48 PM
patcheng marked 4 inline comments as done.

Changes suggested by sbc100

sbc100 added inline comments.Jan 29 2018, 9:40 AM
lib/Object/WasmObjectFile.cpp
995

Remove extra brackets around DataSegments.size()

1025

Use if/else here

1046

Just use getSectionContents() here?

1052

No need for getSectionContentArrayRef method I think, just put that logic right here.

1129

remove extra newline

tools/llvm-readobj/WasmDumper.cpp
148 ↗(On Diff #131732)

Move this up before the declaration of WasmSec

patcheng updated this revision to Diff 132547.Feb 2 2018, 1:21 AM
patcheng marked 2 inline comments as done.
patcheng marked 5 inline comments as done.Feb 2 2018, 1:30 AM

Hopefully, the code should be good now.

Unfortunately, I need help adding the newly generated trivial.obj.wasm here

to the diff.

Thanks

test/tools/llvm-objdump/wasm.txt
13 ↗(On Diff #130773)

Thanks for the tip.

I was able to regenerate trivial.obj.wasm file, and updated the test result, but I was not able to include the new .wasm as part of the patch without getting an error from Phabricator.

Sorry to flip flop on this but I think we are going in the direction of using custom sections for this kind of this (debug info, lto stuff, etc). Anything that doesn't need to be in the address space at runtime.

Hopefully this should simply this change.

Are you able to generate and consume the __clangast now? It should show up as a custom section

patcheng marked an inline comment as done.May 10 2019, 9:48 PM

I haven't had the time to working on this project recently.

But looks like zhuowei was able to access "_clangast" without my changes. Here are the changes that was needed: https://github.com/apple/swift-llvm/pull/155

Maybe this can be closed?

sbc100 resigned from this revision.Aug 30 2019, 4:48 PM

I think this can be closed?