Page MenuHomePhabricator

[WebAssembly] Better support for WASM Object format
Needs ReviewPublic

Authored by patcheng on Jan 18 2018, 12:51 AM.




WasmObjectFile::section() now exposes WasmDataSegments as sections
so that when Clang iterates through the sections, it wil find the

When writing WasmDataSegments, if the segment name is "__clangast", now make sure the WasmDataSegment is aligned by adding padding to the integer encoding.

Diff Detail

Event Timeline

patcheng created this revision.Jan 18 2018, 12:51 AM
patcheng retitled this revision from Better support for WASM Object format to [WebAssembly] Better support for WASM Object format.Jan 18 2018, 12:52 AM
patcheng added a reviewer: sbc100.

This commit is not ready. wasm2yaml is relying on WasmObjectFile::sections() to return only 1 Data Section. Now, we are returning each DataSegment. As a result, some of the tests are failing currently.

sbc100 added inline comments.Jan 18 2018, 10:01 AM

This seems reasonable, and could even be a separate change that could land right away if your prefer.


Should do this for section_end() too? and symbol_begin()/end() too?

Is there a particular motivation for this? I someone reading this value somewhere?

re: wasm2yaml

Is it only used for unit tests?
Before I process, just want to make sure I am causing issues. Basically, I am thinking changing the output from one block for entire Data section

- Type:            DATA
    - SectionOffset:   6
      MemoryIndex:     0
        Opcode:          I32_CONS
        Value:           0
      Content:         '00000000'

to a Data Segment.

- Type:            DATA
  SectionOffset:   6
  MemoryIndex:     0
    Opcode:          I32_CONS
    Value:           0
  Content:         '00000000'

Looks like it's used by clang when -fembed-bitcode is specified:

Probably makes more sense to separate it.


See changes to moveSectionNext() above.

Now when iterating through sections(), we are using Ref.d.a and Ref.d.b.

Don't think it's needed in other methods.

Actually, looking into a bit more, looks like DataRefImpl has a constructor that memset() everything. Maybe it's not strictly necessary (?)

patcheng updated this revision to Diff 130773.Jan 20 2018, 11:09 AM

Updated llvm-readobj and obj2yaml so that output in similar format as previously.

As far as I know yes the yaml<->obj tools are mostly used for testing.

sbc100 added inline comments.Jan 20 2018, 9:59 PM

This method seems like it could use a better name since "DetaRef" is can refer to many things, not just segment. Also, what is wrong with using the existing getSectionContents method?


Unlike other formats that wasm file formats is not designed to be mapped into memory, so the file alignment should not be relevent. If it is particularly important for bitcode data perhaps you could just align those segments, but I don't think we want padding for data segments in general.

patcheng updated this revision to Diff 131542.Jan 25 2018, 11:37 PM
patcheng edited the summary of this revision. (Show Details)

Now only align when the segment name is "__clangast"

sbc100 added inline comments.Jan 28 2018, 11:11 AM

drop the the { } ? (and maybe move the comment out side the if if it makes it more readable?)


The indentation here looks funny. Can you run clang-format on this patch ?


Can this line move to the else of the if below?

72 ↗(On Diff #131542)

Revert this change?


We should probably give this section (actually its the one and only segment, right?) a name.

Also, perhaps we should omit the data section itself from the list, now that its sub-segments are themselves lists?

patcheng added inline comments.Jan 28 2018, 1:52 PM

Now Section.Name comes from DataSegment's Name, which is comes from linking.

In trivial.obj.wasm, there is no names for the DataSegment, so it is currently returning nothing.

Should it default to "DATA"?

sbc100 added inline comments.Jan 28 2018, 2:02 PM

Maybe we should regenerate trivial.obj.wasm? See test/tools/llvm-readobj/Inputs/trivial.ll for how to do this

patcheng updated this revision to Diff 131732.Jan 28 2018, 4:48 PM
patcheng marked 4 inline comments as done.

Changes suggested by sbc100

sbc100 added inline comments.Jan 29 2018, 9:40 AM

Remove extra brackets around DataSegments.size()


Use if/else here


Just use getSectionContents() here?


No need for getSectionContentArrayRef method I think, just put that logic right here.


remove extra newline


Move this up before the declaration of WasmSec

patcheng updated this revision to Diff 132547.Feb 2 2018, 1:21 AM
patcheng marked 2 inline comments as done.
patcheng marked 5 inline comments as done.Feb 2 2018, 1:30 AM

Hopefully, the code should be good now.

Unfortunately, I need help adding the newly generated trivial.obj.wasm here

to the diff.



Thanks for the tip.

I was able to regenerate trivial.obj.wasm file, and updated the test result, but I was not able to include the new .wasm as part of the patch without getting an error from Phabricator.

Sorry to flip flop on this but I think we are going in the direction of using custom sections for this kind of this (debug info, lto stuff, etc). Anything that doesn't need to be in the address space at runtime.

Hopefully this should simply this change.

Are you able to generate and consume the __clangast now? It should show up as a custom section

patcheng marked an inline comment as done.May 10 2019, 9:48 PM

I haven't had the time to working on this project recently.

But looks like zhuowei was able to access "_clangast" without my changes. Here are the changes that was needed:

Maybe this can be closed?

sbc100 resigned from this revision.Aug 30 2019, 4:48 PM

I think this can be closed?