This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Second phase of implementing extended const proposal.
ClosedPublic

Authored by sbc100 on Mar 9 2022, 6:22 PM.

Details

Summary

This change continues to lay the ground work for supporting extended
const expressions in the linker.

The included test covers object file reading and writing and the YAML
representation.

I have a followup PR that actually adds support for using these extended
expression in the linker.

Diff Detail

Unit TestsFailed

Event Timeline

sbc100 created this revision.Mar 9 2022, 6:22 PM
Herald added a project: Restricted Project. · View Herald Transcript
sbc100 requested review of this revision.Mar 9 2022, 6:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 6:22 PM
sbc100 edited the summary of this revision. (Show Details)Mar 9 2022, 6:26 PM
sbc100 retitled this revision from [WebAssembly] Second phase of implemented extended const proposal. to [WebAssembly] Second phase of implementing extended const proposal..Mar 9 2022, 10:23 PM

I assume these llvm_unreachables your're adding in this CL are the places that will be filled in in subsequent CLs?

llvm/include/llvm/ObjectYAML/WasmYAML.h
67

Should these be a union?
I wonder when we can go to C++17 variant...

llvm/lib/MC/WasmObjectWriter.cpp
935
llvm/lib/Object/WasmObjectFile.cpp
200

this is a bit confusing. If there's just a const/global.get/ref.null followed by an end, it's MVP, otherwise it's extended. But can there be 2 consts in a row? Seems like that would also be accepted here? and then line 208 would read a 3rd opcode?

I assume these llvm_unreachables your're adding in this CL are the places that will be filled in in subsequent CLs?

Yup.. they represent places where we don't currently handle extended const.. but we will do soon.

llvm/lib/Object/WasmObjectFile.cpp
200

Line 206 resets the the read pointer back to the start of the init expr.

Its kind of sad we have to actually parse of the initexpr here. Turns out that because the initexpr doesn't encode its own length (unlike a function) it cannot be skipped over which means that is not possible to write a file decoder that doesn't also have an instruction decodeer :(

sbc100 updated this revision to Diff 414474.Mar 10 2022, 1:03 PM
  • feedback
sbc100 marked an inline comment as done.Mar 10 2022, 2:12 PM
dschuff added inline comments.Mar 10 2022, 5:32 PM
llvm/lib/Object/WasmObjectFile.cpp
200

I guess you can skip over the whole section, but not each individual init expr. Is there any use case for wanting to skip over or avoid parsing some subset of initexprs? Even our file decoders in LLVM still parsed the const exprs of MVP.
... I guess this file itself is an example of where we want to know the location/size of the initexpr but don't care about evaluating it.
I wonder if it would be worth adding an extra byte to every initexpr to encode the size. Seems like a lot when each expr is still just a few bytes.

211

This loop could at least be simplified if we just used readULEB128 instead of all these typed versions e.g. readVarint32 since we aren't using the value and maybe don't care if the encoded value overflows.

sbc100 added inline comments.Mar 10 2022, 6:11 PM
llvm/lib/Object/WasmObjectFile.cpp
200

I actually already opened a bug to discuss the possibility of adding a size so we could avoid writing and instruction decoder here: https://github.com/WebAssembly/design/issues/1443.

I'm guessing it would be tricky to add at this point.

sbc100 updated this revision to Diff 414541.Mar 10 2022, 6:13 PM
  • feedback
sbc100 marked an inline comment as done.Mar 10 2022, 6:14 PM
dschuff accepted this revision.Mar 14 2022, 8:38 AM

This code is fine as a reflection of the current state of the proposal though.

llvm/lib/Object/WasmObjectFile.cpp
200

It's certainly not too late to change the extended-const proposal. We can follow up there.

llvm/test/ObjectYAML/wasm/extended_const_expressions.yaml
20

Maybe it's worth checking an invalid constexpr (to test that we either do or don't care about validity in this particular code)?

This revision is now accepted and ready to land.Mar 14 2022, 8:38 AM
This revision was landed with ongoing or failed builds.Mar 14 2022, 8:56 AM
This revision was automatically updated to reflect the committed changes.

This patch triggers a bunch of uninitialized memory under msan errors when running llvm-project/lld/test/wasm:data-layout.s (and several others) as below:

exit status 2
==1915==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55e0ff9ca683 in (anonymous namespace)::WasmDumper::dump() /llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:374:13
    #1 0x55e0ff9c1b87 in wasm2yaml(llvm::raw_ostream&, llvm::object::WasmObjectFile const&) llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:411:50
    #2 0x55e0ff9be3c9 in dumpObject llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:40:29
    #3 0x55e0ff9be3c9 in dumpInput llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:67:12
    #4 0x55e0ff9be3c9 in main llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:89:19
    #5 0x7f7a5befc8d2 in __libc_start_main (/usr/grte/v5/lib64/libc.so.6+0x628d2) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)
    #6 0x55e0ff798429 in _start /usr/grte/v5/debug-src/src/csu/../sysdeps/x86_64/start.S:120

SUMMARY: MemorySanitizer: use-of-uninitialized-value llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:374:13 in (anonymous namespace)::WasmDumper::dump()
Exiting
FileCheck error: '<stdin>' is empty.
FileCheck command line:  llvm-project/llvm/FileCheck --allow-unused-prefixes llvm-project/lld/test/wasm/data-layout.s -check-prefix=CHECK-SHARED

This patch triggers a bunch of uninitialized memory under msan errors when running llvm-project/lld/test/wasm:data-layout.s (and several others) as below:

exit status 2
==1915==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55e0ff9ca683 in (anonymous namespace)::WasmDumper::dump() /llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:374:13
    #1 0x55e0ff9c1b87 in wasm2yaml(llvm::raw_ostream&, llvm::object::WasmObjectFile const&) llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:411:50
    #2 0x55e0ff9be3c9 in dumpObject llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:40:29
    #3 0x55e0ff9be3c9 in dumpInput llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:67:12
    #4 0x55e0ff9be3c9 in main llvm-project/llvm/tools/obj2yaml/obj2yaml.cpp:89:19
    #5 0x7f7a5befc8d2 in __libc_start_main (/usr/grte/v5/lib64/libc.so.6+0x628d2) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)
    #6 0x55e0ff798429 in _start /usr/grte/v5/debug-src/src/csu/../sysdeps/x86_64/start.S:120

SUMMARY: MemorySanitizer: use-of-uninitialized-value llvm-project/llvm/tools/obj2yaml/wasm2yaml.cpp:374:13 in (anonymous namespace)::WasmDumper::dump()
Exiting
FileCheck error: '<stdin>' is empty.
FileCheck command line:  llvm-project/llvm/FileCheck --allow-unused-prefixes llvm-project/lld/test/wasm/data-layout.s -check-prefix=CHECK-SHARED

Thanks, investigating.

Attempted fix for ASAN issue: https://github.com/llvm/llvm-project/commit/2481adb59cb63c90879f4537ad36339da3ad61a6

I was unable to repro the ASAN issue locally, so please let me know if this fix is effective. I've fairly confident it will me.