This is an archive of the discontinued LLVM Phabricator instance.

[wasm-ld] Allow importing/exporting the output module's memory with arbitrary names
Needs ReviewPublic

Authored by Liamolucko on Aug 7 2022, 11:03 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Adds an --export-memory option to wasm-ld which allows passing a name to give to the exported memory, and extends --import-memory to allow passing a <module>,<name> pair specifying where the memory should be imported from.

This is to support the newly-defined 'Canonical ABI' for modules (https://github.com/WebAssembly/component-model/pull/71), which requires that the memory be exported as "cabi_memory".

Diff Detail

Event Timeline

Liamolucko created this revision.Aug 7 2022, 11:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2022, 11:03 PM
Liamolucko requested review of this revision.Aug 7 2022, 11:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2022, 11:03 PM
sbc100 added a comment.Aug 8 2022, 9:24 AM

I think we maybe want to think this over a little more. Clearly there is a need to be able to control the import/export of memory 0 but I'd like to make sure we do it in a forward thinking way.

Some considerations.

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?
  2. When we went to multi-table we made table 0 into a first class symbol.. then it could be reasonable like any other symbol. Perhaps we should do this for memory 0 too? See https://reviews.llvm.org/D96001
  3. It would be to be able to control the default "env" name from which all default import come. This could remove the needs for the comma separated pair parsing.

If we made __linear_memory into first class symbol like __indirection_function_table then perhaps we could allow the toolchain to control its import/export name directly using the import_name/export_name attributes in assembly and avoid the need for any new command line flag and re-using the existing machinery?

sbc100 added inline comments.Aug 8 2022, 9:29 AM
lld/wasm/Driver.cpp
379

Why allow the same memory to be exported multiple times under different names?

I guess it could be useful but I'd like to see it done in a more standard way that could be re-used for all symbols types (and looking forward to multiple memories).

We have support for symbols alias in the assembly format and object format so I would hope would achieve multiple names that way if possible.

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?

Yeah, the linker will also have to merge together cabi_start definitions, so using a unified flag makes more sense. I'll make a new patch to do it that way. Does Phabricator have a way to close this one?

If we made __linear_memory into first class symbol like __indirection_function_table then perhaps we could allow the toolchain to control its import/export name directly using the import_name/export_name attributes in assembly and avoid the need for any new command line flag and re-using the existing machinery?

Yeah, that seems like the proper way to allow renaming. That would require adding a new SYMTAB_MEMORY to https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md#linking-metadata-section, right?

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?

Yeah, the linker will also have to merge together cabi_start definitions, so using a unified flag makes more sense. I'll make a new patch to do it that way. Does Phabricator have a way to close this one?

If we made __linear_memory into first class symbol like __indirection_function_table then perhaps we could allow the toolchain to control its import/export name directly using the import_name/export_name attributes in assembly and avoid the need for any new command line flag and re-using the existing machinery?

Yeah, that seems like the proper way to allow renaming. That would require adding a new SYMTAB_MEMORY to https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md#linking-metadata-section, right?

Yes, I think so, plus a bunch of work in MC and wasm-ld layers to support multiple memories. This work was already done for the multi-table use case, so we should be able to follow that pattern.

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?

Yeah, the linker will also have to merge together cabi_start definitions, so using a unified flag makes more sense. I'll make a new patch to do it that way. Does Phabricator have a way to close this one?

That does cabi_start represent? If its the equivalent of main we probably don't want the linker to merge them but instead reject programs that contain more than one (just like any other symbol).

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?

Yeah, the linker will also have to merge together cabi_start definitions, so using a unified flag makes more sense. I'll make a new patch to do it that way. Does Phabricator have a way to close this one?

That does cabi_start represent? If its the equivalent of main we probably don't want the linker to merge them but instead reject programs that contain more than one (just like any other symbol).

It's the component-model equivalent of WebAssembly start functions, which can be used for main but are mainly meant for initialization. Fully-fledged components are explicitly allowed to have multiple, so I think it makes sense to merge them.

In curious why memorytExports is a set of strings rather than an optional string. Is there a need for exporting the memory with multiple names?

I can envision a possible need, which is to build modules that export both "memory" and "cabi_memory". I'm open to supporting something like that if there's an acute need for it, but I'm also hesitant, because

  1. Are there going to be other cabi-relatated things, that warrent just putting this behind a single "cabi" flag rather than requiring N different flags for this?

Yeah, the linker will also have to merge together cabi_start definitions, so using a unified flag makes more sense. I'll make a new patch to do it that way. Does Phabricator have a way to close this one?

That does cabi_start represent? If its the equivalent of main we probably don't want the linker to merge them but instead reject programs that contain more than one (just like any other symbol).

It's the component-model equivalent of WebAssembly start functions, which can be used for main but are mainly meant for initialization. Fully-fledged components are explicitly allowed to have multiple, so I think it makes sense to merge them.

I think the canonical ABI will need support for multiple cabi_start functions, so that we don't have to merge them. I filled https://github.com/WebAssembly/component-model/issues/88 to track this.

In curious why memorytExports is a set of strings rather than an optional string. Is there a need for exporting the memory with multiple names?

Not really; I just thought that I would allow for arbitrary combinations while I was at it.

I think the canonical ABI will need support for multiple cabi_start functions, so that we don't have to merge them. I filled https://github.com/WebAssembly/component-model/issues/88 to track this.

Now that it seems like that's not going to be supported natively in the Canonical ABI, should the linker merge together cabi_start functions? They shouldn't be able to depend on one another, but it seems like it could be useful to define cabi_start functions across different files that work with independent values.

In curious why memorytExports is a set of strings rather than an optional string. Is there a need for exporting the memory with multiple names?

Not really; I just thought that I would allow for arbitrary combinations while I was at it.

I think the canonical ABI will need support for multiple cabi_start functions, so that we don't have to merge them. I filled https://github.com/WebAssembly/component-model/issues/88 to track this.

Now that it seems like that's not going to be supported natively in the Canonical ABI, should the linker merge together cabi_start functions? They shouldn't be able to depend on one another, but it seems like it could be useful to define cabi_start functions across different files that work with independent values.

I think its probably simpler to only allow a single start cabi_start definition.. just like any other symbol. It sounds like it should be a linker error to have more than one of them.

I think its probably simpler to only allow a single start cabi_start definition.. just like any other symbol. It sounds like it should be a linker error to have more than one of them.

This sounds right to me. We already have the .init_array mechanism for registering constructors at link time, and it sounds like we can continue to use that with cabi_start.

sbc100 added a comment.EditedOct 27 2022, 4:31 PM

Should this be closed in favor of https://reviews.llvm.org/D135898?