Page MenuHomePhabricator

[WebAssembly] Change size_t to `unsigned long`
ClosedPublic

Authored by sunfish on Nov 27 2017, 3:09 PM.

Details

Summary

Currently wasm's size_t is unsigned int for compatibility with asm.js and Emscripten's library code; see the patch for details.

Changing it to unsigned long (which is also 32-bit on wasm32) would make it the same type as wasm64 (where unsigned long is 64-bit), which would eliminate the most common cause for mangled names being different between wasm32 and wasm64. For example, export lists containing symbol names could often be the same between wasm32 and wasm64.

Even though the ABI won't be officially stable for a while, increasing numbers of tools are looking to interoperate with LLVM's output, and core details like the size_t type may become inconvenient to change in the future.

This patch aims to remain compatible with current Emscripten integration by making the change only in the wasm32-unknown-unknown-wasm triple, so it won't affect the s2wasm path, which uses the wasm32-unknown-unknown-elf triple. It would be something that would need to be handled if Emscripten wants to start using the wasm32-unknown-unknown-wasm triple through.

Diff Detail

Repository
rC Clang

Event Timeline

sunfish created this revision.Nov 27 2017, 3:09 PM
sunfish updated this revision to Diff 126043.Dec 7 2017, 2:17 PM

This updates the patch to make size_t be unsigned long unconditionally, corresponding with the proposed patch to Emscripten here which changes it for asm.js as well.

sunfish updated this revision to Diff 131141.Jan 23 2018, 1:41 PM

Rebase for upstream changes.

dschuff accepted this revision.Jan 23 2018, 2:45 PM

LGTM FTR, as long as we coordinate when this lands with emscripten.

This revision is now accepted and ready to land.Jan 23 2018, 2:45 PM

Is there anything stopping this from landing in svn?

We want to coordinate landing this with corresponding changes in Emscripten. Since this is an ABI change, Emscripten needs to figure out how to manage it. See here for ongoing discussion.

I've now refreshed the emscripten patches and am hoping they can land soon.

This revision was automatically updated to reflect the committed changes.
sbc100 added a comment.Aug 8 2018, 9:39 AM

I'm running into an issue with an internal codebase that assumes that size_t is equivalent to either int32_t or int64_t which this change invalidates.

After this change we have:
size_t -> long
int32_t -> int
int64_t -> ling long int.

e.g.:

#include <inttypes.h>
#include <stdlib.h>

int func(int32_t* f) {
  return 1;
}

int func(int64_t* f) {
  return 2;
}

int main() {
  size_t a = 0 ;
  return func(&a);
}

This generates the following error after this change:

./bin/clang --target=wasm32 -c ./test.cpp 
./test.cpp:14:10: error: no matching function for call to 'func'
  return func(&a);
         ^~~~
./test.cpp:4:5: note: candidate function not viable: no known conversion from 'size_t *' (aka 'unsigned long *') to 'int32_t *'
      (aka 'int *') for 1st argument
int func(int32_t* f) {
    ^
./test.cpp:8:5: note: candidate function not viable: no known conversion from 'size_t *' (aka 'unsigned long *') to 'int64_t *'
      (aka 'long long *') for 1st argument
int func(int64_t* f) {
    ^
1 error generated.
sbc100 added a comment.Aug 8 2018, 9:41 AM

Should we change int32_t as well? Or does the above code just need fixing? I'm a little concerned because this code is very portable which implies that WebAssembly might be doing something unusual here.

Is this related to the issue reported in the thread here?

Is this related to the issue reported in the thread here?

Ah yes, I'll follow up there.

This has also shown up in a game engine middleware codebase, so it may be a
broader issue - people seem to assume size_t is one of the int*_t types.