This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Deduplicate imports of the same module name, field name, and type
ClosedPublic

Authored by fitzgen on Jul 6 2021, 5:54 PM.

Diff Detail

Event Timeline

fitzgen created this revision.Jul 6 2021, 5:54 PM
fitzgen requested review of this revision.Jul 6 2021, 5:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2021, 5:54 PM

Nice! I think I'm sold on the idea that this a good think. I'm curious what @dschuff and @sunfish think? Is wasm-ld the right place to be doing this?

lld/test/wasm/duplicate-function-imports.ll
1 ↗(On Diff #356845)

Why .ll and not .s here?

lld/test/wasm/duplicate-global-imports.s
31

Maybe we should do a global.get on each of these globals so we can see that the de-duplication is working in the relocations.

Sadly obj2yaml doesn't do disassembly so the test will be a little opaque, at least until https://reviews.llvm.org/D105539 lands

lld/test/wasm/duplicate-table-imports.s
41

Lets do a .get on each of the tables?

lld/wasm/SyntheticSections.h
198

Maybe just importedGlobals, importedFunctions and importedTables?

sbc100 added inline comments.Jul 7 2021, 10:58 AM
lld/test/wasm/duplicate-global-imports.s
3

I think we can remove --no-gc-sections (if you use each on in a global.get) and --allow-undefined.

lld/test/wasm/duplicate-table-imports.s
3

Can you remove --allow-undefined ? And maybe you can --export-all rather then exporting each one ?

fitzgen added inline comments.Jul 7 2021, 1:40 PM
lld/test/wasm/duplicate-function-imports.ll
1 ↗(On Diff #356845)

I used .ll because it was the first test I wrote and it was derived from the C snippet. I'm happy to change it to .s if you prefer.

lld/test/wasm/duplicate-global-imports.s
31

Regarding this and removing --no-gc-sections: is it generally preferred to create uses of things rather than to pass --no-gc-sections in tests?

Happy to update this either way, just curious.

lld/test/wasm/duplicate-table-imports.s
3

Will do.

41

There is a table.get for each table:

  • read_funcref_1 does it for t1
  • read_funcref_2 does it for t2
  • read_externref does it for t3
lld/wasm/SyntheticSections.h
198

Sure thing, will change to these names.

fitzgen marked 5 inline comments as done.Jul 7 2021, 2:45 PM
fitzgen added inline comments.
lld/test/wasm/duplicate-table-imports.s
3

I switched to --export-all rather than explicitly enumerating each export, but removing --allow-undefined makes it error out because the t{1,2,3} symbols are undefined, and I don't know how to make a defined, imported table in a .s.

sbc100 added inline comments.Jul 7 2021, 2:47 PM
lld/test/wasm/duplicate-function-imports.ll
1 ↗(On Diff #356845)

Yes please, I've been trying to covert all the existing .ll tests .s for a while so would rather not any any new .ll if possible.

lld/test/wasm/duplicate-global-imports.s
31

My general philosophy is to try to minimize the number of flags passed in a given test. This tends to make to more obvious what is being tested (less noise) and makes makes it easier to find tests for a given open by simply grepping for it. There could also be an argument that we want to do most of our testing on the default path since that is the most well trodden one... maybe?

Of course its a balance and we don't want to be jumping through too many hoops to avoid extra flags.

fitzgen updated this revision to Diff 357083.Jul 7 2021, 2:48 PM

Updated based on review feedback:

  • Removed --allow-undefined and --no-gc-sections from duplicate-global-imports.s
  • Added global.gets for each global in duplicate-global-imports.s
  • Switched to --export-all instead of explicitly listing each export in duplicate-table-imports.s
  • Renamed alreadyImportedStuff to importedStuff
fitzgen updated this revision to Diff 357088.Jul 7 2021, 3:00 PM
fitzgen marked an inline comment as done.

Rewrote duplicate-function-imports.ll into duplicate-function-imports.s.

fitzgen marked an inline comment as done.Jul 7 2021, 3:02 PM

Given that we're doing things like GC in the linker, and that the linker is already responsible for creating the imports in the first place (and already has several different options affecting that) I think this makes sense for the linker too.

fitzgen updated this revision to Diff 357249.Jul 8 2021, 9:18 AM

Ran clang-format.

sbc100 added inline comments.Jul 12 2021, 1:48 PM
lld/wasm/SyntheticSections.cpp
113

Do these need to be optional? Doesn't getImportModuleAndName unconditionally set them?

fitzgen updated this revision to Diff 358759.Jul 14 2021, 2:43 PM

Make it so getImportModuleAndName does not take optional out parameters.

fitzgen marked an inline comment as done.Jul 14 2021, 2:43 PM
sbc100 accepted this revision.Jul 14 2021, 5:10 PM
sbc100 added inline comments.
lld/wasm/SyntheticSections.cpp
113

The fact that we need template here make me thing maybe we should just move importModule/importName into the symbol base class and we could then just do isUndefined() here... but that is a bigger change.

lld/wasm/SyntheticSections.h
156

What about putting ImportKey and this DenseMapInfo in that anonymous namespace since they should not be needed outside this file? Then we could avoid the prefix in lld::wasm::ImportKey maybe?

This revision is now accepted and ready to land.Jul 14 2021, 5:10 PM
sbc100 added inline comments.Jul 14 2021, 5:20 PM
lld/wasm/SyntheticSections.cpp
113

I created https://reviews.llvm.org/D106026.. I don't mind which one lands first.

fitzgen added inline comments.Jul 15 2021, 1:46 PM
lld/wasm/SyntheticSections.h
156

When I do this, I get the following warning a few times:

/home/nick/llvm-project/lld/wasm/SyntheticSections.h:165:7: warning: ‘lld::wasm::ImportSection’ has a field ‘lld::wasm::ImportSection::importedGlobals’ whose type uses the anonymous namespace [-Wsubobject-linkage]
  165 | class ImportSection : public SyntheticSection {
      |       ^~~~~~~~~~~~~

Do you still prefer this approach?

Do you have any idea what is up with https://reviews.llvm.org/harbormaster/unit/view/870129/ ? Locally the test passes just fine, and the types certainly look correct to me. It almost looks like whatever version of llvm-mc running in CI doesn't understand ref types properly? Pretty confused here.

sbc100 added inline comments.Jul 16 2021, 10:25 AM
lld/wasm/SyntheticSections.h
156

Oh sorry, I didn't realize this was in a header. Ignore my previous comment.

Do you have any idea what is up with https://reviews.llvm.org/harbormaster/unit/view/870129/ ? Locally the test passes just fine, and the types certainly look correct to me. It almost looks like whatever version of llvm-mc running in CI doesn't understand ref types properly? Pretty confused here.

@aardappel just added the type checker vary recently (https://reviews.llvm.org/D104945). So maybe you need to update your local branch?

fitzgen marked 2 inline comments as done.Jul 16 2021, 10:28 AM

Do you have any idea what is up with https://reviews.llvm.org/harbormaster/unit/view/870129/ ? Locally the test passes just fine, and the types certainly look correct to me. It almost looks like whatever version of llvm-mc running in CI doesn't understand ref types properly? Pretty confused here.

@aardappel just added the type checker vary recently (https://reviews.llvm.org/D104945). So maybe you need to update your local branch?

Ah yes, I think this is it. That patch mentions that .globaltype etc must come *before* the entity now, and I believe that is what is going wrong in this test case. Will update the patch in one sec.

What do you think about landing https://reviews.llvm.org/D106026 first? (would avoid the use of template <typename UndefinedT> in this patch).

What do you think about landing https://reviews.llvm.org/D106026 first? (would avoid the use of template <typename UndefinedT> in this patch).

I don't really mind either way.

Obviously it is one tiny rebase less work for me if this lands first, but that's not much work and if that patch is green and ready to land, then I say go for it.

By the way, I am not LLVM committer, so I'll need your help actually landing this patch when the time comes.

fitzgen updated this revision to Diff 359399.Jul 16 2021, 11:30 AM

Fixed failing tests.

sbc100 accepted this revision.Jul 16 2021, 12:41 PM

I'm OK landing this first. Just say when you ready. Still lgtm.

I'm OK landing this first. Just say when you ready. Still lgtm.

Thanks. This *should* be ready, except that there seems to be a few failures in CI: https://reviews.llvm.org/harbormaster/unit/114556/

AFAICT, they are all unrelated. Not sure whether these are known-flaky tests or how to retry them or what. Any ideas?

I'm OK landing this first. Just say when you ready. Still lgtm.

Thanks. This *should* be ready, except that there seems to be a few failures in CI: https://reviews.llvm.org/harbormaster/unit/114556/

AFAICT, they are all unrelated. Not sure whether these are known-flaky tests or how to retry them or what. Any ideas?

They look unrelated. Can you try re-basing and re-uploading?

This revision was landed with ongoing or failed builds.Jul 19 2021, 1:59 PM
This revision was automatically updated to reflect the committed changes.
MaskRay reopened this revision.EditedJul 19 2021, 5:10 PM
MaskRay added a subscriber: MaskRay.

Reverted by 16aac493e59519377071e900d119ba2e7e5b525d due to an use-of-uninitialized-value bug.

lld/test/wasm/duplicate-table-imports.s
44

Newer tests in other lld ports use ## for non-CHECK-non-RUN comments.

lld/wasm/SyntheticSections.h
99

The coding standard doesn't use /*

This revision is now accepted and ready to land.Jul 19 2021, 5:10 PM
MaskRay requested changes to this revision.Jul 19 2021, 5:10 PM
This revision now requires changes to proceed.Jul 19 2021, 5:10 PM

Reverted by 16aac493e59519377071e900d119ba2e7e5b525d due to an use-of-uninitialized-value bug.

Do you have a link to a CI run with sanitizers enabled, or something like that, that shows which value is uninitialized? Thanks!

Reverted by 16aac493e59519377071e900d119ba2e7e5b525d due to an use-of-uninitialized-value bug.

Do you have a link to a CI run with sanitizers enabled, or something like that, that shows which value is uninitialized? Thanks!

https://lab.llvm.org/buildbot/#/builders/74/builds/6017/steps/10/logs/stdio

Is there documentation anywhere on how to build wasm-ld with memory sanitizer enabled?

I've tried this:

$ cmake -DLLVM_USE_SANITIZER=MemoryWithOrigins -DLLVM_ENABLE_PROJECTS=lld path/to/llvm
$ make -j8

but then I get memory sanitizer errors while running make like this:

[  5%] Building Attributes.inc...
Uninitialized bytes in __interceptor_memchr at offset 6 inside [0x7020000006c0, 7)
==1094381==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x52842d in llvm::StringRef::find(char, unsigned long) const /home/nick/llvm-project/llvm/include/llvm/ADT/StringRef.h:319:29
    #1 0x11b0ccf in llvm::StringRef::split(llvm::SmallVectorImpl<llvm::StringRef>&, char, int, bool) const /home/nick/llvm-project/llvm/lib/Support/StringRef.cpp:341:20
    #2 0x11da056 in llvm::Triple::Triple(llvm::Twine const&) /home/nick/llvm-project/llvm/lib/Support/Triple.cpp:770:19
    #3 0x12f4e33 in llvm::sys::getProcessTriple[abi:cxx11]() /home/nick/llvm-project/llvm/lib/Support/Host.cpp:1727:10
    #4 0x10b1bd5 in (anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, bool) /home/nick/llvm-project/llvm/lib/Support/CommandLine.cpp:1345:17
    #5 0x10b1570 in llvm::cl::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, char const*, bool) /home/nick/llvm-project/llvm/lib/Support/CommandLine.cpp:1320:24
    #6 0xff18e5 in main /home/nick/llvm-project/llvm/utils/TableGen/TableGen.cpp:283:3
    #7 0x7f773ee7acb1 in __libc_start_main csu/../csu/libc-start.c:314:16
    #8 0x428a1d in _start (/home/nick/llvm-project/obj/bin/llvm-tblgen+0x428a1d)

  Uninitialized value was stored to memory at
    #0 0x42e7c9 in __msan_memcpy (/home/nick/llvm-project/obj/bin/llvm-tblgen+0x42e7c9)
    #1 0x527135 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/basic_string.tcc:225:6
    #2 0x11f977f in llvm::Twine::str[abi:cxx11]() const /home/nick/llvm-project/llvm/lib/Support/Twine.cpp:20:12
    #3 0x11d9f3d in llvm::Triple::Triple(llvm::Twine const&) /home/nick/llvm-project/llvm/lib/Support/Triple.cpp:765:16
    #4 0x12f4e33 in llvm::sys::getProcessTriple[abi:cxx11]() /home/nick/llvm-project/llvm/lib/Support/Host.cpp:1727:10
    #5 0x10b1bd5 in (anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, bool) /home/nick/llvm-project/llvm/lib/Support/CommandLine.cpp:1345:17
    #6 0x10b1570 in llvm::cl::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*, char const*, bool) /home/nick/llvm-project/llvm/lib/Support/CommandLine.cpp:1320:24
    #7 0xff18e5 in main /home/nick/llvm-project/llvm/utils/TableGen/TableGen.cpp:283:3
    #8 0x7f773ee7acb1 in __libc_start_main csu/../csu/libc-start.c:314:16

  Uninitialized value was created by a heap allocation
    #0 0x4a30a9 in operator new(unsigned long) (/home/nick/llvm-project/obj/bin/llvm-tblgen+0x4a30a9)
    #1 0x7f773f19c87f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) (/lib/x86_64-linux-gnu/libstdc++.so.6+0x14387f)
    #2 0xffffffffffffef9f  (<unknown module>)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/nick/llvm-project/llvm/include/llvm/ADT/StringRef.h:319:29 in llvm::StringRef::find(char, unsigned long) const
Exiting
make[2]: *** [include/llvm/IR/CMakeFiles/intrinsics_gen.dir/build.make:136: include/llvm/IR/Attributes.inc] Error 77
make[2]: Leaving directory '/home/nick/llvm-project/obj'
make[1]: *** [CMakeFiles/Makefile2:11564: include/llvm/IR/CMakeFiles/intrinsics_gen.dir/all] Error 2
make[1]: Leaving directory '/home/nick/llvm-project/obj'
make: *** [Makefile:152: all] Error 2

Is there documentation anywhere on how to build wasm-ld with memory sanitizer enabled?

https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

msan isn't usable without an instrumented C++ standard library.
I personally use simplified https://github.com/MaskRay/Config/wiki/LLVM out/msan

Reverted by 16aac493e59519377071e900d119ba2e7e5b525d due to an use-of-uninitialized-value bug.

Do you have a link to a CI run with sanitizers enabled, or something like that, that shows which value is uninitialized? Thanks!

If setting up a local asan build is painful (I've never don't it myself), and you think you know the source of the error you could also just re-land with the fix.

I'm pretty sure the issue was that I was comparing/hashing maximums in WasmLimits's operator== and hash traits unconditionally, rather than only if Flags & WASM_LIMITS_FLAG_HAS_MAX.

fitzgen updated this revision to Diff 360936.Jul 22 2021, 12:52 PM

Updated to only hash/compare WasmLimits's Maximum when Flags & WASM_LIMITS_FLAGS_HAS_MAX. This should fix the use of uninitialized data that led to reverting the original commit.

@sbc100, would you mind relanding this for me? Thank you!

sbc100 added inline comments.Jul 22 2021, 1:29 PM
lld/wasm/Writer.cpp
568 ↗(On Diff #360936)

Is this chunk intentional?

fitzgen added inline comments.Jul 22 2021, 1:37 PM
lld/wasm/Writer.cpp
568 ↗(On Diff #360936)

Whoops, I think this was a bad rebase. Apologies!

Updated patch incoming.

fitzgen updated this revision to Diff 360954.Jul 22 2021, 1:39 PM
fitzgen marked an inline comment as done.

Removed chunk that shouldn't have been in this patch.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 22 2021, 2:17 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
TaoPan added a subscriber: TaoPan.Aug 2 2021, 3:33 AM
TaoPan added inline comments.
lld/wasm/SyntheticSections.h
134

Please be note this line introduced below build error:
FAILED: tools/lld/wasm/CMakeFiles/lldWasm.dir/MapFile.cpp.o
/usr/bin/c++ -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/lld/wasm -I/media/sdb/tao/llvm/llvm-project-debug/lld/wasm -I/media/sdb/tao/llvm/llvm-project-debug/lld/include -Itools/lld/include -Iinclude -I/media/sdb/tao/llvm/llvm-project-debug/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -fno-exceptions -fno-rtti -std=c++14 -MD -MT tools/lld/wasm/CMakeFiles/lldWasm.dir/MapFile.cpp.o -MF tools/lld/wasm/CMakeFiles/lldWasm.dir/MapFile.cpp.o.d -o tools/lld/wasm/CMakeFiles/lldWasm.dir/MapFile.cpp.o -c /media/sdb/tao/llvm/llvm-project-debug/lld/wasm/MapFile.cpp
In file included from /media/sdb/tao/llvm/llvm-project-debug/lld/wasm/MapFile.cpp:28:0:
/media/sdb/tao/llvm/llvm-project-debug/lld/wasm/SyntheticSections.h:134:36: error: specialization of ‘template<class T> struct llvm::DenseMapInfo’ in different namespace [-fpermissive]
template <typename T> struct llvm::DenseMapInfo<lld::wasm::ImportKey<T>> {

^

In file included from /media/sdb/tao/llvm/llvm-project-debug/lld/wasm/MapFile.h:12:0,

from /media/sdb/tao/llvm/llvm-project-debug/lld/wasm/MapFile.cpp:21:

/media/sdb/tao/llvm/llvm-project-debug/llvm/include/llvm/ADT/ArrayRef.h:29:31: error: from definition of ‘template<class T> struct llvm::DenseMapInfo’ [-fpermissive]

template<typename T> struct DenseMapInfo;
sbc100 added inline comments.Aug 4 2021, 11:15 AM
lld/wasm/SyntheticSections.h
134

What compiler are you using that gives this error? I'm guessing gcc or something that is not clang? (I'm surprised we didn't get any reports of bots failing? Any idea why your config might not be covered by any of the llvn bots?)

@fitzgen can you take a look?

sbc100 added inline comments.Aug 4 2021, 11:18 AM
lld/wasm/SyntheticSections.h
134

Looks like there is already a fix in flight: https://reviews.llvm.org/D107422

TaoPan added inline comments.Aug 4 2021, 6:33 PM
lld/wasm/SyntheticSections.h
134

Yes, it's gcc dependent error, I used gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12). My config command is

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;compiler-rt;lld" ../llvm

Looks not very special. I know this error won't be triggered with "-D CMAKE_C_COMPILER=clang -D CMAKE_CXX_COMPILER=clang++".

sbc100 added inline comments.Aug 5 2021, 12:04 PM
lld/wasm/SyntheticSections.h
134

Do we not have any bots that build with a recent enough gcc version to report this failure? Seems odd.

TaoPan added inline comments.Aug 6 2021, 12:42 AM
lld/wasm/SyntheticSections.h
134

I checked recent enough gcc version (11.2.0), this error can't be reproduced, it's gcc old version dependent error.