This is an archive of the discontinued LLVM Phabricator instance.

Fix Wasm symbol name collisions
AbandonedPublic

Authored by ncw on Dec 1 2017, 4:57 AM.

Details

Reviewers
sbc100
Summary

If you link in two object files with duplicate local symbol names, the link should succeed - but the "names" section is populated with duplicate names. According to binaryen, this is invalid and wasm-dis will fail.

When linking ELF files, the symbol names are "budged" to avoid the collision in the final symbol table.

I've added budging and a test for the Wasm name table.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

ncw created this revision.Dec 1 2017, 4:57 AM
ncw added a comment.Dec 1 2017, 5:11 AM

PS. I'm adding Sam as reviewer on all these Phabricator reviews, because he's listed in the CODE_OWNERS.TXT document. I'm happy to add other people too though!

ruiu added a subscriber: ruiu.Dec 1 2017, 1:17 PM

When linking a large program, you have millions of symbols, and inserting all of them into a set or a hash table is too time-consuming. One of the most important design decision when I made lld is to do only one hash table lookup for each symbol read from a file. We have the main symbol table, so we have no room for other hash table or a set.

This patch inserts all symbols to other set, which violates the design decision. You need to come up with a different design to keep lld fast.

I wonder why wasm doesn't allow symbols with the same name. Non-external symbols are usually referenced not by name but by their indices, no? Symbol names for local symbols are mostly for debugging. No programs should refer them by name, at least in ELF.

For comparison, wabt is ok with duplicate names here (see here), so this may be a bug in binaryen.

sbc100 edited edge metadata.Dec 1 2017, 5:40 PM

Duplicate names is fine I think. Which means this is a binaryen bug I think.

sbc100 added a comment.Dec 1 2017, 5:45 PM

Also, in my experiments with gcc local symbols names can be duplicates of each other in the ELF world.

I linked two object with a static foo function and I get:

$ nm testelf  | grep foo
00000000004004ed t foo
000000000040050e t foo
ncw abandoned this revision.Dec 2 2017, 12:56 AM

That was silly. I also did a test with standard x86 ELF - and I thought I'd observed name mangling. But now I've just repeated the test and it doesn't happen.

Sorry! Closing this, I only submitted it under the mistaken impression that it was matching ELF behaviour.