This error case is described in Linking.md. The operand for call requires generation of a synthetic stub.
Add test for this case.
Differential D44028
[WebAssembly] Handle weak undefined functions with a synthetic stub ncw on Mar 2 2018, 9:58 AM. Authored by
Details This error case is described in Linking.md. The operand for call requires generation of a synthetic stub. Add test for this case.
Diff Detail
Event Timeline
Comment Actions You said "I'm OK with this for now" in the last comment - is the PR as a whole OK to merge then? Comment Actions How about this for a longer term solution: When a relocation like this is found the linker can create a linker synthetic function which will abort.. a simple one might be "call_indirect 0" which we know. For globals I think we can ban weak references, since there is no way to checking the address of a global before calling get_global in it. I'm happy with the code change for now, but can you write the test as minimal piece of bitcode rather than yaml. (Is there some reason you didn't do that already?) Comment Actions I thought about that too... the synthetic function would just be (func (type) unreachable) of course! It's messy as you need a separate one for each type signature. It adds quite a bit of "weight" and machinery to LLD. It basically comes down to whether you think this should be "fixed" in LLD or LLC; should we make a dummy unreachable function in LLD or bodge LLC to emit call_indirect for weak-undefined calls. This patch is currently operating on the basis that the fix will be in LLC... but actually maybe you could convince me otherwise. We shouldn't really use call_indirect for things that aren't truly indirect, so even though it's a nuisance, maybe the linker-synthetic functions are right after all.
Hm, on reflection, I agree. We should disallow undefined globals, even if weak.
I'm not sure I can, for the global case (there's no way to emit globals other than __stack_pointer currently). For the "call" case, I can change over to bitcode, OK. Comment Actions OK, here's the new approach! It's basically the complete opposite of the previous approach. Weak undefined globals are disallowed entirely, and weak undefined functions are allowed and supported via some new synthetic stubs. The tests have good coverage: they check that the synthetic stubs are eligible for GC, and check that the stub is emitted when needed. Comment Actions Thanks. Looks like a good approach. Good that we already had support for synthetic functions. Lets see if there is any feedback on https://github.com/WebAssembly/tool-conventions/issues/46 before we land this.
Comment Actions I have a few questions because I don't know much about this wasm feature...
Comment Actions The address of a function is its table index in the indirection function call table. For weak undefined functions (or null functions) this will be zero. However the following call is *direct* call, not a call_indirect. And direct calls are more efficient we do want continue to use direct calls wherever we can.
Yes, its for catching that specific case. In this case the linker needs insert some valid function index into the direct call instruction. It it doesn't insert a valid function index with the correct signature, the module will fail to validate (i.e. will fail to load). We could generate a better synthetic function with a nice error message. However, that would add complexity to the code generation, and I think we are already on shakey ground adding any kind of code generation to the linker. Also, simply crashing is what existing platforms do, so I think that can/should be expected. At least, I don't think we should block this change on making an nice error message, unless you want to push the code generation part into the compiler instead?
Comment Actions Good questions. So the call pattern we want to support is fairly common in libraries like libc - you have an optional function that's called only if the linker pulls it in. For C code looking like this: int foo() __attribute__((weak)); void callFooOrSkip() { if (&foo) foo(); } On x86, you get code like this: 00000000 <callFooOrSkip>: 0: b8 00 00 00 00 mov $0x0,%eax 1: R_386_32 foo 5: 85 c0 test %eax,%eax 7: 74 05 je e <callFooOrSkip+0xe> 9: e9 fc ff ff ff jmp a <callFooOrSkip+0xa> a: R_386_PC32 foo e: c3 ret There's a relocation for the function pointer, and a second one for the operand of the call. At link time, the linker can just put null in there for both values, or garbage - if the CPU ever were to take the "if" branch, or if the call to foo() were not guarded with an if, it wouldn't matter if process were to crash with SIGSEGV. The jmp doesn't need to legal - only if it's ever actually taken. On Wasm, you get code like this: block i32.const @R_WEBASSEMBLY_TABLE_INDEX(foo) ; push function index to stack i32.eqz ; pop value from stack, push bool to stack whether it's equal br_if ; pop value from stack, break out of block if it's true call @R_WEBASSEMBLY_FUNC_INDEX(foo) ; call and push rv to stack drop ; pop ignored rv from stack end Again there are two relocations. But here's the difference - you can't put null/zero as the operand to the call instruction. The Wasm validator is precisely designed to make it impossible to write code that fails with SIGSEGV! Regardless of whether the branch is ever taken, there has to be _something_ to go there. Hence the creation of the linker-synthetic functions. We don't expect the stubs to be called, they just have to exist to make the Wasm legal (and abort sensibly if they are somehow called). Comment Actions
I see. Thank you for the explanation!
It shouldn't block this change indeed. I was just trying to understand this patch. I'm not too worried that we generate wasm instructions in the linker. Unlike other file formats, wasm supports and will support only one machine instruction -- that's the wasm instructions. So, if something is easy or natural to implement in a linker instead of a compiler, I think we shouldn't hesitate to do that.
Comment Actions I hope this is OK, but when I committed I made one additional change, that I think makes sense. Rather than being stingy and emitting one stub per type, I'm now emitting one per function - the stubs are very small, and the advantage is that you'll now get a nice stack trace if these things ever should be called (since they can be named sensibly, to indicate which function was missing).
|
Is it OK that all these will have the same name?
Perhaps its worth adding a test for two function with the same sig? And two functions with different sigs? (Either in this test file or separate).