This patch adds support for multiple tables to WebAssembly, and updates the CFI implementation to place each disjoint call set into a unique WebAssembly indirect call table with homogeneous element type.
Unfortunately, the current approach is fairly hacky, and likely won't be in a state that is ready to be merged before I leave. One problem that I've encountered in general with CFI has been with pointer casts:
In C, pointer types are often casted in order to implement features such as generic callbacks and object polymorphism. But these will have different type metadata at the function definition and the call site, and fail to match up. For example, the standard library qsort() definitions a callback comparison function with type int (*compare)(const void*, const void*). But when implementing such a comparison function, it will typically have a more specific type, e.g. int compare_parse(Linkage_info* a, Linkage_info* b).
With this specific implementation in particular, there are some additional problems:
- There is no direct link from the call to the llvm.type.test() to the actual indirect call site. But I need to tag the indirect call so that it references the correct indirect call table in WebAssembly, so the current approach performs a restricted level-ordered search from the type test up to sibling uses of each parent node, which is mostly correct but fairly slow.
- Metadata is used to tag the call site with the destination indirect call table in WebAssembly. But this is very brittle, because the program should still be correct if the metadata is removed. However, this is currently not true. First, if there is no metadata present at an indirect call site, the backend assumes that the default indirect call table should be used, because indirect calls generated by the compiler for C++ vtable references have no call table tag. Additionally, if there is type metadata that is present at an indirect call site, but the program does not contain any functions with this type, then these call sites will not be tagged; this is usually caused by unreachable/dead code. Secondly, if the indirect call site is tagged but subsequently modified by a compiler optimization/transform, then it may lose the call table tag, which is incorrect.