Hi,
After spending some time to understand how "weak aliases" work on Windows, with the undocumented linker option: "alternatename", I came up with 2 general macros for defining weak functions: "WEAK()" and "WEAK_INTERFACE()".
I think this really simplifies the code and these changes are required for my next diffs that fix Sanitizer Coverage for Windows (I don't add these changes only for refactoring, I add these changes because they are necessary to make Sanitizer Coverage work on Windows, and port libFuzzer to Windows which is my final goal).
I provide 2 macros:
+ WEAK(ReturnType, Name, Parameters) : for declaring or defining weak functions. + WEAK_INTERFACE(ReturnType, Name, Parameters) : for declaring or defining weak functions that should be exported, for example for the interface of a library.
So, for example:
+ Declaring a weak function:
WEAK(bool, compare, (int a, int b))
+ Defining a weak function:
WEAK(bool, compare, (int a, int b)) { return a > b; }
In Windows, we don't have a direct equivalent of weak symbols, but we can use the macro "WIN_WEAK_ALIAS()" (defined in https://reviews.llvm.org/D28525) which defines an alias to a default implementation (using the pragma "alternatename"), and only works when linking statically.
To define a weak function "fun", we define a default implementation with a different name "fun__def" and we create a "weak alias" fun = fun__def.
Then, users can override it just defining "fun".
For example:
header.h:
WEAK(bool, compare, (int a, int b))
default.cc:
WEAK(bool, compare, (int a, int b)) { return 0; }
override.cc
extern "C" bool compare (int a, int b) { return a >= b; }
So, until this point, when linking statically, it works quite similar to weak symbols in linux, with the difference that we always need to provide a default implementation.
However, when exposing weak functions in the interface of a shared library on Windows (dll), it is a bit different. We only provide the default implementation (fun__def()).
Clients of that library, only need to include a header with the declaration of that function, which will define a "weak alias" fun = fun__def. So, by default clients will be using the default implementation imported from the dll, and they can override it by redefining the function. For example:
libAheader.h:
WEAK_INTERFACE(bool, compare, (int a, int b))
libAdefault.cc:
WEAK(bool, compare, (int a, int b)) { return 0; }
client.cc
#include "libAheader.h" // We can use the default implementation from the library: compare(1, 2); // Or we can override it: extern "C" bool compare (int a, int b) { return a >= b; }
So, when linking dynamically, it works different to linux.
If some unit overrides the function (redefining fun()), the rest of the units (other dlls, or main executable) don't have access to it.
So, for example, in the previous example. If the client redefines "compare()", the code of the library libA will continue using the default implementation.
When access to the implementation in a different unit is required, interception can be used. (I use this for asan dll, in next diffs)
So, in order to use weak functions in a portable way that works for Windows and Linux, we must:
+ Always provide a default implementation. + For dynamically libraries (dll), only provide the default implementation, and don't override it inside the library.
I think this doesn't impose any limitation to the current code for sanitizers. Generally, all we want to do is provide a default implementation that can be overrided by the user.
When providing static libraries, it works fine for both Linux and Windows. When users override the weak function, also the code in the library is updated to refer to that function (because all is statically linked together).
When providing shared libraries, for windows we need to use interception to get a pointer to the function defined by the user and override the default implementation in the dll.
For example, for asan in Windows, we provided 2 implementations, a static library for MT and a shared library for MD. Lets consider the case of shared library (MD).
Client's code is instrumented with some calls to __sanitizer_cov_trace_pc_guard(). By default, that calls will be aliased to use the imported default implementation fom asan dll: __sanitizer_cov_trace_pc_guard__def().
But clients can override it. So, when asan dll is initialized, it will check if the main executable exports the definition of __sanitizer_cov_trace_pc_guard(). If we find that function, then we override the default function in the dll with that pointer. So, all the client's dlls with instrumentation that import __sanitizer_cov_trace_pc_guard__def() from asan dll, will be using the function provided by the main executable.
Thanks,
Marcos