This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt] General definition for weak functions.
ClosedPublic

Authored by mpividori on Jan 12 2017, 12:04 AM.

Details

Summary

Hi,

After spending some time to understand how "weak aliases" work on Windows, with the undocumented linker option: "alternatename", I came up with 2 general macros for defining weak functions: "WEAK()" and "WEAK_INTERFACE()".
I think this really simplifies the code and these changes are required for my next diffs that fix Sanitizer Coverage for Windows (I don't add these changes only for refactoring, I add these changes because they are necessary to make Sanitizer Coverage work on Windows, and port libFuzzer to Windows which is my final goal).

I provide 2 macros:

+ WEAK(ReturnType, Name, Parameters)  :  for declaring or defining weak functions.
+ WEAK_INTERFACE(ReturnType, Name, Parameters)  :  for declaring or defining weak functions that should be exported, for example for the interface of a library.

So, for example:
+ Declaring a weak function:

WEAK(bool, compare, (int a, int b))

+ Defining a weak function:

WEAK(bool, compare, (int a, int b)) { return a > b; }

In Windows, we don't have a direct equivalent of weak symbols, but we can use the macro "WIN_WEAK_ALIAS()" (defined in https://reviews.llvm.org/D28525) which defines an alias to a default implementation (using the pragma "alternatename"), and only works when linking statically.
To define a weak function "fun", we define a default implementation with a different name "fun__def" and we create a "weak alias" fun = fun__def.
Then, users can override it just defining "fun".
For example:
header.h:

WEAK(bool, compare, (int a, int b))

default.cc:

WEAK(bool, compare, (int a, int b)) { return 0; }

override.cc

extern "C" bool compare (int a, int b) { return a >= b; }

So, until this point, when linking statically, it works quite similar to weak symbols in linux, with the difference that we always need to provide a default implementation.

However, when exposing weak functions in the interface of a shared library on Windows (dll), it is a bit different. We only provide the default implementation (fun__def()).
Clients of that library, only need to include a header with the declaration of that function, which will define a "weak alias" fun = fun__def. So, by default clients will be using the default implementation imported from the dll, and they can override it by redefining the function. For example:
libAheader.h:

WEAK_INTERFACE(bool, compare, (int a, int b))

libAdefault.cc:

WEAK(bool, compare, (int a, int b)) { return 0; }

client.cc

#include "libAheader.h"
// We can use the default implementation from the library:
  compare(1, 2);
// Or we can override it:
  extern "C" bool compare (int a, int b) { return a >= b; }

So, when linking dynamically, it works different to linux.
If some unit overrides the function (redefining fun()), the rest of the units (other dlls, or main executable) don't have access to it.
So, for example, in the previous example. If the client redefines "compare()", the code of the library libA will continue using the default implementation.
When access to the implementation in a different unit is required, interception can be used. (I use this for asan dll, in next diffs)

So, in order to use weak functions in a portable way that works for Windows and Linux, we must:

+ Always provide a default implementation.
+ For dynamically libraries (dll), only provide the default implementation, and don't override it inside the library.

I think this doesn't impose any limitation to the current code for sanitizers. Generally, all we want to do is provide a default implementation that can be overrided by the user.
When providing static libraries, it works fine for both Linux and Windows. When users override the weak function, also the code in the library is updated to refer to that function (because all is statically linked together).
When providing shared libraries, for windows we need to use interception to get a pointer to the function defined by the user and override the default implementation in the dll.

For example, for asan in Windows, we provided 2 implementations, a static library for MT and a shared library for MD. Lets consider the case of shared library (MD).
Client's code is instrumented with some calls to __sanitizer_cov_trace_pc_guard(). By default, that calls will be aliased to use the imported default implementation fom asan dll: __sanitizer_cov_trace_pc_guard__def().
But clients can override it. So, when asan dll is initialized, it will check if the main executable exports the definition of __sanitizer_cov_trace_pc_guard(). If we find that function, then we override the default function in the dll with that pointer. So, all the client's dlls with instrumentation that import __sanitizer_cov_trace_pc_guard__def() from asan dll, will be using the function provided by the main executable.

Thanks,
Marcos

Diff Detail

Repository
rL LLVM

Event Timeline

mpividori updated this revision to Diff 84073.Jan 12 2017, 12:04 AM
mpividori retitled this revision from to [compiler-rt] General definition for weak functions..
mpividori updated this object.
mpividori added reviewers: kcc, rnk, aizatsky, zturner.
mpividori set the repository for this revision to rL LLVM.
mpividori added a subscriber: llvm-commits.
mpividori updated this object.Jan 12 2017, 12:37 AM
rnk accepted this revision.Jan 17 2017, 1:01 PM

Looks good to me. I think this is the best we can do to abstract over the differences between ELF weak functions and COFF weak functions. We should wait for @kcc to comment though, since this impacts them significantly.

This revision is now accepted and ready to land.Jan 17 2017, 1:01 PM
rnk added inline comments.Jan 17 2017, 1:05 PM
lib/sanitizer_common/sanitizer_internal_defs.h
74 ↗(On Diff #84073)

We should consider removing this macro eventually and migrate all users over to the new API.

mpividori added inline comments.Jan 17 2017, 1:10 PM
lib/sanitizer_common/sanitizer_internal_defs.h
74 ↗(On Diff #84073)

Exactly, but I was waiting for review on the new macro before migrating everything.

rnk requested changes to this revision.Jan 17 2017, 1:12 PM
rnk added inline comments.
lib/sanitizer_common/sanitizer_internal_defs.h
42 ↗(On Diff #84073)

After looking at D28597 and D28598, I realized how this usage of WIN_WEAK_ALIAS makes it so that every weak declaration now introduces a #pragma comment(linker, "/alternatename:..."). That seems bad. In particular, I don't like how including sanitizer_interface_internal.h in D28598 is used for the side effect of providing aliases for the sanitizer coverage callbacks.

It might be nicer to provide separate macros for declaring and defining weak functions. On Windows, weak function declarations wouldn't have anything special added. Then, at the definition, they would provide the pragma and the default definition.

60 ↗(On Diff #84073)

typo on "dynamically"

This revision now requires changes to proceed.Jan 17 2017, 1:12 PM
mpividori added inline comments.Jan 17 2017, 2:15 PM
lib/sanitizer_common/sanitizer_internal_defs.h
42 ↗(On Diff #84073)

@rnk, yes, it "looks bad". But works really well and I strongly think this is appropriate for this case, I will explain why:

"Windows "weak aliases" are not exposed on object files"

As I commented in the definition of WIN_WEAK_ALIAS(), if we define an alias: "fun = fun_def", the compiler doesn't include "fun" in the symbol table of the object file. (if we refer to "fun" inside that object file, it will include "fun" to the symbol table, but as UNDEF, which for the purpose of this explanation is the same)
So, when linking to an object file that includes the default definition ant the alias, it will works fine because the linker will consider that object file even if it doesn't resolve any symbol, and it will find the definition, but when the default definition is included in a static library, the linker could ignore the object file where "fun" is defined (because "fun" is not included in the symbol table, or is included as UNDEF) and result in unresolved symbol.
For example:

// definition.h
  int fun();
  int fun__def();
// definition.cc
extern "C" {
  WIN_WEAK_ALIAS(fun, fun__def)
  int fun__def() {
    return 1;
  }
}
// main.cc
#include "definition.h"
int main() {
  return fun();
}

This works fine:

clang-cl /c definition.cc
clang-cl main.cc definition.obj

This fails with unresolved symbol:

clang-cl /c definition.cc
lib definitions.obj
clang-cl main.cc definition.lib

So, I think it is OK to include the weak alias in the declaration, so everywhere you refer to "fun" you include the weak alias "fun = fun__def".
The linker should be able to remove that alias when "fun" is not used. And we avoid the problem of dealing with static libraries.
For example:

// definition.h
  int fun();
  int fun__def();
  WIN_WEAK_ALIAS(fun, fun__def)
// definition.cc
extern "C" {
  int fun__def() {
    return 1;
  }
}
// main.cc
#include "definition.h"
int main() {
  return fun();
}

This will works fine:

clang-cl /c definition.cc
lib definitions.obj
clang-cl main.cc definition.lib

Would you agree?

mpividori updated this revision to Diff 84761.Jan 17 2017, 4:03 PM
mpividori edited edge metadata.

@rnk I updated the diff to add suggested changes. Would you agree? (In future, if we have link errors with weak symbols we must remember to use WIN_FORCE_LINK() to force linking to the object defining a weak symbol, but now, fortunately, it is not necessary, since weak symbols are defined in modules that define other non-weak symbols, so they are always included by the linker).

aizatsky requested changes to this revision.Jan 17 2017, 4:06 PM
aizatsky added a comment.EditedJan 17 2017, 4:06 PM

I don't like that we introduce macros making function definition ugly and opaque. All this to implement one platform hacks. I would prefer to have current version cleaned up to eliminate ifdefs. Something like:

SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE
const char *__ubsan_default_options() { return ""; }

WIN_WEAK_FN(__ubsan_default_options);

Is such clenup possible? What properties does proposed solution have that cleaned up one doesn't?

This revision now requires changes to proceed.Jan 17 2017, 4:06 PM

@aizatsky I updated the code 2 minutes before your comment.

I think the two macros that I am defining: WEAK_DEF() and WEAK_INTERFACE_DECL() are very simple to understand and simplify the code removing redundant code.

The solution that you propose has some disadvantages:

  • For Linux, we don't need to define a different function for the default implementation. So, with your proposed solution we have 2 approaches:

+) or we define the default function and we create a weak alias in linux too:

  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options__def() { return ""; }
#ifdef SANITIZER_WINDOWS
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options();
  WIN_WEAK_ALIAS(__ubsan_options, __ubsan_options__def);
#else
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options()  __attribute__ ((weak, alias ("___ubsan_options__def")));;
#endif

(I guess weak aliases also work for APPLE)
We could generalize the weak alias in a macro to avoid the #ifdefs .

+) Or we can use some #ifdef to define the default function only for Windows:

#ifdef SANITIZER_WINDOWS
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options__def() { return ""; }
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options();
  WIN_WEAK_ALIAS(__ubsan_options, __ubsan_options__def);
#else
  SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE const char *__ubsan_options(){ return ""; }
#endif

In this case we have to duplicate the code for the default implementation.

With the macro WEAK_DEF() that I define, you can see that I simply the code, since we don't need any special code for different platforms:

WEAK_DEF(const char *, __ubsan_options, ()) { return ""; }

This works fine for Windows and other platforms.

  • Also, we need to export default functions but only on Windows, with your approach, this means I should write this code in the header file:
#ifdef SANITIZER_WINDOWS
  const char *__ubsan_options();
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options__def();
#else
  SANITIZER_INTERFACE_ATTRIBUTE const char *__ubsan_options();
#endif

Which I achieve with the macro WEAK_INTERFACE_DECL(), and simplifies that code to:

WEAK_INTERFACE_DECL(const char *, __ubsan_options, ());

I suggest to make WIN* macros no-op on all platforms that do not need it.

#if WINDOWS
  #define WIN_WEAK_FN(...) ...
#else
  #define WIN_WEAK_FN(...)
#end

And have the function definition code look like:

SANITIZER_WEAK_ATTRIBUTE const char *__ubsan_default_options() { return ""; }
WIN_WEAK_FN(__ubsan_default_options, ...)

Notice the absence of ifdefs in the client code and C++ syntax for simple cases.

How about your solution and other platforms? E.g. mac? If you want to push such a heavy interface change you at least should think through how it will solve other problems as well. Does it help other platforms? It wouldn't be great if we had to introduce another level of complexity for another platform.

Another wild thought: is there any template magic that we might invoke to automate creation of function aliases?

But that doesn't work, we shouldn't define __ubsan_default_options on Windows. It must be a "weak alias" to a default implementation "ubsan_default_optionsdef".
And "ubsan_default_optionsdef" shouldn't exist in non-windows systems, where you can define weak functions.

I also prefer to keep the function declarations as they are. Otherwise IDEs can't find the function. Another idea: Is there a way to rename the function with an attribute? Or e.g. with __asm ("another_name"); (this works on Darwin)?

zturner edited edge metadata.Jan 17 2017, 5:31 PM

Another wild thought: is there any template magic that we might invoke to automate creation of function aliases?

I don't think a template would work because of the attributes and the fact that the name has to be different. You couldn't propagate the name through a template, it has to be done via macro. I'm trying to think of alternatives, but nothing really comes to mind. Will give it some more thought.

Another option (as I mentioned in https://reviews.llvm.org/D28525#641954 ), that will also simplify the code, is to ALWAYS use weak aliases instead of weak functions.
So, always define a default implementation with "__def" suffix, and then define a weak alias: fun = fun__def
I am not sure if weak aliases work on Apple. If that is the case, this would STRONGLY SIMPLIFY the code since we take EXACTLY the same approach for all the platforms.
So, this is the idea:

  • we define a general macro for weak aliases:
#if SANITIZER_WINDOWS
#define WEAK_ALIAS(Name, Default)  \
  __pragma(comment(linker, "/alternatename:" #Name "=" #Default))
#else
#define STRINGIFY(A) #A
#define WEAK_ALIAS(Name, Default) \
  _Pragma( STRINGIFY(weak Name =  Default) )
#endif
  • We define a macro to represent the name of the default implemenation:
#define WEAK_DEFAULT_NAME(Name) Name##__def
  • For each weak function "fun" we provide a default implementation and weak alias:
void WEAK_DEFAULT_NAME(fun)(int a, int b) {
   // Default implementation .....
}
WEAK_ALIAS(fun, WEAK_DEFAULT_NAME(fun))

What do you think? I whink this is the best we can do, and, at the same time, keep the code similar for all platforms.

(I am using weak pragmas for linux: https://gcc.gnu.org/onlinedocs/gcc/Weak-Pragmas.html)

So, in Windows we only have "weak aliases". In Linux, we have both, weak functions and weak aliases.
I would like to know which options we have in Darwin. If we can use weak aliases for Darwin, it would be simple, since we can use weak aliases for all the platforms.
If not, we need different approaches.

(I am using weak pragmas for linux: https://gcc.gnu.org/onlinedocs/gcc/Weak-Pragmas.html)

So, in Windows we only have "weak aliases". In Linux, we have both, weak functions and weak aliases.
I would like to know which options we have in Darwin. If we can use weak aliases for Darwin, it would be simple, since we can use weak aliases for all the platforms.
If not, we need different approaches.

Weak aliases don't work on Darwin.

@kubabrecka @aizatsky Ok, thanks for your comments.

So, we can't take the same approach for all platforms because: Windows supports "weak aliases" , Linux support weak functions and weak aliases, Darwin support weak functions.
So, we need to continue with this approach:

  • Weak functions with default implementation for Linux and Darwin:
__attribute__ ((weak)) void fun () {
  // Default implementation
}
  • Use an auxiliar function for windows and define a weak alias:
void fun_def() {
  // Default implementation
}
WIN_WEAK_ALIAS(fun, fun__def)

Do you agree on this point? Would you do it differently?


So, if we include that code, we have some duplicated code:

#if ! SANITIZER_WINDOWS
 __attribute__ ((weak)) void fun () {
   // Default implementation
 }
#else
 void fun_def() {
   // Default implementation
 }
 WIN_WEAK_ALIAS(fun, fun__def)
#endif

To avoid duplicated code, I will define a macro WEAK_DEF(), so we can reduce that code to:

WEAK_DEF(void, fun, ()) {
   // Default implementation
 }

Would you agree on this point? Would you do it differently?


In the headers, I also need to export the default function on Windows ("fun__def").
So, we can use "#ifdef" and declare the functions as:

SANITIZER_INTERFACE_ATTRIBUTE void fun();
#ifdef SANITIZER_WINDOWS
SANITIZER_INTERFACE_ATTRIBUTE void fun__def();
#endif

To avoid that, I will define a macro WEAK_INTERFACE_DECL(), so we can reduce that code to:

WEAK_INTERFACE_DECL(void, fun, ());

Would you agree on this point? Would you do it differently?


As we declare many weak functions in different sanitizers, not only in sanitizer_common, I thought it would be appropriate to use the macros that I mention instead of using "#ifdefs".

I would appreciate your comment on this. I really want to finish with these changes to continue focused on libFuzzer.

Thanks.

kubamracek edited edge metadata.Jan 17 2017, 10:05 PM

To avoid duplicated code, I will define a macro WEAK_DEF(), so we can reduce that code to:

WEAK_DEF(void, fun, ()) {
  // Default implementation
}

Would you agree on this point? Would you do it differently?

I don't like that this is not a "standard" function declaration. It's not obvious that this declares a function. My editor/IDE will not recognize this as a function, and "Go to definition" and such will not work. Is there really no way of keeping the syntax of...:

WHATEVER_MACROS_NECESSARY
void fun() MAYBE_OTHER_MACROS {
  ...code...
}

?

In D28596#649214, @kubabrecka wrote:

To avoid duplicated code, I will define a macro WEAK_DEF(), so we can reduce that code to:

WEAK_DEF(void, fun, ()) {
  // Default implementation
}

Would you agree on this point? Would you do it differently?

I don't like that this is not a "standard" function declaration. It's not obvious that this declares a function. My editor/IDE will not recognize this as a function, and "Go to definition" and such will not work. Is there really no way of keeping the syntax of...:

WHATEVER_MACROS_NECESSARY
void fun() MAYBE_OTHER_MACROS {
  ...code...
}

?

Hi @kubabrecka ,
Thanks for your reply.
I think the macro name WEAK_DEF() makes it explicit that we are defining a function. Also, I am writing the components in the same order: return value, function name, parameters and function body. I think it is easy to understand by any user, so the only problem is the editor.
I can't keep the syntax as you suggest, because I need to define a function with a different name adding the suffix "__def" and also define a weak alias, and declare the original function. So, definitely this can not be accomplished in that way.
Also, is the same approach used for the definition of the macros: INTERCEPTOR() , DECLARE_REAL() (in: interception/interception.h), for RECOVERABLE() and UNRECOVERABLE() (in: ubsan/ubsan_handlers.h), LLVM_SYMBOLIZER_INTERCEPTOR*() (in: sanitizer_common/symbolizer/sanitizer_wrappers), etc. So, this is also used in other parts of compiler-rt repository.
As weak functions are a small portion of the definitions, I think this won't generate many problems to users working with an editor that doesn't expand the macros.
Also, I think the advantages are considerably more relevant than that disadvantage. We can define weak functions that will work in all platforms with a unique macro, without boilerplate code. If we want to continue supporting Linux, Darwin and Windows, I think this is useful.
Would you agree?
Thanks

We can define weak functions that will work in all platforms with a unique macro, without boilerplate code. If we want to continue supporting Linux, Darwin and Windows, I think this is useful.

Oh, I absolute agree with that.

The macros you mentioned (INTERCEPTOR, DECLARE_REAL, RECOVERABLE, and TEST from gtest) and a bit of mess that doesn't have a unified syntax of the user side. Gtest's TEST() macro sounds like a different story to me, because it doesn't have parameters nor return type -- we only have TEST(classname, testname) { code }, which is completely reasonable. The syntax of INTERCEPTOR() is very close to what you're proposing here, and frankly I don't like the syntax of INTERCEPTOR either.

I need to define a function with a different name adding the suffix "__def" and also define a weak alias, and declare the original function. So, definitely this can not be accomplished in that way.

Have you looked into e.g. changing the exported symbol names with /export:...=...? Couldn't this help? Could we use it to basically switch the two functions?

If there's really no other way, then I suggest either:

  1. Changing WEAK_DEF to drop the extra set of parentheses to have parity with INTERCEPTOR macro. I.e.: WEAK_DEF(void, OnPrint, const char *str) { code }.
  2. Having a macro only around the function name, i.e.: WEAK_ATTRIBUTE void WEAK_NAME(OnPrint)(const char *str) { code }.

Hi @kubabrecka ,
Thanks for your suggestions.

I need to define a function with a different name adding the suffix "__def" and also define a weak alias, and declare the original function. So, definitely this can not be accomplished in that way.

Have you looked into e.g. changing the exported symbol names with /export:...=...? Couldn't this help? Could we use it to basically switch the two functions?

Unfortunately, /export:...=... doesn't help.

If there's really no other way, then I suggest either:

  1. Changing WEAK_DEF to drop the extra set of parentheses to have parity with INTERCEPTOR macro. I.e.: WEAK_DEF(void, OnPrint, const char *str) { code }.

Ok. Yes, I agree. I will do that.

  1. Having a macro only around the function name, i.e.: WEAK_ATTRIBUTE void WEAK_NAME(OnPrint)(const char *str) { code }.

I can't do that because not only I need to change the name, I also need to add a linker pragma WIN_WEAK_ALIAS(OnPrint, OnPrint__def) and declare OnPrint.
I mean, for non-windows OnPrint definition will be:

void OnPrint(const char *str) {
  // Default implementation
}

And for Windows, it will be:

void OnPrint__def(const char *str) {
  // Default implementation
}
void OnPrint(const char *str);
WIN_WEAK_ALIAS(OnPrint, OnPrint__def)

Thanks.

  1. Having a macro only around the function name, i.e.: WEAK_ATTRIBUTE void WEAK_NAME(OnPrint)(const char *str) { code }.

I can't do that because not only I need to change the name, I also need to add a linker pragma WIN_WEAK_ALIAS(OnPrint, OnPrint__def) and declare OnPrint.
I mean, for non-windows OnPrint definition will be:

void OnPrint(const char *str) {
  // Default implementation
}

And for Windows, it will be:

void OnPrint__def(const char *str) {
  // Default implementation
}
void OnPrint(const char *str);
WIN_WEAK_ALIAS(OnPrint, OnPrint__def)

Thanks.

What if instead of getting it all into one line, we settle for this:

void WEAK_FN_DEF(OnPrint)(const char *str) {
}
WEAK_FN_ALIAS(OnPrint)

The macro WEAK_FN_ALIAS could look like this:

#define WEAK_FN_ALIAS(Fn) \
  decltype(Fn##__def) Fn; \
  WIN_WEAK_ALIAS(Fn, Fn##__def)

On non-Windows platforms, it would just do nothing. Apparently the crazy decltype magic works and should give you a definition named OnPrint__def and a forward declaration named OnPrint.

Yes, what @zturner proposes works. @kubabrecka Would that make your editor work fine?
I would prefer to use only one macro as I defined it before, because if we use 2 macros, we always need to remember to include both.
But just let me know what you prefer and I will do that.

The only way i can think to improve that is to make it so WEAK_ALIAS macro comes first, and make WEAK_DEF comes second and static_assert if WEAK_ALIAS has not been called (by using sizeof on the forward decl perhaps). This way you will get a compiler error if you don't do both.

I don't even know if this would work (and i also don't think it should hold up this change regardless since it's only a minor improvement).

That said, I'm fine with any approach that works, so I'll leave it up kuba@ and aizatsky@ to decide which they like better

Both solutions work by me.

What about FreeBSD? It hasn't been mentioned in this thread.

@kubabrecka I don't change the implementation for non-windows systems. I only modify the code to use a macro, but after the preprocessor the resulting code is nearly the same. I guess weak functions works for FreeBSD.

@kubabrecka according to the definition of SANITIZER_WEAK_ATTRIBUTE in sanitizer_common/sanitizer_internal_defs.h, for FreeBSD we are considering __attribute__((weak)). So, in this diff I don't change the behavior on FreeBSD.
Thanks.

rnk added a comment.Jan 19 2017, 10:54 AM

I suggest to make WIN* macros no-op on all platforms that do not need it.

#if WINDOWS
  #define WIN_WEAK_FN(...) ...
#else
  #define WIN_WEAK_FN(...)
#end

And have the function definition code look like:

SANITIZER_WEAK_ATTRIBUTE const char *__ubsan_default_options() { return ""; }
WIN_WEAK_FN(__ubsan_default_options, ...)

Notice the absence of ifdefs in the client code and C++ syntax for simple cases.

How about your solution and other platforms? E.g. mac? If you want to push such a heavy interface change you at least should think through how it will solve other problems as well. Does it help other platforms? It wouldn't be great if we had to introduce another level of complexity for another platform.

I don't want it to be possible for non-Windows developers to forget to use WIN_WEAK_FN. I would prefer it if the macro wrapped the function declaration so that it can't be forgotten. The Windows build is too often broken when adding weak functions.

I don't want it to be possible for non-Windows developers to forget to use WIN_WEAK_FN. I would prefer it if the macro wrapped the function declaration so that it can't be forgotten. The Windows build is too often broken when adding weak functions.

Makes sense. Marcos, can we get an updated patch? I'll test it on Darwin.

mpividori updated this revision to Diff 85159.EditedJan 20 2017, 11:18 AM
mpividori edited edge metadata.

Hi,
I updated the diff to include requested changes. I define 2 macros: WEAK_DEF() and WEAK_INTERFACE_DECL().

Tests pass on Linux.

Now, if we use WEAK_DEF() to define a weak function, it will work on Windows too, as long as:

  • We only define it once. This means, only one definition when linking. (This is different to weak symbols in linux, where we can define it many times in different TU, and the linker will choose the first that it finds). If this generates problems in the future (this doesn't happen now, but could happen when combining many sanitizers on Windows, providing the default weak implementation twice), this could be solved modifying the macro, WEAK_DEFAULT_NAME() to generate different default names for different sanitizers, for example, for asan, define fun__def__asan and for lsan, define fun__def__lsan.
  • We link statically.

When linking dynamically, weak functions are exported with the __dll suffix.
So, on the client side, we can define a weak alias: WIN_WEAK_ALIAS(fun , fun__dll) and use the default implementation or override it, redefining "fun". This is what I do in nexts diffs to fix Sanitizer Coverage for Windows. With the difference that I include the "windows weak aliases" in a static library which is included by clang driver (clang-rt_asan_dynamic_runtime_thunk-arch.lib), so they don't need to do anything different to make it work on Windows.

Thanks.

Can you rebase? The patch doesn't apply cleanly on SVN trunk.

Really? I rebased some hours ago. Ok I will rebase again.

mpividori updated this revision to Diff 85167.Jan 20 2017, 11:35 AM

@kubamracek I see the problem. You need to apply this patch before: https://reviews.llvm.org/D28525

@kubamracek I see the problem. You need to apply this patch before: https://reviews.llvm.org/D28525

That diff looks accepted by Reid. Can you land it first?

The patch passes all tests on Darwin.

aizatsky requested changes to this revision.Jan 20 2017, 5:19 PM

I think we can clean this up a bit.

lib/asan/asan_allocator.cc
960 ↗(On Diff #85181)

WEAK_* macros specify extern "C", right? Let's remove this and similar usages then.

lib/sanitizer_common/sanitizer_common.cc
508 ↗(On Diff #85181)

I'm not sure what's up with !SANITIZER_GO here, do you know?

lib/sanitizer_common/sanitizer_coverage_libcdep.cc
1021 ↗(On Diff #85181)

Can we get rid of void when defining a no-op weak function?

WEAK_INTERFACE_DECL(const char *, __ubsan_default_options);

It doesn't look like we'd need ##__VA_ARGS__.

lib/sanitizer_common/sanitizer_internal_defs.h
24 ↗(On Diff #85181)

Let's create a top-level weak symbols section documentation here that doesn't mention any platform specifics. It should document intent (weak symbols multi-platform support), top-level macros, their correct usage and give an example.

All platform-specific documentation should be next to implementation like now.

lib/sanitizer_common/sanitizer_printf.cc
218 ↗(On Diff #85181)

do you still neeed this #if?

218 ↗(On Diff #85181)

Now that you are introducing these macros, should we continue to have SANITIZER_SUPPORTS_WEAK_HOOKS visible to the code?

This revision now requires changes to proceed.Jan 20 2017, 5:19 PM
mpividori added inline comments.Jan 20 2017, 5:39 PM
lib/asan/asan_allocator.cc
960 ↗(On Diff #85181)

@aizatsky Ok, I can do that.

lib/sanitizer_common/sanitizer_common.cc
508 ↗(On Diff #85181)

@aizatsky I don't know, I think we can remove it.
As far as I understand, SANITIZER_GO is a flag that we provide from the command line.
For SANITIZER_GO we ignore weak symbols in sanitizer_internal_defs.h.

lib/sanitizer_common/sanitizer_coverage_libcdep.cc
1021 ↗(On Diff #85181)

@aizatsky Yes, I can do that. Do you think that way it is easier to understand? I can see for the INTERCEPTOR() macro, we always make it explicit and write void.

lib/sanitizer_common/sanitizer_internal_defs.h
24 ↗(On Diff #85181)

@aizatsky I agree, I will add more documentation.

lib/sanitizer_common/sanitizer_printf.cc
218 ↗(On Diff #85181)

@aizatsky as I don't understand the purpose of TSAN_EXTERNAL_HOOKS and SANITIZER_GO, I don't modify this implementation. If you prefer I can do that, maybe in a different diff? Also I want to add more changes, for example remove all the conditions like:

if (weak_function)
  weak_function()

Which is not necessary anymore, since we always provide a default implementation. Weak functions are always defined.

mpividori added inline comments.Jan 21 2017, 3:30 PM
lib/sanitizer_common/sanitizer_common.cc
508 ↗(On Diff #85181)

@aizatsky SANITIZER_GO flag is only used to build ThreadSanitizer runtime for Go. This condition was added in r285400: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161024/400704.html I think we can remove it, but to avoid problems, I don't.

mpividori added inline comments.Jan 21 2017, 3:34 PM
lib/sanitizer_common/sanitizer_coverage_libcdep.cc
1021 ↗(On Diff #85181)

@aizatsky If I do so, the compiler generates a lot of warnings: warning: must specify at least one argument for ... parameter of variadic macro

mpividori updated this revision to Diff 85251.Jan 21 2017, 4:00 PM
mpividori edited edge metadata.

Hi,
I add a new diff to include suggested changes:

  • I move Windows specific code to sanitizer_win_defs.h.
  • I include a general explanation on using WEAK_DEF() and WEAK_DECL() in sanitizer_internal_defs.h.
  • I remove redundant extern "c".
  • I clarify the purpose of SANITIZER_SUPPORT_WEAK_HOOKS and add some documentation.

So, to simplify the code, we always provide a default implementation for weak functions.

But, for some special cases, like weak hooks __sanitizer_[malloc|free]_hook() that will be called very often, we want to avoid the overhead of executing the default implementation when it is not necessary. (this overhead really matters? only the overhead of function call, because the default implmentation is empty... maybe...)
So, for weak hooks, I don't modify the code for platforms that can use real weak symbols that will evaluate to a null pointer when not defined. I only define the default implementation for platforms that doesn't support weak hooks, like:

#if !SANITIZER_SUPPORT_WEAK_HOOKS
WEAK_DEF(void, __sanitizer_malloc_hook, void *ptr, uptr size) {
   (void)ptr;
   (void)size;
}
#endif

Maybe this doesn't represent a relevant difference in practice and we can remove the #if #endif and always provide a default implementation for weak hooks too.

So, the final idea is this:

  • WEAK_DECL() for declaring a weak function.
  • WEAK_DEF() for defining a default weak implementation.
  • If we don't want to provide a a default implementation when not necessary and avoid the overhead:
#if !SANITIZER_SUPPORT_WEAK_HOOKS
`WEAK_DEF( ... weak_hook ...)` { // default}
#endif

@aizatsky Can me move forward?

After this diff, I would be grateful if you could consider these diffs in this order:

So we finish fixing Sanitizer Coverage for Windows and I can continue working on libFuzzer.
Thanks,

rnk added inline comments.Jan 23 2017, 3:47 PM
lib/sanitizer_common/sanitizer_win_defs.h
91 ↗(On Diff #85251)

Remind me why we need to declare the *__dll variants in headers? Can this be a .cc file implementation detail?

If we don't need to declare both foo and foo__dll, then we can get rid of WEAK_DECL and go back to something more like:

extern "C" void SANITIZER_WEAK_DECL sancov_blah(int params);
mpividori added inline comments.Jan 23 2017, 4:18 PM
lib/sanitizer_common/sanitizer_win_defs.h
91 ↗(On Diff #85251)

@rnk I think that macro is appropriate. Every time we declare a weak function in Windows, we are declaring 2 functions. So, when we include that header, we are also including both declarations, which is what is expected.
So, for example in: https://reviews.llvm.org/D28598 when I include the header: sanitizer_common/sanitizer_interface_internal.h , it include both declarations, and I don't need to add extra declarations.

mpividori updated this revision to Diff 85804.Jan 25 2017, 2:05 PM

@rnk Ok, I realized you are right, we can remove WEAK_DECL and continue using declarations like: void SANITIZER_WEAK_ATTRIBUTE ... , because we only need to refer to the exported function "fun__dll" on a linker pragma WEAK_ALIAS(...), so we don't need a declaration.
So, now, I only define a macro "WEAK_DEF()".
Thanks.

rnk accepted this revision.Jan 25 2017, 3:36 PM

lgtm Thanks!

+alekseyshl

Is there anything that remains to be done on this revision? IIUC all concerns have been addressed. kcc@, aizatsky@, alekseyshl@, kubamracek@: Do you guys have any further issues or is this ok to go in?

mpividori updated this revision to Diff 86093.Jan 27 2017, 12:26 PM

In this diff I modify the name of exported weak symbols, so I need to temporarily disable a win test which check the interfaces. I will fix that test in a different diff, after all the refactoring to windows sanitizer.

alekseyshl accepted this revision.Jan 27 2017, 1:40 PM

LGTM with a few minor comments.

lib/asan/asan_allocator.cc
961 ↗(On Diff #85804)

Since all we have now is this macro, can we be a bit more explicit about the fact that it's an interface function?
SANITIZER_INTERFACE_WEAK_DEF, for example?

lib/sanitizer_common/sanitizer_internal_defs.h
17 ↗(On Diff #85804)

It does not feel right, unconditionally including win specific header in the generic module. I realize that within sanitizer_win_defs.h there is a #if SANITIZER_WINDOWS guard, but still. Can you move it into #if SANITIZER_WINDOWS block, where it is actually needed?

lib/sanitizer_common/sanitizer_win_defs.h
62 ↗(On Diff #85804)

We cannot define weak functions on Windows,

93 ↗(On Diff #85804)

A default implementation must always be provided

99 ↗(On Diff #85804)

I think your summary explains it in a bit more coherent way, can you expand this comment?

mpividori added inline comments.Jan 27 2017, 1:52 PM
lib/asan/asan_allocator.cc
961 ↗(On Diff #85804)

@alekseyshl Ok, I agree. I will change the macro.

lib/sanitizer_common/sanitizer_internal_defs.h
17 ↗(On Diff #85804)

@alekseyshl Ok I can change that, but in https://reviews.llvm.org/D28525 they suggested that I simply include the header and only use #ifdef SANITIZER_WINDOWS inside the header, to simplify the code. Whatever you prefer is ok for me.

alekseyshl added inline comments.Jan 27 2017, 3:54 PM
lib/sanitizer_common/sanitizer_internal_defs.h
17 ↗(On Diff #85804)

To me, D28525 can be justified by the fact that it's a .cc file containing both win and non-win code, hence both types of headers are there (although I'd rather split it into platform specific files and have a check inside sanitizer_win_defs.h that SANITIZER_WINDOWS is defined, but anyway).

Here you're defining things, you already have those #if blocks, move it into the first one requiring definitions from this file.

Just and idea, you don't have to do anything about it: if we take a step back, whatever definitions use from sanitizer_win_defs.h, are they used anywhere else? My (limited) understandin so far, now the rest of the code is going to use WEAK_DEF macro, everything else is its implementation detail, right? If so, why not move all those definitions to sanitizer_internal_defs.h under #if SANITIZER_WINDOWS? Wouldn't it express better what we're trying to achieve?

@alekseyshl Thanks for your feedback.

Yes, I agree about the #ifdefs. After this diff I only include sanitizer_win_defs.h in files including windows code, except sanitizer_internal_def.h, so I think I can simply use an "#ifdef" there.

I created 2 macros: WIN_WEAK_IMPORT_DEF and WIN_WEAK_EXPORT_DEF and I included their definition in sanitizer_win_defs.h because this simplifies the code in sanitizer_internal_defs.h. Otherwise, I would be including a lot of Windows specific comments there which make the code confusing.

Also, I need WIN_WEAK_IMPORT_DEF for other windows specific files, in the next diffs.

mpividori updated this revision to Diff 86154.Jan 27 2017, 6:21 PM

@alekseyshl , I include requested changes.

LGTM

lib/asan/asan_win_dll_thunk.cc
337 ↗(On Diff #86154)

If you don't need those two functions, just delete them.

This revision was automatically updated to reflect the committed changes.

According to git bisect, this broke lib/tsan/go/build.bat, and it seems to be broken ever since.

rnk added a comment.May 11 2018, 1:27 PM

According to git bisect, this broke lib/tsan/go/build.bat, and it seems to be broken ever since.

I don't believe that runs continuously anywhere, so it doesn't surprise me that it's been broken for over a year. So far as I can tell, the tsan + Windows + go build is unmaintained. =/ It'd be nice to get it straightened out.

Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2019, 12:26 AM

Turns out this broke an important part of our internal integration because sanitizer::OnPrint become OnPrint, so existing hook stopped working. We are clearly missing some tests.
But exporting and calling a global extern "C" OnPrint does not look OK. All functions need to be either extern "C" and
sanitizer_ prefixed, or in __sanitizer:: namespace.

Now if you build:

#include <stdio.h>
void OnPrint() { printf("OnPrint\n"); }
int main() {}

It crashes with:

$ TSAN_OPTIONS=verbosity=1 ./a.out

103485==Installed the sigaction for signal 11

ThreadSanitizer:DEADLYSIGNAL
ThreadSanitizer:DEADLYSIGNAL
ThreadSanitizer: nested bug in the same thread, aborting.

rnk added a comment.Sep 24 2019, 9:05 AM

I see. I audited the other instances of SANITIZER_INTERFACE_WEAK_DEF, and they all seem to be properly namespaced into __asan_, __sanitizer_, etc.

Do you (or someone else from dynamic-tools) mind driving the fix for this, since you control the internal integration for this and would know where to add the missing test? Again, sorry for the breakage.

In D28596#1680946, @rnk wrote:

I see. I audited the other instances of SANITIZER_INTERFACE_WEAK_DEF, and they all seem to be properly namespaced into __asan_, __sanitizer_, etc.

Do you (or someone else from dynamic-tools) mind driving the fix for this, since you control the internal integration for this and would know where to add the missing test? Again, sorry for the breakage.

Proposed fix in https://reviews.llvm.org/D67987.