This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Provide -fvisibility-global-new-delete-hidden option
ClosedPublic

Authored by phosek on Oct 26 2018, 5:36 PM.

Details

Summary

When the global new and delete operators aren't declared, Clang
provides and implicit declaration, but this declaration currently
always uses the default visibility. This is a problem when the
C++ library itself is being built with non-default visibility because
the implicit declaration will force the new and delete operators to
have the default visibility unlike the rest of the library.

The existing workaround is to use assembly to enforce the visiblity:
https://fuchsia.googlesource.com/zircon/+/master/system/ulib/zxcpp/new.cpp#108
but that solution is not always available, e.g. in the case of of
libFuzzer which is using an internal version of libc++ that's also built
with -fvisibility=hidden where the existing behavior is causing issues.

This change introduces a new option -fvisibility-global-new-delete-hidden
which makes the implicit declaration of the global new and delete
operators hidden.

Diff Detail

Repository
rC Clang

Event Timeline

phosek created this revision.Oct 26 2018, 5:36 PM
rnk added a comment.Oct 29 2018, 5:41 PM

This seems pretty inconsistent with how -fvisibility=hidden behaves with normal, user written declarations. Consider this code:

void foo();
void bar() { foo(); }

We get this IR:

$ clang -S -emit-llvm --target=x86_64-linux t.c -o - -fvisibility=hidden
...
define hidden void @bar() #0 {
entry:
  call void @foo()
  ret void
}
declare dso_local void @foo() #1

Basically, declarations are never marked hidden, only definitions are. This is because system headers are not explicitly marked with default visibility annotations, and you still want to be able to call libc from code compiled with -fvisibility=hidden.

I'm... kind of surprised we have code to explicitly mark these declarations with default visibility attributes.

rsmith requested changes to this revision.Oct 29 2018, 7:30 PM

Replacing the global new and delete is supposed to be a whole-program operation (you only get one global allocator). Otherwise you couldn't allocate memory in one DSO and deallocate it in another. (And nor could inline functions etc.)

If you really want to do this weird thing, and you understand what you're getting yourself into, I don't see a problem with having a dedicated flag for it, but don't break all existing users of -fvisibility=.

This revision now requires changes to proceed.Oct 29 2018, 7:30 PM

Replacing the global new and delete is supposed to be a whole-program operation (you only get one global allocator). Otherwise you couldn't allocate memory in one DSO and deallocate it in another. (And nor could inline functions etc.)

If you really want to do this weird thing, and you understand what you're getting yourself into, I don't see a problem with having a dedicated flag for it, but don't break all existing users of -fvisibility=.

That sounds reasonable, do you have any suggestion for the name of such flag?

Replacing the global new and delete is supposed to be a whole-program operation (you only get one global allocator). Otherwise you couldn't allocate memory in one DSO and deallocate it in another. (And nor could inline functions etc.)

If you really want to do this weird thing, and you understand what you're getting yourself into, I don't see a problem with having a dedicated flag for it, but don't break all existing users of -fvisibility=.

That sounds reasonable, do you have any suggestion for the name of such flag?

-fglobal-new-delete-visibility= is the first thing that springs to mind, but maybe there's something better?

Replacing the global new and delete is supposed to be a whole-program operation (you only get one global allocator). Otherwise you couldn't allocate memory in one DSO and deallocate it in another. (And nor could inline functions etc.)

If you really want to do this weird thing, and you understand what you're getting yourself into, I don't see a problem with having a dedicated flag for it, but don't break all existing users of -fvisibility=.

I don't really understand how these functions are different from other functions. The language standards don't have anything to say about ELF visibility. What you say about "whole-program operation" is true of any global symbol. When we use visibility switches or annotations it's because we want to change how global symbols behave. I don't understand the rationale for treating these particular functions differently from all other functions. It is especially bizarre to me that explicit attributes on the definition sites are silently ignored for these functions and no others.

Few if any existing users of -fvisibility or visibility attributes use them on definitions of operator new and operator delete. The notion that existing users are expecting this bizarrely inconsistent behavior seems pretty questionable to me.

But indeed I do know what I'm doing and I am willing to tell the compiler even more explicitly if you insist that I should have to do that for some reason.
I don't care what the switch is called. This is only ever going to be used in basically one place in the world (the libc++ definitions when building it for hermetic static linking).

Replacing the global new and delete is supposed to be a whole-program operation (you only get one global allocator). Otherwise you couldn't allocate memory in one DSO and deallocate it in another. (And nor could inline functions etc.)

If you really want to do this weird thing, and you understand what you're getting yourself into, I don't see a problem with having a dedicated flag for it, but don't break all existing users of -fvisibility=.

I don't really understand how these functions are different from other functions. The language standards don't have anything to say about ELF visibility. What you say about "whole-program operation" is true of any global symbol.

These symbols really are special. Other symbols are introduced explicitly by a declaration, whereas these are declared implicitly by the compiler. Other symbols must have exactly one definition (modulo the permission for duplicate identical definitions for some cases), but these ones have a default definition that is designed to be overridable by a different definition appearing anywhere in the program. Other symbols are generally provided in one library and consumed by users of that library, whereas these symbols are typically provided by the main binary and consumed by the libraries that it uses. And so on.

It is especially bizarre to me that explicit attributes on the definition sites are silently ignored for these functions and no others.

Do you have a testcase? That's not the behavior I'm seeing. What I see is that we get a hard error for an explicit attribute on the definition site, because the prior compiler-generated declaration has default visibility. Eg:

$ echo '__attribute__((visibility("hidden"))) void *operator new(__SIZE_TYPE__) { return (void*)42; }' | clang -x c++ -
<stdin>:1:16: error: visibility does not match previous declaration
__attribute__((visibility("hidden"))) void *operator new(__SIZE_TYPE__) { return (void*)42; }
               ^
note: previous attribute is here

(That terrible note at the end is trying to point at the implicit compiler-generated declaration.)

These symbols really are special. Other symbols are introduced explicitly by a declaration, whereas these are declared implicitly by the compiler.

The implicit declaration is the only difference that actually makes sense to me.

Other symbols must have exactly one definition (modulo the permission for duplicate identical definitions for some cases), but these ones have a default definition that is designed to be overridable by a different definition appearing anywhere in the program.

I don't understand this claim. These are symbols like any others at link time. A single definition must be supplied as for any other function.

Other symbols are generally provided in one library and consumed by users of that library, whereas these symbols are typically provided by the main binary and consumed by the libraries that it uses. And so on.

I don't understand this claim. These symbols are normally defined in libc++.so and nowhere else.

It is especially bizarre to me that explicit attributes on the definition sites are silently ignored for these functions and no others.

Do you have a testcase? That's not the behavior I'm seeing. What I see is that we get a hard error for an explicit attribute on the definition site, because the prior compiler-generated declaration has default visibility. Eg:

You're right. It's been quite a while since I was fighting with this originally. It might have been GCC that ignored explicit attributes.

Other symbols must have exactly one definition (modulo the permission for duplicate identical definitions for some cases), but these ones have a default definition that is designed to be overridable by a different definition appearing anywhere in the program.

I don't understand this claim. These are symbols like any others at link time. A single definition must be supplied as for any other function.

The program has two choices: either it provides a definition, and that gets used everywhere, or it does not, and the default version (provided by the toolchain) gets used everywhere. This is what the language model requires, and it's so important that the entire program agrees on what the default heap is that we intentionally make this work even when using -fvisibility=hidden.

Other symbols are generally provided in one library and consumed by users of that library, whereas these symbols are typically provided by the main binary and consumed by the libraries that it uses. And so on.

I don't understand this claim. These symbols are normally defined in libc++.so and nowhere else.

That's not accurate. As noted above, programs can provide their own definitions, which are required to replace the version that would otherwise be implicitly provided.

phosek updated this revision to Diff 172048.Oct 31 2018, 4:32 PM
phosek retitled this revision from [Sema] Use proper visibility for global new and delete declarations to [Sema] Provide -fvisibility-global-new-delete-hidden option.
phosek edited the summary of this revision. (Show Details)

Following the earlier suggestion, I've modified to change to introduce a new -fvisibility-global-new-delete-hidden option that controls this behavior.

Other symbols must have exactly one definition (modulo the permission for duplicate identical definitions for some cases), but these ones have a default definition that is designed to be overridable by a different definition appearing anywhere in the program.

I don't understand this claim. These are symbols like any others at link time. A single definition must be supplied as for any other function.

The program has two choices: either it provides a definition, and that gets used everywhere, or it does not, and the default version (provided by the toolchain) gets used everywhere.

That's true of all global symbols normally provided by libraries.

This is what the language model requires, and it's so important that the entire program agrees on what the default heap is that we intentionally make this work even when using -fvisibility=hidden.

So you're concerned with people who use -fvisibility=hidden on the TUs that define some functions, and then use dynamic linking with the expectation that those definitions will be seen across shared library boundaries.
I don't see how people specifying visibility different than what they actually want is different for these particular functions than for any others. But I'm biased towards users who actually understand what they asked the compiler to do.

Other symbols are generally provided in one library and consumed by users of that library, whereas these symbols are typically provided by the main binary and consumed by the libraries that it uses. And so on.

I don't understand this claim. These symbols are normally defined in libc++.so and nowhere else.

That's not accurate. As noted above, programs can provide their own definitions, which are required to replace the version that would otherwise be implicitly provided.

So they are global symbols defined in libraries. That's all what you've said actually means to me.

I think it's clear that there is nothing different about these symbols than any others from the perspective of how linking works and how that relates to the language specification. The *only* actual difference is that the language specification says that you can override these standard library symbols and that you may not override other standard library symbols--a distinction that does not actually exist materially in linking, nor matter to a programmer following the language specification and not using the compiler in ways outside the language specification (such as controlling symbol visibility).

All of your explanations for how they are different have to do with saving people from themselves when they asked the compiler to do something non-default with all symbols but actually meant to treat these particular symbols differently.
AFAICT the rationale goes something like: linking is subtle and hard to understand; many people use -fvisibility=hidden and do not fully understand what it means; a likely error is failing to understand the need to define a function with explicit-default visibility when it should be visible across shared library boundaries; conforming C++ code may define its own operator new/operator delete functions to have those used by the standard C++ library code (which might be in a shared library) in preference to its own definitions, but may not define other standard functions and have the same expectation; ergo, people with a poor understanding of linking who have followed some advice they didn't fully understand to use -fvisibility=hidden and also might define their own operator new/operator delete functions are best served by the compiler doing what they meant rather than what they said.

I'm a person who understands linking thoroughly, so I'm not the target audience for the concerns you raise. That's fine. Since I do understand the issues, I have no trouble understanding when I need a special compiler option to get the link-time behavior that's correct for my needs.

phosek added a comment.Nov 5 2018, 7:09 PM

@rsmith does this look reasonable to you?

@rsmith friendly ping?

rsmith accepted this revision.Dec 3 2018, 4:53 PM

Looks good with minor changes, thanks.

clang/include/clang/Basic/LangOptions.def
231

This should just be LANGOPT, not BENIGN_LANGOPT, because it changes how we construct the AST (in an incompatible way).

Also, could you please remove the "default" from the description of this LANGOPT (and the previous one)? "default visibility" means something else, so this is confusing as written.

clang/include/clang/Driver/Options.td
1738

Remove the "by default" here: the program can't change the visibility to anything else via an attribute.

This revision is now accepted and ready to land.Dec 3 2018, 4:53 PM
phosek updated this revision to Diff 176533.Dec 3 2018, 6:58 PM
phosek marked 2 inline comments as done.
This revision was automatically updated to reflect the committed changes.