This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Introduce clang::lto_visibility_public attribute
Needs RevisionPublic

Authored by leonardchan on Apr 7 2023, 6:29 PM.

Details

Reviewers
pcc
ldionne
phosek
mcgrathr
philnik
Mordante
Group Reviewers
Restricted Project
Summary

In fuchsia, we build our shared libs with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS and -fvisibility=hidden to ensure all symbols manifested from classes in libc++ headers are hidden. But if we build with LTO, then then each of the classes included from libc++ headers have hidden LTO visibility (which is unique to LTO and separate from ELF visibility). This is an issue if we ship a libc++.so that was not built with LTO because those same libc++ classes will have default LTO visibility. This can lead to incorrect behavior because LTO will only work if classes used across multiple linkage units have default LTO visibility, which can be achieved by either compiling everything with LTO and default visibility, or explicitly adding [[clang::lto_visibility_public]] to those classes. To ensure that we can still use -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS and -fvisibility=hidden to make sure libc++ symbols have hidden visibility, we can apply [[clang::lto_visibility_public]] to libc++ classes which effectively prevents whole program devirtualization from operating on them.

Diff Detail

Event Timeline

leonardchan created this revision.Apr 7 2023, 6:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2023, 6:29 PM
leonardchan requested review of this revision.Apr 7 2023, 6:29 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptApr 7 2023, 6:29 PM
leonardchan edited the summary of this revision. (Show Details)Apr 7 2023, 6:29 PM
leonardchan edited the summary of this revision. (Show Details)
philnik requested changes to this revision.Apr 7 2023, 6:48 PM
philnik added a subscriber: philnik.

I've got a few problems with this patch (in no particular order):

  1. It makes our visibility annotations even more complicated. It is already way to complicated, we should avoid making it even worse.
  2. You say you build with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden which AFAICT defeats the purpose of a shared library, since it wouldn't export a single symbol.
  3. This is a completely untested configuration, which means it will regress (which is already a problem with the current visibility annotations)

My biggest concern right now is (2).

This revision now requires changes to proceed.Apr 7 2023, 6:48 PM
Mordante requested changes to this revision.Apr 8 2023, 4:03 AM
Mordante added a subscriber: Mordante.

Next to @philnik's issues I have a large issue with the lack of documentation. Our visibility macros are already not very well documented, adding more without any documentation seems like a bad idea.
Especially there is no explanation which friend declarations need this markup and which don't.

You say you build with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden which AFAICT defeats the purpose of a shared library, since it wouldn't export a single symbol.

My bad, I should've clarified on this. For fuchsia, we build everything with hidden visibility by default and individual binaries need to explicitly mark their symbols with default visibility if they want to be exported. This also means that when a fuchsia binary uses libc++, we need to make sure it doesn't accidentally export any libc++ symbols. Fuchsia binaries can consume libc++ as either a shared lib or a hermetic static lib, but we don't know at compile time how libc++ will eventually be consumed. If used as a shared lib, then symbols will be defined libc++.so. If linked in as a hermetic static lib, then we need to ensure those symbols have hidden visibility. To cover both cases and prevent us from having to compile twice, we just compile everything with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden.

It makes our visibility annotations even more complicated. It is already way to complicated, we should avoid making it even worse.

In terms of minimizing extra added visibility annotation complexity, what do you think of this alternative: update _LIBCPP_TEMPLATE_VIS and _LIBCPP_TYPE_VIS in <__config> such that it can be defined in the compilation step as an argument. If left undefined, the default behavior would what libcxx does now, which is defining the macro according to _LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS. This way, the change would be less intrusive to libcxx and downstream users can have more flexibility on libc++ class annotations.

This is a completely untested configuration, which means it will regress (which is already a problem with the current visibility annotations)
Next to @philnik's issues I have a large issue with the lack of documentation. Our visibility macros are already not very well documented, adding more without any documentation seems like a bad idea. Especially there is no explanation which friend declarations need this markup and which don't.

I'll also be sure to update the next revision with tests and docs. The initial reasoning for the the friend variants is because some attributes like [[clang::lto_visibility_public]] can only be attached to class definitions, but not friend declarations.

You say you build with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden which AFAICT defeats the purpose of a shared library, since it wouldn't export a single symbol.

My bad, I should've clarified on this. For fuchsia, we build everything with hidden visibility by default and individual binaries need to explicitly mark their symbols with default visibility if they want to be exported. This also means that when a fuchsia binary uses libc++, we need to make sure it doesn't accidentally export any libc++ symbols. Fuchsia binaries can consume libc++ as either a shared lib or a hermetic static lib, but we don't know at compile time how libc++ will eventually be consumed. If used as a shared lib, then symbols will be defined libc++.so. If linked in as a hermetic static lib, then we need to ensure those symbols have hidden visibility. To cover both cases and prevent us from having to compile twice, we just compile everything with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden.

I still don't understand your setup. So you compile the program using LTO and passing -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, right? That way the program itself doesn't export any libc++ symbols and everything in the program has hidden visibility AND hidden LTO visibility. But then you compile libc++.so with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, which causes everything in libc++.so to have hidden visibility. But since you compile libc++.so with LTO disabled, everything in libc++.so has default LTO visibility.

So for example, let's say you have a std::map<int, int> instantiated in your program and let's say there's also one instantiated inside libc++.so. On the program side, std::map<int, int> has hidden visibility because you passed -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, and it has hidden LTO visibility for the same reason. On the libc++.so side, it has hidden visibility because you passed -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden when building the .so, but it has default LTO visibility because you did not compile libc++.so with LTO enabled. That's an ODR violation. Is that correct?

You say you build with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden which AFAICT defeats the purpose of a shared library, since it wouldn't export a single symbol.

My bad, I should've clarified on this. For fuchsia, we build everything with hidden visibility by default and individual binaries need to explicitly mark their symbols with default visibility if they want to be exported. This also means that when a fuchsia binary uses libc++, we need to make sure it doesn't accidentally export any libc++ symbols. Fuchsia binaries can consume libc++ as either a shared lib or a hermetic static lib, but we don't know at compile time how libc++ will eventually be consumed. If used as a shared lib, then symbols will be defined libc++.so. If linked in as a hermetic static lib, then we need to ensure those symbols have hidden visibility. To cover both cases and prevent us from having to compile twice, we just compile everything with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden.

I still don't understand your setup. So you compile the program using LTO and passing -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, right? That way the program itself doesn't export any libc++ symbols and everything in the program has hidden visibility AND hidden LTO visibility. But then you compile libc++.so with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, which causes everything in libc++.so to have hidden visibility. But since you compile libc++.so with LTO disabled, everything in libc++.so has default LTO visibility.

The program is compiled with LTO and -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden but out libc++.so isn't compiled with -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS, -fvisibility=hidden, or LTO so all the symbols in our libc++.so have default visibility and default LTO visibility while an equivalent symbol in the program will have hidden elf and LTO visibility.

So for example, let's say you have a std::map<int, int> instantiated in your program and let's say there's also one instantiated inside libc++.so. On the program side, std::map<int, int> has hidden visibility because you passed -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden, and it has hidden LTO visibility for the same reason. On the libc++.so side, it has hidden visibility because you passed -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden when building the .so, but it has default LTO visibility because you did not compile libc++.so with LTO enabled. That's an ODR violation. Is that correct?

A slight difference in the example is that the symbol has default ELF visibility in the libc++.so side, but yes I believe this is an ODR violation since the map has differing LTO visibilities across different linkage units which is what this patch attempts to fix. We need to make sure the symbols on the program side are hidden ELF visibility and the symbols in libc++ are default ELF visibility, but to ensure LTO visibility is respected, we need a way to plumb [[clang::lto_visibility_public]] to all relevant libc++ classes.

After careful consideration, I am left wondering why -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden even makes sense. I understand that you want to minimize the number of symbols exported from your program and I think that's a desirable goal. However, let's say you take a typical program and just compile it normally without using -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden -- what's the problem? What symbols end up being exported that you wouldn't want to export?

We already mark the vast majority of symbols in libc++ as hidden (using _LIBCPP_HIDE_FROM_ABI), and the ones that are not marked as hidden basically need to be public for correctness purposes.

After careful consideration, I am left wondering why -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden even makes sense. I understand that you want to minimize the number of symbols exported from your program and I think that's a desirable goal. However, let's say you take a typical program and just compile it normally without using -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -fvisibility=hidden -- what's the problem? What symbols end up being exported that you wouldn't want to export?

We already mark the vast majority of symbols in libc++ as hidden (using _LIBCPP_HIDE_FROM_ABI), and the ones that are not marked as hidden basically need to be public for correctness purposes.

We also want to reduce size and startup time so marking them as hidden reduces dynamic relocations at least for ELF. Usage of classes included from headers can manifest public symbols that we don't need to be public. std::streambuf is one I've seen a lot and any usage of that also brings in the methods associated with it.