Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

Support: introduce public API annotation support
Needs ReviewPublic

Authored by tstellar on Sep 2 2021, 1:48 PM.

Details

Summary

This provides a model for annotating the "public" interfaces of a module
in LLVM. This is motivated by the desire to have the ability to build
LLVM libraries in the shared mode to enable extraction of the shared
content between the tools.

Unlike Unix platforms, Windows has a more stringent requirement of
annotating the "dll-interface" when building a shared library. All of
the public APIs should be marked by the appropriate *_API macro. The
annotation indicates what APIs are public but are expected to live in
the dynamic library (DLL/dylib/so) that the macro is associated with.

The longer term desire here is to evolve this to support creating a
small set of libraries. Because there is a cost associated with the
call across the image boundary (GOT+PLT/IAT), the intent is to minimise
the number of libraries. The separation of the libraries should prefer
to split along the following disjoint sets:

  • minimal set of interfaces required for llvm-ar (archiver/librarian)
  • minimal set of interfaces required for lld (linker)
  • minimal set of interfaces required for clang (compiler)

The reasoning behind this split set is that these tools are executed the
most, and minimising the penalty for these three sets is crucial for
keeping the penalty to dynamic linking (and LTO) low. Keeping the
number of libraries low reduces the complexity in distribution and
should also help reduce the overall size of the toolchain distribution.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
compnerd added inline comments.Sep 4 2021, 1:45 PM
llvm/include/llvm/Support/LLVMSupportExports.h
12 ↗(On Diff #370755)

Wait, we want to give the user the flexibility to *not* define that? I intend to make -fvisibility=hidden the only option once the annotations are complete. I want to simply globally set:

set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_C_VISIBILITY_INLINES_HIDDEN YES)
set(CMAKE_CXX_VISIBILITY_INLINES_HIDDEN YES)

Everything in LLVM should be internalized unless you are doing a shared library build, where the annotations will expose the necessary interfaces.

Is there a use case that I am not considering?

In D109192#2983824, @MaskRay wrote:

I think that BUILD_SHARED_LIBS is desirable, and LLVM_ENABLE_DYLIB should go away. That said, I think that the ideal thing here is to have a different configuration that is less granular than BUILD_SHARED_LIBS. That would correspond to LLVMTools, LLVMCompiler and depending on the delta, a potential third library. These three would give us a nice hybrid approach to dynamic and static linking. It would be dynamic for buckets of use (bintuils, clang, and possibly lld if it makes LLVMTools too large) but statically linking large components of LLVM to enable us to take advantage of LTO and DCE amongst that set. It would try to minimise the cost of the GOT/PLT and IAT so that we don't introduce an extremely heavy penalty for these libraries.

Why should LLVM_ENABLE_DYLIB go away? To reduce the number of configurations when the few-dylibs scheme is adopted more?
Other than that, I don't see any downside.

Because the LLVM build system is far too complex. Once we have the annotations, I don't see why we should continue to use custom build system logic versus the CMake standard BUILD_SHARED_LIBS. That would allow the few dylib scheme to be adopted and reduce the complexity in the build system and homogenise the distribution and development builds hopefully.

compnerd updated this revision to Diff 370763.Sep 4 2021, 1:59 PM
jrtc27 added a comment.Sep 4 2021, 2:20 PM

In D109192#2983824, @MaskRay wrote:

I think that BUILD_SHARED_LIBS is desirable, and LLVM_ENABLE_DYLIB should go away. That said, I think that the ideal thing here is to have a different configuration that is less granular than BUILD_SHARED_LIBS. That would correspond to LLVMTools, LLVMCompiler and depending on the delta, a potential third library. These three would give us a nice hybrid approach to dynamic and static linking. It would be dynamic for buckets of use (bintuils, clang, and possibly lld if it makes LLVMTools too large) but statically linking large components of LLVM to enable us to take advantage of LTO and DCE amongst that set. It would try to minimise the cost of the GOT/PLT and IAT so that we don't introduce an extremely heavy penalty for these libraries.

Why should LLVM_ENABLE_DYLIB go away? To reduce the number of configurations when the few-dylibs scheme is adopted more?
Other than that, I don't see any downside.

Because the LLVM build system is far too complex. Once we have the annotations, I don't see why we should continue to use custom build system logic versus the CMake standard BUILD_SHARED_LIBS. That would allow the few dylib scheme to be adopted and reduce the complexity in the build system and homogenise the distribution and development builds hopefully.

The whole point of the option though is to make incremental builds faster which is really helpful for development. I'd be against removing that as all my development builds use it.

compnerd updated this revision to Diff 370767.Sep 4 2021, 2:42 PM
  • correct order of LLVM_NODISCARD and LLVM_SUPPORT_ABI
  • annotate kInvalidFile
compnerd updated this revision to Diff 370770.Sep 4 2021, 3:25 PM
  • force a vtable instantiation to create an anchor for the for opt specializations
  • more tweaks for Windows path (clang-test and llvm-test now build)
compnerd updated this revision to Diff 370773.Sep 4 2021, 4:45 PM
  • correct declaration for cl::opt extern template instantiations
  • adjust ABI breaking check handling
compnerd updated this revision to Diff 370776.Sep 4 2021, 5:56 PM

correct typo in CommandLine.cpp

compnerd updated this revision to Diff 370779.Sep 4 2021, 8:05 PM

annotate some types for MLIR from flang

compnerd updated this revision to Diff 370815.Sep 5 2021, 9:58 AM

check-mlir should build now

As before: what's the strategy for downstreams that modify these libraries? Has this been done entirely by hand, or is there some kind of script they can use to automate annotating their own additions?

As before: what's the strategy for downstreams that modify these libraries? Has this been done entirely by hand, or is there some kind of script they can use to automate annotating their own additions?

Before?

Yes, this has been done by hand. I'm not sure if I can get a perfect result through automation (https://github.com/compnerd/libtool), but I'm looking into that as an option. At least for the standalone functions, its relatively easy to get the desired behaviour correct. But, in general, for downstream repositories, this will likely require the usual porting and an additional audit of their changes.

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

We have significant downstream changes, and we regularly build with shared libraries for development, only static linking in our release builds. This will affect us and having to chase down a bunch of functions with LLVM_SUPPORT_ABI does not fill me with joy. Yes, static linking is great for releases, but it sucks for development, full per-component shared libraries make things much faster, please don't just dismiss it.

compnerd updated this revision to Diff 370818.Sep 5 2021, 11:46 AM
  • annotate Errno.h functions
  • change StringRef to decorate explicit functions, fixing flang builds

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

We have significant downstream changes, and we regularly build with shared libraries for development, only static linking in our release builds. This will affect us and having to chase down a bunch of functions with LLVM_SUPPORT_ABI does not fill me with joy. Yes, static linking is great for releases, but it sucks for development, full per-component shared libraries make things much faster, please don't just dismiss it.

Well, per-component would be significantly worse - LLVM_SUPPORT_ABI marks what just LLVMSupport exports. You would need a separate macro per component if you do a shared library build with a per-component shared library. Doing a set 2-3 libraries means that there are only 2-3 macros that need to be contended with and its easier to identify which module the function resides in. That also means that it is more viable to find LTO/DCE optimizations.

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

We have significant downstream changes, and we regularly build with shared libraries for development, only static linking in our release builds. This will affect us and having to chase down a bunch of functions with LLVM_SUPPORT_ABI does not fill me with joy. Yes, static linking is great for releases, but it sucks for development, full per-component shared libraries make things much faster, please don't just dismiss it.

Well, per-component would be significantly worse - LLVM_SUPPORT_ABI marks what just LLVMSupport exports. You would need a separate macro per component if you do a shared library build with a per-component shared library. Doing a set 2-3 libraries means that there are only 2-3 macros that need to be contended with and its easier to identify which module the function resides in. That also means that it is more viable to find LTO/DCE optimizations.

I don't care about efficiency downstream, I just don't want it to be broken. If you break per-component shared libraries you force me to change my workflow and have slower incremental builds. If I have to go chasing down build errors because our additions need exporting in order for the monolithic shared build to work then I get grumpy at the busywork. If there's strong technical justification for the latter, fine, but please do it in as least disruptive a way as possible (i.e. with as much automation as possible, and in one go rather than spread across 10s of commits). I see no technical justification for the former, that would just be breaking workflows that you personally don't use.

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

We have significant downstream changes, and we regularly build with shared libraries for development, only static linking in our release builds. This will affect us and having to chase down a bunch of functions with LLVM_SUPPORT_ABI does not fill me with joy. Yes, static linking is great for releases, but it sucks for development, full per-component shared libraries make things much faster, please don't just dismiss it.

Well, per-component would be significantly worse - LLVM_SUPPORT_ABI marks what just LLVMSupport exports. You would need a separate macro per component if you do a shared library build with a per-component shared library. Doing a set 2-3 libraries means that there are only 2-3 macros that need to be contended with and its easier to identify which module the function resides in. That also means that it is more viable to find LTO/DCE optimizations.

I don't care about efficiency downstream, I just don't want it to be broken. If you break per-component shared libraries you force me to change my workflow and have slower incremental builds. If I have to go chasing down build errors because our additions need exporting in order for the monolithic shared build to work then I get grumpy at the busywork. If there's strong technical justification for the latter, fine, but please do it in as least disruptive a way as possible (i.e. with as much automation as possible, and in one go rather than spread across 10s of commits). I see no technical justification for the former, that would just be breaking workflows that you personally don't use.

Well, it may be that your downstream project doesn't care about efficiency, but every project has a differing set of needs and makes different tradeoffs. This is an issue for LLVM, and impacting performance of the compiler to make a particular workflow possible is not valid justification. There are technical grounds for preferring the semi-monolithic library for the tooling and the compiler.

compnerd updated this revision to Diff 370824.Sep 5 2021, 12:58 PM

Expose LLVMParseCommandLineOptions

jrtc27 added a comment.Sep 5 2021, 1:28 PM

I should mention that this doesn't matter to the downstream projects at the end of the day - they aren't compelled to enable the shared library mode - static linking will be unimpacted.

We have significant downstream changes, and we regularly build with shared libraries for development, only static linking in our release builds. This will affect us and having to chase down a bunch of functions with LLVM_SUPPORT_ABI does not fill me with joy. Yes, static linking is great for releases, but it sucks for development, full per-component shared libraries make things much faster, please don't just dismiss it.

Well, per-component would be significantly worse - LLVM_SUPPORT_ABI marks what just LLVMSupport exports. You would need a separate macro per component if you do a shared library build with a per-component shared library. Doing a set 2-3 libraries means that there are only 2-3 macros that need to be contended with and its easier to identify which module the function resides in. That also means that it is more viable to find LTO/DCE optimizations.

I don't care about efficiency downstream, I just don't want it to be broken. If you break per-component shared libraries you force me to change my workflow and have slower incremental builds. If I have to go chasing down build errors because our additions need exporting in order for the monolithic shared build to work then I get grumpy at the busywork. If there's strong technical justification for the latter, fine, but please do it in as least disruptive a way as possible (i.e. with as much automation as possible, and in one go rather than spread across 10s of commits). I see no technical justification for the former, that would just be breaking workflows that you personally don't use.

Well, it may be that your downstream project doesn't care about efficiency, but every project has a differing set of needs and makes different tradeoffs. This is an issue for LLVM, and impacting performance of the compiler to make a particular workflow possible is not valid justification. There are technical grounds for preferring the semi-monolithic library for the tooling and the compiler.

Preferring != requiring. Of course if there's sufficient technical justification for removing that option then fine, but that's for the community to decide, not you.

FWIW, i personally think that the current set of libs BUILD_SHARED_LIBS produces should not change, it's intentional that way, to both be fast and fine-grained.
And LLVM_BUILD_DYLIB is also intentionally producing a huge monolithic library, so that only a single library needs to be dlopened for speed reasons.
IIRC someone from MLIR also wanted to do something about library restructurization, i would suggest to talk to them.

All that being said, if either BUILD_SHARED_LIBS or LLVM_BUILD_DYLIB is being changed from what they are now,
i think this requires an RFC to llvm-dev with precise explanation of what why and how,
and why it's okay to regress existing granularity/speed/performance expectations.

FWIW, i personally think that the current set of libs BUILD_SHARED_LIBS produces should not change, it's intentional that way, to both be fast and fine-grained.
And LLVM_BUILD_DYLIB is also intentionally producing a huge monolithic library, so that only a single library needs to be dlopened for speed reasons.
IIRC someone from MLIR also wanted to do something about library restructurization, i would suggest to talk to them.

All that being said, if either BUILD_SHARED_LIBS or LLVM_BUILD_DYLIB is being changed from what they are now,
i think this requires an RFC to llvm-dev with precise explanation of what why and how,
and why it's okay to regress existing granularity/speed/performance expectations.

This change has no impact on what BUILD_SHARED_LIBS does. It does however fix the current problem that LLVMSupport on WIndows when built as a shared library has no public functions and on ELF and MachO targets exports everything. Instead, this will explicitly control the public surface of the library's interface when built as a shared library.

There are similar changes that will be needed to support building LLVM as a shared library, but that is a separate larger change and I have no intention of merging that change into this change.

MaskRay added a comment.EditedSep 5 2021, 1:47 PM

If I am to summarize the state. The currently ELF build models are:

  • default: libLLVMSupport.a
  • -DBUILD_SHARED_LIBS=on: libLLVMSupport.so
  • -DLLVM_LINK_LLVM_DYLIB=on: libLLVM-14git.so

MSVC dynamic linking doesn't work.

From

  • minimal set of interfaces required for llvm-ar (archiver/librarian)
  • minimal set of interfaces required for lld (linker)
  • minimal set of interfaces required for clang (compiler)

it seems that the intention is to have libLLVM{foo,bar,qux}.so as the end state? And that is a controversial point among at least @compnerd, @jrtc27, and me.

I think the main purpose of this patch is to introduce these source file annotations (LLVM_SUPPORT_ABI #include "llvm/Support/LLVMSupportExports.h") which will be the backbone of whatever dynamic linking mode we end up doing.
And we should ensure all parties are satisfied with the solution.
If you remove the behavior changing CMake changes, I will give a strong support.

Specifically, any risky CMake or visibility annotation changes should be dropped from this starting point patch. More concretely,

  • #define LLVM_SUPPORT_ABI is acceptable since it is clearly a no-op.
  • #define LLVM_SUPPORT_ABI __attribute__((visibility("protected"))) can be risky and is thus unacceptable
  • Changing -DBUILD_SHARED_LIBS=on behavior is unacceptable (this presumably doesn't work on MSVC, so changing MSVC is fine). MinGW seems to support this config, so don't change its behavior (CC @mstorsjo)
  • Changing -DLLVM_LINK_LLVM_DYLIB=on behavior is unacceptable (this doesn't work on MSVC, so changing MSVC is fine)

Currently there is a hacking change in lib/Support/CMakeLists.txt: add_llvm_component_library(LLVMSupport SHARED
This breaks the regular libLLVMSupport.a build, so it should be fixed.

The libLLVM{foo,bar,qux}.so design is related to the Windows DLL limitation about 64k exported symbols.
To reduce platform differences (ELF/Mach-O vs PE/COFF), @compnerd wants the libLLVM{foo,bar,qux}.so design to be the way forward and the monolithic libLLVM-14git.so scheme to be dropped.
I agree with @lebedev.ri that such change needs to discussed on llvm-dev.
The libLLVM-14git.so scheme is adopted by multiple major Linux distributions with many LLVM dependent packages.
Such a split should not be taken lightly.

FWIW For my https://github.com/MaskRay/ccls/blob/master/CMakeLists.txt#L95 , I'll need to adjust

if(LLVM_LINK_LLVM_DYLIB)
  target_link_libraries(ccls PRIVATE LLVM)

Other packages may need similar adaptation as well.
Personally I consider that ELF/Mach-O changing libLLVM-14git.so to cater to Windows DLL limitation unfortunate, but I can be convinced if the split argument (platform difference, maintenance cost consideration) is sufficiently strong.

compnerd updated this revision to Diff 370828.Sep 5 2021, 2:04 PM

If I am to summarize the state. The currently ELF build models are:

  • default: libLLVMSupport.a
  • -DBUILD_SHARED_LIBS=on: libLLVMSupport.so
  • -DLLVM_LINK_LLVM_DYLIB=on: libLLVM-14git.so

Windows dynamic linking doesn't work.

From

  • minimal set of interfaces required for llvm-ar (archiver/librarian)
  • minimal set of interfaces required for lld (linker)
  • minimal set of interfaces required for clang (compiler)

it seems that the intention is to have libLLVM{foo,bar,qux}.so as the end state? And that is a controversial point among at least @compnerd, @jrtc27, and me.

I think the main purpose of this patch is to introduce these source file annotations (LLVM_SUPPORT_ABI #include "llvm/Support/LLVMSupportExports.h") which will be the backbone of whatever dynamic linking mode we end up doing.
And we should ensure all parties are satisfied with the solution.
If you remove the behavior changing CMake changes, I will give a strong support.

More to introduce the model for the annotations. The annotations are module dependent, that is each module must introduce its own decoration and the function must have a clear owner defined, which is why LLVMSupport is a good basis for the conceptual change - the intent is that the functions in LLVMSupport are owned by LLVMSupport and thus will be annotated with the LLVM_SUPPORT_ABI.

Specifically, any risky CMake or visibility annotation should be dropped from this starting point.

I think that this we disagree on. I think that the visibility handling needs to be part of this - otherwise the behaviour becomes too complicated to reason about. The default beahviour on Windows is that the symbol is dso_local and the default behaviour on ELF and MachO targets is *not* dso_local. This means that we have different defaults and any future change will need to contend with this. By changing the visibility for LLVMSupport when building LLVMSupport as a shared library to hidden visibility, we should get consistent behaviour across the board and make it easier to reason through for changes.

E.g.

  • #define LLVM_SUPPORT_ABI is acceptable since it is clearly a no-op.
  • #define LLVM_SUPPORT_ABI __attribute__((visibility("protected"))) can be risky and is thus unacceptable
  • Changing -DBUILD_SHARED_LIBS=on behavior is unacceptable (this presumably doesn't work on Windows, so changing Windows is fine)
  • Changing -DLLVM_LINK_LLVM_DYLIB=on behavior is unacceptable (this doesn't work on Windows, so changing Windows is fine)

Currently there is a hacking change in lib/Support/CMakeLists.txt: add_llvm_component_library(LLVMSupport SHARED
This breaks the regular libLLVMSupport.a build, so it should be fixed.

Yes, and I even commented as such that it is unacceptable, but is done to allow for quick turnaround on testing. Once the pre-commit checks have verified that it is happy with the build, this will be changed to restore the build of LLVMSupport to the way that it is handled today.

The libLLVM{foo,bar,qux}.so design is related to the Windows DLL limitation about 64k exported symbols.
To reduce platform differences (ELF/Mach-O vs PE/COFF), @compnerd wants the libLLVM{foo,bar,qux}.so design to be the way forward and the monolithic libLLVM-14git.so scheme to be dropped.
I agree with @lebedev.ri that such change needs to discussed on llvm-dev.
The libLLVM-14git.so scheme is adopted by multiple major Linux distributions with many LLVM dependent packages.
Such a split should not be taken lightly.

Yes I do expect that we are going to hit the limit of the exported symbols and thus its very likely to require that split. However, I cannot say with certainty that will happen until we get to that point. Once LLVMSupport changes are baked enough to demonstrate the pattern, the changes to allow building LLVM as a library can be tested to see what the current limitation is. The more fragmented the build, the worse it is for optimizations and size reduction possibilities. It also makes distribution more complicated.

FWIW For my https://github.com/MaskRay/ccls/blob/master/CMakeLists.txt#L95 , I'll need to adjust

if(LLVM_LINK_LLVM_DYLIB)
  target_link_libraries(ccls PRIVATE LLVM)

Other packages may need similar adaptation as well.
Personally I consider that ELF/Mach-O changing libLLVM-14git.so to cater to Windows DLL limitation unfortunate, but I can be convinced if the split argument (platform difference, maintenance cost consideration) is sufficiently strong.

I don't think that it is "catering" nor "unfortunate". There are tradeoffs in software engineering, and this is one of them. There are similar issues where LLVM is doing things that are non-optimal and other platforms suffer from them. An example is the use of the anchor key functions which are meaningless on some platforms.

The current state of the shared library build is another example of the "unfortunate" results of supporting ELF targets (I say this as a counterpoint to your argument that it is unfortunate that ELF must do something different, I tend to care about ELF targets just as much as PE/COFF targets).

Platforms are different, and software engineering requires that you evaluate different tradeoffs. There is a rich design space for structuring binaries, and choices aren't inherently unfortunate. They are evaluating with different weights being ascribed to different features, and that results in different choices.

Keeping LLVM friendly to a variety of platforms with contributors developing and testing on a different set of platforms means that it is best to have it behave as consistently as possible. This reduces the likelihood of someone unintentionally regressing a system other than the one that they are developing on.

The c-index-test failure is interesting. libclang will link against LLVMPasses and have the enable-ml-inliner option and thus register it. c-index-tool itself links against LLVMPasses and thus registers the option again, resulting in the duplicate registration.

echristo requested changes to this revision.Sep 8 2021, 11:19 AM
echristo added a subscriber: echristo.

Hi Saleem,

I think I understand where you're going, but I think this deserves a writeup and consideration on llvm-dev with your desired endpoint for the source base and what changes to both current and future development will need to happen with some benefits and costs outlined. Might want to include a "support story" for the project along these lines as well - or at least be clear that this doesn't change or affect what long term APIs we are willing to support.

Thoughts?

This revision now requires changes to proceed.Sep 8 2021, 11:19 AM

Hi Saleem,

I think I understand where you're going, but I think this deserves a writeup and consideration on llvm-dev with your desired endpoint for the source base and what changes to both current and future development will need to happen with some benefits and costs outlined. Might want to include a "support story" for the project along these lines as well - or at least be clear that this doesn't change or affect what long term APIs we are willing to support.

Thoughts?

I think that is inline with what I had expected. I actually asked @rnk about his thoughts on this as well, and he was of the opinion that this wouldn't be a big surprise and to send out a proposed change first. I wanted to get through LLVMSupport to the point that it would be possible to show what an approach for enabling a shared library build that could support all the platforms would look like. I don't have any concerns with sending out a small write up of what the changes once at least the Windows pre-commit CI is green. I already had expected to have to do that subsequently as I don't expect to tackle LLVM in a single change. I had expected this to serve as a preparatory change to discuss the necessary changes for permitting LLVM_ENABLE_DYLIB builds on Windows.

arichardson added inline comments.
llvm/include/llvm/Support/LLVMSupportExports.h
12 ↗(On Diff #370755)

If we need to add this for every library, it seems like the header should be generated?

Maybe we could even re-use https://cmake.org/cmake/help/latest/module/GenerateExportHeader.html?

rnk added inline comments.Sep 10 2021, 12:25 PM
llvm/include/llvm/Support/LLVMSupportExports.h
12 ↗(On Diff #370755)

CMake is the official build system for LLVM, so maybe this should not weigh on the project priorities, but there are two unofficial build systems in-tree that would have to replicate this functionality (Bazel & gn).

I think we should strive to keep the build system as simple as possible, even if it has ways to help us reduce boilerplate code. Most LLVM developers are not CMake experts, but we can all read and reason about C++ and C pre-processor directives.

I believe there are ways to factor out the selection of the appropriate attribute using token pasting.

compnerd added inline comments.Sep 13 2021, 10:15 AM
llvm/include/llvm/Support/LLVMSupportExports.h
12 ↗(On Diff #370755)

We would in order to support BUILD_SHARED_LIBS universally. However, my goal is not to enable BUILD_SHARED_LIBS but rather allow LLVM_BUILD_LLVM_DYLIB which builds a single shared library. LLVMSupport is a good target to demonstrate what is needed without having to implement the full coverage for all of LLVM without actually reaching some consensus. That said, I would certainly be interested in having LLVMSupport as a shared library as well.

This comment was removed by fortsnek9348.
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2023, 7:00 AM
tstellar commandeered this revision.Jun 8 2023, 4:09 PM
tstellar edited reviewers, added: compnerd; removed: tstellar.
tstellar updated this revision to Diff 529776.EditedJun 8 2023, 4:37 PM

After talking with @compnerd on Discord, I'm planning on helping to push this
patch through. At this point my goal is just to annotate the ABI functions and
not make any changes to how -DBUILD_SHARED_LIBS=ON or -DLLVM_LINK_LLVM_DYLIB=ON
behave. We can make changes to those configurations in a future patch (if we
want to).

My plan is to:

  1. Annotate all the ABI functions in the LLVM support library with LLVM_SUPPORT_ABI macro (this patch), and use this an example for how to do the rest of the annotations.
  2. Annotate the rest of the LLVM ABI functions using a different macro (e.g. LLVM_ABI)
  3. Annotate the C ABI with a third macro (e.g. LLVM_C_ABI).
  4. Annotate the clang C++ ABI with another macro (e.g. CLANG_CPP_ABI).
  5. Do an analysis of which ABI functions usage inside and outside of the tree, and remove unused functions from the public ABI (i.e. mark them as visibility hidden).

For this patch, I've rebased it on main and made a few changes:

  • Add ABI annotations to classes and remove them from class member functions.

This will make the patch smaller and easier to maintain. We can always revisit
this in the future once we have the complete set of ABIs annotated.

  • Use default visibility instead of protected visibility on Linux.
  • Drop the add_llvm_component_library(LLVMSupport SHARED change.
tstellar retitled this revision from [WIP/DNM] Support: introduce public API annotation support to Support: introduce public API annotation support.Jun 8 2023, 4:42 PM
nikic added a subscriber: nikic.Jun 23 2023, 1:44 AM

After talking with @compnerd on Discord, I'm planning on helping to push this
patch through. At this point my goal is just to annotate the ABI functions and
not make any changes to how -DBUILD_SHARED_LIBS=ON or -DLLVM_LINK_LLVM_DYLIB=ON
behave. We can make changes to those configurations in a future patch (if we
want to).

My plan is to:

  1. Annotate all the ABI functions in the LLVM support library with LLVM_SUPPORT_ABI macro (this patch), and use this an example for how to do the rest of the annotations.
  2. Annotate the rest of the LLVM ABI functions using a different macro (e.g. LLVM_ABI)

Why do we need/want to annotate the support library with a different macro from everything else? Isn't LLVMSupport part of libLLVM just like the rest?

We already have the LLVM_EXTERNAL_VISIBILITY and LLVM_LIBRARY_VISIBILITY macros. LLVM_EXTERNAL_VISIBILITY is used in the places that already use the -fvisibility=hidden default, namely the backend. LLVM_LIBRARY_VISIBILTY is used in a few places that want to explicitly exclude symbols in code that still uses default visibilty.

I would have expected that this would annotate symbols using the existing LLVM_EXTERNAL_VISIBILITY macro (and LLVM_LIBRARY_VISIBILITY could be retired once everything uses -fvisibility=hidden).

llvm/include/llvm/Support/LLVMSupportExports.h
13 ↗(On Diff #529776)

What is the LLVM_SUPPORT_STATIC variable for? I don't see it defined anywhere.

After talking with @compnerd on Discord, I'm planning on helping to push this
patch through. At this point my goal is just to annotate the ABI functions and
not make any changes to how -DBUILD_SHARED_LIBS=ON or -DLLVM_LINK_LLVM_DYLIB=ON
behave. We can make changes to those configurations in a future patch (if we
want to).

My plan is to:

  1. Annotate all the ABI functions in the LLVM support library with LLVM_SUPPORT_ABI macro (this patch), and use this an example for how to do the rest of the annotations.
  2. Annotate the rest of the LLVM ABI functions using a different macro (e.g. LLVM_ABI)

Why do we need/want to annotate the support library with a different macro from everything else? Isn't LLVMSupport part of libLLVM just like the rest?

LLVMSupport is used very commonly as a support library outside of LLVM.

We already have the LLVM_EXTERNAL_VISIBILITY and LLVM_LIBRARY_VISIBILITY macros. LLVM_EXTERNAL_VISIBILITY is used in the places that already use the -fvisibility=hidden default, namely the backend. LLVM_LIBRARY_VISIBILTY is used in a few places that want to explicitly exclude symbols in code that still uses default visibilty.

I would have expected that this would annotate symbols using the existing LLVM_EXTERNAL_VISIBILITY macro (and LLVM_LIBRARY_VISIBILITY could be retired once everything uses -fvisibility=hidden).

Visibility and DLL Storage are technically different and provide subtly different things. LLVM_EXTERNAL_VISIBILITY does not take the library into account nor does it take into account the static vs dynamic linking nature and thus would not be sufficient for the purposes of DLL Storage.

rnk added a comment.Jun 23 2023, 9:45 AM

Why do we need/want to annotate the support library with a different macro from everything else? Isn't LLVMSupport part of libLLVM just like the rest?

We already have the LLVM_EXTERNAL_VISIBILITY and LLVM_LIBRARY_VISIBILITY macros. LLVM_EXTERNAL_VISIBILITY is used in the places that already use the -fvisibility=hidden default, namely the backend. LLVM_LIBRARY_VISIBILTY is used in a few places that want to explicitly exclude symbols in code that still uses default visibilty.

I would have expected that this would annotate symbols using the existing LLVM_EXTERNAL_VISIBILITY macro (and LLVM_LIBRARY_VISIBILITY could be retired once everything uses -fvisibility=hidden).

For DLL import/export annotations, we need to have distinct macros for every DLL we plan to make. The question then becomes, do we need libLLVMSupport.dll?

Building Support as a DLL is by far the most conventional and straightforward approach. There are many tools (FileCheck, llvm-tblgen) that use Support but not all of LLVM, and they need to be built first.

The alternative is that we place dllexport annotations across the Support APIs, but then we come up with some creative build rules to produce a special static Support library for all of the tools that don't or can't link to the main LLVM DSO. This approach has the risk of leading users into accidentally linking both libraries, which then results in linker errors or silent ODR violations with duplicated global state. We can avoid all this if we are willing to tolerate the complexity of using different ABI annotations in Support headers.

Personally, I think in the long run, having a separate support DLL will be the least confusing approach and will have the fewest surprises. However, I'm speaking as someone who contributed to the implementation of dllexport semantics in clang, so I am probably much more familiar with them than the average LLVM developer.

Given that my plan right now is just to annotate the ABI and not create any new shared objects, should we still use a separate annotation for Support or should we just use the same annotation for the whole library and then specialize it later?

Why do we need/want to annotate the support library with a different macro from everything else? Isn't LLVMSupport part of libLLVM just like the rest?

We already have the LLVM_EXTERNAL_VISIBILITY and LLVM_LIBRARY_VISIBILITY macros. LLVM_EXTERNAL_VISIBILITY is used in the places that already use the -fvisibility=hidden default, namely the backend. LLVM_LIBRARY_VISIBILTY is used in a few places that want to explicitly exclude symbols in code that still uses default visibilty.

I would have expected that this would annotate symbols using the existing LLVM_EXTERNAL_VISIBILITY macro (and LLVM_LIBRARY_VISIBILITY could be retired once everything uses -fvisibility=hidden).

For DLL import/export annotations, we need to have distinct macros for every DLL we plan to make. The question then becomes, do we need libLLVMSupport.dll?

Building Support as a DLL is by far the most conventional and straightforward approach. There are many tools (FileCheck, llvm-tblgen) that use Support but not all of LLVM, and they need to be built first.

The alternative is that we place dllexport annotations across the Support APIs, but then we come up with some creative build rules to produce a special static Support library for all of the tools that don't or can't link to the main LLVM DSO. This approach has the risk of leading users into accidentally linking both libraries, which then results in linker errors or silent ODR violations with duplicated global state. We can avoid all this if we are willing to tolerate the complexity of using different ABI annotations in Support headers.

Personally, I think in the long run, having a separate support DLL will be the least confusing approach and will have the fewest surprises. However, I'm speaking as someone who contributed to the implementation of dllexport semantics in clang, so I am probably much more familiar with them than the average LLVM developer.

So I think I understand why we need a separate ABI macro and can't reuse the existing VISIBILTY ones, but I'm still not clear on why Support is special. Currently (if we ignore BUILD_SHARED_LIBS, which is orthogonal to this) we only support a single libLLVM dylib. Cases that cannot link the dylib for some reason will link against static libraries instead. (Things like tablegen I assume, but also things like llvm-exegesis that use private symbols.)

Is this about some hypothetical future extension where we split up libLLVM.so into smaller libraries (something I am somewhat skeptical about) or is there some Windows idiosyncrasy that requires the special treatment of the Support library?

I'm leaning towards not having a separate macro for LLVMSupport and just using the LLVM_ABI macro for LLVMSupport and everything else.

nikic added a comment.Jul 6 2023, 12:49 AM

I'm leaning towards not having a separate macro for LLVMSupport and just using the LLVM_ABI macro for LLVMSupport and everything else.

I agree. We should have one macro for each currently existing dylib -- which is just the one for LLVM. If we want to introduce additional dylibs in the future, we should split the macro at that point in time.

tstellar updated this revision to Diff 538358.Jul 8 2023, 5:13 AM
  • Use generic LLVM_ABI macro instead of LLVM_SUPPORT_ABI
  • Rebase
rnk added a comment.Jul 10 2023, 11:35 AM

If we're not making a separate DLL for LLVM support, that sounds fine to me.

I'm worried that by limiting scope here, we're just deferring the discovery and resolution of problems that @compnerd already grappled with in his attempt to make LLVM DLLs, but it's also good to be incremental and get things moving.

If we're not making a separate DLL for LLVM support, that sounds fine to me.

I'm worried that by limiting scope here, we're just deferring the discovery and resolution of problems that @compnerd already grappled with in his attempt to make LLVM DLLs, but it's also good to be incremental and get things moving.

I've got a WIP patch to add annotations to the rest of the ABI, I can merge that into this one if you think that's better.

rnk accepted this revision.Jul 12 2023, 1:45 PM

If we're not making a separate DLL for LLVM support, that sounds fine to me.

I'm worried that by limiting scope here, we're just deferring the discovery and resolution of problems that @compnerd already grappled with in his attempt to make LLVM DLLs, but it's also good to be incremental and get things moving.

I've got a WIP patch to add annotations to the rest of the ABI, I can merge that into this one if you think that's better.

Either approach seems fine to me.

Even if this is approved, make sure that it actually successfully builds in Windows configs (last time I tried an earlier version of this patch it broke the build very severely). The premerge CI doesn't seem to have succeeded in applying the patch to even try building it.