This is an archive of the discontinued LLVM Phabricator instance.

Add Clang shared library with C++ exports
ClosedPublic

Authored by beanz on May 14 2019, 8:50 AM.

Details

Summary

This patch adds a libClang_shared library on *nix systems which exports the entire C++ API. In order to support this on Windows we should really refactor llvm-shlib and share code between the two.

This also uses a slightly different method for generating the shared library, which I should back-port to llvm-shlib. Instead of linking the static archives and passing linker flags to force loading the whole libraries, this patch creates object libraries for every library (which has no cost in the build system), and link the object libraries.

Event Timeline

beanz created this revision.May 14 2019, 8:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 14 2019, 8:50 AM
Herald added a subscriber: mgorny. · View Herald Transcript

Thanks for working on this, I have wanted something like this for a while.

It would also be nice to have a CLANG_LINK_CLANG_DYLIB option like we have for llvm, but this can be a follow on patch, and I would be happy to help with this if needed.

clang/tools/clang-shlib/CMakeLists.txt
8

Is this capital 'C' to distinguish it from the other libclang.so ? I would prfer to use lowercase 'c' here instead, so it fit the naming convention for the other libraries.

You mention that you're using OBJECT libraries so objects aren't built mutliples times
and in my current tests the number of steps increased by only 3, it went from 4353 to 4356,
when using this patch, which is great!

What I see in my testing of the patch is that cmake finds that the compilers support position independent code and use -fPIC as default:

-- LLVM default target triple: x86_64-unknown-linux-gnu
-- Performing Test C_SUPPORTS_FPIC
-- Performing Test C_SUPPORTS_FPIC - Success
-- Performing Test CXX_SUPPORTS_FPIC
-- Performing Test CXX_SUPPORTS_FPIC - Success
-- Building with -fPIC

And then, as expected, I see the compile command lines have -fPIC:

[1/4356] /usr/bin/c++  -DGTEST_HAS_RTTI=0 ... -fPIC ...

Its my understanding that when creating shared libraries position independent code generation is required.
Is there a problem if in some scenario the OBJECT libraries were not compiled with -fPIC?

beanz added a comment.May 14 2019, 1:47 PM

@winksaville whether or not PIC is required for shared libraries varies by platform. These days LLVM defaults to -fPIC, and I'm not sure we actually support running LLVM without -fPIC on systems that require shared libraries to be PIC.

beanz updated this revision to Diff 199508.May 14 2019, 1:48 PM

Changed to lowercase 'c' to match other clang libraries.

I did some quick testing.

I used cmake and ninja to build llvm and enabled clang;lld;compiler-rt subprojects:

$ cd build-beanz-clang-shlib-2-add-debug-BUILD-LINK_DYNLIB-ON
$ cmake ../llvm -G Ninja '-DLLVM_ENABLE_PROJECTS=clang;lld;compiler-rt' -DCMAKE_INSTALL_PREFIX=/home/wink/local-beanz-clang-shlib-2-add-debug-BUILD-LINK_DYNLIB-ON -DCMAKE_BUILD_TYPE=Release -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON
$ ninja

I then used bin/clang++ to create bin/clang-shared and verified it ran and looked at the sizes; bin/clang-shared is 187K while bin/clang-9 is 45M

$ bin/clang++  -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG  -Wl,-allow-shlib-undefined   -Wl,--export-dynamic  -Wl,-rpath-link,/home/wink/prgs/llvm/llvm-project/build-beanz-clang-shlib-2-add-debug-BUILD-LINK_DYNLIB-ON/./lib  -Wl,-O3 tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o tools/clang/tools/driver/CMakeFiles/clang.dir/cc1gen_reproducer_main.cpp.o  -o bin/clang-shared  -Wl,-rpath,"\$ORIGIN/../lib" -lpthread lib/libclang_shared.so lib/libLLVM-9svn.so 
$ ./bin/clang-shared --version
clang version 9.0.0 (git@github.com:winksaville/llvm-project 2e41f82e327c9aace5b1367995357a787be77105)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/wink/prgs/llvm/llvm-project/build-beanz-clang-shlib-2-add-debug-BUILD-LINK_DYNLIB-ON/./bin
$ ls -hal bin/clang-9 bin/clang-shared
-rwxr-xr-x 1 wink users  45M May 14 17:57 bin/clang-9
-rwxr-xr-x 1 wink users 187K May 14 22:15 bin/clang-shared

And I'd previously built clang-9 with LLVM_BUILD_LLVM_DYLIB=OFF and it is 114M:

$ ls -hal ../build-beanz-clang-shlib-2-add-debug-DYNLIB-OFF/bin/clang-9
-rwxr-xr-x 1 wink users 114M May 14 11:31 ../build-beanz-clang-shlib-2-add-debug-DYNLIB-OFF/bin/clang-9

I then compilef hi.c with clang-shared and ran the result:

$ cat ../hi.c
#include <stdio.h>

int main(void) {
  printf("hi\n");
  return 0;
}
$ ./bin/clang-shared ../hi.c -o hi
$ ./hi
hi

LGTM

Questions:

  • Should we only build libclang_shared.so if LLVM_BUILD_LLVM_DYLIB is ON?
  • Should we use link clang-9 to libclang_shared.so when LLVM_LINK_LLVM_DYLIB is ON?
  • Or maybe we should define BUILD_CLANG_DYLIB and LINK_CLANG_DYLIB or ... ?

Questions:

  • Should we only build libclang_shared.so if LLVM_BUILD_LLVM_DYLIB is ON?

LLVM_BUILD_LLVM_DYLIB is actually not the important option to think about because it has no impact on this, rather LLVM_LINK_LLVM_DYLIB. That really depends on what tradeoffs you want. With that option off LLVM will be linked statically into the Clang library, which in the common case is fine because you can resolve all the LLVM & Clang APIs against it. That will be faster to load, but a larger binary distribution. With the option on, libclang_shared will link libLLVM, which will result in slower loading times, but smaller distributions. Both are workable situations.

  • Should we use link clang-9 to libclang_shared.so when LLVM_LINK_LLVM_DYLIB is ON?

No. If we want to support that it should be through its own option which we can add in a subsequent patch.

  • Or maybe we should define BUILD_CLANG_DYLIB and LINK_CLANG_DYLIB or ... ?

I'm not sure there is value in a BUILD_CLANG_DYLIB option (and frankly LLVM_BUILD_LLVM_DYLIB should probably go too). Between when I added the support for building libLLVM and now I've shifted my mindset about build system functionality. Optimizing the build time of the all target seems way less worthwhile to me than reducing complication in the build system. Most developers iterate by running check* targets over and over again rather than building all, and as long as the dependencies are properly set up adding additional targets has little impact on iteration time or perceived build time.

A CLANG_LINK_CLANG_DYLIB option probably does make sense to add in order to support smaller binary distributions, although I am concerned about communicating the impact of that and the LLVM_LINK_LLVM_DYLIB options. In the past I've seen people do compile-time performance comparisons between GCC and Clang using the BUILD_SHARED_LIBS option when building Clang, and it leads to really misleading results. I would be very frustrated to see a pile of blog posts saying that Clang is getting slower just because people are deciding to make clang dynamically link its components.

IMHO "BUILD_CLANG_DYLIB" is needed. As you have it now libclang_shared.so is always builds on UNIX systems, which I believe means that all linux distros would have both increasing their sizes. I think the default should be "always" build libclang*.a as it is now, and optionally build libclang_shared.so using some flag.

And adding "LINK_CLANG_DYLIB" in a future patch SGTM.

clang/tools/CMakeLists.txt
16

It seems to me we should put creating clang-shlib in the users control and default to off.

IMHO "BUILD_CLANG_DYLIB" is needed. As you have it now libclang_shared.so is always builds on UNIX systems, which I believe means that all linux distros would have both increasing their sizes.

Distributions shouldn't be installing all unless they really want everything and the kitchen sink. We have the LLVM_DISTRIBUTION_COMPONENTS so that distributions can decide which pieces of the LLVM & Clang distributions they want to install. That is the cleanest mechanism for curating an install.

IMHO "BUILD_CLANG_DYLIB" is needed. As you have it now libclang_shared.so is always builds on UNIX systems, which I believe means that all linux distros would have both increasing their sizes.

Distributions shouldn't be installing all unless they really want everything and the kitchen sink. We have the LLVM_DISTRIBUTION_COMPONENTS so that distributions can decide which pieces of the LLVM & Clang distributions they want to install. That is the cleanest mechanism for curating an install.

Sorry, maybe I didn't make myself clear. I didn't mean distros installing "all" of LLVM. What I meant was that this change currently uses if(UNIX) to generate libclang_shared.so, which means "all linux distors" will have both libclang*.a and libclang_shared.so. Therefore, I'm suggesting the need for something like "BUILD_CLANG_DYLIB" so that libclang_shared.so is only created when "BUILD_CLANG_DYLIB" is ON and the default is OFF.

beanz added a comment.May 15 2019, 3:10 PM

Sorry, maybe I didn't make myself clear.

I understood you fine. I don't think you understand the guidance for building distributions of LLVM.

Distributions only get libclang_shared if they run the install target which installs all of LLVM & Clang. The point of LLVM_DISTRIBUTION_COMPONENTS is to allow people constructing distributions to choose which pieces they want to install without introducing a whole lot of overhead in the build system. The old method of passing dozens of flags to enable and disable individual pieces of the build resulted in a combinatoric explosion in build settings which are confusing and ill understood. The new method is one setting that is much cleaner to use.

Anyone constructing a distribution should be specifying LLVM_DISTRIBUTION_COMPONENTS and running the install-distribution target.

Given that your comment:

What I meant was that this change currently uses if(UNIX) to generate libclang_shared.so, which means "all linux distors" will have both libclang*.a and libclang_shared.so.

Is only true if the distribution is choosing to run the install target, which doesn't grant control over all the individual pieces of LLVM & Clang to install, and is not the recommended way to construct a distribution.

Distributions only get libclang_shared if they run the install target which installs all of LLVM & Clang. The point of LLVM_DISTRIBUTION_COMPONENTS is to allow people constructing distributions to choose which pieces they want to install without introducing a whole lot of overhead in the build system. The old method of passing dozens of flags to enable and disable individual pieces of the build resulted in a combinatoric explosion in build settings which are confusing and ill understood. The new method is one setting that is much cleaner to use.

Anyone constructing a distribution should be specifying LLVM_DISTRIBUTION_COMPONENTS and running the install-distribution target.

Is there any documentation for LLVM_DISTRIBUTION_COMPONENTS? I searched llvm-project and didn't find any.
The only thing I found on the internet was this LLVM weekly post, which pointed me to your llvm-dev post which says look at Apple-stage2.cmake.
I looked at it and there are a bunch of set commands and the one occurrance of set(LLVM_DISTRIBUTION_COMPONENTS with a list of components. So I thought I'd try the simplest thing possible, pass -DLLVM_DISTRIBUTION_COMPONENTS to cmake:

$ cmake ../llvm -G Ninja -DLLVM_DISTRIBUTION_COMPONENTS=clang

And it failed with:

CMake Error at CMakeLists.txt:1108 (message):
  Specified distribution component 'clang' doesn't have a target


CMake Error at CMakeLists.txt:1114 (message):
  Specified distribution component 'clang' doesn't have an install target


CMake Error at CMakeLists.txt:1120 (message):
  Specified distribution component 'clang' doesn't have an install-stripped
  target.  Its installation target creation should be changed to use
  add_llvm_install_targets, or you should manually create the
  'install-clang-stripped' target.

So obviously, that was wrong.

There is a simpler example distribution configuration, but sadly there isn't documentation. That is something I can fix.

I should note, this and many more aspects of the build system were topics I discussed in a talk at the 2016 LLVM Dev meeting when we switched off autoconf:
https://www.youtube.com/watch?v=StF77Cx7pz8

There is a simpler example distribution configuration, but sadly there isn't documentation. That is something I can fix.

Please add documentation if you want people to use it :)
In the mean time I'll watch the the talk you mentioned.

Please add documentation if you want people to use it :)

I won't disagree that it should be documented, and I already started. I do want to point out though that your motivation starting this was Arch linux where they are doing exactly what our documentation says not to do. So...

tstellar accepted this revision.May 16 2019, 11:14 AM

LGTM. We can add CLANG_LINK_CLANG_DYLIB as a follow-up patch.

This revision is now accepted and ready to land.May 16 2019, 11:14 AM

Just to be clear, I have nothing to do with any distribution except as a user (Arch Linux) so please take
what I say and request with a huge grain of salt. As mentioned I have filed a bug
against the clang package in Arch Linux so hopefully we'll be able to get them going in the right direction.

I do have another use case that I've done a little work on. For Pony they have a problem tracking
the various versions of LLVM that distros use so I created a git submodule which allows a ponyc
developer to work with multiple versions or variations of LLVM simultaneously and switch between them
relatively easily. Also, they will eventually switch to bundling libLLVM instead of using the system installed
version which could be any version. I have a standalone version of that at mkllvm-tool-chain.

Anyway, I'm obviously using "intall" style and I suggest having a BUILD_CLANG_DYLIB and LINK_CLANG_DYLIB would
be useful to more precisely control how a compiler might use the libclang*. Some users of LLVM might have a bunch of tools
that use various subparts of clang and they might be resident all the time. So startup performance isn't a problem, but
reducing the in memory foot print using shared libraries could be useful. Or not building the them at all if they are not
using them in anyway. So you're probably correct that for distros these flags shouldn't be used, but for other use cases
of LLVM I think there are.

In anycase, I think this is a worth while change as it and If you're still not convinced I like to see this in sooner rather than later.

beanz added a comment.May 16 2019, 2:56 PM

I will land this patch as-is today. I'll try and get a doc patch out for review today or tomorrow. @winksaville, I'm going to add a section to the document about building LLVM as a shared library for inclusion with tool distributions.

That still shouldn't be done with BUILD_SHARED_LIBS, we do have mechanisms in LLVM to allow people to control which parts of LLVM get included in libLLVM, and we should add similar mechanisms to this new functionality if needed.

This revision was automatically updated to reflect the committed changes.

@beanz @tstellar
I am wondering what to do wrt apt.llvm.org
should it be part of libclang or create a libclang++ package
what do you think?

I would leave it out of any distribution (at least for now).

ok, it should probably be installed in that case (it is currently in usr/lib/llvm-9/lib/libclang_shared.so.9 )
by the way, the name of lib isn't super explicit :/

@sylvestre.ledru, I’ve added you on D62040, which is an attempt to document best practices for generating releases.

@beanz Great doc, thanks! I will see what I could use for Debian/Ubuntu packages (as we have a lot users and packages organized in a specific way, it isn't always easy to make huge changes)

E5ten added a subscriber: E5ten.May 21 2019, 6:11 AM

I might be doing something wrong but this seems to have broken BUILD_SHARED_LIBS for me in that even with that enabled clang is built as a bunch of static libraries linked into a shared one like this patch is supposed to make it do, while I thought that BUILD_SHARED_LIBS was supposed to make it create a bunch of shared libraries as it did before with libclang and still does with libLLVM.

beanz added a comment.May 21 2019, 8:34 AM

@E5ten, I see what is going on. Give me 30 minutes and I’ll have a fix pushed.

beanz added a comment.May 21 2019, 8:54 AM

@E5ten, I just pushed r361271, which should resolve your issue.

I have a better longer-term fix in mind that I'll put up for review this week.

@beanz Wouldn't fixing this by adding OR BUILD_SHARED_LIBS to if(ARG_SHARED) in AddClang.cmake and to if (NOT LLVM_ENABLE_PIC) in clang-shlib/CMakeLists.txt to prevent making libclang_shared when BUILD_SHARED_LIBS is enabled make more sense?

E5ten added a comment.EditedMay 21 2019, 2:22 PM

@beanz Well I took libclang_shared as effectively an equivalent to the libLLVM.so that's created with that dylib option, and when BUILD_SHARED_LIBS is enabled that library is not created, in fact the option to create that library conflicts with BUILD_SHARED_LIBS. Also when the libs are built shared and also as object libraries that are linked into libclang_shared would that not cause the same libraries to be linked into executables twice, once through the shared libraries and once through libclang_shared?
EDIT: This is responding to an email that I thought was a comment on phabricator sorry

beanz added a comment.May 21 2019, 2:31 PM

@beanz Well I took libclang_shared as effectively an equivalent to the libLLVM.so that's created with that dylib option, and when BUILD_SHARED_LIBS is enabled that library is not created, in fact the option to create that library conflicts with BUILD_SHARED_LIBS.

That conflict is because of how I implemented libLLVM, and is something I'm looking to change. I'm hoping to put up a patch this week that will update how we generate libLLVM to be more like how libclang_shared is built.

Also when the libs are built shared and also as object libraries that are linked into libclang_shared would that not cause the same libraries to be linked into executables twice, once through the shared libraries and once through libclang_shared?

Object libraries are just collections of object files without linkage dependencies. The whole point of implementing libclang_shared in terms of object files is to allow constructing libclang_shared even if BUILD_SHARED_LIBS is enabled, and without using extra linker flags to force the linker to treat all the clang static archives as collections of object files (we do that in libLLVM using --all_load and --whole-archive flags).

It shouldn't result in duplicating the linkage because libclang contains all the code and shouldn't have any external resolutions against the libclang* component libraries.

One big reason that I don't want to disable generating libclang_shared (or libLLVM) if BUILD_SHARED_LIBS=On is because libclang_shared is intended to be a shippable binary unlike the BUILD_SHARED_LIBS libraries, so it really should be enabled in every possible build configuration that we can make it work.

E5ten added a comment.May 21 2019, 2:40 PM

@beanz But if libclang_shared is intended to be a shippable binary and BUILD_SHARED_LIBS is only intended to be an option used in developer builds, and libclang_shared while not causing conflicts (thanks for the info on how that works by the way) would be redundant in a local developer build one would see BUILD_SHARED_LIBS enabled in wouldn't it? And also to me it doesn't really make sense to enable the intended shippable binary in a build that is specifically enabling a developer option, and so is obviously not intended for shipment (I get that in the arch case it is intended for shipment but that case is them using an option they shouldn't not the option being one that should be used when the built products are intended for redistribution).

beanz added a comment.May 21 2019, 4:34 PM

@beanz But if libclang_shared is intended to be a shippable binary and BUILD_SHARED_LIBS is only intended to be an option used in developer builds, and libclang_shared while not causing conflicts (thanks for the info on how that works by the way) would be redundant in a local developer build one would see BUILD_SHARED_LIBS enabled in wouldn't it? And also to me it doesn't really make sense to enable the intended shippable binary in a build that is specifically enabling a developer option, and so is obviously not intended for shipment (I get that in the arch case it is intended for shipment but that case is them using an option they shouldn't not the option being one that should be used when the built products are intended for redistribution).

Best way to make sure a shippable binary builds and works is for developers to be building it. In general we should avoid developers doing things that aren't representative of what we expect to be in the hands of users. BUILD_SHARED_LIBS is an unusual exception to this that I wish would go away. The reason we keep it around is because of how significant of a workflow improvement it provides.

tools/clang-shlib/CMakeLists.txt
8 ↗(On Diff #199912)

can we agree on a different name?
we already have licbclang which is a shared library.

Maybe libclangcpp.so ?

kimgr added a subscriber: kimgr.Jun 30 2019, 7:44 AM

@sylvestre.ledru @beanz After this change, the Debian packaging on apt.llvm.org is basically broken. See https://bugs.llvm.org/show_bug.cgi?id=42432. I'm sure this can be fixed in packaging, but I don't know enough about it to know exactly what to do. At any rate I think the idea that libclang_shared.so.* can stay out of the published package is wrong. At least unless there's also a way to build the Clang tree withouth the clang_shared target.

Would be grateful for any ideas on how to fix this.

For now, it isn't part of the debian packaging.
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/blob/snapshot/debian/rules#L563
it is removed as packaging phase as I have been told it isn't ready.

Anyway, the lib should not keep this name.
By definition, on linux, .so means shared. It should have a more explicit name.

@beanz Can you please rename it? thanks

For now, it isn't part of the debian packaging.
https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/blob/snapshot/debian/rules#L563
it is removed as packaging phase as I have been told it isn't ready.

Anyway, the lib should not keep this name.
By definition, on linux, .so means shared. It should have a more explicit name.

@beanz Can you please rename it? thanks

I've filed a bug for this and marked it as a blocker for 9.0.0, because once we ship a release with this name, it will be harder to change: https://bugs.llvm.org/show_bug.cgi?id=42475

CLANG_LINK_CLANG_DYLIB=ON does not seem to work with LLVM_INSTALL_TOOLCHAIN_ONLY=ON. I filed a bug report: https://bugs.llvm.org/show_bug.cgi?id=42575