Page MenuHomePhabricator

[CMake][ELF] Link libLLVM.so and libclang-cpp.so with -Bsymbolic-functions
ClosedPublic

Authored by MaskRay on May 7 2021, 1:06 PM.

Details

Summary

llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html

In an ELF shared object, a default visibility defined symbol is preemptible by
default. This creates some missed optimization opportunities.
-Bsymbolic-functions is more aggressive than our current -fvisibility-inlines-hidden
(present since 2012) as it applies to all function definitions. It can

  • avoid PLT for cross-TU function calls && reduce dynamic symbol lookup
  • reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64

In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313
The built clang with -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on is significantly faster.
See the Linux kernel build result https://bugs.archlinux.org/task/70697

Note: the performance of -fno-semantic-interposition -Bsymbolic-functions
libLLVM.so and libclang-cpp.so is close to a PIE binary linking against
libLLVM*.a and libclang*.a. -Bsymbolic-functions is the major contributor.
On x86-64 (with GOTPCRELX) and ppc64 ELFv2, the GOT/TOC relocations can be
optimized.

Some implication:

Interposing a subset of functions is no longer supported.
(This is fragile on ELF and unsupported on Mach-O at all. For Mach-O we don't
use ld -interpose or -flat_namespace)

Compiling a program which takes the address of any LLVM function with
{gcc,clang} -fno-pic and expects the address to equal to the address taken
from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that
llvm-project shouldn't have different behaviors depending on such pointer
equality (as we've been using -fvisibility-inlines-hidden which applies to
inline functions for a long time), but if we accidentally do, users should be
aware that they should not make assumption on pointer equality in -fno-pic
mode.

See more on https://maskray.me/blog/2021-05-09-fno-semantic-interposition

Diff Detail

Event Timeline

MaskRay created this revision.May 7 2021, 1:06 PM
MaskRay requested review of this revision.May 7 2021, 1:06 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 7 2021, 1:06 PM
MaskRay edited the summary of this revision. (Show Details)May 7 2021, 1:12 PM
MaskRay edited the summary of this revision. (Show Details)May 7 2021, 1:18 PM
MaskRay edited the summary of this revision. (Show Details)May 7 2021, 1:30 PM
MaskRay edited the summary of this revision. (Show Details)
phosek accepted this revision.May 7 2021, 1:39 PM

LGTM

This revision is now accepted and ready to land.May 7 2021, 1:39 PM

Does this change mean that LD_PRELOAD will no longer work? Are there any other downsides to adding these flags?

MaskRay updated this revision to Diff 343755.May 7 2021, 1:52 PM

Set global CMAKE_SHARED_LINKER_FLAGS instead.

This additional makes -DBUILD_SHARED_LIBS=on builds (dev only; not recommended for packagers) faster.

phosek added inline comments.May 7 2021, 1:55 PM
llvm/cmake/modules/HandleLLVMOptions.cmake
243

I'd really like to avoid modifying CMAKE_SHARED_LINKER_FLAGS, modifying these global variables is an antipattern in modern CMake. We should always set these directly on the targets that need them. What's the problem with the approach you used in the first patch?

MaskRay added a comment.EditedMay 7 2021, 1:59 PM

Does this change mean that LD_PRELOAD will no longer work? Are there any other downsides to adding these flags?

The downside is that it will behave more like Windows dll and Mach-O dylib using two-level namespace lookup.
No runtime preemption.

There are several types of LD_PRELOAD usage.

First, use LD_PRELOAD=same_soname.so to replace a DT_NEEDED entry with the same SONAME. Both -fno-semantic-interposition and -Bsymbolic are compatible with such usage.

Second, use LD_PRELOAD=malloc.so to intercept some functions not defined in the application or any of its shared object dependencies. Both -fno-semantic-interposition and -Bsymbolic are compatible.

c
void *f() { return malloc(0xb612); }

Third, use LD_PRELOAD=different_soname.so to replace a function defined in a shared object dependency and the SONAME is different.
Such usage is incompatible with -Bsymbolic.
If the function is referenced in its definiting translation unit, the call sites are statically bound with -fno-semantic-interposition;
otherwise the usage is still compatible.

MaskRay added inline comments.May 7 2021, 2:00 PM
llvm/cmake/modules/HandleLLVMOptions.cmake
243

With this, we can make libLLVM*.so libclang*.so (-DBUILD_SHARED_LIBS=on) and other lib*.so faster as well.

phosek requested changes to this revision.May 7 2021, 2:08 PM
phosek added inline comments.
llvm/cmake/modules/HandleLLVMOptions.cmake
243

I'd prefer to setting that flag inside add_llvm_library and add_clang_library when SHARED is set. HandleLLVMOptions is also used from runtimes where setting this flag is really undesirable (that's why using global variables is really problematic).

This revision now requires changes to proceed.May 7 2021, 2:08 PM
MaskRay updated this revision to Diff 343758.May 7 2021, 2:08 PM

Don't affect -DBUILD_SHARED_LIBS=on

MaskRay marked an inline comment as done.May 7 2021, 2:10 PM
MaskRay added inline comments.
llvm/cmake/modules/HandleLLVMOptions.cmake
243

add_llvm_library and add_clang_library may be used by downstream projects which don't want -Bsymbolic-functions.

Switched back. I'll just not touch -DBUILD_SHARED_LIBS=on for now.

phosek accepted this revision.May 7 2021, 2:42 PM
This revision is now accepted and ready to land.May 7 2021, 2:42 PM
bcain added a subscriber: bcain.May 10 2021, 11:32 AM
rnk added a subscriber: rnk.May 10 2021, 6:12 PM

+1 to the idea, but I have no idea if this is the right cmake spot

If we've decided to actually care about the shared library build, should we also consider using -fvisibility-inlines-hidden?

MaskRay marked an inline comment as done.May 11 2021, 1:01 AM

+1 to the idea, but I have no idea if this is the right cmake spot

If we've decided to actually care about the shared library build, should we also consider using -fvisibility-inlines-hidden?

Line 329: check_cxx_compiler_flag("-fvisibility-inlines-hidden" SUPPORTS_FVISIBILITY_INLINES_HIDDEN_FLAG).
llvm-project has been using this since 2012. With this flag, taking the address of an inline function may be different in different modules.
So -fno-semantic-interposition and -Bsymbolic-functions will now complement the existing option:)

I'll push this tomorrow.

MaskRay edited the summary of this revision. (Show Details)May 11 2021, 5:37 PM
MaskRay edited the summary of this revision. (Show Details)

I was originally skeptical of this change, because of some negative feedback about these flags in the past on the Fedora development list. However, the people I've talked to more recently seem to be in favor of this change, so no objection from me.

I tried to build the libc++ benchmarks locally and the benchmark with target to_chars_libcxx fails to properly execute. Bi-section let to this commit. Can the change in the linker flags affect libc++? I also see the scheduled builds of libc++ fail at Buildkite https://buildkite.com/llvm-project/libcxx-ci.
Can you have a look at the issue?

There is also a test failure on the aarch64 2-stage bot (https://lab.llvm.org/buildbot/#/builders/7/builds/2720) which I've bisected to this change, so I'll revert it for now.

MaskRay added a subscriber: marco-c.EditedMay 13 2021, 10:15 AM

I tried to build the libc++ benchmarks locally and the benchmark with target to_chars_libcxx fails to properly execute. Bi-section let to this commit. Can the change in the linker flags affect libc++? I also see the scheduled builds of libc++ fail at Buildkite https://buildkite.com/llvm-project/libcxx-ci.
Can you have a look at the issue?

libcxx/src/locale.cpp calls install(&make<_VSTD::ctype<char> >(nullptr, false, 1u)); to initialize std::__1::ctype<char>::id.
In libcxx/include/__locale (which may be transitively included by the executable), the std::__1::ctype<char>::id reference is expected to be bound to std::__1::ctype<char>::id defined in libc++.so.1.
If std::__1::ctype<char>::id is copy relocated by the executable (gcc/clang -fno-pic), the std::__1::ctype<char>::id reference from the executable will be bound to the uninitialized copy in the executable, causing a no-facet failure.

cpp
// _Facet is std::__1::ctype<char>
template <class _Facet>
inline _LIBCPP_INLINE_VISIBILITY
const _Facet&
use_facet(const locale& __l)
{
    return static_cast<const _Facet&>(*__l.use_facet(_Facet::id));
}

locale.cpp thus cannot be compiled with -fno-semantic-interposition

There is also a test failure on the aarch64 2-stage bot (https://lab.llvm.org/buildbot/#/builders/7/builds/2720) which I've bisected to this change, so I'll revert it for now.

The way @marco-c made compiler-rt/test/profile/Posix/gcov-shared-flush.c work on Linux in rL336678 is to use interposition of the global variables writeout_fn_list/etc.
This doesn't work on Darwin, thus https://bugs.llvm.org/show_bug.cgi?id=38134
-fno-semantic-interposition disables variable interposition on Linux as well, therefore a similar failure.

I think -fno-semantic-interposition should be specific to llvm/clang for now. We should ensure they work even under copy relocations.

MaskRay reopened this revision.EditedMay 13 2021, 1:23 PM

I'll go simple - just drop -fno-semantic-interposition from this patch, as I cannot figure out how to make -fno-semantic-interposition specific to llvm/ and clang/.
-Bsymbolic-functions claims most of the optimizations anyway (and more so on x86-64 with GOTPCRELX) for Clang built clang.

Unfortunately gcc -fpic -fsemantic-interposition disables IPO so we should figure out how to reinstate -fno-semantic-interposition to make distribution clang faster.

This revision is now accepted and ready to land.May 13 2021, 1:23 PM
MaskRay updated this revision to Diff 345264.May 13 2021, 1:26 PM
MaskRay retitled this revision from [CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions to [CMake][ELF] Link libLLVM.so and libclang-cpp.so with -Bsymbolic-functions.
MaskRay edited the summary of this revision. (Show Details)

Drop -fno-semantic-interposition

MaskRay updated this revision to Diff 345268.May 13 2021, 1:36 PM
MaskRay edited the summary of this revision. (Show Details)

Fix comment

This revision was landed with ongoing or failed builds.May 13 2021, 1:45 PM
This revision was automatically updated to reflect the committed changes.

I cannot figure out how to make -fno-semantic-interposition specific to llvm/ and clang/.

Maybe @nikic knows? I think he is interested in this work, since the base (reverted) commit introduced 4% compile time improvement - really nice.

I cannot figure out how to make -fno-semantic-interposition specific to llvm/ and clang/.

Maybe @nikic knows? I think he is interested in this work, since the base (reverted) commit introduced 4% compile time improvement - really nice.

Thanks for showing interesting:) (At least let me know I am not wasting my energy.)
Posted D102453 for the GCC optimization.

Personally I always build llvm-project with a stable Clang so D102453 doesn't really make my life better.
D102453 is for those distributions which build llvm-project with GCC...

When building Linux kernel, I don't see additional improvement by adding -fno-semantic-interposition,
but it does make libLLVM.so 2% smaller (notably, GCC -fPIC -fno-semantic-interposition can optimize some
functions which call other non-inline functions so in various places the function body can be smaller.
Some vtable can be optimized out as well.)