This is an archive of the discontinued LLVM Phabricator instance.

Fix RUNPATH not accounting for LLVM_ENABLE_PER_TARGET_RUNTIME_DIR
Needs ReviewPublic

Authored by arcivanov on May 21 2022, 11:39 AM.

Details

Reviewers
EricWF
lei
daltenty
phosek
Group Reviewers
Restricted Project
Restricted Project
Restricted Project
Summary

I'm incorporating by reference the data from this issue: https://github.com/llvm/llvm-project/issues/54955
This commit fixes this issue.

Furthermore the issue always occurs when building runtimes as part of the two-stage distro build as LLVM_ENABLE_PER_TARGET_RUNTIME_DIR is always ON for all architectures except AIX for any and all selected runtimes.

Diff Detail

Event Timeline

arcivanov created this revision.May 21 2022, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2022, 11:39 AM
Herald added a subscriber: mgorny. · View Herald Transcript
arcivanov requested review of this revision.May 21 2022, 11:39 AM
EricWF requested changes to this revision.May 21 2022, 1:56 PM
EricWF added a subscriber: EricWF.

I think this changes necessary to fix libc++, assuming it actually uses this cmake infrastructure, which I'm not sure it does.

How is this been tested?

Also, I apologize I didn't mean to request changes but I can't unselect it on mobile

This revision now requires changes to proceed.May 21 2022, 1:56 PM

Well, this affects *at least* runtimes libunwind, libcxx and libcxxabi and at least on Linux.
The situation occurs when project are built depending on any of those runtimes being directly colocated, i.e. referenced by runpath.

The issue was discovered as part of a single-shot two-stage build with said runtimes.
More details: https://discourse.llvm.org/t/bootstrap-build-with-llvm-runtimes-clang-shared-libc-build-failure-stage2-bins-due-to-libc-so-not-found/60141/7?u=arcivanov

The saga continues. The libcxx/abi do not appear to have any rpath reference to the libunwind:

$ ldd bin/clang-14
        linux-vdso.so.1 (0x00007ffcfd7cc000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7970ad0000)
        libclang-cpp.so.14 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/libclang-cpp.so.14 (0x00007f796ce70000)
        libLLVM-14.so => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so (0x00007f796971b000)
        libc++.so.1 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/x86_64-unknown-linux-gnu/libc++.so.1 (0x00007f7969627000)
        libc++abi.so.1 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/x86_64-unknown-linux-gnu/libc++abi.so.1 (0x00007f79695ec000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f7969510000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f79694f5000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f79692ec000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f7970b02000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f79692e7000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f79692e0000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f79692c6000)
        libunwind.so.1 => not found
        libunwind.so.1 => not found

It looks like 2-stage builds with embedded shared library runtimes are fundamentally broken with respect to rpath/runpath.
Additional revision is incoming.

arcivanov updated this revision to Diff 431848.May 24 2022, 5:28 PM

When libcxx and libcxxabi depend on shared libunwind that is installed with them in the target-triple directory, both libraries also require an '$ORIGIN' runpath.

arcivanov added a comment.EditedMay 24 2022, 6:03 PM
$ ldd clang-14 
        linux-vdso.so.1 (0x00007ffeb49f7000)
        libclang-cpp.so.14 => /home/user/Documents/src/project/llvm.twostage.bin/bin/./../lib/libclang-cpp.so.14 (0x00007fc547ad7000)
        libLLVM-14.so => /home/user/Documents/src/project/llvm.twostage.bin/bin/./../lib/libLLVM-14.so (0x00007fc54437f000)
        libc++.so.1 => /home/user/Documents/src/project/llvm.twostage.bin/bin/./../lib/x86_64-unknown-linux-gnu/libc++.so.1 (0x00007fc54428b000)
        libc++abi.so.1 => /home/user/Documents/src/project/llvm.twostage.bin/bin/./../lib/x86_64-unknown-linux-gnu/libc++abi.so.1 (0x00007fc544252000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fc54414b000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fc54412e000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fc543f25000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc54b739000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fc543f0b000)
        libxml2.so.2 => /lib64/libxml2.so.2 (0x00007fc543d81000)
        libatomic.so.1 => /lib64/libatomic.so.1 (0x00007fc543d77000)
        libunwind.so.1 => /home/user/Documents/src/project/llvm.twostage.bin/bin/./../lib/x86_64-unknown-linux-gnu/libunwind.so.1 (0x00007fc543d69000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fc543d3d000)
$ ldd libc++.so.1.0 
        linux-vdso.so.1 (0x00007ffe0af16000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f823dd6b000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f823dc8f000)
        libatomic.so.1 => /lib64/libatomic.so.1 (0x00007f823dc85000)
        libc++abi.so.1 => /home/user/Documents/src/project/llvm.twostage.bin/lib/x86_64-unknown-linux-gnu/./libc++abi.so.1 (0x00007f823dc4c000)
        libunwind.so.1 => /home/user/Documents/src/project/llvm.twostage.bin/lib/x86_64-unknown-linux-gnu/./libunwind.so.1 (0x00007f823dc40000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f823e095000)

Windows doesn't need the rpath.

I'm not sure what to do here to validate this as builds keep randomly failing due to timeouts.

for what it's worth: As part of upgrading the Salesforce-internal clang toolchain to clang 15.0.0, I also ran into the RUNPATH issues. This patch here solved those issues for us

DoDoENT added a subscriber: DoDoENT.Sep 7 2022, 4:00 AM
lei accepted this revision.Jan 23 2023, 1:23 PM
lei added a subscriber: lei.

This patch fixes the exact problem I am seeing when building on PowerPC for -DLLVM_ENABLE_RUNTIMES=compiler-rt;libcxx;libcxxabi -DLLVM_ENABLE_LIBCXX:BOOL=ON.
I have applied this patch and tested it on the ppcle rhel/aix/flang/mlir bots and it does not cause any regressions so this LGTM.

@EricWF Does that address your testing concerns?

@EricWF any chance to get it in for 16?

lei added a comment.Jan 30 2023, 10:11 AM

It would be great if we can get this in 16.

lei added a subscriber: tstellar.Jan 30 2023, 2:40 PM

@tstellar PowerPC would need this patch enable to publish a build with libcxx. When does this patch need to land enable to get this into LLVM 16.0?

@tstellar for that, I presume, the commit needs to be in main already. Can anybody help moving this forward?

lei added a comment.Feb 2 2023, 8:47 AM

@arcivanov are you still waiting for Eric to respond or are you looking for help to commit this patch?

@lei I have no commit privileges so I can't really do anything apart from submitting the patch and waiting for someone to do something. I don't know if we need to wait for EricW or somebody else can help. I just would like the patch in main and 16.x. I'm ready to take the steps necessary to help with that.

lei added a comment.EditedFeb 2 2023, 9:30 AM

Since Eric mentioned above that he didn't mean to block this patch and he hasn't come back to it, I am going to assume it is okay to land this patch now. I can help to commit this tomorrow if we don't get any blocking comments by then.

When was this patch last rebased? Can you please rebase/test the patch again?

@lei long time ago. Let me rebase it on main.
I assume after that is committed you'll "cherry-pick" into 16.x via GH issue?

lei added a comment.Feb 2 2023, 9:35 AM

@lei long time ago. Let me rebase it on main.
I assume after that is committed you'll "cherry-pick" into 16.x via GH issue?

I will help to commit this issue, can you please deal with what's needed to get it cherry-pick'd into 16.x?

arcivanov updated this revision to Diff 494500.Feb 2 2023, 8:18 PM

Rebase on 17.x (main)

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 2 2023, 8:18 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript

I will help to commit this issue, can you please deal with what's needed to get it cherry-pick'd into 16.x?

As far as I understand once it's in main a person with commit authority can use the cherry-pick command from GH Issues per @tstellar 's guide: https://llvm.org/docs/GitHub.html#backporting-fixes-to-the-release-branches

phosek added a subscriber: phosek.Feb 2 2023, 11:42 PM
phosek added inline comments.
libcxx/src/CMakeLists.txt
184

Once we require CMake >=3.14, we should be able to use https://cmake.org/cmake/help/latest/prop_tgt/BUILD_RPATH_USE_ORIGIN.html. Can you leave a TODO comment here for that?

185–186

What's the motivation behind this condition? Why not set this unconditionally?

187

This flag may not be supported by all linkers so we need to check if it's supported first.

arcivanov added inline comments.Feb 3 2023, 2:04 PM
libcxx/src/CMakeLists.txt
184

I will

185–186

I don't know the side-effects of setting this unconditionally, i.e. I went with a narrowly-tailored solution and principle of least surprise, only capturing the condition where a hard runtime failure would occur without introducing the RPATH into $ORIGIN.

The motivation is to only introduce this when:

  1. LIBCXX is an SO; and
  2. There is an unwinder used that is NOT linked statically; and
  3. You actually build shared unwinder.
187

I would appreciate help here.
Would using logic similar to llvm_setup_rpath satisfy you or is there a better solution you see?
I could limit this further to only UNIX which is non-FreeBSD/DragonFly, which should cover all linkers?

The saga continues. The libcxx/abi do not appear to have any rpath reference to the libunwind:

$ ldd bin/clang-14
        linux-vdso.so.1 (0x00007ffcfd7cc000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7970ad0000)
        libclang-cpp.so.14 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/libclang-cpp.so.14 (0x00007f796ce70000)
        libLLVM-14.so => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so (0x00007f796971b000)
        libc++.so.1 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/x86_64-unknown-linux-gnu/libc++.so.1 (0x00007f7969627000)
        libc++abi.so.1 => /home/user/Documents/src/project/llvm.twostage.build.manylinux2014/tools/clang/stage2-bins/bin/../lib/x86_64-unknown-linux-gnu/libc++abi.so.1 (0x00007f79695ec000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f7969510000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f79694f5000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f79692ec000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f7970b02000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f79692e7000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f79692e0000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f79692c6000)
        libunwind.so.1 => not found
        libunwind.so.1 => not found

It looks like 2-stage builds with embedded shared library runtimes are fundamentally broken with respect to rpath/runpath.
Additional revision is incoming.

@phosek solving that specific problem in re motivation

@phosek @lei How are we proceeding further?

lei added a comment.Feb 21 2023, 7:47 AM

Since @phosek requested changes, this now require his review to see if his concerns are addressed.
@phosek can you please take a look at this patch to see if it addresses your concerns? PowerPC require this patch for our libcxx builds.

daltenty requested changes to this revision.Mar 6 2023, 11:08 AM
daltenty added a subscriber: daltenty.

This patch is failing in the libc++ precommit CI attached to this revision as the -rpath option is unsupported on AIX:

https://buildkite.com/llvm-project/libcxx-ci/builds/18515#01861582-a41c-42fc-abe0-07ba06ab071d

IIUC there should not be a need to insert this linker options manually as is happening here, CMake has specific variables for controlling the build and install rpaths:

https://cmake.org/cmake/help/v3.26/variable/CMAKE_BUILD_RPATH.html
https://cmake.org/cmake/help/latest/variable/CMAKE_INSTALL_RPATH.html

AFAIK these should generate the appropriate linker options automatically per-platform

(That said, relative $ORIGIN rpaths are not supported on some platforms like AIX, so this isn't going to work universally. But at least using the CMake features should make it generate the right flags and won't result in a linker failure)

This revision now requires changes to proceed.Mar 6 2023, 11:08 AM
arcivanov updated this revision to Diff 502761.Mar 6 2023, 12:52 PM

Limit the changes to UNIX and NOT APPLE and NOT AIX.

arcivanov updated this revision to Diff 502785.Mar 6 2023, 1:46 PM

Fix "not matches" evaluation priority.

@daltenty I applied the same exclusion logic llvm_setup_rpath in AddLLVM.cmake and the build is now fully passing.
I could take the same approach as llvm_setup_rpath with INSTALL_RPATH and BUILD_WITH_INSTALL_RPATH if you prefer.

phosek requested changes to this revision.Mar 6 2023, 11:14 PM
phosek added inline comments.
libcxx/src/CMakeLists.txt
185–186

This part of the change seems unrelated to LLVM_ENABLE_PER_TARGET_RUNTIME_DIR, only the llvm_setup_rpath change should be needed. If that's correct, could we split this part of the change into a separate patch?

187

Rather than manually setting the linker flag, could we instead set the BUILD_RPATH and INSTALL_RPATH property on the shared targets?

This revision now requires changes to proceed.Mar 6 2023, 11:14 PM
arcivanov added inline comments.Mar 6 2023, 11:25 PM
libcxx/src/CMakeLists.txt
185–186

I don't know if it makes sense. This part was found during the testing of the original problem: https://reviews.llvm.org/D126122#3530383
The issue is fundamentally the same - broken RPATHs and non-functional libraries.
Would you prefer I rather rename the patch to something like "Missing relative $ORIGIN RUNPATH in libcxx/-abi or with LLVM_ENABLE_PER_TARGET_RUNTIME_DIR"?

187

I could certainly try.

@daltenty I applied the same exclusion logic llvm_setup_rpath in AddLLVM.cmake and the build is now fully passing.

Thanks. I think the AIX exclude part should be fine, we can worry about fixing this if the AIX loader ever gets the required feature.

I could take the same approach as llvm_setup_rpath with INSTALL_RPATH and BUILD_WITH_INSTALL_RPATH if you prefer.

Yeah, I think as @phosek suggests, it would be better is we set the right BUILD_RPATH and INSTALL_RPATH properties for the shared library targets, so we avoid manually pushing the linker opts.

Seems like we might need to add something, as these properties don't seem like they're currently used in the runtime side. Maybe we could create LIBCXX_INSTALL_RPATH and LIBCXX_BUILD_RPATH and similar variables and use those to create the properties (something like that might be generally useful for manipulating the load paths).

arcivanov updated this revision to Diff 508448.Mar 26 2023, 4:18 PM

Introduced requested changes by adding LIBCXX_INSTALL_RPATH and LIBCXXABI_INSTALL_RPATH.
Eliminated add_link_flags use for RPATH.

TWIMC,

Unfortunately there is another build bug in the main now:

[1064/1064] cd /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/libcxx/src && /usr/local/bin/cmake -DCMAKE_INSTALL_COMPONENT=cxx -P /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/libcxx/cmake_install.cmake
FAILED: libcxx/src/CMakeFiles/install-cxx /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/libcxx/src/CMakeFiles/install-cxx
cd /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/libcxx/src && /usr/local/bin/cmake -DCMAKE_INSTALL_COMPONENT=cxx -P /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/libcxx/cmake_install.cmake
-- Install configuration: "RelWithDebInfo"
-- Installing: /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/install/lib/libc++.so.1.0
-- Installing: /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/install/lib/libc++.so.1
-- Set runtime path of "/home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/install/lib/libc++.so.1.0" to "$ORIGIN"
-- Installing: /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/install/lib/libc++.so
CMake Error at cmake_install.cmake:88 (file):
  file RPATH_CHANGE could not write new RPATH:
 
    $ORIGIN
 
  to the file:
 
    /home/tcwg-buildbot/worker/linaro-armv8-libcxx-03/llvm-project/libcxx-ci/build/armv8/install/lib/libc++.so
 
  The file format is not recognized.

This is because LIBCXX_ENABLE_ABI_LINKER_SCRIPT is ON by default, the file is a text script, not an ELF so setting ORIGIN fails.
This is not caused by my change but rather is exposed by it.
I will appreciate advice on this.

The issue is a bug in CMake that will be fixed in 3.27.0 only: https://gitlab.kitware.com/cmake/cmake/-/issues/24647
Using INSTALL_RPATH version of the patch while having LIBCXX_ENABLE_ABI_LINKER_SCRIPT activated will not work until CMake 3.27 at the earliest without a rewrite of the entire linker script logic as well.

I propose that the previous version of the patch (without INSTALL_RPATH) is accepted as is for the time being to make sure this issue is unblocked.

@phosek @daltenty @lei
I propose merging Diff 502785 (https://reviews.llvm.org/D126122?id=502785) if there are no further objections.
Otherwise we'll have to wait until LLVM requires minimum version of CMake 3.27+.

ldionne added a subscriber: ldionne.Sep 7 2023, 8:25 AM

I'm coming back to this in the context of the Github PR transition cleanup.

Is there a reason why you're not setting CMAKE_INSTALL_RPATH when building your runtimes instead? I am still trying to wrap my head around the actual issue reported in https://github.com/llvm/llvm-project/issues/54955.

I'm coming back to this in the context of the Github PR transition cleanup.

Is there a reason why you're not setting CMAKE_INSTALL_RPATH when building your runtimes instead? I am still trying to wrap my head around the actual issue reported in https://github.com/llvm/llvm-project/issues/54955.

I'm not building "runtimes". I'm building a two-stage llvm with runtimes with LLVM build relying on the runtimes for stage2 build.

https://github.com/karellen/karellen-llvm/blob/master/patches/001-runpath.patch
https://github.com/karellen/karellen-llvm/blob/master/build-twostage.sh#L68

If you go through the history of this patchset, you'll see it's exclusively my issue but other types of builds as well.
I'm also quite exasperated dealing with this and as you can see I and other people moved to our own patches to deal with this problem.