Page MenuHomePhabricator

[NFC][libomptarget] Reduce the dependency on libelf
ClosedPublic

Authored by vzakhari on Jun 2 2021, 11:41 AM.

Details

Summary

This change-set removes libelf usage from elf_common part of the plugins. libelf is still used in x86_64 generic plugin code and in some plugins (e.g. amdgpu) - these will have to be cleaned up in separate checkins.

Diff Detail

Event Timeline

vzakhari created this revision.Jun 2 2021, 11:41 AM
vzakhari requested review of this revision.Jun 2 2021, 11:41 AM
Herald added a project: Restricted Project. · View Herald Transcript

Looks functionally correct. Always in favour of moving stuff out of headers where we can, though as an unrelated change we might want to start building plugins with flto or similar to ensure that remains free.

Either myself or @pdhaliwal will do equivalent handling for the amdgpu plugin.

There was some discussion on the mailing list about failing to link against llvm libs but I didn't see the resolution to it. Suggest we continue as planned.

openmp/libomptarget/plugins/common/elf_common/elf_common.cpp
22

A bit surprised that GETNAME handles spaces. Have you done a debug build of this to check the DP() handling compiles cleanly?

65

Maybe a helper function for this setup code? It's very similar to the above function. Thinking something along the lines of:

ELFObjectFileBase * tbd(char * bytes, size_t size); // return nullptr if it can't make one

but it's possible memory management will thwart that, in which case it may be easier callback style:

template <typename F>
int withBytesAsElf(char * bytes, size_t size, F f)
{
 ///
int rc = f(Object);
///
return rc;
}
vzakhari added inline comments.Jun 4 2021, 11:09 AM
openmp/libomptarget/plugins/common/elf_common/elf_common.cpp
22

Sure, other plugins also use TARGET_NAME with spaces. And I did verify that LIBOMPTARGET_DEBUG output worked, e.g.:

TARGET ELF Common --> Unknown ELF format or not an ELF image!

vzakhari updated this revision to Diff 349924.Jun 4 2021, 11:09 AM
vzakhari marked an inline comment as done.
This revision is now accepted and ready to land.Jun 4 2021, 12:09 PM
vzakhari updated this revision to Diff 350438.Jun 7 2021, 3:35 PM
vzakhari updated this revision to Diff 352340.Jun 15 2021, 10:37 PM

Just a rebase.

Are you expecting to land this soon?

Are you expecting to land this soon?

Yes, I am going to merge it in 24 hours.

Great, thanks! I'll (try to) find some time to start cutting the dependency out of the amdgpu plugin next week.

This revision was landed with ongoing or failed builds.Jun 16 2021, 8:34 AM
This revision was automatically updated to reflect the committed changes.

This patch seems to break the link of libomptarget.rtl.cuda.so. Here is the error on my machine.

[46/48] Linking CXX shared library libomptarget/libomptarget.rtl.x86_64.so
FAILED: libomptarget/libomptarget.rtl.x86_64.so
: && /home/shiltian/Documents/deploy/llvm/release/bin/clang++ -fPIC -Wall -Wcast-qual -Wformat-pedantic -Wimplicit-fallthrough -Wsign-compare -Wno-extra -Wno-pedantic -std=gnu++14 -g   -shared -Wl,-soname,libomptarget.rtl.x86_64.so -o libomptarget/libomptarget.rtl.x86_64.so libomptarget/plugins/common/elf_common/CMakeFiles/elf_common.dir/elf_common.cpp.o libomptarget/plugins/x86_64/CMakeFiles/omptarget.rtl.x86_64.dir/__/generic-elf-64bit/src/rtl.cpp.o  /usr/lib64/libffi.so  /usr/lib64/libelf.so  -ldl  -lpthread  -Wl,--version-script=/home/shiltian/Documents/vscode/llvm-project/openmp/libomptarget/plugins/x86_64/../exports  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMObject.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMBitReader.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMCore.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMRemarks.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMBitstreamReader.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMMCParser.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMMC.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMDebugInfoCodeView.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMTextAPI.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMBinaryFormat.a  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMSupport.a  -lrt  -lm  /usr/lib64/libz.so  /usr/lib64/libtinfo.so  /home/shiltian/Documents/deploy/llvm/release/lib/libLLVMDemangle.a  -ldl  -lpthread && :
/usr/bin/ld: libomptarget/plugins/common/elf_common/CMakeFiles/elf_common.dir/elf_common.cpp.o: relocation R_X86_64_32S against symbol `stderr@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
[47/48] Linking CXX shared library libomptarget/libomptarget.rtl.cuda.so

I checked its parent commit that there is no problem.

[AMD Official Use Only]

I am seeing similar errors after merging down the latest llvm.
I had to add a cmake option to find the include files and I am now seeing the relocation issues

I am on ubuntu 18.04

I build openmp for target in a separate step after llvm is built.

It appears the static llvm libraries were not built with fPIC, thus can't be linked into a shared library. Forcing fPIC on everything would verify that (as in stuff would work, maybe slightly slower) if the cmake controls are opaque.

I think there's a linker bug/feature in this area too, I remember running into this on a previous toolchain where the objects had all been built with fPIC, but the relocation error remained. I think that was solved by passing a linker flag or using gold, but that is going back a few years and the details are hazy.

@tianshilei1992 what is your OS?

➜  lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: Fedora
Description:    Fedora release 32 (Thirty Two)
Release:        32
Codename:       ThirtyTwo

Can you please provide the CMake command to reproduce this? I cannot reproduce this with an ordinary in-tree build.

[AMD Official Use Only]

Trunk as of this morning
work/rlieberm/mono-repo/llvm-project/openmp/libomptarget/plugins/common/elf_common/elf_common.cpp
/work/rlieberm/mono-repo/llvm-project/openmp/libomptarget/plugins/common/elf_common/elf_common.cpp:15:10: fatal error: 'llvm/BinaryFormat/Magic.h' file not found
#include "llvm/BinaryFormat/Magic.h"

^~~~~~~~~~~~~~~~~~~~~~~~~~~

1 error generated.
[182/204] Building CXX object openmp/tools/archer/CMakeFiles/archer.dir/ompt-tsan.cpp.o

Cmake command, you can probably drop the CMAKE_INSTALL_PREFIX

cmake -DCMAKE_INSTALL_PREFIX=/home/rlieberm/rocm/trunk_1.0 \

-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="clang;lld" \
-DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \
-DLLVM_ENABLE_ASSERTIONS=ON                        \
-DLLVM_ENABLE_RUNTIMES="openmp" \
-DCLANG_DEFAULT_LINKER=lld                         \
../llvm -GNinja

i forgot to mention that the first problem i encountered with this patch is that elf_common.cpp has difficulties finding its LLVM header files.
i am reporting that issue. For me, the fPIC relocatables might be a downstream issue, still looking into that one.

for the header file issue, i cobbled this up
add to end of file openmp/libomptarget/plugins/common/elf_common/CMakeLists.txt
include_directories(${LIBOMPTARGET_LLVM_INCLUDE_DIRS})

and added a cmake definition kind of like:
-DLIBOMPTARGET_LLVM_INCLUDE_DIRS=/<blah>/llvm-project/llvm/include

A person more versed in CMake might have a better solution, hope so.

[AMD Official Use Only]

Trunk as of this morning
work/rlieberm/mono-repo/llvm-project/openmp/libomptarget/plugins/common/elf_common/elf_common.cpp
/work/rlieberm/mono-repo/llvm-project/openmp/libomptarget/plugins/common/elf_common/elf_common.cpp:15:10: fatal error: 'llvm/BinaryFormat/Magic.h' file not found
#include "llvm/BinaryFormat/Magic.h"

^~~~~~~~~~~~~~~~~~~~~~~~~~~

1 error generated.
[182/204] Building CXX object openmp/tools/archer/CMakeFiles/archer.dir/ompt-tsan.cpp.o

Cmake command, you can probably drop the CMAKE_INSTALL_PREFIX

cmake -DCMAKE_INSTALL_PREFIX=/home/rlieberm/rocm/trunk_1.0 \

-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="clang;lld" \
-DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \
-DLLVM_ENABLE_ASSERTIONS=ON                        \
-DLLVM_ENABLE_RUNTIMES="openmp" \
-DCLANG_DEFAULT_LINKER=lld                         \
../llvm -GNinja

Thank you, @ronlieb! I am able to reproduce it. It seems that LLVM_ENABLE_RUNTIMES behavior is quite different from the -DLLVM_ENABLE_PROJECTS="openmp" that I use (so I did not see this issue before). It seems to me that in LLVM_ENABLE_RUNTIMES mode, openmp is configured/built as a standalone project using clang built by the "main" cmake/build step, but for some reason OPENMP_STANDALONE_BUILD is false. I am not sure about the right solution yet. Can you please add openmp to your -DLLVM_ENABLE_PROJECTS="clang;lld" and remove "-DLLVM_ENABLE_RUNTIMES="openmp" as a temporary workaround?

i forgot to mention that the first problem i encountered with this patch is that elf_common.cpp has difficulties finding its LLVM header files.
i am reporting that issue. For me, the fPIC relocatables might be a downstream issue, still looking into that one.

for the header file issue, i cobbled this up
add to end of file openmp/libomptarget/plugins/common/elf_common/CMakeLists.txt
include_directories(${LIBOMPTARGET_LLVM_INCLUDE_DIRS})

and added a cmake definition kind of like:
-DLIBOMPTARGET_LLVM_INCLUDE_DIRS=/<blah>/llvm-project/llvm/include

A person more versed in CMake might have a better solution, hope so.

Please check if D104535 fixes the header files issue.
As to the PIC issue, I do not see it. @tianshilei1992 please provide cmake commands so that I can reproduce the issue.

I build my LLVM with the command:

cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/Documents/deploy/llvm/release -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_INSTALL_UTILS=ON -DLLVM_INCLUDE_BENCHMARKS=OFF -DLLVM_INCLUDE_TESTS=ON -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_ENABLE_ASSERTIONS=ON $HOME/Documents/vscode/llvm-project/llvm

And then have a separate configure for OpenMP:

cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/Documents/deploy/openmp/debug -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=$HOME/Documents/deploy/llvm/release/bin/clang -DCMAKE_CXX_COMPILER=$HOME/Documents/deploy/llvm/release/bin/clang++ -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=75 -DOPENMP_NOT_EXECUTABLE=$HOME/Documents/deploy/llvm/release/bin/not -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DLLVM_ROOT=$HOME/Documents/deploy/llvm/release -DLIBOMPTARGET_LLVM_INCLUDE_DIRS=$HOME/Documents/vscode/llvm-project/llvm/include $HOME/Documents/vscode/llvm-project/openm

Thank you, @ronlieb! I am able to reproduce it. It seems that LLVM_ENABLE_RUNTIMES behavior is quite different from the -DLLVM_ENABLE_PROJECTS="openmp" that I use (so I did not see this issue before). It seems to me that in LLVM_ENABLE_RUNTIMES mode, openmp is configured/built as a standalone project using clang built by the "main" cmake/build step, but for some reason OPENMP_STANDALONE_BUILD is false. I am not sure about the right solution yet. Can you please add openmp to your -DLLVM_ENABLE_PROJECTS="clang;lld" and remove "-DLLVM_ENABLE_RUNTIMES="openmp" as a temporary workaround?

i added openmp to PROJECTS, and removed -DLLVM_ENABLE_RUNTIMES="openmp"

results in error

CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:358 (add_custom_target):

add_custom_target cannot create target "check-openmp" because another
target with the same name already exists.  The existing target is a custom
target created in source directory
"/work/rlieberm/mono-repo/llvm-project/openmp".  See documentation for
policy CMP0002 for more details.

Call Stack (most recent call first):

runtimes/CMakeLists.txt:226 (llvm_ExternalProject_Add)
runtimes/CMakeLists.txt:353 (runtime_default_target)

Thank you, @ronlieb! I am able to reproduce it. It seems that LLVM_ENABLE_RUNTIMES behavior is quite different from the -DLLVM_ENABLE_PROJECTS="openmp" that I use (so I did not see this issue before). It seems to me that in LLVM_ENABLE_RUNTIMES mode, openmp is configured/built as a standalone project using clang built by the "main" cmake/build step, but for some reason OPENMP_STANDALONE_BUILD is false. I am not sure about the right solution yet. Can you please add openmp to your -DLLVM_ENABLE_PROJECTS="clang;lld" and remove "-DLLVM_ENABLE_RUNTIMES="openmp" as a temporary workaround?

i added openmp to PROJECTS, and removed -DLLVM_ENABLE_RUNTIMES="openmp"

results in error

CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:358 (add_custom_target):

add_custom_target cannot create target "check-openmp" because another
target with the same name already exists.  The existing target is a custom
target created in source directory
"/work/rlieberm/mono-repo/llvm-project/openmp".  See documentation for
policy CMP0002 for more details.

Call Stack (most recent call first):

runtimes/CMakeLists.txt:226 (llvm_ExternalProject_Add)
runtimes/CMakeLists.txt:353 (runtime_default_target)

This is weird. The following command works for me: cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lld;openmp" -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" -DLLVM_ENABLE_ASSERTIONS=ON -DCLANG_DEFAULT_LINKER=lld ../llvm -GNinja
It looks like your cmake invocation still uses the LLVM_ENABLE_RUNTIMES path. Did you remove cmake cache before the reconfiguration? Anyway, I believe D104535 should fix the LLVM_ENABLE_RUNTIMES build.

I build my LLVM with the command:

cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/Documents/deploy/llvm/release -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_INSTALL_UTILS=ON -DLLVM_INCLUDE_BENCHMARKS=OFF -DLLVM_INCLUDE_TESTS=ON -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_ENABLE_ASSERTIONS=ON $HOME/Documents/vscode/llvm-project/llvm

And then have a separate configure for OpenMP:

cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$HOME/Documents/deploy/openmp/debug -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=$HOME/Documents/deploy/llvm/release/bin/clang -DCMAKE_CXX_COMPILER=$HOME/Documents/deploy/llvm/release/bin/clang++ -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=75 -DOPENMP_NOT_EXECUTABLE=$HOME/Documents/deploy/llvm/release/bin/not -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DLLVM_ROOT=$HOME/Documents/deploy/llvm/release -DLIBOMPTARGET_LLVM_INCLUDE_DIRS=$HOME/Documents/vscode/llvm-project/llvm/include $HOME/Documents/vscode/llvm-project/openm

Thank you, @tianshilei1992. I believe D104545 should fix the issue.

Thank you, @tianshilei1992. I believe D104545 should fix the issue.

Confirmed. Works for me.