With D94745, we no longer use CUDA SDK to compile deviceRTLs. Therefore,
many CMake code in the project is useless. This patch cleans up unnecessary code
and also drops the requirement to build NVPTX deviceRTLs. CUDA detection is
still being used however to determine whether we need to involve the tests. Auto
detection of compute capability is enabled by default and can be disabled by
setting CMake variable LIBOMPTARGET_NVPTX_AUTODETECT_COMPUTE_CAPABILITY=OFF.
If auto detection is enabled, and CUDA is also valid, it will only build the
bitcode library for the detected version; otherwise, all variants supported will
be generated. One drawback of this patch is, we now generate 96 variants of
bitcode library, and totally 1485 files to be built with a clean build on a
non-CUDA system. LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES="" can be used to
disable building NVPTX deviceRTLs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
270 ms | x64 windows > Clang.CodeGen::profile-filter.c |
Event Timeline
openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake | ||
---|---|---|
15 | My editor can avoid all trailing spaces. |
Cool, thank you. If I'm reading this right, we use the same CMAKE variables to pick a compiler as before, and compile every file 96 times to create 96 libraries.
That seems OK, if a bit inefficient. I believe it's only target_impl.cpp that cares about ptx version, so we could reduce the build time by compiling everything else once per SM and using llvm-link to create each output library from the common base plus the ptx-specific target_impl
Only target_impl.cu cares about the macro, but every time we invoke the compiler, we need to pass -target-cpu sm_xx. I'm not sure it's safe to assume for other code it is good to use an arbitrary SM number.
Spent some time trying to build this without system headers and failed get the printf->vprintf transform to fire. Installing gcc-multilib and using the glibc headers worked out of the box.
I don't have CUDA on my system, and now the build is broken:
[ 38%] Building LLVM bitcode target_impl.cu-cuda_80-sm_80.bc In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:14: In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:15: In file included from /usr/include/assert.h:35: /usr/include/features.h:424:12: fatal error: 'sys/cdefs.h' file not found # include <sys/cdefs.h> ^~~~~~~~~~~~~ 1 error generated.
I think this is because nvptx is a 32 bit platform and you're compiling on a 64 bit platform. gcc-multilib will fix.
In general I don't like the dependence on host libc when compiling for a gpu (as they don't have very much in common), but need to debug through printf handling to break that link.
The idea is to compile llvm on systems that don't have cuda installed, such that the toolchain can later compile openmp code that runs on systems that do have cuda + nvptx hardware.
In particular, so that the llvm compiled for linux distributions can be installed from a package manager onto a system that does have a gpu.
However, it's starting to look like some people who are building libomptarget don't have gcc-multilib installed and also don't care about nvptx offloading.
I think we therefore need to do one of the following:
- turn this build off by default if cuda is missing (what we used to have)
- turn this build off if compilation fails, e.g. by trying to detect multilibs, or otherwise make compilation failures non-fatal
- drop the dependency on the host, which is straightforward if we disable printf and otherwise awkward
turn this build off by default if cuda is missing (what we used to have)
Yes, let's do this and ask the packagers for releases to enable it.
Aside from potentially disabling this build, I actually have multilib installed:
ii gcc-7-multilib 7.5.0-3ubuntu1~18.04 amd64 GNU C compiler (multilib support) ii gcc-multilib 4:7.4.0-1ubuntu2.3 amd64 GNU C compiler (multilib files)
Is there something missing in the cmake files that should make use of it?
Hmm, I installed both gcc-multilib and g++-multilib, but here we actually only includes C header.
My editor can avoid all trailing spaces.