This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][NVPTX] Drop dependence on CUDA to build NVPTX `deviceRTLs`
ClosedPublic

Authored by tianshilei1992 on Jan 26 2021, 11:26 AM.

Details

Summary

With D94745, we no longer use CUDA SDK to compile deviceRTLs. Therefore,
many CMake code in the project is useless. This patch cleans up unnecessary code
and also drops the requirement to build NVPTX deviceRTLs. CUDA detection is
still being used however to determine whether we need to involve the tests. Auto
detection of compute capability is enabled by default and can be disabled by
setting CMake variable LIBOMPTARGET_NVPTX_AUTODETECT_COMPUTE_CAPABILITY=OFF.
If auto detection is enabled, and CUDA is also valid, it will only build the
bitcode library for the detected version; otherwise, all variants supported will
be generated. One drawback of this patch is, we now generate 96 variants of
bitcode library, and totally 1485 files to be built with a clean build on a
non-CUDA system. LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES="" can be used to
disable building NVPTX deviceRTLs.

Diff Detail

Event Timeline

tianshilei1992 requested review of this revision.Jan 26 2021, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2021, 11:26 AM
openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake
15

My editor can avoid all trailing spaces.

Cool, thank you. If I'm reading this right, we use the same CMAKE variables to pick a compiler as before, and compile every file 96 times to create 96 libraries.

That seems OK, if a bit inefficient. I believe it's only target_impl.cpp that cares about ptx version, so we could reduce the build time by compiling everything else once per SM and using llvm-link to create each output library from the common base plus the ptx-specific target_impl

tianshilei1992 added a comment.EditedJan 26 2021, 1:03 PM

That seems OK, if a bit inefficient. I believe it's only target_impl.cpp that cares about ptx version, so we could reduce the build time by compiling everything else once per SM and using llvm-link to create each output library from the common base plus the ptx-specific target_impl

Only target_impl.cu cares about the macro, but every time we invoke the compiler, we need to pass -target-cpu sm_xx. I'm not sure it's safe to assume for other code it is good to use an arbitrary SM number.

JonChesterfield accepted this revision.Jan 26 2021, 4:19 PM

Spent some time trying to build this without system headers and failed get the printf->vprintf transform to fire. Installing gcc-multilib and using the glibc headers worked out of the box.

This revision is now accepted and ready to land.Jan 26 2021, 4:19 PM
This revision was landed with ongoing or failed builds.Jan 26 2021, 5:21 PM
This revision was automatically updated to reflect the committed changes.

I don't have CUDA on my system, and now the build is broken:

[ 38%] Building LLVM bitcode target_impl.cu-cuda_80-sm_80.bc
In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:14:
In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:15:
In file included from /usr/include/assert.h:35:
/usr/include/features.h:424:12: fatal error: 'sys/cdefs.h' file not found
#  include <sys/cdefs.h>
           ^~~~~~~~~~~~~
1 error generated.

I don't have CUDA on my system, and now the build is broken:

[ 38%] Building LLVM bitcode target_impl.cu-cuda_80-sm_80.bc
In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:14:
In file included from /w/src/llvm.org/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:15:
In file included from /usr/include/assert.h:35:
/usr/include/features.h:424:12: fatal error: 'sys/cdefs.h' file not found
#  include <sys/cdefs.h>
           ^~~~~~~~~~~~~
1 error generated.

gcc-multilib is needed.

I think this is because nvptx is a 32 bit platform and you're compiling on a 64 bit platform. gcc-multilib will fix.

In general I don't like the dependence on host libc when compiling for a gpu (as they don't have very much in common), but need to debug through printf handling to break that link.

I don't have CUDA, why is this being compiled on my system to begin with?

I don't have CUDA, why is this being compiled on my system to begin with?

Because they were not compiled before.

JonChesterfield added a comment.EditedJan 27 2021, 11:34 AM

I don't have CUDA, why is this being compiled on my system to begin with?

The idea is to compile llvm on systems that don't have cuda installed, such that the toolchain can later compile openmp code that runs on systems that do have cuda + nvptx hardware.

In particular, so that the llvm compiled for linux distributions can be installed from a package manager onto a system that does have a gpu.

However, it's starting to look like some people who are building libomptarget don't have gcc-multilib installed and also don't care about nvptx offloading.

I think we therefore need to do one of the following:

  • turn this build off by default if cuda is missing (what we used to have)
  • turn this build off if compilation fails, e.g. by trying to detect multilibs, or otherwise make compilation failures non-fatal
  • drop the dependency on the host, which is straightforward if we disable printf and otherwise awkward

turn this build off by default if cuda is missing (what we used to have)

Yes, let's do this and ask the packagers for releases to enable it.

Aside from potentially disabling this build, I actually have multilib installed:

ii  gcc-7-multilib                   7.5.0-3ubuntu1~18.04                amd64        GNU C compiler (multilib support)
ii  gcc-multilib                     4:7.4.0-1ubuntu2.3                  amd64        GNU C compiler (multilib files)

Is there something missing in the cmake files that should make use of it?

Aside from potentially disabling this build, I actually have multilib installed:

ii  gcc-7-multilib                   7.5.0-3ubuntu1~18.04                amd64        GNU C compiler (multilib support)
ii  gcc-multilib                     4:7.4.0-1ubuntu2.3                  amd64        GNU C compiler (multilib files)

Is there something missing in the cmake files that should make use of it?

Hmm, I installed both gcc-multilib and g++-multilib, but here we actually only includes C header.

Aside from potentially disabling this build, I actually have multilib installed:

ii  gcc-7-multilib                   7.5.0-3ubuntu1~18.04                amd64        GNU C compiler (multilib support)
ii  gcc-multilib                     4:7.4.0-1ubuntu2.3                  amd64        GNU C compiler (multilib files)

Is there something missing in the cmake files that should make use of it?

The package should be libc-dev on Ubuntu.

openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake