Page MenuHomePhabricator

[OpenMP] Avoid unintentional use of host compiler as bclib compiler.
AbandonedPublic

Authored by Meinersbur on Apr 30 2021, 1:50 PM.

Details

Summary

Before this patch, libomptarget assumed that if the host compiler is clang, then it is also suitable for compile CUDA to LLVM-IR. The intention was that if libomptarget is compiled using LLVM_ENABLE_RUMTIMES=openmp, the host compiler will be clang from the same propository.

However, the host compiler may also be clang when compiling as a LLVM_ENABLE_PROJECTS=openmp or standalone build if the user explicitly set the tool chain or, clang is the system's default compiler (MacOS, most BSDs). Various vendor compilers such as Apple's clang, Intel's dpcpp/icx, IBM's xlclang, etc. may be identified as 'clang' by CMake. However, these potentially use an incompatible IR, or stripped CUDA support, with unpredictable results.

Fix by explicitly set LIBOMPTARGET_NVPTX_CUDA_COMPILER (and LIBOMPTARGET_NVPTX_BC_LINKER) when using LLVM_ENABLE_RUMTIMES=openmp and do not try to use the host compiler.

Implementation of discussion in D101265. Compilation result: http://meinersbur.de:8011/#/builders/143/builds/375

Diff Detail

Event Timeline

Meinersbur created this revision.Apr 30 2021, 1:50 PM
Meinersbur requested review of this revision.Apr 30 2021, 1:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2021, 1:50 PM
tianshilei1992 requested changes to this revision.Apr 30 2021, 1:58 PM
tianshilei1992 added inline comments.
llvm/runtimes/CMakeLists.txt
238

I don't think it's a good idea to "pollute" LLVM CMake files for this purpose. There are plenty of ways to tell whether OpenMP is built via LLVM_ENABLE_RUNTIMES. I'd set the two CMake variables in OpenMP by checking whether we're in LLVM_ENABLE_RUNTIMES.

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
36

Removing this can cause issue if I compile OpenMP standalone. We cannot assume people all compile OpenMP along with LLVM either with LLVM_ENABLE_RUNTIMES or LLVM_ENABLE_PROJECTS.
Like I said in your previous patch, we need a mechanism to check whether the provided clang is qualified.

This revision now requires changes to proceed.Apr 30 2021, 1:58 PM
Meinersbur added inline comments.May 3 2021, 7:52 AM
llvm/runtimes/CMakeLists.txt
238

Adding project-specific options is already done for COMPILER_RT. This seems to be the established approach.

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
36

As mentioned in the summary, automatically using the host compiler may result in unpredictable LLVM-IR that e.g. include vendor extensions. That is, the C/C++ to host-assembly compiler is just the wrong tool for CUDA-to-LLVM-IR compilation.

I think the only clang we would want to support is the clang from that same git commit.

Meinersbur added inline comments.May 3 2021, 8:04 AM
llvm/runtimes/CMakeLists.txt
238

Btw, PASSTHROUGH_PREFIXES, will pass all OPENMP_ options (even those that are left to their defaults) to the nested CMake configuration. It will not do so with LIBOMPTARGET_NVPTX_CUDA_COMPILER because the prefix for the openmp project is assumed to be OPENMP_ and LIBOMPTARGET_ is missing.

tianshilei1992 added inline comments.May 3 2021, 11:22 AM
llvm/runtimes/CMakeLists.txt
238

I mean, you could do something in CUDA's CMake file in the following way (pseudo code):

if (BUILD_VIA_RUNTIME)
set(LIBOMPTARGET_NVPTX_CUDA_COMPILER target clang)
set(BC_LINKER target llvm-link)
endif()

where BUILD_VIA_RUNTIME can be detected. You don't have to do that in LLVM's CMake file.

Btw, PASSTHROUGH_PREFIXES, will pass all OPENMP_ options (even those that are left to their defaults) to the nested CMake configuration. It will not do so with LIBOMPTARGET_NVPTX_CUDA_COMPILER because the prefix for the openmp project is assumed to be OPENMP_ and LIBOMPTARGET_ is missing.

This is not true. See line 178 at llvm/runtimes/CMakeLists.txt.

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
36

That doesn't make sense. It's fine that if building in-tree, use the one from same commit, but it should never be the only way. Again (I have said that for three times), we need to check if the clang is qualified. Users can of course set LIBOMPTARGET_NVPTX_CUDA_COMPILER to a random clang. This change doesn't solve the root problem: is the compiler qualified? What we really need is, no matter where the compiler is from (it can be the host compiler detected by CMake, it can also be the one specified by users), check its qualification before use it.

Meinersbur added inline comments.May 5 2021, 8:19 AM
llvm/runtimes/CMakeLists.txt
238

How would BUILD_VIA_RUNTIME be detected? target clang/target llvm-link are not targets in the runtimes build CMake configurations.

I think setting LIBOMPTARGET_NVPTX_CUDA_COMPILER is exactly the right option because this is the variable a user would need to set if they want a standalone build without using LLVM_ENABLE_RUNTIMES. Not additional magic needed.

This is not true. See line 178 at llvm/runtimes/CMakeLists.txt.

Correct, did not see that. Makes me wonder why this kind if pollution OK, but setting LIBOMPTARGET_NVPTX_CUDA_COMPILER is not.

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
36

The user sets LIBOMPTARGET_NVPTX_CUDA_COMPILER manually, it is their responsibility. If they do not specify LIBOMPTARGET_NVPTX_CUDA_COMPILER, but gets a broken build due to the default being inadequate we cannot blame the user.

The only adequacy test I can think of that cannot result in a broken build is to execute clang --version and compare it the commit hash.

tianshilei1992 added inline comments.May 5 2021, 10:26 AM
llvm/runtimes/CMakeLists.txt
238

How would BUILD_VIA_RUNTIME be detected? target clang/target llvm-link are not targets in the runtimes build CMake configurations.

You could refer to line 7 at openmp/CMakeLists.txt. That's what we used to determine if it is standalone build. If OPENMP_STANDALONE_BUILD is false, then we could set the two variables accordingly because in that case, it is either in runtime build, or project build, where in both cases you wanna use the in-tree build clang.

Makes me wonder why this kind if pollution OK, but setting LIBOMPTARGET_NVPTX_CUDA_COMPILER is not.

That changes apply for all OpenMP arguments, libomp, libomptarget, plugins, device runtimes, tests, you name it. It's not just for the one single NVIDIA device runtime. What's more important, we obviously have a more elegant way to do it in project's own directory to avoid leaking those project specific arguments all over the place.

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
36

The only adequacy test I can think of that cannot result in a broken build is to execute clang --version and compare it the commit hash.

That is not sufficient. What if users build LLVM w/o expected targets? You can still get a "valid" clang (in terms of version) and use it for NVPTX, but in fact it doesn't support it at all. We did encounter this once when AMD offloading was enabled by default previously.

We need to check the eligibility before including corresponding directories in CMake. That's the ultimate solution. We know our device runtime is using X, Y, Z features. We check whether the compiler can work properly. That's how autoconfig works. If in the future new features are being used, we simply update the checker.

josemonsalve2 added inline comments.
llvm/runtimes/CMakeLists.txt
238

I know this is been a while since this happened. But I was looking at this code and found this revision.

I just wanted to add. Even though they have compiler-rt specific code, they recognize it is something that would need to be fixed. see llvm-project/runtimes/CMakeLists.txt:110 or nearby. The comment says:

# TODO: compiler-rt has to use standalone build for now. We tried to remove
  # this in D57992 but this broke the build because compiler-rt assumes that
  # LLVM and Clang are configured in the same build to set up dependencies. We
  # should clean up the compiler-rt build and remove this eventually.
Meinersbur added inline comments.Jul 29 2021, 12:37 PM
llvm/runtimes/CMakeLists.txt
238

I still have the opinion that we should not assume that CMAKE_CXX_COMPILER (C++-to-.o) is also the compiler to cross-compile to .bc even more than why CMAKE_C_COMPILER is also different from CMAKE_CXX_COMPILER, CMAKE_CUDA_COMPILER or CMAKE_CUDA_HOST_COMPILER (even though they probably are all able to compile C-code to the host target), and you need a different CMAKE_C_COMPILER when cross-compiling.

What CMake variable specifies what compiler to use as "OpenMP/CUDA-to-bc" compiler is not libomptarget-specific. It could also be named CMAKE_OPENMP_<TARGET>_CROSSCOMPILER.

Alternatively, the deviceRTL could be built in another CMake build configuration that is configured to cross-compiler to NVPTX (in the same sense that compiler-rt is configured to cross-compile to the architecture(s) that just-built Clang targets), but this requires much more effort.

Meinersbur abandoned this revision.Oct 29 2021, 6:15 AM

Superseded by D111983 which stops using CMAKE_C_COMPILER as device compiler (using clang in LLVM_DIR instead)