Page MenuHomePhabricator

[OpenMP][CMake] Use in-project clang as CUDA->IR compiler.
ClosedPublic

Authored by Meinersbur on Apr 25 2021, 3:52 PM.

Details

Summary

If available, use the clang that is already built in the same project as CUDA compiler unless another executable is explicitly defined. This also ensures the generated deviceRTL IR will be consistent with the version of Clang.

The change in add_subdirectory order is required to ensure that if clang is part of the project build, its target exists before openmp is included. Alternatively, LLVM_TOOL_CLANG_BUILD can could be used to determine whether clang build is enabled (Not sure how reliable it is).

This patch is required to reliably test OpenMP offloading in a buildbot without either a two-stage build (e.g. with LLVM_ENABLE_RUNTIMES) or a separately installed clang on the worker that will eventually become outdated.

See the current builder and the build with this patch applied.

Diff Detail

Event Timeline

Meinersbur created this revision.Apr 25 2021, 3:52 PM
Meinersbur requested review of this revision.Apr 25 2021, 3:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2021, 3:52 PM

I didn't get in what scenario we need this patch. Could you please expatiate it?

  • Add context
  • Update README

I didn't get in what scenario we need this patch. Could you please expatiate it?

Not having to provide another sufficiently recent clang/llvm-link binary to pass with -DLIBOMPTARGET_NVPTX_CUDA_COMPILER/-DLIBOMPTARGET_NVPTX_BC_LINKER.

D95466 seems to have removed the checks for a working Clang at configure-time. If it was still there, this patch would have disabled them since the Clang in the repository with OpenMP would be the most recent and known to support all required features. This also makes it ideal default if available.

tianshilei1992 added a comment.EditedApr 25 2021, 6:01 PM

I didn't get in what scenario we need this patch. Could you please expatiate it?

Not having to provide another sufficiently recent clang/llvm-link binary to pass with -DLIBOMPTARGET_NVPTX_CUDA_COMPILER/-DLIBOMPTARGET_NVPTX_BC_LINKER.

If OpenMP is built via LLVM_ENABLE_RUNTIMES, the two variables you mentioned here are not needed. If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

D95466 seems to have removed the checks for a working Clang at configure-time. If it was still there, this patch would have disabled them since the Clang in the repository with OpenMP would be the most recent and known to support all required features. This also makes it ideal default if available.

The code removed only checked whether a compiler can compile CUDA code. Since we already move to OpenMP style device runtime, it is not needed anymore. The problem here is, we have a requirement of minimum version of clang to work, and we lack that check.

If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

Is this documented somewhere?

Would you require the buildbot to do a stage1 build first for a LLVM_ENABLE_PROJECTS=openmp build?
I don't see a reason for a stage1 build given that using the in-project clang is as simple. OPENMP_TEST_C_COMPILER/OPENMP_TEST_CXX_COMPILER uses it as well by default.

If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

Is this documented somewhere?

You could refer to https://llvm.org/docs/CMake.html for LLVM_ENABLE_PROJECTS and https://llvm.org/docs/BuildingADistribution.html for LLVM_ENABLE_RUNTIMES.

Would you require the buildbot to do a stage1 build first for a LLVM_ENABLE_PROJECTS=openmp build?

It depends. If you want offloading feature, then it requires. If you don't want that, it doesn't. I suppose you are building LLVM with GCC. If you're building LLVM with LLVM, offloading support will also be enabled. The idea here is, the OpenMP is built using the same compiler to build LLVM. Basically the OpenMP project will be built in a same way as others. For example, for LLVM_ENABLE_PROJECTS=clang, we don't expect to build clang with the "recent built" clang, right? If you need it to be built with the recent build Clang, then LLVM_ENABLE_RUNTIMES is for that purpose.

I don't see a reason for a stage1 build given that using the in-project clang is as simple. OPENMP_TEST_C_COMPILER/OPENMP_TEST_CXX_COMPILER uses it as well by default.

Why not go with LLVM_ENABLE_RUNTIMES? It will build everything needed "all at once" (internally it is not. It first builds all parts except projects in LLVM_ENABLE_RUNTIMES, and then automatically invokes CMake configuration of those projects, set corresponding environment variables, and starts the build).

However, another potential direction can be, if we find OpenMP is in LLVM_ENABLE_PROJECTS, we "move" it to LLVM_ENABLE_RUNTIMES. This could arguably make more sense.

If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

Is this documented somewhere?

You could refer to https://llvm.org/docs/CMake.html for LLVM_ENABLE_PROJECTS and https://llvm.org/docs/BuildingADistribution.html for LLVM_ENABLE_RUNTIMES.

The first link does not mention what is intended to build with what compiler.
The second link is for creating a distributable package, which does not apply here.

Would you require the buildbot to do a stage1 build first for a LLVM_ENABLE_PROJECTS=openmp build?

It depends. If you want offloading feature, then it requires. If you don't want that, it doesn't.

It is not required, it works without a stage1 build and this patch.

For example, for LLVM_ENABLE_PROJECTS=clang, we don't expect to build clang with the "recent built" clang, right?

We expect .td files to be built with recent built tablegen.

If you need it to be built with the recent build Clang, then LLVM_ENABLE_RUNTIMES is for that purpose.
Why not go with LLVM_ENABLE_RUNTIMES?

That is another configuration: http://meinersbur.de:8011/#/builders/143 (The LLVM_ENABLE_PROJECT configuration is http://meinersbur.de:8011/#/builders/142)

I would like to test both.

There are valid use cases for using a LLVM_ENABLE_PROJECT configuration. One is that it is a single build dir for using in an IDE or generating a single CMAKE_EXPORT_COMPILE_COMMANDS for use with tools such as clangd. Ninja can also better track dependencies. libomp(target?) is also used by non-Clang compilers such as icc and msvc.

I suppose you are building LLVM with GCC. If you're building LLVM with LLVM, offloading support will also be enabled. The idea here is, the OpenMP is built using the same compiler to build LLVM. Basically the OpenMP project will be built in a same way as others.

The OpenMP host code, but CMake does not configure a device RTL compiler. I would not expect it to use the host compiler for cross-compiling to the GPU. That's what a cross-compiler is for.

Another difference is while for assembly we have a well-defined ABI that ensures that outputs from different compilers are compatible, mixing BC files from different versions of LLVM might not be such a good idea. Sure, there is the AutoUpgrader, but it's probably one of the least tested components in LLVM. Then there are also future versions or third-party forks of clang who's BC files might not at all compatible with our clang, e.g. Apple's clang.

LLVM_ENABLE_RUNTIMES also seems badly documented. When it was first introduced, it did not work for a couple of years and I only found out relatively recently that it is supposed to work. And I am not the only one. A newcomer will first try LLVM_ENABLE_PROJECT=clang;openmp and be frustrated to see that it will not work with a cryptic error message

No library 'libomptarget-nvptx-sm_61.bc' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-nvptx-bc-path to specify nvptx bitcode library.

These problems can all be avoided by using the "recent built" clang as a sensible default. You can still use another by explicitly setting LIBOMPTARGET_NVPTX_CUDA_COMPILER.

jdoerfert added inline comments.Apr 28 2021, 11:19 AM
openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
43

LLVM_ENABLE_RUNTIMES=openmp should itself set LIBOMPTARGET_NVPTX_CUDA_COMPILER instead.

That seems sensible. And then we could emit a warning/note here and above.

If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

Is this documented somewhere?

You could refer to https://llvm.org/docs/CMake.html for LLVM_ENABLE_PROJECTS and https://llvm.org/docs/BuildingADistribution.html for LLVM_ENABLE_RUNTIMES.

The first link does not mention what is intended to build with what compiler.
The second link is for creating a distributable package, which does not apply here.

The two links show the difference between LLVM_ENABLE_RUNTIMES and LLVM_ENABLE_PROJECTS. It does mention that projects in LLVM_ENABLE_RUNTIMES are built by the recent built clang.

Would you require the buildbot to do a stage1 build first for a LLVM_ENABLE_PROJECTS=openmp build?

It depends. If you want offloading feature, then it requires. If you don't want that, it doesn't.

It is not required, it works without a stage1 build and this patch.

If you need offloading feature, it is required because OpenMP offloading features requires clang to be the compiler. Yes, libomptarget.so will be generated, but no deviceRTLs will be generated so the offloading is still unusable.

For example, for LLVM_ENABLE_PROJECTS=clang, we don't expect to build clang with the "recent built" clang, right?

We expect .td files to be built with recent built tablegen.

It's chicken egg problem. Let's say we have LLVM_ENABLE_PROJECTS=clang;openmp, and we are using GCC to build the whole LLVM. Let's call the clang generated here "the clang". If you expect OpenMP to be built by "the clang", what do you expect to build "the clang"? If GCC, why is OpenMP built by "the clang"? If "the clang", where is "the clang" at the first place? And even if we only have`LLVM_ENABLE_PROJECTS=openmp, why is OpenMP in this case built by GCC?

From my perspective, to keep the semantics consistent is important, and that's the only reason that I think all projects in LLVM_ENABLE_PROJECTS should be built by the same compiler as the one to build LLVM.

There are valid use cases for using a LLVM_ENABLE_PROJECT configuration. One is that it is a single build dir for using in an IDE or generating a single CMAKE_EXPORT_COMPILE_COMMANDS for use with tools such as clangd. Ninja can also better track dependencies. libomp(target?) is also used by non-Clang compilers such as icc and msvc.

I don't doubt that. To have LLVM_ENABLE_PROJECTS for IDE is totally fine, and I think even now w/o your patch, libomp and libomptarget can work with clangd except those plugins and deviceRTLs. Plugins and deviceRTLs will still be missing in the compilation database even with your patch if I understand correctly.

I suppose you are building LLVM with GCC. If you're building LLVM with LLVM, offloading support will also be enabled. The idea here is, the OpenMP is built using the same compiler to build LLVM. Basically the OpenMP project will be built in a same way as others.

The OpenMP host code, but CMake does not configure a device RTL compiler. I would not expect it to use the host compiler for cross-compiling to the GPU. That's what a cross-compiler is for.

Sure, you can, but by default it will use what CMake detects. If you don't want it, just specify the one.

Another difference is while for assembly we have a well-defined ABI that ensures that outputs from different compilers are compatible, mixing BC files from different versions of LLVM might not be such a good idea. Sure, there is the AutoUpgrader, but it's probably one of the least tested components in LLVM. Then there are also future versions or third-party forks of clang who's BC files might not at all compatible with our clang, e.g. Apple's clang.

That's unrelated to the problem we discuss here. The key point is, you propose to let LLVM_ENABLE_PROJECTS behave same as LLVM_ENABLE_RUNTIMES, and solely for OpenMP. If so, why do we even have this two arguments at the first place? Or let me put it in this way, is there any other project (such as lld) that when it is in LLVM_ENABLE_PROJECTS along with clang, it is actually built by the clang just built? If yes, I'll be totally fine with your change.

And BTW, using LLVM_ENABLE_RUNTIMES can already avoid all problems you mentioned here.

LLVM_ENABLE_RUNTIMES also seems badly documented. When it was first introduced, it did not work for a couple of years and I only found out relatively recently that it is supposed to work. And I am not the only one. A newcomer will first try LLVM_ENABLE_PROJECT=clang;openmp and be frustrated to see that it will not work with a cryptic error message

No library 'libomptarget-nvptx-sm_61.bc' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-nvptx-bc-path to specify nvptx bitcode library.

These problems can all be avoided by using the "recent built" clang as a sensible default. You can still use another by explicitly setting LIBOMPTARGET_NVPTX_CUDA_COMPILER.

I agree with you that LLVM_ENABLE_RUNTIMES was broken before, but now it works, and it is in OpenMP Q&A (https://openmp.llvm.org/docs/SupportAndFAQ.html).

FWIW, the only problem for LLVM_ENABLE_RUNTIMES now is, if we run check-all, and if any test case in libomptarget fails, other checks will not be run. Since currently offloading x86-64 is broken, test cases for libomptarget never all pass, but that's another story and we need to fix the lit for libomptarget.

protze.joachim added inline comments.
llvm/CMakeLists.txt
971 ↗(On Diff #340408)

Are you sure that this doesn't break cmake dependencies for compiler-rt?

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
35

I think, you want to have:

set(cuda_compiler $<TARGET_FILE:clang>)

If OpenMP is built via LLVM_ENABLE_PROJECTS, it is *intentional* to build OpenMP w/o using the clang in recent build.

Is this documented somewhere?

You could refer to https://llvm.org/docs/CMake.html for LLVM_ENABLE_PROJECTS and https://llvm.org/docs/BuildingADistribution.html for LLVM_ENABLE_RUNTIMES.

The first link does not mention what is intended to build with what compiler.
The second link is for creating a distributable package, which does not apply here.

The two links show the difference between LLVM_ENABLE_RUNTIMES and LLVM_ENABLE_PROJECTS. It does mention that projects in LLVM_ENABLE_RUNTIMES are built by the recent built clang.

I think, what Michael really meant with "Is this documented somewhere?":
Is there user documentation, that LLVM_ENABLE_PROJECTS=openmp was bricked by llvm-12 and you shouldn't expect libomptarget to work properly unless moving to LLVM_ENABLE_RUNTIMES=openmp or doing two-stage build? Since this is crucial information, this should be BOLD in the OpenMP release notes.

Building LLVM_ENABLE_PROJECTS=openmp with the previous clang release used to work fine with single-stage build.

That's unrelated to the problem we discuss here. The key point is, you propose to let LLVM_ENABLE_PROJECTS behave same as LLVM_ENABLE_RUNTIMES, and solely for OpenMP. If so, why do we even have this two arguments at the first place? Or let me put it in this way, is there any other project (such as lld) that when it is in LLVM_ENABLE_PROJECTS along with clang, it is actually built by the clang just built? If yes, I'll be totally fine with your change.

OK, there seems to be a misconception. With this patch LLVM_ENABLE_PROJECTS and LLVM_ENABLE_RUNTIMES will still be different.
LLVM_ENABLE_PROJECTS will continue to compile all host code (libomp.so, libomptarget.so) with the compiler selected by CMake.
This only difference is that by default and if available, the LLVM_ENABLE_PROJECTS configuration also uses the just-built clang to cross-compile the deviceRTL to LLVM-IR. The host-compiler selected by CMake is for compiling to host-compatible assembly and unsuitable for cross-compilation for generation of LLVM bitcode.

Or let me put it in this way, is there any other project (such as lld) that when it is in LLVM_ENABLE_PROJECTS along with clang, it is actually built by the clang just built? If yes, I'll be totally fine with your change.

  1. Thread-sanitizer (compiler-rt) compiles libcxx with $<TARGET_FILE:clang> to get a thread-sanitized C++ standard library.
  2. libtooling uses just-built clang for AST introspection
  3. Polly uses just-built clang-tidy to check whether its correctly formatted.
  4. libc uses just-built clang-tidy for some linting.
llvm/CMakeLists.txt
971 ↗(On Diff #340408)

I am not (for compiler-rt or other uses such as those mentioned in the response); due to its global consequences, this change might indeed be risky. Logically, compiler-rt itself is also a runtime and hence should logically be ordered the same way. I would check this further before committing to verify.

In the summary I mentioned an alternative that would work without this change (checking LLVM_TOOL_CLANG_BUILD instead, like Polly).

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
35

Having it named clang will make CMake assume it is the clang target, and add dependency to the target and replace clang with that path of the output executable.
See https://cmake.org/cmake/help/latest/command/add_custom_target.html

If COMMAND specifies an executable target name (created by the add_executable() command), it will automatically be replaced by the location of the executable created at build time

With specifying a full path, CMake may not add the dependency to the clang target anymore (maybe it does, haven't checked) and would have to be defined manually.

However:

This target-level dependency does NOT add a file-level dependency that would cause the custom command to re-run whenever the executable is recompiled. List target names with the DEPENDS option to add such file-level dependencies.

Maybe we do want such a dependency to guarantee that the bitcode is always the latest
(But then it seems to be illogical that we allow this with pre-compiled clangs via LIBOMPTARGET_NVPTX_CUDA_COMPILER).

  • Rebase
  • Fix typos, remove risky change, fix cross-compiling
protze.joachim added inline comments.Apr 30 2021, 6:48 AM
openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
35

Thanks for the clarification regarding the add_custom_command behavior

The only case I see where this really matters is, when you recompile in an existing build dir after switching branches.
In that case it makes sense to recompile the BC files, if the compiler is newer than the existing file.
So, I'd add cuda_compiler as an explicit dependency in the custom target below.

tianshilei1992 accepted this revision.Apr 30 2021, 8:48 AM

Or let me put it in this way, is there any other project (such as lld) that when it is in LLVM_ENABLE_PROJECTS along with clang, it is actually built by the clang just built? If yes, I'll be totally fine with your change.

  1. Thread-sanitizer (compiler-rt) compiles libcxx with $<TARGET_FILE:clang> to get a thread-sanitized C++ standard library.
  2. libtooling uses just-built clang for AST introspection
  3. Polly uses just-built clang-tidy to check whether its correctly formatted.
  4. libc uses just-built clang-tidy for some linting.

Thanks for the information. Now I'm okay with the change. @protze.joachim 's comment sounds reasonable. LIBOMPTARGET_NVPTX_CUDA_COMPILER is a path to clang so it's different from the target clang. llvm-link as well. Others LGTM.

This revision is now accepted and ready to land.Apr 30 2021, 8:48 AM
  • Add file-level dependency to ensure most recent clang ("Joachim's suggestion")
  • Use $<TARGET_FILE:clang> for just-built clang. The reason is that otherwise if the user specifies LIBOMPTARGET_NVPTX_CUDA_COMPILER=clang (could mean the clang in $PATH; we cannot avoid that CMake rewrites it to in-tree clang), CMake would error out saying that a target "clang" does not exist when adding the dependency.
  • Add output message to notify user which clang is used. To make it clear to users that it might not be the host compiler determined by CMake.
This revision was landed with ongoing or failed builds.Apr 30 2021, 10:47 AM
This revision was automatically updated to reflect the committed changes.