Page MenuHomePhabricator

[driver][CUDA] Use CMake's FindCUDA as default --cuda-path.
Needs ReviewPublic

Authored by Meinersbur on Oct 22 2020, 10:27 AM.

Details

Summary

Add the introspection result from find_package(CUDA) as the first candidate when looking for a CUDA installation. This allows specifying the path to the CUDA installation when configuring LLVM instead of in every invocation of clang. Clang and libomptarget both use CMake's introspection to determine the CUDA environment, in particular to determine the default and supported architectures for OpenMP. libomptarget's regression tests assume that clang will find the CUDA installation itself without passing --cuda-path. Ideally, these are consistent. Making Clang also use the CMake introspection was the preferred solution from the OpenMP Multi-Company conference call over adding --cuda-path to every regression test.

To not make the driver regression test dependendent on the build configuration, a new flag --cuda-path-ignore-cmake is added which causes the CUDA_TOOLKIT_ROOT_PATH to not be added to the list of installation candidates.

Some notes:

  • Both libomptarget and clang have a workaround for http://bugs.debian.org/882505 (CUDA on Debian and Ubuntu installations with /usr/lib/cuda prefix instead of /usr), but implemented very differently. Also see D40453 and D55588). In my Ubuntu LTS versions 18.04 and 20.04, this seem to be fixed as /usr/lib/cuda still exists but only contain the files version.txt and libdevice.10.bc (though as symbolic link directory), which happen to be the files Clang looks for when searching a CUDA installation. I assume this is a compatibility workaround to not break Clang's compatibility workaround. libomptarget's implementation was to overwrite CUDA_TOOLKIT_ROOT_DIR after invoking FindCUDA, which is an evil thing to do. I replaced it with a PATHS "/usr/lib/cuda" so FindCUDA can check that path itself for older distributions which indeed have CUDA installed there.
  • find_package(CUDA) is deprecated by find_package(CUDAToolkit) in CMake 3.17. The required version for LLVM currently is CMake 3.13.4

Diff Detail

Unit TestsFailed

TimeTest
380 mslinux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp

Event Timeline

Meinersbur created this revision.Oct 22 2020, 10:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2020, 10:27 AM
Meinersbur requested review of this revision.Oct 22 2020, 10:27 AM
tra requested changes to this revision.Oct 22 2020, 10:41 AM

One concern I have is that the path we configure during clang's build is not necessarily the right choice for the user of clang we build. It's likely that the clang in the end will be used on a completely different machine.
E.g. official clang builds can not ever provide the same CUDA path for *all* users who end up using them. Requiring the rest to use a special option to make clang work again looks like an overall usability regression to me.

I think the default should still let clang search for CUDA or require the user to provide correct CUDA path. "Use CUDA path discovered by CMake at build time" should be a non-default configuration option if/when it's needed and appropriate.

Alternatively, I would not object to appending cmake-discovered path to the default search list, or enforcing it for OpenMP builds only if that's what OpenMP needs (though I still don't see how that's going to be a good default for all/most users).

This revision now requires changes to proceed.Oct 22 2020, 10:41 AM
In D89974#2347938, @tra wrote:

One concern I have is that the path we configure during clang's build is not necessarily the right choice for the user of clang we build. It's likely that the clang in the end will be used on a completely different machine.
E.g. official clang builds can not ever provide the same CUDA path for *all* users who end up using them. Requiring the rest to use a special option to make clang work again looks like an overall usability regression to me.

Not quite, as far as I understand the change it's only one of the searched paths.

I think the default should still let clang search for CUDA or require the user to provide correct CUDA path. "Use CUDA path discovered by CMake at build time" should be a non-default configuration option if/when it's needed and appropriate.

I agree here. It's definitely surprising to make it the *first* path because module loading another CUDA version and putting it into PATH is not recognized anymore.

clang/test/Driver/cuda-detect.cu
9

There's now a space missing before --cuda-path-ignore-env.

In D89974#2347938, @tra wrote:

I think the default should still let clang search for CUDA or require the user to provide correct CUDA path. "Use CUDA path discovered by CMake at build time" should be a non-default configuration option if/when it's needed and appropriate.

I agree here. It's definitely surprising to make it the *first* path because module loading another CUDA version and putting it into PATH is not recognized anymore.

Also note that GCC_INSTALL_PREFIX works differently (empty by default) and must be set explicitly during configure to make the build embed a path into the final executable. Maybe that's the better way to go?

tra added a comment.Oct 22 2020, 11:00 AM
In D89974#2347938, @tra wrote:

One concern I have is that the path we configure during clang's build is not necessarily the right choice for the user of clang we build. It's likely that the clang in the end will be used on a completely different machine.
E.g. official clang builds can not ever provide the same CUDA path for *all* users who end up using them. Requiring the rest to use a special option to make clang work again looks like an overall usability regression to me.

Not quite, as far as I understand the change it's only one of the searched paths.

Right. My mistake. I misread the code and the excess of --cuda-path-ignore-cmake in the tests convinced me that it replaced the default search paths.

The point about the path unlikely to be a good default guess still stands. It will also introduce surprising behavior when clang built on a machine with CUDA installed will behave differently compared to clang built on a machine w/o CUDA. IMO clang built from the same sources with the same build configuration options should behave the same way, regardless of what's been installed on the build host during the build.

Meinersbur edited the summary of this revision. (Show Details)

Re-add accidentally removed space

I echo @tra's concerns.
Having an option for a vendor to append/prepend a toolkit search location seems useful, but currently this seems more like a workaround for an explicit CUDA toolkit path not being passed to the testsuite.
Shortcomings in the testrunner don't seem like a reason to introduce new build-time configured default search paths into the driver.

Aside, here's an overview of CMake+CUDA:

  • FindCUDA is deprecated since 3.10 and obsolete since 3.17. Replaced by native language support and FindCUDAToolkit.
  • CMake supports Clang as a CUDA compiler since 3.18.
  • CMake supports separable compilation with Clang since 3.19.
  • CMake doesn't support CUDA with Clang on Windows. I hope to work on it for 3.20.

Ideally LLVM would use the native CMake CUDA support and the testsuite would use the CUDA toolkit found by CMake where necessary.
Users would be able to override the selected toolkit by setting CUDAToolkit_ROOT when configuring Clang.
Unfortunately this would require CMake 3.18, so this'll have to wait a bit.

If there are any questions/issues on the CMake side let me know. I maintain Clang and CUDA support in CMake.

The host's build environment already influences the compiler's defaults, e.g. D88929. Especially libomptarget heavily uses it, and I would argue that these should be consistent, unless explicitly overwritten.

However, I see that OpenMP offloading is not the only use case.

I think the default should still let clang search for CUDA or require the user to provide correct CUDA path. "Use CUDA path discovered by CMake at build time" should be a non-default configuration option if/when it's needed and appropriate.

Having an option for a vendor to append/prepend a toolkit search location seems useful, but currently this seems more like a workaround for an explicit CUDA toolkit path not being passed to the testsuite.
Shortcomings in the testrunner don't seem like a reason to introduce new build-time configured default search paths into the driver.

I thought, right now we would configure clang with a cuda path XYZ and the clang we build still will lookup the default path instead. This requires you to specify XYZ when compiling/linking as well in order to not accidentally pick up a different cuda installation (or none at all). Maybe I misunderstood but if I didn't, the behavior we have right now seems wrong or at least incomplete. I should be able to build a clang on a system with a non-default cuda installation that will work "out-of-the-box" for users as it has the cuda path "backed-in". Again, I might misunderstand all this.

I thought, right now we would configure clang with a cuda path XYZ

If you speak of the current state in the repository, AFAICT, Clang doesn't know about a CUDA installation that CMake may have found. There's just a list of default paths that are searched during runtime, dynamically enhanced by an entry from searching ptxas in PATH.

tra added a comment.Oct 22 2020, 12:15 PM

Shortcomings in the testrunner don't seem like a reason to introduce new build-time configured default search paths into the driver.

+1

The root of the problem is that we rely on CUDA SDK as the external dependency and we have no way to reliably predict where the user will put it.
Generally speaking, the location of that dependency is specific to particular machine. I.e. I can have CUDA installed pretty much anywhere I want. Different distributions may install it anywhere they want. There may be multiple versions installed and clang has no basis to guess which one the user intends to use.

I do not see a way to make it work reliably without explicit input from the end user. Making the heuristic dependent on clang's build environment is not improving the situation, IMO. It may help some users, but it will disrupt others and it will introduce more confusion about where specific clang binary searches for CUDA installations.

CUDA path is sort of a global configuration parameter for all CUDA compilations. Perhaps we should consider allowing the user to specify a CUDA search path candidate via environment variable. This should allow transparently overriding preferred CUDA path without having to adjust all builds. I can't say I like it, but it seems to be the least bad way (that I can think of ATM) to address the dependency on something that only the end user would know for sure.

In D89974#2348176, @tra wrote:

CUDA path is sort of a global configuration parameter for all CUDA compilations. Perhaps we should consider allowing the user to specify a CUDA search path candidate via environment variable. This should allow transparently overriding preferred CUDA path without having to adjust all builds. I can't say I like it, but it seems to be the least bad way (that I can think of ATM) to address the dependency on something that only the end user would know for sure.

Don't we already have this via PATH? At least that was my motivation back then and it worked without problems.

tra added a comment.Oct 22 2020, 12:35 PM

Don't we already have this via PATH? At least that was my motivation back then and it worked without problems.

We do and it should indeed work for the tests. Setting an environment variable should be reasonably low burden if/when default path is not good enough.

PATH does suffer from the problem that ptxas location is not necessarily the right CUDA location. I.e. sometimes it's found via /usr/bin which I can not drop from the PATH, so one should be careful to *prepend* desired ptxas location.

rnk added a comment.Oct 22 2020, 1:05 PM

My opinion carries less weight since I don't use CUDA, but I agree with everything Art said. Here's some input, if it helps.

I like the PATH search for ptxas as a way to make things work out of the box as often as possible.

I don't like the idea of CMake auto-detecting CUDA for the reasons Art listed. It's a great way to implicitly leak more details about the build environment to the user.

An explicit CMake option to override the CUDA installation path seems fine. A distribution could use this if they vend the CUDA SDK somewhere standard, for example.

Lastly, if those things don't work, there is the CCC_OVERRIDE_OPTIONS environment variable, so the user always has that escape hatch. A separate environment variable would also be fine. It reminds me of PYTHON_HOME, with similar drawbacks.

tra added a comment.Oct 22 2020, 1:23 PM
In D89974#2348310, @rnk wrote:

Lastly, if those things don't work, there is the CCC_OVERRIDE_OPTIONS environment variable, so the user always has that escape hatch. A separate environment variable would also be fine. It reminds me of PYTHON_HOME, with similar drawbacks.

Oh. I didn't know about CCC_OVERRIDE_OPTIONS. It appears that does a bit more that just adding extra options. Is it documented somewhere?

Also, clang appears to use SDKROOT environment for specifying XCode path on MacOs, presumably for the reasons similar to what we're dealing with here with CUDA.

sylvestre.ledru resigned from this revision.Oct 31 2020, 4:03 AM