This is an archive of the discontinued LLVM Phabricator instance.

[External][CUDA] Add option to test with the new driver
ClosedPublic

Authored by jhuber6 on May 23 2022, 11:53 AM.

Details

Summary

Upstream clang supports RDC-mode compilation through the new driver.
This patch adds an option that allows the user to configure the compilation
job to use the new driver for testing. This will allow us to optionally test the
new driver in the buildbots to track changes as we develop.

Event Timeline

jhuber6 created this revision.May 23 2022, 11:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 11:53 AM
Herald added subscribers: mattd, mgorny. · View Herald Transcript
jhuber6 requested review of this revision.May 23 2022, 11:53 AM
jhuber6 edited the summary of this revision. (Show Details)May 23 2022, 12:05 PM
tra accepted this revision.May 23 2022, 2:06 PM
tra added inline comments.
External/CUDA/CMakeLists.txt
41

You may as well enable it by default, if you want it tested by the CUDA bots. It's easier than updating the bots.

This revision is now accepted and ready to land.May 23 2022, 2:06 PM
jhuber6 added inline comments.May 23 2022, 2:11 PM
External/CUDA/CMakeLists.txt
41

True, worst case scenario I break them and need to revert. I'll probably go ahead and do that.

This revision was automatically updated to reflect the committed changes.
tra added inline comments.Nov 30 2022, 4:37 PM
External/CUDA/CMakeLists.txt
302

Where does clang-linker-wrapper look for nvlink ? It looks like it may be picking it up from a wrong place. I've just enabled CUDA-11.8 on the build bot and ran into this failure:

https://lab.llvm.org/buildbot/#/builders/46/builds/39560

nvlink fatal   : Input file '/tmp/complex.cu-nvptx64-nvidia-cuda-sm_75-82422d.cubin' newer than toolkit (118 vs 102)
/buildbot/cuda-t4-0/work/clang-cuda-t4/clang/bin/clang-linker-wrapper: error: 'nvlink' failed

It seems to be picking it up from $PATH, and ends up with the wrong one.

If I remove nvlink from PATH, then the build fails with clang-linker-wrapper: error: Unable to find 'nvlink' in path, even though nvlink is present in the CUDA/bin directories for all CUDA versions on the bot.

I'll disable the new driver test on CUDA bots for now, until this is sorted out.

tra added inline comments.Nov 30 2022, 4:53 PM
External/CUDA/CMakeLists.txt
301

Oh. I've just realized that this has effectively switched all tests to use the new driver. The good news is that it worked, though I don't understand how. Perhaps we've been previously picking up a more recent nvlink than we do now.

tra added inline comments.Nov 30 2022, 4:59 PM
External/CUDA/CMakeLists.txt
301

Yup. Due to a quirk of CUDA SDK installation, we just happened to have the latest version sticking its binaries in the PATH and that worked.

For now I'll leave CUDA bots running with the old driver, until we figure out what to do about picking the right nvlink. Bots test with multiple CUDA versions and I do want to use the matching nvlink for tools built with particular CUDA version.

jhuber6 added inline comments.Nov 30 2022, 5:27 PM
External/CUDA/CMakeLists.txt
302

So, if it goes through clang it will use the one Clang knows about which is gotten through the CudaInstallationDetector by passing --cuda-path= on the command line. If that was not provided it searches for the binaries directly. Maybe we are using a two-step compilation and only the first one is passed the cuda path directly?