This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Driver changes to support CUDA compilation on MacOS.
ClosedPublic

Authored by jlebar on Nov 16 2016, 4:21 PM.

Details

Summary

Compiling CUDA device code requires us to know the host toolchain,
because CUDA device-side compiles pull in e.g. host headers.

When we only supported Linux compilation, this worked because
CudaToolChain, which is responsible for device-side CUDA compilation,
inherited from the Linux toolchain. But in order to support MacOS,
CudaToolChain needs to take a HostToolChain pointer.

Because a CUDA toolchain now requires a host TC, we no longer will
create a CUDA toolchain from Driver::getToolChain -- you have to go
through CreateOffloadingDeviceToolChains. I am *pretty* sure this is
correct, and that previously any attempt to create a CUDA toolchain
through getToolChain() would eventually have resulted in us throwing
"error: unsupported use of NVPTX for host compilation".

In any case hacking getToolChain to create a CUDA+host toolchain would
be wrong, because a Driver can be reused for multiple compilations,
potentially with different host TCs, and getToolChain will cache the
result, causing us to potentially use a stale host TC.

So that's the main change in this patch.

In addition, we have to pull CudaInstallationDetector out of Generic_GCC
and into a top-level class. It's now used by the Generic_GCC and MachO
toolchains.

Diff Detail

Repository
rL LLVM

Event Timeline

jlebar updated this revision to Diff 78286.Nov 16 2016, 4:21 PM
jlebar retitled this revision from to [CUDA] Driver changes to support CUDA compilation on MacOS..
jlebar updated this object.
jlebar added a reviewer: tra.
jlebar added subscribers: sfantao, hfinkel, rryan.

Hi Justin,

Thanks for the patch.

clang/lib/Driver/Driver.cpp
479 ↗(On Diff #78286)

I am not sure I understand why to pair host and device toolchain in the map. The driver can be used to several compilations, but how do these compilation use different host toolchains? Can you give an example of an invocation? Maybe add it to the regression tests bellow.

jlebar added inline comments.Nov 17 2016, 10:52 AM
clang/lib/Driver/Driver.cpp
479 ↗(On Diff #78286)

The driver can be used to several compilations, but how do these compilation use different host toolchains?

I don't know if it's possible to do so when compiling through the command line. But if using clang as a library, you can create a Driver and use it for multiple compilations with arbitrary targets.

I am not certain we do this inside of the tree, although there are a few places where we create Driver objects, such as lib/Tooling/CompilationDatabase.cpp and lib/Tooling/Tooling.cpp. But also anyone downstream can presumably use clang this way.

tra accepted this revision.Nov 17 2016, 1:31 PM
tra edited edge metadata.

LGTM, with couple of minor nits.

clang/lib/Driver/Driver.cpp
3650–3654 ↗(On Diff #78286)

should there be an assert() or llvm_unreachable() to ensure that? Right now we'll happily return default toolchain.

clang/test/Driver/cuda-detect.cu
67 ↗(On Diff #78286)

Should that be --target=i386-apple-macosx ?

This revision is now accepted and ready to land.Nov 17 2016, 1:31 PM
jlebar marked 2 inline comments as done.Nov 17 2016, 4:45 PM
jlebar added inline comments.
clang/lib/Driver/Driver.cpp
3650–3654 ↗(On Diff #78286)

Unfortunately no -- the way the code is structured now, we get the toolchain before we have a chance to raise an error.

I agree that's pretty broken...

clang/test/Driver/cuda-detect.cu
67 ↗(On Diff #78286)

Wow, good eye.

This revision was automatically updated to reflect the committed changes.
jlebar marked 2 inline comments as done.