Download Raw Diff

Details

Reviewers

jholewinski
eliben
echristo

Commits

rGc29db8441959: [CUDA] Added a wrapper header for inclusion of stock CUDA headers.
rC253388: [CUDA] Added a wrapper header for inclusion of stock CUDA headers.
rL253388: [CUDA] Added a wrapper header for inclusion of stock CUDA headers.

Summary

Header files that come with CUDA are assuming split host/device compilation and are not usable by clang out of the box.
With a bit of preprocessor magic it's possible to twist them into a form usable by clang after D13170 and D13144 land.

Diff Detail

Event Timeline

tra updated this revision to Diff 35739.Sep 25 2015, 10:58 AM

tra retitled this revision from to [CUDA] Added a wrapper header for inclusion of stock CUDA headers..

tra updated this object.

tra added reviewers: echristo, eliben, jholewinski.

tra added a subscriber: cfe-commits.

Fixed typos and whitespace nits.
use #pragma push_macro for CUDACC_RTC, too.

eliben added inline comments.Sep 25 2015, 12:14 PM

lib/Headers/clang_cuda_support.h
54	If this includes CUDA headers, maybe you can #error out if the CUDA version isn't good?

Include cuda.h and #error if we see wrong CUDA_VERSION.

lgtm

This revision is now accepted and ready to land.Sep 25 2015, 1:23 PM

Bikeshed: it's part of the clang headers, do we really need "clang" in the header name?

-eric

The (vague) idea was to make clear that the header is *not* part of cuda
distribution.

That said, the file could use a better name.

Do any of these sound better?

fix_cuda_headers.h
adapt_cuda_headers.h
cuda_shim.h

--Artem

Why not just call it cuda.h and use #include next for it and then fix it up?

-eric

cuda_runtime.h may be a better choice. nvcc -includes it during both host
and device compilation and this wrapper file is intended to serve a similar
purpose and will probably end up being -included by cc1 in the end.

--Artem

Renamed wrapper to cuda_runtime.h
Similarly to nvcc, automatically add "-include cuda_runtime.h" to CC1 invocations unless -nocudainc is specified.

Changed header wrapping strategy. Previous version was attempting to
make CUDA headers work for host/device compilations separately. In the
end host and device compilations ended up with different view of
CUDA-provided functions. While it mostly worked, that is not what we
really want. What we want is to have identical view of device-specific
functions in both cases and let function overloading handle name clashes
between host and device functions.

This wrapper now always includes CUDA headers exactly the same way during
host and device compilation passes and produces identical preprocessed
content during host and device side compilation for sm_35 GPUs. Device
compilation passes for older GPUs will see a smaller subset of device
functions supported by particular GPU.

As a bonus this wrapper works with CUDA 7.5 now.

Herald added a subscriber: klimek. · View Herald TranscriptOct 20 2015, 12:44 PM

I'm ignoring the content of the header, but this seems to be a not terrible way to do things. I gather that cuda_runtime.h is something that's typically included by the driver by nvidia and not the client?

Also, tests?

-eric

In D13171#272397, @echristo wrote:

I'm ignoring the content of the header, but this seems to be a not terrible way to do things. I gather that cuda_runtime.h is something that's typically included by the driver by nvidia and not the client?

Correct. cuda_runtime.h (and all it pulls in) is -include'd under the hood by nvcc.

Also, tests?

I'll add a test to verify that "-include cuda_runtime.h" shows up on cc1 command line where/when it's expected.

What would be a good way to test the wrapper itself within clang tree without real CUDA headers?

I've done fair amount of manual testing outside of clang source tree.

manual comparison of preprocessed output from cuda_runtime.h between host and device passes.
compiled 39 out of 46 thrust examples and verified that they produce output identical to nvcc-compiled binaries.

In D13171#272441, @tra wrote:

In D13171#272397, @echristo wrote:

I'm ignoring the content of the header, but this seems to be a not terrible way to do things. I gather that cuda_runtime.h is something that's typically included by the driver by nvidia and not the client?

Correct. cuda_runtime.h (and all it pulls in) is -include'd under the hood by nvcc.

Also, tests?

I'll add a test to verify that "-include cuda_runtime.h" shows up on cc1 command line where/when it's expected.

Ick.

What would be a good way to test the wrapper itself within clang tree without real CUDA headers?

Hrm. Maybe a set of inputs that stub out things? Hard really.

I've done fair amount of manual testing outside of clang source tree.

manual comparison of preprocessed output from cuda_runtime.h between host and device passes.

compiled 39 out of 46 thrust examples and verified that they produce output identical to nvcc-compiled binaries.

Cool.

LGTM with those changes and give a thought at how to test this in tree better.

Thanks!

-eric

Added test cases for force-including of cuda_runtime.h
Tweaked inclusion of one header due to use of default arguments.

tra mentioned this in D14370: [doc] Compile CUDA with LLVM.Nov 5 2015, 11:39 AM

jingyue added a subscriber: jingyue.Nov 6 2015, 8:58 AM

Closed by commit rL253388: [CUDA] Added a wrapper header for inclusion of stock CUDA headers. (authored by tra). · Explain WhyNov 17 2015, 2:31 PM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Added a wrapper header for inclusion of stock CUDA headers.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 35760

lib/Headers/CMakeLists.txt

lib/Headers/clang_cuda_support.h

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Added a wrapper header for inclusion of stock CUDA headers.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 35760

lib/Headers/CMakeLists.txt

lib/Headers/clang_cuda_support.h

[CUDA] Added a wrapper header for inclusion of stock CUDA headers.
ClosedPublic