Clang can use CUDA-9.1 now, though new builtins (__hmma_m32n8k16*) are not implemented yet.
The major change is that headers in CUDA-9.1 went through substantial
changes that started in CUDA-9.0 which required substantial changes
in the cuda compatibility headers provided by clang.
There are two major issues:
- CUDA SDK no longer provides declarations for libdevice functions.
- A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations.
This patch changes the way CUDA headers are handled if we compile
with CUDA 9.x. Both 9.0 and 9.1 are affected.
- Clang provides its own declarations of libdevice functions.
- For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice.
This patch should not affect compilation with CUDA-8. There may be
some observable differences for CUDA-9.0, though they are not expected
to affect functionality.
Tested: CUDA test-suite tests for all supported combinations of:
CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70
Perhaps update to CUDA 9+