Page MenuHomePhabricator

[CUDA][Windows] Final fix for bug 38811 (Step 3 of 3)
ClosedPublic

Authored by emankov on Apr 3 2019, 11:03 AM.

Details

Summary

Last fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows".

[IMPORTANT]
With that last fix, CUDA has just started being compiling by clang on Windows after nearly a year and two clang’s major releases (7 and 8).
As long as the last LLVM release, in which clang was compiling CUDA on Windows successfully, was 6.0.1, this fix and two previous have to be included into upcoming 7.1.0 and 8.0.1 releases.

[How to repro]

clang++.exe -x cuda "c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\0_Simple\simplePrintf\simplePrintf.cu" -I"c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\common\inc" --cuda-gpu-arch=sm_50 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0" -L"c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64" -lcudart.lib  -v

[Output]

In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:390:11: error: no matching function for call to '__isinfl'
  return (__isinfl(a) != 0);
          ^~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2662:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __isinfl(long double a))
             ^
In file included from <built-in>:1:
In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:438:11: error: no matching function for call to '__isnanl'
  return (__isnanl(a) != 0);
          ^~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2672:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __isnanl(long double a))
             ^
In file included from <built-in>:1:
In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:486:11: error: no matching function for call to '__finitel'
  return (__finitel(a) != 0);
          ^~~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2652:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __finitel(long double a))
             ^
3 errors generated when compiling for sm_50.

[Solution]
Add missing device functions' declarations and definitions.

Diff Detail

Repository
rC Clang

Event Timeline

emankov created this revision.Apr 3 2019, 11:03 AM
tra added inline comments.Apr 3 2019, 11:27 AM
clang/lib/Headers/__clang_cuda_cmath.h
81–90 ↗(On Diff #193538)

CUDA does not support long double on device side, so I would rather see a compilation error when someone attempts to use a long double, as opposed to providing inconsistent support for it.

I wonder if we could just provide declarations for these functions. This should allow math_functions.hpp to compile, but would still prevent any use of long double on device side.

emankov updated this revision to Diff 193874.Apr 5 2019, 8:11 AM

Provide only declarations for missing long double device functions to prevent any use of long double on the device side, because CUDA does not support long double on the device side.

[Testing]
{Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.9, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1}

tra accepted this revision.Apr 5 2019, 9:30 AM

Thank you for fixing this!

This revision is now accepted and ready to land.Apr 5 2019, 9:30 AM
tra added a comment.Apr 5 2019, 9:40 AM

One more thing -- perhaps the long double declarations should be put under #ifndef _MSC_VER in all the files to make the change unobservable on non-windows platforms.

Adding a comment why we only have declarations for these functions would also be helpful.

This revision was automatically updated to reflect the committed changes.

Oooh, sorry, but I've just pushed the fix. But with the following words: "Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side."

tra added a comment.Apr 5 2019, 10:07 AM

Oooh, sorry, but I've just pushed the fix. But with the following words: "Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side."

It's not a big deal at the moment -- there are no long double users in CUDA on linux yet. You can clean up in another commit.
BTW, you may want to make commit description somewhat more concise than rC357779. Including all details of reproduction, etc is way too much info which is better suited for the bug report or review.

In D60220#1456449, @tra wrote:

It's not a big deal at the moment -- there are no long double users in CUDA on linux yet. You can clean up in another commit.
BTW, you may want to make commit description somewhat more concise than rC357779. Including all details of reproduction, etc is way too much info which is better suited for the bug report or review.

Ok, thanks, will do it.