This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][Windows] Final fix for bug 38811 (Step 3 of 3)
ClosedPublic

Authored by emankov on Apr 3 2019, 11:03 AM.

Details

Summary

Last fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows".

[IMPORTANT]
With that last fix, CUDA has just started being compiling by clang on Windows after nearly a year and two clang’s major releases (7 and 8).
As long as the last LLVM release, in which clang was compiling CUDA on Windows successfully, was 6.0.1, this fix and two previous have to be included into upcoming 7.1.0 and 8.0.1 releases.

[How to repro]

clang++.exe -x cuda "c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\0_Simple\simplePrintf\simplePrintf.cu" -I"c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\common\inc" --cuda-gpu-arch=sm_50 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0" -L"c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64" -lcudart.lib  -v

[Output]

In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:390:11: error: no matching function for call to '__isinfl'
  return (__isinfl(a) != 0);
          ^~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2662:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __isinfl(long double a))
             ^
In file included from <built-in>:1:
In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:438:11: error: no matching function for call to '__isnanl'
  return (__isnanl(a) != 0);
          ^~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2672:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __isnanl(long double a))
             ^
In file included from <built-in>:1:
In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:486:11: error: no matching function for call to '__finitel'
  return (__finitel(a) != 0);
          ^~~~~~~~~
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2652:14: note: candidate function not viable: call to __host__ function from __device__ function
__func__(int __finitel(long double a))
             ^
3 errors generated when compiling for sm_50.

[Solution]
Add missing device functions' declarations and definitions.

Diff Detail

Repository
rC Clang

Event Timeline

emankov created this revision.Apr 3 2019, 11:03 AM
tra added inline comments.Apr 3 2019, 11:27 AM
clang/lib/Headers/__clang_cuda_cmath.h
81–90 ↗(On Diff #193538)

CUDA does not support long double on device side, so I would rather see a compilation error when someone attempts to use a long double, as opposed to providing inconsistent support for it.

I wonder if we could just provide declarations for these functions. This should allow math_functions.hpp to compile, but would still prevent any use of long double on device side.

emankov updated this revision to Diff 193874.Apr 5 2019, 8:11 AM

Provide only declarations for missing long double device functions to prevent any use of long double on the device side, because CUDA does not support long double on the device side.

[Testing]
{Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.9, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1}

tra accepted this revision.Apr 5 2019, 9:30 AM

Thank you for fixing this!

This revision is now accepted and ready to land.Apr 5 2019, 9:30 AM
tra added a comment.Apr 5 2019, 9:40 AM

One more thing -- perhaps the long double declarations should be put under #ifndef _MSC_VER in all the files to make the change unobservable on non-windows platforms.

Adding a comment why we only have declarations for these functions would also be helpful.

This revision was automatically updated to reflect the committed changes.

Oooh, sorry, but I've just pushed the fix. But with the following words: "Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side."

tra added a comment.Apr 5 2019, 10:07 AM

Oooh, sorry, but I've just pushed the fix. But with the following words: "Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side."

It's not a big deal at the moment -- there are no long double users in CUDA on linux yet. You can clean up in another commit.
BTW, you may want to make commit description somewhat more concise than rC357779. Including all details of reproduction, etc is way too much info which is better suited for the bug report or review.

In D60220#1456449, @tra wrote:

It's not a big deal at the moment -- there are no long double users in CUDA on linux yet. You can clean up in another commit.
BTW, you may want to make commit description somewhat more concise than rC357779. Including all details of reproduction, etc is way too much info which is better suited for the bug report or review.

Ok, thanks, will do it.

I'm seeing quite similar errors on FreeBSD with Clang 8 and 9:

In file included from <built-in>:1:
In file included from /usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:2910:7: error: no matching function for call to '__isnan'
  if (__isnan(a)) {
      ^~~~~~~
/usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
               ^
In file included from <built-in>:1:
In file included from /usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:2934:7: error: no matching function for call to '__isnan'
  if (__isnan(a)) {
      ^~~~~~~
/usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
               ^
In file included from <built-in>:1:
In file included from /usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:2960:7: error: no matching function for call to '__isnan'
  if (__isnan(a)) {
      ^~~~~~~
/usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
               ^
In file included from <built-in>:1:
In file included from /usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:3090:7: error: no matching function for call to '__isnan'
  if (__isnan(a)) {
      ^~~~~~~
/usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
               ^
In file included from <built-in>:1:
In file included from /usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:3196:7: error: no matching function for call to '__isnan'
  if (__isnan(a)) {
      ^~~~~~~
/usr/local/llvm90/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
               ^

Any idea how to fix this?

I'm seeing quite similar errors on FreeBSD with Clang 8 and 9:
Any idea how to fix this?

It looks like CUDA doesn't support double argument for device function __isnan on FreeBSD.

  1. I'd look at LLVM trunk (10.0.0svn).
  2. If the issue were not eliminated in the trunk, I'd make a change for FreeBSD similar to https://reviews.llvm.org/rL358654 to provide declarations for that function, allowing math_functions.hpp to compile, but with preventing from any use of it on the device side.
tra added a comment.Oct 28 2019, 10:17 AM

I'm seeing quite similar errors on FreeBSD with Clang 8 and 9:
Any idea how to fix this?

It looks like CUDA doesn't support double argument for device function __isnan on FreeBSD.

It's actually the opposite -- FreeBSD does not provide *host*-side __isnan(double) -- the error complains that it's the host code that tried to use __isnan and failed when overload resolution produced a device variant.

FreeBSD does have host-side implementation of isnan, only it's apparently called __inline_isnan() https://github.com/freebsd/freebsd/blob/master/lib/msun/src/math.h#L197
I think the right thing to do here would probably be to define a wrapper __isnan(double) which would call it.

  1. I'd look at LLVM trunk (10.0.0svn).
  2. If the issue were not eliminated in the trunk, I'd make a change for FreeBSD similar to https://reviews.llvm.org/rL358654 to provide declarations for that function, allowing math_functions.hpp to compile, but with preventing from any use of it on the device side.

It looks like CUDA doesn't support double argument for device function __isnan on FreeBSD.

It's actually the opposite -- FreeBSD does not provide *host*-side __isnan(double) -- the error complains that it's the host code that tried to use __isnan and failed when overload resolution produced a device variant.

Sure, you right, quite opposite: call to __device__ function from __host__ function. I agree: the wrapper for the existing __inline_isnan() is the solution.

dim added a subscriber: dim.Oct 29 2019, 10:14 AM

Hm, I would really say that __isnan and the other __ prefixed functions are Linuxisms, or more accurately, glibc-isms. They also don't exist on e.g. macOS:

$ cat check-isnan.cpp
#include <math.h>

int check_isnan(double d)
{
  return ::__isnan(d);
}

$ clang -v
Apple clang version 11.0.0 (clang-1100.0.33.8)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

$ clang -c check-isnan.cpp
check-isnan.cpp:5:12: error: no member named '__isnan' in the global namespace; did you mean 'isnan'?
  return ::__isnan(d);
         ~~^~~~~~~
           isnan
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/math.h:519:1: note: 'isnan' declared here
isnan(_A1 __lcpp_x) _NOEXCEPT
^
1 error generated.

Why can't the regular isnan be used instead? Or is this a CUDA-specific requirement? (Apologies, but I know next to nothing about CUDA :) )

tra added a comment.Oct 29 2019, 10:30 AM
In D60220#1725633, @dim wrote:

Hm, I would really say that __isnan and the other __ prefixed functions are Linuxisms, or more accurately, glibc-isms. They also don't exist on e.g. macOS:

Why can't the regular isnan be used instead? Or is this a CUDA-specific requirement? (Apologies, but I know next to nothing about CUDA :) )

Well, that's what *CUDA* headers use, as written by NVIDIA. We have no control over them. :-(

In D60220#1725633, @dim wrote:
$ cat check-isnan.cpp
#include <math.h>

int check_isnan(double d)
{
  return ::__isnan(d);
}
$ clang -c check-isnan.cpp
Why can't the regular `isnan` be used instead?  Or is this a CUDA-specific requirement?  (Apologies, but I know next to nothing about CUDA :) )
  1. #include "cuda_runtime.h"
  2. as long as __isnan is a devide function, it should be called from __devide__ or __global__ function
  3. clang -c check-isnan.cpp -x cuda