This is an archive of the discontinued LLVM Phabricator instance.

[openmp][nfc] Simplify macros guarding math complex headers
ClosedPublic

Authored by JonChesterfield on Jun 30 2021, 11:51 AM.

Details

Summary

The __CUDA__ macro is already defined for openmp/nvptx and is not used by
__clang_cuda_complex_builtins.h, so dropping that macro slightly simplifies
nvptx and avoids defining it on amdgcn (where it is likely to be harmful).

Also dropped a cplusplus test from a C++ header as compilation will have
failed on cmath earlier if it was included from C.

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Jun 30 2021, 11:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 11:51 AM
JonChesterfield edited the summary of this revision. (Show Details)Jun 30 2021, 11:52 AM
JonChesterfield added a reviewer: ronlieb.

this unblocks the hazard I am concerned about for D104904, namely it stops us defining __CUDA__ when compiling amdgcn code that includes complex.h

clang/lib/Headers/__clang_cuda_complex_builtins.h
21

bit weird that these are weak, but not changing that here

Should the name of file be changed as well?

Yeah, it probably should be. I should also check the blame list for the file to see who else should be on the reviewer list.

Looks pretty much like a revert of https://reviews.llvm.org/D90415 which was necessary to allow building with -x cuda -fopenmp.
Won't this break that again?

I fear there's no test covering that case and I either wasn't sure where to add such a test.. (also -x hip -fopenmp?)

That's interesting. I don't see how there is a semantic change here - _openmp is defined already and the builtins file ignores the cuda define - but I also haven't tried openmp+cuda in combination.

fodinabor requested changes to this revision.Jul 1 2021, 8:35 AM

citing from https://reviews.llvm.org/rG7f1e6fcff9427adfa8efa3bfeeeac801da788b87:

Due to recent changes we cannot use OpenMP in CUDA files anymore (PR45533) as the math handling of CUDA is different when _OPENMP is defined. We actually want this different behavior only if we are offloading with OpenMP to NVIDIA, thus generating NVPTX.

_OPENMP is defined even when only the CPU backend is targeted, when using -fopenmp. But then e.g. the OpenMP __nv_isnand variant is chosen for _ISNANd which is not defined if using CPU OpenMP and CUDA.

Applying this patch thus leads to this bunch of errors for clang -x cuda -fopenmp /dev/null -o /dev/null --cuda-gpu-arch=sm_70

In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:98:7: error: no matching function for call to '__nv_isnand'
  if (_ISNANd(__real__(z)) && _ISNANd(__imag__(z))) {
      ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:98:31: error: no matching function for call to '__nv_isnand'
  if (_ISNANd(__real__(z)) && _ISNANd(__imag__(z))) {
                              ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:100:9: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__a) || _ISINFd(__b)) {
        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:100:25: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__a) || _ISINFd(__b)) {
                        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:101:24: error: no matching function for call to '__nv_isinfd'
      __a = _COPYSIGNd(_ISINFd(__a) ? 1 : 0, __a);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:101:13: error: no matching function for call to '__nv_copysign'
      __a = _COPYSIGNd(_ISINFd(__a) ? 1 : 0, __a);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:102:24: error: no matching function for call to '__nv_isinfd'
      __b = _COPYSIGNd(_ISINFd(__b) ? 1 : 0, __b);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:102:13: error: no matching function for call to '__nv_copysign'
      __b = _COPYSIGNd(_ISINFd(__b) ? 1 : 0, __b);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:103:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__c))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:104:15: error: no matching function for call to '__nv_copysign'
        __c = _COPYSIGNd(0, __c);
              ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:105:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__d))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:106:15: error: no matching function for call to '__nv_copysign'
        __d = _COPYSIGNd(0, __d);
              ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:109:9: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__c) || _ISINFd(__d)) {
        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:109:25: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__c) || _ISINFd(__d)) {
                        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:110:24: error: no matching function for call to '__nv_isinfd'
      __c = _COPYSIGNd(_ISINFd(__c) ? 1 : 0, __c);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:110:13: error: no matching function for call to '__nv_copysign'
      __c = _COPYSIGNd(_ISINFd(__c) ? 1 : 0, __c);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:111:24: error: no matching function for call to '__nv_isinfd'
      __d = _COPYSIGNd(_ISINFd(__d) ? 1 : 0, __d);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:111:13: error: no matching function for call to '__nv_copysign'
      __d = _COPYSIGNd(_ISINFd(__d) ? 1 : 0, __d);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:112:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__a))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for host.

I will try to bring up a patch with a regression test for this.

This revision now requires changes to proceed.Jul 1 2021, 8:35 AM

I added a pretty simple regression that should make testing this -x cuda -fopenmp issue simpler: https://reviews.llvm.org/D105322
I guess a similar test for -x hip -fopenmp could be added, but it hasn't been an issue so far as HIP and OpenMP AMDGCN seem to use the same builtins?

jdoerfert requested changes to this revision.Jul 2 2021, 9:43 AM

We need a macro for OPENMP and one for OPENMP_OFFLOAD, we can use a single one for the latter and avoid _NVPTX, _AMDGCN, ... but we need both as described by @fodinabor.

I think the openmp_wrappers are only used when compiling device code, which would explain why setting a macro in one of them is a proxy for detecting compilation for the device.

Attempting to verify that, it looks like:
trunk-nvptx includes openmp_wrappers on device code only
trunk-amdgcn never includes openmp_wrappers
aomp-amdgcn includes openmp_wrappers on device code and cuda_wrappers on host code

in which case #define __OPENMP_NVPTX from an openmp_wrapper is equivalent to defining __OPENMP_NVPTX when compiling for the target and not for the host.

This seems fragile. How about we #define _OPENMP_HOST when compiling openmp for the host, and _OPENMP_TARGET when compiling openmp for the device? Do that from clang directly, not from a header which is only sometimes included. For one thing, we may want wrapper headers like these for the openmp host at some point.

  • reduce patch to only dropping cuda define
  • reduce patch to only dropping cuda define v2, now with missing save
JonChesterfield edited the summary of this revision. (Show Details)EditedJul 13 2021, 6:39 AM

Cut down to only dropping the cuda define, which is sufficient to resolve D104904. Haven't built/tested this diff yet.

Looks ok to me. Regression tests and runtime tests went fine. Tested a simple cuda and openmp kernel with sin function on sm_61, didn't see any issue.

clang/lib/Headers/openmp_wrappers/complex
21

^ this header does not look for a macro called CUDA or include any other headers so I believe dropping the macro can make no change to that header.

It might affect other things that happen to be included after this header, but iiuc cuda and openmp-nvptx both define __CUDA__ anyway, so that could only break amdgpu applications that were erroneously looking for a cuda macro.

fodinabor accepted this revision.Jul 18 2021, 11:39 AM

LGTM as well :)

This revision is now accepted and ready to land.Jul 18 2021, 11:39 AM
This revision was landed with ongoing or failed builds.Jul 18 2021, 3:31 PM
This revision was automatically updated to reflect the committed changes.