This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/Headers/
-
lib/
-
Headers/
1/1
__clang_cuda_complex_builtins.h
-
openmp_wrappers/
1/1
complex
-
complex.h

Differential D105221

[openmp][nfc] Simplify macros guarding math complex headers
ClosedPublic

Authored by JonChesterfield on Jun 30 2021, 11:51 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
tianshilei1992
pdhaliwal
ronlieb
jlebar
ashi1
fodinabor

Commits

rG3e649f8ef187: [openmp][nfc] Simplify macros guarding math complex headers

Summary

The __CUDA__ macro is already defined for openmp/nvptx and is not used by
__clang_cuda_complex_builtins.h, so dropping that macro slightly simplifies
nvptx and avoids defining it on amdgcn (where it is likely to be harmful).

Also dropped a cplusplus test from a C++ header as compilation will have
failed on cmath earlier if it was included from C.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

JonChesterfield created this revision.Jun 30 2021, 11:51 AM

Herald added subscribers: guansong, kristof.beyls, tpr, yaxunl. · View Herald TranscriptJun 30 2021, 11:51 AM

JonChesterfield requested review of this revision.Jun 30 2021, 11:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 11:51 AM

Herald added subscribers: cfe-commits, sstefan1. · View Herald Transcript

this unblocks the hazard I am concerned about for D104904, namely it stops us defining __CUDA__ when compiling amdgcn code that includes complex.h

clang/lib/Headers/__clang_cuda_complex_builtins.h
21	bit weird that these are weak, but not changing that here

Harbormaster completed remote builds in B111824: Diff 355652.Jun 30 2021, 1:31 PM

Should the name of file be changed as well?

Yeah, it probably should be. I should also check the blame list for the file to see who else should be on the reviewer list.

JonChesterfield added reviewers: jlebar, ashi1, fodinabor.Jul 1 2021, 5:08 AM

Looks pretty much like a revert of https://reviews.llvm.org/D90415 which was necessary to allow building with -x cuda -fopenmp.
Won't this break that again?

I fear there's no test covering that case and I either wasn't sure where to add such a test.. (also -x hip -fopenmp?)

That's interesting. I don't see how there is a semantic change here - _openmp is defined already and the builtins file ignores the cuda define - but I also haven't tried openmp+cuda in combination.

citing from https://reviews.llvm.org/rG7f1e6fcff9427adfa8efa3bfeeeac801da788b87:

Due to recent changes we cannot use OpenMP in CUDA files anymore (PR45533) as the math handling of CUDA is different when _OPENMP is defined. We actually want this different behavior only if we are offloading with OpenMP to NVIDIA, thus generating NVPTX.

_OPENMP is defined even when only the CPU backend is targeted, when using -fopenmp. But then e.g. the OpenMP __nv_isnand variant is chosen for _ISNANd which is not defined if using CPU OpenMP and CUDA.

Applying this patch thus leads to this bunch of errors for clang -x cuda -fopenmp /dev/null -o /dev/null --cuda-gpu-arch=sm_70

In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:98:7: error: no matching function for call to '__nv_isnand'
  if (_ISNANd(__real__(z)) && _ISNANd(__imag__(z))) {
      ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:98:31: error: no matching function for call to '__nv_isnand'
  if (_ISNANd(__real__(z)) && _ISNANd(__imag__(z))) {
                              ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:100:9: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__a) || _ISINFd(__b)) {
        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:100:25: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__a) || _ISINFd(__b)) {
                        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:101:24: error: no matching function for call to '__nv_isinfd'
      __a = _COPYSIGNd(_ISINFd(__a) ? 1 : 0, __a);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:101:13: error: no matching function for call to '__nv_copysign'
      __a = _COPYSIGNd(_ISINFd(__a) ? 1 : 0, __a);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:102:24: error: no matching function for call to '__nv_isinfd'
      __b = _COPYSIGNd(_ISINFd(__b) ? 1 : 0, __b);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:102:13: error: no matching function for call to '__nv_copysign'
      __b = _COPYSIGNd(_ISINFd(__b) ? 1 : 0, __b);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:103:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__c))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:104:15: error: no matching function for call to '__nv_copysign'
        __c = _COPYSIGNd(0, __c);
              ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:105:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__d))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:106:15: error: no matching function for call to '__nv_copysign'
        __d = _COPYSIGNd(0, __d);
              ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:109:9: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__c) || _ISINFd(__d)) {
        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:109:25: error: no matching function for call to '__nv_isinfd'
    if (_ISINFd(__c) || _ISINFd(__d)) {
                        ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:110:24: error: no matching function for call to '__nv_isinfd'
      __c = _COPYSIGNd(_ISINFd(__c) ? 1 : 0, __c);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:110:13: error: no matching function for call to '__nv_copysign'
      __c = _COPYSIGNd(_ISINFd(__c) ? 1 : 0, __c);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:111:24: error: no matching function for call to '__nv_isinfd'
      __d = _COPYSIGNd(_ISINFd(__d) ? 1 : 0, __d);
                       ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:68:17: note: expanded from macro '_ISINFd'
#define _ISINFd __nv_isinfd
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:224:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isinfd(double __a);
               ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:111:13: error: no matching function for call to '__nv_copysign'
      __d = _COPYSIGNd(_ISINFd(__d) ? 1 : 0, __d);
            ^~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:72:20: note: expanded from macro '_COPYSIGNd'
#define _COPYSIGNd __nv_copysign
                   ^~~~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:47:19: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ double __nv_copysign(double __a, double __b);
                  ^
In file included from <built-in>:1:
In file included from /home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_runtime_wrapper.h:419:
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:112:11: error: no matching function for call to '__nv_isnand'
      if (_ISNANd(__a))
          ^~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_complex_builtins.h:66:17: note: expanded from macro '_ISNANd'
#define _ISNANd __nv_isnand
                ^~~~~~~~~~~
/home/joachim/Projekte/install/lib/clang/13.0.0/include/__clang_cuda_libdevice_declares.h:226:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __nv_isnand(double __a);
               ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for host.

I will try to bring up a patch with a regression test for this.

This revision now requires changes to proceed.Jul 1 2021, 8:35 AM

I added a pretty simple regression that should make testing this -x cuda -fopenmp issue simpler: https://reviews.llvm.org/D105322
I guess a similar test for -x hip -fopenmp could be added, but it hasn't been an issue so far as HIP and OpenMP AMDGCN seem to use the same builtins?

We need a macro for OPENMP and one for OPENMP_OFFLOAD, we can use a single one for the latter and avoid _NVPTX, _AMDGCN, ... but we need both as described by @fodinabor.

I think the openmp_wrappers are only used when compiling device code, which would explain why setting a macro in one of them is a proxy for detecting compilation for the device.

Attempting to verify that, it looks like:
trunk-nvptx includes openmp_wrappers on device code only
trunk-amdgcn never includes openmp_wrappers
aomp-amdgcn includes openmp_wrappers on device code and cuda_wrappers on host code

in which case #define __OPENMP_NVPTX from an openmp_wrapper is equivalent to defining __OPENMP_NVPTX when compiling for the target and not for the host.

This seems fragile. How about we #define _OPENMP_HOST when compiling openmp for the host, and _OPENMP_TARGET when compiling openmp for the device? Do that from clang directly, not from a header which is only sometimes included. For one thing, we may want wrapper headers like these for the openmp host at some point.

reduce patch to only dropping cuda define

reduce patch to only dropping cuda define v2, now with missing save

Cut down to only dropping the cuda define, which is sufficient to resolve D104904. Haven't built/tested this diff yet.

Harbormaster completed remote builds in B113725: Diff 358250.Jul 13 2021, 7:41 AM

Looks ok to me. Regression tests and runtime tests went fine. Tested a simple cuda and openmp kernel with sin function on sm_61, didn't see any issue.

JonChesterfield added inline comments.Jul 14 2021, 6:18 AM

clang/lib/Headers/openmp_wrappers/complex
21	^ this header does not look for a macro called CUDA or include any other headers so I believe dropping the macro can make no change to that header. It might affect other things that happen to be included after this header, but iiuc cuda and openmp-nvptx both define `__CUDA__` anyway, so that could only break amdgpu applications that were erroneously looking for a cuda macro.

JonChesterfield mentioned this in D104904: [OpenMP][AMDGCN] Initial math headers support.Jul 14 2021, 7:45 AM

@fodinabor?

LGTM as well :)

This revision is now accepted and ready to land.Jul 18 2021, 11:39 AM

This revision was landed with ongoing or failed builds.Jul 18 2021, 3:31 PM

Closed by commit rG3e649f8ef187: [openmp][nfc] Simplify macros guarding math complex headers (authored by JonChesterfield). · Explain Why

This revision was automatically updated to reflect the committed changes.

JonChesterfield added a commit: rG3e649f8ef187: [openmp][nfc] Simplify macros guarding math complex headers.

Revision Contents

Path

Size

clang/

lib/

Headers/

__clang_cuda_complex_builtins.h

6 lines

openmp_wrappers/

complex

6 lines

complex.h

1 line

Diff 358248

clang/lib/Headers/__clang_cuda_complex_builtins.h

	Show All 10 Lines
	#define __CLANG_CUDA_COMPLEX_BUILTINS			#define __CLANG_CUDA_COMPLEX_BUILTINS

	// This header defines __muldc3, __mulsc3, __divdc3, and __divsc3. These are			// This header defines __muldc3, __mulsc3, __divdc3, and __divsc3. These are
	// libgcc functions that clang assumes are available when compiling c99 complex			// libgcc functions that clang assumes are available when compiling c99 complex
	// operations. (These implementations come from libc++, and have been modified			// operations. (These implementations come from libc++, and have been modified
	// to work with CUDA and OpenMP target offloading [in C and C++ mode].)			// to work with CUDA and OpenMP target offloading [in C and C++ mode].)

	#pragma push_macro("__DEVICE__")			#pragma push_macro("__DEVICE__")
	#ifdef __OPENMP_NVPTX__			#ifdef _OPENMP
	#pragma omp declare target			#pragma omp declare target
	#define __DEVICE__ __attribute__((noinline, nothrow, cold, weak))			#define __DEVICE__ __attribute__((noinline, nothrow, cold, weak))
				JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions bit weird that these are weak, but not changing that here JonChesterfield: bit weird that these are weak, but not changing that here
	#else			#else
	#define __DEVICE__ __device__ inline			#define __DEVICE__ __device__ inline
	#endif			#endif

	// To make the algorithms available for C and C++ in CUDA and OpenMP we select			// To make the algorithms available for C and C++ in CUDA and OpenMP we select
	// different but equivalent function versions. TODO: For OpenMP we currently			// different but equivalent function versions. TODO: For OpenMP we currently
	// select the native builtins as the overload support for templates is lacking.			// select the native builtins as the overload support for templates is lacking.
	#if !defined(__OPENMP_NVPTX__)			#if !defined(_OPENMP)
	#define _ISNANd std::isnan			#define _ISNANd std::isnan
	#define _ISNANf std::isnan			#define _ISNANf std::isnan
	#define _ISINFd std::isinf			#define _ISINFd std::isinf
	#define _ISINFf std::isinf			#define _ISINFf std::isinf
	#define _ISFINITEd std::isfinite			#define _ISFINITEd std::isfinite
	#define _ISFINITEf std::isfinite			#define _ISFINITEf std::isfinite
	#define _COPYSIGNd std::copysign			#define _COPYSIGNd std::copysign
	#define _COPYSIGNf std::copysign			#define _COPYSIGNf std::copysign
	▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines
	#undef _SCALBNf			#undef _SCALBNf
	#undef _ABSd			#undef _ABSd
	#undef _ABSf			#undef _ABSf
	#undef _LOGBd			#undef _LOGBd
	#undef _LOGBf			#undef _LOGBf
	#undef _fmaxd			#undef _fmaxd
	#undef _fmaxf			#undef _fmaxf

	#ifdef __OPENMP_NVPTX__			#ifdef _OPENMP
	#pragma omp end declare target			#pragma omp end declare target
	#endif			#endif

	#pragma pop_macro("__DEVICE__")			#pragma pop_macro("__DEVICE__")

	#endif // __CLANG_CUDA_COMPLEX_BUILTINS			#endif // __CLANG_CUDA_COMPLEX_BUILTINS

clang/lib/Headers/openmp_wrappers/complex

	Show All 11 Lines

	#ifndef _OPENMP			#ifndef _OPENMP
	#error "This file is for OpenMP compilation only."			#error "This file is for OpenMP compilation only."
	#endif			#endif

	// We require std::math functions in the complex builtins below.			// We require std::math functions in the complex builtins below.
	#include <cmath>			#include <cmath>

	#define __CUDA__
	#define __OPENMP_NVPTX__			#define __OPENMP_NVPTX__
	#include <__clang_cuda_complex_builtins.h>			#include <__clang_cuda_complex_builtins.h>
				JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions ^ this header does not look for a macro called CUDA or include any other headers so I believe dropping the macro can make no change to that header. It might affect other things that happen to be included after this header, but iiuc cuda and openmp-nvptx both define `__CUDA__` anyway, so that could only break amdgpu applications that were erroneously looking for a cuda macro. JonChesterfield: ^ this header does not look for a macro called __CUDA__ or include any other headers so I…
	#undef __OPENMP_NVPTX__			#undef __OPENMP_NVPTX__
	#endif			#endif

	// Grab the host header too.			// Grab the host header too.
	#include_next <complex>			#include_next <complex>


	#ifdef __cplusplus

	// If we are compiling against libc++, the macro _LIBCPP_STD_VER should be set			// If we are compiling against libc++, the macro _LIBCPP_STD_VER should be set
	// after including <cmath> above. Since the complex header we use is a			// after including <cmath> above. Since the complex header we use is a
	// simplified version of the libc++, we don't need it in this case. If we			// simplified version of the libc++, we don't need it in this case. If we
	// compile against libstdc++, or any other standard library, we will overload			// compile against libstdc++, or any other standard library, we will overload
	// the (hopefully template) functions in the <complex> header with the ones we			// the (hopefully template) functions in the <complex> header with the ones we
	// got from libc++ which decomposes math functions, like `std::sin`, into			// got from libc++ which decomposes math functions, like `std::sin`, into
	// arithmetic and calls to non-complex functions, all of which we can then			// arithmetic and calls to non-complex functions, all of which we can then
	// handle.			// handle.
	#ifndef _LIBCPP_STD_VER			#ifndef _LIBCPP_STD_VER

	#pragma omp begin declare variant match( \			#pragma omp begin declare variant match( \
	device = {arch(nvptx, nvptx64)}, \			device = {arch(nvptx, nvptx64)}, \
	implementation = {extension(match_any, allow_templates)})			implementation = {extension(match_any, allow_templates)})

	#include <complex_cmath.h>			#include <complex_cmath.h>

	#pragma omp end declare variant			#pragma omp end declare variant

	#endif			#endif

	#endif

clang/lib/Headers/openmp_wrappers/complex.h

	Show All 11 Lines

	#ifndef _OPENMP			#ifndef _OPENMP
	#error "This file is for OpenMP compilation only."			#error "This file is for OpenMP compilation only."
	#endif			#endif

	// We require math functions in the complex builtins below.			// We require math functions in the complex builtins below.
	#include <math.h>			#include <math.h>

	#define __CUDA__
	#define __OPENMP_NVPTX__			#define __OPENMP_NVPTX__
	#include <__clang_cuda_complex_builtins.h>			#include <__clang_cuda_complex_builtins.h>
	#undef __OPENMP_NVPTX__			#undef __OPENMP_NVPTX__
	#endif			#endif

	// Grab the host header too.			// Grab the host header too.
	#include_next <complex.h>			#include_next <complex.h>