This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/Headers/
-
lib/
-
Headers/
-
__clang_cuda_complex_builtins.h
-
openmp_wrappers/
-
complex
-
complex.h

Differential D90415

[OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers.
ClosedPublic

Authored by fodinabor on Oct 29 2020, 12:17 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
hfinkel
tra

Commits

rGeaee608448c8: [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers.

Summary

This is very similar to 7f1e6fcff942, just fixing a left-over.
With this, it should be possible to use both, -x cuda and -fopenmp in the same invocation,
enabling to use both OpenMP, targeting CPU, and CUDA, targeting the GPU.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fodinabor created this revision.Oct 29 2020, 12:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2020, 12:17 PM

Herald added subscribers: cfe-commits, guansong, yaxunl. · View Herald Transcript

fodinabor requested review of this revision.Oct 29 2020, 12:17 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptOct 29 2020, 12:17 PM

Herald added a subscriber: sstefan1. · View Herald Transcript

fodinabor edited the summary of this revision. (Show Details)Oct 29 2020, 12:19 PM

fodinabor added reviewers: hfinkel, tra.

Harbormaster completed remote builds in B76957: Diff 301707.Oct 29 2020, 12:50 PM

You need to define & undefine the macro around the includes of __clang_cuda_complex_builtins.h in clang/lib/Headers/openmp_wrappers/complex and clang/lib/Headers/openmp_wrappers/complex.h. (see also rG7f1e6fcff942) That should fix the tests. Assuming the tests pass, LGTM.

For release 11, can you file a bug and prepare a patch that applies to the relevant branch?

This revision is now accepted and ready to land.Oct 29 2020, 1:06 PM

Add missing macro definitions.

Thanks, will land it later.

For the bug see: https://bugs.llvm.org/show_bug.cgi?id=48014
Do I have to create a new phabricator review, too?
I'm currently building the release/11.x branch with the patch cherry-picked (which worked flawlessly).

Harbormaster completed remote builds in B76965: Diff 301726.Oct 29 2020, 2:11 PM

This revision was landed with ongoing or failed builds.Oct 29 2020, 3:25 PM

Closed by commit rGeaee608448c8: [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers. (authored by fodinabor). · Explain Why

This revision was automatically updated to reflect the committed changes.

fodinabor added a commit: rGeaee608448c8: [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in complex wrapper headers..

In D90415#2363056, @fodinabor wrote:

Thanks, will land it later.

For the bug see: https://bugs.llvm.org/show_bug.cgi?id=48014
Do I have to create a new phabricator review, too?
I'm currently building the release/11.x branch with the patch cherry-picked (which worked flawlessly).

CC tstellar@redhat.com in the bug and ask for it to be merged. Attach the patch file to the bug. At least that is (one way) how it works I think.

fodinabor mentioned this in D105221: [openmp][nfc] Simplify macros guarding math complex headers.Jul 1 2021, 5:40 AM

fodinabor mentioned this in D105322: [NFC][OpenMP][CUDA] Add test for using `-x cuda -fopenmp`.Jul 1 2021, 4:22 PM

fodinabor mentioned this in rG75e941b05c78: [NFC][OpenMP][CUDA] Add test for using `-x cuda -fopenmp`.Jul 2 2021, 10:00 AM

Revision Contents

Path

Size

clang/

lib/

Headers/

__clang_cuda_complex_builtins.h

6 lines

openmp_wrappers/

complex

2 lines

complex.h

2 lines

Diff 301774

clang/lib/Headers/__clang_cuda_complex_builtins.h

	Show All 10 Lines
	#define __CLANG_CUDA_COMPLEX_BUILTINS			#define __CLANG_CUDA_COMPLEX_BUILTINS

	// This header defines __muldc3, __mulsc3, __divdc3, and __divsc3. These are			// This header defines __muldc3, __mulsc3, __divdc3, and __divsc3. These are
	// libgcc functions that clang assumes are available when compiling c99 complex			// libgcc functions that clang assumes are available when compiling c99 complex
	// operations. (These implementations come from libc++, and have been modified			// operations. (These implementations come from libc++, and have been modified
	// to work with CUDA and OpenMP target offloading [in C and C++ mode].)			// to work with CUDA and OpenMP target offloading [in C and C++ mode].)

	#pragma push_macro("__DEVICE__")			#pragma push_macro("__DEVICE__")
	#ifdef _OPENMP			#ifdef __OPENMP_NVPTX__
	#pragma omp declare target			#pragma omp declare target
	#define __DEVICE__ __attribute__((noinline, nothrow, cold, weak))			#define __DEVICE__ __attribute__((noinline, nothrow, cold, weak))
	#else			#else
	#define __DEVICE__ __device__ inline			#define __DEVICE__ __device__ inline
	#endif			#endif

	// To make the algorithms available for C and C++ in CUDA and OpenMP we select			// To make the algorithms available for C and C++ in CUDA and OpenMP we select
	// different but equivalent function versions. TODO: For OpenMP we currently			// different but equivalent function versions. TODO: For OpenMP we currently
	// select the native builtins as the overload support for templates is lacking.			// select the native builtins as the overload support for templates is lacking.
	#if !defined(_OPENMP)			#if !defined(__OPENMP_NVPTX__)
	#define _ISNANd std::isnan			#define _ISNANd std::isnan
	#define _ISNANf std::isnan			#define _ISNANf std::isnan
	#define _ISINFd std::isinf			#define _ISINFd std::isinf
	#define _ISINFf std::isinf			#define _ISINFf std::isinf
	#define _ISFINITEd std::isfinite			#define _ISFINITEd std::isfinite
	#define _ISFINITEf std::isfinite			#define _ISFINITEf std::isfinite
	#define _COPYSIGNd std::copysign			#define _COPYSIGNd std::copysign
	#define _COPYSIGNf std::copysign			#define _COPYSIGNf std::copysign
	▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines
	#undef _SCALBNf			#undef _SCALBNf
	#undef _ABSd			#undef _ABSd
	#undef _ABSf			#undef _ABSf
	#undef _LOGBd			#undef _LOGBd
	#undef _LOGBf			#undef _LOGBf
	#undef _fmaxd			#undef _fmaxd
	#undef _fmaxf			#undef _fmaxf

	#ifdef _OPENMP			#ifdef __OPENMP_NVPTX__
	#pragma omp end declare target			#pragma omp end declare target
	#endif			#endif

	#pragma pop_macro("__DEVICE__")			#pragma pop_macro("__DEVICE__")

	#endif // __CLANG_CUDA_COMPLEX_BUILTINS			#endif // __CLANG_CUDA_COMPLEX_BUILTINS

clang/lib/Headers/openmp_wrappers/complex

	Show All 12 Lines
	#ifndef _OPENMP			#ifndef _OPENMP
	#error "This file is for OpenMP compilation only."			#error "This file is for OpenMP compilation only."
	#endif			#endif

	// We require std::math functions in the complex builtins below.			// We require std::math functions in the complex builtins below.
	#include <cmath>			#include <cmath>

	#define __CUDA__			#define __CUDA__
				#define __OPENMP_NVPTX__
	#include <__clang_cuda_complex_builtins.h>			#include <__clang_cuda_complex_builtins.h>
				#undef __OPENMP_NVPTX__
	#endif			#endif

	// Grab the host header too.			// Grab the host header too.
	#include_next <complex>			#include_next <complex>


	#ifdef __cplusplus			#ifdef __cplusplus

	Show All 21 Lines

clang/lib/Headers/openmp_wrappers/complex.h

	Show All 12 Lines
	#ifndef _OPENMP			#ifndef _OPENMP
	#error "This file is for OpenMP compilation only."			#error "This file is for OpenMP compilation only."
	#endif			#endif

	// We require math functions in the complex builtins below.			// We require math functions in the complex builtins below.
	#include <math.h>			#include <math.h>

	#define __CUDA__			#define __CUDA__
				#define __OPENMP_NVPTX__
	#include <__clang_cuda_complex_builtins.h>			#include <__clang_cuda_complex_builtins.h>
				#undef __OPENMP_NVPTX__
	#endif			#endif

	// Grab the host header too.			// Grab the host header too.
	#include_next <complex.h>			#include_next <complex.h>