This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Provide math functions in OpenMP device code via OpenMP variants
ClosedPublic

Authored by jdoerfert on Mar 6 2020, 5:11 PM.

Details

Summary

For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions,
we include the appropriate definitions inside of an omp begin/end declare variant match(device={arch(nvptx)}) scope.
This way, the vendor specific math functions will become specialized versions of the system math functions.
When a system math function is called and specialized version is available the selection logic introduced in D75779
instead call the specialized version. In contrast to the code path we used so far, the system header is actually included.
This means functions without specialized versions are available and so are macro definitions.

This should address PR42061, PR42798, and PR42799.

Diff Detail

Event Timeline

jdoerfert created this revision.Mar 6 2020, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2020, 5:11 PM

That's less invasive than I feared. Nicely done.

It may worth keeping the openmp header wrapper to do architecture dispatch. Something like:

#ifndef __CLANG_OPENMP_MATH_DECLARES_H__
#define __CLANG_OPENMP_MATH_DECLARES_H__

#ifndef _OPENMP
#error "This file is for OpenMP compilation only."
#endif

#if defined(__AMDGCN__)
#pragma omp begin declare variant match(device = {arch(amdgcn)})
#include "equivalent_header.h"
#pragma omp end declare variant
#endif // __AMDGCN__

#if defined(__NVPTX__)
#define __CUDA__
#pragma omp begin declare variant match(device = {arch(nvptx)})

#if defined(__cplusplus)
#include <__clang_cuda_math_forward_declares.h>
#endif

/// Include declarations for libdevice functions.
#include <__clang_cuda_libdevice_declares.h>
/// Provide definitions for these functions.
#include <__clang_cuda_device_functions.h>

#pragma omp end declare variant
#undef __CUDA__
#endif // __NVPTX__


#endif // __CLANG_OPENMP_MATH_DECLARES_H__
clang/lib/Headers/cuda_wrappers/new
36 ↗(On Diff #248864)

macros look off here - should it be #define DEVICE, or the following uses __DEVICE__?

jdoerfert marked an inline comment as done.Mar 6 2020, 9:55 PM

That's less invasive than I feared. Nicely done.

We need to run some more tests to make sure it works as expected but I hope we can completely piggy back on the underlying "language" support.

It may worth keeping the openmp header wrapper to do architecture dispatch. Something like:

We can do that or adjust the pipeline based on the target, either is fine with me.

clang/lib/Headers/cuda_wrappers/new
36 ↗(On Diff #248864)

Yes.

Furthermore I think I want to introduce the effect of __device__ as an attribute, basically match(device={arch(nvptx)} on a single function. That would make the declare variant go away and allow us to piggy back on the __DEVICE__ directly.

jdoerfert updated this revision to Diff 248937.Mar 7 2020, 8:58 AM

Add complex test case

tra added a subscriber: tra.Mar 9 2020, 8:53 AM

Couple of nits below. LGTM for CUDA headers otherwise.

clang/lib/Headers/__clang_cuda_math_forward_declares.h
42–43

Shouldn't float and double abs variants be preserved?

clang/lib/Headers/cuda_wrappers/new
107 ↗(On Diff #248937)

You may want to push/pop __DEVICE__ macro here and in other headers where it's not done.

jdoerfert updated this revision to Diff 250035.Mar 12 2020, 1:02 PM
jdoerfert marked an inline comment as done.

Adjust to new scheme, tested locally with some math functions, seems to work

jdoerfert marked 3 inline comments as done.Mar 12 2020, 1:06 PM
jdoerfert added inline comments.
clang/lib/AST/ASTContext.cpp
1621 ↗(On Diff #250035)

These AuxTarget changes can be split off. I observed a segfault while testing stuff.

clang/lib/Headers/CMakeLists.txt
150

the math_declares above will provide all C math functions so we do not need math.h anymore here.

clang/lib/Headers/__clang_cuda_cmath.h
312

when this is included with OpenMP it is actually following the system <cmath> and these using declarations are neither needed nor helpful (afaict)

clang/lib/Headers/__clang_cuda_math_forward_declares.h
42–43

yes, thx! I make so many errors modifying these files, it is ridicules...

jdoerfert updated this revision to Diff 253153.Mar 27 2020, 9:44 AM

Add and repair tests to show new features, e.g. math macro support

jdoerfert retitled this revision from [WIP][OpenMP] Reuse CUDA wrappers in `nvptx` target regions. to [OpenMP] Provide math functions in OpenMP device code via OpenMP variants.Mar 27 2020, 9:49 AM
jdoerfert edited the summary of this revision. (Show Details)
jdoerfert updated this revision to Diff 253154.Mar 27 2020, 9:53 AM

Remove OpenMP from clang/lib/Headers/__clang_cuda_math_forward_declares.h

jdoerfert updated this revision to Diff 254375.Apr 1 2020, 5:28 PM

Rewrite. Wrap math.h, time.h, and cmath. Preload only device functions.

Passes all 185 math c++11 tests from [0] which do not deal with long double.
[0] https://github.com/TApplencourt/OmpVal

ye-luo accepted this revision.Apr 2 2020, 9:12 AM
ye-luo added a subscriber: ye-luo.

Good work.
I verified that PR42798 and PR42799 are fixed by this.
Tests are completed on Ubuntu 18.04.
Clang now becomes usable for application developers.

There are still issues on RHEL that openmp_wrappers is not added before the system math searching path. Option -isystem openmp_wrappers can be used as a workaround.
This remaining issue should not be a blocker of accepting this patch. It can be dealt separately later without complicating the review of the current patch.

This revision is now accepted and ready to land.Apr 2 2020, 9:12 AM

My RHEL issue was caused by a CPLUS_INCLUDE_PATH environment variable. So this is feature not a bug. After removing it, everything works smoothly for me.

jdoerfert updated this revision to Diff 255050.Apr 4 2020, 9:23 AM

Cleanup and rebase

sammccall added inline comments.
clang/test/Headers/nvptx_device_math_macro.cpp
11

Hmm, this fails if the test is run from a directory containing "call".

; ModuleID = '/usr/local/google/home/sammccall/src/llvm-project/clang/test/Headers/nvptx_device_math_macro.cpp

I guess I can blame my parents :-)
Is CHECK-NOT: call double correct for both of these?

jdoerfert marked an inline comment as done.Apr 11 2020, 10:48 AM
jdoerfert added inline comments.
clang/test/Headers/nvptx_device_math_macro.cpp
11

Oh my, ... sorry.

I guess call{{.*}}@ should do the trick. No globals are referenced in a call

chapuni added a subscriber: chapuni.Jul 1 2022, 7:10 AM
chapuni added inline comments.
clang/lib/Headers/CMakeLists.txt
145

It doesn't contain <time.h>, intentional?

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 7:10 AM
jdoerfert added inline comments.Jul 8 2022, 9:09 AM
clang/lib/Headers/CMakeLists.txt
145

We don't wrap that yet. Do you need it?

clang/lib/Headers/openmp_wrappers/__clang_openmp_device_functions.h