Page MenuHomePhabricator

[OpenMP][CUDA] Add missing overload for `remquo(float,float,int*)`
ClosedPublic

Authored by jdoerfert on Oct 22 2020, 9:42 AM.

Details

Summary

Reported by Colleen Bertoni <bertoni@anl.gov> after running the OvO test
suite: https://github.com/TApplencourt/OvO/

The template overload is still hidden behind an ifdef for OpenMP. In the
future we probably want to remove the ifdef but that requires further
testing.

Diff Detail

Event Timeline

jdoerfert created this revision.Oct 22 2020, 9:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2020, 9:42 AM
jdoerfert requested review of this revision.Oct 22 2020, 9:42 AM

This should fix the last OvO math test (I think C++11) we fail. While those tests are "simple" they are fairly exhaustive and it's a good sign to pass them ;)

JonChesterfield accepted this revision.Oct 22 2020, 10:00 AM

Change obviously good. Am I right in reading this as all of OvO then passes for trunk, nvptx?

This revision is now accepted and ready to land.Oct 22 2020, 10:00 AM

Change obviously good. Am I right in reading this as all of OvO then passes for trunk, nvptx?

C++11 math tests on one system. These are system dependent to some degree so I would not claim it will always work, but it looks promising. and yes, nvptx, though nothing really cares about that, it just resolves down to nv calls instead of xyz calls at the end of the day.

tra added a comment.Oct 22 2020, 10:28 AM

The template overload

function overload.

is still hidden behind an ifdef for OpenMP. In the
future we probably want to remove the ifdef but that requires further
testing.

I don't think it's the case. I've just ran clang++ -x cuda /dev/null --cuda-host-only -dD -E and I do see the definitions for tanh just above your change.

In D89971#2347909, @tra wrote:

The template overload

function overload.

is still hidden behind an ifdef for OpenMP. In the
future we probably want to remove the ifdef but that requires further
testing.

I don't think it's the case. I've just ran clang++ -x cuda /dev/null --cuda-host-only -dD -E and I do see the definitions for tanh just above your change.

My comment was about the template I point out in the inline comment below. Also I was talking about the OpenMP declare variant overloads, not the cuda definitions.
TBH, I don't understand what you want to tell me with the remark about the cuda definitions and tanh. Nothing the OpenMP stuff does should impact cuda and tanh was there already, I'm confused :(

clang/lib/Headers/__clang_cuda_cmath.h
330

The template overload I was talking about is this one.

tra accepted this revision.Oct 22 2020, 12:23 PM
tra added inline comments.
clang/lib/Headers/__clang_cuda_cmath.h
330

Ah. It makes more sense now.

The patch description may need to be rephrased. You've added a function overload, but the description talks about the template which is *not* in the patch.
Something like Added function overload for remquo(float...)... the template function overload for other types... is still behind OPENMP_NVPTX

jdoerfert added inline comments.Oct 22 2020, 12:27 PM
clang/lib/Headers/__clang_cuda_cmath.h
330

Will do:

Not affected by this patch is the `remquo` template overload which is
still hidden behind an ifdef for OpenMP. In the future we probably want
to remove the ifdef but that requires further testing.
This revision was landed with ongoing or failed builds.Oct 27 2020, 5:13 PM
This revision was automatically updated to reflect the committed changes.
tra added a comment.Oct 28 2020, 9:45 AM

Looks like it breaks CUDA: http://lab.llvm.org:8011/#/builders/46/builds/304

[10/873] Building CXX object External/CUDA/CMakeFiles/printf-cuda-11.0-c++14-libstdc++-6.dir/printf.cu.o
FAILED: External/CUDA/CMakeFiles/printf-cuda-11.0-c++14-libstdc++-6.dir/printf.cu.o 
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/bin/clang++  -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time -UNDEBUG --cuda-path=/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0 -I/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include --cuda-gpu-arch=sm_75 -std=c++14 -stdlib=libstdc++ -gcc-toolchain /buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/gcc-6 -DSTDLIB_VERSION=2014 -MD -MT External/CUDA/CMakeFiles/printf-cuda-11.0-c++14-libstdc++-6.dir/printf.cu.o -MF External/CUDA/CMakeFiles/printf-cuda-11.0-c++14-libstdc++-6.dir/printf.cu.o.d -o External/CUDA/CMakeFiles/printf-cuda-11.0-c++14-libstdc++-6.dir/printf.cu.o -c /buildbot/cuda-t4-0/work/None_clang-cuda-t4/llvm-test-suite/External/CUDA/printf.cu
In file included from <built-in>:1:
In file included from /buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_runtime_wrapper.h:412:
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_cmath.h:177:18: error: redefinition of 'remquo'
__DEVICE__ float remquo(float __n, float __d, int *__q) {
                 ^
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include/crt/math_functions.hpp:685:31: note: previous definition is here
__MATH_FUNCTIONS_DECL__ float remquo(float a, float b, int *quo)
                              ^
1 error generated when compiling for sm_75.
[11/873] Building CXX object External/CUDA/CMakeFiles/builtin_var-cuda-11.0-c++14-libstdc++-7.dir/builtin_var.cu.o
FAILED: External/CUDA/CMakeFiles/builtin_var-cuda-11.0-c++14-libstdc++-7.dir/builtin_var.cu.o 
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/bin/clang++  -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time -UNDEBUG --cuda-path=/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0 -I/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include --cuda-gpu-arch=sm_75 -std=c++14 -stdlib=libstdc++ -gcc-toolchain /buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/gcc-7 -DSTDLIB_VERSION=2014 -MD -MT External/CUDA/CMakeFiles/builtin_var-cuda-11.0-c++14-libstdc++-7.dir/builtin_var.cu.o -MF External/CUDA/CMakeFiles/builtin_var-cuda-11.0-c++14-libstdc++-7.dir/builtin_var.cu.o.d -o External/CUDA/CMakeFiles/builtin_var-cuda-11.0-c++14-libstdc++-7.dir/builtin_var.cu.o -c /buildbot/cuda-t4-0/work/None_clang-cuda-t4/llvm-test-suite/External/CUDA/builtin_var.cu
In file included from <built-in>:1:
In file included from /buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_runtime_wrapper.h:412:
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_cmath.h:177:18: error: redefinition of 'remquo'
__DEVICE__ float remquo(float __n, float __d, int *__q) {
                 ^
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include/crt/math_functions.hpp:685:31: note: previous definition is here
__MATH_FUNCTIONS_DECL__ float remquo(float a, float b, int *quo)
                              ^
1 error generated when compiling for sm_75.
[12/873] Building CXX object External/CUDA/CMakeFiles/assert-cuda-11.0-c++14-libstdc++-7.dir/assert.cu.o
FAILED: External/CUDA/CMakeFiles/assert-cuda-11.0-c++14-libstdc++-7.dir/assert.cu.o 
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/bin/clang++  -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time -UNDEBUG --cuda-path=/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0 -I/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include --cuda-gpu-arch=sm_75 -std=c++14 -stdlib=libstdc++ -gcc-toolchain /buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/gcc-7 -DSTDLIB_VERSION=2014 -MD -MT External/CUDA/CMakeFiles/assert-cuda-11.0-c++14-libstdc++-7.dir/assert.cu.o -MF External/CUDA/CMakeFiles/assert-cuda-11.0-c++14-libstdc++-7.dir/assert.cu.o.d -o External/CUDA/CMakeFiles/assert-cuda-11.0-c++14-libstdc++-7.dir/assert.cu.o -c /buildbot/cuda-t4-0/work/None_clang-cuda-t4/llvm-test-suite/External/CUDA/assert.cu
In file included from <built-in>:1:
In file included from /buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_runtime_wrapper.h:412:
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/clang/lib/clang/12.0.0/include/__clang_cuda_cmath.h:177:18: error: redefinition of 'remquo'
__DEVICE__ float remquo(float __n, float __d, int *__q) {
                 ^
/buildbot/cuda-t4-0/work/None_clang-cuda-t4/external/cuda/cuda-11.0/include/crt/math_functions.hpp:685:31: note: previous definition is here
__MATH_FUNCTIONS_DECL__ float remquo(float a, float b, int *quo)
                              ^
1 error generated when compiling for sm_75.
tra added a comment.Oct 28 2020, 9:48 AM

Never mind. I see that you've fixed it in 17c8251bca83072d2f3e00f936d6ce24500e6b02

In D89971#2359398, @tra wrote:

Never mind. I see that you've fixed it in 17c8251bca83072d2f3e00f936d6ce24500e6b02

Yeah, as I write in the commit message, this confused me but the short term solution was simple so I went ahead.

tra added a comment.Oct 28 2020, 10:14 AM

I'll take a look at what broke the original patch.