This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Math Headers to use type promotion
ClosedPublic

Authored by ashi1 on Oct 29 2020, 10:23 AM.

Details

Summary

Similar to libcxx implementation of cmath function
overloads, use type promotion templates to determine
return types of multi-argument math functions.

Diff Detail

Event Timeline

ashi1 requested review of this revision.Oct 29 2020, 10:23 AM
ashi1 created this revision.
tra added a comment.Oct 29 2020, 10:45 AM

I'm not quite sure what is the problem this patch is intended to solve. Could you give me more details?

clang/lib/Headers/__clang_hip_cmath.h
291

Will automatically derived return type ever be different from the __retty we're explicitly specifying now?

  • If the answer is no, then what does the patch buy us here? It looks like a more complicated way to do what we're already doing.
  • If yes, this will potentially create observable differences in code behavior before/after -std=c++11.
ashi1 added a comment.Oct 29 2020, 1:15 PM
In D90409#2362554, @tra wrote:

I'm not quite sure what is the problem this patch is intended to solve. Could you give me more details?

@tra, a problem arose with the fma function. When given fma(float, float, char), it was returning a double type. Instead, we want to be more similar to C++ and return the promoted type which is float in this case.
This patch tries to fix a few failures I introduced with my recent HIP header refactoring patch.

Also, HOLD on this patch. I just found another bug. Will update soon.

clang/lib/Headers/__clang_hip_cmath.h
291

Yes, sometimes the return type should be float when called by float mixed with char/int/short args.

tra added a comment.Oct 29 2020, 2:02 PM
In D90409#2362554, @tra wrote:

I'm not quite sure what is the problem this patch is intended to solve. Could you give me more details?

@tra, a problem arose with the fma function. When given fma(float, float, char), it was returning a double type. Instead, we want to be more similar to C++ and return the promoted type which is float in this case.
This patch tries to fix a few failures I introduced with my recent HIP header refactoring patch.

That is odd. char should've been promoted to float and fma(flot, float, float) should've been called and this patch should not have been necessary. https://cppinsights.io/s/7cdd71b7
If that's the case, then this patch may not do the right thing either -- it would force the arguments to the derived result type, but if fma(double) is the only choice, the arguments will be implicitly converted to double and back which is probably not what you want.

Perhaps the problem is that fma(flot, float, float) is not visible at the point where the overload resolution happens.

Also, HOLD on this patch. I just found another bug. Will update soon.

ashi1 added a comment.Nov 3 2020, 8:39 AM
In D90409#2363044, @tra wrote:
In D90409#2362554, @tra wrote:

I'm not quite sure what is the problem this patch is intended to solve. Could you give me more details?

@tra, a problem arose with the fma function. When given fma(float, float, char), it was returning a double type. Instead, we want to be more similar to C++ and return the promoted type which is float in this case.
This patch tries to fix a few failures I introduced with my recent HIP header refactoring patch.

That is odd. char should've been promoted to float and fma(flot, float, float) should've been called and this patch should not have been necessary. https://cppinsights.io/s/7cdd71b7
If that's the case, then this patch may not do the right thing either -- it would force the arguments to the derived result type, but if fma(double) is the only choice, the arguments will be implicitly converted to double and back which is probably not what you want.

Perhaps the problem is that fma(flot, float, float) is not visible at the point where the overload resolution happens.

Sorry, the original HIP test may be invalid, it was comparing the result of (float, float X) with the float result, and the (double, double, X) with the double result, so it expected (float, float, char) to equal (float, float, float). However, when I am looking at clang's libcxx implementation, the return type of (float, float, char) should instead be promoted to double. I think HIP wants to follow closely with the libcxx implementation of type promotion:
https://github.com/llvm/llvm-project/blob/master/libcxx/include/math.h#L1200

libcxx will change the return type to double, if any of the arguments are of the type: char/int/unsigned/long/ulong/longlong/ulonglong/double:

static void __test(...);
static float __test(float);
static double __test(char);
static double __test(int);
static double __test(unsigned);
static double __test(long);
static double __test(unsigned long);
static double __test(long long);
static double __test(unsigned long long);
static double __test(double);
ashi1 added a comment.Nov 3 2020, 8:47 AM

I tried the previous HIP FMA test, and it looks like libcxx's cmath is expecting fma(float, float, char) to be promoted to (double, double, double) and return type double:
https://cppinsights.io/s/ee45a5ca

ashi1 updated this revision to Diff 302600.Nov 3 2020, 8:55 AM

Revised the patch to match libcxx, fixed a bug in return type resolution, and ran clang-format on this patch.

tra accepted this revision.Nov 3 2020, 9:42 AM
tra added a subscriber: jlebar.

I was confused about type conversion. Apparently the standard library says :

If any argument has integral type, it is cast to double

LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?

This revision is now accepted and ready to land.Nov 3 2020, 9:42 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2020, 10:41 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?

I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode!

But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or something? Like, "this code you wrote maybe for nvcc is going to do something different with clang."

nvcc does not support fma(float,float,char)

https://godbolt.org/z/zxbMhP

clang's behavior was different from nvcc already.

tra added a comment.Nov 3 2020, 12:18 PM

LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?

I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode!

But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or something? Like, "this code you wrote maybe for nvcc is going to do something different with clang."

Interestingly enough CUDA 10.1+ already promotes integer fma() arguments to double:
https://godbolt.org/z/crbqTe

I wonder what makes HIP different to require this change.

tra added a comment.Nov 3 2020, 12:21 PM

nvcc does not support fma(float,float,char)

It does, it just needs an explicit flag to match clang's treatment of constexpr functions as HD.

In D90409#2371987, @tra wrote:

nvcc does not support fma(float,float,char)

It does, it just needs an explicit flag to match clang's treatment of constexpr functions as HD.

In D90409#2371972, @tra wrote:

LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?

I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode!

But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or something? Like, "this code you wrote maybe for nvcc is going to do something different with clang."

Interestingly enough CUDA 10.1+ already promotes integer fma() arguments to double:
https://godbolt.org/z/crbqTe

I wonder what makes HIP different to require this change.

Practically the behavior is the same since they all promote integer types to double. This matches the C++ behavior. However the HIP change will make it conform to C++ for a target supporting long double whereas the previous header did not.

In D90409#2371987, @tra wrote:

nvcc does not support fma(float,float,char)

It does, it just needs an explicit flag to match clang's treatment of constexpr functions as HD.

In D90409#2371972, @tra wrote:

LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?

I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode!

But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or something? Like, "this code you wrote maybe for nvcc is going to do something different with clang."

Interestingly enough CUDA 10.1+ already promotes integer fma() arguments to double:
https://godbolt.org/z/crbqTe

I wonder what makes HIP different to require this change.

Practically the behavior is the same since they all promote integer types to double. This matches the C++ behavior. However the HIP change will make it conform to C++ for a target supporting long double whereas the previous header did not.

Sorry I mean the change can make the header extendable to long double easily although it does not yet. Another thing is that it allows resolution of mixed argument types with _Float16.

tra added a comment.Nov 3 2020, 1:27 PM

Practically the behavior is the same since they all promote integer types to double. This matches the C++ behavior. However the HIP change will make it conform to C++ for a target supporting long double whereas the previous header did not.

Sorry I mean the change can make the header extendable to long double easily although it does not yet. Another thing is that it allows resolution of mixed argument types with _Float16.

OK. This makes more sense now. Thank you for the explanation.

While this does solve one particular instance of the issue, we can't jsut copy/paste bits of the standard library forever. We need something more robust.
NVIDIA now has their own fork of the standard library https://github.com/NVIDIA/libcudacxx and that may be a good starting point.
I think at some point we (HIP & CUDA owners) need to talk to libc++ maintainers and see if we can find a better way to extend the standard library to CUDA/HIP.

yaxunl added a comment.Nov 4 2020, 7:37 AM
In D90409#2372183, @tra wrote:

Practically the behavior is the same since they all promote integer types to double. This matches the C++ behavior. However the HIP change will make it conform to C++ for a target supporting long double whereas the previous header did not.

Sorry I mean the change can make the header extendable to long double easily although it does not yet. Another thing is that it allows resolution of mixed argument types with _Float16.

OK. This makes more sense now. Thank you for the explanation.

While this does solve one particular instance of the issue, we can't jsut copy/paste bits of the standard library forever. We need something more robust.
NVIDIA now has their own fork of the standard library https://github.com/NVIDIA/libcudacxx and that may be a good starting point.
I think at some point we (HIP & CUDA owners) need to talk to libc++ maintainers and see if we can find a better way to extend the standard library to CUDA/HIP.

Agree. A seamless native libc++ support for CUDA/HIP is very attractive. Even if just partial support. At least math functions to start with.