Similar to libcxx implementation of cmath function
overloads, use type promotion templates to determine
return types of multi-argument math functions.
Details
- Reviewers
yaxunl tra - Commits
- rGca5b31502c82: [HIP] Math Headers to use type promotion
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'm not quite sure what is the problem this patch is intended to solve. Could you give me more details?
clang/lib/Headers/__clang_hip_cmath.h | ||
---|---|---|
291 | Will automatically derived return type ever be different from the __retty we're explicitly specifying now?
|
@tra, a problem arose with the fma function. When given fma(float, float, char), it was returning a double type. Instead, we want to be more similar to C++ and return the promoted type which is float in this case.
This patch tries to fix a few failures I introduced with my recent HIP header refactoring patch.
Also, HOLD on this patch. I just found another bug. Will update soon.
clang/lib/Headers/__clang_hip_cmath.h | ||
---|---|---|
291 | Yes, sometimes the return type should be float when called by float mixed with char/int/short args. |
That is odd. char should've been promoted to float and fma(flot, float, float) should've been called and this patch should not have been necessary. https://cppinsights.io/s/7cdd71b7
If that's the case, then this patch may not do the right thing either -- it would force the arguments to the derived result type, but if fma(double) is the only choice, the arguments will be implicitly converted to double and back which is probably not what you want.
Perhaps the problem is that fma(flot, float, float) is not visible at the point where the overload resolution happens.
Also, HOLD on this patch. I just found another bug. Will update soon.
Sorry, the original HIP test may be invalid, it was comparing the result of (float, float X) with the float result, and the (double, double, X) with the double result, so it expected (float, float, char) to equal (float, float, float). However, when I am looking at clang's libcxx implementation, the return type of (float, float, char) should instead be promoted to double. I think HIP wants to follow closely with the libcxx implementation of type promotion:
https://github.com/llvm/llvm-project/blob/master/libcxx/include/math.h#L1200
libcxx will change the return type to double, if any of the arguments are of the type: char/int/unsigned/long/ulong/longlong/ulonglong/double:
static void __test(...); static float __test(float); static double __test(char); static double __test(int); static double __test(unsigned); static double __test(long); static double __test(unsigned long); static double __test(long long); static double __test(unsigned long long); static double __test(double);
I tried the previous HIP FMA test, and it looks like libcxx's cmath is expecting fma(float, float, char) to be promoted to (double, double, double) and return type double:
https://cppinsights.io/s/ee45a5ca
Revised the patch to match libcxx, fixed a bug in return type resolution, and ran clang-format on this patch.
LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT?
I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode!
But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or something? Like, "this code you wrote maybe for nvcc is going to do something different with clang."
nvcc does not support fma(float,float,char)
clang's behavior was different from nvcc already.
Interestingly enough CUDA 10.1+ already promotes integer fma() arguments to double:
https://godbolt.org/z/crbqTe
I wonder what makes HIP different to require this change.
It does, it just needs an explicit flag to match clang's treatment of constexpr functions as HD.
Practically the behavior is the same since they all promote integer types to double. This matches the C++ behavior. However the HIP change will make it conform to C++ for a target supporting long double whereas the previous header did not.
Sorry I mean the change can make the header extendable to long double easily although it does not yet. Another thing is that it allows resolution of mixed argument types with _Float16.
OK. This makes more sense now. Thank you for the explanation.
While this does solve one particular instance of the issue, we can't jsut copy/paste bits of the standard library forever. We need something more robust.
NVIDIA now has their own fork of the standard library https://github.com/NVIDIA/libcudacxx and that may be a good starting point.
I think at some point we (HIP & CUDA owners) need to talk to libc++ maintainers and see if we can find a better way to extend the standard library to CUDA/HIP.
Agree. A seamless native libc++ support for CUDA/HIP is very attractive. Even if just partial support. At least math functions to start with.
Will automatically derived return type ever be different from the __retty we're explicitly specifying now?