Page MenuHomePhabricator

[flang][runtime] Fixed identity value for REAL(16) == __float128.
ClosedPublic

Authored by vzakhari on Sep 22 2022, 3:48 PM.

Details

Summary

std::numeric_limits<__float128>::max/lowest return 0.0, so recreate
value of FLT128_MAX ourselves to avoid using quadmath.h's FLT128_MAX
that is currently causes warnings with GCC -Wpedantic.

Diff Detail

Event Timeline

vzakhari created this revision.Sep 22 2022, 3:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 3:48 PM
Herald added a subscriber: jdoerfert. · View Herald Transcript
vzakhari requested review of this revision.Sep 22 2022, 3:48 PM

I consider it a bug in the Fortran standard that the identity values for real minval/maxval are not signed infinities on IEEE-754 targets.

Perhaps you could compute the largest finite number with scalbn or ldexp applied to nexttoward((quad precision)1.0, 0.0) on hosts whose numeric_limits header is broken.

Thanks!

I will use ldexpf128(nextafterf128(1.0, 0.0), __FLT128_MAX_EXP__). These functions are defined by glibc headers for GCC compiler. I could have used GCC's nextafterq/ldexpq, but I think it is better to use glibc names so that when they add support for clang it will hopefully work as-is. In any case, I will have to check in CMake file that these entry points are available and only then enable the REAL(16) specialization. Runtime will not work correctly, when built with clang, and the unittest will fail, so I will probably have to make the same checks in CMake files for the unittests.

This will allow getting rid of these weird -Wpedantic workaround.

You can also just initialize the value with a union with a 128-bit integer, using 0x7ffeffffffffffff for the most significant 64 bits and 0xffffffffffffffff for the least significant 64 bits. This would be even better since that's a known permanent constant.

You can also just initialize the value with a union with a 128-bit integer, using 0x7ffeffffffffffff for the most significant 64 bits and 0xffffffffffffffff for the least significant 64 bits. This would be even better since that's a known permanent constant.

This is a possibility, but I am not sure if it is safe to rely on particular __float128 representation. Is it guaranteed to comply with IEEE binary128 format in all implementations? I know we only set HAS_FLOAT128 to true for clang and gcc and I guess it should be a safe assumption, but I am not quite comfortable with the hard-coded constant.

Also, can there be different endianness for integer and floating point numbers on a target?

vzakhari updated this revision to Diff 462387.Sep 22 2022, 8:49 PM

You can also just initialize the value with a union with a 128-bit integer, using 0x7ffeffffffffffff for the most significant 64 bits and 0xffffffffffffffff for the least significant 64 bits. This would be even better since that's a known permanent constant.

This is a possibility, but I am not sure if it is safe to rely on particular __float128 representation. Is it guaranteed to comply with IEEE binary128 format in all implementations? I know we only set HAS_FLOAT128 to true for clang and gcc and I guess it should be a safe assumption, but I am not quite comfortable with the hard-coded constant.

Also, can there be different endianness for integer and floating point numbers on a target?

Yes, there is only one IEEE-754 representation for each size of numbers; implementations may vary their encodings only for signaling vs quiet NaNs. Yes, the PDP-11 and successors had a byte order for floating-point numbers that was neither little-endian nor big-endian.

If you use an expression to compute a known constant value, please at least check the result to ensure that it equals the known constant value.

You can also just initialize the value with a union with a 128-bit integer, using 0x7ffeffffffffffff for the most significant 64 bits and 0xffffffffffffffff for the least significant 64 bits. This would be even better since that's a known permanent constant.

This is a possibility, but I am not sure if it is safe to rely on particular __float128 representation. Is it guaranteed to comply with IEEE binary128 format in all implementations? I know we only set HAS_FLOAT128 to true for clang and gcc and I guess it should be a safe assumption, but I am not quite comfortable with the hard-coded constant.

Also, can there be different endianness for integer and floating point numbers on a target?

Yes, there is only one IEEE-754 representation for each size of numbers; implementations may vary their encodings only for signaling vs quiet NaNs. Yes, the PDP-11 and successors had a byte order for floating-point numbers that was neither little-endian nor big-endian.

If you use an expression to compute a known constant value, please at least check the result to ensure that it equals the known constant value.

Yes, it produces expected result on x86/arm/power with gcc.

You can also just initialize the value with a union with a 128-bit integer, using 0x7ffeffffffffffff for the most significant 64 bits and 0xffffffffffffffff for the least significant 64 bits. This would be even better since that's a known permanent constant.

This is a possibility, but I am not sure if it is safe to rely on particular __float128 representation. Is it guaranteed to comply with IEEE binary128 format in all implementations? I know we only set HAS_FLOAT128 to true for clang and gcc and I guess it should be a safe assumption, but I am not quite comfortable with the hard-coded constant.

Also, can there be different endianness for integer and floating point numbers on a target?

Yes, there is only one IEEE-754 representation for each size of numbers; implementations may vary their encodings only for signaling vs quiet NaNs. Yes, the PDP-11 and successors had a byte order for floating-point numbers that was neither little-endian nor big-endian.

If you use an expression to compute a known constant value, please at least check the result to ensure that it equals the known constant value.

Yes, it produces expected result on x86/arm/power with gcc.

I mean that you should check it in code with an assert.

But then I will have to use FLT128_MAX and quadmath.h, and I am back at the weird -Wpedantic issue :) can you please clarify how you suggest doing this check?

But then I will have to use FLT128_MAX and quadmath.h, and I am back at the weird -Wpedantic issue :) can you please clarify how you suggest doing this check?

Use hex.

But then I will have to use FLT128_MAX and quadmath.h, and I am back at the weird -Wpedantic issue :) can you please clarify how you suggest doing this check?

Use hex.

Or verify that the value is finite and that adding one ULP to it makes it infinite (or equivalently raises an overflow).

vzakhari updated this revision to Diff 462690.Sep 24 2022, 4:01 PM

Ok, then I will just use hex with some sanity verification.

vzakhari edited the summary of this revision. (Show Details)Sep 24 2022, 4:03 PM
jeanPerier accepted this revision.Sep 28 2022, 1:04 AM
This revision is now accepted and ready to land.Sep 28 2022, 1:04 AM