This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Add `__truncating_cast` for safely casting float types to integers
ClosedPublic

Authored by ldionne on Aug 27 2019, 3:24 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

EricWF created this revision.Aug 27 2019, 3:24 PM

Seems like a very useful function. __max_representable_int_for_float also seems useful. Should this work in C++03? If so there are a few changes that need to be made. It would also be great if this could be a constexpr (but, obviously, not necessary).

include/math.h
1556 ↗(On Diff #217513)

Seems odd this is the only thing in this file inside the standard namespace. Are we moving towards writing std::__helper instead of __libcpp_helper? It seems like the other helper functions in this file use the __libcpp prefix and aren't in the standard namespace.

1558 ↗(On Diff #217513)

Nit: maybe qualify all the uses of numeric_limits and similar?

1572 ↗(On Diff #217513)

What is the enum providing for you? Couldn't this just be static const int _Bits = ...?

1573 ↗(On Diff #217513)

What's the reasoning behind shifting something forward and back? Shouldn't this always negate the other operation?

1579 ↗(On Diff #217513)

This will not work before C++11.

1582 ↗(On Diff #217513)

Maybe change INFINITY to std::numeric_limits< _RealT >::infinity()

test/libcxx/numerics/truncating_cast.pass.cpp
10 ↗(On Diff #217513)

Is this supposed to work in C++03? If so, update this test and __truncating_cast. Otherwise, add an #if and a // UNSUPPORTED: C++98, C++03

25 ↗(On Diff #217513)

Maybe test with more than just double. float, long double, others?

28 ↗(On Diff #217513)

C++03 will not like this :P

scanon added a subscriber: scanon.Aug 28 2019, 8:52 AM

I would tend to write this function in the following form:

// set up lower bound and upper bound
if (r > upperBound) r = upperBound;
if (!(r >= lowerBound)) r = lowerBound; // NaN is mapped to l.b.
return static_cast<IntType>(r);

I prefer to avoid the explicit trunc call, since that's the defined behavior of the static_cast once the value is in-range, anyway.

include/math.h
1573 ↗(On Diff #217513)

This function doesn't quite do what it says on the tin; it considers the number of significand bits used for the floating-point type, but not the exponent range. This doesn't matter for double, because double's exponent range is much, much larger than any integer type, but it does matter for types like float16 (largest representable value is 65504)--when it's added as a standard floating-point type at some future point, this will introduce subtle bugs.

You should be able to work around this by converting value to _FloatT, taking the minimum of the result and numeric_limits::max, and converting back.

This also assumes that _FloatT has radix == 2, which I do not believe is actually implied by is_floating_point == true. Please add a static assert for that so that future decimal types don't use this template.

1582 ↗(On Diff #217513)

Why isn't this just __trunc_r > _MaxVal?

1584 ↗(On Diff #217513)

This has a subtle assumption that _IntT is two's-complement and _FloatT has radix=2, so that the implicit conversion that occurs in the comparison is exact. The radix should be a static assert; does libc++ care about non-two's-complement at all?

Just from a clarity perspective, I would personally make the conversion explicit.

1586 ↗(On Diff #217513)

If I'm reading right, NaNs will fall through the above two comparisons and invoke UB on the static_cast below. I suspect that's not the desired behavior. What is the intended result for NaN?

scanon added inline comments.Aug 28 2019, 8:56 AM
test/libcxx/numerics/truncating_cast.pass.cpp
36 ↗(On Diff #217513)

Probably should test nextafter(static_cast<double>(Lim::max()), INFINITY) here instead.

EricWF marked 10 inline comments as done.Aug 28 2019, 12:20 PM
EricWF added inline comments.
include/math.h
1556 ↗(On Diff #217513)

We shouldn't put things in the global namespace generally. It's fine that this is the only thing in namespace std is this file.

1558 ↗(On Diff #217513)

No need to qualify types. The lookup will be unambiguous.

1572 ↗(On Diff #217513)

It subtly enforces that the argument is a compile time constant without introducing a global variable declaration.

1579 ↗(On Diff #217513)

Using statements? Yes it will, but only because we require Clang in C++03 now. And Clang allows using statements as an extension.

1582 ↗(On Diff #217513)

Consider long long and double. MaxVal - numeric_limits<long long>::max() == 1024, and we want values between MaxVal and ::max() to round down. So instead we essentially check for __r >= numeric_limits<long long>::max() + 1. This approach seems more accurate.

1584 ↗(On Diff #217513)

I'll static assert the radix, but I think it's safe to assume twos compliment. According to p0907, none of MSVC, GCC, or Clang support other representations [1].

How would you make this explicit?

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r0.html

1586 ↗(On Diff #217513)

I didn't want to treat NaN as a valid input, so I want to allow UBSAN to catch it rather than to provide a valid output.

test/libcxx/numerics/truncating_cast.pass.cpp
10 ↗(On Diff #217513)

The test compiles and passes with C++03.

EricWF updated this revision to Diff 217700.Aug 28 2019, 12:20 PM

Address review comments. I think this is good to go.

scanon added inline comments.Aug 28 2019, 12:36 PM
include/math.h
1582 ↗(On Diff #217513)

Consider long long and double. MaxVal - numeric_limits<long long>::max() == 1024, and we want values between MaxVal and ::max() to round down. So instead we essentially check for __r >= numeric_limits<long long>::max() + 1

Yes, but there are no values between MaxVal and ::max() in the floating-point format; if there were, they would be MaxVal instead. So you can ditch the nextafter and just use > static_cast<_FloatT>(MaxVal).

scanon requested changes to this revision.Aug 28 2019, 12:38 PM
scanon added inline comments.
include/math.h
1586 ↗(On Diff #217513)

Please document this clearly; otherwise someone will assume that this is a UB-free conversion and use it for that purpose.

This revision now requires changes to proceed.Aug 28 2019, 12:38 PM
EricWF marked an inline comment as done.Aug 28 2019, 1:17 PM
EricWF added inline comments.
include/math.h
1573 ↗(On Diff #217513)

Very good point. I've added the static assertions and limited the function to float, double, and long double so the fp16 case won't bite us anytime soon.

zoecarver added inline comments.Aug 28 2019, 1:45 PM
include/math.h
1582 ↗(On Diff #217513)

I thought the same thing, but that isn't necessarily true. Eric showed me this link https://godbolt.org/z/AjBHYqv which does a good job showing what happens when trying to compare an integer value and float value. See the precision loss:

warning: implicit conversion from 'long long' to 'double' changes value from 9223372036854775807 to 9223372036854775808

Eric showed me this link https://godbolt.org/z/AjBHYqv

Dead link.

Yes, conversion of numeric_limits<long long>::max to double rounds to a value out of range for long long. That's not what I'm talking about. Very specifically, in this line:

if (__r >= ::nextafter(static_cast<_RealT>(_MaxVal), INFINITY))

_MaxVal, by construction, is representable both as _RealT and as _IntT, so the static_cast does not change the value (so the rounding demonstrated in your godbolt link doesn't create a bug). a >= nextafter(b, INFINITY) is equivalent to a > b for any finite floating-point a and b. So this condition can simply be if (__r > static_cast<_RealT>(_MaxVal)).

Yes, that sounds right. I can't think of any reason that the condition couldn't be if (__r > static_cast<floatT>(numeric_limits<intT>::max())). The information lost from shifting the value around is never more than the information lost from static_casting the value (as far as I have been able to reason and test).

majnemer added inline comments.
test/libcxx/numerics/truncating_cast.pass.cpp
12 ↗(On Diff #217513)

closes -> closest

EricWF updated this revision to Diff 218149.Aug 30 2019, 12:11 PM
EricWF marked an inline comment as done.

Address review comments.

  • Document that NaN isn't a supported input.
  • Fix spelling mistake.
ldionne commandeered this revision.Sep 3 2019, 2:27 PM
ldionne edited reviewers, added: EricWF; removed: ldionne.

Taking over because Eric is on vacation. I think everything has been addressed at this point.

@scanon do you see anything else that needs to change?

zoecarver accepted this revision.Sep 3 2019, 4:57 PM
scanon accepted this revision.Sep 4 2019, 5:24 AM

I believe that the code can still be simplified somewhat, but that it's correct as-is for float, double, and long double. I'll take an AI to follow-up on future improvements, and let's get this in.

This revision is now accepted and ready to land.Sep 4 2019, 5:24 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2019, 5:48 AM