This is an archive of the discontinued LLVM Phabricator instance.

[libc] Add float type and flag for nearest_integer to enable SSE4.2.
ClosedPublic

Authored by lntue on Jul 15 2022, 8:05 PM.

Details

Summary

Add float type and flag for nearest integer to automatically test with
and without SSE4.2 flag.

Diff Detail

Event Timeline

lntue created this revision.Jul 15 2022, 8:05 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 15 2022, 8:05 PM
lntue requested review of this revision.Jul 15 2022, 8:05 PM
sivachandra added inline comments.Jul 17 2022, 11:32 PM
libc/src/__support/FPUtil/nearest_integer.h
29

Can you explain where this overload is to be used? I am asking because I see couple of problems here:

  1. The constants that you used are of double type. So, the RHS expression on line 33 will get evaluated as a double expression and will be equivalent to r = x for a large range of x values.
  2. Even if you add a suffix of f to the constants, the behavior is still dependent on the value of FLT_EVAL_METHOD: https://en.cppreference.com/w/c/types/limits/FLT_EVAL_METHOD. Perhaps this aspect is not of a major concern on platforms like x86_64 and aarch64 anyway.

Going by #1 at least, I am surprised that this fallback implementation is serving as a drop-in-replacement to the x86_64 and aarch64 specializations. If a drop-in-replacement is in fact the intention, then the testing strategy should be this:

  1. Unit test this fallback implementation separately - testing with the user of this fallback will amount to a kind of integration test.
  2. Unit test the users with either the fallback or the preferred path. There is not much gain in unit testing the users for both the preferred path and the fallback path.
  3. If testing of the plumbing of the fallback/preferred path is desired, then testing for that should be done with explicit intent.

One more point that strikes me here is the comment you have above, "... in case of a tie, might pick a random one among 2 closest integers when the rounding mode is not FE_TONEAREST." If the users' algorithms are really tolerant to such "randomness", then perhaps we should give this fallback a name which captures that randomness aspect. Also, if there is only one user of this fallback, then may be this fallback should live with that user and not as utility?

lntue added inline comments.Jul 18 2022, 7:31 AM
libc/src/__support/FPUtil/nearest_integer.h
29

Thanks for spotting the data type issue!

About the intention of the change, this is to be used for improving exp*f functions as experimenting in https://reviews.llvm.org/D130008. The main thing that these functions are used to to replace the idiom for rounding to nearest integer that we've been using:

int k = static_cast<int>(x < 0 ? x - 0.5f : x + 0.5f);
float kf = static_cast<float>(k);

This by itself also has all the problems that you mentioned in the comment, such as FLT_EVAL_METHOD, possible different answers for different rounding modes, etc. Maybe I shouldn't use the word random. It well-defined, it's just too wordy to describe its behavior explicitly in all cases.

About testing, this function is similar to fputil::multiply_add and fputil::polyeval that might give slightly different answers with/without special instructions, and that the fallback path is never taken on aarch64. So it's suitable for a flag control in the same way:

  • The end goal for each math function is that the final outputs are identical regardless of which path nearest_integer takes, so technically, we should require both preferred path and fallback path to be tested, unless explicitly stated otherwise.

So to answer to your questions: this function is kind of similar to polyeval / multiply_add, and will be used by several functions, hence being shared here.

lntue updated this revision to Diff 445494.Jul 18 2022, 7:37 AM

Update comments and float literals.

sivachandra added inline comments.Jul 18 2022, 9:49 AM
libc/src/__support/FPUtil/nearest_integer.h
29

Thanks for spotting the data type issue!

About the intention of the change, this is to be used for improving exp*f functions as experimenting in https://reviews.llvm.org/D130008. The main thing that these functions are used to to replace the idiom for rounding to nearest integer that we've been using:

int k = static_cast<int>(x < 0 ? x - 0.5f : x + 0.5f);
float kf = static_cast<float>(k);

For my knowledge, can you explain what the problem is with this idiom and how the proposed implementation is solving it?

This by itself also has all the problems that you mentioned in the comment, such as FLT_EVAL_METHOD, possible different answers for different rounding modes, etc. Maybe I shouldn't use the word random. It well-defined, it's just too wordy to describe its behavior explicitly in all cases.

Considering that this fallback is not equivalent to the behavior of the native instructions, and that it was previously erroneous for some inputs, can I conclude that exp*f you are experimenting with is tolerant to such incorrectness?

About testing, this function is similar to fputil::multiply_add and fputil::polyeval that might give slightly different answers with/without special instructions, and that the fallback path is never taken on aarch64. So it's suitable for a flag control in the same way:

  • The end goal for each math function is that the final outputs are identical regardless of which path nearest_integer takes, so technically, we should require both preferred path and fallback path to be tested, unless explicitly stated otherwise.

So to answer to your questions: this function is kind of similar to polyeval / multiply_add, and will be used by several functions, hence being shared here.

lntue added inline comments.Jul 18 2022, 1:14 PM
libc/src/__support/FPUtil/nearest_integer.h
29

Thanks for spotting the data type issue!

About the intention of the change, this is to be used for improving exp*f functions as experimenting in https://reviews.llvm.org/D130008. The main thing that these functions are used to to replace the idiom for rounding to nearest integer that we've been using:

int k = static_cast<int>(x < 0 ? x - 0.5f : x + 0.5f);
float kf = static_cast<float>(k);

For my knowledge, can you explain what the problem is with this idiom and how the proposed implementation is solving it?

This idiom first converting a floating point type to an integer type, and then converting back to a floating point. CPUs seem to hate this ping pong back and forth between floating point and integer registers, significantly reduce the throughput and increase latency with its on the critical path, which is mostly floating point computations. The key point that makes this implementation improves the performance is that the entire computation is within floating point type and not too many branches.

This by itself also has all the problems that you mentioned in the comment, such as FLT_EVAL_METHOD, possible different answers for different rounding modes, etc. Maybe I shouldn't use the word random. It well-defined, it's just too wordy to describe its behavior explicitly in all cases.

Considering that this fallback is not equivalent to the behavior of the native instructions, and that it was previously erroneous for some inputs, can I conclude that exp*f you are experimenting with is tolerant to such incorrectness?

Yes, it can kind of tolerate these different inputs (I wouldn't call it incorrect, since technically they are all correct, just different decisions / rounding modes when there is a tie), probably few exceptional values need to be updated similar to sinf.

sivachandra accepted this revision.Jul 22 2022, 12:10 AM
This revision is now accepted and ready to land.Jul 22 2022, 12:10 AM