Download Raw Diff

Details

Reviewers

michaelrj
sivachandra
zimmermann6

Commits

rGaad04534c419: [libc] Implement correct rounding with all rounding modes for hypot functions.

Summary

Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lntue created this revision.Jan 18 2022, 11:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2022, 11:12 AM

Herald added subscribers: ecnelises, tschuett. · View Herald Transcript

lntue requested review of this revision.Jan 18 2022, 11:12 AM

Harbormaster completed remote builds in B144068: Diff 400919.Jan 18 2022, 11:17 AM

LGTM from a formatting perspective

Dear Tue,

Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes.

I did stress this revision with my random testing code
(https://gitlab.inria.fr/zimmerma/math_accuracy) and did not find any
incorrectly rounded results, for all rounding modes, both for binary32
and binary64.

Instead of testing get_round() == FE_UPWARD to know the current rounding
mode, I wonder whether a test like 0x1p0f + 0x1p-24f != 0x1p0f would be
faster.

Paul

In D117590#3254008, @zimmermann6 wrote:

Dear Tue,

Update the rounding logic for generic hypot function so that it will round correctly with all rounding modes.

I did stress this revision with my random testing code
(https://gitlab.inria.fr/zimmerma/math_accuracy) and did not find any
incorrectly rounded results, for all rounding modes, both for binary32
and binary64.

Instead of testing get_round() == FE_UPWARD to know the current rounding
mode, I wonder whether a test like 0x1p0f + 0x1p-24f != 0x1p0f would be
faster.

Paul

Thanks Paul for testing the patch!

I've tried with using the 0x1p0f + 0x1p-24f != 0x1p0f instead of get_round() == FE_UPWARD, and it does make the perf tests on normal range ~ 5% faster.
But this is due to the compiler optimized away the expression (making it always False), and in turn, making the function not correctly rounded for all rounding modes any more: https://godbolt.org/z/87z4bWE9P

And if feel like if we add extra stuff to prevent the compiler from optimizing the expression away, it would bring the performance back to what we got with get_round() == FE_UPWARD.
This is also belong to the exceptional cases where we short-circuit the results, and so at least any changes inside would not affect the worst case performance.

Dear Tue,

I've tried with using the 0x1p0f + 0x1p-24f != 0x1p0f instead of get_round() == FE_UPWARD, and it does make the perf tests on normal range ~ 5% faster.
But this is due to the compiler optimized away the expression (making it always False), and in turn, making the function not correctly rounded for all rounding modes any more: https://godbolt.org/z/87z4bWE9P

yes, you need to add -frounding-math to the compiler options.

And if feel like if we add extra stuff to prevent the compiler from optimizing the expression away, it would bring the performance back to what we got with get_round() == FE_UPWARD.
This is also belong to the exceptional cases where we short-circuit the results, and so at least any changes inside would not affect the worst case performance.

please can you try with -frounding-math and check if the perf tests are
faster and slower?

Paul

Use a faster check for FE_UPWARD rounding mode.

Herald added a subscriber: mgorny. · View Herald TranscriptJan 19 2022, 8:19 AM

In D117590#3254772, @zimmermann6 wrote:

Dear Tue,

I've tried with using the 0x1p0f + 0x1p-24f != 0x1p0f instead of get_round() == FE_UPWARD, and it does make the perf tests on normal range ~ 5% faster.
But this is due to the compiler optimized away the expression (making it always False), and in turn, making the function not correctly rounded for all rounding modes any more: https://godbolt.org/z/87z4bWE9P

yes, you need to add -frounding-math to the compiler options.

And if feel like if we add extra stuff to prevent the compiler from optimizing the expression away, it would bring the performance back to what we got with get_round() == FE_UPWARD.
This is also belong to the exceptional cases where we short-circuit the results, and so at least any changes inside would not affect the worst case performance.

please can you try with -frounding-math and check if the perf tests are
faster and slower?

Paul

Thanks Paul! I've added -fround-math and it worked correctly while maintaining 5% overall performance improvement compared to get_round().

I've updated the patch accordingly. You might have to redo the stress tests again to make sure everything is still working properly.

Thanks!

Harbormaster completed remote builds in B144307: Diff 401251.Jan 19 2022, 8:54 AM

Remove warnings for perf tests.

sivachandra added inline comments.Jan 19 2022, 10:39 AM

libc/test/src/math/HypotTest.h
77	Is the call to `func` happening with the intended rounding mode?

lntue added inline comments.Jan 19 2022, 11:03 AM

libc/test/src/math/HypotTest.h
77	Yes, by putting the call inside the macro, it will be put by the macro after the rounding mode is set before it is evaluated. If it does not work as intended, the assertions with different rounding modes will definitely catch it, because most of the outputs are not representable in the floating point, and hence rounding up/down will guarantee to give different results.

Please wait for @zimmermann6 to give his green light.

This revision is now accepted and ready to land.Jan 19 2022, 11:07 AM

Harbormaster completed remote builds in B144360: Diff 401323.Jan 19 2022, 12:21 PM

I still get warnings with the latest revision:

/localdisk/zimmerma/llvm-project/libc/src/__support/FPUtil/Hypot.h:149:22: warning: hexadecimal floating literals are a C++17 feature [-Wc++17-extensions]
    if ((y != 0) && (0x1p0f + 0x1p-24f != 0x1p0f)) {
                     ^

the stress tests were successful (for all four rounding modes, both in single and double precision).
Thus I am ok with this version, thanks!

after fixing my stress program I was able to find one value which does not seem to be correctly rounded (for binary32 and rounding up):

zimmerma@biscotte:~/svn/tbd/20/src/binary32$ CFLAGS=-DCHECK_CR LLVM=llvm-project VERBOSE=-v RND=rndu ./doitb.llvm hypot 1000
Checking hypot with llvm-project and rndu
Using seed 1076573
NEW hypot 0 -1 0x1.ffffecp-1,-0x1.000002p+27 [1.00] 1 1
libm gives 0x1.000002p+27
mpfr gives 0x1.000004p+27

Please can you confirm?

This revision now requires changes to proceed.Jan 20 2022, 2:52 AM

I got similar results with binary64:

Checking hypot with llvm-project and rndu
Using seed 1078001
NEW hypot 0 -1 0x1.ccbbbcfef3c02p-523,0x1.924bf639c1a94p+500 [1.00] 1 1
libm gives 0x1.924bf639c1a94p+500
mpfr gives 0x1.924bf639c1a95p+500

Fix compiler flags that were added to the wrong entrypoints.

In D117590#3257664, @zimmermann6 wrote:
after fixing my stress program I was able to find one value which does not seem to be correctly rounded (for binary32 and rounding up):
zimmerma@biscotte:~/svn/tbd/20/src/binary32$ CFLAGS=-DCHECK_CR LLVM=llvm-project VERBOSE=-v RND=rndu ./doitb.llvm hypot 1000
Checking hypot with llvm-project and rndu
Using seed 1076573
NEW hypot 0 -1 0x1.ffffecp-1,-0x1.000002p+27 [1.00] 1 1
libm gives 0x1.000002p+27
mpfr gives 0x1.000004p+27
Please can you confirm?

Thanks Paul for catching the problem! For some reason before my previous patch update, merging to the head had put the -fround-math compiler flags into the wrong entrypoints that I didn't notice. After moving them back to hypotf and hypot, the results are correct on my side now.

Harbormaster completed remote builds in B144570: Diff 401620.Jan 20 2022, 6:43 AM

I'm ok with the new revision. However I see there are still some calls to get_round(). Did you try to replace them by floating-point operations?

You might also want to add the following hard-to-round cases (for binary32) in your test cases:

/* the following are hard-to-round cases with many identical bits after       
   the round bit */
{0x1.900004p+34,0x1.400002p+23}, /* 45 identical bits */
{0x1.05555p+34,0x1.bffffep+23},  /* 44 identical bits */
{0x1.e5fffap+34,0x1.affffep+23}, /* 45 identical bits */
{0x1.260002p+34,0x1.500002p+23}, /* 45 identical bits */
{0x1.fffffap+34,0x1.fffffep+23}, /* 45 identical bits */
{0x1.8ffffap+34,0x1.3ffffep+23}, /* 45 identical bits */
{0x1.87fffcp+35,0x1.bffffep+23}, /* 47 identical bits */

By the way, none of the other libraries is correctly rounded for binary32, here are the corresponding worst cases:

/* hypot(x,y) */
{0x1.b8e50ap-52,-0x1.db1e78p-64},   /* GNU libc       0.500001 */
{0x1.03b54cp-33,0x1.6ca6bep-45},    /* icc            0.500001 */
{0x1.e2eff6p+97,-0x1.044cb2p+108},  /* AMD LibM       0.500001 */
{-0x1.6b05c4p-127,0x1.6b3146p-126}, /* Newlib         1.20805 */
{-0x1.6b05c4p-127,0x1.6b3146p-126}, /* OpenLibm       1.20805 */
{0x1.26b188p-127,-0x1.a4f2fp-128},  /* Musl           0.926707 */
{0x1.e2eff6p+97,-0x1.044cb2p+108},  /* Darwin 20.4.0  0.500001 */

This revision is now accepted and ready to land.Jan 20 2022, 7:31 AM

wc.1231 KBDownload

attached is a file with 1200 binary32 exact cases with ulp(x)=2^12*ulp(y), x^2+y^2=z^2 having up to 72 bits. You might add them to your test cases.

Add more hard-to-round tests.

Add even more hard-to-round tests from Paul.

In D117590#3258279, @zimmermann6 wrote:

I'm ok with the new revision. However I see there are still some calls to get_round(). Did you try to replace them by floating-point operations?

You might also want to add the following hard-to-round cases (for binary32) in your test cases:

/* the following are hard-to-round cases with many identical bits after       
   the round bit */
{0x1.900004p+34,0x1.400002p+23}, /* 45 identical bits */
{0x1.05555p+34,0x1.bffffep+23},  /* 44 identical bits */
{0x1.e5fffap+34,0x1.affffep+23}, /* 45 identical bits */
{0x1.260002p+34,0x1.500002p+23}, /* 45 identical bits */
{0x1.fffffap+34,0x1.fffffep+23}, /* 45 identical bits */
{0x1.8ffffap+34,0x1.3ffffep+23}, /* 45 identical bits */
{0x1.87fffcp+35,0x1.bffffep+23}, /* 47 identical bits */

By the way, none of the other libraries is correctly rounded for binary32, here are the corresponding worst cases:

/* hypot(x,y) */
{0x1.b8e50ap-52,-0x1.db1e78p-64},   /* GNU libc       0.500001 */
{0x1.03b54cp-33,0x1.6ca6bep-45},    /* icc            0.500001 */
{0x1.e2eff6p+97,-0x1.044cb2p+108},  /* AMD LibM       0.500001 */
{-0x1.6b05c4p-127,0x1.6b3146p-126}, /* Newlib         1.20805 */
{-0x1.6b05c4p-127,0x1.6b3146p-126}, /* OpenLibm       1.20805 */
{0x1.26b188p-127,-0x1.a4f2fp-128},  /* Musl           0.926707 */
{0x1.e2eff6p+97,-0x1.044cb2p+108},  /* Darwin 20.4.0  0.500001 */

Thanks Paul for the test cases! I've added these and the other 1.2k cases in your attachment to the test.

I have also tried replacing the final get_round with the corresponding floating point operations, but for some reason, -O3 always partially optimized it away even with -frounding-math flag together, and that make the results incorrect.
If I change the flag to -O2, then it is correct, but I see virtually no performance improvement. So I decided to keep the get_round() at the end so that it is easier to read and less sensitive to optimization flags.

Harbormaster completed remote builds in B144610: Diff 401673.Jan 20 2022, 10:32 AM

Closed by commit rGaad04534c419: [libc] Implement correct rounding with all rounding modes for hypot functions. (authored by lntue). · Explain WhyJan 20 2022, 10:33 AM

This revision was automatically updated to reflect the committed changes.

lntue added a commit: rGaad04534c419: [libc] Implement correct rounding with all rounding modes for hypot functions..

a performance graph is available at https://core-math.gitlabpages.inria.fr/graph_perf_hypotf.pdf

Dear Tue,

I have also tried replacing the final get_round with the corresponding floating point operations, but for some reason, -O3 always partially optimized it away even with -frounding-math flag together, and that make the results incorrect.
If I change the flag to -O2, then it is correct, but I see virtually no performance improvement. So I decided to keep the get_round() at the end so that it is easier to read and less sensitive to optimization flags.

this might be a compiler bug, since -frounding-math should be supported
whatever the optimization level (-O2 or -O3). You might want to investigate
further and report the bug if any.

Best regards,
Paul

this might be a compiler bug, since -frounding-math should be supported whatever the optimization level (-O2 or -O3). You might want to investigate further and report the bug if any.

Hi, what's the target in the context? Or can you show the partial IR after -frounding-math? Currently only X86/SystemZ/PowerPC in LLVM has rather complete support for rounding-mode-aware float operations.

In D117590#3269109, @qiucf wrote:

this might be a compiler bug, since -frounding-math should be supported whatever the optimization level (-O2 or -O3). You might want to investigate further and report the bug if any.

Hi, what's the target in the context? Or can you show the partial IR after -frounding-math? Currently only X86/SystemZ/PowerPC in LLVM has rather complete support for rounding-mode-aware float operations.

This is the error messages that we got on aarch64-ubuntu: Buildbot log

[1/3] Building CXX object projects/libc/src/math/generic/CMakeFiles/libc.src.math.generic.hypotf.dir/hypotf.cpp.o
clang-9: warning: optimization flag '-frounding-math' is not supported [-Wignored-optimization-argument]
[2/3] Building CXX object projects/libc/src/math/generic/CMakeFiles/libc.src.math.generic.hypot.dir/hypot.cpp.o
clang-9: warning: optimization flag '-frounding-math' is not supported [-Wignored-optimization-argument]

It looks like our buildbot for aarch64 uses clang-9 and -frounding-math is not supported there?

This is the error messages that we got on aarch64-ubuntu: Buildbot log https://lab.llvm.org/buildbot/#/builders/138/builds/16983/steps/4/logs/stdio

it would be better to print the inputs with %e, %g or %a in the checking code,
since with %f tiny inputs simply appear as 0:

Input decimal: x: 0.00000000000000000000000000000000000000000000000000 y: 1797693134862315708145274237317043567980705675258449965989174768031572607800285387605895586327668781715404589535143824642343213268894641827684675467035375169860499105765512820762454900903893289440758

Diff 401620

libc/src/__support/FPUtil/Hypot.h

//===-- Implementation of hypotf function ---------------------------------===//		//===-- Implementation of hypotf function ---------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC_SUPPORT_FPUTIL_HYPOT_H		#ifndef LLVM_LIBC_SRC_SUPPORT_FPUTIL_HYPOT_H
#define LLVM_LIBC_SRC_SUPPORT_FPUTIL_HYPOT_H		#define LLVM_LIBC_SRC_SUPPORT_FPUTIL_HYPOT_H

#include "BasicOperations.h"		#include "BasicOperations.h"
		#include "FEnvImpl.h"
#include "FPBits.h"		#include "FPBits.h"
#include "src/__support/CPP/TypeTraits.h"		#include "src/__support/CPP/TypeTraits.h"

namespace __llvm_libc {		namespace __llvm_libc {
namespace fputil {		namespace fputil {

namespace internal {		namespace internal {

▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	static inline T hypot(T x, T y) {
uint16_t a_exp, b_exp, out_exp;		uint16_t a_exp, b_exp, out_exp;
UIntType a_mant, b_mant;		UIntType a_mant, b_mant;
DUIntType a_mant_sq, b_mant_sq;		DUIntType a_mant_sq, b_mant_sq;
bool sticky_bits;		bool sticky_bits;

if ((x_bits.get_unbiased_exponent() >=		if ((x_bits.get_unbiased_exponent() >=
y_bits.get_unbiased_exponent() + MantissaWidth<T>::VALUE + 2) \|\|		y_bits.get_unbiased_exponent() + MantissaWidth<T>::VALUE + 2) \|\|
(y == 0)) {		(y == 0)) {
		// Check if the rounding mode is FE_UPWARD, will need -frounding-math so
		// that the compiler does not optimize it away.
		if ((y != 0) && (0x1p0f + 0x1p-24f != 0x1p0f)) {
		UIntType out_bits = FPBits_t(abs(x)).uintval();
		return T(FPBits_t(++out_bits));
		}
return abs(x);		return abs(x);
} else if ((y_bits.get_unbiased_exponent() >=		} else if ((y_bits.get_unbiased_exponent() >=
x_bits.get_unbiased_exponent() + MantissaWidth<T>::VALUE + 2) \|\|		x_bits.get_unbiased_exponent() + MantissaWidth<T>::VALUE + 2) \|\|
(x == 0)) {		(x == 0)) {
y_bits.set_sign(0);		// Check if the rounding mode is FE_UPWARD, will need -frounding-math so
		// that the compiler does not optimize it away.
		if ((x != 0) && (0x1p0f + 0x1p-24f != 0x1p0f)) {
		UIntType out_bits = FPBits_t(abs(y)).uintval();
		return T(FPBits_t(++out_bits));
		}
return abs(y);		return abs(y);
}		}

if (abs(x) >= abs(y)) {		if (abs(x) >= abs(y)) {
a_exp = x_bits.get_unbiased_exponent();		a_exp = x_bits.get_unbiased_exponent();
a_mant = x_bits.get_mantissa();		a_mant = x_bits.get_mantissa();
b_exp = y_bits.get_unbiased_exponent();		b_exp = y_bits.get_unbiased_exponent();
b_mant = y_bits.get_mantissa();		b_mant = y_bits.get_mantissa();
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	if (y_new >= ONE) {
if (out_exp == 0) {		if (out_exp == 0) {
out_exp = 1;		out_exp = 1;
}		}
}		}

y_new >>= 1;		y_new >>= 1;

// Round to the nearest, tie to even.		// Round to the nearest, tie to even.
if (round_bit && (lsb \|\| sticky_bits \|\| (r != 0))) {		switch (get_round()) {
		case FE_TONEAREST:
		// Round to nearest, ties to even
		if (round_bit && (lsb \|\| sticky_bits \|\| (r != 0)))
		++y_new;
		break;
		case FE_UPWARD:
		if (round_bit \|\| sticky_bits \|\| (r != 0))
++y_new;		++y_new;
		break;
}		}

if (y_new >= (ONE >> 1)) {		if (y_new >= (ONE >> 1)) {
y_new -= ONE >> 1;		y_new -= ONE >> 1;
++out_exp;		++out_exp;
if (out_exp >= FPBits_t::MAX_EXPONENT) {		if (out_exp >= FPBits_t::MAX_EXPONENT) {
return T(FPBits_t::inf());		return T(FPBits_t::inf());
}		}
Show All 10 Lines

libc/src/math/generic/CMakeLists.txt

Show First 20 Lines • Show All 948 Lines • ▼ Show 20 Lines	add_entrypoint_object(
hypotf		hypotf
SRCS		SRCS
hypotf.cpp		hypotf.cpp
HDRS		HDRS
../hypotf.h		../hypotf.h
DEPENDS		DEPENDS
libc.src.__support.FPUtil.fputil		libc.src.__support.FPUtil.fputil
COMPILE_OPTIONS		COMPILE_OPTIONS
-O2		-O3
		-frounding-math
		-Wno-c++17-extensions
)		)

add_entrypoint_object(		add_entrypoint_object(
fdim		fdim
SRCS		SRCS
fdim.cpp		fdim.cpp
HDRS		HDRS
../fdim.h		../fdim.h
Show All 31 Lines	add_entrypoint_object(
hypot		hypot
SRCS		SRCS
hypot.cpp		hypot.cpp
HDRS		HDRS
../hypot.h		../hypot.h
DEPENDS		DEPENDS
libc.src.__support.FPUtil.fputil		libc.src.__support.FPUtil.fputil
COMPILE_OPTIONS		COMPILE_OPTIONS
-O2		-O3
		-frounding-math
		-Wno-c++17-extensions
)		)

add_entrypoint_object(		add_entrypoint_object(
nextafter		nextafter
SRCS		SRCS
nextafter.cpp		nextafter.cpp
HDRS		HDRS
../nextafter.h		../nextafter.h
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

libc/test/src/math/CMakeLists.txt

Show First 20 Lines • Show All 1,054 Lines • ▼ Show 20 Lines	add_fp_unittest(
SUITE		SUITE
libc_math_unittests		libc_math_unittests
SRCS		SRCS
hypotf_test.cpp		hypotf_test.cpp
DEPENDS		DEPENDS
libc.include.math		libc.include.math
libc.src.math.hypotf		libc.src.math.hypotf
libc.src.__support.FPUtil.fputil		libc.src.__support.FPUtil.fputil
		COMPILE_OPTIONS
		-Wno-c++17-extensions
)		)

add_fp_unittest(		add_fp_unittest(
hypot_test		hypot_test
NEED_MPFR		NEED_MPFR
SUITE		SUITE
libc_math_unittests		libc_math_unittests
SRCS		SRCS
hypot_test.cpp		hypot_test.cpp
DEPENDS		DEPENDS
libc.include.math		libc.include.math
libc.src.math.hypot		libc.src.math.hypot
libc.src.__support.FPUtil.fputil		libc.src.__support.FPUtil.fputil
		COMPILE_OPTIONS
		-Wno-c++17-extensions
)		)

add_fp_unittest(		add_fp_unittest(
nextafter_test		nextafter_test
SUITE		SUITE
libc_math_unittests		libc_math_unittests
SRCS		SRCS
nextafter_test.cpp		nextafter_test.cpp
▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

libc/test/src/math/HypotTest.h

//===-- Utility class to test different flavors of hypot ------------------===//		//===-- Utility class to test different flavors of hypot ------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H		#ifndef LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H
#define LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H		#define LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H

#include "src/__support/FPUtil/FPBits.h"		#include "src/__support/FPUtil/FPBits.h"
#include "src/__support/FPUtil/Hypot.h"
#include "utils/MPFRWrapper/MPFRUtils.h"		#include "utils/MPFRWrapper/MPFRUtils.h"
#include "utils/UnitTest/FPMatcher.h"		#include "utils/UnitTest/FPMatcher.h"
#include "utils/UnitTest/Test.h"		#include "utils/UnitTest/Test.h"

#include <math.h>		#include <math.h>

namespace mpfr = __llvm_libc::testing::mpfr;		namespace mpfr = __llvm_libc::testing::mpfr;

Show All 16 Lines	void test_special_numbers(Func func) {
EXPECT_FP_EQ(func(zero, inf), inf);		EXPECT_FP_EQ(func(zero, inf), inf);
EXPECT_FP_EQ(func(neg_inf, neg_zero), inf);		EXPECT_FP_EQ(func(neg_inf, neg_zero), inf);

EXPECT_FP_EQ(func(nan, nan), nan);		EXPECT_FP_EQ(func(nan, nan), nan);
EXPECT_FP_EQ(func(nan, zero), nan);		EXPECT_FP_EQ(func(nan, zero), nan);
EXPECT_FP_EQ(func(neg_zero, nan), nan);		EXPECT_FP_EQ(func(neg_zero, nan), nan);

EXPECT_FP_EQ(func(neg_zero, zero), zero);		EXPECT_FP_EQ(func(neg_zero, zero), zero);

		T x = 0x1.ffffecp-1f;
		T y = 0x1.000002p+27;
		mpfr::BinaryInput<T> input{x, y};
		ASSERT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Hypot, input, func(x, y),
		0.5);
		x = 0x1.ccbbbcfef3c02p-523;
		y = 0x1.924bf639c1a94p+500;
		input = mpfr::BinaryInput<T>{x, y};
		ASSERT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Hypot, input, func(x, y),
		0.5);
}		}

void test_subnormal_range(Func func) {		void test_subnormal_range(Func func) {
constexpr UIntType COUNT = 1000001;		constexpr UIntType COUNT = 1000001;
for (unsigned scale = 0; scale < 4; ++scale) {		for (unsigned scale = 0; scale < 4; ++scale) {
UIntType max_value = FPBits::MAX_SUBNORMAL << scale;		UIntType max_value = FPBits::MAX_SUBNORMAL << scale;
UIntType step = (max_value - FPBits::MIN_SUBNORMAL) / COUNT;		UIntType step = (max_value - FPBits::MIN_SUBNORMAL) / COUNT;
for (int signs = 0; signs < 4; ++signs) {		for (int signs = 0; signs < 4; ++signs) {
for (UIntType v = FPBits::MIN_SUBNORMAL, w = max_value;		for (UIntType v = FPBits::MIN_SUBNORMAL, w = max_value;
v <= max_value && w >= FPBits::MIN_SUBNORMAL;		v <= max_value && w >= FPBits::MIN_SUBNORMAL;
v += step, w -= step) {		v += step, w -= step) {
T x = T(FPBits(v)), y = T(FPBits(w));		T x = T(FPBits(v)), y = T(FPBits(w));
if (signs % 2 == 1) {		if (signs % 2 == 1) {
x = -x;		x = -x;
}		}
if (signs >= 2) {		if (signs >= 2) {
y = -y;		y = -y;
}		}

T result = func(x, y);
mpfr::BinaryInput<T> input{x, y};		mpfr::BinaryInput<T> input{x, y};
ASSERT_MPFR_MATCH(mpfr::Operation::Hypot, input, result, 0.5);		ASSERT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Hypot, input,
		func(x, y), 0.5);
		sivachandraUnsubmitted Not Done Reply Inline Actions Is the call to `func` happening with the intended rounding mode? sivachandra: Is the call to `func` happening with the intended rounding mode?
		lntueAuthorUnsubmitted Done Reply Inline Actions Yes, by putting the call inside the macro, it will be put by the macro after the rounding mode is set before it is evaluated. If it does not work as intended, the assertions with different rounding modes will definitely catch it, because most of the outputs are not representable in the floating point, and hence rounding up/down will guarantee to give different results. lntue: Yes, by putting the call inside the macro, it will be put by the macro after the rounding mode…
}		}
}		}
}		}
}		}

void test_normal_range(Func func) {		void test_normal_range(Func func) {
constexpr UIntType COUNT = 1000001;		constexpr UIntType COUNT = 1000001;
constexpr UIntType STEP = (FPBits::MAX_NORMAL - FPBits::MIN_NORMAL) / COUNT;		constexpr UIntType STEP = (FPBits::MAX_NORMAL - FPBits::MIN_NORMAL) / COUNT;
for (int signs = 0; signs < 4; ++signs) {		for (int signs = 0; signs < 4; ++signs) {
for (UIntType v = FPBits::MIN_NORMAL, w = FPBits::MAX_NORMAL;		for (UIntType v = FPBits::MIN_NORMAL, w = FPBits::MAX_NORMAL;
v <= FPBits::MAX_NORMAL && w >= FPBits::MIN_NORMAL;		v <= FPBits::MAX_NORMAL && w >= FPBits::MIN_NORMAL;
v += STEP, w -= STEP) {		v += STEP, w -= STEP) {
T x = T(FPBits(v)), y = T(FPBits(w));		T x = T(FPBits(v)), y = T(FPBits(w));
if (signs % 2 == 1) {		if (signs % 2 == 1) {
x = -x;		x = -x;
}		}
if (signs >= 2) {		if (signs >= 2) {
y = -y;		y = -y;
}		}

T result = func(x, y);
mpfr::BinaryInput<T> input{x, y};		mpfr::BinaryInput<T> input{x, y};
ASSERT_MPFR_MATCH(mpfr::Operation::Hypot, input, result, 0.5);		ASSERT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Hypot, input,
		func(x, y), 0.5);
}		}
}		}
}		}
};		};

#endif // LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H		#endif // LLVM_LIBC_TEST_SRC_MATH_HYPOTTEST_H

libc/test/src/math/differential_testing/CMakeLists.txt

Show First 20 Lines • Show All 400 Lines • ▼ Show 20 Lines	add_diff_binary(
hypotf_perf		hypotf_perf
SRCS		SRCS
hypotf_perf.cpp		hypotf_perf.cpp
DEPENDS		DEPENDS
.binary_op_single_output_diff		.binary_op_single_output_diff
libc.src.math.hypotf		libc.src.math.hypotf
COMPILE_OPTIONS		COMPILE_OPTIONS
-fno-builtin		-fno-builtin
		-Wno-c++17-extensions
)		)

add_diff_binary(		add_diff_binary(
hypot_perf		hypot_perf
SRCS		SRCS
hypot_perf.cpp		hypot_perf.cpp
DEPENDS		DEPENDS
.binary_op_single_output_diff		.binary_op_single_output_diff
libc.src.math.hypot		libc.src.math.hypot
COMPILE_OPTIONS		COMPILE_OPTIONS
-fno-builtin		-fno-builtin
		-Wno-c++17-extensions
)		)

This is an archive of the discontinued LLVM Phabricator instance.

[libc] Implement correct rounding with all rounding modes for hypot functions.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 401620

libc/src/__support/FPUtil/Hypot.h

libc/src/math/generic/CMakeLists.txt

libc/test/src/math/CMakeLists.txt

libc/test/src/math/HypotTest.h

libc/test/src/math/differential_testing/CMakeLists.txt

This is an archive of the discontinued LLVM Phabricator instance.

[libc] Implement correct rounding with all rounding modes for hypot functions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 401620

libc/src/__support/FPUtil/Hypot.h

libc/src/math/generic/CMakeLists.txt

libc/test/src/math/CMakeLists.txt

libc/test/src/math/HypotTest.h

libc/test/src/math/differential_testing/CMakeLists.txt

[libc] Implement correct rounding with all rounding modes for hypot functions.
ClosedPublic