Implement log2f based on RLIBM library correctly rounded for all rounding modes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
libc/src/math/generic/common_constants.h | ||
---|---|---|
16 | is this intended to be a long term solution? It feels like there should be an easier way to handle using hexadecimal float constants. |
libc/src/math/generic/common_constants.h | ||
---|---|---|
16 | I think the long term solution is to build libc with C++17 standard and get rid of these. |
libc/src/math/generic/common_constants.h | ||
---|---|---|
16 | Can you check if adding -std=c++17 to COMPILE_OPTIONS fixes the warning problem for you? |
libc/src/math/generic/common_constants.h | ||
---|---|---|
16 | I tried to add -std=c++17 to the CMakeList.txt files in src/math, src/math/generic, and test/src/math but the warnings still show up when the pragmas are removed. |
this patch does not apply to the current main branch (db5aceb):
$ patch -p1 -i /tmp/D115828.diff patching file libc/config/linux/aarch64/entrypoints.txt Hunk #1 FAILED at 136. 1 out of 1 hunk FAILED -- saving rejects to file libc/config/linux/aarch64/entrypoints.txt.rej
It seems this patch was built on the branch which adds logf.
libc/src/math/generic/common_constants.h | ||
---|---|---|
16 | I think we pass -std=c++14 explicitly also and that confuses the compiler. You should be able to add -Wno-c++17-extensions to COMPILE_OPTIONS. |
libc/src/math/generic/common_constants.h | ||
---|---|---|
18 | When you have large common global data, a better approach would be to put them in an object library like this: https://github.com/llvm/llvm-project/blob/main/libc/src/math/generic/CMakeLists.txt#L49. And then use it as a dep for all users. |
I've rebased the patch. Can you check to see if it works now? You might need to patch this on top of the logf patch if possible.
Accepting the mechanics of the change. Please wait for Paul and others for acceptance of the math parts.
the new version applies cleanly to the main branch. I have tested it on x86_64 under Linux (haswell). I confirm it is CR for rounding to nearest, and I get 3 failures if I disable the 3 exceptional cases. For other rounding modes I get 8 failures for rounding towards zero (with the exceptional cases), 8 failures too for rounding towards -Inf, and 7 failures for rounding towards +Inf.
I tried with a polynomial generated by Sollya and with this polynomial we need no exceptional cases, and the routine is CR for all rounding modes (please can someone confirm?):
double r = __llvm_libc::fputil::polyeval( d, extra_factor, 0x1.71547652b7fefp+0, -0x1.715476500a42ep-1, 0x1.ec70917f77152p-2, -0x1.71482b204ea69p-2, 0x1.21da0eb07c659p-2);
For the record this polynomial was obtained with the following input file (then run sollya log2f.sollya):
n = 5; /* polynomial degree */ P = 53; /* precision of the coefficients */ pretty = proc(u) { return ~(floor(u*1000)/1000); }; d = [0, 1/2^7]; f = log2(1+x); w = 1; p = remez(f, n, d, w); pf = fpminimax(log2(1+x), [|1,2,3,4,5|], [|P...|], d, absolute, floating, 0, p)\ ; err_p = -log2(dirtyinfnorm(pf*w-f, d)); print (pf, pretty(err_p));
Sollya is available from https://www.sollya.org/. Would you consider using the Sollya polynomial?
RLIBM's polynomial should also have 0 special case inputs and produce correctly rounded results for all inputs and for all rounding modes for log2f.
I think the current polynomial has special case inputs because the generated polynomial is generated with a different polynomial evaluation.
Here is a revised polynomial that uses the current polynomial evaluation and produces correct results for all rounding modes and all inputs with zero violated inputs:
Polynomial: y=-3.2945312494298684154217536821552091564491707891340621650044795387657359e-16 x^(0) + 1.4426950408890866217603843324468471109867095947265625000000000000000000e+00 x^(1) + -7.2134752022691861483849606884177774190902709960937500000000000000000000e-01 x^(2) + 4.8089833027421252653610395100258756428956985473632812500000000000000000e-01 x^(3) + -3.6069225263970772221711058591608889400959014892578125000000000000000000e-01 x^(4) + 2.8949201646411226729327381690382026135921478271484375000000000000000000e-01 x^(5)
Thanks Paul and Santosh for looking into this! It looks like there are many polynomials that will make it correctly rounded for the entire range, which is a great news!
That also make me think that so actually the exception cases from logf (and maybe log10f in the near future) are coming from adding the extra factor: m*log(2) + log(f). So if we can make that addition more accurate, we shouldn't have any exceptional cases. Moreover, it's entirely possible that we can reduce the degree of our polynomials while maintaining the accuracy.
I've discussed with Christoph and see if he can get any update on those directions.
Dear Santosh,
Here is a revised polynomial that uses the current polynomial evaluation and produces correct results for all rounding modes and all inputs with zero violated inputs:
Polynomial: y=-3.2945312494298684154217536821552091564491707891340621650044795387657359e-16 x^(0) + 1.4426950408890866217603843324468471109867095947265625000000000000000000e+00 x^(1) + -7.2134752022691861483849606884177774190902709960937500000000000000000000e-01 x^(2) + 4.8089833027421252653610395100258756428956985473632812500000000000000000e-01 x^(3) + -3.6069225263970772221711058591608889400959014892578125000000000000000000e-01 x^(4) + 2.8949201646411226729327381690382026135921478271484375000000000000000000e-01 x^(5)
this polynomial has a non-zero constant term, unlike the polynomial used in
this patch, and the one produced by Sollya. How can it fit the current
framework (see below)?
double r = __llvm_libc::fputil::polyeval( d, extra_factor, 0x1.71547652c2801p+0, -0x1.715476ec167eep-1, 0x1.ec72eb4428a1ap-2, -0x1.72fd9daa7714fp-2, 0x1.8be682a823a9bp-2);
Best regards,
Paul
Dear Paul,
Yes, it has a non-zero constant term because it generates the correctly rounded round-to-odd result for a 34-bit floating point. When this is rounded to any FP representation with the 8 exponent bits and for any rounding mode in the standard, it produces the correctly rounded result.
Here is our paper describing the result: https://people.cs.rutgers.edu/~sn349/papers/rlibmall-popl-2022.pdf
We plugged this polynomial locally in our infrastructure which uses the same range reduction and the same quintic polynomial evaluation, it produces correctly rounded results for all inputs and all inputs. Are you observing that it is not producing the correct result for some inputs?
Thanks,
Santosh
Dear Santosh,
Yes, it has a non-zero constant term because it generates the correctly rounded round-to-odd result for a 34-bit floating point. When this is rounded to any FP representation with the 8 exponent bits and for any rounding mode in the standard, it produces the correctly rounded result.
Here is our paper describing the result: https://people.cs.rutgers.edu/~sn349/papers/rlibmall-popl-2022.pdfWe plugged this polynomial locally in our infrastructure which uses the same range reduction and the same quintic polynomial evaluation, it produces correctly rounded results for all inputs and all inputs. Are you observing that it is not producing the correct result for some inputs?
sorry I was not clear enough: can you generate with RLIBM a degree-5
polynomial with double coefficients and a zero constant term that you can
plug into the llvm-libc infrastructure (lines 131-132 of the current
patch) and which produces correct-rounding for all rounding modes (like
the polynomial generated by Sollya which I gave above).
Best regards,
Paul
Dear Paul,
Thanks, Here is a polynomial with zero constant term that I regenerated with RLIBM. It produces correctly rounded results for all rounding modes and inputs in our infrastructure. It should produce correct results for all representations that has 8 bits of exponents and precisions starting from 10 bits to 32-bits and for all rounding modes.
Polynomial: y=1.4426950408890186761112772728665731847286224365234375000000000000000000e+00 x^(1) + -7.2134752026802795299431636522058397531509399414062500000000000000000000e-01 x^(2) + 4.8089833837385143056053493637591600418090820312500000000000000000000000e-01 x^(3) + -3.6069150703943497759951242187526077032089233398437500000000000000000000e-01 x^(4) + 2.8934750971542422259830118491663597524166107177734375000000000000000000e-01 x^(5)
Dear Santosh,
Thanks, Here is a polynomial with zero constant term that I regenerated with RLIBM. It produces correctly rounded results for all rounding modes and inputs in our infrastructure. It should produce correct results for all representations that has 8 bits of exponents and precisions starting from 10 bits to 32-bits and for all rounding modes.
Polynomial: y=1.4426950408890186761112772728665731847286224365234375000000000000000000e+00 x^(1) + -7.2134752026802795299431636522058397531509399414062500000000000000000000e-01 x^(2) + 4.8089833837385143056053493637591600418090820312500000000000000000000000e-01 x^(3) + -3.6069150703943497759951242187526077032089233398437500000000000000000000e-01 x^(4) + 2.8934750971542422259830118491663597524166107177734375000000000000000000e-01 x^(5)
if I plug this polynomial into the llvm-libc framework as follows:
double r = __llvm_libc::fputil::polyeval( d, extra_factor, 0x1.71547652b83f7p+0, -0x1.7154765134294p-1, 0x1.ec709d3010d4p-2, -0x1.71591d4ab7a18p-2, 0x1.284ab6ada08d6p-2);
then I get 1 incorrect rounding for x=0x1.fc64e8p-1 and rounding to nearest,
and 1 incorrect rounding for x=0x1.197472p+0 and rounding down/towards zero
(for rounding up, all results are correctly rounded).
Best regards,
Paul
Paul,
Thanks. Let me investigate and test it out with the patch. I have been testing the polynomials with our framework. There seems to be subtle differences between the patch and our public RLIBM framework.
Santosh
Here is a new polynomial that is generated using the exact output compensation in the patch. Tue suggested to use the FMA based poly eval as he was observing performance regressions with the SIMD instruction in x86-64.
Polynomial: y=1.4426950408936214387267682468518614768981933593750000000000000000000000e+00 x^(1) + -7.2134752892795794831926059487159363925457000732421875000000000000000000e-01 x^(2) + 4.8090233829603024062748772848863154649734497070312500000000000000000000e-01 x^(3) + -3.6137987525825709944626851211069151759147644042968750000000000000000000e-01 x^(4) + 3.2929554893140711158139311010017991065979003906250000000000000000000000e-01 x^(5)
Polynomial evaluation used is as follows:
double t1 = fma(x, a5, a4);
double t2 = fma(x, t1, a3);
double t3 = fma(x, t2, a2);
double t4 = fma(x, t3, a1);
final result = fma(d, t4, extra_factor)
Can you check if it produces correctly rounded results for all inputs and all rounding modes?
Dear Santosh,
Here is a new polynomial that is generated using the exact output compensation in the patch. Tue suggested to use the FMA based poly eval as he was observing performance regressions with the SIMD instruction in x86-64.
Polynomial: y=1.4426950408936214387267682468518614768981933593750000000000000000000000e+00 x^(1) + -7.2134752892795794831926059487159363925457000732421875000000000000000000e-01 x^(2) + 4.8090233829603024062748772848863154649734497070312500000000000000000000e-01 x^(3) + -3.6137987525825709944626851211069151759147644042968750000000000000000000e-01 x^(4) + 3.2929554893140711158139311010017991065979003906250000000000000000000000e-01 x^(5)
Polynomial evaluation used is as follows:
double t1 = fma(x, a5, a4);
double t2 = fma(x, t1, a3);
double t3 = fma(x, t2, a2);
double t4 = fma(x, t3, a1);final result = fma(d, t4, extra_factor)
Can you check if it produces correctly rounded results for all inputs and all rounding modes?
sure. If I converted the coefficients properly to hexadecimal values,
there is still one incorrectly rounded result for rounding towards zero
or down (same input x):
libm wrong by up to 1.01e+00 ulp(s) [1] for x=0x1.03a16ap+0
log2 gives 0x1.4cdc4ap-6
mpfr_log2 gives 0x1.4cdc4cp-6
Best regards,
Paul
Dear Paul,
Thanks for testing it out.
I am seeing the exact oracle result in the local build for input 0x1.03a16ap+0. The result produced by the implementation is 0x1.4cdc4cp-6.
Have you commented out lines src/__support/FPUtil/PolyEval.h:38-42.
If you have not, it is most likely using using x86-64 SIMD extensions that Tue suggested us not to use. It could be one reason for the divergence we are seeing.
Thinking out loud: the round-to-nearest result for this input is exactly the same as round-to-zero result 0x1.4cdc4cp-6. Given that the implementation is producing the correct round-to-nearest, and disagreeing only for round-to-zero, is it possible that there is a bug in the test harness that checks round-to-zero results?
Thanks,
Santosh
Dear Santosh,
Have you commented out lines src/__support/FPUtil/PolyEval.h:38-42.
yes:
- a/libc/src/__support/FPUtil/PolyEval.h
+++ b/libc/src/__support/FPUtil/PolyEval.h
@@ -35,7 +35,7 @@ INLINE_FMA static inline T polyeval(T x, T a0, Ts... a) {
} namespace fputil
} namespace __llvm_libc
-#ifdef LLVM_LIBC_ARCH_X86_64
+#ifdef LLVM_LIBC_ARCH_X86_64_XXX
#include "x86_64/PolyEval.h"
Please find attached the log2f.cpp file I am using.
Best regards,
Paul
Dear Paul,
I was wondering how you are testing other rounding modes other than round-to-nearest-ties-to-even.
The coefficients that we are using are identical.
double r = __llvm_libc::fputil::polyeval(
d, extra_factor, 0x1.71547652bd4fp+0, -0x1.7154769b978c7p-1, 0x1.ec71a99e349c8p-2, -0x1.720d90e6aac6cp-2, 0x1.5132da3583dap-2);
Now this double value "r" needs to be rounded according to the target rounding mode.
The cast on line 142 in the current patch is doing static_cast<float>(r). I assume it is just rounding it to round-to-nearest ties to even.
I am seeing that these coefficients are producing correctly rounded results for the round-to-nearest-ties-to-even.
Further when the double value is specifically rounded to the target rounding mode, it is producing the correctly rounded results for them.
If possible, can you tell me the double value (r) returned by the libc polynomial with the above coefficients for various rounding modes corresponding to the input (x=0x1.03a16ap+0)?
It should produce the same double value "r" for all rounding modes.
In my build, it produces r = 0x1.4cdc4c8p-6, which when rounded to any rounding mode produces the correctly rounded result.
Thanks,
Santosh
Dear Santosh,
Date: Thu, 23 Dec 2021 17:37:50 +0000 (UTC)
From: Santosh Nagarakatte via Phabricator <reviews@reviews.llvm.org>Dear Paul,
I was wondering how you are testing other rounding modes other than round-to-nearest-ties-to-even.
as follows:
fesetround(...); y = log2f (x);
The coefficients that we are using are identical.
double r = __llvm_libc::fputil::polyeval(
d, extra_factor, 0x1.71547652bd4fp+0, -0x1.7154769b978c7p-1, 0x1.ec71a99e349c8p-2, -0x1.720d90e6aac6cp-2, 0x1.5132da3583dap-2);Now this double value "r" needs to be rounded according to the target rounding mode.
The cast on line 142 in the current patch is doing static_cast<float>(r). I assume it is just rounding it to round-to-nearest ties to even.
I am seeing that these coefficients are producing correctly rounded results for the round-to-nearest-ties-to-even.
Further when the double value is specifically rounded to the target rounding mode, it is producing the correctly rounded results for them.
If possible, can you tell me the double value (r) returned by the libc polynomial with the above coefficients for various rounding modes corresponding to the input (x=0x1.03a16ap+0)?
It should produce the same double value "r" for all rounding modes.
In my build, it produces r = 0x1.4cdc4c80p-6, which when rounded to any rounding mode produces the correctly rounded result.
here is what I get with the different rounding modes:
FE_TONEAREST:
r=0x1.4cdc4c0000001p-6
FE_TOWARDZERO:
r=0x1.4cdc4bfffffffp-6
FE_UPWARD:
r=0x1.4cdc4c0000001p-6
FE_DOWNWARD:
r=0x1.4cdc4bfffffffp-6
The double value r differs because the rounding mode is also used internally
(for example for the polynomial evaluation).
The code could set internally the rounding mode to FE_TONEAREST, and
restore it before the last rounding, this would solve the issue (at least
for that x-value), but I guess that would be slower than dealing with one
exceptional case.
Best regards,
Paul
Thanks Paul and Santosh for getting to the bottom of this! I'm updating the testing infrastructure to fully support all rounding modes and once it completes, I'll update this patch accordingly.
@zimmermann6, @santoshn: I've updated the implementation to be correctly rounded for all rounding modes. I also added the exhaustive tests for all rounding modes. Thanks!
Dear Tue,
@zimmermann6, @santoshn: I've updated the implementation to be correctly rounded for all rounding modes. I also added the exhaustive tests for all rounding modes. Thanks!
I confirm the new function is correctly rounded for all rounding modes.
Great job!
Paul
is this intended to be a long term solution? It feels like there should be an easier way to handle using hexadecimal float constants.