Page MenuHomePhabricator

zimmermann6 (Paul Zimmermann)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 30 2021, 1:16 AM (77 w, 6 d)

Recent Activity

Tue, May 23

zimmermann6 accepted D151049: [libc][math] Implement double precision log1p correctly rounded to all rounding modes..

this function is faster than core-math, even for the reciprocal throughput, great work!

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 49.596 + 0.352 clc/call; Median-Min = 0.315 clc/call; Max = 50.216 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 26.743 + 0.357 clc/call; Median-Min = 0.333 clc/call; Max = 29.183 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 38.887 + 0.296 clc/call; Median-Min = 0.249 clc/call; Max = 41.140 clc/call;
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 94.356 + 0.365 clc/call; Median-Min = 0.288 clc/call; Max = 95.270 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 69.370 + 0.356 clc/call; Median-Min = 0.262 clc/call; Max = 70.010 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 67.833 + 0.346 clc/call; Median-Min = 0.292 clc/call; Max = 68.525 clc/call;
Tue, May 23, 8:01 AM · Restricted Project, Restricted Project
zimmermann6 accepted D150374: [libc][math] Implement double precision log2 function correctly rounded to all rounding modes..

here is what I get on my AMD EPYC 7282 for the reciprocal throughput:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 21.110 + 0.272 clc/call; Median-Min = 0.280 clc/call; Max = 21.622 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 21.312 + 0.198 clc/call; Median-Min = 0.085 clc/call; Max = 23.254 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 31.690 + 0.395 clc/call; Median-Min = 0.345 clc/call; Max = 34.334 clc/call;

and for the latency:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 58.310 + 0.416 clc/call; Median-Min = 0.304 clc/call; Max = 59.411 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 55.688 + 0.187 clc/call; Median-Min = 0.093 clc/call; Max = 56.306 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 62.254 + 0.355 clc/call; Median-Min = 0.339 clc/call; Max = 62.856 clc/call;
Tue, May 23, 7:44 AM · Restricted Project, Restricted Project
zimmermann6 accepted D150131: [libc][math] Implement double precision log function correctly rounded to all rounding modes..

it works now, thanks. I get for the reciprocal throughput on a AMD EPYC 7282:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 21.404 + 0.252 clc/call; Median-Min = 0.295 clc/call; Max = 23.841 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 13.188 + 0.181 clc/call; Median-Min = 0.036 clc/call; Max = 13.582 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 22.978 + 0.294 clc/call; Median-Min = 0.304 clc/call; Max = 23.516 clc/call;

and for the latency:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 57.524 + 0.353 clc/call; Median-Min = 0.298 clc/call; Max = 58.199 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 49.889 + 0.110 clc/call; Median-Min = 0.058 clc/call; Max = 50.229 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 51.938 + 0.328 clc/call; Median-Min = 0.307 clc/call; Max = 52.466 clc/call;
Tue, May 23, 7:30 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D151049: [libc][math] Implement double precision log1p correctly rounded to all rounding modes..

I get the same error as for log2:

CMake Error at /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:5 (get_target_property):
  get_target_property() called with non-existent target
  "libc.src.math.generic.log_range_reduction".
Call Stack (most recent call first):
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:35 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:35 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:82 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/lib/CMakeLists.txt:26 (add_entrypoint_library)
Tue, May 23, 6:35 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D150374: [libc][math] Implement double precision log2 function correctly rounded to all rounding modes..

I get an error while running ninja:

CMake Error at /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:5 (get_target_property):
  get_target_property() called with non-existent target
  "libc.src.math.generic.log_range_reduction".
Call Stack (most recent call first):
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:35 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:35 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/cmake/modules/LLVMLibCLibraryRules.cmake:82 (collect_object_file_deps)
  /localdisk/zimmerma/llvm-project/libc/lib/CMakeLists.txt:26 (add_entrypoint_library)
Tue, May 23, 6:33 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D150131: [libc][math] Implement double precision log function correctly rounded to all rounding modes..

I get failures when I try to apply this patch to head (revision 7489301):

patching file libc/config/darwin/arm/entrypoints.txt
patching file libc/config/linux/aarch64/entrypoints.txt
patching file libc/config/linux/x86_64/entrypoints.txt
patching file libc/config/windows/entrypoints.txt
patching file libc/spec/stdc.td
patching file libc/src/math/CMakeLists.txt
patching file libc/src/math/generic/CMakeLists.txt
Hunk #1 succeeded at 784 with fuzz 2 (offset 16 lines).
Hunk #2 FAILED at 785.
Hunk #3 succeeded at 846 (offset -3 lines).
1 out of 3 hunks FAILED -- saving rejects to file libc/src/math/generic/CMakeLists.txt.rej
patching file libc/src/math/generic/common_constants.h
Hunk #1 FAILED at 39.
1 out of 1 hunk FAILED -- saving rejects to file libc/src/math/generic/common_constants.h.rej
patching file libc/src/math/generic/common_constants.cpp
Hunk #1 succeeded at 196 with fuzz 2 (offset -169 lines).
patching file libc/src/math/generic/log.cpp
patching file libc/src/math/generic/log10.cpp
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 41.
Hunk #3 FAILED at 785.
Hunk #4 FAILED at 944.
4 out of 4 hunks FAILED -- saving rejects to file libc/src/math/generic/log10.cpp.rej
patching file libc/src/math/generic/log_range_reduction.h
patching file libc/src/math/log.h
patching file libc/test/src/math/CMakeLists.txt
patching file libc/test/src/math/log_test.cpp
Tue, May 23, 6:29 AM · Restricted Project, Restricted Project
zimmermann6 accepted D150014: [libc][math] Make log10 correctly rounded for non-FMA targets and improve itsperformance..

all worst cases from core-math do pass. However I get a larger reciprocal throughput on a AMD EPYC 7282:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 21.270 + 0.322 clc/call; Median-Min = 0.302 clc/call; Max = 23.839 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 26.172 + 0.346 clc/call; Median-Min = 0.310 clc/call; Max = 26.798 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 49.688 + 0.306 clc/call; Median-Min = 0.308 clc/call; Max = 50.288 clc/call;

For the latency I get similar values:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 65.915 + 0.316 clc/call; Median-Min = 0.282 clc/call; Max = 66.518 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 73.612 + 0.389 clc/call; Median-Min = 0.326 clc/call; Max = 74.422 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 66.387 + 0.276 clc/call; Median-Min = 0.303 clc/call; Max = 67.140 clc/call;
Tue, May 23, 4:51 AM · Restricted Project, Restricted Project

Apr 10 2023

zimmermann6 accepted D147759: [libc][math] Update range reduction step for log2f and improve its performance..

all test are ok now. Reciprocal throughput:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 9.649 + 0.454 clc/call; Median-Min = 0.288 clc/call; Max = 11.526 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 7.224 + 0.310 clc/call; Median-Min = 0.320 clc/call; Max = 8.306 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 10.579 + 0.162 clc/call; Median-Min = 0.024 clc/call; Max = 11.528 clc/call;

Latency:

GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 42.395 + 0.340 clc/call; Median-Min = 0.322 clc/call; Max = 42.983 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 38.033 + 0.434 clc/call; Median-Min = 0.304 clc/call; Max = 39.897 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 41.699 + 0.343 clc/call; Median-Min = 0.327 clc/call; Max = 42.248 clc/call;
Apr 10 2023, 10:59 PM · Restricted Project, Restricted Project

Apr 7 2023

zimmermann6 accepted D147755: [libc][math] Update range reduction step for logf and reduce its latency..

thanks all tests do pass now. For the reciprocal throughput I get:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh logf
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 10.839 + 0.378 clc/call; Median-Min = 0.304 clc/call; Max = 13.593 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 7.240 + 0.351 clc/call; Median-Min = 0.307 clc/call; Max = 9.576 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 18.822 + 0.339 clc/call; Median-Min = 0.314 clc/call; Max = 19.396 clc/call;

and for the latency:

zimmerma@biscotte:~/svn/core-math$ PERF_ARGS=--latency LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh logf
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 46.968 + 0.321 clc/call; Median-Min = 0.301 clc/call; Max = 47.649 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 38.243 + 0.328 clc/call; Median-Min = 0.318 clc/call; Max = 38.928 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 54.652 + 0.404 clc/call; Median-Min = 0.329 clc/call; Max = 55.396 clc/call;
Apr 7 2023, 8:38 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D147759: [libc][math] Update range reduction step for log2f and improve its performance..

I get an error at compilation:

/localdisk/zimmerma/llvm-project/libc/src/math/generic/log2f.cpp:149:48: error: use of undeclared identifier 'R'
                           static_cast<double>(R[index]), -1.0); // Exact
                                               ^
Apr 7 2023, 12:08 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D147755: [libc][math] Update range reduction step for logf and reduce its latency..

the patch fails to apply to main (revision 10cff75):

$ patch -p1 -i /tmp/D147755.diff 
patching file libc/src/math/generic/common_constants.h
Hunk #1 succeeded at 17 with fuzz 2 (offset -3 lines).
patching file libc/src/math/generic/common_constants.cpp
Hunk #1 FAILED at 109.
Hunk #2 succeeded at 102 with fuzz 2 (offset -28 lines).
1 out of 2 hunks FAILED -- saving rejects to file libc/src/math/generic/common_constants.cpp.rej
patching file libc/src/math/generic/logf.cpp
patching file libc/test/src/math/logf_test.cpp
Apr 7 2023, 12:04 AM · Restricted Project, Restricted Project

Apr 6 2023

zimmermann6 accepted D147676: [libc][math] Update range reduction step for log10f and reduce its latency..

the failure is fixed now, and the performance is slightly better than the 'main' branch on my machine:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh log10f
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 10.523 + 0.392 clc/call; Median-Min = 0.332 clc/call; Max = 11.603 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 18.521 + 0.418 clc/call; Median-Min = 0.330 clc/call; Max = 19.653 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 12.929 + 0.626 clc/call; Median-Min = 0.297 clc/call; Max = 15.292 clc/call;
Apr 6 2023, 11:07 PM · Restricted Project, Restricted Project
zimmermann6 requested changes to D147676: [libc][math] Update range reduction step for log10f and reduce its latency..

I get a failure for rounding down:

zimmerma@biscotte:~/svn/core-math$ CORE_MATH_CHECK_STD=true LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./check.sh log10f
Running exhaustive check in --rndn mode...
all ok
Running exhaustive check in --rndz mode...
all ok
Running exhaustive check in --rndu mode...
all ok
Running exhaustive check in --rndd mode...
FAIL x=0x1p+0 ref=0x0p+0 y=-0x0p+0

Also, on an AMD EPYC 7282 I get a regression in speed. With master:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh log10f
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 10.531 + 0.273 clc/call; Median-Min = 0.281 clc/call; Max = 13.047 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 18.529 + 0.342 clc/call; Median-Min = 0.309 clc/call; Max = 19.811 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 13.059 + 0.526 clc/call; Median-Min = 0.290 clc/call; Max = 15.586 clc/call;

With this patch:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh log10f
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 10.534 + 0.297 clc/call; Median-Min = 0.303 clc/call; Max = 11.415 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 18.529 + 0.561 clc/call; Median-Min = 0.327 clc/call; Max = 20.729 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 19.791 + 0.313 clc/call; Median-Min = 0.338 clc/call; Max = 22.809 clc/call;
Apr 6 2023, 2:16 AM · Restricted Project, Restricted Project

Jan 28 2023

zimmermann6 accepted D142781: [libc][math] Implement acoshf function correctly rounded to all rounding modes..

all exhaustive tests do pass. The performance is a little worse than CORE-MATH and glibc:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh acoshf
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 17.383 + 0.168 clc/call; Median-Min = 0.004 clc/call; Max = 17.769 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 21.634 + 0.150 clc/call; Median-Min = 0.059 clc/call; Max = 22.014 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 25.034 + 0.320 clc/call; Median-Min = 0.305 clc/call; Max = 25.602 clc/call;
Jan 28 2023, 11:54 PM · Restricted Project, Restricted Project
zimmermann6 requested changes to D142781: [libc][math] Implement acoshf function correctly rounded to all rounding modes..

I cannot apply to main (revision f7c1982), the patch fails.

Jan 28 2023, 12:00 AM · Restricted Project, Restricted Project

Jan 27 2023

zimmermann6 accepted D142681: [libc][math] Implement asinhf function correctly rounded for all rounding modes..

I confirm all exhaustive tests do pass, and the timings are similar to CORE-MATH:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./perf.sh asinhf
GNU libc version: 2.36
GNU libc release: stable
[####################] 100 %
Ntrial = 20 ; Min = 24.687 + 0.365 clc/call; Median-Min = 0.296 clc/call; Max = 26.051 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 37.687 + 0.295 clc/call; Median-Min = 0.287 clc/call; Max = 39.359 clc/call;
[####################] 100 %
Ntrial = 20 ; Min = 25.939 + 0.320 clc/call; Median-Min = 0.292 clc/call; Max = 26.596 clc/call;
Jan 27 2023, 1:28 AM · Restricted Project, Restricted Project

Dec 14 2022

zimmermann6 accepted D139846: [libc][math] Implement log10 function correctly rounded for all rounding modes.

I tried on 2096978 hard-to-round cases I have generated and all tests pass, with all four rounding modes. Great work!

zimmerma@biscotte:/tmp/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a ./check.sh --worst log10
Running worst cases check in --rndn mode...
2096978 tests passed, 0 failure(s)
Running worst cases check in --rndz mode...
2096978 tests passed, 0 failure(s)
Running worst cases check in --rndu mode...
2096978 tests passed, 0 failure(s)
Running worst cases check in --rndd mode...
2096978 tests passed, 0 failure(s)
Dec 14 2022, 1:02 AM · Restricted Project, Restricted Project

Sep 26 2022

zimmermann6 accepted D134575: [libc][math] Simplify tanf implementation and improve its performance..

I get similar timings:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanf
GNU libc version: 2.34
GNU libc release: stable
14.464
50.813
14.254
zimmerma@biscotte:~/svn/core-math$ PERF_ARGS=--latency LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanf
GNU libc version: 2.34
GNU libc release: stable
54.500
110.220
59.640

Good work!

Sep 26 2022, 10:36 AM · Restricted Project, Restricted Project

Sep 21 2022

zimmermann6 requested changes to D127951: [libc][math] New algorithm for expf/expm1f/exp2f and for new functions sinhf/coshf..

sorry for the delay. It seems this does not compile properly with current main:

/localdisk/zimmerma/llvm-project/libc/src/math/generic/expm1f.cpp:11:10: fatal error: 'expxf.h' file not found
#include "expxf.h"
         ^~~~~~~~~
Sep 21 2022, 12:09 AM · Restricted Project, Restricted Project

Sep 19 2022

zimmermann6 accepted D134104: [libc][math] Implement exp10f function correctly rounded to all rounding modes..

thank you, the new version is fine. Great work!

Sep 19 2022, 6:49 AM · Restricted Project, Restricted Project
zimmermann6 accepted D134002: [libc][math] Improve tanhf performance..

I confirm the function is still correctly rounded, and the timings improved.

Sep 19 2022, 1:34 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D134104: [libc][math] Implement exp10f function correctly rounded to all rounding modes..

I got one reject when applying this patch to main (revision 458598c):

patching file libc/src/math/generic/explogxf.h
Hunk #1 FAILED at 51.
Hunk #2 succeeded at 41 with fuzz 2 (offset -29 lines).
1 out of 2 hunks FAILED -- saving rejects to file libc/src/math/generic/explogxf.h.rej
Sep 19 2022, 1:23 AM · Restricted Project, Restricted Project

Sep 16 2022

zimmermann6 added a comment to D134002: [libc][math] Improve tanhf performance..

does this patch need to be applied on another one? Or rebased? It does not apply cleanly to main (71e52a1), unless I did something wrong.

Sep 16 2022, 12:08 AM · Restricted Project, Restricted Project

Sep 15 2022

zimmermann6 added a comment to D133870: [libc][math] Improve exp2f performance..

I confirm the improvement, and the function is still correctly rounded.

Sep 15 2022, 3:38 AM · Restricted Project, Restricted Project
zimmermann6 accepted D133913: [libc][math] Improve sinhf and coshf performance..

this is very clever! I confirm the speed improvement (and still correct rounding by exhaustive search).

Sep 15 2022, 1:40 AM · Restricted Project, Restricted Project

Sep 9 2022

zimmermann6 accepted D133550: [libc][math] Implement acosf function correctly rounded for all rounding modes..

great work! The reciprocal throughput is indeed slightly better than CORE-MATH, and the latency slightly worse:

# reciprocal throughput
GNU libc version: 2.34
GNU libc release: stable
33.819
37.064
29.462
# latency
GNU libc version: 2.34
GNU libc release: stable
54.951
80.046
62.001
Sep 9 2022, 1:33 AM · Restricted Project, Restricted Project

Sep 7 2022

zimmermann6 accepted D133370: [libc] Return correct values for hypot when overflowed..

ok for me. I have added more checks near the underflow and overflow boundaries in CORE-MATH check.sh, and it passes all tests.

Sep 7 2022, 3:19 AM · Restricted Project, Restricted Project
zimmermann6 accepted D133400: [libc][math] Implement asinf function correctly rounded for all rounding modes..

it works now, thanks. I confirm it is correctly rounded. I get similar figures on a AMD EPYC 7282 (glibc, core-math, llvm-libc):

# reciprocal throughput
GNU libc version: 2.34
GNU libc release: stable
26.722
31.752
27.841
# latency
GNU libc version: 2.34
GNU libc release: stable
56.310
64.140
61.051
Sep 7 2022, 12:02 AM · Restricted Project, Restricted Project

Sep 6 2022

zimmermann6 added a comment to D133400: [libc][math] Implement asinf function correctly rounded for all rounding modes..

the patch fails for me on top of main (revision ea953b9). Is there any other patch to apply first?

Sep 6 2022, 11:32 PM · Restricted Project, Restricted Project

Aug 29 2022

zimmermann6 accepted D132842: [libc][math] Added atanf function..

ok for me, I get slightly different figures on a AMD EPYC 7282:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh atanf
GNU libc version: 2.34
GNU libc release: stable
17.539
31.797
26.903
Aug 29 2022, 4:37 AM · Restricted Project, Restricted Project
zimmermann6 accepted D132811: [libc][math] Added atanhf function..

ok for me. I get slightly better figures for llvm-libc:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh atanhf
GNU libc version: 2.34
GNU libc release: stable
23.547
70.432
20.065

This is on a AMD EPYC 7282.

Aug 29 2022, 1:47 AM · Restricted Project, Restricted Project

Aug 22 2022

zimmermann6 added a comment to D130901: [libc] Implement sincosf function correctly rounded to all rounding modes..

are the performance numbers for sinf, for cosf, or for random calls?

Aug 22 2022, 1:48 AM · Restricted Project, Restricted Project

Jul 29 2022

zimmermann6 added a comment to D129275: [libc][math] Added coshf function..

I get slightly different figures on my machine:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh coshf
GNU libc version: 2.33
GNU libc release: release
17.730
19.322
22.815
zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc PERF_ARGS=--latency ./perf.sh coshf
GNU libc version: 2.33
GNU libc release: release
49.478
48.614
75.194
Jul 29 2022, 3:11 AM · Restricted Project, Restricted Project

Jul 28 2022

zimmermann6 accepted D130644: [libc] Implement cosf function that is correctly rounded to all rounding modes..

this looks all good to me:

GNU libc version: 2.33
GNU libc release: release
17.271
25.064
13.555
GNU libc version: 2.33
GNU libc release: release
48.048
58.428
54.403

The first figures are for reciprocal throughput (core-math, glibc, llvm-libc), the second ones are for the latency. Great work!

Jul 28 2022, 12:36 AM · Restricted Project, Restricted Project
zimmermann6 accepted D129005: [libc][math] Improved performance of exp2f function..

I confirm the reciprocal throughput decreased from 18 to 10 cycles (on the same machine as above):

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh exp2f
GNU libc version: 2.33
GNU libc release: release
9.728
7.085
10.040
zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc PERF_ARGS=--latency ./perf.sh exp2f
GNU libc version: 2.33
GNU libc release: release
37.106
29.520
48.515

Good work Kirill!

Jul 28 2022, 12:19 AM · Restricted Project, Restricted Project

Jul 27 2022

zimmermann6 accepted D130629: [libc] Change sinf range reduction to mod pi/16 to be shared with cosf..

here are the timings I get:

zimmerma@biscotte:~/svn/core-math$ !273
LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh sinf
GNU libc version: 2.33
GNU libc release: release
16.784
23.823
14.114
zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc PERF_ARGS=--latency ./perf.sh sinf
GNU libc version: 2.33
GNU libc release: release
47.889
57.795
52.998
Jul 27 2022, 6:02 AM · Restricted Project, Restricted Project

Jul 26 2022

zimmermann6 accepted D130502: [libc] Use nearest_integer instructions to improve expm1f performance..

I confirm that I get similar timings. Nice work!

Jul 26 2022, 1:36 AM · Restricted Project, Restricted Project
zimmermann6 accepted D130498: [libc] Use nearest_integer instructions to improve expf performance..

I confirm it is still correctly rounded, and now faster than CORE-MATH. Nice work!

Jul 26 2022, 1:09 AM · Restricted Project, Restricted Project

Jul 18 2022

zimmermann6 accepted D123154: [libc] Implement sinf function that is correctly rounded to all rounding modes..

I confirm the latest version is correctly rounded for all rounding modes on the machine I tried (AMD EPYC 7282 with gcc 10.2.1 and clang 11.0.1-2).
For the reciprocal throughput I get:

zimmerma@biscotte:/tmp/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_LAUNCHER="/localdisk/zimmerma/glibc-2.35/install/lib/ld-linux-x86-64.so.2 --library-path /localdisk/zimmerma/glibc-2.35/install/lib" CORE_MATH_PERF_MODE=rdtsc ./perf.sh sinf
GNU libc version: 2.35
GNU libc release: stable
16.705
23.636
13.989

i.e., 16.7 cycles for core-math, 23.6 cycles for glibc 2.35, and 14.0 cycles for llvm-libc. Good work!
For the latency the figures are worse than glibc:

zimmerma@biscotte:/tmp/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_LAUNCHER="/localdisk/zimmerma/glibc-2.35/install/lib/ld-linux-x86-64.so.2 --library-path /localdisk/zimmerma/glibc-2.35/install/lib" CORE_MATH_PERF_MODE=rdtsc PERF_ARGS=--latency ./perf.sh sinf
GNU libc version: 2.35
GNU libc release: stable
47.926
57.338
62.912
Jul 18 2022, 4:14 AM · Restricted Project, Restricted Project

Jul 12 2022

zimmermann6 added a comment to D129278: [libc][math] Added sinhf function..

The easiest way, from my point of view, will be to apply it on top of working coshf branch, which you made and test.

Jul 12 2022, 7:42 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D129278: [libc][math] Added sinhf function..

should I apply this patch also on top of revision 60d6be5 + D129005 + D129215 ?

Jul 12 2022, 6:42 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D129275: [libc][math] Added coshf function..

Can you try to put all the chain on top of the revision 60d6be5dd3f411cfe1b5392cbb... for now. I'll rebase the revisions to the last main tonight.

Jul 12 2022, 6:37 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D129275: [libc][math] Added coshf function..

Yes. You should first apply D129005, after D129215 and after that one. The initial revision were cut to several ones to improve review process and new function deployment.

Jul 12 2022, 5:15 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D129275: [libc][math] Added coshf function..

I couldn't build this patch on top of 'main' (revision 81af344):

/localdisk/zimmerma/llvm-project/libc/src/math/generic/coshf.cpp:11:10: fatal error: 'src/math/generic/expxf.h' file not found
#include "src/math/generic/expxf.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~

Is there any dependency?

Jul 12 2022, 4:53 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D129005: [libc][math] Improved performance of exp2f function..

I confirm the new function is correctly rounded (for all rounding modes). For what concerns efficiency, here is what I get on a AMD EPYC 7282 with gcc 10.2.1 and clang 11.0.1-2:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh exp2f
GNU libc version: 2.31
GNU libc release: stable
9.720
6.228
18.081
Jul 12 2022, 3:58 AM · Restricted Project, Restricted Project

Jul 11 2022

zimmermann6 accepted D123154: [libc] Implement sinf function that is correctly rounded to all rounding modes..

I confirm the new version is correctly rounded (on the machine I tried it). However I find different figures for the number of cycles:

zimmerma@biscotte:~/svn/core-math$ LIBM=/localdisk/zimmerma/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE_MATH_PERF_MODE=rdtsc ./perf.sh sinf
GNU libc version: 2.31
GNU libc release: stable
16.781
23.443
32.737

This is on a AMD EPYC 7282 with gcc version 10.2.1 and clang 11.0.1-2 (I guess llvm-libc is compiled with clang). This gives 17 cycles for the core-math routine, and 33 cycles for the llvm-libc one.

Jul 11 2022, 2:44 AM · Restricted Project, Restricted Project

May 2 2022

zimmermann6 resigned from D124495: [libc] Implement double precision FMA for targets without FMA instructions..

I'm sorry, I not fluent enough in C++ to review this patch

May 2 2022, 7:54 AM · Restricted Project, Restricted Project

Apr 6 2022

zimmermann6 accepted D123154: [libc] Implement sinf function that is correctly rounded to all rounding modes..

all tests pass now, and I get the following figures (first CORE-MATH, 2nd GNU libc, 2rd LLVM libc):

$ LIBM=/users/zimmerma/svn/core-math/libllvmlibc.a ./perf.sh sinf
38.997
26.503
33.990
Apr 6 2022, 7:51 AM · Restricted Project, Restricted Project
zimmermann6 requested changes to D123154: [libc] Implement sinf function that is correctly rounded to all rounding modes..

I get an error for rounding up:

Using llvm-libc
MPFR library: 4.1.0       
MPFR header:  4.1.0 (based on 4.1.0)
Checking function sinf with MPFR_RNDU
libm wrong by up to 3.40e-11 ulp(s) [1] for x=-0x1.47d0fep+34
sin      gives -0x1p+0
mpfr_sin gives -0x1.fffffep-1
Total: errors=1 (0.00%) errors2=0 maxerr=3.40e-11 ulp(s)
Apr 6 2022, 12:36 AM · Restricted Project, Restricted Project

Mar 29 2022

zimmermann6 accepted D122538: [libc] Improve the performance of expm1f..

all values are still correctly rounded, and I confirm the speed improvement:

zimmerma@tomate:/tmp/core-math$ LIBM=/users/zimmerma/svn/core-math/libllvmlibc.a ./perf.sh expm1f # previous code
22.692
54.039
53.218
zimmerma@tomate:/tmp/core-math$ LIBM=/tmp/libllvmlibc.a ./perf.sh expm1f # new code
22.698
54.037
17.240

The llvm-libm results are the 3rd ones (first CORE-MATH, second GNU libc).

Mar 29 2022, 6:31 AM · Restricted Project, Restricted Project

Mar 25 2022

zimmermann6 accepted D122418: [libc] Improve the performance of expf..

ok for me, all exhaustive tests pass, and the performance increased a lot:

zimmerma@tomate:~/svn/core-math$ LIBM=/users/zimmerma/svn/core-math/libllvmlibc.a ./perf.sh expf
21.594
10.968
51.286
zimmerma@tomate:~/svn/core-math$ LIBM=/tmp/libllvmlibc.a ./perf.sh expf
21.596
10.966
16.997

The first run is with the previous version, the second one with the new version. The last timing is the one for llvm-libc, the first one for core-math, and the 2nd one for the GNU libc (not CR).

Mar 25 2022, 4:35 AM · Restricted Project, Restricted Project

Mar 24 2022

zimmermann6 accepted D122346: [libc] Improve the performance of exp2f..

I confirm the results are still correctly rounded for all four rounding modes. For rounding to nearest the reciprocal throughput/latency decreased from 56/103.5 cycles to 26.6/76.0 cycles on a Core i5-4590.
As a comparison, the core-math code runs in 21.2/63.2 cycles.

Mar 24 2022, 2:39 AM · Restricted Project, Restricted Project

Mar 15 2022

zimmermann6 accepted D121574: [libc] Implement expm1f function that is correctly rounded for all rounding modes..

ok for me too, I confirm all exhaustive searchs do pass. Great!

Mar 15 2022, 7:00 AM · Restricted Project, Restricted Project
zimmermann6 added a comment to D121463: [libc] Implement exp2f function that is correctly rounded for all rounding modes..

Do you have any performance numbers comparing before and after?

Mar 15 2022, 6:11 AM · Restricted Project, Restricted Project

Mar 14 2022

zimmermann6 accepted D121463: [libc] Implement exp2f function that is correctly rounded for all rounding modes..

ok for me too, all exhaustive tests do pass!

Mar 14 2022, 2:42 AM · Restricted Project, Restricted Project

Mar 11 2022

zimmermann6 accepted D121440: [libc] Implement expf function that is correctly rounded for all rounding modes..

my exhaustive search confirms it is correctly rounded for all four rounding modes, great!

Mar 11 2022, 1:28 AM · Restricted Project, Restricted Project

Feb 4 2022

zimmermann6 accepted D118962: [libc] Implement log1pf correctly rounded to all rounding modes..

apart the compiler warnings, all exhaustive tests comparing to MPFR do pass, for all four rounding modes. Good work!

Feb 4 2022, 1:05 AM · Restricted Project
zimmermann6 added a comment to D118962: [libc] Implement log1pf correctly rounded to all rounding modes..

I get several warnings when I compile this version:

In file included from /localdisk/zimmerma/llvm-project/libc/src/stdlib/strtold.cpp:11:
In file included from /localdisk/zimmerma/llvm-project/libc/src/__support/str_to_float.h:16:
/localdisk/zimmerma/llvm-project/libc/src/__support/high_precision_decimal.h:115:42: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
    if (roundToDigit < 0 || roundToDigit >= this->num_digits) {
                            ~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~
/localdisk/zimmerma/llvm-project/libc/src/__support/high_precision_decimal.h:121:26: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
        roundToDigit + 1 == this->num_digits) {
        ~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~
Feb 4 2022, 12:55 AM · Restricted Project

Jan 31 2022

zimmermann6 resigned from D118157: [libc] Improve hypotf performance with different algorithm correctly rounded to all rounding modes..

the version of last Friday is fine for me: I did run exhaustive tests for 2^23 <= y < 2^24, and 2^(23+k) <= x < 2^(24+k) for 0 <= k <= 13.
However since it changed in the meantime, I don't have resources any more to review the new version.

Jan 31 2022, 12:12 AM · Restricted Project

Jan 28 2022

zimmermann6 added a comment to D118157: [libc] Improve hypotf performance with different algorithm correctly rounded to all rounding modes..

I'm still running semi-exhaustive tests, it takes some time. I wonder whether a full exhaustive test is possible, by comparing the LLVM implementation with the code from Alexei at https://core-math.gitlabpages.inria.fr/. On a 64-core machine (Intel Xeon Gold 6130 @ 2.10GHz), it takes 4.6s to check 2^33 pairs (x,y). If one tests only positive x,y and x>=y, as exhaustive comparison would have to check 2^61 pairs for each rounding mode, which would take less than 1.5 month using 10000 such machines. This would not be a proof, but the probability that both codes are wrong for the same inputs and give exactly the same wrong answer is quite small.

Jan 28 2022, 1:36 AM · Restricted Project

Jan 26 2022

zimmermann6 requested changes to D118157: [libc] Improve hypotf performance with different algorithm correctly rounded to all rounding modes..

I get some errors for rounding to nearest:

Difference for 0x1.faf49ep+25,0x1.480002p+23
llvm_hypot: 0x1.00c5bp+26
as_hypot:   0x1.00c5b2p+26
pz_hypot:   0x1.00c5b2p+26
Jan 26 2022, 6:07 AM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

This is the error messages that we got on aarch64-ubuntu: Buildbot log https://lab.llvm.org/buildbot/#/builders/138/builds/16983/steps/4/logs/stdio

Jan 26 2022, 1:31 AM · Restricted Project

Jan 25 2022

zimmermann6 accepted D118149: [libc] Make logf function correctly rounded for all rounding modes..

all exhaustive tests do pass, with all four rounding modes. Maybe put in comment the corresponding input values for the exceptional cases ?

Jan 25 2022, 8:22 AM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

Dear Tue,

Jan 25 2022, 4:57 AM · Restricted Project
zimmermann6 accepted D118093: [libc] Implement log10f correctly rounded for all rounding modes..

this revision passes all exhaustive tests, for the four rounding modes, great work!

Jan 25 2022, 1:57 AM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

a performance graph is available at https://core-math.gitlabpages.inria.fr/graph_perf_hypotf.pdf

Jan 25 2022, 1:02 AM · Restricted Project

Jan 20 2022

zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..


attached is a file with 1200 binary32 exact cases with ulp(x)=2^12*ulp(y), x^2+y^2=z^2 having up to 72 bits. You might add them to your test cases.

Jan 20 2022, 9:07 AM · Restricted Project
zimmermann6 accepted D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

I'm ok with the new revision. However I see there are still some calls to get_round(). Did you try to replace them by floating-point operations?

Jan 20 2022, 7:31 AM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

I got similar results with binary64:

Checking hypot with llvm-project and rndu
Using seed 1078001
NEW hypot 0 -1 0x1.ccbbbcfef3c02p-523,0x1.924bf639c1a94p+500 [1.00] 1 1
libm gives 0x1.924bf639c1a94p+500
mpfr gives 0x1.924bf639c1a95p+500
Jan 20 2022, 6:02 AM · Restricted Project
zimmermann6 requested changes to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

after fixing my stress program I was able to find one value which does not seem to be correctly rounded (for binary32 and rounding up):

zimmerma@biscotte:~/svn/tbd/20/src/binary32$ CFLAGS=-DCHECK_CR LLVM=llvm-project VERBOSE=-v RND=rndu ./doitb.llvm hypot 1000
Checking hypot with llvm-project and rndu
Using seed 1076573
NEW hypot 0 -1 0x1.ffffecp-1,-0x1.000002p+27 [1.00] 1 1
libm gives 0x1.000002p+27
mpfr gives 0x1.000004p+27

Please can you confirm?

Jan 20 2022, 2:52 AM · Restricted Project

Jan 19 2022

zimmermann6 accepted D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

the stress tests were successful (for all four rounding modes, both in single and double precision).
Thus I am ok with this version, thanks!

Jan 19 2022, 11:23 PM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

I still get warnings with the latest revision:

/localdisk/zimmerma/llvm-project/libc/src/__support/FPUtil/Hypot.h:149:22: warning: hexadecimal floating literals are a C++17 feature [-Wc++17-extensions]
    if ((y != 0) && (0x1p0f + 0x1p-24f != 0x1p0f)) {
                     ^
Jan 19 2022, 10:55 PM · Restricted Project
zimmermann6 added a comment to D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

Dear Tue,

Jan 19 2022, 7:36 AM · Restricted Project
zimmermann6 updated subscribers of D117590: [libc] Implement correct rounding with all rounding modes for hypot functions..

Dear Tue,

Jan 19 2022, 2:16 AM · Restricted Project

Jan 14 2022

zimmermann6 accepted D115828: [libc] Implement correctly rounded log2f based on RLIBM library..
Jan 14 2022, 9:21 AM · Restricted Project
zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Tue,

Jan 14 2022, 9:20 AM · Restricted Project

Dec 23 2021

zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 23 2021, 9:17 PM · Restricted Project
zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 23 2021, 6:05 AM · Restricted Project
zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 23 2021, 2:40 AM · Restricted Project

Dec 21 2021

zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 21 2021, 7:19 AM · Restricted Project
zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 21 2021, 12:14 AM · Restricted Project

Dec 20 2021

zimmermann6 updated subscribers of D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

Dear Santosh,

Dec 20 2021, 6:08 AM · Restricted Project

Dec 17 2021

zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

the new version applies cleanly to the main branch. I have tested it on x86_64 under Linux (haswell). I confirm it is CR for rounding to nearest, and I get 3 failures if I disable the 3 exceptional cases. For other rounding modes I get 8 failures for rounding towards zero (with the exceptional cases), 8 failures too for rounding towards -Inf, and 7 failures for rounding towards +Inf.

Dec 17 2021, 1:59 AM · Restricted Project

Dec 16 2021

zimmermann6 requested changes to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

a rebase is needed so that this patch can be applied on the 'main' branch

Dec 16 2021, 9:11 AM · Restricted Project
zimmermann6 accepted D115408: [libc] Implement correctly rounded logf based on RLIBM library..

this revision is fine to me (for rounding to nearest), thanks!

Dec 16 2021, 9:10 AM · Restricted Project

Dec 15 2021

zimmermann6 added a comment to D115828: [libc] Implement correctly rounded log2f based on RLIBM library..

this patch does not apply to the current main branch (db5aceb):

$ patch -p1 -i /tmp/D115828.diff 
patching file libc/config/linux/aarch64/entrypoints.txt
Hunk #1 FAILED at 136.
1 out of 1 hunk FAILED -- saving rejects to file libc/config/linux/aarch64/entrypoints.txt.rej

It seems this patch was built on the branch which adds logf.

Dec 15 2021, 11:31 PM · Restricted Project

Dec 14 2021

zimmermann6 added a comment to D115408: [libc] Implement correctly rounded logf based on RLIBM library..

I confirm the new version is CR for all cases in rounding to nearest. A way to make the exceptional cases CR for directed rounding modes is the following:

Dec 14 2021, 12:56 AM · Restricted Project

Dec 13 2021

zimmermann6 added a comment to D115408: [libc] Implement correctly rounded logf based on RLIBM library..

maybe I did something wrong, but with the latest version I get two failures for x=0x1.2f1fd6p+3 and x=0x1.bacb4ap+25.
If I disable the test for exceptional values I get five failures, those two and x=0x1.01a33ep+0, x=0x1.b121a6p+76 and 0x1.6351d8p+95.

Dec 13 2021, 1:40 AM · Restricted Project

Dec 10 2021

zimmermann6 added a comment to D115408: [libc] Implement correctly rounded logf based on RLIBM library..

I confirm this version is correctly rounded for all binary32 inputs and rounding to nearest, by exhaustive testing. For other rounding modes I find 46 incorrect roundings for rounding towards zero, 46 for rounding towards +Inf, and 44 for rounding towards -Inf. I guess part of them are due to the hard-coded values for the 21 exceptional cases, which are on the wrong side with probability 1/2 each. Thus with little additional effort you could get a correctly rounded function for all rounding modes.

Dec 10 2021, 1:47 AM · Restricted Project

Dec 9 2021

zimmermann6 added a comment to D115408: [libc] Implement correctly rounded logf based on RLIBM library..

Do you know the cost in latency/throughput of the switch() with the 21 exceptional cases? Another way would be to perform a rounding test once the approximation of log has been computed, and go in the switch only if the test fails (which would happen very rarely).

Dec 9 2021, 10:33 PM · Restricted Project