This is an archive of the discontinued LLVM Phabricator instance.

[libc][math] New algorithm for expf/expm1f/exp2f and for new functions sinhf/coshf.
Needs RevisionPublic

Authored by orex on Jun 16 2022, 3:49 AM.

Details

Summary
A new common algorithms for expf/expm1f/exp2f/sinhf/coshf introduced:
1) Lookup tables size for expf/expm1f/exp2f reduced 12 times!
2) Common algorithm for all 5 functions. The same code for expf/expm1f/sinhf/coshf.
3) Improved precision: number of exceptional cases reduced from 9 to 2 (+1 sinhf).
4) More reliable algorithm. It uses pure mathematic and do not rely on Sollya
fitting. Easy change of lookup table size, for example.
5) Perf tests shows similar performance with previous implementation.
6) Core-math performance tests below (glibc 2.31)
expfexpm1fexp2fsinhfcoshf
glibc10.51942.48310.37764.08423.605
this14.69614.51820.37429.92233.572
prev15.00312.29127.201------

Diff Detail

Event Timeline

orex created this revision.Jun 16 2022, 3:49 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 16 2022, 3:49 AM
orex edited the summary of this revision. (Show Details)Jun 16 2022, 3:49 AM
orex edited the summary of this revision. (Show Details)Jun 16 2022, 4:48 AM
orex published this revision for review.Jun 16 2022, 5:03 AM
lntue added inline comments.Jun 16 2022, 7:11 AM
libc/src/math/generic/common_constants.cpp
106

Use hexadecimal floats for constants.

libc/src/math/generic/common_constants.h
20–21

Use all caps for constants.

libc/src/math/generic/exp2f.cpp
22

Use hexadecimal floats for constants and provide how are these constants generated. Also use caps and more descriptive names.

73

Maybe you can just inline exval1, exval2, and exval_mask here, with the comment explaining how exval_mask is obtained.

83–89

Add comments explain the range reduction computations in detail.

91

Use hexadecimal floats for constants. Also you might want to try to combine a bit to take advantage of FMA when it's available, either

multiply_add(dx, polyeval(...), l2h*dx);

or

polyeval(dx, l2h*dx, l2l, ...);

Those 2 are actually the same, and when there is no FMA. Performance tests should show if it improves when FMA is available or not.

98

Maybe combining this to multiply_add(ml, pe, pe + ml + 1.0 to take advantage of FMA if available.

libc/test/src/math/CMakeLists.txt
1199

Indentation.

orex updated this revision to Diff 441028.Jun 29 2022, 8:34 AM

Added sinhf/coshf

orex retitled this revision from [libc][math] New common algorithm for expf/expm1f/exp2f. to [libc][math] New algorithm for expf/expm1f/exp2f and for new functions sinhf/coshf..Jun 29 2022, 10:00 AM
orex edited the summary of this revision. (Show Details)
orex updated this revision to Diff 441306.Jun 30 2022, 1:39 AM
orex edited the summary of this revision. (Show Details)

Fix build problem.

orex updated this revision to Diff 448644.Jul 29 2022, 8:51 AM

Rebased to latest main.

orex added a comment.Jul 29 2022, 9:00 AM

Paul (@zimmermann6) and Tue (@lntue),

can you check performance of these two functions expf/expm1f on your systems? I have an idea to use the functions instead of standard in MinRelSize compiling option. Reducing the size can be useful for embedded systems, for example. What do you think?

zimmermann6 requested changes to this revision.Sep 21 2022, 12:09 AM

sorry for the delay. It seems this does not compile properly with current main:

/localdisk/zimmerma/llvm-project/libc/src/math/generic/expm1f.cpp:11:10: fatal error: 'expxf.h' file not found
#include "expxf.h"
         ^~~~~~~~~
This revision now requires changes to proceed.Sep 21 2022, 12:09 AM