- Use intrinsics for x86-64 fma
- Optimize PolyEval for x86-64 with degree 3 & 5 polynomials.
- There might be a slight loss of accuracy compared to Horner's scheme due to usages of higher powers x^2 and x^3 in the computations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The subject line and description are confusing. It seems to me like you have added specializations for 3rd and 5th degree polynomials. But, the description has "3-6 polynomials".
libc/src/__support/FPUtil/PolyEval.h | ||
---|---|---|
36 | Couple of things:
| |
libc/src/__support/FPUtil/x86_64/PolyEvalDouble.h | ||
1 ↗ | (On Diff #392768) | Fix line length but not sure if the description is correct also. |
libc/src/__support/FPUtil/x86_64/PolyEvalFloat.h | ||
1 ↗ | (On Diff #392768) | Fix line length but not sure if the description is correct also. |
libc/src/math/generic/CMakeLists.txt | ||
503 ↗ | (On Diff #392768) | Can you add a comment explaining what in the implementation would be affected by this? AFAICT, we are either using the fma instructions directly, or are calling the fma related builtins. So, it is not clear to me as to why this should be required. |
libc/test/src/math/expm1f_test.cpp | ||
111 | If there is a possibility of trading off between performance and accuracy, we should provide build time switches for users to pick one or the other based on their needs. By default we want to "err" on the side of providing more accurate implementations over providing faster but less accurate implementations. |
libc/src/math/generic/CMakeLists.txt | ||
---|---|---|
503 ↗ | (On Diff #392768) | So I learn't that the fma related builtins require this option. The correct way to do this would be to do it this way:
Then targets depending on the helper library will automatically get the -mfma compile option. |
libc/test/src/math/expm1f_test.cpp | ||
---|---|---|
111 | We're going to have a correctly rounded version for expm1f soon so I'm not worried about this regression yet. |
[libc] Use intrinsic for x86-64 fma and optimize PolyEval for x86-64 with degree 3 & 5 polynomials.
libc/test/src/math/expm1f_test.cpp | ||
---|---|---|
111 | Added a possible reason to the patch's summary. |
[libc] Use intrinsics for x86-64 fma and optimize PolyEval for x86-64 with degree 3 & 5 polynomials.
Couple of things: