This is an archive of the discontinued LLVM Phabricator instance.

[libc][math] Added tanhf function.
ClosedPublic

Authored by orex on Jul 29 2022, 7:51 AM.

Details

Summary

Correct rounding function. Performance ~2x faster than glibc analog.

Performance (llvm 12 intel):

CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
13.279
37.492
18.145
CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='--latency' ./perf.sh tanhf
GNU libc version: 2.31
GNU libc release: stable
40.658
109.582
66.568

Diff Detail

Event Timeline

orex created this revision.Jul 29 2022, 7:51 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 29 2022, 7:51 AM
orex requested review of this revision.Jul 29 2022, 7:51 AM
lntue accepted this revision.Aug 1 2022, 6:40 AM

The implementation passed exhaustive tests for all 3 modes: SSE2, SSE4.2, and FMA.

And here is the performance number on Ryzen 1700:

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf
CORE-MATH reciprocal throughput   : 13.379
System LIBC reciprocal throughput : 55.843
LIBC reciprocal throughput        : 25.130

$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf --latency
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency   : 43.814
System LIBC latency : 125.128
LIBC latency        : 94.906
This revision is now accepted and ready to land.Aug 1 2022, 6:40 AM
This revision was automatically updated to reflect the committed changes.