This represents the start of my work to import and adapt the SLEEF vector math-function library, authored by Naoki Shibata, to LLVM. See https://github.com/shibatch/sleef for the original source. For the RFC, see: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102254.html
I've not changed any of the meat of the implementation in order to make this patch; but I've tried to make is more like a runtime library (and I've made the source files C++ files instead of C files). All of the external functions start with __. The largest issue is is how to deal with vector ISA/ABI compatibility. I've tried to properly separate the concerns of:
- For what processor is the runtime library itself being compiled
- For what vector ABIs are vectorized functions being made available
Aside from the scalar versions, which are pure C/C++ and always compiled, vector versions are compiled when possible. For example, we have xsin and xsinf, the scalar versions, xsinsse2 and xsinfsse2 (which use m128d and m128 types), xsinavx and xsinfavx (which use m256d and m256), and xsinavx2 and xsinfavx2 (which also use m256d and m256, although some functions use a different integer type compared to the avx versions). As many of these variants as possible are compiled into the library simultaneously.
The library is implemented using intrinsics, not assembly, and so the associated target features must be enabled in the compiler to compile the relevant versions of these functions. By default, compilers on x86 often only enable support for SSE2, (i.e. not AVX or later ISAs). When the compiler will support adding flags to turn on AVX, AVX2, etc. the build will do that, but only the files which require it. This is important because if you're building for an older core (or just trying to the portable), you don't want the compiler to start generating AVX instructions inside your SSE2 functions.
For ARM, NEON is supported (although only single-precision currently).
I've not yet dealt with testing; the source on github has testing programs. They make use of mpfr (a dependency I doubt we want), and, in part, perform randomized testing (and, at least for regression tests, we probably don't want that either). We need to figure out what we want to do here.
In any case, there's a lot of discuss here about code structure, naming conventions, testing, etc.