DO NOT REVIEW; For reference only.
The Arm A64 instruction set has instructions dedicated to some mathematical functions. This patch expands llvm.sin.* and llvm.cos.* intrinsics using these instructions when the fast-math flag afn is attached and SVE can be used.
This can improve performance in terms of followings.
- Utilize optimal dedicated instructions based on the target architecture feature (SVE, NEON) (can be achieved also by dedicated math libraries)
- Vectorize loops which include mathematical function calls (can be achieved also by vectorized math libraries and compiler support, e.g. D134719)
- Eliminate function call overhead
- Schedule instructions in caller and callee collectively
- Better software pipelining (in the future)
- Increase optimal candidates of fission points in loop fission (in the future)
This patch is a primitive work. I posted here to discuss direction of this patch at Discourse. A complete patch will be posted in another review.