This lets us emit e.g. sin.approx.f32. See
http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sin
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
More tightly scope the USE_FAST_MATH macro.
tra pointed out that device_functions.hpp uses USE_FAST_MATH for its own
purposes. For this CL, we only want to define USE_FAST_MATH around
math_functions.hpp.