In current Clang, on the OpenMP NVPTX toolchain, math functions are resolved as math functions for the host. For example, a call to sqrt() in a target region will result in an LLVM-IR call which looks like this:
call double sqrt(double %1)
This patch allows for math functions in OpenMP NVPTX target regions to call the same math functions that CUDA code calls. For example, for sqrt we get:
call double @llvm.nvvm.sqrt.rn.d(double %1)
This is necessary for both correctness and performance.