In current Clang, on the OpenMP NVPTX toolchain, math functions are resolved as math functions for the host. For example, a call to sqrt() in a target region will result in an LLVM-IR call which looks like this:

call double sqrt(double %1)

This patch allows for math functions in OpenMP NVPTX target regions to call the same math functions that CUDA code calls. For example, for sqrt we get:

call double @llvm.nvvm.sqrt.rn.d(double %1)

This is necessary for both correctness and performance.