Rename -fcuda-approx-transcendentals as
-fgpu-approx-transcendentals and pass it
to both device and host clang -cc1.
Fix its interaction with -ffast-math to allow
-fno-gpu-approx-transcendentals to override
the implicit -fcuda-approx-transcendentals
due to -ffast-math.
Rename the predefined macro to be
__CLANG_GPU_APPROX_TRANSCENDENTALS__.
Emit the macro for both device and host compilation.