Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.
Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.
The file is only intended to be -include'd by clang and should never be included by users, hence '__'.
"by compiler" -> "by the compiler"