Currently, the NVPTX compilation toolchain can only be invoked either
through CUDA or OpenMP via --offload-device-only. This is because we
cannot build a CUDA toolchain without an accompanying host toolchain for
the offloading. When using --target=nvptx64-nvidia-cuda this results
in generating calls to the GNU assembler and linker, leading to errors.
This patch abstracts the portions of the CUDA toolchain that are
independent of the host toolchain or offloading kind into a new base
class called NVPTXToolChain. We still need to read the host's triple
to build the CUDA installation, so if not present we just assume it will
match the host's system for now, or the user can provide the path
explicitly.
This should allow the compiler driver to create NVPTX device images
directly from C/C++ code.
Nit: it's hard to tell whether the whitespace additions are spaces or tabs. They show up as ">" to me which suggests it may be tabs. Just in case it is indeed the case, please make sure to un-tabify the changes.