This patch adds a loader utility targeting the CUDA driver API to launch
NVPTX images called nvptx_loader. This takes a GPU image on the
command line and launches the _start kernel with the appropriate
arguments. The _start kernel is provided by the already implemented
nvptx/start.cpp. So, an application with a main function can be
compiled and run as follows.
clang++ --target=nvptx64-nvidia-cuda main.cpp crt1.o -march=sm_70 -o image ./nvptx_loader image args to kernel
This implementation is not tested and does not yet support RPC. This
requires further development to work around NVIDIA specific limitations
in atomics and linking.