We may want to be able to mark certain regions as kernels even without
being in an accepted CUDA or OpenCL language mode. This patch introduces
a new attribute limited to nvptx targets called nvptx_kernel which
will perform the same metadata action as the existing CUDA ones. This
closely mimics the behaviour of the amdgpu_kernel attribute. This
allows for making executable NVPTX device images without using an
existing offloading language model.
I was unsure how to do this, I could potentially re-use all the CUDA
attributes and just replace the CUDA language requirement with an
NVPTX architecture requirement. Also I don't know if I should add more
than just this attribute.
Nice.
This reminded me that we have a project compiling CUDA, but targeting SPIR-V instead of NVPTX. It looks like this will likely break them. The project is out-of-tree, but I'd still need to figure out how to keep them working. I guess it would be easy enough to expand TargetNVPTX to TargetNVPTXOrSpirV. I'm mostly concerned about logistics of making it happen without disruption.