This patch adds the necessary changes required to bundle and wrap HIP
files. The bundling is done using clang-offload-bundler currently to
mimic fatbinary and the wrapping is done using very similar runtime
calls to CUDA. This still does not support managed / surface / texture
variables, that would require some additional information in the entry.
One difference in the codegeneration with AMD is that I don't check if
the handle is null before destructing it, I'm not sure if that's
required.
With this we should be able to support HIP with the new driver.
Depends on D128850
Nit: This test case does not have any CHECK lines and could use a comment describing what it's supposed to test. AFAICT it's intended to make sure that no temporary files are left around, but I'm not 100% sure.