Currently when we generate OpenMP offloading code we always make
fallback code for the CPU. This is necessary for implementing features
like conditional offloading and ensuring that unhandled pragmas don't
result in missing symbols. However, this is problematic for a few cases.
For offloading tests we can silently fail to the host without realizing
that offloading failed. Additionally, this makes it impossible to
provide interoperabiility to other offloading schemes like HIP or CUDA
because those methods do not provide any such host fallback guaruntee.
this patch adds the -fopenmp-offload-mandatory flag to prevent
generating the fallback symbol on the CPU and instead replaces the
function with a dummy global and the failed branch with 'unreachable'.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
This is necessary for implementing features like conditional offloading and ensuring that unhandled pragmas don't result in missing symbols.
This behavior is part of the standard.
For offloading tests we can silently fail to the host without realizing that offloading failed.
It is controlled by the OMP_TARGET_OFFLOAD env variable, no? You can set this env var to mandatory to avoid this problem.
I believe it's reasonable to have this as an option flag to defy the standard, we have other flags that do this already (e.g. -fopenmp-cuda-mode).
For offloading tests we can silently fail to the host without realizing that offloading failed.
It is controlled by the OMP_TARGET_OFFLOAD env variable, no? You can set this env var to mandatory to avoid this problem.
Yes, I don't think we set it in the tests right now for some reason. But the main reason I made this patch is for interoperability. Without this if you want to call a CUDA function from the OpenMP device you'd need a variant and a dummy implementation. If you don't write a dummy implementation you'll get a linker error, if you don't use a variant you'll override the CUDA version.
I mean it is not the implementation feature.
For offloading tests we can silently fail to the host without realizing that offloading failed.
It is controlled by the OMP_TARGET_OFFLOAD env variable, no? You can set this env var to mandatory to avoid this problem.
Yes, I don't think we set it in the tests right now for some reason.
I thought it is the default behavior. But we need to set it for offloading tests to be sure in their behavior.
But the main reason I made this patch is for interoperability. Without this if you want to call a CUDA function from the OpenMP device you'd need a variant and a dummy implementation. If you don't write a dummy implementation you'll get a linker error, if you don't use a variant you'll override the CUDA version.
Ah, ok, I see. How is supposed to be used? In Cuda code or in plain C/C++ code?
I haven't finalized the implementation, but the basic support I've tested was calling a __device__ function compiled from another file with OpenMP, with this patch the source files would look like this for example. I think the inverse would also be possible given some code on the CUDA side. Calling CUDA kernels would take some extra work.
__device__ int cuda() { return 0; }
int cuda(void); #pragma omp declare target device_type(nohost) to(cuda) int main() { int x = 1; #pragma omp target map(from : x) x = cuda(); return x; }
If we have #pragma omp target if (...) then that requires a host fallback and violates the assertion the user passed in, it will hit the unreachable and fail. If the user passed in #pragma omp target device(...) we will assume that a host implementation exists as well.
In the test file I had a global, but forgot to check the globals to show that @x exists on the device. I should also put an if (0) to show that we always hit unreachable in that case.
Adding test case to check if codegen for unreachables, and an extra function to show that it is not created for the host while the other is. Also added an error message when the user specified offloading is mandatory but couldn't be created due to if(0) or a lack of triples.
I assume it would be good to notify the user somehow about target regions, which may require execution on the host. Maybe add a note during the codegen phase?
Technically all of them may require execution on the host according to the documentation. We could emit a warning whenever we codegen a target region with an if clause, but I feel the user should have a good enough idea that if won't work if they specifically turn on the flag that removes host execution.
Thanks! Seems a good thing to add to the offloading test runner, preferably in a separate change to avoid reverting this in case of unforeseen problems
Could definitely do that, it doesn't seem like we test if anywhere in the OpenMP tests so it shouldn't break anything.