Sometimes libomptarget's CUDA plugin produces unhelpful diagnostics
about a lack of CUDA devices before an application runs:
$ clang -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa hello-world.c $ ./a.out CUDA error: Error returned from cuInit CUDA error: no CUDA-capable device is detected Hello World: 4
This can happen when the CUDA plugin was built but all CUDA devices
are currently disabled in some manner, perhaps because
CUDA_VISIBLE_DEVICES is set to the empty string. As shown in the
above example, it can even happen when we haven't compiled the
application for offloading to CUDA.
The following code from openmp/libomptarget/plugins/cuda/src/rtl.cpp
appears to be intended to handle this case, and it chooses not to
write a diagnostic to stderr unless debugging is enabled:
if (NumberOfDevices == 0) { DP("There are no devices supporting CUDA.\n"); return; }
The problem is that the above code is never reached because the
earlier cuInit returns CUDA_ERROR_NO_DEVICE. This patch handles
that cuInit case in the same manner as the above code handles the
NumberOfDevices == 0 case.