This patch is to fix issue in the following simple case:
#include <omp.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
int num = omp_get_num_devices();
printf("%d\n", num);
return 0;
}
Currently it returns 0 even devices exist. Since this file doesn't contain any
target region, the host entry is empty so further actions like initialization
will not be proceeded, leading to wrong device number returned by runtime
function call.
What's more, this patch also fixed wrong number of devices returned by
`omp_get_num_devices` even the binary contains target regions. In the past,
only libraries that can recogonize kernels will be initialized. For example, if
the fat binary only contains CUDA kernels, then only CUDA offloading library is
loaded. Others will not even though they exist. This leads to a fact that
`omp_get_num_devices` will only returns number of NVIDIA GPUs, which does
not conform with specification, saying that "the omp_get_num_devices routine
returns the number of available target devices".