CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation
to take advantages of multi-threads computation.
CUDA/HIP and OpenMP both use Sema::DeviceCallGraph to store functions to be analyzed and remove them
once they decide the function is sure to be emitted. CUDA/HIP and OpenMP have different functions to determine
if a function is sure to be emitted.
To check host/device correctly for CUDA/HIP when -fopenmp is enabled, there needs a unified logic to determine
whether a function is to be emitted. The logic needs to be aware of both CUDA and OpenMP logic.
This patch only affects CUDA/HIP program which are compiled with -fopenmp. It should have no effect on C++ programs
compiled with -fopenmp.
There's an overload of DenseMap::erase that just takes a key value, so this whole thing can be S.DeviceCallGraph.erase(OrigCallee);.
Why do we need to erase the entry instead of re-using it? If the call graphs are different for the two use-cases, is that conflict a problem for other reasons?