In the device runtime there are many function calls to __kmpc_is_spmd_exec_mode
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.
In this patch, we use AAKernelInfo to analyze kernel execution. Basically, for
each kernel (device) function F, we collect all kernel entries K that can
reach F. A new AA, AAFoldRuntimeCall, is created for each call site. In each
iteration, it will check all reaching kernel entries, and update the folded value
accordingly.
In the future we will support more function.
I don't think we need this anymore, do we?