In the device runtime there are many function calls to __kmpc_is_spmd_exec_mode
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.
In this patch, we use AAKernelInfo to analyze kernel execution. Basically, for
each kernel (device) function F, we collect all kernel entries K that can
reach F. In each iteration, we go through all reaching kernel entries and check
their execution mode. If F can only be reached by kernel entries with same mode,
we update a map from CallBase * to Constant * to corresponding value. In
manifest stage, if any entry of the map is not nullptr, which means the function
call can be folded to the Constant *, we replace all uses of the function call
and remove it.
Later we will also add more foldable functions, such as isMainThread.
Unsure, I think this way is more natural.