Currently an OpenMP thread is mapped to a hardware thread. In order to support
SIMD, we have to map an OpenMP thread to a warp (wavefront). This mapping has to
be determined when the kernel is launched, and the execution mode is encoded in
the int8_t Mode when calling __kmpc_target_init, which is introduced in
D110279. However, we cannot determine if simd is used and then adjust Mode
accordingly in current Clang CodeGen because the function call to
__kmpc_target_init is emitted before the body of target region.
This patches adds a new clang argument -fopenmp-target-simd to emit code that
supports SIMD mapping. When this argument is set, no matter whether there is
simd directive in target region, the new mappig is always used. If it is not
set or -fno-openmp-target-simd is set, the existing mapping will be used, and
simd directive will be ignored.
The reason we don't reuse -fopenmp-simd is the CodeGen of device code shares
some implementation with host code. -fopenmp-simd can change CodeGen, which we
don't expect to introduce any unknown effects.