Page MenuHomePhabricator

[WIP][Clang][OpenMP] Add new clang argument `-fopenmp-target-simd`
Needs ReviewPublic

Authored by tianshilei1992 on Sep 22 2021, 1:11 PM.

Details

Summary

Currently an OpenMP thread is mapped to a hardware thread. In order to support
SIMD, we have to map an OpenMP thread to a warp (wavefront). This mapping has to
be determined when the kernel is launched, and the execution mode is encoded in
the int8_t Mode when calling __kmpc_target_init, which is introduced in
D110279. However, we cannot determine if simd is used and then adjust Mode
accordingly in current Clang CodeGen because the function call to
__kmpc_target_init is emitted before the body of target region.

This patches adds a new clang argument -fopenmp-target-simd to emit code that
supports SIMD mapping. When this argument is set, no matter whether there is
simd directive in target region, the new mappig is always used. If it is not
set or -fno-openmp-target-simd is set, the existing mapping will be used, and
simd directive will be ignored.

The reason we don't reuse -fopenmp-simd is the CodeGen of device code shares
some implementation with host code. -fopenmp-simd can change CodeGen, which we
don't expect to introduce any unknown effects.

Diff Detail

Event Timeline

tianshilei1992 created this revision.Sep 22 2021, 1:11 PM
tianshilei1992 requested review of this revision.Sep 22 2021, 1:11 PM
Herald added a project: Restricted Project. · View Herald Transcript
tianshilei1992 edited the summary of this revision. (Show Details)Sep 22 2021, 1:12 PM

it can emit right mode

Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2021, 8:46 PM

I'm excited about the thread to warp mapping but unclear why this needs to be a compile time flag. Can we use this mapping when pragma simd is present and otherwise stay with the current one?

I'm excited about the thread to warp mapping but unclear why this needs to be a compile time flag. Can we use this mapping when pragma simd is present and otherwise stay with the current one?

Like I mentioned in the description, when __kmpc_target_init is emitted, the body has not yet. If the target region is something like:

#pragma omp target
{
#pragma omp parallel
  { /* parallel region */ }
#pragma omp simd
  { /* simd region */ }
}

Clang has no idea the target region actually contains a simd region. As a result, it cannot adjust the execution mode accordingly. Of course we can always change the value in middle end (such as OpenMPOpt), but it's not gonna work if it is not invoked.