This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single kmpc_parallel_51 runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using update_cc_test_checks.py. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
70 ms | x64 debian > LLVM.Transforms/OpenMP::gpu_state_machine_function_ptr_replacement.ll Script:
--
: 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/opt -S -passes=openmpopt -pass-remarks=openmp-opt -openmp-print-gpu-kernels < /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/OpenMP/gpu_state_machine_function_ptr_replacement.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/OpenMP/gpu_state_machine_function_ptr_replacement.ll
| |
190 ms | x64 windows > Clang.OpenMP::nvptx_data_sharing.cpp Script:
--
: 'RUN: at line 5'; c:\ws\w16e2-2\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\w16e2-2\llvm-project\premerge-checks\build\lib\clang\13.0.0\include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc C:\ws\w16e2-2\llvm-project\premerge-checks\clang\test\OpenMP\nvptx_data_sharing.cpp -o C:\ws\w16e2-2\llvm-project\premerge-checks\build\tools\clang\test\OpenMP\Output\nvptx_data_sharing.cpp.tmp-ppc-host.bc
| |
860 ms | x64 windows > Clang.OpenMP::nvptx_lambda_capturing.cpp Script:
--
: 'RUN: at line 5'; c:\ws\w16e2-2\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\w16e2-2\llvm-project\premerge-checks\build\lib\clang\13.0.0\include -nostdsysteminc -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm C:\ws\w16e2-2\llvm-project\premerge-checks\clang\test\OpenMP\nvptx_lambda_capturing.cpp -o - | c:\ws\w16e2-2\llvm-project\premerge-checks\build\bin\filecheck.exe --allow-unused-prefixes C:\ws\w16e2-2\llvm-project\premerge-checks\clang\test\OpenMP\nvptx_lambda_capturing.cpp --check-prefix CHECK1
| |
290 ms | x64 windows > Clang.OpenMP::nvptx_parallel_codegen.cpp Script:
--
: 'RUN: at line 3'; c:\ws\w16e2-2\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\w16e2-2\llvm-project\premerge-checks\build\lib\clang\13.0.0\include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc C:\ws\w16e2-2\llvm-project\premerge-checks\clang\test\OpenMP\nvptx_parallel_codegen.cpp -o C:\ws\w16e2-2\llvm-project\premerge-checks\build\tools\clang\test\OpenMP\Output\nvptx_parallel_codegen.cpp.tmp-ppc-host.bc
| |
240 ms | x64 windows > Clang.OpenMP::nvptx_parallel_for_codegen.cpp Script:
--
: 'RUN: at line 3'; c:\ws\w16e2-2\llvm-project\premerge-checks\build\bin\clang.exe -cc1 -internal-isystem c:\ws\w16e2-2\llvm-project\premerge-checks\build\lib\clang\13.0.0\include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc C:\ws\w16e2-2\llvm-project\premerge-checks\clang\test\OpenMP\nvptx_parallel_for_codegen.cpp -o C:\ws\w16e2-2\llvm-project\premerge-checks\build\tools\clang\test\OpenMP\Output\nvptx_parallel_for_codegen.cpp.tmp-ppc-host.bc
| |
View Full Test Results (10 Failed) |