Page MenuHomePhabricator

[OpenMP] Create custom state machines for generic target regions

Authored by jdoerfert on May 6 2021, 12:01 AM.



In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.

The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.


Diff Detail

Unit TestsFailed

490 msx64 debian > Clang.OpenMP::nvptx_lambda_capturing.cpp
Script: -- : 'RUN: at line 5'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/13.0.0/include -nostdsysteminc -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/nvptx_lambda_capturing.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/nvptx_lambda_capturing.cpp --check-prefix=CHECK1
350 msx64 debian > Clang.OpenMP::nvptx_target_codegen.cpp
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/13.0.0/include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/nvptx_target_codegen.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_target_codegen.cpp.tmp-ppc-host.bc
1,790 msx64 debian > Clang.OpenMP::nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/13.0.0/include -nostdsysteminc -x c++ -O1 -disable-llvm-optzns -verify -fopenmp -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/../Headers/Inputs/include -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/../../lib/Headers/openmp_wrappers -include __clang_openmp_device_functions.h -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp.tmp-ppc-host.bc

Event Timeline

jdoerfert created this revision.May 6 2021, 12:01 AM
jdoerfert requested review of this revision.May 6 2021, 12:01 AM
Herald added a reviewer: baziotis. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
jdoerfert updated this revision to Diff 343302.May 6 2021, 12:08 AM

Fix porting mistakes

jdoerfert updated this revision to Diff 344652.May 11 2021, 9:39 PM

Add comments and tests

FWIW, here is what clang currently generates:
With these two patches the result is better in basically every case except the last where there is no actual improvement to be made.

jdoerfert updated this revision to Diff 345172.May 13 2021, 9:11 AM

Update tests with --function-signature to avoid check-label matching calls

LG, some remarks


Nit, concatenation?





ggeorgakoudis added inline comments.May 17 2021, 4:04 PM

Is equality comparison enough for ParallelRegions or should we change ParallelRegions refer to same function? I guess size equality should be enough since we only add ParallelRegions, but still checking.

jdoerfert added inline comments.May 17 2021, 4:38 PM

I will check the content, it should not matter how we use it but you're right, better to plan for the future.

jdoerfert marked 3 inline comments as done.

Addressed comments
Also moved logic from the AAKernelInfoFunction::updateImpl to
AAKernelInfoCallSite::initialize as it looks cleaner to fix the
state of runtime calls right away.

jdoerfert added a comment.EditedMay 19 2021, 10:32 PM

[Wrong review -.- ]


It won't always be described properly by "concatenation". Already the |= part is on the border, once we do SPMD detection there will be a &= as well.

ggeorgakoudis accepted this revision.EditedSun, Jun 13, 11:09 AM

LGTM, minor comment, fix tests?


Remove dump()?

This revision is now accepted and ready to land.Sun, Jun 13, 11:09 AM