Page MenuHomePhabricator

Simplifying memory globalization from the front end to move optimizations to the middle end.
Needs ReviewPublic

Authored by josemonsalve2 on Nov 2 2020, 10:42 PM.

Details

Reviewers
jdoerfert
Summary

Memory globalization was fully implemented in the front end. There are three runtime
functions in Libomptarget:

  • __kmpc_data_sharing_push_stack
  • __kmpc_data_sharing_coalesced_push_stack
  • __kmpc_data_sharing_pop_stack

The front end performed a scape analysis and created a record declare with all the stack
variables. Then, based on the context (isTTD and other parameters) it would create a push
for the size of the record, or for that size multiplied by the WARP (to globalize for the
whole WARP.

This PR removes the record creation, and it simplifies the front end to be a simple runtime
call that will be later on optimized in the middle end. The middle end will be able to
determine the stack variables that do scape, and those that do not, as well as the
approrpiate merging of different globalized variables

Diff Detail

Unit TestsFailed

TimeTest
60 msx64 debian > Clang.OpenMP::declare_target_codegen_globalization.cpp
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /mnt/disks/ssd0/agent/llvm-project/clang/test/OpenMP/declare_target_codegen_globalization.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/tools/clang/test/OpenMP/Output/declare_target_codegen_globalization.cpp.tmp-ppc-host.bc
110 msx64 debian > Clang.OpenMP::nvptx_distribute_parallel_generic_mode_codegen.cpp
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -verify -fopenmp -fopenmp-version=45 -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /mnt/disks/ssd0/agent/llvm-project/clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_distribute_parallel_generic_mode_codegen.cpp.tmp-ppc-host.bc
150 msx64 debian > Clang.OpenMP::nvptx_parallel_codegen.cpp
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /mnt/disks/ssd0/agent/llvm-project/clang/test/OpenMP/nvptx_parallel_codegen.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_parallel_codegen.cpp.tmp-ppc-host.bc
100 msx64 debian > Clang.OpenMP::nvptx_parallel_for_codegen.cpp
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /mnt/disks/ssd0/agent/llvm-project/clang/test/OpenMP/nvptx_parallel_for_codegen.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_parallel_for_codegen.cpp.tmp-ppc-host.bc
160 msx64 debian > Clang.OpenMP::nvptx_target_codegen.cpp
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /mnt/disks/ssd0/agent/llvm-project/clang/test/OpenMP/nvptx_target_codegen.cpp -o /mnt/disks/ssd0/agent/llvm-project/build/tools/clang/test/OpenMP/Output/nvptx_target_codegen.cpp.tmp-ppc-host.bc
View Full Test Results (23 Failed)

Event Timeline

josemonsalve2 created this revision.Nov 2 2020, 10:42 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptNov 2 2020, 10:42 PM
josemonsalve2 requested review of this revision.Nov 2 2020, 10:42 PM

Let's start by adding an updated test to this so we can see how the result looks.

Removing globalized record for parallel regions

When globalization occurs in parallel regions, a record was crerated that is not necessary anymore.
This is expected to be done in the back end.

I'm working on the other tests right now.

Modifying 3 more tests to reflect changes in this patch