This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP] Codegen for teams directive for NVPTX
ClosedPublic

Authored by carlo.bertolli on Mar 8 2016, 10:15 AM.

Details

Summary

This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:

  • Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
  • Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.

Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.

Diff Detail

Repository
rL LLVM

Event Timeline

carlo.bertolli retitled this revision from to [OPENMP] Codegen for teams directive for NVPTX.
carlo.bertolli updated this object.
carlo.bertolli set the repository for this revision to rL LLVM.
ABataev added inline comments.Mar 8 2016, 11:47 PM
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
51–52

This will cause a crash for captured global variables. Emit as inlined directive

carlo.bertolli marked an inline comment as done.Mar 9 2016, 12:55 PM

Addressed comment in new version of diff.

[OPENMP] This new version of the patch uses the inlining machinery of CGOpenMPRuntime.cpp instead of just dumping the teams body statement using EmitStmt.

ABataev edited edge metadata.Mar 9 2016, 7:21 PM

Add tests with captured globals to check that this problem is resolved

Support for global variables in a target region requires the global to be placed in a pragma declare target region. Pragma declare target region is not currently available in Clang (no parsing, sema, or codegen). I will add a check for this in the regression test once the support becomes available. Thanks!

I can actually add a regression test that checks that the compiler does not break when using a global in a teams region, even without declare target, if that is what is required here.

ABataev accepted this revision.Mar 10 2016, 8:19 PM
ABataev edited edge metadata.

LG

lib/CodeGen/CGOpenMPRuntimeNVPTX.h
32–59

Remove 'virtual' and add 'override' at the end of each function

This revision is now accepted and ready to land.Mar 10 2016, 8:19 PM
mkuron added a subscriber: mkuron.Mar 20 2016, 9:15 AM
carlo.bertolli edited edge metadata.

[OPENMP] Even though this patch was already accepted in its previous form, comments on depending patch D18286 (http://reviews.llvm.org/D18286) revealed that a new approach for this patch was necessary. Instead of committing something that I have to change anyway later on, I decided to provide a new version of this base patch. Please review it again and let me know about any comments you may have.

carlo.bertolli closed this revision.Apr 4 2016, 9:00 AM

Committed revision 265304.