HomePhabricator

Fix OMPT support for task frames, parallel regions, and parallel regions + loops

Description

Fix OMPT support for task frames, parallel regions, and parallel regions + loops

This patch makes it possible for a performance tool that uses call stack
unwinding to map implementation-level call stacks from master and worker
threads into a unified global view. There are several components to this patch.

include/*/ompt.h.var

Add a new enumeration type that indicates whether the code for a master task
  for a parallel region is invoked by the user program or the runtime system
Change the signature for OMPT parallel begin/end callbacks to indicate whether
  the master task will be invoked by the program or the runtime system. This
  enables a performance tool using call stack unwinding to handle these two
  cases differently. For this case, a profiler that uses call stack unwinding
  needs to know that the call path prefix for the master task may differ from
  those available within the begin/end callbacks if the program invokes the
  master.

kmp.h

Change the signature for __kmp_join_call to take an additional parameter
indicating the fork_context type. This is needed to supply the OMPT parallel
end callback with information about whether the compiler or the runtime
invoked the master task for a parallel region.

kmp_csupport.c

Ensure that the OMPT task frame field reenter_runtime_frame is properly set
  and cleared before and after calls to fork and join threads for a parallel
  region.
Adjust the code for the new signature for __kmp_join_call.
Adjust the OMPT parallel begin callback invocations to carry the extra
  parameter indicating whether the program or the runtime invokes the master
  task for a parallel region.

kmp_gsupport.c

Apply all of the analogous changes described for kmp_csupport.c for the GOMP
  interface
Add OMPT support for the GOMP combined parallel region + loop API to
  maintain the OMPT task frame field reenter_runtime_frame.

kmp_runtime.c:

Use the new information passed by __kmp_join_call to adjust the OMPT
  parallel end callback invocations to carry the extra parameter indicating
  whether the program or the runtime invokes the master task for a parallel
  region.

ompt_internal.h:

Use the flavor of the parallel region API (GNU or Intel) to determine who
  invokes the master task.

Differential Revision: http://reviews.llvm.org/D11259