Fix OMPT support for task frames, parallel regions, and parallel regions + loops
This patch makes it possible for a performance tool that uses call stack
unwinding to map implementation-level call stacks from master and worker
threads into a unified global view. There are several components to this patch.
Add a new enumeration type that indicates whether the code for a master task for a parallel region is invoked by the user program or the runtime system Change the signature for OMPT parallel begin/end callbacks to indicate whether the master task will be invoked by the program or the runtime system. This enables a performance tool using call stack unwinding to handle these two cases differently. For this case, a profiler that uses call stack unwinding needs to know that the call path prefix for the master task may differ from those available within the begin/end callbacks if the program invokes the master.
Change the signature for __kmp_join_call to take an additional parameter indicating the fork_context type. This is needed to supply the OMPT parallel end callback with information about whether the compiler or the runtime invoked the master task for a parallel region.
Ensure that the OMPT task frame field reenter_runtime_frame is properly set and cleared before and after calls to fork and join threads for a parallel region. Adjust the code for the new signature for __kmp_join_call. Adjust the OMPT parallel begin callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region.
Apply all of the analogous changes described for kmp_csupport.c for the GOMP interface Add OMPT support for the GOMP combined parallel region + loop API to maintain the OMPT task frame field reenter_runtime_frame.
Use the new information passed by __kmp_join_call to adjust the OMPT parallel end callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region.
Use the flavor of the parallel region API (GNU or Intel) to determine who invokes the master task.
Differential Revision: http://reviews.llvm.org/D11259