This is an archive of the discontinued LLVM Phabricator instance.

Fix OMPT support for task frames for parallel regions and parallel regions + loops
ClosedPublic

Authored by jmellorcrummey on Jul 16 2015, 6:04 AM.

Details

Summary

This patch makes it possible for a performance tool that uses call stack unwinding to map implementation-level call stacks from master and worker threads into a unified global view. There are several components to this patch.

include/*/ompt.h.var

  • Add a new enumeration type that indicates whether the code for a master task for a parallel region is invoked by the user program or the runtime system
  • Change the signature for OMPT parallel begin/end callbacks to indicate whether the master task will be invoked by the program or the runtime system. This enables a performance tool using call stack unwinding to handle these two cases differently. For this case, profiler that uses call stack unwinding needs to know that the call path prefix for the master task may differ from those available within the begin/end callbacks if the program invokes tha master.

kmp.h

  • Change the signature for __kmp_join_call to take an additional parameter indicating the fork_context type. This is needed to supply the OMPT parallel end callback with information about whether the compiler or the runtime invoked the master task for a parallel region.

kmp_csupport.c

  • Ensure that the OMPT task frame field reenter_runtime_frame is properly set and cleared before and after calls to fork and join threads for a parallel region.
  • Adjust the code for the new signature for __kmp_join_call.
  • Adjust the OMPT parallel begin callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region.

kmp_gsupport.c

  • Apply all of the analogous changes described for kmp_csupport.c for the GOMP interface
  • Add OMPT support for the GOMP combined parallel region + loop API to maintain the OMPT task frame field reenter_runtime_frame.

kmp_runtime.c:

  • Use the new information passed by __kmp_join_call to adjust the OMPT parallel end callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region.

ompt_internal.h:

  • Use the flavor of the parallel region API (GNU or Intel) to determine who invokes the master task.

Diff Detail

Event Timeline

jmellorcrummey retitled this revision from to Fix OMPT support for task frames for parallel regions and parallel regions + loops.
jmellorcrummey updated this object.
jmellorcrummey added reviewers: jcownie, jlpeyton.

Besides the #if 0 around the assert only some minor issues about indention. I don't know how rigid we should be with this...

runtime/src/kmp_gsupport.c
493

Indention

522

Looks like 3 space indention while there is 4 space indention around...

546

Indention

554

Indention

runtime/src/kmp_runtime.c
2094

This probably shouldn't be here, it's part of another issue...

jmellorcrummey added inline comments.Jul 16 2015, 6:54 AM
runtime/src/kmp_gsupport.c
554

The indentation anomalies came from working with vi, which used tabs rather than spaces. I can update the diff if this warrants it.

runtime/src/kmp_runtime.c
2094

I don't quite know how to handle this. The runtime works though I have test codes that cause the assert to trip. In my development version, I patched out the assert pending patches from Intel. Is leaving in the broken assert preferable? Should I delete the assert entirely and let Intel restore a variant of it later after they have altered the runtime to ensure that the condition is true?

AndreyChurbanov added inline comments.
runtime/src/kmp_runtime.c
2094

John, I think that either #if0 or temporarily removing the assertion should work fine. But please do this as a separate patch, because it is indeed not related to the OMPT changes. We will then restore the assertion with the patch that fixes the reason of it.

Address concerns about indentation.

Remove #if 0 around a problematic assertion, which is a separate issue from the purpose of this patch.

Hahnfeld accepted this revision.Jul 16 2015, 11:35 PM
Hahnfeld added a reviewer: Hahnfeld.

LGTM (if I'm allowed to say...)

No regressions with test suite and finally fixes some of the errors with GCC and ompt_get_task_frame.

This revision is now accepted and ready to land.Jul 16 2015, 11:35 PM
This revision was automatically updated to reflect the committed changes.