This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] [OMPD] [1/6] Implementation of OMPD debugging library - libompd. Code changes in openmp/runtime to support libompd.

Authored by Vigneshbalu on Apr 9 2021, 5:21 AM.



These patches propose the implementation of OMPD, a debugging interface to support debugging of OpenMP programs.

"" acts as an interface to the third-party tools, typically the debuggers. The tool accesses the state of the OpenMP program through the library and libompd can read or write the state of the OpenMP program that has begun execution through the callback function provided by the tool.

A brief talk regarding the upstreaming plans was provided a few months back in the LLVM-OpenMP committee meeting, the notes of which were provided at:

Most of the implementation has been contributed to by folks from RWTH-Aachen University, LLNL, Rice University, Perforce, AMD and others. (Might have missed out some of the contributors – apologies for that). It has been developed and maintained in

OMPD support is restricted only to CPU and Linux currently.

OMPD APIs are implemented as per the OpenMP 5.0 standard .
These patches also include some gdb-plugin code which is a python module which can be loaded into the debugger, gdb. This provides OMPD specific debugging commands once the "ompd init" command is invoked at the gdb prompt. More details regarding this are provided in the README available in the gdb-plugin directory.

CMake build system and directory structure are similar to the openmp/runtime and openmp/libomptarget.
Testing is done by two methods. First by comparing the output of OMPT and OMPD, second through gdb-plugin using llvm-lit and Filecheck.

This implementation does not provide debugging support for offloading to CPUs or GPUs.

The breakdown of patches is below.
Note: Patches should be applied in the below order.

  1. Code changes in openmp/runtime to support libompd.
  2. TargetValue: Access OpenMP runtime state through callbacks provided by the tool.
  3. omp-debug: Implementation of OMPD APIs.
  4. omp-icv: OMPD Internal control variable handlers.
  5. gdb-plugin: A Plugin code to gdb to leverage libompd to provide debugging support.
  6. libompd-tests: Testcases for libompd.

Another set of around 50 testcases will be pushed subsequently.

Please let me know if the patches need to be reorganized.

Diff Detail

Event Timeline

Vigneshbalu created this revision.Apr 9 2021, 5:21 AM
Vigneshbalu requested review of this revision.Apr 9 2021, 5:21 AM
Herald added a project: Restricted Project. · View Herald Transcript edited the summary of this revision. (Show Details)Apr 9 2021, 7:32 AM
Vigneshbalu changed the visibility from "Public (No Login Required)" to "Custom Policy".Apr 9 2021, 7:55 AM edited the summary of this revision. (Show Details)Apr 9 2021, 8:12 AM
Vigneshbalu changed the visibility from "Custom Policy" to "Public (No Login Required)".Apr 11 2021, 9:47 AM

Can you please upload a diff with full context, see (


Here and elsewhere, if a change is not guarded by OMPD_SUPPORT, it seems to me they should be committed separately.

Diff with full context

Vigneshbalu added inline comments.Apr 11 2021, 11:48 PM

I have guarded other places, this change and "openmp/runtime/src/ompt-specific.cpp" are code movement that stems from .
I will create separate review for this.

Removed the code that is not guarded by OMPD and created a separate review for the same

cchen added a subscriber: cchen.Apr 21 2021, 2:07 PM

Addressed clang-tidy warnings.

hbae added inline comments.Apr 27 2021, 5:11 PM

We need to use ompd_wait_id_t.


Why do we need scheduling task for implicit tasks while we don't call it task scheduling in the specification?


We already have program points defined for OMPT callbacks.
Isn't it better to place parallel begin/end BP at the places near OMPT parallel begin/end?
Are there any missing parallel begin/end not covered by the current OMPT parallel begin/end?


Can you add some comments why it is 7.

jini.susan added inline comments.Apr 28 2021, 5:08 AM

Thanks for this good catch! Will work on modifying this.


According to the spec, there is a difference in the binding associated with the current parallel handle with the OMPT parallel-begin event and the OMPD parallel-begin event.

For OMPD, (Section 5.6.1.)

The OpenMP implementation must execute ompd_bp_parallel_begin at every
parallel-begin event. At the point that the implementation reaches
ompd_bp_parallel_begin, the binding for ompd_get_curr_parallel_handle is the
parallel region that is beginning and the binding for ompd_get_curr_task_handle is the
task that encountered the parallel construct.


Between a parallel-begin event and an implicit-task-begin event, a call to
ompt_get_parallel_info(0,...) may return information about the outer parallel team,
the new parallel team or an inconsistent state.

The OMPD runtime entry points in this implementation are invoked at points to ensure that the right binding holds as per the spec.

protze.joachim added inline comments.Apr 28 2021, 7:33 AM

The OMPT parallel-begin is placed before the runtime gets a lock for creating the team. All information provided by the callback is available at this point. Other information like the number of threads is passed in the implicit-task-begin event.
As Jini pointed out, there is not enough information available for OMPD parallel-begin at that point.


Please drop these lines. The current patches don't include any cuda stuff.


same here, drop the cuda symbol

Addressed the Review comments.

Vigneshbalu marked 5 inline comments as done.May 2 2021, 9:53 PM
Vigneshbalu added inline comments.

Thanks Hansang. it is unwanted variable and removed it




Thanks for the catch joachim

protze.joachim added inline comments.May 11 2021, 1:45 AM

I think, the specification of ompd_get_scheduling_task_handle and ompd_get_generating_task_handle needs some refinement.

I think, the background here is the following:

On return, the scheduling_task_handle argument points to a location that points to a handle for the
task that is still on the stack of execution on the same thread and was deferred in favor of executing
the selected task.

When reaching a taskwait, the execution of the encountering implicit task is deferred to execute some explicit tasks.
Similarly, for a parallel region, the execution of the encountering implicit task is deferred to execute the implicit task of the parallel region.

When thinking about possible debugger workflow, it might help to cut the chain as you suggest and implicitly tell the debugger, that the implicit task is reached and the debugger should use ompd_get_generating_task_handle to get the parent task on the stack.

For the function ompd_get_generating_task_handle, the following might be an issue:

The ompd_get_generating_task_handle function obtains a pointer to the task handle for the task that encountered the OpenMP task construct that generated the task represented by task_handle.

Since the parallel construct is no task construct, following this definition, the task encountering a parallel construct is also not the generating task. The glossary entry for generating task also refers to the parallel region and not the implicit task.

I think, we should clarify the terms in the next tools subcommittee call.

protze.joachim added inline comments.May 25 2021, 3:46 PM

As of today's OMPD discussion, we will add a clarification for 5.2, that implicit tasks have no scheduling parent task, but all tasks have a generating parent task.
Unwinding the stack of explicit tasks, a tool would be able to identify when an implicit task is reached and can look at the generating task to further unwind the stack of tasks.

This means, all assignments of scheduling_parent should be removed in this function.

Removing assignments of scheduling_parent from the function "__kmp_fork_call" as per review comment.

Vigneshbalu marked an inline comment as done.May 31 2021, 9:05 PM
hbae added inline comments.Jun 2 2021, 2:46 PM

Is there any reason to keep this for serialized parallel?

Vigneshbalu added inline comments.Jun 3 2021, 12:56 AM

Apologies for the oversight, removed the wrong assignments. Working on the changes, will update the diff.

Vigneshbalu marked an inline comment as not done.Jun 3 2021, 12:57 AM
hbae added inline comments.Jun 3 2021, 6:04 AM

I think other changes are OK. This one is also need to be removed.

Addressed review comments.

Vigneshbalu marked an inline comment as done.Jun 4 2021, 4:58 AM
Vigneshbalu added inline comments.

Removed and updated the diff, please let us know if there is any other review comments.

hbae accepted this revision.Jun 4 2021, 4:02 PM


This revision is now accepted and ready to land.Jun 4 2021, 4:02 PM

I am seeing intermittent failures with the new test case on our amdgpu buildbot

FAIL: ompd-test::ompd_parallel.c

please revert and fix.

Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2022, 2:27 AM
ye-luo added a subscriber: ye-luo.Nov 11 2022, 4:17 PM

I disabled the test
test/CMakeLists.txt breaks DeviceRTL in LLVM_ENABLE_PROJECTS builds.
find_package(LLVM) is likely the source of the trouble.