This is an archive of the discontinued LLVM Phabricator instance.

[OMPT] Fix wrong parent_task_id in serialized parallel_begin with GCC
ClosedPublic

Authored by Hahnfeld on Jan 29 2016, 12:58 AM.

Details

Summary

Without this patch a simple #pragma omp parallel num_threads(1) leads to

ompt_event_parallel_begin: parent_task_id=3, [...], parallel_id=2, [...]
ompt_event_parallel_end: parallel_id=2, task_id=4, [...]

Diff Detail

Event Timeline

Hahnfeld updated this revision to Diff 46362.Jan 29 2016, 12:58 AM
Hahnfeld retitled this revision from to [OMPT] Fix wrong parent_task_id in serialized parallel_begin with GCC.
Hahnfeld updated this object.
Hahnfeld added reviewers: jlpeyton, jmellorcrummey.
Hahnfeld added subscribers: openmp-commits, tcramer.
jlpeyton accepted this revision.Feb 4 2016, 1:41 PM
jlpeyton edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Feb 4 2016, 1:41 PM
jmellorcrummey edited edge metadata.Feb 4 2016, 2:13 PM

I haven't had the time to analyze this patch in detail. To me, it looks like it fixes a symptom (an incorrect argument passed to a callback) rather than an underlying cause (information about the top task frame may be improperly updated).

If the information for the task at depth 0 before the serialized parallel is what should be presented to the parallel begin callback, then is the information at depth 0 wrong? Remember, a tool using asynchronous sampling could observe this task at any time - either before or after the serialized parallel.

I haven't had the time to analyze this patch in detail. To me, it looks like it fixes a symptom (an incorrect argument passed to a callback) rather than an underlying cause (information about the top task frame may be improperly updated).

If the information for the task at depth 0 before the serialized parallel is what should be presented to the parallel begin callback, then is the information at depth 0 wrong? Remember, a tool using asynchronous sampling could observe this task at any time - either before or after the serialized parallel.

Not sure if I completely get your point...
What I have just tested is that the ids reported by ompt_get_{parallel,task}_id are correct when sampling user code. I think this is because there is a lightweight taskteam constructed on line 417 (with the patch applied)

Hahnfeld updated this revision to Diff 47002.Feb 5 2016, 1:58 AM
Hahnfeld edited edge metadata.

Move callback up to execute in correct context.

This way, there will only be wrong ids reported between __kmp_serialized_parallel and the creation of the lightweight task.
I don't know if this can be fixed because the task is already brought to execution in __kmp_serialized_parallel so we would have to assign the correct id somewehere in there...

omalyshe edited edge metadata.Feb 26 2016, 6:50 AM

Is it OK that ompt_parallel_id_new() is called several times for one region? First in line 393 and then inside kmp_serialized_parallel() it can be called once or twice.

Is it OK that ompt_parallel_id_new() is called several times for one region? First in line 393 and then inside kmp_serialized_parallel() it can be called once or twice.

While this isn't good from a performance perspective I think this won't hurt for correctness: It will skip some ids but contiguous ids aren't required by the standard.
On the other hand I don't know if this may be required in some cases, I don't always fully understand the code paths in the runtime library...

Olga, John,

Does this look ok now?

omalyshe accepted this revision.Mar 15 2016, 1:01 AM
omalyshe edited edge metadata.

LGTM.

This revision was automatically updated to reflect the committed changes.