This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Fix deadlock for detachable task with child tasks
ClosedPublic

Authored by protze.joachim on Apr 22 2021, 10:05 AM.

Details

Summary

This patch fixes https://bugs.llvm.org/show_bug.cgi?id=49066.

For detachable tasks, the assumption breaks that the proxy task cannot have remaining child tasks when the proxy completes.
In stead of increment/decrement the incomplete task count, a high order bit is flipped to mark and wait for the incomplete proxy task.

Diff Detail

Event Timeline

protze.joachim requested review of this revision.Apr 22 2021, 10:05 AM

I still have issues with this patch and I think, it's not the right solution. I wonder, why __kmp_bottom_half_finish_proxy is necessary at all - see the inline comment.
If it's necessary, I tend to change the "imaginary children" code to td_incomplete_child_tasks |= 0x4000000 and td_incomplete_child_tasks &= !0x4000000. Then wait while (td_incomplete_child_tasks & 0x4000000) != 0

Even with that change, for a testcase creating thousands of tasks, the execution gets stuck at some point with no task being executed and all threads trying to steal.

@AndreyChurbanov is there any good way to find all remaining tasks with open dependencies in a debugger?

openmp/runtime/src/kmp_tasking.cpp
3874–3876

Why can't this be part of __kmp_bottom_half_finish_proxy? Is the proxy task supposed to be completed, before the __kmp_bottom_half_finish_proxy task executes?

protze.joachim edited the summary of this revision. (Show Details)

The initial fix broke in some cases. I missed the difference between the incomplete and allocated task counters.
This updated patch should fix the issue, but I still run into starvation issues. I'll attach another reproducer in the bug.

openmp/runtime/src/kmp_tasking.cpp
3879

This statement zeroes the taskdata->td_incomplete_child_tasks.
Proper operation would be to clear the set bit only, that is

KMP_ATOMIC_AND(&taskdata->td_incomplete_child_tasks, PROXY_TASK_FLAG-1);

Once the PROXY_TASK_FLAG is the degree of 2, minus one should fill all bits but the highest one.

AndreyChurbanov requested changes to this revision.Jul 26 2021, 11:27 AM
This revision now requires changes to proceed.Jul 26 2021, 11:27 AM
protze.joachim marked an inline comment as done.Jul 27 2021, 7:25 AM
protze.joachim added inline comments.
openmp/runtime/src/kmp_tasking.cpp
3879

Good catch. This might explain the issues I saw with this patch.

I intended to make this ~PROXY_TASK_FLAG confused logical and bit-wise not.

protze.joachim updated this revision to Diff 362023.EditedJul 27 2021, 7:25 AM
protze.joachim marked an inline comment as done.

Address the issue raised by @AndreyChurbanov

Also rebased to current main

This revision is now accepted and ready to land.Jul 27 2021, 9:33 AM
This revision was landed with ongoing or failed builds.Jul 27 2021, 3:02 PM
This revision was automatically updated to reflect the committed changes.

As a result of testing with different compilers before pushing, I added support-level information to the testcase following other detached task tests and removed the event from firstprivate clause, because gcc thinks that it is illegal.