Spurious assertion failures are symptoms of a race condition for the handling of detached tasks:
Assertion failure at kmp_tasking.cpp(3744): taskdata->td_flags.complete == 1.
Assertion failure at kmp_tasking.cpp(710): taskdata->td_flags.executing == 0.
in the case of detach=true, all accesses to taskdata in __kmp_task_finish need to happen before (~line 873):
taskdata->td_flags.proxy = TASK_PROXY;
This assignment signals to __kmp_fulfill_event, that the task will need to be freed there. So, conceptionally the ownership of taskdata is moved.
Is it safe to move the call to the destructors up in the code?
Do we need to acquire the lock in __kmp_fulfill_event to make sure the lock is released in __kmp_task_finish, before taskdata is freed?
completion_lock looks a better name than competion_lock (typo?).