Page MenuHomePhabricator

Fix for third case reported in

Authored by AndreyChurbanov on May 22 2019, 5:58 AM.



The code fragments of recruiting a thread from pool (kmp_runtime.cpp:4297):

new_thr->th.th_in_pool = FALSE;
if (new_thr->th.th_active_in_pool == TRUE) {
  KMP_DEBUG_ASSERT(new_thr->th.th_active == FALSE);

and its awakening, executed under the mutex (z_Linux_util.cpp:1534):

th->th.th_active = TRUE;
if (th->th.th_in_pool) {
  th->th.th_active_in_pool = TRUE;

can be executed in parallel. Since the "th_in_pool=0" assignment is out of critical section, it is valid to have "th_active==1" and "th_active_in_pool==0" for the worker thread been recruited from common pool (e.g. in case of spurious wakeup). So the assertion triggered looks wrong. The patch deletes it.

I actually failed to reproduce the assertion on my server with RHEL7 (probably because it is hard to force spurious wakeup of a thread, and without it I failed to find code path of simultaneous execution of code fragments I referenced), but the patch looks logically correct.

Diff Detail


Event Timeline

AndreyChurbanov edited the summary of this revision. (Show Details)May 22 2019, 6:10 AM

I tried at r361335, where the original reproducer failed this assert consistently quickly for me today. After applying your patch there and running each reproducer from my bug report in a shell while loop for about an hour, I saw no failures. Thanks!

hbae accepted this revision.May 22 2019, 9:36 AM


This revision is now accepted and ready to land.May 22 2019, 9:36 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2019, 9:48 AM