Some code path in nested parallelism miss updates for dist barrier
which could lead hang. Add appropriate function calls.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I noticed this when running check-openmp with dist barrier by using https://reviews.llvm.org/D122645
btw, it seems parallel tests run does not work when using dist barrier, but I couldn't figure out the reason.
Hi -- can you tell me which specific tests failed reliably? Was it one particular arch? I want to make sure I'm covering those tests and will give your change a try and review soon. Thanks!
Hi
I saw check-openmp hang on both x86 and arm64. Adding timeout value to lit option gives me the following result:
$ env LIT_OPTS="--show-xfail --timeout=60 -j 1" CHECK_OPENMP_ENV="KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist" ninja check-openmp ... Timed Out Tests (15): libomp :: affinity/bug-nested.c libomp :: affinity/format/fields_values.c libomp :: affinity/format/nested.c libomp :: affinity/format/nested2.c libomp :: affinity/format/nested_mixed.c libomp :: env/omp_thread_limit.c libomp :: ompt/misc/threads_nested.c libomp :: ompt/parallel/nested.c libomp :: ompt/parallel/nested_lwt.c libomp :: ompt/parallel/nested_thread_num.c libomp :: ompt/parallel/nested_threadnum.c libomp :: ompt/teams/parallel_team.c libomp :: parallel/omp_nested.c libomp :: teams/kmp_num_teams.c libomp :: worksharing/for/omp_monotonic_schedule_set_get.c
As these tests are related to nest/teams, I spotted suspected code path. With this patch, these timeout failures are gone.
(In addition to above timeouts, ompt/synchronization/reduction/tree_reduce.c is failed. I believe this is a test problem and partially addressed in D123359)