This is an archive of the discontinued LLVM Phabricator instance.

[flang][OpenMP] Fix the data race problem for firstprivate clause
AbandonedPublic

Authored by peixin on Jun 25 2022, 11:16 PM.

Details

Summary

As OpenMP Spec 5.0, to avoid the data races, concurrent updates of the
original list item must be synchronized with the read of the original
list item that occurs as a result of the firstprivate clause. Adding
barrier(s) before and/or after the worksharing region would remove the
data races, and it is the application(user)'s job. However, when
one list item is in both firstprivate and lastprivate clauses, the
synchronization should be ensured by compiler since the lastprivate
clause follows the reads of the original list item performed for the
initialization of each of the firstprivate list item.

Add FIXME and TODO for two special cases, sections construct and linear
clause.

The data race problem for single construct will be handled later.

This implementation is based on the discussion with OpenMP committee and
clang code (clang/lib/CodeGen/CGStmtOpenMP.cpp).

Diff Detail

Event Timeline

peixin created this revision.Jun 25 2022, 11:16 PM
peixin requested review of this revision.Jun 25 2022, 11:16 PM

Hi @peixin thanks for this. In the following, we need some sort of synchronization in this construct to make sure both the regions get their copies before they start executing. How should we handle this?

!$omp taskgroup
!$omp task firstprivate(x)
  call foo(x)
!$omp end task
!$omp task firstprivate(x)
  call bar(x)
!$omp end task
!$omp end taskgroup
peixin added a comment.EditedJun 26 2022, 4:31 AM
!$omp taskgroup
!$omp task firstprivate(x)
  call foo(x)
!$omp end task
!$omp task firstprivate(x)
  call bar(x)
!$omp end task
!$omp end taskgroup

I guess this is one bad code. It actually does not work as you think.

  1. Check the following:
program main
  integer, pointer :: p1, p2
  allocate(p2)
  p1=>p2
  p2 = 1
  !$omp parallel num_threads(4)
  !$omp taskgroup
  !$omp task firstprivate(p2)
    print *, p2
    p1 = p2 + 1
  !$omp end task
  !$omp task firstprivate(p2)
    print *, p2
    p1 = p2 + 1
  !$omp end task
  !$omp task firstprivate(p2)
    print *, p2
    p1 = p2 + 1
  !$omp end task
  !$omp task firstprivate(p2)
    print *, p2
    p1 = p2 + 1
  !$omp end task
  !$omp end taskgroup
  !$omp end parallel
end
$ gfortran test.f90 -fopenmp && ./a.out
           1
           2
           3
           4
           5
           6
           7
           8
           9
          10
          11
          12
          13
          14
          15
          16
$ ifort test.f90 -fopenmp && ./a.out
           1
           1
           2
           2
           3
           4
           5
           3
           4
           4
           5
           6
           7
           5
           5
           5
$ gfortran --version
GNU Fortran (GCC) 11.3.0
$ ifort --version
ifort (IFORT) 2021.5.0 20211109

Also check this case:

int task_test(int *x) {
  #pragma omp task firstprivate(x)
  {
    *x = *x + 1;
  }
  #pragma omp task firstprivate(x)
  {
    *x = *x + 1;
  }
  return *x;
}

There is no barrier generated using clang (clang test.c -fopenmp -S -emit-llvm).

  1. As OpenMP Spec 5.0 2.20 Nesting of regions, a barrier region may not be closely nested inside a worksharing, loop, task, taskloop, critical, ordered, atomic, or master region. According to this, we should not generate one barrier inside task construct. @shraiysh Can you also check this for taskloop when you work on that?
  1. As OpenMP examples 5-0-1.pdf 3.1, also note that the tasks will be executed in no specified order because there are no synchronization directives. Also check the examples tasking.1.f90 and tasking.2.f90, P is firstprivate by default. I think changing the firstprivate variable in task region is the bad code?

I haven't investigated the task construct too much. Please correct me if I am wrong somewhere.

Hmm, thats interesting. This makes me doubt if we need the barrier at all for firstprivate? When do we need such a barrier? Should we add it only when #pragma omp barrier is allowed inside a region? Maybe all such testcases are "bad" and we would be fine without the barrier too? We should have a discussion and maybe ask the OpenMP experts about this.

This makes me doubt if we need the barrier at all for firstprivate? When do we need such a barrier? Should we add it only when #pragma omp barrier is allowed inside a region?

In the short term, I don't think so. Emiting the barrier should be analyzed case-by-case.

Maybe all such testcases are "bad" and we would be fine without the barrier too?

No, at least these testcases for parallel and worksharing-loop are fine. I haven't investigated on others too much.

We should have a discussion and maybe ask the OpenMP experts about this.

Agree. Let's discuss it on the Flang OpenMP meeting this week.

shraiysh accepted this revision.Jul 4 2022, 10:53 PM
This revision is now accepted and ready to land.Jul 4 2022, 10:53 PM
peixin planned changes to this revision.Jul 7 2022, 6:08 PM
peixin updated this revision to Diff 443474.Jul 9 2022, 10:32 PM
peixin retitled this revision from [flang][OpenMP] Fix data-race problem for task construct to [flang][OpenMP] Fix the data race problem for firstprivate clause.
peixin edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jul 9 2022, 10:32 PM
peixin updated this revision to Diff 443476.Jul 9 2022, 11:15 PM

Fix clang format

peixin abandoned this revision.Jul 26 2022, 4:48 AM