This is an archive of the discontinued LLVM Phabricator instance.

[flang][OpenMP] Lowering support for default clause
ClosedPublic

Authored by NimishMishra on Apr 18 2022, 5:21 AM.

Details

Summary

This patch adds lowering support for default clause.

  1. During symbol resolution in semantics, should the enclosing context have a default data sharing clause defined and a parser::Name is not attached to an explicit data sharing clause, the semantics::Symbol::Flag::OmpPrivate flag (in case of default(private)) and semantics::Symbol::Flag::OmpFirstprivate flag (in case of default(firstprivate)) is added to the symbol.
  1. During lowering, all symbols having either semantics::Symbol::Flag::OmpPrivate or semantics::Symbol::Flag::OmpFirstprivate flag are collected and privatised appropriately.

Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>

Diff Detail

Event Timeline

NimishMishra created this revision.Apr 18 2022, 5:21 AM
NimishMishra requested review of this revision.Apr 18 2022, 5:21 AM

Ping for review!

Some minor comments and suggestions. Nice work. Thank you!

flang/lib/Lower/OpenMP.cpp
95

Add hard todo instead of silently returning for unhandled cases.

149

If D125689 gets merged before this, please make sure you add a barrier after firstprivate code generation for default clause. If this patch gets merged first, I will try to handle it in D125689.

644

[nit] Why remove this later? Is there a plan to remove the TODO at the end? I see it as a safety net. Even if all the clauses are implemented, it will protect us from unexpected issues with future new clauses for the time when only parsing has been implemented for them...

flang/lib/Semantics/resolve-directives.cpp
1502

nit: maybe add a hard todo for other values of default clause?

NimishMishra edited the summary of this revision. (Show Details)May 27 2022, 6:59 PM
NimishMishra edited the summary of this revision. (Show Details)
peixin added a comment.Jun 7 2022, 4:00 AM

Please rebase this patch.

flang/test/Lower/OpenMP/default-clause.f90
6

Remove the prefix and move them in one line.

NimishMishra marked an inline comment as done.Jun 23 2022, 3:53 AM
NimishMishra added inline comments.
flang/lib/Lower/OpenMP.cpp
95

I don't think it's required. If you check the test cases, the behaviour for default(none) and default(shared seems fine. default(none) forces a separate data-sharing clause (for which privatisation is handled elsewhere). default(shared) is, I guess, the default behaviour anyway.

All in all, I don't think we need to do anything else in case of none or shared

644

Thank you. Removed the TODO

CI fails for the two examples in debian.

Flang :: Lower/OpenMP/omp-parallel-wsloop.f90
Flang :: Lower/OpenMP/omp-unstructured.f90

BTW, can you rebase this patch?

For changes in resolve-directives.cpp, can you add one test case for flang-new -fc1 -fdebug-dump-symbols or %python %S/../test_symbols.py %s %flang_fc1 -fopenmp? The examples are flang/test/Semantics/OpenMP/omp-symbol01.f90, omp-do14.f90, ....

flang/test/Lower/OpenMP/default-clause.f90
5

Please remove --check-prefix=FIRDialect. The checks for flang-new and bbc are the same.

Addressed comments

NimishMishra marked an inline comment as done.Jun 27 2022, 4:37 AM
NimishMishra added inline comments.
flang/lib/Lower/OpenMP.cpp
113

This accidentally crept in. I will remove this.

NimishMishra edited the summary of this revision. (Show Details)Jun 27 2022, 4:42 AM
peixin added inline comments.Jun 28 2022, 11:41 PM
flang/lib/Lower/Bridge.cpp
495

Don't understand why you change the previous design. The previous design should work OK for default clause. Which case does not work using the previous design?

flang/lib/Lower/OpenMP.cpp
95

The else is not necessary.

157

Why this? You can still use hasFirstPrivateOp for default clause in some way.

NimishMishra added inline comments.Jun 28 2022, 11:47 PM
flang/lib/Lower/Bridge.cpp
495

shallowLookupSymbol was not really finding the symbols I had as a result of default clause. I suppose those symbols reside somewhere else other than symbolMap.back() (maybe somewhere down the symbolMap).

flang/lib/Lower/OpenMP.cpp
157

I am inserting an omp.barrier for two cases:

  1. firstprivate is present
  2. default clause is firstprivate.

Now that I think of it, I should keep it as hasFirstPrivateOp || ompDefaultClauseType == Fortran::parser::OmpDefaultClause::Type::Firstprivate

peixin added inline comments.Jun 28 2022, 11:51 PM
flang/lib/Lower/Bridge.cpp
495

Can you give me one example (test case)? I can take a look at why it fails and check if it is the problem of the previous design.

flang/lib/Lower/OpenMP.cpp
157

You can change the boolean variable name into better one and add comment for it. To my understanding, these two cases actaully have the same meaning for the data-race problem.

NimishMishra added inline comments.Jun 29 2022, 3:49 AM
flang/lib/Lower/Bridge.cpp
495
integer :: x, y
!$omp parallel default(firstprivate)
   x = x * y
!$omp end parallel

If for this, I try converter.copyHostAssociateVar(*sym), then I get assertion failure "Host associated symbol box not found".

With the changes I did (basically instead of shallowLookupSymbol, do a separate createHostAssociateVarClone and lookupSymbol), it works.

kiranchandramohan requested changes to this revision.Jun 29 2022, 3:59 AM

Please add few more tests for nested constructs and with loop inside constructs having default clause.

flang/include/flang/Lower/AbstractConverter.h
104–112

I think exposing so many functions is not appropriate. Do we really need all these?

flang/lib/Lower/OpenMP.cpp
97

At this point, I believe the symbols themselves have the information whether it is private or firstprivate. If so, can we think of combining the genDefaultClause and privatizeVars functions? If there is no default clause present, then also collect all the symbols from the firstprivate, private clauses and hand it here and I believe it will work.

This revision now requires changes to proceed.Jun 29 2022, 3:59 AM

I will take a look at this later this week to check if there is any problem in previous design of copyHostAssociateVar, which may stop its reuse on default clause.

flang/include/flang/Lower/AbstractConverter.h
104–112

+1

flang/lib/Lower/Bridge.cpp
495

Thanks a lot for the detailed information.

flang/lib/Lower/OpenMP.cpp
97

+1

NimishMishra added inline comments.Jun 29 2022, 10:37 PM
flang/include/flang/Lower/AbstractConverter.h
104–112

shallowLookupSymbol isn't quite working in the case of default clause. I guess if we establish that there's isn't anything I am missing in the implementation and we really need can't use the current copyHostAssociateVar for default clause, probably I can pass on a boolean and modify behaviour inside copyHostAssociateVar itself.

peixin requested changes to this revision.Jul 1 2022, 10:08 PM

@NimishMishra I tried the following and it works.

diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index 0949cb7d3aea..636307e1d7dc 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -59,43 +59,59 @@ getOmpObjectSymbol(const Fortran::parser::OmpObject &ompObject) {
   return sym;
 }
 
-template <typename T>
-static void createPrivateVarSyms(Fortran::lower::AbstractConverter &converter,
-                                 const T *clause) {
-  const Fortran::parser::OmpObjectList &ompObjectList = clause->v;
-  for (const Fortran::parser::OmpObject &ompObject : ompObjectList.v) {
-    Fortran::semantics::Symbol *sym = getOmpObjectSymbol(ompObject);
-    // Privatization for symbols which are pre-determined (like loop index
-    // variables) happen separately, for everything else privatize here.
-    if (sym->test(Fortran::semantics::Symbol::Flag::OmpPreDetermined))
-      continue;
-    bool success = converter.createHostAssociateVarClone(*sym);
-    (void)success;
-    assert(success && "Privatization failed due to existing binding");
-    if constexpr (std::is_same_v<T, Fortran::parser::OmpClause::Firstprivate>) {
-      converter.copyHostAssociateVar(*sym);
-    }
-  }
-}
-
 static void privatizeVars(Fortran::lower::AbstractConverter &converter,
-                          const Fortran::parser::OmpClauseList &opClauseList) {
+                          const Fortran::parser::OmpClauseList &opClauseList,
+                          Fortran::lower::pft::Evaluation &eval) {
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
   auto insPt = firOpBuilder.saveInsertionPoint();
   firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock());
-  bool hasFirstPrivateOp = false;
+  // Has firstprivate or default(firstprivate) clause?
+  bool hasFirstPrivate = false;
+  // The symbols in private/firstprivate clause or region under
+  // default(private/firstprivate) clause.
+  llvm::SetVector<const Fortran::semantics::Symbol *> symbols;
+
+  // Collect the privatized symbols for private/firstprivate/default clause.
   for (const Fortran::parser::OmpClause &clause : opClauseList.v) {
-    if (const auto &privateClause =
+    auto collectOmpObjectListSymbol = [&](const Fortran::parser::OmpObjectList &ompObjectList) {
+      for (const Fortran::parser::OmpObject &ompObject : ompObjectList.v) {
+        Fortran::semantics::Symbol *sym = getOmpObjectSymbol(ompObject);
+        symbols.insert(sym);
+      }
+    };
+    if (const auto &defaultClause =
+            std::get_if<Fortran::parser::OmpClause::Default>(&clause.u)) {
+      if (defaultClause->v.v == Fortran::parser::OmpDefaultClause::Type::Private)
+        converter.collectSymbolSet(eval, symbols,
+            Fortran::semantics::Symbol::Flag::OmpPrivate, /*isUltimateSymbol=*/false);
+      else if (defaultClause->v.v == Fortran::parser::OmpDefaultClause::Type::Firstprivate)
+        converter.collectSymbolSet(
+            eval, symbols, Fortran::semantics::Symbol::Flag::OmpFirstPrivate,
+            /*isUltimateSymbol=*/false);
+    } else if (const auto &privateClause =
             std::get_if<Fortran::parser::OmpClause::Private>(&clause.u)) {
-      createPrivateVarSyms(converter, privateClause);
+      collectOmpObjectListSymbol(privateClause->v);
     } else if (const auto &firstPrivateClause =
                    std::get_if<Fortran::parser::OmpClause::Firstprivate>(
                        &clause.u)) {
-      createPrivateVarSyms(converter, firstPrivateClause);
-      hasFirstPrivateOp = true;
+      collectOmpObjectListSymbol(firstPrivateClause->v);
+    }
+  }
+
+  for (const Fortran::semantics::Symbol *sym : symbols) {
+    // Privatization for symbols which are pre-determined (like loop index
+    // variables) happen separately, for everything else privatize here.
+    if (sym->test(Fortran::semantics::Symbol::Flag::OmpPreDetermined))
+      continue;
+    bool success = converter.createHostAssociateVarClone(*sym);
+    (void)success;
+    assert(success && "Privatization failed due to existing binding");
+    if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate)) {
+      converter.copyHostAssociateVar(*sym);
+      hasFirstPrivate = true;
     }
   }
-  if (hasFirstPrivateOp)
+  if (hasFirstPrivate)
     firOpBuilder.create<mlir::omp::BarrierOp>(converter.getCurrentLocation());
   firOpBuilder.restoreInsertionPoint(insPt);
 }
@@ -376,7 +392,7 @@ createBodyOfOp(Op &op, Fortran::lower::AbstractConverter &converter,
 
   // Handle privatization. Do not privatize if this is the outer operation.
   if (clauses && !outerCombined)
-    privatizeVars(converter, *clauses);
+    privatizeVars(converter, *clauses, eval);
 
   if constexpr (std::is_same_v<Op, omp::ParallelOp>) {
     threadPrivatizeVars(converter, eval);
$ cat test.f90 
subroutine sub()
  integer :: x, y
  !$omp parallel default(firstprivate)
    x = x * y
  !$omp end parallel
end
$ bbc -emit-fir -fopenmp test.f90 
$ cat test.mlir 
func.func @_QPsub() {
  %0 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsubEx"}
  %1 = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFsubEy"}
  omp.parallel   {
    %2 = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFsubEx"}
    %3 = fir.load %0 : !fir.ref<i32>
    fir.store %3 to %2 : !fir.ref<i32>
    %4 = fir.alloca i32 {bindc_name = "y", pinned, uniq_name = "_QFsubEy"}
    %5 = fir.load %1 : !fir.ref<i32>
    fir.store %5 to %4 : !fir.ref<i32>
    omp.barrier
    %6 = fir.load %2 : !fir.ref<i32>
    %7 = fir.load %4 : !fir.ref<i32>
    %8 = arith.muli %6, %7 : i32
    fir.store %8 to %2 : !fir.ref<i32>
    omp.terminator
  }
  return
}

Please verify this.

flang/lib/Semantics/resolve-directives.cpp
1493
AddToContextObjectWithDSA(
    *name.symbol, semantics::Symbol::Flag::OmpPrivate);

This seems to be not necessary.

1603

Also remove this if you remove the above.

Addressed comments

The build fails with "502 Bad Gateway". I will restart the build later

peixin accepted this revision.Jul 4 2022, 1:25 AM

Thanks @NimishMishra for the update.

Nothing additional is done for default(none) (which forces defining explicit data sharing clauses, which in turn are anyway handled separately) and for default(shared) (which is anyway the default data-sharing mode).

Nit: In summary, this has been done in another patch. Remove this?

Please wait for one more approval. @kiranchandramohan @shraiysh

NimishMishra edited the summary of this revision. (Show Details)Jul 4 2022, 1:56 AM
kiranchandramohan requested changes to this revision.Jul 4 2022, 2:02 AM

If you have used code written by @peixin then please add him as a co-author.

Please add a debug-dump-symbols test to check the default symbols.

flang/test/Lower/OpenMP/default-clause.f90
138

Is there no PRIVATE_Y?

143

INNER_PRIVATE_Y?

147–148

Please capture the dataflow here with Named variables.

154–157

Please capture the dataflow here.

175–178

capture the dataflow here or ignore the check lines for the constants.
Consider this as a common comment for all tests.

This revision now requires changes to proceed.Jul 4 2022, 2:02 AM
NimishMishra added inline comments.Jul 4 2022, 2:08 AM
flang/test/Lower/OpenMP/default-clause.f90
138

No. Variables used in the default(firstprivate) block are not getting captured here. Inside the inner parallel, the host association is being used to create a first-private var.

Should the behaviour be something else?

flang/test/Lower/OpenMP/default-clause.f90
138

Could you explain why the treatment of X is different from Y here? Both X and Y are private in the outer and firstprivate in the first inner parallel region right?

NimishMishra added inline comments.Jul 4 2022, 2:35 AM
flang/test/Lower/OpenMP/default-clause.f90
138

I do not know the exact reason. From playing around, it seems like the default clauses on the outer and inner parallel are getting matched. The fact that x is both inside the inner private and firstprivate block is doing something here. For example, if I remove X from the default(private) inner parallel block, it vanishes from here.

The symbol dump looks something like this (I have removed the other default clauses for brevity):

MainProgram scope: default_clause_lowering size=16 alignment=4
    w size=4 offset=12: ObjectEntity type: INTEGER(4)
    x size=4 offset=0: ObjectEntity type: INTEGER(4)
    y size=4 offset=4: ObjectEntity type: INTEGER(4)
    z size=4 offset=8: ObjectEntity type: INTEGER(4)
    Block scope: size=0 alignment=1
      Block scope: size=0 alignment=1
        x (OmpFirstPrivate): HostAssoc
        y (OmpFirstPrivate): HostAssoc
      Block scope: size=0 alignment=1
        w (OmpPrivate): HostAssoc
        x (OmpPrivate): HostAssoc

Now if we did something like:

!$omp parallel default(firstprivate)
        !$omp parallel default(firstprivate)
            x = y
        !$omp end parallel

        !$omp parallel default(private) shared(z)
            w = x + z
        !$omp end parallel
    !$omp end parallel

the IR generated is

omp.parallel   {
      %4 = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFEx"}
      %5 = fir.load %1 : !fir.ref<i32>
      fir.store %5 to %4 : !fir.ref<i32>
      %6 = fir.alloca i32 {bindc_name = "y", pinned, uniq_name = "_QFEy"}
      %7 = fir.load %2 : !fir.ref<i32>
      fir.store %7 to %6 : !fir.ref<i32>
      omp.barrier
      omp.parallel   {
        %8 = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFEx"}
        %9 = fir.load %4 : !fir.ref<i32>
        fir.store %9 to %8 : !fir.ref<i32>
        %10 = fir.alloca i32 {bindc_name = "y", pinned, uniq_name = "_QFEy"}
        %11 = fir.load %6 : !fir.ref<i32>
        fir.store %11 to %10 : !fir.ref<i32>
        omp.barrier
        %12 = fir.load %10 : !fir.ref<i32>
        fir.store %12 to %8 : !fir.ref<i32>
        omp.terminator
      }
      omp.parallel   {
        %8 = fir.alloca i32 {bindc_name = "w", pinned, uniq_name = "_QFEw"}
        %9 = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFEx"}
        %10 = fir.load %9 : !fir.ref<i32>
        %11 = fir.load %3 : !fir.ref<i32>
        %12 = arith.addi %10, %11 : i32
        fir.store %12 to %8 : !fir.ref<i32>
        omp.terminator
      }
      omp.terminator
    }

Note that x is present in both the inner blocks, yet it is first-privatized in the outer parallel.

NimishMishra added inline comments.Jul 4 2022, 2:43 AM
flang/test/Lower/OpenMP/default-clause.f90
138

Ok, I think the issue is:

if (const auto &privateClause =
                   std::get_if<Fortran::parser::OmpClause::Private>(
                       &clause.u)) {
      collectOmpObjectListSymbol(privateClause->v);
}

So it is capturing default(private) on the outermost block, and gathering symbols which have OmpPrivate only. Hence it is missing OmpFirstPrivate

NimishMishra added inline comments.Jul 4 2022, 2:46 AM
flang/test/Lower/OpenMP/default-clause.f90
138

Sorry I meant

if (defaultClause->v.v ==
          Fortran::parser::OmpDefaultClause::Type::Private)
        converter.collectSymbolSet(eval, symbols,
                                   Fortran::semantics::Symbol::Flag::OmpPrivate,
                                   /*isUltimateSymbol=*/false);
peixin requested changes to this revision.Jul 4 2022, 4:24 AM
peixin added inline comments.
flang/test/Lower/OpenMP/default-clause.f90
138

Sorry, I missed the special case of the nested constructs and no code between the nested constructs. I think the problem is the following code:

if (GetContext().defaultDSA == semantics::Symbol::Flag::OmpPrivate &&
    !HasDataSharingAttributeObject(*name.symbol)) {
  name.symbol = DeclarePrivateAccessEntity(
      *name.symbol, semantics::Symbol::Flag::OmpPrivate, currScope());
} else if (GetContext().defaultDSA ==
        semantics::Symbol::Flag::OmpFirstPrivate &&
    !HasDataSharingAttributeObject(*name.symbol)) {
  name.symbol = DeclarePrivateAccessEntity(
      *name.symbol, semantics::Symbol::Flag::OmpFirstPrivate, currScope());
}

You should traverse the dirContext to the top and check each level. The dumped symbol should be something like the following:

MainProgram scope: default_clause_lowering size=16 alignment=4
    w size=4 offset=12: ObjectEntity type: INTEGER(4)
    x size=4 offset=0: ObjectEntity type: INTEGER(4)
    y size=4 offset=4: ObjectEntity type: INTEGER(4)
    z size=4 offset=8: ObjectEntity type: INTEGER(4)
    Block scope: size=0 alignment=1
        w (OmpPrivate): HostAssoc
        x (OmpPrivate): HostAssoc
        y (OmpPrivate): HostAssoc
      Block scope: size=0 alignment=1
        x (OmpFirstPrivate): HostAssoc
        y (OmpFirstPrivate): HostAssoc
      Block scope: size=0 alignment=1
        w (OmpPrivate): HostAssoc
        x (OmpPrivate): HostAssoc

When you traverse the dirContext, you should use DeclarePrivateAccessEntity(..., ..., dirScope);. You can get the dirScope by context_.FindScope(dirContext_[i].directiveSource).

Also, you should add one more example for the following:

!$omp parallel default(private)
    !$omp parallel default(firstprivate)
        x = y
    !$omp end parallel

    !$omp parallel default(shared)
        w = x + z
    !$omp end parallel
!$omp end parallel
peixin added a comment.Jul 4 2022, 4:32 AM

You may check one more example as follows:

!$omp parallel default(firstprivate)
    !$omp single
        x = y
    !$omp end single
!$omp end parallel
NimishMishra edited the summary of this revision. (Show Details)
  • Addressed comments
  • Added support for handling nested block constructs with default clauses
NimishMishra added inline comments.Jul 10 2022, 6:29 AM
flang/lib/Lower/Bridge.cpp
525

Use of eval here allows all symbols to be collected into a llvm::SetVector, even from the nested constructs. I found two special cases:

!$omp parallel default(private)
    y = 10
    !$omp parallel default(private)
           y = 20
    !$omp end parallel
!$omp end parallel

I got double privatization assertion failures with these. Hence the use of std::map in the current implementation in OpenMP.cpp to keep unique values.

Another corner-case was

!$omp parallel shared(y) default(private)
    y = 10
    !$omp parallel default(private)
         y = 20
    !$omp end parallel
!$omp end parallel

This again looked at the default(...) of the outer construct, and collected all symbols with OmpPrivate. Means even though we have a shared(y), it was getting privatized. For this, I deliberately collect sharedSymbols and remove them from the std::map being privatised.

peixin added inline comments.Jul 11 2022, 6:43 PM
flang/lib/Lower/Bridge.cpp
517–518

Can you remove this change and add "Depends on D129190" in the summary?

flang/lib/Lower/OpenMP.cpp
199

This code is too fragile. What about the following:

llvm::SetVector<const Fortran::semantics::Symbol *> defaultSymbols;
llvm::SetVector<const Fortran::semantics::Symbol *> symbolsInNestedOmpRegions;
llvm::SetVector<const Fortran::semantics::Symbol *> privatizedSymbols;
// Collect the symbols.
for (auto privatizedSymbol: privatizedSymbols)
  privatizeSymbol(*privatizedSymbol);
for (auto defaultSymbol : defaultSymbols)
  if (!symbolsInNestedRegions.contains(defaultSymbol))
    privatizeSymbol(*defaultSymbol);
flang/lib/Semantics/resolve-directives.cpp
1490
1491
1515
Symbol *hostSymbol =
    defaultDSASymbols.size() ? defaultDSASymbols.back() : &symbol->GetUltimate();
defaultDSASymbols.push_back(DeclarePrivateAccessEntity(
    *hostSymbol, dirContext.defaultDSA,
    context_.FindScope(dirContext.directiveSource)));
NimishMishra added inline comments.Jul 13 2022, 12:16 AM
flang/lib/Lower/OpenMP.cpp
199

This doesn't solve one problem. Consider:

!$omp parallel default(private)
  x = 20
  !$omp parallel default(firstprivate)
         y = 10
  !$omp end parallel

    !$omp parallel default(first)
         y = 10
  !$omp end parallel
!$omp end parallel

Now defaultSymbols creates the vector [x, y, y]. This causes assertion failure.

symbolsInNestedOmpRegions creates the vector [y, y]. Final privatization thus proceeds only for x because of !symbolsInNestedRegions.contains(defaultSymbol)

I have another doubt. C/C++ does not have default(private | firstprivate). As standard says, the default(firstprivate | private) clause causes all variables in the construct that have implicitly determined data-sharing attributes to be (firstprivate | private). For the following case:

y = 20
!$omp parallel default(private)
  !$omp parallel default(firstprivate)
      y = 10
  !$omp end parallel
!$omp end parallel

Is there a private "y" between default(private) and default(firstprivate)? That is, is y copied from y = 20 or a private y with unknown value? Does y in y = 10 is taken as implicitly determined data-sharing?
What about the following case with the same question:

y = 20
!$omp parallel default(private)
  !$omp parallel firstprivate(y)
      y = 10
  !$omp end parallel
!$omp end parallel

Can you check the behaviors of other compilers?

flang/lib/Lower/OpenMP.cpp
199

defaultSymbols is [x, y1, y2, y3]. symbolsInNestedOmpRegions is [y2, y3].

Block scope: size=0 alignment=1
  x (OmpPrivate): HostAssoc
  y (OmpPrivate): HostAssoc // y1
  Block scope: size=0 alignment=1
    y (OmpFirstPrivate): HostAssoc // y2
  Block scope: size=0 alignment=1
    y (OmpFirstPrivate): HostAssoc // y3

It should work.

NimishMishra added inline comments.Jul 19 2022, 1:36 AM
flang/lib/Lower/OpenMP.cpp
199

You are right. It works. But it sprouts another problem.

Consider the following:

!$omp parallel default(private)
    !$omp parallel default(private)
         x = 10
   !$omp end parallel
!$omp end parallel

Even with symbolsInNestedRegions, you can see that we still get double privatization in the inner block. Reason being that collectSymbolSet collects both the symbol and the host-associated symbol (here, both have same flags).

I tried to break collectSymbolSet into two: collecting either the symbols or only the host-associated symbols. But that has complicated matters so far.

I'll keep trying and see how this approach can be made to work.

I have another doubt. C/C++ does not have default(private | firstprivate). As standard says, the default(firstprivate | private) clause causes all variables in the construct that have implicitly determined data-sharing attributes to be (firstprivate | private). For the following case:

y = 20
!$omp parallel default(private)
  !$omp parallel default(firstprivate)
      y = 10
  !$omp end parallel
!$omp end parallel

Is there a private "y" between default(private) and default(firstprivate)? That is, is y copied from y = 20 or a private y with unknown value? Does y in y = 10 is taken as implicitly determined data-sharing?
What about the following case with the same question:

y = 20
!$omp parallel default(private)
  !$omp parallel firstprivate(y)
      y = 10
  !$omp end parallel
!$omp end parallel

Can you check the behaviors of other compilers?

Please check and address this comment before going ahead in case we go in the wrong direction.

OpenMP 5.1 onwards permits firstprivate/private in C language with some restrictions. https://www.openmp.org/spec-html/5.1/openmpsu116.html#x151-1650002.21.4.1

If you want to test with Clang then you can use a command like ./bin/clang -fopenmp -fopenmp-version=51 -S -emit-llvm

@NimishMishra where are we with this? Do you need any help here? Would you be able to reply to @peixin's question and get this ready for submission today?

ronlieb removed a subscriber: ronlieb.
NimishMishra added a comment.EditedJul 25 2022, 7:19 AM

@NimishMishra where are we with this? Do you need any help here? Would you be able to reply to @peixin's question and get this ready for submission today?

I checked this. There is indeed a privatised variable between the outer block with default(...) and the inner block with default(...) in other compilers. This patch also exhibits the same behaviour.

As you would have seen, I use a std::map and some insertions/deletions while lowering to privatise default clause variables. @peixin suggested another approach but that got equally complex (as I mentioned in my previous comment). I have, so far, not been able to arrive at a simpler solution than what we have in this patch.

flang/lib/Lower/OpenMP.cpp
199

You are right. It works. But it sprouts another problem.

Consider the following:

!$omp parallel default(private)
    !$omp parallel default(private)
         x = 10
   !$omp end parallel
!$omp end parallel

Even with symbolsInNestedRegions, you can see that we still get double privatization in the inner block. Reason being that collectSymbolSet collects both the symbol and the host-associated symbol (here, both have same flags).

I tried to break collectSymbolSet into two: collecting either the symbols or only the host-associated symbols. But that has complicated matters so far.

I'll keep trying and see how this approach can be made to work.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?
Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?
Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

Yes. We could do that. You are right in saying that the current implementation mixes up too much of different phases. We should definitely look into reducing that.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?

I agree.

Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

I tired to differentiate the symbols (including implicit host-associated or explicit symbols) using the scope. Unfortunately, there is no such thing that I can find during lowering. Adding the scope info in lowering for OpenMP may benefit other constructs/clauses. I haven't tried it in pre-fir. It's maybe worth a try.

@NimishMishra Can you update this patch back to last version with some TODO(s) so not to collect and remove symbols, which is kind of fragile?

This revision was not accepted when it landed; it landed in state Needs Review.Jul 26 2022, 1:40 AM
This revision was automatically updated to reflect the committed changes.

Is it possible to add a hard TODO for the cases that are not properly handled?

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

flang/lib/Lower/Bridge.cpp
517–518

Nit: spelling Associated.

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me.

+1

flang/include/flang/Lower/AbstractConverter.h
111

This is wrong spelling. I have updated D129190, and you didn't notice it. Also, the comment is not as the function does now.

flang/lib/Lower/OpenMP.cpp
193

You only borrowed the code in D128595 without the description, FIXME and TODO. Also, the summary didn't mention this, and there is not test case for this.

Sorry for the confusion. I mistook it as in we get the implementation in before the code freeze, and later change it to a semantic pass. Should I revert the commit?

Is it possible to add a hard TODO for the cases that are not properly handled?

The problem with the implementation is that it's not good. But it handles the cases we have in the tests here. The other approach we were looking at (collecting the entire evaluations's symbols, collecting only the nested evaluations' symbols, and creating a set difference of the two) gave problems with shared(...) clause. It also gave problems with some other nested constructs which had parent constructs, but did not have children constructs.

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

I am sorry. I wait for 2-3 days usually after approval before merging. In this case, I mistook your comment as a go-ahead to submit a working implementation, which we shall change later. I guess there is still some time left before the code freeze, should I revert this commit?

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

That approach did not work with two cases (1. with shared(...) clause and 2. with nested constructs which had parent constructs, but did not have children constructs). The current approach, although fragile, worked with both. So I kept it as it is.

flang/lib/Lower/OpenMP.cpp
193

I did not borrow code from https://reviews.llvm.org/D128595. I was aware of D130027, and I added the change based on that patch and the discussions we had.

Should I revert this completely and begin anew? That might fix the dependencies

shraiysh reopened this revision.Jul 26 2022, 6:50 AM

I have reverted this change for now, we can discuss and add TODOs for the unhandled cases and then merge it.

Sorry for the confusion. I mistook it as in we get the implementation in before the code freeze, and later change it to a semantic pass. Should I revert the commit?

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

I am sorry. I wait for 2-3 days usually after approval before merging. In this case, I mistook your comment as a go-ahead to submit a working implementation, which we shall change later. I guess there is still some time left before the code freeze, should I revert this commit?

That is alright. The semantics implementation is a longer-term suggestion. You don't have to do it in this patch.

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

That approach did not work with two cases (1. with shared(...) clause and 2. with nested constructs which had parent constructs, but did not have children constructs). The current approach, although fragile, worked with both. So I kept it as it is.

I would prefer a hard TODO for the nested default but if that is not required then it is OK.
Could you fix the minor spelling nit and the comments from Peixin and then submit it with an approval from @shraiysh or @peixin today? (Don't wait for me)

Could you fix the minor spelling nit and the comments from Peixin and then submit it with an approval from @shraiysh or @peixin today? (Don't wait for me)

Given that @peixin has more context for this revision, it would be faster if you could review the addressing of comments for this @peixin.

If Peixin is not online/available then I will try to review it if the patch updates tonight.

It is at 23:30 now in my time zone. So, I am afraid that I cannot review it after the update.

I didn't review the approach of collecting and removing symbols of the current patch in detail since I feel like this is not a good way. Can we use the approach of https://reviews.llvm.org/D123930#3661846 and add TODO for the cases of double privitization? This is not one hard comment if @shraiysh help review the current approach.

@NimishMishra Do you have a plan on how to go forward with this? I am away for a couple of weeks. Could you discuss with peixin and shraiysh either here or in tomorrows call and make progress and submit?

@NimishMishra Do you have a plan on how to go forward with this? I am away for a couple of weeks. Could you discuss with peixin and shraiysh either here or in tomorrows call and make progress and submit?

Yes. I plan to discuss further steps on this in tomorrow's call and close at the earliest.

This latest update closely follows @peixin's suggestions and avoids std::map and inserting/deleting symbols. Because of this, one test that was earlier passing is now failing. I have added the same in flang/test/Lower/OpenMP/Todo/omp-default-clause-inner-loop.f90. Rest all nested constructs are passing.

NimishMishra added inline comments.Jul 28 2022, 2:59 AM
flang/lib/Lower/OpenMP.cpp
193

@peixin Sorry for the confusion with https://reviews.llvm.org/D128595. Should I bring those changes in here? Or will you move ahead with it separately? Apologies again for the confusion

peixin added inline comments.Jul 28 2022, 4:19 AM
flang/lib/Lower/OpenMP.cpp
193

Don't worry about that. I have abandoned it and won't work on that again. You can move those changes in this patch.

peixin added inline comments.Jul 28 2022, 8:31 AM
flang/include/flang/Lower/AbstractConverter.h
107

Can you merge these two functions into one by adding one more argument to collectSymbolSet since they are quite similar?

flang/lib/Lower/OpenMP.cpp
159
160

Merged the two symbol collector methods into one

peixin added inline comments.Aug 2 2022, 6:48 PM
flang/include/flang/Lower/AbstractConverter.h
108

Not correct comment.

111

The default value of false sounds to be more reasonable.

flang/lib/Lower/Bridge.cpp
521
flang/lib/Lower/OpenMP.cpp
63

This function needs to be removed.

122

This needs to be moved in line 185. This is confusing since privatizedSymbols is used for private/firstprivate clause instead of default clause.

125

There may not be one pre-FIR pass to address this issue. How about the following:

FIXME: Collect the symbols with private/firstprivate flag in the region of the construct with default(private/firstprivate) clause excluding the symbols with the same private/firstprivate flag in the inner nested regions.

126
161

The "{" is not necessary.

179

Remove the blank line.

225–226
converter.collectSymbolSet(eval, threadprivateSyms,
                           Fortran::semantics::Symbol::Flag::OmpThreadprivate);

Use this since this uses the default behavior of collectSymbolSet, and no need to specify these two arguments.

flang/lib/Semantics/resolve-directives.cpp
1490

This comment is missed.

1491

This comment is missed.

1515

This comment is missed.

1518
NimishMishra marked 16 inline comments as done.

Addressed comments.

flang/lib/Lower/OpenMP.cpp
63

This function is being used by the last private clause inside privatizeVars

peixin accepted this revision.Aug 4 2022, 2:07 AM

Thanks for the update. LGTM. You can wait for a day or two to land this if there is no further comments.

flang/lib/Lower/OpenMP.cpp
121–130

Remove this since this for loop is not only for private/firstprivate clause.

Can you land this so that other privatization-related patches can be implemented based on this?

This revision was not accepted when it landed; it landed in state Needs Review.Aug 12 2022, 4:09 AM
This revision was automatically updated to reflect the committed changes.

I have landed the patch with https://reviews.llvm.org/rG435feefbdd6c91faf24fa5e69c4e7c3bc127568a.

Thank you @peixin, @kiranchandramohan, @shraiysh and others for working through this review.