Page MenuHomePhabricator

[flang][OpenMP] Lowering support for default clause
ClosedPublic

Authored by NimishMishra on Apr 18 2022, 5:21 AM.

Details

Summary

This patch adds lowering support for default clause.

  1. During symbol resolution in semantics, should the enclosing context have a default data sharing clause defined and a parser::Name is not attached to an explicit data sharing clause, the semantics::Symbol::Flag::OmpPrivate flag (in case of default(private)) and semantics::Symbol::Flag::OmpFirstprivate flag (in case of default(firstprivate)) is added to the symbol.
  1. During lowering, all symbols having either semantics::Symbol::Flag::OmpPrivate or semantics::Symbol::Flag::OmpFirstprivate flag are collected and privatised appropriately.

Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
peixin requested changes to this revision.Jul 4 2022, 4:24 AM
peixin added inline comments.
flang/test/Lower/OpenMP/default-clause.f90
138

Sorry, I missed the special case of the nested constructs and no code between the nested constructs. I think the problem is the following code:

if (GetContext().defaultDSA == semantics::Symbol::Flag::OmpPrivate &&
    !HasDataSharingAttributeObject(*name.symbol)) {
  name.symbol = DeclarePrivateAccessEntity(
      *name.symbol, semantics::Symbol::Flag::OmpPrivate, currScope());
} else if (GetContext().defaultDSA ==
        semantics::Symbol::Flag::OmpFirstPrivate &&
    !HasDataSharingAttributeObject(*name.symbol)) {
  name.symbol = DeclarePrivateAccessEntity(
      *name.symbol, semantics::Symbol::Flag::OmpFirstPrivate, currScope());
}

You should traverse the dirContext to the top and check each level. The dumped symbol should be something like the following:

MainProgram scope: default_clause_lowering size=16 alignment=4
    w size=4 offset=12: ObjectEntity type: INTEGER(4)
    x size=4 offset=0: ObjectEntity type: INTEGER(4)
    y size=4 offset=4: ObjectEntity type: INTEGER(4)
    z size=4 offset=8: ObjectEntity type: INTEGER(4)
    Block scope: size=0 alignment=1
        w (OmpPrivate): HostAssoc
        x (OmpPrivate): HostAssoc
        y (OmpPrivate): HostAssoc
      Block scope: size=0 alignment=1
        x (OmpFirstPrivate): HostAssoc
        y (OmpFirstPrivate): HostAssoc
      Block scope: size=0 alignment=1
        w (OmpPrivate): HostAssoc
        x (OmpPrivate): HostAssoc

When you traverse the dirContext, you should use DeclarePrivateAccessEntity(..., ..., dirScope);. You can get the dirScope by context_.FindScope(dirContext_[i].directiveSource).

Also, you should add one more example for the following:

!$omp parallel default(private)
    !$omp parallel default(firstprivate)
        x = y
    !$omp end parallel

    !$omp parallel default(shared)
        w = x + z
    !$omp end parallel
!$omp end parallel
peixin added a comment.Jul 4 2022, 4:32 AM

You may check one more example as follows:

!$omp parallel default(firstprivate)
    !$omp single
        x = y
    !$omp end single
!$omp end parallel
NimishMishra edited the summary of this revision. (Show Details)
  • Addressed comments
  • Added support for handling nested block constructs with default clauses
NimishMishra added inline comments.Jul 10 2022, 6:29 AM
flang/lib/Lower/Bridge.cpp
575

Use of eval here allows all symbols to be collected into a llvm::SetVector, even from the nested constructs. I found two special cases:

!$omp parallel default(private)
    y = 10
    !$omp parallel default(private)
           y = 20
    !$omp end parallel
!$omp end parallel

I got double privatization assertion failures with these. Hence the use of std::map in the current implementation in OpenMP.cpp to keep unique values.

Another corner-case was

!$omp parallel shared(y) default(private)
    y = 10
    !$omp parallel default(private)
         y = 20
    !$omp end parallel
!$omp end parallel

This again looked at the default(...) of the outer construct, and collected all symbols with OmpPrivate. Means even though we have a shared(y), it was getting privatized. For this, I deliberately collect sharedSymbols and remove them from the std::map being privatised.

peixin added inline comments.Jul 11 2022, 6:43 PM
flang/lib/Lower/Bridge.cpp
559–560

Can you remove this change and add "Depends on D129190" in the summary?

flang/lib/Lower/OpenMP.cpp
167

This code is too fragile. What about the following:

llvm::SetVector<const Fortran::semantics::Symbol *> defaultSymbols;
llvm::SetVector<const Fortran::semantics::Symbol *> symbolsInNestedOmpRegions;
llvm::SetVector<const Fortran::semantics::Symbol *> privatizedSymbols;
// Collect the symbols.
for (auto privatizedSymbol: privatizedSymbols)
  privatizeSymbol(*privatizedSymbol);
for (auto defaultSymbol : defaultSymbols)
  if (!symbolsInNestedRegions.contains(defaultSymbol))
    privatizeSymbol(*defaultSymbol);
flang/lib/Semantics/resolve-directives.cpp
1495
1496
1520
Symbol *hostSymbol =
    defaultDSASymbols.size() ? defaultDSASymbols.back() : &symbol->GetUltimate();
defaultDSASymbols.push_back(DeclarePrivateAccessEntity(
    *hostSymbol, dirContext.defaultDSA,
    context_.FindScope(dirContext.directiveSource)));
NimishMishra added inline comments.Jul 13 2022, 12:16 AM
flang/lib/Lower/OpenMP.cpp
167

This doesn't solve one problem. Consider:

!$omp parallel default(private)
  x = 20
  !$omp parallel default(firstprivate)
         y = 10
  !$omp end parallel

    !$omp parallel default(first)
         y = 10
  !$omp end parallel
!$omp end parallel

Now defaultSymbols creates the vector [x, y, y]. This causes assertion failure.

symbolsInNestedOmpRegions creates the vector [y, y]. Final privatization thus proceeds only for x because of !symbolsInNestedRegions.contains(defaultSymbol)

I have another doubt. C/C++ does not have default(private | firstprivate). As standard says, the default(firstprivate | private) clause causes all variables in the construct that have implicitly determined data-sharing attributes to be (firstprivate | private). For the following case:

y = 20
!$omp parallel default(private)
  !$omp parallel default(firstprivate)
      y = 10
  !$omp end parallel
!$omp end parallel

Is there a private "y" between default(private) and default(firstprivate)? That is, is y copied from y = 20 or a private y with unknown value? Does y in y = 10 is taken as implicitly determined data-sharing?
What about the following case with the same question:

y = 20
!$omp parallel default(private)
  !$omp parallel firstprivate(y)
      y = 10
  !$omp end parallel
!$omp end parallel

Can you check the behaviors of other compilers?

flang/lib/Lower/OpenMP.cpp
167

defaultSymbols is [x, y1, y2, y3]. symbolsInNestedOmpRegions is [y2, y3].

Block scope: size=0 alignment=1
  x (OmpPrivate): HostAssoc
  y (OmpPrivate): HostAssoc // y1
  Block scope: size=0 alignment=1
    y (OmpFirstPrivate): HostAssoc // y2
  Block scope: size=0 alignment=1
    y (OmpFirstPrivate): HostAssoc // y3

It should work.

NimishMishra added inline comments.Jul 19 2022, 1:36 AM
flang/lib/Lower/OpenMP.cpp
167

You are right. It works. But it sprouts another problem.

Consider the following:

!$omp parallel default(private)
    !$omp parallel default(private)
         x = 10
   !$omp end parallel
!$omp end parallel

Even with symbolsInNestedRegions, you can see that we still get double privatization in the inner block. Reason being that collectSymbolSet collects both the symbol and the host-associated symbol (here, both have same flags).

I tried to break collectSymbolSet into two: collecting either the symbols or only the host-associated symbols. But that has complicated matters so far.

I'll keep trying and see how this approach can be made to work.

I have another doubt. C/C++ does not have default(private | firstprivate). As standard says, the default(firstprivate | private) clause causes all variables in the construct that have implicitly determined data-sharing attributes to be (firstprivate | private). For the following case:

y = 20
!$omp parallel default(private)
  !$omp parallel default(firstprivate)
      y = 10
  !$omp end parallel
!$omp end parallel

Is there a private "y" between default(private) and default(firstprivate)? That is, is y copied from y = 20 or a private y with unknown value? Does y in y = 10 is taken as implicitly determined data-sharing?
What about the following case with the same question:

y = 20
!$omp parallel default(private)
  !$omp parallel firstprivate(y)
      y = 10
  !$omp end parallel
!$omp end parallel

Can you check the behaviors of other compilers?

Please check and address this comment before going ahead in case we go in the wrong direction.

OpenMP 5.1 onwards permits firstprivate/private in C language with some restrictions. https://www.openmp.org/spec-html/5.1/openmpsu116.html#x151-1650002.21.4.1

If you want to test with Clang then you can use a command like ./bin/clang -fopenmp -fopenmp-version=51 -S -emit-llvm

@NimishMishra where are we with this? Do you need any help here? Would you be able to reply to @peixin's question and get this ready for submission today?

ronlieb removed a subscriber: ronlieb.
NimishMishra added a comment.EditedMon, Jul 25, 7:19 AM

@NimishMishra where are we with this? Do you need any help here? Would you be able to reply to @peixin's question and get this ready for submission today?

I checked this. There is indeed a privatised variable between the outer block with default(...) and the inner block with default(...) in other compilers. This patch also exhibits the same behaviour.

As you would have seen, I use a std::map and some insertions/deletions while lowering to privatise default clause variables. @peixin suggested another approach but that got equally complex (as I mentioned in my previous comment). I have, so far, not been able to arrive at a simpler solution than what we have in this patch.

flang/lib/Lower/OpenMP.cpp
167

You are right. It works. But it sprouts another problem.

Consider the following:

!$omp parallel default(private)
    !$omp parallel default(private)
         x = 10
   !$omp end parallel
!$omp end parallel

Even with symbolsInNestedRegions, you can see that we still get double privatization in the inner block. Reason being that collectSymbolSet collects both the symbol and the host-associated symbol (here, both have same flags).

I tried to break collectSymbolSet into two: collecting either the symbols or only the host-associated symbols. But that has complicated matters so far.

I'll keep trying and see how this approach can be made to work.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?
Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?
Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

Yes. We could do that. You are right in saying that the current implementation mixes up too much of different phases. We should definitely look into reducing that.

If nested default clause is the issue, we can consider adding a TODO for that particular case and get the rest of the implementation in. What do you think @NimishMishra @peixin @shraiysh ?

I agree.

Longer-term I think leaving the expansion of the default clause to private, firstprivate, shared to a pass between semantics and pre-fir tree generation seems to be a possible way. Doing all these traversals and collection of symbols during lowering seems to be mixing issues of frontend and lowering.

I tired to differentiate the symbols (including implicit host-associated or explicit symbols) using the scope. Unfortunately, there is no such thing that I can find during lowering. Adding the scope info in lowering for OpenMP may benefit other constructs/clauses. I haven't tried it in pre-fir. It's maybe worth a try.

@NimishMishra Can you update this patch back to last version with some TODO(s) so not to collect and remove symbols, which is kind of fragile?

This revision was not accepted when it landed; it landed in state Needs Review.Tue, Jul 26, 1:40 AM
This revision was automatically updated to reflect the committed changes.

Is it possible to add a hard TODO for the cases that are not properly handled?

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

flang/lib/Lower/Bridge.cpp
559–560

Nit: spelling Associated.

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me.

+1

flang/include/flang/Lower/AbstractConverter.h
117

This is wrong spelling. I have updated D129190, and you didn't notice it. Also, the comment is not as the function does now.

flang/lib/Lower/OpenMP.cpp
281

You only borrowed the code in D128595 without the description, FIXME and TODO. Also, the summary didn't mention this, and there is not test case for this.

Sorry for the confusion. I mistook it as in we get the implementation in before the code freeze, and later change it to a semantic pass. Should I revert the commit?

Is it possible to add a hard TODO for the cases that are not properly handled?

The problem with the implementation is that it's not good. But it handles the cases we have in the tests here. The other approach we were looking at (collecting the entire evaluations's symbols, collecting only the nested evaluations' symbols, and creating a set difference of the two) gave problems with shared(...) clause. It also gave problems with some other nested constructs which had parent constructs, but did not have children constructs.

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

I am sorry. I wait for 2-3 days usually after approval before merging. In this case, I mistook your comment as a go-ahead to submit a working implementation, which we shall change later. I guess there is still some time left before the code freeze, should I revert this commit?

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

That approach did not work with two cases (1. with shared(...) clause and 2. with nested constructs which had parent constructs, but did not have children constructs). The current approach, although fragile, worked with both. So I kept it as it is.

flang/lib/Lower/OpenMP.cpp
281

I did not borrow code from https://reviews.llvm.org/D128595. I was aware of D130027, and I added the change based on that patch and the discussions we had.

Should I revert this completely and begin anew? That might fix the dependencies

shraiysh reopened this revision.Tue, Jul 26, 6:50 AM

I have reverted this change for now, we can discuss and add TODOs for the unhandled cases and then merge it.

Sorry for the confusion. I mistook it as in we get the implementation in before the code freeze, and later change it to a semantic pass. Should I revert the commit?

BTW, it will be good to get an approval before merging. In this case, I guess we are in a bit of a hurry and you were probably confused by the messaging from me. Apologies for that.

I am sorry. I wait for 2-3 days usually after approval before merging. In this case, I mistook your comment as a go-ahead to submit a working implementation, which we shall change later. I guess there is still some time left before the code freeze, should I revert this commit?

That is alright. The semantics implementation is a longer-term suggestion. You don't have to do it in this patch.

Also, could you clarify whether you followed @peixin's suggestion or kept the code as it is?

That approach did not work with two cases (1. with shared(...) clause and 2. with nested constructs which had parent constructs, but did not have children constructs). The current approach, although fragile, worked with both. So I kept it as it is.

I would prefer a hard TODO for the nested default but if that is not required then it is OK.
Could you fix the minor spelling nit and the comments from Peixin and then submit it with an approval from @shraiysh or @peixin today? (Don't wait for me)

Could you fix the minor spelling nit and the comments from Peixin and then submit it with an approval from @shraiysh or @peixin today? (Don't wait for me)

Given that @peixin has more context for this revision, it would be faster if you could review the addressing of comments for this @peixin.

If Peixin is not online/available then I will try to review it if the patch updates tonight.

It is at 23:30 now in my time zone. So, I am afraid that I cannot review it after the update.

I didn't review the approach of collecting and removing symbols of the current patch in detail since I feel like this is not a good way. Can we use the approach of https://reviews.llvm.org/D123930#3661846 and add TODO for the cases of double privitization? This is not one hard comment if @shraiysh help review the current approach.

@NimishMishra Do you have a plan on how to go forward with this? I am away for a couple of weeks. Could you discuss with peixin and shraiysh either here or in tomorrows call and make progress and submit?

@NimishMishra Do you have a plan on how to go forward with this? I am away for a couple of weeks. Could you discuss with peixin and shraiysh either here or in tomorrows call and make progress and submit?

Yes. I plan to discuss further steps on this in tomorrow's call and close at the earliest.

This latest update closely follows @peixin's suggestions and avoids std::map and inserting/deleting symbols. Because of this, one test that was earlier passing is now failing. I have added the same in flang/test/Lower/OpenMP/Todo/omp-default-clause-inner-loop.f90. Rest all nested constructs are passing.

NimishMishra added inline comments.Thu, Jul 28, 2:59 AM
flang/lib/Lower/OpenMP.cpp
281

@peixin Sorry for the confusion with https://reviews.llvm.org/D128595. Should I bring those changes in here? Or will you move ahead with it separately? Apologies again for the confusion

peixin added inline comments.Thu, Jul 28, 4:19 AM
flang/lib/Lower/OpenMP.cpp
281

Don't worry about that. I have abandoned it and won't work on that again. You can move those changes in this patch.

peixin added inline comments.Thu, Jul 28, 8:31 AM
flang/include/flang/Lower/AbstractConverter.h
113

Can you merge these two functions into one by adding one more argument to collectSymbolSet since they are quite similar?

flang/lib/Lower/OpenMP.cpp
188
189

Merged the two symbol collector methods into one

peixin added inline comments.Tue, Aug 2, 6:48 PM
flang/include/flang/Lower/AbstractConverter.h
111

Not correct comment.

117

The default value of false sounds to be more reasonable.

flang/lib/Lower/Bridge.cpp
564
flang/lib/Lower/OpenMP.cpp
63–64

This function needs to be removed.

96

This needs to be moved in line 185. This is confusing since privatizedSymbols is used for private/firstprivate clause instead of default clause.

99

There may not be one pre-FIR pass to address this issue. How about the following:

FIXME: Collect the symbols with private/firstprivate flag in the region of the construct with default(private/firstprivate) clause excluding the symbols with the same private/firstprivate flag in the inner nested regions.

100
190

The "{" is not necessary.

208

Remove the blank line.

306–308
converter.collectSymbolSet(eval, threadprivateSyms,
                           Fortran::semantics::Symbol::Flag::OmpThreadprivate);

Use this since this uses the default behavior of collectSymbolSet, and no need to specify these two arguments.

flang/lib/Semantics/resolve-directives.cpp
1495

This comment is missed.

1496

This comment is missed.

1520

This comment is missed.

1523
NimishMishra marked 16 inline comments as done.

Addressed comments.

flang/lib/Lower/OpenMP.cpp
63–64

This function is being used by the last private clause inside privatizeVars

peixin accepted this revision.Thu, Aug 4, 2:07 AM

Thanks for the update. LGTM. You can wait for a day or two to land this if there is no further comments.

flang/lib/Lower/OpenMP.cpp
108

Remove this since this for loop is not only for private/firstprivate clause.

Can you land this so that other privatization-related patches can be implemented based on this?

This revision was not accepted when it landed; it landed in state Needs Review.Fri, Aug 12, 4:09 AM
This revision was automatically updated to reflect the committed changes.

I have landed the patch with https://reviews.llvm.org/rG435feefbdd6c91faf24fa5e69c4e7c3bc127568a.

Thank you @peixin, @kiranchandramohan, @shraiysh and others for working through this review.