Page MenuHomePhabricator

[flang][OpenMP] Handle lastprivate on sections construct
Needs ReviewPublic

Authored by NimishMishra on Sep 12 2022, 4:20 AM.

Details

Summary

This patch adds support for lastprivate on sections construct. One omp.sections operation can have several omp.section operation. As such, the privatization happens in the lexically last omp.section operation.

Diff Detail

Event Timeline

NimishMishra created this revision.Sep 12 2022, 4:20 AM
NimishMishra requested review of this revision.Sep 12 2022, 4:20 AM
kiranchandramohan requested changes to this revision.Sep 13 2022, 5:24 AM

Nice to see this patch.

I have some general concerns about having to special case in different places for handling the lastprivatisation in the section op.

Also, can you check why lowering for the following fails? This just adds some unstructured code into the last section.

subroutine lastprivate()
  integer :: x
  !$omp sections lastprivate(x)
      !$omp section
          x = x * 10
      !$omp section
          x = x + 1
      !$omp section
          goto 30
30        x = x - 3
    !$omp end sections
end subroutine
flang/lib/Lower/Bridge.cpp
523 ↗(On Diff #459420)

Can we work with the lastPrivBlock and avoid use of the targetOperation?

flang/lib/Lower/OpenMP.cpp
269

What issues did you face while trying to reuse existing code?

293

Can this special casing be removed?

590

Can the special casing be removed?

This revision now requires changes to proceed.Sep 13 2022, 5:24 AM
NimishMishra added inline comments.Sep 13 2022, 9:21 PM
flang/lib/Lower/Bridge.cpp
523 ↗(On Diff #459420)

I certainly tried to. I faced these problems.

For loops, we created one new scf.if %cmp operation which comes with a region. Thus by pointing lastPrivBlock to this, we were able to put the last private code in the correct location.

For omp.section however, we have one region and this is the workflow:

omp.section{
      // privatise here with a fir.alloc operation
      <extra FIR>
     // lastprivate operation with a fir.store
}

If I point lastPrivBlock to this region, then

if (lastPrivBlock)
      builder->setInsertionPointToStart(lastPrivBlock);

... expectedly does the following

omp.section{
       // lastprivate operation with a fir.store
      // privatise here with a fir.alloc operation
      <extra FIR>
}

If I change setInsertionPointToStart to setInsertionPointToEnd, then it inserts the lastprivate's fir.store after the omp.terminator in the block.

Hence, I thought to use precise locations through specifying the operation before which we want to insert the lastprivate FIR.

flang/lib/Lower/OpenMP.cpp
269

In addition to my previous comment, one issue was to set the insertion point in between these two things:

omp.section{
    // fir.allocate
 <extra FIR comes here>
    // fir.store 
}

With the usual workflow, the insertion point for all nested evaluations' lowering is set after all the code generated by privatizeVars. Concretely, I got the following:

omp.section{
    // fir.allocate
    // fir.store 
   <extra FIR comes here>
}

This was the issue due to which I needed the

if (!(hasLastPrivateOp && std::is_same_v<Op, omp::SectionOp>))
   firOpBuilder.restoreInsertionPoint(insPt);
peixin added inline comments.Sep 16 2022, 11:23 PM
flang/lib/Lower/Bridge.cpp
522 ↗(On Diff #459420)

Instead of passing the special argument to this function, do you consider pass the inerstion Point as one argument to handle all the cases? The argument name would be something like copyInsertionPoint/copyAssignInsertionPoint.

Updated the patch:

  • Modified copyHostAssociateVar to take in a single argument for all lastprivate cases
  • Split the handling of sections privatisation from the generic handler function

@kiranchandramohan I tried to work around the current implementation in a number of ways. One such way was to try and add more than 1 block in the operation's region. However that got nowhere. Should this current version be not acceptable, do you have any ideas on how to tackle this problem?

kiranchandramohan requested changes to this revision.Sep 23 2022, 5:01 AM

Updated the patch:

  • Modified copyHostAssociateVar to take in a single argument for all lastprivate cases
  • Split the handling of sections privatisation from the generic handler function

@kiranchandramohan I tried to work around the current implementation in a number of ways. One such way was to try and add more than 1 block in the operation's region. However that got nowhere. Should this current version be not acceptable, do you have any ideas on how to tackle this problem?

As discussed in the call, you can try to minimize further duplication of code. To match the lastprivate of wsloop behaviour, you can try to have an scf.if with true as the predicate value.

flang/include/flang/Lower/AbstractConverter.h
21 ↗(On Diff #461880)

Is this needed here now?

flang/include/flang/Lower/PFTBuilder.h
274 ↗(On Diff #461880)

Isn't this the default behaviour, do we need to specify this overload?

This revision now requires changes to proceed.Sep 23 2022, 5:01 AM

Updated the patch as per @kiranchandramohan's suggestion.

To simplify privatisation, currently we add a scf.if %true {...} operation to last lexically last omp.section operation in an omp.sections operation. Canonicalization in later phases of lowering shall remove this redundant operation.

NimishMishra marked 2 inline comments as done.Oct 2 2022, 10:08 PM
NimishMishra added inline comments.
flang/include/flang/Lower/PFTBuilder.h
274 ↗(On Diff #461880)

Yes. The eval == eval.parentConstruct->getLastNestedEvaluation() requires this overload

Rebase with main

kiranchandramohan requested changes to this revision.Oct 6 2022, 4:33 AM

Nice to see this patch.

I have some general concerns about having to special case in different places for handling the lastprivatisation in the section op.

Also, can you check why lowering for the following fails? This just adds some unstructured code into the last section.

subroutine lastprivate()
  integer :: x
  !$omp sections lastprivate(x)
      !$omp section
          x = x * 10
      !$omp section
          x = x + 1
      !$omp section
          goto 30
30        x = x - 3
    !$omp end sections
end subroutine

Have you checked the code given above? I think it fails with this change.

flang/lib/Lower/OpenMP.cpp
144

Can you add a comment describing why the scf::IfOp is required?

590

Is this still required?

This revision now requires changes to proceed.Oct 6 2022, 4:33 AM

Rebased and added comment to explain need of a dummy scf.if operation.

Privatization for unstructured sections to be handled in a separate patch.

NimishMishra marked an inline comment as done.Oct 13 2022, 11:10 PM
NimishMishra added inline comments.
flang/lib/Lower/OpenMP.cpp
590

I have a question on resetBeforeTerminator. It does one of two things:

  1. Either is sets the insertion point after the storeOp OR
  2. It sets it at the beginning on the block.

In the current context, the second point is happening. By setting the insertion to the top of the section block, FIR is generated before privatization. We can work around this by allowing privatizeVars to set storeOp. However, that sounds cumbersome.

I was wondering on the reasons for this function. Is it a wsloop specific function, for which you do have a storeOp? If so, does it make sense to call this function for certain operations like wsloop and leave it for others?

Something like:

if(lastPrivateOp && <wsloop operation>)
    firOpBuilder.setInsertionPoint(storeOp);

For other operations, it's not required to reset insertion point after privatizeVars has already done it.

Let me know what you think of this? Basically, we can use this functionality to "override" insertion point set by privatizeVars for cases which need such overriding (like wsloop), and leave it for other cases.

peixin requested changes to this revision.Oct 14 2022, 12:14 AM

Can you add one test case including nowait clause and compare what IR this patch generated with that of clang? It seems that one implicit barrier is missed. https://github.com/llvm/llvm-project/blob/6370bc2435a8406898eee7338ae7d795a252ad04/clang/lib/CodeGen/CGStmtOpenMP.cpp#L4075-L4083

flang/lib/Lower/OpenMP.cpp
127

No need to add the above operator.

144

I think it is redundant. wsloop uses it to check if it is the sequentially last iteration. For sections construct, you already use &eval == &eval.parentConstruct->getLastNestedEvaluation() to check it is lexically last section construct.

This revision now requires changes to proceed.Oct 14 2022, 12:14 AM
NimishMishra added inline comments.Oct 17 2022, 8:06 PM
flang/lib/Lower/OpenMP.cpp
127

Thanks. This is cleaner

144

By just using eval.parentConstruct->getLastNestedEvaluation(), we can know the lexically final block. The problem is to place the FIR correctly in this region.

omp.section{
<private/firstprivate>

<actual FIR>

<lastprivate store>
}

Now omp.section has one region. I was able to generate firstprivate/private FIR as well as the lastprivate FIR correctly, but to set the insertion point between these needed some logic in privatizeVars function.

To this end, @kiranchandramohan suggested that, to keep the privatization code simpler and generic, we can add a dummy operation like the one you are seeing currently. This will allow a subregion inside omp.section's own region, and prevent the need of handling insertion points separately. Anyway this operation shall be removed in later canonicalization.

peixin added inline comments.Oct 17 2022, 11:51 PM
flang/lib/Lower/OpenMP.cpp
144

OK. Got it. Thanks.

Added a test case for nowait clause on sections construct as well.

Do you have any pending work here?

flang/lib/Lower/OpenMP.cpp
163

Please add a comment on why this barrier is required.

peixin added inline comments.Wed, Nov 9, 10:57 PM
flang/lib/Lower/OpenMP.cpp
158

Can you try to get otherClauses, which is sectionsEndClauseList here, from eval.parentConstruct?