Page MenuHomePhabricator

[flang][OpenMP] Lowering support for atomic capture
Needs ReviewPublic

Authored by NimishMishra on Jun 8 2022, 12:56 AM.

Details

Summary

This patch adds lowering support for atomic capture operation. First is created a region (without any operand) for the atomic capture operation. Then based on one of the following configurations...

  1. [update-stmt, capture-stmt]
  2. [capture-stmt, update-stmt]
  3. [capture-stmt, write-stmt]

... the lowering proceeds by creating these individual operations inside the atomic capture's region.

Diff Detail

Event Timeline

NimishMishra created this revision.Jun 8 2022, 12:56 AM
NimishMishra requested review of this revision.Jun 8 2022, 12:56 AM

This follows after D125668.

TODO: handling pointers in atomic capture construct

Thanks for working on this. I have a few comments on generated IR and changes in OpenMPDialect.cpp. Please excuse the formatting issues, if any as I'm commenting this from a mobile :')

flang/lib/Lower/OpenMP.cpp
1491

Shouldn't this be done in semantics?

flang/test/Lower/OpenMP/atomic-capture.f90
35

Shouldn't all of this (34-41) be wrapped in an omp.atomic.read and omp.atomic.update operation? Why can't we generate that here? Relaxing the restrictions on omp.capture is not a solution for this when it's possible to express this same thing in current syntax.

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
936 ↗(On Diff #435066)

Is this move required?

941 ↗(On Diff #435066)

Can you please add a testcase for this in mlir/test/Dialect/OpenMP/ops.mlir? I'm having trouble understanding why this is required.

NimishMishra added inline comments.Jun 9 2022, 9:33 PM
flang/lib/Lower/OpenMP.cpp
1491

This is the best approach I could think of in order to understand which capture construct combination to lower to. So I put basic "structural" checks for v=x and x=x op expr statements. There are more semantic checks attached with these. I assume we rely on the semantics phase to take care of them.

These helper functions here are only doing structural checks.

flang/test/Lower/OpenMP/atomic-capture.f90
35

I do not understand. What do you mean by "wrapped" in a read and write operation?

Generally, the read operation is not problematic. In the write operation however, the FIR for expression evaluation is involved. I was attempting to control insertion points to make this expression evaluation "outside" the omp.atomic.capture, but I couldn't do it.

However, I am not too sure if it is wrong to put the FIR related to expression evaluation within the capture block (alongside the omp.atomic.write). Please correct me if I am missing something here.

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
936 ↗(On Diff #435066)

No. It's unintentional. I will revert it.

shraiysh added inline comments.Jun 9 2022, 10:20 PM
flang/test/Lower/OpenMP/atomic-capture.f90
34–43

If you can generate this, then that's the most accurate imo.

This does not violate the semantics of atomic construct because it clearly says that -

Only the read and write of the location designated by x are performed mutually atomically. Neither the evaluation of expr or expr_list, nor the write to the location designated by v, need be atomic with respect to the read or write of the location designated by x.

36–42

I do not understand. What do you mean by "wrapped" in a read and write operation?

This is what I meant.

However, I am not too sure if it is wrong to put the FIR related to expression evaluation within the capture block (alongside the omp.atomic.write).

Ideally, the evaluation of the expression should not be inside the atomic region at all. However, if that's somehow not possible, it should be pushed inside an omp.atomic.update expression because omp.atomic.update operation supports multiple operations in it's region. If that cannot be done too, we should justify adding this relaxation to omp.atomic.capture independent of Flang by answering "what kind of capture executions cannot be expressed in MLIR with the current syntax?". As long as they can be expressed, it should be the job of flang to lower appropriately.

NimishMishra added inline comments.Jun 9 2022, 10:38 PM
flang/test/Lower/OpenMP/atomic-capture.f90
36–42

Okay. Then I will work on moving the evaluation of the expression outside the atomic region. I think I have a way to do that.

shraiysh added inline comments.Jun 9 2022, 11:03 PM
flang/lib/Lower/OpenMP.cpp
1533–1534

This should not always be true. If the function call on rhs does not use the variable on the lhs, then this is an atomic write statement.

Modeling it as an update sort of works, in the sense that there is no visible change in behavior of the program, but the generated IR will not be entirely accurate. If it is very hard to deduce write here, maybe mention it as a todo?

1535–1537

I meant can something like this work? Relying on semantics for a valid binary operator.

NimishMishra added inline comments.Jun 9 2022, 11:09 PM
flang/lib/Lower/OpenMP.cpp
1533–1534

You are right. I missed it. I will revisit these functions and see what can be done.

Improved design of the solution

NimishMishra added inline comments.Mon, Aug 1, 4:55 AM
flang/test/Lower/OpenMP/atomic-capture.f90
81

I had a discussion to make at this point. The verifier for atomic capture checks if opsInRegion.size() == 3 i.e. if the number of operations in a region were 3.

This issue came up before during lowering of omp.atomic.write inside atomic capture, since a write statement has expression evaluation which also takes up space. @shraiysh suggested to keep this expression evaluation outside the omp.atomic.capture, since is what I have done currently.

However, with pointers, the issue has resurfaced. This particular b = a lowers as

%0 = allocate c
%1 = allocate d
%2 = allocate a
%3 = allocate b

omp.atomic.capture{
   omp.atomic.update{......}

 %4 = load a
  % 5 = load b
  omp.atomic.read %5 = %4

 omp.terminator
}

The discussion I wish to have is whether the verification method here should be changed, or should we evaluate all LHS and RHS of an atomic assignment statement beforehand. To change the verification, I was thinking like the following:

  1. Ensure in a list of operations, there are exactly two omp.atomic operations and exactly one omp.terminator operations
  1. Last operation in the region is omp.terminator
  1. Second last operation in the region is necessarily a omp.atomic operation