This is an archive of the discontinued LLVM Phabricator instance.

[Flang][OpenMP] Initial support for integer reduction in worksharing-loop
ClosedPublic

Authored by kiranchandramohan on Jul 19 2022, 4:57 AM.

Details

Summary

Lower the Flang parse-tree containing OpenMP reductions to the OpenMP
dialect. The OpenMP dialect models reductions with,

  1. A reduction declaration operation that specifies how to initialize, combine, and atomically combine private reduction variables.
  2. The OpenMP operation (like wsloop) that supports reductions has an array of reduction accumulator variables (operands) and an array attribute of the same size that points to the reduction declaration to be used for the reduction accumulation.
  3. The OpenMP reduction operation that takes a value and an accumulator. This operation replaces the original reduction operation in the source.

(1) is implemented by the createReductionDecl in OpenMP.cpp, (2) is implemented while creating the OpenMP operation, (3) is implemented by the genOpenMPReduction function in OpenMP.cpp, and called from Bridge.cpp. The implementation of (3) is not very robust.

NOTE 1: The patch currently supports only reductions for integer type addition.
NOTE 2: Only supports reduction in the worksharing loop.
NOTE 3: Does not generate atomic combination region.
NOTE 4: Other options for creating the reduction operation include
a) having the reduction operation as a construct containing an assignment
and then handling it appropriately in the Bridge.
b) we can modify genAssignment or genFIR(AssignmentStmt) in the Bridge to
handle OpenMP reduction but so far we have tried not to mix OpenMP
and non-OpenMP code and this will break that.
I will try (b) in a separate patch.
NOTE 4: OpenMP dialect gained support for reduction with the patches:
D105358, D107343. See https://discourse.llvm.org/t/rfc-openmp-reduction-support/3367
for more details.

Co-authored-by: Peixin-Qiao <qiaopeixin@huawei.com>

Diff Detail

Event Timeline

Herald added a reviewer: ftynse. · View Herald Transcript
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
kiranchandramohan requested review of this revision.Jul 19 2022, 4:57 AM

Support reduction of different integer types. Add more tests.

kiranchandramohan retitled this revision from [Flang][OpenMP] WIP : Initial support for reduction to [Flang][OpenMP] Initial support for reduction.Jul 20 2022, 4:39 PM
kiranchandramohan edited the summary of this revision. (Show Details)
kiranchandramohan edited the summary of this revision. (Show Details)
flang/lib/Lower/OpenMP.cpp
1517

Try the approach in NOTE 3.b

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1316

Move this to a separate patch.

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
121–126

Add a test.

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

flang/lib/Lower/OpenMP.cpp
1513

i think this should be done as a part of this patch itself.

Address review comments, fix a minor issue, add TODO tests to demonstrate
existence of TODOs.

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

This is a great point. We should keep this in mind and see whether we can try something like this when we want to use genFIR for assignment statement for reductions.

flang/lib/Lower/OpenMP.cpp
1513

Did you mean handling various operation types or the TODOs?
I have added tests to demonstrate that TODOs exist for unhandled cases and that includes the different operations or operations on unsupported types.

This is not my area of expertise, but overall makes sense to me. I really appreciate the detailed summary! IMHO, genOpenMPReduction would really benefit from a bit of refactoring (I've not read it yet, tbh).

A reduction declaration operation that specifies how to initialize, combine and atomically combine private reduction variables.

I'm guessing that in the following, combiner is for both "combine" and "atomically combine":

!CHECK-LABEL: omp.reduction.declare
!CHECK-SAME: @[[RED_I64_NAME:.*]] : i64 init {
!CHECK: ^bb0(%{{.*}}: i64):
!CHECK:  %[[C0_1:.*]] = arith.constant 0 : i64
!CHECK:  omp.yield(%[[C0_1]] : i64)
!CHECK: } combiner {
!CHECK: ^bb0(%[[ARG0:.*]]: i64, %[[ARG1:.*]]: i64):
!CHECK:  %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i64
!CHECK:  omp.yield(%[[RES]] : i64)
!CHECK: }

Or is combiner for regular "combine" only?

Also:

NOTE 1: The patch currently supports only reductions for integer type addition.

Why not call this out in the title (e.g. "[Flang][OpenMP] Add support for integer reductions in work-sharing loops". That's probably the key bit of information here.

flang/lib/Lower/OpenMP.cpp
1510–1511

[nit] IIUC, the key task for this method is to generate OpenMP reductions (that's what the method suggests). However, this comment starts with an implementation detail ("Find chain : load reduction var -> reduction_operation -> store reduction var") rather than a generic description ("eplace it with the reduction operation").

1513

I think that "one patch per type" is fine. It's much easier to review short patches.

1517

This method could really benefit from some early exits. It's quite tricky to read it ATM. Any chance to refactor a bit?

shraiysh added inline comments.Jul 25 2022, 5:00 AM
flang/lib/Lower/OpenMP.cpp
1513

I meant adding tests for TODOs itself, thanks! Handling everything will make this a huge patch. Separate patches for unhandled cases would be better!

When I test various integer scalar, I found several test cases without semantic errors and I am not sure if we should support them for now:

$ cat temp2.f90 
program main
  integer :: x(10)! = 0
  integer :: i = 1

!  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x(1))
  do i = 1, 10
    x(1) = x(1) + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp4.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp5.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt%x)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
flang/lib/Lower/OpenMP.cpp
1517

This does not work for the following two cases:

program main
  integer, allocatable :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
program main
  integer, pointer :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
awarzynski added inline comments.Jul 25 2022, 7:01 AM
flang/test/Lower/OpenMP/wsloop-reduction-int.f90
56

Not needed

88

Not needed

When I test various integer scalar, I found several test cases without semantic errors and I am not sure if we should support them for now:

$ cat temp2.f90 
program main
  integer :: x(10)! = 0
  integer :: i = 1

!  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x(1))
  do i = 1, 10
    x(1) = x(1) + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp4.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp5.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt%x)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

Thanks @peixin for going through this in detail. I think we can handle all these in future patches. Handling reduction during generation is a better approach as suggested in Note 3.b and switching to that approach will help handle most of the cases.

BTW, Could you file a github issue for temp2 and temp5? These seem to have issues in the semantic stage itself. temp4 hits a Todo in lowering.

Let me know if this is not alrite with you.

flang/lib/Lower/OpenMP.cpp
1517

These two (allocate and pointer tests) hit the hard Todo and we can handle them in subsequent patches. I will add these tests (as Todo tests) to the patch and add you as co-author.

kiranchandramohan retitled this revision from [Flang][OpenMP] Initial support for reduction to [Flang][OpenMP] Initial support for integer reduction in worksharing-loop.Jul 25 2022, 8:12 AM
kiranchandramohan edited the summary of this revision. (Show Details)

Thanks @peixin for going through this in detail. I think we can handle all these in future patches. Handling reduction during generation is a better approach as suggested in Note 3.b and switching to that approach will help handle most of the cases.

This is OK to me.

BTW, Could you file a github issue for temp2 and temp5? These seem to have issues in the semantic stage itself. temp4 hits a Todo in lowering.

Let me know if this is not alrite with you.

Sure, I will file a issue tomorrow in my time zone. I will also attach the gfortran and classic-flang results, which I tested using one script.

You want to catch up the release of llvm 15, right? This patch mostly looks good to me. Several nit comments.

flang/lib/Lower/OpenMP.cpp
803
912

Should this be in line 905 with one else branch with TODO for DefinedOpName? The DefinedOpName will be for declare reduction, if I understand correctly, and these will be refactored outside in the future.

1531

Same as above. Should the TODO also be in this function(genOpenMPReduction)?

kiranchandramohan marked 4 inline comments as done.

Address review comments by @awarzynski and @peixin, add tests provided by @peixin and add as co-author.

This is not my area of expertise, but overall makes sense to me. I really appreciate the detailed summary! IMHO, genOpenMPReduction would really benefit from a bit of refactoring (I've not read it yet, tbh).

A reduction declaration operation that specifies how to initialize, combine and atomically combine private reduction variables.

I'm guessing that in the following, combiner is for both "combine" and "atomically combine" or is combiner for regular "combine" only?

The atomic combiner is a separate one which is optional. Providing these will give a performance improvement but is not necessary for functional correctness. We can generate the atomic combiners in future patches. For an e.g see https://github.com/llvm/llvm-project/blob/8068751189af3099d9abef8953a9639d6798535c/mlir/test/Target/LLVMIR/openmp-reduction.mlir#L17
Note: I have added a comma after combine to make this clear.

Also:

NOTE 1: The patch currently supports only reductions for integer type addition.

Why not call this out in the title (e.g. "[Flang][OpenMP] Add support for integer reductions in work-sharing loops". That's probably the key bit of information here.

Sure. Good point, added to title.

flang/lib/Lower/OpenMP.cpp
912

Yes, it will be refactored. But i have moved it out of the loop as per your suggestion, this will enable us to have separate TODOs for unsupported operations and types.

1517

Many of them are due to front-end style code where we are accessing fields inside a container type. I have removed one based on a suggestion from Peixin, but otherwise have not addressed it.

1531

I am not adding TODOs here since they will never be reached (the corresponding code in creating the reduction declaration would have been hit). But I have converted them to continue in line with @awarzynski's comments.

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

Inheritance is an interesting idea. Another approach would be extend the FirConverter so that it knows how convert all constructs, i.e., add methods ala. genOpenMPReduction or genOpenACCTarget.

Add driver tests in a few Todo tests.

kiranchandramohan edited the summary of this revision. (Show Details)Jul 25 2022, 10:00 AM
awarzynski accepted this revision.Jul 25 2022, 10:41 AM

LGTM, thanks for addressing my comments!

As discussed in the call today, it would be great to merge this in time for LLVM 15 branch, i.e. today or early tomorrow (you may want to update the release notes when merging). I believe that you have addressed all comments and I suggest merging this as is - everything else can be addressed post-commit. @tschuett - I hope that that's fine. Implementing your suggestion would require quite a refactor and Kiran would definitely miss the LLVM 15 deadline. We can always re-visit this later!

This revision is now accepted and ready to land.Jul 25 2022, 10:41 AM

LGTM, thanks for addressing my comments!

As discussed in the call today, it would be great to merge this in time for LLVM 15 branch, i.e. today or early tomorrow (you may want to update the release notes when merging). I believe that you have addressed all comments and I suggest merging this as is - everything else can be addressed post-commit. @tschuett - I hope that that's fine. Implementing your suggestion would require quite a refactor and Kiran would definitely miss the LLVM 15 deadline. We can always re-visit this later!

Thanks @awarzynski. I will update the Readme separately.
@tschuett Thanks for your suggestion. We can discuss refactoring separately in a patch or in discourse.