This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/
-
include/flang/Lower/
-
flang/
-
Lower/
-
OpenMP.h
-
lib/Lower/
-
Lower/
-
Bridge.cpp
8/15
OpenMP.cpp
-
test/Lower/OpenMP/
-
Lower/
-
OpenMP/
-
Todo/
-
parallel-reduction.f90
-
reduction-allocatable.f90
-
reduction-and.f90
-
reduction-arrays.f90
-
reduction-derived-type-field.f90
-
reduction-eqv.f90
-
reduction-iand.f90
-
reduction-ieor.f90
-
reduction-ior.f90
-
reduction-max.f90
-
reduction-min.f90
-
reduction-multiply.f90
-
reduction-neqv.f90
-
reduction-or.f90
-
reduction-real.f90
-
reduction-subtract.f90
2/2
wsloop-reduction-int.f90
-
mlir/
-
include/mlir/Dialect/OpenMP/
-
mlir/
-
Dialect/
-
OpenMP/
1/1
OpenMPOps.td
-
lib/Conversion/OpenMPToLLVM/
-
Conversion/
-
OpenMPToLLVM/
1/1
OpenMPToLLVM.cpp

Differential D130077

[Flang][OpenMP] Initial support for integer reduction in worksharing-loop
ClosedPublic

Authored by kiranchandramohan on Jul 19 2022, 4:57 AM.

Download Raw Diff

Details

Reviewers

sscalpone
ftynse
jdoerfert
peixin
shraiysh
awarzynski
DylanFleming-arm

Commits

rG7bb1151ba21e: [Flang][OpenMP] Initial support for integer reduction in worksharing-loop

Summary

Lower the Flang parse-tree containing OpenMP reductions to the OpenMP
dialect. The OpenMP dialect models reductions with,

A reduction declaration operation that specifies how to initialize, combine, and atomically combine private reduction variables.
The OpenMP operation (like wsloop) that supports reductions has an array of reduction accumulator variables (operands) and an array attribute of the same size that points to the reduction declaration to be used for the reduction accumulation.
The OpenMP reduction operation that takes a value and an accumulator. This operation replaces the original reduction operation in the source.

(1) is implemented by the createReductionDecl in OpenMP.cpp, (2) is implemented while creating the OpenMP operation, (3) is implemented by the genOpenMPReduction function in OpenMP.cpp, and called from Bridge.cpp. The implementation of (3) is not very robust.

NOTE 1: The patch currently supports only reductions for integer type addition.
NOTE 2: Only supports reduction in the worksharing loop.
NOTE 3: Does not generate atomic combination region.
NOTE 4: Other options for creating the reduction operation include
a) having the reduction operation as a construct containing an assignment
and then handling it appropriately in the Bridge.
b) we can modify genAssignment or genFIR(AssignmentStmt) in the Bridge to
handle OpenMP reduction but so far we have tried not to mix OpenMP
and non-OpenMP code and this will break that.
I will try (b) in a separate patch.
NOTE 4: OpenMP dialect gained support for reduction with the patches:
D105358, D107343. See https://discourse.llvm.org/t/rfc-openmp-reduction-support/3367
for more details.

Co-authored-by: Peixin-Qiao <qiaopeixin@huawei.com>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kiranchandramohan created this revision.Jul 19 2022, 4:57 AM

Herald added a reviewer: sscalpone. · View Herald TranscriptJul 19 2022, 4:57 AM

Herald added a reviewer: ftynse. · View Herald Transcript

Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript

Herald added subscribers: bzcheeseman, awarzynski, sdasgup3 and 21 others. · View Herald Transcript

kiranchandramohan requested review of this revision.Jul 19 2022, 4:57 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptJul 19 2022, 4:57 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: sstefan1, stephenneuendorffer, nicolasvasilache, jdoerfert. · View Herald Transcript

Harbormaster completed remote builds in B176228: Diff 445781.Jul 19 2022, 5:14 AM

Support reduction of different integer types. Add more tests.

kiranchandramohan retitled this revision from [Flang][OpenMP] WIP : Initial support for reduction to [Flang][OpenMP] Initial support for reduction.Jul 20 2022, 4:39 PM

kiranchandramohan edited the summary of this revision. (Show Details)

kiranchandramohan added reviewers: peixin, shraiysh, awarzynski, DylanFleming-arm.

kiranchandramohan edited the summary of this revision. (Show Details)

kiranchandramohan added inline comments.Jul 20 2022, 4:43 PM

flang/lib/Lower/OpenMP.cpp
1517	Try the approach in NOTE 3.b
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1316	Move this to a separate patch.
mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp
121–126	Add a test.

Harbormaster completed remote builds in B176620: Diff 446297.Jul 20 2022, 4:53 PM

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

flang/lib/Lower/OpenMP.cpp
1513	i think this should be done as a part of this patch itself.

Address review comments, fix a minor issue, add TODO tests to demonstrate
existence of TODOs.

In D130077#3671499, @shraiysh wrote:

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

This is a great point. We should keep this in mind and see whether we can try something like this when we want to use genFIR for assignment statement for reductions.

flang/lib/Lower/OpenMP.cpp
1513	Did you mean handling various operation types or the TODOs? I have added tests to demonstrate that TODOs exist for unhandled cases and that includes the different operations or operations on unsupported types.

Harbormaster completed remote builds in B177239: Diff 447146.Jul 24 2022, 11:22 AM

This is not my area of expertise, but overall makes sense to me. I really appreciate the detailed summary! IMHO, genOpenMPReduction would really benefit from a bit of refactoring (I've not read it yet, tbh).

A reduction declaration operation that specifies how to initialize, combine and atomically combine private reduction variables.

I'm guessing that in the following, combiner is for both "combine" and "atomically combine":

!CHECK-LABEL: omp.reduction.declare
!CHECK-SAME: @[[RED_I64_NAME:.*]] : i64 init {
!CHECK: ^bb0(%{{.*}}: i64):
!CHECK:  %[[C0_1:.*]] = arith.constant 0 : i64
!CHECK:  omp.yield(%[[C0_1]] : i64)
!CHECK: } combiner {
!CHECK: ^bb0(%[[ARG0:.*]]: i64, %[[ARG1:.*]]: i64):
!CHECK:  %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i64
!CHECK:  omp.yield(%[[RES]] : i64)
!CHECK: }

Or is combiner for regular "combine" only?

Also:

NOTE 1: The patch currently supports only reductions for integer type addition.

Why not call this out in the title (e.g. "[Flang][OpenMP] Add support for integer reductions in work-sharing loops". That's probably the key bit of information here.

flang/lib/Lower/OpenMP.cpp
1510–1511	[nit] IIUC, the key task for this method is to generate OpenMP reductions (that's what the method suggests). However, this comment starts with an implementation detail ("Find chain : load reduction var -> reduction_operation -> store reduction var") rather than a generic description ("eplace it with the reduction operation").
1513	I think that "one patch per type" is fine. It's much easier to review short patches.
1517	This method could really benefit from some early exits. It's quite tricky to read it ATM. Any chance to refactor a bit?

shraiysh added inline comments.Jul 25 2022, 5:00 AM

flang/lib/Lower/OpenMP.cpp
1513	I meant adding tests for TODOs itself, thanks! Handling everything will make this a huge patch. Separate patches for unhandled cases would be better!

When I test various integer scalar, I found several test cases without semantic errors and I am not sure if we should support them for now:

$ cat temp2.f90 
program main
  integer :: x(10)! = 0
  integer :: i = 1

!  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x(1))
  do i = 1, 10
    x(1) = x(1) + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp4.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp5.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt%x)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

flang/lib/Lower/OpenMP.cpp

1517

This does not work for the following two cases:

program main
  integer, allocatable :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

program main
  integer, pointer :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

awarzynski added inline comments.Jul 25 2022, 7:01 AM

flang/test/Lower/OpenMP/wsloop-reduction-int.f90
56	Not needed
88	Not needed

In D130077#3675834, @peixin wrote:

When I test various integer scalar, I found several test cases without semantic errors and I am not sure if we should support them for now:

$ cat temp2.f90 
program main
  integer :: x(10)! = 0
  integer :: i = 1

!  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x(1))
  do i = 1, 10
    x(1) = x(1) + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp4.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end
$ cat temp5.f90 
program main
  type t
    integer :: x
  end type
  integer :: i = 1
  type(t) :: mt

  mt%x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:mt%x)
  do i = 1, 10
    mt%x = mt%x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

Thanks @peixin for going through this in detail. I think we can handle all these in future patches. Handling reduction during generation is a better approach as suggested in Note 3.b and switching to that approach will help handle most of the cases.

BTW, Could you file a github issue for temp2 and temp5? These seem to have issues in the semantic stage itself. temp4 hits a Todo in lowering.

Let me know if this is not alrite with you.

flang/lib/Lower/OpenMP.cpp
1517	These two (allocate and pointer tests) hit the hard Todo and we can handle them in subsequent patches. I will add these tests (as Todo tests) to the patch and add you as co-author.

kiranchandramohan retitled this revision from [Flang][OpenMP] Initial support for reduction to [Flang][OpenMP] Initial support for integer reduction in worksharing-loop.Jul 25 2022, 8:12 AM

kiranchandramohan edited the summary of this revision. (Show Details)

Thanks @peixin for going through this in detail. I think we can handle all these in future patches. Handling reduction during generation is a better approach as suggested in Note 3.b and switching to that approach will help handle most of the cases.

This is OK to me.

BTW, Could you file a github issue for temp2 and temp5? These seem to have issues in the semantic stage itself. temp4 hits a Todo in lowering.

Let me know if this is not alrite with you.

Sure, I will file a issue tomorrow in my time zone. I will also attach the gfortran and classic-flang results, which I tested using one script.

You want to catch up the release of llvm 15, right? This patch mostly looks good to me. Several nit comments.

flang/lib/Lower/OpenMP.cpp
803
912	Should this be in line 905 with one else branch with TODO for DefinedOpName? The DefinedOpName will be for declare reduction, if I understand correctly, and these will be refactored outside in the future.
1531	Same as above. Should the TODO also be in this function(genOpenMPReduction)?

Address review comments by @awarzynski and @peixin, add tests provided by @peixin and add as co-author.

In D130077#3675635, @awarzynski wrote:

This is not my area of expertise, but overall makes sense to me. I really appreciate the detailed summary! IMHO, genOpenMPReduction would really benefit from a bit of refactoring (I've not read it yet, tbh).

A reduction declaration operation that specifies how to initialize, combine and atomically combine private reduction variables.

I'm guessing that in the following, combiner is for both "combine" and "atomically combine" or is combiner for regular "combine" only?

The atomic combiner is a separate one which is optional. Providing these will give a performance improvement but is not necessary for functional correctness. We can generate the atomic combiners in future patches. For an e.g see https://github.com/llvm/llvm-project/blob/8068751189af3099d9abef8953a9639d6798535c/mlir/test/Target/LLVMIR/openmp-reduction.mlir#L17
Note: I have added a comma after combine to make this clear.

Also:

NOTE 1: The patch currently supports only reductions for integer type addition.

Why not call this out in the title (e.g. "[Flang][OpenMP] Add support for integer reductions in work-sharing loops". That's probably the key bit of information here.

Sure. Good point, added to title.

flang/lib/Lower/OpenMP.cpp
912	Yes, it will be refactored. But i have moved it out of the loop as per your suggestion, this will enable us to have separate TODOs for unsupported operations and types.
1517	Many of them are due to front-end style code where we are accessing fields inside a container type. I have removed one based on a suggestion from Peixin, but otherwise have not addressed it.
1531	I am not adding TODOs here since they will never be reached (the corresponding code in creating the reduction declaration would have been hit). But I have converted them to continue in line with @awarzynski's comments.

In D130077#3671499, @shraiysh wrote:

Just a thought - for the long run, would it make sense to have OpenMPConverter class inheriting from FirConverter and overriding some functions- like genFIR for OpenMP constructs and genFIR for assignment statement? IMO this would make the file more structured, while separating OpenMP and Fortran code. Right now, OpenMP.cpp is just a bunch of functions.

Inheritance is an interesting idea. Another approach would be extend the FirConverter so that it knows how convert all constructs, i.e., add methods ala. genOpenMPReduction or genOpenACCTarget.

Add driver tests in a few Todo tests.

kiranchandramohan edited the summary of this revision. (Show Details)Jul 25 2022, 10:00 AM

LGTM, thanks for addressing my comments!

As discussed in the call today, it would be great to merge this in time for LLVM 15 branch, i.e. today or early tomorrow (you may want to update the release notes when merging). I believe that you have addressed all comments and I suggest merging this as is - everything else can be addressed post-commit. @tschuett - I hope that that's fine. Implementing your suggestion would require quite a refactor and Kiran would definitely miss the LLVM 15 deadline. We can always re-visit this later!

This revision is now accepted and ready to land.Jul 25 2022, 10:41 AM

Harbormaster completed remote builds in B177417: Diff 447389.Jul 25 2022, 11:15 AM

Closed by commit rG7bb1151ba21e: [Flang][OpenMP] Initial support for integer reduction in worksharing-loop (authored by kiranchandramohan). · Explain WhyJul 25 2022, 11:47 AM

This revision was automatically updated to reflect the committed changes.

kiranchandramohan added a commit: rG7bb1151ba21e: [Flang][OpenMP] Initial support for integer reduction in worksharing-loop.

In D130077#3676863, @awarzynski wrote:

LGTM, thanks for addressing my comments!

As discussed in the call today, it would be great to merge this in time for LLVM 15 branch, i.e. today or early tomorrow (you may want to update the release notes when merging). I believe that you have addressed all comments and I suggest merging this as is - everything else can be addressed post-commit. @tschuett - I hope that that's fine. Implementing your suggestion would require quite a refactor and Kiran would definitely miss the LLVM 15 deadline. We can always re-visit this later!

Thanks @awarzynski. I will update the Readme separately.
@tschuett Thanks for your suggestion. We can discuss refactoring separately in a patch or in discourse.

DylanFleming-arm mentioned this in D130767: [Flang][OpenMP] Add support for integer multiplication reduction in worksharing-loop.Jul 29 2022, 5:53 AM

DylanFleming-arm mentioned this in rG9893b26dfa75: [Flang][OpenMP] Add support for integer multiplication reduction in worksharing….Aug 9 2022, 12:32 PM

Revision Contents

Path

Size

flang/

include/

flang/

Lower/

OpenMP.h

2 lines

lib/

Lower/

Bridge.cpp

9 lines

OpenMP.cpp

177 lines

test/

Lower/

OpenMP/

Todo/

parallel-reduction.f90

11 lines

reduction-allocatable.f90

21 lines

reduction-and.f90

15 lines

reduction-arrays.f90

15 lines

reduction-derived-type-field.f90

21 lines

15 lines

16 lines

16 lines

16 lines

16 lines

16 lines

reduction-multiply.f90

15 lines

reduction-neqv.f90

15 lines

reduction-or.f90

15 lines

reduction-real.f90

16 lines

reduction-subtract.f90

15 lines

wsloop-reduction-int.f90

144 lines

mlir/

include/

mlir/

Dialect/

OpenMP/

OpenMPOps.td

3 lines

lib/

Conversion/

OpenMPToLLVM/

OpenMPToLLVM.cpp

26 lines

Diff 447423

flang/include/flang/Lower/OpenMP.h

	Show All 32 Lines
	} // namespace pft			} // namespace pft

	void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &,			void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &,
	const parser::OpenMPConstruct &);			const parser::OpenMPConstruct &);
	void genOpenMPDeclarativeConstruct(AbstractConverter &, pft::Evaluation &,			void genOpenMPDeclarativeConstruct(AbstractConverter &, pft::Evaluation &,
	const parser::OpenMPDeclarativeConstruct &);			const parser::OpenMPDeclarativeConstruct &);
	int64_t getCollapseValue(const Fortran::parser::OmpClauseList &clauseList);			int64_t getCollapseValue(const Fortran::parser::OmpClauseList &clauseList);
	void genThreadprivateOp(AbstractConverter &, const pft::Variable &);			void genThreadprivateOp(AbstractConverter &, const pft::Variable &);
				void genOpenMPReduction(AbstractConverter &,
				const Fortran::parser::OmpClauseList &clauseList);

	} // namespace lower			} // namespace lower
	} // namespace Fortran			} // namespace Fortran

	#endif // FORTRAN_LOWER_OPENMP_H			#endif // FORTRAN_LOWER_OPENMP_H

flang/lib/Lower/Bridge.cpp

Show First 20 Lines • Show All 1,624 Lines • ▼ Show 20 Lines	void genFIR(const Fortran::parser::OpenMPConstruct &omp) {

// If loop is part of an OpenMP Construct then the OpenMP dialect		// If loop is part of an OpenMP Construct then the OpenMP dialect
// workshare loop operation has already been created. Only the		// workshare loop operation has already been created. Only the
// body needs to be created here and the do_loop can be skipped.		// body needs to be created here and the do_loop can be skipped.
// Skip the number of collapsed loops, which is 1 when there is a		// Skip the number of collapsed loops, which is 1 when there is a
// no collapse requested.		// no collapse requested.

Fortran::lower::pft::Evaluation *curEval = &getEval();		Fortran::lower::pft::Evaluation *curEval = &getEval();
		const Fortran::parser::OmpClauseList *loopOpClauseList = nullptr;
if (ompLoop) {		if (ompLoop) {
const auto &wsLoopOpClauseList = std::get<Fortran::parser::OmpClauseList>(		loopOpClauseList = &std::get<Fortran::parser::OmpClauseList>(
std::get<Fortran::parser::OmpBeginLoopDirective>(ompLoop->t).t);		std::get<Fortran::parser::OmpBeginLoopDirective>(ompLoop->t).t);
int64_t collapseValue =		int64_t collapseValue =
Fortran::lower::getCollapseValue(wsLoopOpClauseList);		Fortran::lower::getCollapseValue(*loopOpClauseList);

curEval = &curEval->getFirstNestedEvaluation();		curEval = &curEval->getFirstNestedEvaluation();
for (int64_t i = 1; i < collapseValue; i++) {		for (int64_t i = 1; i < collapseValue; i++) {
curEval = &*std::next(curEval->getNestedEvaluations().begin());		curEval = &*std::next(curEval->getNestedEvaluations().begin());
}		}
}		}

for (Fortran::lower::pft::Evaluation &e : curEval->getNestedEvaluations())		for (Fortran::lower::pft::Evaluation &e : curEval->getNestedEvaluations())
genFIR(e);		genFIR(e);

		if (ompLoop)
		genOpenMPReduction(this, loopOpClauseList);

localSymbols.popScope();		localSymbols.popScope();
builder->restoreInsertionPoint(insertPt);		builder->restoreInsertionPoint(insertPt);
}		}

void genFIR(const Fortran::parser::OpenMPDeclarativeConstruct &ompDecl) {		void genFIR(const Fortran::parser::OpenMPDeclarativeConstruct &ompDecl) {
mlir::OpBuilder::InsertPoint insertPt = builder->saveInsertionPoint();		mlir::OpBuilder::InsertPoint insertPt = builder->saveInsertionPoint();
genOpenMPDeclarativeConstruct(*this, getEval(), ompDecl);		genOpenMPDeclarativeConstruct(*this, getEval(), ompDecl);
for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations())		for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations())
▲ Show 20 Lines • Show All 1,599 Lines • Show Last 20 Lines

flang/lib/Lower/OpenMP.cpp

Show First 20 Lines • Show All 692 Lines • ▼ Show 20 Lines

auto taskOp = firOpBuilder.create<mlir::omp::TaskOp>(

/*in_reductions=*/nullptr, priorityClauseOperand, allocateOperands,

allocatorOperands);

createBodyOfOp(taskOp, converter, currentLocation, eval, &opClauseList);

} else {

TODO(converter.getCurrentLocation(), "Unhandled block directive");

}

/// Creates an OpenMP reduction declaration and inserts it into the provided

/// symbol table. The declaration has a constant initializer with the neutral

/// value `initValue`, and the reduction combiner carried over from `reduce`.

/// TODO: Generalize this for non-integer types, add atomic region.

static omp::ReductionDeclareOp createReductionDecl(fir::FirOpBuilder &builder,

llvm::StringRef name,

mlir::Type type,

mlir::Location loc) {

OpBuilder::InsertionGuard guard(builder);

mlir::ModuleOp module = builder.getModule();

mlir::OpBuilder modBuilder(module.getBodyRegion());

auto decl = module.lookupSymbol<mlir::omp::ReductionDeclareOp>(name);

if (!decl)

decl = modBuilder.create<omp::ReductionDeclareOp>(loc, name, type);

else

return decl;

builder.createBlock(&decl.initializerRegion(), decl.initializerRegion().end(),

{type}, {loc});

builder.setInsertionPointToEnd(&decl.initializerRegion().back());

Value init = builder.create<mlir::arith::ConstantOp>(

loc, type, builder.getIntegerAttr(type, 0));

builder.create<omp::YieldOp>(loc, init);

builder.createBlock(&decl.reductionRegion(), decl.reductionRegion().end(),

{type, type}, {loc, loc});

builder.setInsertionPointToEnd(&decl.reductionRegion().back());

mlir::Value op1 = decl.reductionRegion().front().getArgument(0);

mlir::Value op2 = decl.reductionRegion().front().getArgument(1);

Value addRes = builder.create<mlir::arith::AddIOp>(loc, op1, op2);

builder.create<omp::YieldOp>(loc, addRes);

return decl;

}

static mlir::omp::ScheduleModifier

translateModifier(const Fortran::parser::OmpScheduleModifierType &m) {

switch (m.v) {

case Fortran::parser::OmpScheduleModifierType::ModType::Monotonic:

return mlir::omp::ScheduleModifier::monotonic;

case Fortran::parser::OmpScheduleModifierType::ModType::Nonmonotonic:

return mlir::omp::ScheduleModifier::nonmonotonic;

case Fortran::parser::OmpScheduleModifierType::ModType::Simd:

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

const auto &modType2 = std::get<

modifier->t);

if (modType2 && modType2->v.v ==

Fortran::parser::OmpScheduleModifierType::ModType::Simd)

return mlir::omp::ScheduleModifier::simd;

}

return mlir::omp::ScheduleModifier::none;

}

static std::string getReductionName(

Fortran::parser::DefinedOperator::IntrinsicOperator intrinsicOp,

mlir::Type ty) {

std::string reductionName;

if (intrinsicOp == Fortran::parser::DefinedOperator::IntrinsicOperator::Add)

peixinUnsubmitted

Done

std::string reductionName;

- if (intrinsicOp == Fortran::parser::DefinedOperator::IntrinsicOperator::Add) {

+ if (intrinsicOp == Fortran::parser::DefinedOperator::IntrinsicOperator::Add)

reductionName = "add_reduction";

peixin:

reductionName = "add_reduction";

else

reductionName = "other_reduction";

return (llvm::Twine(reductionName) +

(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +

llvm::Twine(ty.getIntOrFloatBitWidth()))

.str();

}

static void genOMP(Fortran::lower::AbstractConverter &converter,

Fortran::lower::pft::Evaluation &eval,

const Fortran::parser::OpenMPLoopConstruct &loopConstruct) {

fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();

mlir::Location currentLocation = converter.getCurrentLocation();

llvm::SmallVector<mlir::Value> lowerBound, upperBound, step, linearVars,

linearStepVars, reductionVars;

mlir::Value scheduleChunkClauseOperand, ifClauseOperand;

mlir::Attribute scheduleClauseOperand, noWaitClauseOperand,

orderedClauseOperand, orderClauseOperand;

SmallVector<Attribute> reductionDeclSymbols;

Fortran::lower::StatementContext stmtCtx;

const auto &loopOpClauseList = std::get<Fortran::parser::OmpClauseList>(

std::get<Fortran::parser::OmpBeginLoopDirective>(loopConstruct.t).t);

const auto ompDirective =

std::get<Fortran::parser::OmpLoopDirective>(

std::get<Fortran::parser::OmpBeginLoopDirective>(loopConstruct.t).t)

.v;

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines

if (const auto &scheduleClause =

if (const auto *expr = Fortran::semantics::GetExpr(*chunkExpr)) {

scheduleChunkClauseOperand =

fir::getBase(converter.genExprValue(*expr, stmtCtx));

}

} else if (const auto &ifClause =

std::get_if<Fortran::parser::OmpClause::If>(&clause.u)) {

ifClauseOperand = getIfClauseOperand(converter, stmtCtx, ifClause);

} else if (const auto &reductionClause =

std::get_if<Fortran::parser::OmpClause::Reduction>(

&clause.u)) {

omp::ReductionDeclareOp decl;

const auto &redOperator{std::get<Fortran::parser::OmpReductionOperator>(

reductionClause->v.t)};

const auto &objectList{

std::get<Fortran::parser::OmpObjectList>(reductionClause->v.t)};

if (const auto &redDefinedOp =

std::get_if<Fortran::parser::DefinedOperator>(&redOperator.u)) {

const auto &intrinsicOp{

std::get<Fortran::parser::DefinedOperator::IntrinsicOperator>(

redDefinedOp->u)};

if (intrinsicOp !=

Fortran::parser::DefinedOperator::IntrinsicOperator::Add)

TODO(currentLocation,

"Reduction of some intrinsic operators is not supported");

for (const auto &ompObject : objectList.v) {

if (const auto *name{

peixinUnsubmitted

Not Done

Should this be in line 905 with one else branch with TODO for DefinedOpName? The DefinedOpName will be for declare reduction, if I understand correctly, and these will be refactored outside in the future.

peixin: Should this be in line 905 with one else branch with TODO for DefinedOpName? The DefinedOpName…

kiranchandramohanAuthorUnsubmitted

Done

Yes, it will be refactored. But i have moved it out of the loop as per your suggestion, this will enable us to have separate TODOs for unsupported operations and types.

kiranchandramohan: Yes, it will be refactored. But i have moved it out of the loop as per your suggestion, this…

Fortran::parser::Unwrap<Fortran::parser::Name>(ompObject)}) {

if (const auto *symbol{name->symbol}) {

mlir::Value symVal = converter.getSymbolAddress(*symbol);

mlir::Type redType =

symVal.getType().cast<fir::ReferenceType>().getEleTy();

reductionVars.push_back(symVal);

if (redType.isIntOrIndex()) {

decl = createReductionDecl(

firOpBuilder, getReductionName(intrinsicOp, redType),

redType, currentLocation);

} else {

TODO(currentLocation,

"Reduction of some types is not supported");

}

reductionDeclSymbols.push_back(SymbolRefAttr::get(

firOpBuilder.getContext(), decl.sym_name()));

}

} else {

TODO(currentLocation,

"Reduction of intrinsic procedures is not supported");

}

// The types of lower bound, upper bound, and step are converted into the

// type of the loop variable if necessary.

mlir::Type loopVarType = getLoopVarType(converter, loopVarTypeSize);

for (unsigned it = 0; it < (unsigned)lowerBound.size(); it++) {

lowerBound[it] = firOpBuilder.createConvert(currentLocation, loopVarType,

Show All 16 Lines

if (llvm::omp::OMPD_simd == ompDirective) {

return;

}

// FIXME: Add support for following clauses:

// 1. linear

// 2. order

auto wsLoopOp = firOpBuilder.create<mlir::omp::WsLoopOp>(

currentLocation, lowerBound, upperBound, step, linearVars, linearStepVars,

reductionVars, /*reductions=*/nullptr,

reductionVars,

reductionDeclSymbols.empty()

? nullptr

: mlir::ArrayAttr::get(firOpBuilder.getContext(),

reductionDeclSymbols),

scheduleClauseOperand.dyn_cast_or_null<omp::ClauseScheduleKindAttr>(),

scheduleChunkClauseOperand, /*schedule_modifiers=*/nullptr,

/*simd_modifier=*/nullptr,

noWaitClauseOperand.dyn_cast_or_null<UnitAttr>(),

orderedClauseOperand.dyn_cast_or_null<IntegerAttr>(),

orderClauseOperand.dyn_cast_or_null<omp::ClauseOrderKindAttr>(),

/*inclusive=*/firOpBuilder.getUnitAttr());

▲ Show 20 Lines • Show All 520 Lines • ▼ Show 20 Lines

std::visit(

[&](const Fortran::parser::OpenMPThreadprivate &threadprivate) {

// The directive is lowered when instantiating the variable to

// support the case of threadprivate variable declared in module.

ompDeclConstruct.u);

}

// Generate an OpenMP reduction operation. This implementation finds the chain :

// load reduction var -> reduction_operation -> store reduction var and replaces

awarzynskiUnsubmitted

Done

[nit] IIUC, the key task for this method is to generate OpenMP reductions (that's what the method suggests). However, this comment starts with an implementation detail ("Find chain : load reduction var -> reduction_operation -> store reduction var") rather than a generic description ("eplace it with the reduction operation").

awarzynski: [nit] IIUC, the key task for this method is to generate OpenMP reductions (that's what the…

// it with the reduction operation.

// TODO: Currently assumes it is an integer addition reduction. Generalize this

shraiyshUnsubmitted

Not Done

i think this should be done as a part of this patch itself.

shraiysh: i think this should be done as a part of this patch itself.

kiranchandramohanAuthorUnsubmitted

Done

Did you mean handling various operation types or the TODOs?
I have added tests to demonstrate that TODOs exist for unhandled cases and that includes the different operations or operations on unsupported types.

kiranchandramohan: Did you mean handling various operation types or the TODOs? I have added tests to demonstrate…

shraiyshUnsubmitted

Not Done

I meant adding tests for TODOs itself, thanks! Handling everything will make this a huge patch. Separate patches for unhandled cases would be better!

shraiysh: I meant adding tests for TODOs itself, thanks! Handling everything will make this a huge patch.

awarzynskiUnsubmitted

Not Done

I think that "one patch per type" is fine. It's much easier to review short patches.

awarzynski: I think that "one patch per type" is fine. It's much easier to review short patches.

// for various reduction operation types.

// TODO: Generate the reduction operation during lowering instead of creating

// and removing operations since this is not a robust approach. Also, removing

// ops in the builder (instead of a rewriter) is probably not the best approach.

kiranchandramohanAuthorUnsubmitted

Done

Try the approach in NOTE 3.b

kiranchandramohan: Try the approach in NOTE 3.b

awarzynskiUnsubmitted

Not Done

This method could really benefit from some early exits. It's quite tricky to read it ATM. Any chance to refactor a bit?

awarzynski: This method could really benefit from some [[ https://llvm.org/docs/CodingStandards.html#use…

kiranchandramohanAuthorUnsubmitted

Done

Many of them are due to front-end style code where we are accessing fields inside a container type. I have removed one based on a suggestion from Peixin, but otherwise have not addressed it.

kiranchandramohan: Many of them are due to front-end style code where we are accessing fields inside a container…

peixinUnsubmitted

Not Done

This does not work for the following two cases:

program main
  integer, allocatable :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

program main
  integer, pointer :: x! = 0
  integer :: i = 1

  allocate(x)
  x = 0

  !$omp parallel num_threads(4)
  !$omp do reduction(+:x)
  do i = 1, 10
    x = x + i
  enddo
  !$omp end do
  !$omp end parallel

  print *, x
end

peixin: This does not work for the following two cases: ``` program main integer, allocatable :: x!

kiranchandramohanAuthorUnsubmitted

Done

These two (allocate and pointer tests) hit the hard Todo and we can handle them in subsequent patches. I will add these tests (as Todo tests) to the patch and add you as co-author.

kiranchandramohan: These two (allocate and pointer tests) hit the hard Todo and we can handle them in subsequent…

void Fortran::lower::genOpenMPReduction(

Fortran::lower::AbstractConverter &converter,

const Fortran::parser::OmpClauseList &clauseList) {

fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();

for (const auto &clause : clauseList.v) {

if (const auto &reductionClause =

std::get_if<Fortran::parser::OmpClause::Reduction>(&clause.u)) {

const auto &redOperator{std::get<Fortran::parser::OmpReductionOperator>(

reductionClause->v.t)};

const auto &objectList{

std::get<Fortran::parser::OmpObjectList>(reductionClause->v.t)};

if (auto reductionOp =

std::get_if<Fortran::parser::DefinedOperator>(&redOperator.u)) {

peixinUnsubmitted

Not Done

Same as above. Should the TODO also be in this function(genOpenMPReduction)?

peixin: Same as above. Should the TODO also be in this function(genOpenMPReduction)?

kiranchandramohanAuthorUnsubmitted

Done

I am not adding TODOs here since they will never be reached (the corresponding code in creating the reduction declaration would have been hit). But I have converted them to continue in line with @awarzynski's comments.

kiranchandramohan: I am not adding TODOs here since they will never be reached (the corresponding code in creating…

const auto &intrinsicOp{

std::get<Fortran::parser::DefinedOperator::IntrinsicOperator>(

reductionOp->u)};

if (intrinsicOp !=

Fortran::parser::DefinedOperator::IntrinsicOperator::Add)

continue;

for (const auto &ompObject : objectList.v) {

if (const auto *name{

Fortran::parser::Unwrap<Fortran::parser::Name>(ompObject)}) {

if (const auto *symbol{name->symbol}) {

mlir::Value symVal = converter.getSymbolAddress(*symbol);

mlir::Type redType =

symVal.getType().cast<fir::ReferenceType>().getEleTy();

if (!redType.isIntOrIndex())

continue;

for (mlir::OpOperand &use1 : symVal.getUses()) {

if (auto load = mlir::dyn_cast<fir::LoadOp>(use1.getOwner())) {

mlir::Value loadVal = load.getRes();

for (mlir::OpOperand &use2 : loadVal.getUses()) {

if (auto add = mlir::dyn_cast<mlir::arith::AddIOp>(

use2.getOwner())) {

mlir::Value addRes = add.getResult();

for (mlir::OpOperand &use3 : addRes.getUses()) {

if (auto store =

mlir::dyn_cast<fir::StoreOp>(use3.getOwner())) {

if (store.getMemref() == symVal) {

// Chain found! Now replace load->reduction->store

// with the OpenMP reduction operation.

mlir::OpBuilder::InsertPoint insertPtDel =

firOpBuilder.saveInsertionPoint();

firOpBuilder.setInsertionPoint(add);

if (add.getLhs() == loadVal) {

firOpBuilder.create<mlir::omp::ReductionOp>(

add.getLoc(), add.getRhs(), symVal);

} else {

firOpBuilder.create<mlir::omp::ReductionOp>(

add.getLoc(), add.getLhs(), symVal);

}

store.erase();

add.erase();

load.erase();

firOpBuilder.restoreInsertionPoint(insertPtDel);

}

flang/test/Lower/OpenMP/Todo/parallel-reduction.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: OpenMP Block construct clauses
				subroutine reduction_parallel
				integer :: x
				!$omp parallel reduction(+:x)
				x = x + i
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-allocatable.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some types is not supported
				subroutine reduction_allocatable
				integer, allocatable :: x
				integer :: i = 1

				allocate(x)
				x = 0

				!$omp parallel num_threads(4)
				!$omp do reduction(+:x)
				do i = 1, 10
				x = x + i
				enddo
				!$omp end do
				!$omp end parallel

				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-and.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_and(y)
				logical :: x, y(100)
				!$omp parallel
				!$omp do reduction(.and.:x)
				do i=1, 100
				x = x .and. y(i)
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-arrays.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some types is not supported
				subroutine reduction_array(y)
				integer :: x(100), y(100,100)
				!$omp parallel
				!$omp do reduction(+:x)
				do i=1, 100
				x = x + y(:,i)
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-derived-type-field.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some types is not supported
				subroutine reduction_allocatable
				type t
				integer :: x
				end type
				integer :: i = 1
				type(t) :: mt

				mt%x = 0

				!$omp parallel num_threads(4)
				!$omp do reduction(+:mt)
				do i = 1, 10
				mt%x = mt%x + i
				enddo
				!$omp end do
				!$omp end parallel
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-eqv.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_eqv(y)
				logical :: x, y(100)
				!$omp parallel
				!$omp do reduction(.eqv.:x)
				do i=1, 100
				x = x .eqv. y(i)
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-iand.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of intrinsic procedures is not supported
				subroutine reduction_iand(y)
				integer :: x, y(:)
				x = 0
				!$omp parallel
				!$omp do reduction(iand:x)
				do i=1, 100
				x = iand(x, y(i))
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-ieor.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of intrinsic procedures is not supported
				subroutine reduction_ieor(y)
				integer :: x, y(:)
				x = 0
				!$omp parallel
				!$omp do reduction(ieor:x)
				do i=1, 100
				x = ieor(x, y(i))
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-ior.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of intrinsic procedures is not supported
				subroutine reduction_ior(y)
				integer :: x, y(:)
				x = 0
				!$omp parallel
				!$omp do reduction(ior:x)
				do i=1, 100
				x = ior(x, y(i))
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-max.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of intrinsic procedures is not supported
				subroutine reduction_max(y)
				integer :: x, y(:)
				x = 0
				!$omp parallel
				!$omp do reduction(max:x)
				do i=1, 100
				x = max(x, y(i))
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-min.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of intrinsic procedures is not supported
				subroutine reduction_min(y)
				integer :: x, y(:)
				x = 0
				!$omp parallel
				!$omp do reduction(min:x)
				do i=1, 100
				x = min(x, y(i))
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-multiply.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_multiply
				integer :: x
				!$omp parallel
				!$omp do reduction(*:x)
				do i=1, 100
				x = x * i
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-neqv.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_neqv(y)
				logical :: x, y(100)
				!$omp parallel
				!$omp do reduction(.neqv.:x)
				do i=1, 100
				x = x .neqv. y(i)
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-or.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_or(y)
				logical :: x, y(100)
				!$omp parallel
				!$omp do reduction(.or.:x)
				do i=1, 100
				x = x .or. y(i)
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-real.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some types is not supported
				subroutine reduction_real
				real :: x
				x = 0.0
				!$omp parallel
				!$omp do reduction(+:x)
				do i=1, 100
				x = x + 1.0
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/Todo/reduction-subtract.f90

This file was added.

				! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s
				! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 \| FileCheck %s

				! CHECK: not yet implemented: Reduction of some intrinsic operators is not supported
				subroutine reduction_subtract
				integer :: x
				!$omp parallel
				!$omp do reduction(-:x)
				do i=1, 100
				x = x - i
				end do
				!$omp end do
				!$omp end parallel
				print *, x
				end subroutine

flang/test/Lower/OpenMP/wsloop-reduction-int.f90

This file was added.

				! RUN: bbc -emit-fir -fopenmp %s -o - \| FileCheck %s
				! RUN: %flang_fc1 -emit-fir -fopenmp %s -o - \| FileCheck %s

				!CHECK-LABEL: omp.reduction.declare
				!CHECK-SAME: @[[RED_I64_NAME:.*]] : i64 init {
				!CHECK: ^bb0(%{{.*}}: i64):
				!CHECK: %[[C0_1:.*]] = arith.constant 0 : i64
				!CHECK: omp.yield(%[[C0_1]] : i64)
				!CHECK: } combiner {
				!CHECK: ^bb0(%[[ARG0:.]]: i64, %[[ARG1:.]]: i64):
				!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i64
				!CHECK: omp.yield(%[[RES]] : i64)
				!CHECK: }

				!CHECK-LABEL: omp.reduction.declare
				!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init {
				!CHECK: ^bb0(%{{.*}}: i32):
				!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32
				!CHECK: omp.yield(%[[C0_1]] : i32)
				!CHECK: } combiner {
				!CHECK: ^bb0(%[[ARG0:.]]: i32, %[[ARG1:.]]: i32):
				!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32
				!CHECK: omp.yield(%[[RES]] : i32)
				!CHECK: }

				!CHECK-LABEL: func.func @_QPsimple_reduction
				!CHECK: %[[XREF:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_reductionEx"}
				!CHECK: %[[C0_2:.*]] = arith.constant 0 : i32
				!CHECK: fir.store %[[C0_2]] to %[[XREF]] : !fir.ref<i32>
				!CHECK: omp.parallel
				!CHECK: %[[I_PVT_REF:.*]] = fir.alloca i32 {adapt.valuebyref, pinned}
				!CHECK: %[[C1_1:.*]] = arith.constant 1 : i32
				!CHECK: %[[C100:.*]] = arith.constant 100 : i32
				!CHECK: %[[C1_2:.*]] = arith.constant 1 : i32
				!CHECK: omp.wsloop reduction(@[[RED_I32_NAME]] -> %[[XREF]] : !fir.ref<i32>) for (%[[IVAL:.*]]) : i32 = (%[[C1_1]]) to (%[[C100]]) inclusive step (%[[C1_2]])
				!CHECK: fir.store %[[IVAL]] to %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: %[[I_PVT_VAL:.*]] = fir.load %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: omp.reduction %[[I_PVT_VAL]], %[[XREF]] : !fir.ref<i32>
				!CHECK: omp.yield
				!CHECK: omp.terminator
				!CHECK: return

				subroutine simple_reduction
				integer :: x
				x = 0
				!$omp parallel
				!$omp do reduction(+:x)
				do i=1, 100
				x = x + i
				end do
				!$omp end do
				!$omp end parallel
				end subroutine

				!CHECK-LABEL: func.func @_QPsimple_reduction_switch_order
				!CHECK: %[[XREF:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_reduction_switch_orderEx"}
				awarzynskiUnsubmitted Done Reply Inline Actions Not needed awarzynski: Not needed
				!CHECK: %[[C0_2:.*]] = arith.constant 0 : i32
				!CHECK: fir.store %[[C0_2]] to %[[XREF]] : !fir.ref<i32>
				!CHECK: omp.parallel
				!CHECK: %[[I_PVT_REF:.*]] = fir.alloca i32 {adapt.valuebyref, pinned}
				!CHECK: %[[C1_1:.*]] = arith.constant 1 : i32
				!CHECK: %[[C100:.*]] = arith.constant 100 : i32
				!CHECK: %[[C1_2:.*]] = arith.constant 1 : i32
				!CHECK: omp.wsloop reduction(@[[RED_I32_NAME]] -> %[[XREF]] : !fir.ref<i32>) for (%[[IVAL:.*]]) : i32 = (%[[C1_1]]) to (%[[C100]]) inclusive step (%[[C1_2]])
				!CHECK: fir.store %[[IVAL]] to %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: %[[I_PVT_VAL:.*]] = fir.load %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: omp.reduction %[[I_PVT_VAL]], %[[XREF]] : !fir.ref<i32>
				!CHECK: omp.yield
				!CHECK: omp.terminator
				!CHECK: return

				subroutine simple_reduction_switch_order
				integer :: x
				x = 0
				!$omp parallel
				!$omp do reduction(+:x)
				do i=1, 100
				x = i + x
				end do
				!$omp end do
				!$omp end parallel
				end subroutine

				!CHECK-LABEL: func.func @_QPmultiple_reductions_same_type
				!CHECK: %[[XREF:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFmultiple_reductions_same_typeEx"}
				!CHECK: %[[YREF:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFmultiple_reductions_same_typeEy"}
				!CHECK: %[[ZREF:.*]] = fir.alloca i32 {bindc_name = "z", uniq_name = "_QFmultiple_reductions_same_typeEz"}
				!CHECK: omp.parallel
				awarzynskiUnsubmitted Done Reply Inline Actions Not needed awarzynski: Not needed
				!CHECK: %[[I_PVT_REF:.*]] = fir.alloca i32 {adapt.valuebyref, pinned}
				!CHECK: omp.wsloop reduction(@[[RED_I32_NAME]] -> %[[XREF]] : !fir.ref<i32>, @[[RED_I32_NAME]] -> %[[YREF]] : !fir.ref<i32>, @[[RED_I32_NAME]] -> %[[ZREF]] : !fir.ref<i32>) for (%[[IVAL]]) : i32
				!CHECK: fir.store %[[IVAL]] to %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: %[[I_PVT_VAL1:.*]] = fir.load %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: omp.reduction %[[I_PVT_VAL1]], %[[XREF]] : !fir.ref<i32>
				!CHECK: %[[I_PVT_VAL2:.*]] = fir.load %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: omp.reduction %[[I_PVT_VAL2]], %[[YREF]] : !fir.ref<i32>
				!CHECK: %[[I_PVT_VAL3:.*]] = fir.load %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: omp.reduction %[[I_PVT_VAL3]], %[[ZREF]] : !fir.ref<i32>
				!CHECK: omp.yield
				!CHECK: omp.terminator
				!CHECK: return

				subroutine multiple_reductions_same_type
				integer :: x,y,z
				x = 0
				y = 0
				z = 0
				!$omp parallel
				!$omp do reduction(+:x,y,z)
				do i=1, 100
				x = x + i
				y = y + i
				z = z + i
				end do
				!$omp end do
				!$omp end parallel
				end subroutine

				!CHECK-LABEL: func.func @_QPmultiple_reductions_different_type
				!CHECK: %[[XREF:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFmultiple_reductions_different_typeEx"}
				!CHECK: %[[YREF:.*]] = fir.alloca i64 {bindc_name = "y", uniq_name = "_QFmultiple_reductions_different_typeEy"}
				!CHECK: omp.parallel
				!CHECK: %[[I_PVT_REF:.*]] = fir.alloca i32 {adapt.valuebyref, pinned}
				!CHECK: omp.wsloop reduction(@[[RED_I32_NAME]] -> %[[XREF]] : !fir.ref<i32>, @[[RED_I64_NAME]] -> %[[YREF]] : !fir.ref<i64>) for (%[[IVAL:.*]]) : i32
				!CHECK: fir.store %[[IVAL]] to %[[I_PVT_REF]] : !fir.ref<i32>
				!CHECK: %[[C1_32:.*]] = arith.constant 1 : i32
				!CHECK: omp.reduction %[[C1_32]], %[[XREF]] : !fir.ref<i32>
				!CHECK: %[[C1_64:.*]] = arith.constant 1 : i64
				!CHECK: omp.reduction %[[C1_64]], %[[YREF]] : !fir.ref<i64>
				!CHECK: omp.yield
				!CHECK: omp.terminator
				!CHECK: return

				subroutine multiple_reductions_different_type
				integer :: x
				integer(kind=8) :: y
				!$omp parallel
				!$omp do reduction(+:x,y)
				do i=1, 100
				x = x + 1_4
				y = y + 1_8
				end do
				!$omp end do
				!$omp end parallel
				end subroutine

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

Show First 20 Lines • Show All 1,306 Lines • ▼ Show 20 Lines	let assemblyFormat = [{ `cancellation_construct_type` `(`
attr-dict}];		attr-dict}];
let hasVerifier = 1;		let hasVerifier = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.19.5.7 declare reduction Directive		// 2.19.5.7 declare reduction Directive
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def ReductionDeclareOp : OpenMP_Op<"reduction.declare", [Symbol]> {		def ReductionDeclareOp : OpenMP_Op<"reduction.declare", [Symbol,
		IsolatedFromAbove]> {
		kiranchandramohanAuthorUnsubmitted Done Reply Inline Actions Move this to a separate patch. kiranchandramohan: Move this to a separate patch.
let summary = "declares a reduction kind";		let summary = "declares a reduction kind";

let description = [{		let description = [{
Declares an OpenMP reduction kind. This requires two mandatory and one		Declares an OpenMP reduction kind. This requires two mandatory and one
optional region.		optional region.

1. The initializer region specifies how to initialize the thread-local		1. The initializer region specifies how to initialize the thread-local
reduction value. This is usually the neutral element of the reduction.		reduction value. This is usually the neutral element of the reduction.
▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	for (unsigned idx = 0; idx < curOp.getNumVariableOperands(); ++idx) {
}		}
convertedOperands.emplace_back(adaptor.getOperands()[idx]);		convertedOperands.emplace_back(adaptor.getOperands()[idx]);
}		}
rewriter.replaceOpWithNewOp<T>(curOp, resTypes, convertedOperands,		rewriter.replaceOpWithNewOp<T>(curOp, resTypes, convertedOperands,
curOp->getAttrs());		curOp->getAttrs());
return success();		return success();
}		}
};		};

		struct ReductionOpConversion : public ConvertOpToLLVMPattern<omp::ReductionOp> {
		using ConvertOpToLLVMPattern<omp::ReductionOp>::ConvertOpToLLVMPattern;
		LogicalResult
		matchAndRewrite(omp::ReductionOp curOp, OpAdaptor adaptor,
		ConversionPatternRewriter &rewriter) const override {
		if (curOp.accumulator().getType().isa<MemRefType>()) {
		// TODO: Support memref type in variable operands
		return rewriter.notifyMatchFailure(curOp, "memref is not supported yet");
		}
		rewriter.replaceOpWithNewOp<omp::ReductionOp>(
		curOp, TypeRange(), adaptor.getOperands(), curOp->getAttrs());
		return success();
		}
		};
} // namespace		} // namespace

void mlir::configureOpenMPToLLVMConversionLegality(		void mlir::configureOpenMPToLLVMConversionLegality(
ConversionTarget &target, LLVMTypeConverter &typeConverter) {		ConversionTarget &target, LLVMTypeConverter &typeConverter) {
target.addDynamicallyLegalOp<mlir::omp::CriticalOp, mlir::omp::ParallelOp,		target.addDynamicallyLegalOp<mlir::omp::CriticalOp, mlir::omp::ParallelOp,
mlir::omp::WsLoopOp, mlir::omp::MasterOp,		mlir::omp::WsLoopOp, mlir::omp::MasterOp,
mlir::omp::SectionsOp, mlir::omp::SingleOp>(		mlir::omp::SectionsOp, mlir::omp::SingleOp>(
[&](Operation *op) {		[&](Operation *op) {
return typeConverter.isLegal(&op->getRegion(0)) &&		return typeConverter.isLegal(&op->getRegion(0)) &&
typeConverter.isLegal(op->getOperandTypes()) &&		typeConverter.isLegal(op->getOperandTypes()) &&
typeConverter.isLegal(op->getResultTypes());		typeConverter.isLegal(op->getResultTypes());
});		});
target		target
.addDynamicallyLegalOp<mlir::omp::AtomicReadOp, mlir::omp::AtomicWriteOp,		.addDynamicallyLegalOp<mlir::omp::AtomicReadOp, mlir::omp::AtomicWriteOp,
mlir::omp::FlushOp, mlir::omp::ThreadprivateOp>(		mlir::omp::FlushOp, mlir::omp::ThreadprivateOp>(
[&](Operation *op) {		[&](Operation *op) {
return typeConverter.isLegal(op->getOperandTypes()) &&		return typeConverter.isLegal(op->getOperandTypes()) &&
typeConverter.isLegal(op->getResultTypes());		typeConverter.isLegal(op->getResultTypes());
});		});
		target.addDynamicallyLegalOp<mlir::omp::ReductionOp>([&](Operation *op) {
		return typeConverter.isLegal(op->getOperandTypes());
		});
}		}

void mlir::populateOpenMPToLLVMConversionPatterns(LLVMTypeConverter &converter,		void mlir::populateOpenMPToLLVMConversionPatterns(LLVMTypeConverter &converter,
RewritePatternSet &patterns) {		RewritePatternSet &patterns) {
patterns.add<		patterns.add<
RegionOpConversion<omp::CriticalOp>, RegionOpConversion<omp::MasterOp>,		ReductionOpConversion, RegionOpConversion<omp::CriticalOp>,
RegionOpConversion<omp::ParallelOp>, RegionOpConversion<omp::WsLoopOp>,		RegionOpConversion<omp::MasterOp>, ReductionOpConversion,
RegionOpConversion<omp::SectionsOp>, RegionOpConversion<omp::SingleOp>,		RegionOpConversion<omp::MasterOp>, RegionOpConversion<omp::ParallelOp>,
		RegionOpConversion<omp::WsLoopOp>, RegionOpConversion<omp::SectionsOp>,
		RegionOpConversion<omp::SingleOp>,
		kiranchandramohanAuthorUnsubmitted Done Reply Inline Actions Add a test. kiranchandramohan: Add a test.
RegionLessOpWithVarOperandsConversion<omp::AtomicReadOp>,		RegionLessOpWithVarOperandsConversion<omp::AtomicReadOp>,
RegionLessOpWithVarOperandsConversion<omp::AtomicWriteOp>,		RegionLessOpWithVarOperandsConversion<omp::AtomicWriteOp>,
RegionLessOpWithVarOperandsConversion<omp::FlushOp>,		RegionLessOpWithVarOperandsConversion<omp::FlushOp>,
RegionLessOpWithVarOperandsConversion<omp::ThreadprivateOp>>(converter);		RegionLessOpWithVarOperandsConversion<omp::ThreadprivateOp>>(converter);
}		}

namespace {		namespace {
struct ConvertOpenMPToLLVMPass		struct ConvertOpenMPToLLVMPass
Show All 28 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Flang][OpenMP] Initial support for integer reduction in worksharing-loopClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 447423

flang/include/flang/Lower/OpenMP.h

flang/lib/Lower/Bridge.cpp

flang/lib/Lower/OpenMP.cpp

flang/test/Lower/OpenMP/Todo/parallel-reduction.f90

flang/test/Lower/OpenMP/Todo/reduction-allocatable.f90

flang/test/Lower/OpenMP/Todo/reduction-and.f90

flang/test/Lower/OpenMP/Todo/reduction-arrays.f90

flang/test/Lower/OpenMP/Todo/reduction-derived-type-field.f90

flang/test/Lower/OpenMP/Todo/reduction-eqv.f90

flang/test/Lower/OpenMP/Todo/reduction-iand.f90

flang/test/Lower/OpenMP/Todo/reduction-ieor.f90

flang/test/Lower/OpenMP/Todo/reduction-ior.f90

flang/test/Lower/OpenMP/Todo/reduction-max.f90

flang/test/Lower/OpenMP/Todo/reduction-min.f90

flang/test/Lower/OpenMP/Todo/reduction-multiply.f90

flang/test/Lower/OpenMP/Todo/reduction-neqv.f90

flang/test/Lower/OpenMP/Todo/reduction-or.f90

flang/test/Lower/OpenMP/Todo/reduction-real.f90

flang/test/Lower/OpenMP/Todo/reduction-subtract.f90

flang/test/Lower/OpenMP/wsloop-reduction-int.f90

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

mlir/lib/Conversion/OpenMPToLLVM/OpenMPToLLVM.cpp

[Flang][OpenMP] Initial support for integer reduction in worksharing-loop
ClosedPublic