This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/test/Lower/OpenMP/
-
test/
-
Lower/
-
OpenMP/
-
parallel-sections.f90
-
sections.f90
-
mlir/
-
include/mlir/Dialect/OpenMP/
-
mlir/
-
Dialect/
-
OpenMP/
-
OpenMPOps.td
-
test/Dialect/OpenMP/
-
Dialect/
-
OpenMP/
-
canonicalize.mlir
-
invalid.mlir
-
ops.mlir

Differential D126404

[mlir][OpenMP] Add recursive side effect information and tests for canonicalize pass
Changes PlannedPublic

Authored by shraiysh on May 25 2022, 11:56 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
kiranktp
kiranchandramohan
NimishMishra
peixin
MatsPetersson
arnamoy10
ftynse
zero9178

Summary

This patch adds RecursiveSideEffects trait to the operations in OpenMP
Dialect that have a region associated with them. This allows the
canonicalize pass to remove such operations when they have an empty
region.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

shraiysh created this revision.May 25 2022, 11:56 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 25 2022, 11:56 AM

Herald added subscribers: mehdi_amini, guansong, yaxunl. · View Herald Transcript

shraiysh requested review of this revision.May 25 2022, 11:56 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptMay 25 2022, 11:56 AM

Herald added subscribers: sstefan1, jdoerfert. · View Herald Transcript

shraiysh added reviewers: kiranktp, kiranchandramohan, NimishMishra, peixin, MatsPetersson, arnamoy10.May 25 2022, 11:56 AM

Harbormaster completed remote builds in B166327: Diff 432067.May 25 2022, 12:54 PM

LGTM. Thank you for this.

This revision is now accepted and ready to land.May 25 2022, 8:15 PM

Not sure about this change, particularly when there are no optimisations specified.

Would adding this as a pass in MLIR or OpenMPOpt in LLVM be a better option?

In D126404#3539560, @kiranchandramohan wrote:

Not sure about this change, particularly when there are no optimisations specified.

Would adding this as a pass in MLIR or OpenMPOpt in LLVM be a better option?

Move the change to MLIR Canonicalizer and add tests for the same.

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 6:59 AM

Herald added subscribers: bzcheeseman, sdasgup3, wenzhicui and 21 others. · View Herald Transcript

shraiysh requested review of this revision.May 26 2022, 7:00 AM

shraiysh retitled this revision from [flang][OpenMP] Prevent emitting sections when there are none to [mlir][OpenMP] Add recursive side effect information and tests for canonicalize pass.

shraiysh edited the summary of this revision. (Show Details)

shraiysh added reviewers: ftynse, zero9178.

Harbormaster completed remote builds in B166463: Diff 432270.May 26 2022, 7:18 AM

I am not an expert in MLIR attributes, so I will leave the review for other expects. The results by this patch LG. @shraiysh Where can I get the information of RecursiveSideEffects? Can you share one link?

I do not know of any link for this. The only explanation I found is in the comment to its definition -

This trait indicates that the side effects of an operation includes the effects of operations nested within its regions. If the operation has no derived effects interfaces, the operation itself can be assumed to have no side effects.

It is a bit tricky to understand, and it took me atleast 10 re-reads of this comment and a lot of discussion to understand this, but I don't know how we would be able to phrase this better.

An example where we should have this trait: omp.parallel has the RecursiveSideEffects trait because when an someone asks whether an omp.parallel operation in a .mlir file has side effects, we should look into the operations under it. If any of those child operations has side effects, then we say that the omp.parallel has side effects.

On an unrelated note, omp.parallel in itself introduces no side effects. When these two facts are combined, empty omp.parallel or omp.parallel with no-side-effect computations inside it can be eliminated (I will add a testcase for this - it is not already in this patch).

An example where we should not have this trait: when the operations inside the region are not executed while the main operation is executed and child operations with side effects cause no side effects to the main operation. For example, something like lambdas. Even if the operations within a lambda operation have side effects, the lambda operation will not have side effects because those child operations are not executed when a lambda is defined. Hence it has no recursive side effects. So, all lambda operations for which the results aren't used anywhere can be eliminated even if their inner bodies have side effects.

I hope this helps and I don't add to the confusion about this trait. :)

I am not an expert in MLIR attributes, so I will leave the review for other expects.

Sure, no problem.

In D126404#3544055, @shraiysh wrote:

I do not know of any link for this. The only explanation I found is in the comment to its definition -

This trait indicates that the side effects of an operation includes the effects of operations nested within its regions. If the operation has no derived effects interfaces, the operation itself can be assumed to have no side effects.

It is a bit tricky to understand, and it took me atleast 10 re-reads of this comment and a lot of discussion to understand this, but I don't know how we would be able to phrase this better.

An example where we should have this trait: omp.parallel has the RecursiveSideEffects trait because when an someone asks whether an omp.parallel operation in a .mlir file has side effects, we should look into the operations under it. If any of those child operations has side effects, then we say that the omp.parallel has side effects.

On an unrelated note, omp.parallel in itself introduces no side effects. When these two facts are combined, empty omp.parallel or omp.parallel with no-side-effect computations inside it can be eliminated (I will add a testcase for this - it is not already in this patch).

An example where we should not have this trait: when the operations inside the region are not executed while the main operation is executed and child operations with side effects cause no side effects to the main operation. For example, something like lambdas. Even if the operations within a lambda operation have side effects, the lambda operation will not have side effects because those child operations are not executed when a lambda is defined. Hence it has no recursive side effects. So, all lambda operations for which the results aren't used anywhere can be eliminated even if their inner bodies have side effects.

I hope this helps and I don't add to the confusion about this trait. :)

Got it. Thanks for the explanations.

Would it be always correct to remove the OpenMP operation if there are clauses? Particularly, thinking about cases like allocate, lastprivate (not an issue now since we handle it elsewhere), worksharing loop (without nowait, will cause removal of a synch point).

In D126404#3554674, @kiranchandramohan wrote:

Would it be always correct to remove the OpenMP operation if there are clauses? Particularly, thinking about cases like allocate, lastprivate (not an issue now since we handle it elsewhere), worksharing loop (without nowait, will cause removal of a synch point).

That's a good concern. I think we cannot put this trait under worksharing loop because of the implicit barrier. I believe the other constructs create their own thread-group and the implicit barrier would not affect anything outside. Another way to handle wsloop will be to disconnect the implicit barrier from the wsloop operation. That means #pragma omp wsloop {} gets translated to the following. With nowait, there would be no omp.barrier operation. This is less closer to the standard, but allows simpler optimization by eliminating codegen for empty wsloops and gives us more explicit IR.

omp.wsloop {...}
omp.barrier

I don't know much about the intricacies of allocate clause however afaiu, it looks like allocate clause only affects how the data is perceived under the construct. Is that correct? Will the following code have a side effect outside the construct?

omp.some_construct allocate(..) {
  omp.terminator
}

In D126404#3555416, @shraiysh wrote:
In D126404#3554674, @kiranchandramohan wrote:

Would it be always correct to remove the OpenMP operation if there are clauses? Particularly, thinking about cases like allocate, lastprivate (not an issue now since we handle it elsewhere), worksharing loop (without nowait, will cause removal of a synch point).

That's a good concern. I think we cannot put this trait under worksharing loop because of the implicit barrier. I believe the other constructs create their own thread-group and the implicit barrier would not affect anything outside. Another way to handle wsloop will be to disconnect the implicit barrier from the wsloop operation. That means #pragma omp wsloop {} gets translated to the following. With nowait, there would be no omp.barrier operation. This is less closer to the standard, but allows simpler optimization by eliminating codegen for empty wsloops and gives us more explicit IR.
omp.wsloop {...}
omp.barrier

Let us not separate the barrier. As part of the current implementation, the OpenMPIRBuilder adds the barrier based on the presence of the nowait attribute.
If in future we have lastprivate support, and if the index variable is marked as lastprivate, then removing the region in that case is probably not correct.

I don't know much about the intricacies of allocate clause however afaiu, it looks like allocate clause only affects how the data is perceived under the construct. Is that correct? Will the following code have a side effect outside the construct?
omp.some_construct allocate(..) {
  omp.terminator
}

Allocate will only allocate inside the scope of the construct, in that sense it is local. But it can ask to allocate in different kinds of memory. These will be calls to the OpenMP runtime and I believe they will probably call the OS or drivers for these resources.

Another thought that came to mind is what if there are task dependency clauses specified and we remove an OpenMP operation.

BTW, the openmp-opt passes are able to remove some trivial OpenMP regions. In the past, there were some requests to perform these kinds of transformations in that layer so that both compilers benefit. But I agree that this is easier at the MLIR layer without the expansion of the region and the insertion of runtime calls. At the same time, I am not 100% sure whether clauses can cause side effects and hence would it be better to do it at the openmp-opt level when everything is expanded?
https://openmp.llvm.org/remarks/OMP160.html#omp160

Let us not separate the barrier. As part of the current implementation, the OpenMPIRBuilder adds the barrier based on the presence of the nowait attribute.
If in future we have lastprivate support, and if the index variable is marked as lastprivate, then removing the region in that case is probably not correct.

Okay, I will remove this from omp.wsloop operation. Thank you!

Allocate will only allocate inside the scope of the construct, in that sense it is local. But it can ask to allocate in different kinds of memory. These will be calls to the OpenMP runtime and I believe they will probably call the OS or drivers for these resources.

The system calls will have side-effects (like memory allocation on device etc.) but I thought those effects should be erased before the construct ends (deallocation should happen before it ends either because the programmer asks for it, or because of the OpenMP Runtime). Is that correct?

Another thought that came to mind is what if there are task dependency clauses specified and we remove an OpenMP operation.
BTW, the openmp-opt passes are able to remove some trivial OpenMP regions. In the past, there were some requests to perform these kinds of transformations in that layer so that both compilers benefit. But I agree that this is easier at the MLIR layer without the expansion of the region and the insertion of runtime calls. At the same time, I am not 100% sure whether clauses can cause side effects and hence would it be better to do it at the openmp-opt level when everything is expanded?
https://openmp.llvm.org/remarks/OMP160.html#omp160

Thank you for the reference, I did not know about this. However as you pointed out, it would be more desirable to do it at the MLIR layer itself. Would it be okay if we had conditional elimination of constructs? i.e. we eliminate only when there are no side-effect-producing clauses on the construct - like if clause and num_threads clauses. I can try to break this into separate patches for each construct and we can have the discussion about side effects of clauses on that particular construct. Should we do it that way?

In D126404#3559976, @shraiysh wrote:

Another thought that came to mind is what if there are task dependency clauses specified and we remove an OpenMP operation.
BTW, the openmp-opt passes are able to remove some trivial OpenMP regions. In the past, there were some requests to perform these kinds of transformations in that layer so that both compilers benefit. But I agree that this is easier at the MLIR layer without the expansion of the region and the insertion of runtime calls. At the same time, I am not 100% sure whether clauses can cause side effects and hence would it be better to do it at the openmp-opt level when everything is expanded?
https://openmp.llvm.org/remarks/OMP160.html#omp160

Thank you for the reference, I did not know about this. However as you pointed out, it would be more desirable to do it at the MLIR layer itself. Would it be okay if we had conditional elimination of constructs? i.e. we eliminate only when there are no side-effect-producing clauses on the construct - like if clause and num_threads clauses. I can try to break this into separate patches for each construct and we can have the discussion about side effects of clauses on that particular construct. Should we do it that way?

Yes, separating it out for each construct is a good way to focus and discuss.

shraiysh planned changes to this revision.Jun 6 2022, 11:36 AM

Revision Contents

Path

Size

flang/

test/

Lower/

OpenMP/

parallel-sections.f90

2 lines

sections.f90

25 lines

mlir/

include/

mlir/

Dialect/

OpenMP/

OpenMPOps.td

25 lines

test/

Dialect/

OpenMP/

canonicalize.mlir

248 lines

invalid.mlir

2 lines

ops.mlir

7 lines

Diff 432270

flang/test/Lower/OpenMP/parallel-sections.f90

Show All 20 Lines	!$omp parallel sections
!OMPDialect: omp.section {		!OMPDialect: omp.section {
!$omp section		!$omp section
!FIRDialect: fir.load		!FIRDialect: fir.load
!FIRDialect: arith.subi		!FIRDialect: arith.subi
!FIRDialect: fir.store		!FIRDialect: fir.store
y = y - 5		y = y - 5
!OMPDialect: omp.terminator		!OMPDialect: omp.terminator
!OMPDialect: omp.terminator		!OMPDialect: omp.terminator
!OMPDialect: omp.terminator
!$omp end parallel sections		!$omp end parallel sections
end subroutine omp_parallel_sections		end subroutine omp_parallel_sections

!===============================================================================		!===============================================================================
! Parallel sections construct with allocate clause		! Parallel sections construct with allocate clause
!===============================================================================		!===============================================================================

!FIRDialect: func @_QPomp_parallel_sections		!FIRDialect: func @_QPomp_parallel_sections
Show All 9 Lines	!$omp parallel sections allocate(omp_high_bw_mem_alloc: x)
!$omp section		!$omp section
x = x + 12		x = x + 12
!OMPDialect: omp.terminator		!OMPDialect: omp.terminator
!OMPDialect: omp.section {		!OMPDialect: omp.section {
!$omp section		!$omp section
y = y + 5		y = y + 5
!OMPDialect: omp.terminator		!OMPDialect: omp.terminator
!OMPDialect: omp.terminator		!OMPDialect: omp.terminator
!OMPDialect: omp.terminator
!$omp end parallel sections		!$omp end parallel sections
end subroutine omp_parallel_sections_allocate		end subroutine omp_parallel_sections_allocate

flang/test/Lower/OpenMP/sections.f90

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
!FIRDialect: {{.*}} = fir.load %[[COUNT]] : !fir.ref<i32>		!FIRDialect: {{.*}} = fir.load %[[COUNT]] : !fir.ref<i32>
!FIRDialect: {{.}} = fir.convert {{.}} : (i32) -> f32		!FIRDialect: {{.}} = fir.convert {{.}} : (i32) -> f32
!FIRDialect: {{.*}} = fir.load %[[ETA]] : !fir.ref<f32>		!FIRDialect: {{.*}} = fir.load %[[ETA]] : !fir.ref<f32>
!FIRDialect: {{.}} = arith.subf {{.}}, {{.*}} : f32		!FIRDialect: {{.}} = arith.subf {{.}}, {{.*}} : f32
!FIRDialect: {{.}} = fir.convert {{.}} : (f32) -> i32		!FIRDialect: {{.}} = fir.convert {{.}} : (f32) -> i32
!FIRDialect: fir.store {{.*}} to %[[DOUBLE_COUNT]] : !fir.ref<i32>		!FIRDialect: fir.store {{.*}} to %[[DOUBLE_COUNT]] : !fir.ref<i32>
!FIRDialect: omp.terminator		!FIRDialect: omp.terminator
!FIRDialect: }		!FIRDialect: }
!FIRDialect: omp.terminator
!FIRDialect: }
!FIRDialect: omp.sections nowait {
!FIRDialect: omp.terminator
!FIRDialect: }		!FIRDialect: }
!FIRDialect: return		!FIRDialect: return
!FIRDialect: }		!FIRDialect: }

!LLVMDialect: llvm.func @_QQmain() {		!LLVMDialect: llvm.func @_QQmain() {
!LLVMDialect: %[[COUNT:.*]] = llvm.mlir.addressof @_QFEcount : !llvm.ptr<i32>		!LLVMDialect: %[[COUNT:.*]] = llvm.mlir.addressof @_QFEcount : !llvm.ptr<i32>
!LLVMDialect: {{.*}} = builtin.unrealized_conversion_cast %[[COUNT]] : !llvm.ptr<i32> to !fir.ref<i32>		!LLVMDialect: {{.*}} = builtin.unrealized_conversion_cast %[[COUNT]] : !llvm.ptr<i32> to !fir.ref<i32>
!LLVMDialect: %[[DOUBLE_COUNT:.*]] = llvm.mlir.addressof @_QFEdouble_count : !llvm.ptr<i32>		!LLVMDialect: %[[DOUBLE_COUNT:.*]] = llvm.mlir.addressof @_QFEdouble_count : !llvm.ptr<i32>
Show All 32 Lines
!LLVMDialect: {{.*}} = llvm.load %[[COUNT]] : !llvm.ptr<i32>		!LLVMDialect: {{.*}} = llvm.load %[[COUNT]] : !llvm.ptr<i32>
!LLVMDialect: {{.}} = llvm.sitofp {{.}} : i32 to f32		!LLVMDialect: {{.}} = llvm.sitofp {{.}} : i32 to f32
!LLVMDialect: {{.*}} = llvm.load %[[ETA]] : !llvm.ptr<f32>		!LLVMDialect: {{.*}} = llvm.load %[[ETA]] : !llvm.ptr<f32>
!LLVMDialect: {{.}} = llvm.fsub {{.}}, {{.*}} : f32		!LLVMDialect: {{.}} = llvm.fsub {{.}}, {{.*}} : f32
!LLVMDialect: {{.}} = llvm.fptosi {{.}} : f32 to i32		!LLVMDialect: {{.}} = llvm.fptosi {{.}} : f32 to i32
!LLVMDialect: llvm.store {{.*}}, %[[DOUBLE_COUNT]] : !llvm.ptr<i32>		!LLVMDialect: llvm.store {{.*}}, %[[DOUBLE_COUNT]] : !llvm.ptr<i32>
!LLVMDialect: omp.terminator		!LLVMDialect: omp.terminator
!LLVMDialect: }		!LLVMDialect: }
!LLVMDialect: omp.terminator
!LLVMDialect: }
!LLVMDialect: omp.sections nowait {
!LLVMDialect: omp.section {
!LLVMDialect: omp.terminator
!LLVMDialect: }
!LLVMDialect: omp.terminator
!LLVMDialect: }		!LLVMDialect: }
!LLVMDialect: llvm.return		!LLVMDialect: llvm.return
!LLVMDialect: }		!LLVMDialect: }

program sample		program sample
use omp_lib		use omp_lib
integer :: count = 0, double_count = 1		integer :: count = 0, double_count = 1
!$omp sections private (eta, double_count) allocate(omp_high_bw_mem_alloc: count)		!$omp sections private (eta, double_count) allocate(omp_high_bw_mem_alloc: count)
Show All 10 Lines	program sample

!$omp sections		!$omp sections
!$omp end sections nowait		!$omp end sections nowait
end program sample		end program sample

!FIRDialect: func @_QPfirstprivate(%[[ARG:.*]]: !fir.ref<f32> {fir.bindc_name = "alpha"}) {		!FIRDialect: func @_QPfirstprivate(%[[ARG:.*]]: !fir.ref<f32> {fir.bindc_name = "alpha"}) {
!FIRDialect: omp.sections {		!FIRDialect: omp.sections {
!FIRDialect: omp.section {		!FIRDialect: omp.section {
!FIRDialect: omp.terminator
!FIRDialect: }
!FIRDialect: omp.terminator
!FIRDialect: }
!FIRDialect: omp.sections {
!FIRDialect: omp.section {
!FIRDialect: %[[PRIVATE_VAR:.*]] = fir.load %[[ARG]] : !fir.ref<f32>		!FIRDialect: %[[PRIVATE_VAR:.*]] = fir.load %[[ARG]] : !fir.ref<f32>
!FIRDialect: %[[CONSTANT:.*]] = arith.constant 5.000000e+00 : f32		!FIRDialect: %[[CONSTANT:.*]] = arith.constant 5.000000e+00 : f32
!FIRDialect: %[[PRIVATE_VAR_2:.*]] = arith.mulf %[[PRIVATE_VAR]], %[[CONSTANT]] : f32		!FIRDialect: %[[PRIVATE_VAR_2:.*]] = arith.mulf %[[PRIVATE_VAR]], %[[CONSTANT]] : f32
!FIRDialect: fir.store %[[PRIVATE_VAR_2]] to %[[ARG]] : !fir.ref<f32>		!FIRDialect: fir.store %[[PRIVATE_VAR_2]] to %[[ARG]] : !fir.ref<f32>
!FIRDialect: omp.terminator		!FIRDialect: omp.terminator
!FIRDialect: }		!FIRDialect: }
!FIRDialect: omp.terminator
!FIRDialect: }		!FIRDialect: }
!FIRDialect: return		!FIRDialect: return
!FIRDialect: }		!FIRDialect: }

!LLVMDialect: llvm.func @_QPfirstprivate(%[[ARG:.*]]: !llvm.ptr<f32> {fir.bindc_name = "alpha"}) {		!LLVMDialect: llvm.func @_QPfirstprivate(%[[ARG:.*]]: !llvm.ptr<f32> {fir.bindc_name = "alpha"}) {
!LLVMDialect: omp.sections {		!LLVMDialect: omp.sections {
!LLVMDialect: omp.section {		!LLVMDialect: omp.section {
!LLVMDialect: omp.terminator
!LLVMDialect: }
!LLVMDialect: omp.terminator
!LLVMDialect: }
!LLVMDialect: omp.sections {
!LLVMDialect: omp.section {
!LLVMDialect: {{.*}} = llvm.load %[[ARG]] : !llvm.ptr<f32>		!LLVMDialect: {{.*}} = llvm.load %[[ARG]] : !llvm.ptr<f32>
!LLVMDialect: {{.*}} = llvm.mlir.constant(5.000000e+00 : f32) : f32		!LLVMDialect: {{.*}} = llvm.mlir.constant(5.000000e+00 : f32) : f32
!LLVMDialect: {{.}} = llvm.fmul {{.}}, {{.*}} : f32		!LLVMDialect: {{.}} = llvm.fmul {{.}}, {{.*}} : f32
!LLVMDialect: llvm.store {{.*}}, %[[ARG]] : !llvm.ptr<f32>		!LLVMDialect: llvm.store {{.*}}, %[[ARG]] : !llvm.ptr<f32>
!LLVMDialect: omp.terminator		!LLVMDialect: omp.terminator
!LLVMDialect: }		!LLVMDialect: }
!LLVMDialect: omp.terminator
!LLVMDialect: }		!LLVMDialect: }
!LLVMDialect: llvm.return		!LLVMDialect: llvm.return
!LLVMDialect: }		!LLVMDialect: }

subroutine firstprivate(alpha)		subroutine firstprivate(alpha)
real :: alpha		real :: alpha
!$omp sections firstprivate(alpha)		!$omp sections firstprivate(alpha)
!$omp end sections		!$omp end sections

!$omp sections		!$omp sections
alpha = alpha * 5		alpha = alpha * 5
!$omp end sections		!$omp end sections
end subroutine		end subroutine

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	def ParallelOp : OpenMP_Op<"parallel", [
let extraClassDeclaration = [{		let extraClassDeclaration = [{
// TODO: remove this once emitAccessorPrefix is set to		// TODO: remove this once emitAccessorPrefix is set to
// kEmitAccessorPrefix_Prefixed for the dialect.		// kEmitAccessorPrefix_Prefixed for the dialect.
/// Returns the reduction variables		/// Returns the reduction variables
operand_range getReductionVars() { return reduction_vars(); }		operand_range getReductionVars() { return reduction_vars(); }
}];		}];
}		}

def TerminatorOp : OpenMP_Op<"terminator", [Terminator]> {		def TerminatorOp : OpenMP_Op<"terminator", [Terminator, NoSideEffect]> {
let summary = "terminator for OpenMP regions";		let summary = "terminator for OpenMP regions";
let description = [{		let description = [{
A terminator operation for regions that appear in the body of OpenMP		A terminator operation for regions that appear in the body of OpenMP
operation. These regions are not expected to return any value so the		operation. These regions are not expected to return any value so the
terminator takes no operands. The terminator op returns control to the		terminator takes no operands. The terminator op returns control to the
enclosing op.		enclosing op.
}];		}];

Show All 16 Lines
}		}
def ScheduleModifierAttr : EnumAttr<OpenMP_Dialect, ScheduleModifier,		def ScheduleModifierAttr : EnumAttr<OpenMP_Dialect, ScheduleModifier,
"sched_mod">;		"sched_mod">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.8.1 Sections Construct		// 2.8.1 Sections Construct
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def SectionOp : OpenMP_Op<"section", [HasParent<"SectionsOp">]> {		def SectionOp : OpenMP_Op<"section", [HasParent<"SectionsOp">,
		RecursiveSideEffects]> {
let summary = "section directive";		let summary = "section directive";
let description = [{		let description = [{
A section operation encloses a region which represents one section in a		A section operation encloses a region which represents one section in a
sections construct. A section op should always be surrounded by an		sections construct. A section op should always be surrounded by an
`omp.sections` operation.		`omp.sections` operation.
}];		}];
let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);
let assemblyFormat = "$region attr-dict";		let assemblyFormat = "$region attr-dict";
}		}

def SectionsOp : OpenMP_Op<"sections", [AttrSizedOperandSegments,		def SectionsOp : OpenMP_Op<"sections", [AttrSizedOperandSegments,
ReductionClauseInterface]> {		ReductionClauseInterface, RecursiveSideEffects,
		SingleBlockImplicitTerminator<"TerminatorOp">]> {
let summary = "sections construct";		let summary = "sections construct";
let description = [{		let description = [{
The sections construct is a non-iterative worksharing construct that		The sections construct is a non-iterative worksharing construct that
contains `omp.section` operations. The `omp.section` operations are to be		contains `omp.section` operations. The `omp.section` operations are to be
distributed among and executed by the threads in a team. Each `omp.section`		distributed among and executed by the threads in a team. Each `omp.section`
is executed once by one of the threads in the team in the context of its		is executed once by one of the threads in the team in the context of its
implicit task.		implicit task.

▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	let extraClassDeclaration = [{
operand_range getReductionVars() { return reduction_vars(); }		operand_range getReductionVars() { return reduction_vars(); }
}];		}];
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.8.2 Single Construct		// 2.8.2 Single Construct
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def SingleOp : OpenMP_Op<"single", [AttrSizedOperandSegments]> {		def SingleOp : OpenMP_Op<"single", [AttrSizedOperandSegments,
		RecursiveSideEffects]> {
let summary = "single directive";		let summary = "single directive";
let description = [{		let description = [{
The single construct specifies that the associated structured block is		The single construct specifies that the associated structured block is
executed by only one of the threads in the team (not necessarily the		executed by only one of the threads in the team (not necessarily the
master thread), in the context of its implicit task. The other threads		master thread), in the context of its implicit task. The other threads
in the team, which do not execute the block, wait at an implicit barrier		in the team, which do not execute the block, wait at an implicit barrier
at the end of the single construct unless a nowait clause is specified.		at the end of the single construct unless a nowait clause is specified.
}];		}];
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	def WsLoopOp : OpenMP_Op<"wsloop", [AttrSizedOperandSegments,
let hasVerifier = 1;		let hasVerifier = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Simd construct [2.9.3.1]		// Simd construct [2.9.3.1]
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,		def SimdLoopOp : OpenMP_Op<"simdloop", [AttrSizedOperandSegments,
AllTypesMatch<["lowerBound", "upperBound", "step"]>]> {		AllTypesMatch<["lowerBound", "upperBound", "step"]>,
		RecursiveSideEffects]> {
let summary = "simd loop construct";		let summary = "simd loop construct";
let description = [{		let description = [{
The simd construct can be applied to a loop to indicate that the loop can be		The simd construct can be applied to a loop to indicate that the loop can be
transformed into a SIMD loop (that is, multiple iterations of the loop can		transformed into a SIMD loop (that is, multiple iterations of the loop can
be executed concurrently using SIMD instructions).. The lower and upper		be executed concurrently using SIMD instructions).. The lower and upper
bounds specify a half-open range: the range includes the lower bound but		bounds specify a half-open range: the range includes the lower bound but
does not include the upper bound.		does not include the upper bound.

▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.10.1 task Construct		// 2.10.1 task Construct
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def TaskOp : OpenMP_Op<"task", [AttrSizedOperandSegments,		def TaskOp : OpenMP_Op<"task", [AttrSizedOperandSegments,
OutlineableOpenMPOpInterface, AutomaticAllocationScope,		OutlineableOpenMPOpInterface, AutomaticAllocationScope,
ReductionClauseInterface]> {		ReductionClauseInterface, RecursiveSideEffects]> {
let summary = "task construct";		let summary = "task construct";
let description = [{		let description = [{
The task construct defines an explicit task.		The task construct defines an explicit task.

For definitions of "undeferred task", "included task", "final task" and		For definitions of "undeferred task", "included task", "final task" and
"mergeable task", please check OpenMP Specification.		"mergeable task", please check OpenMP Specification.

When an `if` clause is present on a task construct, and the value of		When an `if` clause is present on a task construct, and the value of
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	def FlushOp : OpenMP_Op<"flush"> {
let arguments = (ins Variadic<AnyType>:$varList);		let arguments = (ins Variadic<AnyType>:$varList);

let assemblyFormat = [{ ( `(` $varList^ `:` type($varList) `)` )? attr-dict}];		let assemblyFormat = [{ ( `(` $varList^ `:` type($varList) `)` )? attr-dict}];
}		}
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.14.5 target construct		// 2.14.5 target construct
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def TargetOp : OpenMP_Op<"target",[AttrSizedOperandSegments]> {		def TargetOp : OpenMP_Op<"target",[AttrSizedOperandSegments,
		RecursiveSideEffects]> {
let summary = "target construct";		let summary = "target construct";
let description = [{		let description = [{
The target construct includes a region of code which is to be executed		The target construct includes a region of code which is to be executed
on a device.		on a device.

The optional $if_expr parameter specifies a boolean result of a		The optional $if_expr parameter specifies a boolean result of a
conditional check. If this value is 1 or is not provided then the target		conditional check. If this value is 1 or is not provided then the target
region runs on a device, if it is 0 then the target region is executed on the		region runs on a device, if it is 0 then the target region is executed on the
Show All 25 Lines	let assemblyFormat = [{
) $region attr-dict		) $region attr-dict
}];		}];
}		}


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// 2.16 master Construct		// 2.16 master Construct
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
def MasterOp : OpenMP_Op<"master"> {		def MasterOp : OpenMP_Op<"master", [RecursiveSideEffects]> {
let summary = "master construct";		let summary = "master construct";
let description = [{		let description = [{
The master construct specifies a structured block that is executed by		The master construct specifies a structured block that is executed by
the master thread of the team.		the master thread of the team.
}];		}];

let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);

Show All 19 Lines	let assemblyFormat = [{
$sym_name oilist(`hint` `(` custom<SynchronizationHint>($hint_val) `)`)		$sym_name oilist(`hint` `(` custom<SynchronizationHint>($hint_val) `)`)
attr-dict		attr-dict
}];		}];
let hasVerifier = 1;		let hasVerifier = 1;
}		}


def CriticalOp : OpenMP_Op<"critical",		def CriticalOp : OpenMP_Op<"critical",
[DeclareOpInterfaceMethods<SymbolUserOpInterface>]> {		[DeclareOpInterfaceMethods<SymbolUserOpInterface>, RecursiveSideEffects]> {
let summary = "critical construct";		let summary = "critical construct";
let description = [{		let description = [{
The critical construct imposes a restriction on the associated structured		The critical construct imposes a restriction on the associated structured
block (region) to be executed by only a single thread at a time.		block (region) to be executed by only a single thread at a time.
}];		}];

let arguments = (ins OptionalAttr<FlatSymbolRefAttr>:$name);		let arguments = (ins OptionalAttr<FlatSymbolRefAttr>:$name);

▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	def OrderedOp : OpenMP_Op<"ordered"> {
let assemblyFormat = [{		let assemblyFormat = [{
( `depend_type` `` $depend_type_val^ )?		( `depend_type` `` $depend_type_val^ )?
( `depend_vec` `(` $depend_vec_vars^ `:` type($depend_vec_vars) `)` )?		( `depend_vec` `(` $depend_vec_vars^ `:` type($depend_vec_vars) `)` )?
attr-dict		attr-dict
}];		}];
let hasVerifier = 1;		let hasVerifier = 1;
}		}

def OrderedRegionOp : OpenMP_Op<"ordered_region"> {		def OrderedRegionOp : OpenMP_Op<"ordered_region", [RecursiveSideEffects]> {
let summary = "ordered construct with region";		let summary = "ordered construct with region";
let description = [{		let description = [{
The ordered construct with region specifies a structured block in a		The ordered construct with region specifies a structured block in a
worksharing-loop, SIMD, or worksharing-loop SIMD region that is executed in		worksharing-loop, SIMD, or worksharing-loop SIMD region that is executed in
the order of the loop iterations.		the order of the loop iterations.

The `simd` attribute corresponds to the SIMD clause specified. If it is not		The `simd` attribute corresponds to the SIMD clause specified. If it is not
present, it behaves as if the THREADS clause is specified or no clause is		present, it behaves as if the THREADS clause is specified or no clause is
▲ Show 20 Lines • Show All 362 Lines • Show Last 20 Lines

mlir/test/Dialect/OpenMP/canonicalize.mlir

This file was added.

				// RUN: mlir-opt %s -canonicalize -split-input-file \| FileCheck %s

				// CHECK-LABEL: func.func @all_empty_sections()
				func.func @all_empty_sections() {
				omp.sections {
				omp.section {}
				omp.section {}
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.sections
				// CHECK-NOT: omp.section

				// -----

				// CHECK-LABEL: func.func @all_sections_only_terminator
				func.func @all_sections_only_terminator() {
				omp.sections {
				omp.section { omp.terminator }
				omp.section { omp.terminator }
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.sections
				// CHECK-NOT: omp.section

				// -----

				// CHECK-LABEL: func.func @no_sections
				func.func @no_sections() {
				omp.sections {}
				return
				}

				// CHECK-NOT: omp.sections
				// CHECK-NOT: omp.section

				// -----

				// CHECK-LABEL: func.func @no_sections_only_terminator
				func.func @no_sections_only_terminator() {
				omp.sections {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.sections
				// CHECK-NOT: omp.section

				// -----

				func.func @one_empty_section() {
				omp.sections {
				omp.section {} // this operation should be eliminated
				omp.section {
				"test.foo"() : () -> ()
				omp.terminator
				}
				omp.terminator
				}
				return
				}

				// CHECK-LABEL: func.func @one_empty_section() {
				// CHECK-NEXT: omp.sections {
				// CHECK-NEXT: omp.section {
				// CHECK-NEXT: "test.foo"() : () -> ()
				// CHECK-NEXT: omp.terminator
				// CHECK-NEXT: }
				// CHECK-NEXT: }
				// CHECK-NEXT: return
				// CHECK-NEXT: }

				// -----

				func.func @one_section_only_terminator() {
				omp.sections {
				omp.section {
				omp.terminator
				} // this operation should be eliminated
				omp.section {
				"test.foo"() : () -> ()
				omp.terminator
				}
				omp.terminator
				}
				return
				}

				// CHECK-LABEL: func.func @one_section_only_terminator() {
				// CHECK-NEXT: omp.sections {
				// CHECK-NEXT: omp.section {
				// CHECK-NEXT: "test.foo"() : () -> ()
				// CHECK-NEXT: omp.terminator
				// CHECK-NEXT: }
				// CHECK-NEXT: }
				// CHECK-NEXT: return
				// CHECK-NEXT: }

				// -----

				// CHECK-LABEL: func.func @empty_parallel
				func.func @empty_parallel() {
				omp.parallel {}
				return
				}

				// CHECK-NOT: omp.parallel

				// -----

				// CHECK-LABEL: func.func @parallel_only_terminator
				func.func @parallel_only_terminator() {
				omp.parallel {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.parallel

				// -----

				// CHECK-LABEL: func.func @single_only_terminator
				func.func @single_only_terminator() {
				omp.single {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.single

				// -----

				// CHECK-LABEL: func.func @wsloop_only_terminator
				func.func @wsloop_only_terminator(%lb: i32, %ub: i32, %step: i32) {
				omp.wsloop for (%i) : i32 = (%lb) to (%ub) step (%step) {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.wsloop

				// -----

				// CHECK-LABEL: func.func @wsloop_only_yield
				func.func @wsloop_only_yield(%lb: i32, %ub: i32, %step: i32) {
				omp.wsloop for (%i) : i32 = (%lb) to (%ub) step (%step) {
				omp.yield
				}
				return
				}

				// CHECK-NOT: omp.wsloop

				// -----

				// CHECK-LABEL: func.func @master_only_terminator
				func.func @master_only_terminator() {
				omp.master {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.master

				// -----

				// CHECK-LABEL: func.func @critical_only_terminator
				func.func @critical_only_terminator() {
				omp.critical {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.critical

				// -----

				// CHECK-LABEL: func.func @ordered_region_only_terminator
				func.func @ordered_region_only_terminator(%lb: i32, %ub: i32, %step: i32) {
				omp.wsloop ordered(0) for (%i) : i32 = (%lb) to (%ub) step (%step) {
				omp.ordered_region {
				omp.terminator
				}
				"test.foo"() : () -> ()
				}
				return
				}

				// CHECK-NOT: omp.ordered_region

				// -----

				// CHECK-LABEL: func.func @simdloop_only_terminator
				func.func @simdloop_only_terminator(%lb: i32, %ub: i32, %step: i32) {
				omp.simdloop (%i) : i32 = (%lb) to (%ub) step (%step) {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.simdloop

				// -----

				// CHECK-LABEL: func.func @simdloop_only_yield
				func.func @simdloop_only_yield(%lb: i32, %ub: i32, %step: i32) {
				omp.simdloop (%i) : i32 = (%lb) to (%ub) step (%step) {
				omp.yield
				}
				return
				}

				// CHECK-NOT: omp.simdloop

				// -----

				// CHECK-LABEL: func.func @task_only_terminator
				func.func @task_only_terminator() {
				omp.task {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.task

				// -----

				// CHECK-LABEL: func.func @target_only_terminator
				func.func @target_only_terminator() {
				omp.target {
				omp.terminator
				}
				return
				}

				// CHECK-NOT: omp.target

mlir/test/Dialect/OpenMP/invalid.mlir

Show First 20 Lines • Show All 1,057 Lines • ▼ Show 20 Lines	omp.sections order(concurrent) {
omp.terminator		omp.terminator
}		}
return		return
}		}

// -----		// -----

func.func @omp_sections() {		func.func @omp_sections() {
// expected-error @below {{failed to verify constraint: region with 1 blocks}}		// expected-error @below {{op expects region #0 to have 0 or 1 blocks}}
omp.sections {		omp.sections {
omp.section {		omp.section {
omp.terminator		omp.terminator
}		}
omp.terminator		omp.terminator
^bb2:		^bb2:
omp.terminator		omp.terminator
}		}
▲ Show 20 Lines • Show All 192 Lines • Show Last 20 Lines

mlir/test/Dialect/OpenMP/ops.mlir

Show First 20 Lines • Show All 1,040 Lines • ▼ Show 20 Lines	func.func @omp_atomic_capture(%v: memref<i32>, %x: memref<i32>, %expr: i32) {
return		return
}		}

// CHECK-LABEL: omp_sectionsop		// CHECK-LABEL: omp_sectionsop
func.func @omp_sectionsop(%data_var1 : memref<i32>, %data_var2 : memref<i32>,		func.func @omp_sectionsop(%data_var1 : memref<i32>, %data_var2 : memref<i32>,
%data_var3 : memref<i32>, %redn_var : !llvm.ptr<f32>) {		%data_var3 : memref<i32>, %redn_var : !llvm.ptr<f32>) {
// CHECK: omp.sections allocate(%{{.}} : memref<i32> -> %{{.}} : memref<i32>)		// CHECK: omp.sections allocate(%{{.}} : memref<i32> -> %{{.}} : memref<i32>)
"omp.sections" (%data_var1, %data_var1) ({		"omp.sections" (%data_var1, %data_var1) ({
// CHECK: omp.terminator
omp.terminator		omp.terminator
}) {operand_segment_sizes = dense<[0,1,1]> : vector<3xi32>} : (memref<i32>, memref<i32>) -> ()		}) {operand_segment_sizes = dense<[0,1,1]> : vector<3xi32>} : (memref<i32>, memref<i32>) -> ()

// CHECK: omp.sections reduction(@add_f32 -> %{{.*}} : !llvm.ptr<f32>)		// CHECK: omp.sections reduction(@add_f32 -> %{{.*}} : !llvm.ptr<f32>)
"omp.sections" (%redn_var) ({		"omp.sections" (%redn_var) ({
// CHECK: omp.terminator
omp.terminator		omp.terminator
}) {operand_segment_sizes = dense<[1,0,0]> : vector<3xi32>, reductions=[@add_f32]} : (!llvm.ptr<f32>) -> ()		}) {operand_segment_sizes = dense<[1,0,0]> : vector<3xi32>, reductions=[@add_f32]} : (!llvm.ptr<f32>) -> ()

// CHECK: omp.sections nowait {		// CHECK: omp.sections nowait {
omp.sections nowait {		omp.sections nowait {
// CHECK: omp.terminator
omp.terminator		omp.terminator
}		}

// CHECK: omp.sections reduction(@add_f32 -> %{{.*}} : !llvm.ptr<f32>) {		// CHECK: omp.sections reduction(@add_f32 -> %{{.*}} : !llvm.ptr<f32>) {
omp.sections reduction(@add_f32 -> %redn_var : !llvm.ptr<f32>) {		omp.sections reduction(@add_f32 -> %redn_var : !llvm.ptr<f32>) {
// CHECK: omp.terminator
omp.terminator		omp.terminator
}		}

// CHECK: omp.sections allocate(%{{.}} : memref<i32> -> %{{.}} : memref<i32>)		// CHECK: omp.sections allocate(%{{.}} : memref<i32> -> %{{.}} : memref<i32>)
omp.sections allocate(%data_var1 : memref<i32> -> %data_var1 : memref<i32>) {		omp.sections allocate(%data_var1 : memref<i32> -> %data_var1 : memref<i32>) {
// CHECK: omp.terminator
omp.terminator		omp.terminator
}		}

// CHECK: omp.sections nowait		// CHECK: omp.sections nowait
omp.sections nowait {		omp.sections nowait {
// CHECK: omp.section		// CHECK: omp.section
omp.section {		omp.section {
// CHECK: %{{.*}} = "test.payload"() : () -> i32		// CHECK: %{{.*}} = "test.payload"() : () -> i32
%1 = "test.payload"() : () -> i32		%1 = "test.payload"() : () -> i32
// CHECK: %{{.*}} = "test.payload"() : () -> i32		// CHECK: %{{.*}} = "test.payload"() : () -> i32
%2 = "test.payload"() : () -> i32		%2 = "test.payload"() : () -> i32
// CHECK: %{{.}} = "test.payload"(%{{.}}, %{{.*}}) : (i32, i32) -> i32		// CHECK: %{{.}} = "test.payload"(%{{.}}, %{{.*}}) : (i32, i32) -> i32
%3 = "test.payload"(%1, %2) : (i32, i32) -> i32		%3 = "test.payload"(%1, %2) : (i32, i32) -> i32
}		}
// CHECK: omp.section		// CHECK: omp.section
omp.section {		omp.section {
// CHECK: %{{.}} = "test.payload"(%{{.}}) : (!llvm.ptr<f32>) -> i32		// CHECK: %{{.}} = "test.payload"(%{{.}}) : (!llvm.ptr<f32>) -> i32
%1 = "test.payload"(%redn_var) : (!llvm.ptr<f32>) -> i32		%1 = "test.payload"(%redn_var) : (!llvm.ptr<f32>) -> i32
}		}
// CHECK: omp.section		// CHECK: omp.section
omp.section {		omp.section {
// CHECK: "test.payload"(%{{.*}}) : (!llvm.ptr<f32>) -> ()		// CHECK: "test.payload"(%{{.*}}) : (!llvm.ptr<f32>) -> ()
"test.payload"(%redn_var) : (!llvm.ptr<f32>) -> ()		"test.payload"(%redn_var) : (!llvm.ptr<f32>) -> ()
}		}
// CHECK: omp.terminator
omp.terminator
}		}
return		return
}		}

// CHECK-LABEL: func @omp_single		// CHECK-LABEL: func @omp_single
func.func @omp_single() {		func.func @omp_single() {
omp.parallel {		omp.parallel {
// CHECK: omp.single {		// CHECK: omp.single {
▲ Show 20 Lines • Show All 245 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][OpenMP] Add recursive side effect information and tests for canonicalize passChanges PlannedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 432270

flang/test/Lower/OpenMP/parallel-sections.f90

flang/test/Lower/OpenMP/sections.f90

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

mlir/test/Dialect/OpenMP/canonicalize.mlir

mlir/test/Dialect/OpenMP/invalid.mlir

mlir/test/Dialect/OpenMP/ops.mlir

[mlir][OpenMP] Add recursive side effect information and tests for canonicalize pass
Changes PlannedPublic