This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/OpenMP/
-
mlir/
-
Dialect/
-
OpenMP/
-
CMakeLists.txt
-
OpenMPDialect.h
5/11
OpenMPOps.td
-
lib/Dialect/OpenMP/IR/
-
Dialect/
-
OpenMP/
-
IR/
-
OpenMPDialect.cpp
-
test/Dialect/OpenMP/
-
Dialect/
-
OpenMP/
1/1
cli.mlir

Differential D155765

[OpenMP Dialect] Add omp.canonical_loop operation.
Needs ReviewPublic

Authored by shraiysh on Jul 19 2023, 5:03 PM.

Download Raw Diff

Details

Reviewers

abidmalikwaterloo
kiranchandramohan
do
ftynse
jdoerfert
nicolasvasilache
skatrak
jdenny
Meinersbur
raghavendhra
dpalermo

Summary

This patch continues the work of D147658. It adds the omp.canonical_loop operation as the basic block for everything loop-related in OpenMP, such as worksharing-loop, distribute, loop transformation, etc.

In contrast to the current omp.wsloop approach

Loop-related semantics need to be implemented only once
Is composable with OpenMP loop transformations such as unrolling, tiling.
Is supposed to eventually support non-rectangular loops
Supports expressing non-perfectly nested loops

This patch only adds the MLIR representation; to something useful, I still have to implement lowering from Flang with at least the DO construct, and lowering to LLVM-IR using the OpenMPIRBuilder.

The pretty syntax currently is

omp.canonical_loop $iv in [0, %tripcount) { ... }

where [0, %tripcount) represents the half-open integer range of an OpenMP logical iteration space. Unbalanced parentheses/brackets and 0 keyword might not be universally liked. I could think of alternatives such as

omp.canonical_loop $iv = range(%tripcount) { ... }

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Meinersbur created this revision.Jul 19 2023, 5:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2023, 5:03 PM

Herald added subscribers: bviyer, sunshaoce, Moerafaat and 24 others. · View Herald Transcript

Meinersbur requested review of this revision.Jul 19 2023, 5:03 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptJul 19 2023, 5:03 PM

Herald added a reviewer: nicolasvasilache. · View Herald Transcript

Herald added subscribers: jplehr, sstefan1, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Thanks @Meinersbur for this patch. It is also exciting to see your patches in MLIR. :)

This patch only adds the MLIR representation; to something useful, I still have to implement lowering from Flang with at least the DO construct, and lowering to LLVM-IR using the OpenMPIRBuilder.

I will give this a quick try to see how this fits in with the lowering from Flang. Previously, we were against the [0,tripcount) representation, now with a better understanding we can hopefully make it work without much hassle.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
440	`SingleBlockImplicitTerminator` will enforce the requirement for a single block in the loop and hence would disallow branches inside the loop.
443
443	While this representation definitely conforms to one of the representations mentioned in the OpenMP canonical loop definition, it does not directly cover all the forms. We might have to word it slightly differently.
462
mlir/test/Dialect/OpenMP/cli.mlir
81–85	Nit: empty lines

kiranchandramohan added a reviewer: skatrak.Jul 20 2023, 3:13 AM

Meinersbur added a reviewer: jdenny.Jul 21 2023, 10:15 AM

raghavendhra added a subscriber: raghavendhra.Jul 31 2023, 7:10 AM

TIFitis added a subscriber: TIFitis.Aug 14 2023, 5:46 AM

Hi @Meinersbur. Thank you for this patch. This looks like it would make the IR cleaner. It isn't clear to me how the loop info result of this operation is going to be used in the IR. Can you please elaborate on that? Also, an example of how it would look under omp.parallel would be helpful to understand.

jsjodin added a subscriber: jsjodin.Aug 25 2023, 8:02 AM

jsjodin added inline comments.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
523	I'm trying to understand how the yield works. What determines if an inner canonical loop should/must be yielded and be part or the result of an outer canonical loop? Can there be multiple non-nested canonical loops inside the body of a canonical loop?

Meinersbur added inline comments.Aug 25 2023, 12:23 PM

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
523	The inner loop should always be yielded if to be considered a canonical loop nest in the OpenMP sense. Multiple loops are returned for deeper loops nests -- see cli.mlir for examples. There can be other non-canonical loops nested in the loop body. They are basically ignored. OpenMP does not allow two canonical loops at the same level (e.g. sequentially executed), only nested within the other.

jsjodin added inline comments.Aug 25 2023, 1:53 PM

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
523	The inner loop should always be yielded if to be considered a canonical loop nest in the OpenMP sense. Multiple loops are returned for deeper loops nests -- see cli.mlir for examples. There can be other non-canonical loops nested in the loop body. They are basically ignored. OpenMP does not allow two canonical loops at the same level (e.g. sequentially executed), only nested within the other. Thanks for the clarification, adding this op looks reasonable to me.

Hi Michael,
thank you very much for your work.

Could you provide a high level (Fortran/C) example with OpenMP pragmas and show the expected MLIR code? Do I understand correctly that the code:

!$omp parallel do
do index_ = 1, 10
    do_body(index)
end do
!$omp end parallel do

will be lowered to the MLIR code:

omp.parallel {
   omp. do {
      omp.canonical_loop %index : i32 in [0, 10) {
          do_body(%index)
      }
   }
}

After discussing with Michael, I will be trying to implement this operation. @domada I have added an example of intended use in the operation with teams distribute construct. Hope it helps.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
440	The idea behind allowing only one block is to be able to statically infer the exact `CanonicalLoopInfo` objects to apply transformations to. We discussed about branching, and while it has not been added right now, we were considering adding an omp.region (name suggestions welcome) later, which would allow us to have branching within canonical loop, while having concrete information about CLIs. %outer, %inner = omp.canonical_loop { %innertmp = omp.canonical_loop { omp.region { // ... branch within canonical loop } omp.yield } omp.yield %innertmp } What do you think about this, @kiranchandramohan? (@Meinersbur please correct me if I am wrong).
443	Is the current wording okay?

Address comments and nits

shraiysh marked 5 inline comments as done.Aug 28 2023, 5:33 PM

shraiysh added reviewers: raghavendhra, dpalermo.

Harbormaster completed remote builds in B255359: Diff 554116.Aug 28 2023, 6:51 PM

Fix build by adding enums and attributes in CMakeLists.txt

Harbormaster completed remote builds in B255419: Diff 554190.Aug 28 2023, 11:37 PM

In D155765#4623187, @shraiysh wrote:

After discussing with Michael, I will be trying to implement this operation. @domada I have added an example of intended use in the operation with teams distribute construct. Hope it helps.

I am thinking we might not have to name the canonical loops and use yield, since the structure is simply a list of nested loops they can be identified by nesting level if they need to be referred to at all.
So could we just nest the operations instead like below?

omp.collapse {
  omp.canonical_loop {
    omp.canonical _loop {
      omp.region { .... }
    }
  }
}

I like @jsjodin's idea about nesting these. It avoids generating problematic IR like this

%cli1, %cli2, %cli3 = omp.canonical_loop ... {
  %cli2, %cli3 = omp.canonical_loop ... {
    %cli3 = omp.canonical_loop ... {
    }
    omp.yield(%cli3)
  }
  omp.yield(%cli2, %cli3)
}
%mergedcli = omp.collapse(%cli1, %cli3) // Problem here - collapsing innermost and outermost loops

What do you think @Meinersbur?

In D155765#4625976, @shraiysh wrote:
I like @jsjodin's idea about nesting these. It avoids generating problematic IR like this
%cli1, %cli2, %cli3 = omp.canonical_loop ... {
  %cli2, %cli3 = omp.canonical_loop ... {
    %cli3 = omp.canonical_loop ... {
    }
    omp.yield(%cli3)
  }
  omp.yield(%cli2, %cli3)
}
%mergedcli = omp.collapse(%cli1, %cli3) // Problem here - collapsing innermost and outermost loops
What do you think @Meinersbur?

Nesting is sufficient to model upto OpenMP 5.2. But it will not be able to model clauses like apply that are coming in the next version of the standard. @Meinersbur was very particular about this and clarified in https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/4.

In D155765#4630479, @kiranchandramohan wrote:
In D155765#4625976, @shraiysh wrote:
I like @jsjodin's idea about nesting these. It avoids generating problematic IR like this
%cli1, %cli2, %cli3 = omp.canonical_loop ... {
  %cli2, %cli3 = omp.canonical_loop ... {
    %cli3 = omp.canonical_loop ... {
    }
    omp.yield(%cli3)
  }
  omp.yield(%cli2, %cli3)
}
%mergedcli = omp.collapse(%cli1, %cli3) // Problem here - collapsing innermost and outermost loops
What do you think @Meinersbur?
Nesting is sufficient to model upto OpenMP 5.2. But it will not be able to model clauses like apply that are coming in the next version of the standard. @Meinersbur was very particular about this and clarified in https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/4.

Looking at the example on discourse I still don't see why it is not sufficient to nest operations, given that an index can be used to refer to any canonical loop in a nest. The example shows this:

%outer = omp.canonical_loop for (%c0) to (%c10) step (%c1) {
  %inner = omp.canonical_loop for (%d0) to (%d10) step (%d1) {
    ..
  }
}
%tiled:4 = omp.tile loops(%outer,%inner) { tile_sizes=[4,4] } ;
omp.ws loops(%tiled#0)

An equivalent nested version would be:

omp.ws {
  omp.tile_loops { tilze_sizes = [4,4], loops = [0,1] } {
    omp.canonical_loop for (%c0) to (%c10) step (%c1) {
      omp.canonical_loop for (%d0) to (%d10) step (%d1) {
        ..
      }
    }
  }
}

In D155765#4631222, @jsjodin wrote:
In D155765#4630479, @kiranchandramohan wrote:
In D155765#4625976, @shraiysh wrote:
I like @jsjodin's idea about nesting these. It avoids generating problematic IR like this
%cli1, %cli2, %cli3 = omp.canonical_loop ... {
  %cli2, %cli3 = omp.canonical_loop ... {
    %cli3 = omp.canonical_loop ... {
    }
    omp.yield(%cli3)
  }
  omp.yield(%cli2, %cli3)
}
%mergedcli = omp.collapse(%cli1, %cli3) // Problem here - collapsing innermost and outermost loops
What do you think @Meinersbur?
Nesting is sufficient to model upto OpenMP 5.2. But it will not be able to model clauses like apply that are coming in the next version of the standard. @Meinersbur was very particular about this and clarified in https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/4.
Looking at the example on discourse I still don't see why it is not sufficient to nest operations, given that an index can be used to refer to any canonical loop in a nest. The example shows this:
%outer = omp.canonical_loop for (%c0) to (%c10) step (%c1) {
  %inner = omp.canonical_loop for (%d0) to (%d10) step (%d1) {
    ..
  }
}
%tiled:4 = omp.tile loops(%outer,%inner) { tile_sizes=[4,4] } ;
omp.ws loops(%tiled#0)
An equivalent nested version would be:
omp.ws {
  omp.tile_loops { tilze_sizes = [4,4], loops = [0,1] } {
    omp.canonical_loop for (%c0) to (%c10) step (%c1) {
      omp.canonical_loop for (%d0) to (%d10) step (%d1) {
        ..
      }
    }
  }
}

I meant @Meinersbur 's reply (https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/5) to my question, that shows an example that tiles the loop and then unrolls the inner-loop. Copying it here for reference.

#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;

which is equivalent to:

#pragma omp 
for (int i1 = 0; i1 < 64; i1+=4) 
  #pragma omp unroll
  for (int i = i1; i < i1+4; ++i) ;

In D155765#4632789, @kiranchandramohan wrote:
In D155765#4631222, @jsjodin wrote:
In D155765#4630479, @kiranchandramohan wrote:
In D155765#4625976, @shraiysh wrote:
I like @jsjodin's idea about nesting these. It avoids generating problematic IR like this
%cli1, %cli2, %cli3 = omp.canonical_loop ... {
  %cli2, %cli3 = omp.canonical_loop ... {
    %cli3 = omp.canonical_loop ... {
    }
    omp.yield(%cli3)
  }
  omp.yield(%cli2, %cli3)
}
%mergedcli = omp.collapse(%cli1, %cli3) // Problem here - collapsing innermost and outermost loops
What do you think @Meinersbur?
Nesting is sufficient to model upto OpenMP 5.2. But it will not be able to model clauses like apply that are coming in the next version of the standard. @Meinersbur was very particular about this and clarified in https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/4.
Looking at the example on discourse I still don't see why it is not sufficient to nest operations, given that an index can be used to refer to any canonical loop in a nest. The example shows this:
%outer = omp.canonical_loop for (%c0) to (%c10) step (%c1) {
  %inner = omp.canonical_loop for (%d0) to (%d10) step (%d1) {
    ..
  }
}
%tiled:4 = omp.tile loops(%outer,%inner) { tile_sizes=[4,4] } ;
omp.ws loops(%tiled#0)
An equivalent nested version would be:
omp.ws {
  omp.tile_loops { tilze_sizes = [4,4], loops = [0,1] } {
    omp.canonical_loop for (%c0) to (%c10) step (%c1) {
      omp.canonical_loop for (%d0) to (%d10) step (%d1) {
        ..
      }
    }
  }
}
I meant @Meinersbur 's reply (https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/5) to my question, that shows an example that tiles the loop and then unrolls the inner-loop. Copying it here for reference.
#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;
which is equivalent to:
#pragma omp 
for (int i1 = 0; i1 < 64; i1+=4) 
  #pragma omp unroll
  for (int i = i1; i < i1+4; ++i) ;

Yes. that is the example I was referring to. I just copied the MLIR, but didn't bother to put in the OpenMP code, so thanks for doing that.

In D155765#4632957, @jsjodin wrote:
In D155765#4632789, @kiranchandramohan wrote:
I meant @Meinersbur 's reply (https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/5) to my question, that shows an example that tiles the loop and then unrolls the inner-loop. Copying it here for reference.
#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;
which is equivalent to:
#pragma omp 
for (int i1 = 0; i1 < 64; i1+=4) 
  #pragma omp unroll
  for (int i = i1; i < i1+4; ++i) ;
Yes. that is the example I was referring to. I just copied the MLIR, but didn't bother to put in the OpenMP code, so thanks for doing that.

I think, I am missing the point here. But just to confirm, the example above has only a single loop on which a tiling transformation is performed followed by an unroll on the inner loop. The approach proposed in this patch I believe would model it as:

#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;

%lp = omp.canonical_loop for (%c0) to (%c64) step (%c1) {
    ..
}
%tiled:2 = omp.tile loops(%lp) { tile_sizes=[4] } 
%unrolled = omp.unroll loops(%tiled#1)

Representing this with the nested approach would require specifying which loop the unroll should apply and since it is not the outermost loop, we will have to specify with an index. I am using loop(n) with an integer argument to specify the loop at the depth that we want (0 for the loop enclosed, 1 for the following loop and so on).

omp.unroll loop(1)
  omp.tile loop (0) { tile_sizes=[4] } {
    omp.canonical_loop for (%c0) to (%c64) step (%c1) {
      ..
    }
  }
}

I don't know whether coming up with an integer argument is always possible. We will have to effectively interpret the whole sequence of transformations to come up with this number. And with loop-fission (not in 6.0) there will be multiple loops at a nesting level and to index them accurately, each nesting level will need a multi-dimensional index.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
440	Sounds OK to me. `omp.region` looks like a nice Op. Might be useful for easing lowering in other cases too. Keep in mind that this will be required quite soon since even an `if` statement gets converted to branches by the time we are in LLVM dialect.
443	Yes, the wording looks fine.

In D155765#4634817, @kiranchandramohan wrote:
In D155765#4632957, @jsjodin wrote:
In D155765#4632789, @kiranchandramohan wrote:
I meant @Meinersbur 's reply (https://discourse.llvm.org/t/mlir-summit-openmp-roundtable-discussion-summary/66574/5) to my question, that shows an example that tiles the loop and then unrolls the inner-loop. Copying it here for reference.
#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;
which is equivalent to:
#pragma omp 
for (int i1 = 0; i1 < 64; i1+=4) 
  #pragma omp unroll
  for (int i = i1; i < i1+4; ++i) ;
Yes. that is the example I was referring to. I just copied the MLIR, but didn't bother to put in the OpenMP code, so thanks for doing that.
I think, I am missing the point here. But just to confirm, the example above has only a single loop on which a tiling transformation is performed followed by an unroll on the inner loop. The approach proposed in this patch I believe would model it as:

#pragma omp tile sizes(4) apply(intratile:unroll)
for (int i = 0; i < 64; ++i) ;
%lp = omp.canonical_loop for (%c0) to (%c64) step (%c1) {
    ..
}
%tiled:2 = omp.tile loops(%lp) { tile_sizes=[4] } 
%unrolled = omp.unroll loops(%tiled#1)
Representing this with the nested approach would require specifying which loop the unroll should apply and since it is not the outermost loop, we will have to specify with an index. I am using loop(n) with an integer argument to specify the loop at the depth that we want (0 for the loop enclosed, 1 for the following loop and so on).

Yes, that is correct, and it looks like in the example code if I understand the notation correctly we are using indices to specify the loops. 'Does '''%tiled:2''' mean it produces two loops and does '''%tiled#1''' mean index 1 of the loops?

omp.unroll loop(1)
  omp.tile loop (0) { tile_sizes=[4] } {
    omp.canonical_loop for (%c0) to (%c64) step (%c1) {
      ..
    }
  }
}
I don't know whether coming up with an integer argument is always possible. We will have to effectively interpret the whole sequence of transformations to come up with this number. And with loop-fission (not in 6.0) there will be multiple loops at a nesting level and to index them accurately, each nesting level will need a multi-dimensional index.

Ah okay, I think that I wasn't clear about what the nesting operations do with respect to the loop information. Each operation would transform the loop information, so internally it would be similar to the current proposal. It is preferable to avoid having a representation that allows multiple ways to create malformed IR, which I think the nesting would avoid.

There are two possible approaches for the unrolled operation here -

%unrolled1 = omp.unroll loops(%tiled) at(1)
%unrolled2 = omp.unroll loops(%tiled#1)

The good thing with the first approach, with at(1) - is that because we have both the loops as input, we transform the inner loop and we have the new single loop as output which can directly be used in other constructs. Note that this is similar to Jan's suggestion about nesting and maintaining indices.
The problem with the second approach is that the input is only the inner loop and the output (%unrolled2) should represent a structured block (not a loop) which should be captured within the loop %tiled#0 somehow. (maybe an omp.merge_canonical_loop_with_body operation?).

In D155765#4638411, @shraiysh wrote:
There are two possible approaches for the unrolled operation here -
%unrolled1 = omp.unroll loops(%tiled) at(1)
%unrolled2 = omp.unroll loops(%tiled#1)

I realized that we cannot really do the first approach, because unroll could be present without nested loops. The nesting of openmp constructs could help with this. It might require complex indexing though. @kiranchandramohan do we have the details on loop-fission or the loop-modifiers for apply (like intratile)? I checked the technical report document here but it doesn't mention too much detail about these modifiers.

Moved to https://github.com/llvm/llvm-project/pull/65380 because phabricator has been unresponsive lately (I have to hit reply multiple times before it sends one).

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

OpenMP/

CMakeLists.txt

7 lines

OpenMPDialect.h

3 lines

OpenMPOps.td

151 lines

lib/

Dialect/

OpenMP/

IR/

OpenMPDialect.cpp

121 lines

test/

Dialect/

OpenMP/

cli.mlir

88 lines

Diff 554190

mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt

	set(LLVM_TARGET_DEFINITIONS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Frontend/OpenMP/OMP.td)			set(LLVM_TARGET_DEFINITIONS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Frontend/OpenMP/OMP.td)
	mlir_tablegen(OmpCommon.td --gen-directive-decl --directives-dialect=OpenMP)			mlir_tablegen(OmpCommon.td --gen-directive-decl --directives-dialect=OpenMP)
	add_public_tablegen_target(omp_common_td)			add_public_tablegen_target(omp_common_td)

				add_mlir_dialect(OpenMPOps omp)
	set(LLVM_TARGET_DEFINITIONS OpenMPOps.td)			set(LLVM_TARGET_DEFINITIONS OpenMPOps.td)
	mlir_tablegen(OpenMPOpsDialect.h.inc -gen-dialect-decls -dialect=omp)
	mlir_tablegen(OpenMPOpsDialect.cpp.inc -gen-dialect-defs -dialect=omp)
	mlir_tablegen(OpenMPOps.h.inc -gen-op-decls)
	mlir_tablegen(OpenMPOps.cpp.inc -gen-op-defs)
	mlir_tablegen(OpenMPOpsEnums.h.inc -gen-enum-decls)			mlir_tablegen(OpenMPOpsEnums.h.inc -gen-enum-decls)
	mlir_tablegen(OpenMPOpsEnums.cpp.inc -gen-enum-defs)			mlir_tablegen(OpenMPOpsEnums.cpp.inc -gen-enum-defs)
	mlir_tablegen(OpenMPOpsAttributes.h.inc -gen-attrdef-decls -attrdefs-dialect=omp)			mlir_tablegen(OpenMPOpsAttributes.h.inc -gen-attrdef-decls -attrdefs-dialect=omp)
	mlir_tablegen(OpenMPOpsAttributes.cpp.inc -gen-attrdef-defs -attrdefs-dialect=omp)			mlir_tablegen(OpenMPOpsAttributes.cpp.inc -gen-attrdef-defs -attrdefs-dialect=omp)
	add_mlir_doc(OpenMPOps OpenMPDialect Dialects/ -gen-dialect-doc -dialect=omp)			add_mlir_doc(OpenMPOps OpenMPDialect Dialects/ -gen-dialect-doc -dialect=omp)
	add_public_tablegen_target(MLIROpenMPOpsIncGen)
	add_dependencies(OpenMPDialectDocGen omp_common_td)			add_dependencies(OpenMPDialectDocGen omp_common_td)

	add_mlir_interface(OpenMPOpsInterfaces)			add_mlir_interface(OpenMPOpsInterfaces)

	set(LLVM_TARGET_DEFINITIONS OpenMPTypeInterfaces.td)			set(LLVM_TARGET_DEFINITIONS OpenMPTypeInterfaces.td)
	mlir_tablegen(OpenMPTypeInterfaces.h.inc -gen-type-interface-decls)			mlir_tablegen(OpenMPTypeInterfaces.h.inc -gen-type-interface-decls)
	mlir_tablegen(OpenMPTypeInterfaces.cpp.inc -gen-type-interface-defs)			mlir_tablegen(OpenMPTypeInterfaces.cpp.inc -gen-type-interface-defs)
	add_public_tablegen_target(MLIROpenMPTypeInterfacesIncGen)			add_public_tablegen_target(MLIROpenMPTypeInterfacesIncGen)
	add_dependencies(mlir-generic-headers MLIROpenMPTypeInterfacesIncGen)			add_dependencies(mlir-generic-headers MLIROpenMPTypeInterfacesIncGen)

mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h

	Show All 19 Lines
	#include "mlir/IR/SymbolTable.h"			#include "mlir/IR/SymbolTable.h"
	#include "mlir/Interfaces/ControlFlowInterfaces.h"			#include "mlir/Interfaces/ControlFlowInterfaces.h"
	#include "mlir/Interfaces/SideEffectInterfaces.h"			#include "mlir/Interfaces/SideEffectInterfaces.h"

	#include "mlir/Dialect/OpenMP/OpenMPOpsDialect.h.inc"			#include "mlir/Dialect/OpenMP/OpenMPOpsDialect.h.inc"
	#include "mlir/Dialect/OpenMP/OpenMPOpsEnums.h.inc"			#include "mlir/Dialect/OpenMP/OpenMPOpsEnums.h.inc"
	#include "mlir/Dialect/OpenMP/OpenMPTypeInterfaces.h.inc"			#include "mlir/Dialect/OpenMP/OpenMPTypeInterfaces.h.inc"

				#define GET_TYPEDEF_CLASSES
				#include "mlir/Dialect/OpenMP/OpenMPOpsTypes.h.inc"

	#define GET_ATTRDEF_CLASSES			#define GET_ATTRDEF_CLASSES
	#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.h.inc"			#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.h.inc"

	#include "mlir/Dialect/OpenMP/OpenMPInterfaces.h"			#include "mlir/Dialect/OpenMP/OpenMPInterfaces.h"

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/OpenMP/OpenMPOps.h.inc"			#include "mlir/Dialect/OpenMP/OpenMPOps.h.inc"

	#endif // MLIR_DIALECT_OPENMP_OPENMPDIALECT_H_			#endif // MLIR_DIALECT_OPENMP_OPENMPDIALECT_H_

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

Show All 22 Lines

include "mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td" include "mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td"

include "mlir/Dialect/OpenMP/OpenMPTypeInterfaces.td" include "mlir/Dialect/OpenMP/OpenMPTypeInterfaces.td"

def OpenMP_Dialect : Dialect { def OpenMP_Dialect : Dialect {

let name = "omp"; let name = "omp";

let cppNamespace = "::mlir::omp"; let cppNamespace = "::mlir::omp";

let dependentDialects = ["::mlir::LLVM::LLVMDialect, ::mlir::func::FuncDialect"]; let dependentDialects = ["::mlir::LLVM::LLVMDialect, ::mlir::func::FuncDialect"];

let useDefaultAttributePrinterParser = 1; let useDefaultAttributePrinterParser = 1;

let useDefaultTypePrinterParser = 1;

let usePropertiesForAttributes = 1; let usePropertiesForAttributes = 1;

} }

// OmpCommon requires definition of OpenACC_Dialect. // OmpCommon requires definition of OpenACC_Dialect.

include "mlir/Dialect/OpenMP/OmpCommon.td" include "mlir/Dialect/OpenMP/OmpCommon.td"

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// OpenMP Attributes // OpenMP Attributes

Show All 35 Lines def TargetAttr : OpenMP_Attr<"Target", "target"> {

let parameters = (ins let parameters = (ins

StringRefParameter<>:$target_cpu, StringRefParameter<>:$target_cpu,

StringRefParameter<>:$target_features StringRefParameter<>:$target_features

); );

let assemblyFormat = "`<` struct(params) `>`"; let assemblyFormat = "`<` struct(params) `>`";

} }

class OpenMP_Type<string name, string typeMnemonic> :

TypeDef<OpenMP_Dialect, name> {

let mnemonic = typeMnemonic;

}

class OpenMP_Op<string mnemonic, list<Trait> traits = []> : class OpenMP_Op<string mnemonic, list<Trait> traits = []> :

Op<OpenMP_Dialect, mnemonic, traits>; Op<OpenMP_Dialect, mnemonic, traits>;

// Type which can be constraint accepting standard integers and indices. // Type which can be constraint accepting standard integers and indices.

def IntLikeType : AnyTypeOf<[AnyInteger, Index]>; def IntLikeType : AnyTypeOf<[AnyInteger, Index]>;

def OpenMP_PointerLikeType : TypeAlias<OpenMP_PointerLikeTypeInterface, def OpenMP_PointerLikeType : TypeAlias<OpenMP_PointerLikeTypeInterface,

▲ Show 20 Lines • Show All 305 Lines • ▼ Show 20 Lines oilist(`allocate` `(`

$allocators_vars, type($allocators_vars) $allocators_vars, type($allocators_vars)

) `)` ) `)`

|`nowait` $nowait |`nowait` $nowait

) $region attr-dict ) $region attr-dict

}]; }];

let hasVerifier = 1; let hasVerifier = 1;

} }

//===---------------------------------------------------------------------===//

// OpenMP Canonical Loop Operation

//===---------------------------------------------------------------------===//

def CanonicalLoopInfoType : OpenMP_Type<"CanonicalLoopInfo", "cli"> {

let summary = "Type for representing a reference to a canonical loop";

let description = [{

A variable of type CanonicalLoopInfo refers to an OpenMP-compatible

canonical loop in the same function. Variables of this type are not

available at runtime and therefore cannot be used by the program itself,

i.e. an opaque type. It is similar to the transform dialect's

`!transform.interface` type, but instead of implementing an interface

for each transformation, the OpenMP dialect itself defines possible

operations on this type.

A CanonicalLoopInfo variable can

1. be passed to omp.yield to be accessible outside the loop.

2. passed to omp operations that take a CanonicalLoopInfo argument,

such as `omp.unroll`.

A CanonicalLoopInfo variable can not

1. be returned from a function,

2. passed to operations that are not specifically designed to take a

CanonicalLoopInfo, including AnyType.

A CanonicalLoopInfo variable directly corresponds to an object of

OpenMPIRBuilder's CanonicalLoopInfo struct when lowering to LLVM-IR.

}];

}

def CanonicalLoopOp : OpenMP_Op<"canonical_loop", [SingleBlockImplicitTerminator<"omp::YieldOp">]> {

kiranchandramohanUnsubmitted

Not Done

SingleBlockImplicitTerminator will enforce the requirement for a single block in the loop and hence would disallow branches inside the loop.

kiranchandramohan: `SingleBlockImplicitTerminator` will enforce the requirement for a single block in the loop and…

shraiyshAuthorUnsubmitted

Not Done

The idea behind allowing only one block is to be able to statically infer the exact CanonicalLoopInfo objects to apply transformations to. We discussed about branching, and while it has not been added right now, we were considering adding an omp.region (name suggestions welcome) later, which would allow us to have branching within canonical loop, while having concrete information about CLIs.

%outer, %inner = omp.canonical_loop {
  %innertmp = omp.canonical_loop {
    omp.region {
      // ... branch within canonical loop
    }
    omp.yield
  }
  omp.yield %innertmp
}

What do you think about this, @kiranchandramohan? (@Meinersbur please correct me if I am wrong).

shraiysh: The idea behind allowing only one block is to be able to statically infer the exact…

kiranchandramohanUnsubmitted

Not Done

Sounds OK to me. omp.region looks like a nice Op. Might be useful for easing lowering in other cases too.

Keep in mind that this will be required quite soon since even an if statement gets converted to branches by the time we are in LLVM dialect.

kiranchandramohan: Sounds OK to me. `omp.region` looks like a nice Op. Might be useful for easing lowering in…

let summary = "OpenMP Canonical Loop Operation";

let description = [{

All loops that conform to OpenMP's definition of a canonical loop can be

kiranchandramohanUnsubmitted

Done

let description = [{

- A CanonicalLoopOp represents a loop that is conforms to OpenMP's defintion

+ A CanonicalLoopOp represents a loop that conforms to OpenMP's definition

of a canonical loop. In particular, there are no loop-carried variables

kiranchandramohan:

kiranchandramohanUnsubmitted

Not Done

While this representation definitely conforms to one of the representations mentioned in the OpenMP canonical loop definition, it does not directly cover all the forms. We might have to word it slightly differently.

kiranchandramohan: While this representation definitely conforms to one of the representations mentioned in the…

shraiyshAuthorUnsubmitted

Not Done

Is the current wording okay?

shraiysh: Is the current wording okay?

kiranchandramohanUnsubmitted

Not Done

Yes, the wording looks fine.

kiranchandramohan: Yes, the wording looks fine.

simplified to a CanonicalLoopOp. In particular, there are no loop-carried

variables and the number of iterations it will execute is know before the

operation. This allows e.g. to determine the number of threads and chunks

the iterations space is split into before executing any iteration. More

restrictions may apply in cases such as (collapsed) loop nests, doacross

loops, etc.

The induction variable is always of the same type as the tripcount argument.

Since it can never be negative, tripcount is always interpreted as an

unsigned integer. It is the caller's responsbility to ensure the tripcount

is not negative when its interpretation is signed, i.e.

`%tripcount = max(0,%tripcount)`.

In contrast to other loop operations such as `scf.for`, the number of

iterations is determined by only a single variable, the trip-count. The

induction variable value is the logical iteration number of that iteration,

which OpenMP defines to be between 0 and the trip-count (exclusive).

Loop representation having lower-bound, upper-bound, and step-size operands,

require passes to do more work than necessary, including handling special

kiranchandramohanUnsubmitted

Done

Loop representation having lower-bound, upper-bound, and step-size operands,

- require passes to do more work than necessary, incliding handling special

+ require passes to do more work than necessary, including handling special

cases such as upper-bound smaller than lower-bound, upper-bound equal to

kiranchandramohan:

cases such as upper-bound smaller than lower-bound, upper-bound equal to

the integer type's maximal value, negative step size, etc. This complexity

is better only handled once by the front-end and can apply its semantics

for such cases while still being able to represent any kind of loop, which

kind of the point of a mid-end intermediate representation. User-defined

types such as random-access iterators in C++ could not directly be

represented anyway.

The return value of a omp.canonical_loop is a CanonicalLoopInfo that can be

used to refer to the canonical loop to apply transformations -- such as

tiling, unrolling, or work-sharing -- to the loop, similar to the transform

dialect but with OpenMP-specific semantics. To refer to nested canonical

loops, the CanonicalLoopInfo can be passed to omp.yield and becomes an

additional return value of the the outer `omp.canonical_loop`.

Every `omp.yield` on the loop body must be passed the same CanonicalLoopInfo

since nesting is a static/compile-time property.

A CanonicalLoopOp can be lowered to LLVM-IR using OpenMPIRBuilder's

createCanonicalLoop method.

#### Examples

Translation from lower-bound, upper-bount, step-size to trip-count.

```c

for (int i = 3; i < 42; i+=2) {

B[i] = A[i];

}

```

```mlir

%lb = arith.constant 3 : i32

%ub = arith.constant 42 : i32

%step = arith.constant 2 : i32

%range = arith.sub %ub, %lb : i32

%tc = arith.div %range, %step : i32

%cli = omp.canonical_loop %iv : i32 in [0, %tc) {

%offset = arith.mul %iv, %step : i32

%i = arith.add %offset, %lb : i32

%a = load %arrA[%i] : memref<?xf32>

store %a, %arrB[%i] : memref<?xf32>

}

```

Nested canonical loop with transformation.

```mlir

%outer,%inner = omp.canonical_loop %iv1 : i32 in [0, %tripcount) {

%inner = omp.canonical_loop %iv2 : i32 in [0, %tc) {

%a = load %arrA[%iv1, %iv2] : memref<?x?xf32>

store %a, %arrB[%iv1, %iv2] : memref<?x?xf32>

}

omp.yield(%inner : !omp.cli)

}

omp.tile(%outer, %inner : !omp.cli, !omp.cli)

```

Nested canonical loop with other constructs. The `omp.distribute`

operation has not been added yet, so this is suggested use with other

constructs.

```mlir

omp.target {

%outer,%inner = omp.canonical_loop %iv1 : i32 in [0, %tripcount) {

jsjodinUnsubmitted

Done

I'm trying to understand how the yield works. What determines if an inner canonical loop should/must be yielded and be part or the result of an outer canonical loop? Can there be multiple non-nested canonical loops inside the body of a canonical loop?

jsjodin: I'm trying to understand how the yield works. What determines if an inner canonical loop…

MeinersburUnsubmitted

Done

The inner loop should always be yielded if to be considered a canonical loop nest in the OpenMP sense. Multiple loops are returned for deeper loops nests -- see cli.mlir for examples.

There can be other non-canonical loops nested in the loop body. They are basically ignored.

OpenMP does not allow two canonical loops at the same level (e.g. sequentially executed), only nested within the other.

Meinersbur: The inner loop should always be yielded if to be considered a canonical loop nest in the OpenMP…

jsjodinUnsubmitted

Done

The inner loop should always be yielded if to be considered a canonical loop nest in the OpenMP sense. Multiple loops are returned for deeper loops nests -- see cli.mlir for examples.

There can be other non-canonical loops nested in the loop body. They are basically ignored.

OpenMP does not allow two canonical loops at the same level (e.g. sequentially executed), only nested within the other.

Thanks for the clarification, adding this op looks reasonable to me.

jsjodin: > The inner loop should always be yielded if to be considered a canonical loop nest in the…

%inner = omp.canonical_loop %iv2 : i32 in [0, %tc) {

%a = load %arrA[%iv1, %iv2] : memref<?x?xf32>

store %a, %arrB[%iv1, %iv2] : memref<?x?xf32>

}

omp.yield(%inner : !omp.cli)

}

%collapsed_loopinfo = omp.collapse(%outer, %inner)

omp.teams {

call @foo() : () -> ()

omp.distribute(%collapsed_loopinfo)

}

```

}];

let hasCustomAssemblyFormat = 1;

let hasVerifier = 1;

let arguments = (ins IntLikeType:$tripCount);

let regions = (region AnyRegion:$region);

let results = (outs Variadic<CanonicalLoopInfoType>:$loopInfo);

let extraClassDeclaration = [{

::mlir::Value getInductionVar();

}];

}

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// 2.9.2 Workshare Loop Construct // 2.9.2 Workshare Loop Construct

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

def WsLoopOp : OpenMP_Op<"wsloop", [AttrSizedOperandSegments, def WsLoopOp : OpenMP_Op<"wsloop", [AttrSizedOperandSegments,

AllTypesMatch<["lowerBound", "upperBound", "step"]>, AllTypesMatch<["lowerBound", "upperBound", "step"]>,

RecursiveMemoryEffects, ReductionClauseInterface]> { RecursiveMemoryEffects, ReductionClauseInterface]> {

let summary = "worksharing-loop construct"; let summary = "worksharing-loop construct";

▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines let summary = "simd loop construct";

let hasCustomAssemblyFormat = 1; let hasCustomAssemblyFormat = 1;

let hasVerifier = 1; let hasVerifier = 1;

} }

def YieldOp : OpenMP_Op<"yield", def YieldOp : OpenMP_Op<"yield",

[Pure, ReturnLike, Terminator, [Pure, ReturnLike, Terminator,

ParentOneOf<["WsLoopOp", "ReductionDeclareOp", ParentOneOf<["WsLoopOp", "ReductionDeclareOp",

"AtomicUpdateOp", "SimdLoopOp"]>]> { "AtomicUpdateOp", "SimdLoopOp", "CanonicalLoopOp"]>]> {

let summary = "loop yield and termination operation"; let summary = "loop yield and termination operation";

let description = [{ let description = [{

"omp.yield" yields SSA values from the OpenMP dialect op region and "omp.yield" yields SSA values from the OpenMP dialect op region and

terminates the region. The semantics of how the values are yielded is terminates the region. The semantics of how the values are yielded is

defined by the parent operation. defined by the parent operation.

}]; }];

let arguments = (ins Variadic<AnyType>:$results); let arguments = (ins Variadic<AnyType>:$results);

▲ Show 20 Lines • Show All 1,166 Lines • Show Last 20 Lines

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

//===- OpenMPDialect.cpp - MLIR Dialect for OpenMP implementation ---------===//		//===- OpenMPDialect.cpp - MLIR Dialect for OpenMP implementation ---------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the OpenMP dialect and its operations.		// This file implements the OpenMP dialect and its operations.
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	void OpenMPDialect::initialize() {
addOperations<		addOperations<
#define GET_OP_LIST		#define GET_OP_LIST
#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"
>();		>();
addAttributes<		addAttributes<
#define GET_ATTRDEF_LIST		#define GET_ATTRDEF_LIST
#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"
>();		>();
		addTypes<
		#define GET_TYPEDEF_LIST
		#include "mlir/Dialect/OpenMP/OpenMPOpsTypes.cpp.inc"
		>();

addInterface<OpenMPDialectFoldInterface>();		addInterface<OpenMPDialectFoldInterface>();
LLVM::LLVMPointerType::attachInterface<		LLVM::LLVMPointerType::attachInterface<
PointerLikeModel<LLVM::LLVMPointerType>>(*getContext());		PointerLikeModel<LLVM::LLVMPointerType>>(*getContext());
MemRefType::attachInterface<PointerLikeModel<MemRefType>>(*getContext());		MemRefType::attachInterface<PointerLikeModel<MemRefType>>(*getContext());
LLVM::LLVMPointerType::attachInterface<		LLVM::LLVMPointerType::attachInterface<
PointerLikeModel<LLVM::LLVMPointerType>>(*getContext());		PointerLikeModel<LLVM::LLVMPointerType>>(*getContext());

▲ Show 20 Lines • Show All 1,443 Lines • ▼ Show 20 Lines	if ((cct == ClauseCancellationConstructType::Sections) &&
!(isa<SectionsOp>(parentOp) \|\| isa<SectionOp>(parentOp))) {		!(isa<SectionsOp>(parentOp) \|\| isa<SectionOp>(parentOp))) {
return emitOpError() << "cancellation point sections must appear "		return emitOpError() << "cancellation point sections must appear "
<< "inside a sections region";		<< "inside a sections region";
}		}
// TODO : Add more when we support taskgroup.		// TODO : Add more when we support taskgroup.
return success();		return success();
}		}

		//===----------------------------------------------------------------------===//
		// CanonicaLoopOp
		//===----------------------------------------------------------------------===//

		Value mlir::omp::CanonicalLoopOp::getInductionVar() {
		return getRegion().getArgument(0);
		}

		void mlir::omp::CanonicalLoopOp::print(OpAsmPrinter &p) {
		p << " " << getInductionVar() << " : " << getInductionVar().getType()
		<< " in [0, " << getTripCount() << ") ";

		// omp.yield is implicit if no arguments passed to it.
		p.printRegion(getRegion(), /printEntryBlockArgs=/false,
		/printBlockTerminators=/getResultTypes().size() > 1);

		p.printOptionalAttrDict((*this)->getAttrs());
		}

		mlir::ParseResult
		mlir::omp::CanonicalLoopOp::parse(::mlir::OpAsmParser &parser,
		::mlir::OperationState &result) {
		Builder &builder = parser.getBuilder();
		MLIRContext *context = parser.getContext();

		// We derive the type of tripCount from inductionVariable. Unfortunatelty we
		// cannot do the other way around because MLIR requires the type of tripCount
		// to be known when calling resolveOperand.
		OpAsmParser::Argument inductionVariable;
		if (parser.parseArgument(inductionVariable, /allowType/ true) \|\|
		parser.parseKeyword("in") \|\| parser.parseLSquare())
		return failure();

		int zero = -1;
		SMLoc zeroLoc = parser.getCurrentLocation();
		if (parser.parseInteger(zero))
		return failure();
		if (zero != 0) {
		parser.emitError(zeroLoc, "Logical iteration space starts with zero");
		return failure();
		}

		OpAsmParser::UnresolvedOperand tripcount;
		if (parser.parseComma() \|\| parser.parseOperand(tripcount) \|\|
		parser.parseRParen() \|\|
		parser.resolveOperand(tripcount, inductionVariable.type, result.operands))
		return failure();

		// Parse the loop body.
		Region *region = result.addRegion();
		if (parser.parseRegion(*region, {inductionVariable}))
		return failure();
		CanonicalLoopOp::ensureTerminator(*region, builder, result.location);

		// Return the CanonicalLoopInfo for this loop, plus the CanonicalLoopInfos
		// passed to omp.yield.
		int numResults = 0;
		for (Block &block : *region) {
		if (auto yield = dyn_cast<YieldOp>(block.getTerminator())) {
		numResults = yield.getNumOperands();
		break;
		}
		}
		numResults += 1;
		for (int i = 0; i < numResults; ++i)
		result.types.push_back(CanonicalLoopInfoType::get(context));

		// Parse the optional attribute list.
		if (parser.parseOptionalAttrDict(result.attributes))
		return failure();

		return mlir::success();
		}

		LogicalResult CanonicalLoopOp::verify() {
		Value indVar = getInductionVar();
		Value tripCount = getTripCount();
		Block *body = getBody();
		Region &region = getRegion();

		if (indVar.getType() != tripCount.getType())
		return emitOpError(
		"Region argument must be the same type as the trip count");

		auto numResults = getResultTypes().size();
		if (numResults <= 0)
		return emitOpError(
		"omp.canonical_loop must return at least one CanonicalLoopInfo");

		// Check arguments to omp.yield operations
		YieldOp *firstYield = nullptr;
		for (Block &block : region) {
		if (auto yield = dyn_cast<YieldOp>(block.getTerminator())) {
		if (yield.getNumOperands() != numResults - 1)
		return emitOpError("omp.yield arguments must match number of return "
		"CanonicalLoopInfo's");

		if (!firstYield) {
		firstYield = &yield;
		continue;
		}

		for (int i = 0; i < numResults - 1; ++i) {
		if (yield.getOperand(i) != firstYield->getOperand(i))
		return emitOpError("Each omp.yield must return the same values");
		}
		}
		}

		return success();
		}

#define GET_ATTRDEF_CLASSES		#define GET_ATTRDEF_CLASSES
#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.cpp.inc"

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"		#include "mlir/Dialect/OpenMP/OpenMPOps.cpp.inc"

		#define GET_TYPEDEF_CLASSES
		#include "mlir/Dialect/OpenMP/OpenMPOpsTypes.cpp.inc"

mlir/test/Dialect/OpenMP/cli.mlir

This file was added.

				// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s

				// CHECK-LABEL: @omp_canonloop_raw
				// CHECK-SAME: (%[[tc:.*]]: i32)
				func.func @omp_canonloop_raw(%tc : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %{{.}} : i32 in [0, %[[tc]]) {
				%cli = "omp.canonical_loop" (%tc) ({
				^bb0(%iv: i32):
				// omp.yield without argument is implicit
				// CHECK-NOT: omp.yield
				omp.yield
				}) : (i32) -> (!omp.cli)
				return
				}

				// CHECK-LABEL: @omp_nested_canonloop_raw
				// CHECK-SAME: (%[[tc_outer:.]]: i32, %[[tc_inner:.]]: i32)
				func.func @omp_nested_canonloop_raw(%tc_outer : i32, %tc_inner : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %{{.}} : i32 in [0, %[[tc_outer]]) {
				%outer,%inner = "omp.canonical_loop" (%tc_outer) ({
				^bb_outer(%iv_outer: i32):
				// CHECK: %[[inner_cli:.]] = omp.canonical_loop %{{.}} : i32 in [0, %[[tc_inner]]) {
				%inner = "omp.canonical_loop" (%tc_inner) ({
				^bb_inner(%iv_inner: i32):
				omp.yield
				}) : (i32) -> (!omp.cli)
				// CHECK: omp.yield(%[[inner_cli]] : !omp.cli)
				omp.yield (%inner : !omp.cli)
				}) : (i32) -> (!omp.cli, !omp.cli)
				return
				}

				// CHECK-LABEL: @omp_triple_nested_canonloop_raw
				func.func @omp_triple_nested_canonloop_raw(%tc_outer : i32,%tc_middle : i32, %tc_inner : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%outer, %middle, %inner = "omp.canonical_loop" (%tc_outer) ({
				^bb_outer(%iv_outer: i32):
				// CHECK: %[[middle:.]]:2 = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%middle, %inner= "omp.canonical_loop" (%tc_middle) ({
				^bb_middle(%iv_middle: i32):
				// CHECK: %[[inner:.]] = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%inner = "omp.canonical_loop" (%tc_inner) ({
				^bb_inner(%iv_inner: i32):
				omp.yield
				}) : (i32) -> (!omp.cli)
				// CHECK: omp.yield(%[[inner]] : !omp.cli)
				omp.yield (%inner : !omp.cli)
				}) : (i32) -> (!omp.cli,!omp.cli)
				// CHECK: omp.yield(%[[middle]]#0, %[[middle]]#1 : !omp.cli, !omp.cli)
				omp.yield (%middle, %inner : !omp.cli, !omp.cli)
				}) : (i32) -> (!omp.cli, !omp.cli, !omp.cli)
				return
				}

				// CHECK-LABEL: @omp_canonloop_pretty
				// CHECK-SAME: (%[[tc:.*]]: i32)
				func.func @omp_canonloop_pretty(%tc : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %[[iv:.]] : i32 in [0, %[[tc]]) {
				%cli = omp.canonical_loop %iv : i32 in [0, %tc) {
				// CHECK-NEXT: %{{.*}} = llvm.add %[[iv]], %[[iv]] : i32
				%newval = llvm.add %iv, %iv: i32
				// CHECK-NOT: omp.yield
				}
				return
				}

				// CHECK-LABEL: @omp_canonloop_implicit_yield
				func.func @omp_canonloop_implicit_yield(%tc : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%cli = omp.canonical_loop %iv : i32 in [0, %tc) {
				// CHECK-NOT: omp.yield
				// CHECK-NEXT: }
				}
				return
				}

				// CHECK-LABEL: @omp_canonloop_nested_pretty
				func.func @omp_canonloop_nested_pretty(%tc : i32) -> () {
				// CHECK: %{{.}} = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%outer,%inner = omp.canonical_loop %iv1 : i32 in [0, %tc) {
				// CHECK: %[[inner:.]] = omp.canonical_loop %{{.}} : i32 in [0, %{{.*}}) {
				%inner = omp.canonical_loop %iv2 : i32 in [0, %tc) {}
				// CHECK: omp.yield(%[[inner]] : !omp.cli)
				omp.yield (%inner : !omp.cli)
				}
				kiranchandramohanUnsubmitted Done Reply Inline Actions Nit: empty lines kiranchandramohan: Nit: empty lines
				return
				}

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP Dialect] Add omp.canonical_loop operation.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 554190

mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt

mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

mlir/test/Dialect/OpenMP/cli.mlir

[OpenMP Dialect] Add omp.canonical_loop operation.
Needs ReviewPublic