diff --git a/mlir/docs/Dialects/Linalg.md b/mlir/docs/Dialects/Linalg.md
--- a/mlir/docs/Dialects/Linalg.md
+++ b/mlir/docs/Dialects/Linalg.md
@@ -6,12 +6,12 @@
-Linalg is designed to solve the High-level Hierarchical Optimization
-(HHO box) in MLIR and to interoperate nicely within a
-*Mixture Of Expert Compilers* environment (i.e. the *CGSel* box).
+Linalg is designed to solve the High-level Hierarchical Optimization (HHO box)
+in MLIR and to interoperate nicely within a *Mixture Of Expert Compilers*
+environment (i.e. the *CGSel* box).
-The [Rationale Document](../Rationale/RationaleLinalgDialect.md)
-goes into significantly more design and architectural decision details.
+The [Rationale Document](../Rationale/RationaleLinalgDialect.md) goes into
+significantly more design and architectural decision details.
## Set of Key Transformations
@@ -20,51 +20,72 @@
`linalg.generic` OpInterface and avoid the pitfall of relying on hardcoded
one-off op knowledge.
-The textual form description of these transformations is left for future
-work. Still, it is useful to at least the key transformations that are
-performed on the Linalg IR and that have influenced its design:
-1. Progressive Buffer Allocation.
-1. Parametric Tiling.
-1. Promotion to Temporary Buffer in Fast Memory.
-1. Tiled Producer-Consumer Fusion with Parametric Tile-And-Fuse.
-1. Map to Parallel and Reduction Loops and Hardware.
-1. Vectorization: Rewrite in Vector Form.
-1. Lower to Loops (Affine, Generic, and Parallel).
-1. Lower to Library Calls or Special Instructions, Intrinsics or ISA.
-1. Partially Lower to Iterations Over a Finer-Grained Linalg Op.
+The textual form description of these transformations is left for future work.
+Still, it is useful to at least the key transformations that are performed on
+the Linalg IR and that have influenced its design: 1. Progressive Buffer
+Allocation. 1. Parametric Tiling. 1. Promotion to Temporary Buffer in Fast
+Memory. 1. Tiled Producer-Consumer Fusion with Parametric Tile-And-Fuse. 1. Map
+to Parallel and Reduction Loops and Hardware. 1. Vectorization: Rewrite in
+Vector Form. 1. Lower to Loops (Affine, Generic, and Parallel). 1. Lower to
+Library Calls or Special Instructions, Intrinsics or ISA. 1. Partially Lower to
+Iterations Over a Finer-Grained Linalg Op.
## High-Level Description of Linalg Ops
-Linalg takes at least some inspiration from all previously [listed prior
-art](#prior_art). The design enables the definition of ***CustomOps*** with
-generic properties that enable [key transformations](#key_transformations),
-including lowering to scalar load/store and other operations or to external
-library calls and intrinsics.
-These ops can have ***either tensor or buffer operands***, subject to
-[conventions and limitations](#tensors_and_buffers).
+Linalg takes at least some inspiration from all previously
+[listed prior art](#prior_art). The design enables the definition of
+***CustomOps*** with generic properties that enable
+[key transformations](#key_transformations), including lowering to scalar
+load/store and other operations or to external library calls and intrinsics.
+
+These ops can have ***either tensor or buffer*** as both input and output
+operands. Output tensors operands serve the purpose of providing a unifying
+abstraction and give a shape to the result tensors as described in the discourse
+discussion
+[Linalg and Shapes](https://llvm.discourse.group/t/linalg-and-shapes/2421).
+
+Output tensors can come in 2 flavors and are always associated with a
+corresponding op result:
+
+1. an "init tensor" output value which provides an initial value for a tensor
+ that is creating by iteratively updating the result (also called
+ "destructive updates"). Such tensor is always materialized in some form. If
+ enough fusion occurs it may end up being materialized only as a
+ register-level SSA value. It is expected (but not required) that the
+ destructive update pattern can be rewritten as an inplace update on buffers.
+
+2. a "shape-only" tensor output value which is write-only and only serves the
+ purpose of carrying shape information to lower levels of abstraction. In the
+ future this will be replaced by an appropriate shape type when it is
+ available as a builtin type (see the discourse discussion
+ [Linalg and Shapes](https://llvm.discourse.group/t/linalg-and-shapes/2421)
+ for more details).
### Payload-Carrying Ops
-Linalg defines two payload carrying operations that implement the [structured ops](
-https://docs.google.com/presentation/d/1P-j1GrH6Q5gLBjao0afQ-GfvcAeF-QU4GXXeSy0eJ9I/edit#slide=id.p
-) abstraction on tensors and buffers. This is architected as two generic operations
-`linalg.generic` (resp. `linalg.indexed_generic`) that can express custom
-operations with *index-free semantics* (resp. *indexing semantics*).
-The properties of these generic ops are the result of applying the
-guiding principles described in the [Rationale Document](../Rationale/RationaleLinalgDialect.md).
-They are listed next, with a brief example and discussion for each.
+
+Linalg defines two payload carrying operations that implement the
+[structured ops](https://docs.google.com/presentation/d/1P-j1GrH6Q5gLBjao0afQ-GfvcAeF-QU4GXXeSy0eJ9I/edit#slide=id.p)
+abstraction on tensors and buffers. This is architected as two generic
+operations `linalg.generic` (resp. `linalg.indexed_generic`) that can express
+custom operations with *index-free semantics* (resp. *indexing semantics*). The
+properties of these generic ops are the result of applying the guiding
+principles described in the
+[Rationale Document](../Rationale/RationaleLinalgDialect.md). They are listed
+next, with a brief example and discussion for each.
#### Property 1: Input and Output Operands Define The Iteration Space
+
A `linalg.generic` op fully *derives* the specification of its iteration space
-from its operands.
-The property enforces that a localized IR element (the op) *has* all the information
-needed to synthesize the control-flow required to iterate over its operands,
-according to their type. This notion of IR localization bears some resemblance
-to [URUK](http://icps.u-strasbg.fr/~bastoul/research/papers/GVBCPST06-IJPP.pdf).
+from its operands. The property enforces that a localized IR element (the op)
+*has* all the information needed to synthesize the control-flow required to
+iterate over its operands, according to their type. This notion of IR
+localization bears some resemblance to
+[URUK](http://icps.u-strasbg.fr/~bastoul/research/papers/GVBCPST06-IJPP.pdf).
-Consider the following fully specified `linalg.generic` example.
-Here, the first operand is a `memref` of `f32` scalar elements that
-has an ordinary identity layout, and the second one is a `memref` of
-4-element vectors with a 2-strided, 1-offset layout.
+Consider the following fully specified `linalg.generic` example. Here, the first
+operand is a `memref` of `f32` scalar elements that has an ordinary identity
+layout, and the second one is a `memref` of 4-element vectors with a 2-strided,
+1-offset layout.
```mlir
// File name: example1.mlir
@@ -117,39 +138,41 @@
The property participates in simplifying analyses and transformations. For
instance, it guarantees no out-of bounds access can occur by construction
-(assuming dynamic operand dimensions agree with each other, which is the
-purpose of the `assert` runtime check).
+(assuming dynamic operand dimensions agree with each other, which is the purpose
+of the `assert` runtime check).
-Before lowering to loop form, loop induction variables and iterators are *not yet
-materialized*. This is a necessary property if we want an abstraction that
+Before lowering to loop form, loop induction variables and iterators are *not
+yet materialized*. This is a necessary property if we want an abstraction that
works on both tensor values and buffers because ***values don’t escape
loops/nesting***.
The main implications are that:
-1. The semantics of the ops are *restricted to operate on structured data
-types*, on which we can define an iterator.
-2. This does not model arbitrary code with side-effects.
+
+1. The semantics of the ops are *restricted to operate on structured data
+ types*, on which we can define an iterator.
+
+2. This does not model arbitrary code with side-effects.
We do not think these are serious limitations in practice because MLIR is all
-about mixing different levels of abstractions in the same IR. As long as
-Linalg can progressively lower to the next level of abstraction, it can also
-be just bypassed for things that do not fit.
+about mixing different levels of abstractions in the same IR. As long as Linalg
+can progressively lower to the next level of abstraction, it can also be just
+bypassed for things that do not fit.
At the same time, conditioning op semantics on structured data types is a very
promising path towards extensibility to non-dense tensors as experience with
LIFT abstractions for
-[sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf)
-and [position-dependent
-arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf),
+[sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf) and
+[position-dependent arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf),
as well as [TACO](http://tensor-compiler.org/), has shown.
#### Property 2: Reversible Mappings Between Control and Data Structures
+
A `linalg.generic` *defines* the mapping between the iteration space (i.e. the
loops) and the data.
-Consider the following fully specified `linalg.generic` example.
-Here, the first `memref` is a 2-strided one on both of its dimensions,
-and the second `memref` uses an identity layout.
+Consider the following fully specified `linalg.generic` example. Here, the first
+`memref` is a 2-strided one on both of its dimensions, and the second `memref`
+uses an identity layout.
```
// File name: example2.mlir
@@ -176,195 +199,165 @@
```
The property "*Reversible Mappings Between Control and Data Structures*" is
-materialized by a lowering into a form that will resemble:
-```
-// Run: mlir-opt example2.mlir -allow-unregistered-dialect -convert-linalg-to-loops
-#map0 = affine_map<(d0, d1) -> (d0 * 2 + d1 * 2)>
+materialized by a lowering into a form that will resemble: ``` // Run: mlir-opt
+example2.mlir -allow-unregistered-dialect -convert-linalg-to-loops
-func @example(%arg0: memref<8x?xf32, #map0>, %arg1: memref>) {
- %c8 = constant 8 : index
- %c0 = constant 0 : index
- %c1 = constant 1 : index
- %0 = dim %arg0, %c1 : memref<8x?xf32, #map0>
- scf.for %arg2 = %c0 to %0 step %c1 {
- scf.for %arg3 = %c0 to %c8 step %c1 {
- %1 = load %arg0[%arg3, %arg2] : memref<8x?xf32, #map0>
- %2 = load %arg1[%arg3] : memref>
- %3 = "some_compute"(%1, %2) : (f32, vector<4xf32>) -> vector<4xf32>
- store %3, %arg1[%arg3] : memref>
- }
- }
- return
-}
-```
+# map0 = affine_map<(d0, d1) -> (d0 * 2 + d1 * 2)>
-This mapping needs to be reversible because we want to be
-able to go back and forth between the two and answer questions such as:
-- Given a subset of the iteration space, what subset of data does it read and
-write?
-- Given a subset of data read or written, what subset of the iteration space
-is responsible for this read or write?
+func @example(%arg0: memref<8x?xf32, #map0>, %arg1: memref>) {
+%c8 = constant 8 : index %c0 = constant 0 : index %c1 = constant 1 : index %0 =
+dim %arg0, %c1 : memref<8x?xf32, #map0> scf.for %arg2 = %c0 to %0 step %c1 {
+scf.for %arg3 = %c0 to %c8 step %c1 { %1 = load %arg0[%arg3, %arg2] :
+memref<8x?xf32, #map0> %2 = load %arg1[%arg3] : memref> %3 =
+"some_compute"(%1, %2) : (f32, vector<4xf32>) -> vector<4xf32> store %3,
+%arg1[%arg3] : memref> } } return } ```
+
+This mapping needs to be reversible because we want to be able to go back and
+forth between the two and answer questions such as: - Given a subset of the
+iteration space, what subset of data does it read and write? - Given a subset of
+data read or written, what subset of the iteration space is responsible for this
+read or write?
Answering these `2` questions is one of the main analyses that Linalg uses to
implement transformations such as tiling, tiled producer-consumer fusion, and
promotion to temporary buffers in fast memory.
-In the current implementation, `linalg.generic` uses a list of [AffineMaps](https://mlir.llvm.org/docs/LangRef/#affinemap-attribute) (see the `#indexing_maps` attribute in the previous examples).
-This is a pragmatic short-term solution, but in the longer term note that
-this property could be even evaluated dynamically, similarly to
-inspector-executor algorithms.
+In the current implementation, `linalg.generic` uses a list of
+[AffineMaps](https://mlir.llvm.org/docs/LangRef/#affinemap-attribute) (see the
+`#indexing_maps` attribute in the previous examples). This is a pragmatic
+short-term solution, but in the longer term note that this property could be
+even evaluated dynamically, similarly to inspector-executor algorithms.
#### Property 3: The Type Of Iterators is Defined Explicitly
+
A `linalg.generic` op fully *declares* the type of its iterators. This
information is used in transformations.
These properties are derived from established practice in the field and mirror
-the properties from Ken Kennedy's [Optimizing Compilers for Modern Architectures](
-https://www.elsevier.com/books/optimizing-compilers-for-modern-architectures/allen/978-0-08-051324-9).
-The key idea of legality of loop transformations expressed by Kennedy is
-that ***the lexicographic order of all dependence vectors must be
-preserved***.
+the properties from Ken Kennedy's
+[Optimizing Compilers for Modern Architectures](https://www.elsevier.com/books/optimizing-compilers-for-modern-architectures/allen/978-0-08-051324-9).
+The key idea of legality of loop transformations expressed by Kennedy is that
+***the lexicographic order of all dependence vectors must be preserved***.
This can be better captured directly at the loop level thanks to specific
-iterator types, among which:
-*parallel*, *reduction*, *partition*, *permutable/monotonic*, *sequential*,
-*dependence distance*, ...
+iterator types, among which: *parallel*, *reduction*, *partition*,
+*permutable/monotonic*, *sequential*, *dependence distance*, ...
-These types are traditionally the result of complex dependence analyses and
-have been referred to as "*bands*" in the polyhedral community (e.g. *parallel
+These types are traditionally the result of complex dependence analyses and have
+been referred to as "*bands*" in the polyhedral community (e.g. *parallel
bands*, *permutable bands*, etc, in
[ISL](https://en.wikipedia.org/wiki/Integer_set_library) schedule tree
parlance).
-Specifying the information declaratively in a `linalg.generic` allows
-conveying properties that may be hard (or even impossible) to derive from
-lower-level information. These properties can be brought all the way to the
-moment when they are useful for transformations, used and then discarded.
+Specifying the information declaratively in a `linalg.generic` allows conveying
+properties that may be hard (or even impossible) to derive from lower-level
+information. These properties can be brought all the way to the moment when they
+are useful for transformations, used and then discarded.
Additionally, these properties may also be viewed as a contract that the
-frontend/user guarantees and that the compiler may take advantage of. The
-common example is the use of data-dependent reduction semantics for
-specifying histogram computations. If the frontend has additional knowledge
-that proper atomic operations are available, it may be better to specify
-parallel semantics and use the special atomic in the computation region.
+frontend/user guarantees and that the compiler may take advantage of. The common
+example is the use of data-dependent reduction semantics for specifying
+histogram computations. If the frontend has additional knowledge that proper
+atomic operations are available, it may be better to specify parallel semantics
+and use the special atomic in the computation region.
At this time, Linalg only has an explicit use for *parallel* and *reduction*
loops but previous experience shows that the abstraction generalizes.
#### Property 4: The Compute Payload is Specified With a Region
-A `linalg.generic` op has a compute payload that is fully generic thanks to
-the use of
+
+A `linalg.generic` op has a compute payload that is fully generic thanks to the
+use of
[Regions](https://github.com/llvm/llvm-project/blob/58265ad42a90ae8905be6a447cb42e53529a54a0/mlir/docs/LangRef.md#regions).
-The region takes as arguments the scalar elemental types of the tensor or
-buffer operands of the `linalg.generic`. For flexibility and ability to match
-library calls, additional special values may be passed. For instance, a
-`linalg.fill` operation takes a buffer and an additional scalar value.
+The region takes as arguments the scalar elemental types of the tensor or buffer
+operands of the `linalg.generic`. For flexibility and ability to match library
+calls, additional special values may be passed. For instance, a `linalg.fill`
+operation takes a buffer and an additional scalar value.
-At this time there are no additional restrictions to the region
-semantics. This is meant to allow the exploration of various design tradeoffs
-at the intersection of regions and iterator types.
-In particular, the frontend is responsible for the semantics of iterator types
-to correspond to the operations inside the region: the region can capture
-buffers arbitrarily and write into them. If this conflicts with some parallel
-iterator requirement, this is undefined behavior.
+At this time there are no additional restrictions to the region semantics. This
+is meant to allow the exploration of various design tradeoffs at the
+intersection of regions and iterator types. In particular, the frontend is
+responsible for the semantics of iterator types to correspond to the operations
+inside the region: the region can capture buffers arbitrarily and write into
+them. If this conflicts with some parallel iterator requirement, this is
+undefined behavior.
-Previous examples already elaborate compute payloads with an unregistered function `"some_compute"`. The following code snippet shows what the result will be when using a concrete operation `addf`:
-```
-// File name: example3.mlir
-#indexing_maps = [
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>
-]
-#attrs = {
- args_in = 2,
- args_out = 1,
- indexing_maps = #indexing_maps,
- iterator_types = ["parallel", "parallel"]
-}
-func @example(%A: memref, %B: memref, %C: memref) {
- linalg.generic #attrs %A, %B, %C {
- ^bb0(%a: f32, %b: f32, %c: f32):
- %d = addf %a, %b : f32
- linalg.yield %d : f32
- }: memref, memref, memref
- return
-}
-```
+Previous examples already elaborate compute payloads with an unregistered
+function `"some_compute"`. The following code snippet shows what the result will
+be when using a concrete operation `addf`: ``` // File name: example3.mlir
-This function basically element-wise adds up two matrices (`%A` and `%B`) and stores the result into another one (`%C`).
+# indexing_maps = [
-The property "*The Compute Payload is Specified With a Region*" is
-materialized by a lowering into a form that will resemble:
-```
-// Run: mlir-opt example3.mlir -convert-linalg-to-loops
-#indexing_maps = [
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>
-]
-#attrs = {
- args_in = 2,
- args_out = 1,
- indexing_maps = #indexing_maps,
- iterator_types = ["parallel", "parallel"]
-}
-func @example(%A: memref, %B: memref, %C: memref) {
- linalg.generic #attrs %A, %B, %C {
- ^bb0(%a: f32, %b: f32, %c: f32):
- %d = addf %a, %b : f32
- linalg.yield %d : f32
- }: memref, memref, memref
- return
-}
-```
+affine_map<(i, j) -> (i, j)>, affine_map<(i, j) -> (i, j)>, affine_map<(i, j) ->
+(i, j)> ]
+
+# attrs = {
+
+args_in = 2, args_out = 1, indexing_maps = #indexing_maps, iterator_types =
+["parallel", "parallel"] } func @example(%A: memref, %B:
+memref, %C: memref) { linalg.generic #attrs %A, %B, %C {
+^bb0(%a: f32, %b: f32, %c: f32): %d = addf %a, %b : f32 linalg.yield %d : f32 }:
+memref, memref, memref return } ```
+
+This function basically element-wise adds up two matrices (`%A` and `%B`) and
+stores the result into another one (`%C`).
+
+The property "*The Compute Payload is Specified With a Region*" is materialized
+by a lowering into a form that will resemble: ``` // Run: mlir-opt example3.mlir
+-convert-linalg-to-loops
+
+# indexing_maps = [
+
+affine_map<(i, j) -> (i, j)>, affine_map<(i, j) -> (i, j)>, affine_map<(i, j) ->
+(i, j)> ]
+
+# attrs = {
+
+args_in = 2, args_out = 1, indexing_maps = #indexing_maps, iterator_types =
+["parallel", "parallel"] } func @example(%A: memref, %B:
+memref, %C: memref) { linalg.generic #attrs %A, %B, %C {
+^bb0(%a: f32, %b: f32, %c: f32): %d = addf %a, %b : f32 linalg.yield %d : f32 }:
+memref, memref, memref return } ```
In the process of lowering to loops and lower-level constructs, similar
-requirements are encountered, as are discussed in the [inlined call op
-proposal](https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282/2).
-We expect to be able to reuse the common lower-level infrastructure provided
-it evolves to support both region arguments and captures.
+requirements are encountered, as are discussed in the
+[inlined call op proposal](https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282/2).
+We expect to be able to reuse the common lower-level infrastructure provided it
+evolves to support both region arguments and captures.
#### Property 5: May Map To an External Library Call
+
A `linalg.generic` op may map to an external library call by specifying a
-`SymbolAttr`. At this level of abstraction, the important glue is the ability
-to perform transformations that preserve the structure necessary to ***call
-the external library after different transformations have been applied***.
+`SymbolAttr`. At this level of abstraction, the important glue is the ability to
+perform transformations that preserve the structure necessary to ***call the
+external library after different transformations have been applied***.
-This involves considerations related to preservation of op semantics
-and integration at the ABI level. Regardless of whether one wants to use
-external library calls or a custom ISA, the problem for codegen is similar:
-preservation of a fixed granularity.
+This involves considerations related to preservation of op semantics and
+integration at the ABI level. Regardless of whether one wants to use external
+library calls or a custom ISA, the problem for codegen is similar: preservation
+of a fixed granularity.
-Consider the following example that adds an additional attribute `library_call="pointwise_add"`
-that specifies the name of an external library call we intend to use:
-```
-// File name: example4.mlir
-#indexing_maps = [
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>,
- affine_map<(i, j) -> (i, j)>
-]
-#attrs = {
- args_in = 2,
- args_out = 1,
- indexing_maps = #indexing_maps,
- iterator_types = ["parallel", "parallel"],
- library_call = "pointwise_add"
-}
-func @example(%A: memref, %B: memref, %C: memref) {
- linalg.generic #attrs %A, %B, %C {
- ^bb0(%a: f32, %b: f32, %c: f32):
- %d = addf %a, %b : f32
- linalg.yield %d : f32
- }: memref, memref, memref
- return
-}
-```
+Consider the following example that adds an additional attribute
+`library_call="pointwise_add"` that specifies the name of an external library
+call we intend to use: ``` // File name: example4.mlir
-The property "*Map To an External Library Call*" is
-materialized by a lowering into a form that will resemble:
+# indexing_maps = [
+
+affine_map<(i, j) -> (i, j)>, affine_map<(i, j) -> (i, j)>, affine_map<(i, j) ->
+(i, j)> ]
+
+# attrs = {
+
+args_in = 2, args_out = 1, indexing_maps = #indexing_maps, iterator_types =
+["parallel", "parallel"], library_call = "pointwise_add" } func @example(%A:
+memref, %B: memref, %C: memref) {
+linalg.generic #attrs %A, %B, %C { ^bb0(%a: f32, %b: f32, %c: f32): %d = addf
+%a, %b : f32 linalg.yield %d : f32 }: memref, memref,
+memref return } ```
+
+The property "*Map To an External Library Call*" is materialized by a lowering
+into a form that will resemble:
```
// Run: mlir-opt example4.mlir -convert-linalg-to-std
@@ -383,204 +376,138 @@
func @pointwise_add(memref, memref, memref) attributes {llvm.emit_c_interface}
```
-Which, after lowering to LLVM resembles:
-```
-// Run: mlir-opt example4.mlir -convert-linalg-to-std | mlir-opt -convert-std-to-llvm
-// Some generated code are omitted here.
-func @example(%arg0: !llvm<"float*">, ...) {
- ...
- llvm.call @pointwise_add(...) : (!llvm<"float*">, ...) -> ()
- return
-}
+Which, after lowering to LLVM resembles: ``` // Run: mlir-opt example4.mlir
+-convert-linalg-to-std | mlir-opt -convert-std-to-llvm // Some generated code
+are omitted here. func @example(%arg0: !llvm<"float*">, ...) { ... llvm.call
+@pointwise_add(...) : (!llvm<"float*">, ...) -> () return }
-llvm.func @pointwise_add(%arg0: !llvm<"float*">, ...) attributes {llvm.emit_c_interface} {
- ...
- llvm.call @_mlir_ciface_pointwise_add(%9, %19, %29) : (!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }
-*">) -> ()
- llvm.return
-}
-llvm.func @_mlir_ciface_pointwise_add(!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">) attributes {llvm.emit_c_interface}
-```
+llvm.func @pointwise_add(%arg0: !llvm<"float*">, ...) attributes
+{llvm.emit_c_interface} { ... llvm.call @_mlir_ciface_pointwise_add(%9, %19,
+%29) : (!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{
+float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*, float*, i64, [2
+x i64], [2 x i64] } *">) -> () llvm.return } llvm.func
+@_mlir_ciface_pointwise_add(!llvm<"{ float*, float*, i64, [2 x i64], [2 x i64]
+}*">, !llvm<"{ float*, float*, i64, [2 x i64], [2 x i64] }*">, !llvm<"{ float*,
+float*, i64, [2 x i64], [2 x i64] }*">) attributes {llvm.emit_c_interface} ```
##### Convention For External Library Interoperability
+
The `linalg` dialect adopts a convention that is similar to `BLAS` when
-offloading operations to fast library implementations: pass a non-owning
-pointer to input and output data with additional metadata. This convention
-is also found in libraries such as `MKL`, `OpenBLAS`, `BLIS`, `cuBLAS`,
-`cuDNN`, etc.. and more generally at interface points across language
-boundaries (e.g. C++ / Python).
+offloading operations to fast library implementations: pass a non-owning pointer
+to input and output data with additional metadata. This convention is also found
+in libraries such as `MKL`, `OpenBLAS`, `BLIS`, `cuBLAS`, `cuDNN`, etc.. and
+more generally at interface points across language boundaries (e.g. C++ /
+Python).
-Generally, `linalg` passes non-owning pointers to View data structures
-to pre-compiled library calls linked externally.
+Generally, `linalg` passes non-owning pointers to View data structures to
+pre-compiled library calls linked externally.
-There is an [ongoing
-discussion](https://llvm.discourse.group/t/lowering-optional-attributes-in-linalg-structuredops-to-standard-dialect/333/3)
+There is an
+[ongoing discussion](https://llvm.discourse.group/t/lowering-optional-attributes-in-linalg-structuredops-to-standard-dialect/333/3)
on the topic of extending interoperability in the presence of key attributes.
#### Property 6: Perfectly Nested Writes To The Whole Output Operands
+
Perfectly nested loops form a particularly important class of structure that
enables key loop transformations such as tiling and mapping to library calls.
Unfortunately, this type of structure is easily broken by transformations such
as partial loop fusion. Tiling and mapping to library calls become more
-challenging, or even infeasible. Linalg ops adopt perfect-nestedness
-as a first-class property: the structure cannot be broken and is
-transported in the IR by construction.
+challenging, or even infeasible. Linalg ops adopt perfect-nestedness as a
+first-class property: the structure cannot be broken and is transported in the
+IR by construction.
A `linalg.generic` op represents a perfectly nested loop nest that writes the
-entire memory region. This is a structural constraint across regions and
-loops that has proven to be key in simplifying transformations.
+entire memory region. This is a structural constraint across regions and loops
+that has proven to be key in simplifying transformations.
-One particular point to mention is that converting imperfectly nested code
-into perfectly nested code can often be done with enough loop distribution
-and embedding of conditionals down to the innermost loop level.
+One particular point to mention is that converting imperfectly nested code into
+perfectly nested code can often be done with enough loop distribution and
+embedding of conditionals down to the innermost loop level.
Previous experience with Tensor Comprehensions gave us the intuition that
-forcing innermost control-flow nesting is a lot like writing data-parallel
-code with arrays of boolean values and predication.
-This type of trick has also been used before in polyhedral compilers to
-convert non-affine control into affine compute dependencies.
+forcing innermost control-flow nesting is a lot like writing data-parallel code
+with arrays of boolean values and predication. This type of trick has also been
+used before in polyhedral compilers to convert non-affine control into affine
+compute dependencies.
While it may be possible to automate such rewrites from generic IR,
`linalg.generic` just forces the semantics for now.
The key implication is that this conversion to deep predication needs to be
-undone once we are done with Linalg transformations.
-After iterators and induction variables are materialized (i.e. after lowering
-out of `linalg.generic` occurred), the overall performance will be greatly
-influenced by the quality of canonicalizations, foldings and *Loop Independent
-Code Motion* (LICM).
+undone once we are done with Linalg transformations. After iterators and
+induction variables are materialized (i.e. after lowering out of
+`linalg.generic` occurred), the overall performance will be greatly influenced
+by the quality of canonicalizations, foldings and *Loop Independent Code Motion*
+(LICM).
In the grander scheme, the reliance on late LICM was deemed a necessary risk.
#### Putting it Together
+
As it stands, the six properties above define the semantics of a
`linalg.generic` op. It is an open question whether all of these semantics are
strictly necessary in practice and whether some should or could be derived
-automatically while still maintaining the [core guiding
-principles](#guiding_principles).
+automatically while still maintaining the
+[core guiding principles](#guiding_principles).
For the time being, we have settled on the combination of these properties
because of empirical evidence building and working on multiple high-level
compilers. As we lay those down and engage more with the community, we expect
multiple rounds of discussions and design changes to the original architecture.
-### Tensors and Buffers: Conventions and Limitations
-
-Tensors are immutable SSA values, buffers are mutable regions of memory subject
-to side-effects and aliasing. As a consequence, output buffers are passed as
-operands whereas output tensors are new SSA values corresponding to op results.
-Inputs can be arbitrary tensors or buffers and are always passed as operands.
-
-The following convention is currently in-flight and is in the process of
-replacing other existing conventions. The following convention currently applies
-to "named" structured ops which are auto-generated by the linalg-ods tool.
-
-The convention adopted is as follows:
-
-1. A first block of `ins` op operands hold read-only inputs of ShapedType.
-2. An optional second block of `outs` op operands hold read-write output
- buffers of MemRefType.
-3. An optional third block of `init` operands hold initialization tensors of
- RankedTensorType. Such tensors can appear when the op performs a reduction
- and returns a tensor.
-
-Structured ops with fully parallel semantics, have empty `init`. They may either
-write in-place into `outs` buffers or return new tensors.
-
-Structured ops with reduction semantics and output tensor(s) however have
-additional restrictions:
-
-1. They can only return a single tensor for now.
-2. They cannot have any output buffer operand (i.e. `outs` is empty).
-3. They have exactly one `init` tensor of the same type as the unique output
- tensor. Such an `init` tensor does not have an explicit associate indexing
- map. Instead the map of the result tensor is used to signify that the `init`
- and the `result` are "tied".
-
-Points 1. and 2. keep complexity of the representation in check by allowing only
-a single result tensor, when reductions are present.
-
-Point 3. is related to the fact that SSA values cannot represent in-place
-updates. Instead, linalg adopts a similar convention that exists in e.g.
-`vector.outerproduct`: the value that is reduced into is passed as an explicit
-argument and a new result of the same shape is produced.
-
-It is expected buffer allocation will fold this last input onto the result in a
-single output buffer argument, which is why the same indexing map is required:
-the last input operand is said to be "tied" to the result.
-
-Alternative, more complex representations, would allow for:
-
-1. Multiple results and `init` tensors in arbitrary orders, which could be
- captured by an extra ArrayAttr of position pairs.
-2. Relaxing the conditions on the indexing map equalities on the each pair and
- e.g. allow implicit broadcasts of the input.
-
-These representations are deemed unnecessarily complex for now and are left for
-future discussion.
-
-As an illustration, the syntax for a `linalg.matmul` writing into a buffer is:
-
-```
-linalg.matmul ins(%a, %b : memref, tensor)
- outs(%c : memref)
-```
-
-, whereas the syntax for a `linalg.matmul` returning a new tensor is:
-
-```
-%d = linalg.matmul ins(%a, %b : tensor, memref)
- init(%c : tensor)
- -> tensor
-```
-
### Data Representation: Views
-The current implementation uses the [Strided MemRef (a.k.a View)](
-https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio)
+
+The current implementation uses the
+[Strided MemRef (a.k.a View)](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio)
abstraction. The name *View* is used interchangeably in `linalg` to signify
-*Strided MemRef*.
-In the future we expect to use other structured data types and
+*Strided MemRef*. In the future we expect to use other structured data types and
support ragged, mixed-sparse and other types. We expect to draw on the
experience from existing LIFT abstractions for
-[sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf)
-and [position-dependent
-arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf).
+[sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf) and
+[position-dependent arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf).
### Metadata Ops
+
A set of ops that manipulate metadata but do not move memory. These ops take
-`view` operands + extra attributes and return new `view`s. The returned
-`view`s generally alias the operand `view`. At the moment the existing ops
-are:
+`view` operands + extra attributes and return new `view`s. The returned `view`s
+generally alias the operand `view`. At the moment the existing ops are:
- * `std.view`,
- * `std.subview`,
- * `std.transpose`.
- * `linalg.range`,
- * `linalg.slice`,
- * `linalg.reshape`,
+```
+* `std.view`,
+* `std.subview`,
+* `std.transpose`.
+* `linalg.range`,
+* `linalg.slice`,
+* `linalg.reshape`,
+```
Future ops are added on a per-need basis but should include:
- * `linalg.tile`,
- * `linalg.intersection`,
- * `linalg.convex_union`,
- * `linalg.difference` (would need to work on a list of views).
+```
+* `linalg.tile`,
+* `linalg.intersection`,
+* `linalg.convex_union`,
+* `linalg.difference` (would need to work on a list of views).
+```
These additional operations correspond to abstractions that have been known to
work in the field of large-scale distributed stencil computations.
-In a longer-term future, the abstractions from [Legion data-centric
-programming model](https://legion.stanford.edu/overview/) seem generally
-appealing.
+In a longer-term future, the abstractions from
+[Legion data-centric programming model](https://legion.stanford.edu/overview/)
+seem generally appealing.
### Named Payload-Carrying Ops
+
Additionally, `linalg` provides a small subset of commonly named operations:
- * `linalg.copy`,
- * `linalg.fill`,
- * `linalg.dot`,
- * `linalg.matmul`,
- * `linalg.conv`.
+```
+* `linalg.copy`,
+* `linalg.fill`,
+* `linalg.dot`,
+* `linalg.matmul`,
+* `linalg.conv`.
+```
These named operations adhere to the `linalg.generic` op interface. Work is in
progress to define declarative mechanisms to automatically generate named ops
@@ -608,7 +535,7 @@
1. The operations used to specify computations use EDSC intrinsics so that they
can easily be parsed and emitted into a simple region builder without
resorting to more general MLIR parsing.
-1. Reduction dimensions are specified with angle bracket notation on the
+1. Reduction dimensions are specified with angle bracket notation on the
operation they apply to (e.g. `std_add` specifies that `k` is a reduction
dimension). In TC, a reduction is specified with `op=` operator and the
reduction dimensions are inferred.
@@ -677,23 +604,24 @@
```
## Open Issues and Design Alternatives
-Multiple open issues and design alternatives are in flight and it is time to
-lay them out for the community to discuss and pick apart:
-1. Should `linalg.generic` support nesting?
-1. Should `linalg.generic` regions take views or only scalars?
-1. Should we try to solve automatic differentiation at this level of
-abstraction?
-1. Are all the six properties really necessary?
-1. Is this relying too much on declarative specification and would we be
-better off relying more on analyses?
-1. Is this general enough for the community's needs? If not how should this be
-extended, if at all?
-...
+
+Multiple open issues and design alternatives are in flight and it is time to lay
+them out for the community to discuss and pick apart:
+
+1. Should `linalg.generic` support nesting?
+1. Should `linalg.generic` regions take views or only scalars?
+1. Should we try to solve automatic differentiation at this level of
+ abstraction?
+1. Are all the six properties really necessary?
+1. Is this relying too much on declarative specification and would we be better
+ off relying more on analyses?
+1. Is this general enough for the community's needs? If not how should this be
+ extended, if at all? ...
These key questions (and much more) should be really thought of in the general
context of MLIR in which different levels of IR interoperate seamlessly. In
-practice, it is not necessary (or beneficial) to try and solve all problems in the
-same IR.
+practice, it is not necessary (or beneficial) to try and solve all problems in
+the same IR.
## Operations
diff --git a/mlir/include/mlir/Dialect/Linalg/Analysis/DependenceAnalysis.h b/mlir/include/mlir/Dialect/Linalg/Analysis/DependenceAnalysis.h
--- a/mlir/include/mlir/Dialect/Linalg/Analysis/DependenceAnalysis.h
+++ b/mlir/include/mlir/Dialect/Linalg/Analysis/DependenceAnalysis.h
@@ -45,19 +45,17 @@
class LinalgDependenceGraph {
public:
enum DependenceType { RAR = 0, RAW, WAR, WAW, NumTypes };
- struct LinalgOpView {
- Operation *op;
- unsigned operandIndex;
- };
+ // TODO: OpOperand tracks dependencies on buffer operands. Tensor result will
+ // need an extension to use OpResult.
struct LinalgDependenceGraphElem {
// dependentOpView may be either:
// 1. src in the case of dependencesIntoGraphs.
// 2. dst in the case of dependencesFromDstGraphs.
- LinalgOpView dependentOpView;
+ OpOperand *dependentOpView;
// View in the op that is used to index in the graph:
// 1. src in the case of dependencesFromDstGraphs.
// 2. dst in the case of dependencesIntoGraphs.
- LinalgOpView indexingOpView;
+ OpOperand *indexingOpView;
// Type of the dependence.
DependenceType dependenceType;
};
@@ -161,8 +159,8 @@
// Uses std::pair to keep operations and view together and avoid usage errors
// related to src/dst and producer/consumer terminology in the context of
// dependences.
- void addDependenceElem(DependenceType dt, LinalgOpView indexingOpView,
- LinalgOpView dependentOpView);
+ void addDependenceElem(DependenceType dt, OpOperand *indexingOpView,
+ OpOperand *dependentOpView);
/// Implementation detail for findCoveringxxx.
SmallVector
diff --git a/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h b/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
--- a/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
+++ b/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
@@ -30,8 +30,8 @@
namespace edsc {
inline void defaultRegionBuilder(ValueRange args) {}
-/// Build a `linalg.generic` op with the specified `inputs`, `outputBuffers`,
-/// `initTensors`, `resultTensorsTypes` and `region`.
+/// Build a `linalg.generic` op with the specified `inputs`, `outputs`,
+/// `resultTensorsTypes` and `region`.
///
/// `otherValues` and `otherAttributes` may be passed and will be appended as
/// operands and attributes respectively.
@@ -41,15 +41,12 @@
///
/// 1. `inputs` may contain StructuredIndexed that capture either buffer or
/// tensor values.
-/// 2. `outputsBuffers` may contain StructuredIndexed that capture buffer
-/// values.
-/// 3. `initTensors` contain tensor values, without indexing maps.
-/// 4. `resultTensorTypes` may contain StructuredIndexed that capture return
-/// tensor types.
+/// 2. `outputs` may contain StructuredIndexed that capture either buffer or
+/// tensor values. In the future this will be extended with ranked shape values.
+/// 4. `resultTensorTypes` may contain return tensor types.
Operation *makeGenericLinalgOp(
ArrayRef iteratorTypes, ArrayRef inputs,
- ArrayRef outputBuffers, ArrayRef initTensors,
- ArrayRef resultTensorTypes,
+ ArrayRef outputs, TypeRange resultTensorTypes,
function_ref regionBuilder = defaultRegionBuilder,
ArrayRef otherValues = {}, ArrayRef otherAttributes = {});
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h b/mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h
@@ -9,7 +9,6 @@
#ifndef MLIR_DIALECT_LINALG_LINALGOPS_H_
#define MLIR_DIALECT_LINALG_LINALGOPS_H_
-#include "mlir/Dialect/Linalg/IR/LinalgTraits.h"
#include "mlir/Dialect/Linalg/IR/LinalgTypes.h"
#include "mlir/Dialect/StandardOps/IR/Ops.h"
#include "mlir/Dialect/Utils/StructuredOpsUtils.h"
@@ -111,9 +110,17 @@
void getDimsOfType(Operation *op, StringRef iteratorTypeName,
SmallVectorImpl &res);
+namespace detail {
+LogicalResult verifyStructuredOpInterface(Operation *op);
+} // namespace detail
} // namespace linalg
} // namespace mlir
+namespace mlir {
+namespace linalg {
+class IndexedGenericOp;
+} // namespace linalg
+} // namespace mlir
#include "mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterfaces.h.inc"
#define GET_OP_CLASSES
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
@@ -19,26 +19,6 @@
include "mlir/Interfaces/CopyOpInterface.td"
include "mlir/Interfaces/SideEffectInterfaces.td"
-// The Linalg `NInputs` trait provides the API for ops that are known
-// to have a specified number of inputs, all passed as operands.
-// See Linalg/LinalgTraits.h for implementation details and usage.
-class NInputs :
- NativeOpTrait<"linalg::NInputs<" # !cast(n) # ">::Impl"> {}
-
-// The Linalg `ZeroInitTensors` trait provides the API for ops that are known
-// to not have input tensor operands.
-// See Linalg/LinalgTraits.h for implementation details and usage.
-def ZeroInitTensors : NativeOpTrait<"linalg::ZeroInitTensors"> {}
-
-// The Linalg `NOutputs` trait provides the API for ops that are known
-// to have a specified number of outputs, all passed as operands.
-// See Linalg/LinalgTraits.h for implementation details and usage.
-class NOutputs :
- NativeOpTrait<"linalg::NOutputs<" # !cast(n) # ">::Impl"> {}
-
-def StructuredOpTraits : NativeOpTrait<"linalg::StructuredOpTraits">;
-def NamedStructuredOpTrait : NativeOpTrait<"linalg::NamedStructuredOpTrait">;
-
// Base Tablegen class for Linalg ops.
// Linalg ops that correspond to library calls operate on ShapedType as their
// first operands. These may be optionally followed by non-view operands
@@ -50,7 +30,6 @@
class LinalgStructured_Op props>
: LinalgStructuredBase_Op])> {
code libraryCallName = [{
std::string getLibraryCallName() {
@@ -65,12 +44,7 @@
//===----------------------------------------------------------------------===//
// At the moment these are not declarative and require a bunch of C++ code.
// In the future, these should be migrated to a declarative specification.
-def CopyOp : LinalgStructured_Op<"copy", [
- CopyOpInterface,
- NInputs<1>,
- ZeroInitTensors,
- NOutputs<1>
- ]> {
+def CopyOp : LinalgStructured_Op<"copy", [CopyOpInterface]> {
let description = [{
Copies the data in the input view into the output view.
@@ -137,6 +111,14 @@
}]>];
let extraClassDeclaration = libraryCallName # [{
+ ValueRange inputs() {
+ return OperandRange{getOperands().begin(), getOperands().begin() + 1};
+ }
+
+ ValueRange outputs() {
+ return OperandRange{getOperands().begin() + 1, getOperands().begin() + 2};
+ }
+
// Rank-polymorphic.
// filling_value -> O(ivs) with parallel iterators.
ArrayAttr iterator_types() {
@@ -170,14 +152,16 @@
let hasCanonicalizer = 1;
}
-def FillOp : LinalgStructured_Op<"fill", [
- NInputs<0>,
- ZeroInitTensors,
- NOutputs<1>]> {
-
+def FillOp : LinalgStructured_Op<"fill", []> {
let arguments = (ins AnyStridedMemRef:$output,
AnyTypeOf<[AnyFloat, AnySignlessInteger, AnyVector]>:$value);
let extraClassDeclaration = libraryCallName # [{
+ ValueRange inputs() { return {}; }
+
+ ValueRange outputs() {
+ return OperandRange{getOperands().begin(), getOperands().begin() + 1};
+ }
+
// Rank-polymorphic.
// filling_value -> O(ivs) with parallel iterators.
ArrayAttr iterator_types() {
@@ -276,13 +260,8 @@
}];
}
-def ConvOp : PoolingBase_Op<"conv", [
- NInputs<2>,
- // Despite having reductions, this manually defined ConvOp may only take
- // memref operands and can never have init tensors.
- ZeroInitTensors,
- NOutputs<1>]> {
-
+// Only support buffer semantics.
+def ConvOp : PoolingBase_Op<"conv", []> {
let description = [{
Generic n-D convolution as described in the TF documentation:
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/nn/convolution
@@ -313,6 +292,14 @@
OptionalAttr:$padding);
let extraClassDeclaration = commonUtils # [{
+ ValueRange inputs() {
+ return OperandRange{getOperands().begin(), getOperands().begin() + 2};
+ }
+
+ ValueRange outputs() {
+ return OperandRange{getOperands().begin() + 2, getOperands().begin() + 3};
+ }
+
// TODO: extend to support more than 1 dimensions and potentially grouping
// too.
unsigned getNumBatchDimensions() { return 1; }
@@ -335,6 +322,11 @@
// parallelized across; i.e. [zs] in the TF notation above whose number
// match `xs` (i.e. 1 window loop per "image" dimension).
// This may evolve in the future.
+ // Conditionally check nWin for cases of ill-formed op: this avoids
+ // overflows before hitting the verifier.
+ assert(nPar > getNumBatchDimensions() + getNumInputFeatureDimensions() &&
+ "expected at least one window dimension (i.e. memref ranks greater "
+ "than 2)");
unsigned nWin =
nPar - getNumBatchDimensions() - getNumInputFeatureDimensions();
SmallVector iters(nPar, getParallelIteratorTypeName());
@@ -352,7 +344,8 @@
ArrayAttr indexing_maps() {
MLIRContext *context = getContext();
auto nWin = getNumWindowLoops();
- assert(nWin > 0 && "expected at least one window dimension");
+ assert(nWin > 0 && "expected at least one window dimension (i.e. memref "
+ "ranks greater than 2)");
unsigned idx = 0;
// In the following, AffineDimExprs are indexed in loop order:
// [ b, xs, k, q, zs]
@@ -394,13 +387,9 @@
let hasCanonicalizer = 1;
}
+// Only support buffer semantics.
class SingleInputPoolingBase_Op
- : PoolingBase_Op,
- // Despite having reductions, this manually defined ConvOp may only take
- // memref operands and can never have init tensors.
- ZeroInitTensors,
- NOutputs<1>]> {
+ : PoolingBase_Op {
let description = [{
A base class for single input pooling function.
@@ -420,6 +409,14 @@
OptionalAttr:$padding);
let extraClassDeclaration = commonUtils# [{
+ ValueRange inputs() {
+ return OperandRange{getOperands().begin(), getOperands().begin() + 2};
+ }
+
+ ValueRange outputs() {
+ return OperandRange{getOperands().begin() + 2, getOperands().begin() + 3};
+ }
+
ArrayAttr iterator_types() {
// Outer parallel loops are always the number of output dimensions.
unsigned nPar = getOutputShapedType(0).getRank();
@@ -493,11 +490,9 @@
class GenericOpBase : LinalgStructuredBase_Op,
- NamedStructuredOpTrait,
SingleBlockImplicitTerminator<"YieldOp">]> {
let arguments = (ins Variadic:$inputs,
- Variadic:$output_buffers,
- Variadic:$init_tensors,
+ Variadic:$outputs,
AffineMapArrayAttr:$indexing_maps,
ArrayAttr:$iterator_types,
OptionalAttr:$doc,
@@ -622,34 +617,26 @@
```mlir
%C = linalg.generic #trait_attribute
ins(%A, %B : tensor, memref)
- init(%C : tensor)
+ outs(%C : tensor)
{other-optional-attributes}
{region}
-> (tensor)
```
-
- The `init` operand and the conventions around mixing tensors and buffers are
- described in more detail in the "Tensors and Buffers: Conventions and
- Limitations" section in the [Linalg Document](../docs/Linalg.md)
-
- Tensor values must be legalized by a buffer allocation pass before most
- transformations can be applied. Such legalizations move tensor return values
- into output buffer operands and updates the region arguments accordingly.
}];
let builders = [
OpBuilderDAG<(ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs,
- "ValueRange":$outputBuffers, "ValueRange":$initTensors,
- "ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
- "StringRef":$doc, "StringRef":$libraryCall,
+ "ValueRange":$outputs, "ArrayRef":$indexingMaps,
+ "ArrayRef":$iteratorTypes, "StringRef":$doc,
+ "StringRef":$libraryCall,
CArg<"function_ref", "nullptr">)>,
OpBuilderDAG<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers,
"ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
"StringRef":$doc, "StringRef":$libraryCall,
CArg<"function_ref", "nullptr">)>,
OpBuilderDAG<(ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs,
- "ValueRange":$outputBuffers, "ValueRange":$initTensors,
- "ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
+ "ValueRange":$outputs, "ArrayRef":$indexingMaps,
+ "ArrayRef":$iteratorTypes,
CArg<"function_ref", "nullptr">)>,
OpBuilderDAG<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers,
"ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
@@ -714,8 +701,8 @@
```mlir
linalg.indexed_generic #matmul_trait
- ins(%A, %B : memref,
- memref)
+ ins(%A, %B : memref,
+ memref)
outs(%C : memref) {
(%offset_m: index, %offset_n: index, %offset_k: index,
%a: f32, %b: f32, %c: f32) :
@@ -761,27 +748,19 @@
```mlir
%C = linalg.indexed_generic #trait_attribute
- ins(%A, %B : tensor, memref)
- init(%C : tensor)
+ ins(%A, %B : tensor, memref)
+ outs(%C : tensor)
{other-optional-attributes}
{region_with_index_arguments}
-> (tensor)
```
-
- The `init` operand and the conventions around mixing tensors and buffers are
- described in more detail in the "Tensors and Buffers: Conventions and
- Limitations" section in the [Linalg Document](../docs/Linalg.md)
-
- Tensor values must be legalized by a buffer allocation pass before most
- transformations can be applied. Such legalizations move tensor return values
- into output buffer operands and update the region arguments accordingly.
}];
let builders = [
OpBuilderDAG<(ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs,
- "ValueRange":$outputBuffers, "ValueRange":$initTensors,
- "ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
- "StringRef":$doc, "StringRef":$libraryCall,
+ "ValueRange":$outputs, "ArrayRef":$indexingMaps,
+ "ArrayRef":$iteratorTypes, "StringRef":$doc,
+ "StringRef":$libraryCall,
CArg<"function_ref",
"nullptr">)>,
OpBuilderDAG<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers,
@@ -790,8 +769,8 @@
CArg<"function_ref",
"nullptr">)>,
OpBuilderDAG<(ins "TypeRange":$resultTensorTypes, "ValueRange":$inputs,
- "ValueRange":$outputBuffers, "ValueRange":$initTensors,
- "ArrayRef":$indexingMaps, "ArrayRef":$iteratorTypes,
+ "ValueRange":$outputs, "ArrayRef":$indexingMaps,
+ "ArrayRef":$iteratorTypes,
CArg<"function_ref",
"nullptr">)>,
OpBuilderDAG<(ins "ValueRange":$inputs, "ValueRange":$outputBuffers,
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterface.td b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterface.td
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterface.td
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterface.td
@@ -20,6 +20,24 @@
def LinalgStructuredInterface : OpInterface<"LinalgOp"> {
let cppNamespace = "::mlir::linalg";
let methods = [
+ //===------------------------------------------------------------------===//
+ // Loop types handling.
+ //===------------------------------------------------------------------===//
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the number of induction variables in the basic block. This should
+ always be 0 for index-free linalg ops. For IndexedGeneric, this must be
+ equal to numLoops
+ }],
+ /*retTy=*/"unsigned",
+ /*methodName=*/"getNumPayloadInductionVariables",
+ /*args=*/(ins),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return isa(this->getOperation()) ?
+ $_op.getNumLoops() : 0;
+ }]
+ >,
//===------------------------------------------------------------------===//
// Loop types handling.
//===------------------------------------------------------------------===//
@@ -125,42 +143,40 @@
getNumIterators(getReductionIteratorTypeName(), iters) == 1;
}]>,
//===------------------------------------------------------------------===//
- // Num input/output/initTensors arguments handling.
+ // Num input/output arguments handling.
//===------------------------------------------------------------------===//
- // These special methods must be defined by each op that wants to implement
- // the LinalgStructuredInterface. For now, this is either:
- // - Explicitly specified in the op definition.
- // - Derived from variadic attributes (for "named" ops, linalg.generic and
- // linalg.indexed_generic ops).
+ // These special methods rely on `inputs` and `outputs` being defined by
+ // each op that wants to implement the LinalgStructuredInterface.
InterfaceMethod<
/*desc=*/[{
Return the number of inputs.
}],
/*retTy=*/"unsigned",
- /*methodName=*/"getNumInputs"
- >,
- InterfaceMethod<
- /*desc=*/[{
- Return the number of init tensors.
- }],
- /*retTy=*/"unsigned",
- /*methodName=*/"getNumInitTensors"
+ /*methodName=*/"getNumInputs",
+ /*args=*/(ins),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return $_op.inputs().size();
+ }]
>,
InterfaceMethod<
/*desc=*/[{
Return the number of outputs.
}],
/*retTy=*/"unsigned",
- /*methodName=*/"getNumOutputs"
+ /*methodName=*/"getNumOutputs",
+ /*args=*/(ins),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return $_op.outputs().size();
+ }]
>,
//===------------------------------------------------------------------===//
- // Input arguments handling.
+ // Input operands handling.
//===------------------------------------------------------------------===//
InterfaceMethod<
/*desc=*/[{
- Return the `i`-th input value.
- The `i^th` input argument is always the `i^th` operand regardless of
- whether we have tensors or buffers.
+ Return the `i`-th input operand.
}],
/*retTy=*/"Value",
/*methodName=*/"getInput",
@@ -173,24 +189,7 @@
>,
InterfaceMethod<
/*desc=*/[{
- Return the index of the given input value `v`, or `None` if the value is
- not an input.
- }],
- /*retTy=*/"llvm::Optional",
- /*methodName=*/"getIndexOfInput",
- /*args=*/(ins "Value":$value),
- /*methodBody=*/"",
- /*defaultImplementation=*/[{
- auto it = llvm::find(getInputs(), value);
- if (it != getInputs().end())
- return it - getInputs().begin();
- return llvm::None;
- }]
- >,
- InterfaceMethod<
- /*desc=*/[{
- Return the `i`-th input shaped type, irrespective of buffer or tensor
- type.
+ Return the `i`-th input shaped type
}],
/*retTy=*/"ShapedType",
/*methodName=*/"getInputShapedType",
@@ -202,7 +201,7 @@
>,
InterfaceMethod<
/*desc=*/[{
- Return the input operands.
+ Return the range of input operands.
}],
/*retTy=*/"Operation::operand_range",
/*methodName=*/"getInputs",
@@ -215,7 +214,19 @@
>,
InterfaceMethod<
/*desc=*/[{
- Return the range over the input operands that are of buffer type.
+ Return the OpOperands for the input operands.
+ }],
+ /*retTy=*/" MutableArrayRef",
+ /*methodName=*/"getInputOpOperands",
+ /*args=*/(ins),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return this->getOperation()->getOpOperands().take_front(getNumInputs());
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the subset of input operands that are of buffer type.
}],
/*retTy=*/"SmallVector",
/*methodName=*/"getInputBuffers",
@@ -223,417 +234,500 @@
/*methodBody=*/"",
/*defaultImplementation=*/[{
return llvm::to_vector<4>(llvm::make_filter_range(
- getInputs(), [](Value in){ return in.getType().isa(); }));
+ getInputs(), [](Value in){ return in.getType().template isa(); }));
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the subset of input operands that are of ranked tensor type.
+ Return the number of input buffer operands.
}],
- /*retTy=*/"SmallVector",
- /*methodName=*/"getInputTensorTypes" ,
+ /*retTy=*/"unsigned",
+ /*methodName=*/"getNumInputBuffers",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- SmallVector res;
- for (Type type : getInputs().getTypes())
- if (auto t = type.template dyn_cast())
- res.push_back(t);
- return res;
+ return $_op.getInputBuffers().size();
}]
>,
- //===------------------------------------------------------------------===//
- // Output arguments handling.
- //===------------------------------------------------------------------===//
InterfaceMethod<
/*desc=*/[{
- Return the output buffer at the given index, asserts that this is a
- buffer operand and not a tensor result.
- The `i^th` output argument is an operand (resp. a return value) iff it
- is a value of buffer type (resp. a return value of tensor type).
+ Return the `index`^th input buffer.
}],
/*retTy=*/"Value",
- /*methodName=*/"getOutputBuffer",
- /*args=*/(ins "unsigned":$i),
+ /*methodName=*/"getInputBuffer",
+ /*args=*/(ins "unsigned":$index),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- // Output buffers are passed as output buffer operands (side-effecting).
- // Output tensors are results.
- // The union of the 2 are all the outputs and we want to ensure i does
- // not overflow the buffer operands.
- assert(i + this->getOperation()->getNumResults() < $_op.getNumOutputs()
- && "overflowing output buffer index");
- return this->getOperation()->getOperand($_op.getNumInputs() + i);
+ assert(index < getNumInputBuffers());
+ return getInputBuffers()[index];
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the index of the given buffer value, or `None` if the value is
- not part of the output buffers.
+ Return the subset of input operands that are of buffer type.
}],
- /*retTy=*/"llvm::Optional",
- /*methodName=*/"getIndexOfOutputBuffer",
- /*args=*/(ins "Value":$value),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInputBuffersOpOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto it = llvm::find(getOutputBuffers(), value);
- if (it != getOutputBuffers().end())
- return it - getOutputBuffers().begin();
- return llvm::None;
+ SmallVector res;
+ res.reserve(getNumInputs());
+ for (OpOperand &o : getInputOpOperands())
+ if (o.get().getType().isa())
+ res.push_back(&o);
+ return res;
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the type of the output buffer at the given index.
+ Return the subset of input operands that are of tensor type.
}],
- /*retTy=*/"MemRefType",
- /*methodName=*/"getOutputBufferType",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInputTensors",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return getOutputBuffer(i).getType().template cast();
- }]>,
+ return llvm::to_vector<4>(llvm::make_filter_range(
+ getInputs(),
+ [](Value in){ return in.getType().template isa(); }));
+ }]
+ >,
InterfaceMethod<
/*desc=*/[{
- Return the `i`-th output shaped type, irrespective of buffer or tensor
- type.
+ Return the subset of input operands that are of buffer type.
}],
- /*retTy=*/"ShapedType",
- /*methodName=*/"getOutputShapedType",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInputTensorsOpOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return getShapedType(i + $_op.getNumInputs());
- }]>,
+ SmallVector res;
+ res.reserve(getNumInputs());
+ for (OpOperand &o : getInputOpOperands())
+ if (o.get().getType().isa())
+ res.push_back(&o);
+ return res;
+ }]
+ >,
InterfaceMethod<
/*desc=*/[{
- Return the results that are of ranked tensor type.
+ Return the types of the subset of input operands that are of buffer type.
}],
- /*retTy=*/"SmallVector",
- /*methodName=*/"getOutputTensorTypes",
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInputBufferTypes" ,
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- SmallVector res;
- for (Type type : this->getOperation()->getResults().getTypes())
- res.push_back(type.template cast());
- return res;
- }]>,
+ return llvm::to_vector<4>(
+ llvm::map_range(
+ llvm::make_filter_range(
+ ValueRange(getInputs()).getTypes(),
+ [](Type in){ return in.isa(); }),
+ [](Type in){ return in.cast(); }));
+ }]
+ >,
InterfaceMethod<
/*desc=*/[{
- Return the output buffers (operands).
+ Return the types of the subset of input operands that are of ranked
+ tensor type.
}],
- /*retTy=*/"Operation::operand_range",
- /*methodName=*/"getOutputBuffers",
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInputTensorTypes" ,
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto range = this->getOperation()->getOperands();
- return {range.begin() + $_op.getNumInputs(),
- range.begin() + getNumInputsAndOutputBuffers()};
+ return llvm::to_vector<4>(
+ llvm::map_range(
+ llvm::make_filter_range(
+ ValueRange(getInputs()).getTypes(),
+ [](Type in){ return in.isa(); }),
+ [](Type in){ return in.cast(); }));
}]
>,
//===------------------------------------------------------------------===//
- // Input and Output arguments handling.
+ // Output operands handling.
//===------------------------------------------------------------------===//
InterfaceMethod<
/*desc=*/[{
- Return one single buffer at position `$i`.
+ Return the `i`-th output operand.
}],
/*retTy=*/"Value",
- /*methodName=*/"getBuffer",
+ /*methodName=*/"getOutput",
/*args=*/(ins "unsigned":$i),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(i < getNumInputsAndOutputBuffers() && "overflowing buffers index");
- return this->getOperation()->getOperand(i);
+ assert(i < $_op.getNumOutputs());
+ return this->getOperation()->getOperand(i + $_op.getNumInputs());
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the number of output buffers
+ Return the `i`-th output shaped type
}],
- /*retTy=*/"unsigned",
- /*methodName=*/"getNumOutputBuffers",
+ /*retTy=*/"ShapedType",
+ /*methodName=*/"getOutputShapedType",
+ /*args=*/(ins "unsigned":$i),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return getOutput(i).getType().template cast();
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the range of output operands.
+ }],
+ /*retTy=*/"Operation::operand_range",
+ /*methodName=*/"getOutputs",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return $_op.getNumOutputs() - this->getOperation()->getNumResults();
+ auto start =
+ this->getOperation()->getOperands().begin() + $_op.getNumInputs();
+ return {start, start + $_op.getNumOutputs()};
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the number of inputs and outputs, irrespective of their buffer or
- tensor type.
+ Return the OpOperands for the output operandss.
}],
- /*retTy=*/"unsigned",
- /*methodName=*/"getNumInputsAndOutputs",
+ /*retTy=*/" MutableArrayRef",
+ /*methodName=*/"getOutputOpOperands",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return $_op.getNumInputs() + $_op.getNumOutputs();
+ return this->getOperation()->getOpOperands().slice(
+ getNumInputs(), getNumOutputs());
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the number of inputs, irrespective of their buffer or tensor type
- and output buffers
+ Return the subset of output operands that are of buffer type.
}],
- /*retTy=*/"unsigned",
- /*methodName=*/"getNumInputsAndOutputBuffers",
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputBuffers",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return $_op.getNumInputs() + $_op.getNumOutputs() -
- this->getOperation()->getNumResults();
+ return llvm::to_vector<4>(llvm::make_filter_range(
+ getOutputs(), [](Value in){ return in.getType().template isa(); }));
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the range over inputs (irrespective of type) and output buffers.
+ Return the `index`^th output buffer.
}],
- /*retTy=*/"Operation::operand_range",
- /*methodName=*/"getInputsAndOutputBuffers",
+ /*retTy=*/"Value",
+ /*methodName=*/"getOutputBuffer",
+ /*args=*/(ins "unsigned":$index),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ assert(index < getNumOutputBuffers());
+ return getOutputBuffers()[index];
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the subset of output operands that are of buffer type.
+ }],
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputBuffersOpOperands",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto range = this->getOperation()->getOperands();
- return {range.begin(), range.begin() + getNumInputsAndOutputBuffers()};
+ SmallVector res;
+ res.reserve(getNumOutputs());
+ for (OpOperand &o : getOutputOpOperands())
+ if (o.get().getType().isa())
+ res.push_back(&o);
+ return res;
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the range over init tensors.
+ Return the number of output buffer operands.
}],
- /*retTy=*/"Operation::operand_range",
- /*methodName=*/"getInitTensors",
+ /*retTy=*/"unsigned",
+ /*methodName=*/"getNumOutputBuffers",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto range = this->getOperation()->getOperands();
- auto base = range.begin() + getNumInputsAndOutputBuffers();
- return {base, base + $_op.getNumInitTensors()};
+ return $_op.getOutputBuffers().size();
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return one single init tensor at position `$i`.
+ Return the subset of output operands that are of tensor type.
}],
- /*retTy=*/"Value",
- /*methodName=*/"getInitTensor",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputTensors",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(i < $_op.getNumInitTensors() && "overflowing init tensor index");
- return getInitTensors()[i];
+ return llvm::to_vector<4>(llvm::make_filter_range(
+ getOutputs(),
+ [](Value in){ return in.getType().template isa(); }));
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return true if the shaped operand index `i` is the index of an init
- tensor.
+ Return the subset of output operands that are of tensor type.
}],
- /*retTy=*/"bool",
- /*methodName=*/"isIndexOfAnInitTensor",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputTensorsOpOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(i < $_op.getNumShapedOperands() && "overflowing shaped operand index");
- return i >= $_op.getNumInputs() + getNumOutputBuffers();
+ SmallVector res;
+ res.reserve(getNumOutputs());
+ for (OpOperand &o : getOutputOpOperands())
+ if (o.get().getType().isa())
+ res.push_back(&o);
+ return res;
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the relative init tensor index of the shaped operand index.
+ Return the number of output tensor operands.
}],
/*retTy=*/"unsigned",
- /*methodName=*/"getInitTensorIndexFromShapedIndex",
- /*args=*/(ins "unsigned":$i),
+ /*methodName=*/"getNumOutputTensors",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(isIndexOfAnInitTensor(i) && "expected an init tensor index");
- return i - $_op.getNumInputs() - getNumOutputBuffers();
+ return $_op.getOutputTensors().size();
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the index of the given init tensor value, or `None` if the value
- is not part of the init tensors.
+ Return the types of the subset of output operands that are of buffer type.
}],
- /*retTy=*/"llvm::Optional",
- /*methodName=*/"getIndexOfInitTensor",
- /*args=*/(ins "Value":$value),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputBufferTypes" ,
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto it = llvm::find(getInitTensors(), value);
- if (it != getInitTensors().end())
- return it - getInitTensors().begin();
- return llvm::None;
+ return llvm::to_vector<4>(
+ llvm::map_range(
+ llvm::make_filter_range(
+ ValueRange(getOutputs()).getTypes(),
+ [](Type in){ return in.isa(); }),
+ [](Type in){ return in.cast(); }));
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the number of inputs, output buffers and init tensors operands.
+ Return the types of the subset of output operands that are of ranked
+ tensor type.
}],
- /*retTy=*/"unsigned",
- /*methodName=*/"getNumShapedOperands",
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getOutputTensorTypes" ,
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return getNumInputsAndOutputBuffers() + $_op.getNumInitTensors();
+ return llvm::to_vector<4>(
+ llvm::map_range(
+ llvm::make_filter_range(
+ ValueRange(getOutputs()).getTypes(),
+ [](Type in){ return in.isa(); }),
+ [](Type in){ return in.cast(); }));
}]
>,
+
+ //===------------------------------------------------------------------===//
+ // Input and Output arguments handling.
+ //===------------------------------------------------------------------===//
InterfaceMethod<
/*desc=*/[{
- Return the `i`-th shaped operand value, which can be an arbitrary input
- tensor/buffer, init tensor or output buffer.
+ Return true if the payload uses the value loaded from `opOperand`. This
+ is useful to avoid loading from "write-only" memory that may be
+ uninitialized, as well as properly cloning "read-write" operands.
}],
- /*retTy=*/"Value",
- /*methodName=*/"getShapedOperand",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"bool",
+ /*methodName=*/"payloadUsesValueFromOpOperand",
+ /*args=*/(ins "OpOperand *":$opOperand),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(i < $_op.getNumShapedOperands());
- return this->getOperation()->getOperand(i);
+ unsigned bbArgNumber =
+ getNumPayloadInductionVariables() + opOperand->getOperandNumber();
+ // Safeguard against the named linalg ops that are manually defined and
+ // that only support buffer semantics: we should not be there.
+ assert(this->getOperation()->getNumRegions() == 1);
+ Block &block = this->getOperation()->getRegion(0).front();
+ // Init tensors have uses.
+ return !block.getArgument(bbArgNumber).use_empty();
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the range over inputs, output buffers and init tensors.
+ Return true if the payload uses the value loaded from input operand
+ `index`.
}],
- /*retTy=*/"Operation::operand_range",
- /*methodName=*/"getShapedOperands",
- /*args=*/(ins),
+ /*retTy=*/"bool",
+ /*methodName=*/"payloadUsesValueFromInputOperandIndex",
+ /*args=*/(ins "unsigned":$index),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto range = this->getOperation()->getOperands();
- return {range.begin(), range.begin() + getNumShapedOperands()};
+ return payloadUsesValueFromOpOperand(&getInputOpOperands()[index]);
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the `i`-th shaped type, there are 3 cases:
- 1. if `i < $_op.getNumInputs()` then return `getInputShapedType(i)`;
- otherwise
- 2. if `i < getNumInputsAndOutputBuffers()` then return the
- `getOutputBufferType(i - $_op.getNumInputs())`; otherwise
- 3. return the `i - getNumInputsAndOutputBuffers()` result type.
+ Return true if the payload uses the value loaded from output operand
+ `index`.
}],
- /*retTy=*/"ShapedType",
- /*methodName=*/"getShapedType",
- /*args=*/(ins "unsigned":$i),
+ /*retTy=*/"bool",
+ /*methodName=*/"payloadUsesValueFromOutputOperandIndex",
+ /*args=*/(ins "unsigned":$index),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- if (i < $_op.getNumInputs())
- return getInputShapedType(i);
- if (i < getNumInputsAndOutputBuffers())
- return getOutputBufferType(i - $_op.getNumInputs());
- return this->getOperation()->getResult(
- i - getNumInputsAndOutputBuffers()).
- getType().template cast();
- }]>,
+ return payloadUsesValueFromOpOperand(&getOutputOpOperands()[index]);
+ }]
+ >,
InterfaceMethod<
/*desc=*/[{
- Return the shaped types for all the inputs and outputs
+ Return true if `opOperand` is an init tensor. This is true when it is
+ an output tensor operand whose value is used in the payload region.
}],
- /*retTy=*/"SmallVector",
- /*methodName=*/"getInputOutputShapedTypes",
+ /*retTy=*/"bool",
+ /*methodName=*/"isInitTensor",
+ /*args=*/(ins "OpOperand *":$opOperand),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ if (!opOperand->get().getType().template isa())
+ return false;
+ if (opOperand->getOperandNumber() < $_op.getNumInputs())
+ return false;
+ return payloadUsesValueFromOpOperand(opOperand);
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return true if the operand at output index `index` is an init tensor.
+ }],
+ /*retTy=*/"bool",
+ /*methodName=*/"isIndexOfInitTensor",
+ /*args=*/(ins "unsigned":$index),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ assert(index < getNumOutputs());
+ return isInitTensor(
+ &this->getOperation()->getOpOperands()[$_op.getNumInputs() + index]);
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the output operands that are init tensors.
+ }],
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getInitTensors",
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- SmallVector inputOutputTypes(
- this->getOperation()->operand_type_begin(),
- this->getOperation()->operand_type_end());
- inputOutputTypes.append(this->getOperation()->result_type_begin(),
- this->getOperation()->result_type_end());
+ auto start =
+ this->getOperation()->getOpOperands().begin() + $_op.getNumInputs();
return llvm::to_vector<4>(
- llvm::map_range(inputOutputTypes, [](Type type) -> ShapedType {
- return type.cast();
- }));
+ llvm::map_range(
+ llvm::make_filter_range(
+ llvm::make_range(start, start + $_op.getNumOutputs()),
+ [&](OpOperand &opOperand) {
+ return $_op.isInitTensor(&opOperand);
+ }),
+ [&](OpOperand &opOperand) {
+ return opOperand.get();
+ }));
}]
>,
InterfaceMethod<
/*desc=*/[{
- Return the first position of the shaped operand in the operand list.
+ Return the number of init tensor operands.
}],
- /*retTy=*/"Optional",
- /*methodName=*/"getIndexOfShapedOperand",
- /*args=*/(ins "Value":$value),
+ /*retTy=*/"unsigned",
+ /*methodName=*/"getNumInitTensors",
+ /*args=*/(ins),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return getInitTensors().size();
+ }]
+ >,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the number of input and output operands.
+ }],
+ /*retTy=*/"unsigned",
+ /*methodName=*/"getNumShapedOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- Optional inputIndex = getIndexOfInput(value);
- if (inputIndex.hasValue()) return inputIndex.getValue();
- Optional outputIndex = getIndexOfOutputBuffer(value);
- if (outputIndex.hasValue())
- return $_op.getNumInputs() + outputIndex.getValue();
- Optional initTensorIndex = getIndexOfInitTensor(value);
- if (initTensorIndex.hasValue())
- return $_op.getNumInputs() + $_op.getNumOutputBuffers() + initTensorIndex.getValue();
- return llvm::None;
+ return $_op.getNumInputs() + $_op.getNumOutputs();
}]
>,
InterfaceMethod<
/*desc=*/[{
- Returns the operand index given the input index. Returns None
- of the input index is invalid.
+ Return the `i`-th shaped operand value.
}],
- /*retTy=*/"Optional",
- /*methodName=*/"getOperandIndexForInputIndex",
- /*args=*/(ins "unsigned":$input_index),
+ /*retTy=*/"Value",
+ /*methodName=*/"getShapedOperand",
+ /*args=*/(ins "unsigned":$i),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- if (input_index >= $_op.getNumInputs())
- return llvm::None;
- return input_index;
+ assert(i < $_op.getNumShapedOperands());
+ return this->getOperation()->getOperand(i);
}]
>,
InterfaceMethod<
/*desc=*/[{
- Returns the operand index given the output index. Returns None
- of the output index is invalid.
+ Return the range over input and output operands.
}],
- /*retTy=*/"Optional",
- /*methodName=*/"getOperandIndexForOutputIndex",
- /*args=*/(ins "unsigned":$output_index),
+ /*retTy=*/"Operation::operand_range",
+ /*methodName=*/"getShapedOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- if (output_index >= $_op.getNumOutputs())
- return llvm::None;
- return output_index + $_op.getNumInputs();
+ auto range = this->getOperation()->getOperands();
+ return {range.begin(), range.begin() + getNumShapedOperands()};
}]
>,
InterfaceMethod<
/*desc=*/[{
- Returns the input index given the operand index. Return None
- if the operand index doesnt corresponding to an input.
+ Return the OpOperands for all the shaped operands.
}],
- /*retTy=*/"Optional",
- /*methodName=*/"getInputIndex",
- /*args=*/(ins "unsigned":$operand_index),
+ /*retTy=*/" MutableArrayRef",
+ /*methodName=*/"getShapedOpOperands",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- if (operand_index >= $_op.getNumInputs())
- return llvm::None;
- return operand_index;
+ return this->getOperation()->getOpOperands().take_front(
+ getNumShapedOperands());
}]
>,
InterfaceMethod<
/*desc=*/[{
- Returns the output index given the operand index. Return None
- if the operand index doesnt corresponding to an output.
+ Return the range over input and output operands.
}],
- /*retTy=*/"Optional",
- /*methodName=*/"getOutputIndex",
- /*args=*/(ins "unsigned":$operand_index),
+ /*retTy=*/"SmallVector",
+ /*methodName=*/"getShapedOperandTypes",
+ /*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- if (operand_index < $_op.getNumInputs() ||
- operand_index >= $_op.getNumInputs() + $_op.getNumOutputs())
- return llvm::None;
- return operand_index - $_op.getNumInputs();
+ return llvm::to_vector<4>(
+ llvm::map_range(
+ getShapedOperands(),
+ [](Value v) { return v.getType().cast(); }));
}]
>,
+ InterfaceMethod<
+ /*desc=*/[{
+ Return the `i`-th shaped type
+ }],
+ /*retTy=*/"ShapedType",
+ /*methodName=*/"getShapedType",
+ /*args=*/(ins "unsigned":$i),
+ /*methodBody=*/"",
+ /*defaultImplementation=*/[{
+ return $_op.getShapedOperand(i).getType().template cast();
+ }]>,
//===------------------------------------------------------------------===//
// Other interface methods.
@@ -679,7 +773,7 @@
/*args=*/(ins "unsigned":$i),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- assert(i < getNumInputsAndOutputs());
+ assert(i < $_op.getNumShapedOperands());
return getIndexingMaps()[i];
}]
>,
@@ -719,8 +813,8 @@
/*methodBody=*/"",
/*defaultImplementation=*/[{
return this->getOperation()->getNumResults() == 0 &&
- llvm::all_of(getInputs(),
- [](Value v) { return v.getType().isa(); });
+ llvm::all_of(getShapedOperands(), [](Value v) {
+ return v.getType().template isa(); });
}]
>,
InterfaceMethod<
@@ -732,11 +826,9 @@
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- auto isTensorType = [](Value v) {
- return v.getType().isa();
- };
- return llvm::all_of(getInputs(), isTensorType) &&
- llvm::all_of(this->getOperation()->getResults(), isTensorType);
+ return llvm::all_of(getShapedOperands(), [](Value v) {
+ return v.getType().template isa();
+ });
}]
>,
InterfaceMethod<
@@ -748,7 +840,8 @@
/*args=*/(ins),
/*methodBody=*/"",
/*defaultImplementation=*/[{
- return $_op->getAttr(getSparseAttrName()).template dyn_cast_or_null() != nullptr;
+ return $_op->getAttr(getSparseAttrName()).
+ template dyn_cast_or_null() != nullptr;
}]
>,
InterfaceMethod<
@@ -871,7 +964,7 @@
];
let extraClassDeclaration = [{
- /// Return the flat list of all operand dimension sizes in the order they
+ /// Return the flat list of all operand dimension sizes in the order they
/// appear in the operands.
SmallVector createFlatListOfOperandDims(OpBuilder &, Location);
@@ -893,7 +986,7 @@
for (unsigned i = 0; i < nExtraOperands; ++i) {
res.push_back(getOperation()->getOperand(numShapedOperands + i));
assert((res.back().getType().isSignlessIntOrIndexOrFloat()
- || res.back().getType().isa()) &&
+ || res.back().getType().template isa()) &&
"expected scalar or vector type");
}
return res;
@@ -904,7 +997,6 @@
//========================================================================//
void setNumInputs(unsigned num) { setOperandSegmentAt(0, num); }
void setNumOutputBuffers(unsigned num) { setOperandSegmentAt(1, num); }
- void setNumInitTensors(unsigned num) { setOperandSegmentAt(2, num); }
private:
void setOperandSegmentAt(unsigned idx, unsigned val) {
@@ -916,6 +1008,8 @@
getOperation()->setAttr("operand_segment_sizes", newAttr);
}
}];
+
+ let verify = [{ return detail::verifyStructuredOpInterface($_op); }];
}
#endif // LINALG_IR_STRUCTURED_OPS_INTERFACE
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h b/mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h
deleted file mode 100644
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h
+++ /dev/null
@@ -1,166 +0,0 @@
-//===- LinalgTraits.h - Linalg Traits ---------------------------*- C++ -*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef MLIR_DIALECT_LINALG_LINALGTRAITS_H_
-#define MLIR_DIALECT_LINALG_LINALGTRAITS_H_
-
-#include "mlir/Dialect/Linalg/IR/LinalgTypes.h"
-#include "mlir/Dialect/Utils/StructuredOpsUtils.h"
-#include "mlir/IR/AffineMap.h"
-#include "mlir/IR/BuiltinOps.h"
-#include "mlir/IR/BuiltinTypes.h"
-#include "mlir/IR/OpDefinition.h"
-#include "mlir/Support/LLVM.h"
-
-namespace mlir {
-namespace OpTrait {
-namespace linalg {
-
-/// This class provides the API for ops that are known to have a specified
-/// number of inputs, all passed as operands. Use as a trait as follows:
-///
-/// class DotOp : public Op::Impl> {
-///
-template class NInputs {
-public:
- template
- class Impl : public OpTrait::TraitBase::Impl> {
- public:
- static unsigned getNumInputs() { return N; }
- };
-};
-
-/// This class provides the API for ops that are known to not have init tensor
-/// operands. Use as a trait as follows:
-///
-/// class CopyOp : public Op {
-///
-template
-class ZeroInitTensors : public TraitBase {
-public:
- static unsigned getNumInitTensors() { return 0; }
-};
-
-/// This class provides the API for ops that are known to have a specified
-/// number of outputs, all passed as operands. Use as a trait as follows:
-///
-/// class DotOp : public Op::Impl> {
-///
-template class NOutputs {
-public:
- template
- class Impl : public OpTrait::TraitBase::Impl> {
- public:
- static unsigned getNumOutputs() { return N; }
- };
-};
-
-/// This class provides a verifier for structured ops that are known to operate
-/// on buffers or tensors. This trait must be used in conjunction with an op
-/// definition or a trait that provides the methods `getNumInputs` and
-/// `getNumOutputs`. Use as a trait as follows:
-///
-/// class DotOp : public Op {
-///
-template
-class StructuredOpTraits
- : public OpTrait::TraitBase {
-public:
- static LogicalResult verifyTrait(Operation *op) {
- ConcreteType concreteOp = cast(op);
- auto nOperands = concreteOp.getNumInputsAndOutputBuffers();
- if (failed(OpTrait::impl::verifyAtLeastNOperands(op, nOperands)))
- return failure();
- if (op->getNumResults() > concreteOp.getNumOutputs())
- return op->emitError("unexpected #results > #outputs");
- return success();
- }
-};
-
-/// This class provides a verifier for structured ops that are known to operate
-/// on buffers or tensors and that support `ins`, `outs` and `init` arguments.
-/// This trait must be used in conjunction with an op definition or a trait that
-/// provides the methods `getNumInputs` and `getNumOutputs`.
-///
-/// Use as a trait as follows:
-///
-/// class MatmulOp : public Op {
-///
-template
-class NamedStructuredOpTrait
- : public OpTrait::TraitBase {
-public:
- unsigned getNumInputs() {
- return cast(this->getOperation()).inputs().size();
- }
- unsigned getNumInitTensors() {
- return cast(this->getOperation()).init_tensors().size();
- }
- unsigned getNumOutputs() {
- ConcreteType concreteOp = cast(this->getOperation());
- return concreteOp.output_buffers().size() +
- concreteOp.result_tensors().size();
- }
- static LogicalResult verifyTrait(Operation *op) {
- ConcreteType concreteOp = cast(op);
- unsigned nInputAndBufferOperands =
- concreteOp.getNumInputsAndOutputBuffers();
- if (failed(
- OpTrait::impl::verifyAtLeastNOperands(op, nInputAndBufferOperands)))
- return failure();
-
- SmallVector redDims;
- concreteOp.getReductionDims(redDims);
- // If no result and no reduction, only check there is no init tensor and we
- // are done.
- if (redDims.empty() || op->getNumResults() == 0) {
- if (!concreteOp.init_tensors().empty())
- return op->emitError("expected empty `init` when op has no "
- "results or no reduction dims");
- return success();
- }
-
- // Only a single tensor result supported atm.
- if (op->getNumResults() != 1)
- return op->emitError(
- "expected single tensor result when reduction present");
-
- if (concreteOp.init_tensors().size() != op->getNumResults())
- return op->emitError(
- "expected #init tensors to match #results when reduction present");
-
- for (unsigned idx = 0, e = op->getNumResults(); idx < e; ++idx)
- if (concreteOp.init_tensors()[idx].getType() != op->getResultTypes()[idx])
- return op->emitError("expected init tensor #")
- << idx << " of the same type as result #" << idx;
-
- // Output tensor indexing map may not depend on reduction index.
- // TODO: this is not yet tested. Add a test when linalg.generic switches to
- // this representation.
- for (unsigned idx = 0, e = concreteOp.getNumOutputs(); idx < e; ++idx) {
- AffineMap outputMap = concreteOp.getOutputIndexingMap(idx);
- for (auto expr : outputMap.getResults()) {
- for (auto dim : redDims) {
- unsigned pos = dim.cast().getPosition();
- if (expr.isFunctionOfDim(pos))
- return op->emitError(
- "unexpected single tensor output indexing map ")
- << "is function of reduction dim @" << pos;
- }
- }
- }
-
- return success();
- }
-};
-
-} // namespace linalg
-} // namespace OpTrait
-} // namespace mlir
-
-#endif // MLIR_DIALECT_LINALG_LINALGTRAITS_H_
diff --git a/mlir/include/mlir/IR/OpBase.td b/mlir/include/mlir/IR/OpBase.td
--- a/mlir/include/mlir/IR/OpBase.td
+++ b/mlir/include/mlir/IR/OpBase.td
@@ -673,6 +673,11 @@
MemRefRankOf<[AnyType], [rank]>.predicate]>,
AnyStridedMemRef.description # " of rank " # rank>;
+class StridedMemRefRankOf allowedTypes, list ranks> :
+ Type.predicate, HasAnyRankOfPred]>,
+ StrJoin.result # " " #
+ MemRefOf.description>;
+
// This represents a generic tuple without any constraints on element type.
def AnyTuple : Type;
diff --git a/mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir b/mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir
--- a/mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir
+++ b/mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir
@@ -22,7 +22,7 @@
%C = constant dense<1000.0> : tensor<2x4xf32>
%D = linalg.matmul ins(%A, %B: tensor<2x3xf32>, tensor<3x4xf32>)
- init(%C: tensor<2x4xf32>) -> tensor<2x4xf32>
+ outs(%C: tensor<2x4xf32>) -> tensor<2x4xf32>
%unranked = tensor.cast %D : tensor<2x4xf32> to tensor<*xf32>
call @print_memref_f32(%unranked) : (tensor<*xf32>) -> ()
diff --git a/mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp b/mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp
--- a/mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp
+++ b/mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp
@@ -13,6 +13,7 @@
#include "mlir/Dialect/Linalg/Analysis/DependenceAnalysis.h"
#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
#include "mlir/Dialect/StandardOps/IR/Ops.h"
+#include "mlir/IR/BuiltinOps.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
@@ -113,15 +114,16 @@
}
void LinalgDependenceGraph::addDependenceElem(DependenceType dt,
- LinalgOpView indexingOpView,
- LinalgOpView dependentOpView) {
+ OpOperand *indexingOpView,
+ OpOperand *dependentOpView) {
LLVM_DEBUG(dbgs() << "\nAdd dep type " << getDependenceTypeStr(dt) << ":\t ("
- << *indexingOpView.op << ", " << indexingOpView.operandIndex
- << ") -> \n\t\t(" << *dependentOpView.op << ", "
- << dependentOpView.operandIndex << ")");
- dependencesFromGraphs[dt][indexingOpView.op].push_back(
+ << indexingOpView->get() << " @"
+ << indexingOpView->getOperandNumber() << ") -> \n\t\t("
+ << dependentOpView->get() << " @"
+ << dependentOpView->getOperandNumber() << ")");
+ dependencesFromGraphs[dt][indexingOpView->getOwner()].push_back(
LinalgDependenceGraphElem{dependentOpView, indexingOpView, dt});
- dependencesIntoGraphs[dt][dependentOpView.op].push_back(
+ dependencesIntoGraphs[dt][dependentOpView->getOwner()].push_back(
LinalgDependenceGraphElem{indexingOpView, dependentOpView, dt});
}
@@ -156,57 +158,25 @@
}
void LinalgDependenceGraph::addDependencesBetween(LinalgOp src, LinalgOp dst) {
- for (auto srcView : llvm::enumerate(src.getOutputBuffers())) { // W
- unsigned srcIndex =
- src.getOperandIndexForOutputIndex(srcView.index()).getValue();
+ for (OpOperand *srcOpOperand : src.getOutputBuffersOpOperands()) { // W
// RAW graph
- for (auto dstView : llvm::enumerate(dst.getInputBuffers())) { // R
- if (aliases.alias(srcView.value(),
- dstView.value())) { // if alias, fill RAW
- unsigned dstIndex =
- dst.getOperandIndexForInputIndex(dstView.index()).getValue();
- addDependenceElem(DependenceType::RAW,
- LinalgOpView{src.getOperation(), srcIndex},
- LinalgOpView{dst.getOperation(), dstIndex});
- }
- }
+ for (OpOperand *dstOpOperand : dst.getInputBuffersOpOperands()) // R
+ if (aliases.alias(srcOpOperand->get(), dstOpOperand->get())) // RAW alias
+ addDependenceElem(DependenceType::RAW, srcOpOperand, dstOpOperand);
// WAW graph
- for (auto dstView : llvm::enumerate(dst.getOutputBuffers())) { // W
- if (aliases.alias(srcView.value(),
- dstView.value())) { // if alias, fill WAW
- unsigned dstIndex =
- dst.getOperandIndexForOutputIndex(dstView.index()).getValue();
- addDependenceElem(DependenceType::WAW,
- LinalgOpView{src.getOperation(), srcIndex},
- LinalgOpView{dst.getOperation(), dstIndex});
- }
- }
+ for (OpOperand *dstOpOperand : dst.getOutputBuffersOpOperands()) // W
+ if (aliases.alias(srcOpOperand->get(), dstOpOperand->get())) // WAW alias
+ addDependenceElem(DependenceType::WAW, srcOpOperand, dstOpOperand);
}
- for (auto srcView : llvm::enumerate(src.getInputBuffers())) { // R
- unsigned srcIndex =
- src.getOperandIndexForInputIndex(srcView.index()).getValue();
+ for (OpOperand *srcOpOperand : src.getInputBuffersOpOperands()) { // R
// RAR graph
- for (auto dstView : llvm::enumerate(dst.getInputBuffers())) { // R
- if (aliases.alias(srcView.value(),
- dstView.value())) { // if alias, fill RAR
- unsigned dstIndex =
- dst.getOperandIndexForInputIndex(dstView.index()).getValue();
- addDependenceElem(DependenceType::RAR,
- LinalgOpView{src.getOperation(), srcIndex},
- LinalgOpView{dst.getOperation(), dstIndex});
- }
- }
+ for (OpOperand *dstOpOperand : dst.getInputBuffersOpOperands()) // R
+ if (aliases.alias(srcOpOperand->get(), dstOpOperand->get())) // RAR alias
+ addDependenceElem(DependenceType::RAR, srcOpOperand, dstOpOperand);
// WAR graph
- for (auto dstView : llvm::enumerate(dst.getOutputBuffers())) { // W
- if (aliases.alias(srcView.value(),
- dstView.value())) { // if alias, fill WAR
- unsigned dstIndex =
- dst.getOperandIndexForOutputIndex(dstView.index()).getValue();
- addDependenceElem(DependenceType::WAR,
- LinalgOpView{src.getOperation(), srcIndex},
- LinalgOpView{dst.getOperation(), dstIndex});
- }
- }
+ for (OpOperand *dstOpOperand : dst.getOutputBuffersOpOperands()) // W
+ if (aliases.alias(srcOpOperand->get(), dstOpOperand->get())) // WAR alias
+ addDependenceElem(DependenceType::WAR, srcOpOperand, dstOpOperand);
}
}
@@ -248,17 +218,15 @@
// TODO: we are not considering paths yet, just interleaved positions.
for (auto dt : types) {
for (auto dependence : getDependencesFrom(src, dt)) {
- auto interimPos = linalgOpPositions.lookup(dependence.dependentOpView.op);
+ auto interimPos =
+ linalgOpPositions.lookup(dependence.dependentOpView->getOwner());
// Skip if not interleaved.
if (interimPos >= dstPos || interimPos <= srcPos)
continue;
- linalg::LinalgOp consumer =
- cast(dependence.indexingOpView.op);
- Value consumerView =
- consumer.getShapedOperand(dependence.indexingOpView.operandIndex);
+ Value consumerView = dependence.indexingOpView->get();
if (view && !aliases.alias(view, consumerView))
continue;
- auto *op = dependence.dependentOpView.op;
+ auto *op = dependence.dependentOpView->getOwner();
LLVM_DEBUG(dbgs() << "\n***Found covering dependence of type "
<< getDependenceTypeStr(dt) << ": " << *src << " -> "
<< *op << " on " << consumerView);
@@ -271,12 +239,10 @@
bool LinalgDependenceGraph::hasDependenceFrom(
LinalgOp srcLinalgOp, LinalgOp dstLinalgOp,
ArrayRef depTypes) const {
- for (auto dep : depTypes) {
- for (auto dependence : getDependencesInto(dstLinalgOp, dep)) {
- if (dependence.dependentOpView.op == srcLinalgOp)
+ for (auto dep : depTypes)
+ for (auto dependence : getDependencesInto(dstLinalgOp, dep))
+ if (dependence.dependentOpView->getOwner() == srcLinalgOp)
return true;
- }
- }
return false;
}
diff --git a/mlir/lib/Dialect/Linalg/EDSC/Builders.cpp b/mlir/lib/Dialect/Linalg/EDSC/Builders.cpp
--- a/mlir/lib/Dialect/Linalg/EDSC/Builders.cpp
+++ b/mlir/lib/Dialect/Linalg/EDSC/Builders.cpp
@@ -23,36 +23,25 @@
Operation *mlir::edsc::makeGenericLinalgOp(
ArrayRef iteratorTypes, ArrayRef inputs,
- ArrayRef outputBuffers, ArrayRef initTensors,
- ArrayRef resultTensorTypes,
+ ArrayRef outputs, TypeRange resultTensorTypes,
function_ref regionBuilder, ArrayRef otherValues,
ArrayRef otherAttributes) {
OpBuilder &builder = edsc::ScopedContext::getBuilderRef();
// Build maps
SmallVector, 4> exprsList;
- exprsList.reserve(inputs.size() + outputBuffers.size() + initTensors.size());
- for (auto container : {inputs, outputBuffers, resultTensorTypes})
+ exprsList.reserve(inputs.size() + outputs.size());
+
+ for (auto container : {inputs, outputs})
for (const StructuredIndexed &s : container)
exprsList.emplace_back(s.getExprs().begin(), s.getExprs().end());
auto maps = AffineMap::inferFromExprList(exprsList);
- SmallVector types;
- assert(llvm::all_of(resultTensorTypes, [](const StructuredIndexed &s) {
- return !s.hasValue();
- }));
- std::copy(resultTensorTypes.begin(), resultTensorTypes.end(),
- std::back_inserter(types));
-
- SmallVector inputValues, outputBufferValues, initTensorValues;
+ SmallVector inputValues, outputValues;
inputValues.reserve(inputs.size());
- outputBufferValues.reserve(outputBuffers.size());
- initTensorValues.reserve(initTensors.size());
+ outputValues.reserve(outputs.size());
std::copy(inputs.begin(), inputs.end(), std::back_inserter(inputValues));
- std::copy(outputBuffers.begin(), outputBuffers.end(),
- std::back_inserter(outputBufferValues));
- std::copy(initTensors.begin(), initTensors.end(),
- std::back_inserter(initTensorValues));
+ std::copy(outputs.begin(), outputs.end(), std::back_inserter(outputValues));
auto iteratorStrTypes =
llvm::to_vector<8>(llvm::map_range(iteratorTypes, toString));
@@ -61,10 +50,9 @@
edsc::ScopedContext::getBuilderRef()
.create(
edsc::ScopedContext::getLocation(),
- types,
+ resultTensorTypes,
inputValues,
- outputBufferValues,
- initTensorValues,
+ outputValues,
builder.getAffineMapArrayAttr(maps),
builder.getStrArrayAttr(iteratorStrTypes),
StringAttr() /*doc*/,
@@ -77,12 +65,10 @@
using namespace edsc;
SmallVector blockTypes;
- blockTypes.reserve(inputs.size() + outputBuffers.size() + initTensors.size());
- for (auto container : {inputs, outputBuffers})
+ blockTypes.reserve(inputs.size() + outputs.size());
+ for (auto container : {inputs, outputs})
for (const StructuredIndexed &s : container)
blockTypes.push_back(getElementTypeOrSelf(s.getType()));
- for (Value v : initTensors)
- blockTypes.push_back(getElementTypeOrSelf(v.getType()));
assert(op->getNumRegions() == 1);
assert(op->getRegion(0).empty());
@@ -119,11 +105,10 @@
linalg_yield(unaryOp(a));
};
if (O.getType().isa())
- return makeGenericLinalgOp(iterTypes, /*inputs=*/{I}, /*outputBuffers=*/{},
- /*initTensors=*/{}, /*resultTensorTypes=*/{O},
- fun);
- return makeGenericLinalgOp(iterTypes, /*inputs=*/{I}, /*outputBuffers=*/{O},
- /*initTensors=*/{}, /*resultTensorTypes=*/{}, fun);
+ return makeGenericLinalgOp(iterTypes, /*inputs=*/{I}, /*outputs=*/{O},
+ /*resultTensorTypes=*/{O}, fun);
+ return makeGenericLinalgOp(iterTypes, /*inputs=*/{I}, /*outputs=*/{O},
+ /*resultTensorTypes=*/{}, fun);
}
Operation *mlir::edsc::ops::linalg_generic_pointwise_tanh(StructuredIndexed I,
@@ -144,12 +129,10 @@
linalg_yield(binaryOp(a, b));
};
if (O.getType().isa())
- return makeGenericLinalgOp(
- iterTypes, /*inputs=*/{I1, I2}, /*outputBuffers=*/{},
- /*initTensors=*/{}, /*resultTensorTypes=*/{O}, fun);
+ return makeGenericLinalgOp(iterTypes, /*inputs=*/{I1, I2}, /*outputs=*/{O},
+ /*resultTensorTypes=*/{O}, fun);
return makeGenericLinalgOp(iterTypes, /*inputs=*/{I1, I2},
- /*outputBuffers=*/{O},
- /*initTensors=*/{}, /*resultTensorTypes=*/{}, fun);
+ /*outputs=*/{O}, /*resultTensorTypes=*/{}, fun);
}
Operation *mlir::edsc::ops::linalg_generic_pointwise_add(StructuredIndexed I1,
@@ -181,8 +164,7 @@
return makeGenericLinalgOp(
{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},
/*inputs=*/{A({m, k}), B({k, n})},
- /*outputBuffers=*/{C({m, n})},
- /*initTensors=*/{},
+ /*outputs=*/{C({m, n})},
/*resultTensorTypes=*/{},
regionBuilder);
// clang-format on
@@ -199,8 +181,7 @@
return makeGenericLinalgOp(
{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},
/*inputs=*/{A({m, k}), B({k, n})},
- /*outputBuffers=*/{},
- /*initTensors=*/{C({m, n})},
+ /*outputs=*/{C({m, n})},
/*resultTensorTypes=*/{D({m, n})},
regionBuilder);
// clang-format on
@@ -236,8 +217,7 @@
simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),
c}),
W({kh, kw, c, f}) },
- /*outputBuffers=*/{ O({b, h, w, f}) },
- /*initTensors=*/{},
+ /*outputs=*/{ O({b, h, w, f}) },
/*resultTensorTypes=*/{},
macRegionBuilder);
// clang-format on
@@ -272,9 +252,8 @@
simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),
c}),
W({kh, kw, c, dm})},
- /*outputBuffers=*/{
+ /*outputs=*/{
O({b, h, w, simplifyAffineExpr(c * depth_multiplier + dm, numDims, 0)})},
- /*initTensors=*/{},
/*resultTensorTypes=*/{},
macRegionBuilder);
// clang-format on
diff --git a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
--- a/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
+++ b/mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
@@ -88,22 +88,20 @@
/// Forward declarations.
template
-static void buildNamedStructuredOpRegionAndAttributes(
- OpBuilder &opBuilder, OperationState &result, TypeRange inputTypes,
- TypeRange outputBufferTypes, TypeRange initTensorTypes,
- TypeRange resultTypes);
+static void buildNamedStructuredOpRegionAndAttributes(OpBuilder &opBuilder,
+ OperationState &result,
+ TypeRange inputTypes,
+ TypeRange outputTypes);
static ParseResult
parseCommonStructuredOpParts(OpAsmParser &parser, OperationState &result,
SmallVectorImpl &inputTypes,
- SmallVectorImpl &outputBufferTypes,
- SmallVectorImpl &initTensorTypes);
+ SmallVectorImpl &outputTypes);
template
static ParseResult
parseNamedStructuredOpRegion(OpAsmParser &parser, Region ®ion,
- TypeRange inputTypes, TypeRange outputBufferTypes,
- TypeRange initTensorTypes, TypeRange resultTypes);
+ TypeRange inputTypes, TypeRange outputTypes);
static ParseResult
parseNamedStructuredOpResults(OpAsmParser &parser,
SmallVectorImpl &resultTypes);
@@ -122,9 +120,6 @@
template
static void printNamedStructuredOp(OpAsmPrinter &p, NamedStructuredOpType op);
-template
-static LogicalResult verifyNamedStructuredOp(NamedStructuredOpType op);
-
/// This is a common class used for patterns of the form
/// ```
/// someop(memrefcast) -> someop
@@ -152,11 +147,10 @@
//===----------------------------------------------------------------------===//
void GenericOp::build(
OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes,
- ValueRange inputs, ValueRange outputBuffers, ValueRange initTensors,
- ArrayRef indexingMaps, ArrayRef iteratorTypes,
- StringRef doc, StringRef libraryCall,
+ ValueRange inputs, ValueRange outputs, ArrayRef indexingMaps,
+ ArrayRef iteratorTypes, StringRef doc, StringRef libraryCall,
function_ref bodyBuild) {
- build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
+ build(builder, result, resultTensorTypes, inputs, outputs,
builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),
doc.empty() ? StringAttr() : builder.getStringAttr(doc),
@@ -166,7 +160,7 @@
return;
SmallVector blockArgTypes;
- for (ValueRange container : {inputs, outputBuffers, initTensors})
+ for (ValueRange container : {inputs, outputs})
for (Value v : container)
blockArgTypes.push_back(v.getType().cast().getElementType());
@@ -178,41 +172,40 @@
void GenericOp::build(
OpBuilder &builder, OperationState &result, ValueRange inputs,
- ValueRange outputBuffers, ArrayRef indexingMaps,
+ ValueRange outputs, ArrayRef indexingMaps,
ArrayRef iteratorTypes, StringRef doc, StringRef libraryCall,
function_ref bodyBuild) {
- build(builder, result, TypeRange{}, inputs, outputBuffers, ValueRange{},
- indexingMaps, iteratorTypes, doc, libraryCall, bodyBuild);
+ build(builder, result, TypeRange{}, inputs, outputs, indexingMaps,
+ iteratorTypes, doc, libraryCall, bodyBuild);
}
void GenericOp::build(
OpBuilder &builder, OperationState &result, ValueRange inputs,
- ValueRange outputBuffers, ArrayRef indexingMaps,
+ ValueRange outputs, ArrayRef indexingMaps,
ArrayRef iteratorTypes,
function_ref bodyBuild) {
- build(builder, result, inputs, outputBuffers, indexingMaps, iteratorTypes,
+ build(builder, result, inputs, outputs, indexingMaps, iteratorTypes,
/*doc=*/"",
/*libraryCall=*/"", bodyBuild);
}
void GenericOp::build(
OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes,
- ValueRange inputs, ValueRange outputBuffers, ValueRange initTensors,
- ArrayRef indexingMaps, ArrayRef iteratorTypes,
+ ValueRange inputs, ValueRange outputs, ArrayRef indexingMaps,
+ ArrayRef iteratorTypes,
function_ref bodyBuild) {
- build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
- indexingMaps, iteratorTypes,
+ build(builder, result, resultTensorTypes, inputs, outputs, indexingMaps,
+ iteratorTypes,
/*doc=*/"",
/*libraryCall=*/"", bodyBuild);
}
void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes,
- ValueRange inputs, ValueRange outputBuffers, ValueRange initTensors,
- ArrayRef indexingMaps, ArrayRef iteratorTypes,
- StringRef doc, StringRef libraryCall,
+ ValueRange inputs, ValueRange outputs, ArrayRef indexingMaps,
+ ArrayRef iteratorTypes, StringRef doc, StringRef libraryCall,
function_ref
bodyBuild) {
- build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
+ build(builder, result, resultTensorTypes, inputs, outputs,
builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),
doc.empty() ? StringAttr() : builder.getStringAttr(doc),
@@ -223,7 +216,7 @@
unsigned nLoops = iteratorTypes.size();
SmallVector blockArgTypes(nLoops, builder.getIndexType());
- for (ValueRange container : {inputs, outputBuffers, initTensors})
+ for (ValueRange container : {inputs, outputs})
for (Value v : container)
blockArgTypes.push_back(v.getType().cast().getElementType());
@@ -237,32 +230,32 @@
void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, ValueRange inputs,
- ValueRange outputBuffers, ArrayRef indexingMaps,
+ ValueRange outputs, ArrayRef indexingMaps,
ArrayRef iteratorTypes, StringRef doc, StringRef libraryCall,
function_ref
bodyBuild) {
- build(builder, result, TypeRange{}, inputs, outputBuffers, ValueRange{},
- indexingMaps, iteratorTypes, doc, libraryCall, bodyBuild);
+ build(builder, result, TypeRange{}, inputs, outputs, indexingMaps,
+ iteratorTypes, doc, libraryCall, bodyBuild);
}
void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, ValueRange inputs,
- ValueRange outputBuffers, ArrayRef indexingMaps,
+ ValueRange outputs, ArrayRef indexingMaps,
ArrayRef iteratorTypes,
function_ref
bodyBuild) {
- build(builder, result, inputs, outputBuffers, indexingMaps, iteratorTypes,
+ build(builder, result, inputs, outputs, indexingMaps, iteratorTypes,
/*doc=*/"", /*libraryCall=*/"", bodyBuild);
}
void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, TypeRange resultTensorTypes,
- ValueRange inputs, ValueRange outputBuffers, ValueRange initTensors,
- ArrayRef indexingMaps, ArrayRef iteratorTypes,
+ ValueRange inputs, ValueRange outputs, ArrayRef indexingMaps,
+ ArrayRef iteratorTypes,
function_ref
bodyBuild) {
- build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
- indexingMaps, iteratorTypes,
+ build(builder, result, resultTensorTypes, inputs, outputs, indexingMaps,
+ iteratorTypes,
/*doc=*/"",
/*libraryCall=*/"", bodyBuild);
}
@@ -327,9 +320,8 @@
dictAttr.getValue().end());
// Parsing is shared with named ops, except for the region.
- SmallVector inputTypes, outputBufferTypes, initTensorTypes;
- if (parseCommonStructuredOpParts(parser, result, inputTypes,
- outputBufferTypes, initTensorTypes))
+ SmallVector inputTypes, outputTypes;
+ if (parseCommonStructuredOpParts(parser, result, inputTypes, outputTypes))
return failure();
// Optional attributes may be added.
@@ -360,7 +352,7 @@
static void getGenericEffectsImpl(
SmallVectorImpl>
&effects,
- ValueRange results, ValueRange inputBuffers, ValueRange outputBuffers) {
+ ValueRange results, ValueRange inputBuffers, ValueRange outputs) {
for (Value value : results) {
effects.emplace_back(MemoryEffects::Allocate::get(), value,
SideEffects::DefaultResource::get());
@@ -369,7 +361,7 @@
effects.emplace_back(MemoryEffects::Read::get(), value,
SideEffects::DefaultResource::get());
}
- for (Value value : outputBuffers) {
+ for (Value value : outputs) {
effects.emplace_back(MemoryEffects::Read::get(), value,
SideEffects::DefaultResource::get());
effects.emplace_back(MemoryEffects::Write::get(), value,
@@ -391,65 +383,151 @@
getInputBuffers(), getOutputBuffers());
}
-namespace {
+LogicalResult mlir::linalg::detail::verifyStructuredOpInterface(Operation *op) {
+ LinalgOp linalgOp = cast(op);
+ // Expect at least one shaped operand.
+ // This means an op that constructs a tensor out of indices cannot be a
+ // LinalgOp at the moment. For now this will have to be a special op until we
+ // have output shape operands that are not tensors.
+ auto nShapedOperands = linalgOp.getNumShapedOperands();
+ if (nShapedOperands == 0)
+ return linalgOp.emitOpError("expected at least 1 Shaped operand");
+ if (failed(OpTrait::impl::verifyAtLeastNOperands(op, nShapedOperands)))
+ return failure();
+ // Should have at least one output tensor per result tensor.
+ // Can also have outbut buffers that do not correspond to results.
+ if (op->getNumResults() > linalgOp.getNumOutputs())
+ return op->emitError("unexpected #results > #outputs");
+
+ // All shaped operands must be indexed.
+ if (linalgOp.indexing_maps().size() != linalgOp.getNumShapedOperands())
+ return linalgOp.emitOpError("expected the number of indexing_map (")
+ << linalgOp.indexing_maps().size()
+ << ") to be equal to the number of shaped operands ("
+ << linalgOp.getNumShapedOperands() << ")";
-template
-struct BlockArgsVerifier {
- static LogicalResult verify(GenericOpType op, Block &block);
-};
+ SmallVector indexingMaps;
+ indexingMaps.reserve(linalgOp.indexing_maps().size());
+ for (auto en : llvm::enumerate(linalgOp.indexing_maps())) {
+ auto idx = en.index();
+ auto m = en.value().template cast().getValue();
+ indexingMaps.push_back(m); // Save reference to map for further checks.
+ auto shapedValue = linalgOp.getShapedType(idx);
-template
-LogicalResult BlockArgsVerifier::verify(GenericOpType op,
- Block &block) {
- auto nOperands = op.getNumOperands();
- if (block.getNumArguments() != nOperands)
- return op.emitOpError("expected number of block arguments to match number "
- "of operands");
+ // Symbols disallowed.
+ if (m.getNumSymbols() != 0)
+ return linalgOp.emitOpError("unexpected symbols in indexing_map #")
+ << idx;
- // Note: the number and type of yield values are checked in the YieldOp.
- auto nInputViews = op.getNumInputs();
- for (unsigned i = 0; i < nOperands; ++i) {
- auto viewType = op.getShapedType(i);
- if (viewType.getElementType() != block.getArgument(i).getType())
- return op.emitOpError("expected block argument ")
- << (i + 1) << " of the same type as elemental type of "
- << ((i < nInputViews) ? "input " : "output ")
- << "operand: " << viewType;
+ // Domain must be consistent.
+ auto nLoops = linalgOp.getNumLoops();
+ if (m.getNumDims() != nLoops)
+ return linalgOp.emitOpError("expected indexing_map #")
+ << idx << " to have " << nLoops
+ << " dim(s) to match the number of loops";
+
+ if (m.getNumResults() != shapedValue.getRank())
+ return linalgOp.emitOpError("expected shaped value rank (")
+ << shapedValue << ") to match the result rank of indexing_map #"
+ << idx << " (" << m.getNumResults() << ")";
}
- return success();
-}
-template <>
-LogicalResult BlockArgsVerifier::verify(IndexedGenericOp op,
- Block &block) {
- auto nInputViews = op.getNumInputs();
- auto nLoops = op.getNumLoops();
- auto nOperands = op.getNumOperands();
- if (block.getNumArguments() != nOperands + nLoops)
- return op.emitOpError(
- "expected number of block arguments to match number of operands + "
- "number of loops");
+ SmallVector redDims;
+ linalgOp.getReductionDims(redDims);
+
+ // Simplifying assumption: either full tensor or full buffer mode.
+ // This allows simpler verification of output operands vs result types
+ // without premature tracking of which operand is what in mixed-mode.
+ // TODO: relax when mixed-mode needs to pass verification.
+ if (linalgOp.getNumOutputBuffers() > 0 && linalgOp.getNumOutputTensors() > 0)
+ return op->emitError("expected output operands to all have tensor type or "
+ "all have buffer type");
+
+ if (op->getNumResults() != linalgOp.getNumOutputTensors())
+ return op->emitError("expected as many output tensor operands as results");
+ for (auto it :
+ llvm::zip(linalgOp.getOutputOpOperands(), op->getResultTypes())) {
+ if (!std::get<0>(it).get().getType().isa())
+ continue;
+ if (std::get<0>(it).get().getType() != std::get<1>(it))
+ return op->emitError("expected type of operand #")
+ << std::get<0>(it).getOperandNumber() << " ("
+ << std::get<0>(it).get().getType() << ")"
+ << " to match type of corresponding result (" << std::get<1>(it)
+ << ")";
+ }
+
+ // Output tensor indexing map may not depend on reduction indices.
+ for (OpOperand &opOperand : linalgOp.getOutputOpOperands()) {
+ AffineMap outputMap = linalgOp.getIndexingMap(opOperand.getOperandNumber());
+ for (auto expr : outputMap.getResults()) {
+ for (auto dim : redDims) {
+ unsigned pos = dim.cast().getPosition();
+ if (expr.isFunctionOfDim(pos)) {
+ std::string exprStr;
+ {
+ llvm::raw_string_ostream os(exprStr);
+ os << expr;
+ }
+ return op->emitError(
+ "unexpected output tensor expression in indexing map #")
+ << (opOperand.getOperandNumber() - linalgOp.getNumInputs())
+ << " a.k.a '" << exprStr
+ << "' is function of reduction iterator 'd" << pos << "'";
+ }
+ }
+ }
+ }
+
+ // Named ops that are defined manually have a region builder but no region at
+ // this time. Assume the region is well-formed by specification.
+ // TODO: use linalg-ods-gen for all ops when we have enough expressive power.
+ if (linalgOp->getNumRegions() == 0) {
+ assert(!linalgOp.getRegionBuilder() && "regionBuilder but no region");
+ return success();
+ }
+
+ auto ®ion = linalgOp->getRegion(0);
+ if (linalgOp->getNumRegions() > 1 || !llvm::hasSingleElement(region))
+ return op->emitOpError("expected 1 region with 1 block");
+
+ if (!linalgOp.getShapesToLoopsMap())
+ return op->emitOpError("expected the shape-to-loops map to be non-null");
+
+ // Simplifying assumption: bbargs match 1-1 with shape operands elemental
+ // types.
+ // TODO: once ranked shape types are plugged in, we may want to drop the
+ // corresponding bbargs, that can never be read from. This will be subject to
+ // consistency discussions (i.e. what to do with output tensors whose bbarg is
+ // not used).
+ Block &block = linalgOp->getRegion(0).front();
+ unsigned numBBIvs = linalgOp.getNumPayloadInductionVariables();
+
+ if (linalgOp.getNumShapedOperands() + numBBIvs != block.getNumArguments())
+ return op->emitError("expected as many non-induction variable region basic "
+ "block arguments as the number of shaped operands");
// Note: the number and type of yield values are checked in the YieldOp.
- for (unsigned i = 0; i < nLoops; ++i)
+ for (unsigned i = 0; i < numBBIvs; ++i)
if (!block.getArgument(i).getType().isIndex())
- return op.emitOpError("expected block argument ")
- << (i + 1) << " to be an index";
-
- for (unsigned i = 0; i < nOperands; ++i) {
- unsigned memrefArgIndex = i + nLoops;
- auto viewType = op.getShapedType(i);
- if (viewType.getElementType() !=
- block.getArgument(memrefArgIndex).getType())
- return op.emitOpError("expected block argument ")
- << (memrefArgIndex + 1)
- << " of the same type as elemental type of "
- << ((i < nInputViews) ? "input " : "output ")
- << "operand: " << viewType;
+ return op->emitOpError("expected index block argument #") << i;
+
+ unsigned idx = 0;
+ for (auto it : llvm::zip(linalgOp.getShapedOperandTypes(),
+ block.getArguments().drop_front(numBBIvs))) {
+ if (std::get<0>(it).getElementType() != std::get<1>(it).getType())
+ return op->emitError("expected type of bb argument #")
+ << (idx + numBBIvs) << " (" << std::get<1>(it).getType() << ")"
+ << " to match element type of corresponding shaped operand ("
+ << std::get<0>(it).getElementType() << ")";
+ ++idx;
}
+
return success();
}
+namespace {
+
template
struct AnnotationsVerifier {
static LogicalResult verify(GenericOpType op) { return success(); }
@@ -465,7 +543,7 @@
return op.emitOpError("expected sparse annotations on tensors only");
if (op.getNumOutputs() != 1)
return op.emitOpError("expected single output tensor");
- unsigned numTensors = op.getNumInputsAndOutputs();
+ unsigned numTensors = op.getNumShapedOperands();
if (sparseAttr.size() != numTensors)
return op.emitOpError("expected one sparse annotation for each tensor");
for (unsigned t = 0; t < numTensors; t++) {
@@ -497,49 +575,6 @@
template
static LogicalResult verifyGenericOp(GenericOpType op) {
- auto nLoops = op.getNumLoops();
-
- if (op.inputs().size() + op.output_buffers().size() +
- op.init_tensors().size() + op.getNumResults() ==
- 0)
- return op.emitOpError("expected at least 1 Shaped operand or return");
-
- auto ®ion = op.region();
- if (!llvm::hasSingleElement(region))
- return op.emitOpError("expected region with 1 block");
- if (failed(BlockArgsVerifier::verify(op, region.front())))
- return failure();
-
- if (op.indexing_maps().size() != op.getNumInputsAndOutputs())
- return op.emitOpError("expected the number of indexing_map (")
- << op.indexing_maps().size()
- << ") to be equal to the number of inputs and outputs ("
- << op.getNumInputsAndOutputs() << ")";
-
- SmallVector indexingMaps;
- indexingMaps.reserve(op.indexing_maps().size());
- for (auto en : llvm::enumerate(op.indexing_maps())) {
- auto idx = en.index();
- auto m = en.value().template cast().getValue();
- indexingMaps.push_back(m); // Save reference to map for further checks.
- auto view = op.getShapedType(idx);
-
- if (m.getNumSymbols() != 0)
- return op.emitOpError("unexpected symbols in indexing_map #") << idx;
-
- if (m.getNumDims() != nLoops)
- return op.emitOpError("expected indexing_map #")
- << idx << " to have " << nLoops
- << " dim(s) to match the number of loops";
-
- if (m.getNumResults() != view.getRank())
- return op.emitOpError("expected indexing_map #")
- << idx << " results to match view rank: " << view;
- }
-
- if (!op.getShapesToLoopsMap())
- return op.emitOpError("expected the shape-to-loops map to be non-null");
-
if (failed(AnnotationsVerifier::verify(op)))
return failure();
@@ -1380,8 +1415,6 @@
return op.emitOpError("expects memref elemental types to match");
if (oType.getRank() != iType.getRank() || oType.getRank() != fType.getRank())
return op.emitOpError("expects memref ranks to match");
- if (oType.getRank() <= 2)
- return op.emitOpError("expects memref ranks to be greater than 2");
if (auto strides = op.strides()) {
if (failed(
verifyStrideOrDilation(op, strides->getValue(), /*isStride=*/true)))
@@ -1591,13 +1624,12 @@
template
static void buildNamedStructuredOpRegionAndAttributesImpl(
OpBuilder &opBuilder, Region ®ion, TypeRange inputTypes,
- TypeRange outputBufferTypes, TypeRange initTensorTypes,
- TypeRange resultTypes,
+ TypeRange outputTypes,
std::function errorHandler) {
// TODO: atm all operands go through getElementTypeOrSelf,
// reconsider when we have evidence we need to.
SmallVector argTypes;
- for (auto containers : {inputTypes, outputBufferTypes, resultTypes})
+ for (auto containers : {inputTypes, outputTypes})
for (auto t : containers)
argTypes.push_back(getElementTypeOrSelf(t));
@@ -1622,13 +1654,11 @@
void buildNamedStructuredOpRegionAndAttributes(OpBuilder &opBuilder,
OperationState &result,
TypeRange inputTypes,
- TypeRange outputBufferTypes,
- TypeRange initTensorTypes,
- TypeRange resultTypes) {
+ TypeRange outputTypes) {
Region ®ion = *result.addRegion();
buildNamedStructuredOpRegionAndAttributesImpl(
- opBuilder, region, inputTypes, outputBufferTypes, initTensorTypes,
- resultTypes, [&](unsigned expected, unsigned actual) {
+ opBuilder, region, inputTypes, outputTypes,
+ [&](unsigned expected, unsigned actual) {
llvm::errs() << "region expects " << expected << " args, got "
<< actual;
assert(expected != actual && "incorrect number of arguments");
@@ -1638,13 +1668,12 @@
template
static ParseResult
parseNamedStructuredOpRegion(OpAsmParser &parser, Region ®ion,
- TypeRange inputTypes, TypeRange outputBufferTypes,
- TypeRange initTensorTypes, TypeRange resultTypes) {
+ TypeRange inputTypes, TypeRange outputTypes) {
ParseResult res = success();
OpBuilder opBuilder(parser.getBuilder().getContext());
buildNamedStructuredOpRegionAndAttributesImpl(
- opBuilder, region, inputTypes, outputBufferTypes, initTensorTypes,
- resultTypes, [&](unsigned expected, unsigned actual) {
+ opBuilder, region, inputTypes, outputTypes,
+ [&](unsigned expected, unsigned actual) {
res = parser.emitError(parser.getCurrentLocation(),
llvm::formatv("region expects {0} args, got {1}",
expected, actual));
@@ -1664,12 +1693,9 @@
static ParseResult
parseCommonStructuredOpParts(OpAsmParser &parser, OperationState &result,
SmallVectorImpl