This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/
-
mlir/
-
Dialect/
-
Linalg/
-
EDSC/
4/4
Builders.h
-
IR/
11/11
LinalgStructuredOps.td
-
LinalgTraits.h
-
Utils/
-
StructuredOpsUtils.h
-
lib/
-
Conversion/LinalgToSPIRV/
-
LinalgToSPIRV/
-
LinalgToSPIRV.cpp
-
Dialect/Linalg/
-
Linalg/
-
EDSC/
-
Builders.cpp
-
IR/
6/8
LinalgOps.cpp
-
Transforms/
-
DropUnitDims.cpp
3/3
Fusion.cpp
-
TensorsToBuffers.cpp
-
test/
-
Conversion/LinalgToSPIRV/
-
LinalgToSPIRV/
-
linalg-to-spirv.mlir
-
Dialect/Linalg/
-
Linalg/
-
canonicalize.mlir
-
drop-unit-extent-dims.mlir
-
fold-unit-trip-loops.mlir
-
fusion-tensor.mlir
-
fusion.mlir
-
fusion_indexed_generic.mlir
-
inlining.mlir
4/4
invalid.mlir
-
loops.mlir
-
parallel_loops.mlir
-
roundtrip.mlir
-
standard.mlir
-
tensors-to-buffers.mlir
-
tile.mlir
-
tile_indexed_generic.mlir
-
tile_parallel.mlir
-
tile_parallel_reduce.mlir
-
transform-patterns.mlir
-
EDSC/
-
builder-api-test.cpp
-
Transforms/
-
buffer-placement-preparation-allowed-memref-results.mlir
-
buffer-placement-preparation.mlir
-
buffer-placement.mlir
-
copy-removal.mlir
-
lib/Transforms/
-
Transforms/
2/3
TestBufferPlacement.cpp
-
tools/mlir-linalg-ods-gen/
-
mlir-linalg-ods-gen/
-
mlir-linalg-ods-gen.cpp

Differential D87938

[mlir][Linalg] Uniformize linalg.generic with named ops.
ClosedPublic

Authored by nicolasvasilache on Sep 18 2020, 1:22 PM.

Download Raw Diff

Details

Reviewers

ftynse
pifon2a
mravishankar
stellaraccident
silvas
benvanik
herhut
antiagainst
aartbik
burmako

Summary

This revision allows representing a reduction at the level of linalg on tensors for generic ops by uniformizing with the named ops approach.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	80 ms	linux > MLIR.EDSC::builder-api-test.cpp
	4,340 ms	windows > MLIR.EDSC::builder-api-test.cpp

Event Timeline

nicolasvasilache created this revision.Sep 18 2020, 1:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2020, 1:22 PM

Herald added subscribers: tatianashp, ThomasRaoux, AlexeySotkin and 14 others. · View Herald Transcript

nicolasvasilache requested review of this revision.Sep 18 2020, 1:22 PM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald TranscriptSep 18 2020, 1:22 PM

Harbormaster completed remote builds in B72236: Diff 292889.Sep 18 2020, 1:51 PM

burmako accepted this revision.Sep 19 2020, 3:31 PM

burmako added inline comments.

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
48	The comment looks outdated after the changes to the function signature.
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524	`{other-attributes}`? (Generally, I see the notation in this file is inconsistent between `[other-attributes]` and `{other-attributes}`, so maybe this change is intentional?).
614	Is this paragraph removed because this is no longer the case?
621	"update" instead of "updates"?
639	What do you think about the syntax which keeps type annotations at the end (while still introducing `ins` and friends)?
718	Missing curly brace?
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
46	In the function above, this parameter is called `outputBufferTypes` without the "s" in "Buffers". Also see `initTensorTypes` below.
mlir/test/Dialect/Linalg/invalid.mlir
394	Why is this no longer a test?
527	Likewise.

This revision is now accepted and ready to land.Sep 19 2020, 3:31 PM

nicolasvasilache marked 9 inline comments as done.Sep 21 2020, 12:09 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
614	correct, the SSA values don't escape control flow is not true anymore (e.g. scf.for with yield that I am going to take advantage of in the future).
639	It has 2 drawbacks that I can see right now: still need to jump through hoops to figure out which arg is of what type in the multi-list. More local type information is more desirable esp. when we mix input + output buffers + result tensors. the goal is to have the parser and printer use declarative assembly format with optional groups (blocked on some missing declarative assembly feature). Optional groups need operands and types to be within the same parsing unit (can't have the type parsing relegated at the end).
mlir/test/Dialect/Linalg/invalid.mlir
394	it was duplicated, see the next test.
527	this is no longer valid, there is no such thing as #outputs anymore (previously independently specified by the args_out attribute).

burmako added inline comments.Sep 21 2020, 6:55 AM

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
639	"still need to jump through hoops to figure out which arg is of what type in the multi-list". If I understand correctly, you're talking about the more general problem of visually matching things with their types because there's a bunch of syntax between one and the other. This also exists in other operations like call/llvm.invoke (match arguments with their types) and, more generally, in default syntax for ops. One could even say it's also relevant to bigger ops in general, e.g. for subview (match the source with its type). Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax? The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func. Depending on the reader / formatting, this syntax may not necessarily be a readability improvement upon the status quo. Previously one could figure out the types of linalg.generic views by looking at the end of the syntax, and now the types live in the middle which is not necessarily easy to visually parse. This also affects the use case of mixing tensors and buffers because for that use case it seems to be beneficial to quickly skim the types involved in an op. Custom syntax that's stylistically different from the status quo may lead to other issues that I haven't thought about. For example, here's one that came to mind as I was ready to press the Submit button. https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices recommends that "Tests should be minimal, and only check what is absolutely necessary". As an example of minimalism, it shows an example that omits types from ops in expected syntax, e.g. `// CHECK-NEXT: return %[[RESULT]], %[[RESULT]]`. This is something that becomes harder with the proposed syntax since one cannot just drop stuff after the conventional colon (colon included), and instead we need to say things like `ins(%foo, %bar: {{.*}}), outs...`.

nicolasvasilache marked 5 inline comments as done.Sep 21 2020, 8:36 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
639	and, more generally, in default syntax for ops. Yes that is the unsugared form, it quickly becomes hard to read when you have enough operands. Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax? Because parameter packs based on `operand_segment_sizes` need to be sugared otherwise they are hard to segment out. The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func. This relates to @mravishankar's comment on https://reviews.llvm.org/D87767. TL;DR I agree that we want to go towards a func-like syntax where each arg is followed by its type. I would strongly prefer this to be done automatically, once and for all, by the declarative format to avoid developing more parser/printer debt. When the custom parser / printer accepts and interleaved mode it will be easy to have an NFC CL to update the syntax. The situation is much better than it was before though: if we have 3 parameter packs with 2, 3 and 4 arguments there is strictly less jumping through hoops to determine the type of second argument of the third pack (you just need to look for the 2nd local type instead of looking for the `7^th` global type). and now the types live in the middle which is not necessarily easy to visually parse We could also put the region before the type arguments, would that alleviate part of the problem ? readability improvement upon the status quo Consistency wins here, first I need to fix the semantics gap and uniformize with named ops (see https://reviews.llvm.org/D87767). Also note that some of these arguments may become regions in the future. Once uniformity is achieved we can continue improving the parsing / pretty-printing in an NFC fashion. In particular the regions should also ideally be simplified. stylistically different from the status quo The status quo is now https://reviews.llvm.org/D87767, the generic ops have weaker semantics and need to be updated to allow tensors + reductions. Once the harder functional changes are landed, we can iterate on a better syntax. Note however that any proposed syntax will need to also work for named ops.

Additional changes and builders.

Harbormaster completed remote builds in B72390: Diff 293186.Sep 21 2020, 8:56 AM

I like the new syntax and encoding! Great cleanup. Some more verification of valid forms would be nice.

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
51	I assume you mean `resultTensorTypes` here?
53	Is this an alternative way to encode tensor results?
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524	This is not the actual syntax, right? There is an `attrs` there.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
360	I would have expected `init_tensors` not to count here. They only exist for reductions on tensors, so they are implied. But this verification is not very precise anyway.
1222	Maybe assign to a local?
1450	nit: `parseColonTypeList`
1507	`printArrowTypeList` or `printOptionalArrowTypeList`?
mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp
446	Left over?
512–521	I know this is a carry over but why `UnknownLoc`?
517	Reformat.
mlir/test/lib/Transforms/TestBufferPlacement.cpp
42	Not sure why this exists here. Can this not use the pattern from `TensorToBuffers.h` or is that not exposed there?

nicolasvasilache marked 10 inline comments as done.Sep 21 2020, 12:27 PM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h
53	Just stale stuff, cleaned up, thanks!
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
524	Indeed thanks! Ideally this would go away but this CL has grown fat and I am running out of patience writing throwaway parser/printer code. Will figure it out once we can use the declarative format.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
360	Ack, more work is needed to make reductions + tensors really work with Linalg. In particular for buffer allocation.
1222	Not getting this, could you please elaborate?
mlir/test/lib/Transforms/TestBufferPlacement.cpp
42	It's not exposed and it is not exposable without deeper refactorings because we don't want TensorToBuffer.h to depend on Linalg.

Address review comments.

Harbormaster completed remote builds in B72414: Diff 293233.Sep 21 2020, 12:50 PM

herhut added a subscriber: tpopp.Sep 22 2020, 1:39 AM

herhut added inline comments.

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
1222	It was just a nit to do `auto iteratorTypes = cast<LinalgOp>(op).iterator_types()` and then use that here and below.
mlir/test/lib/Transforms/TestBufferPlacement.cpp
42	There already is a `populateConvertLinalgOnTensorsToBuffersPattern` function, it is just not exposed. It should be enough to just expose that function and call it here. No need to do this in this change, I can clean that up, too, once this landed. We need to figure out where to put all the tensor to buffers pattern anyway, as different dialects will need them and having a `populate` function in passes/transforms/rewrites seems the right approach to me. @tpopp FYI as you looked into patterns for `shape.assuming`.

This has landed as ed229132f1c4ea2ba0644fc345d8279e47a00565.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

EDSC/

Builders.h

15 lines

IR/

LinalgStructuredOps.td

226 lines

LinalgTraits.h

2 lines

Utils/

StructuredOpsUtils.h

8 lines

lib/

Conversion/

LinalgToSPIRV/

LinalgToSPIRV.cpp

2 lines

Dialect/

Linalg/

EDSC/

Builders.cpp

143 lines

IR/

LinalgOps.cpp

307 lines

Transforms/

DropUnitDims.cpp

44 lines

Fusion.cpp

81 lines

TensorsToBuffers.cpp

53 lines

test/

Conversion/

LinalgToSPIRV/

linalg-to-spirv.mlir

32 lines

Dialect/

Linalg/

canonicalize.mlir

6 lines

drop-unit-extent-dims.mlir

39 lines

fold-unit-trip-loops.mlir

33 lines

fusion-tensor.mlir

198 lines

fusion.mlir

67 lines

fusion_indexed_generic.mlir

51 lines

8 lines

240 lines

82 lines

25 lines

118 lines

17 lines

tensors-to-buffers.mlir

27 lines

tile.mlir

6 lines

tile_indexed_generic.mlir

12 lines

tile_parallel.mlir

17 lines

tile_parallel_reduce.mlir

15 lines

transform-patterns.mlir

42 lines

EDSC/

builder-api-test.cpp

44 lines

Transforms/

buffer-placement-preparation-allowed-memref-results.mlir

8 lines

buffer-placement-preparation.mlir

85 lines

buffer-placement.mlir

218 lines

copy-removal.mlir

34 lines

lib/

Transforms/

TestBufferPlacement.cpp

71 lines

tools/

mlir-linalg-ods-gen/

mlir-linalg-ods-gen.cpp

2 lines

Diff 292889

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h

	Show All 39 Lines
	/// =============			/// =============
	///			///
	/// 1. `inputs` may contain StructuredIndexed that capture either buffer or			/// 1. `inputs` may contain StructuredIndexed that capture either buffer or
	/// tensor values.			/// tensor values.
	/// 2. `outputs` may contain StructuredIndexed that capture either buffer values			/// 2. `outputs` may contain StructuredIndexed that capture either buffer values
	/// or tensor types. If both buffer values and tensor types are present, then			/// or tensor types. If both buffer values and tensor types are present, then
	/// all buffer values must appear before any tensor type. Without this			/// all buffer values must appear before any tensor type. Without this
	/// restriction output tensor results would need to be reordered, which would			/// restriction output tensor results would need to be reordered, which would
	/// result in surprising behavior when combined with region definition.			/// result in surprising behavior when combined with region definition.
				burmakoUnsubmitted Done Reply Inline Actions The comment looks outdated after the changes to the function signature. burmako: The comment looks outdated after the changes to the function signature.
	Operation *makeGenericLinalgOp(			Operation *makeGenericLinalgOp(
	ArrayRef<IteratorType> iteratorTypes, ArrayRef<StructuredIndexed> inputs,			ArrayRef<IteratorType> iteratorTypes, ArrayRef<StructuredIndexed> inputs,
	ArrayRef<StructuredIndexed> outputs,			ArrayRef<StructuredIndexed> outputBuffers, ArrayRef<Value> initTensors,
				herhutUnsubmitted Done Reply Inline Actions I assume you mean `resultTensorTypes` here? herhut: I assume you mean `resultTensorTypes` here?
				ArrayRef<StructuredIndexed> resultTensorTypes,
	function_ref<void(ValueRange)> regionBuilder = defaultRegionBuilder,			function_ref<void(ValueRange)> regionBuilder = defaultRegionBuilder,
				herhutUnsubmitted Done Reply Inline Actions Is this an alternative way to encode tensor results? herhut: Is this an alternative way to encode tensor results?
				nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Just stale stuff, cleaned up, thanks! nicolasvasilache: Just stale stuff, cleaned up, thanks!
	ArrayRef<Value> otherValues = {}, ArrayRef<Attribute> otherAttributes = {});			ArrayRef<Value> otherValues = {}, ArrayRef<Attribute> otherAttributes = {});

	namespace ops {			namespace ops {
	using edsc::StructuredIndexed;			using edsc::StructuredIndexed;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// EDSC builders for linalg generic operations.			// EDSC builders for linalg generic operations.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	linalg_generic_matmul(Value vA, Value vB, Value vC,			linalg_generic_matmul(Value vA, Value vB, Value vC,
	MatmulRegionBuilder regionBuilder = macRegionBuilder);			MatmulRegionBuilder regionBuilder = macRegionBuilder);

	/// Build a linalg.generic, under the current ScopedContext, at the current			/// Build a linalg.generic, under the current ScopedContext, at the current
	/// insert point, that computes:			/// insert point, that computes:
	/// ```			/// ```
	/// (m, n, k) = (par, par, seq)			/// (m, n, k) = (par, par, seq)
	/// \|			/// \|
	/// \| C(m, n) = sum_k(A(m, k) * B(k, n))
	/// ```
	/// and returns the tensor `C`.
	Operation *
	linalg_generic_matmul(Value vA, Value vB, RankedTensorType tC,
	MatmulRegionBuilder regionBuilder = mulRegionBuilder);

	/// Build a linalg.generic, under the current ScopedContext, at the current
	/// insert point, that computes:
	/// ```
	/// (m, n, k) = (par, par, seq)
	/// \|
	/// \| D(m, n) = C(m, n) + sum_k(A(m, k) * B(k, n))			/// \| D(m, n) = C(m, n) + sum_k(A(m, k) * B(k, n))
	/// ```			/// ```
	/// and returns the tensor `D`.			/// and returns the tensor `D`.
	Operation *			Operation *
	linalg_generic_matmul(Value vA, Value vB, Value vC, RankedTensorType tD,			linalg_generic_matmul(Value vA, Value vB, Value vC, RankedTensorType tD,
	MatmulRegionBuilder regionBuilder = macRegionBuilder);			MatmulRegionBuilder regionBuilder = macRegionBuilder);

	template <typename Container>			template <typename Container>
	▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

Show All 34 Lines
def NamedStructuredOpTrait : NativeOpTrait<"linalg::NamedStructuredOpTrait">;		def NamedStructuredOpTrait : NativeOpTrait<"linalg::NamedStructuredOpTrait">;

// Base Tablegen class for Linalg ops.		// Base Tablegen class for Linalg ops.
// Linalg ops that correspond to library calls operate on linalg::View as their		// Linalg ops that correspond to library calls operate on linalg::View as their
// first operands. These may be optionally followed by non-view operands		// first operands. These may be optionally followed by non-view operands
// depending on the specific Linalg op.		// depending on the specific Linalg op.
class LinalgStructuredBase_Op<string mnemonic, list<OpTrait> props>		class LinalgStructuredBase_Op<string mnemonic, list<OpTrait> props>
: Op<Linalg_Dialect, mnemonic,		: Op<Linalg_Dialect, mnemonic,
!listconcat(props, [StructuredOpTraits, LinalgStructuredInterface])> {		!listconcat(props, [LinalgStructuredInterface])> {
}		}

class LinalgStructured_Op<string mnemonic, list<OpTrait> props>		class LinalgStructured_Op<string mnemonic, list<OpTrait> props>
: LinalgStructuredBase_Op<mnemonic, props> {		: LinalgStructuredBase_Op<mnemonic,
		!listconcat(props, [StructuredOpTraits])> {
code libraryCallName = [{		code libraryCallName = [{
std::string getLibraryCallName() {		std::string getLibraryCallName() {
return generateLibraryCallName(getOperation());		return generateLibraryCallName(getOperation());
}		}
}];		}];
let assemblyFormat = "`(` operands `)` attr-dict `:` type(operands)";		let assemblyFormat = "`(` operands `)` attr-dict `:` type(operands)";
}		}

▲ Show 20 Lines • Show All 396 Lines • ▼ Show 20 Lines
def LinalgOperand: AnyTypeOf<[AnyRankedTensor, AnyStridedMemRef]>;		def LinalgOperand: AnyTypeOf<[AnyRankedTensor, AnyStridedMemRef]>;

class LinalgOperandOfRank<int rank>: Type<		class LinalgOperandOfRank<int rank>: Type<
And<[		And<[
LinalgOperand.predicate,		LinalgOperand.predicate,
CPred<"$_self.cast<ShapedType>().getRank() == " # rank>]		CPred<"$_self.cast<ShapedType>().getRank() == " # rank>]
>>;		>>;

class GenericOpBase<string mnemonic> : LinalgStructuredBase_Op<mnemonic,		class GenericOpBase<string mnemonic> : LinalgStructuredBase_Op<mnemonic, [
[SingleBlockImplicitTerminator<"YieldOp">]> {		NamedStructuredOpTrait,
let arguments = (ins Variadic<LinalgOperand>:$views,		AttrSizedOperandSegments,
I64Attr:$args_in,		SingleBlockImplicitTerminator<"YieldOp">]> {
I64Attr:$args_out,		let arguments = (ins Variadic<AnyShaped>:$inputs,
		Variadic<AnyMemRef>:$output_buffers,
		Variadic<AnyRankedTensor>:$init_tensors,
AffineMapArrayAttr:$indexing_maps,		AffineMapArrayAttr:$indexing_maps,
ArrayAttr:$iterator_types,		ArrayAttr:$iterator_types,
OptionalAttr<StrAttr>:$doc,		OptionalAttr<StrAttr>:$doc,
OptionalAttr<StrAttr>:$library_call,		OptionalAttr<StrAttr>:$library_call,
Confined<OptionalAttr<I64Attr>,		Confined<OptionalAttr<I64Attr>, [IntMinValue<0>]>
[IntMinValue<0>]>:$symbol_source);		:$symbol_source);
let results = (outs Variadic<AnyRankedTensor>:$output_tensors);		let results = (outs Variadic<AnyRankedTensor>:$result_tensors);
let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);
		let builders = [
		OpBuilder<
		"OpBuilder &builder, OperationState &result, "
		"ValueRange inputs, ValueRange outputBuffers, "
		"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "
		"StringRef = \"\", StringRef = \"\", "
		"IntegerAttr = IntegerAttr(), "
		"function_ref<void(OpBuilder &, Location, ValueRange)> = nullptr">,
		OpBuilder<
		"OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTensorTypes,"
		"ValueRange inputs, ValueRange outputBuffers, ValueRange initTensors, "
		"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "
		"StringRef = \"\", StringRef = \"\", IntegerAttr = IntegerAttr(), "
		"function_ref<void(OpBuilder &, Location, ValueRange)> = nullptr">
		];
let extraClassDeclaration = [{		let extraClassDeclaration = [{
SmallVector<StringRef, 8> linalgTraitAttrNames() {		SmallVector<StringRef, 8> linalgTraitAttrNames() {
return SmallVector<StringRef, 8>{		return SmallVector<StringRef, 8>{
getArgsInAttrName(), getArgsOutAttrName(), getDocAttrName(),		getDocAttrName(),
getIndexingMapsAttrName(), getLibraryCallAttrName(),		getIndexingMapsAttrName(), getLibraryCallAttrName(),
getIteratorTypesAttrName(), getSymbolSourceAttrName()		getIteratorTypesAttrName(), getSymbolSourceAttrName()
};		};
}		}

unsigned getNumInputs() { return args_in(); }

unsigned getNumOutputs() { return args_out(); }

StringRef getLibraryCallName() {		StringRef getLibraryCallName() {
return library_call().hasValue() ? library_call().getValue() : "";		return library_call().hasValue() ? library_call().getValue() : "";
}		}

llvm::Optional<unsigned> getSymbolSource() {		llvm::Optional<unsigned> getSymbolSource() {
auto ss = symbol_source();		auto ss = symbol_source();
return ss.hasValue() ?		return ss.hasValue() ?
llvm::Optional<unsigned>(ss.getValue()) : llvm::None;		llvm::Optional<unsigned>(ss.getValue()) : llvm::None;
}		}
}];		}];

let printer = [{ return ::print(p, *this); }];		let printer = [{ return ::print(p, *this); }];
let parser = [{ return ::parseGenericOp(parser, result); }];		let parser = [{ return ::parseGenericOp(parser, result); }];
}		}

/// Index-free GenericOp.		/// Index-free GenericOp.
def GenericOp : GenericOpBase<"generic"> {		def GenericOp : GenericOpBase<"generic"> {
let description = [{		let description = [{
Generic Linalg op form where the key properties of the computation are		Generic Linalg op form where the key properties of the computation are
specified as attributes. In pretty form, a linalg.generic op is written as:		specified as attributes. In pretty form, a `linalg.generic` op is written
		as:

```mlir		```mlir
linalg.generic #trait_attribute %A, %B, %C {other-attributes} :		linalg.generic #trait_attribute
memref<?x?xf32, stride_specification>,		ins(%A, %B : memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>)
memref<?x?xf32, stride_specification>		outs(%C : memref<?x?xf32, stride_specification>)
		[other-attributes]
		burmakoUnsubmitted Done Reply Inline Actions `{other-attributes}`? (Generally, I see the notation in this file is inconsistent between `[other-attributes]` and `{other-attributes}`, so maybe this change is intentional?). burmako: `{other-attributes}`? (Generally, I see the notation in this file is inconsistent between `…
		herhutUnsubmitted Done Reply Inline Actions This is not the actual syntax, right? There is an `attrs` there. herhut: This is not the actual syntax, right? There is an `attrs` there.
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Indeed thanks! Ideally this would go away but this CL has grown fat and I am running out of patience writing throwaway parser/printer code. Will figure it out once we can use the declarative format. nicolasvasilache: Indeed thanks! Ideally this would go away but this CL has grown fat and I am running out of…
		{region}
```		```

Where #trait_attributes is an alias of a dictionary attribute containing:		Where #trait_attributes is an alias of a dictionary attribute containing:
- args_in: an I64Attr representing the number of input (readonly) views
- args_out: an I64Attr representing the number of output (readwrite) views
- doc [optional]: a documentation string		- doc [optional]: a documentation string
- indexing_maps: a list of AffineMapAttr, one AffineMapAttr per each input		- indexing_maps: a list of AffineMapAttr, one AffineMapAttr per each input
and output view. Such AffineMapAttr specifies the mapping between the		and output view. Such AffineMapAttr specifies the mapping between the
loops and the indexing within each view.		loops and the indexing within each view.
- library_call [optional]: a StringAttr containing the name of an		- library_call [optional]: a StringAttr containing the name of an
external library function that the linalg.generic operation maps to.		external library function that the linalg.generic operation maps to.
The external library is assumed to be dynamically linked and no strong		The external library is assumed to be dynamically linked and no strong
compile-time guarantees are provided. In the absence of such a library		compile-time guarantees are provided. In the absence of such a library
Show All 14 Lines	Defining a #matmul_trait attribute in MLIR can be done as follows:
(m, n, k) -> (m, k),		(m, n, k) -> (m, k),
(m, n, k) -> (k, n),		(m, n, k) -> (k, n),
(m, n, k) -> (m, n)		(m, n, k) -> (m, n)
]		]
#matmul_trait = {		#matmul_trait = {
doc = "C(m, n) += A(m, k) * B(k, n)",		doc = "C(m, n) += A(m, k) * B(k, n)",
indexing_maps = #matmul_accesses,		indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",		library_call = "linalg_matmul",
args_in = 2,
args_out = 1,
iterator_types = ["parallel", "parallel", "reduction"]		iterator_types = ["parallel", "parallel", "reduction"]
}		}
```		```

And can be reused in multiple places as:		And can be reused in multiple places as:
```mlir		```mlir
linalg.generic #matmul_trait %A, %B, %C [other-attributes] {		linalg.generic #matmul_trait
		ins(%A, %B : memref<?x?xf32, stride_specification>,
		memref<?x?xf32, stride_specification>)
		outs(%C : memref<?x?xf32, stride_specification>)
		[other-attributes] {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b: f32		%d = mulf %a, %b: f32
%e = addf %c, %d: f32		%e = addf %c, %d: f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?x?xf32, stride_specification>,		}
memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>
```		```

This may lower to either:		This may lower to either:
```mlir		```mlir
call @linalg_matmul(%A, %B, %C) :		call @linalg_matmul(%A, %B, %C) :
(memref<?x?xf32, stride_specification>,		(memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>)		memref<?x?xf32, stride_specification>)
Show All 12 Lines	scf.for %m = %c0 to %M step %c1 {
%e = addf %c, %d: f32		%e = addf %c, %d: f32
store %e, %C[%m, %n] : memref<?x?x?xf32, stride_specification>		store %e, %C[%m, %n] : memref<?x?x?xf32, stride_specification>
}		}
}		}
}		}
```		```

To allow progressive lowering from the value world (a.k.a tensor values) to		To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a `linalg.generic` op accepts		the buffer world (a.k.a memref values), a `linalg.generic` op allows mixing
mixing input and output ranked tensor values with input and output memrefs.		tensors and buffers operands and tensor results.

```mlir		```mlir
%C = linalg.generic #trait_attribute %A, %B {other-attributes} {region} :		%C = linalg.generic #trait_attribute
tensor<?x?xf32>,		ins(%A, %B : tensor<?x?xf32>, memref<?x?xf32, stride_specification>)
memref<?x?xf32, stride_specification>		init(%C : tensor<?x?xf32>)
		[other-attributes]
		{region}
-> (tensor<?x?xf32>)		-> (tensor<?x?xf32>)
```		```

In this case, the number of outputs (args_out) must match the sum of (1) the		The `init` operand and the conventions around mixing tensors and buffers are
number of output buffer operands and (2) the number of tensor return values.		described in more detail in the "Tensors and Buffers: Conventions and
The semantics is that the `linalg.indexed_generic` op produces (i.e.		Limitations" section in the [Linalg Document](../docs/Linalg.md)
allocates and fills) its tensor return values.

Tensor values must be legalized by a buffer allocation pass before most		Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. Such legalization moves tensor return values		transformations can be applied. Such legalizations move tensor return values
into output buffer operands and updates the region arguments accordingly.		into output buffer operands and updates the region arguments accordingly.
		burmakoUnsubmitted Done Reply Inline Actions "update" instead of "updates"? burmako: "update" instead of "updates"?

Transformations that create control-flow around linalg.indexed_generic		The `symbol_source` attribute allows selecting a particular operand and
operations are not expected to work with tensors because SSA values do not		introducing symbols for each operand dimension. Such symbols can then be
escape naturally. Still, transformations and rewrites that take advantage of		used in the indexing maps.
tensor SSA values are expected to be useful and will be added in the near
future.
burmakoUnsubmitted Done Reply Inline Actions Is this paragraph removed because this is no longer the case? burmako: Is this paragraph removed because this is no longer the case?
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions correct, the SSA values don't escape control flow is not true anymore (e.g. scf.for with yield that I am going to take advantage of in the future). nicolasvasilache: correct, the SSA values don't escape control flow is not true anymore (e.g. scf.for with yield…

Example of 1D convolution with symbols:		Example of 1D convolution with symbols:
```mlir		```mlir
#conv_1d_accesses = [		#conv_1d_accesses = [
affine_map<(m, n)[dimN] -> (m + n - dimN floordiv 2)>, // in		affine_map<(m, n)[dimN] -> (m + n - dimN floordiv 2)>, // in
affine_map<(m, n)[dimN] -> (n)>, // filter		affine_map<(m, n)[dimN] -> (n)>, // filter
affine_map<(m, n)[dimN] -> (m)> // out		affine_map<(m, n)[dimN] -> (m)> // out
]		]

#conv_1d_trait = {		#conv_1d_trait = {
doc = "O(m) += I(m + n - size(n) floordiv 2) * K(n)",		doc = "O(m) += I(m + n - size(n) floordiv 2) * K(n)",
indexing_maps = #conv_1d_accesses,		indexing_maps = #conv_1d_accesses,
library_call = "linalg_conv_1d",		library_call = "linalg_conv_1d",
iterator_types = ["parallel", "parallel"],		iterator_types = ["parallel", "parallel"],
symbol_source = 1		symbol_source = 1
}		}

linalg.generic #conv_1d_trait %in, %filter, %out {		linalg.generic #conv_1d_trait
		ins(%in, %filter : memref<?xf32>, memref<?xf32>)
		outs(%out : memref<?xf32>) {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?xf32>,		}
memref<?xf32>,
memref<?xf32>
burmakoUnsubmitted Done Reply Inline Actions What do you think about the syntax which keeps type annotations at the end (while still introducing `ins` and friends)? burmako: What do you think about the syntax which keeps type annotations at the end (while still…
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions It has 2 drawbacks that I can see right now: still need to jump through hoops to figure out which arg is of what type in the multi-list. More local type information is more desirable esp. when we mix input + output buffers + result tensors. the goal is to have the parser and printer use declarative assembly format with optional groups (blocked on some missing declarative assembly feature). Optional groups need operands and types to be within the same parsing unit (can't have the type parsing relegated at the end). nicolasvasilache: It has 2 drawbacks that I can see right now: - still need to jump through hoops to figure out…
burmakoUnsubmitted Done Reply Inline Actions "still need to jump through hoops to figure out which arg is of what type in the multi-list". If I understand correctly, you're talking about the more general problem of visually matching things with their types because there's a bunch of syntax between one and the other. This also exists in other operations like call/llvm.invoke (match arguments with their types) and, more generally, in default syntax for ops. One could even say it's also relevant to bigger ops in general, e.g. for subview (match the source with its type). Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax? The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func. Depending on the reader / formatting, this syntax may not necessarily be a readability improvement upon the status quo. Previously one could figure out the types of linalg.generic views by looking at the end of the syntax, and now the types live in the middle which is not necessarily easy to visually parse. This also affects the use case of mixing tensors and buffers because for that use case it seems to be beneficial to quickly skim the types involved in an op. Custom syntax that's stylistically different from the status quo may lead to other issues that I haven't thought about. For example, here's one that came to mind as I was ready to press the Submit button. https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices recommends that "Tests should be minimal, and only check what is absolutely necessary". As an example of minimalism, it shows an example that omits types from ops in expected syntax, e.g. `// CHECK-NEXT: return %[[RESULT]], %[[RESULT]]`. This is something that becomes harder with the proposed syntax since one cannot just drop stuff after the conventional colon (colon included), and instead we need to say things like `ins(%foo, %bar: {{.}}), outs...`. burmako:* 1) "still need to jump through hoops to figure out which arg is of what type in the multi-list".
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions and, more generally, in default syntax for ops. Yes that is the unsugared form, it quickly becomes hard to read when you have enough operands. Why does this need to be treated in a special way in linalg.generic / Linalg named ops syntax? Because parameter packs based on `operand_segment_sizes` need to be sugared otherwise they are hard to segment out. The problem is still not solved with the current notation because you still need to jump through hoops within argument lists unlike e.g. in cond_br and func. This relates to @mravishankar's comment on https://reviews.llvm.org/D87767. TL;DR I agree that we want to go towards a func-like syntax where each arg is followed by its type. I would strongly prefer this to be done automatically, once and for all, by the declarative format to avoid developing more parser/printer debt. When the custom parser / printer accepts and interleaved mode it will be easy to have an NFC CL to update the syntax. The situation is much better than it was before though: if we have 3 parameter packs with 2, 3 and 4 arguments there is strictly less jumping through hoops to determine the type of second argument of the third pack (you just need to look for the 2nd local type instead of looking for the `7^th` global type). and now the types live in the middle which is not necessarily easy to visually parse We could also put the region before the type arguments, would that alleviate part of the problem ? readability improvement upon the status quo Consistency wins here, first I need to fix the semantics gap and uniformize with named ops (see https://reviews.llvm.org/D87767). Also note that some of these arguments may become regions in the future. Once uniformity is achieved we can continue improving the parsing / pretty-printing in an NFC fashion. In particular the regions should also ideally be simplified. stylistically different from the status quo The status quo is now https://reviews.llvm.org/D87767, the generic ops have weaker semantics and need to be updated to allow tensors + reductions. Once the harder functional changes are landed, we can iterate on a better syntax. Note however that any proposed syntax will need to also work for named ops. nicolasvasilache: >and, more generally, in default syntax for ops. Yes that is the unsugared form, it quickly…
```		```
where symbol s0 will be substituted with `dim %filter, %c0` i.e. the first		where symbol s0 will be substituted with `dim %filter, %c0` i.e. the first
and only dimension of the second operand as specified by the symbol_source		and only dimension of the second operand as specified by the symbol_source
attribute.		attribute.
}];		}];

let builders = [
OpBuilder<
"OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes, "
"ValueRange args, int64_t argsIn, int64_t argsOut, "
"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "
"function_ref<void(OpBuilder &, Location, ValueRange)> = nullptr">
];

let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];

let hasFolder = 1;		let hasFolder = 1;
let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
}		}

/// GenericOp with Indexing (i.e. multi-for style in which the region is passed		/// GenericOp with Indexing (i.e. multi-for style in which the region is passed
/// the enclosing loop induction variables)		/// the enclosing loop induction variables)
def IndexedGenericOp : GenericOpBase<"indexed_generic"> {		def IndexedGenericOp : GenericOpBase<"indexed_generic"> {
let description = [{		let description = [{
Indexed Generic Linalg op form where the key properties of the computation		Indexed Generic Linalg op form where the key properties of the computation
are specified as attributes. In pretty form, a linalg.indexed_generic op is		are specified as attributes. In pretty form, a `linalg.indexed_generic` op
written as:		is written as:

```mlir		```mlir
linalg.indexed_generic #trait_attribute %A, %B, %C {other-attributes} :		linalg.indexed_generic #trait_attribute
memref<?x?xf32, stride_specification>,		ins(%A, %B : memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>)
memref<?x?xf32, stride_specification>		outs(%C : memref<?x?xf32, stride_specification>)
		[other-attributes]
		{region}
```		```

Where #trait_attributes is an alias of a dictionary attribute containing:		Where #trait_attributes is an alias of a dictionary attribute containing:
- args_in: an I64Attr representing the number of input (readonly) views
- args_out: an I64Attr representing the number of output (readwrite) views
- doc [optional]: a documentation string		- doc [optional]: a documentation string
- indexing_maps: a list of AffineMapAttr, one AffineMapAttr per each input		- indexing_maps: a list of AffineMapAttr, one AffineMapAttr per each input
and output view. Such AffineMapAttr specifies the mapping between the		and output view. Such AffineMapAttr specifies the mapping between the
loops and the indexing within each view.		loops and the indexing within each view.
- library_call [optional]: a StringAttr containing the name of an		- library_call [optional]: a StringAttr containing the name of an
external library function that the linalg.indexed_generic operation		external library function that the linalg.indexed_generic operation
maps to. The external library is assumed to be dynamically linked and		maps to. The external library is assumed to be dynamically linked and
no strong compile-time guarantees are provided. In the absence of such		no strong compile-time guarantees are provided. In the absence of such
Show All 11 Lines	#matmul_accesses = [
(m, n, k) -> (m, k),		(m, n, k) -> (m, k),
(m, n, k) -> (k, n),		(m, n, k) -> (k, n),
(m, n, k) -> (m, n)		(m, n, k) -> (m, n)
]		]
#matmul_trait = {		#matmul_trait = {
doc = "C(m, n) += A(m, k) * B(k, n)",		doc = "C(m, n) += A(m, k) * B(k, n)",
indexing_maps = #matmul_accesses,		indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",		library_call = "linalg_matmul",
args_in = 2,
args_out = 1,
iterator_types = ["parallel", "parallel", "reduction"]		iterator_types = ["parallel", "parallel", "reduction"]
}		}
```		```

And can be reused in multiple places as:		And can be reused in multiple places as:

```mlir		```mlir
linalg.indexed_generic #matmul_trait %A, %B, %C [other-attributes] {		linalg.indexed_generic #matmul_trait
		ins(%A, %B : memref<?x?xf32, stride_specification>,
		memref<?x?xf32, stride_specification>)
		outs(%C : memref<?x?xf32, stride_specification>)
		burmakoUnsubmitted Done Reply Inline Actions Missing curly brace? burmako: Missing curly brace?
(%offset_m: index, %offset_n: index, %offset_k: index,		(%offset_m: index, %offset_n: index, %offset_k: index,
%a: f32, %b: f32, %c: f32) :		%a: f32, %b: f32, %c: f32) :
"some_optional_computation"(%offset_m, %offset_n, %offset_k)		"some_optional_computation"(%offset_m, %offset_n, %offset_k)
%d = mulf %a, %b: f32		%d = mulf %a, %b: f32
%e = addf %c, %d: f32		%e = addf %c, %d: f32
linalg_yield %e : f32		linalg_yield %e : f32
} : memref<?x?xf32, stride_specification>,		}
memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>
```		```

This may lower to either:		This may lower to either:

```mlir		```mlir
call @linalg_matmul(%offset_m, %offset_n, %offset_k, %A, %B, %C) :		call @linalg_matmul(%offset_m, %offset_n, %offset_k, %A, %B, %C) :
(memref<?x?xf32, stride_specification>,		(index, index, index,
		memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>,
memref<?x?xf32, stride_specification>)		memref<?x?xf32, stride_specification>)
-> ()		-> ()
```		```

or IR resembling:		or IR resembling:

```mlir		```mlir
Show All 9 Lines	scf.for %m = %c0 to %M step %c1 {
store %d, %C[%m, %n] : memref<?x?x?xf32, stride_specification>		store %d, %C[%m, %n] : memref<?x?x?xf32, stride_specification>
}		}
}		}
}		}
```		```

To allow progressive lowering from the value world (a.k.a tensor values) to		To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a `linalg.indexed_generic` op		the buffer world (a.k.a memref values), a `linalg.indexed_generic` op
accepts mixing input and output ranked tensor values with input and output		allows mixing tensors and buffers operands and tensor results.
memrefs.

```mlir		```mlir
%C = linalg.indexed_generic #trait_attribute %A, %B {other-attributes}		%C = linalg.indexed_generic #trait_attribute
: tensor<?x?xf32>,		ins(%A, %B : tensor<?x?xf32>, memref<?x?xf32, stride_specification>)
memref<?x?xf32, stride_specification>		init(%C : tensor<?x?xf32>)
		[other-attributes]
		{region_with_index_arguments}
-> (tensor<?x?xf32>)		-> (tensor<?x?xf32>)
```		```

In this case, the number of outputs (args_out) must match the sum of (1) the		The `init` operand and the conventions around mixing tensors and buffers are
number of output buffer operands and (2) the number of tensor return values.		described in more detail in the "Tensors and Buffers: Conventions and
The semantics is that the `linalg.indexed_generic` op produces (i.e.		Limitations" section in the [Linalg Document](../docs/Linalg.md)
allocates and fills) its return values.

Tensor values must be legalized by a buffer allocation pass before most		Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. Such legalization moves tensor return values		transformations can be applied. Such legalizations move tensor return values
into output buffer operands and updates the region argument accordingly.		into output buffer operands and updates the region arguments accordingly.

Transformations that create control-flow around linalg.indexed_generic		The `symbol_source` attribute allows selecting a particular operand and
operations are not expected to work with tensors because SSA values do not		introducing symbols for each operand dimension. Such symbols can then be
escape naturally. Still, transformations and rewrites that take advantage of		used in the indexing maps.
tensor SSA values are expected to be useful and will be added in the near
future.
}];

let builders = [		Example of 1D convolution with symbols:
OpBuilder<		```mlir
"OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes, "		#conv_1d_accesses = [
"ValueRange args, int64_t argsIn, int64_t argsOut, "		affine_map<(m, n)[dimN] -> (m + n - dimN floordiv 2)>, // in
"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "		affine_map<(m, n)[dimN] -> (n)>, // filter
"function_ref<void(OpBuilder &, Location, ValueRange, ValueRange)> "		affine_map<(m, n)[dimN] -> (m)> // out
"= nullptr">		]
];
		#conv_1d_trait = {
		doc = "O(m) += I(m + n - size(n) floordiv 2) * K(n)",
		indexing_maps = #conv_1d_accesses,
		library_call = "linalg_conv_1d",
		iterator_types = ["parallel", "parallel"],
		symbol_source = 1
		}

		linalg.generic #conv_1d_trait
		ins(%in, %filter : memref<?xf32>, memref<?xf32>)
		outs(%out : memref<?xf32>) {
		^bb0(%a: f32, %b: f32, %c: f32) :
		%d = mulf %a, %b : f32
		%e = addf %c, %d : f32
		linalg.yield %e : f32
		}
		```
		where symbol s0 will be substituted with `dim %filter, %c0` i.e. the first
		and only dimension of the second operand as specified by the symbol_source
		attribute.
		}];

let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];

let hasFolder = 1;		let hasFolder = 1;
let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Named Linalg ops, implemented as a declarative configurations of generic ops.		// Named Linalg ops, implemented as a declarative configurations of generic ops.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// This file is auto-generated from a TC def specification.		// This file is auto-generated from a TC def specification.
include "mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.td"		include "mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.td"

#endif // LINALG_STRUCTURED_OPS		#endif // LINALG_STRUCTURED_OPS

mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	class NamedStructuredOpTrait
: public OpTrait::TraitBase<ConcreteType, NamedStructuredOpTrait> {		: public OpTrait::TraitBase<ConcreteType, NamedStructuredOpTrait> {
public:		public:
unsigned getNumInputs() {		unsigned getNumInputs() {
return cast<ConcreteType>(this->getOperation()).inputs().size();		return cast<ConcreteType>(this->getOperation()).inputs().size();
}		}
unsigned getNumOutputs() {		unsigned getNumOutputs() {
ConcreteType concreteOp = cast<ConcreteType>(this->getOperation());		ConcreteType concreteOp = cast<ConcreteType>(this->getOperation());
return concreteOp.output_buffers().size() +		return concreteOp.output_buffers().size() +
concreteOp.output_tensors().size();		concreteOp.result_tensors().size();
}		}
static LogicalResult verifyTrait(Operation *op) {		static LogicalResult verifyTrait(Operation *op) {
ConcreteType concreteOp = cast<ConcreteType>(op);		ConcreteType concreteOp = cast<ConcreteType>(op);
unsigned nInputAndBufferOperands =		unsigned nInputAndBufferOperands =
concreteOp.getNumInputsAndOutputBuffers();		concreteOp.getNumInputsAndOutputBuffers();
if (failed(		if (failed(
OpTrait::impl::verifyAtLeastNOperands(op, nInputAndBufferOperands)))		OpTrait::impl::verifyAtLeastNOperands(op, nInputAndBufferOperands)))
return failure();		return failure();
▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	/// Attribute name for the AffineArrayAttr which encodes the relationship			/// Attribute name for the AffineArrayAttr which encodes the relationship
	/// between a structured op iterators' and its operands.			/// between a structured op iterators' and its operands.
	constexpr StringRef getIndexingMapsAttrName() { return "indexing_maps"; }			constexpr StringRef getIndexingMapsAttrName() { return "indexing_maps"; }

	/// Attribute name for the StrArrayAttr which encodes the type of a structured			/// Attribute name for the StrArrayAttr which encodes the type of a structured
	/// op's iterators.			/// op's iterators.
	constexpr StringRef getIteratorTypesAttrName() { return "iterator_types"; }			constexpr StringRef getIteratorTypesAttrName() { return "iterator_types"; }

	/// Attribute name for the IntegerAttr which encodes the number of input buffer
	/// arguments.
	constexpr StringRef getArgsInAttrName() { return "args_in"; }

	/// Attribute name for the IntegerAttr which encodes the number of input buffer
	/// arguments.
	constexpr StringRef getArgsOutAttrName() { return "args_out"; }

	/// Attribute name for the StringAttr which encodes an optional documentation			/// Attribute name for the StringAttr which encodes an optional documentation
	/// string of the structured op.			/// string of the structured op.
	constexpr StringRef getDocAttrName() { return "doc"; }			constexpr StringRef getDocAttrName() { return "doc"; }

	/// Attribute name for the StrArrayAttr which encodes the external library			/// Attribute name for the StrArrayAttr which encodes the external library
	/// function that implements the structured op.			/// function that implements the structured op.
	constexpr StringRef getLibraryCallAttrName() { return "library_call"; }			constexpr StringRef getLibraryCallAttrName() { return "library_call"; }

	▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

mlir/lib/Conversion/LinalgToSPIRV/LinalgToSPIRV.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	SingleWorkgroupReduction::matchAsPerformingReduction(
linalg::GenericOp genericOp) {		linalg::GenericOp genericOp) {
Operation *op = genericOp.getOperation();		Operation *op = genericOp.getOperation();

// Make sure the linalg.generic is working on memrefs.		// Make sure the linalg.generic is working on memrefs.
if (!genericOp.hasBufferSemantics())		if (!genericOp.hasBufferSemantics())
return llvm::None;		return llvm::None;

// Make sure this is reduction with one input and one output.		// Make sure this is reduction with one input and one output.
if (genericOp.args_in() != 1 \|\| genericOp.args_out() != 1)		if (genericOp.getNumInputs() != 1 \|\| genericOp.getNumOutputs() != 1)
return llvm::None;		return llvm::None;

auto originalInputType = op->getOperand(0).getType().cast<MemRefType>();		auto originalInputType = op->getOperand(0).getType().cast<MemRefType>();
auto originalOutputType = op->getOperand(1).getType().cast<MemRefType>();		auto originalOutputType = op->getOperand(1).getType().cast<MemRefType>();

// Make sure the original input has one dimension.		// Make sure the original input has one dimension.
if (!originalInputType.hasStaticShape() \|\| originalInputType.getRank() != 1)		if (!originalInputType.hasStaticShape() \|\| originalInputType.getRank() != 1)
return llvm::None;		return llvm::None;
▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/EDSC/Builders.cpp

Show All 17 Lines
using namespace mlir;		using namespace mlir;
using namespace mlir::edsc;		using namespace mlir::edsc;
using namespace mlir::edsc::intrinsics;		using namespace mlir::edsc::intrinsics;
using namespace mlir::linalg;		using namespace mlir::linalg;
using namespace mlir::scf;		using namespace mlir::scf;

Operation *mlir::edsc::makeGenericLinalgOp(		Operation *mlir::edsc::makeGenericLinalgOp(
ArrayRef<IteratorType> iteratorTypes, ArrayRef<StructuredIndexed> inputs,		ArrayRef<IteratorType> iteratorTypes, ArrayRef<StructuredIndexed> inputs,
ArrayRef<StructuredIndexed> outputs,		ArrayRef<StructuredIndexed> outputBuffers, ArrayRef<Value> initTensors,
		ArrayRef<StructuredIndexed> resultTensorTypes,
function_ref<void(ValueRange)> regionBuilder, ArrayRef<Value> otherValues,		function_ref<void(ValueRange)> regionBuilder, ArrayRef<Value> otherValues,
ArrayRef<Attribute> otherAttributes) {		ArrayRef<Attribute> otherAttributes) {
for (unsigned i = 0, e = outputs.size(); i + 1 < e; ++i)		OpBuilder &builder = edsc::ScopedContext::getBuilderRef();
assert(!(outputs[i].getType().isa<RankedTensorType>() &&
outputs[i + 1].getType().isa<MemRefType>()) &&
"output tensors must be passed after output buffers");
auto &builder = edsc::ScopedContext::getBuilderRef();
auto *ctx = builder.getContext();
unsigned nInputs = inputs.size();
unsigned nOutputs = outputs.size();

		// Build maps
SmallVector<SmallVector<AffineExpr, 4>, 4> exprsList;		SmallVector<SmallVector<AffineExpr, 4>, 4> exprsList;
exprsList.reserve(nInputs + nOutputs);		exprsList.reserve(inputs.size() + outputBuffers.size() + initTensors.size());
for (auto structuredIndexed : inputs)		for (auto container : {inputs, outputBuffers, resultTensorTypes})
exprsList.emplace_back(structuredIndexed.getExprs().begin(),		for (const StructuredIndexed &s : container)
structuredIndexed.getExprs().end());		exprsList.emplace_back(s.getExprs().begin(), s.getExprs().end());
for (auto structuredIndexed : outputs)
exprsList.emplace_back(structuredIndexed.getExprs().begin(),
structuredIndexed.getExprs().end());
auto maps = AffineMap::inferFromExprList(exprsList);		auto maps = AffineMap::inferFromExprList(exprsList);

unsigned nViews = nInputs + nOutputs;
SmallVector<Value, 4> values;
values.reserve(nViews);
values.append(inputs.begin(), inputs.end());
std::copy_if(outputs.begin(), outputs.end(), std::back_inserter(values),
[](StructuredIndexed s) { return s.hasValue(); });
SmallVector<Type, 4> types;		SmallVector<Type, 4> types;
std::copy_if(outputs.begin(), outputs.end(), std::back_inserter(types),		assert(llvm::all_of(resultTensorTypes, [](const StructuredIndexed &s) {
[](StructuredIndexed s) { return !s.hasValue(); });		return !s.hasValue();
		}));
		std::copy(resultTensorTypes.begin(), resultTensorTypes.end(),
		std::back_inserter(types));

		SmallVector<Value, 4> inputValues, outputBufferValues, initTensorValues;
		inputValues.reserve(inputs.size());
		outputBufferValues.reserve(outputBuffers.size());
		initTensorValues.reserve(initTensors.size());
		std::copy(inputs.begin(), inputs.end(), std::back_inserter(inputValues));
		std::copy(outputBuffers.begin(), outputBuffers.end(),
		std::back_inserter(outputBufferValues));
		std::copy(initTensors.begin(), initTensors.end(),
		std::back_inserter(initTensorValues));

auto iteratorStrTypes =		auto iteratorStrTypes =
llvm::to_vector<8>(llvm::map_range(iteratorTypes, toString));		llvm::to_vector<8>(llvm::map_range(iteratorTypes, toString));
// clang-format off		// clang-format off
auto *op =		auto *op =
edsc::ScopedContext::getBuilderRef()		edsc::ScopedContext::getBuilderRef()
.create<linalg::GenericOp>(		.create<linalg::GenericOp>(
edsc::ScopedContext::getLocation(),		edsc::ScopedContext::getLocation(),
types,		types,
values,		inputValues,
IntegerAttr::get(IntegerType::get(64, ctx), nInputs),		outputBufferValues,
IntegerAttr::get(IntegerType::get(64, ctx), nOutputs),		initTensorValues,
builder.getAffineMapArrayAttr(maps),		builder.getAffineMapArrayAttr(maps),
builder.getStrArrayAttr(iteratorStrTypes),		builder.getStrArrayAttr(iteratorStrTypes),
StringAttr() /doc/,		StringAttr() /doc/,
StringAttr() /library_call/,		StringAttr() /library_call/,
IntegerAttr() /symbol_source/		IntegerAttr() /symbol_source/
/* TODO: other attributes in op */		/* TODO: other attributes in op */
)		)
.getOperation();		.getOperation();
// clang-format on		// clang-format on

using namespace edsc;		using namespace edsc;
SmallVector<Type, 4> blockTypes;		SmallVector<Type, 4> blockTypes;
blockTypes.reserve(values.size());		blockTypes.reserve(inputs.size() + outputBuffers.size() + initTensors.size());
for (auto it : llvm::enumerate(values))		for (auto container : {inputs, outputBuffers})
blockTypes.push_back((it.index() < nViews)		for (const StructuredIndexed &s : container)
? getElementTypeOrSelf(it.value())		blockTypes.push_back(getElementTypeOrSelf(s.getType()));
: it.value().getType());		for (Value v : initTensors)
		blockTypes.push_back(getElementTypeOrSelf(v.getType()));

assert(op->getNumRegions() == 1);		assert(op->getNumRegions() == 1);
assert(op->getRegion(0).empty());		assert(op->getRegion(0).empty());
OpBuilder opBuilder(op);		OpBuilder opBuilder(op);
ScopedContext scope(opBuilder, op->getLoc());		ScopedContext scope(opBuilder, op->getLoc());
buildInNewBlock(op->getRegion(0), blockTypes, regionBuilder);		buildInNewBlock(op->getRegion(0), blockTypes, regionBuilder);
assert(llvm::hasSingleElement(op->getRegion(0)));		assert(llvm::hasSingleElement(op->getRegion(0)));
return op;		return op;
Show All 14 Lines	void mlir::edsc::ops::macRegionBuilder(ValueRange args) {
Value a(args[0]), b(args[1]), c(args[2]);		Value a(args[0]), b(args[1]), c(args[2]);
linalg_yield(c + a * b);		linalg_yield(c + a * b);
}		}

Operation *mlir::edsc::ops::linalg_generic_pointwise(		Operation *mlir::edsc::ops::linalg_generic_pointwise(
UnaryPointwiseOpBuilder unaryOp, StructuredIndexed I, StructuredIndexed O) {		UnaryPointwiseOpBuilder unaryOp, StructuredIndexed I, StructuredIndexed O) {
SmallVector<IteratorType, 4> iterTypes(O.getExprs().size(),		SmallVector<IteratorType, 4> iterTypes(O.getExprs().size(),
IteratorType::Parallel);		IteratorType::Parallel);
if (O.getType().isa<RankedTensorType>()) {
auto fun = [&unaryOp](ValueRange args) {		auto fun = [&unaryOp](ValueRange args) {
assert(args.size() == 1 && "expected 1 block arguments");		assert(args.size() >= 1 && "expected >= 1 block arguments");
Value a(args[0]);		Value a(args[0]);
linalg_yield(unaryOp(a));		linalg_yield(unaryOp(a));
};		};
return makeGenericLinalgOp(iterTypes, {I}, {O}, fun);		if (O.getType().isa<RankedTensorType>())
}		return makeGenericLinalgOp(iterTypes, /inputs=/{I}, /outputBuffers=/{},
auto fun = [&unaryOp](ValueRange args) {		/initTensors=/{}, /resultTensorTypes=/{O},
assert(args.size() == 2 && "expected 2 block arguments");		fun);
Value a(args[0]);		return makeGenericLinalgOp(iterTypes, /inputs=/{I}, /outputBuffers=/{O},
linalg_yield(unaryOp(a));		/initTensors=/{}, /resultTensorTypes=/{}, fun);
};
return makeGenericLinalgOp(iterTypes, {I}, {O}, fun);
}		}

Operation *mlir::edsc::ops::linalg_generic_pointwise_tanh(StructuredIndexed I,		Operation *mlir::edsc::ops::linalg_generic_pointwise_tanh(StructuredIndexed I,
StructuredIndexed O) {		StructuredIndexed O) {
UnaryPointwiseOpBuilder unOp([](Value a) -> Value { return std_tanh(a); });		UnaryPointwiseOpBuilder unOp([](Value a) -> Value { return std_tanh(a); });
return linalg_generic_pointwise(unOp, I, O);		return linalg_generic_pointwise(unOp, I, O);
}		}

/// Binary pointwise operation (with broadcast) entry point.		/// Binary pointwise operation (with broadcast) entry point.
Operation *mlir::edsc::ops::linalg_generic_pointwise(		Operation *mlir::edsc::ops::linalg_generic_pointwise(
BinaryPointwiseOpBuilder binaryOp, StructuredIndexed I1,		BinaryPointwiseOpBuilder binaryOp, StructuredIndexed I1,
StructuredIndexed I2, StructuredIndexed O) {		StructuredIndexed I2, StructuredIndexed O) {
SmallVector<IteratorType, 4> iterTypes(O.getExprs().size(),		SmallVector<IteratorType, 4> iterTypes(O.getExprs().size(),
IteratorType::Parallel);		IteratorType::Parallel);
if (O.getType().isa<RankedTensorType>()) {
auto fun = [&binaryOp](ValueRange args) {		auto fun = [&binaryOp](ValueRange args) {
assert(args.size() == 2 && "expected 2 block arguments");		assert(args.size() >= 2 && "expected >= 1 block arguments");
Value a(args[0]), b(args[1]);
linalg_yield(binaryOp(a, b));
};
return makeGenericLinalgOp(iterTypes, {I1, I2}, {O}, fun);
}
auto fun = [&binaryOp](ValueRange args) {
assert(args.size() == 3 && "expected 3 block arguments");
Value a(args[0]), b(args[1]);		Value a(args[0]), b(args[1]);
linalg_yield(binaryOp(a, b));		linalg_yield(binaryOp(a, b));
};		};
return makeGenericLinalgOp(iterTypes, {I1, I2}, {O}, fun);		if (O.getType().isa<RankedTensorType>())
		return makeGenericLinalgOp(
		iterTypes, /inputs=/{I1, I2}, /outputBuffers=/{},
		/initTensors=/{}, /resultTensorTypes=/{O}, fun);
		return makeGenericLinalgOp(iterTypes, /inputs=/{I1, I2},
		/outputBuffers=/{O},
		/initTensors=/{}, /resultTensorTypes=/{}, fun);
}		}

Operation *mlir::edsc::ops::linalg_generic_pointwise_add(StructuredIndexed I1,		Operation *mlir::edsc::ops::linalg_generic_pointwise_add(StructuredIndexed I1,
StructuredIndexed I2,		StructuredIndexed I2,
StructuredIndexed O) {		StructuredIndexed O) {
using edsc::op::operator+;		using edsc::op::operator+;
BinaryPointwiseOpBuilder binOp(		BinaryPointwiseOpBuilder binOp(
[](Value a, Value b) -> Value { return a + b; });		[](Value a, Value b) -> Value { return a + b; });
Show All 14 Lines
mlir::edsc::ops::linalg_generic_matmul(Value vA, Value vB, Value vC,		mlir::edsc::ops::linalg_generic_matmul(Value vA, Value vB, Value vC,
MatmulRegionBuilder regionBuilder) {		MatmulRegionBuilder regionBuilder) {
// clang-format off		// clang-format off
AffineExpr m, n, k;		AffineExpr m, n, k;
bindDims(ScopedContext::getContext(), m, n, k);		bindDims(ScopedContext::getContext(), m, n, k);
StructuredIndexed A(vA), B(vB), C(vC);		StructuredIndexed A(vA), B(vB), C(vC);
return makeGenericLinalgOp(		return makeGenericLinalgOp(
{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},		{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},
{A({m, k}), B({k, n})},		/inputs=/{A({m, k}), B({k, n})},
{C({m, n})},		/outputBuffers=/{C({m, n})},
regionBuilder);		/initTensors=/{},
// clang-format on		/resultTensorTypes=/{},
}

Operation *
mlir::edsc::ops::linalg_generic_matmul(Value vA, Value vB, RankedTensorType tC,
MatmulRegionBuilder regionBuilder) {
// clang-format off
AffineExpr m, n, k;
bindDims(ScopedContext::getContext(), m, n, k);
StructuredIndexed A(vA), B(vB), C(tC);
return makeGenericLinalgOp(
{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},
{A({m, k}), B({k, n})},
{C({m, n})},
regionBuilder);		regionBuilder);
// clang-format on		// clang-format on
}		}

Operation *		Operation *
mlir::edsc::ops::linalg_generic_matmul(Value vA, Value vB, Value vC,		mlir::edsc::ops::linalg_generic_matmul(Value vA, Value vB, Value vC,
RankedTensorType tD,		RankedTensorType tD,
MatmulRegionBuilder regionBuilder) {		MatmulRegionBuilder regionBuilder) {
// clang-format off		// clang-format off
AffineExpr m, n, k;		AffineExpr m, n, k;
bindDims(ScopedContext::getContext(), m, n, k);		bindDims(ScopedContext::getContext(), m, n, k);
StructuredIndexed A(vA), B(vB), C(vC), D(tD);		StructuredIndexed A(vA), B(vB), C(vC), D(tD);
return makeGenericLinalgOp(		return makeGenericLinalgOp(
{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},		{IteratorType::Parallel, IteratorType::Parallel, IteratorType::Reduction},
{A({m, k}), B({k, n}), C({m, n})},		/inputs=/{A({m, k}), B({k, n})},
{D({m, n})},		/outputBuffers=/{},
		/initTensors=/{C({m, n})},
		/resultTensorTypes=/{D({m, n})},
regionBuilder);		regionBuilder);
// clang-format on		// clang-format on
}		}

Operation *mlir::edsc::ops::linalg_generic_conv_nhwc(Value vI, Value vW,		Operation *mlir::edsc::ops::linalg_generic_conv_nhwc(Value vI, Value vW,
Value vO,		Value vO,
ArrayRef<int> strides,		ArrayRef<int> strides,
ArrayRef<int> dilations) {		ArrayRef<int> dilations) {
Show All 9 Lines	Operation *mlir::edsc::ops::linalg_generic_conv_nhwc(Value vI, Value vW,
auto d = dilations;		auto d = dilations;

AffineExpr b, f, h, w, kh, kw, c;		AffineExpr b, f, h, w, kh, kw, c;
bindDims(ctx, b, f, h, w, kh, kw, c);		bindDims(ctx, b, f, h, w, kh, kw, c);
unsigned numDims = c.cast<AffineDimExpr>().getPosition() + 1;		unsigned numDims = c.cast<AffineDimExpr>().getPosition() + 1;
StructuredIndexed I(vI), W(vW), O(vO);		StructuredIndexed I(vI), W(vW), O(vO);
// clang-format off		// clang-format off
return makeGenericLinalgOp(		return makeGenericLinalgOp(
{par, par, par, par, red, red, red}, {		{par, par, par, par, red, red, red},
		/inputs=/{
I({b,		I({b,
// Roundtrip to flattened form to serve as canonicalization and ensure		// Roundtrip to flattened form to serve as canonicalization and ensure
// consistent ordering of subexpressions.		// consistent ordering of subexpressions.
simplifyAffineExpr(s[0] * h + d[0] * kh, numDims, 0),		simplifyAffineExpr(s[0] * h + d[0] * kh, numDims, 0),
simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),		simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),
c}),		c}),
W({kh, kw, c, f})}, {		W({kh, kw, c, f}) },
O({b, h, w, f})},		/outputBuffers=/{ O({b, h, w, f}) },
		/initTensors=/{},
		/resultTensorTypes=/{},
macRegionBuilder);		macRegionBuilder);
// clang-format on		// clang-format on
}		}

Operation *mlir::edsc::ops::linalg_generic_dilated_conv_nhwc(		Operation *mlir::edsc::ops::linalg_generic_dilated_conv_nhwc(
Value vI, Value vW, Value vO, int depth_multiplier, ArrayRef<int> strides,		Value vI, Value vW, Value vO, int depth_multiplier, ArrayRef<int> strides,
ArrayRef<int> dilations) {		ArrayRef<int> dilations) {
MLIRContext *ctx = ScopedContext::getContext();		MLIRContext *ctx = ScopedContext::getContext();
// TODO: some template magic to make everything rank-polymorphic.		// TODO: some template magic to make everything rank-polymorphic.
assert((dilations.empty() \|\| dilations.size() == 2) && "only 2-D conv atm");		assert((dilations.empty() \|\| dilations.size() == 2) && "only 2-D conv atm");
assert((strides.empty() \|\| strides.size() == 2) && "only 2-D conv atm");		assert((strides.empty() \|\| strides.size() == 2) && "only 2-D conv atm");

// Some short names.		// Some short names.
auto par = IteratorType::Parallel;		auto par = IteratorType::Parallel;
auto red = IteratorType::Reduction;		auto red = IteratorType::Reduction;
auto s = strides;		auto s = strides;
auto d = dilations;		auto d = dilations;

// clang-format off		// clang-format off
AffineExpr b, dm, c, h, w, kh, kw;		AffineExpr b, dm, c, h, w, kh, kw;
bindDims(ctx, b, dm, c, h, w, kh, kw);		bindDims(ctx, b, dm, c, h, w, kh, kw);
unsigned numDims = kw.cast<AffineDimExpr>().getPosition() + 1;		unsigned numDims = kw.cast<AffineDimExpr>().getPosition() + 1;
StructuredIndexed I(vI), W(vW), O(vO);		StructuredIndexed I(vI), W(vW), O(vO);
return makeGenericLinalgOp(		return makeGenericLinalgOp(
{par, par, par, par, par, red, red}, {		{par, par, par, par, par, red, red},
		/inputs=/{
I({b,		I({b,
// Roundtrip to flattened form to serve as canonicalization and ensure		// Roundtrip to flattened form to serve as canonicalization and ensure
// consistent ordering of subexpressions.		// consistent ordering of subexpressions.
simplifyAffineExpr(s[0] * h + d[0] * kh, numDims, 0),		simplifyAffineExpr(s[0] * h + d[0] * kh, numDims, 0),
simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),		simplifyAffineExpr(s[1] * w + d[1] * kw, numDims, 0),
c}),		c}),
W({kh, kw, c, dm})}, {		W({kh, kw, c, dm})},
		/outputBuffers=/{
O({b, h, w, simplifyAffineExpr(c * depth_multiplier + dm, numDims, 0)})},		O({b, h, w, simplifyAffineExpr(c * depth_multiplier + dm, numDims, 0)})},
		/initTensors=/{},
		/resultTensorTypes=/{},
macRegionBuilder);		macRegionBuilder);
// clang-format on		// clang-format on
}		}

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show All 34 Lines

/// Forward declarations.		/// Forward declarations.
template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static void buildNamedStructuredOpRegionAndAttributes(		static void buildNamedStructuredOpRegionAndAttributes(
OpBuilder &opBuilder, OperationState &result, TypeRange inputTypes,		OpBuilder &opBuilder, OperationState &result, TypeRange inputTypes,
TypeRange outputBufferTypes, TypeRange initTensorTypes,		TypeRange outputBufferTypes, TypeRange initTensorTypes,
TypeRange resultTypes);		TypeRange resultTypes);

		static ParseResult
		parseCommonStructuredOpParts(OpAsmParser &parser, OperationState &result,
		SmallVectorImpl<Type> &inputsTypes,
		SmallVectorImpl<Type> &outputBuffersTypes,
		burmakoUnsubmitted Done Reply Inline Actions In the function above, this parameter is called `outputBufferTypes` without the "s" in "Buffers". Also see `initTensorTypes` below. burmako: In the function above, this parameter is called `outputBufferTypes` without the "s" in…
		SmallVectorImpl<Type> &initTensorsTypes);

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static ParseResult		static ParseResult
parseNamedStructuredOpRegion(OpAsmParser &parser, Region &region,		parseNamedStructuredOpRegion(OpAsmParser &parser, Region &region,
TypeRange inputTypes, TypeRange outputBufferTypes,		TypeRange inputTypes, TypeRange outputBufferTypes,
TypeRange initTensorTypes, TypeRange resultTypes);		TypeRange initTensorTypes, TypeRange resultTypes);
static ParseResult		static ParseResult
parseNamedStructuredOpResults(OpAsmParser &parser,		parseNamedStructuredOpResults(OpAsmParser &parser,
SmallVectorImpl<Type> &resultTypes);		SmallVectorImpl<Type> &resultTypes);

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static ParseResult parseNamedStructuredOp(OpAsmParser &parser,		static ParseResult parseNamedStructuredOp(OpAsmParser &parser,
OperationState &result);		OperationState &result);

		template <typename NamedStructuredOpType>
		static void printCommonStructuredOpParts(OpAsmPrinter &p,
		NamedStructuredOpType op);

static void printNamedStructuredOpResults(OpAsmPrinter &p,		static void printNamedStructuredOpResults(OpAsmPrinter &p,
TypeRange resultTypes);		TypeRange resultTypes);

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static void printNamedStructuredOp(OpAsmPrinter &p, NamedStructuredOpType op);		static void printNamedStructuredOp(OpAsmPrinter &p, NamedStructuredOpType op);

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static LogicalResult verifyNamedStructuredOp(NamedStructuredOpType op);		static LogicalResult verifyNamedStructuredOp(NamedStructuredOpType op);
Show All 18 Lines
///////////////////// Operations defined with Tablegen /////////////////////////		///////////////////// Operations defined with Tablegen /////////////////////////
// For such operations that do not correspond to library calls (i.e. defined in		// For such operations that do not correspond to library calls (i.e. defined in
// LinalgOps.td), we define an overloaded `print` function and a		// LinalgOps.td), we define an overloaded `print` function and a
// parse`className` function.		// parse`className` function.

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GenericOps		// GenericOps
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		void GenericOp::build(
		OpBuilder &builder, OperationState &result, ValueRange inputs,
		ValueRange outputBuffers, ArrayRef<AffineMap> indexingMaps,
		ArrayRef<StringRef> iteratorTypes, StringRef doc, StringRef libraryCall,
		IntegerAttr symbolSource,
		function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {
		build(builder, result, ArrayRef<Type>{}, inputs, outputBuffers, ValueRange{},
		builder.getAffineMapArrayAttr(indexingMaps),
		builder.getStrArrayAttr(iteratorTypes),
		doc.empty() ? StringAttr() : builder.getStringAttr(doc),
		libraryCall.empty() ? StringAttr() : builder.getStringAttr(libraryCall),
		symbolSource);
		if (!bodyBuild)
		return;

		SmallVector<Type, 4> blockArgTypes;
		for (ValueRange container : {inputs, outputBuffers})
		for (Value v : container)
		blockArgTypes.push_back(v.getType().cast<ShapedType>().getElementType());

		OpBuilder::InsertionGuard guard(builder);
		auto &region = *result.regions.front();
		Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);
		bodyBuild(builder, result.location, bodyBlock->getArguments());
		}

void GenericOp::build(		void GenericOp::build(
OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,		OpBuilder &builder, OperationState &result,
ValueRange args, int64_t argsIn, int64_t argsOut,		ArrayRef<Type> resultTensorTypes, ValueRange inputs,
		ValueRange outputBuffers, ValueRange initTensors,
ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,		ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,
		StringRef doc, StringRef libraryCall, IntegerAttr symbolSource,
function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {		function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {
build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),		build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
builder.getI64IntegerAttr(argsOut),
builder.getAffineMapArrayAttr(indexingMaps),		builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),		builder.getStrArrayAttr(iteratorTypes),
/doc=/nullptr, /library_call=/nullptr,		doc.empty() ? StringAttr() : builder.getStringAttr(doc),
/symbol_source=/nullptr);		libraryCall.empty() ? StringAttr() : builder.getStringAttr(libraryCall),
		symbolSource);
if (!bodyBuild)		if (!bodyBuild)
return;		return;

SmallVector<Type, 4> blockArgTypes;		SmallVector<Type, 4> blockArgTypes;
for (Value arg : args)		for (ValueRange container : {inputs, outputBuffers, initTensors})
blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());		for (Value v : container)
		blockArgTypes.push_back(v.getType().cast<ShapedType>().getElementType());

OpBuilder::InsertionGuard guard(builder);		OpBuilder::InsertionGuard guard(builder);
auto &region = *result.regions.front();		auto &region = *result.regions.front();
Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);		Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);
bodyBuild(builder, result.location, bodyBlock->getArguments());		bodyBuild(builder, result.location, bodyBlock->getArguments());
}		}

void IndexedGenericOp::build(		void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,		OpBuilder &builder, OperationState &result, ValueRange inputs,
ValueRange args, int64_t argsIn, int64_t argsOut,		ValueRange outputBuffers, ArrayRef<AffineMap> indexingMaps,
		ArrayRef<StringRef> iteratorTypes, StringRef doc, StringRef libraryCall,
		IntegerAttr symbolSource,
		function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {
		build(builder, result, ArrayRef<Type>{}, inputs, outputBuffers, ValueRange{},
		builder.getAffineMapArrayAttr(indexingMaps),
		builder.getStrArrayAttr(iteratorTypes),
		doc.empty() ? StringAttr() : builder.getStringAttr(doc),
		libraryCall.empty() ? StringAttr() : builder.getStringAttr(libraryCall),
		symbolSource);
		if (!bodyBuild)
		return;

		unsigned nLoops = iteratorTypes.size();
		SmallVector<Type, 4> blockArgTypes(nLoops, builder.getIndexType());
		for (ValueRange container : {inputs, outputBuffers})
		for (Value v : container)
		blockArgTypes.push_back(v.getType().cast<ShapedType>().getElementType());

		OpBuilder::InsertionGuard guard(builder);
		auto &region = *result.regions.front();
		Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);
		bodyBuild(builder, result.location, bodyBlock->getArguments());
		}

		void IndexedGenericOp::build(
		OpBuilder &builder, OperationState &result,
		ArrayRef<Type> resultTensorTypes, ValueRange inputs,
		ValueRange outputBuffers, ValueRange initTensors,
ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,		ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,
function_ref<void(OpBuilder &, Location, ValueRange, ValueRange)>		StringRef doc, StringRef libraryCall, IntegerAttr symbolSource,
bodyBuild) {		function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {
build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),		build(builder, result, resultTensorTypes, inputs, outputBuffers, initTensors,
builder.getI64IntegerAttr(argsOut),
builder.getAffineMapArrayAttr(indexingMaps),		builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),		builder.getStrArrayAttr(iteratorTypes),
/doc=/nullptr, /library_call=/nullptr,		doc.empty() ? StringAttr() : builder.getStringAttr(doc),
/symbol_source=/nullptr);		libraryCall.empty() ? StringAttr() : builder.getStringAttr(libraryCall),
		symbolSource);
if (!bodyBuild)		if (!bodyBuild)
return;		return;

unsigned nLoops = iteratorTypes.size();		unsigned nLoops = iteratorTypes.size();
SmallVector<Type, 4> blockArgTypes(nLoops, builder.getIndexType());		SmallVector<Type, 4> blockArgTypes(nLoops, builder.getIndexType());
for (Value arg : args)		for (ValueRange container : {inputs, outputBuffers, initTensors})
blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());		for (Value v : container)
		blockArgTypes.push_back(v.getType().cast<ShapedType>().getElementType());

OpBuilder::InsertionGuard guard(builder);		OpBuilder::InsertionGuard guard(builder);
auto &region = *result.regions.front();		auto &region = *result.regions.front();
Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);		Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);
bodyBuild(builder, result.location,		bodyBuild(builder, result.location, bodyBlock->getArguments());
bodyBlock->getArguments().take_front(nLoops),
bodyBlock->getArguments().drop_front(nLoops));
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static void printGenericOp(OpAsmPrinter &p, GenericOpType op) {		static void printGenericOp(OpAsmPrinter &p, GenericOpType op) {
auto attrNames = op.linalgTraitAttrNames();		p << op.getOperationName() << " ";
llvm::StringSet<> linalgTraitAttrsSet;
linalgTraitAttrsSet.insert(attrNames.begin(), attrNames.end());		// Print extra attributes.
SmallVector<NamedAttribute, 8> attrs;		auto genericAttrNames = op.linalgTraitAttrNames();

		llvm::StringSet<> genericAttrNamesSet;
		genericAttrNamesSet.insert(genericAttrNames.begin(), genericAttrNames.end());
		SmallVector<NamedAttribute, 8> genericAttrs;
for (auto attr : op.getAttrs())		for (auto attr : op.getAttrs())
if (linalgTraitAttrsSet.count(attr.first.strref()) > 0)		if (genericAttrNamesSet.count(attr.first.strref()) > 0)
attrs.push_back(attr);		genericAttrs.push_back(attr);
		if (!genericAttrs.empty()) {
		auto genericDictAttr = DictionaryAttr::get(genericAttrs, op.getContext());
		p << genericDictAttr;
		}

auto dictAttr = DictionaryAttr::get(attrs, op.getContext());		// Printing is shared with named ops, except for the region and attributes
p << op.getOperationName() << " " << dictAttr;		printCommonStructuredOpParts(p, op);
p.printOptionalAttrDict(op.getAttrs(), attrNames);
p << " " << op.getOperands();		genericAttrNames.push_back("operand_segment_sizes");

		bool hasExtraAttrs = false;
		for (NamedAttribute n : op.getAttrs()) {
		if ((hasExtraAttrs = !genericAttrNamesSet.contains(n.first.strref())))
		break;
		}
		if (hasExtraAttrs) {
		p << " attrs = ";
		p.printOptionalAttrDict(op.getAttrs(), /elidedAttrs=/genericAttrNames);
		}

		// Print region.
if (!op.region().empty())		if (!op.region().empty())
p.printRegion(op.region());		p.printRegion(op.region());
p << ": " << op.getOperandTypes();
auto outputTensorTypes = op.getResultTypes();		// Print results.
if (!outputTensorTypes.empty())		printNamedStructuredOpResults(p, op.result_tensors().getTypes());
p << " -> " << outputTensorTypes;
}		}

static void print(OpAsmPrinter &p, GenericOp op) { printGenericOp(p, op); }		static void print(OpAsmPrinter &p, GenericOp op) { printGenericOp(p, op); }

static void print(OpAsmPrinter &p, IndexedGenericOp op) {		static void print(OpAsmPrinter &p, IndexedGenericOp op) {
printGenericOp(p, op);		printGenericOp(p, op);
}		}

static ParseResult parseGenericOp(OpAsmParser &parser, OperationState &result) {		static ParseResult parseGenericOp(OpAsmParser &parser, OperationState &result) {
SmallVector<OpAsmParser::OperandType, 8> operandsInfo, regionOperandsInfo;
DictionaryAttr dictAttr;		DictionaryAttr dictAttr;
// Parse the core linalg traits that must check into a dictAttr.		// Parse the core linalg traits that must check into a dictAttr.
// The name is unimportant as we will overwrite result.attributes.		// The name is unimportant as we will overwrite result.attributes.
// The core linalg traits must contain the information necessary to pass the		// The core linalg traits must contain the information necessary to pass the
// verifier.		// verifier.
if (parser.parseAttribute(dictAttr, "_", result.attributes))		if (parser.parseAttribute(dictAttr, "_", result.attributes))
return failure();		return failure();
result.attributes.assign(dictAttr.getValue().begin(),		result.attributes.assign(dictAttr.getValue().begin(),
dictAttr.getValue().end());		dictAttr.getValue().end());

		// Parsing is shared with named ops, except for the region.
		SmallVector<Type, 1> inputsTypes, outputBuffersTypes, initTensorsTypes;
		if (parseCommonStructuredOpParts(parser, result, inputsTypes,
		outputBuffersTypes, initTensorsTypes))
		return failure();

// Optional attributes may be added.		// Optional attributes may be added.
if (parser.parseOptionalAttrDict(result.attributes) \|\|		if (succeeded(parser.parseOptionalKeyword("attrs")))
parser.parseOperandList(operandsInfo))		if (failed(parser.parseEqual()) \|\|
		failed(parser.parseOptionalAttrDict(result.attributes)))
return failure();		return failure();

Region &region = *result.addRegion();		SmallVector<OpAsmParser::OperandType, 8> regionOperands;
		std::unique_ptr<Region> region = std::make_unique<Region>();
SmallVector<Type, 8> operandTypes, regionTypes;		SmallVector<Type, 8> operandTypes, regionTypes;
if (parser.parseRegion(region, regionOperandsInfo, regionTypes))		if (parser.parseRegion(*region, regionOperands, regionTypes))
return failure();
if (parser.parseColonTypeList(operandTypes))
return failure();		return failure();
		result.addRegion(std::move(region));

// Generic ops may specify that a subset of its outputs are tensors. Such		// Generic ops may specify that a subset of its outputs are tensors. Such
// outputs are specified in the result type.		// outputs are specified in the result type.
SmallVector<Type, 8> tensorResultTypes;		// TODO: may need to move output parsing before region parsing.
if (parser.parseOptionalArrowTypeList(tensorResultTypes))		// Need to wait for declarative assembly resolution to decide.
		SmallVector<Type, 1> outputTensorsTypes;
		if (parseNamedStructuredOpResults(parser, outputTensorsTypes))
return failure();		return failure();
if (!tensorResultTypes.empty())		result.addTypes(outputTensorsTypes);
result.addTypes(tensorResultTypes);
return parser.resolveOperands(operandsInfo, operandTypes,		return success();
parser.getCurrentLocation(), result.operands);
}		}

namespace {		namespace {
template <typename GenericOpType> struct BlockArgsVerifier {		template <typename GenericOpType> struct BlockArgsVerifier {
static LogicalResult verify(GenericOpType op, Block &block);		static LogicalResult verify(GenericOpType op, Block &block);
};		};

template <typename GenericOpType>		template <typename GenericOpType>
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
}		}
} // namespace		} // namespace

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyGenericOp(GenericOpType op) {		static LogicalResult verifyGenericOp(GenericOpType op) {
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();

		if (op.inputs().size() + op.output_buffers().size() +
		herhutUnsubmitted Done Reply Inline Actions I would have expected `init_tensors` not to count here. They only exist for reductions on tensors, so they are implied. But this verification is not very precise anyway. herhut: I would have expected `init_tensors` not to count here. They only exist for reductions on…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Ack, more work is needed to make reductions + tensors really work with Linalg. In particular for buffer allocation. nicolasvasilache: Ack, more work is needed to make reductions + tensors really work with Linalg. In particular…
		op.init_tensors().size() + op.getNumResults() ==
		0)
		return op.emitOpError("expected at least 1 Shaped operand or return");

auto &region = op.region();		auto &region = op.region();
if (!llvm::hasSingleElement(region))		if (!llvm::hasSingleElement(region))
return op.emitOpError("expected region with 1 block");		return op.emitOpError("expected region with 1 block");
if (failed(BlockArgsVerifier<GenericOpType>::verify(op, region.front())))		if (failed(BlockArgsVerifier<GenericOpType>::verify(op, region.front())))
return failure();		return failure();

auto symbolSourceAttr =		auto symbolSourceAttr =
op.template getAttrOfType<IntegerAttr>("symbol_source");		op.template getAttrOfType<IntegerAttr>("symbol_source");
Show All 32 Lines	static LogicalResult verifyGenericOp(GenericOpType op) {
// TODO: Bound inference for maps with symbols		// TODO: Bound inference for maps with symbols
if (!concatMap.getNumSymbols() && !inversePermutation(concatMap))		if (!concatMap.getNumSymbols() && !inversePermutation(concatMap))
return op.emitOpError("expected the concatenation of maps in indexing_map "		return op.emitOpError("expected the concatenation of maps in indexing_map "
"to be invertible");		"to be invertible");

return success();		return success();
}		}

static LogicalResult verify(GenericOp op) {		static LogicalResult verify(GenericOp op) { return verifyGenericOp(op); }
// Temporarily hoisted here to avoid duplicating more code.
// TODO: uniformize with named structured ops.		static LogicalResult verify(IndexedGenericOp op) { return verifyGenericOp(op); }
auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();
if (nInputsAndOutputBuffers != llvm::size(op.views()))
return op.emitOpError("expected exactly ")
<< nInputsAndOutputBuffers
<< " inputs (tensor or buffer) and output buffer operands";
return verifyGenericOp(op);
}

static LogicalResult verify(IndexedGenericOp op) {
// Temporarily hoisted here to avoid duplicating more code.
// TODO: uniformize with named structured ops.
auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();
if (nInputsAndOutputBuffers != llvm::size(op.views()))
return op.emitOpError("expected exactly ")
<< nInputsAndOutputBuffers
<< " inputs (tensor or buffer) and output buffer operands";
return verifyGenericOp(op);
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ReshapeOp		// ReshapeOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Collapse reassociation maps that are used in pair of reshape ops where one		/// Collapse reassociation maps that are used in pair of reshape ops where one
/// is a producer and other is the consumer. Only valid to use this method when		/// is a producer and other is the consumer. Only valid to use this method when
/// both the producer and consumer are collapsing dimensions or both are		/// both the producer and consumer are collapsing dimensions or both are
▲ Show 20 Lines • Show All 790 Lines • ▼ Show 20 Lines

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/Linalg/IR/LinalgStructuredOps.cpp.inc"		#include "mlir/Dialect/Linalg/IR/LinalgStructuredOps.cpp.inc"

/// Return the dims that are `iteratorTypeName` loops in the LinalgOp `op`.		/// Return the dims that are `iteratorTypeName` loops in the LinalgOp `op`.
/// Assumes `op` is a LinalgOp.		/// Assumes `op` is a LinalgOp.
void mlir::linalg::getDimsOfType(Operation *op, StringRef iteratorTypeName,		void mlir::linalg::getDimsOfType(Operation *op, StringRef iteratorTypeName,
SmallVectorImpl<AffineExpr> &res) {		SmallVectorImpl<AffineExpr> &res) {
		if (!cast<LinalgOp>(op).iterator_types())
		herhutUnsubmitted Not Done Reply Inline Actions Maybe assign to a local? herhut: Maybe assign to a local?
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Not getting this, could you please elaborate? nicolasvasilache: Not getting this, could you please elaborate?
		herhutUnsubmitted Not Done Reply Inline Actions It was just a nit to do `auto iteratorTypes = cast<LinalgOp>(op).iterator_types()` and then use that here and below. herhut: It was just a nit to do `auto iteratorTypes = cast<LinalgOp>(op).iterator_types()` and then…
		return;

unsigned dim = 0;		unsigned dim = 0;
MLIRContext *ctx = op->getContext();		MLIRContext *ctx = op->getContext();
for (auto tn :		for (auto tn :
cast<LinalgOp>(op).iterator_types().getAsValueRange<StringAttr>()) {		cast<LinalgOp>(op).iterator_types().getAsValueRange<StringAttr>()) {
if (tn == iteratorTypeName)		if (tn == iteratorTypeName)
res.push_back(getAffineDimExpr(dim, ctx));		res.push_back(getAffineDimExpr(dim, ctx));
++dim;		++dim;
}		}
▲ Show 20 Lines • Show All 184 Lines • ▼ Show 20 Lines
parseNamedStructuredOpResults(OpAsmParser &parser,		parseNamedStructuredOpResults(OpAsmParser &parser,
SmallVectorImpl<Type> &resultTypes) {		SmallVectorImpl<Type> &resultTypes) {
if (succeeded(parser.parseOptionalArrow()))		if (succeeded(parser.parseOptionalArrow()))
if (parser.parseTypeList(resultTypes))		if (parser.parseTypeList(resultTypes))
return failure();		return failure();
return success();		return success();
}		}

template <typename NamedStructuredOpType>		static ParseResult
static ParseResult parseNamedStructuredOp(OpAsmParser &parser,		parseCommonStructuredOpParts(OpAsmParser &parser, OperationState &result,
OperationState &result) {		SmallVectorImpl<Type> &inputsTypes,
		SmallVectorImpl<Type> &outputBuffersTypes,
		SmallVectorImpl<Type> &initTensorsTypes) {
llvm::SMLoc inputsOperandsLoc, outputBuffersOperandsLoc,		llvm::SMLoc inputsOperandsLoc, outputBuffersOperandsLoc,
initTensorsOperandsLoc;		initTensorsOperandsLoc;
SmallVector<OpAsmParser::OperandType, 4> inputsOperands,		SmallVector<OpAsmParser::OperandType, 4> inputsOperands,
outputBuffersOperands, initTensorsOperands;		outputBuffersOperands, initTensorsOperands;
SmallVector<Type, 1> inputsTypes, outputBuffersTypes, initTensorsTypes,
outputTensorsTypes;
std::unique_ptr<Region> regionRegion = std::make_unique<Region>();

if (parser.parseOptionalAttrDict(result.attributes) \|\|		parser.parseOptionalAttrDict(result.attributes);
parser.parseKeyword("ins") \|\| parser.parseLParen())
		if (succeeded(parser.parseOptionalKeyword("ins"))) {
		if (parser.parseLParen())
return failure();		return failure();

inputsOperandsLoc = parser.getCurrentLocation();		inputsOperandsLoc = parser.getCurrentLocation();
if (parser.parseOperandList(inputsOperands) \|\| parser.parseColon() \|\|		if (parser.parseOperandList(inputsOperands) \|\| parser.parseColon() \|\|
parser.parseTypeList(inputsTypes) \|\| parser.parseRParen())		parser.parseTypeList(inputsTypes) \|\| parser.parseRParen())
return failure();		return failure();
		}

if (succeeded(parser.parseOptionalKeyword("outs"))) {		if (succeeded(parser.parseOptionalKeyword("outs"))) {
outputBuffersOperandsLoc = parser.getCurrentLocation();		outputBuffersOperandsLoc = parser.getCurrentLocation();
if (parser.parseLParen() \|\|		if (parser.parseLParen() \|\|
parser.parseOperandList(outputBuffersOperands) \|\| parser.parseColon() \|\|		parser.parseOperandList(outputBuffersOperands) \|\| parser.parseColon() \|\|
		herhutUnsubmitted Done Reply Inline Actions nit: `parseColonTypeList` herhut: nit: `parseColonTypeList`
parser.parseTypeList(outputBuffersTypes) \|\| parser.parseRParen())		parser.parseTypeList(outputBuffersTypes) \|\| parser.parseRParen())
return failure();		return failure();
}		}
if (succeeded(parser.parseOptionalKeyword("init"))) {		if (succeeded(parser.parseOptionalKeyword("init"))) {
initTensorsOperandsLoc = parser.getCurrentLocation();		initTensorsOperandsLoc = parser.getCurrentLocation();
if (parser.parseLParen() \|\| parser.parseOperandList(initTensorsOperands) \|\|		if (parser.parseLParen() \|\| parser.parseOperandList(initTensorsOperands) \|\|
parser.parseColon() \|\| parser.parseTypeList(initTensorsTypes) \|\|		parser.parseColon() \|\| parser.parseTypeList(initTensorsTypes) \|\|
parser.parseRParen())		parser.parseRParen())
return failure();		return failure();
}		}

if (parseNamedStructuredOpResults(parser, outputTensorsTypes))
return failure();

if (parseNamedStructuredOpRegion<NamedStructuredOpType>(
parser, *regionRegion, inputsTypes, outputBuffersTypes,
initTensorsTypes, outputTensorsTypes))
return failure();

if (parser.resolveOperands(inputsOperands, inputsTypes, inputsOperandsLoc,		if (parser.resolveOperands(inputsOperands, inputsTypes, inputsOperandsLoc,
result.operands) \|\|		result.operands) \|\|
parser.resolveOperands(outputBuffersOperands, outputBuffersTypes,		parser.resolveOperands(outputBuffersOperands, outputBuffersTypes,
outputBuffersOperandsLoc, result.operands) \|\|		outputBuffersOperandsLoc, result.operands) \|\|
parser.resolveOperands(initTensorsOperands, initTensorsTypes,		parser.resolveOperands(initTensorsOperands, initTensorsTypes,
initTensorsOperandsLoc, result.operands))		initTensorsOperandsLoc, result.operands))
return failure();		return failure();

result.addTypes(outputTensorsTypes);
result.addRegion(std::move(regionRegion));
result.addAttribute("operand_segment_sizes",		result.addAttribute("operand_segment_sizes",
parser.getBuilder().getI32VectorAttr(		parser.getBuilder().getI32VectorAttr(
{static_cast<int32_t>(inputsOperands.size()),		{static_cast<int32_t>(inputsOperands.size()),
static_cast<int32_t>(outputBuffersOperands.size()),		static_cast<int32_t>(outputBuffersOperands.size()),
static_cast<int32_t>(initTensorsOperands.size())}));		static_cast<int32_t>(initTensorsOperands.size())}));
return success();		return success();
}		}

		template <typename NamedStructuredOpType>
		static ParseResult parseNamedStructuredOp(OpAsmParser &parser,
		OperationState &result) {
		SmallVector<Type, 1> inputsTypes, outputBuffersTypes, initTensorsTypes;
		if (parseCommonStructuredOpParts(parser, result, inputsTypes,
		outputBuffersTypes, initTensorsTypes))
		return failure();

		// TODO: consider merging results parsing into region parsing.
		// Need to wait for declarative assembly resolution to decide.
		SmallVector<Type, 1> outputTensorsTypes;
		if (parseNamedStructuredOpResults(parser, outputTensorsTypes))
		return failure();
		result.addTypes(outputTensorsTypes);

		std::unique_ptr<Region> region = std::make_unique<Region>();
		if (parseNamedStructuredOpRegion<NamedStructuredOpType>(
		parser, *region, inputsTypes, outputBuffersTypes, initTensorsTypes,
		outputTensorsTypes))
		return failure();
		result.addRegion(std::move(region));

		return success();
		}

static void printNamedStructuredOpResults(OpAsmPrinter &p,		static void printNamedStructuredOpResults(OpAsmPrinter &p,
TypeRange resultTypes) {		TypeRange resultTypes) {
if (resultTypes.empty())		if (resultTypes.empty())
return;		return;
p << "-> " << resultTypes;		p << "-> " << resultTypes;
		herhutUnsubmitted Done Reply Inline Actions `printArrowTypeList` or `printOptionalArrowTypeList`? herhut: `printArrowTypeList` or `printOptionalArrowTypeList`?
}		}

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static void printNamedStructuredOp(OpAsmPrinter &p, NamedStructuredOpType op) {		static void printCommonStructuredOpParts(OpAsmPrinter &p,
p << op.getOperationName();		NamedStructuredOpType op) {
p.printOptionalAttrDict(op.getAttrs(),
/elidedAttrs=/{"operand_segment_sizes"});
p << " ins(" << op.inputs() << " : " << op.inputs().getTypes() << ")";		p << " ins(" << op.inputs() << " : " << op.inputs().getTypes() << ")";
if (!op.output_buffers().empty())		if (!op.output_buffers().empty())
p << " outs(" << op.output_buffers() << " : "		p << " outs(" << op.output_buffers() << " : "
<< op.output_buffers().getTypes() << ")";		<< op.output_buffers().getTypes() << ")";
if (!op.init_tensors().empty())		if (!op.init_tensors().empty())
p << " init(" << op.init_tensors() << " : " << op.init_tensors().getTypes()		p << " init(" << op.init_tensors() << " : " << op.init_tensors().getTypes()
<< ")";		<< ") ";
p << " ";		}
printNamedStructuredOpResults(p, op.output_tensors().getTypes());
p << " ";		template <typename NamedStructuredOpType>
		static void printNamedStructuredOp(OpAsmPrinter &p, NamedStructuredOpType op) {
		p << op.getOperationName();
		p.printOptionalAttrDict(op.getAttrs(),
		/elidedAttrs=/{"operand_segment_sizes"});

		// Printing is shared with generic ops, except for the region and attributes.
		printCommonStructuredOpParts(p, op);

		// Results printing.
		printNamedStructuredOpResults(p, op.result_tensors().getTypes());

// Region is elided.		// Region is elided.
}		}

template <typename NamedStructuredOpType>		template <typename NamedStructuredOpType>
static LogicalResult verifyNamedStructuredOp(NamedStructuredOpType op) {		static LogicalResult verifyNamedStructuredOp(NamedStructuredOpType op) {
return verifyGenericOp<NamedStructuredOpType>(op);		return verifyGenericOp<NamedStructuredOpType>(op);
}		}
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp

Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	UnitExtentReplacementInfo info = {
RankedTensorType::get(newShape, type.getElementType()),		RankedTensorType::get(newShape, type.getElementType()),
AffineMap::get(indexMap.getNumDims(), indexMap.getNumSymbols(),		AffineMap::get(indexMap.getNumDims(), indexMap.getNumSymbols(),
newIndexExprs, context),		newIndexExprs, context),
ArrayAttr::get(reassociationMaps, context)};		ArrayAttr::get(reassociationMaps, context)};
return info;		return info;
}		}

namespace {		namespace {

/// Pattern to replace tensors operands/results that are unit extents.		/// Pattern to replace tensors operands/results that are unit extents.
struct ReplaceUnitExtentTensors : public OpRewritePattern<GenericOp> {		struct ReplaceUnitExtentTensors : public OpRewritePattern<GenericOp> {
using OpRewritePattern<GenericOp>::OpRewritePattern;		using OpRewritePattern<GenericOp>::OpRewritePattern;
LogicalResult matchAndRewrite(GenericOp genericOp,		LogicalResult matchAndRewrite(GenericOp genericOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (!genericOp.hasTensorSemantics())		// TODO: support init_tensors and reductions.
		if (!genericOp.hasTensorSemantics() \|\| !genericOp.init_tensors().empty())
return failure();		return failure();

MLIRContext *context = rewriter.getContext();		MLIRContext *context = rewriter.getContext();
Location loc = genericOp.getLoc();		Location loc = genericOp.getLoc();

SmallVector<AffineMap, 4> newIndexingMaps;		SmallVector<AffineMap, 4> newIndexingMaps;
SmallVector<ArrayAttr, 4> reassociationMaps;		SmallVector<ArrayAttr, 4> reassociationMaps;
SmallVector<ShapedType, 4> newInputOutputTypes;		SmallVector<ShapedType, 4> newInputOutputTypes;
Show All 12 Lines	LogicalResult matchAndRewrite(GenericOp genericOp,
// If the indexing maps of the result operation are not invertible (i.e. not		// If the indexing maps of the result operation are not invertible (i.e. not
// legal), abort.		// legal), abort.
if (!doCanonicalization \|\|		if (!doCanonicalization \|\|
!inversePermutation(concatAffineMaps(newIndexingMaps)))		!inversePermutation(concatAffineMaps(newIndexingMaps)))
return failure();		return failure();

// If any operand type change, insert a reshape to convert from the original		// If any operand type change, insert a reshape to convert from the original
// type to the new type.		// type to the new type.
SmallVector<Value, 4> newOperands;		// TODO: get rid of flattenedIdx which assumes operand order and contiguity.
newOperands.reserve(genericOp.getNumOperands());		unsigned flattenedIdx = 0;
for (auto operand : llvm::enumerate(genericOp.getOperands())) {		auto insertReshapes = [&](ValueRange values) {
if (operand.value().getType() == newInputOutputTypes[operand.index()]) {		SmallVector<Value, 4> res;
newOperands.push_back(operand.value());		res.reserve(values.size());
} else {		for (auto operand : llvm::enumerate(values)) {
newOperands.push_back(rewriter.create<linalg::TensorReshapeOp>(		if (operand.value().getType() == newInputOutputTypes[flattenedIdx])
loc, newInputOutputTypes[operand.index()], operand.value(),		res.push_back(operand.value());
reassociationMaps[operand.index()]));		else
}		res.push_back(rewriter.create<linalg::TensorReshapeOp>(
		loc, newInputOutputTypes[flattenedIdx], operand.value(),
		reassociationMaps[flattenedIdx]));
		++flattenedIdx;
}		}
		return res;
		};

		SmallVector<Value, 4> newInputs = insertReshapes(genericOp.inputs());
		SmallVector<Value, 4> newOutputBuffers =
		insertReshapes(genericOp.output_buffers());
		SmallVector<Value, 4> newInitTensors =
		insertReshapes(genericOp.init_tensors());

// If any result type change, insert a reshape to convert from the original		// If any result type change, insert a reshape to convert from the original
// type to the new type.		// type to the new type.
SmallVector<Type, 4> resultTypes;		SmallVector<Type, 4> resultTypes;
resultTypes.reserve(genericOp.getNumResults());		resultTypes.reserve(genericOp.getNumResults());
for (unsigned i : llvm::seq<unsigned>(0, genericOp.getNumResults()))		for (unsigned i : llvm::seq<unsigned>(0, genericOp.getNumResults()))
resultTypes.push_back(		resultTypes.push_back(
newInputOutputTypes[i + genericOp.getNumOperands()]);		newInputOutputTypes[i + genericOp.getNumOperands()]);
GenericOp replacementOp = rewriter.create<GenericOp>(		GenericOp replacementOp = rewriter.create<GenericOp>(
loc, resultTypes, newOperands, genericOp.args_in(),		loc, resultTypes, newInputs, newOutputBuffers, newInitTensors,
genericOp.args_out(), rewriter.getAffineMapArrayAttr(newIndexingMaps),		rewriter.getAffineMapArrayAttr(newIndexingMaps),
genericOp.iterator_types(),		genericOp.iterator_types(),
/doc = / nullptr,		/doc = / nullptr,
/library_call = / nullptr,		/library_call = / nullptr,
/symbol_source = / nullptr);		/symbol_source = / nullptr);
rewriter.inlineRegionBefore(genericOp.region(), replacementOp.region(),		rewriter.inlineRegionBefore(genericOp.region(), replacementOp.region(),
replacementOp.region().begin());		replacementOp.region().begin());

// If any result tensor has a modified shape, then add reshape to recover		// If any result tensor has a modified shape, then add reshape to recover
// the original shape.		// the original shape.
SmallVector<Value, 4> resultReplacements;		SmallVector<Value, 4> resultReplacements;
for (auto result : llvm::enumerate(replacementOp.getResults())) {		for (auto result : llvm::enumerate(replacementOp.getResults())) {
unsigned index = result.index() + replacementOp.getNumOperands();		unsigned index = result.index() + replacementOp.getNumOperands();
RankedTensorType origResultType = genericOp.getResult(result.index())		RankedTensorType origResultType = genericOp.getResult(result.index())
.getType()		.getType()
.cast<RankedTensorType>();		.cast<RankedTensorType>();
if (origResultType != result.value().getType()) {		if (origResultType != result.value().getType())
resultReplacements.push_back(rewriter.create<linalg::TensorReshapeOp>(		resultReplacements.push_back(rewriter.create<linalg::TensorReshapeOp>(
loc, origResultType, result.value(), reassociationMaps[index]));		loc, origResultType, result.value(), reassociationMaps[index]));
} else {		else
resultReplacements.push_back(result.value());		resultReplacements.push_back(result.value());
}		}
}
rewriter.replaceOp(genericOp, resultReplacements);		rewriter.replaceOp(genericOp, resultReplacements);
return success();		return success();
}		}
};		};
} // namespace		} // namespace

/// Patterns that are used to canonicalize the use of unit-extent dims for		/// Patterns that are used to canonicalize the use of unit-extent dims for
/// broadcasting.		/// broadcasting.
Show All 27 Lines

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

Show First 20 Lines • Show All 437 Lines • ▼ Show 20 Lines
/// Implementation of fusion of generic ops and indexed_generic ops.		/// Implementation of fusion of generic ops and indexed_generic ops.
struct FuseGenericOpsOnTensors {		struct FuseGenericOpsOnTensors {
static bool isFusible(LinalgOp producer, LinalgOp consumer,		static bool isFusible(LinalgOp producer, LinalgOp consumer,
unsigned consumerIdx) {		unsigned consumerIdx) {
// Producer and consumer must have tensor semantics.		// Producer and consumer must have tensor semantics.
if (!producer.hasTensorSemantics() \|\| !consumer.hasTensorSemantics())		if (!producer.hasTensorSemantics() \|\| !consumer.hasTensorSemantics())
return false;		return false;

		// TODO: maybe allow init_tensors and reductions.
		herhutUnsubmitted Done Reply Inline Actions Left over? herhut: Left over?
		// if (producer.init_tensors().empty() \|\| consumer.init_tensors().empty())
		// return false;

// Verify that		// Verify that
// - the producer has all "parallel" iterator type.		// - the producer has all "parallel" iterator type.
if (producer.getNumParallelLoops() != producer.getNumLoops())		if (producer.getNumParallelLoops() != producer.getNumLoops())
return false;		return false;

// Get the consumer index map. The number of results of the consumer index		// Get the consumer index map. The number of results of the consumer index
// map must match the number of loops of the producer.		// map must match the number of loops of the producer.
AffineMap consumerIndexMap = consumer.getIndexingMap(consumerIdx);		AffineMap consumerIndexMap = consumer.getIndexingMap(consumerIdx);
Show All 40 Lines	static LinalgOp fuse(LinalgOp producer, LinalgOp consumer,
computeProducerOperandIndex(		computeProducerOperandIndex(
producer, consumer.getInputIndexingMap(consumerIdx), fusedIndexMaps);		producer, consumer.getInputIndexingMap(consumerIdx), fusedIndexMaps);

// Append the indexing maps for the remaining consumer operands.		// Append the indexing maps for the remaining consumer operands.
fusedIndexMaps.append(std::next(consumerIndexMaps.begin(), consumerIdx + 1),		fusedIndexMaps.append(std::next(consumerIndexMaps.begin(), consumerIdx + 1),
consumerIndexMaps.end());		consumerIndexMaps.end());

// Generate the fused op.		// Generate the fused op.
		// Tensor-level fusion is only on ops without initTensors and outputBuffers.
LinalgOp fusedOp;		LinalgOp fusedOp;
if (isa<GenericOp>(producer.getOperation()) &&		if (isa<GenericOp>(producer.getOperation()) &&
isa<GenericOp>(consumer.getOperation())) {		isa<GenericOp>(consumer.getOperation())) {
fusedOp =		fusedOp =
rewriter		rewriter
.create<GenericOp>(		.create<GenericOp>(rewriter.getUnknownLoc(),
rewriter.getUnknownLoc(),		consumer.getOperation()->getResultTypes(),
consumer.getOperation()->getResultTypes(), fusedOperands,		/inputs=/fusedOperands,
rewriter.getI64IntegerAttr(fusedOperands.size()),		/outputBuffers=/ValueRange{},
rewriter.getI64IntegerAttr(		/initTensors=/ValueRange{},
consumer.getOperation()->getNumResults()),
rewriter.getArrayAttr(fusedIndexMaps),		rewriter.getArrayAttr(fusedIndexMaps),
		herhutUnsubmitted Done Reply Inline Actions Reformat. herhut: Reformat.
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr,		/library_call=/nullptr,
/symbol_source=/nullptr)		/symbol_source=/nullptr)
		herhutUnsubmitted Done Reply Inline Actions I know this is a carry over but why `UnknownLoc`? herhut: I know this is a carry over but why `UnknownLoc`?
.getOperation();		.getOperation();
} else {		} else {
fusedOp =		fusedOp = rewriter
rewriter
.create<IndexedGenericOp>(		.create<IndexedGenericOp>(
rewriter.getUnknownLoc(),		rewriter.getUnknownLoc(),
consumer.getOperation()->getResultTypes(), fusedOperands,		consumer.getOperation()->getResultTypes(),
rewriter.getI64IntegerAttr(fusedOperands.size()),		/inputs=/fusedOperands,
rewriter.getI64IntegerAttr(		/outputBuffers=/ValueRange{},
consumer.getOperation()->getNumResults()),		/initTensors=/ValueRange{},
rewriter.getArrayAttr(fusedIndexMaps),		rewriter.getArrayAttr(fusedIndexMaps),
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr,		/library_call=/nullptr,
/symbol_source=/nullptr)		/symbol_source=/nullptr)
.getOperation();		.getOperation();
}		}

// Construct an AffineMap from consumer loops to producer loops.		// Construct an AffineMap from consumer loops to producer loops.
// consumer loop -> tensor index		// consumer loop -> tensor index
AffineMap consumerResultIndexMap =		AffineMap consumerResultIndexMap =
consumer.getInputIndexingMap(consumerIdx);		consumer.getInputIndexingMap(consumerIdx);
// producer loop -> tensor index		// producer loop -> tensor index
AffineMap producerResultIndexMap = producer.getOutputIndexingMap(0);		AffineMap producerResultIndexMap = producer.getOutputIndexingMap(0);
▲ Show 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	if (!inversePermutation(concatAffineMaps(fusedIndexMaps)))
return nullptr;		return nullptr;

SmallVector<Attribute, 4> indexMapAttrs = llvm::to_vector<4>(		SmallVector<Attribute, 4> indexMapAttrs = llvm::to_vector<4>(
llvm::map_range(fusedIndexMaps, [](AffineMap map) -> Attribute {		llvm::map_range(fusedIndexMaps, [](AffineMap map) -> Attribute {
return AffineMapAttr::get(map);		return AffineMapAttr::get(map);
}));		}));
LinalgOp fusedOp = createLinalgOpOfSameType(		LinalgOp fusedOp = createLinalgOpOfSameType(
consumer, rewriter, rewriter.getUnknownLoc(),		consumer, rewriter, rewriter.getUnknownLoc(),
consumerOp->getResultTypes(), fusedOperands,		consumerOp->getResultTypes(),
rewriter.getI64IntegerAttr(fusedOperands.size()),		/inputs=/fusedOperands,
rewriter.getI64IntegerAttr(consumerOp->getNumResults()),		/outputBuffers=/ValueRange{},
		/initTensors=/ValueRange{}, // no init tensors for now.
rewriter.getArrayAttr(indexMapAttrs), consumer.iterator_types(),		rewriter.getArrayAttr(indexMapAttrs), consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr,		/library_call=/nullptr,
/symbol_source=/nullptr);		/symbol_source=/nullptr);
auto &fusedRegion = fusedOp.getOperation()->getRegion(0);		auto &fusedRegion = fusedOp.getOperation()->getRegion(0);
rewriter.cloneRegionBefore(consumerOp->getRegion(0), fusedRegion,		rewriter.cloneRegionBefore(consumerOp->getRegion(0), fusedRegion,
fusedRegion.begin());		fusedRegion.begin());
return fusedOp;		return fusedOp;
Show All 40 Lines	static LinalgOp fuseCollapsingCase(LinalgOp producer,
SmallVector<Attribute, 4> indexMapAttrs = llvm::to_vector<4>(		SmallVector<Attribute, 4> indexMapAttrs = llvm::to_vector<4>(
llvm::map_range(fusedIndexMaps, [](AffineMap map) -> Attribute {		llvm::map_range(fusedIndexMaps, [](AffineMap map) -> Attribute {
return AffineMapAttr::get(map);		return AffineMapAttr::get(map);
}));		}));

Operation *producerOp = producer.getOperation();		Operation *producerOp = producer.getOperation();
LinalgOp fusedOp = createLinalgOpOfSameType(		LinalgOp fusedOp = createLinalgOpOfSameType(
producer, rewriter, rewriter.getUnknownLoc(), consumer.getResultType(),		producer, rewriter, rewriter.getUnknownLoc(), consumer.getResultType(),
producerOp->getOperands(),		/inputs=/producerOp->getOperands(),
rewriter.getI64IntegerAttr(producerOp->getNumOperands()),		/outputBuffers=/ValueRange{},
rewriter.getI64IntegerAttr(1), rewriter.getArrayAttr(indexMapAttrs),		/initTensors=/ValueRange{}, // no init tensors for now.
producer.iterator_types(),		rewriter.getArrayAttr(indexMapAttrs), producer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr,		/library_call=/nullptr,
/symbol_source=/nullptr);		/symbol_source=/nullptr);
auto &fusedRegion = fusedOp.getOperation()->getRegion(0);		auto &fusedRegion = fusedOp.getOperation()->getRegion(0);
rewriter.cloneRegionBefore(producerOp->getRegion(0), fusedRegion,		rewriter.cloneRegionBefore(producerOp->getRegion(0), fusedRegion,
fusedRegion.begin());		fusedRegion.begin());
return fusedOp;		return fusedOp;
}		}
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	static LinalgOp fuseExpandingCase(LinalgOp producer, TensorReshapeOp consumer,
SmallVector<Type, 4> resultTypes;		SmallVector<Type, 4> resultTypes;
for (auto t : producer.getOutputTensorTypes()) {		for (auto t : producer.getOutputTensorTypes()) {
Type type = RankedTensorType::get(dstShape,		Type type = RankedTensorType::get(dstShape,
t.cast<ShapedType>().getElementType());		t.cast<ShapedType>().getElementType());
resultTypes.push_back(type);		resultTypes.push_back(type);
}		}

int rank = dstShape.size();		int rank = dstShape.size();
int numArgsIn = producer.getNumInputs();
int numArgsOut = producer.getNumOutputs();
auto genericOp = rewriter.create<linalg::GenericOp>(		auto genericOp = rewriter.create<linalg::GenericOp>(
loc, resultTypes, args, numArgsIn, numArgsOut,		loc, resultTypes, /inputs=/args,
		/outputBuffers=/ValueRange{},
		/initTensors=/ValueRange{},
SmallVector<AffineMap, 3>(args.size() + resultTypes.size(),		SmallVector<AffineMap, 3>(args.size() + resultTypes.size(),
rewriter.getMultiDimIdentityMap(rank)),		rewriter.getMultiDimIdentityMap(rank)),
SmallVector<StringRef, 3>(rank, getParallelIteratorTypeName()));		SmallVector<StringRef, 3>(rank, getParallelIteratorTypeName()));
Region &region = genericOp.getRegion();		Region &region = genericOp.getRegion();
rewriter.cloneRegionBefore(producer.getOperation()->getRegion(0), region,		rewriter.cloneRegionBefore(producer.getOperation()->getRegion(0), region,
region.begin());		region.begin());
return cast<LinalgOp>(genericOp.getOperation());		return cast<LinalgOp>(genericOp.getOperation());
}		}
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	static LinalgOp fuse(ConstantOp producer, LinalgOp consumer,

// Create a constant scalar value from the splat constant.		// Create a constant scalar value from the splat constant.
Value scalarConstant = rewriter.create<ConstantOp>(		Value scalarConstant = rewriter.create<ConstantOp>(
producer.getLoc(),		producer.getLoc(),
producer.value().cast<DenseElementsAttr>().getSplatValue());		producer.value().cast<DenseElementsAttr>().getSplatValue());

LinalgOp fusedOp = createLinalgOpOfSameType(		LinalgOp fusedOp = createLinalgOpOfSameType(
consumer, rewriter, rewriter.getUnknownLoc(),		consumer, rewriter, rewriter.getUnknownLoc(),
consumerOp->getResultTypes(), fusedOperands,		consumerOp->getResultTypes(),
rewriter.getI64IntegerAttr(consumerOp->getNumOperands() - 1),		/inputs=/fusedOperands,
rewriter.getI64IntegerAttr(consumerOp->getNumResults()),		/outputBuffers=/ValueRange{},
		/initTensors=/ValueRange{}, // no init tensors for now.
rewriter.getAffineMapArrayAttr(fusedIndexMaps),		rewriter.getAffineMapArrayAttr(fusedIndexMaps),
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr,		/library_call=/nullptr,
/symbol_source=/nullptr);		/symbol_source=/nullptr);

// Map the block argument corresponding to the replaced argument with the		// Map the block argument corresponding to the replaced argument with the
// scalar constant.		// scalar constant.
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp

Show All 30 Lines	class GenericOpConverter
: public BufferAssignmentOpConversionPattern<linalg::GenericOp> {		: public BufferAssignmentOpConversionPattern<linalg::GenericOp> {
public:		public:
using BufferAssignmentOpConversionPattern<		using BufferAssignmentOpConversionPattern<
linalg::GenericOp>::BufferAssignmentOpConversionPattern;		linalg::GenericOp>::BufferAssignmentOpConversionPattern;

LogicalResult		LogicalResult
matchAndRewrite(linalg::GenericOp op, ArrayRef<Value> operands,		matchAndRewrite(linalg::GenericOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const final {		ConversionPatternRewriter &rewriter) const final {
		linalg::GenericOpAdaptor adaptor(operands,
		op.getOperation()->getAttrDictionary());

		// TODO: support ops with reduction.
		if (!op.init_tensors().empty())
		return failure();

		// All inputs need to be turned into buffers first. Until then, bail out.
		if (llvm::any_of(adaptor.inputs(),
		[](Value in) { return !in.getType().isa<MemRefType>(); }))
		return failure();

Location loc = op.getLoc();		Location loc = op.getLoc();
ResultRange results = op.getOperation()->getResults();		SmallVector<Value, 2> outputBuffers, newOutputBuffers;
SmallVector<Value, 2> newArgs, newResults;		outputBuffers.assign(adaptor.output_buffers().begin(),
newArgs.reserve(operands.size() + results.size());		adaptor.output_buffers().end());
newArgs.append(operands.begin(), operands.end());		newOutputBuffers.reserve(op.getNumOutputs());
newResults.reserve(results.size());		newOutputBuffers.append(adaptor.output_buffers().begin(),
		adaptor.output_buffers().end());

// Update all types to memref types.		// Update all types to memref types.
for (auto result : results) {		for (Type t : op.getResultTypes()) {
auto type = result.getType().cast<ShapedType>();		auto type = t.cast<ShapedType>();
assert(type && "tensor to buffer conversion expects ranked results");
if (!type.hasStaticShape())		if (!type.hasStaticShape())
return rewriter.notifyMatchFailure(		return rewriter.notifyMatchFailure(
op, "dynamic shapes not currently supported");		op, "dynamic shapes not currently supported");
auto memrefType = MemRefType::get(type.getShape(), type.getElementType());		auto memrefType = MemRefType::get(type.getShape(), type.getElementType());
auto alloc = rewriter.create<AllocOp>(loc, memrefType);		auto alloc = rewriter.create<AllocOp>(loc, memrefType);
newArgs.push_back(alloc);		newOutputBuffers.push_back(alloc);
newResults.push_back(alloc);
}		}

// Generate a new linalg operation that works on buffers.		// Generate a new linalg operation that works on buffers.
auto linalgOp = rewriter.create<linalg::GenericOp>(		auto linalgOp = rewriter.create<linalg::GenericOp>(
loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),		loc,
rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),		/resultTensorTypes=/ArrayRef<Type>{},
op.iterator_types(), op.docAttr(), op.library_callAttr(),		/inputs=/adaptor.inputs(),
op.symbol_sourceAttr());		/outputBuffers=/newOutputBuffers,
		/initTensors=/ValueRange{}, op.indexing_maps(), op.iterator_types(),
		op.docAttr(), op.library_callAttr(), op.symbol_sourceAttr());

// Create a new block in the region of the new Generic Op.		// Create a new block in the region of the new Generic Op.
Block &oldBlock = op.getRegion().front();		Block &oldBlock = op.getRegion().front();
Region &newRegion = linalgOp.region();		Region &newRegion = linalgOp.region();
Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),		Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),
oldBlock.getArgumentTypes());		oldBlock.getArgumentTypes());

// Add the result arguments to the new block.		// Add the result arguments to the new block.
for (auto result : newResults)		for (Value v : newOutputBuffers)
newBlock->addArgument(		newBlock->addArgument(v.getType().cast<MemRefType>().getElementType());
result.getType().cast<ShapedType>().getElementType());

// Clone the body of the old block to the new block.		// Clone the body of the old block to the new block.
BlockAndValueMapping mapping;		BlockAndValueMapping mapping;
for (unsigned i = 0; i < oldBlock.getNumArguments(); i++)		for (unsigned i = 0; i < oldBlock.getNumArguments(); i++)
mapping.map(oldBlock.getArgument(i), newBlock->getArgument(i));		mapping.map(oldBlock.getArgument(i), newBlock->getArgument(i));

		OpBuilder::InsertionGuard guard(rewriter);
rewriter.setInsertionPointToEnd(newBlock);		rewriter.setInsertionPointToEnd(newBlock);
for (auto &op : oldBlock.getOperations()) {		for (auto &op : oldBlock.getOperations()) {
Operation *clonedOp = rewriter.clone(op, mapping);		Operation *clonedOp = rewriter.clone(op, mapping);
mapping.map(op.getResults(), clonedOp->getResults());		mapping.map(op.getResults(), clonedOp->getResults());
}		}

// Replace the results of the old Generic Op with the results of the new		// Replace the results of the old op with the new output buffers.
// one.		rewriter.replaceOp(op, newOutputBuffers);
rewriter.replaceOp(op, newResults);
return success();		return success();
}		}
};		};

/// Populate the given list with patterns to convert Linalg operations on		/// Populate the given list with patterns to convert Linalg operations on
/// tensors to buffers.		/// tensors to buffers.
static void populateConvertLinalgOnTensorsToBuffersPattern(		static void populateConvertLinalgOnTensorsToBuffersPattern(
MLIRContext context, BufferAssignmentTypeConverter converter,		MLIRContext context, BufferAssignmentTypeConverter converter,
▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

mlir/test/Conversion/LinalgToSPIRV/linalg-to-spirv.mlir

	// RUN: mlir-opt -split-input-file -convert-linalg-to-spirv -canonicalize -verify-diagnostics %s -o - \| FileCheck %s			// RUN: mlir-opt -split-input-file -convert-linalg-to-spirv -canonicalize -verify-diagnostics %s -o - \| FileCheck %s

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Single workgroup reduction			// Single workgroup reduction
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#single_workgroup_reduction_trait = {			#single_workgroup_reduction_trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["reduction"],			iterator_types = ["reduction"],
	indexing_maps = [			indexing_maps = [
	affine_map<(i) -> (i)>,			affine_map<(i) -> (i)>,
	affine_map<(i) -> (0)>			affine_map<(i) -> (0)>
	]			]
	}			}

	module attributes {			module attributes {
	Show All 26 Lines
	// CHECK: ^bb2:			// CHECK: ^bb2:
	// CHECK: spv._merge			// CHECK: spv._merge
	// CHECK: }			// CHECK: }
	// CHECK: spv.Return			// CHECK: spv.Return

	func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) attributes {			func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) attributes {
	spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}			spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}
	} {			} {
	linalg.generic #single_workgroup_reduction_trait %input, %output {			linalg.generic #single_workgroup_reduction_trait
				ins(%input : memref<16xi32>)
				outs(%output : memref<1xi32>) {
	^bb(%in: i32, %out: i32):			^bb(%in: i32, %out: i32):
	%sum = addi %in, %out : i32			%sum = addi %in, %out : i32
	linalg.yield %sum : i32			linalg.yield %sum : i32
	} : memref<16xi32>, memref<1xi32>			}
	spv.Return			spv.Return
	}			}
	}			}

	// -----			// -----

	// Missing shader entry point ABI			// Missing shader entry point ABI

	#single_workgroup_reduction_trait = {			#single_workgroup_reduction_trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["reduction"],			iterator_types = ["reduction"],
	indexing_maps = [			indexing_maps = [
	affine_map<(i) -> (i)>,			affine_map<(i) -> (i)>,
	affine_map<(i) -> (0)>			affine_map<(i) -> (0)>
	]			]
	}			}

	module attributes {			module attributes {
	spv.target_env = #spv.target_env<			spv.target_env = #spv.target_env<
	#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>			#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>
	} {			} {
	func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) {			func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) {
	// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}			// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}
	linalg.generic #single_workgroup_reduction_trait %input, %output {			linalg.generic #single_workgroup_reduction_trait
				ins(%input : memref<16xi32>)
				outs(%output : memref<1xi32>) {
	^bb(%in: i32, %out: i32):			^bb(%in: i32, %out: i32):
	%sum = addi %in, %out : i32			%sum = addi %in, %out : i32
	linalg.yield %sum : i32			linalg.yield %sum : i32
	} : memref<16xi32>, memref<1xi32>			}
	return			return
	}			}
	}			}

	// -----			// -----

	// Mismatch between shader entry point ABI and input memref shape			// Mismatch between shader entry point ABI and input memref shape

	#single_workgroup_reduction_trait = {			#single_workgroup_reduction_trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["reduction"],			iterator_types = ["reduction"],
	indexing_maps = [			indexing_maps = [
	affine_map<(i) -> (i)>,			affine_map<(i) -> (i)>,
	affine_map<(i) -> (0)>			affine_map<(i) -> (0)>
	]			]
	}			}

	module attributes {			module attributes {
	spv.target_env = #spv.target_env<			spv.target_env = #spv.target_env<
	#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>			#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>
	} {			} {
	func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) attributes {			func @single_workgroup_reduction(%input: memref<16xi32>, %output: memref<1xi32>) attributes {
	spv.entry_point_abi = {local_size = dense<[32, 1, 1]>: vector<3xi32>}			spv.entry_point_abi = {local_size = dense<[32, 1, 1]>: vector<3xi32>}
	} {			} {
	// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}			// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}
	linalg.generic #single_workgroup_reduction_trait %input, %output {			linalg.generic #single_workgroup_reduction_trait
				ins(%input : memref<16xi32>)
				outs(%output : memref<1xi32>) {
	^bb(%in: i32, %out: i32):			^bb(%in: i32, %out: i32):
	%sum = addi %in, %out : i32			%sum = addi %in, %out : i32
	linalg.yield %sum : i32			linalg.yield %sum : i32
	} : memref<16xi32>, memref<1xi32>			}
	spv.Return			spv.Return
	}			}
	}			}

	// -----			// -----

	// Unsupported multi-dimension input memref			// Unsupported multi-dimension input memref

	#single_workgroup_reduction_trait = {			#single_workgroup_reduction_trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "reduction"],			iterator_types = ["parallel", "reduction"],
	indexing_maps = [			indexing_maps = [
	affine_map<(i, j) -> (i, j)>,			affine_map<(i, j) -> (i, j)>,
	affine_map<(i, j) -> (i)>			affine_map<(i, j) -> (i)>
	]			]
	}			}

	module attributes {			module attributes {
	spv.target_env = #spv.target_env<			spv.target_env = #spv.target_env<
	#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>			#spv.vce<v1.3, [Shader, GroupNonUniformArithmetic], []>, {}>
	} {			} {
	func @single_workgroup_reduction(%input: memref<16x8xi32>, %output: memref<16xi32>) attributes {			func @single_workgroup_reduction(%input: memref<16x8xi32>, %output: memref<16xi32>) attributes {
	spv.entry_point_abi = {local_size = dense<[16, 8, 1]>: vector<3xi32>}			spv.entry_point_abi = {local_size = dense<[16, 8, 1]>: vector<3xi32>}
	} {			} {
	// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}			// expected-error @+1 {{failed to legalize operation 'linalg.generic'}}
	linalg.generic #single_workgroup_reduction_trait %input, %output {			linalg.generic #single_workgroup_reduction_trait
				ins(%input : memref<16x8xi32>)
				outs(%output : memref<16xi32>) {
	^bb(%in: i32, %out: i32):			^bb(%in: i32, %out: i32):
	%sum = addi %in, %out : i32			%sum = addi %in, %out : i32
	linalg.yield %sum : i32			linalg.yield %sum : i32
	} : memref<16x8xi32>, memref<16xi32>			}
	spv.Return			spv.Return
	}			}
	}			}

mlir/test/Dialect/Linalg/canonicalize.mlir

	Show First 20 Lines • Show All 176 Lines • ▼ Show 20 Lines
	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(i) -> (i)>,			affine_map<(i) -> (i)>,
	affine_map<(i) -> (i)>			affine_map<(i) -> (i)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel"]			iterator_types = ["parallel"]
	}			}

	func @dce_zero_memref(%arg0 : memref<0xf32>, %arg1: tensor<0xf32>) -> tensor<0xf32> {			func @dce_zero_memref(%arg0 : memref<0xf32>, %arg1: tensor<0xf32>) -> tensor<0xf32> {
	// memref<0x32> is expected to be dce'ed			// memref<0x32> is expected to be dce'ed
	linalg.copy(%arg0, %arg0): memref<0xf32>, memref<0xf32>			linalg.copy(%arg0, %arg0): memref<0xf32>, memref<0xf32>

	// tensor<0xf32> cannot be dce'ed			// tensor<0xf32> cannot be dce'ed
	%1 = linalg.generic #trait %arg1 {			%1 = linalg.generic #trait ins(%arg1 : tensor<0xf32>) {
	^bb(%0: f32) :			^bb(%0: f32) :
	linalg.yield %0 : f32			linalg.yield %0 : f32
	} : tensor<0xf32> -> tensor<0xf32>			} -> tensor<0xf32>

	return %1: tensor<0xf32>			return %1: tensor<0xf32>
	}			}
	// CHECK-LABEL: @dce_zero_memref			// CHECK-LABEL: @dce_zero_memref
	// CHECK-NOT: linalg.copy			// CHECK-NOT: linalg.copy
	// CHECK-NEXT: linalg.generic			// CHECK-NEXT: linalg.generic

	// -----			// -----
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir

	// RUN: mlir-opt %s -linalg-fold-unit-extent-dims -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -linalg-fold-unit-extent-dims -split-input-file \| FileCheck %s

	#accesses = [			#accesses = [
	affine_map<(i, j, k, l, m) -> (i, k, m)>,			affine_map<(i, j, k, l, m) -> (i, k, m)>,
	affine_map<(i, j, k, l, m) -> (i, k, j, l, m)>			affine_map<(i, j, k, l, m) -> (i, k, j, l, m)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"],
	indexing_maps = #accesses,			indexing_maps = #accesses,
	library_call = "some_external_func"			library_call = "some_external_func"
	}			}

	func @drop_one_trip_loops(%arg0 : tensor<?x1x?xf32>) -> tensor<?x1x?x1x?xf32>			func @drop_one_trip_loops(%arg0 : tensor<?x1x?xf32>) -> tensor<?x1x?x1x?xf32>
	{			{
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<?x1x?xf32>) {
	^bb0(%arg1 : f32) :			^bb0(%arg1 : f32) :
	linalg.yield %arg1 : f32			linalg.yield %arg1 : f32
	} : tensor<?x1x?xf32> -> tensor<?x1x?x1x?xf32>			} -> tensor<?x1x?x1x?xf32>
	return %0 : tensor<?x1x?x1x?xf32>			return %0 : tensor<?x1x?x1x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2) -> (d2)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2) -> (d2)>
	// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1, d2) -> (d0, d2)>			// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1, d2) -> (d0, d2)>
	// CHECK-DAG: #[[$MAP3:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$MAP3:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK-DAG: #[[$MAP4:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d0, d1)>			// CHECK-DAG: #[[$MAP4:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d0, d1)>
	// CHECK-DAG: #[[$MAP5:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d2, d3)>			// CHECK-DAG: #[[$MAP5:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d2, d3)>
	// CHECK-DAG: #[[$MAP6:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d4)>			// CHECK-DAG: #[[$MAP6:.*]] = affine_map<(d0, d1, d2, d3, d4) -> (d4)>
	// CHECK-LABEL: func @drop_one_trip_loops			// CHECK-LABEL: func @drop_one_trip_loops
	// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP0]], #[[$MAP1]]]			// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP0]], #[[$MAP1]]]
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP2]], #[[$MAP3]]]			// CHECK-SAME: indexing_maps = [#[[$MAP2]], #[[$MAP3]]]
	// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]			// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]
	// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP4]], #[[$MAP5]], #[[$MAP6]]]			// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP4]], #[[$MAP5]], #[[$MAP6]]]

	// -----			// -----

	#map0 = affine_map<(i, j) -> (i, j)>			#map0 = affine_map<(i, j) -> (i, j)>
	#access = [#map0, #map0]			#access = [#map0, #map0]
	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	indexing_maps = #access,			indexing_maps = #access,
	library_call = "some_external_func"			library_call = "some_external_func"
	}			}

	func @drop_all_loops(%arg0 : tensor<1x1xf32>) -> tensor<1x1xf32>			func @drop_all_loops(%arg0 : tensor<1x1xf32>) -> tensor<1x1xf32>
	{			{
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<1x1xf32>) {
	^bb0(%arg1: f32) :			^bb0(%arg1: f32) :
	linalg.yield %arg1 : f32			linalg.yield %arg1 : f32
	} : tensor<1x1xf32> -> tensor<1x1xf32>			} -> tensor<1x1xf32>
	return %0 : tensor<1x1xf32>			return %0 : tensor<1x1xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> ()>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> ()>
	// CHECK-LABEL: func @drop_all_loops			// CHECK-LABEL: func @drop_all_loops
	// CHECK: linalg.tensor_reshape %{{.*}} []			// CHECK: linalg.tensor_reshape %{{.*}} []
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]
	// CHECK-SAME: iterator_types = []			// CHECK-SAME: iterator_types = []

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0) -> (0, d0)>,			affine_map<(d0) -> (0, d0)>,
	affine_map<(d0) -> (d0)>			affine_map<(d0) -> (d0)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel"],			iterator_types = ["parallel"],
	library_call = "some_external_fn"			library_call = "some_external_fn"
	}			}

	func @leading_dim_1_canonicalization(%arg0: tensor<1x5xf32>) -> tensor<5xf32> {			func @leading_dim_1_canonicalization(%arg0: tensor<1x5xf32>) -> tensor<5xf32> {
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<1x5xf32>) {
	^bb0(%arg2: f32): // no predecessors			^bb0(%arg2: f32): // no predecessors
	linalg.yield %arg2 : f32			linalg.yield %arg2 : f32
	} : tensor<1x5xf32> -> tensor<5xf32>			} -> tensor<5xf32>
	return %0 : tensor<5xf32>			return %0 : tensor<5xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @leading_dim_1_canonicalization			// CHECK-LABEL: func @leading_dim_1_canonicalization
	// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP0]]]			// CHECK: linalg.tensor_reshape %{{.*}} [#[[$MAP0]]]
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP1]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP1]], #[[$MAP1]]]
	// CHECK-SAME: iterator_types = ["parallel"]			// CHECK-SAME: iterator_types = ["parallel"]

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0, d1) -> (0, d1)>,			affine_map<(d0, d1) -> (0, d1)>,
	affine_map<(d0, d1) -> (d0, 0)>,			affine_map<(d0, d1) -> (d0, 0)>,
	affine_map<(d0, d1) -> (d0, d1)>			affine_map<(d0, d1) -> (d0, d1)>
	]			]

	#trait = {			#trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	library_call = "some_external_fn"			library_call = "some_external_fn"
	}			}

	func @broadcast_test(%arg0 : tensor<5xf32>, %arg1 : tensor<5xf32>) -> tensor<5x5xf32>			func @broadcast_test(%arg0 : tensor<5xf32>, %arg1 : tensor<5xf32>) -> tensor<5x5xf32>
	{			{
	%0 = linalg.tensor_reshape %arg0 [affine_map<(d0, d1) -> (d0, d1)>] :			%0 = linalg.tensor_reshape %arg0 [affine_map<(d0, d1) -> (d0, d1)>] :
	tensor<5xf32> into tensor<1x5xf32>			tensor<5xf32> into tensor<1x5xf32>
	%1 = linalg.tensor_reshape %arg1 [affine_map<(d0, d1) -> (d0, d1)>] :			%1 = linalg.tensor_reshape %arg1 [affine_map<(d0, d1) -> (d0, d1)>] :
	tensor<5xf32> into tensor<5x1xf32>			tensor<5xf32> into tensor<5x1xf32>
	%2 = linalg.generic #trait %0, %1 {			%2 = linalg.generic #trait
				ins(%0, %1 : tensor<1x5xf32>, tensor<5x1xf32>) {
	^bb0(%arg2: f32, %arg3: f32):			^bb0(%arg2: f32, %arg3: f32):
	%3 = addf %arg2, %arg3 : f32			%3 = addf %arg2, %arg3 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	} : tensor<1x5xf32>, tensor<5x1xf32> -> tensor<5x5xf32>			} -> tensor<5x5xf32>
	return %2 : tensor<5x5xf32>			return %2 : tensor<5x5xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d1)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0)>
	// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @broadcast_test			// CHECK-LABEL: func @broadcast_test
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP2]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP2]]]
	// CHECK-SAME: iterator_types = ["parallel", "parallel"]			// CHECK-SAME: iterator_types = ["parallel", "parallel"]
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0, d1) -> (0, 0)>,			affine_map<(d0, d1) -> (0, 0)>,
	affine_map<(d0, d1) -> (d0, d1)>			affine_map<(d0, d1) -> (d0, d1)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	library_call = "some_external_fn"			library_call = "some_external_fn"
	}			}

	func @broadcast_scalar(%arg0 : tensor<1x1xf32>) -> tensor<?x?xf32>			func @broadcast_scalar(%arg0 : tensor<1x1xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<1x1xf32>) {
	^bb0(%arg1 : f32):			^bb0(%arg1 : f32):
	linalg.yield %arg1 : f32			linalg.yield %arg1 : f32
	} : tensor<1x1xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	return %0 : tensor<?x?xf32>			return %0 : tensor<?x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> ()>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> ()>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @broadcast_scalar			// CHECK-LABEL: func @broadcast_scalar
	// CHECK-SAME: %[[ARG0:.*]]: tensor<1x1xf32>			// CHECK-SAME: %[[ARG0:.*]]: tensor<1x1xf32>
	// CHECK: %[[A:.*]] = linalg.tensor_reshape %[[ARG0]] []			// CHECK: %[[A:.*]] = linalg.tensor_reshape %[[ARG0]] []
	// CHECK-SAME: tensor<1x1xf32> into tensor<f32>			// CHECK-SAME: tensor<1x1xf32> into tensor<f32>
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]
	// CHECK-SAME: iterator_types = ["parallel", "parallel"]			// CHECK-SAME: iterator_types = ["parallel", "parallel"]
	// CHECK-SAME: %[[A]]			// CHECK-SAME: %[[A]]

mlir/test/Dialect/Linalg/fold-unit-trip-loops.mlir

	// RUN: mlir-opt %s -linalg-fold-unit-extent-dims="fold-one-trip-loops-only" -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -linalg-fold-unit-extent-dims="fold-one-trip-loops-only" -split-input-file \| FileCheck %s

	#accesses = [			#accesses = [
	affine_map<(i, j, k, l, m) -> (i, k, m)>,			affine_map<(i, j, k, l, m) -> (i, k, m)>,
	affine_map<(i, j, k, l, m) -> (i, k, j, l, m)>			affine_map<(i, j, k, l, m) -> (i, k, j, l, m)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"],
	indexing_maps = #accesses,			indexing_maps = #accesses,
	library_call = "some_external_func"			library_call = "some_external_func"
	}			}

	func @drop_one_trip_loops(%arg0 : tensor<?x1x?xf32>) -> tensor<?x1x?x1x?xf32>			func @drop_one_trip_loops(%arg0 : tensor<?x1x?xf32>) -> tensor<?x1x?x1x?xf32>
	{			{
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<?x1x?xf32>) {
	^bb0(%arg1 : f32) :			^bb0(%arg1 : f32) :
	linalg.yield %arg1 : f32			linalg.yield %arg1 : f32
	} : tensor<?x1x?xf32> -> tensor<?x1x?x1x?xf32>			} -> tensor<?x1x?x1x?xf32>
	return %0 : tensor<?x1x?x1x?xf32>			return %0 : tensor<?x1x?x1x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, 0, d2)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, 0, d2)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2) -> (d0, 0, d1, 0, d2)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2) -> (d0, 0, d1, 0, d2)>
	// CHECK-LABEL: func @drop_one_trip_loops			// CHECK-LABEL: func @drop_one_trip_loops
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]
	// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]			// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"]

	// -----			// -----

	#map0 = affine_map<(i, j) -> (i, j)>			#map0 = affine_map<(i, j) -> (i, j)>
	#access = [#map0, #map0]			#access = [#map0, #map0]
	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	indexing_maps = #access,			indexing_maps = #access,
	library_call = "some_external_func"			library_call = "some_external_func"
	}			}

	func @drop_all_loops(%arg0 : tensor<1x1xf32>) -> tensor<1x1xf32>			func @drop_all_loops(%arg0 : tensor<1x1xf32>) -> tensor<1x1xf32>
	{			{
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<1x1xf32>) {
	^bb0(%arg1: f32) :			^bb0(%arg1: f32) :
	linalg.yield %arg1 : f32			linalg.yield %arg1 : f32
	} : tensor<1x1xf32> -> tensor<1x1xf32>			} -> tensor<1x1xf32>
	return %0 : tensor<1x1xf32>			return %0 : tensor<1x1xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> (0, 0)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> (0, 0)>
	// CHECK-LABEL: func @drop_all_loops			// CHECK-LABEL: func @drop_all_loops
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]
	// CHECK-SAME: iterator_types = []			// CHECK-SAME: iterator_types = []

	// -----			// -----

	#map0 = affine_map<(i, j) -> (i, j)>			#map0 = affine_map<(i, j) -> (i, j)>
	#access = [#map0, #map0]			#access = [#map0, #map0]
	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	indexing_maps = #access,			indexing_maps = #access,
	library_call = "some_external_func"			library_call = "some_external_func"
	}			}

	func @drop_all_loops(%arg0 : memref<1x1xf32>, %arg1 : memref<1x1xf32>)			func @drop_all_loops(%arg0 : memref<1x1xf32>, %arg1 : memref<1x1xf32>)
	{			{
	linalg.generic #trait %arg0, %arg1 {			linalg.generic #trait
				ins(%arg0 : memref<1x1xf32>)
				outs(%arg1 : memref<1x1xf32>) {
	^bb0(%arg2: f32, %arg3 : f32) :			^bb0(%arg2: f32, %arg3 : f32) :
	linalg.yield %arg2 : f32			linalg.yield %arg2 : f32
	} : memref<1x1xf32>, memref<1x1xf32>			}
	return			return
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> (0, 0)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<() -> (0, 0)>
	// CHECK-LABEL: func @drop_all_loops			// CHECK-LABEL: func @drop_all_loops
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]
	// CHECK-SAME: iterator_types = []			// CHECK-SAME: iterator_types = []

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0, d1) -> (d0, d1)>,			affine_map<(d0, d1) -> (d0, d1)>,
	affine_map<(d0, d1) -> (d1)>			affine_map<(d0, d1) -> (d1)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	library_call = "some_external_fn"			library_call = "some_external_fn"
	}			}

	func @leading_dim_1_canonicalization(%arg0: tensor<1x5xf32>) -> tensor<5xf32> {			func @leading_dim_1_canonicalization(%arg0: tensor<1x5xf32>) -> tensor<5xf32> {
	%0 = linalg.generic #trait %arg0 {			%0 = linalg.generic #trait
				ins(%arg0 : tensor<1x5xf32>) {
	^bb0(%arg2: f32): // no predecessors			^bb0(%arg2: f32): // no predecessors
	linalg.yield %arg2 : f32			linalg.yield %arg2 : f32
	} : tensor<1x5xf32> -> tensor<5xf32>			} -> tensor<5xf32>
	return %0 : tensor<5xf32>			return %0 : tensor<5xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0) -> (0, d0)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0) -> (0, d0)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0) -> (d0)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0) -> (d0)>
	// CHECK-LABEL: func @leading_dim_1_canonicalization			// CHECK-LABEL: func @leading_dim_1_canonicalization
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]
	// CHECK-SAME: iterator_types = ["parallel"]			// CHECK-SAME: iterator_types = ["parallel"]

mlir/test/Dialect/Linalg/fusion-tensor.mlir

	// RUN: mlir-opt %s -linalg-fusion-for-tensor-ops -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -linalg-fusion-for-tensor-ops -split-input-file \| FileCheck %s

	// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>

	// CHECK-LABEL: @add_mul_fusion			// CHECK-LABEL: @add_mul_fusion
	func @add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = {{\[}}[[$MAP0]], [[$MAP0]], [[$MAP0]], [[$MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[$MAP0]], [[$MAP0]], [[$MAP0]], [[$MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%0, %arg2 : tensor<?x?xf32>, tensor<?x?xf32>) {
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: [[ARG0:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG0:%[a-zA-Z0-9_]*]]
	// CHECK-SAME: [[ARG1:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG1:%[a-zA-Z0-9_]*]]
	// CHECK-SAME: [[ARG2:%[a-zA-Z0-9_]*]]			// CHECK-SAME: [[ARG2:%[a-zA-Z0-9_]*]]
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg5: f32, %arg6: f32): // no predecessors
	// CHECK: [[T1:%[a-zA-Z0-9_]*]] = addf [[ARG0]], [[ARG1]]			// CHECK: [[T1:%[a-zA-Z0-9_]*]] = addf [[ARG0]], [[ARG1]]
	// CHECK-NOT: linalg.yield			// CHECK-NOT: linalg.yield
	// CHECK: mulf [[T1]], [[ARG2]]			// CHECK: mulf [[T1]], [[ARG2]]
	// CHECK: linalg.yield			// CHECK: linalg.yield
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg5, %arg6 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>			// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d1, d0)>			#map1 = affine_map<(d0, d1) -> (d1, d0)>

	// CHECK-LABEL: @transpose_add_mul_fusion			// CHECK-LABEL: @transpose_add_mul_fusion
	func @transpose_add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @transpose_add_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = {{\[}}[[$MAP0]], [[$MAP1]], [[$MAP0]], [[$MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[$MAP0]], [[$MAP1]], [[$MAP0]], [[$MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%0, %arg2 : tensor<?x?xf32>, tensor<?x?xf32>) {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg5: f32, %arg6: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg5, %arg6 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>			// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d1, d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d1, d0)>			#map1 = affine_map<(d0, d1) -> (d1, d0)>

	// CHECK-LABEL: @add_transpose_mul_fusion			// CHECK-LABEL: @add_transpose_mul_fusion
	func @add_transpose_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_transpose_mul_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]} %arg0, %arg1 {			%0 = linalg.generic {indexing_maps = [#map0, #map1, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = {{\[}}[[$MAP1]], [[$MAP0]], [[$MAP0]], [[$MAP0]]{{\]}}			// CHECK-SAME: indexing_maps = {{\[}}[[$MAP1]], [[$MAP0]], [[$MAP0]], [[$MAP0]]{{\]}}
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%0, %arg2 : tensor<?x?xf32>, tensor<?x?xf32>) {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg5: f32, %arg6: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg5, %arg6 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0)>			// CHECK-DAG: [[$MAP1:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0)>
	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1) -> (d0)>			#map1 = affine_map<(d0, d1) -> (d0)>
	#map2 = affine_map<(d0) -> (d0)>			#map2 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: @add_broadcast_mul_fusion			// CHECK-LABEL: @add_broadcast_mul_fusion
	func @add_broadcast_mul_fusion(%arg0: tensor<?xf32>, %arg1 : tensor<?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>			func @add_broadcast_mul_fusion(%arg0: tensor<?xf32>, %arg1 : tensor<?xf32>, %arg2 : tensor<?x?xf32>) -> tensor<?x?xf32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map2, #map2, #map2], iterator_types = ["parallel"]} %arg0, %arg1 {			%0 = linalg.generic {indexing_maps = [#map2, #map2, #map2], iterator_types = ["parallel"]}
				ins(%arg0, %arg1 : tensor<?xf32>, tensor<?xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?xf32>, tensor<?xf32> -> tensor<?xf32>			} -> tensor<?xf32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = {{\[}}[[$MAP1]], [[$MAP1]], [[$MAP0]], [[$MAP0]]			// CHECK-SAME: indexing_maps = {{\[}}[[$MAP1]], [[$MAP1]], [[$MAP0]], [[$MAP0]]
	%2 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]} %0, %arg2 {			%2 = linalg.generic {indexing_maps = [#map1, #map0, #map0], iterator_types = ["parallel", "parallel"]}
				ins(%0, %arg2 : tensor<?xf32>, tensor<?x?xf32>) {
	^bb0(%arg5: f32, %arg6: f32): // no predecessors			^bb0(%arg5: f32, %arg6: f32): // no predecessors
	%3 = mulf %arg5, %arg6 : f32			%3 = mulf %arg5, %arg6 : f32
	linalg.yield %3 : f32			linalg.yield %3 : f32
	}: tensor<?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>			} -> tensor<?x?xf32>
	return %2 : tensor<?x?xf32>			return %2 : tensor<?x?xf32>
	}			}

	// -----			// -----

	// CHECK: #[[$MAP0:.*]] = affine_map<() -> ()>			// CHECK: #[[$MAP0:.*]] = affine_map<() -> ()>
	#map0 = affine_map<() -> ()>			#map0 = affine_map<() -> ()>

	// CHECK-LABEL: @add_mul_scalar_fusion			// CHECK-LABEL: @add_mul_scalar_fusion
	func @add_mul_scalar_fusion(%arg0: tensor<f32>, %arg1: tensor<f32>, %arg2: tensor<f32>) -> tensor<f32>			func @add_mul_scalar_fusion(%arg0: tensor<f32>, %arg1: tensor<f32>, %arg2: tensor<f32>) -> tensor<f32>
	{			{
	%0 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = []} %arg0, %arg1 {			%0 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = []}
				ins(%arg0, %arg1 : tensor<f32>, tensor<f32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = addf %arg3, %arg4 : f32			%1 = addf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<f32>, tensor<f32> -> tensor<f32>			} -> tensor<f32>
	// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64			// CHECK: linalg.generic {
	// CHECK: addf			// CHECK: addf
	// CHECK: mulf			// CHECK: mulf
	%1 = linalg.generic {args_in = 2 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = []} %0, %arg2 {			%1 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = []}
				ins(%0, %arg2 : tensor<f32>, tensor<f32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = mulf %arg3, %arg4 : f32			%1 = mulf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<f32>, tensor<f32> -> tensor<f32>			} -> tensor<f32>

	return %1 : tensor<f32>			return %1 : tensor<f32>
	}			}

	// -----			// -----

	// CHECK-DAG: #[[$MAP0:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 4 + d2, d3)>			// CHECK-DAG: #[[$MAP0:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 4 + d2, d3)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>

	#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	func @generic_op_reshape_producer_fusion(%arg0 : tensor<?x?x?xf32>,			func @generic_op_reshape_producer_fusion(%arg0 : tensor<?x?x?xf32>,
	%arg1 : tensor<?x?x4x?xf32>) ->			%arg1 : tensor<?x?x4x?xf32>) ->
	tensor<?x?x4x?xf32>			tensor<?x?x4x?xf32>
	{			{
	%0 = linalg.tensor_reshape %arg0 [affine_map<(i, j, k, l) -> (i)>,			%0 = linalg.tensor_reshape %arg0 [affine_map<(i, j, k, l) -> (i)>,
	affine_map<(i, j, k, l) -> (j, k)>,			affine_map<(i, j, k, l) -> (j, k)>,
	affine_map<(i, j, k, l) -> (l)>] :			affine_map<(i, j, k, l) -> (l)>] :
	tensor<?x?x?xf32> into tensor<?x?x4x?xf32>			tensor<?x?x?xf32> into tensor<?x?x4x?xf32>
	%1 = linalg.generic			%1 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel", "parallel"]}
	%0, %arg1 {			ins(%0, %arg1 : tensor<?x?x4x?xf32>, tensor<?x?x4x?xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = mulf %arg3, %arg4 : f32			%1 = mulf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?x4x?xf32>, tensor<?x?x4x?xf32> -> tensor<?x?x4x?xf32>			} -> tensor<?x?x4x?xf32>
	return %1 : tensor<?x?x4x?xf32>			return %1 : tensor<?x?x4x?xf32>
	}			}

	// CHECK-LABEL: func @generic_op_reshape_producer_fusion			// CHECK-LABEL: func @generic_op_reshape_producer_fusion
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: args_in = 2
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP1]]]
	// CHECK-NOT: linalg.generic			// CHECK-NOT: linalg.generic


	// -----			// -----

	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	// CHECK-DAG: #[[$MAP1:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 20 + d2 * 5 + d3)>			// CHECK-DAG: #[[$MAP1:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 20 + d2 * 5 + d3)>

	#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	func @generic_op_reshape_consumer_fusion(%arg0 : tensor<?x?x4x5xf32>,			func @generic_op_reshape_consumer_fusion(%arg0 : tensor<?x?x4x5xf32>,
	%arg1 : tensor<?x?x4x5xf32>) ->			%arg1 : tensor<?x?x4x5xf32>) ->
	tensor<?x?xf32>			tensor<?x?xf32>
	{			{
	%0 = linalg.generic			%0 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel", "parallel"]}
	%arg0, %arg1 {			ins(%arg0, %arg1 : tensor<?x?x4x5xf32>, tensor<?x?x4x5xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = mulf %arg3, %arg4 : f32			%1 = mulf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?x4x5xf32>, tensor<?x?x4x5xf32> -> tensor<?x?x4x5xf32>			} -> tensor<?x?x4x5xf32>
	%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,			%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,
	affine_map<(i, j, k, l) -> (j, k, l)>] :			affine_map<(i, j, k, l) -> (j, k, l)>] :
	tensor<?x?x4x5xf32> into tensor<?x?xf32>			tensor<?x?x4x5xf32> into tensor<?x?xf32>
	return %1 : tensor<?x?xf32>			return %1 : tensor<?x?xf32>
	}			}

	// CHECK-LABEL: func @generic_op_reshape_consumer_fusion			// CHECK-LABEL: func @generic_op_reshape_consumer_fusion
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: args_in = 2
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP1]]]
	// CHECK-NOT: linalg.generic			// CHECK-NOT: linalg.generic

	// -----			// -----

	#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	func @generic_op_reshape_consumer_nofusion(%arg0 : tensor<?x?x?x5xf32>,			func @generic_op_reshape_consumer_nofusion(%arg0 : tensor<?x?x?x5xf32>,
	%arg1 : tensor<?x?x?x5xf32>) ->			%arg1 : tensor<?x?x?x5xf32>) ->
	tensor<?x?xf32>			tensor<?x?xf32>
	{			{
	%0 = linalg.generic			%0 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel", "parallel"]}
	%arg0, %arg1 {			ins(%arg0, %arg1 : tensor<?x?x?x5xf32>, tensor<?x?x?x5xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%1 = mulf %arg3, %arg4 : f32			%1 = mulf %arg3, %arg4 : f32
	linalg.yield %1 : f32			linalg.yield %1 : f32
	}: tensor<?x?x?x5xf32>, tensor<?x?x?x5xf32> -> tensor<?x?x?x5xf32>			} -> tensor<?x?x?x5xf32>
	%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,			%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,
	affine_map<(i, j, k, l) -> (j, k, l)>] :			affine_map<(i, j, k, l) -> (j, k, l)>] :
	tensor<?x?x?x5xf32> into tensor<?x?xf32>			tensor<?x?x?x5xf32> into tensor<?x?xf32>
	return %1 : tensor<?x?xf32>			return %1 : tensor<?x?xf32>
	}			}

	// CHECK-LABEL: func @generic_op_reshape_consumer_nofusion			// CHECK-LABEL: func @generic_op_reshape_consumer_nofusion
	// CHECK: linalg.tensor_reshape			// CHECK: linalg.tensor_reshape

	// -----			// -----

	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	#map1 = affine_map<(d0, d1, d2) -> (d0, d1)>			#map1 = affine_map<(d0, d1, d2) -> (d0, d1)>
	#map2 = affine_map<(d0, d1, d2) -> (d2)>			#map2 = affine_map<(d0, d1, d2) -> (d2)>

	func @generic_op_reshape_consumer_expanding(%arg0: tensor<264x4xf32>)			func @generic_op_reshape_consumer_expanding(%arg0: tensor<264x4xf32>)
	-> tensor<8x33x4xf32> {			-> tensor<8x33x4xf32> {
	%cst = constant dense<2.000000e+00> : tensor<264x4xf32>			%cst = constant dense<2.000000e+00> : tensor<264x4xf32>
	%0 = linalg.generic			%0 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel"]}			iterator_types = ["parallel", "parallel"]}
	%arg0, %cst {			ins(%arg0, %cst : tensor<264x4xf32>, tensor<264x4xf32>) {
	^bb0(%arg1: f32, %arg2: f32): // no predecessors			^bb0(%arg1: f32, %arg2: f32): // no predecessors
	%2 = mulf %arg1, %arg2 : f32			%2 = mulf %arg1, %arg2 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: tensor<264x4xf32>, tensor<264x4xf32> -> tensor<264x4xf32>			} -> tensor<264x4xf32>
	%1 = linalg.tensor_reshape %0 [#map1, #map2] :			%1 = linalg.tensor_reshape %0 [#map1, #map2] :
	tensor<264x4xf32> into tensor<8x33x4xf32>			tensor<264x4xf32> into tensor<8x33x4xf32>
	return %1 : tensor<8x33x4xf32>			return %1 : tensor<8x33x4xf32>
	}			}

	// The reshape op in `%arg0` is folded into the indexing map of generic op.			// The reshape op in `%arg0` is folded into the indexing map of generic op.
	// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0, d1, d2) -> (d0 * 33 + d1, d2)>			// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0, d1, d2) -> (d0 * 33 + d1, d2)>
	// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK: func @generic_op_reshape_consumer_expanding			// CHECK: func @generic_op_reshape_consumer_expanding
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape
	// CHECK: %[[CST:.]] = constant {{.}} : f32			// CHECK: %[[CST:.]] = constant {{.}} : f32
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: indexing_maps = [#[[MAP0]], #[[MAP1]]]			// CHECK-SAME: indexing_maps = [#[[MAP0]], #[[MAP1]]]
	// CHECK: tensor<264x4xf32> -> tensor<8x33x4xf32>			// CHECK-SAME: tensor<264x4xf32>
				// CHECK: -> tensor<8x33x4xf32>
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape

	// -----			// -----

	#map0 = affine_map<(d0, d1, d2) -> (d0)>			#map0 = affine_map<(d0, d1, d2) -> (d0)>
	#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	func @generic_op_constant_fusion(%arg0 : tensor<5x?x?xf32>) -> tensor<5x?x?xf32>			func @generic_op_constant_fusion(%arg0 : tensor<5x?x?xf32>) -> tensor<5x?x?xf32>
	{			{
	%0 = constant dense<42.0> : tensor<5xf32>			%0 = constant dense<42.0> : tensor<5xf32>
	%1 = linalg.generic			%1 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map1, #map1],			indexing_maps = [#map0, #map1, #map1],
	iterator_types = ["parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel"]}
	%0, %arg0 {			ins(%0, %arg0 : tensor<5xf32>, tensor<5x?x?xf32>) {
	^bb0(%arg1: f32, %arg2: f32):			^bb0(%arg1: f32, %arg2: f32):
	%2 = mulf %arg1, %arg2 : f32			%2 = mulf %arg1, %arg2 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: tensor<5xf32>, tensor<5x?x?xf32> -> tensor<5x?x?xf32>			} -> tensor<5x?x?xf32>
	return %1 : tensor<5x?x?xf32>			return %1 : tensor<5x?x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK-LABEL: func @generic_op_constant_fusion			// CHECK-LABEL: func @generic_op_constant_fusion
	// CHECK: %[[CST:.]] = constant {{.}} : f32			// CHECK: %[[CST:.]] = constant {{.}} : f32
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: args_in = 1 : i64
	// CHECK-SAME: args_out = 1 : i64
	// CHECK: ^{{.}}(%[[ARG1:.]]: f32)			// CHECK: ^{{.}}(%[[ARG1:.]]: f32)
	// CHECK: mulf %[[CST]], %[[ARG1]]			// CHECK: mulf %[[CST]], %[[ARG1]]

	// -----			// -----

	#map0 = affine_map<(d0, d1, d2) -> (d0)>			#map0 = affine_map<(d0, d1, d2) -> (d0)>
	#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	func @indexed_generic_op_constant_fusion(%arg0 : tensor<5x?x?xf32>)			func @indexed_generic_op_constant_fusion(%arg0 : tensor<5x?x?xf32>)
	-> tensor<5x?x?xf32>			-> tensor<5x?x?xf32>
	{			{
	%0 = constant dense<42.0> : tensor<5xf32>			%0 = constant dense<42.0> : tensor<5xf32>
	%1 = linalg.indexed_generic			%1 = linalg.indexed_generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map1, #map1],			indexing_maps = [#map0, #map1, #map1],
	iterator_types = ["parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel"]}
	%0, %arg0 {			ins(%0, %arg0 : tensor<5xf32>, tensor<5x?x?xf32>) {
	^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: f32, %arg5 : f32):			^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: f32, %arg5 : f32):
	%2 = mulf %arg4, %arg5 : f32			%2 = mulf %arg4, %arg5 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: tensor<5xf32>, tensor<5x?x?xf32> -> tensor<5x?x?xf32>			} -> tensor<5x?x?xf32>
	return %1 : tensor<5x?x?xf32>			return %1 : tensor<5x?x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK-LABEL: func @indexed_generic_op_constant_fusion			// CHECK-LABEL: func @indexed_generic_op_constant_fusion
	// CHECK: %[[CST:.]] = constant {{.}} : f32			// CHECK: %[[CST:.]] = constant {{.}} : f32
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 1 : i64
	// CHECK-SAME: args_out = 1 : i64
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG3:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG3:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG4:.*]]: f32)			// CHECK-SAME: %[[ARG4:.*]]: f32)
	// CHECK: mulf %[[CST]], %[[ARG4]]			// CHECK: mulf %[[CST]], %[[ARG4]]

	// -----			// -----

	#map0 = affine_map<(d0, d1, d2) -> ()>			#map0 = affine_map<(d0, d1, d2) -> ()>
	#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	func @generic_op_zero_dim_constant_fusion(%arg0 : tensor<5x?x?xf32>)			func @generic_op_zero_dim_constant_fusion(%arg0 : tensor<5x?x?xf32>)
	-> tensor<5x?x?xf32>			-> tensor<5x?x?xf32>
	{			{
	%0 = constant dense<42.0> : tensor<f32>			%0 = constant dense<42.0> : tensor<f32>
	%1 = linalg.generic			%1 = linalg.generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map1, #map1],			indexing_maps = [#map0, #map1, #map1],
	iterator_types = ["parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel"]}
	%0, %arg0 {			ins(%0, %arg0 : tensor<f32>, tensor<5x?x?xf32>) {
	^bb0(%arg1: f32, %arg2: f32):			^bb0(%arg1: f32, %arg2: f32):
	%2 = mulf %arg1, %arg2 : f32			%2 = mulf %arg1, %arg2 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: tensor<f32>, tensor<5x?x?xf32> -> tensor<5x?x?xf32>			} -> tensor<5x?x?xf32>
	return %1 : tensor<5x?x?xf32>			return %1 : tensor<5x?x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK-LABEL: func @generic_op_zero_dim_constant_fusion			// CHECK-LABEL: func @generic_op_zero_dim_constant_fusion
	// CHECK: %[[CST:.]] = constant {{.}} : f32			// CHECK: %[[CST:.]] = constant {{.}} : f32
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: args_in = 1 : i64
	// CHECK-SAME: args_out = 1 : i64
	// CHECK: ^{{.}}(%[[ARG1:.]]: f32)			// CHECK: ^{{.}}(%[[ARG1:.]]: f32)
	// CHECK: mulf %[[CST]], %[[ARG1]]			// CHECK: mulf %[[CST]], %[[ARG1]]

	// -----			// -----

	#map0 = affine_map<(d0, d1, d2) -> ()>			#map0 = affine_map<(d0, d1, d2) -> ()>
	#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			#map1 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	func @indexed_generic_op_zero_dim_constant_fusion			func @indexed_generic_op_zero_dim_constant_fusion
	(%arg0 : tensor<5x?x?xf32>) -> tensor<5x?x?xf32>			(%arg0 : tensor<5x?x?xf32>) -> tensor<5x?x?xf32>
	{			{
	%0 = constant dense<42.0> : tensor<f32>			%0 = constant dense<42.0> : tensor<f32>
	%1 = linalg.indexed_generic			%1 = linalg.indexed_generic {
	{args_in = 2 : i64, args_out = 1 : i64,
	indexing_maps = [#map0, #map1, #map1],			indexing_maps = [#map0, #map1, #map1],
	iterator_types = ["parallel", "parallel", "parallel"]}			iterator_types = ["parallel", "parallel", "parallel"]}
	%0, %arg0 {			ins(%0, %arg0 : tensor<f32>, tensor<5x?x?xf32>) {
	^bb0(%arg1 : index, %arg2 : index, %arg3 : index, %arg4: f32, %arg5: f32):			^bb0(%arg1 : index, %arg2 : index, %arg3 : index, %arg4: f32, %arg5: f32):
	%2 = mulf %arg4, %arg5 : f32			%2 = mulf %arg4, %arg5 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: tensor<f32>, tensor<5x?x?xf32> -> tensor<5x?x?xf32>			} -> tensor<5x?x?xf32>
	return %1 : tensor<5x?x?xf32>			return %1 : tensor<5x?x?xf32>
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	// CHECK-LABEL: func @indexed_generic_op_zero_dim_constant_fusion			// CHECK-LABEL: func @indexed_generic_op_zero_dim_constant_fusion
	// CHECK: %[[CST:.]] = constant {{.}} : f32			// CHECK: %[[CST:.]] = constant {{.}} : f32
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 1 : i64
	// CHECK-SAME: args_out = 1 : i64
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG3:[a-zA-Z0-9]*]]: index			// CHECK-SAME: %[[ARG3:[a-zA-Z0-9]*]]: index
	// CHECK-SAME: %[[ARG4:.*]]: f32)			// CHECK-SAME: %[[ARG4:.*]]: f32)
	// CHECK: mulf %[[CST]], %[[ARG4]]			// CHECK: mulf %[[CST]], %[[ARG4]]

	// -----			// -----

	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	func @generic_op_indexed_generic_op_fusion(%arg0: tensor<?x?xi32>,			func @generic_op_indexed_generic_op_fusion(%arg0: tensor<?x?xi32>,
	%arg1: tensor<?x?xi32>) {			%arg1: tensor<?x?xi32>) {
	%0 = linalg.generic {			%0 = linalg.generic {
	args_in = 2 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel"] } %arg0, %arg1 {			iterator_types = ["parallel", "parallel"] }
				ins(%arg0, %arg1 : tensor<?x?xi32>, tensor<?x?xi32>) {
	^bb0(%arg2: i32, %arg3: i32): // no predecessors			^bb0(%arg2: i32, %arg3: i32): // no predecessors
	%10 = addi %arg2, %arg3 : i32			%10 = addi %arg2, %arg3 : i32
	linalg.yield %10 : i32			linalg.yield %10 : i32
	} : tensor<?x?xi32>, tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	%1 = linalg.indexed_generic {			%1 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel", "parallel"] } %0 {			iterator_types = ["parallel", "parallel"] }
				ins(%0 : tensor<?x?xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = index_cast %arg3 : index to i32			%3 = index_cast %arg3 : index to i32
	%4 = addi %arg4, %2 : i32			%4 = addi %arg4, %2 : i32
	%5 = subi %4, %3 : i32			%5 = subi %4, %3 : i32
	linalg.yield %5 : i32			linalg.yield %5 : i32
	}: tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	return			return
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @generic_op_indexed_generic_op_fusion			// CHECK-LABEL: func @generic_op_indexed_generic_op_fusion
	// CHECK-NOT: linalg.generic			// CHECK-NOT: linalg.generic
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 2
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP0]]]
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32			// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32
	// CHECK-SAME: %[[ARG3:[a-zA-Z0-9_]*]]: i32			// CHECK-SAME: %[[ARG3:[a-zA-Z0-9_]*]]: i32
	// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ARG3]] : i32			// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ARG3]] : i32
	// CHECK: %[[ADD_OPERAND:.+]] = index_cast %[[ARG0]] : index to i32			// CHECK: %[[ADD_OPERAND:.+]] = index_cast %[[ARG0]] : index to i32
	// CHECK: %[[SUB_OPERAND:.+]] = index_cast %[[ARG1]] : index to i32			// CHECK: %[[SUB_OPERAND:.+]] = index_cast %[[ARG1]] : index to i32
	// CHECK: %[[VAL2:.+]] = addi %[[VAL1]], %[[ADD_OPERAND]] : i32			// CHECK: %[[VAL2:.+]] = addi %[[VAL1]], %[[ADD_OPERAND]] : i32
	// CHECK: %[[VAL3:.+]] = subi %[[VAL2]], %[[SUB_OPERAND]] : i32			// CHECK: %[[VAL3:.+]] = subi %[[VAL2]], %[[SUB_OPERAND]] : i32
	// CHECK: linalg.yield %[[VAL3]] : i32			// CHECK: linalg.yield %[[VAL3]] : i32

	// -----			// -----

	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	func @indexed_generic_op_generic_op_fusion(%arg0: tensor<?x?xi32>,			func @indexed_generic_op_generic_op_fusion(%arg0: tensor<?x?xi32>,
	%arg1: tensor<?x?xi32>) {			%arg1: tensor<?x?xi32>) {
	%0 = linalg.indexed_generic {			%0 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel", "parallel"] } %arg0 {			iterator_types = ["parallel", "parallel"] }
				ins(%arg0 : tensor<?x?xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = index_cast %arg3 : index to i32			%3 = index_cast %arg3 : index to i32
	%4 = addi %arg4, %2 : i32			%4 = addi %arg4, %2 : i32
	%5 = subi %4, %3 : i32			%5 = subi %4, %3 : i32
	linalg.yield %5 : i32			linalg.yield %5 : i32
	}: tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	%1 = linalg.generic {			%1 = linalg.generic {
	args_in = 2 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel"] } %0, %arg1 {			iterator_types = ["parallel", "parallel"] }
				ins(%0, %arg1 : tensor<?x?xi32>, tensor<?x?xi32>) {
	^bb0(%arg2: i32, %arg3: i32): // no predecessors			^bb0(%arg2: i32, %arg3: i32): // no predecessors
	%10 = addi %arg2, %arg3 : i32			%10 = addi %arg2, %arg3 : i32
	linalg.yield %10 : i32			linalg.yield %10 : i32
	} : tensor<?x?xi32>, tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	return			return
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @indexed_generic_op_generic_op_fusion			// CHECK-LABEL: func @indexed_generic_op_generic_op_fusion
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 2
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]], #[[$MAP0]]]
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32			// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32
	// CHECK-SAME: %[[ARG3:[a-zA-Z0-9_]*]]: i32			// CHECK-SAME: %[[ARG3:[a-zA-Z0-9_]*]]: i32
	// CHECK: %[[ADD_OPERAND:.+]] = index_cast %[[ARG0]] : index to i32			// CHECK: %[[ADD_OPERAND:.+]] = index_cast %[[ARG0]] : index to i32
	// CHECK: %[[SUB_OPERAND:.+]] = index_cast %[[ARG1]] : index to i32			// CHECK: %[[SUB_OPERAND:.+]] = index_cast %[[ARG1]] : index to i32
	// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ADD_OPERAND]] : i32			// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ADD_OPERAND]] : i32
	// CHECK: %[[VAL2:.+]] = subi %[[VAL1]], %[[SUB_OPERAND]] : i32			// CHECK: %[[VAL2:.+]] = subi %[[VAL1]], %[[SUB_OPERAND]] : i32
	// CHECK: %[[VAL3:.+]] = addi %[[VAL2]], %[[ARG3]] : i32			// CHECK: %[[VAL3:.+]] = addi %[[VAL2]], %[[ARG3]] : i32
	// CHECK: linalg.yield %[[VAL3]] : i32			// CHECK: linalg.yield %[[VAL3]] : i32
	// CHECK-NOT: linalg.generic			// CHECK-NOT: linalg.generic

	// -----			// -----

	// The indices of the first indexed_generic op are swapped after fusion.			// The indices of the first indexed_generic op are swapped after fusion.
	#map0 = affine_map<(d0, d1) -> (d1, d0)>			#map0 = affine_map<(d0, d1) -> (d1, d0)>
	#map1 = affine_map<(d0, d1) -> (d0, d1)>			#map1 = affine_map<(d0, d1) -> (d0, d1)>
	func @indexed_generic_op_fusion(%arg0: tensor<?x?xi32>) {			func @indexed_generic_op_fusion(%arg0: tensor<?x?xi32>) {
	%0 = linalg.indexed_generic {			%0 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel", "parallel"] } %arg0 {			iterator_types = ["parallel", "parallel"] }
				ins(%arg0 : tensor<?x?xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = index_cast %arg3 : index to i32			%3 = index_cast %arg3 : index to i32
	%4 = addi %arg4, %2 : i32			%4 = addi %arg4, %2 : i32
	%5 = subi %4, %3 : i32			%5 = subi %4, %3 : i32
	linalg.yield %5 : i32			linalg.yield %5 : i32
	}: tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	%1 = linalg.indexed_generic {			%1 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map1, #map1],			indexing_maps = [#map1, #map1],
	iterator_types = ["parallel", "parallel"] } %0 {			iterator_types = ["parallel", "parallel"] }
				ins(%0 : tensor<?x?xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = index_cast %arg3 : index to i32			%3 = index_cast %arg3 : index to i32
	%4 = addi %arg4, %2 : i32			%4 = addi %arg4, %2 : i32
	%5 = subi %4, %3 : i32			%5 = subi %4, %3 : i32
	linalg.yield %5 : i32			linalg.yield %5 : i32
	}: tensor<?x?xi32> -> tensor<?x?xi32>			} -> tensor<?x?xi32>
	return			return
	}			}
	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1) -> (d0, d1)>
	// CHECK-LABEL: func @indexed_generic_op_fusion			// CHECK-LABEL: func @indexed_generic_op_fusion
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 1
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP0]]]
	// CHECK: ^{{[a-zA-Z0-9_]*}}			// CHECK: ^{{[a-zA-Z0-9_]*}}
	// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index			// CHECK-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: index
	// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32			// CHECK-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: i32
	// CHECK: %[[ADD_OPERAND1:.+]] = index_cast %[[ARG1]] : index to i32			// CHECK: %[[ADD_OPERAND1:.+]] = index_cast %[[ARG1]] : index to i32
	// CHECK: %[[SUB_OPERAND1:.+]] = index_cast %[[ARG0]] : index to i32			// CHECK: %[[SUB_OPERAND1:.+]] = index_cast %[[ARG0]] : index to i32
	// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ADD_OPERAND1]] : i32			// CHECK: %[[VAL1:.+]] = addi %[[ARG2]], %[[ADD_OPERAND1]] : i32
	Show All 13 Lines
	#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	func @indexed_generic_op_reshape_producer_fusion(%arg0 : tensor<?x?x?xi32>)			func @indexed_generic_op_reshape_producer_fusion(%arg0 : tensor<?x?x?xi32>)
	-> tensor<?x?x4x?xi32> {			-> tensor<?x?x4x?xi32> {
	%0 = linalg.tensor_reshape %arg0 [affine_map<(i, j, k, l) -> (i)>,			%0 = linalg.tensor_reshape %arg0 [affine_map<(i, j, k, l) -> (i)>,
	affine_map<(i, j, k, l) -> (j, k)>,			affine_map<(i, j, k, l) -> (j, k)>,
	affine_map<(i, j, k, l) -> (l)>] :			affine_map<(i, j, k, l) -> (l)>] :
	tensor<?x?x?xi32> into tensor<?x?x4x?xi32>			tensor<?x?x?xi32> into tensor<?x?x4x?xi32>
	%1 = linalg.indexed_generic {			%1 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel", "parallel", "parallel", "parallel"] } %0 {			iterator_types = ["parallel", "parallel", "parallel", "parallel"] }
				ins(%0 : tensor<?x?x4x?xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: index, %arg5: index, %arg6: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: index, %arg5: index, %arg6: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = addi %arg6, %2 : i32			%3 = addi %arg6, %2 : i32
	linalg.yield %3 : i32			linalg.yield %3 : i32
	}: tensor<?x?x4x?xi32> -> tensor<?x?x4x?xi32>			} -> tensor<?x?x4x?xi32>
	return %1 : tensor<?x?x4x?xi32>			return %1 : tensor<?x?x4x?xi32>
	}			}

	// CHECK-LABEL: func @indexed_generic_op_reshape_producer_fusion			// CHECK-LABEL: func @indexed_generic_op_reshape_producer_fusion
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 1
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape

	// -----			// -----

	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	// CHECK-DAG: #[[$MAP1:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 20 + d2 * 5 + d3)>			// CHECK-DAG: #[[$MAP1:.]] = affine_map<(d0, d1, d2, d3) -> (d0, d1 20 + d2 * 5 + d3)>

	#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>			#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
	func @indexed_generic_op_reshape_consumer_fusion(%arg0 : tensor<?x?x4x5xi32>)			func @indexed_generic_op_reshape_consumer_fusion(%arg0 : tensor<?x?x4x5xi32>)
	-> tensor<?x?xi32> {			-> tensor<?x?xi32> {
	%0 = linalg.indexed_generic {			%0 = linalg.indexed_generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel", "parallel", "parallel", "parallel"] } %arg0 {			iterator_types = ["parallel", "parallel", "parallel", "parallel"] }
				ins(%arg0 : tensor<?x?x4x5xi32>) {
	^bb0(%arg2: index, %arg3: index, %arg4: index, %arg5: index, %arg6: i32): // no predecessors			^bb0(%arg2: index, %arg3: index, %arg4: index, %arg5: index, %arg6: i32): // no predecessors
	%2 = index_cast %arg2 : index to i32			%2 = index_cast %arg2 : index to i32
	%3 = addi %arg6, %2 : i32			%3 = addi %arg6, %2 : i32
	linalg.yield %3 : i32			linalg.yield %3 : i32
	}: tensor<?x?x4x5xi32> -> tensor<?x?x4x5xi32>			} -> tensor<?x?x4x5xi32>
	%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,			%1 = linalg.tensor_reshape %0 [affine_map<(i, j, k, l) -> (i)>,
	affine_map<(i, j, k, l) -> (j, k, l)>] :			affine_map<(i, j, k, l) -> (j, k, l)>] :
	tensor<?x?x4x5xi32> into tensor<?x?xi32>			tensor<?x?x4x5xi32> into tensor<?x?xi32>
	return %1 : tensor<?x?xi32>			return %1 : tensor<?x?xi32>
	}			}

	// CHECK-LABEL: func @indexed_generic_op_reshape_consumer_fusion			// CHECK-LABEL: func @indexed_generic_op_reshape_consumer_fusion
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK-SAME: args_in = 1
	// CHECK-SAME: args_out = 1
	// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]			// CHECK-SAME: indexing_maps = [#[[$MAP0]], #[[$MAP1]]]
	// CHECK-NOT: linalg.tensor_reshape			// CHECK-NOT: linalg.tensor_reshape

mlir/test/Dialect/Linalg/fusion.mlir

	Show First 20 Lines • Show All 464 Lines • ▼ Show 20 Lines
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK: linalg.matmul			// CHECK: linalg.matmul
	// CHECK-NOT: linalg.matmul			// CHECK-NOT: linalg.matmul

	// -----			// -----

	#id_2d = affine_map<(i, j) -> (i, j)>			#id_2d = affine_map<(i, j) -> (i, j)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}
	func @pointwise(%A: memref<?x?xf32, offset: 0, strides: [?, ?]>,			func @pointwise(%A: memref<?x?xf32, offset: 0, strides: [?, ?]>,
	%B: memref<?x?xf32, offset: 0, strides: [?, ?]>,			%B: memref<?x?xf32, offset: 0, strides: [?, ?]>,
	%C: memref<?x?xf32, offset: 0, strides: [?, ?]>,			%C: memref<?x?xf32, offset: 0, strides: [?, ?]>,
	%D: memref<?x?xf32, offset: 0, strides: [?, ?]>) {			%D: memref<?x?xf32, offset: 0, strides: [?, ?]>) {
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c3 = constant 3 : index			%c3 = constant 3 : index
	%c2 = constant 2 : index			%c2 = constant 2 : index
	linalg.generic #pointwise_2d_trait %A, %A, %B {			linalg.generic #pointwise_2d_trait
				ins(%A, %A: memref<?x?xf32, offset: 0, strides: [?, ?]>,
				memref<?x?xf32, offset: 0, strides: [?, ?]>)
				outs(%B : memref<?x?xf32, offset: 0, strides: [?, ?]>) {
	^bb0(%E: f32, %arg5: f32, %arg6: f32): // no predecessors			^bb0(%E: f32, %arg5: f32, %arg6: f32): // no predecessors
	%2 = addf %E, %arg5 : f32			%2 = addf %E, %arg5 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: memref<?x?xf32, offset: 0, strides: [?, ?]>,			}
	memref<?x?xf32, offset: 0, strides: [?, ?]>,
	memref<?x?xf32, offset: 0, strides: [?, ?]>
	%0 = dim %B, %c0 : memref<?x?xf32, offset: 0, strides: [?, ?]>			%0 = dim %B, %c0 : memref<?x?xf32, offset: 0, strides: [?, ?]>
	%1 = dim %B, %c1 : memref<?x?xf32, offset: 0, strides: [?, ?]>			%1 = dim %B, %c1 : memref<?x?xf32, offset: 0, strides: [?, ?]>
	scf.for %arg4 = %c0 to %0 step %c2 {			scf.for %arg4 = %c0 to %0 step %c2 {
	scf.for %arg5 = %c0 to %1 step %c3 {			scf.for %arg5 = %c0 to %1 step %c3 {
	%4 = std.subview %B[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%4 = std.subview %B[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32, offset: 0, strides: [?, ?]> to			memref<?x?xf32, offset: 0, strides: [?, ?]> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	%5 = std.subview %C[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%5 = std.subview %C[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32, offset: 0, strides: [?, ?]> to			memref<?x?xf32, offset: 0, strides: [?, ?]> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	%6 = std.subview %D[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%6 = std.subview %D[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32, offset: 0, strides: [?, ?]> to			memref<?x?xf32, offset: 0, strides: [?, ?]> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	linalg.generic #pointwise_2d_trait %4, %5, %6 {			linalg.generic #pointwise_2d_trait
				ins(%4, %5: memref<?x?xf32, offset: ?, strides: [?, ?]>,
				memref<?x?xf32, offset: ?, strides: [?, ?]>)
				outs(%6 : memref<?x?xf32, offset: ?, strides: [?, ?]>) {
	^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	%7 = mulf %arg6, %arg7 : f32			%7 = mulf %arg6, %arg7 : f32
	linalg.yield %7 : f32			linalg.yield %7 : f32
	}: memref<?x?xf32, offset: ?, strides: [?, ?]>,			}
	memref<?x?xf32, offset: ?, strides: [?, ?]>,
	memref<?x?xf32, offset: ?, strides: [?, ?]>
	}			}
	}			}
	return			return
	}			}
	// CHECK-LABEL: func @pointwise			// CHECK-LABEL: func @pointwise
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK-NOT: scf.for			// CHECK-NOT: scf.for
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK: addf			// CHECK: addf
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK: mulf			// CHECK: mulf

	// -----			// -----

	#id_2d = affine_map<(i, j) -> (i, j)>			#id_2d = affine_map<(i, j) -> (i, j)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}
	func @pointwise_no_view(%M: index, %N: index) {			func @pointwise_no_view(%M: index, %N: index) {
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c3 = constant 3 : index			%c3 = constant 3 : index
	%c2 = constant 2 : index			%c2 = constant 2 : index
	%A = alloc (%M, %N): memref<?x?xf32>			%A = alloc (%M, %N): memref<?x?xf32>
	%B = alloc (%M, %N): memref<?x?xf32>			%B = alloc (%M, %N): memref<?x?xf32>
	%C = alloc (%M, %N): memref<?x?xf32>			%C = alloc (%M, %N): memref<?x?xf32>
	%D = alloc (%M, %N): memref<?x?xf32>			%D = alloc (%M, %N): memref<?x?xf32>
	%E = alloc (%M, %N): memref<?x?xf32>			%E = alloc (%M, %N): memref<?x?xf32>
	linalg.generic #pointwise_2d_trait %A, %A, %B {			linalg.generic #pointwise_2d_trait
				ins(%A, %A : memref<?x?xf32>, memref<?x?xf32>)
				outs(%B : memref<?x?xf32>) {
	^bb0(%e: f32, %arg5: f32, %arg6: f32): // no predecessors			^bb0(%e: f32, %arg5: f32, %arg6: f32): // no predecessors
	%2 = addf %e, %arg5 : f32			%2 = addf %e, %arg5 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: memref<?x?xf32>,			}
	memref<?x?xf32>,
	memref<?x?xf32>
	%0 = dim %B, %c0 : memref<?x?xf32>			%0 = dim %B, %c0 : memref<?x?xf32>
	%1 = dim %B, %c1 : memref<?x?xf32>			%1 = dim %B, %c1 : memref<?x?xf32>
	scf.for %arg4 = %c0 to %0 step %c2 {			scf.for %arg4 = %c0 to %0 step %c2 {
	scf.for %arg5 = %c0 to %1 step %c3 {			scf.for %arg5 = %c0 to %1 step %c3 {
	%4 = std.subview %B[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%4 = std.subview %B[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32> to			memref<?x?xf32> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	%5 = std.subview %C[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%5 = std.subview %C[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32> to			memref<?x?xf32> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	%6 = std.subview %D[%arg4, %arg5][%c2, %c3][%c1, %c1] :			%6 = std.subview %D[%arg4, %arg5][%c2, %c3][%c1, %c1] :
	memref<?x?xf32> to			memref<?x?xf32> to
	memref<?x?xf32, offset: ?, strides: [?, ?]>			memref<?x?xf32, offset: ?, strides: [?, ?]>
	linalg.generic #pointwise_2d_trait %4, %5, %6 {			linalg.generic #pointwise_2d_trait
				ins(%4, %5: memref<?x?xf32, offset: ?, strides: [?, ?]>,
				memref<?x?xf32, offset: ?, strides: [?, ?]>)
				outs(%6 : memref<?x?xf32, offset: ?, strides: [?, ?]>) {
	^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors			^bb0(%arg6: f32, %arg7: f32, %arg8: f32): // no predecessors
	%7 = mulf %arg6, %arg7 : f32			%7 = mulf %arg6, %arg7 : f32
	linalg.yield %7 : f32			linalg.yield %7 : f32
	}: memref<?x?xf32, offset: ?, strides: [?, ?]>,			}
	memref<?x?xf32, offset: ?, strides: [?, ?]>,
	memref<?x?xf32, offset: ?, strides: [?, ?]>
	}			}
	}			}
	return			return
	}			}
	// CHECK-LABEL: func @pointwise_no_view			// CHECK-LABEL: func @pointwise_no_view
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK-NOT: scf.for			// CHECK-NOT: scf.for
	Show All 11 Lines

	func @fusion_of_three(%arg0: memref<100x10xf32>,			func @fusion_of_three(%arg0: memref<100x10xf32>,
	%arg1: memref<100xf32>,			%arg1: memref<100xf32>,
	%arg2: memref<100x10xf32>) {			%arg2: memref<100x10xf32>) {
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%0 = alloc() {temp = true} : memref<100x10xf32>			%0 = alloc() {temp = true} : memref<100x10xf32>
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map1],			indexing_maps = [#map0, #map1],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]}
	} %arg1, %0 {			ins(%arg1 : memref<100xf32>)
				outs(%0 : memref<100x10xf32>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	linalg.yield %arg3 : f32			linalg.yield %arg3 : f32
	}: memref<100xf32>, memref<100x10xf32>			}
	%1 = alloc() {temp = true} : memref<100x10xf32>			%1 = alloc() {temp = true} : memref<100x10xf32>
	linalg.generic {			linalg.generic {
	args_in = 2 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map1, #map1, #map1],			indexing_maps = [#map1, #map1, #map1],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]}
	} %arg0, %0, %1 {			ins(%arg0, %0: memref<100x10xf32>, memref<100x10xf32>)
				outs(%1 : memref<100x10xf32>) {
	^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32, %arg5: f32): // no predecessors
	%2 = subf %arg3, %arg4 : f32			%2 = subf %arg3, %arg4 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: memref<100x10xf32>, memref<100x10xf32>, memref<100x10xf32>			}
	dealloc %0 : memref<100x10xf32>			dealloc %0 : memref<100x10xf32>
	%2 = dim %1, %c0 : memref<100x10xf32>			%2 = dim %1, %c0 : memref<100x10xf32>
	%3 = dim %1, %c1 : memref<100x10xf32>			%3 = dim %1, %c1 : memref<100x10xf32>
	%4 = dim %arg2, %c0 : memref<100x10xf32>			%4 = dim %arg2, %c0 : memref<100x10xf32>
	%5 = dim %arg2, %c1 : memref<100x10xf32>			%5 = dim %arg2, %c1 : memref<100x10xf32>
	scf.for %i = %c0 to %2 step %c1 {			scf.for %i = %c0 to %2 step %c1 {
	scf.for %j = %c0 to %3 step %c1 {			scf.for %j = %c0 to %3 step %c1 {
	%6 = std.subview %1[%i, %j][%c1, %c1][%c1, %c1] :			%6 = std.subview %1[%i, %j][%c1, %c1][%c1, %c1] :
	memref<100x10xf32> to memref<?x?xf32, #map2>			memref<100x10xf32> to memref<?x?xf32, #map2>
	%7 = std.subview %arg2[%i, %j][%c1, %c1][%c1, %c1] :			%7 = std.subview %arg2[%i, %j][%c1, %c1][%c1, %c1] :
	memref<100x10xf32> to memref<?x?xf32, #map2>			memref<100x10xf32> to memref<?x?xf32, #map2>
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map1, #map1],			indexing_maps = [#map1, #map1],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]}
	} %6, %7 {			ins(%6 : memref<?x?xf32, #map2>)
				outs(%7 : memref<?x?xf32, #map2>) {
	^bb0(%arg3: f32, %arg4: f32): // no predecessors			^bb0(%arg3: f32, %arg4: f32): // no predecessors
	%8 = exp %arg3 : f32			%8 = exp %arg3 : f32
	linalg.yield %8 : f32			linalg.yield %8 : f32
	}: memref<?x?xf32, #map2>,			}
	memref<?x?xf32, #map2>
	}			}
	}			}
	dealloc %1 : memref<100x10xf32>			dealloc %1 : memref<100x10xf32>
	return			return
	}			}
	// CHECK-LABEL: func @fusion			// CHECK-LABEL: func @fusion
	// CHECK-NOT: linalg.generic			// CHECK-NOT: linalg.generic
	// CHECK: scf.for			// CHECK: scf.for
	▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/fusion_indexed_generic.mlir

	// RUN: mlir-opt %s -linalg-fusion -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -linalg-fusion -split-input-file \| FileCheck %s

	#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>			#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>
	#id_2d = affine_map<(d0, d1) -> (d0, d1)>			#id_2d = affine_map<(d0, d1) -> (d0, d1)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}
	func @fuse_indexed_generic_consumer(%A: memref<?x?xf32>,			func @fuse_indexed_generic_consumer(%A: memref<?x?xf32>,
	%B: memref<?x?xf32>,			%B: memref<?x?xf32>,
	%C: memref<?x?xf32>,			%C: memref<?x?xf32>,
	%D: memref<?x?xf32>) {			%D: memref<?x?xf32>) {
	linalg.generic #pointwise_2d_trait %A, %B, %C {			linalg.generic #pointwise_2d_trait
				ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
				outs(%C : memref<?x?xf32>) {
	^bb0(%e: f32, %arg5: f32, %arg6: f32): // no predecessors			^bb0(%e: f32, %arg5: f32, %arg6: f32): // no predecessors
	%2 = addf %e, %arg5 : f32			%2 = addf %e, %arg5 : f32
	linalg.yield %2 : f32			linalg.yield %2 : f32
	}: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>			}
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c25 = constant 25 : index			%c25 = constant 25 : index
	%c10 = constant 10 : index			%c10 = constant 10 : index
	%0 = dim %C, %c0 : memref<?x?xf32>			%0 = dim %C, %c0 : memref<?x?xf32>
	%1 = dim %C, %c1 : memref<?x?xf32>			%1 = dim %C, %c1 : memref<?x?xf32>
	%2 = dim %D, %c0 : memref<?x?xf32>			%2 = dim %D, %c0 : memref<?x?xf32>
	%3 = dim %D, %c1 : memref<?x?xf32>			%3 = dim %D, %c1 : memref<?x?xf32>
	scf.for %arg2 = %c0 to %0 step %c10 {			scf.for %arg2 = %c0 to %0 step %c10 {
	scf.for %arg3 = %c0 to %1 step %c25 {			scf.for %arg3 = %c0 to %1 step %c25 {
	%4 = std.subview %C[%arg2, %arg3][%c10, %c25][%c1, %c1] :			%4 = std.subview %C[%arg2, %arg3][%c10, %c25][%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>
	%5 = std.subview %D[%arg2, %arg3][%c10, %c25][%c1, %c1] :			%5 = std.subview %D[%arg2, %arg3][%c10, %c25][%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>
	linalg.indexed_generic {			linalg.indexed_generic {
	indexing_maps = [#id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"]}
	args_in = 1,			ins(%4 : memref<?x?xf32, #map>)
	args_out = 1			outs(%5 : memref<?x?xf32, #map>) {
	} %4, %5 {
	^bb0(%arg4: index, %arg5: index, %arg6: f32, %arg7: f32):			^bb0(%arg4: index, %arg5: index, %arg6: f32, %arg7: f32):
	%6 = addi %arg4, %arg2 : index			%6 = addi %arg4, %arg2 : index
	%7 = addi %arg5, %arg3 : index			%7 = addi %arg5, %arg3 : index
	%8 = index_cast %6 : index to i32			%8 = index_cast %6 : index to i32
	%9 = sitofp %8 : i32 to f32			%9 = sitofp %8 : i32 to f32
	%10 = index_cast %7 : index to i32			%10 = index_cast %7 : index to i32
	%11 = sitofp %10 : i32 to f32			%11 = sitofp %10 : i32 to f32
	%12 = addf %9, %11 : f32			%12 = addf %9, %11 : f32
	linalg.yield %12 : f32			linalg.yield %12 : f32
	}: memref<?x?xf32, #map>, memref<?x?xf32, #map>			}
	}			}
	}			}
	return			return
	}			}
	// CHECK-LABEL: func @fuse_indexed_generic_consumer			// CHECK-LABEL: func @fuse_indexed_generic_consumer
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK: scf.for			// CHECK: scf.for
	// CHECK-NOT: scf.for			// CHECK-NOT: scf.for
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-NOT: addi			// CHECK-NOT: addi
	// CHECK: addf			// CHECK: addf
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK: index_cast			// CHECK: index_cast

	// -----			// -----

	#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>			#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>
	#id_2d = affine_map<(d0, d1) -> (d0, d1)>			#id_2d = affine_map<(d0, d1) -> (d0, d1)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}
	func @fuse_indexed_generic_producer(%A: memref<?x?xf32>,			func @fuse_indexed_generic_producer(%A: memref<?x?xf32>,
	%B: memref<?x?xf32>,			%B: memref<?x?xf32>,
	%C: memref<?x?xf32>,			%C: memref<?x?xf32>,
	%D: memref<?x?xf32>) {			%D: memref<?x?xf32>) {
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c25 = constant 25 : index			%c25 = constant 25 : index
	%c10 = constant 10 : index			%c10 = constant 10 : index
	linalg.indexed_generic #pointwise_2d_trait %A, %B, %C {			linalg.indexed_generic #pointwise_2d_trait
				ins(%A, %B : memref<?x?xf32>, memref<?x?xf32>)
				outs(%C : memref<?x?xf32>) {
	^bb0(%i: index, %j: index, %a: f32, %b: f32, %c: f32): // no predecessors			^bb0(%i: index, %j: index, %a: f32, %b: f32, %c: f32): // no predecessors
	%i_int = index_cast %i: index to i32			%i_int = index_cast %i: index to i32
	%i_float = sitofp %i_int : i32 to f32			%i_float = sitofp %i_int : i32 to f32
	%ab = addf %a, %b : f32			%ab = addf %a, %b : f32
	%out = addf %ab, %i_float : f32			%out = addf %ab, %i_float : f32
	linalg.yield %out : f32			linalg.yield %out : f32
	}: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>			}
	%C_X = dim %C, %c0 : memref<?x?xf32>			%C_X = dim %C, %c0 : memref<?x?xf32>
	%C_Y = dim %C, %c1 : memref<?x?xf32>			%C_Y = dim %C, %c1 : memref<?x?xf32>
	%D_X = dim %D, %c0 : memref<?x?xf32>			%D_X = dim %D, %c0 : memref<?x?xf32>
	%D_Y = dim %D, %c1 : memref<?x?xf32>			%D_Y = dim %D, %c1 : memref<?x?xf32>
	scf.parallel (%arg2, %arg3) = (%c0, %c0) to (%C_X, %C_Y) step (%c10, %c25) {			scf.parallel (%arg2, %arg3) = (%c0, %c0) to (%C_X, %C_Y) step (%c10, %c25) {
	%C_view = std.subview %C[%arg2, %arg3][%c10, %c25][%c1, %c1] :			%C_view = std.subview %C[%arg2, %arg3][%c10, %c25][%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>
	%D_view = std.subview %D[%arg2, %arg3][%c10, %c25][%c1, %c1] :			%D_view = std.subview %D[%arg2, %arg3][%c10, %c25][%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>
	linalg.generic {			linalg.generic {
	indexing_maps = [#id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"]}
	args_in = 1,			ins(%C_view : memref<?x?xf32, #map>)
	args_out = 1			outs(%D_view : memref<?x?xf32, #map>) {
	} %C_view, %D_view {
	^bb0( %a: f32, %b: f32):			^bb0( %a: f32, %b: f32):
	%ab = addf %a, %b : f32			%ab = addf %a, %b : f32
	linalg.yield %ab : f32			linalg.yield %ab : f32
	}: memref<?x?xf32, #map>, memref<?x?xf32, #map>			}
	}			}
	return			return
	}			}
	// CHECK-LABEL: func @fuse_indexed_generic_producer			// CHECK-LABEL: func @fuse_indexed_generic_producer
	// CHECK: scf.parallel ([[I:%.]], [[J:%.]]) =			// CHECK: scf.parallel ([[I:%.]], [[J:%.]]) =
	// CHECK-NOT: scf.parallel			// CHECK-NOT: scf.parallel
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK: ^bb0([[i:%.]]: index, [[j:%.]]: index			// CHECK: ^bb0([[i:%.]]: index, [[j:%.]]: index
	// CHECK: [[i_new:%.*]] = addi [[i]], [[I]] : index			// CHECK: [[i_new:%.*]] = addi [[i]], [[I]] : index
	// CHECK: [[j_new:%.*]] = addi [[j]], [[J]] : index			// CHECK: [[j_new:%.*]] = addi [[j]], [[J]] : index
	// CHECK: {{.*}} = index_cast [[i_new]] : index to i32			// CHECK: {{.*}} = index_cast [[i_new]] : index to i32
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK: addf			// CHECK: addf

	// -----			// -----

	#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>			#map = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>
	#id_2d = affine_map<(d0, d1) -> (d0, d1)>			#id_2d = affine_map<(d0, d1) -> (d0, d1)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,
	args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}
	func @fuse_indexed_generic_producer_tile_second_dim_only(%A: memref<?x?xf32>,			func @fuse_indexed_generic_producer_tile_second_dim_only(%A: memref<?x?xf32>,
	%B: memref<?x?xf32>,			%B: memref<?x?xf32>,
	%C: memref<?x?xf32>,			%C: memref<?x?xf32>,
	%D: memref<?x?xf32>) {			%D: memref<?x?xf32>) {
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%c3 = constant 3 : index			%c3 = constant 3 : index
	%c0 = constant 0 : index			%c0 = constant 0 : index
	linalg.indexed_generic #pointwise_2d_trait %A, %B, %C {			linalg.indexed_generic #pointwise_2d_trait
				ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>)
				outs(%C : memref<?x?xf32>) {
	^bb0(%i: index, %j: index, %a: f32, %b: f32, %c: f32): // no predecessors			^bb0(%i: index, %j: index, %a: f32, %b: f32, %c: f32): // no predecessors
	%j_int = index_cast %j: index to i32			%j_int = index_cast %j: index to i32
	%j_float = sitofp %j_int : i32 to f32			%j_float = sitofp %j_int : i32 to f32
	%ab = addf %a, %b : f32			%ab = addf %a, %b : f32
	%out = addf %ab, %j_float : f32			%out = addf %ab, %j_float : f32
	linalg.yield %out : f32			linalg.yield %out : f32
	}: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>			}
	%C_X = dim %C, %c0 : memref<?x?xf32>			%C_X = dim %C, %c0 : memref<?x?xf32>
	%C_Y = dim %C, %c1 : memref<?x?xf32>			%C_Y = dim %C, %c1 : memref<?x?xf32>
	%D_X = dim %D, %c0 : memref<?x?xf32>			%D_X = dim %D, %c0 : memref<?x?xf32>
	%D_Y = dim %D, %c1 : memref<?x?xf32>			%D_Y = dim %D, %c1 : memref<?x?xf32>
	%3 = linalg.range %c0 : %C_Y : %c3 : !linalg.range			%3 = linalg.range %c0 : %C_Y : %c3 : !linalg.range
	scf.parallel (%j) = (%c0) to (%C_Y) step (%c3) {			scf.parallel (%j) = (%c0) to (%C_Y) step (%c3) {
	%0 = affine.min affine_map<(d0, d1, d2) -> (d0, d1 - d2)>(%c3, %C_Y, %j)			%0 = affine.min affine_map<(d0, d1, d2) -> (d0, d1 - d2)>(%c3, %C_Y, %j)
	%C_view = subview %C[%c0, %j] [%C_X, %0] [%c1, %c1] :			%C_view = subview %C[%c0, %j] [%C_X, %0] [%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>

	%1 = affine.min affine_map<(d0, d1, d2) -> (d0, d1 - d2)>(%c3, %D_Y, %j)			%1 = affine.min affine_map<(d0, d1, d2) -> (d0, d1 - d2)>(%c3, %D_Y, %j)
	%D_view = subview %D[%c0, %j] [%D_X, %1] [%c1, %c1] :			%D_view = subview %D[%c0, %j] [%D_X, %1] [%c1, %c1] :
	memref<?x?xf32> to memref<?x?xf32, #map>			memref<?x?xf32> to memref<?x?xf32, #map>

	linalg.generic {			linalg.generic {
	indexing_maps = [#id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"]}
	args_in = 1,			ins(%C_view : memref<?x?xf32, #map>)
	args_out = 1			outs(%D_view : memref<?x?xf32, #map>) {
	} %C_view, %D_view {
	^bb0( %a: f32, %b: f32):			^bb0( %a: f32, %b: f32):
	%ab = addf %a, %b : f32			%ab = addf %a, %b : f32
	linalg.yield %ab : f32			linalg.yield %ab : f32
	}: memref<?x?xf32, #map>, memref<?x?xf32, #map>			}
	scf.yield			scf.yield
	}			}
	return			return
	}			}
	// CHECK-LABEL: func @fuse_indexed_generic_producer_tile_second_dim_only			// CHECK-LABEL: func @fuse_indexed_generic_producer_tile_second_dim_only
	// CHECK: [[C0:%.*]] = constant 0 : index			// CHECK: [[C0:%.*]] = constant 0 : index
	// CHECK: scf.parallel ([[J:%.*]]) =			// CHECK: scf.parallel ([[J:%.*]]) =
	// CHECK-NOT: scf.parallel			// CHECK-NOT: scf.parallel
	// CHECK: linalg.indexed_generic			// CHECK: linalg.indexed_generic
	// CHECK: ^bb0([[i:%.]]: index, [[j:%.]]: index			// CHECK: ^bb0([[i:%.]]: index, [[j:%.]]: index
	// CHECK: [[i_new:%.*]] = addi [[i]], [[C0]] : index			// CHECK: [[i_new:%.*]] = addi [[i]], [[C0]] : index
	// CHECK: [[j_new:%.*]] = addi [[j]], [[J]] : index			// CHECK: [[j_new:%.*]] = addi [[j]], [[J]] : index
	// CHECK: {{.*}} = index_cast [[j_new]] : index to i32			// CHECK: {{.*}} = index_cast [[j_new]] : index to i32
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK: addf			// CHECK: addf

mlir/test/Dialect/Linalg/inlining.mlir

	// RUN: mlir-opt %s -inline \| FileCheck %s			// RUN: mlir-opt %s -inline \| FileCheck %s

	// These tests verify that regions with operations from Lingalg dialect			// These tests verify that regions with operations from Lingalg dialect
	// can be inlined.			// can be inlined.

	#accesses = [			#accesses = [
	affine_map<(i) -> (i)>,			affine_map<(i) -> (i)>,
	affine_map<(i) -> (i)>			affine_map<(i) -> (i)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel"]			iterator_types = ["parallel"]
	}			}

	func @inline_into(%arg0: memref<?xf32>) {			func @inline_into(%arg0: memref<?xf32>) {
	// CHECK: linalg.generic			// CHECK: linalg.generic
	call @inlined_fn(%arg0) : (memref<?xf32>) -> ()			call @inlined_fn(%arg0) : (memref<?xf32>) -> ()
	return			return
	}			}

	func @inlined_fn(%arg0: memref<?xf32>) {			func @inlined_fn(%arg0: memref<?xf32>) {
	// CHECK: linalg.generic			// CHECK: linalg.generic
	linalg.generic #trait %arg0, %arg0 {			linalg.generic #trait
				ins(%arg0 : memref<?xf32>)
				outs(%arg0 : memref<?xf32>) {
	^bb(%0 : f32, %1 : f32) :			^bb(%0 : f32, %1 : f32) :
	linalg.yield %0 : f32			linalg.yield %0 : f32
	} : memref<?xf32>, memref<?xf32>			}
	return			return
	}			}

mlir/test/Dialect/Linalg/invalid.mlir

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
func @yield_parent(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @yield_parent(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected parent op with LinalgOp interface}}		// expected-error @+1 {{op expected parent op with LinalgOp interface}}
linalg.yield %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>		linalg.yield %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}

// -----		// -----

func @generic_no_region(%arg0: memref<f32>) {		func @generic_no_region(%arg0: memref<f32>) {
// expected-error @+6 {{expected '{' to begin a region}}		// expected-error @+5 {{expected '{' to begin a region}}
linalg.generic {		linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0 : memref<f32>		} ins(%arg0 : memref<f32>)
}

// -----

func @generic_at_least_2_operands(%arg0: memref<f32>) {
// expected-error @+1 {{op expected 2 or more operands}}
linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []
} %arg0 {} : memref<f32>
}

// -----

func @generic_exactly_2_views(%arg0: memref<f32>) {
// expected-error @+1 {{op expected exactly 2 inputs (tensor or buffer) and output buffer operands}}
linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []
} %arg0, %arg0, %arg0 {}: memref<f32>, memref<f32>, memref<f32>
}		}

// -----		// -----

func @generic_mismatched_num_returns(%arg0: memref<f32>) {		func @generic_mismatched_num_returns(%arg0: memref<f32>) {
// expected-error @+8 {{op expected number of yield values (1) to match the number of operands of the enclosing LinalgOp (0)}}		// expected-error @+6 {{op expected number of yield values (1) to match the number of operands of the enclosing LinalgOp (0)}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<() -> ()> ],		indexing_maps = [ affine_map<() -> ()> ],
iterator_types = []		iterator_types = []}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%0: f32):		^bb(%0: f32):
linalg.yield		linalg.yield
}: memref<f32>		}
}		}

// -----		// -----

func @generic_symbol_in_map(%arg0: memref<i32>) {		func @generic_symbol_in_map(%arg0: memref<i32>) {
// expected-error @+1 {{expected the number of symbols in indexing_map #0 to match rank of operand `symbol_source`}}		// expected-error @+1 {{expected the number of symbols in indexing_map #0 to match rank of operand `symbol_source`}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<()[N] -> (0)> ],		indexing_maps = [ affine_map<()[N] -> (0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<i32>) {
^bb(%i : i32):		^bb(%i : i32):
linalg.yield %i : i32		linalg.yield %i : i32
}: memref<i32>		}
}		}

// -----		// -----

func @generic_symbol_source_out_of_range(%arg0: memref<i32>) {		func @generic_symbol_source_out_of_range(%arg0: memref<i32>) {
// expected-error @+1 {{symbol_source index out of range}}		// expected-error @+1 {{symbol_source index out of range}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<()[N] -> (0)> ],		indexing_maps = [ affine_map<()[N] -> (0)> ],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
symbol_source = 1		symbol_source = 1}
} %arg0 {		outs(%arg0 : memref<i32>) {
^bb(%i : i32):		^bb(%i : i32):
linalg.yield %i : i32		linalg.yield %i : i32
}: memref<i32>		}
}		}

// -----		// -----

func @generic_wrong_dim_in_map(%arg0: memref<1xi32>) {		func @generic_wrong_dim_in_map(%arg0: memref<1xi32>) {
// expected-error @+1 {{op expected indexing_map #0 to have 1 dim(s) to match the number of loops}}		// expected-error @+1 {{op expected indexing_map #0 to have 1 dim(s) to match the number of loops}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<1xi32>) {
^bb(%i : i32):		^bb(%i : i32):
linalg.yield %i : i32		linalg.yield %i : i32
}: memref<1xi32>		}
}		}

// -----		// -----

func @generic_one_d_view(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_one_d_view(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected indexing_map #0 results to match view rank: 'memref<?xf32, affine_map<(d0)[s0] -> (d0 + s0)>>'}}		// expected-error @+1 {{op expected indexing_map #0 results to match view rank: 'memref<?xf32, affine_map<(d0)[s0] -> (d0 + s0)>>'}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<() -> (0, 0)> ],		indexing_maps = [ affine_map<() -> (0, 0)> ],
iterator_types = []		iterator_types = []}
} %arg0 {		outs(%arg0 : memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
^bb(%f : f32):		^bb(%f : f32):
linalg.yield %f: f32		linalg.yield %f: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>>		}
}		}

// -----		// -----

func @generic_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+9 {{'linalg.yield' op type of yield operand 1 ('i4') doesn't match the element type of the enclosing linalg.generic op ('f32')}}		// expected-error @+7 {{'linalg.yield' op type of yield operand 1 ('i4') doesn't match the element type of the enclosing linalg.generic op ('f32')}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
^bb(%0: f32):		^bb(%0: f32):
%1 = constant 1: i4		%1 = constant 1: i4
linalg.yield %1: i4		linalg.yield %1: i4
}: memref<?xf32, affine_map<(i)[off]->(off + i)>>		}
}		}

// -----		// -----

func @generic_singular_maps(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>, %arg1: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_singular_maps(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>, %arg1: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected the concatenation of maps in indexing_map to be invertible}}		// expected-error @+1 {{op expected the concatenation of maps in indexing_map to be invertible}}
linalg.generic {		linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [		indexing_maps = [
affine_map<(i, j) -> (i + j)>,		affine_map<(i, j) -> (i + j)>,
affine_map<(i, j) -> (i + j)>		affine_map<(i, j) -> (i + j)>
],		],
iterator_types = ["parallel","parallel"]		iterator_types = ["parallel","parallel"]}
} %arg0, %arg1 {		ins(%arg0 : memref<?xf32, affine_map<(i)[off]->(off + i)>>)
		outs(%arg1 : memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
^bb(%0: f32, %1: f32):		^bb(%0: f32, %1: f32):
linalg.yield %1: f32		linalg.yield %1: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>>,		}
memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}

////////////////////////////////////////////////////////////////////////////////		////////////////////////////////////////////////////////////////////////////////
///////////////////////////// Region tests /////////////////////////////////////		///////////////////////////// Region tests /////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////		////////////////////////////////////////////////////////////////////////////////

// -----		// -----

func @generic_empty_region(%arg0: memref<f32>) {		func @generic_empty_region(%arg0: memref<f32>) {
%f0 = constant 0.0: f32		%f0 = constant 0.0: f32
// expected-error @+1 {{op expects region #0 to have 0 or 1 blocks}}		// expected-error @+1 {{op expects region #0 to have 0 or 1 blocks}}
linalg.generic {		linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []}
} %arg0, %arg0 {		ins(%arg0 : memref<f32>)
		outs(%arg0 : memref<f32>) {
^bb1:		^bb1:
linalg.yield %f0: f32		linalg.yield %f0: f32
^bb2:		^bb2:
linalg.yield %f0: f32		linalg.yield %f0: f32
}: memref<f32>, memref<f32>		}
}		}

// -----		// -----

func @generic_empty_region(%arg0: memref<f32>) {		func @generic_empty_region(%arg0: memref<f32>) {
%f0 = constant 0.0: f32		%f0 = constant 0.0: f32
// expected-error @+1 {{linalg.generic' op expected region with 1 block}}		// expected-error @+1 {{linalg.generic' op expected region with 1 block}}
linalg.generic {		linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []}
} %arg0, %arg0 {		ins(%arg0 : memref<f32>)
}: memref<f32>, memref<f32>		outs(%arg0 : memref<f32>) {
		}
}		}

// -----		// -----

func @generic_mismatched_num_arguments(%arg0: memref<f32>) {		func @generic_mismatched_num_arguments(%arg0: memref<f32>) {
// expected-error @+1 {{op expected number of block arguments to match number of operands}}		// expected-error @+1 {{op expected number of block arguments to match number of operands}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%f: f32, %g: f32):		^bb(%f: f32, %g: f32):
linalg.yield %f: f32		linalg.yield %f: f32
}: memref<f32>		}
}		}

// -----		// -----

func @generic_block_arg_type(%arg0: memref<f32>) {		func @generic_block_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 1 of the same type as elemental type of output operand: 'memref<f32>'}}		// expected-error @+1 {{op expected block argument 1 of the same type as elemental type of output operand: 'memref<f32>'}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%i: i1):		^bb(%i: i1):
linalg.yield %i : i1		linalg.yield %i : i1
}: memref<f32>		}
}		}

// -----		// -----

func @indexed_generic_block_arg_count(%arg0: memref<f32>) {		func @indexed_generic_block_arg_count(%arg0: memref<f32>) {
// expected-error @+1 {{op expected number of block arguments to match number of operands + number of loops}}		// expected-error @+1 {{op expected number of block arguments to match number of operands + number of loops}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%f: f32):		^bb(%f: f32):
linalg.yield %f : f32		linalg.yield %f : f32
}: memref<f32>		}
}		}

// -----		// -----

func @indexed_generic_block_induction_var_arg_type(%arg0: memref<f32>) {		func @indexed_generic_block_induction_var_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 1 to be an index}}		// expected-error @+1 {{op expected block argument 1 to be an index}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%i: f64, %f: f32):		^bb(%i: f64, %f: f32):
linalg.yield %f: f32		linalg.yield %f: f32
}: memref<f32>		}
}		}

// -----		// -----

func @indexed_generic_block_arg_type(%arg0: memref<f32>) {		func @indexed_generic_block_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 2 of the same type as elemental type of output operand: 'memref<f32>'}}		// expected-error @+1 {{op expected block argument 2 of the same type as elemental type of output operand: 'memref<f32>'}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%i: index, %f: i1):		^bb(%i: index, %f: i1):
linalg.yield %i: index		linalg.yield %i: index
}: memref<f32>		}
}		}

// -----		// -----

func @indexed_generic_arg_count(%arg0: memref<f32>) {		func @indexed_generic_arg_count(%arg0: memref<f32>) {
// expected-error @+1 {{op expected number of block arguments to match number of operands + number of loops}}		// expected-error @+1 {{op expected number of block arguments to match number of operands + number of loops}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<()[] -> ()> ],		indexing_maps = [ affine_map<()[] -> ()> ],
iterator_types = []		iterator_types = []}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%0: index, %1: f32):		^bb(%0: index, %1: f32):
linalg.yield %1: f32		linalg.yield %1: f32
} : memref<f32>		}
return		return
}		}

// -----		// -----

func @indexed_generic_induction_var_arg_type(%arg0: memref<f32>) {		func @indexed_generic_induction_var_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 1 to be an index}}		// expected-error @+1 {{op expected block argument 1 to be an index}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
iterator_types = ["parallel"],		iterator_types = ["parallel"],
indexing_maps = [ affine_map<(i) -> (i)> ]		indexing_maps = [ affine_map<(i) -> (i)> ]}
} %arg0 {		outs(%arg0 : memref<f32>) {
^bb(%0: i32, %1: f32):		^bb(%0: i32, %1: f32):
linalg.yield %1: f32		linalg.yield %1: f32
} : memref<f32>		}
}		}

// -----		// -----

func @indexed_generic_result_count(%arg0: memref<?xf32>) {		func @indexed_generic_result_count(%arg0: memref<?xf32>) {
// expected-error @+8 {{op expected number of yield values (1) to match the number of operands of the enclosing LinalgOp (2)}}		// expected-error @+6 {{op expected number of yield values (1) to match the number of operands of the enclosing LinalgOp (2)}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<?xf32>) {
^bb(%i: index, %val: f32):		^bb(%i: index, %val: f32):
linalg.yield %val, %val: f32, f32		linalg.yield %val, %val: f32, f32
}: memref<?xf32>		}
}		}

// -----		// -----

func @generic_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+9 {{type of yield operand 1 ('i1') doesn't match the element type of the enclosing linalg.generic op ('f32')}}		// expected-error @+7 {{type of yield operand 1 ('i1') doesn't match the element type of the enclosing linalg.generic op ('f32')}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
^bb(%i: f32):		^bb(%i: f32):
%0 = constant 0: i1		%0 = constant 0: i1
linalg.yield %0: i1		linalg.yield %0: i1
}: memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}

// -----

func @generic_result_tensor_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op result #0 must be ranked tensor of any type values, but got 'f32'}}
%0 = linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]
} %arg0 {
^bb(%i: f32):
linalg.yield %i: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>> -> f32
burmakoUnsubmitted Done Reply Inline Actions Why is this no longer a test? burmako: Why is this no longer a test?
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions it was duplicated, see the next test. nicolasvasilache: it was duplicated, see the next test.
}		}

// -----		// -----

func @generic_result_tensor_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_result_tensor_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op result #0 must be ranked tensor of any type values, but got 'f32'}}		// expected-error @+1 {{op result #0 must be ranked tensor of any type values, but got 'f32'}}
%0 = linalg.generic {		%0 = linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		ins(%arg0 : memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
^bb(%i: f32):		^bb(%i: f32):
linalg.yield %i: f32		linalg.yield %i: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>> -> f32		} -> f32
}		}

// -----		// -----

func @generic(%arg0: memref<?x?xi4>) {		func @generic(%arg0: memref<?x?xi4>) {
// expected-error @+2 {{op expects regions to end with 'linalg.yield', found 'std.addf'}}		// expected-error @+2 {{op expects regions to end with 'linalg.yield', found 'std.addf'}}
// expected-note @+1 {{in custom textual format, the absence of terminator implies 'linalg.yield'}}		// expected-note @+1 {{in custom textual format, the absence of terminator implies 'linalg.yield'}}
linalg.generic {		linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]}
} %arg0 {		outs(%arg0 : memref<?x?xi4>) {
^bb(%0: i4) :		^bb(%0: i4) :
%1 = std.addf %0, %0: i4		%1 = std.addf %0, %0: i4
} : memref<?x?xi4>		}
return		return
}		}

// -----		// -----

func @conv_rank_limit(%arg0: memref<?xf32>, %arg1: memref<?xf32>, %arg2: memref<?xf32>) {		func @conv_rank_limit(%arg0: memref<?xf32>, %arg1: memref<?xf32>, %arg2: memref<?xf32>) {
// expected-error @+1 {{expects memref ranks to be greater than 2}}		// expected-error @+1 {{expects memref ranks to be greater than 2}}
linalg.conv(%arg0, %arg1, %arg2) : memref<?xf32>, memref<?xf32>, memref<?xf32>		linalg.conv(%arg0, %arg1, %arg2) : memref<?xf32>, memref<?xf32>, memref<?xf32>
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	func @named_ops(%a3: memref<?x?x?xf32>, %b3: memref<?x?xf32>, %c3: memref<?x?x?xf32>) {
// expected-error @+1 {{op expected indexing_map #1 results to match view rank: 'memref<?x?xf32>'}}		// expected-error @+1 {{op expected indexing_map #1 results to match view rank: 'memref<?x?xf32>'}}
linalg.batch_matmul ins(%a3, %b3: memref<?x?x?xf32>, memref<?x?xf32>)		linalg.batch_matmul ins(%a3, %b3: memref<?x?x?xf32>, memref<?x?xf32>)
outs(%c3 : memref<?x?x?xf32>)		outs(%c3 : memref<?x?x?xf32>)
return		return
}		}

// -----		// -----

func @generic(%arg0: tensor<?x?xi4>) {
// expected-error @+1 {{unexpected #results > #outputs}}
linalg.generic {
args_in = 1,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]
} %arg0 {
^bb(%0: i4) :
%1 = std.addi %0, %0: i4
linalg.yield %1, %1: i4, i4
} : tensor<?x?xi4> -> (tensor<?x?xi4>, tensor<?x?xi4>)
return
}
burmakoUnsubmitted Done Reply Inline Actions Likewise. burmako: Likewise.
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions this is no longer valid, there is no such thing as #outputs anymore (previously independently specified by the args_out attribute). nicolasvasilache: this is no longer valid, there is no such thing as #outputs anymore (previously independently…

// -----

func @empty_init_expected(%m: memref<?x?xf32>, %t: tensor<?x?xf32>) {		func @empty_init_expected(%m: memref<?x?xf32>, %t: tensor<?x?xf32>) {
// expected-error @+1 {{expected empty `init` when op has no results or no reduction dims}}		// expected-error @+1 {{expected empty `init` when op has no results or no reduction dims}}
linalg.matmul ins(%m, %m: memref<?x?xf32>, memref<?x?xf32>)		linalg.matmul ins(%m, %m: memref<?x?xf32>, memref<?x?xf32>)
outs(%m : memref<?x?xf32>)		outs(%m : memref<?x?xf32>)
init(%t : tensor<?x?xf32>)		init(%t : tensor<?x?xf32>)
return		return
}		}

Show All 38 Lines

mlir/test/Dialect/Linalg/loops.mlir

Show First 20 Lines • Show All 551 Lines • ▼ Show 20 Lines	#trait2 = {
args_in = 1,		args_in = 1,
args_out = 2,		args_out = 2,
iterator_types = ["parallel", "parallel", "parallel"],		iterator_types = ["parallel", "parallel", "parallel"],
indexing_maps = #accesses,		indexing_maps = #accesses,
library_call = "some_external_function_name_2",		library_call = "some_external_function_name_2",
doc = "B(i,j,k), C(i,k,j) = foo(A(i, j), B(i,j,k), C(i,k,j))"		doc = "B(i,j,k), C(i,k,j) = foo(A(i, j), B(i,j,k), C(i,k,j))"
}		}
func @generic_region(%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>, %arg2: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {		func @generic_region(%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>, %arg2: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
linalg.generic #trait2 %arg0, %arg1, %arg2 {		linalg.generic #trait2
		ins(%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>)
		outs(%arg1, %arg2 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>,
		memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
^bb0(%a: f32, %b: f32, %c: f32):		^bb0(%a: f32, %b: f32, %c: f32):
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %d, %e : f32, f32		linalg.yield %d, %e : f32, f32
}: memref<?x?xf32, offset: ?, strides: [?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>		}
return		return
}		}
// CHECKLOOP-LABEL: @generic_region		// CHECKLOOP-LABEL: @generic_region
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: scf.for %[[j:.]] = {{.}}		// CHECKLOOP: scf.for %[[j:.]] = {{.}}
// CHECKLOOP: scf.for %[[k:.]] = {{.}}		// CHECKLOOP: scf.for %[[k:.]] = {{.}}
// CHECKLOOP: %[[a:.]] = load %{{.}}[%[[i]], %[[j]]] : memref<?x?xf32, #[[$strided2D]]>		// CHECKLOOP: %[[a:.]] = load %{{.}}[%[[i]], %[[j]]] : memref<?x?xf32, #[[$strided2D]]>
// CHECKLOOP: %[[b:.]] = load %{{.}}[%[[i]], %[[j]], %[[k]]] : memref<?x?x?xf32, #[[$strided3D]]>		// CHECKLOOP: %[[b:.]] = load %{{.}}[%[[i]], %[[j]], %[[k]]] : memref<?x?x?xf32, #[[$strided3D]]>
Show All 20 Lines	#trait4 = {
indexing_maps = #accesses,		indexing_maps = #accesses,
library_call = "some_external_function_name_2",		library_call = "some_external_function_name_2",
doc = "B(i,j,k), C(i,k,j) = foo(A(i, j) * B(i,j,k), i * j * k + C(i,k,j))"		doc = "B(i,j,k), C(i,k,j) = foo(A(i, j) * B(i,j,k), i * j * k + C(i,k,j))"
}		}
func @indexed_generic_region(		func @indexed_generic_region(
%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>,		%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>,		%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>,
%arg2: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {		%arg2: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
linalg.indexed_generic #trait4 %arg0, %arg1, %arg2 {		linalg.indexed_generic #trait4
		ins(%arg0 : memref<?x?xf32, offset: ?, strides: [?, 1]>)
		outs(%arg1, %arg2 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>,
		memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
^bb0(%i: index, %j: index, %k: index, %a: f32, %b: f32, %c: f32):		^bb0(%i: index, %j: index, %k: index, %a: f32, %b: f32, %c: f32):
%result_1 = mulf %a, %b : f32		%result_1 = mulf %a, %b : f32

%ij = addi %i, %j : index		%ij = addi %i, %j : index
%ijk = addi %ij, %k : index		%ijk = addi %ij, %k : index
%ijk_int = index_cast %ijk : index to i32		%ijk_int = index_cast %ijk : index to i32
%ijk_float = sitofp %ijk_int : i32 to f32		%ijk_float = sitofp %ijk_int : i32 to f32

%result_2 = addf %c, %ijk_float : f32		%result_2 = addf %c, %ijk_float : f32
linalg.yield %result_1, %result_2 : f32, f32		linalg.yield %result_1, %result_2 : f32, f32
}: memref<?x?xf32, offset: ?, strides: [?, 1]>,		}
memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>,
memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
return		return
}		}

// CHECKLOOP-LABEL: @indexed_generic_region		// CHECKLOOP-LABEL: @indexed_generic_region
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: scf.for %[[j:.]] = {{.}}		// CHECKLOOP: scf.for %[[j:.]] = {{.}}
// CHECKLOOP: scf.for %[[k:.]] = {{.}}		// CHECKLOOP: scf.for %[[k:.]] = {{.}}
// CHECKLOOP: %[[a:.]] = load %{{.}}[%[[i]], %[[j]]]		// CHECKLOOP: %[[a:.]] = load %{{.}}[%[[i]], %[[j]]]
Show All 34 Lines	#trait_broadcast = {
args_out = 1,		args_out = 1,
indexing_maps = #broadcast_access,		indexing_maps = #broadcast_access,
iterator_types = ["parallel", "parallel"],		iterator_types = ["parallel", "parallel"],
library_call = "some_broadcast_external_fn"		library_call = "some_broadcast_external_fn"
}		}

func @generic_op_zero_rank(%arg0: memref<f32>, %arg1: memref<3x4xf32>)		func @generic_op_zero_rank(%arg0: memref<f32>, %arg1: memref<3x4xf32>)
{		{
linalg.generic #trait_broadcast %arg0, %arg1 {		linalg.generic #trait_broadcast
		ins(%arg0 : memref<f32>)
		outs(%arg1 : memref<3x4xf32>) {
^bb(%a: f32, %b: f32) :		^bb(%a: f32, %b: f32) :
linalg.yield %a : f32		linalg.yield %a : f32
} : memref<f32>, memref<3x4xf32>		}
return		return
}		}

// CHECKLOOP-LABEL: @generic_op_zero_rank		// CHECKLOOP-LABEL: @generic_op_zero_rank
// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xf32>		// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xf32>
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: scf.for %[[j:.]] = {{.}}		// CHECKLOOP: scf.for %[[j:.]] = {{.}}
// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][]		// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][]
// CHECKLOOP: store %[[a]], %[[ARG1]][%[[i]], %[[j]]]		// CHECKLOOP: store %[[a]], %[[ARG1]][%[[i]], %[[j]]]

// CHECKPARALLEL-LABEL: @generic_op_zero_rank		// CHECKPARALLEL-LABEL: @generic_op_zero_rank
// CHECKPARALLEL-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKPARALLEL-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKPARALLEL-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xf32>		// CHECKPARALLEL-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xf32>
// CHECKPARALLEL: scf.parallel (%[[i:[a-zA-Z0-9_]]], %[[j:[a-zA-Z0-9_]]])		// CHECKPARALLEL: scf.parallel (%[[i:[a-zA-Z0-9_]]], %[[j:[a-zA-Z0-9_]]])
// CHECKPARALLEL: %[[a:.*]] = load %[[ARG0]][]		// CHECKPARALLEL: %[[a:.*]] = load %[[ARG0]][]
// CHECKPARALLEL: store %[[a]], %[[ARG1]][%[[i]], %[[j]]]		// CHECKPARALLEL: store %[[a]], %[[ARG1]][%[[i]], %[[j]]]

func @indexed_generic_op_zero_rank(%arg0: memref<i32>, %arg1: memref<3x4xi32>)		func @indexed_generic_op_zero_rank(%arg0: memref<i32>, %arg1: memref<3x4xi32>)
{		{
linalg.indexed_generic #trait_broadcast %arg0, %arg1 {		linalg.indexed_generic #trait_broadcast
		ins(%arg0 : memref<i32>)
		outs(%arg1 : memref<3x4xi32>) {
^bb(%i: index, %j: index, %a: i32, %b: i32) :		^bb(%i: index, %j: index, %a: i32, %b: i32) :
%ij = addi %i, %j : index		%ij = addi %i, %j : index
%ij_int = index_cast %ij : index to i32		%ij_int = index_cast %ij : index to i32
%result = addi %a, %ij_int : i32		%result = addi %a, %ij_int : i32
linalg.yield %result : i32		linalg.yield %result : i32
} : memref<i32>, memref<3x4xi32>		}
return		return
}		}

// CHECKLOOP-LABEL: @indexed_generic_op_zero_rank		// CHECKLOOP-LABEL: @indexed_generic_op_zero_rank
// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<i32>		// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<i32>
// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xi32>		// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<3x4xi32>
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: scf.for %[[j:.]] = {{.}}		// CHECKLOOP: scf.for %[[j:.]] = {{.}}
Show All 23 Lines	#trait_reduce_1D = {
args_out = 1,		args_out = 1,
indexing_maps = #reduce_1D_access,		indexing_maps = #reduce_1D_access,
iterator_types = ["reduction"],		iterator_types = ["reduction"],
library_call = "some_reduce_external_fn"		library_call = "some_reduce_external_fn"
}		}

func @generic_op_1D_reduce(%arg0: memref<?xf32>, %arg1: memref<f32>)		func @generic_op_1D_reduce(%arg0: memref<?xf32>, %arg1: memref<f32>)
{		{
linalg.generic #trait_reduce_1D %arg0, %arg1 {		linalg.generic #trait_reduce_1D
		ins(%arg0 : memref<?xf32>)
		outs(%arg1 : memref<f32>) {
^bb(%a: f32, %b: f32) :		^bb(%a: f32, %b: f32) :
%0 = addf %a, %b : f32		%0 = addf %a, %b : f32
linalg.yield %0 : f32		linalg.yield %0 : f32
} : memref<?xf32>, memref<f32>		}
return		return
}		}
// CHECKLOOP-LABEL: @generic_op_1D_reduce		// CHECKLOOP-LABEL: @generic_op_1D_reduce
// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<?xf32>		// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<?xf32>
// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][%[[i]]]		// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][%[[i]]]
// CHECKLOOP: %[[b:.*]] = load %[[ARG1]][]		// CHECKLOOP: %[[b:.*]] = load %[[ARG1]][]
Show All 23 Lines	#trait_reduce_init_1D = {
iterator_types = ["reduction"],		iterator_types = ["reduction"],
library_call = "some_reduce_external_fn"		library_call = "some_reduce_external_fn"
}		}

func @indexed_generic_op_1D_reduce(%arg0: memref<?xf32>,		func @indexed_generic_op_1D_reduce(%arg0: memref<?xf32>,
%arg1: memref<f32>,		%arg1: memref<f32>,
%arg2: memref<f32>)		%arg2: memref<f32>)
{		{
linalg.indexed_generic #trait_reduce_init_1D %arg0, %arg1, %arg2 {		linalg.indexed_generic #trait_reduce_init_1D
		ins(%arg0, %arg1 : memref<?xf32>, memref<f32>)
		outs(%arg2 : memref<f32>) {
^bb(%i : index, %a: f32, %b: f32, %c: f32) :		^bb(%i : index, %a: f32, %b: f32, %c: f32) :
%0 = constant 0 : index		%0 = constant 0 : index
%1 = cmpi "eq", %0, %i : index		%1 = cmpi "eq", %0, %i : index
%2 = select %1, %b, %c : f32		%2 = select %1, %b, %c : f32
%3 = addf %a, %2 : f32		%3 = addf %a, %2 : f32
linalg.yield %3 : f32		linalg.yield %3 : f32
} : memref<?xf32>, memref<f32>, memref<f32>		}
return		return
}		}
// CHECKLOOP-LABEL: @indexed_generic_op_1D_reduce		// CHECKLOOP-LABEL: @indexed_generic_op_1D_reduce
// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<?xf32>		// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<?xf32>
// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][%[[i]]]		// CHECKLOOP: %[[a:.*]] = load %[[ARG0]][%[[i]]]
Show All 19 Lines	#trait_const_fill = {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [affine_map<(i) -> (i)>],		indexing_maps = [affine_map<(i) -> (i)>],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
library_call = "some_external_fn"		library_call = "some_external_fn"
}		}
func @generic_const_init(%arg0: memref<?xf32>) {		func @generic_const_init(%arg0: memref<?xf32>) {
%cst = constant 1.0 : f32		%cst = constant 1.0 : f32
linalg.generic #trait_const_fill %arg0 {		linalg.generic #trait_const_fill outs(%arg0 : memref<?xf32>) {
^bb0(%arg1: f32): // no predecessors		^bb0(%arg1: f32): // no predecessors
linalg.yield %cst : f32		linalg.yield %cst : f32
}: memref<?xf32>		}
return		return
}		}
// CHECKLOOP-LABEL: @generic_const_init		// CHECKLOOP-LABEL: @generic_const_init
// CHECKLOOP-SAME: %[[ARG0:.*]]: memref<?xf32>		// CHECKLOOP-SAME: %[[ARG0:.*]]: memref<?xf32>
// CHECKLOOP: %[[CONST:.*]] = constant 1.000000e+00 : f32		// CHECKLOOP: %[[CONST:.*]] = constant 1.000000e+00 : f32
// CHECKLOOP: scf.for %[[i:.]] = {{.}}		// CHECKLOOP: scf.for %[[i:.]] = {{.}}
// CHECKLOOP: store %[[CONST]], %[[ARG0]]		// CHECKLOOP: store %[[CONST]], %[[ARG0]]

Show All 12 Lines	#scalar_trait = {
args_in = 2,		args_in = 2,
args_out = 1,		args_out = 1,
iterator_types = [],		iterator_types = [],
indexing_maps = #scalar_access,		indexing_maps = #scalar_access,
library_call = "some_external_fn"		library_call = "some_external_fn"
}		}
func @scalar_code(%arg0: memref<f32>, %arg1 : memref<f32>, %arg2 : memref<f32>)		func @scalar_code(%arg0: memref<f32>, %arg1 : memref<f32>, %arg2 : memref<f32>)
{		{
linalg.generic #scalar_trait %arg0, %arg1, %arg2 {		linalg.generic #scalar_trait
		ins(%arg0, %arg1 : memref<f32>, memref<f32>)
		outs(%arg2 : memref<f32>) {
^bb(%a : f32, %b : f32, %c : f32) :		^bb(%a : f32, %b : f32, %c : f32) :
%0 = addf %a, %b : f32		%0 = addf %a, %b : f32
linalg.yield %0 : f32		linalg.yield %0 : f32
} : memref<f32>, memref<f32>, memref<f32>		}
return		return
}		}
// CHECKLOOP-LABEL: @scalar_code		// CHECKLOOP-LABEL: @scalar_code
// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG0:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG1:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: memref<f32>		// CHECKLOOP-SAME: %[[ARG2:[a-zA-Z0-9_]*]]: memref<f32>
// CHECKLOOP-NOT: scf.for		// CHECKLOOP-NOT: scf.for
// CHECKLOOP: load %[[ARG0]][]		// CHECKLOOP: load %[[ARG0]][]
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	#conv_1d_trait = {
indexing_maps = #conv_1d_accesses,		indexing_maps = #conv_1d_accesses,
library_call = "linalg_conv_1d",		library_call = "linalg_conv_1d",
n_views = [2, 1],		n_views = [2, 1],
iterator_types = ["parallel", "parallel"],		iterator_types = ["parallel", "parallel"],
symbol_source = 1		symbol_source = 1
}		}

func @conv1d(%in : memref<?xf32>, %filter : memref<?xf32>, %out : memref<?xf32>) -> () {		func @conv1d(%in : memref<?xf32>, %filter : memref<?xf32>, %out : memref<?xf32>) -> () {
linalg.generic #conv_1d_trait %in, %filter, %out {		linalg.generic #conv_1d_trait
		ins(%in, %filter : memref<?xf32>, memref<?xf32>)
		outs(%out : memref<?xf32>) {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?xf32>,		}
memref<?xf32>,
memref<?xf32>
return		return
}		}

// CHECKLOOP-LABEL: @conv1d		// CHECKLOOP-LABEL: @conv1d
// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>		// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>
// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>		// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>
// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>		// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>
// CHECKLOOP: %[[c0:.*]] = constant 0 : index		// CHECKLOOP: %[[c0:.*]] = constant 0 : index
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	#conv_2d_trait = {
indexing_maps = #conv_2d_accesses,		indexing_maps = #conv_2d_accesses,
library_call = "linalg_conv_2d",		library_call = "linalg_conv_2d",
n_views = [2, 1],		n_views = [2, 1],
iterator_types = ["parallel", "parallel", "parallel", "parallel"],		iterator_types = ["parallel", "parallel", "parallel", "parallel"],
symbol_source = 1		symbol_source = 1
}		}

func @conv2d(%in : memref<?x?xf32>, %filter : memref<?x?xf32>, %out : memref<?x?xf32>) -> () {		func @conv2d(%in : memref<?x?xf32>, %filter : memref<?x?xf32>, %out : memref<?x?xf32>) -> () {
linalg.generic #conv_2d_trait %in, %filter, %out {		linalg.generic #conv_2d_trait
		ins(%in, %filter : memref<?x?xf32>, memref<?x?xf32>)
		outs(%out : memref<?x?xf32>) {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?x?xf32>,		}
memref<?x?xf32>,
memref<?x?xf32>
return		return
}		}

// CHECKLOOP-LABEL: @conv2d		// CHECKLOOP-LABEL: @conv2d
// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>		// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>
// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>		// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>
// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>		// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>
// CHECKLOOP: %[[c0:.*]] = constant 0 : index		// CHECKLOOP: %[[c0:.*]] = constant 0 : index
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	#conv_3d_trait = {
indexing_maps = #conv_3d_accesses,		indexing_maps = #conv_3d_accesses,
library_call = "linalg_conv_3d",		library_call = "linalg_conv_3d",
n_views = [2, 1],		n_views = [2, 1],
iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],		iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],
symbol_source = 1		symbol_source = 1
}		}

func @conv3d(%in : memref<?x?x?xf32>, %filter : memref<?x?x?xf32>, %out : memref<?x?x?xf32>) -> () {		func @conv3d(%in : memref<?x?x?xf32>, %filter : memref<?x?x?xf32>, %out : memref<?x?x?xf32>) -> () {
linalg.generic #conv_3d_trait %in, %filter, %out {		linalg.generic #conv_3d_trait
		ins(%in, %filter : memref<?x?x?xf32>, memref<?x?x?xf32>)
		outs(%out : memref<?x?x?xf32>) {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?x?x?xf32>,		}
memref<?x?x?xf32>,
memref<?x?x?xf32>
return		return
}		}

// CHECKLOOP-LABEL: @conv3d		// CHECKLOOP-LABEL: @conv3d
// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>		// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>		// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>		// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
// CHECKLOOP: %[[c0:.*]] = constant 0 : index		// CHECKLOOP: %[[c0:.*]] = constant 0 : index
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#conv_4d_trait = {
indexing_maps = #conv_4d_accesses,		indexing_maps = #conv_4d_accesses,
library_call = "linalg_conv_4d",		library_call = "linalg_conv_4d",
n_views = [2, 1],		n_views = [2, 1],
iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],		iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],
symbol_source = 1		symbol_source = 1
}		}

func @conv4d(%in : memref<?x?x?x?xf32>, %filter : memref<?x?x?x?xf32>, %out : memref<?x?x?x?xf32>) -> () {		func @conv4d(%in : memref<?x?x?x?xf32>, %filter : memref<?x?x?x?xf32>, %out : memref<?x?x?x?xf32>) -> () {
linalg.generic #conv_4d_trait %in, %filter, %out {		linalg.generic #conv_4d_trait
		ins(%in, %filter : memref<?x?x?x?xf32>, memref<?x?x?x?xf32>)
		outs(%out : memref<?x?x?x?xf32>) {
^bb0(%a: f32, %b: f32, %c: f32) :		^bb0(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b : f32		%d = mulf %a, %b : f32
%e = addf %c, %d : f32		%e = addf %c, %d : f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<?x?x?x?xf32>,		}
memref<?x?x?x?xf32>,
memref<?x?x?x?xf32>
return		return
}		}

// CHECKLOOP-LABEL: @conv4d		// CHECKLOOP-LABEL: @conv4d
// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>		// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>		// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>		// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
// CHECKLOOP: %[[c0:.*]] = constant 0 : index		// CHECKLOOP: %[[c0:.*]] = constant 0 : index
▲ Show 20 Lines • Show All 241 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/parallel_loops.mlir

	// RUN: mlir-opt %s -convert-linalg-to-parallel-loops -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -convert-linalg-to-parallel-loops -split-input-file \| FileCheck %s

	#map0 = affine_map<(d0, d1) -> (d0, d1)>			#map0 = affine_map<(d0, d1) -> (d0, d1)>
	func @linalg_generic_sum(%lhs: memref<2x2xf32>,			func @linalg_generic_sum(%lhs: memref<2x2xf32>,
	%rhs: memref<2x2xf32>,			%rhs: memref<2x2xf32>,
	%sum: memref<2x2xf32>) {			%sum: memref<2x2xf32>) {
	linalg.generic {			linalg.generic {
	args_in = 2 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0, #map0],			indexing_maps = [#map0, #map0, #map0],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]}
	} %lhs, %rhs, %sum {			ins(%lhs, %rhs : memref<2x2xf32>, memref<2x2xf32>)
				outs(%sum : memref<2x2xf32>) {
	^bb0(%lhs_in: f32, %rhs_in: f32, %sum_out: f32): // no predecessors			^bb0(%lhs_in: f32, %rhs_in: f32, %sum_out: f32): // no predecessors
	%0 = addf %lhs_in, %rhs_in : f32			%0 = addf %lhs_in, %rhs_in : f32
	linalg.yield %0 : f32			linalg.yield %0 : f32
	}: memref<2x2xf32>, memref<2x2xf32>, memref<2x2xf32>			}
	return			return
	}			}
	// CHECK-LABEL: @linalg_generic_sum			// CHECK-LABEL: @linalg_generic_sum
	// CHECK: (%[[LHS:.]]:{{.}}, %[[RHS:.]]:{{.}}, %[[SUM:.]]:{{.}})			// CHECK: (%[[LHS:.]]:{{.}}, %[[RHS:.]]:{{.}}, %[[SUM:.]]:{{.}})
	// CHECK-DAG: %[[C2:.*]] = constant 2			// CHECK-DAG: %[[C2:.*]] = constant 2
	// CHECK-DAG: %[[C0:.*]] = constant 0			// CHECK-DAG: %[[C0:.*]] = constant 0
	// CHECK-DAG: %[[C1:.*]] = constant 1			// CHECK-DAG: %[[C1:.*]] = constant 1
	// CHECK: scf.parallel (%[[I:.]], %[[J:.]]) = {{.*}}			// CHECK: scf.parallel (%[[I:.]], %[[J:.]]) = {{.*}}
	// CHECK: %[[LHS_ELEM:.*]] = load %[[LHS]][%[[I]], %[[J]]]			// CHECK: %[[LHS_ELEM:.*]] = load %[[LHS]][%[[I]], %[[J]]]
	// CHECK: %[[RHS_ELEM:.*]] = load %[[RHS]][%[[I]], %[[J]]]			// CHECK: %[[RHS_ELEM:.*]] = load %[[RHS]][%[[I]], %[[J]]]
	// CHECK: %[[SUM:.*]] = addf %[[LHS_ELEM]], %[[RHS_ELEM]] : f32			// CHECK: %[[SUM:.*]] = addf %[[LHS_ELEM]], %[[RHS_ELEM]] : f32
	// CHECK: store %[[SUM]], %{{.*}}[%[[I]], %[[J]]]			// CHECK: store %[[SUM]], %{{.*}}[%[[I]], %[[J]]]
	// CHECK: scf.yield			// CHECK: scf.yield

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>,			affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>,
	affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>			affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>
	]			]
	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "reduction", "parallel"],			iterator_types = ["parallel", "parallel", "reduction", "parallel"],
	indexing_maps = #accesses			indexing_maps = #accesses
	}			}

	func @lower_outer_parallel(%A: memref<?x?x?x?xf32>, %B: memref<?x?x?xf32>) {			func @lower_outer_parallel(%A: memref<?x?x?x?xf32>, %B: memref<?x?x?xf32>) {
	linalg.generic #trait %A, %B {			linalg.generic #trait
				ins(%A : memref<?x?x?x?xf32>)
				outs(%B : memref<?x?x?xf32>) {
	^bb0(%a: f32, %b: f32):			^bb0(%a: f32, %b: f32):
	linalg.yield %a: f32			linalg.yield %a: f32
	} : memref<?x?x?x?xf32>, memref<?x?x?xf32>			}
	return			return
	}			}
	// CHECK-LABEL: @lower_outer_parallel			// CHECK-LABEL: @lower_outer_parallel
	// CHECK-DAG: %[[C0:.*]] = constant 0			// CHECK-DAG: %[[C0:.*]] = constant 0
	// CHECK-DAG: %[[C1:.*]] = constant 1			// CHECK-DAG: %[[C1:.*]] = constant 1
	// CHECK-DAG: %[[D0:.]] = dim %{{.}}, %c0			// CHECK-DAG: %[[D0:.]] = dim %{{.}}, %c0
	// CHECK-DAG: %[[D1:.]] = dim %{{.}}, %c1			// CHECK-DAG: %[[D1:.]] = dim %{{.}}, %c1
	// CHECK-DAG: %[[D2:.]] = dim %{{.}}, %c2			// CHECK-DAG: %[[D2:.]] = dim %{{.}}, %c2
	// CHECK-DAG: %[[D3:.]] = dim %{{.}}, %c3			// CHECK-DAG: %[[D3:.]] = dim %{{.}}, %c3
	// CHECK: scf.parallel (%[[IV0:.]], %[[IV1:.]]) = (%[[C0]], %[[C0]]) to (%[[D0]], %[[D1]]) step (%[[C1]], %[[C1]])			// CHECK: scf.parallel (%[[IV0:.]], %[[IV1:.]]) = (%[[C0]], %[[C0]]) to (%[[D0]], %[[D1]]) step (%[[C1]], %[[C1]])
	// CHECK: scf.for %[[IV2:.*]] = %[[C0]] to %[[D2]] step %[[C1]]			// CHECK: scf.for %[[IV2:.*]] = %[[C0]] to %[[D2]] step %[[C1]]
	// CHECK: scf.parallel (%[[IV3:.*]]) = (%[[C0]]) to (%[[D3]]) step (%[[C1]])			// CHECK: scf.parallel (%[[IV3:.*]]) = (%[[C0]]) to (%[[D3]]) step (%[[C1]])
	// CHECK: load %{{.*}}[%[[IV0]], %[[IV1]], %[[IV2]], %[[IV3]]]			// CHECK: load %{{.*}}[%[[IV0]], %[[IV1]], %[[IV2]], %[[IV3]]]
	// CHECK: store %{{.}}, %{{.}}[%[[IV0]], %[[IV1]], %[[IV3]]]			// CHECK: store %{{.}}, %{{.}}[%[[IV0]], %[[IV1]], %[[IV3]]]

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>,			affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>,
	affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d2, d4, d5)>			affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d2, d4, d5)>
	]			]
	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "reduction", "parallel", "parallel", "reduction"],			iterator_types = ["parallel", "parallel", "reduction", "parallel", "parallel", "reduction"],
	indexing_maps = #accesses			indexing_maps = #accesses
	}			}

	func @lower_mixed_parallel(%A: memref<?x?x?x?x?x?xf32>, %B: memref<?x?x?x?xf32>) {			func @lower_mixed_parallel(%A: memref<?x?x?x?x?x?xf32>, %B: memref<?x?x?x?xf32>) {
	linalg.generic #trait %A, %B {			linalg.generic #trait
				ins(%A : memref<?x?x?x?x?x?xf32>)
				outs(%B : memref<?x?x?x?xf32>) {
	^bb0(%a: f32, %b: f32):			^bb0(%a: f32, %b: f32):
	linalg.yield %a: f32			linalg.yield %a: f32
	} : memref<?x?x?x?x?x?xf32>, memref<?x?x?x?xf32>			}
	return			return
	}			}
	// CHECK-LABEL: @lower_mixed_parallel			// CHECK-LABEL: @lower_mixed_parallel
	// CHECK-DAG: %[[C0:.*]] = constant 0			// CHECK-DAG: %[[C0:.*]] = constant 0
	// CHECK-DAG: %[[C1:.*]] = constant 1			// CHECK-DAG: %[[C1:.*]] = constant 1
	// CHECK-DAG: %[[D0:.]] = dim %{{.}}, %c0			// CHECK-DAG: %[[D0:.]] = dim %{{.}}, %c0
	// CHECK-DAG: %[[D1:.]] = dim %{{.}}, %c1			// CHECK-DAG: %[[D1:.]] = dim %{{.}}, %c1
	// CHECK-DAG: %[[D2:.]] = dim %{{.}}, %c2			// CHECK-DAG: %[[D2:.]] = dim %{{.}}, %c2
	Show All 9 Lines

mlir/test/Dialect/Linalg/roundtrip.mlir

	Show First 20 Lines • Show All 287 Lines • ▼ Show 20 Lines
	// CHECK-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECK-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	#trait = {			#trait = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	library_call = "some_external_function_name_1"			library_call = "some_external_function_name_1"
	}			}

	func @generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			func @generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,
	%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.generic #trait {foo = 1} %arg0, %arg1 {			linalg.generic #trait
				ins(%arg0 : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>)
				outs(%arg1 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>)
				attrs = {foo = 1} {
	^bb(%0: vector<3x4xi4>, %1: f32) :			^bb(%0: vector<3x4xi4>, %1: f32) :
	%f0 = constant 0.0 : f32			%f0 = constant 0.0 : f32
	linalg.yield %f0 : f32			linalg.yield %f0 : f32
	} : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			}
	memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @generic			// CHECK-LABEL: func @generic
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{[0-9a-z]}}, #{{[0-9a-z]}}],
	// CHECK-SAME: library_call = "some_external_function_name_1"			// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"],
				// CHECK-SAME: library_call = "some_external_function_name_1"}
				// CHECK-SAME: ins({{.*}} : memref<?x?xvector<3x4xi4>, #[[$strided2D]]>)
				// CHECK-SAME: outs({{.*}} : memref<?x?x?xf32, #[[$strided3D]]>)
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: {foo = 1 : i64}
	// CHECK: memref<?x?xvector<3x4xi4>, #[[$strided2D]]>, memref<?x?x?xf32, #[[$strided3D]]>

	func @generic_with_tensor_input(%arg0: tensor<?x?xvector<3x4xi4>>,			func @generic_with_tensor_input(%arg0: tensor<?x?xvector<3x4xi4>>,
	%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.generic #trait {foo = 1} %arg0, %arg1 {			linalg.generic #trait
				ins(%arg0 : tensor<?x?xvector<3x4xi4>>)
				outs(%arg1 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>)
				attrs = {foo = 1} {
	^bb(%0: vector<3x4xi4>, %1: f32) :			^bb(%0: vector<3x4xi4>, %1: f32) :
	%f0 = constant 0.0 : f32			%f0 = constant 0.0 : f32
	linalg.yield %f0 : f32			linalg.yield %f0 : f32
	} : tensor<?x?xvector<3x4xi4>>,			}
	memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @generic_with_tensor_input			// CHECK-LABEL: func @generic_with_tensor_input
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_1"}			// CHECK-SAME: library_call = "some_external_function_name_1"}
				// CHECK-SAME: ins({{.*}} : tensor<?x?xvector<3x4xi4>>)
				// CHECK-SAME: outs({{.*}} : memref<?x?x?xf32, #[[$strided3D]]>)
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: {foo = 1 : i64}
	// CHECK: tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, #[[$strided3D]]>

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	#trait2 = {			#trait2 = {
	args_in = 2,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	library_call = "some_external_function_name_1"			library_call = "some_external_function_name_1"
	}			}

	func @generic_with_tensor_input_and_output(			func @generic_with_tensor_input_and_output(
	%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)			%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)
	-> (tensor<?x?x?xf32>) {			-> (tensor<?x?x?xf32>) {
	%0 = linalg.generic #trait2 {foo = 1} %arg0, %arg1 {			%0 = linalg.generic #trait2
				ins(%arg0, %arg1 : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32>)
				attrs = {foo = 1} {
	^bb(%0: vector<3x4xi4>, %1: f32) :			^bb(%0: vector<3x4xi4>, %1: f32) :
	%f0 = constant 0.0 : f32			%f0 = constant 0.0 : f32
	linalg.yield %f0 : f32			linalg.yield %f0 : f32
	} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			} -> tensor<?x?x?xf32>
	return %0 : tensor<?x?x?xf32>			return %0 : tensor<?x?x?xf32>
	}			}
	// CHECK-LABEL: func @generic_with_tensor_input_and_output			// CHECK-LABEL: func @generic_with_tensor_input_and_output
	// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_1"}			// CHECK-SAME: library_call = "some_external_function_name_1"}
				// CHECK-SAME: ins({{.*}} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32>)
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: {foo = 1 : i64}
	// CHECK-SAME: %{{.}}, %{{.}}			// CHECK: -> tensor<?x?x?xf32>
	// CHECK: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	// CHECK: return {{.*}} : tensor<?x?x?xf32>			// CHECK: return {{.*}} : tensor<?x?x?xf32>

	// -----			// -----

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	#trait2 = {			#trait2 = {
	args_in = 2,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	library_call = "some_external_function_name_1"			library_call = "some_external_function_name_1"
	}			}

	func @indexed_generic_with_tensor_input_and_output(			func @indexed_generic_with_tensor_input_and_output(
	%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)			%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>)
	-> (tensor<?x?x?xf32>) {			-> (tensor<?x?x?xf32>) {
	%0 = linalg.indexed_generic #trait2 {foo = 1} %arg0, %arg1 {			%0 = linalg.indexed_generic #trait2
				ins(%arg0, %arg1 : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32>)
				attrs = {foo = 1} {
	^bb(%i: index, %j: index, %k: index, %0: vector<3x4xi4>, %1: f32) :			^bb(%i: index, %j: index, %k: index, %0: vector<3x4xi4>, %1: f32) :
	%f0 = constant 0.0 : f32			%f0 = constant 0.0 : f32
	linalg.yield %f0 : f32			linalg.yield %f0 : f32
	} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			} -> tensor<?x?x?xf32>
	return %0 : tensor<?x?x?xf32>			return %0 : tensor<?x?x?xf32>
	}			}
	// CHECK-LABEL: func @indexed_generic_with_tensor_input_and_output			// CHECK-LABEL: func @indexed_generic_with_tensor_input_and_output
	// CHECK: linalg.indexed_generic {args_in = 2 : i64, args_out = 1 : i64,			// CHECK: linalg.indexed_generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_1"}			// CHECK-SAME: library_call = "some_external_function_name_1"}
				// CHECK-SAME: ins({{.*}} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32>)
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: {foo = 1 : i64}
	// CHECK-SAME: %{{.}}, %{{.}}			// CHECK: -> tensor<?x?x?xf32>
	// CHECK: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	// CHECK: return {{.*}} : tensor<?x?x?xf32>			// CHECK: return {{.*}} : tensor<?x?x?xf32>

	// -----			// -----

	#broadcast_access = [			#broadcast_access = [
	affine_map<(i, j) -> ()>,			affine_map<(i, j) -> ()>,
	affine_map<(i, j) -> (i, j)>			affine_map<(i, j) -> (i, j)>
	]			]

	#trait_broadcast = {			#trait_broadcast = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #broadcast_access,			indexing_maps = #broadcast_access,
	iterator_types = ["parallel", "parallel"],			iterator_types = ["parallel", "parallel"],
	library_call = "some_broadcast_external_fn"			library_call = "some_broadcast_external_fn"
	}			}

	func @generic_op_zero_rank(%arg0: tensor<f32>) -> (tensor<3x4xf32>)			func @generic_op_zero_rank(%arg0: tensor<f32>) -> (tensor<3x4xf32>)
	{			{
	%0 = linalg.generic #trait_broadcast %arg0 {			%0 = linalg.generic #trait_broadcast
				ins(%arg0 : tensor<f32>) {
	^bb(%a: f32) :			^bb(%a: f32) :
	linalg.yield %a : f32			linalg.yield %a : f32
	} : tensor<f32> -> tensor<3x4xf32>			} -> tensor<3x4xf32>
	return %0 : tensor<3x4xf32>			return %0 : tensor<3x4xf32>
	}			}

	func @indexed_generic_op_zero_rank(%arg0: tensor<f32>) -> (tensor<3x4xf32>)			func @indexed_generic_op_zero_rank(%arg0: tensor<f32>) -> (tensor<3x4xf32>)
	{			{
	%0 = linalg.indexed_generic #trait_broadcast %arg0 {			%0 = linalg.indexed_generic #trait_broadcast
				ins(%arg0 : tensor<f32>) {
	^bb(%i: index, %j: index, %a: f32) :			^bb(%i: index, %j: index, %a: f32) :
	linalg.yield %a : f32			linalg.yield %a : f32
	} : tensor<f32> -> tensor<3x4xf32>			} -> tensor<3x4xf32>
	return %0 : tensor<3x4xf32>			return %0 : tensor<3x4xf32>
	}			}

	// -----			// -----

	// CHECK-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>			// CHECK-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
	// CHECK-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECK-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>

	#accesses = [			#accesses = [
	affine_map<(i, j, k) -> (j, i)>,			affine_map<(i, j, k) -> (j, i)>,
	affine_map<(i, j, k) -> (i, k, i + j)>			affine_map<(i, j, k) -> (i, k, i + j)>
	]			]

	#trait3 = {			#trait3 = {
	args_in = 1,
	args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	library_call = "some_external_function_name_2"			library_call = "some_external_function_name_2"
	}			}

	func @generic_region(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			func @generic_region(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,
	%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.generic #trait3 {foo = 1} %arg0, %arg1 {			linalg.generic #trait3
				ins(%arg0 : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>)
				outs(%arg1 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>)
				attrs = {foo = 1} {
	^bb(%a: vector<3x4xi4>, %b: f32) :			^bb(%a: vector<3x4xi4>, %b: f32) :
	linalg.yield %b : f32			linalg.yield %b : f32
	} : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			}
	memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @generic_region			// CHECK-LABEL: func @generic_region
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,			// CHECK: linalg.generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{[0-9a-z]}}, #{{[0-9a-z]}}],
				// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_2"			// CHECK-SAME: library_call = "some_external_function_name_2"
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: ins({{.*}} : memref<?x?xvector<3x4xi4>, #[[$strided2D]]>)
				// CHECK-SAME: outs({{.*}} : memref<?x?x?xf32, #[[$strided3D]]>)
				// CHECK-SAME: attrs = {foo = 1 : i64} {
	// CHECK: ^{{.}}(%{{.}}: vector<3x4xi4>, %{{.*}}: f32):			// CHECK: ^{{.}}(%{{.}}: vector<3x4xi4>, %{{.*}}: f32):
	// CHECK: linalg.yield %{{.*}} : f32			// CHECK: linalg.yield %{{.*}} : f32
	// CHECK: memref<?x?xvector<3x4xi4>, #[[$strided2D]]>,
	// CHECK-SAME: memref<?x?x?xf32, #[[$strided3D]]>

	func @indexed_generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			func @indexed_generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,
	%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.indexed_generic #trait3 {foo = 1} %arg0, %arg1 {			linalg.indexed_generic #trait3
				ins(%arg0 : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>)
				outs(%arg1 : memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>)
				attrs = {foo = 1} {
	^bb(%i: index, %j: index, %k: index, %a: vector<3x4xi4>, %b: f32) :			^bb(%i: index, %j: index, %k: index, %a: vector<3x4xi4>, %b: f32) :
	linalg.yield %b : f32			linalg.yield %b : f32
	}: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			}
	memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @indexed_generic			// CHECK-LABEL: func @indexed_generic
	// CHECK: linalg.indexed_generic {args_in = 1 : i64, args_out = 1 : i64,			// CHECK: linalg.indexed_generic {
	// CHECK-SAME: indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"],			// CHECK-SAME: indexing_maps = [#{{[0-9a-z]}}, #{{[0-9a-z]}}],
				// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel"],
	// CHECK-SAME: library_call = "some_external_function_name_2"			// CHECK-SAME: library_call = "some_external_function_name_2"
				// CHECK-SAME: ins({{.*}} : memref<?x?xvector<3x4xi4>, #[[$strided2D]]>)
				// CHECK-SAME: outs({{.*}} : memref<?x?x?xf32, #[[$strided3D]]>)
	// CHECK-SAME: {foo = 1 : i64}			// CHECK-SAME: {foo = 1 : i64}
	// CHECK: ^{{.}}(%{{.}}: index, %{{.}}: index, %{{.}}: index, %{{.}}: vector<3x4xi4>, %{{.}}: f32):			// CHECK: ^{{.}}(%{{.}}: index, %{{.}}: index, %{{.}}: index, %{{.}}: vector<3x4xi4>, %{{.}}: f32):
	// CHECK: linalg.yield %{{.*}} : f32			// CHECK: linalg.yield %{{.*}} : f32
	// CHECK: }: memref<?x?xvector<3x4xi4>, #[[$strided2D]]>,			// CHECK: }
	// CHECK-SAME: memref<?x?x?xf32, #[[$strided3D]]>

	// -----			// -----

	// CHECK-DAG: #[[$reshapeD01:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>			// CHECK-DAG: #[[$reshapeD01:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>
	// CHECK-DAG: #[[$reshapeD2:.*]] = affine_map<(d0, d1, d2) -> (d2)>			// CHECK-DAG: #[[$reshapeD2:.*]] = affine_map<(d0, d1, d2) -> (d2)>
	// CHECK-DAG: #[[$reshapeD0:.*]] = affine_map<(d0, d1, d2) -> (d0)>			// CHECK-DAG: #[[$reshapeD0:.*]] = affine_map<(d0, d1, d2) -> (d0)>
	// CHECK-DAG: #[[$reshapeD12:.*]] = affine_map<(d0, d1, d2) -> (d1, d2)>			// CHECK-DAG: #[[$reshapeD12:.*]] = affine_map<(d0, d1, d2) -> (d1, d2)>
	// CHECK-DAG: #[[$reshapeD012:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>			// CHECK-DAG: #[[$reshapeD012:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
	▲ Show 20 Lines • Show All 166 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/standard.mlir

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	// CHECK-SAME: memref<?x?x?xf32, #[[$map8]]>, memref<?x?x?xf32, #[[$map8]]>			// CHECK-SAME: memref<?x?x?xf32, #[[$map8]]>, memref<?x?x?xf32, #[[$map8]]>

	#matmul_accesses = [			#matmul_accesses = [
	affine_map<(m, n, k) -> (m, k)>,			affine_map<(m, n, k) -> (m, k)>,
	affine_map<(m, n, k) -> (k, n)>,			affine_map<(m, n, k) -> (k, n)>,
	affine_map<(m, n, k) -> (m, n)>			affine_map<(m, n, k) -> (m, n)>
	]			]
	#matmul_trait = {			#matmul_trait = {
	args_in = 2,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "reduction"],			iterator_types = ["parallel", "parallel", "reduction"],
	indexing_maps = #matmul_accesses,			indexing_maps = #matmul_accesses,
	library_call = "external_outerproduct_matmul"			library_call = "external_outerproduct_matmul"
	}			}

	!vector_type_A = type vector<4xf32>			!vector_type_A = type vector<4xf32>
	!vector_type_B = type vector<4xf32>			!vector_type_B = type vector<4xf32>
	!vector_type_C = type vector<4x4xf32>			!vector_type_C = type vector<4x4xf32>

	!matrix_type_A = type memref<?x?x!vector_type_A>			!matrix_type_A = type memref<?x?x!vector_type_A>
	!matrix_type_B = type memref<?x?x!vector_type_B>			!matrix_type_B = type memref<?x?x!vector_type_B>
	!matrix_type_C = type memref<?x?x!vector_type_C>			!matrix_type_C = type memref<?x?x!vector_type_C>

	func @matmul_vec_impl(%A: !matrix_type_A, %B: !matrix_type_B, %C: !matrix_type_C) {			func @matmul_vec_impl(%A: !matrix_type_A, %B: !matrix_type_B, %C: !matrix_type_C) {
	linalg.generic #matmul_trait %A, %B, %C {			linalg.generic #matmul_trait
				ins(%A, %B : !matrix_type_A, !matrix_type_B)
				outs(%C : !matrix_type_C) {
	^bb0(%a: !vector_type_A, %b: !vector_type_B, %c: !vector_type_C):			^bb0(%a: !vector_type_A, %b: !vector_type_B, %c: !vector_type_C):
	%d = vector.outerproduct %a, %b, %c: !vector_type_A, !vector_type_B			%d = vector.outerproduct %a, %b, %c: !vector_type_A, !vector_type_B
	linalg.yield %d: !vector_type_C			linalg.yield %d: !vector_type_C
	} : !matrix_type_A, !matrix_type_B, !matrix_type_C			}

	return			return
	}			}
	// CHECK-LABEL: func @matmul_vec_impl(			// CHECK-LABEL: func @matmul_vec_impl(
	// CHECK: call @external_outerproduct_matmul(%{{.*}}) :			// CHECK: call @external_outerproduct_matmul(%{{.*}}) :

	#indexed_matmul_trait = {			#indexed_matmul_trait = {
	args_in = 2,
	args_out = 1,
	iterator_types = ["parallel", "parallel", "reduction"],			iterator_types = ["parallel", "parallel", "reduction"],
	indexing_maps = #matmul_accesses,			indexing_maps = #matmul_accesses,
	library_call = "external_indexed_outerproduct_matmul"			library_call = "external_indexed_outerproduct_matmul"
	}			}
	func @matmul_vec_indexed(%A: !matrix_type_A,			func @matmul_vec_indexed(%A: !matrix_type_A,
	%B: !matrix_type_B,			%B: !matrix_type_B,
	%C: !matrix_type_C) {			%C: !matrix_type_C) {
	linalg.indexed_generic #indexed_matmul_trait %A, %B, %C {			linalg.indexed_generic #indexed_matmul_trait
				ins(%A, %B : !matrix_type_A, !matrix_type_B)
				outs(%C : !matrix_type_C) {
	^bb0(%i: index, %j: index, %k: index,			^bb0(%i: index, %j: index, %k: index,
	%a: !vector_type_A, %b: !vector_type_B, %c: !vector_type_C):			%a: !vector_type_A, %b: !vector_type_B, %c: !vector_type_C):
	%d = vector.outerproduct %a, %b, %c: !vector_type_A, !vector_type_B			%d = vector.outerproduct %a, %b, %c: !vector_type_A, !vector_type_B
	linalg.yield %d: !vector_type_C			linalg.yield %d: !vector_type_C
	} : !matrix_type_A, !matrix_type_B, !matrix_type_C			}
	return			return
	}			}
	// CHECK-LABEL: func @matmul_vec_indexed(			// CHECK-LABEL: func @matmul_vec_indexed(
	// CHECK: %[[ZERO:.*]] = constant 0 : index			// CHECK: %[[ZERO:.*]] = constant 0 : index
	// CHECK: call @external_indexed_outerproduct_matmul(%[[ZERO]], %[[ZERO]], %[[ZERO]], %{{.*}})			// CHECK: call @external_indexed_outerproduct_matmul(%[[ZERO]], %[[ZERO]], %[[ZERO]], %{{.*}})

mlir/test/Dialect/Linalg/tensors-to-buffers.mlir

	// RUN: mlir-opt -convert-linalg-on-tensors-to-buffers -buffer-placement -split-input-file %s \| FileCheck %s			// RUN: mlir-opt -convert-linalg-on-tensors-to-buffers -buffer-placement -split-input-file %s \| FileCheck %s

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @multiple_results_generic_op			// CHECK-LABEL: func @multiple_results_generic_op
	func @multiple_results_generic_op(%arg0: tensor<4xf32>) -> (tensor<4xf32>, tensor<4xf32>) {			func @multiple_results_generic_op(%arg0: tensor<4xf32>) -> (tensor<4xf32>, tensor<4xf32>) {
	%0, %1 = linalg.generic {args_in = 1 : i64, args_out = 2 : i64, indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel"]} %arg0 {			%0, %1 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<4xf32>) {
	^bb0(%gen_arg1: f32):			^bb0(%gen_arg1: f32):
	%tmp1 = exp %gen_arg1 : f32			%tmp1 = exp %gen_arg1 : f32
	linalg.yield %tmp1, %tmp1 : f32, f32			linalg.yield %tmp1, %tmp1 : f32, f32
	}: tensor<4xf32> -> (tensor<4xf32>, tensor<4xf32>)			} -> tensor<4xf32>, tensor<4xf32>
	return %0, %1 : tensor<4xf32>, tensor<4xf32>			return %0, %1 : tensor<4xf32>, tensor<4xf32>
	}			}
	// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]], %[[ARG1_RESULT:.]]: [[TYPE]], %[[ARG2_RESULT:.]]: [[TYPE]])			// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]], %[[ARG1_RESULT:.]]: [[TYPE]], %[[ARG2_RESULT:.]]: [[TYPE]])
	// CHECK: %[[FIRST_ALLOC:.*]] = alloc() : [[TYPE]]			// CHECK: %[[FIRST_ALLOC:.*]] = alloc() : [[TYPE]]
	// CHECK: %[[SECOND_ALLOC:.*]] = alloc() : [[TYPE]]			// CHECK: %[[SECOND_ALLOC:.*]] = alloc() : [[TYPE]]
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: %[[NEW_ARG0]], %[[FIRST_ALLOC]], %[[SECOND_ALLOC]]			// CHECK-SAME: ins(%[[NEW_ARG0]] : [[TYPE]]
				// CHECK-SAME: outs(%[[FIRST_ALLOC]], %[[SECOND_ALLOC]] : [[TYPE]], [[TYPE]]
	// CHECK-NEXT: ^{{[a-z0-9_]*}}			// CHECK-NEXT: ^{{[a-z0-9_]*}}
	// CHECK-SAME: %{{.}}: f32, %{{.}}: f32, %{{.*}}: f32			// CHECK-SAME: %{{.}}: f32, %{{.}}: f32, %{{.*}}: f32
	// CHECK-NEXT: %{{.*}} = exp			// CHECK-NEXT: %{{.*}} = exp
	// CHECK-NEXT: linalg.yield			// CHECK-NEXT: linalg.yield
	// CHECK-NEXT: [[TYPE]], [[TYPE]], [[TYPE]]
	// CHECK: linalg.copy(%[[FIRST_ALLOC]], %[[ARG1_RESULT]])			// CHECK: linalg.copy(%[[FIRST_ALLOC]], %[[ARG1_RESULT]])
	// CHECK: dealloc %[[FIRST_ALLOC]]			// CHECK: dealloc %[[FIRST_ALLOC]]
	// CHECK: linalg.copy(%[[SECOND_ALLOC]], %[[ARG2_RESULT]])			// CHECK: linalg.copy(%[[SECOND_ALLOC]], %[[ARG2_RESULT]])
	// CHECK: dealloc %[[SECOND_ALLOC]]			// CHECK: dealloc %[[SECOND_ALLOC]]
	// CHECK: return			// CHECK: return

	// -----			// -----

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @chained_operations			// CHECK-LABEL: func @chained_operations
	func @chained_operations(%arg0: tensor<4xf32>) -> tensor<4xf32> {			func @chained_operations(%arg0: tensor<4xf32>) -> tensor<4xf32> {
	%0 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%0 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<4xf32>) {
	^bb0(%gen_arg1: f32):			^bb0(%gen_arg1: f32):
	%tmp1 = exp %gen_arg1 : f32			%tmp1 = exp %gen_arg1 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<4xf32> -> tensor<4xf32>			} -> tensor<4xf32>
	%1 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %0 {			%1 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%0 : tensor<4xf32>) {
	^bb0(%gen_arg2: f32):			^bb0(%gen_arg2: f32):
	%tmp2 = exp %gen_arg2 : f32			%tmp2 = exp %gen_arg2 : f32
	linalg.yield %tmp2 : f32			linalg.yield %tmp2 : f32
	}: tensor<4xf32> -> tensor<4xf32>			} -> tensor<4xf32>
	return %1 : tensor<4xf32>			return %1 : tensor<4xf32>
	}			}
	// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]], %[[ARG1_RESULT:.*]]: [[TYPE]])			// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]], %[[ARG1_RESULT:.*]]: [[TYPE]])
	// CHECK: %[[FIRST_ALLOC:.*]] = alloc() : [[TYPE]]			// CHECK: %[[FIRST_ALLOC:.*]] = alloc() : [[TYPE]]
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: %[[NEW_ARG0]], %[[FIRST_ALLOC]]			// CHECK-SAME: ins(%[[NEW_ARG0]] : [[TYPE]]
				// CHECK-SAME: outs(%[[FIRST_ALLOC]] : [[TYPE]]
	// CHECK: ^{{[a-z0-9_]*}}			// CHECK: ^{{[a-z0-9_]*}}
	// CHECK-SAME: %{{.}}: f32, %{{.}}: f32			// CHECK-SAME: %{{.}}: f32, %{{.}}: f32
	// CHECK: [[TYPE]], [[TYPE]]
	// CHECK: %[[SECOND_ALLOC:.*]] = alloc() : [[TYPE]]			// CHECK: %[[SECOND_ALLOC:.*]] = alloc() : [[TYPE]]
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: %[[FIRST_ALLOC]], %[[SECOND_ALLOC]]			// CHECK-SAME: ins(%[[FIRST_ALLOC]] : [[TYPE]]
				// CHECK-SAME: outs(%[[SECOND_ALLOC]] : [[TYPE]]
	// CHECK: ^{{[a-z0-9_]*}}			// CHECK: ^{{[a-z0-9_]*}}
	// CHECK-SAME: %{{.}}: f32, %{{.}}: f32			// CHECK-SAME: %{{.}}: f32, %{{.}}: f32
	// CHECK: [[TYPE]], [[TYPE]]
	// CHECK: dealloc %[[FIRST_ALLOC]]			// CHECK: dealloc %[[FIRST_ALLOC]]
	// CHECK: linalg.copy(%[[SECOND_ALLOC]], %[[ARG1_RESULT]])			// CHECK: linalg.copy(%[[SECOND_ALLOC]], %[[ARG1_RESULT]])
	// CHECK: dealloc %[[SECOND_ALLOC]]			// CHECK: dealloc %[[SECOND_ALLOC]]
	// CHECK: return			// CHECK: return

	// -----			// -----

	// CHECK-LABEL: func @no_linalg_op			// CHECK-LABEL: func @no_linalg_op
	func @no_linalg_op(%arg0: f32) -> (f32, f32) {			func @no_linalg_op(%arg0: f32) -> (f32, f32) {
	%0 = mulf %arg0, %arg0 : f32			%0 = mulf %arg0, %arg0 : f32
	return %0, %0 : f32, f32			return %0, %0 : f32, f32
	}			}
	// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]]) -> ([[TYPE]], [[TYPE]])			// CHECK: (%[[NEW_ARG0:.]]: [[TYPE:.]]) -> ([[TYPE]], [[TYPE]])
	// CHECK: %[[RESULT:.*]] = mulf %[[NEW_ARG0]], %[[NEW_ARG0]] : [[TYPE]]			// CHECK: %[[RESULT:.*]] = mulf %[[NEW_ARG0]], %[[NEW_ARG0]] : [[TYPE]]
	// CHECK: return %[[RESULT]], %[[RESULT]] : [[TYPE]], [[TYPE]]			// CHECK: return %[[RESULT]], %[[RESULT]] : [[TYPE]], [[TYPE]]

mlir/test/Dialect/Linalg/tile.mlir

Show First 20 Lines • Show All 343 Lines • ▼ Show 20 Lines	#pointwise_2d_trait = {
args_in = 2,		args_in = 2,
args_out = 1,		args_out = 1,
indexing_maps = [#id_2d, #id_2d, #id_2d],		indexing_maps = [#id_2d, #id_2d, #id_2d],
iterator_types = ["parallel", "parallel"]		iterator_types = ["parallel", "parallel"]
}		}

func @pointwise(%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>, %arg1: memref<?x?xf32, offset: ?, strides: [?, 1]>,		func @pointwise(%arg0: memref<?x?xf32, offset: ?, strides: [?, 1]>, %arg1: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%arg2: memref<?x?xf32, offset: ?, strides: [?, 1]>) {		%arg2: memref<?x?xf32, offset: ?, strides: [?, 1]>) {
linalg.generic #pointwise_2d_trait %arg0, %arg1, %arg2 {		linalg.generic #pointwise_2d_trait
		ins(%arg0, %arg1 : memref<?x?xf32, offset: ?, strides: [?, 1]>, memref<?x?xf32, offset: ?, strides: [?, 1]>)
		outs(%arg2 : memref<?x?xf32, offset: ?, strides: [?, 1]>) {
^bb0(%arg4: f32, %arg5: f32, %arg6: f32): // no predecessors		^bb0(%arg4: f32, %arg5: f32, %arg6: f32): // no predecessors
%4 = addf %arg4, %arg5 : f32		%4 = addf %arg4, %arg5 : f32
linalg.yield %4 : f32		linalg.yield %4 : f32
}: memref<?x?xf32, offset: ?, strides: [?, 1]>, memref<?x?xf32, offset: ?, strides: [?, 1]>, memref<?x?xf32, offset: ?, strides: [?, 1]>		}
return		return
}		}
// TILE-2-LABEL: func @pointwise		// TILE-2-LABEL: func @pointwise
// TILE-2: for		// TILE-2: for
// TILE-2-NOT: for		// TILE-2-NOT: for
// TILE-2: linalg.generic		// TILE-2: linalg.generic

// TILE-02-LABEL: func @pointwise		// TILE-02-LABEL: func @pointwise
Show All 13 Lines

mlir/test/Dialect/Linalg/tile_indexed_generic.mlir

// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=10,25" \| FileCheck %s -check-prefix=TILE-10n25		// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=10,25" \| FileCheck %s -check-prefix=TILE-10n25
// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=25,0" \| FileCheck %s -check-prefix=TILE-25n0		// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=25,0" \| FileCheck %s -check-prefix=TILE-25n0
// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=0,25" \| FileCheck %s -check-prefix=TILE-0n25		// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=0,25" \| FileCheck %s -check-prefix=TILE-0n25

#id_1d = affine_map<(i) -> (i)>		#id_1d = affine_map<(i) -> (i)>
#pointwise_1d_trait = {		#pointwise_1d_trait = {
args_in = 1,		args_in = 1,
args_out = 1,		args_out = 1,
indexing_maps = [#id_1d, #id_1d],		indexing_maps = [#id_1d, #id_1d],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
}		}
func @indexed_generic_vector(%operand: memref<50xf32>, %result: memref<50xf32>) {		func @indexed_generic_vector(%operand: memref<50xf32>, %result: memref<50xf32>) {
linalg.indexed_generic #pointwise_1d_trait %operand, %result {		linalg.indexed_generic #pointwise_1d_trait
		ins(%operand :memref<50xf32>)
		outs(%result : memref<50xf32>) {
^bb0(%i: index, %operand_in: f32, %result_in: f32):		^bb0(%i: index, %operand_in: f32, %result_in: f32):
%i_int = index_cast %i: index to i32		%i_int = index_cast %i: index to i32
%i_float = sitofp %i_int : i32 to f32		%i_float = sitofp %i_int : i32 to f32
%out = addf %operand_in, %i_float : f32		%out = addf %operand_in, %i_float : f32
linalg.yield %out : f32		linalg.yield %out : f32
}: memref<50xf32>, memref<50xf32>		}
return		return
}		}
// TILE-10n25-LABEL: func @indexed_generic_vector		// TILE-10n25-LABEL: func @indexed_generic_vector
// TILE-10n25: %[[C10:.*]] = constant 10 : index		// TILE-10n25: %[[C10:.*]] = constant 10 : index
// TILE-10n25: scf.for %[[J:.]] = {{.}} step %[[C10]]		// TILE-10n25: scf.for %[[J:.]] = {{.}} step %[[C10]]
// TILE-10n25: linalg.indexed_generic		// TILE-10n25: linalg.indexed_generic
// TILE-10n25: ^bb0(%[[I:.]]: index, %[[IN:.]]: f32, %[[OUT:.*]]: f32)		// TILE-10n25: ^bb0(%[[I:.]]: index, %[[IN:.]]: f32, %[[OUT:.*]]: f32)
// TILE-10n25: %[[NEW_I:.*]] = addi %[[I]], %[[J]] : index		// TILE-10n25: %[[NEW_I:.*]] = addi %[[I]], %[[J]] : index
Show All 20 Lines	#combined_indices_trait = {
args_out = 1,		args_out = 1,
indexing_maps = [		indexing_maps = [
affine_map<(i, j) -> (j, i + j)>,		affine_map<(i, j) -> (j, i + j)>,
affine_map<(i, j) -> (i, j)>		affine_map<(i, j) -> (i, j)>
],		],
iterator_types = ["parallel", "parallel"]		iterator_types = ["parallel", "parallel"]
}		}
func @indexed_generic_matrix(%operand: memref<50x100xf32>, %result: memref<50x100xf32>) {		func @indexed_generic_matrix(%operand: memref<50x100xf32>, %result: memref<50x100xf32>) {
linalg.indexed_generic #combined_indices_trait %operand, %result {		linalg.indexed_generic #combined_indices_trait
		ins(%operand : memref<50x100xf32>)
		outs(%result : memref<50x100xf32>) {
^bb0(%i: index, %j: index, %operand_in: f32, %result_in: f32):		^bb0(%i: index, %j: index, %operand_in: f32, %result_in: f32):
%i_int = index_cast %i: index to i32		%i_int = index_cast %i: index to i32
%i_float = sitofp %i_int : i32 to f32		%i_float = sitofp %i_int : i32 to f32
%j_int = index_cast %j: index to i32		%j_int = index_cast %j: index to i32
%j_float = sitofp %j_int : i32 to f32		%j_float = sitofp %j_int : i32 to f32
%out = addf %i_float, %j_float : f32		%out = addf %i_float, %j_float : f32
linalg.yield %out : f32		linalg.yield %out : f32
}: memref<50x100xf32>, memref<50x100xf32>		}
return		return
}		}
// TILE-10n25-LABEL: func @indexed_generic_matrix		// TILE-10n25-LABEL: func @indexed_generic_matrix
// TILE-10n25-DAG: %[[C25:.*]] = constant 25 : index		// TILE-10n25-DAG: %[[C25:.*]] = constant 25 : index
// TILE-10n25-DAG: %[[C10:.*]] = constant 10 : index		// TILE-10n25-DAG: %[[C10:.*]] = constant 10 : index
// TILE-10n25: scf.for %[[K:.]] = {{.}} step %[[C10]]		// TILE-10n25: scf.for %[[K:.]] = {{.}} step %[[C10]]
// TILE-10n25: scf.for %[[L:.]] = {{.}} step %[[C25]]		// TILE-10n25: scf.for %[[L:.]] = {{.}} step %[[C25]]
// TILE-10n25: linalg.indexed_generic		// TILE-10n25: linalg.indexed_generic
Show All 32 Lines

mlir/test/Dialect/Linalg/tile_parallel.mlir

	// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=2" \| FileCheck %s -check-prefix=TILE-2			// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=2" \| FileCheck %s -check-prefix=TILE-2
	// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=0,2" \| FileCheck %s -check-prefix=TILE-02			// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=0,2" \| FileCheck %s -check-prefix=TILE-02
	// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=0,0,2" \| FileCheck %s -check-prefix=TILE-002			// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=0,0,2" \| FileCheck %s -check-prefix=TILE-002
	// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=2,3,4" \| FileCheck %s -check-prefix=TILE-234			// RUN: mlir-opt %s -linalg-tile-to-parallel-loops="linalg-tile-sizes=2,3,4" \| FileCheck %s -check-prefix=TILE-234

	#id_2d = affine_map<(i, j) -> (i, j)>			#id_2d = affine_map<(i, j) -> (i, j)>
	#pointwise_2d_trait = {			#pointwise_2d_trait = {
	args_in = 2,			args_in = 2,
	args_out = 1,			args_out = 1,
	indexing_maps = [#id_2d, #id_2d, #id_2d],			indexing_maps = [#id_2d, #id_2d, #id_2d],
	iterator_types = ["parallel", "parallel"]			iterator_types = ["parallel", "parallel"]
	}			}

	func @sum(%lhs: memref<?x?xf32, offset: ?, strides: [?, 1]>,			func @sum(%lhs: memref<?x?xf32, offset: ?, strides: [?, 1]>,
	%rhs: memref<?x?xf32, offset: ?, strides: [?, 1]>,			%rhs: memref<?x?xf32, offset: ?, strides: [?, 1]>,
	%sum: memref<?x?xf32, offset: ?, strides: [?, 1]>) {			%sum: memref<?x?xf32, offset: ?, strides: [?, 1]>) {
	linalg.generic #pointwise_2d_trait %lhs, %rhs, %sum {			linalg.generic #pointwise_2d_trait
				ins(%lhs, %rhs: memref<?x?xf32, offset: ?, strides: [?, 1]>,
				memref<?x?xf32, offset: ?, strides: [?, 1]>)
				outs(%sum : memref<?x?xf32, offset: ?, strides: [?, 1]>) {
	^bb0(%lhs_in: f32, %rhs_in: f32, %sum_out: f32):			^bb0(%lhs_in: f32, %rhs_in: f32, %sum_out: f32):
	%result = addf %lhs_in, %rhs_in : f32			%result = addf %lhs_in, %rhs_in : f32
	linalg.yield %result : f32			linalg.yield %result : f32
	}: memref<?x?xf32, offset: ?, strides: [?, 1]>,			}
	memref<?x?xf32, offset: ?, strides: [?, 1]>,
	memref<?x?xf32, offset: ?, strides: [?, 1]>
	return			return
	}			}
	// TILE-2-LABEL: func @sum(			// TILE-2-LABEL: func @sum(
	// TILE-2-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {			// TILE-2-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {
	// TILE-2-DAG: [[C0:%.*]] = constant 0 : index			// TILE-2-DAG: [[C0:%.*]] = constant 0 : index
	// TILE-2-DAG: [[C2:%.*]] = constant 2 : index			// TILE-2-DAG: [[C2:%.*]] = constant 2 : index
	// TILE-2: [[LHS_ROWS:%.*]] = dim [[LHS]], %c0			// TILE-2: [[LHS_ROWS:%.*]] = dim [[LHS]], %c0
	// TILE-2: scf.parallel ([[I:%.*]]) = ([[C0]]) to ([[LHS_ROWS]]) step ([[C2]]) {			// TILE-2: scf.parallel ([[I:%.*]]) = ([[C0]]) to ([[LHS_ROWS]]) step ([[C2]]) {
	// TILE-2-NO: scf.parallel			// TILE-2-NO: scf.parallel
	// TILE-2: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]			// TILE-2: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]
	// TILE-2: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]			// TILE-2: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]
	// TILE-2: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]			// TILE-2: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]
	// TILE-2: linalg.generic {{.*}} [[LHS_SUBVIEW]], [[RHS_SUBVIEW]], [[SUM_SUBVIEW]] {			// TILE-2: linalg.generic {{.}} ins([[LHS_SUBVIEW]], [[RHS_SUBVIEW]]{{.}} outs([[SUM_SUBVIEW]]

	// TILE-02-LABEL: func @sum(			// TILE-02-LABEL: func @sum(
	// TILE-02-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {			// TILE-02-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {
	// TILE-02-DAG: [[C0:%.*]] = constant 0 : index			// TILE-02-DAG: [[C0:%.*]] = constant 0 : index
	// TILE-02-DAG: [[C2:%.*]] = constant 2 : index			// TILE-02-DAG: [[C2:%.*]] = constant 2 : index
	// TILE-02: [[LHS_COLS:%.*]] = dim [[LHS]], %c1			// TILE-02: [[LHS_COLS:%.*]] = dim [[LHS]], %c1
	// TILE-02: scf.parallel ([[I:%.*]]) = ([[C0]]) to ([[LHS_COLS]]) step ([[C2]]) {			// TILE-02: scf.parallel ([[I:%.*]]) = ([[C0]]) to ([[LHS_COLS]]) step ([[C2]]) {
	// TILE-02-NO: scf.parallel			// TILE-02-NO: scf.parallel
	// TILE-02: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]			// TILE-02: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]
	// TILE-02: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]			// TILE-02: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]
	// TILE-02: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]			// TILE-02: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]
	// TILE-02: linalg.generic {{.*}} [[LHS_SUBVIEW]], [[RHS_SUBVIEW]], [[SUM_SUBVIEW]] {			// TILE-02: linalg.generic {{.}} ins([[LHS_SUBVIEW]], [[RHS_SUBVIEW]]{{.}} outs([[SUM_SUBVIEW]]

	// TILE-002-LABEL: func @sum(			// TILE-002-LABEL: func @sum(
	// TILE-002-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {			// TILE-002-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {
	// TILE-002-NO: scf.parallel			// TILE-002-NO: scf.parallel
	// TILE-002: linalg.generic {{.*}} [[LHS]], [[RHS]], [[SUM]] {			// TILE-002: linalg.generic {{.}} ins([[LHS]], [[RHS]]{{.}} outs([[SUM]]

	// TILE-234-LABEL: func @sum(			// TILE-234-LABEL: func @sum(
	// TILE-234-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {			// TILE-234-SAME: [[LHS:%.]]: {{.}}, [[RHS:%.]]: {{.}}, [[SUM:%.]]: {{.}}) {
	// TILE-234-DAG: [[C0:%.*]] = constant 0 : index			// TILE-234-DAG: [[C0:%.*]] = constant 0 : index
	// TILE-234-DAG: [[C2:%.*]] = constant 2 : index			// TILE-234-DAG: [[C2:%.*]] = constant 2 : index
	// TILE-234-DAG: [[C3:%.*]] = constant 3 : index			// TILE-234-DAG: [[C3:%.*]] = constant 3 : index
	// TILE-234: [[LHS_ROWS:%.*]] = dim [[LHS]], %c0			// TILE-234: [[LHS_ROWS:%.*]] = dim [[LHS]], %c0
	// TILE-234: [[LHS_COLS:%.*]] = dim [[LHS]], %c1			// TILE-234: [[LHS_COLS:%.*]] = dim [[LHS]], %c1
	// TILE-234: scf.parallel ([[I:%.]], [[J:%.]]) = ([[C0]], [[C0]]) to ([[LHS_ROWS]], [[LHS_COLS]]) step ([[C2]], [[C3]]) {			// TILE-234: scf.parallel ([[I:%.]], [[J:%.]]) = ([[C0]], [[C0]]) to ([[LHS_ROWS]], [[LHS_COLS]]) step ([[C2]], [[C3]]) {
	// TILE-234-NO: scf.parallel			// TILE-234-NO: scf.parallel
	// TILE-234: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]			// TILE-234: [[LHS_SUBVIEW:%.*]] = subview [[LHS]]
	// TILE-234: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]			// TILE-234: [[RHS_SUBVIEW:%.*]] = subview [[RHS]]
	// TILE-234: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]			// TILE-234: [[SUM_SUBVIEW:%.*]] = subview [[SUM]]
	// TILE-234: linalg.generic {{.*}} [[LHS_SUBVIEW]], [[RHS_SUBVIEW]], [[SUM_SUBVIEW]] {			// TILE-234: linalg.generic {{.}} ins([[LHS_SUBVIEW]], [[RHS_SUBVIEW]]{{.}} outs([[SUM_SUBVIEW]]

mlir/test/Dialect/Linalg/tile_parallel_reduce.mlir

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	#trait = {
iterator_types = ["reduction", "parallel", "reduction"],		iterator_types = ["reduction", "parallel", "reduction"],
indexing_maps = #accesses		indexing_maps = #accesses
}		}

func @reduction(%arg0 : memref<?x?x?xf32>,		func @reduction(%arg0 : memref<?x?x?xf32>,
%arg1 : memref<?x?xf32>,		%arg1 : memref<?x?xf32>,
%arg2 : memref<?xf32>)		%arg2 : memref<?xf32>)
{		{
linalg.generic #trait %arg0, %arg1, %arg2 {		linalg.generic #trait
		ins(%arg0, %arg1 : memref<?x?x?xf32>, memref<?x?xf32>)
		outs(%arg2 : memref<?xf32>) {
^bb0(%arg3 : f32, %arg4 : f32, %arg5 : f32):		^bb0(%arg3 : f32, %arg4 : f32, %arg5 : f32):
%0 = addf %arg3, %arg4 : f32		%0 = addf %arg3, %arg4 : f32
%1 = addf %0, %arg5 : f32		%1 = addf %0, %arg5 : f32
linalg.yield %1 : f32		linalg.yield %1 : f32
} : memref<?x?x?xf32>, memref<?x?xf32>, memref<?xf32>		}
return		return
}		}

// CHECK-LABEL: func @reduction		// CHECK-LABEL: func @reduction
// CHECK-DAG: %[[C2:.*]] = constant 2 : index		// CHECK-DAG: %[[C2:.*]] = constant 2 : index
// CHECK-DAG: %[[C4:.*]] = constant 4 : index		// CHECK-DAG: %[[C4:.*]] = constant 4 : index
// CHECK-DAG: %[[C8:.*]] = constant 8 : index		// CHECK-DAG: %[[C8:.*]] = constant 8 : index
// CHECK: scf.for %[[ARG3:.*]] =		// CHECK: scf.for %[[ARG3:.*]] =
// CHECK-SAME: step %[[C2]]		// CHECK-SAME: step %[[C2]]
// CHECK: scf.parallel (%[[ARG4:.*]]) =		// CHECK: scf.parallel (%[[ARG4:.*]]) =
// CHECK-SAME: step (%[[C4]])		// CHECK-SAME: step (%[[C4]])
// CHECK: scf.for %[[ARG5:.*]] =		// CHECK: scf.for %[[ARG5:.*]] =
// CHECK-SAME: step %[[C8]]		// CHECK-SAME: step %[[C8]]
// CHECK: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], %[[ARG4]], %[[ARG5]]]		// CHECK: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], %[[ARG4]], %[[ARG5]]]
// CHECK: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], %[[ARG5]]]		// CHECK: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], %[[ARG5]]]
// CHECK: %[[SV3:.]] = subview %{{.}}[%[[ARG4]]]		// CHECK: %[[SV3:.]] = subview %{{.}}[%[[ARG4]]]
// CHECK: linalg.generic		// CHECK: linalg.generic
// CHECK-SAME: %[[SV1]], %[[SV2]], %[[SV3]]		// CHECK-SAME: ins(%[[SV1]], %[[SV2]]
		// CHECK-SAME: outs(%[[SV3]]

// TILE1-LABEL: func @reduction		// TILE1-LABEL: func @reduction
// TILE1-DAG: %[[C2:.*]] = constant 2 : index		// TILE1-DAG: %[[C2:.*]] = constant 2 : index
// TILE1: scf.for %[[ARG3:.*]] =		// TILE1: scf.for %[[ARG3:.*]] =
// TILE1-SAME: step %[[C2]]		// TILE1-SAME: step %[[C2]]
// TILE1: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], 0, 0]		// TILE1: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], 0, 0]
// TILE1: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], 0]		// TILE1: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], 0]
// TILE1-NOT: subview		// TILE1-NOT: subview
// TILE1: linalg.generic		// TILE1: linalg.generic
// TILE1-SAME: %[[SV1]], %[[SV2]], %{{.*}}		// TILE1-SAME: ins(%[[SV1]], %[[SV2]]
		// TILE1-SAME: outs(%{{.*}}

// TILE2-LABEL: func @reduction		// TILE2-LABEL: func @reduction
// TILE2-DAG: %[[C2:.*]] = constant 2 : index		// TILE2-DAG: %[[C2:.*]] = constant 2 : index
// TILE2-DAG: %[[C4:.*]] = constant 4 : index		// TILE2-DAG: %[[C4:.*]] = constant 4 : index
// TILE2: scf.for %[[ARG3:.*]] =		// TILE2: scf.for %[[ARG3:.*]] =
// TILE2-SAME: step %[[C2]]		// TILE2-SAME: step %[[C2]]
// TILE2: scf.parallel (%[[ARG4:.*]]) =		// TILE2: scf.parallel (%[[ARG4:.*]]) =
// TILE2-SAME: step (%[[C4]])		// TILE2-SAME: step (%[[C4]])
// TILE2: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], %[[ARG4]], 0]		// TILE2: %[[SV1:.]] = subview %{{.}}[%[[ARG3]], %[[ARG4]], 0]
// TILE2: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], 0]		// TILE2: %[[SV2:.]] = subview %{{.}}[%[[ARG3]], 0]
// TILE2: %[[SV3:.]] = subview %{{.}}[%[[ARG4]]]		// TILE2: %[[SV3:.]] = subview %{{.}}[%[[ARG4]]]
// TILE2: linalg.generic		// TILE2: linalg.generic
// TILE2-SAME: %[[SV1]], %[[SV2]], %[[SV3]]		// TILE2-SAME: ins(%[[SV1]], %[[SV2]]
		// TILE2-SAME: outs(%[[SV3]]

mlir/test/Dialect/Linalg/transform-patterns.mlir

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	indexing_maps = [
affine_map<(m, n, k) -> (k, n)>,		affine_map<(m, n, k) -> (k, n)>,
affine_map<(m, n, k) -> (m, n)>		affine_map<(m, n, k) -> (m, n)>
],		],
iterator_types = ["parallel", "parallel", "reduction"],		iterator_types = ["parallel", "parallel", "reduction"],
__internal_linalg_transform__ = "VECTORIZE"		__internal_linalg_transform__ = "VECTORIZE"
}		}
func @vectorization_test(%A: memref<8x16xf32>, %B: memref<16x32xf32>,		func @vectorization_test(%A: memref<8x16xf32>, %B: memref<16x32xf32>,
%C: memref<8x32xf32>) {		%C: memref<8x32xf32>) {
linalg.generic #matmul_trait %A, %B, %C {		linalg.generic #matmul_trait
		ins(%A, %B : memref<8x16xf32>, memref<16x32xf32>)
		outs(%C : memref<8x32xf32>) {
^bb(%a: f32, %b: f32, %c: f32) :		^bb(%a: f32, %b: f32, %c: f32) :
%d = mulf %a, %b: f32		%d = mulf %a, %b: f32
%e = addf %c, %d: f32		%e = addf %c, %d: f32
linalg.yield %e : f32		linalg.yield %e : f32
} : memref<8x16xf32>, memref<16x32xf32>, memref<8x32xf32>		}
return		return
}		}
// CHECK-LABEL: func @vectorization_test		// CHECK-LABEL: func @vectorization_test
// CHECK: vector.transfer_read %{{.*}} : memref<8x16xf32>, vector<8x16xf32>		// CHECK: vector.transfer_read %{{.*}} : memref<8x16xf32>, vector<8x16xf32>
// CHECK: vector.transfer_read %{{.*}} : memref<16x32xf32>, vector<16x32xf32>		// CHECK: vector.transfer_read %{{.*}} : memref<16x32xf32>, vector<16x32xf32>
// CHECK: vector.transfer_read %{{.*}} : memref<8x32xf32>, vector<8x32xf32>		// CHECK: vector.transfer_read %{{.*}} : memref<8x32xf32>, vector<8x32xf32>
// CHECK: vector.contract {indexing_maps = [#[[$mk]], #[[$kn]], #[[$mn]]], iterator_types = ["parallel", "parallel", "reduction"]} %{{.}}, %{{.}}, %{{.*}} : vector<8x16xf32>, vector<16x32xf32> into vector<8x32xf32>		// CHECK: vector.contract {indexing_maps = [#[[$mk]], #[[$kn]], #[[$mn]]], iterator_types = ["parallel", "parallel", "reduction"]} %{{.}}, %{{.}}, %{{.*}} : vector<8x16xf32>, vector<16x32xf32> into vector<8x32xf32>
// CHECK: vector.transfer_write %{{.}}, %{{.}} : vector<8x32xf32>, memref<8x32xf32>		// CHECK: vector.transfer_write %{{.}}, %{{.}} : vector<8x32xf32>, memref<8x32xf32>

func @vectorization_test_integer(%A: memref<8x16xi32>, %B: memref<16x32xi32>,		func @vectorization_test_integer(%A: memref<8x16xi32>, %B: memref<16x32xi32>,
%C: memref<8x32xi32>) {		%C: memref<8x32xi32>) {
linalg.generic #matmul_trait %A, %B, %C {		linalg.generic #matmul_trait
		ins(%A, %B : memref<8x16xi32>, memref<16x32xi32>)
		outs(%C : memref<8x32xi32>) {
^bb(%a: i32, %b: i32, %c: i32) :		^bb(%a: i32, %b: i32, %c: i32) :
%d = muli %a, %b: i32		%d = muli %a, %b: i32
%e = addi %c, %d: i32		%e = addi %c, %d: i32
linalg.yield %e : i32		linalg.yield %e : i32
} : memref<8x16xi32>, memref<16x32xi32>, memref<8x32xi32>		}
return		return
}		}
// CHECK-LABEL: func @vectorization_test_integer		// CHECK-LABEL: func @vectorization_test_integer
// CHECK: vector.transfer_read %{{.*}} : memref<8x16xi32>, vector<8x16xi32>		// CHECK: vector.transfer_read %{{.*}} : memref<8x16xi32>, vector<8x16xi32>
// CHECK: vector.transfer_read %{{.*}} : memref<16x32xi32>, vector<16x32xi32>		// CHECK: vector.transfer_read %{{.*}} : memref<16x32xi32>, vector<16x32xi32>
// CHECK: vector.transfer_read %{{.*}} : memref<8x32xi32>, vector<8x32xi32>		// CHECK: vector.transfer_read %{{.*}} : memref<8x32xi32>, vector<8x32xi32>
// CHECK: vector.contract {indexing_maps = [#[[$mk]], #[[$kn]], #[[$mn]]], iterator_types = ["parallel", "parallel", "reduction"]} %{{.}}, %{{.}}, %{{.*}} : vector<8x16xi32>, vector<16x32xi32> into vector<8x32xi32>		// CHECK: vector.contract {indexing_maps = [#[[$mk]], #[[$kn]], #[[$mn]]], iterator_types = ["parallel", "parallel", "reduction"]} %{{.}}, %{{.}}, %{{.*}} : vector<8x16xi32>, vector<16x32xi32> into vector<8x32xi32>
// CHECK: vector.transfer_write %{{.}}, %{{.}} : vector<8x32xi32>, memref<8x32xi32>		// CHECK: vector.transfer_write %{{.}}, %{{.}} : vector<8x32xi32>, memref<8x32xi32>
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	#generic_matmul_trait = {
args_out = 1,		args_out = 1,
indexing_maps = #matmul_accesses,		indexing_maps = #matmul_accesses,
library_call = "linalg_matmul",		library_call = "linalg_matmul",
iterator_types = ["parallel", "parallel", "reduction"]		iterator_types = ["parallel", "parallel", "reduction"]
}		}
func @permute_generic(%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,		func @permute_generic(%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%B: memref<?x?xf32, offset: ?, strides: [?, 1]>,		%B: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%C: memref<?x?xf32, offset: ?, strides: [?, 1]>) {		%C: memref<?x?xf32, offset: ?, strides: [?, 1]>) {
linalg.generic #generic_matmul_trait %A, %B, %C {		linalg.generic #generic_matmul_trait
		ins(%A, %B : memref<?x?xf32, offset: ?, strides: [?, 1]>,
		memref<?x?xf32, offset: ?, strides: [?, 1]>)
		outs(%C : memref<?x?xf32, offset: ?, strides: [?, 1]>) {
^bb(%a: f32, %b: f32, %c: f32):		^bb(%a: f32, %b: f32, %c: f32):
%d = mulf %a, %b: f32		%d = mulf %a, %b: f32
%e = addf %c, %d: f32		%e = addf %c, %d: f32
linalg.yield %e: f32		linalg.yield %e: f32
}: memref<?x?xf32, offset: ?, strides: [?, 1]>,		}
memref<?x?xf32, offset: ?, strides: [?, 1]>,
memref<?x?xf32, offset: ?, strides: [?, 1]>
return		return
}		}
// CHECK-LABEL: func @permute_generic		// CHECK-LABEL: func @permute_generic
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [#[[$kn]], #[[$nm]], #[[$km]]],		// CHECK-SAME: indexing_maps = [#[[$kn]], #[[$nm]], #[[$km]]],
// CHECK-SAME: iterator_types = ["parallel", "reduction", "parallel"],		// CHECK-SAME: iterator_types = ["parallel", "reduction", "parallel"],
// CHECK-SAME: library_call = "linalg_matmul"} %{{.}}, %{{.}}, %{{.*}}		// CHECK-SAME: library_call = "linalg_matmul"}
// CHECK: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,		// CHECK: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,
// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,		// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>
// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>		// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>

#indexed_matmul_trait = {		#indexed_matmul_trait = {
args_in = 2,		args_in = 2,
args_out = 1,		args_out = 1,
indexing_maps = #matmul_accesses,		indexing_maps = #matmul_accesses,
library_call = "linalg_matmul_indexed",		library_call = "linalg_matmul_indexed",
iterator_types = ["parallel", "parallel", "reduction"]		iterator_types = ["parallel", "parallel", "reduction"]
}		}
func @permute_generic_indexed(		func @permute_generic_indexed(
%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,		%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%B: memref<?x?xf32, offset: ?, strides: [?, 1]>,		%B: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%C: memref<?x?xf32, offset: ?, strides: [?, 1]>) {		%C: memref<?x?xf32, offset: ?, strides: [?, 1]>) {
linalg.indexed_generic #indexed_matmul_trait %A, %B, %C {		linalg.indexed_generic #indexed_matmul_trait
		ins(%A, %B : memref<?x?xf32, offset: ?, strides: [?, 1]>,
		memref<?x?xf32, offset: ?, strides: [?, 1]>)
		outs(%C : memref<?x?xf32, offset: ?, strides: [?, 1]>) {
^bb(%i: index, %j: index, %k: index, %a: f32, %b: f32, %c: f32):		^bb(%i: index, %j: index, %k: index, %a: f32, %b: f32, %c: f32):
%d = mulf %a, %b: f32		%d = mulf %a, %b: f32
%e = addf %c, %d: f32		%e = addf %c, %d: f32
linalg.yield %e: f32		linalg.yield %e: f32
} : memref<?x?xf32, offset: ?, strides: [?, 1]>,		}
memref<?x?xf32, offset: ?, strides: [?, 1]>,
memref<?x?xf32, offset: ?, strides: [?, 1]>
return		return
}		}
// CHECK-LABEL: func @permute_generic_indexed		// CHECK-LABEL: func @permute_generic_indexed
// CHECK: linalg.indexed_generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.indexed_generic {
// CHECK-SAME: indexing_maps = [#[[$kn]], #[[$nm]], #[[$km]]],		// CHECK-SAME: indexing_maps = [#[[$kn]], #[[$nm]], #[[$km]]],
// CHECK-SAME: iterator_types = ["parallel", "reduction", "parallel"],		// CHECK-SAME: iterator_types = ["parallel", "reduction", "parallel"],
// CHECK-SAME: library_call = "linalg_matmul_indexed"} %{{.}}, %{{.}}, %{{.*}}		// CHECK-SAME: library_call = "linalg_matmul_indexed"}
// CHECK: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,		// CHECK: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,
// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>,		// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>
// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>		// CHECK-SAME: memref<?x?xf32, #[[$STRIDED_2D_u_1]]>

func @matvec_perm(%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,		func @matvec_perm(%A: memref<?x?xf32, offset: ?, strides: [?, 1]>,
%x: memref<?xf32, offset: ?, strides: [1]>,		%x: memref<?xf32, offset: ?, strides: [1]>,
%y: memref<?xf32, offset: ?, strides: [1]>) {		%y: memref<?xf32, offset: ?, strides: [1]>) {
linalg.matvec {__internal_linalg_transform__ = "__with_perm__"}		linalg.matvec {__internal_linalg_transform__ = "__with_perm__"}
ins(%A, %x: memref<?x?xf32, offset: ?, strides: [?, 1]>,		ins(%A, %x: memref<?x?xf32, offset: ?, strides: [?, 1]>,
memref<?xf32, offset: ?, strides: [1]>)		memref<?xf32, offset: ?, strides: [1]>)
▲ Show 20 Lines • Show All 202 Lines • Show Last 20 Lines

mlir/test/EDSC/builder-api-test.cpp

Show First 20 Lines • Show All 880 Lines • ▼ Show 20 Lines	TEST_FUNC(affine_if_op) {
intrinsics::affine_if(intSet, affineIfArgs, /withElseRegion=/true);		intrinsics::affine_if(intSet, affineIfArgs, /withElseRegion=/true);

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

// clang-format off		// clang-format off
// CHECK-LABEL: func @linalg_generic_pointwise		// CHECK-LABEL: func @linalg_generic_pointwise
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
		// CHECK-SAME: ins({{.*}}memref<?x?xf32>, memref<?x?xf32>)
		// CHECK-SAME: outs({{.*}}memref<?x?xf32>)
// CHECK: addf		// CHECK: addf
// CHECK: }: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>		// CHECK: linalg.generic {
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
		// CHECK-SAME: ins({{.*}}memref<?x?xf32>, memref<?x?xf32>)
		// CHECK-SAME: outs({{.*}}memref<?x?xf32>)
// CHECK: cmpf "ogt"		// CHECK: cmpf "ogt"
// CHECK: select		// CHECK: select
// CHECK: }: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>		// CHECK: linalg.generic {
// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
		// CHECK-SAME: ins({{.*}}memref<?x?xf32>)
		// CHECK-SAME: outs({{.*}}memref<?x?xf32>)
// CHECK: tanh		// CHECK: tanh
// CHECK: }: memref<?x?xf32>, memref<?x?xf32>
// clang-format on		// clang-format on
TEST_FUNC(linalg_generic_pointwise_test) {		TEST_FUNC(linalg_generic_pointwise_test) {
using namespace edsc;		using namespace edsc;
using namespace edsc::ops;		using namespace edsc::ops;

auto f32Type = FloatType::getF32(&globalContext());		auto f32Type = FloatType::getF32(&globalContext());
auto memrefType = MemRefType::get(		auto memrefType = MemRefType::get(
{ShapedType::kDynamicSize, ShapedType::kDynamicSize}, f32Type, {}, 0);		{ShapedType::kDynamicSize, ShapedType::kDynamicSize}, f32Type, {}, 0);
Show All 11 Lines	TEST_FUNC(linalg_generic_pointwise_test) {
linalg_generic_pointwise_tanh(SA({i, j}), SC({i, j}));		linalg_generic_pointwise_tanh(SA({i, j}), SC({i, j}));

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

// clang-format off		// clang-format off
// CHECK-LABEL: func @linalg_generic_matmul		// CHECK-LABEL: func @linalg_generic_matmul
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d2, d1)>, affine_map<(d0, d1, d2) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]}
/// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):		/// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):
// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32		// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32
// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32		// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32
// CHECK: linalg.yield %[[a4]] : f32		// CHECK: linalg.yield %[[a4]] : f32
// CHECK: }: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>		// CHECK: }: memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>
// clang-format on		// clang-format on
Show All 12 Lines	TEST_FUNC(linalg_generic_matmul_test) {
linalg_generic_matmul(f.getArguments());		linalg_generic_matmul(f.getArguments());

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

// clang-format off		// clang-format off
// CHECK-LABEL: func @linalg_generic_conv_nhwc		// CHECK-LABEL: func @linalg_generic_conv_nhwc
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d2 * 3 + d4 * 5, d3 * 4 + d5 * 6, d6)>,		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d2 * 3 + d4 * 5, d3 * 4 + d5 * 6, d6)>,
// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d1)>,		// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d1)>,
// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d2, d3, d1)>],		// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d2, d3, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]}
/// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):		/// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):
// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32		// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32
// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32		// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32
// CHECK: linalg.yield %[[a4]] : f32		// CHECK: linalg.yield %[[a4]] : f32
Show All 17 Lines	linalg_generic_conv_nhwc(f.getArguments(),
/strides=/{3, 4}, /dilations=/{5, 6});		/strides=/{3, 4}, /dilations=/{5, 6});

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

// clang-format off		// clang-format off
// CHECK-LABEL: func @linalg_generic_dilated_conv_nhwc		// CHECK-LABEL: func @linalg_generic_dilated_conv_nhwc
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d3 * 3 + d5 * 5, d4 * 4 + d6 * 6, d2)>,		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d3 * 3 + d5 * 5, d4 * 4 + d6 * 6, d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d5, d6, d2, d1)>,		// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d5, d6, d2, d1)>,
// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d3, d4, d1 + d2 * 7)>],		// CHECK-SAME: affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d3, d4, d1 + d2 * 7)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "reduction", "reduction"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "reduction", "reduction"]}
// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):		// CHECK: ^bb0(%[[a0:.]]: f32, %[[a1:.]]: f32, %[[a2:.*]]: f32):
// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32		// CHECK: %[[a3:.*]] = mulf %[[a0]], %[[a1]] : f32
// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32		// CHECK: %[[a4:.*]] = addf %[[a2]], %[[a3]] : f32
// CHECK: linalg.yield %[[a4]] : f32		// CHECK: linalg.yield %[[a4]] : f32
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	TEST_FUNC(linalg_metadata_ops) {
linalg_reshape(memrefType, reshaped, maps);		linalg_reshape(memrefType, reshaped, maps);

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

// clang-format off		// clang-format off
// CHECK-LABEL: func @linalg_tensors		// CHECK-LABEL: func @linalg_tensors
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
// CHECK: addf		// CHECK: addf
// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>		// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
// CHECK: cmpf "ogt"		// CHECK: cmpf "ogt"
// CHECK: select		// CHECK: select
// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>		// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>
// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel"]}
// CHECK: tanh		// CHECK: tanh
// CHECK: }: tensor<?x?xf32> -> tensor<?x?xf32>		// CHECK: }: tensor<?x?xf32> -> tensor<?x?xf32>
// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>,		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d2, d1)>,		// CHECK-SAME: affine_map<(d0, d1, d2) -> (d2, d1)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>],		// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]}		// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]}
// CHECK: mulf		// CHECK: mulf
// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>		// CHECK: }: tensor<?x?xf32>, memref<?x?xf32> -> tensor<?x?xf32>
// CHECK: linalg.generic {args_in = 3 : i64, args_out = 1 : i64,		// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>,		// CHECK-SAME: indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d2, d1)>,		// CHECK-SAME: affine_map<(d0, d1, d2) -> (d2, d1)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>,		// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>],		// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1)>],
// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]		// CHECK-SAME: iterator_types = ["parallel", "parallel", "reduction"]
// CHECK: mulf		// CHECK: mulf
// CHECK: addf		// CHECK: addf
// CHECK: }: tensor<?x?xf32>, memref<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>		// CHECK: }: tensor<?x?xf32>, memref<?x?xf32>, tensor<?x?xf32> -> tensor<?x?xf32>
Show All 10 Lines	TEST_FUNC(linalg_tensors_test) {
auto f = makeFunction("linalg_tensors", {}, {tensorType, memrefType});		auto f = makeFunction("linalg_tensors", {}, {tensorType, memrefType});

OpBuilder builder(f.getBody());		OpBuilder builder(f.getBody());
ScopedContext scope(builder, f.getLoc());		ScopedContext scope(builder, f.getLoc());
Value A(f.getArgument(0)), B(f.getArgument(1));		Value A(f.getArgument(0)), B(f.getArgument(1));
AffineExpr i, j;		AffineExpr i, j;
bindDims(&globalContext(), i, j);		bindDims(&globalContext(), i, j);
StructuredIndexed SA(A), SB(B), SC(tensorType);		StructuredIndexed SA(A), SB(B), SC(tensorType);
linalg_generic_pointwise_add(SA({i, j}), SB({i, j}), SC({i, j}));		Value added = linalg_generic_pointwise_add(SA({i, j}), SB({i, j}), SC({i, j}))
linalg_generic_pointwise_max(SA({i, j}), SB({i, j}), SC({i, j}));		->getResult(0);
linalg_generic_pointwise_tanh(SA({i, j}), SC({i, j}));		Value maxed = linalg_generic_pointwise_max(SA({i, j}), SB({i, j}),
Value o1 = linalg_generic_matmul(A, B, tensorType)->getResult(0);		StructuredIndexed(added)({i, j}))
		->getResult(0);
		Value tanhed = linalg_generic_pointwise_tanh(SA({i, j}),
		StructuredIndexed(maxed)({i, j}))
		->getResult(0);
		Value o1 = linalg_generic_matmul(A, B, tanhed, tensorType)->getResult(0);
linalg_generic_matmul(A, B, o1, tensorType);		linalg_generic_matmul(A, B, o1, tensorType);

f.print(llvm::outs());		f.print(llvm::outs());
f.erase();		f.erase();
}		}

TEST_FUNC(vector_extractelement_op_i32) {		TEST_FUNC(vector_extractelement_op_i32) {
using namespace edsc::op;		using namespace edsc::op;
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

mlir/test/Transforms/buffer-placement-preparation-allowed-memref-results.mlir

	Show All 13 Lines
	// -----			// -----

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @complex_signature_conversion			// CHECK-LABEL: func @complex_signature_conversion
	func @complex_signature_conversion(%arg0: tensor<5xf32>, %arg1: memref<10xf32>, %arg2: i1, %arg3: f16) -> (i1, tensor<5xf32>, memref<10xf32>, memref<15xf32>, f16) {			func @complex_signature_conversion(%arg0: tensor<5xf32>, %arg1: memref<10xf32>, %arg2: i1, %arg3: f16) -> (i1, tensor<5xf32>, memref<10xf32>, memref<15xf32>, f16) {
	%0 = alloc() : memref<15xf32>			%0 = alloc() : memref<15xf32>
	%1 = linalg.generic {			%1 = linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]			iterator_types = ["parallel"]}
	} %arg0 {			ins(%arg0 : tensor<5xf32>) {
	^bb0(%gen1_arg0: f32):			^bb0(%gen1_arg0: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<5xf32> -> tensor<5xf32>			} -> tensor<5xf32>
	return %arg2, %1, %arg1, %0, %arg3 : i1, tensor<5xf32>, memref<10xf32>, memref<15xf32>, f16			return %arg2, %1, %arg1, %0, %arg3 : i1, tensor<5xf32>, memref<10xf32>, memref<15xf32>, f16
	}			}
	// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[ARG2:.]]: i1, %[[ARG3:.]]: f16)			// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[ARG2:.]]: i1, %[[ARG3:.]]: f16)
	// CHECK-SAME: (i1, memref<5xf32>, memref<10xf32>, memref<15xf32>, f16)			// CHECK-SAME: (i1, memref<5xf32>, memref<10xf32>, memref<15xf32>, f16)
	// CHECK: %[[FIRST_ALLOC:.*]] = alloc()			// CHECK: %[[FIRST_ALLOC:.*]] = alloc()
	// CHECK: %[[LINALG_ALLOC:.*]] = alloc()			// CHECK: %[[LINALG_ALLOC:.*]] = alloc()
	// CHECK: return %[[ARG2]], %[[LINALG_ALLOC]], %[[ARG1]], %[[FIRST_ALLOC]], %[[ARG3]]			// CHECK: return %[[ARG2]], %[[LINALG_ALLOC]], %[[ARG1]], %[[FIRST_ALLOC]], %[[ARG3]]

	▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

mlir/test/Transforms/buffer-placement-preparation.mlir

	Show All 11 Lines
	// function arguments list. The other memref function results remain as function			// function arguments list. The other memref function results remain as function
	// results.			// results.

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @memref_in_function_results			// CHECK-LABEL: func @memref_in_function_results
	func @memref_in_function_results(%arg0: tensor<5xf32>, %arg1: memref<10xf32>) -> (tensor<5xf32>, memref<10xf32>, memref<15xf32>) {			func @memref_in_function_results(%arg0: tensor<5xf32>, %arg1: memref<10xf32>) -> (tensor<5xf32>, memref<10xf32>, memref<15xf32>) {
	%0 = alloc() : memref<15xf32>			%0 = alloc() : memref<15xf32>
	%1 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%1 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<5xf32>) {
	^bb0(%gen1_arg0: f32):			^bb0(%gen1_arg0: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<5xf32> -> tensor<5xf32>			} -> tensor<5xf32>
	return %1, %arg1, %0 : tensor<5xf32>, memref<10xf32>, memref<15xf32>			return %1, %arg1, %0 : tensor<5xf32>, memref<10xf32>, memref<15xf32>
	}			}
	// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)			// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)
	// CHECK-SAME: (memref<10xf32>, memref<15xf32>)			// CHECK-SAME: (memref<10xf32>, memref<15xf32>)
	// CHECK: %[[FIRST_ALLOC:.*]] = alloc()			// CHECK: %[[FIRST_ALLOC:.*]] = alloc()
	// CHECK: %[[LINALG_ALLOC:.*]] = alloc()			// CHECK: %[[LINALG_ALLOC:.*]] = alloc()
	// CHECK: linalg.copy(%[[LINALG_ALLOC]], %[[RESULT]])			// CHECK: linalg.copy(%[[LINALG_ALLOC]], %[[RESULT]])
	// CHECK: return %[[ARG1]], %[[FIRST_ALLOC]]			// CHECK: return %[[ARG1]], %[[FIRST_ALLOC]]
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	// -----			// -----

	// Test Case: Simple case for checking if BufferAssignmentPlacer creates AllocOps right before GenericOps.			// Test Case: Simple case for checking if BufferAssignmentPlacer creates AllocOps right before GenericOps.

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @compute_allocs_position_simple			// CHECK-LABEL: func @compute_allocs_position_simple
	func @compute_allocs_position_simple(%cond: i1, %arg0: tensor<2xf32>) -> tensor<2xf32>{			func @compute_allocs_position_simple(%cond: i1, %arg0: tensor<2xf32>) -> tensor<2xf32>{
	%0 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%0 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<2xf32>) {
	^bb0(%gen1_arg0: f32):			^bb0(%gen1_arg0: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	%1 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %0 {			%1 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%0 : tensor<2xf32>) {
	^bb0(%gen2_arg0: f32):			^bb0(%gen2_arg0: f32):
	%tmp2 = exp %gen2_arg0 : f32			%tmp2 = exp %gen2_arg0 : f32
	linalg.yield %tmp2 : f32			linalg.yield %tmp2 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	return %1 : tensor<2xf32>			return %1 : tensor<2xf32>
	}			}
	// CHECK: (%{{.}}: {{.}}, %[[ARG0:.*]]: memref<2xf32>,			// CHECK: (%{{.}}: {{.}}, %[[ARG0:.*]]: memref<2xf32>,
	// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()			// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[FIRST_ALLOC]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}} outs(%[[FIRST_ALLOC]]
	// CHECK: %[[SECOND_ALLOC:.*]] = alloc()			// CHECK: %[[SECOND_ALLOC:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[FIRST_ALLOC]], %[[SECOND_ALLOC]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[FIRST_ALLOC]]{{.}} outs(%[[SECOND_ALLOC]]

	// -----			// -----

	// Test Case: if-else case for checking if BufferAssignmentPlacer creates AllocOps right before GenericOps.			// Test Case: if-else case for checking if BufferAssignmentPlacer creates AllocOps right before GenericOps.

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @compute_allocs_position			// CHECK-LABEL: func @compute_allocs_position
	func @compute_allocs_position(%cond: i1, %arg0: tensor<2xf32>) -> tensor<2xf32>{			func @compute_allocs_position(%cond: i1, %arg0: tensor<2xf32>) -> tensor<2xf32>{
	%0 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%0 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<2xf32>) {
	^bb0(%gen1_arg0: f32):			^bb0(%gen1_arg0: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	%1 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %0 {			%1 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%0 : tensor<2xf32>) {
	^bb0(%gen2_arg0: f32):			^bb0(%gen2_arg0: f32):
	%tmp2 = exp %gen2_arg0 : f32			%tmp2 = exp %gen2_arg0 : f32
	linalg.yield %tmp2 : f32			linalg.yield %tmp2 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	cond_br %cond, ^bb1(%arg0, %0: tensor<2xf32>, tensor<2xf32>),			cond_br %cond, ^bb1(%arg0, %0: tensor<2xf32>, tensor<2xf32>),
	^bb2(%0, %arg0: tensor<2xf32>, tensor<2xf32>)			^bb2(%0, %arg0: tensor<2xf32>, tensor<2xf32>)
	^bb1(%arg1 : tensor<2xf32>, %arg2 : tensor<2xf32>):			^bb1(%arg1 : tensor<2xf32>, %arg2 : tensor<2xf32>):
	%2 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%2 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<2xf32>) {
	^bb0(%gen3_arg0: f32):			^bb0(%gen3_arg0: f32):
	%tmp3 = exp %gen3_arg0 : f32			%tmp3 = exp %gen3_arg0 : f32
	linalg.yield %tmp3 : f32			linalg.yield %tmp3 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	%3 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %2 {			%3 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%2 : tensor<2xf32>) {
	^bb0(%gen4_arg0: f32):			^bb0(%gen4_arg0: f32):
	%tmp4 = exp %gen4_arg0 : f32			%tmp4 = exp %gen4_arg0 : f32
	linalg.yield %tmp4 : f32			linalg.yield %tmp4 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	br ^exit(%arg1, %arg2 : tensor<2xf32>, tensor<2xf32>)			br ^exit(%arg1, %arg2 : tensor<2xf32>, tensor<2xf32>)
	^bb2(%arg3 : tensor<2xf32>, %arg4 : tensor<2xf32>):			^bb2(%arg3 : tensor<2xf32>, %arg4 : tensor<2xf32>):
	%4 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%4 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<2xf32>) {
	^bb0(%gen5_arg0: f32):			^bb0(%gen5_arg0: f32):
	%tmp5 = exp %gen5_arg0 : f32			%tmp5 = exp %gen5_arg0 : f32
	linalg.yield %tmp5 : f32			linalg.yield %tmp5 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	%5 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %4 {			%5 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%4 : tensor<2xf32>) {
	^bb0(%gen6_arg0: f32):			^bb0(%gen6_arg0: f32):
	%tmp6 = exp %gen6_arg0 : f32			%tmp6 = exp %gen6_arg0 : f32
	linalg.yield %tmp6 : f32			linalg.yield %tmp6 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	br ^exit(%arg3, %arg4 : tensor<2xf32>, tensor<2xf32>)			br ^exit(%arg3, %arg4 : tensor<2xf32>, tensor<2xf32>)
	^exit(%arg5 : tensor<2xf32>, %arg6 : tensor<2xf32>):			^exit(%arg5 : tensor<2xf32>, %arg6 : tensor<2xf32>):
	%6 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0 {			%6 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%arg0 : tensor<2xf32>) {
	^bb0(%gen7_arg0: f32):			^bb0(%gen7_arg0: f32):
	%tmp7 = exp %gen7_arg0 : f32			%tmp7 = exp %gen7_arg0 : f32
	linalg.yield %tmp7 : f32			linalg.yield %tmp7 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	%7 = linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %6 {			%7 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
				ins(%6 : tensor<2xf32>) {
	^bb0(%gen8_arg0: f32):			^bb0(%gen8_arg0: f32):
	%tmp8 = exp %gen8_arg0 : f32			%tmp8 = exp %gen8_arg0 : f32
	linalg.yield %tmp8 : f32			linalg.yield %tmp8 : f32
	}: tensor<2xf32> -> tensor<2xf32>			} -> tensor<2xf32>
	return %7 : tensor<2xf32>			return %7 : tensor<2xf32>
	}			}
	// CHECK: (%{{.}}: {{.}}, %[[ARG0:.*]]: memref<2xf32>,			// CHECK: (%{{.}}: {{.}}, %[[ARG0:.*]]: memref<2xf32>,
	// CHECK-NEXT: %[[ALLOC0:.*]] = alloc()			// CHECK-NEXT: %[[ALLOC0:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[ALLOC0]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}} outs(%[[ALLOC0]]
	// CHECK: %[[ALLOC1:.*]] = alloc()			// CHECK: %[[ALLOC1:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ALLOC0]], %[[ALLOC1]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ALLOC0]]{{.}} outs(%[[ALLOC1]]
	// CHECK: cond_br %{{.}}, ^[[BB0:.]]({{.}}), ^[[BB1:.]](			// CHECK: cond_br %{{.}}, ^[[BB0:.]]({{.}}), ^[[BB1:.]](
	// CHECK-NEXT: ^[[BB0]]			// CHECK-NEXT: ^[[BB0]]
	// CHECK-NEXT: %[[ALLOC2:.*]] = alloc()			// CHECK-NEXT: %[[ALLOC2:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[ALLOC2]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}} outs(%[[ALLOC2]]
	// CHECK: %[[ALLOC3:.*]] = alloc()			// CHECK: %[[ALLOC3:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ALLOC2]], %[[ALLOC3]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ALLOC2]]{{.}} outs(%[[ALLOC3]]
	// CHECK: br ^[[EXIT:.]]({{.}})			// CHECK: br ^[[EXIT:.]]({{.}})
	// CHECK-NEXT: ^[[BB1]]			// CHECK-NEXT: ^[[BB1]]
	// CHECK-NEXT: %[[ALLOC4:.*]] = alloc()			// CHECK-NEXT: %[[ALLOC4:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[ALLOC4]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}} outs(%[[ALLOC4]]
	// CHECK: %[[ALLOC5:.*]] = alloc()			// CHECK: %[[ALLOC5:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ALLOC4]], %[[ALLOC5]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ALLOC4]]{{.}} outs(%[[ALLOC5]]
	// CHECK: br ^[[EXIT]]			// CHECK: br ^[[EXIT]]
	// CHECK-NEXT: ^[[EXIT]]			// CHECK-NEXT: ^[[EXIT]]
	// CHECK-NEXT: %[[ALLOC6:.*]] = alloc()			// CHECK-NEXT: %[[ALLOC6:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[ALLOC6]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}} outs(%[[ALLOC6]]
	// CHECK: %[[ALLOC7:.*]] = alloc()			// CHECK: %[[ALLOC7:.*]] = alloc()
	// CHECK-NEXT: linalg.generic {{.*}} %[[ALLOC6]], %[[ALLOC7]]			// CHECK-NEXT: linalg.generic {{.}} ins(%[[ALLOC6]]{{.}} outs(%[[ALLOC7]]

	// -----			// -----

	// Test case: Checking BufferAssignmentCallOpConverter and			// Test case: Checking BufferAssignmentCallOpConverter and
	// BufferAssignmentFuncOpConverter and BufferAssignmentReturnOpConverter all			// BufferAssignmentFuncOpConverter and BufferAssignmentReturnOpConverter all
	// together. The signature of `callee` after signature conversion would be:			// together. The signature of `callee` after signature conversion would be:

	// func @callee(%arg0: memref<5xf32>,%arg1: memref<5xf32>) -> ()			// func @callee(%arg0: memref<5xf32>,%arg1: memref<5xf32>) -> ()

	// The operands and results of caller and return operations must be matched			// The operands and results of caller and return operations must be matched
	// respectively.			// respectively.

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @callee			// CHECK-LABEL: func @callee
	func @callee(%arg1: tensor<5xf32>) -> tensor<5xf32> {			func @callee(%arg1: tensor<5xf32>) -> tensor<5xf32> {
	%0 = linalg.generic {			%0 = linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
	args_in = 1 : i64,			ins(%arg1 : tensor<5xf32>) {
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]
	} %arg1 {
	^bb0(%gen1_arg0: f32):			^bb0(%gen1_arg0: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: tensor<5xf32> -> tensor<5xf32>			} -> tensor<5xf32>
	return %0 : tensor<5xf32>			return %0 : tensor<5xf32>
	}			}
	// CHECK: (%[[CALLEE_ARG:.]]: memref<5xf32>, %[[CALLEE_RESULT:.]]: memref<5xf32>)			// CHECK: (%[[CALLEE_ARG:.]]: memref<5xf32>, %[[CALLEE_RESULT:.]]: memref<5xf32>)
	// CHECK: %[[ALLOC:.*]] = alloc()			// CHECK: %[[ALLOC:.*]] = alloc()
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK: linalg.copy(%[[ALLOC]], %[[CALLEE_RESULT]])			// CHECK: linalg.copy(%[[ALLOC]], %[[CALLEE_RESULT]])
	// CHECK: return			// CHECK: return

	▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

mlir/test/Transforms/buffer-placement.mlir

Show All 18 Lines
// CHECK-LABEL: func @condBranch		// CHECK-LABEL: func @condBranch
func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cond_br %arg0, ^bb1, ^bb2		cond_br %arg0, ^bb1, ^bb2
^bb1:		^bb1:
br ^bb3(%arg1 : memref<2xf32>)		br ^bb3(%arg1 : memref<2xf32>)
^bb2:		^bb2:
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^bb3(%0 : memref<2xf32>)		br ^bb3(%0 : memref<2xf32>)
^bb3(%1: memref<2xf32>):		^bb3(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[ALLOC:.*]] = alloc()
// CHECK-NEXT: cond_br		// CHECK-NEXT: cond_br
Show All 25 Lines	func @condBranchDynamicType(
%arg2: memref<?xf32>,		%arg2: memref<?xf32>,
%arg3: index) {		%arg3: index) {
cond_br %arg0, ^bb1, ^bb2(%arg3: index)		cond_br %arg0, ^bb1, ^bb2(%arg3: index)
^bb1:		^bb1:
br ^bb3(%arg1 : memref<?xf32>)		br ^bb3(%arg1 : memref<?xf32>)
^bb2(%0: index):		^bb2(%0: index):
%1 = alloc(%0) : memref<?xf32>		%1 = alloc(%0) : memref<?xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %1 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<?xf32>)
		outs(%1: memref<?xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<?xf32>, memref<?xf32>		}
br ^bb3(%1 : memref<?xf32>)		br ^bb3(%1 : memref<?xf32>)
^bb3(%2: memref<?xf32>):		^bb3(%2: memref<?xf32>):
"linalg.copy"(%2, %arg2) : (memref<?xf32>, memref<?xf32>) -> ()		"linalg.copy"(%2, %arg2) : (memref<?xf32>, memref<?xf32>) -> ()
return		return
}		}

// CHECK-NEXT: cond_br		// CHECK-NEXT: cond_br
// CHECK: %[[DIM0:.*]] = dim		// CHECK: %[[DIM0:.*]] = dim
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	func @condBranchDynamicTypeNested(
%arg2: memref<?xf32>,		%arg2: memref<?xf32>,
%arg3: index) {		%arg3: index) {
cond_br %arg0, ^bb1, ^bb2(%arg3: index)		cond_br %arg0, ^bb1, ^bb2(%arg3: index)
^bb1:		^bb1:
br ^bb6(%arg1 : memref<?xf32>)		br ^bb6(%arg1 : memref<?xf32>)
^bb2(%0: index):		^bb2(%0: index):
%1 = alloc(%0) : memref<?xf32>		%1 = alloc(%0) : memref<?xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %1 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<?xf32>)
		outs(%1: memref<?xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<?xf32>, memref<?xf32>		}
cond_br %arg0, ^bb3, ^bb4		cond_br %arg0, ^bb3, ^bb4
^bb3:		^bb3:
br ^bb5(%1 : memref<?xf32>)		br ^bb5(%1 : memref<?xf32>)
^bb4:		^bb4:
br ^bb5(%1 : memref<?xf32>)		br ^bb5(%1 : memref<?xf32>)
^bb5(%2: memref<?xf32>):		^bb5(%2: memref<?xf32>):
br ^bb6(%2 : memref<?xf32>)		br ^bb6(%2 : memref<?xf32>)
^bb6(%3: memref<?xf32>):		^bb6(%3: memref<?xf32>):
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @criticalEdge		// CHECK-LABEL: func @criticalEdge
func @criticalEdge(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @criticalEdge(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cond_br %arg0, ^bb1, ^bb2(%arg1 : memref<2xf32>)		cond_br %arg0, ^bb1, ^bb2(%arg1 : memref<2xf32>)
^bb1:		^bb1:
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^bb2(%0 : memref<2xf32>)		br ^bb2(%0 : memref<2xf32>)
^bb2(%1: memref<2xf32>):		^bb2(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[ALLOC:.*]] = alloc()
// CHECK-NEXT: cond_br		// CHECK-NEXT: cond_br
Show All 14 Lines
// for %0 and %arg1.		// for %0 and %arg1.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @invCriticalEdge		// CHECK-LABEL: func @invCriticalEdge
func @invCriticalEdge(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @invCriticalEdge(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0, ^bb1, ^bb2(%arg1 : memref<2xf32>)		cond_br %arg0, ^bb1, ^bb2(%arg1 : memref<2xf32>)
^bb1:		^bb1:
br ^bb2(%0 : memref<2xf32>)		br ^bb2(%0 : memref<2xf32>)
^bb2(%1: memref<2xf32>):		^bb2(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

Show All 14 Lines
// Dealloc for %7 should happen after the CopyOp.		// Dealloc for %7 should happen after the CopyOp.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @ifElse		// CHECK-LABEL: func @ifElse
func @ifElse(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @ifElse(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0,		cond_br %arg0,
^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),		^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),
^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)		^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)
^bb1(%1: memref<2xf32>, %2: memref<2xf32>):		^bb1(%1: memref<2xf32>, %2: memref<2xf32>):
br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)		br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)
^bb2(%3: memref<2xf32>, %4: memref<2xf32>):		^bb2(%3: memref<2xf32>, %4: memref<2xf32>):
br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)		br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)
^bb3(%5: memref<2xf32>, %6: memref<2xf32>):		^bb3(%5: memref<2xf32>, %6: memref<2xf32>):
%7 = alloc() : memref<2xf32>		%7 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %5, %7 {		iterator_types = ["parallel"]}
		ins(%5: memref<2xf32>)
		outs(%7: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
"linalg.copy"(%7, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%7, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: %[[SECOND_ALLOC:.*]] = alloc()		// CHECK: %[[SECOND_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
Show All 15 Lines
// aliases of %0.		// aliases of %0.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @ifElseNoUsers		// CHECK-LABEL: func @ifElseNoUsers
func @ifElseNoUsers(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @ifElseNoUsers(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0,		cond_br %arg0,
^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),		^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),
^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)		^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)
^bb1(%1: memref<2xf32>, %2: memref<2xf32>):		^bb1(%1: memref<2xf32>, %2: memref<2xf32>):
br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)		br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)
^bb2(%3: memref<2xf32>, %4: memref<2xf32>):		^bb2(%3: memref<2xf32>, %4: memref<2xf32>):
br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)		br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)
^bb3(%5: memref<2xf32>, %6: memref<2xf32>):		^bb3(%5: memref<2xf32>, %6: memref<2xf32>):
Show All 20 Lines
// Two missing DeallocOps should be inserted in the exit block.		// Two missing DeallocOps should be inserted in the exit block.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @ifElseNested		// CHECK-LABEL: func @ifElseNested
func @ifElseNested(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @ifElseNested(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0,		cond_br %arg0,
^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),		^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),
^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)		^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)
^bb1(%1: memref<2xf32>, %2: memref<2xf32>):		^bb1(%1: memref<2xf32>, %2: memref<2xf32>):
br ^bb5(%1, %2 : memref<2xf32>, memref<2xf32>)		br ^bb5(%1, %2 : memref<2xf32>, memref<2xf32>)
^bb2(%3: memref<2xf32>, %4: memref<2xf32>):		^bb2(%3: memref<2xf32>, %4: memref<2xf32>):
cond_br %arg0, ^bb3(%3 : memref<2xf32>), ^bb4(%4 : memref<2xf32>)		cond_br %arg0, ^bb3(%3 : memref<2xf32>), ^bb4(%4 : memref<2xf32>)
^bb3(%5: memref<2xf32>):		^bb3(%5: memref<2xf32>):
br ^bb5(%5, %3 : memref<2xf32>, memref<2xf32>)		br ^bb5(%5, %3 : memref<2xf32>, memref<2xf32>)
^bb4(%6: memref<2xf32>):		^bb4(%6: memref<2xf32>):
br ^bb5(%3, %6 : memref<2xf32>, memref<2xf32>)		br ^bb5(%3, %6 : memref<2xf32>, memref<2xf32>)
^bb5(%7: memref<2xf32>, %8: memref<2xf32>):		^bb5(%7: memref<2xf32>, %8: memref<2xf32>):
%9 = alloc() : memref<2xf32>		%9 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %7, %9 {		iterator_types = ["parallel"]}
		ins(%7: memref<2xf32>)
		outs(%9: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
"linalg.copy"(%9, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%9, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: %[[SECOND_ALLOC:.*]] = alloc()		// CHECK: %[[SECOND_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
Show All 9 Lines
// inserts the two missing DeallocOps after the last GenericOp.		// inserts the two missing DeallocOps after the last GenericOp.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @redundantOperations		// CHECK-LABEL: func @redundantOperations
func @redundantOperations(%arg0: memref<2xf32>) {		func @redundantOperations(%arg0: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %0 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
%1 = alloc() : memref<2xf32>		%1 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %0, %1 {		iterator_types = ["parallel"]}
		ins(%0: memref<2xf32>)
		outs(%1: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
return		return
}		}

// CHECK: (%[[ARG0:.]]: {{.}})		// CHECK: (%[[ARG0:.]]: {{.}})
// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[FIRST_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic {{.*}} %[[ARG0]], %[[FIRST_ALLOC]]		// CHECK-NEXT: linalg.generic {{.}} ins(%[[ARG0]]{{.}}outs(%[[FIRST_ALLOC]]
// CHECK: %[[SECOND_ALLOC:.*]] = alloc()		// CHECK: %[[SECOND_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic {{.*}} %[[FIRST_ALLOC]], %[[SECOND_ALLOC]]		// CHECK-NEXT: linalg.generic {{.}} ins(%[[FIRST_ALLOC]]{{.}}outs(%[[SECOND_ALLOC]]
// CHECK: dealloc		// CHECK: dealloc
// CHECK-NEXT: dealloc		// CHECK-NEXT: dealloc
// CHECK-NEXT: return		// CHECK-NEXT: return

// -----		// -----

// Test Case:		// Test Case:
// bb0		// bb0
Show All 11 Lines
func @moving_alloc_and_inserting_missing_dealloc(		func @moving_alloc_and_inserting_missing_dealloc(
%cond: i1,		%cond: i1,
%arg0: memref<2xf32>,		%arg0: memref<2xf32>,
%arg1: memref<2xf32>) {		%arg1: memref<2xf32>) {
cond_br %cond, ^bb1, ^bb2		cond_br %cond, ^bb1, ^bb2
^bb1:		^bb1:
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %0 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^exit(%0 : memref<2xf32>)		br ^exit(%0 : memref<2xf32>)
^bb2:		^bb2:
%1 = alloc() : memref<2xf32>		%1 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %1 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%1: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^exit(%1 : memref<2xf32>)		br ^exit(%1 : memref<2xf32>)
^exit(%arg2: memref<2xf32>):		^exit(%arg2: memref<2xf32>):
"linalg.copy"(%arg2, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%arg2, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %{{.*}} = alloc()		// CHECK-NEXT: %{{.*}} = alloc()
// CHECK-NEXT: %{{.*}} = alloc()		// CHECK-NEXT: %{{.*}} = alloc()
Show All 22 Lines	%cond: i1,
%arg0: memref<2xf32>,		%arg0: memref<2xf32>,
%arg1: memref<2xf32>) {		%arg1: memref<2xf32>) {
cond_br %cond, ^bb1, ^bb2		cond_br %cond, ^bb1, ^bb2
^bb1:		^bb1:
br ^exit(%arg0 : memref<2xf32>)		br ^exit(%arg0 : memref<2xf32>)
^bb2:		^bb2:
%1 = alloc() : memref<2xf32>		%1 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %1 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%1: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
dealloc %1 : memref<2xf32>		dealloc %1 : memref<2xf32>
br ^exit(%1 : memref<2xf32>)		br ^exit(%1 : memref<2xf32>)
^exit(%arg2: memref<2xf32>):		^exit(%arg2: memref<2xf32>):
"linalg.copy"(%arg2, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%arg2, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %{{.*}} = alloc()		// CHECK-NEXT: %{{.*}} = alloc()
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc		// CHECK-NEXT: dealloc
// CHECK-NEXT: return		// CHECK-NEXT: return

// -----		// -----

// Test Case: Inserting missing DeallocOp in a single block.		// Test Case: Inserting missing DeallocOp in a single block.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @inserting_missing_dealloc_simple		// CHECK-LABEL: func @inserting_missing_dealloc_simple
func @inserting_missing_dealloc_simple(		func @inserting_missing_dealloc_simple(
%arg0 : memref<2xf32>,		%arg0 : memref<2xf32>,
%arg1: memref<2xf32>) {		%arg1: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %0 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
"linalg.copy"(%0, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%0, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc		// CHECK-NEXT: dealloc

// -----		// -----

// Test Case: Moving invalid DeallocOp (there is a user after deallocation) in a		// Test Case: Moving invalid DeallocOp (there is a user after deallocation) in a
// single block.		// single block.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @moving_invalid_dealloc_op		// CHECK-LABEL: func @moving_invalid_dealloc_op
func @moving_invalid_dealloc_op(%arg0 : memref<2xf32>, %arg1: memref<2xf32>) {		func @moving_invalid_dealloc_op(%arg0 : memref<2xf32>, %arg1: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg0, %0 {		iterator_types = ["parallel"]}
		ins(%arg0: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
dealloc %0 : memref<2xf32>		dealloc %0 : memref<2xf32>
"linalg.copy"(%0, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%0, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc		// CHECK-NEXT: dealloc

Show All 10 Lines

// CHECK-LABEL: func @nested_regions_and_cond_branch		// CHECK-LABEL: func @nested_regions_and_cond_branch
func @nested_regions_and_cond_branch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @nested_regions_and_cond_branch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cond_br %arg0, ^bb1, ^bb2		cond_br %arg0, ^bb1, ^bb2
^bb1:		^bb1:
br ^bb3(%arg1 : memref<2xf32>)		br ^bb3(%arg1 : memref<2xf32>)
^bb2:		^bb2:
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg1, %0 {		linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%1 = alloc() : memref<2xf32>		%1 = alloc() : memref<2xf32>
linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg1, %1 {		linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%1: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^bb3(%0 : memref<2xf32>)		br ^bb3(%0 : memref<2xf32>)
^bb3(%1: memref<2xf32>):		^bb3(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}
// CHECK: (%[[cond:.]]: {{.}}, %[[ARG1:.]]: {{.}}, %{{.}}: {{.}})		// CHECK: (%[[cond:.]]: {{.}}, %[[ARG1:.]]: {{.}}, %{{.}}: {{.}})
// CHECK-NEXT: %[[GENERIC1_ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[GENERIC1_ALLOC:.*]] = alloc()
// CHECK-NEXT: cond_br %[[cond]], ^[[BB1:.]], ^[[BB2:.]]		// CHECK-NEXT: cond_br %[[cond]], ^[[BB1:.]], ^[[BB2:.]]
// CHECK: ^[[BB2]]:		// CHECK: ^[[BB2]]:
// CHECK-NEXT: linalg.generic {{{.*}}} %[[ARG1]], %[[GENERIC1_ALLOC]]		// CHECK-NEXT: linalg.generic {{{.}}} ins(%[[ARG1]]{{.}}outs(%[[GENERIC1_ALLOC]]
// CHECK: %[[GENERIC2_ALLOC:.*]] = alloc()		// CHECK: %[[GENERIC2_ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic {{{.*}}} %[[ARG1]], %[[GENERIC2_ALLOC]]		// CHECK-NEXT: linalg.generic {{{.}}} ins(%[[ARG1]]{{.}}outs(%[[GENERIC2_ALLOC]]
// CHECK: dealloc %[[GENERIC2_ALLOC]]		// CHECK: dealloc %[[GENERIC2_ALLOC]]
// CHECK-NEXT: %{{.*}} = exp		// CHECK-NEXT: %{{.*}} = exp
// CHECK: ^[[BB3:.]]({{.}}):		// CHECK: ^[[BB3:.]]({{.}}):
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc %[[GENERIC1_ALLOC]]		// CHECK-NEXT: dealloc %[[GENERIC1_ALLOC]]

// -----		// -----

// Test Case: buffer deallocation escaping		// Test Case: buffer deallocation escaping
// BufferPlacement Expected Behaviour: It must not dealloc %arg1 and %x		// BufferPlacement Expected Behaviour: It must not dealloc %arg1 and %x
// since they are operands of return operation and should escape from		// since they are operands of return operation and should escape from
// deallocating. It should dealloc %y after linalg.copy.		// deallocating. It should dealloc %y after linalg.copy.

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @memref_in_function_results		// CHECK-LABEL: func @memref_in_function_results
func @memref_in_function_results(%arg0: memref<5xf32>, %arg1: memref<10xf32>, %arg2: memref<5xf32>) -> (memref<10xf32>, memref<15xf32>) {		func @memref_in_function_results(%arg0: memref<5xf32>, %arg1: memref<10xf32>, %arg2: memref<5xf32>) -> (memref<10xf32>, memref<15xf32>) {
%x = alloc() : memref<15xf32>		%x = alloc() : memref<15xf32>
%y = alloc() : memref<5xf32>		%y = alloc() : memref<5xf32>
linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg0, %y {		linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
		ins(%arg0: memref<5xf32>)
		outs(%y: memref<5xf32>) {
^bb0(%arg3: f32, %arg4: f32):		^bb0(%arg3: f32, %arg4: f32):
%2 = exp %arg3 : f32		%2 = exp %arg3 : f32
linalg.yield %2 : f32		linalg.yield %2 : f32
}: memref<5xf32>, memref<5xf32>		}
linalg.copy(%y, %arg2) : memref<5xf32>, memref<5xf32>		linalg.copy(%y, %arg2) : memref<5xf32>, memref<5xf32>
return %arg1, %x : memref<10xf32>, memref<15xf32>		return %arg1, %x : memref<10xf32>, memref<15xf32>
}		}
// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)		// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)
// CHECK: %[[X:.*]] = alloc()		// CHECK: %[[X:.*]] = alloc()
// CHECK: %[[Y:.*]] = alloc()		// CHECK: %[[Y:.*]] = alloc()
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK: dealloc %[[Y]]		// CHECK: dealloc %[[Y]]
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines
// CHECK-LABEL: func @condBranchAlloca		// CHECK-LABEL: func @condBranchAlloca
func @condBranchAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @condBranchAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cond_br %arg0, ^bb1, ^bb2		cond_br %arg0, ^bb1, ^bb2
^bb1:		^bb1:
br ^bb3(%arg1 : memref<2xf32>)		br ^bb3(%arg1 : memref<2xf32>)
^bb2:		^bb2:
%0 = alloca() : memref<2xf32>		%0 = alloca() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^bb3(%0 : memref<2xf32>)		br ^bb3(%0 : memref<2xf32>)
^bb3(%1: memref<2xf32>):		^bb3(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: cond_br		// CHECK-NEXT: cond_br
// CHECK: %[[ALLOCA:.*]] = alloca()		// CHECK: %[[ALLOCA:.*]] = alloca()
// CHECK: br ^bb3(%[[ALLOCA:.*]])		// CHECK: br ^bb3(%[[ALLOCA:.*]])
// CHECK-NEXT: ^bb3		// CHECK-NEXT: ^bb3
// CHECK-NEXT: linalg.copy		// CHECK-NEXT: linalg.copy
// CHECK-NEXT: return		// CHECK-NEXT: return

// -----		// -----

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @ifElseAlloca		// CHECK-LABEL: func @ifElseAlloca
func @ifElseAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @ifElseAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0,		cond_br %arg0,
^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),		^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),
^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)		^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)
^bb1(%1: memref<2xf32>, %2: memref<2xf32>):		^bb1(%1: memref<2xf32>, %2: memref<2xf32>):
br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)		br ^bb3(%1, %2 : memref<2xf32>, memref<2xf32>)
^bb2(%3: memref<2xf32>, %4: memref<2xf32>):		^bb2(%3: memref<2xf32>, %4: memref<2xf32>):
br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)		br ^bb3(%3, %4 : memref<2xf32>, memref<2xf32>)
^bb3(%5: memref<2xf32>, %6: memref<2xf32>):		^bb3(%5: memref<2xf32>, %6: memref<2xf32>):
%7 = alloca() : memref<2xf32>		%7 = alloca() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %5, %7 {		iterator_types = ["parallel"]}
		ins(%5: memref<2xf32>)
		outs(%7: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
"linalg.copy"(%7, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%7, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: %[[ALLOCA:.*]] = alloca()		// CHECK: %[[ALLOCA:.*]] = alloca()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: dealloc %[[ALLOC]]		// CHECK: dealloc %[[ALLOC]]
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: return		// CHECK-NEXT: return

// -----		// -----

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @ifElseNestedAlloca		// CHECK-LABEL: func @ifElseNestedAlloca
func @ifElseNestedAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @ifElseNestedAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
%0 = alloca() : memref<2xf32>		%0 = alloca() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {		iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
cond_br %arg0,		cond_br %arg0,
^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),		^bb1(%arg1, %0 : memref<2xf32>, memref<2xf32>),
^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)		^bb2(%0, %arg1 : memref<2xf32>, memref<2xf32>)
^bb1(%1: memref<2xf32>, %2: memref<2xf32>):		^bb1(%1: memref<2xf32>, %2: memref<2xf32>):
br ^bb5(%1, %2 : memref<2xf32>, memref<2xf32>)		br ^bb5(%1, %2 : memref<2xf32>, memref<2xf32>)
^bb2(%3: memref<2xf32>, %4: memref<2xf32>):		^bb2(%3: memref<2xf32>, %4: memref<2xf32>):
cond_br %arg0, ^bb3(%3 : memref<2xf32>), ^bb4(%4 : memref<2xf32>)		cond_br %arg0, ^bb3(%3 : memref<2xf32>), ^bb4(%4 : memref<2xf32>)
^bb3(%5: memref<2xf32>):		^bb3(%5: memref<2xf32>):
br ^bb5(%5, %3 : memref<2xf32>, memref<2xf32>)		br ^bb5(%5, %3 : memref<2xf32>, memref<2xf32>)
^bb4(%6: memref<2xf32>):		^bb4(%6: memref<2xf32>):
br ^bb5(%3, %6 : memref<2xf32>, memref<2xf32>)		br ^bb5(%3, %6 : memref<2xf32>, memref<2xf32>)
^bb5(%7: memref<2xf32>, %8: memref<2xf32>):		^bb5(%7: memref<2xf32>, %8: memref<2xf32>):
%9 = alloc() : memref<2xf32>		%9 = alloc() : memref<2xf32>
linalg.generic {		linalg.generic {
args_in = 1 : i64,
args_out = 1 : i64,
indexing_maps = [#map0, #map0],		indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %7, %9 {		iterator_types = ["parallel"]}
		ins(%7: memref<2xf32>)
		outs(%9: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
"linalg.copy"(%9, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%9, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}

// CHECK-NEXT: %[[ALLOCA:.*]] = alloca()		// CHECK-NEXT: %[[ALLOCA:.*]] = alloca()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: %[[ALLOC:.*]] = alloc()		// CHECK: %[[ALLOC:.*]] = alloc()
// CHECK-NEXT: linalg.generic		// CHECK-NEXT: linalg.generic
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc %[[ALLOC]]		// CHECK-NEXT: dealloc %[[ALLOC]]
// CHECK-NEXT: return		// CHECK-NEXT: return

// -----		// -----

#map0 = affine_map<(d0) -> (d0)>		#map0 = affine_map<(d0) -> (d0)>

// CHECK-LABEL: func @nestedRegionsAndCondBranchAlloca		// CHECK-LABEL: func @nestedRegionsAndCondBranchAlloca
func @nestedRegionsAndCondBranchAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {		func @nestedRegionsAndCondBranchAlloca(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cond_br %arg0, ^bb1, ^bb2		cond_br %arg0, ^bb1, ^bb2
^bb1:		^bb1:
br ^bb3(%arg1 : memref<2xf32>)		br ^bb3(%arg1 : memref<2xf32>)
^bb2:		^bb2:
%0 = alloc() : memref<2xf32>		%0 = alloc() : memref<2xf32>
linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg1, %0 {		linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%0: memref<2xf32>) {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):		^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%1 = alloca() : memref<2xf32>		%1 = alloca() : memref<2xf32>
linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#map0, #map0], iterator_types = ["parallel"]} %arg1, %1 {		linalg.generic {indexing_maps = [#map0, #map0], iterator_types = ["parallel"]}
		ins(%arg1: memref<2xf32>)
		outs(%1: memref<2xf32>) {
^bb0(%gen2_arg0: f32, %gen2_arg1: f32):		^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
%tmp2 = exp %gen2_arg0 : f32		%tmp2 = exp %gen2_arg0 : f32
linalg.yield %tmp2 : f32		linalg.yield %tmp2 : f32
}: memref<2xf32>, memref<2xf32>		}
%tmp1 = exp %gen1_arg0 : f32		%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32		linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>		}
br ^bb3(%0 : memref<2xf32>)		br ^bb3(%0 : memref<2xf32>)
^bb3(%1: memref<2xf32>):		^bb3(%1: memref<2xf32>):
"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()		"linalg.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return		return
}		}
// CHECK: (%[[cond:.]]: {{.}}, %[[ARG1:.]]: {{.}}, %{{.}}: {{.}})		// CHECK: (%[[cond:.]]: {{.}}, %[[ARG1:.]]: {{.}}, %{{.}}: {{.}})
// CHECK-NEXT: %[[ALLOC:.*]] = alloc()		// CHECK-NEXT: %[[ALLOC:.*]] = alloc()
// CHECK-NEXT: cond_br %[[cond]], ^[[BB1:.]], ^[[BB2:.]]		// CHECK-NEXT: cond_br %[[cond]], ^[[BB1:.]], ^[[BB2:.]]
// CHECK: ^[[BB2]]:		// CHECK: ^[[BB2]]:
// CHECK-NEXT: linalg.generic {{{.*}}} %[[ARG1]], %[[ALLOC]]		// CHECK-NEXT: linalg.generic {{{.}}} ins(%[[ARG1]]{{.}}outs(%[[ALLOC]]
// CHECK: %[[ALLOCA:.*]] = alloca()		// CHECK: %[[ALLOCA:.*]] = alloca()
// CHECK-NEXT: linalg.generic {{{.*}}} %[[ARG1]], %[[ALLOCA]]		// CHECK-NEXT: linalg.generic {{{.}}} ins(%[[ARG1]]{{.}}outs(%[[ALLOCA]]
// CHECK: %{{.*}} = exp		// CHECK: %{{.*}} = exp
// CHECK: ^[[BB3:.]]({{.}}):		// CHECK: ^[[BB3:.]]({{.}}):
// CHECK: linalg.copy		// CHECK: linalg.copy
// CHECK-NEXT: dealloc %[[ALLOC]]		// CHECK-NEXT: dealloc %[[ALLOC]]

// -----		// -----

// CHECK-LABEL: func @nestedRegionControlFlowAlloca		// CHECK-LABEL: func @nestedRegionControlFlowAlloca
▲ Show 20 Lines • Show All 312 Lines • Show Last 20 Lines

mlir/test/Transforms/copy-removal.mlir

	Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines

	// CHECK-LABEL: func @test_with_temp_usage_after_copy			// CHECK-LABEL: func @test_with_temp_usage_after_copy
	func @test_with_temp_usage_after_copy() -> memref<5xf32> {			func @test_with_temp_usage_after_copy() -> memref<5xf32> {
	%ret = alloc() : memref<5xf32>			%ret = alloc() : memref<5xf32>
	%res = alloc() : memref<5xf32>			%res = alloc() : memref<5xf32>
	%temp = alloc() : memref<5xf32>			%temp = alloc() : memref<5xf32>
	linalg.copy(%ret, %temp) : memref<5xf32>, memref<5xf32>			linalg.copy(%ret, %temp) : memref<5xf32>, memref<5xf32>
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]} %temp, %res {			iterator_types = ["parallel"]}
				ins(%temp : memref<5xf32>)
				outs(%res : memref<5xf32>) {
	^bb0(%gen1_arg0: f32, %gen1_arg1: f32):			^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: memref<5xf32>, memref<5xf32>			}
	dealloc %ret : memref<5xf32>			dealloc %ret : memref<5xf32>
	return %temp : memref<5xf32>			return %temp : memref<5xf32>
	}			}
	// CHECK-NEXT: %[[ret:.*]] = alloc()			// CHECK-NEXT: %[[ret:.*]] = alloc()
	// CHECK-NEXT: %[[res:.*]] = alloc()			// CHECK-NEXT: %[[res:.*]] = alloc()
	// CHECK-NOT: %{{.*}} = alloc()			// CHECK-NOT: %{{.*}} = alloc()
	// CHECK-NOT: linalg.copy			// CHECK-NOT: linalg.copy
	// CHECK-NOT: dealloc %[[ret]]			// CHECK-NOT: dealloc %[[ret]]
	▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @test_ReuseCopyTargetAsSource			// CHECK-LABEL: func @test_ReuseCopyTargetAsSource
	func @test_ReuseCopyTargetAsSource(%arg0: memref<2xf32>, %result: memref<2xf32>){			func @test_ReuseCopyTargetAsSource(%arg0: memref<2xf32>, %result: memref<2xf32>){
	// CHECK-SAME: (%[[ARG0:.]]: memref<2xf32>, %[[RES:.]]: memref<2xf32>)			// CHECK-SAME: (%[[ARG0:.]]: memref<2xf32>, %[[RES:.]]: memref<2xf32>)
	// CHECK-NOT: %{{.*}} = alloc			// CHECK-NOT: %{{.*}} = alloc
	%temp = alloc() : memref<2xf32>			%temp = alloc() : memref<2xf32>
	// CHECK-NEXT: linalg.generic			// CHECK-NEXT: linalg.generic
	// CHECK-SAME: %[[ARG0]], %[[RES]]			// CHECK-SAME: ins(%[[ARG0]]{{.*}}outs(%[[RES]]
	// CHECK-NOT: linalg.copy(%{{.*}}, %[[RES]])			// CHECK-NOT: linalg.copy(%{{.*}}, %[[RES]])
	// CHECK-NOT: dealloc %{{.*}}			// CHECK-NOT: dealloc %{{.*}}
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]} %arg0, %temp {			iterator_types = ["parallel"]}
				ins(%arg0 : memref<2xf32>)
				outs(%temp : memref<2xf32>) {
	^bb0(%gen2_arg0: f32, %gen2_arg1: f32):			^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
	%tmp2 = exp %gen2_arg0 : f32			%tmp2 = exp %gen2_arg0 : f32
	linalg.yield %tmp2 : f32			linalg.yield %tmp2 : f32
	}: memref<2xf32>, memref<2xf32>			}
	"linalg.copy"(%temp, %result) : (memref<2xf32>, memref<2xf32>) -> ()			"linalg.copy"(%temp, %result) : (memref<2xf32>, memref<2xf32>) -> ()
	dealloc %temp : memref<2xf32>			dealloc %temp : memref<2xf32>
	// CHECK: return			// CHECK: return
	return			return
	}			}

	// -----			// -----

	// Copy operation must not be removed since an operation writes to %to value			// Copy operation must not be removed since an operation writes to %to value
	// before copy.			// before copy.

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// CHECK-LABEL: func @test_ReuseCopyTargetAsSource			// CHECK-LABEL: func @test_ReuseCopyTargetAsSource
	func @test_ReuseCopyTargetAsSource(%arg0: memref<2xf32>){			func @test_ReuseCopyTargetAsSource(%arg0: memref<2xf32>){
	%to = alloc() : memref<2xf32>			%to = alloc() : memref<2xf32>
	%temp = alloc() : memref<2xf32>			%temp = alloc() : memref<2xf32>
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]} %arg0, %temp {			iterator_types = ["parallel"]}
				ins(%arg0 : memref<2xf32>)
				outs(%temp : memref<2xf32>) {
	^bb0(%gen1_arg0: f32, %gen1_arg1: f32):			^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
	%tmp1 = exp %gen1_arg0 : f32			%tmp1 = exp %gen1_arg0 : f32
	linalg.yield %tmp1 : f32			linalg.yield %tmp1 : f32
	}: memref<2xf32>, memref<2xf32>			}
	linalg.generic {			linalg.generic {
	args_in = 1 : i64,
	args_out = 1 : i64,
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]} %arg0, %to {			iterator_types = ["parallel"]}
				ins(%arg0 : memref<2xf32>)
				outs(%to : memref<2xf32>) {
	^bb0(%gen2_arg0: f32, %gen2_arg1: f32):			^bb0(%gen2_arg0: f32, %gen2_arg1: f32):
	%tmp2 = exp %gen2_arg0 : f32			%tmp2 = exp %gen2_arg0 : f32
	linalg.yield %tmp2 : f32			linalg.yield %tmp2 : f32
	}: memref<2xf32>, memref<2xf32>			}
	// CHECK: linalg.copy			// CHECK: linalg.copy
	"linalg.copy"(%temp, %to) : (memref<2xf32>, memref<2xf32>) -> ()			"linalg.copy"(%temp, %to) : (memref<2xf32>, memref<2xf32>) -> ()
	dealloc %temp : memref<2xf32>			dealloc %temp : memref<2xf32>
	return			return
	}			}

	// -----			// -----

	▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

mlir/test/lib/Transforms/TestBufferPlacement.cpp

	Show All 33 Lines
	template <bool allowMemrefFunctionResults>			template <bool allowMemrefFunctionResults>
	struct TestBufferPlacementPreparationPass			struct TestBufferPlacementPreparationPass
	: mlir::PassWrapper<			: mlir::PassWrapper<
	TestBufferPlacementPreparationPass<allowMemrefFunctionResults>,			TestBufferPlacementPreparationPass<allowMemrefFunctionResults>,
	OperationPass<ModuleOp>> {			OperationPass<ModuleOp>> {

	/// Converts tensor-type generic linalg operations to memref ones using			/// Converts tensor-type generic linalg operations to memref ones using
	/// buffer assignment.			/// buffer assignment.
				/// TODO: Avoid the copy-pasta by exposing the pattern from BufferPlacement.h
				herhutUnsubmitted Done Reply Inline Actions Not sure why this exists here. Can this not use the pattern from `TensorToBuffers.h` or is that not exposed there? herhut: Not sure why this exists here. Can this not use the pattern from `TensorToBuffers.h` or is that…
				nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions It's not exposed and it is not exposable without deeper refactorings because we don't want TensorToBuffer.h to depend on Linalg. nicolasvasilache: It's not exposed and it is not exposable without deeper refactorings because we don't want…
				herhutUnsubmitted Not Done Reply Inline Actions There already is a `populateConvertLinalgOnTensorsToBuffersPattern` function, it is just not exposed. It should be enough to just expose that function and call it here. No need to do this in this change, I can clean that up, too, once this landed. We need to figure out where to put all the tensor to buffers pattern anyway, as different dialects will need them and having a `populate` function in passes/transforms/rewrites seems the right approach to me. @tpopp FYI as you looked into patterns for `shape.assuming`. herhut: There already is a `populateConvertLinalgOnTensorsToBuffersPattern` function, it is just not…
				/// This probably requires an OpConversionPattern working on generic
				/// Operation*. For now only RewritePattern allow this.
	class GenericOpConverter			class GenericOpConverter
	: public BufferAssignmentOpConversionPattern<linalg::GenericOp> {			: public BufferAssignmentOpConversionPattern<linalg::GenericOp> {
	public:			public:
	using BufferAssignmentOpConversionPattern<			using BufferAssignmentOpConversionPattern<
	linalg::GenericOp>::BufferAssignmentOpConversionPattern;			linalg::GenericOp>::BufferAssignmentOpConversionPattern;

	LogicalResult			LogicalResult
	matchAndRewrite(linalg::GenericOp op, ArrayRef<Value> operands,			matchAndRewrite(linalg::GenericOp op, ArrayRef<Value> operands,
	ConversionPatternRewriter &rewriter) const final {			ConversionPatternRewriter &rewriter) const final {
				linalg::GenericOpAdaptor adaptor(operands,
				op.getOperation()->getAttrDictionary());

				// TODO: support ops with reduction.
				if (!op.init_tensors().empty())
				return failure();

				// All inputs need to be turned into buffers first. Until then, bail out.
				if (llvm::any_of(adaptor.inputs(), [](Value in) {
				return !in.getType().isa<MemRefType>();
				}))
				return failure();

	Location loc = op.getLoc();			Location loc = op.getLoc();
	ResultRange results = op.getOperation()->getResults();			SmallVector<Value, 2> outputBuffers, newOutputBuffers;
	SmallVector<Value, 2> newArgs, newResults;			outputBuffers.assign(adaptor.output_buffers().begin(),
	newArgs.reserve(operands.size() + results.size());			adaptor.output_buffers().end());
	newArgs.append(operands.begin(), operands.end());			newOutputBuffers.reserve(op.getNumOutputs());
	newResults.reserve(results.size());			newOutputBuffers.append(adaptor.output_buffers().begin(),
				adaptor.output_buffers().end());

	// Update all types to memref types.			// Update all types to memref types.
	for (auto result : results) {			for (Type t : op.getResultTypes()) {
	ShapedType type = result.getType().cast<ShapedType>();			auto type = t.cast<ShapedType>();
	assert(type && "Generic operations with non-shaped typed results are "
	"not currently supported.");
	if (!type.hasStaticShape())			if (!type.hasStaticShape())
	return rewriter.notifyMatchFailure(			return rewriter.notifyMatchFailure(
	op, "dynamic shapes not currently supported");			op, "dynamic shapes not currently supported");
	auto memrefType =			auto memrefType =
	MemRefType::get(type.getShape(), type.getElementType());			MemRefType::get(type.getShape(), type.getElementType());
	auto alloc = rewriter.create<AllocOp>(loc, memrefType);			auto alloc = rewriter.create<AllocOp>(loc, memrefType);
	newArgs.push_back(alloc);			newOutputBuffers.push_back(alloc);
	newResults.push_back(alloc);
	}			}

	// Generate a new linalg operation that works on buffers.			// Generate a new linalg operation that works on buffers.
	auto linalgOp = rewriter.create<linalg::GenericOp>(			auto linalgOp = rewriter.create<linalg::GenericOp>(
	loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),			loc,
	rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),			/resultTensorTypes=/ArrayRef<Type>{},
	op.iterator_types(), op.docAttr(), op.library_callAttr(),			/inputs=/adaptor.inputs(),
	op.symbol_sourceAttr());			/outputBuffers=/newOutputBuffers,
				/initTensors=/ValueRange{}, op.indexing_maps(), op.iterator_types(),
				op.docAttr(), op.library_callAttr(), op.symbol_sourceAttr());

	// Create a new block in the region of the new Generic Op.			// Create a new block in the region of the new Generic Op.
	Block &oldBlock = op.getRegion().front();			Block &oldBlock = op.getRegion().front();
	Region &newRegion = linalgOp.region();			Region &newRegion = linalgOp.region();
	Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),			Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),
	oldBlock.getArgumentTypes());			oldBlock.getArgumentTypes());

	// Map the old block arguments to the new ones.
	BlockAndValueMapping mapping;
	mapping.map(oldBlock.getArguments(), newBlock->getArguments());

	// Add the result arguments to the new block.			// Add the result arguments to the new block.
	for (auto result : newResults)			for (Value v : newOutputBuffers)
	newBlock->addArgument(			newBlock->addArgument(v.getType().cast<MemRefType>().getElementType());
	result.getType().cast<ShapedType>().getElementType());

	// Clone the body of the old block to the new block.			// Clone the body of the old block to the new block.
				BlockAndValueMapping mapping;
				for (unsigned i = 0; i < oldBlock.getNumArguments(); i++)
				mapping.map(oldBlock.getArgument(i), newBlock->getArgument(i));

				OpBuilder::InsertionGuard guard(rewriter);
	rewriter.setInsertionPointToEnd(newBlock);			rewriter.setInsertionPointToEnd(newBlock);
	for (auto &op : oldBlock.getOperations())			for (auto &op : oldBlock.getOperations()) {
	rewriter.clone(op, mapping);			Operation *clonedOp = rewriter.clone(op, mapping);
				mapping.map(op.getResults(), clonedOp->getResults());
				}

	// Replace the results of the old Generic Op with the results of the new			// Replace the results of the old op with the new output buffers.
	// one.			rewriter.replaceOp(op, newOutputBuffers);
	rewriter.replaceOp(op, newResults);
	return success();			return success();
	}			}
	};			};

	void populateTensorLinalgToBufferLinalgConversionPattern(			void populateTensorLinalgToBufferLinalgConversionPattern(
	MLIRContext context, BufferAssignmentTypeConverter converter,			MLIRContext context, BufferAssignmentTypeConverter converter,
	OwningRewritePatternList *patterns) {			OwningRewritePatternList *patterns) {
	populateWithBufferAssignmentOpConversionPatterns<			populateWithBufferAssignmentOpConversionPatterns<
	▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp

Show First 20 Lines • Show All 1,443 Lines • ▼ Show 20 Lines	void TCParser::printODS(llvm::raw_ostream &os, StringRef cppOpName,
ComprehensionParsingState &state) {		ComprehensionParsingState &state) {
const char *header = R"FMT( def {0} : LinalgStructuredBase_Op<"{1}", [		const char *header = R"FMT( def {0} : LinalgStructuredBase_Op<"{1}", [
NamedStructuredOpTrait,		NamedStructuredOpTrait,
AttrSizedOperandSegments,		AttrSizedOperandSegments,
SingleBlockImplicitTerminator<"YieldOp">]> {		SingleBlockImplicitTerminator<"YieldOp">]> {
let arguments = (ins Variadic<AnyShaped>:$inputs,		let arguments = (ins Variadic<AnyShaped>:$inputs,
Variadic<AnyMemRef>:$output_buffers,		Variadic<AnyMemRef>:$output_buffers,
Variadic<AnyRankedTensor>:$init_tensors);		Variadic<AnyRankedTensor>:$init_tensors);
let results = (outs Variadic<AnyRankedTensor>:$output_tensors);		let results = (outs Variadic<AnyRankedTensor>:$result_tensors);
let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);

let builders = [ OpBuilder<		let builders = [ OpBuilder<
"OpBuilder &b, OperationState &result,"		"OpBuilder &b, OperationState &result,"
"ValueRange inputs, ValueRange outputBuffers",		"ValueRange inputs, ValueRange outputBuffers",
[{{		[{{
result.addOperands(inputs);		result.addOperands(inputs);
result.addOperands(outputBuffers);		result.addOperands(outputBuffers);
▲ Show 20 Lines • Show All 275 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Uniformize linalg.generic with named ops.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 292889

mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h

mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h

mlir/lib/Conversion/LinalgToSPIRV/LinalgToSPIRV.cpp

mlir/lib/Dialect/Linalg/EDSC/Builders.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

mlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp

mlir/test/Conversion/LinalgToSPIRV/linalg-to-spirv.mlir

mlir/test/Dialect/Linalg/canonicalize.mlir

mlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir

mlir/test/Dialect/Linalg/fold-unit-trip-loops.mlir

mlir/test/Dialect/Linalg/fusion-tensor.mlir

mlir/test/Dialect/Linalg/fusion.mlir

mlir/test/Dialect/Linalg/fusion_indexed_generic.mlir

mlir/test/Dialect/Linalg/inlining.mlir

mlir/test/Dialect/Linalg/invalid.mlir

mlir/test/Dialect/Linalg/loops.mlir

mlir/test/Dialect/Linalg/parallel_loops.mlir

mlir/test/Dialect/Linalg/roundtrip.mlir

mlir/test/Dialect/Linalg/standard.mlir

mlir/test/Dialect/Linalg/tensors-to-buffers.mlir

mlir/test/Dialect/Linalg/tile.mlir

mlir/test/Dialect/Linalg/tile_indexed_generic.mlir

mlir/test/Dialect/Linalg/tile_parallel.mlir

mlir/test/Dialect/Linalg/tile_parallel_reduce.mlir

mlir/test/Dialect/Linalg/transform-patterns.mlir

mlir/test/EDSC/builder-api-test.cpp

mlir/test/Transforms/buffer-placement-preparation-allowed-memref-results.mlir

mlir/test/Transforms/buffer-placement-preparation.mlir

mlir/test/Transforms/buffer-placement.mlir

mlir/test/Transforms/copy-removal.mlir

mlir/test/lib/Transforms/TestBufferPlacement.cpp

mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp

[mlir][Linalg] Uniformize linalg.generic with named ops.
ClosedPublic