This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
11/11
LinalgStructuredOps.td
5/6
LinalgTraits.h
-
Utils/
-
Utils.h
-
lib/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
Analysis/
4/4
DependenceAnalysis.cpp
-
IR/
9/9
LinalgOps.cpp
-
Transforms/
-
Fusion.cpp
4/4
LinalgToLoops.cpp
-
LinalgTransforms.cpp
-
Promotion.cpp
-
Tiling.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
invalid.mlir
-
roundtrip.mlir

Differential D72555

[mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors.
ClosedPublic

Authored by nicolasvasilache on Jan 10 2020, 11:22 PM.

Download Raw Diff

Details

Reviewers

ftynse
jpienaar
mravishankar
asaadaldien
pifon2a
herhut
stellaraccident
sanjoy.google

Commits

rGf52d71736b10: [mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors.

Summary

This diff fixes issues with the semantics of linalg.generic on tensors that appeared when converting directly from HLO to linalg.generic.
The changes are self-contained within MLIR and can be captured and tested independently of XLA.

The linalg.generic and indexed_generic are updated to:

To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a linalg.generic op accepts
mixing input and output ranked tensor values with input and output memrefs.

%1 = linalg.generic #trait_attribute %A, %B {other-attributes} :
  tensor<?x?xf32>,
  memref<?x?xf32, stride_specification>
  -> (tensor<?x?xf32>)

In this case, the number of outputs (args_out) must match the sum of (1) the
number of output buffer operands and (2) the number of tensor return values.
The semantics is that the linalg.indexed_generic op produces (i.e.
allocates and fills) its return values.

Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. Such legalization moves tensor return values
into output buffer operands and updates the region argument accordingly.

Transformations that create control-flow around linalg.indexed_generic
operations are not expected to mix with tensors because SSA values do not
escape naturally. Still, transformations and rewrites that take advantage of
tensor SSA values are expected to be useful and will be added in the near
future.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nicolasvasilache created this revision.Jan 10 2020, 11:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2020, 11:22 PM

Herald added subscribers: llvm-commits, lucyrfox, mgester and 8 others. · View Herald Transcript

nicolasvasilache edited the summary of this revision. (Show Details)Jan 10 2020, 11:25 PM

nicolasvasilache added reviewers: ftynse, jpienaar, mravishankar, asaadaldien, pifon2a, herhut, stellaraccident, sanjoy.

clang-format.

Unit tests: pass. 61744 tests passed, 0 failed and 780 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: pass. 61744 tests passed, 0 failed and 780 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster failed remote builds in B43756: Diff 237485!Jan 10 2020, 11:48 PM

Harbormaster failed remote builds in B43755: Diff 237484!

sanjoy edited reviewers, added: sanjoy.google; removed: sanjoy.Jan 12 2020, 1:46 PM

Herald added a subscriber: liufengdb. · View Herald TranscriptJan 12 2020, 1:46 PM

hanchung added a subscriber: hanchung.Jan 13 2020, 12:01 AM

I browsed over the code and left some minor comments. The idea and code in verifiers LGTM. I find it much harder to read now with all those similar sounding accessors but that is inevitable I assume if you want to support mixed forms.

It seems that all transformation and code generation still assumes all buffer representations. That is fine (and also makes sense) but I'd prefer to have something emit an error or assert if this is not the case. One way less to shoot myself in the foot.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
693–695	So this example has a single output assuming that there is an order constraint on the arguments (inputs first, then outputs). Could the describe this here?
mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h
186	The number of results alone needs to the smaller than nOutputs(). Maybe assert that?
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
189	This iterators over all operands, including output buffers.
mlir/lib/Dialect/Linalg/Transforms/LinalgToLoops.cpp
92–94	This code assumes an all buffer representation? Maybe add an assert?
230	This code assumes an all buffers representation?

nicolasvasilache marked 8 inline comments as done.Jan 13 2020, 1:28 PM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
693–695	Updated the example, had forgotten this, thanks!
mlir/lib/Dialect/Linalg/Transforms/LinalgToLoops.cpp
92–94	Sprinkled a bunch in the relevant places, thanks!
230	sprinkled a bunch of asserts in the relevant places, thanks!

Address review comments.

Unit tests: pass. 61794 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B43859: Diff 237764.Jan 13 2020, 1:43 PM

ftynse requested changes to this revision.Jan 14 2020, 7:23 AM

ftynse added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
60	Apologies for being obnoxious, but the review would be significantly faster if NFC code motion changes were separate from functional changes (in the former case I can quickly eyeball it to confirm it's indeed NFC, whereas in the late I need to understand deeply what the code does; without separation, the understanding time also extends on the NFC parts)
132	Why remove the documentation?
573–574	Unclear whether "allocates and fills" extends to both memrefs and tensors, or just tensors since only tensors are "return values".
578	Nit: argument -> arguments
580–582	I cannot understand what "does not mix with tensors" mean in this context. Consider rephrasing or providing an example.
698–702	I wonder if you could write this text only once in a tablegen variable for both linalg.generic and linalg.indexed_generic, and then just concatenate that variable with the rest of the op-specific documentation. This would avoid the duplication and everything that comes with it, but may make it harder for a casual reader of the op definition to follow. Since you are the only person actively supporting this and doing the work twice, your call.
mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h
123–124	Nit: values->buffers
136	Typo "depending regardless"
138	Nit: remove this `//`, I was confused by it to read the block comment above as describing the function below, which is contradictory.
188	Spurious semicolon
mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp
142–146	This dependence analysis now only works if the output is a buffer, and cannot be queried if it is a tensor. Consider documenting (and maybe extending later to also work for tensors).
148	Do you need any modifications to support tensor inputs here? From a quick look, it should just work, but better check...
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
186	Ultra-nit: functions are not necessarily fun, please use "function" in user-visible messages
188–201	Why +1 here? It's inconsistent with block arguments above.
220	The use of 0-based index here is inconsistent with the 1-based index below. Please adopt a convention, document it in the dialect documentation and use everywhere.
225–239	This looks common with then non-indexed version. Can this be factored out into a helper function?
243	Nit: "inputs and buffer operands" sounds like a false dichotomy.
730	Ultra-nit: whitespace before `(`

This revision now requires changes to proceed.Jan 14 2020, 7:23 AM

Address review comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
60	Apologies for this reordering. I was contemplating extracting and landing an NFC but please note that the code that moved is clearly marked by phabricator with the yellow vertical bar. When you hover over it it says `Moved from line xxx`. If you think that is still not enough I can rebase the NFC part but I personally have found phabricator to be good for this.
132	Because the proper place for this doc is inside the op interface and it is already there, almost word for word.
698–702	Will consider for a followup, thanks!
mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp
142–146	Added an assertion, there is no short term plan to support such transformations and analyses on tensors. The answer is buffer allocate then it's in the buffer world and these apply. Mixed mode may appear in the future but would require rethinking things.
148	added assertions above.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
188–201	made it consistent, error messages for function arguments start at 1: e.g. 1st argument etc.
220	Made it consistent, thanks for spotting!

Unit tests: pass. 61794 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B43957: Diff 238007.Jan 14 2020, 9:55 AM

Something went wrong with formatting in the commit description (mixed double space prefix and backticks?). Please reformat and feel free to land.

This revision is now accepted and ready to land.Jan 14 2020, 1:55 PM

Rebasing.

nicolasvasilache edited the summary of this revision. (Show Details)Jan 14 2020, 2:22 PM

nicolasvasilache edited the summary of this revision. (Show Details)

nicolasvasilache edited the summary of this revision. (Show Details)Jan 14 2020, 2:24 PM

nicolasvasilache edited the summary of this revision. (Show Details)

Closed by commit rGf52d71736b10: [mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors. (authored by nicolasvasilache). · Explain WhyJan 14 2020, 2:27 PM

This revision was automatically updated to reflect the committed changes.

Unit tests: pass. 61863 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B43999: Diff 238100.Jan 14 2020, 2:37 PM

asaadaldien added inline comments.Jan 14 2020, 3:24 PM

mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h
204	Do we need to `assert(i - getNumInputsAndOutputBuffers() < getOutputTensorTypes().size())` ?

hanchung mentioned this in D74267: [mlir][Linalg] Add a roundtrip test for indexed_generic op with tensors..Feb 7 2020, 4:38 PM

hanchung mentioned this in rG4687822b9e8e: [mlir][Linalg] Add a roundtrip test for indexed_generic op with tensors..Feb 10 2020, 12:55 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgStructuredOps.td

178 lines

LinalgTraits.h

167 lines

Utils/

Utils.h

2 lines

lib/

Dialect/

Linalg/

Analysis/

DependenceAnalysis.cpp

10 lines

IR/

LinalgOps.cpp

187 lines

Transforms/

31 lines

43 lines

14 lines

11 lines

9 lines

test/

Dialect/

Linalg/

invalid.mlir

94 lines

roundtrip.mlir

24 lines

Diff 238102

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

Show All 29 Lines	class NOutputs<int args_out> :
NativeOpTrait<"linalg::NOutputs<" # !cast<string>(args_out) # ">::Impl"> {}		NativeOpTrait<"linalg::NOutputs<" # !cast<string>(args_out) # ">::Impl"> {}

def StructuredOpTraits : NativeOpTrait<"linalg::StructuredOpTraits">;		def StructuredOpTraits : NativeOpTrait<"linalg::StructuredOpTraits">;

// The linalg 'LinalgStructuredInterface' provides access to the 'LinalgOp'		// The linalg 'LinalgStructuredInterface' provides access to the 'LinalgOp'
// interface.		// interface.
def LinalgStructuredInterface : OpInterface<"LinalgOp"> {		def LinalgStructuredInterface : OpInterface<"LinalgOp"> {
let methods = [		let methods = [
InterfaceMethod<		//========================================================================//
"Query the number of inputs from the current operation.",		// Loop types handling.
"unsigned", "getNumInputs"		//========================================================================//
>,
InterfaceMethod<
"Query the number of outputs from the current operation.",
"unsigned", "getNumOutputs"
>,
InterfaceMethod<
"Query the number of inputs and outputs from the current operation.",
"unsigned", "getNumInputsAndOutputs"
>,
InterfaceMethod<
"Query the input operands from the current operation.",
"Operation::operand_range", "getInputs"
>,
InterfaceMethod<
"Query the output operands from the current operation.",
"Operation::operand_range", "getOutputs"
>,
InterfaceMethod<
"Query the input and output operands from the current operation.",
"Operation::operand_range", "getInputsAndOutputs"
>,
InterfaceMethod<
"Query the iterator types attribute within the current operation.",
"ArrayAttr", "iterator_types"
>,
InterfaceMethod<
"Query the indexing maps attribute within the current operation.",
"ArrayAttr", "indexing_maps"
>,
InterfaceMethod<		InterfaceMethod<
"Query the number of parallel loops within the current operation.",		"Query the number of parallel loops within the current operation.",
"unsigned", "getNumParallelLoops"		"unsigned", "getNumParallelLoops"
>,		>,
InterfaceMethod<		InterfaceMethod<
"Query the number of reduction loops within the current operation.",		"Query the number of reduction loops within the current operation.",
"unsigned", "getNumReductionLoops"		"unsigned", "getNumReductionLoops"
>,		>,
InterfaceMethod<		InterfaceMethod<
"Query the number of window loops within the current operation.",		"Query the number of window loops within the current operation.",
"unsigned", "getNumWindowLoops"		"unsigned", "getNumWindowLoops"
>,		>,
InterfaceMethod<		InterfaceMethod<
"Query the number of loops within the current operation.",		"Query the number of loops within the current operation.",
"unsigned", "getNumLoops">,		"unsigned", "getNumLoops">,

		//========================================================================//
		// Input arguments handling.
		//========================================================================//
		InterfaceMethod<
		ftynseUnsubmitted Done Reply Inline Actions Apologies for being obnoxious, but the review would be significantly faster if NFC code motion changes were separate from functional changes (in the former case I can quickly eyeball it to confirm it's indeed NFC, whereas in the late I need to understand deeply what the code does; without separation, the understanding time also extends on the NFC parts) ftynse: Apologies for being obnoxious, but the review would be significantly faster if NFC code motion…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Apologies for this reordering. I was contemplating extracting and landing an NFC but please note that the code that moved is clearly marked by phabricator with the yellow vertical bar. When you hover over it it says `Moved from line xxx`. If you think that is still not enough I can rebase the NFC part but I personally have found phabricator to be good for this. nicolasvasilache: Apologies for this reordering. I was contemplating extracting and landing an NFC but please…
		"Query the number of inputs from the current operation.",
		"unsigned", "getNumInputs"
		>,
InterfaceMethod<"Query the input view at the given index.",		InterfaceMethod<"Query the input view at the given index.",
"Value ", "getInput", (ins "unsigned":$i)		"Value ", "getInput", (ins "unsigned":$i)
>,		>,
InterfaceMethod<"Query the output view at the given index.",
"Value ", "getOutput", (ins "unsigned":$i)
>,
InterfaceMethod<[{		InterfaceMethod<[{
Return the index of the given input value `v`, or `None` if the value is		Return the index of the given input value `v`, or `None` if the value is
not an input.		not an input.
}],		}],
"llvm::Optional<unsigned>", "getIndexOfInput", (ins "Value ":$v)		"llvm::Optional<unsigned>", "getIndexOfInput", (ins "Value ":$v)
>,		>,
InterfaceMethod<[{		InterfaceMethod<
Query the index of the given view value, or `None` if the value is not		"Query the input operands from the current operation.",
a view.		"Operation::operand_range", "getInputs"
}],
"llvm::Optional<unsigned>", "getIndexOfOutput", (ins "Value ":$view)
>,		>,
InterfaceMethod<[{		InterfaceMethod<[{
Query the type of the input shape at the given index.		Query the type of the input shape at the given index.
}], "ShapedType", "getInputShapedType", (ins "unsigned":$i)>,		}], "ShapedType", "getInputShapedType", (ins "unsigned":$i)>,
InterfaceMethod<[{		InterfaceMethod<[{
Query the type of the output view at the given index.
}], "ShapedType", "getOutputShapedType", (ins "unsigned":$i)>,
InterfaceMethod<[{
Query whether the op has only MemRef input and outputs.
}], "bool", "hasBufferSemantics">,
InterfaceMethod<[{
Query the subset of input operands that are of ranked tensor type.		Query the subset of input operands that are of ranked tensor type.
}], "SmallVector<RankedTensorType, 4>", "getInputTensorTypes">,		}], "SmallVector<RankedTensorType, 4>", "getInputTensorTypes">,


		//========================================================================//
		// Output arguments handling.
		//========================================================================//
		InterfaceMethod<
		"Query the number of outputs from the current operation.",
		"unsigned", "getNumOutputs"
		>,
		InterfaceMethod<"Query the output buffer at the given index.",
		"Value ", "getOutputBuffer", (ins "unsigned":$i)
		>,
		InterfaceMethod<[{
		Query the index of the given buffer value, or `None` if the value is not
		part of the output buffers.
		}],
		"llvm::Optional<unsigned>", "getIndexOfOutputBuffer", (ins "Value ":$view)
		>,
		InterfaceMethod<[{
		Query the type of the output buffer at the given index.
		}], "MemRefType", "getOutputBufferType", (ins "unsigned":$i)>,
InterfaceMethod<[{		InterfaceMethod<[{
Query the subset of output operands that are of ranked tensor type.		Query the results that are of ranked tensor type.
}], "SmallVector<RankedTensorType, 4>", "getOutputTensorTypes">,		}], "SmallVector<RankedTensorType, 4>", "getOutputTensorTypes">,
		InterfaceMethod<
		"Query the output buffers (operands) from the current operation.",
		"Operation::operand_range", "getOutputBuffers"
		>,

		//========================================================================//
		// Input and Output arguments handling.
		//========================================================================//
		InterfaceMethod<
		"Return the number of inputs and outputs, irrespective of their buffer "
		"or tensor type.",
		"unsigned", "getNumInputsAndOutputs"
		>,
		InterfaceMethod<
		"Return the number of inputs, irrespective of their buffer or tensor "
		"type, and output buffers",
		"unsigned", "getNumInputsAndOutputBuffers"
		>,
		InterfaceMethod<
		"Return the range over inputs (irrespective of type) and output buffers.",
		"Operation::operand_range", "getInputsAndOutputBuffers"
		>,

		//========================================================================//
		// Other interface methods.
		//========================================================================//
		InterfaceMethod<
		"Query the iterator types attribute within the current operation.",
		"ArrayAttr", "iterator_types"
		>,
		InterfaceMethod<
		"Query the indexing maps attribute within the current operation.",
		"ArrayAttr", "indexing_maps"
		>,
		InterfaceMethod<[{
		Query whether the op has only MemRef input and outputs.
		}], "bool", "hasBufferSemantics">,

		//========================================================================//
		// Other static interface methods.
		//========================================================================//
StaticInterfaceMethod<[{		StaticInterfaceMethod<[{
Create an operation of the current type with the given location,		Create an operation of the current type with the given location,
operands, and attributes.		operands, and attributes.
}],		}],
"Operation *", "create",		"Operation *", "create",
(ins "OpBuilder &":$builder, "Location":$loc,		(ins "OpBuilder &":$builder, "Location":$loc,
"ValueRange":$operands,		"ValueRange":$operands,
"ArrayRef<NamedAttribute>":$attributes), [{		"ArrayRef<NamedAttribute>":$attributes), [{
return builder.create<ConcreteOp>(loc, ArrayRef<Type>{}, operands,		return builder.create<ConcreteOp>(loc, ArrayRef<Type>{}, operands,
attributes);		attributes);
}]		}]
>,		>,

/// Clone an operation with the given location and operands. This is used to
ftynseUnsubmitted Done Reply Inline Actions Why remove the documentation? ftynse: Why remove the documentation?
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Because the proper place for this doc is inside the op interface and it is already there, almost word for word. nicolasvasilache: Because the proper place for this doc is inside the op interface and it is already there…
/// abstract away the optional underlying region creation.
InterfaceMethod<[{		InterfaceMethod<[{
Clone the current operation with the given location and operands. This		Clone the current operation with the given location and operands. This
is used to abstract away the optional underlying region creation.		is used to abstract away the optional underlying region creation.
}],		}],
"Operation *", "clone",		"Operation *", "clone",
(ins "OpBuilder &":$b, "Location":$loc, "ValueRange":$operands), [{		(ins "OpBuilder &":$b, "Location":$loc, "ValueRange":$operands), [{
BlockAndValueMapping map;		BlockAndValueMapping map;
unsigned numRegions = op.getOperation()->getNumRegions();		unsigned numRegions = op.getOperation()->getNumRegions();
▲ Show 20 Lines • Show All 389 Lines • ▼ Show 20 Lines	let description = [{
}		}
```		```

To allow progressive lowering from the value world (a.k.a tensor values) to		To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a `linalg.generic` op accepts		the buffer world (a.k.a memref values), a `linalg.generic` op accepts
mixing input and output ranked tensor values with input and output memrefs.		mixing input and output ranked tensor values with input and output memrefs.

```mlir		```mlir
%1 = linalg.generic #trait_attribute %A, %B, %C {other-attributes} :		%C = linalg.generic #trait_attribute %A, %B {other-attributes} :
tensor<?x?xf32>,		tensor<?x?xf32>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>
tensor<?x?xf32>
-> (tensor<?x?xf32>)		-> (tensor<?x?xf32>)
```		```

In this case, the number of return values must match the number of output		In this case, the number of outputs (args_out) must match the sum of (1) the
tensor arguments. The semantics is that the `linalg.generic` op		number of output buffer operands and (2) the number of tensor return values.
produces (i.e. allocates and fills) its return values.		The semantics is that the `linalg.indexed_generic` op produces (i.e.
		allocates and fills) its tensor return values.
		ftynseUnsubmitted Done Reply Inline Actions Unclear whether "allocates and fills" extends to both memrefs and tensors, or just tensors since only tensors are "return values". ftynse: Unclear whether "allocates and fills" extends to both memrefs and tensors, or just tensors…

Tensor values must be legalized by a buffer allocation pass before most		Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. In particular, transformations that create		transformations can be applied. Such legalization moves tensor return values
control flow around linalg.generic operations are not expected to mix with		into output buffer operands and updates the region arguments accordingly.
		ftynseUnsubmitted Done Reply Inline Actions Nit: argument -> arguments ftynse: Nit: argument -> arguments
tensors because SSA values do not escape naturally. Still, transformations
and rewrites that take advantage of tensor SSA values are expected to be		Transformations that create control-flow around linalg.indexed_generic
useful and will be added in the near future.		operations are not expected to work with tensors because SSA values do not
		escape naturally. Still, transformations and rewrites that take advantage of
		ftynseUnsubmitted Done Reply Inline Actions I cannot understand what "does not mix with tensors" mean in this context. Consider rephrasing or providing an example. ftynse: I cannot understand what "does not mix with tensors" mean in this context. Consider rephrasing…
		tensor SSA values are expected to be useful and will be added in the near
		future.
}];		}];
let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];
}		}

def IndexedGenericOp : GenericOpBase<"indexed_generic"> {		def IndexedGenericOp : GenericOpBase<"indexed_generic"> {
let description = [{		let description = [{
Indexed Generic Linalg op form where the key properties of the computation		Indexed Generic Linalg op form where the key properties of the computation
are specified as attributes. In pretty form, a linalg.indexed_generic op is		are specified as attributes. In pretty form, a linalg.indexed_generic op is
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	let description = [{
```		```

To allow progressive lowering from the value world (a.k.a tensor values) to		To allow progressive lowering from the value world (a.k.a tensor values) to
the buffer world (a.k.a memref values), a `linalg.indexed_generic` op		the buffer world (a.k.a memref values), a `linalg.indexed_generic` op
accepts mixing input and output ranked tensor values with input and output		accepts mixing input and output ranked tensor values with input and output
memrefs.		memrefs.

```mlir		```mlir
%1 = linalg.indexed_generic #trait_attribute %A, %B, %C {other-attributes}		%C = linalg.indexed_generic #trait_attribute %A, %B {other-attributes}
: tensor<?x?xf32>,		: tensor<?x?xf32>,
memref<?x?xf32, stride_specification>,		memref<?x?xf32, stride_specification>
tensor<?x?xf32>
-> (tensor<?x?xf32>)		-> (tensor<?x?xf32>)
		herhutUnsubmitted Done Reply Inline Actions So this example has a single output assuming that there is an order constraint on the arguments (inputs first, then outputs). Could the describe this here? herhut: So this example has a single output assuming that there is an order constraint on the arguments…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Updated the example, had forgotten this, thanks! nicolasvasilache: Updated the example, had forgotten this, thanks!
```		```

In this case, the number of return values must match the number of output		In this case, the number of outputs (args_out) must match the sum of (1) the
tensor arguments. The semantics is that the `linalg.indexed_generic` op		number of output buffer operands and (2) the number of tensor return values.
produces (i.e. allocates and fills) its return values.		The semantics is that the `linalg.indexed_generic` op produces (i.e.
		allocates and fills) its return values.

		ftynseUnsubmitted Done Reply Inline Actions I wonder if you could write this text only once in a tablegen variable for both linalg.generic and linalg.indexed_generic, and then just concatenate that variable with the rest of the op-specific documentation. This would avoid the duplication and everything that comes with it, but may make it harder for a casual reader of the op definition to follow. Since you are the only person actively supporting this and doing the work twice, your call. ftynse: I wonder if you could write this text only once in a tablegen variable for both linalg.generic…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Will consider for a followup, thanks! nicolasvasilache: Will consider for a followup, thanks!
Tensor values must be legalized by a buffer allocation pass before most		Tensor values must be legalized by a buffer allocation pass before most
transformations can be applied. In particular, transformations that create		transformations can be applied. Such legalization moves tensor return values
control flow around linalg.generic operations are not expected to mix with		into output buffer operands and updates the region argument accordingly.
tensors because SSA values do not escape naturally. Still, transformations
and rewrites that take advantage of tensor SSA values are expected to be		Transformations that create control-flow around linalg.indexed_generic
useful and will be added in the near future.		operations are not expected to work with tensors because SSA values do not
		escape naturally. Still, transformations and rewrites that take advantage of
		tensor SSA values are expected to be useful and will be added in the near
		future.
}];		}];
let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];
}		}

#endif // LINALG_STRUCTURED_OPS		#endif // LINALG_STRUCTURED_OPS

mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	/// `getNumOutputs`. Use as a trait as follows:			/// `getNumOutputs`. Use as a trait as follows:
	///			///
	/// class DotOp : public Op<DotOp, OpTrait::StructuredOpTraits> {			/// class DotOp : public Op<DotOp, OpTrait::StructuredOpTraits> {
	///			///
	template <typename ConcreteType>			template <typename ConcreteType>
	class StructuredOpTraits			class StructuredOpTraits
	: public OpTrait::TraitBase<ConcreteType, StructuredOpTraits> {			: public OpTrait::TraitBase<ConcreteType, StructuredOpTraits> {
	private:			private:
	/// Return the number of inputs. For internal use only.			/// Return the number of inputs, irrespective of their buffer or tensor type.
				/// For internal use only.
	unsigned nInputs() {			unsigned nInputs() {
	return cast<ConcreteType>(this->getOperation()).getNumInputs();			return cast<ConcreteType>(this->getOperation()).getNumInputs();
	}			}
	/// Return the number of outputs. For internal use only.			/// Return the number of outputs, irrespective of their buffer or tensor type.
				/// For internal use only.
	unsigned nOutputs() {			unsigned nOutputs() {
	return cast<ConcreteType>(this->getOperation()).getNumOutputs();			return cast<ConcreteType>(this->getOperation()).getNumOutputs();
	}			}

	public:			public:
				//==========================================================================//
				// Loop types handling.
				//==========================================================================//
				unsigned getNumParallelLoops() {
				return getNumIterators(
				getParallelIteratorTypeName(),
				cast<ConcreteType>(this->getOperation()).iterator_types());
				}
				unsigned getNumReductionLoops() {
				return getNumIterators(
				getReductionIteratorTypeName(),
				cast<ConcreteType>(this->getOperation()).iterator_types());
				}
				unsigned getNumWindowLoops() {
				return getNumIterators(
				getWindowIteratorTypeName(),
				cast<ConcreteType>(this->getOperation()).iterator_types());
				}
				unsigned getNumLoops() {
				return getNumIterators(
				cast<ConcreteType>(this->getOperation()).iterator_types());
				}

				//==========================================================================//
				// Input arguments handling.
				//==========================================================================//
				// The `i^th` input argument is always the `i^th` operand regardless of
				// whether we have tensors or buffers.
				//
	/// Return the `i`-th input value.			/// Return the `i`-th input value.
	Value getInput(unsigned i) {			Value getInput(unsigned i) {
	assert(i < nInputs());			assert(i < nInputs());
	return this->getOperation()->getOperand(i);			return this->getOperation()->getOperand(i);
	}			}
	/// Return the index of `value` in the list of inputs if found, llvm::None			/// Return the index of `value` in the list of inputs if found, llvm::None
	/// otherwise.			/// otherwise.
	Optional<unsigned> getIndexOfInput(Value value) {			Optional<unsigned> getIndexOfInput(Value value) {
	auto it = llvm::find(getInputs(), value);			auto it = llvm::find(getInputs(), value);
	if (it != getInputs().end())			if (it != getInputs().end())
	return it - getInputs().begin();			return it - getInputs().begin();
	return llvm::None;			return llvm::None;
	}			}
	/// Return the `i`-th input buffer type.			/// Return the `i`-th input buffer type.
	ShapedType getInputShapedType(unsigned i) {			ShapedType getInputShapedType(unsigned i) {
	return getInput(i).getType().template cast<ShapedType>();			return getInput(i).getType().template cast<ShapedType>();
	}			}
	/// Return the range over inputs.			/// Return the range over inputs.
	Operation::operand_range getInputs() {			Operation::operand_range getInputs() {
	auto range = this->getOperation()->getOperands();			auto range = this->getOperation()->getOperands();
	return {range.begin(), range.begin() + nInputs()};			return {range.begin(), range.begin() + nInputs()};
	}			}
	/// Return the `i`-th output.
	Value getOutput(unsigned i) {
	return this->getOperation()->getOperand(nInputs() + i);
	}
	/// Return the index of `value` in the list of output values if found,
	/// llvm::None otherwise.
	Optional<unsigned> getIndexOfOutput(Value value) {
	auto it = llvm::find(getOutputs(), value);
	if (it != getOutputs().end())
	return it - getOutputs().begin();
	return llvm::None;
	}
	/// Return the `i`-th output buffer type.
	ShapedType getOutputShapedType(unsigned i) {
	return getOutput(i).getType().template cast<ShapedType>();
	}
	/// Query whether the op has only MemRef input and outputs.
	bool hasBufferSemantics() {
	return this->getOperation()->getNumResults() == 0 &&
	llvm::all_of(getInputsAndOutputs(),
	[](Value v) { return v.getType().isa<MemRefType>(); });
	}
	/// Query the subset of input operands that are of ranked tensor type.			/// Query the subset of input operands that are of ranked tensor type.
				ftynseUnsubmitted Done Reply Inline Actions Nit: values->buffers ftynse: Nit: values->buffers
	SmallVector<RankedTensorType, 4> getInputTensorTypes() {			SmallVector<RankedTensorType, 4> getInputTensorTypes() {
	SmallVector<RankedTensorType, 4> res;			SmallVector<RankedTensorType, 4> res;
	for (Type type : getInputs().getTypes())			for (Type type : getInputs().getTypes())
	if (auto t = type.template dyn_cast<RankedTensorType>())			if (auto t = type.template dyn_cast<RankedTensorType>())
	res.push_back(t);			res.push_back(t);
	return res;			return res;
	}			}
	/// Query the subset of output operands that are of ranked tensor type.
				//==========================================================================//
				// Output arguments handling.
				//==========================================================================//
				// The `i^th` output argument is an operand (resp. a return value) iff it is
				ftynseUnsubmitted Done Reply Inline Actions Typo "depending regardless" ftynse: Typo "depending regardless"
				// a value of buffer type (resp. a return value of tensor type).

				ftynseUnsubmitted Done Reply Inline Actions Nit: remove this `//`, I was confused by it to read the block comment above as describing the function below, which is contradictory. ftynse: Nit: remove this `//`, I was confused by it to read the block comment above as describing the…
				/// Return the `i`-th output, asserts that this is a buffer operand and not
				/// a tensor result.
				Value getOutputBuffer(unsigned i) {
				assert(i + this->getOperation()->getNumResults() < nOutputs() &&
				"overflowing output buffer index");
				return this->getOperation()->getOperand(nInputs() + i);
				}
				/// Return the index of `value` in the list of output buffers if found,
				/// llvm::None otherwise.
				Optional<unsigned> getIndexOfOutputBuffer(Value value) {
				auto it = llvm::find(getOutputBuffers(), value);
				if (it != getOutputBuffers().end())
				return it - getOutputBuffers().begin();
				return llvm::None;
				}
				/// Return the `i`-th output buffer type.
				MemRefType getOutputBufferType(unsigned i) {
				return getOutputBuffer(i).getType().template cast<MemRefType>();
				}
				/// Return the `i`-th output shaped type, irrespective of buffer of tensor
				/// type.
				ShapedType getOutputShapedType(unsigned i) {
				return getShapedType(i + nInputs());
				}
				/// Query the subset of results that are of ranked tensor type.
	SmallVector<RankedTensorType, 4> getOutputTensorTypes() {			SmallVector<RankedTensorType, 4> getOutputTensorTypes() {
	SmallVector<RankedTensorType, 4> res;			SmallVector<RankedTensorType, 4> res;
	for (Type type : getOutputs().getTypes())			for (Type type : this->getOperation()->getResults().getTypes())
	if (auto t = type.template dyn_cast<RankedTensorType>())			res.push_back(type.template cast<RankedTensorType>());
	res.push_back(t);
	return res;			return res;
	}			}
	/// Return the range over outputs.			/// Return the range over outputs.
	Operation::operand_range getOutputs() {			Operation::operand_range getOutputBuffers() {
	auto range = this->getOperation()->getOperands();			auto range = this->getOperation()->getOperands();
	return {range.begin() + nInputs(),			return {range.begin() + nInputs(),
	range.begin() + getNumInputsAndOutputs()};			range.begin() + getNumInputsAndOutputBuffers()};
	}			}
	/// Return the number of inputs and outputs.
				//==========================================================================//
				// Input and Output arguments handling.
				//==========================================================================//
				/// Return the number of inputs and outputs, irrespective of their buffer or
				/// tensor type.
	unsigned getNumInputsAndOutputs() { return nInputs() + nOutputs(); }			unsigned getNumInputsAndOutputs() { return nInputs() + nOutputs(); }
	/// Return the `i`-th buffer type.			/// Return the number of inputs, irrespective of their buffer or tensor type,
	ShapedType getShapedType(unsigned i) {			/// and output buffers.
	return (i < nInputs()) ? getInputShapedType(i)			unsigned getNumInputsAndOutputBuffers() {
	: getOutputShapedType(i - nInputs());			assert(this->getOperation()->getNumResults() <= nOutputs());
				herhutUnsubmitted Done Reply Inline Actions The number of results alone needs to the smaller than nOutputs(). Maybe assert that? herhut: The number of results alone needs to the smaller than nOutputs(). Maybe assert that?
				return nInputs() + nOutputs() - this->getOperation()->getNumResults();
	}			}
				ftynseUnsubmitted Done Reply Inline Actions Spurious semicolon ftynse: Spurious semicolon
	/// Return the range over inputs and outputs.			/// Return the range over inputs (irrespective of type) and output buffers.
	Operation::operand_range getInputsAndOutputs() {			Operation::operand_range getInputsAndOutputBuffers() {
	auto range = this->getOperation()->getOperands();			auto range = this->getOperation()->getOperands();
	return {range.begin(), range.begin() + getNumInputsAndOutputs()};			return {range.begin(), range.begin() + getNumInputsAndOutputBuffers()};
	}			}
	unsigned getNumParallelLoops() {			/// Return the `i`-th shaped type, there are 3 cases:
	return getNumIterators(			/// 1. if `i < nInputs()` then return `getInputShapedType(i)`; otherwise
	getParallelIteratorTypeName(),			/// 2. if `i < getNumInputsAndOutputBuffers()` then return the
	cast<ConcreteType>(this->getOperation()).iterator_types());			/// `getOutputBufferType(i - nInputs())`; otherwise
	}			/// 3. return the `i - getNumInputsAndOutputBuffers()` result type.
	unsigned getNumReductionLoops() {			ShapedType getShapedType(unsigned i) {
	return getNumIterators(			if (i < nInputs())
	getReductionIteratorTypeName(),			return getInputShapedType(i);
	cast<ConcreteType>(this->getOperation()).iterator_types());			if (i < getNumInputsAndOutputBuffers())
	}			return getOutputBufferType(i - nInputs()).template cast<ShapedType>();
	unsigned getNumWindowLoops() {			return getOutputTensorTypes()[i - getNumInputsAndOutputBuffers()]
				asaadaldienUnsubmitted Not Done Reply Inline Actions Do we need to `assert(i - getNumInputsAndOutputBuffers() < getOutputTensorTypes().size())` ? asaadaldien: Do we need to `assert(i - getNumInputsAndOutputBuffers() < getOutputTensorTypes().size())` ?
	return getNumIterators(			.template cast<ShapedType>();
	getWindowIteratorTypeName(),
	cast<ConcreteType>(this->getOperation()).iterator_types());
	}			}
	unsigned getNumLoops() {
	return getNumIterators(			//==========================================================================//
	cast<ConcreteType>(this->getOperation()).iterator_types());			// Other interface methods.
				//==========================================================================//
				/// Query whether the op has only buffer inputs and no returns.
				bool hasBufferSemantics() {
				return this->getOperation()->getNumResults() == 0 &&
				llvm::all_of(getInputs(),
				[](Value v) { return v.getType().isa<MemRefType>(); });
	}			}

				//==========================================================================//
				// Other static interface methods.
				//==========================================================================//
	static LogicalResult verifyTrait(Operation *op) {			static LogicalResult verifyTrait(Operation *op) {
	auto nOperands = cast<ConcreteType>(op).getNumInputsAndOutputs();			auto nOperands = cast<ConcreteType>(op).getNumInputsAndOutputBuffers();
	if (failed(OpTrait::impl::verifyAtLeastNOperands(op, nOperands)))			if (failed(OpTrait::impl::verifyAtLeastNOperands(op, nOperands)))
	return failure();			return failure();
	return success();			return success();
	}			}
	};			};

	} // namespace linalg			} // namespace linalg
	} // namespace OpTrait			} // namespace OpTrait
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_LINALG_LINALGTRAITS_H_			#endif // MLIR_DIALECT_LINALG_LINALGTRAITS_H_

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	Optional<FusionInfo> fuseProducerOf(OpBuilder &b, LinalgOp consumer,
OperationFolder *folder = nullptr);		OperationFolder *folder = nullptr);

/// Returns the linearized list of all view dimensions in a linalgOp. Applying		/// Returns the linearized list of all view dimensions in a linalgOp. Applying
/// the inverse, concatenated loopToOperandRangeMaps to this list allows the		/// the inverse, concatenated loopToOperandRangeMaps to this list allows the
/// derivation of loop ranges for any linalgOp.		/// derivation of loop ranges for any linalgOp.
template <typename ConcreteOp>		template <typename ConcreteOp>
SmallVector<Value, 8> getViewSizes(ConcreteOp linalgOp) {		SmallVector<Value, 8> getViewSizes(ConcreteOp linalgOp) {
SmallVector<Value, 8> res;		SmallVector<Value, 8> res;
for (auto v : linalgOp.getInputsAndOutputs()) {		for (auto v : linalgOp.getInputsAndOutputBuffers()) {
MemRefType t = v.getType().template cast<MemRefType>();		MemRefType t = v.getType().template cast<MemRefType>();
for (unsigned i = 0; i < t.getRank(); ++i)		for (unsigned i = 0; i < t.getRank(); ++i)
res.push_back(edsc::intrinsics::dim(v, i));		res.push_back(edsc::intrinsics::dim(v, i));
}		}
return res;		return res;
}		}

/// Returns the values obtained by applying `map` to the list of values.		/// Returns the values obtained by applying `map` to the list of values.
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp

Show First 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	LinalgDependenceGraph::getDependencesInto(
Operation *dst, LinalgDependenceGraph::DependenceType dt) const {		Operation *dst, LinalgDependenceGraph::DependenceType dt) const {
auto iter = dependencesIntoGraphs[dt].find(dst);		auto iter = dependencesIntoGraphs[dt].find(dst);
if (iter == dependencesIntoGraphs[dt].end())		if (iter == dependencesIntoGraphs[dt].end())
return llvm::make_range(nullptr, nullptr);		return llvm::make_range(nullptr, nullptr);
return llvm::make_range(iter->second.begin(), iter->second.end());		return llvm::make_range(iter->second.begin(), iter->second.end());
}		}

void LinalgDependenceGraph::addDependencesBetween(LinalgOp src, LinalgOp dst) {		void LinalgDependenceGraph::addDependencesBetween(LinalgOp src, LinalgOp dst) {
for (auto srcView : src.getOutputs()) { // W		assert(src.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(dst.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		for (auto srcView : src.getOutputBuffers()) { // W
		ftynseUnsubmitted Done Reply Inline Actions This dependence analysis now only works if the output is a buffer, and cannot be queried if it is a tensor. Consider documenting (and maybe extending later to also work for tensors). ftynse: This dependence analysis now only works if the output is a buffer, and cannot be queried if it…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Added an assertion, there is no short term plan to support such transformations and analyses on tensors. The answer is buffer allocate then it's in the buffer world and these apply. Mixed mode may appear in the future but would require rethinking things. nicolasvasilache: Added an assertion, there is no short term plan to support such transformations and analyses on…
// RAW graph		// RAW graph
for (auto dstView : dst.getInputs()) { // R		for (auto dstView : dst.getInputs()) { // R
		ftynseUnsubmitted Done Reply Inline Actions Do you need any modifications to support tensor inputs here? From a quick look, it should just work, but better check... ftynse: Do you need any modifications to support tensor inputs here? From a quick look, it should just…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions added assertions above. nicolasvasilache: added assertions above.
if (aliases.alias(srcView, dstView)) { // if alias, fill RAW		if (aliases.alias(srcView, dstView)) { // if alias, fill RAW
addDependenceElem(DependenceType::RAW,		addDependenceElem(DependenceType::RAW,
LinalgOpView{src.getOperation(), srcView},		LinalgOpView{src.getOperation(), srcView},
LinalgOpView{dst.getOperation(), dstView});		LinalgOpView{dst.getOperation(), dstView});
}		}
}		}
// WAW graph		// WAW graph
for (auto dstView : dst.getOutputs()) { // W		for (auto dstView : dst.getOutputBuffers()) { // W
if (aliases.alias(srcView, dstView)) { // if alias, fill WAW		if (aliases.alias(srcView, dstView)) { // if alias, fill WAW
addDependenceElem(DependenceType::WAW,		addDependenceElem(DependenceType::WAW,
LinalgOpView{src.getOperation(), srcView},		LinalgOpView{src.getOperation(), srcView},
LinalgOpView{dst.getOperation(), dstView});		LinalgOpView{dst.getOperation(), dstView});
}		}
}		}
}		}
for (auto srcView : src.getInputs()) { // R		for (auto srcView : src.getInputs()) { // R
// RAR graph		// RAR graph
for (auto dstView : dst.getInputs()) { // R		for (auto dstView : dst.getInputs()) { // R
if (aliases.alias(srcView, dstView)) { // if alias, fill RAR		if (aliases.alias(srcView, dstView)) { // if alias, fill RAR
addDependenceElem(DependenceType::RAR,		addDependenceElem(DependenceType::RAR,
LinalgOpView{src.getOperation(), srcView},		LinalgOpView{src.getOperation(), srcView},
LinalgOpView{dst.getOperation(), dstView});		LinalgOpView{dst.getOperation(), dstView});
}		}
}		}
// WAR graph		// WAR graph
for (auto dstView : dst.getOutputs()) { // W		for (auto dstView : dst.getOutputBuffers()) { // W
if (aliases.alias(srcView, dstView)) { // if alias, fill WAR		if (aliases.alias(srcView, dstView)) { // if alias, fill WAR
addDependenceElem(DependenceType::WAR,		addDependenceElem(DependenceType::WAR,
LinalgOpView{src.getOperation(), srcView},		LinalgOpView{src.getOperation(), srcView},
LinalgOpView{dst.getOperation(), dstView});		LinalgOpView{dst.getOperation(), dstView});
}		}
}		}
}		}
}		}
▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	static ParseResult parseGenericOp(OpAsmParser &parser, OperationState &result) {
return parser.resolveOperands(operandsInfo, operandTypes,		return parser.resolveOperands(operandsInfo, operandTypes,
parser.getCurrentLocation(), result.operands);		parser.getCurrentLocation(), result.operands);
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyBlockArgs(GenericOpType op, Block &block);		static LogicalResult verifyBlockArgs(GenericOpType op, Block &block);

template <> LogicalResult verifyBlockArgs(GenericOp op, Block &block) {		template <> LogicalResult verifyBlockArgs(GenericOp op, Block &block) {
auto nViews = op.getNumInputsAndOutputs();		auto nOperands = op.getNumOperands();
auto nInputViews = op.getNumInputs();		if (block.getNumArguments() != nOperands)
if (block.getNumArguments() != nViews)		return op.emitOpError("expected number of block arguments to match number "
return op.emitOpError(		"of operands");
"expected number of block arguments to match number of views");

for (unsigned i = 0; i < nViews; ++i) {		// Note: the number and type of yield values are checked in the YieldOp.
		auto nInputViews = op.getNumInputs();
		for (unsigned i = 0; i < nOperands; ++i) {
auto viewType = op.getShapedType(i);		auto viewType = op.getShapedType(i);
if (viewType.getElementType() != block.getArgument(i).getType())		if (viewType.getElementType() != block.getArgument(i).getType())
return op.emitOpError("expected block argument ")		return op.emitOpError("expected block argument ")
<< i << " of the same type as elemental type of "		<< (i + 1) << " of the same type as elemental type of "
<< ((i < nInputViews) ? "input " : "output ")		<< ((i < nInputViews) ? "input " : "output ")
<< "view: " << viewType;		<< "operand: " << viewType;
}		}
return success();		return success();
}		}

template <> LogicalResult verifyBlockArgs(IndexedGenericOp op, Block &block) {		template <> LogicalResult verifyBlockArgs(IndexedGenericOp op, Block &block) {
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nViews = op.getNumInputsAndOutputs();		auto nOperands = op.getNumOperands();
if (block.getNumArguments() != nViews + nLoops)		if (block.getNumArguments() != nOperands + nLoops)
return op.emitOpError(		return op.emitOpError(
"expected number of block arguments to match number of views + "		"expected number of block arguments to match number of operands + "
"number of loops");		"number of loops");

for (unsigned i = 0; i < nLoops; ++i) {		// Note: the number and type of yield values are checked in the YieldOp.
		for (unsigned i = 0; i < nLoops; ++i)
if (!block.getArgument(i).getType().isIndex())		if (!block.getArgument(i).getType().isIndex())
return op.emitOpError("expected block argument ")		return op.emitOpError("expected block argument ")
<< i << " to be of IndexType";		<< (i + 1) << " to be an index";
}

for (unsigned i = 0; i < nViews; ++i) {		for (unsigned i = 0; i < nOperands; ++i) {
unsigned memrefArgIndex = i + nLoops;		unsigned memrefArgIndex = i + nLoops;
auto viewType = op.getShapedType(i);		auto viewType = op.getShapedType(i);
if (viewType.getElementType() !=		if (viewType.getElementType() !=
block.getArgument(memrefArgIndex).getType())		block.getArgument(memrefArgIndex).getType())
return op.emitOpError("expected block argument ")		return op.emitOpError("expected block argument ")
<< memrefArgIndex << " of the same type as elemental type of "		<< (memrefArgIndex + 1)
		<< " of the same type as elemental type of "
<< ((i < nInputViews) ? "input " : "output ")		<< ((i < nInputViews) ? "input " : "output ")
<< "view: " << viewType;		<< "operand: " << viewType;
}		}
return success();		return success();
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyFuncArgs(GenericOpType op, FunctionType funType);		static LogicalResult verifyFuncArgs(GenericOpType op, FunctionType funType);

template <> LogicalResult verifyFuncArgs(GenericOp op, FunctionType funType) {		template <typename GenericOpType>
auto nViews = op.getNumInputsAndOutputs();		LogicalResult verifyFuncArgsGeneric(GenericOpType op, FunctionType funType) {
auto nInputViews = op.getNumInputs();		auto res = verifyFuncArgs(op, funType);
if (funType.getNumInputs() != nViews)		if (failed(res))
return op.emitOpError("expected fun arguments to match number of views");		return res;
if (funType.getNumResults() != op.getNumOutputs())
return op.emitOpError(
"expected fun results to match number of output views");

for (auto en : llvm::enumerate(op.indexing_maps())) {		auto nInputs = op.getNumInputs();
auto idx = en.index();		auto nOutputs = op.getNumOutputs();
auto view = (idx < nInputViews) ? op.getInputShapedType(idx)		// linalg.generic output element types are exactly the function results.
: op.getOutputShapedType(idx - nInputViews);		for (unsigned idx = 0; idx < nOutputs; ++idx) {
if (funType.getInput(idx) != view.getElementType())		ShapedType shapedType = op.getShapedType(nInputs + idx);
return op.emitOpError("expected fun argument ")		if (funType.getResult(idx) != shapedType.getElementType())
<< idx << " of the same type as elemental type "		return op.emitOpError("expected function result ")
<< view.getElementType() << " of view " << idx;		<< (idx + 1) << " of the same type as elemental type "
		<< shapedType.getElementType() << " of output " << (idx + 1);
if (idx >= nInputViews) {
auto resultIdx = idx - nInputViews;
if (funType.getResult(resultIdx) != view.getElementType())
return op.emitOpError("expected fun result ")
<< resultIdx << " of the same type as elemental type "
<< view.getElementType() << " of view " << idx;
}		}
		return success();
		}

		template <> LogicalResult verifyFuncArgs(GenericOp op, FunctionType funType) {
		auto nOperands = op.getNumOperands();
		if (funType.getNumInputs() != nOperands)
		ftynseUnsubmitted Done Reply Inline Actions Ultra-nit: functions are not necessarily fun, please use "function" in user-visible messages ftynse: Ultra-nit: functions are not necessarily fun, please use "function" in user-visible messages
		return op.emitOpError(
		"expected function arguments to match number of operands");
		if (funType.getNumResults() != op.getNumOutputs())
		herhutUnsubmitted Done Reply Inline Actions This iterators over all operands, including output buffers. herhut: This iterators over all operands, including output buffers.
		return op.emitOpError("expected function results(")
		<< funType.getNumResults() << ") to match number of outputs("
		<< op.getNumOutputs() << ")";

		// linalg.generic operands element types are exactly the first function
		// arguments.
		for (unsigned idx = 0; idx < nOperands; ++idx) {
		ShapedType shapedType = op.getShapedType(idx);
		if (funType.getInput(idx) != shapedType.getElementType())
		return op.emitOpError("expected function argument ")
		<< (idx + 1) << " of the same type as elemental type "
		<< shapedType.getElementType() << " of operand " << (idx + 1);
		ftynseUnsubmitted Done Reply Inline Actions Why +1 here? It's inconsistent with block arguments above. ftynse: Why +1 here? It's inconsistent with block arguments above.
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions made it consistent, error messages for function arguments start at 1: e.g. 1st argument etc. nicolasvasilache: made it consistent, error messages for function arguments start at 1: e.g. 1st argument etc.
}		}

return success();		return success();
}		}

template <>		template <>
LogicalResult verifyFuncArgs(IndexedGenericOp op, FunctionType funType) {		LogicalResult verifyFuncArgs(IndexedGenericOp op, FunctionType funType) {
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nInputViews = op.getNumInputs();
auto nOutputs = op.getNumOutputs();		auto nOutputs = op.getNumOutputs();
auto nViews = op.getNumInputsAndOutputs();		auto nOperands = op.getNumOperands();
if (funType.getNumInputs() != nViews + nLoops)		if (funType.getNumInputs() != nOperands + nLoops)
return op.emitOpError(		return op.emitOpError("expected function arguments to match number of "
"expected fun arguments to match number of views + number of loops");		"loops + number of operands");
if (funType.getNumResults() != nOutputs)		if (funType.getNumResults() != nOutputs)
return op.emitOpError(		return op.emitOpError(
"expected fun results to match number of output views");		"expected function results to match number of outputs");
for (unsigned i = 0; i < nLoops; ++i) {		for (unsigned i = 0; i < nLoops; ++i)
if (!funType.getInput(i).isIndex())		if (!funType.getInput(i).isIndex())
return op.emitOpError("expected fun argument ")		return op.emitOpError("expected function argument ")
		ftynseUnsubmitted Done Reply Inline Actions The use of 0-based index here is inconsistent with the 1-based index below. Please adopt a convention, document it in the dialect documentation and use everywhere. ftynse: The use of 0-based index here is inconsistent with the 1-based index below. Please adopt a…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Made it consistent, thanks for spotting! nicolasvasilache: Made it consistent, thanks for spotting!
<< i << " to be of IndexType";		<< (i + 1) << " to be an index";
}
for (auto en : llvm::enumerate(op.indexing_maps())) {		// linalg.generic operands element types are exactly the first function
auto idx = en.index();		// arguments.
auto funIdx = nLoops + idx;		for (unsigned idx = 0; idx < nOperands; ++idx) {
auto view = (idx < nInputViews) ? op.getInputShapedType(idx)		ShapedType shapedType = op.getShapedType(idx);
: op.getOutputShapedType(idx - nInputViews);		if (funType.getInput(idx + nLoops) != shapedType.getElementType())
if (funType.getInput(funIdx) != view.getElementType())		return op.emitOpError("expected function argument ")
return op.emitOpError("expected fun argument ")		<< (idx + nLoops + 1) << " of the same type as elemental type "
<< funIdx << " of the same type as elemental type "		<< shapedType.getElementType() << " of input " << (idx + 1);
<< view.getElementType() << " of view " << idx;

if (idx >= nInputViews) {
auto resultIdx = idx - nInputViews;
if (funType.getResult(resultIdx) != view.getElementType())
return op.emitOpError("expected fun result ")
<< resultIdx << " of the same type as elemental type "
<< view.getElementType() << " of view " << idx;
}
}		}

return success();		return success();
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyGenericOp(GenericOpType op) {		static LogicalResult verifyGenericOp(GenericOpType op) {
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
		ftynseUnsubmitted Done Reply Inline Actions This looks common with then non-indexed version. Can this be factored out into a helper function? ftynse: This looks common with then non-indexed version. Can this be factored out into a helper…
auto nViews = op.getNumInputsAndOutputs();		auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();
if (nViews != llvm::size(op.views()))		if (nInputsAndOutputBuffers != llvm::size(op.views()))
return op.emitOpError("expected exactly ") << nViews << " view operands";		return op.emitOpError("expected exactly ")
		<< nInputsAndOutputBuffers
		ftynseUnsubmitted Done Reply Inline Actions Nit: "inputs and buffer operands" sounds like a false dichotomy. ftynse: Nit: "inputs and buffer operands" sounds like a false dichotomy.
		<< " inputs (tensor or buffer) and output buffer operands";

auto &region = op.region();		auto &region = op.region();
auto funOp = op.getFunction();		auto funOp = op.getFunction();
auto funType = funOp ? funOp.getType() : FunctionType();		auto funType = funOp ? funOp.getType() : FunctionType();
if (!region.empty()) {		if (!region.empty()) {
if (region.getBlocks().size() != 1)		if (region.getBlocks().size() != 1)
return op.emitOpError("expected region with 1 block");		return op.emitOpError("expected region with 1 block");
if (failed(verifyBlockArgs(op, region.getBlocks().front())))		if (failed(verifyBlockArgs(op, region.getBlocks().front())))
return failure();		return failure();
} else {		} else {
if (!funOp \|\| !funOp.getType())		if (!funOp \|\| !funOp.getType())
return op.emitOpError(		return op.emitOpError(
"expected fun attribute to refer to a defined symbol");		"expected function attribute to refer to a defined symbol");
if (failed(verifyFuncArgs(op, funType)))		if (failed(verifyFuncArgsGeneric(op, funType)))
return failure();		return failure();
}		}

SmallVector<AffineMap, 4> indexingMaps;		SmallVector<AffineMap, 4> indexingMaps;
indexingMaps.reserve(op.indexing_maps().size());		indexingMaps.reserve(op.indexing_maps().size());
for (auto en : llvm::enumerate(op.indexing_maps())) {		for (auto en : llvm::enumerate(op.indexing_maps())) {
auto idx = en.index();		auto idx = en.index();
auto m = en.value().template cast<AffineMapAttr>().getValue();		auto m = en.value().template cast<AffineMapAttr>().getValue();
Show All 23 Lines	static LogicalResult verifyGenericOp(GenericOpType op) {
}		}

auto concatMap = concatAffineMaps(indexingMaps);		auto concatMap = concatAffineMaps(indexingMaps);
auto aggregateMap = inversePermutation(concatMap);		auto aggregateMap = inversePermutation(concatMap);
if (!aggregateMap)		if (!aggregateMap)
return op.emitOpError("expected the concatenation of maps in indexing_map "		return op.emitOpError("expected the concatenation of maps in indexing_map "
"to be invertible");		"to be invertible");

auto outputTensorTypes = op.getOutputTensorTypes();
if (outputTensorTypes.size() != op.getNumResults())
return op.emitOpError("expected #output tensor operands (")
<< outputTensorTypes.size() << ") to match #results ("
<< op.getNumResults() << ")";

unsigned index = 0;
for (auto it : llvm::zip(op.getResultTypes(), outputTensorTypes)) {
auto resTy = std::get<0>(it);
auto outOpTy = std::get<1>(it);
if (resTy != outOpTy)
return op.emitOpError("result #")
<< index << " must be " << outOpTy << ", but got " << resTy;
++index;
}

return success();		return success();
}		}

static LogicalResult verify(GenericOp op) { return verifyGenericOp(op); }		static LogicalResult verify(GenericOp op) { return verifyGenericOp(op); }
static LogicalResult verify(IndexedGenericOp op) { return verifyGenericOp(op); }		static LogicalResult verify(IndexedGenericOp op) { return verifyGenericOp(op); }

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// RangeOp		// RangeOp
▲ Show 20 Lines • Show All 412 Lines • ▼ Show 20 Lines	return failure(parser.parseOperandList(opInfo) \|\|
parser.parseOptionalAttrDict(result.attributes) \|\|		parser.parseOptionalAttrDict(result.attributes) \|\|
(!opInfo.empty() && parser.parseColonTypeList(types)) \|\|		(!opInfo.empty() && parser.parseColonTypeList(types)) \|\|
parser.resolveOperands(opInfo, types, loc, result.operands));		parser.resolveOperands(opInfo, types, loc, result.operands));
}		}

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyYield(YieldOp op, GenericOpType genericOp) {		static LogicalResult verifyYield(YieldOp op, GenericOpType genericOp) {
// The operand number and types must match the view element types.		// The operand number and types must match the view element types.
auto nOutputViews = genericOp.getNumOutputs();		auto nOutputs = genericOp.getNumOutputs();
if (op.getNumOperands() != nOutputViews)		if (op.getNumOperands() != nOutputs)
return op.emitOpError("expected ")		return op.emitOpError("expected number of yield values (")
<< nOutputViews << " operand to match enclosing linalg.generic op";		<< nOutputs << ") to match the number of operands of the enclosing "
		<< "linalg.generic op (" << op.getNumOperands() << ")";
		ftynseUnsubmitted Done Reply Inline Actions Ultra-nit: whitespace before `(` ftynse: Ultra-nit: whitespace before `(`

for (unsigned i = 0; i != nOutputViews; ++i) {		for (unsigned i = 0; i != nOutputs; ++i) {
auto elementType = genericOp.getOutputShapedType(i).getElementType();		auto elementType = genericOp.getOutputShapedType(i).getElementType();
if (op.getOperand(i).getType() != elementType)		if (op.getOperand(i).getType() != elementType)
return op.emitOpError("type of return operand ")		return op.emitOpError("type of yield operand ")
<< i << " (" << op.getOperand(i).getType()		<< (i + 1) << " (" << op.getOperand(i).getType()
<< ") doesn't match view element type (" << elementType << ")";		<< ") doesn't match "
		<< "the element type of the enclosing linalg.generic op ("
		<< elementType << ")";
}		}
return success();		return success();
}		}

static LogicalResult verify(YieldOp op) {		static LogicalResult verify(YieldOp op) {
auto *parentOp = op.getParentOp();		auto *parentOp = op.getParentOp();
if (parentOp->getNumRegions() != 1 \|\| parentOp->getRegion(0).empty())		if (parentOp->getNumRegions() != 1 \|\| parentOp->getRegion(0).empty())
return op.emitOpError("expected single non-empty parent region");		return op.emitOpError("expected single non-empty parent region");
▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	static llvm::cl::list<unsigned> clTileSizes(
llvm::cl::cat(clOptionsCategory));		llvm::cl::cat(clOptionsCategory));

// Return a cloned version of `op` that operates on `loopRanges`, assumed to be		// Return a cloned version of `op` that operates on `loopRanges`, assumed to be
// a subset of the original loop ranges of `op`.		// a subset of the original loop ranges of `op`.
// This is achieved by applying the `loopToOperandRangesMaps` permutation maps		// This is achieved by applying the `loopToOperandRangesMaps` permutation maps
// to the `loopRanges` in order to obtain view ranges.		// to the `loopRanges` in order to obtain view ranges.
static LinalgOp cloneWithLoopRanges(OpBuilder &b, Location loc, LinalgOp op,		static LinalgOp cloneWithLoopRanges(OpBuilder &b, Location loc, LinalgOp op,
ArrayRef<SubViewOp::Range> loopRanges) {		ArrayRef<SubViewOp::Range> loopRanges) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");
auto maps = loopToOperandRangesMaps(op);		auto maps = loopToOperandRangesMaps(op);
SmallVector<Value, 8> clonedViews;		SmallVector<Value, 8> clonedViews;
clonedViews.reserve(op.getNumInputsAndOutputs());		clonedViews.reserve(op.getNumInputsAndOutputs());
// Iterate over the inputs and outputs in order.		// Iterate over the inputs and outputs in order.
// Extract the subranges from the linearized ranges.		// Extract the subranges from the linearized ranges.
SmallVector<Value, 8> ios(op.getInputsAndOutputs());		SmallVector<Value, 8> ios(op.getInputsAndOutputBuffers());
for (auto en : llvm::enumerate(ios)) {		for (auto en : llvm::enumerate(ios)) {
unsigned idx = en.index();		unsigned idx = en.index();
auto map = maps[idx];		auto map = maps[idx];
LLVM_DEBUG(dbgs() << "map: " << map << "\n");		LLVM_DEBUG(dbgs() << "map: " << map << "\n");
Value view = en.value();		Value view = en.value();
SmallVector<SubViewOp::Range, 4> viewRanges(map.getNumResults());		SmallVector<SubViewOp::Range, 4> viewRanges(map.getNumResults());
for (auto en2 : llvm::enumerate(map.getResults())) {		for (auto en2 : llvm::enumerate(map.getResults())) {
unsigned d = en2.index();		unsigned d = en2.index();
Show All 29 Lines
};		};

// Given an `op`, returns the first (`view`, `dimension`) pair that identifies		// Given an `op`, returns the first (`view`, `dimension`) pair that identifies
// the loop range at `loopDepth`. The semantics of the loopToOperandRangesMaps		// the loop range at `loopDepth`. The semantics of the loopToOperandRangesMaps
// guarantees at least one such dimension is found. If multiple candidates exist		// guarantees at least one such dimension is found. If multiple candidates exist
// they must agree by construction (i.e. have the same size) and we just return		// they must agree by construction (i.e. have the same size) and we just return
// the first one.		// the first one.
static ViewDimension getViewDefiningLoopRange(LinalgOp op, unsigned loopDepth) {		static ViewDimension getViewDefiningLoopRange(LinalgOp op, unsigned loopDepth) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");
auto maps = loopToOperandRangesMaps(op);		auto maps = loopToOperandRangesMaps(op);
// Iterate over the inputs and outputs in order.		// Iterate over the inputs and outputs in order.
// Extract the subranges from the linearized ranges.		// Extract the subranges from the linearized ranges.
SmallVector<Value, 8> ios(op.getInputsAndOutputs());		SmallVector<Value, 8> ios(op.getInputsAndOutputBuffers());
for (auto en : llvm::enumerate(ios)) {		for (auto en : llvm::enumerate(ios)) {
unsigned idx = en.index();		unsigned idx = en.index();
auto map = maps[idx];		auto map = maps[idx];
LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange I/O idx: " << idx << "\n");		LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange I/O idx: " << idx << "\n");
LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange map: " << map << "\n");		LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange map: " << map << "\n");
Value view = en.value();		Value view = en.value();
SmallVector<Value, 8> viewRanges(map.getNumResults(), nullptr);		SmallVector<Value, 8> viewRanges(map.getNumResults(), nullptr);
for (auto en2 : llvm::enumerate(map.getResults())) {		for (auto en2 : llvm::enumerate(map.getResults())) {
if (loopDepth == en2.value().cast<AffineDimExpr>().getPosition()) {		if (loopDepth == en2.value().cast<AffineDimExpr>().getPosition()) {
LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange loopDepth: " << loopDepth		LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange loopDepth: " << loopDepth
<< "\n");		<< "\n");
LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange view: " << view << "\n");		LLVM_DEBUG(dbgs() << "getViewDefiningLoopRange view: " << view << "\n");
return ViewDimension{view, static_cast<unsigned>(en2.index())};		return ViewDimension{view, static_cast<unsigned>(en2.index())};
}		}
}		}
}		}
llvm_unreachable("Expect to be able to extract a view defining loop range");		llvm_unreachable("Expect to be able to extract a view defining loop range");
}		}

static LinalgOp fuse(Value producedView, LinalgOp producer, LinalgOp consumer,		static LinalgOp fuse(Value producedView, LinalgOp producer, LinalgOp consumer,
unsigned consumerIdx, unsigned producerIdx,		unsigned consumerIdx, unsigned producerIdx,
OperationFolder *folder) {		OperationFolder *folder) {
		assert(producer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto subView = dyn_cast_or_null<SubViewOp>(		auto subView = dyn_cast_or_null<SubViewOp>(
consumer.getInput(consumerIdx).getDefiningOp());		consumer.getInput(consumerIdx).getDefiningOp());
auto slice =		auto slice =
dyn_cast_or_null<SliceOp>(consumer.getInput(consumerIdx).getDefiningOp());		dyn_cast_or_null<SliceOp>(consumer.getInput(consumerIdx).getDefiningOp());
assert(subView \|\| slice);		assert(subView \|\| slice);
(void)subView;		(void)subView;
(void)slice;		(void)slice;

Show All 37 Lines	static LinalgOp fuse(Value producedView, LinalgOp producer, LinalgOp consumer,

return cloneWithLoopRanges(b, loc, producer, loopRanges);		return cloneWithLoopRanges(b, loc, producer, loopRanges);
}		}

// Encode structural fusion safety preconditions.		// Encode structural fusion safety preconditions.
// Some of these will be lifted in the future with better analysis.		// Some of these will be lifted in the future with better analysis.
static bool isStructurallyFusableProducer(LinalgOp producer, Value consumedView,		static bool isStructurallyFusableProducer(LinalgOp producer, Value consumedView,
LinalgOp consumer) {		LinalgOp consumer) {
		assert(producer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
if (producer.getNumOutputs() != 1) {		if (producer.getNumOutputs() != 1) {
LLVM_DEBUG(dbgs() << "\nNot structurally fusable (multi-output)");		LLVM_DEBUG(dbgs() << "\nNot structurally fusable (multi-output)");
return false;		return false;
}		}
// Only fuse when the producer block dominates.		// Only fuse when the producer block dominates.
DominanceInfo dom(producer.getOperation());		DominanceInfo dom(producer.getOperation());
if (!dom.dominates(producer.getOperation()->getBlock(),		if (!dom.dominates(producer.getOperation()->getBlock(),
consumer.getOperation()->getBlock())) {		consumer.getOperation()->getBlock())) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs()		dbgs()
<< "\nNot structurally fusable (producer block does not dominate)");		<< "\nNot structurally fusable (producer block does not dominate)");
return false;		return false;
}		}
return true;		return true;
}		}

bool mlir::linalg::isProducerLastWriteOfView(const LinalgDependenceGraph &graph,		bool mlir::linalg::isProducerLastWriteOfView(const LinalgDependenceGraph &graph,
LinalgOp consumer,		LinalgOp consumer,
Value consumedView,		Value consumedView,
LinalgOp producer) {		LinalgOp producer) {
		assert(producer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
// Make some simple structural checks that alleviate the need for more		// Make some simple structural checks that alleviate the need for more
// complex analyses.		// complex analyses.
if (!isStructurallyFusableProducer(producer, consumedView, consumer)) {		if (!isStructurallyFusableProducer(producer, consumedView, consumer)) {
LLVM_DEBUG(dbgs() << "\n***Not static last write due to structure:\t"		LLVM_DEBUG(dbgs() << "\n***Not static last write due to structure:\t"
<< *producer.getOperation());		<< *producer.getOperation());
return false;		return false;
}		}
// Check for any interleaved write to consumedView.		// Check for any interleaved write to consumedView.
if (!graph.findCoveringWrites(producer, consumer, consumedView).empty()) {		if (!graph.findCoveringWrites(producer, consumer, consumedView).empty()) {
LLVM_DEBUG(dbgs() << "\n***Not fusable due to interleaved write:\t"		LLVM_DEBUG(dbgs() << "\n***Not fusable due to interleaved write:\t"
<< *producer.getOperation());		<< *producer.getOperation());
return false;		return false;
}		}
return true;		return true;
}		}

bool mlir::linalg::isFusableInto(const LinalgDependenceGraph &graph,		bool mlir::linalg::isFusableInto(const LinalgDependenceGraph &graph,
LinalgOp consumer, Value consumedView,		LinalgOp consumer, Value consumedView,
LinalgOp producer) {		LinalgOp producer) {
		assert(producer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
if (!isProducerLastWriteOfView(graph, consumer, consumedView, producer))		if (!isProducerLastWriteOfView(graph, consumer, consumedView, producer))
return false;		return false;
// Check for any fusion-preventing dependence to any view read/written that		// Check for any fusion-preventing dependence to any view read/written that
// would violate dependences.		// would violate dependences.
if (!graph.findCoveringDependences(producer, consumer).empty()) {		if (!graph.findCoveringDependences(producer, consumer).empty()) {
LLVM_DEBUG(dbgs() << "\n***Not fusable due to an interleaved dependence:\t"		LLVM_DEBUG(dbgs() << "\n***Not fusable due to an interleaved dependence:\t"
<< *producer.getOperation());		<< *producer.getOperation());
return false;		return false;
}		}
return true;		return true;
}		}

// Only consider RAW atm.		// Only consider RAW atm.
Optional<FusionInfo> mlir::linalg::fuseProducerOf(		Optional<FusionInfo> mlir::linalg::fuseProducerOf(
OpBuilder &b, LinalgOp consumer, unsigned consumerIdx,		OpBuilder &b, LinalgOp consumer, unsigned consumerIdx,
const LinalgDependenceGraph &graph, OperationFolder *folder) {		const LinalgDependenceGraph &graph, OperationFolder *folder) {
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
LLVM_DEBUG(dbgs() << "\nStart examining consumer: "		LLVM_DEBUG(dbgs() << "\nStart examining consumer: "
<< *consumer.getOperation());		<< *consumer.getOperation());
for (auto dependence : graph.getDependencesInto(		for (auto dependence : graph.getDependencesInto(
consumer, LinalgDependenceGraph::DependenceType::RAW)) {		consumer, LinalgDependenceGraph::DependenceType::RAW)) {
LLVM_DEBUG(dbgs() << "\n***Consider producer:\t"		LLVM_DEBUG(dbgs() << "\n***Consider producer:\t"
<< *dependence.dependentOpView.op << "\n");		<< *dependence.dependentOpView.op << "\n");
auto producer = cast<LinalgOp>(dependence.dependentOpView.op);		auto producer = cast<LinalgOp>(dependence.dependentOpView.op);

// Check that the dependence is indeed on the input `consumerIdx` view.		// Check that the dependence is indeed on the input `consumerIdx` view.
auto consumedView = dependence.indexingView;		auto consumedView = dependence.indexingView;
if (consumer.getInput(consumerIdx) != consumedView)		if (consumer.getInput(consumerIdx) != consumedView)
continue;		continue;

// Consumer consumes this view, `isStructurallyFusableProducer` also checks		// Consumer consumes this view, `isStructurallyFusableProducer` also checks
// whether it is a strict subview of the producer view.		// whether it is a strict subview of the producer view.
auto producedView = dependence.dependentOpView.view;		auto producedView = dependence.dependentOpView.view;
auto producerIdx = producer.getIndexOfOutput(producedView).getValue();		auto producerIdx = producer.getIndexOfOutputBuffer(producedView).getValue();
// `consumerIdx` and `producerIdx` exist by construction.		// `consumerIdx` and `producerIdx` exist by construction.
LLVM_DEBUG(dbgs() << "\nRAW producer: " << *producer.getOperation()		LLVM_DEBUG(dbgs() << "\nRAW producer: " << *producer.getOperation()
<< " view: " << producedView		<< " view: " << producedView
<< " output index: " << producerIdx);		<< " output index: " << producerIdx);

// Must be a subview or a slice to guarantee there are loops we can fuse		// Must be a subview or a slice to guarantee there are loops we can fuse
// into.		// into.
auto subView = dyn_cast_or_null<SubViewOp>(consumedView.getDefiningOp());		auto subView = dyn_cast_or_null<SubViewOp>(consumedView.getDefiningOp());
Show All 24 Lines	static void fuseLinalgOpsGreedily(FuncOp f) {
LLVM_DEBUG(f.print(dbgs() << "\nBefore linalg-fusion: \n"));		LLVM_DEBUG(f.print(dbgs() << "\nBefore linalg-fusion: \n"));

OpBuilder b(f);		OpBuilder b(f);
OperationFolder folder(f.getContext());		OperationFolder folder(f.getContext());
DenseSet<Operation *> eraseSet;		DenseSet<Operation *> eraseSet;

// Save original Linalg ops, we only want to make a pass over those.		// Save original Linalg ops, we only want to make a pass over those.
SmallVector<Operation *, 8> linalgOps;		SmallVector<Operation *, 8> linalgOps;
f.walk([&](LinalgOp op) { linalgOps.push_back(op); });		f.walk([&](LinalgOp op) {
		if (op.hasBufferSemantics())
		linalgOps.push_back(op);
		});

Aliases aliases;		Aliases aliases;
LinalgDependenceGraph G(aliases, linalgOps);		LinalgDependenceGraph G(aliases, linalgOps);
for (auto *op : llvm::reverse(linalgOps)) {		for (auto *op : llvm::reverse(linalgOps)) {
for (unsigned consumerIdx = 0, e = LinalgOp(op).getNumInputs();		for (unsigned consumerIdx = 0, e = LinalgOp(op).getNumInputs();
consumerIdx < e; ++consumerIdx) {		consumerIdx < e; ++consumerIdx) {
if (auto fusionInfo = fuseProducerOf(b, op, consumerIdx, G, &folder))		if (auto fusionInfo = fuseProducerOf(b, op, consumerIdx, G, &folder))
eraseSet.insert(fusionInfo->originalProducer.getOperation());		eraseSet.insert(fusionInfo->originalProducer.getOperation());
Show All 25 Lines

mlir/lib/Dialect/Linalg/Transforms/LinalgToLoops.cpp

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
}		}

template <typename IndexedValueType, typename LinalgOpType>		template <typename IndexedValueType, typename LinalgOpType>
class LinalgScopedEmitter {};		class LinalgScopedEmitter {};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, CopyOp> {		class LinalgScopedEmitter<IndexedValueType, CopyOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs, CopyOp copyOp) {		static void emitScalarImplementation(ArrayRef<Value> allIvs, CopyOp copyOp) {
		assert(copyOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		herhutUnsubmitted Done Reply Inline Actions This code assumes an all buffer representation? Maybe add an assert? herhut: This code assumes an all buffer representation? Maybe add an assert?
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Sprinkled a bunch in the relevant places, thanks! nicolasvasilache: Sprinkled a bunch in the relevant places, thanks!
auto nPar = copyOp.getNumParallelLoops();		auto nPar = copyOp.getNumParallelLoops();
assert(nPar == allIvs.size());		assert(nPar == allIvs.size());
auto inputIvs =		auto inputIvs =
permuteIvs(allIvs.take_front(nPar), copyOp.inputPermutation());		permuteIvs(allIvs.take_front(nPar), copyOp.inputPermutation());
auto outputIvs =		auto outputIvs =
permuteIvs(allIvs.take_front(nPar), copyOp.outputPermutation());		permuteIvs(allIvs.take_front(nPar), copyOp.outputPermutation());
SmallVector<IndexHandle, 8> iivs(inputIvs.begin(), inputIvs.end());		SmallVector<IndexHandle, 8> iivs(inputIvs.begin(), inputIvs.end());
SmallVector<IndexHandle, 8> oivs(outputIvs.begin(), outputIvs.end());		SmallVector<IndexHandle, 8> oivs(outputIvs.begin(), outputIvs.end());
IndexedValueType O(copyOp.getOutput(0)), I(copyOp.getInput(0));		IndexedValueType O(copyOp.getOutputBuffer(0)), I(copyOp.getInput(0));
// Emit the proper scalar assignment, whether we are dealing with a 0-D or		// Emit the proper scalar assignment, whether we are dealing with a 0-D or
// an n-D loop nest; with or without permutations.		// an n-D loop nest; with or without permutations.
// clang-format off		// clang-format off
nPar > 0 ? O(oivs) = I(iivs) :		nPar > 0 ? O(oivs) = I(iivs) :
O() = I();		O() = I();
// clang-format on		// clang-format on
}		}
};		};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, FillOp> {		class LinalgScopedEmitter<IndexedValueType, FillOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs, FillOp fillOp) {		static void emitScalarImplementation(ArrayRef<Value> allIvs, FillOp fillOp) {
		assert(fillOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto nPar = fillOp.getNumParallelLoops();		auto nPar = fillOp.getNumParallelLoops();
assert(nPar == allIvs.size());		assert(nPar == allIvs.size());
auto ivs =		auto ivs =
SmallVector<IndexHandle, 4>(allIvs.begin(), allIvs.begin() + nPar);		SmallVector<IndexHandle, 4>(allIvs.begin(), allIvs.begin() + nPar);
IndexedValueType O(fillOp.getOutput(0));		IndexedValueType O(fillOp.getOutputBuffer(0));
// Emit the proper scalar assignment, whether we are dealing with a 0-D or		// Emit the proper scalar assignment, whether we are dealing with a 0-D or
// an n-D loop nest; with or without permutations.		// an n-D loop nest; with or without permutations.
nPar > 0 ? O(ivs) = ValueHandle(fillOp.value())		nPar > 0 ? O(ivs) = ValueHandle(fillOp.value())
: O() = ValueHandle(fillOp.value());		: O() = ValueHandle(fillOp.value());
}		}
};		};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, DotOp> {		class LinalgScopedEmitter<IndexedValueType, DotOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs, DotOp dotOp) {		static void emitScalarImplementation(ArrayRef<Value> allIvs, DotOp dotOp) {
		assert(dotOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
assert(allIvs.size() == 1);		assert(allIvs.size() == 1);
IndexHandle r_i(allIvs[0]);		IndexHandle r_i(allIvs[0]);
IndexedValueType A(dotOp.getInput(0)), B(dotOp.getInput(1)),		IndexedValueType A(dotOp.getInput(0)), B(dotOp.getInput(1)),
C(dotOp.getOutput(0));		C(dotOp.getOutputBuffer(0));
// Emit scalar form.		// Emit scalar form.
C() = C() + A(r_i) * B(r_i);		C() = C() + A(r_i) * B(r_i);
}		}
};		};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, MatvecOp> {		class LinalgScopedEmitter<IndexedValueType, MatvecOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs,		static void emitScalarImplementation(ArrayRef<Value> allIvs,
MatvecOp matvecOp) {		MatvecOp matvecOp) {
		assert(matvecOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
assert(allIvs.size() == 2);		assert(allIvs.size() == 2);
IndexHandle i(allIvs[0]), r_j(allIvs[1]);		IndexHandle i(allIvs[0]), r_j(allIvs[1]);
IndexedValueType A(matvecOp.getInput(0)), B(matvecOp.getInput(1)),		IndexedValueType A(matvecOp.getInput(0)), B(matvecOp.getInput(1)),
C(matvecOp.getOutput(0));		C(matvecOp.getOutputBuffer(0));
// Emit scalar form.		// Emit scalar form.
C(i) = C(i) + A(i, r_j) * B(r_j);		C(i) = C(i) + A(i, r_j) * B(r_j);
}		}
};		};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, MatmulOp> {		class LinalgScopedEmitter<IndexedValueType, MatmulOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs,		static void emitScalarImplementation(ArrayRef<Value> allIvs,
MatmulOp matmulOp) {		MatmulOp matmulOp) {
		assert(matmulOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
assert(allIvs.size() == 3);		assert(allIvs.size() == 3);
IndexHandle i(allIvs[0]), j(allIvs[1]), r_k(allIvs[2]);		IndexHandle i(allIvs[0]), j(allIvs[1]), r_k(allIvs[2]);
IndexedValueType A(matmulOp.getInput(0)), B(matmulOp.getInput(1)),		IndexedValueType A(matmulOp.getInput(0)), B(matmulOp.getInput(1)),
C(matmulOp.getOutput(0));		C(matmulOp.getOutputBuffer(0));
// Emit scalar form.		// Emit scalar form.
C(i, j) = C(i, j) + A(i, r_k) * B(r_k, j);		C(i, j) = C(i, j) + A(i, r_k) * B(r_k, j);
}		}
};		};

template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, ConvOp> {		class LinalgScopedEmitter<IndexedValueType, ConvOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs, ConvOp convOp) {		static void emitScalarImplementation(ArrayRef<Value> allIvs, ConvOp convOp) {
		assert(convOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto b = ScopedContext::getBuilder();		auto b = ScopedContext::getBuilder();
auto loc = ScopedContext::getLocation();		auto loc = ScopedContext::getLocation();
auto maps = loopToOperandRangesMaps(convOp);		auto maps = loopToOperandRangesMaps(convOp);
SmallVector<ValueHandle, 8> fIdx(		SmallVector<ValueHandle, 8> fIdx(
makeCanonicalAffineApplies(b, loc, maps[0], allIvs));		makeCanonicalAffineApplies(b, loc, maps[0], allIvs));
SmallVector<ValueHandle, 8> imIdx(		SmallVector<ValueHandle, 8> imIdx(
makeCanonicalAffineApplies(b, loc, maps[1], allIvs));		makeCanonicalAffineApplies(b, loc, maps[1], allIvs));
SmallVector<ValueHandle, 8> oIdx(		SmallVector<ValueHandle, 8> oIdx(
Show All 30 Lines
// memref<?x?x?Xf32, stride_specification>		// memref<?x?x?Xf32, stride_specification>
// store %14#1, %arg2[%i, %k, %j] :		// store %14#1, %arg2[%i, %k, %j] :
// memref<?x?x?Xf32, stride_specification>		// memref<?x?x?Xf32, stride_specification>
// }		// }
// }		// }
// }		// }
// ```		// ```
template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, GenericOp> {		class LinalgScopedEmitter<IndexedValueType, GenericOp> {
		herhutUnsubmitted Done Reply Inline Actions This code assumes an all buffers representation? herhut: This code assumes an all buffers representation?
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions sprinkled a bunch of asserts in the relevant places, thanks! nicolasvasilache: sprinkled a bunch of asserts in the relevant places, thanks!
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs,		static void emitScalarImplementation(ArrayRef<Value> allIvs,
GenericOp genericOp) {		GenericOp genericOp) {
		assert(genericOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto b = ScopedContext::getBuilder();		auto b = ScopedContext::getBuilder();
auto loc = ScopedContext::getLocation();		auto loc = ScopedContext::getLocation();
using edsc::intrinsics::detail::ValueHandleArray;		using edsc::intrinsics::detail::ValueHandleArray;
unsigned nInputs = genericOp.getNumInputs();		unsigned nInputs = genericOp.getNumInputs();
unsigned nOutputs = genericOp.getNumOutputs();		unsigned nOutputs = genericOp.getNumOutputs();
SmallVector<Value, 4> indexedValues(nInputs + nOutputs);		SmallVector<Value, 4> indexedValues(nInputs + nOutputs);

// 1.a. Emit std_load from input views.		// 1.a. Emit std_load from input views.
for (unsigned i = 0; i < nInputs; ++i) {		for (unsigned i = 0; i < nInputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, genericOp.getInputIndexingMap(i), allIvs));		b, loc, genericOp.getInputIndexingMap(i), allIvs));
indexedValues[i] = std_load(genericOp.getInput(i), indexing);		indexedValues[i] = std_load(genericOp.getInput(i), indexing);
}		}

// 1.b. Emit std_load from output views.		// 1.b. Emit std_load from output views.
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, genericOp.getOutputIndexingMap(i), allIvs));		b, loc, genericOp.getOutputIndexingMap(i), allIvs));
indexedValues[nInputs + i] = std_load(genericOp.getOutput(i), indexing);		indexedValues[nInputs + i] =
		std_load(genericOp.getOutputBuffer(i), indexing);
}		}

auto funcOp = genericOp.getFunction();		auto funcOp = genericOp.getFunction();
if (funcOp) {		if (funcOp) {
// 2. Emit call.		// 2. Emit call.
Operation *callOp = call(funcOp, indexedValues);		Operation *callOp = call(funcOp, indexedValues);
assert(callOp->getNumResults() == genericOp.getNumOutputs());		assert(callOp->getNumResults() == genericOp.getNumOutputs());

// 3. Emit std_store.		// 3. Emit std_store.
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, genericOp.getOutputIndexingMap(i), allIvs));		b, loc, genericOp.getOutputIndexingMap(i), allIvs));
std_store(callOp->getResult(i), genericOp.getOutput(i), indexing);		std_store(callOp->getResult(i), genericOp.getOutputBuffer(i), indexing);
}		}
return;		return;
}		}
// TODO(ntv): When a region inliner exists, use it.		// TODO(ntv): When a region inliner exists, use it.
// 2. Inline region, currently only works for a single basic block.		// 2. Inline region, currently only works for a single basic block.
BlockAndValueMapping map;		BlockAndValueMapping map;
auto &block = genericOp.region().front();		auto &block = genericOp.region().front();
for (auto it : llvm::zip(block.getArguments(), indexedValues))		for (auto it : llvm::zip(block.getArguments(), indexedValues))
map.map(std::get<0>(it), std::get<1>(it));		map.map(std::get<0>(it), std::get<1>(it));
for (auto &op : block.without_terminator()) {		for (auto &op : block.without_terminator()) {
assert(op.getNumRegions() == 0);		assert(op.getNumRegions() == 0);
auto *newOp = b.clone(op, map);		auto *newOp = b.clone(op, map);
for (auto it : llvm::zip(op.getResults(), newOp->getResults()))		for (auto it : llvm::zip(op.getResults(), newOp->getResults()))
map.map(std::get<0>(it), std::get<1>(it));		map.map(std::get<0>(it), std::get<1>(it));
}		}

// 3. Emit std_store.		// 3. Emit std_store.
auto *yieldOp = cast<YieldOp>(block.back()).getOperation();		auto *yieldOp = cast<YieldOp>(block.back()).getOperation();
assert(yieldOp->getNumOperands() == nOutputs);		assert(yieldOp->getNumOperands() == nOutputs);
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, genericOp.getOutputIndexingMap(i), allIvs));		b, loc, genericOp.getOutputIndexingMap(i), allIvs));
std_store(map.lookup(yieldOp->getOperand(i)), genericOp.getOutput(i),		std_store(map.lookup(yieldOp->getOperand(i)),
indexing);		genericOp.getOutputBuffer(i), indexing);
}		}
}		}
};		};

// Emits the MLIR for the scalar part of the indexed generic op by:		// Emits the MLIR for the scalar part of the indexed generic op by:
// 1. Emitting std_load and std_store ops for each input and output view in		// 1. Emitting std_load and std_store ops for each input and output view in
// order. This is achieved by applying the appropriate input or output map		// order. This is achieved by applying the appropriate input or output map
// to the enclosing induction variables.		// to the enclosing induction variables.
Show All 23 Lines
// }		// }
// }		// }
// ```		// ```
template <typename IndexedValueType>		template <typename IndexedValueType>
class LinalgScopedEmitter<IndexedValueType, IndexedGenericOp> {		class LinalgScopedEmitter<IndexedValueType, IndexedGenericOp> {
public:		public:
static void emitScalarImplementation(ArrayRef<Value> allIvs,		static void emitScalarImplementation(ArrayRef<Value> allIvs,
IndexedGenericOp indexedGenericOp) {		IndexedGenericOp indexedGenericOp) {
		assert(indexedGenericOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto b = ScopedContext::getBuilder();		auto b = ScopedContext::getBuilder();
auto loc = ScopedContext::getLocation();		auto loc = ScopedContext::getLocation();
using edsc::intrinsics::detail::ValueHandleArray;		using edsc::intrinsics::detail::ValueHandleArray;
unsigned nInputs = indexedGenericOp.getNumInputs();		unsigned nInputs = indexedGenericOp.getNumInputs();
unsigned nOutputs = indexedGenericOp.getNumOutputs();		unsigned nOutputs = indexedGenericOp.getNumOutputs();
unsigned nLoops = allIvs.size();		unsigned nLoops = allIvs.size();
SmallVector<Value, 4> indexedValues(nLoops + nInputs + nOutputs);		SmallVector<Value, 4> indexedValues(nLoops + nInputs + nOutputs);

Show All 9 Lines	for (unsigned i = 0; i < nInputs; ++i) {
std_load(indexedGenericOp.getInput(i), indexing);		std_load(indexedGenericOp.getInput(i), indexing);
}		}

// 1.b. Emit std_load from output views.		// 1.b. Emit std_load from output views.
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));		b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));
indexedValues[nLoops + nInputs + i] =		indexedValues[nLoops + nInputs + i] =
std_load(indexedGenericOp.getOutput(i), indexing);		std_load(indexedGenericOp.getOutputBuffer(i), indexing);
}		}

if (auto funcOp = indexedGenericOp.getFunction()) {		if (auto funcOp = indexedGenericOp.getFunction()) {
// 2. Emit call.		// 2. Emit call.
Operation *callOp = call(funcOp, indexedValues);		Operation *callOp = call(funcOp, indexedValues);
assert(callOp->getNumResults() == indexedGenericOp.getNumOutputs());		assert(callOp->getNumResults() == indexedGenericOp.getNumOutputs());

// 3. Emit std_store.		// 3. Emit std_store.
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));		b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));
std_store(callOp->getResult(i), indexedGenericOp.getOutput(i),		std_store(callOp->getResult(i), indexedGenericOp.getOutputBuffer(i),
indexing);		indexing);
}		}
return;		return;
}		}
// TODO(ntv): When a region inliner exists, use it.		// TODO(ntv): When a region inliner exists, use it.
// 2. Inline region, currently only works for a single basic block.		// 2. Inline region, currently only works for a single basic block.
BlockAndValueMapping map;		BlockAndValueMapping map;
auto &block = indexedGenericOp.region().front();		auto &block = indexedGenericOp.region().front();
for (auto it : llvm::zip(block.getArguments(), indexedValues))		for (auto it : llvm::zip(block.getArguments(), indexedValues))
map.map(std::get<0>(it), std::get<1>(it));		map.map(std::get<0>(it), std::get<1>(it));
for (auto &op : block.without_terminator()) {		for (auto &op : block.without_terminator()) {
assert(op.getNumRegions() == 0);		assert(op.getNumRegions() == 0);
auto *newOp = b.clone(op, map);		auto *newOp = b.clone(op, map);
for (auto it : llvm::zip(op.getResults(), newOp->getResults()))		for (auto it : llvm::zip(op.getResults(), newOp->getResults()))
map.map(std::get<0>(it), std::get<1>(it));		map.map(std::get<0>(it), std::get<1>(it));
}		}

// 3. Emit std_store.		// 3. Emit std_store.
auto *yieldOp = cast<YieldOp>(block.back()).getOperation();		auto *yieldOp = cast<YieldOp>(block.back()).getOperation();
assert(yieldOp->getNumOperands() == nOutputs);		assert(yieldOp->getNumOperands() == nOutputs);
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
ValueHandleArray indexing(makeCanonicalAffineApplies(		ValueHandleArray indexing(makeCanonicalAffineApplies(
b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));		b, loc, indexedGenericOp.getOutputIndexingMap(i), allIvs));
std_store(map.lookup(yieldOp->getOperand(i)),		std_store(map.lookup(yieldOp->getOperand(i)),
indexedGenericOp.getOutput(i), indexing);		indexedGenericOp.getOutputBuffer(i), indexing);
}		}
}		}
};		};

// This struct is for factoring out the implementation and support template		// This struct is for factoring out the implementation and support template
// instantiations in the following 2 cases:		// instantiations in the following 2 cases:
// 1. Appending to a list of patterns via RewritePatternList.		// 1. Appending to a list of patterns via RewritePatternList.
// 2. Direct invocation via `linalgOpToLoops` and `linalgOpToAffineLoops`.		// 2. Direct invocation via `linalgOpToLoops` and `linalgOpToAffineLoops`.
Show All 11 Lines
LogicalResult LinalgOpToLoopsImpl<LoopTy, IndexedValueTy, ConcreteOpTy>::doit(		LogicalResult LinalgOpToLoopsImpl<LoopTy, IndexedValueTy, ConcreteOpTy>::doit(
Operation *op, PatternRewriter &rewriter) {		Operation *op, PatternRewriter &rewriter) {
OpBuilder b(op);		OpBuilder b(op);
ScopedContext scope(b, op->getLoc());		ScopedContext scope(b, op->getLoc());

// The flattened loopToOperandRangesMaps is expected to be an invertible		// The flattened loopToOperandRangesMaps is expected to be an invertible
// permutation map (which is asserted in the inverse calculation).		// permutation map (which is asserted in the inverse calculation).
auto linalgOp = cast<ConcreteOpTy>(op);		auto linalgOp = cast<ConcreteOpTy>(op);
		assert(linalgOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
auto invertedMap =		auto invertedMap =
inversePermutation(concatAffineMaps(loopToOperandRangesMaps(linalgOp)));		inversePermutation(concatAffineMaps(loopToOperandRangesMaps(linalgOp)));
if (!invertedMap) {		if (!invertedMap) {
LinalgScopedEmitter<IndexedValueTy, ConcreteOpTy>::emitScalarImplementation(		LinalgScopedEmitter<IndexedValueTy, ConcreteOpTy>::emitScalarImplementation(
{}, linalgOp);		{}, linalgOp);
return success();		return success();
}		}

▲ Show 20 Lines • Show All 187 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/LinalgTransforms.cpp

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	for (auto *originalProducer : originalProducers)
rewriter.eraseOp(originalProducer);		rewriter.eraseOp(originalProducer);
return success();		return success();
}		}

bool mlir::linalg::detail::isProducedByOpOfTypeImpl(		bool mlir::linalg::detail::isProducedByOpOfTypeImpl(
Operation *consumerOp, Value consumedView,		Operation *consumerOp, Value consumedView,
function_ref<bool(Operation *)> isaOpType) {		function_ref<bool(Operation *)> isaOpType) {
LinalgOp consumer = dyn_cast<LinalgOp>(consumerOp);		LinalgOp consumer = dyn_cast<LinalgOp>(consumerOp);
		assert(consumer.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
if (!consumer)		if (!consumer)
return false;		return false;

auto maybeConsumerIndex = consumer.getIndexOfInput(consumedView);		auto maybeConsumerIndex = consumer.getIndexOfInput(consumedView);
if (!maybeConsumerIndex)		if (!maybeConsumerIndex)
return false;		return false;

Aliases aliases;		Aliases aliases;
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	mlir::linalg::vectorizeGenericLinalgOpPrecondition(Operation *op) {

// TODO(ntv): non-identity layout.		// TODO(ntv): non-identity layout.
auto isStaticMemRefWithIdentityLayout = [](Value v) {		auto isStaticMemRefWithIdentityLayout = [](Value v) {
auto m = v.getType().dyn_cast<MemRefType>();		auto m = v.getType().dyn_cast<MemRefType>();
if (!m \|\| !m.hasStaticShape() \|\| !m.getAffineMaps().empty())		if (!m \|\| !m.hasStaticShape() \|\| !m.getAffineMaps().empty())
return false;		return false;
return true;		return true;
};		};
if (!llvm::all_of(genericOp.getInputsAndOutputs(),		if (!llvm::all_of(genericOp.getInputsAndOutputBuffers(),
isStaticMemRefWithIdentityLayout))		isStaticMemRefWithIdentityLayout))
return failure();		return failure();
return success();		return success();
}		}

SmallVector<Value, 0>		SmallVector<Value, 0>
mlir::linalg::vectorizeGenericLinalgOp(PatternRewriter &rewriter,		mlir::linalg::vectorizeGenericLinalgOp(PatternRewriter &rewriter,
Operation *op) {		Operation *op) {
LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE		LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE
"]: Rewrite linalg op as vector.contract: "		"]: Rewrite linalg op as vector.contract: "
<< *op << ":\n");		<< *op << ":\n");

assert(succeeded(vectorizeGenericLinalgOpPrecondition(op)) &&		assert(succeeded(vectorizeGenericLinalgOpPrecondition(op)) &&
"DRR failure case must be a precondition");		"DRR failure case must be a precondition");

auto genericOp = cast<linalg::GenericOp>(op);		auto genericOp = cast<linalg::GenericOp>(op);
		assert(genericOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
edsc::ScopedContext scope(rewriter, op->getLoc());		edsc::ScopedContext scope(rewriter, op->getLoc());
using edsc::intrinsics::std_load;		using edsc::intrinsics::std_load;
using edsc::intrinsics::std_store;		using edsc::intrinsics::std_store;
using vector_contract = edsc::intrinsics::ValueBuilder<vector::ContractionOp>;		using vector_contract = edsc::intrinsics::ValueBuilder<vector::ContractionOp>;
using vector_type_cast = edsc::intrinsics::ValueBuilder<vector::TypeCastOp>;		using vector_type_cast = edsc::intrinsics::ValueBuilder<vector::TypeCastOp>;
auto vA = std_load(vector_type_cast(genericOp.getInput(0)));		auto vA = std_load(vector_type_cast(genericOp.getInput(0)));
auto vB = std_load(vector_type_cast(genericOp.getInput(1)));		auto vB = std_load(vector_type_cast(genericOp.getInput(1)));
auto vectorMemRefC = vector_type_cast(genericOp.getOutput(0));		auto vectorMemRefC = vector_type_cast(genericOp.getOutputBuffer(0));
auto vC = std_load(vectorMemRefC);		auto vC = std_load(vectorMemRefC);
auto vRes = vector_contract(vA, vB, vC, genericOp.indexing_maps(),		auto vRes = vector_contract(vA, vB, vC, genericOp.indexing_maps(),
genericOp.iterator_types());		genericOp.iterator_types());
std_store(vRes, vectorMemRefC);		std_store(vRes, vectorMemRefC);
return {};		return {};
}		}

//============================================================================//		//============================================================================//
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
//============================================================================//		//============================================================================//
// Precondition and transformation for Linalg subview promotion.		// Precondition and transformation for Linalg subview promotion.
//============================================================================//		//============================================================================//
LogicalResult mlir::linalg::promoteSubviewsLinalgOpPrecondition(Operation *op) {		LogicalResult mlir::linalg::promoteSubviewsLinalgOpPrecondition(Operation *op) {
LinalgOp linOp = dyn_cast<LinalgOp>(op);		LinalgOp linOp = dyn_cast<LinalgOp>(op);
// Transformation applies to buffers only.		// Transformation applies to buffers only.
if (!linOp \|\| !linOp.hasBufferSemantics())		if (!linOp \|\| !linOp.hasBufferSemantics())
return failure();		return failure();
if (llvm::none_of(linOp.getInputsAndOutputs(), [](Value v) {		if (llvm::none_of(linOp.getInputsAndOutputBuffers(), [](Value v) {
return isa_and_nonnull<SubViewOp>(v.getDefiningOp());		return isa_and_nonnull<SubViewOp>(v.getDefiningOp());
}))		}))
return failure();		return failure();
return success();		return success();
}		}

SmallVector<Value, 0>		SmallVector<Value, 0>
mlir::linalg::promoteSubviewsLinalgOp(PatternRewriter &rewriter,		mlir::linalg::promoteSubviewsLinalgOp(PatternRewriter &rewriter,
Operation *op) {		Operation *op) {
LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE "]: Promote subviews for linalg op: "		LLVM_DEBUG(dbgs() << "\n[" DEBUG_TYPE "]: Promote subviews for linalg op: "
<< *op << ":\n");		<< *op << ":\n");

assert(succeeded(promoteSubviewsLinalgOpPrecondition(op)) &&		assert(succeeded(promoteSubviewsLinalgOpPrecondition(op)) &&
"DRR failure case must be a precondition");		"DRR failure case must be a precondition");

LinalgOp linOp = cast<LinalgOp>(op);		LinalgOp linOp = cast<LinalgOp>(op);
		assert(linOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
SetVector<Value> subViews;		SetVector<Value> subViews;
for (auto it : linOp.getInputsAndOutputs())		for (auto it : linOp.getInputsAndOutputBuffers())
if (auto sv = dyn_cast_or_null<SubViewOp>(it.getDefiningOp()))		if (auto sv = dyn_cast_or_null<SubViewOp>(it.getDefiningOp()))
subViews.insert(sv);		subViews.insert(sv);
if (!subViews.empty()) {		if (!subViews.empty()) {
promoteSubViewOperands(rewriter, linOp, subViews);		promoteSubViewOperands(rewriter, linOp, subViews);
return {};		return {};
}		}
llvm_unreachable("DRR failure case must be a precondition");		llvm_unreachable("DRR failure case must be a precondition");
}		}

mlir/lib/Dialect/Linalg/Transforms/Promotion.cpp

Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	mlir::linalg::promoteSubViews(OpBuilder &b, Location loc,
}		}
return res;		return res;
}		}

LinalgOp mlir::linalg::promoteSubViewOperands(OpBuilder &b, LinalgOp op,		LinalgOp mlir::linalg::promoteSubViewOperands(OpBuilder &b, LinalgOp op,
SetVector<Value> subViews,		SetVector<Value> subViews,
bool dynamicBuffers,		bool dynamicBuffers,
OperationFolder *folder) {		OperationFolder *folder) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");

// 1. Promote the specified views and use them in the new op.		// 1. Promote the specified views and use them in the new op.
ScopedContext scope(b, op.getLoc());		ScopedContext scope(b, op.getLoc());
auto promotedBufferAndViews = promoteSubViews(		auto promotedBufferAndViews = promoteSubViews(
b, op.getLoc(), subViews.getArrayRef(), dynamicBuffers, folder);		b, op.getLoc(), subViews.getArrayRef(), dynamicBuffers, folder);
SmallVector<Value, 8> opViews;		SmallVector<Value, 8> opViews;
opViews.reserve(op.getNumInputsAndOutputs());		opViews.reserve(op.getNumInputsAndOutputs());
SmallVector<std::pair<Value, Value>, 8> writebackViews;		SmallVector<std::pair<Value, Value>, 8> writebackViews;
writebackViews.reserve(subViews.size());		writebackViews.reserve(subViews.size());
unsigned promotedIdx = 0;		unsigned promotedIdx = 0;
for (auto view : op.getInputsAndOutputs()) {		for (auto view : op.getInputsAndOutputBuffers()) {
if (subViews.count(view) != 0) {		if (subViews.count(view) != 0) {
opViews.push_back(promotedBufferAndViews[promotedIdx].fullLocalView);		opViews.push_back(promotedBufferAndViews[promotedIdx].fullLocalView);
writebackViews.emplace_back(std::make_pair(		writebackViews.emplace_back(std::make_pair(
view, promotedBufferAndViews[promotedIdx].partialLocalView));		view, promotedBufferAndViews[promotedIdx].partialLocalView));
promotedIdx++;		promotedIdx++;
} else {		} else {
opViews.push_back(view);		opViews.push_back(view);
}		}
}		}

// 2. Append all other operands as they appear, this enforces that such		// 2. Append all other operands as they appear, this enforces that such
// operands are not views. This is to support cases such as FillOp taking		// operands are not views. This is to support cases such as FillOp taking
// extra scalars etc.		// extra scalars etc.
auto operands = getAssumedNonViewOperands(op);		auto operands = getAssumedNonViewOperands(op);
opViews.append(operands.begin(), operands.end());		opViews.append(operands.begin(), operands.end());
LinalgOp res = op.clone(b, op.getLoc(), opViews);		LinalgOp res = op.clone(b, op.getLoc(), opViews);

// 3. Emit write-back for the promoted output views: copy the partial view.		// 3. Emit write-back for the promoted output views: copy the partial view.
for (auto viewAndPartialLocalView : writebackViews) {		for (auto viewAndPartialLocalView : writebackViews) {
// WARNING: MUST use the old op to determine whether the operand view is an		// WARNING: MUST use the old op to determine whether the operand view is an
// output.		// output.
bool isOutput =		bool isOutput =
op.getIndexOfOutput(viewAndPartialLocalView.first).hasValue();		op.getIndexOfOutputBuffer(viewAndPartialLocalView.first).hasValue();
if (isOutput)		if (isOutput)
copy(viewAndPartialLocalView.second, viewAndPartialLocalView.first);		copy(viewAndPartialLocalView.second, viewAndPartialLocalView.first);
}		}

// 4. Dealloc local buffers.		// 4. Dealloc local buffers.
for (const auto &pi : promotedBufferAndViews)		for (const auto &pi : promotedBufferAndViews)
dealloc(pi.buffer);		dealloc(pi.buffer);

return res;		return res;
}		}

static void promoteSubViews(FuncOp f, bool dynamicBuffers) {		static void promoteSubViews(FuncOp f, bool dynamicBuffers) {
SmallVector<LinalgOp, 8> toErase;		SmallVector<LinalgOp, 8> toErase;
OperationFolder folder(f.getContext());		OperationFolder folder(f.getContext());
f.walk([dynamicBuffers, &folder, &toErase](LinalgOp op) {		f.walk([dynamicBuffers, &folder, &toErase](LinalgOp op) {
		if (!op.hasBufferSemantics())
		return;

// TODO(ntv) some heuristic here to decide what to promote. Atm it is all or		// TODO(ntv) some heuristic here to decide what to promote. Atm it is all or
// nothing.		// nothing.
SetVector<Value> subViews;		SetVector<Value> subViews;
OpBuilder b(op);		OpBuilder b(op);
for (auto it : op.getInputsAndOutputs())		for (auto it : op.getInputsAndOutputBuffers())
if (auto sv = dyn_cast_or_null<SubViewOp>(it.getDefiningOp()))		if (auto sv = dyn_cast_or_null<SubViewOp>(it.getDefiningOp()))
subViews.insert(sv);		subViews.insert(sv);
if (!subViews.empty()) {		if (!subViews.empty()) {
promoteSubViewOperands(b, op, subViews, dynamicBuffers, &folder);		promoteSubViewOperands(b, op, subViews, dynamicBuffers, &folder);
toErase.push_back(op);		toErase.push_back(op);
}		}
});		});
for (auto op : toErase)		for (auto op : toErase)
Show All 25 Lines

mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines
// }		// }
// }		// }
//		//
// TODO(pifon, ntv): Investigate whether mixing implicit and explicit indices		// TODO(pifon, ntv): Investigate whether mixing implicit and explicit indices
// does not lead to losing information.		// does not lead to losing information.
static void transformIndexedGenericOpIndices(		static void transformIndexedGenericOpIndices(
OpBuilder &b, LinalgOp op, ArrayRef<ValueHandle *> pivs,		OpBuilder &b, LinalgOp op, ArrayRef<ValueHandle *> pivs,
const LoopIndexToRangeIndexMap &loopIndexToRangeIndex) {		const LoopIndexToRangeIndexMap &loopIndexToRangeIndex) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");
auto indexedGenericOp = dyn_cast<IndexedGenericOp>(op.getOperation());		auto indexedGenericOp = dyn_cast<IndexedGenericOp>(op.getOperation());
if (!indexedGenericOp)		if (!indexedGenericOp)
return;		return;

// `linalg.indexed_generic` comes in two flavours. One has a region with a		// `linalg.indexed_generic` comes in two flavours. One has a region with a
// single block that defines the loop body. The other has a `fun` attribute		// single block that defines the loop body. The other has a `fun` attribute
// that refers to an existing function symbol. The `fun` function call will be		// that refers to an existing function symbol. The `fun` function call will be
// inserted in the loop body in that case.		// inserted in the loop body in that case.
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	if (isTiled(map.getResult(r), tileSizes))
return true;		return true;
return false;		return false;
}		}

static SmallVector<Value, 4>		static SmallVector<Value, 4>
makeTiledViews(OpBuilder &b, Location loc, LinalgOp linalgOp,		makeTiledViews(OpBuilder &b, Location loc, LinalgOp linalgOp,
ArrayRef<Value> ivs, ArrayRef<Value> tileSizes,		ArrayRef<Value> ivs, ArrayRef<Value> tileSizes,
ArrayRef<Value> viewSizes, OperationFolder *folder) {		ArrayRef<Value> viewSizes, OperationFolder *folder) {
		assert(linalgOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
assert(ivs.size() == static_cast<size_t>(llvm::count_if(		assert(ivs.size() == static_cast<size_t>(llvm::count_if(
llvm::make_range(tileSizes.begin(), tileSizes.end()),		llvm::make_range(tileSizes.begin(), tileSizes.end()),
[](Value v) { return !isZero(v); })) &&		[](Value v) { return !isZero(v); })) &&
"expected as many ivs as non-zero sizes");		"expected as many ivs as non-zero sizes");

using edsc::intrinsics::select;		using edsc::intrinsics::select;
using edsc::op::operator+;		using edsc::op::operator+;
using edsc::op::operator<;		using edsc::op::operator<;

// Construct (potentially temporary) mins and maxes on which to apply maps		// Construct (potentially temporary) mins and maxes on which to apply maps
// that define tile subviews.		// that define tile subviews.
SmallVector<Value, 8> lbs, subViewSizes;		SmallVector<Value, 8> lbs, subViewSizes;
for (unsigned idx = 0, idxIvs = 0, e = tileSizes.size(); idx < e; ++idx) {		for (unsigned idx = 0, idxIvs = 0, e = tileSizes.size(); idx < e; ++idx) {
bool isTiled = !isZero(tileSizes[idx]);		bool isTiled = !isZero(tileSizes[idx]);
lbs.push_back(isTiled ? ivs[idxIvs++] : (Value)constant_index(folder, 0));		lbs.push_back(isTiled ? ivs[idxIvs++] : (Value)constant_index(folder, 0));
subViewSizes.push_back(isTiled ? tileSizes[idx] : viewSizes[idx]);		subViewSizes.push_back(isTiled ? tileSizes[idx] : viewSizes[idx]);
}		}

auto *op = linalgOp.getOperation();		auto *op = linalgOp.getOperation();

SmallVector<Value, 4> res;		SmallVector<Value, 4> res;
res.reserve(op->getNumOperands());		res.reserve(op->getNumOperands());
auto viewIteratorBegin = linalgOp.getInputsAndOutputs().begin();		auto viewIteratorBegin = linalgOp.getInputsAndOutputBuffers().begin();
for (unsigned viewIndex = 0; viewIndex < linalgOp.getNumInputsAndOutputs();		for (unsigned viewIndex = 0; viewIndex < linalgOp.getNumInputsAndOutputs();
++viewIndex) {		++viewIndex) {
Value view = *(viewIteratorBegin + viewIndex);		Value view = *(viewIteratorBegin + viewIndex);
unsigned rank = view.getType().cast<MemRefType>().getRank();		unsigned rank = view.getType().cast<MemRefType>().getRank();
auto map = loopToOperandRangesMaps(linalgOp)[viewIndex];		auto map = loopToOperandRangesMaps(linalgOp)[viewIndex];
// If the view is not tiled, we can use it as is.		// If the view is not tiled, we can use it as is.
if (!isTiled(map, tileSizes)) {		if (!isTiled(map, tileSizes)) {
res.push_back(view);		res.push_back(view);
Show All 38 Lines	makeTiledViews(OpBuilder &b, Location loc, LinalgOp linalgOp,

return res;		return res;
}		}

Optional<TiledLinalgOp>		Optional<TiledLinalgOp>
mlir::linalg::tileLinalgOp(OpBuilder &b, LinalgOp op, ArrayRef<Value> tileSizes,		mlir::linalg::tileLinalgOp(OpBuilder &b, LinalgOp op, ArrayRef<Value> tileSizes,
ArrayRef<unsigned> permutation,		ArrayRef<unsigned> permutation,
OperationFolder *folder) {		OperationFolder *folder) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");
// 1. Enforce the convention that "tiling by zero" skips tiling a particular		// 1. Enforce the convention that "tiling by zero" skips tiling a particular
// dimension. This convention is significantly simpler to handle instead of		// dimension. This convention is significantly simpler to handle instead of
// adjusting affine maps to account for missing dimensions.		// adjusting affine maps to account for missing dimensions.
assert(op.getNumParallelLoops() + op.getNumReductionLoops() +		assert(op.getNumParallelLoops() + op.getNumReductionLoops() +
op.getNumWindowLoops() ==		op.getNumWindowLoops() ==
tileSizes.size() &&		tileSizes.size() &&
"expected matching number of tile sizes and loops");		"expected matching number of tile sizes and loops");

▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	for (auto iv : ivs)
loops.push_back(loop::getForInductionVarOwner(iv));		loops.push_back(loop::getForInductionVarOwner(iv));

return TiledLinalgOp{res, loops};		return TiledLinalgOp{res, loops};
}		}

Optional<TiledLinalgOp> mlir::linalg::tileLinalgOp(		Optional<TiledLinalgOp> mlir::linalg::tileLinalgOp(
OpBuilder &b, LinalgOp op, ArrayRef<int64_t> tileSizes,		OpBuilder &b, LinalgOp op, ArrayRef<int64_t> tileSizes,
ArrayRef<unsigned> permutation, OperationFolder *folder) {		ArrayRef<unsigned> permutation, OperationFolder *folder) {
		assert(op.hasBufferSemantics() && "expected linalg op with buffer semantics");
if (tileSizes.empty())		if (tileSizes.empty())
return llvm::None;		return llvm::None;

// The following uses the convention that "tiling by zero" skips tiling a		// The following uses the convention that "tiling by zero" skips tiling a
// particular dimension. This convention is significantly simpler to handle		// particular dimension. This convention is significantly simpler to handle
// instead of adjusting affine maps to account for missing dimensions.		// instead of adjusting affine maps to account for missing dimensions.
auto nLoops = op.getNumParallelLoops() + op.getNumReductionLoops() +		auto nLoops = op.getNumParallelLoops() + op.getNumReductionLoops() +
op.getNumWindowLoops();		op.getNumWindowLoops();
Show All 20 Lines	Optional<TiledLinalgOp> mlir::linalg::tileLinalgOp(

return tileLinalgOp(b, op, tileSizeValues, permutation, folder);		return tileLinalgOp(b, op, tileSizeValues, permutation, folder);
}		}

static void tileLinalgOps(FuncOp f, ArrayRef<int64_t> tileSizes) {		static void tileLinalgOps(FuncOp f, ArrayRef<int64_t> tileSizes) {
OpBuilder b(f);		OpBuilder b(f);
OperationFolder folder(f.getContext());		OperationFolder folder(f.getContext());
f.walk([tileSizes, &b, &folder](LinalgOp op) {		f.walk([tileSizes, &b, &folder](LinalgOp op) {
		if (!op.hasBufferSemantics())
		return;
auto opLoopsPair =		auto opLoopsPair =
tileLinalgOp(b, op, tileSizes, /permutation=/{}, &folder);		tileLinalgOp(b, op, tileSizes, /permutation=/{}, &folder);
// If tiling occurred successfully, erase old op.		// If tiling occurred successfully, erase old op.
if (opLoopsPair)		if (opLoopsPair)
op.erase();		op.erase();
});		});
f.walk([](LinalgOp op) {		f.walk([](LinalgOp op) {
if (!op.getOperation()->hasNoSideEffect())		if (!op.getOperation()->hasNoSideEffect())
Show All 32 Lines

mlir/test/Dialect/Linalg/invalid.mlir

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	linalg.generic {
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @generic_exactly_2_views(%arg0: memref<f32>) {		func @generic_exactly_2_views(%arg0: memref<f32>) {
// expected-error @+1 {{op expected exactly 2 view operands}}		// expected-error @+1 {{op expected exactly 2 inputs (tensor or buffer) and output buffer operands}}
linalg.generic {		linalg.generic {
args_in = 1,		args_in = 1,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0, %arg0, %arg0: memref<f32>, memref<f32>, memref<f32>		} %arg0, %arg0, %arg0: memref<f32>, memref<f32>, memref<f32>
}		}

// -----		// -----

func @generic_undefined_fun(%arg0: memref<f32>) {		func @generic_undefined_fun(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun attribute to refer to a defined symbol}}		// expected-error @+1 {{op expected function attribute to refer to a defined symbol}}
linalg.generic {		linalg.generic {
args_in = 1,		args_in = 1,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0, %arg0: memref<f32>, memref<f32>		} %arg0, %arg0: memref<f32>, memref<f32>
}		}

// -----		// -----

func @foo() { return }		func @foo() { return }

func @generic_mismatched_num_arguments(%arg0: memref<f32>) {		func @generic_mismatched_num_arguments(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun arguments to match number of views}}		// expected-error @+1 {{op expected function arguments to match number of operands}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @foo(%0: i32) { return }		func @foo(%0: i32) { return }

func @generic_mismatched_num_returns(%arg0: memref<f32>) {		func @generic_mismatched_num_returns(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun results to match number of output views}}		// expected-error @+1 {{op expected function results(0) to match number of outputs(1)}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

		func @foo(%0: i32, %1: i32, %2: i32) { return }

		func @generic_mismatched_num_returns(%0: memref<i32>, %1: memref<f32>) {
		// expected-error @+1 {{op expected function argument 2 of the same type as elemental type 'f32' of operand 2}}
		linalg.generic {
		args_in = 3,
		args_out = 0,
		fun = @foo,
		indexing_maps = [ affine_map<() -> (0)> ],
		iterator_types = []
		} %0, %1, %1: memref<i32>, memref<f32>, memref<f32>
		}

		// -----

		func @foo(%0: i32, %1: i32, %2: f32) -> i32 { return %1: i32}

		func @generic_mismatched_num_returns(%0: memref<i32>, %1: memref<f32>) {
		// expected-error @+1 {{op expected function result 1 of the same type as elemental type 'f32' of output 1}}
		linalg.generic {
		args_in = 2,
		args_out = 1,
		fun = @foo,
		indexing_maps = [ affine_map<() -> (0)> ],
		iterator_types = []
		} %0, %0, %1: memref<i32>, memref<i32>, memref<f32>
		}

		// -----

func @foo(%0: i32) -> i32 { return %0: i32 }		func @foo(%0: i32) -> i32 { return %0: i32 }

func @generic_symbol_in_map(%arg0: memref<i32>) {		func @generic_symbol_in_map(%arg0: memref<i32>) {
// expected-error @+1 {{op expected indexing_map #0 to have no symbols}}		// expected-error @+1 {{op expected indexing_map #0 to have no symbols}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
// -----		// -----

func @foo(%0: i32) -> f32 {		func @foo(%0: i32) -> f32 {
%1 = constant 0.0: f32		%1 = constant 0.0: f32
return %1: f32		return %1: f32
}		}

func @generic_fun_arg_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_fun_arg_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected fun argument 0 of the same type as elemental type 'f32' of view 0}}		// expected-error @+1 {{op expected function argument 1 of the same type as elemental type 'f32' of operand 1}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>		} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}

// -----		// -----

func @foo(%0: f32) -> i4 {		func @foo(%0: f32) -> i4 {
%1 = constant 1: i4		%1 = constant 1: i4
return %1: i4		return %1: i4
}		}

func @generic_fun_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_fun_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected fun result 0 of the same type as elemental type 'f32' of view 0}}		// expected-error @+1 {{op expected function result 1 of the same type as elemental type 'f32' of output 1}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
fun = @foo,		fun = @foo,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>		} %arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>
}		}
Show All 33 Lines	linalg.generic {
^bb1:		^bb1:
^bb2:		^bb2:
}: memref<f32>, memref<f32>		}: memref<f32>, memref<f32>
}		}

// -----		// -----

func @generic_mismatched_num_arguments(%arg0: memref<f32>) {		func @generic_mismatched_num_arguments(%arg0: memref<f32>) {
// expected-error @+1 {{op expected number of block arguments to match number of views}}		// expected-error @+1 {{op expected number of block arguments to match number of operands}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0 {		} %arg0 {
^bb:		^bb:
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @generic_block_arg_type(%arg0: memref<f32>) {		func @generic_block_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 0 of the same type as elemental type of output view: 'memref<f32>'}}		// expected-error @+1 {{op expected block argument 1 of the same type as elemental type of output operand: 'memref<f32>'}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = []		iterator_types = []
} %arg0 {		} %arg0 {
^bb(%i: i1):		^bb(%i: i1):
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @indexed_generic_block_arg_count(%arg0: memref<f32>) {		func @indexed_generic_block_arg_count(%arg0: memref<f32>) {
// expected-error @+1 {{op expected number of block arguments to match number of views + number of loops}}		// expected-error @+1 {{op expected number of block arguments to match number of operands + number of loops}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
^bb(%f: f32):		^bb(%f: f32):
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @indexed_generic_block_induction_var_arg_type(%arg0: memref<f32>) {		func @indexed_generic_block_induction_var_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 0 to be of IndexType}}		// expected-error @+1 {{op expected block argument 1 to be an index}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
^bb(%i: f64, %f: f32):		^bb(%i: f64, %f: f32):
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @indexed_generic_block_arg_type(%arg0: memref<f32>) {		func @indexed_generic_block_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected block argument 1 of the same type as elemental type of output view: 'memref<f32>'}}		// expected-error @+1 {{op expected block argument 2 of the same type as elemental type of output operand: 'memref<f32>'}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
^bb(%i: index, %f: i1):		^bb(%i: index, %f: i1):
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @foo(%f: f32) -> (f32) {		func @foo(%f: f32) -> (f32) {
return %f : f32		return %f : f32
}		}
func @indexed_generic_fun_arg_count(%arg0: memref<f32>) {		func @indexed_generic_fun_arg_count(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun arguments to match number of views + number of loops}}		// expected-error @+1 {{op expected function arguments to match number of loops + number of operands}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
fun = @foo		fun = @foo
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @foo(%i: i32, %val: f32) -> (f32) {		func @foo(%i: i32, %val: f32) -> (f32) {
return %val : f32		return %val : f32
}		}
func @indexed_generic_fun_induction_var_arg_type(%arg0: memref<f32>) {		func @indexed_generic_fun_induction_var_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun argument 0 to be of IndexType}}		// expected-error @+1 {{op expected function argument 1 to be an index}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
iterator_types = ["parallel"],		iterator_types = ["parallel"],
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
fun = @foo		fun = @foo
} %arg0 : memref<f32>		} %arg0 : memref<f32>
}		}

// -----		// -----

func @foo(%i: index, %val: i1) -> (i1) {		func @foo(%i: index, %val: i1) -> (i1) {
return %val : i1		return %val : i1
}		}
func @indexed_generic_fun_arg_type(%arg0: memref<f32>) {		func @indexed_generic_fun_arg_type(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun argument 1 of the same type as elemental type 'f32' of view 0}}		// expected-error @+1 {{op expected function argument 2 of the same type as elemental type 'f32' of input 1}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
fun = @foo		fun = @foo
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @foo(%i: index, %val: i1) -> (i1, i1) {		func @foo(%i: index, %val: i1) -> (i1, i1) {
return %val, %val : i1, i1		return %val, %val : i1, i1
}		}
func @indexed_generic_fun_result_count(%arg0: memref<f32>) {		func @indexed_generic_fun_result_count(%arg0: memref<f32>) {
// expected-error @+1 {{op expected fun results to match number of output views}}		// expected-error @+1 {{op expected function results to match number of outputs}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
fun = @foo		fun = @foo
} %arg0: memref<f32>		} %arg0: memref<f32>
}		}

// -----		// -----

func @foo(%i: index, %val: i32) -> (f32) {		func @foo(%i: index, %val: i32) -> (f32) {
%val_float = sitofp %val : i32 to f32		%val_float = sitofp %val : i32 to f32
return %val_float : f32		return %val_float : f32
}		}
func @indexed_generic_fun_result_count(%arg0: memref<i32>) {		func @indexed_generic_fun_result_count(%arg0: memref<i32>) {
// expected-error @+1 {{op expected fun result 0 of the same type as elemental type 'i32' of view 0}}		// expected-error @+1 {{op expected function result 1 of the same type as elemental type 'i32' of output 1}}
linalg.indexed_generic {		linalg.indexed_generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(d0) -> (d0)> ],		indexing_maps = [ affine_map<(d0) -> (d0)> ],
iterator_types = ["parallel"],		iterator_types = ["parallel"],
fun = @foo		fun = @foo
} %arg0: memref<i32>		} %arg0: memref<i32>
}		}

// -----		// -----

func @generic_fun_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {		func @generic_fun_result_0_element_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+9 {{type of return operand 0 ('i1') doesn't match view element type ('f32')}}		// expected-error @+9 {{type of yield operand 1 ('i1') doesn't match the element type of the enclosing linalg.generic op ('f32')}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],		indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
^bb(%i: f32):		^bb(%i: f32):
%0 = constant 0: i1		%0 = constant 0: i1
Show All 13 Lines	func @generic_result_tensor_type(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
} %arg0 {		} %arg0 {
^bb(%i: f32):		^bb(%i: f32):
linalg.yield %i: f32		linalg.yield %i: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>> -> f32		}: memref<?xf32, affine_map<(i)[off]->(off + i)>> -> f32
}		}

// -----		// -----

func @generic_result_tensor_count(%arg0: memref<?xf32, affine_map<(i)[off]->(off + i)>>) {
// expected-error @+1 {{op expected #output tensor operands (0) to match #results (1)}}
%0 = linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]
} %arg0 {
^bb(%i: f32):
linalg.yield %i: f32
}: memref<?xf32, affine_map<(i)[off]->(off + i)>> -> tensor<?xf32>
}

// -----

func @generic_result_tensor_type(%arg0: tensor<?xf32>) {
// expected-error @+1 {{op result #0 must be 'tensor<?xf32>', but got 'tensor<?x?xf32>'}}
%0 = linalg.generic {
args_in = 0,
args_out = 1,
indexing_maps = [ affine_map<(i) -> (i)> ],
iterator_types = ["parallel"]
} %arg0 {
^bb(%i: f32):
linalg.yield %i: f32
}: tensor<?xf32> -> tensor<?x?xf32>
}

// -----

func @generic_fun_result_0_element_type(%arg0: memref<?xf32>) {		func @generic_fun_result_0_element_type(%arg0: memref<?xf32>) {
// expected-error @+1 {{'linalg.dot' op expected 3 or more operands}}		// expected-error @+1 {{'linalg.dot' op expected 3 or more operands}}
linalg.dot(%arg0, %arg0): memref<?xf32>, memref<?xf32>		linalg.dot(%arg0, %arg0): memref<?xf32>, memref<?xf32>
}		}

// -----		// -----

// expected-error @+1 {{unknown Linalg type}}		// expected-error @+1 {{unknown Linalg type}}
▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/roundtrip.mlir

	Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines

	func @generic_with_tensor_input(%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			func @generic_with_tensor_input(%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.generic #trait %arg0, %arg1 {foo = 1} : tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>			linalg.generic #trait %arg0, %arg1 {foo = 1} : tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @generic_with_tensor_input			// CHECK-LABEL: func @generic_with_tensor_input
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}: tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, #[[strided3D]]>			// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}: tensor<?x?xvector<3x4xi4>>, memref<?x?x?xf32, #[[strided3D]]>

	func @generic_with_tensor_output(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, %arg1: tensor<?x?x?xf32>) -> (tensor<?x?x?xf32>) {			#trait2 = {
	%0 = linalg.generic #trait %arg0, %arg1 {foo = 1} : memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			args_in = 2,
	return %0 : tensor<?x?x?xf32>			args_out = 1,
				indexing_maps = #accesses,
				iterator_types = ["parallel", "parallel", "parallel"],
				fun = @foo,
				library_call = "some_external_function_name_1"
	}			}
	// CHECK-LABEL: func @generic_with_tensor_output
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}: memref<?x?xvector<3x4xi4>, #[[strided2D]]>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	// CHECK: return {{.*}} : tensor<?x?x?xf32>

	func @generic_with_tensor_input_and_output(%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>) -> (tensor<?x?x?xf32>) {			func @generic_with_tensor_input_and_output(%arg0: tensor<?x?xvector<3x4xi4>>, %arg1: tensor<?x?x?xf32>) -> (tensor<?x?x?xf32>) {
	%0 = linalg.generic #trait %arg0, %arg1 {foo = 1} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			%0 = linalg.generic #trait2 %arg0, %arg1 {foo = 1} : tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	return %0 : tensor<?x?x?xf32>			return %0 : tensor<?x?x?xf32>
	}			}
	// CHECK-LABEL: func @generic_with_tensor_input_and_output			// CHECK-LABEL: func @generic_with_tensor_input_and_output
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, fun = @foo, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>			// CHECK: linalg.generic {args_in = 2 : i64, args_out = 1 : i64, fun = @foo, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_1"} %{{.}}, %{{.}} {foo = 1 : i64}: tensor<?x?xvector<3x4xi4>>, tensor<?x?x?xf32> -> tensor<?x?x?xf32>
	// CHECK: return {{.*}} : tensor<?x?x?xf32>			// CHECK: return {{.*}} : tensor<?x?x?xf32>

	#trait2 = {			#trait3 = {
	args_in = 1,			args_in = 1,
	args_out = 1,			args_out = 1,
	indexing_maps = #accesses,			indexing_maps = #accesses,
	iterator_types = ["parallel", "parallel", "parallel"],			iterator_types = ["parallel", "parallel", "parallel"],
	library_call = "some_external_function_name_2"			library_call = "some_external_function_name_2"
	}			}
	func @generic_region(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			func @generic_region(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, %arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.generic #trait2 %arg0, %arg1 {			linalg.generic #trait3 %arg0, %arg1 {
	^bb(%a: vector<3x4xi4>, %b: f32) :			^bb(%a: vector<3x4xi4>, %b: f32) :
	linalg.yield %b : f32			linalg.yield %b : f32
	} {foo = 1}: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>			} {foo = 1}: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @generic_region			// CHECK-LABEL: func @generic_region
	// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_2"} %{{.}}, %{{.}} {			// CHECK: linalg.generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_2"} %{{.}}, %{{.}} {
	// CHECK: ^{{.}}(%{{.}}: vector<3x4xi4>, %{{.*}}: f32): // no predecessors			// CHECK: ^{{.}}(%{{.}}: vector<3x4xi4>, %{{.*}}: f32): // no predecessors
	// CHECK: linalg.yield %{{.*}} : f32			// CHECK: linalg.yield %{{.*}} : f32
	// CHECK: } {foo = 1 : i64}: memref<?x?xvector<3x4xi4>, #[[strided2D]]>, memref<?x?x?xf32, #[[strided3D]]>			// CHECK: } {foo = 1 : i64}: memref<?x?xvector<3x4xi4>, #[[strided2D]]>, memref<?x?x?xf32, #[[strided3D]]>
	func @indexed_generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,			func @indexed_generic(%arg0: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>,
	%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {			%arg1: memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>) {
	linalg.indexed_generic #trait2 %arg0, %arg1 {			linalg.indexed_generic #trait3 %arg0, %arg1 {
	^bb(%i: index, %j: index, %k: index, %a: vector<3x4xi4>, %b: f32) :			^bb(%i: index, %j: index, %k: index, %a: vector<3x4xi4>, %b: f32) :
	linalg.yield %b : f32			linalg.yield %b : f32
	} {foo = 1}: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>			} {foo = 1}: memref<?x?xvector<3x4xi4>, offset: ?, strides: [?, 1]>, memref<?x?x?xf32, offset: ?, strides: [?, ?, 1]>
	return			return
	}			}
	// CHECK-LABEL: func @indexed_generic			// CHECK-LABEL: func @indexed_generic
	// CHECK: linalg.indexed_generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_2"} %{{.}}, %{{.}} {			// CHECK: linalg.indexed_generic {args_in = 1 : i64, args_out = 1 : i64, indexing_maps = [#{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel"], library_call = "some_external_function_name_2"} %{{.}}, %{{.}} {
	// CHECK: ^{{.}}(%{{.}}: index, %{{.}}: index, %{{.}}: index, %{{.}}: vector<3x4xi4>, %{{.}}: f32):			// CHECK: ^{{.}}(%{{.}}: index, %{{.}}: index, %{{.}}: index, %{{.}}: vector<3x4xi4>, %{{.}}: f32):
	▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 238102

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

mlir/include/mlir/Dialect/Linalg/IR/LinalgTraits.h

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h

mlir/lib/Dialect/Linalg/Analysis/DependenceAnalysis.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

mlir/lib/Dialect/Linalg/Transforms/LinalgToLoops.cpp

mlir/lib/Dialect/Linalg/Transforms/LinalgTransforms.cpp

mlir/lib/Dialect/Linalg/Transforms/Promotion.cpp

mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp

mlir/test/Dialect/Linalg/invalid.mlir

mlir/test/Dialect/Linalg/roundtrip.mlir

[mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors.
ClosedPublic