This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/IR/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
-
LinalgOps.h
8/9
LinalgStructuredOps.td
-
lib/
-
Conversion/LinalgToStandard/
-
LinalgToStandard/
-
LinalgToStandard.cpp
-
Dialect/Linalg/
-
Linalg/
-
IR/
1/1
LinalgOps.cpp
-
Transforms/
11/11
Loops.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
invalid.mlir
8/8
loops.mlir

Differential D83879

[mlir][Linalg] Conv1D, Conv2D and Conv3D added as named ops
ClosedPublic

Authored by limo1996 on Jul 15 2020, 8:08 AM.

Download Raw Diff

Details

Reviewers

ftynse
nicolasvasilache
bondhugula

Commits

rG1aaf8aa53d69: [mlir][Linalg] Conv1D, Conv2D and Conv3D added as named ops

Summary

This commit is part of a greater project which aims to add
full end-to-end support for convolutions inside mlir. The
reason behind having conv ops for each rank rather than
having one generic ConvOp is to enable better optimizations
for every N-D case which reflects memory layout of input/kernel
buffers better and simplifies code as well. We expect plain linalg.conv
to be progressively retired.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

limo1996 created this revision.Jul 15 2020, 8:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2020, 8:08 AM

Herald added subscribers: msifontes, jurahul, Kayjukh and 13 others. · View Herald Transcript

Harbormaster failed remote builds in B64361: Diff 278195!Jul 15 2020, 8:09 AM

bondhugula requested changes to this revision.Jul 15 2020, 8:28 PM

bondhugula added a subscriber: bondhugula.

bondhugula added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
186	Nit: Please terminate with period.
237	Missing doc.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
355	These are all missing doc comments - they do need at least a couple of lines however obvious it might appear to the authors now.
368–373	I'm missing something here. Is this diff correctly generated? Where did the existing `emitScalarImplementation` of `linalg.conv` go?

This revision now requires changes to proceed.Jul 15 2020, 8:28 PM

This is a reasonable first take. Please make sure to document your assumptions, e.g. the order of dimensions. Also make sure the long documentation of the "base" class gets visible when you generate documentation from tablegen.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
223	Don't iterator types (at least, their number) also depend on rank?
233	The "input rank * 2" as number of loops sounds fragile to me. Please add documentation about this assumption somewhere, it will have to be revisited when we want to model batch convolutions.
234	@nicolasvasilache should this use "window" type for appropriate loops?
239	This function is long enough to deserve having its definition in a .cc file.
265	`typeof` is a non-standard GCC extension, this will break most other compilers. Why do you even need explicit template instantiation here?
288	This does not correspond to the maps the code above constructs. The code will produce `(d0,d1,d2,d3) -> (d0 + d2, d1 + d3)`.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
368–373	I see it at line 390....
387	Can this at least use a consistent naming scheme: m1,m2,m3?

10/12 comments resolved

Harbormaster failed remote builds in B64548: Diff 278520!Jul 16 2020, 9:34 AM

ftynse added inline comments.Jul 16 2020, 9:59 AM

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
355	I think @bondhugula meant function-level comments.
359	All comments must use proper English grammar, with capitalization and periods.
372	Nit: this comment would make more sense above the first line.
mlir/test/Dialect/Linalg/loops.mlir
1303–1311	This currently generates out-of-bounds accesses. Please also update the loop bound generator to either (a) handle the access expressions properly or (b) return an error in presence of such expressions. (b) is simpler for this commit, but we will ultimately want (a). Your choice.

Could you please rewrite the commit summary? The current one isn't really one - that's more of a plan.

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
368–373	If that's still there, how is Conv2D different from that? Isn't linalg.conv the Conv2D here?

This revision now requires changes to proceed.Jul 16 2020, 12:16 PM

evolve

Harbormaster failed remote builds in B65035: Diff 279432!Jul 20 2020, 11:44 PM

Most of the comments addressed

mlir/test/Dialect/Linalg/loops.mlir
1303–1311	Here I assume that input array has the padding and the output one not.. Should I do it in another way?

limo1996 marked 2 inline comments as done.Jul 21 2020, 2:49 AM

limo1996 added inline comments.

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
368–373	There is a new approach to convolution where we will have named ops for each rank so we can optimize better for each case. For more information you can reach out to @nicolasvasilache

Harbormaster failed remote builds in B65048: Diff 279458!Jul 21 2020, 2:50 AM

ftynse added inline comments.Jul 21 2020, 3:33 AM

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
1129	I think this is sufficiently clear to avoid additional comments in the argument list. If you want to keep a comment above the expression, please use proper English grammar with capitalization and punctuation. I already mentioned this guideline in this diff, it applies to _all_ comments.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
368–373	We expect plain linalg.conv to be progressively retired. It was mimicking TF convolution, which isn't necessarily generic or transformation-friendly enough. This will happen after we reach feature parity.
mlir/test/Dialect/Linalg/loops.mlir
18	Please avoid spurious whitespace changes.
1303–1311	This assumption should be explicit in the op documentation (in ODS). Ideally, we should generate `std.assert` on sizes now that we have that operation, but it can be done consistently for all of Linalg, in a separate commit.

Documentation of ConvOpBase now included comment that input buffers need to be padded
+ other minor changes

mlir/test/Dialect/Linalg/loops.mlir
18	It is not spurious. It is intended as I think an empty line after 5 lines of code increases readability. Actually I am just following the pattern set above (ever sixth line is empty)
1303–1311	Ok. It's mentioned in ODS documentation. We can add `std.assert` to our TODO list.

Harbormaster failed remote builds in B65077: Diff 279508!Jul 21 2020, 6:45 AM

Loop type of kernel iterators changed from parallel to reduction as those cannot be parallelized.

Harbormaster failed remote builds in B65087: Diff 279524!Jul 21 2020, 8:05 AM

Please update the diff/commit description as requested before. It should explain why you are doing the change, https://mlir.llvm.org/getting_started/Contributing/#commit-messages.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
227	This sounds overly restrictive in two ways: there will be bad accesses _unless_ the size of the input is greater than or equal to the size of the output + size of the kernel - 1, not just when the size of the input is equal to the size of the output; the behavior is explained in terms of a specific code generation scheme; it should rather say something like "the behavior of the op is undefined if ...".
mlir/test/Dialect/Linalg/loops.mlir
18	It is. It isn't anyhow related to what this diff does, that is "Conv1D, Conv2D and Conv3D added as named ops". Piggy-backing on commits to do irrelevant changes is bad practice that makes it harder to track changes in the future (irrelevant commits show up in blame). If you want to make a non-functional cleanup change, submit a separate diff with a proper description and put "NFC" in the name to simplify the review. Speaking about readability, the empty lines between the blocks seem to be attributable to https://github.com/llvm/llvm-project/commit/e36337a998a6be39d65872eab3e3e2291b6518b9, where there was a different number of lines per block. So I doubt the claim that every sixth line is intentionally empty. Even if it had been the case, I would have challenged a rule based on syntax rather than semantics. It makes more sense to separate blocks by semantics (maps with dilation / maps without dilation) than by the number of lines. Your code below has a snippet of 28 CHECK lines without vertical whitespace, why haven't those been splitted into 6 different blocks then?

This revision now requires changes to proceed.Jul 22 2020, 4:53 AM

evolve

Harbormaster failed remote builds in B65210: Diff 279769!Jul 22 2020, 4:58 AM

limo1996 added a child revision: D84317: [mlir] Added verification check for linalg.conv to ensure memrefs are of rank > 2.Jul 22 2020, 6:04 AM

ftynse removed a child revision: D84317: [mlir] Added verification check for linalg.conv to ensure memrefs are of rank > 2.Jul 22 2020, 6:22 AM

evolve & merge

limo1996 marked 3 inline comments as done.Jul 22 2020, 7:36 AM

limo1996 added inline comments.

mlir/test/Dialect/Linalg/loops.mlir
18	You are right. Sorry

Harbormaster failed remote builds in B65233: Diff 279825!Jul 22 2020, 7:37 AM

comment regarding input sizes done

Harbormaster failed remote builds in B65234: Diff 279827!Jul 22 2020, 7:44 AM

bondhugula added inline comments.Jul 22 2020, 11:38 AM

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
368–373	We expect plain linalg.conv to be progressively retired. It was mimicking TF Please add a comment on this. Ideally, this should be mentioned in the commit summary or else it wouldn't clear to those outside of your internal discussion. There is a new approach to convolution where we will have named ops for each rank so we can optimize better for each case. For more information you can reach out to Yes, but I was asking what would happen to the current linalg.conv which is equivalent to this conv2d.

bondhugula resigned from this revision.Jul 22 2020, 11:38 AM

Small refactoring

Harbormaster failed remote builds in B65504: Diff 280348!Jul 24 2020, 1:01 AM

limo1996 edited the summary of this revision. (Show Details)Jul 24 2020, 1:01 AM

limo1996 marked an inline comment as done.

limo1996 added a child revision: D84628: [mlir][Linalg] Conv {1,2,3}D ops defined with TC syntax.Jul 27 2020, 3:12 AM

Computation of indexes split into multiple to make windows tests (hopefully) pass

Harbormaster failed remote builds in B65843: Diff 280939!Jul 27 2020, 10:10 AM

ftynse accepted this revision.Jul 28 2020, 5:45 AM

This revision is now accepted and ready to land.Jul 28 2020, 5:45 AM

Could you please rebase?

In D83879#2178481, @ftynse wrote:

Could you please rebase?

Done

rebased

Harbormaster completed remote builds in B66203: Diff 281571.Jul 29 2020, 7:18 AM

This revision was landed with ongoing or failed builds.Jul 29 2020, 7:40 AM

Closed by commit rG1aaf8aa53d69: [mlir][Linalg] Conv1D, Conv2D and Conv3D added as named ops (authored by limo1996, committed by ftynse). · Explain Why

This revision was automatically updated to reflect the committed changes.

ftynse added a commit: rG1aaf8aa53d69: [mlir][Linalg] Conv1D, Conv2D and Conv3D added as named ops.

ftynse removed a child revision: D84628: [mlir][Linalg] Conv {1,2,3}D ops defined with TC syntax.Jul 30 2020, 9:35 AM

ftynse mentioned this in D84628: [mlir][Linalg] Conv {1,2,3}D ops defined with TC syntax.Jul 31 2020, 12:45 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgOps.h

8 lines

LinalgStructuredOps.td

125 lines

lib/

Conversion/

LinalgToStandard/

LinalgToStandard.cpp

5 lines

Dialect/

Linalg/

IR/

LinalgOps.cpp

44 lines

Transforms/

Loops.cpp

55 lines

test/

Dialect/

Linalg/

invalid.mlir

8 lines

loops.mlir

153 lines

Diff 281581

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h

	Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	/// symbol-less identity map of `rank`.			/// symbol-less identity map of `rank`.
	AffineMap extractOrIdentityMap(Optional<AffineMap> maybeMap, unsigned rank,			AffineMap extractOrIdentityMap(Optional<AffineMap> maybeMap, unsigned rank,
	MLIRContext *context);			MLIRContext *context);

	/// Return the vector that is the concatenation of `a` and `b`.			/// Return the vector that is the concatenation of `a` and `b`.
	SmallVector<AffineExpr, 4> concat(ArrayRef<AffineExpr> a,			SmallVector<AffineExpr, 4> concat(ArrayRef<AffineExpr> a,
	ArrayRef<AffineExpr> b);			ArrayRef<AffineExpr> b);

				/// Generates indexing maps for convolution with the following structure:
				/// input: (m_1, ..., m_r, n_1, ..., n_r) -> (m_1 + n_1, ..., m_r + n_r)
				/// kernel: (m_1, ..., m_r, n_1, ..., n_r) -> (n_1, ..., n_r)
				/// output: (m_1, ..., m_r, n_1, ..., n_r) -> (m_1, ..., m_r)
				/// where r is the rank of the input, kernel and output
				llvm::Optional<SmallVector<AffineMap, 8>>
				createConvNDIndexingMaps(MLIRContext *context, unsigned rank);

	#include "mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterfaces.h.inc"			#include "mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterfaces.h.inc"

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/Linalg/IR/LinalgOps.h.inc"			#include "mlir/Dialect/Linalg/IR/LinalgOps.h.inc"

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/Linalg/IR/LinalgStructuredOps.h.inc"			#include "mlir/Dialect/Linalg/IR/LinalgStructuredOps.h.inc"

	} // namespace linalg			} // namespace linalg
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_LINALG_LINALGOPS_H_			#endif // MLIR_DIALECT_LINALG_LINALGOPS_H_

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	let extraClassDeclaration = libraryCallName # [{
}		}
}];		}];

let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];

let hasFolder = 1;		let hasFolder = 1;
}		}

		class ConvOpBase<string mnemonic, int N>
		: LinalgStructured_Op<mnemonic, [NInputs<2>, NOutputs<1>]> {
		let description = [{
		Base operation for any N-D Convolution implemented as a linalg.generic op.
		bondhugulaUnsubmitted Done Reply Inline Actions Nit: Please terminate with period. bondhugula: Nit: Please terminate with period.

		Usage:

		```mlir
		linalg.conv<N>D(%in, %filter, %out) : memref<(?x)+f32>,
		memref<(?x)+f32>,
		memref<(?x)+f32>
		```

		where %in: input array
		%filter: kernel or filter that will be applied on the input array
		%out: output array

		and rank of the operands is N.

		Every child convolution is expressed as:

		```mlir
		#conv_trait = {
		args_in = 2,
		args_out = 1,
		indexing_maps = #conv_accesses,
		library_call = "linalg_conv",
		iterator_types = [("parallel", "parallel")+], // `2 * rank` iterators
		}

		linalg.generic #conv_trait %in, %filter, %out {
		^bb0(%a: f32, %b: f32, %c: f32) :
		%d = mulf %a, %b : f32
		%e = addf %c, %d : f32
		linalg.yield %e : f32
		} : memref<(?x)+f32>,
		memref<(?x)+f32>,
		memref<(?x)+f32>
		```

		where #conv_accesses depend on the rank of the operands and thus
		ftynseUnsubmitted Done Reply Inline Actions Don't iterator types (at least, their number) also depend on rank? ftynse: Don't iterator types (at least, their number) also depend on rank?
		can be found in the documentation of each N-D case.
		Please note that the input array is expected to be right-padded i.e.
		the size of the input is greater than or equal to the size of the output
		+ size of the kernel - 1. If it is not padded the behavior of the op
		ftynseUnsubmitted Done Reply Inline Actions This sounds overly restrictive in two ways: there will be bad accesses _unless_ the size of the input is greater than or equal to the size of the output + size of the kernel - 1, not just when the size of the input is equal to the size of the output; the behavior is explained in terms of a specific code generation scheme; it should rather say something like "the behavior of the op is undefined if ...". ftynse: This sounds overly restrictive in two ways: - there will be bad accesses _unless_ the size of…
		is undefined.
		}];

		let arguments = (ins AnyStridedMemRefOfRank<N>,
		AnyStridedMemRefOfRank<N>,
		AnyStridedMemRefOfRank<N>);
		ftynseUnsubmitted Done Reply Inline Actions The "input rank * 2" as number of loops sounds fragile to me. Please add documentation about this assumption somewhere, it will have to be revisited when we want to model batch convolutions. ftynse: The "input rank * 2" as number of loops sounds fragile to me. Please add documentation about…

		ftynseUnsubmitted Not Done Reply Inline Actions @nicolasvasilache should this use "window" type for appropriate loops? ftynse: @nicolasvasilache should this use "window" type for appropriate loops?
		let extraClassDeclaration = libraryCallName # [{
		llvm::Optional<SmallVector<StringRef, 8>> referenceIterators() {
		// There are always 2 loops for each dimension of the convolution. First
		bondhugulaUnsubmitted Done Reply Inline Actions Missing doc. bondhugula: Missing doc.
		// iterates output and second kernel. Since ranks of all 3 operands must
		// be the same it does not matter which operand is picked to get the rank.
		ftynseUnsubmitted Done Reply Inline Actions This function is long enough to deserve having its definition in a .cc file. ftynse: This function is long enough to deserve having its definition in a .cc file.
		// Loops iterating the output can be parallelized and thus are marked as
		// "parallel" while loops iterating the kernel are accumulating the
		// products and therefore are marked as "reduction".
		unsigned rank = getInputShapedType(0).getRank();
		SmallVector<StringRef, 8> parallel(rank, getParallelIteratorTypeName());
		SmallVector<StringRef, 8> reduction(rank, getReductionIteratorTypeName());
		parallel.insert(parallel.end(), reduction.begin(), reduction.end());
		return parallel;
		}

		// Generates indexing maps with the following structure:
		// input: (m_1, ..., m_r, n_1, ..., n_r) -> (m_1 + n_1, ..., m_r + n_r)
		// kernel: (m_1, ..., m_r, n_1, ..., n_r) -> (n_1, ..., n_r)
		// output: (m_1, ..., m_r, n_1, ..., n_r) -> (m_1, ..., m_r)
		// where r is the rank of the input, kernel and output
		llvm::Optional<SmallVector<AffineMap, 8>> referenceIndexingMaps() {
		MLIRContext *context = getContext();
		unsigned rank = getInputShapedType(0).getRank();
		return createConvNDIndexingMaps(context, rank);
		}
		}];

		let hasFolder = 1;
		let verifier = [{ return ::verify(*this); }];
		}

		ftynseUnsubmitted Done Reply Inline Actions `typeof` is a non-standard GCC extension, this will break most other compilers. Why do you even need explicit template instantiation here? ftynse: `typeof` is a non-standard GCC extension, this will break most other compilers. Why do you…
		def Conv1DOp : ConvOpBase<"conv1D", 1> {
		let description = [{
		1D convolution which uses following affine maps to access operands:

		```mlir
		#conv_accesses = [
		affine_map<(m, n) -> (m + n)>, // in
		affine_map<(m, n) -> (n)>, // kernel
		affine_map<(m, n) -> (m)> // out
		]
		```
		}];
		}

		def Conv2DOp : ConvOpBase<"conv2D", 2> {
		let description = [{
		2D convolution which uses following affine maps to access operands:

		```mlir
		#conv_accesses = [
		affine_map<(m1, m2, n1, n2) -> (m1 + n1, m2 + n2)>, // in
		affine_map<(m1, m2, n1, n2) -> (n1, n2)>, // kernel
		affine_map<(m1, m2, n1, n2) -> (m1, m2) // out
		ftynseUnsubmitted Done Reply Inline Actions This does not correspond to the maps the code above constructs. The code will produce `(d0,d1,d2,d3) -> (d0 + d2, d1 + d3)`. ftynse: This does not correspond to the maps the code above constructs. The code will produce `(d0,d1…
		]
		```
		}];
		}

		def Conv3DOp : ConvOpBase<"conv3D", 3> {
		let description = [{
		3D convolution which uses following affine maps to access operands:

		```mlir
		#conv_accesses = [
		affine_map<(m1, m2, m3, n1, n2, n3) -> (m1 + n1, m2 + n2, m3 + n3)>, // in
		affine_map<(m1, m2, m3, n1, n2, n3) -> (n1, n2, n3)>, // kernel
		affine_map<(m1, m2, m3, n1, n2, n3) -> (m1, m2, m3)> // out
		]
		```
		}];
		}

/// A base class for pooling operation such as conv. The arguments must contain		/// A base class for pooling operation such as conv. The arguments must contain
/// optional arguments `strides`, `dilations` and `padding` with following type:		/// optional arguments `strides`, `dilations` and `padding` with following type:
/// OptionalAttr<I64ArrayAttr>:$strides		/// OptionalAttr<I64ArrayAttr>:$strides
/// OptionalAttr<I64ArrayAttr>:$dilations		/// OptionalAttr<I64ArrayAttr>:$dilations
/// OptionalAttr<I64ElementsAttr>:$padding		/// OptionalAttr<I64ElementsAttr>:$padding
/// `strides` denotes the step of each window along the dimension.		/// `strides` denotes the step of each window along the dimension.
class PoolingBase_Op<string mnemonic, list<OpTrait> props>		class PoolingBase_Op<string mnemonic, list<OpTrait> props>
: LinalgStructured_Op<mnemonic, props> {		: LinalgStructured_Op<mnemonic, props> {
▲ Show 20 Lines • Show All 636 Lines • Show Last 20 Lines

mlir/lib/Conversion/LinalgToStandard/LinalgToStandard.cpp

Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	void mlir::populateLinalgToStandardConversionPatterns(
// attribute values such as kernel striding and dilation.		// attribute values such as kernel striding and dilation.
// clang-format off		// clang-format off
patterns.insert<		patterns.insert<
CopyTransposeConversion,		CopyTransposeConversion,
LinalgOpConversion<ConvOp>,		LinalgOpConversion<ConvOp>,
LinalgOpConversion<PoolingMaxOp>,		LinalgOpConversion<PoolingMaxOp>,
LinalgOpConversion<PoolingMinOp>,		LinalgOpConversion<PoolingMinOp>,
LinalgOpConversion<PoolingSumOp>,		LinalgOpConversion<PoolingSumOp>,
LinalgOpConversion<CopyOp>,		LinalgOpConversion<CopyOp>,
		LinalgOpConversion<Conv1DOp>,
		LinalgOpConversion<Conv2DOp>,
		LinalgOpConversion<Conv3DOp>,
LinalgOpConversion<FillOp>,		LinalgOpConversion<FillOp>,
LinalgOpConversion<GenericOp>,		LinalgOpConversion<GenericOp>,
LinalgOpConversion<IndexedGenericOp>>(ctx);		LinalgOpConversion<IndexedGenericOp>>(ctx);
// TODO: collect all auto-generated named ops with a tblgen directive.		// TODO: collect all auto-generated named ops with a tblgen directive.
patterns.insert<		patterns.insert<
LinalgOpConversion<DotOp>,		LinalgOpConversion<DotOp>,
LinalgOpConversion<BatchMatmulOp>,		LinalgOpConversion<BatchMatmulOp>,
LinalgOpConversion<MatvecOp>,		LinalgOpConversion<MatvecOp>,
Show All 27 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 980 Lines • ▼ Show 20 Lines	static LogicalResult verifyStrideOrDilation(LinalgPoolingOp op,
if (attrs.size() != op.getNumWindowLoops())		if (attrs.size() != op.getNumWindowLoops())
return op.emitOpError("expects num ")		return op.emitOpError("expects num ")
<< strideOrDilation		<< strideOrDilation
<< "s equal to number of window dimensions: " << attrs.size()		<< "s equal to number of window dimensions: " << attrs.size()
<< " vs " << op.getNumWindowLoops();		<< " vs " << op.getNumWindowLoops();
return success();		return success();
}		}

		template <typename ConvNDOp>
		static LogicalResult verify(ConvNDOp op) {
		auto outputType = op.getOutputShapedType(0).getElementType();
		auto inputType = op.getInputShapedType(0).getElementType();
		auto kernelType = op.getInputShapedType(1).getElementType();
		if (outputType != inputType \|\| inputType != kernelType)
		return op.emitOpError("expected all element types of operands to match");

		return success();
		}

static LogicalResult verify(ConvOp op) {		static LogicalResult verify(ConvOp op) {
auto oType = op.output().getType().cast<MemRefType>();		auto oType = op.output().getType().cast<MemRefType>();
auto fType = op.filter().getType().cast<MemRefType>();		auto fType = op.filter().getType().cast<MemRefType>();
auto iType = op.input().getType().cast<MemRefType>();		auto iType = op.input().getType().cast<MemRefType>();
if (oType.getElementType() != iType.getElementType() \|\|		if (oType.getElementType() != iType.getElementType() \|\|
oType.getElementType() != fType.getElementType())		oType.getElementType() != fType.getElementType())
return op.emitOpError("expects memref elemental types to match");		return op.emitOpError("expects memref elemental types to match");
if (oType.getRank() != iType.getRank() \|\| oType.getRank() != fType.getRank())		if (oType.getRank() != iType.getRank() \|\| oType.getRank() != fType.getRank())
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = outputDims.size(); i < e; ++i) {
// TODO: add a level of indirection to linalg.generic.		// TODO: add a level of indirection to linalg.generic.
auto expr = op.getStride(i) * outputDims[i] +		auto expr = op.getStride(i) * outputDims[i] +
op.getDilation(i) * windowDims[i] - op.getLowPad(i);		op.getDilation(i) * windowDims[i] - op.getLowPad(i);
res.push_back(expr);		res.push_back(expr);
}		}
return res;		return res;
}		}

		llvm::Optional<SmallVector<AffineMap, 8>>
		mlir::linalg::createConvNDIndexingMaps(MLIRContext *context, unsigned rank) {
		unsigned numDims = rank * 2, idx = 0;

		SmallVector<AffineExpr, 8> dims, in, kernel, out;
		dims = makeAffineDimExprs(numDims, idx, context);
		in.reserve(rank);
		kernel.reserve(rank);
		out.reserve(rank);

		for (unsigned i = 0; i < rank; i++) {
		in.push_back(dims[i] + dims[rank + i]);
		kernel.push_back(dims[rank + i]);
		out.push_back(dims[i]);
		}

		return SmallVector<AffineMap, 8>{AffineMap::get(numDims, 0, in, context),
		AffineMap::get(numDims, 0, kernel, context),
		AffineMap::get(numDims, 0, out, context)};
		}
		ftynseUnsubmitted Done Reply Inline Actions I think this is sufficiently clear to avoid additional comments in the argument list. If you want to keep a comment above the expression, please use proper English grammar with capitalization and punctuation. I already mentioned this guideline in this diff, it applies to _all_ comments. ftynse: I think this is sufficiently clear to avoid additional comments in the argument list. If you…

#define INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(OP_TYPE) \		#define INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(OP_TYPE) \
template SmallVector<AffineExpr, 4> \		template SmallVector<AffineExpr, 4> \
mlir::linalg::weightedPoolingInputIndex<OP_TYPE>( \		mlir::linalg::weightedPoolingInputIndex<OP_TYPE>( \
OP_TYPE op, ArrayRef<AffineExpr> outputDims, \		OP_TYPE op, ArrayRef<AffineExpr> outputDims, \
ArrayRef<AffineExpr> windowDims);		ArrayRef<AffineExpr> windowDims);

INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(ConvOp)		INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(ConvOp)
INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(PoolingMaxOp)		INSTANTIATE_WEIGHTED_POOLING_INPUT_INDEX(PoolingMaxOp)
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
LogicalResult CopyOp::fold(ArrayRef<Attribute>,		LogicalResult CopyOp::fold(ArrayRef<Attribute>,
SmallVectorImpl<OpFoldResult> &) {		SmallVectorImpl<OpFoldResult> &) {
return foldMemRefCast(*this);		return foldMemRefCast(*this);
}		}
LogicalResult FillOp::fold(ArrayRef<Attribute>,		LogicalResult FillOp::fold(ArrayRef<Attribute>,
SmallVectorImpl<OpFoldResult> &) {		SmallVectorImpl<OpFoldResult> &) {
return foldMemRefCast(*this);		return foldMemRefCast(*this);
}		}
		LogicalResult Conv1DOp::fold(ArrayRef<Attribute>,
		SmallVectorImpl<OpFoldResult> &) {
		return foldMemRefCast(*this);
		}
		LogicalResult Conv2DOp::fold(ArrayRef<Attribute>,
		SmallVectorImpl<OpFoldResult> &) {
		return foldMemRefCast(*this);
		}
		LogicalResult Conv3DOp::fold(ArrayRef<Attribute>,
		SmallVectorImpl<OpFoldResult> &) {
		return foldMemRefCast(*this);
		}
LogicalResult GenericOp::fold(ArrayRef<Attribute>,		LogicalResult GenericOp::fold(ArrayRef<Attribute>,
SmallVectorImpl<OpFoldResult> &) {		SmallVectorImpl<OpFoldResult> &) {
return foldMemRefCast(*this);		return foldMemRefCast(*this);
}		}
LogicalResult IndexedGenericOp::fold(ArrayRef<Attribute>,		LogicalResult IndexedGenericOp::fold(ArrayRef<Attribute>,
SmallVectorImpl<OpFoldResult> &) {		SmallVectorImpl<OpFoldResult> &) {
return foldMemRefCast(*this);		return foldMemRefCast(*this);
}		}
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp

Show First 20 Lines • Show All 289 Lines • ▼ Show 20 Lines	void emitScalarImplementation(ArrayRef<Value> allIvs, FillOp fillOp) {
assert(nPar == allIvs.size());		assert(nPar == allIvs.size());
auto ivs = SmallVector<Value, 4>(allIvs.begin(), allIvs.begin() + nPar);		auto ivs = SmallVector<Value, 4>(allIvs.begin(), allIvs.begin() + nPar);
IndexedValueType O(fillOp.getOutputBuffer(0));		IndexedValueType O(fillOp.getOutputBuffer(0));
// Emit the proper scalar assignment, whether we are dealing with a 0-D or		// Emit the proper scalar assignment, whether we are dealing with a 0-D or
// an n-D loop nest; with or without permutations.		// an n-D loop nest; with or without permutations.
nPar > 0 ? O(ivs) = fillOp.value() : O() = fillOp.value();		nPar > 0 ? O(ivs) = fillOp.value() : O() = fillOp.value();
}		}

		/// Following functions emit scalar part of the N-D convolution op.
		/// N-D convolution has 2N loops:
		/// 1-N: Iterate over the output array O with iterators m1, ..., mN.
		/// N-2N:. Iterate over the kernel K with iterators n1, ..., nN.
		///
		/// The scalar part accumulates products of input array I values with kernel
		/// ones. The accumulation expression therefore looks like:
		/// O[m1, ..., mN] += I[m1 + n1, ..., mN + nN] * K[n1, ..., nN].
		/// Note that the input array has to be padded in order to prevent
		/// out of bounds accesses.
		template <typename IndexedValueType>
		void emitScalarImplementation(ArrayRef<Value> allIvs, Conv1DOp convOp) {
		assert(convOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(allIvs.size() == 2);
		Value m1(allIvs[0]);
		Value n1(allIvs[1]);
		IndexedValueType I(convOp.getInput(0)), K(convOp.getInput(1)),
		O(convOp.getOutputBuffer(0));
		// Emit scalar form for the 1D conv case.
		Value i1 = m1 + n1;
		O(m1) = O(m1) + I(i1) * K(n1);
		}

		template <typename IndexedValueType>
		void emitScalarImplementation(ArrayRef<Value> allIvs, Conv2DOp convOp) {
		assert(convOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(allIvs.size() == 4);
		Value m1(allIvs[0]), m2(allIvs[1]);
		Value n1(allIvs[2]), n2(allIvs[3]);
		IndexedValueType I(convOp.getInput(0)), K(convOp.getInput(1)),
		O(convOp.getOutputBuffer(0));
		// Emit scalar form for the 2D conv case.
		Value i1 = m1 + n1;
		Value i2 = m2 + n2;
		O(m1, m2) = O(m1, m2) + I(i1, i2) * K(n1, n2);
		}

		template <typename IndexedValueType>
		void emitScalarImplementation(ArrayRef<Value> allIvs, Conv3DOp convOp) {
		assert(convOp.hasBufferSemantics() &&
		"expected linalg op with buffer semantics");
		assert(allIvs.size() == 6);
		Value m1(allIvs[0]), m2(allIvs[1]), m3(allIvs[2]);
		Value n1(allIvs[3]), n2(allIvs[4]), n3(allIvs[5]);
		IndexedValueType I(convOp.getInput(0)), K(convOp.getInput(1)),
		O(convOp.getOutputBuffer(0));
		// Emit scalar form for the 3D conv case.
		Value i1 = m1 + n1;
		Value i2 = m2 + n2;
		Value i3 = m3 + n3;
		O(m1, m2, m3) = O(m1, m2, m3) + I(i1, i2, i3) * K(n1, n2, n3);
		}

template <typename IndexedValueType>		template <typename IndexedValueType>
Value getConvOpInput(ConvOp convOp, StdIndexedValue im,		Value getConvOpInput(ConvOp convOp, StdIndexedValue im,
MutableArrayRef<Value> imIdx) {		MutableArrayRef<Value> imIdx) {
		bondhugulaUnsubmitted Done Reply Inline Actions These are all missing doc comments - they do need at least a couple of lines however obvious it might appear to the authors now. bondhugula: These are all missing doc comments - they do need at least a couple of lines however obvious it…
		ftynseUnsubmitted Done Reply Inline Actions I think @bondhugula meant function-level comments. ftynse: I think @bondhugula meant function-level comments.
// TODO: add a level of indirection to linalg.generic.		// TODO: add a level of indirection to linalg.generic.
if (!convOp.padding())		if (!convOp.padding())
return im(imIdx);		return im(imIdx);

		ftynseUnsubmitted Done Reply Inline Actions All comments must use proper English grammar, with capitalization and periods. ftynse: All comments must use proper English grammar, with capitalization and periods.
auto *context = ScopedContext::getContext();		auto *context = ScopedContext::getContext();
Value zeroIndex = std_constant_index(0);		Value zeroIndex = std_constant_index(0);
SmallVector<Value, 8> conds;		SmallVector<Value, 8> conds;
SmallVector<Value, 8> clampedImIdx;		SmallVector<Value, 8> clampedImIdx;
for (auto iter : llvm::enumerate(imIdx)) {		for (auto iter : llvm::enumerate(imIdx)) {
int idx = iter.index();		int idx = iter.index();
auto dim = iter.value();		auto dim = iter.value();
// Only need to iterate over the window dimensions.		// Only need to iterate over the window dimensions.
if (idx == 0 \|\| idx == static_cast<int>(imIdx.size()) - 1) {		if (idx == 0 \|\| idx == static_cast<int>(imIdx.size()) - 1) {
clampedImIdx.push_back(dim);		clampedImIdx.push_back(dim);
continue;		continue;
}		}

		ftynseUnsubmitted Done Reply Inline Actions Nit: this comment would make more sense above the first line. ftynse: Nit: this comment would make more sense above the first line.
using edsc::op::sge;		using edsc::op::sge;
		bondhugulaUnsubmitted Done Reply Inline Actions I'm missing something here. Is this diff correctly generated? Where did the existing `emitScalarImplementation` of `linalg.conv` go? bondhugula: I'm missing something here. Is this diff correctly generated? Where did the existing…
		ftynseUnsubmitted Done Reply Inline Actions I see it at line 390.... ftynse: I see it at line 390....
		bondhugulaUnsubmitted Done Reply Inline Actions If that's still there, how is Conv2D different from that? Isn't linalg.conv the Conv2D here? bondhugula: If that's still there, how is Conv2D different from that? Isn't linalg.conv the Conv2D here?
		limo1996AuthorUnsubmitted Done Reply Inline Actions There is a new approach to convolution where we will have named ops for each rank so we can optimize better for each case. For more information you can reach out to @nicolasvasilache limo1996: There is a new approach to convolution where we will have named ops for each rank so we can…
		ftynseUnsubmitted Done Reply Inline Actions We expect plain linalg.conv to be progressively retired. It was mimicking TF convolution, which isn't necessarily generic or transformation-friendly enough. This will happen after we reach feature parity. ftynse: We expect plain linalg.conv to be progressively retired. It was mimicking TF convolution, which…
		bondhugulaUnsubmitted Done Reply Inline Actions We expect plain linalg.conv to be progressively retired. It was mimicking TF Please add a comment on this. Ideally, this should be mentioned in the commit summary or else it wouldn't clear to those outside of your internal discussion. There is a new approach to convolution where we will have named ops for each rank so we can optimize better for each case. For more information you can reach out to Yes, but I was asking what would happen to the current linalg.conv which is equivalent to this conv2d. bondhugula: >We expect plain linalg.conv to be progressively retired. It was mimicking TF Please add a…
using edsc::op::slt;		using edsc::op::slt;
using edsc::op::operator\|\|;		using edsc::op::operator\|\|;
Value leftOutOfBound = slt(dim, zeroIndex);		Value leftOutOfBound = slt(dim, zeroIndex);
if (conds.empty())		if (conds.empty())
conds.push_back(leftOutOfBound);		conds.push_back(leftOutOfBound);
else		else
conds.push_back(conds.back() \|\| leftOutOfBound);		conds.push_back(conds.back() \|\| leftOutOfBound);
Value rightBound = std_dim(convOp.input(), idx);		Value rightBound = std_dim(convOp.input(), idx);
conds.push_back(conds.back() \|\| (sge(dim, rightBound)));		conds.push_back(conds.back() \|\| (sge(dim, rightBound)));

// When padding is involved, the indices will only be shifted to negative,		// When padding is involved, the indices will only be shifted to negative,
// so having a max op is enough.		// so having a max op is enough.
auto maxMap = AffineMap::get(/dimCount=/1, 0,		auto maxMap = AffineMap::get(/dimCount=/1, 0,
{getAffineDimExpr(/position=/0, context),		{getAffineDimExpr(/position=/0, context),
		ftynseUnsubmitted Done Reply Inline Actions Can this at least use a consistent naming scheme: m1,m2,m3? ftynse: Can this at least use a consistent naming scheme: m1,m2,m3?
getAffineConstantExpr(0, context)},		getAffineConstantExpr(0, context)},
context);		context);
clampedImIdx.push_back(affine_max(dim.getType(), maxMap, ValueRange{dim}));		clampedImIdx.push_back(affine_max(dim.getType(), maxMap, ValueRange{dim}));
}		}

auto &b = ScopedContext::getBuilderRef();		auto &b = ScopedContext::getBuilderRef();
Type type = convOp.input().getType().cast<MemRefType>().getElementType();		Type type = convOp.input().getType().cast<MemRefType>().getElementType();
Value zero = std_constant(type, b.getZeroAttr(type));		Value zero = std_constant(type, b.getZeroAttr(type));
▲ Show 20 Lines • Show All 385 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/invalid.mlir

	Show First 20 Lines • Show All 501 Lines • ▼ Show 20 Lines

	// -----			// -----

	func @named_ops(%a3: memref<?x?x?xf32>, %b3: memref<?x?xf32>, %c3: memref<?x?x?xf32>) {			func @named_ops(%a3: memref<?x?x?xf32>, %b3: memref<?x?xf32>, %c3: memref<?x?x?xf32>) {
	// expected-error @+1 {{op expected indexing_map #1 results to match view rank: 'memref<?x?xf32>'}}			// expected-error @+1 {{op expected indexing_map #1 results to match view rank: 'memref<?x?xf32>'}}
	linalg.batch_matmul %a3, %b3, %c3 : (memref<?x?x?xf32>, memref<?x?xf32>, memref<?x?x?xf32>) -> ()			linalg.batch_matmul %a3, %b3, %c3 : (memref<?x?x?xf32>, memref<?x?xf32>, memref<?x?x?xf32>) -> ()
	return			return
	}			}

				// -----

				func @conv_type_mismatch(%in: memref<?xi32>, %filter: memref<?xf32>, %out: memref<?xf32>) {
				// expected-error @+1 {{expected all element types of operands to match}}
				linalg.conv1D(%in, %filter, %out) : memref<?xi32>, memref<?xf32>, memref<?xf32>
				return
				}

mlir/test/Dialect/Linalg/loops.mlir

	Show All 9 Lines
	// CHECKLOOP-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>			// CHECKLOOP-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>
	// CHECKLOOP-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>			// CHECKLOOP-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>

	// CHECKLOOP-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>			// CHECKLOOP-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
	// CHECKLOOP-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>			// CHECKLOOP-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>
	// CHECKLOOP-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>			// CHECKLOOP-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>
	// CHECKLOOP-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>			// CHECKLOOP-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>
	// CHECKLOOP-DAG: #[[$convLowerBound:.*]] = affine_map<()[s0] -> (s0 floordiv 2)>			// CHECKLOOP-DAG: #[[$convLowerBound:.*]] = affine_map<()[s0] -> (s0 floordiv 2)>
	// CHECKLOOP-DAG: #[[$convUpperBound:.*]] = affine_map<()[s0, s1] -> (s1 + s0 floordiv 2 - s0 + 1)>			// CHECKLOOP-DAG: #[[$convUpperBound:.*]] = affine_map<()[s0, s1] -> (s1 + s0 floordiv 2 - s0 + 1)>
				ftynseUnsubmitted Done Reply Inline Actions Please avoid spurious whitespace changes. ftynse: Please avoid spurious whitespace changes.
				limo1996AuthorUnsubmitted Done Reply Inline Actions It is not spurious. It is intended as I think an empty line after 5 lines of code increases readability. Actually I am just following the pattern set above (ever sixth line is empty) limo1996: It is not spurious. It is intended as I think an empty line after 5 lines of code increases…
				ftynseUnsubmitted Done Reply Inline Actions It is. It isn't anyhow related to what this diff does, that is "Conv1D, Conv2D and Conv3D added as named ops". Piggy-backing on commits to do irrelevant changes is bad practice that makes it harder to track changes in the future (irrelevant commits show up in blame). If you want to make a non-functional cleanup change, submit a separate diff with a proper description and put "NFC" in the name to simplify the review. Speaking about readability, the empty lines between the blocks seem to be attributable to https://github.com/llvm/llvm-project/commit/e36337a998a6be39d65872eab3e3e2291b6518b9, where there was a different number of lines per block. So I doubt the claim that every sixth line is intentionally empty. Even if it had been the case, I would have challenged a rule based on syntax rather than semantics. It makes more sense to separate blocks by semantics (maps with dilation / maps without dilation) than by the number of lines. Your code below has a snippet of 28 CHECK lines without vertical whitespace, why haven't those been splitted into 6 different blocks then? ftynse: It is. It isn't anyhow related to what this diff does, that is "Conv1D, Conv2D and Conv3D added…
				limo1996AuthorUnsubmitted Done Reply Inline Actions You are right. Sorry limo1996: You are right. Sorry
	// CHECKLOOP-DAG: #[[$convMap:.*]] = affine_map<(d0, d1)[s0] -> (d0 + d1 - s0 floordiv 2)>			// CHECKLOOP-DAG: #[[$convMap:.*]] = affine_map<(d0, d1)[s0] -> (d0 + d1 - s0 floordiv 2)>

	// CHECKPARALLEL-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>			// CHECKPARALLEL-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>
	// CHECKPARALLEL-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>			// CHECKPARALLEL-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
	// CHECKPARALLEL-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECKPARALLEL-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>
	// CHECKPARALLEL-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>			// CHECKPARALLEL-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>
	// CHECKPARALLEL-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>			// CHECKPARALLEL-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>

	▲ Show 20 Lines • Show All 1,254 Lines • ▼ Show 20 Lines
	// CHECKPARALLEL: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim10]]]			// CHECKPARALLEL: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim10]]]
	// CHECKPARALLEL: %[[aff4:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim11]]]			// CHECKPARALLEL: %[[aff4:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim11]]]
	// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]], %[[aff4]]] : memref<?x?x?x?xf32>			// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]], %[[aff4]]] : memref<?x?x?x?xf32>
	// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[i4]], %[[i5]], %[[i6]], %[[i7]]] : memref<?x?x?x?xf32>			// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[i4]], %[[i5]], %[[i6]], %[[i7]]] : memref<?x?x?x?xf32>
	// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>			// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>
	// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32			// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
	// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32			// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
	// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>			// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>

				func @conv1d_no_symbols(%in : memref<?xf32>, %filter : memref<?xf32>, %out : memref<?xf32>) -> () {
				linalg.conv1D(%in, %filter, %out) : memref<?xf32>, memref<?xf32>, memref<?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv1d_no_symbols
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg2]], %[[c0]] : memref<?xf32>
				// CHECKLOOP: scf.for %[[b:.*]] = %[[c0]] to %[[dim1]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[m:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKLOOP: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[b]], %[[m]])
				// CHECKLOOP: %[[va:.*]] = load %[[arg1]][%[[m]]] : memref<?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg0]][%[[aff]]] : memref<?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[b]]] : memref<?xf32>
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[b]]] : memref<?xf32>
				ftynseUnsubmitted Done Reply Inline Actions This currently generates out-of-bounds accesses. Please also update the loop bound generator to either (a) handle the access expressions properly or (b) return an error in presence of such expressions. (b) is simpler for this commit, but we will ultimately want (a). Your choice. ftynse: This currently generates out-of-bounds accesses. Please also update the loop bound generator to…
				limo1996AuthorUnsubmitted Done Reply Inline Actions Here I assume that input array has the padding and the output one not.. Should I do it in another way? limo1996: Here I assume that input array has the padding and the output one not.. Should I do it in…
				ftynseUnsubmitted Done Reply Inline Actions This assumption should be explicit in the op documentation (in ODS). Ideally, we should generate `std.assert` on sizes now that we have that operation, but it can be done consistently for all of Linalg, in a separate commit. ftynse: This assumption should be explicit in the op documentation (in ODS). Ideally, we should…
				limo1996AuthorUnsubmitted Done Reply Inline Actions Ok. It's mentioned in ODS documentation. We can add `std.assert` to our TODO list. limo1996: Ok. It's mentioned in ODS documentation. We can add `std.assert` to our TODO list.

				// CHECKPARALLEL-LABEL: @conv1d_no_symbols
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg2]], %[[c0]] : memref<?xf32>
				// CHECKPARALLEL: scf.parallel (%[[b:.*]]) = (%[[c0]]) to (%[[dim1]]) step (%[[c1]]) {
				// CHECKPARALLEL: scf.for %[[m:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKPARALLEL: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[b]], %[[m]])
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg1]][%[[m]]] : memref<?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg0]][%[[aff]]] : memref<?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[b]]] : memref<?xf32>
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[b]]] : memref<?xf32>


				func @conv2d_no_symbols(%in : memref<?x?xf32>, %filter : memref<?x?xf32>, %out : memref<?x?xf32>) -> () {
				linalg.conv2D(%in, %filter, %out) : memref<?x?xf32>, memref<?x?xf32>, memref<?x?xf32>
				return
				}
				// CHECKLOOP-LABEL: @conv2d_no_symbols
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim3:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?xf32>
				// CHECKLOOP: scf.for %[[arg3:.*]] = %[[c0]] to %[[dim2]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg4:.*]] = %[[c0]] to %[[dim3]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg5:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg6:.*]] = %[[c0]] to %[[dim1]] step %[[c1]] {
				// CHECKLOOP: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg3]], %[[arg5]])
				// CHECKLOOP: %[[aff2:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg4]], %[[arg6]])
				// CHECKLOOP: %[[va:.*]] = load %[[arg1]][%[[arg5]], %[[arg6]]] : memref<?x?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg0]][%[[aff]], %[[aff2]]] : memref<?x?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[arg3]], %[[arg4]]] : memref<?x?xf32>
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[arg3]], %[[arg4]]] : memref<?x?xf32>

				// CHECKPARALLEL-LABEL: @conv2d_no_symbols
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim3:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?xf32>
				// CHECKPARALLEL: scf.parallel (%[[arg3:.]], %[[arg4:.]]) = (%[[c0]], %[[c0]]) to (%[[dim2]], %[[dim3]]) step (%[[c1]], %[[c1]]) {
				// CHECKPARALLEL: scf.for %[[arg5:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKPARALLEL: scf.for %[[arg6:.*]] = %[[c0]] to %[[dim1]] step %[[c1]] {
				// CHECKPARALLEL: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg3]], %[[arg5]])
				// CHECKPARALLEL: %[[aff2:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg4]], %[[arg6]])
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg1]][%[[arg5]], %[[arg6]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg0]][%[[aff]], %[[aff2]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[arg3]], %[[arg4]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[arg3]], %[[arg4]]] : memref<?x?xf32>


				func @conv3d_no_symbols(%in : memref<?x?x?xf32>, %filter : memref<?x?x?xf32>, %out : memref<?x?x?xf32>) -> () {
				linalg.conv3D(%in, %filter, %out) : memref<?x?x?xf32>, memref<?x?x?xf32>, memref<?x?x?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv3d_no_symbols
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP: %[[c2:.*]] = constant 2 : index
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim3:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim4:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim5:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?xf32>
				// CHECKLOOP: scf.for %[[arg3:.*]] = %[[c0]] to %[[dim3]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg4:.*]] = %[[c0]] to %[[dim4]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg5:.*]] = %[[c0]] to %[[dim5]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg6:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg7:.*]] = %[[c0]] to %[[dim1]] step %[[c1]] {
				// CHECKLOOP: scf.for %[[arg8:.*]] = %[[c0]] to %[[dim2]] step %[[c1]] {
				// CHECKLOOP: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg3]], %[[arg6]])
				// CHECKLOOP: %[[aff2:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg4]], %[[arg7]])
				// CHECKLOOP: %[[aff3:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg5]], %[[arg8]])
				// CHECKLOOP: %[[va:.*]] = load %[[arg1]][%[[arg6]], %[[arg7]], %[[arg8]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg0]][%[[aff]], %[[aff2]], %[[aff3]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[arg3]], %[[arg4]], %[[arg5]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[arg3]], %[[arg4]], %[[arg5]]] : memref<?x?x?xf32>

				// CHECKPARALLEL-LABEL: @conv3d_no_symbols
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL: %[[c2:.*]] = constant 2 : index
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim3:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim4:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim5:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?xf32>
				// CHECKPARALLEL: scf.parallel (%[[arg3:.]], %[[arg4:.]], %[[arg5:.*]]) = (%[[c0]], %[[c0]], %[[c0]]) to (%[[dim3]], %[[dim4]], %[[dim5]]) step (%[[c1]], %[[c1]], %[[c1]]) {
				// CHECKPARALLEL: scf.for %[[arg6:.*]] = %[[c0]] to %[[dim0]] step %[[c1]] {
				// CHECKPARALLEL: scf.for %[[arg7:.*]] = %[[c0]] to %[[dim1]] step %[[c1]] {
				// CHECKPARALLEL: scf.for %[[arg8:.*]] = %[[c0]] to %[[dim2]] step %[[c1]] {
				// CHECKPARALLEL: %[[aff:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg3]], %[[arg6]])
				// CHECKPARALLEL: %[[aff2:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg4]], %[[arg7]])
				// CHECKPARALLEL: %[[aff3:.*]] = affine.apply #[[$stride1Dilation1]](%[[arg5]], %[[arg8]])
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg1]][%[[arg6]], %[[arg7]], %[[arg8]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg0]][%[[aff]], %[[aff2]], %[[aff3]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[vb]], %[[va]] : f32
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[arg3]], %[[arg4]], %[[arg5]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[arg3]], %[[arg4]], %[[arg5]]] : memref<?x?x?xf32>

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Conv1D, Conv2D and Conv3D added as named opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 281581

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.h

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

mlir/lib/Conversion/LinalgToStandard/LinalgToStandard.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp

mlir/test/Dialect/Linalg/invalid.mlir

mlir/test/Dialect/Linalg/loops.mlir

[mlir][Linalg] Conv1D, Conv2D and Conv3D added as named ops
ClosedPublic