This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
4/4
LinalgStructuredOps.td
-
Utils/
1/1
Utils.h
-
Utils/
-
StructuredOpsUtils.h
-
IR/
-
AffineExpr.h
-
lib/
-
Dialect/Linalg/
-
Linalg/
-
EDSC/
-
Builders.cpp
-
IR/
18/18
LinalgOps.cpp
-
Transforms/
-
DropUnitDims.cpp
-
Fusion.cpp
7/7
Loops.cpp
-
TensorsToBuffers.cpp
-
IR/
1/1
AffineExpr.cpp
9/9
AffineMap.cpp
-
test/
-
Dialect/Linalg/
-
Linalg/
-
invalid.mlir
8/8
loops.mlir
-
lib/Transforms/
-
Transforms/
-
TestBufferPlacement.cpp

Differential D83158

[mlir] Added support for symbols inside linalg.generic and map concatenation
ClosedPublic

Authored by limo1996 on Jul 4 2020, 6:38 AM.

Download Raw Diff

Details

Reviewers

ftynse
nicolasvasilache
rriddle

Commits

rGf9c8febc522c: [mlir] Added support for symbols inside linalg.generic and map concatenation

Summary

This commit adds functionality needed for implementation of convolutions with
linalg.generic op. Since linalg.generic right now expects indexing maps to be
just permutations, offset indexing needed in convolutions is not possible.
Therefore in this commit we address the issue by adding support for symbols inside
indexing maps which enables more advanced indexing. The upcoming commit will
solve the problem of computing loop bounds from such maps.

This commit is a fold of these commits:

[mlir] Added support for symbols inside linalg.generic indexing maps

[mlir] Added support for the shift of the symbols inside AffineExpr

[mlir] Added support for symbols in map concatenation

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

limo1996 created this revision.Jul 4 2020, 6:38 AM

Herald added a reviewer: rriddle. · View Herald TranscriptJul 4 2020, 6:38 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: msifontes, jurahul, Kayjukh and 13 others. · View Herald Transcript

Harbormaster failed remote builds in B62902: Diff 275499!Jul 4 2020, 7:31 AM

Fixed clang-tidy warning

Harbormaster completed remote builds in B62904: Diff 275503.Jul 4 2020, 9:40 AM

limo1996 added a child revision: D83191: [mlir] Loop bounds inference in linalg.generic op improved to support bounds for convolution.Jul 6 2020, 12:59 AM

Thanks, a couple of comments

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
39	Please don't commit commented-out code
183	Please don't use `auto` unless if it improves readability (e.g. for very long types such as iterators, or when the type is already mentioned in the same statement such as `getAttrOfType`).
187	Call `.reserve` to allocate space in the vector before calling `push_back` in a loop.
188	Can you just do `for (unsigned i = 0, e = shapedType.getRank(); i < e; ++i)` instead of enumerating the shape and using only the indices?
mlir/test/Dialect/Linalg/loops.mlir
17	Nit: the expression doesn't seem to correspond to a _strided_ convolution

ftynse added inline comments.Jul 6 2020, 5:45 AM

mlir/test/Dialect/Linalg/loops.mlir
949	Please don't pattern-match on SSA value names (`%c0`), they are not guaranteed to be stable.
953	Hmm, would it be possible to put this operation before the loops (actually, there is one already, many we can just reuse its value)?

This revision now requires changes to proceed.Jul 6 2020, 5:45 AM

Most of the comments incorporated

Harbormaster completed remote builds in B63035: Diff 275737.Jul 6 2020, 9:56 AM

ftynse added inline comments.Jul 7 2020, 1:32 AM

mlir/lib/IR/AffineMap.cpp
415	This function is still not supposed to work for maps with symbols IIUC
431	Is this change necessary?
443–446	There are more of these, I only commented on the first occurrence.

limo1996 marked 6 inline comments as done.Jul 7 2020, 1:56 AM

limo1996 added inline comments.

mlir/lib/IR/AffineMap.cpp
431	When numSymbols == 0 then numDims == numInputs but yeah no need for the change..

removed commented code

limo1996 marked an inline comment as done.Jul 7 2020, 2:21 AM

limo1996 added inline comments.

mlir/lib/IR/AffineMap.cpp
415	At this point it should. The next diff introduces function that bypasses inversePermutation call
431	Actually at this point there is a need for it. I will revert it in the next commit..
mlir/test/Dialect/Linalg/loops.mlir
949	Hmm I saw them in other tests so I just used them.. When naming convention changes other tests will need changes as well..
953	Hmm I think `--cse` solves it right?

ftynse requested changes to this revision.Jul 7 2020, 3:38 AM

ftynse added inline comments.

mlir/lib/IR/AffineMap.cpp
415	I see two bad practices here: trying to commit commented-out code: if the assertion is no longer necessary, remove it completely, add a test that exercises the new behavior, and update the doc; commits are not self-contained: if this change is only required for the next commit, it should be included in that commit and removed from this one.
mlir/test/Dialect/Linalg/loops.mlir
949	The fact that other tests currently contradict the testing guide https://mlir.llvm.org/getting_started/TestingGuide/ because of legacy (they likely existed before the current convention was adopted) does not mean you are allowed to commit new code that also contradicts the testing guide. If you notice such tests, the proper hygiene is to update them as well, in a separate commit. There is _no_ naming convention for SSA names. The only convention is that SSA names can change at any time without any warning, that's why tests should not rely on them. If you disagree with the guide, feel free to start a discussion on the forum. MLIR currently has ~400 test input files totalling ~70k LoC. Updating matches for SSA names in all of them would be extremely painful, will likely take days or weeks even with automation and require multiple rounds because other commits will be editing tests concurrently. This sounds like extremely poor use of engineering time, compared to several minutes spent writing the proper match conditions for each new commit.
953	If it is simple to do in your code, you should do it instead of expecting the caller to run CSE after the fact. Otherwise, we risk getting into a situation where each pass requires five other cleanup passes to run, each of which also require five other cleanup passes and so on. If doing so feels like you start reimplementing a generic CSE, then you shouldn't do it. It's a trade-off that must be considered rather than shrugged off.

This revision now requires changes to proceed.Jul 7 2020, 3:38 AM

nicolasvasilache requested changes to this revision.Jul 7 2020, 5:14 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
560	Can we increase the description and add IR examples ? In particular, if `symbol_source` is specified, it seems we need a symbol for every single dim of the corresponding operand? What happens if the operand is partially static? Atm it seems we add symbols for everything. Will partially static operands break something down the line? (If so please at least add a TODO). Additionally, it seems like symbol_source should be a proper ODS Optional Attribute this way some of the code you write would be auto generated?
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
265	I think most of this could go away with a proper confined integer attr (e.g. https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/StandardOps/IR/Ops.td#L1983)
270	It seems the logic behind this `targetRank` and its usage should be hidden behind a `op.getNumSymbolsPerOperand()` that would return a `SmallVector<unsigned>` where each operand index has the number of symbols. This would also be somewhat future proof if we decided we wanted to turn the the symbol source into an ArrayAttr
280–282	this seems incorrect as all indexing maps could have the number of symbols of operand X. If you had the helper above you could check `m.getNumSymbols() != 0 && m.getNumSymbols() != op.getNumSymbolsPerOperand()[idx]`
mlir/lib/IR/AffineExpr.cpp
101	Drop `dims` above and make this return replaceDimsAndSymbols({}, symbols); ?
mlir/lib/IR/AffineMap.cpp
415	I don't see how this is correct. Bypassing a caller has no incidence on potential other callers and the code would be bugged. Let's revert the 2 changes to this function please.
431	If it is not necessary let's revert is now, I don't see value in making a change that is reversed in a subsequent commit without a solid justification.

Changed static addressing of constants in tests. shiftSymbols function modified as Nicolas suggested.

limo1996 marked 3 inline comments as done.Jul 8 2020, 6:14 AM

Harbormaster completed remote builds in B63399: Diff 276401.Jul 8 2020, 6:51 AM

Documentation for symbol_source attribute extended

limo1996 marked 2 inline comments as done.Jul 9 2020, 2:26 AM

limo1996 added inline comments.

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
560	Documentation updated.. In particular, if symbol_source is specified, it seems we need a symbol for every single dim of the corresponding operand? Yes we do. What happens if the operand is partially static? It behaves the same as if it was dynamic. Do you think it should behave differently? i.e. to not require a symbol for static dimension? Will partially static operands break something down the line? I don't think so Additionally, it seems like symbol_source should be a proper ODS Optional Attribute this way some of the code you write would be auto generated? Yes there is a separate WIP revision for that. Once it is approaved I will merge it into this one.

Harbormaster completed remote builds in B63554: Diff 276669.Jul 9 2020, 2:29 AM

problem with commented code inside inversePermutation solved

limo1996 marked 3 inline comments as done.Jul 9 2020, 7:05 AM

Harbormaster completed remote builds in B63582: Diff 276730.Jul 9 2020, 7:29 AM

limo1996 marked 3 inline comments as done.Jul 10 2020, 1:46 AM

limo1996 added inline comments.

mlir/test/Dialect/Linalg/loops.mlir
953	I will do it in the follow up commit

revision D83378 merged: [mlir][WIP] symbol_source attr in linalg.generic proposal

limo1996 mentioned this in D83378: [mlir][WIP] symbol_source attr in linalg.generic proposal.Jul 10 2020, 5:42 AM

Harbormaster completed remote builds in B63733: Diff 277002.Jul 10 2020, 5:57 AM

limo1996 marked 3 inline comments as done.Jul 10 2020, 8:44 AM

limo1996 added inline comments.

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
247	@nicolasvasilache @ftynse I found the function name misleading as not only `GenericOp` and `IndexedGenericOp` are passed but also `LinalgStructuredOp`s which does not allow me to call functions defined in `GenericOpBase`. Should I circumvent it somehow or just avoid functions I defined in `GenericOpBase`?
270	If we turn symbol source into `ArrayAttr` then we must either pass dimensions from all operands specified in the symbol source into the maps as symbols or we will pass only dimensions of the corresponding operand which is really limited and does not support our case when we need dimensions of the operand in the map that does not correspond to it.
280–282	Yes, all maps can have the number of symbols of operand X. Is it wrong? If you had the helper above you could check `m.getNumSymbols() != 0 && m.getNumSymbols() != op.getNumSymbolsPerOperand()[idx]` This condition basically allows only map at index equal to `symbol_source` to have symbols which is not correct as in our scenario we want to have symbols in the map corresponding to the input view and not the kernel view.

ftynse added inline comments.Jul 10 2020, 9:26 AM

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
247	If you can avoid new ops, just do that. Possibly you can have a verifier in the GenericOp that calls this function, and then does some additional things. Otherwise, it's template-parameterized by the actual OpType, so it should be possible to use template specialization to work around missing functions .
270	A helper function looks reasonable from the code complexity perspective regardless of the further evolution. `ArrayAttr` discussion is irrelevant for the current patch.

evolve

Harbormaster completed remote builds in B64123: Diff 277724.Jul 14 2020, 2:56 AM

ftynse added inline comments.Jul 16 2020, 4:27 AM

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
295–296	Please don't put usernames after TODO in LLVM codebase.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
471	Please use full sentences in comments: start with a capital letter and terminate with a period, https://llvm.org/docs/CodingStandards.html#commenting.

Works for me after the last comments are addressed.

Very sorry for the delay, thanks Jakub!

One more question inline re the dropping of symbols that seems to look a bit fishy.

Thanks for pushing this!

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td
489	thanks!
560	sounds good, thanks!
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
247	This is mostly related to better OpInterface definition. This is good for now, I'll refactor and cleanup a bit at some later time, thanks!
270	The comment was about code hygiene, not suggesting we should discuss this extension right now. Basically l 264-272 is an implementation detail that should be hidden away in a helper function today. If tomorrow the logic grows this new helper function will be the right place to impl this.
280–282	this seems incorrect as all indexing maps could have the number of symbols of operand X. I was thinking about the case where we have >1 operands that want to propagate their symbols. This falls into the extension to ArrayAttr that is better disregarded for now.
280–282	This condition basically allows only map at index equal to symbol_source to have symbols which is not correct as in our scenario we want to have symbols in the map corresponding to the input view and not the kernel view. Ah I see, thanks for explaining, the symbols of the kernel are indeed used in the input. Then I'd recommend making things homogeneous from the get go. Instead of #conv_1d_accesses = [ affine_map<(m, n)[s0] -> (m + n - s0 floordiv 2)>, // in affine_map<(m, n) -> (n)>, // filter affine_map<(m, n) -> (m)> // out ] let's use: #conv_1d_accesses = [ affine_map<(m, n)[s0] -> (m + n - s0 floordiv 2)>, // in affine_map<(m, n)[s0] -> (n)>, // filter affine_map<(m, n)[s0] -> (m)> // out ] ? The fact that only the first operand map had `[s0]` had me think that `s0` corresponds to the operand 0 (even though it clearly says 1). <don't consider for now> and so in the future, when we want symbols from multiple operands, we can just concatenate them all and still remain homogeneous. </don't consider for now>
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
473	This looks quite unsafe to me. It seems the proper way to drop symbols would be to replace them all with `0`. Is this just a temporary thing that needs to be fixed in the next CL and the state of the codebase is that incorrect code may be generated?

ftynse added inline comments.Jul 16 2020, 6:29 AM

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
473	Good point. Can we just `return {}` to signify failure in presence of symbols? The tests for generated loops can go into the following commit where the loop bound computation is added. Here, we can just keep the "roundtripping" test that makes sure we print back what we parsed.

limo1996 marked 3 inline comments as done and an inline comment as not done.Jul 16 2020, 11:39 PM

All comments resolved

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
247	Ok let me know once it's refactored so I can simplify my code as well. Thanks :) Code starting at line 264 can then be simplified to: auto symSrc = op.getSymbolSource(); if (symSrc.hasValue() && symSrc.getValue() >= op.getNumOperands()) return op.emitOpError("symbol_source index out of range");
270	I think in the future once this function is refactored such that it takes only `GenericOp` and `IndexedGenericOp` we can introduce `op.getNumSymbols()` that would return sum of ranks of operands specified in `symbol_source` and lines 264-269 can be refactored like this: auto symSrc = op.getSymbolSource(); if (symSrc.hasValue() && symSrc.getValue() >= op.getNumOperands()) return op.emitOpError("symbol_source index out of range"); For now I refactored only line 271.
280–282	Yeah exactly, we are on the same page. I added `s0` to every map to make it less confusing and also I enforce it now. Thanks for the feedback!

All comments resolved

Harbormaster failed remote builds in B64683: Diff 278770!Jul 17 2020, 8:00 AM

nicolasvasilache accepted this revision.Jul 17 2020, 8:27 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
114	you should be able to just do: linalgOp.indexing_maps().template getAsRange<AffineMapAttr, AffineMap>() and drop the extra function.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
270	Ok I see how the fact that it is use on Generic, IndexedGeneric and on the op interface makes it less attractive to refactor now. I'll fix, thanks!
280–282	Looks great, thanks!

This revision is now accepted and ready to land.Jul 17 2020, 8:27 AM

limo1996 marked 2 inline comments as done.Jul 17 2020, 9:17 AM

Last comment of Nicolas resolved + found out one test is not passing which was caused by quite significant bug so fixed that as well

Harbormaster failed remote builds in B64939: Diff 279274!Jul 20 2020, 9:19 AM

limo1996 marked an inline comment as done.Jul 20 2020, 9:21 AM

Closed by commit rGf9c8febc522c: [mlir] Added support for symbols inside linalg.generic and map concatenation (authored by limo1996, committed by ftynse). · Explain WhyJul 20 2020, 10:21 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgStructuredOps.td

49 lines

Utils/

Utils.h

17 lines

Utils/

StructuredOpsUtils.h

4 lines

IR/

AffineExpr.h

4 lines

lib/

Dialect/

Linalg/

EDSC/

Builders.cpp

3 lines

IR/

LinalgOps.cpp

25 lines

Transforms/

3 lines

15 lines

44 lines

3 lines

IR/

AffineExpr.cpp

8 lines

AffineMap.cpp

11 lines

test/

Dialect/

Linalg/

invalid.mlir

18 lines

loops.mlir

330 lines

lib/

Transforms/

TestBufferPlacement.cpp

3 lines

Diff 279301

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

Show First 20 Lines • Show All 479 Lines • ▼ Show 20 Lines
class GenericOpBase<string mnemonic> : LinalgStructuredBase_Op<mnemonic,		class GenericOpBase<string mnemonic> : LinalgStructuredBase_Op<mnemonic,
[SingleBlockImplicitTerminator<"YieldOp">]> {		[SingleBlockImplicitTerminator<"YieldOp">]> {
let arguments = (ins Variadic<LinalgOperand>:$views,		let arguments = (ins Variadic<LinalgOperand>:$views,
I64Attr:$args_in,		I64Attr:$args_in,
I64Attr:$args_out,		I64Attr:$args_out,
AffineMapArrayAttr:$indexing_maps,		AffineMapArrayAttr:$indexing_maps,
ArrayAttr:$iterator_types,		ArrayAttr:$iterator_types,
OptionalAttr<StrAttr>:$doc,		OptionalAttr<StrAttr>:$doc,
OptionalAttr<StrAttr>:$library_call);		OptionalAttr<StrAttr>:$library_call,
		Confined<OptionalAttr<I64Attr>,
		nicolasvasilacheUnsubmitted Done Reply Inline Actions thanks! nicolasvasilache: thanks!
		[IntMinValue<0>]>:$symbol_source);
let results = (outs Variadic<AnyRankedTensor>:$output_tensors);		let results = (outs Variadic<AnyRankedTensor>:$output_tensors);
let regions = (region AnyRegion:$region);		let regions = (region AnyRegion:$region);
let extraClassDeclaration = [{		let extraClassDeclaration = [{
SmallVector<StringRef, 8> linalgTraitAttrNames() {		SmallVector<StringRef, 8> linalgTraitAttrNames() {
return SmallVector<StringRef, 8>{		return SmallVector<StringRef, 8>{
getArgsInAttrName(), getArgsOutAttrName(), getDocAttrName(),		getArgsInAttrName(), getArgsOutAttrName(), getDocAttrName(),
getIndexingMapsAttrName(), getLibraryCallAttrName(),		getIndexingMapsAttrName(), getLibraryCallAttrName(),
getIteratorTypesAttrName()		getIteratorTypesAttrName(), getSymbolSourceAttrName()
};		};
}		}

unsigned getNumInputs() { return args_in().getSExtValue(); }		unsigned getNumInputs() { return args_in().getSExtValue(); }

unsigned getNumOutputs() { return args_out().getSExtValue(); }		unsigned getNumOutputs() { return args_out().getSExtValue(); }

StringRef getLibraryCallName() {		StringRef getLibraryCallName() {
return library_call().hasValue() ? library_call().getValue() : "";		return library_call().hasValue() ? library_call().getValue() : "";
}		}

llvm::Optional<SmallVector<StringRef, 8>> referenceIterators() {		llvm::Optional<SmallVector<StringRef, 8>> referenceIterators() {
llvm_unreachable(		llvm_unreachable(
"No such thing as reference iterator types for a generic op.");		"No such thing as reference iterator types for a generic op.");
}		}

llvm::Optional<SmallVector<AffineMap, 8>> referenceIndexingMaps() {		llvm::Optional<SmallVector<AffineMap, 8>> referenceIndexingMaps() {
llvm_unreachable(		llvm_unreachable(
"No such thing as reference indexing maps for a generic op.");		"No such thing as reference indexing maps for a generic op.");
}		}

		llvm::Optional<unsigned> getSymbolSource() {
		auto ss = symbol_source();
		return ss.hasValue() ?
		llvm::Optional<unsigned>(ss.getValue().getLimitedValue()) : llvm::None;
		}
}];		}];

let printer = [{ return ::print(p, *this); }];		let printer = [{ return ::print(p, *this); }];
let parser = [{ return ::parseGenericOp(parser, result); }];		let parser = [{ return ::parseGenericOp(parser, result); }];
}		}

/// Index-free GenericOp.		/// Index-free GenericOp.
def GenericOp : GenericOpBase<"generic"> {		def GenericOp : GenericOpBase<"generic"> {
Show All 19 Lines	Where #trait_attributes is an alias of a dictionary attribute containing:
external library function that the linalg.generic operation maps to.		external library function that the linalg.generic operation maps to.
The external library is assumed to be dynamically linked and no strong		The external library is assumed to be dynamically linked and no strong
compile-time guarantees are provided. In the absence of such a library		compile-time guarantees are provided. In the absence of such a library
call, linalg.generic will always lower to loops.		call, linalg.generic will always lower to loops.
- iterator_types: an ArrayAttr specifying the type of the enclosing loops.		- iterator_types: an ArrayAttr specifying the type of the enclosing loops.
Each element of the list represents and iterator of one of the following		Each element of the list represents and iterator of one of the following
types:		types:
parallel, reduction, window		parallel, reduction, window
		- symbol_source: index of the operand whose dimensions will be propagated
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we increase the description and add IR examples ? In particular, if `symbol_source` is specified, it seems we need a symbol for every single dim of the corresponding operand? What happens if the operand is partially static? Atm it seems we add symbols for everything. Will partially static operands break something down the line? (If so please at least add a TODO). Additionally, it seems like symbol_source should be a proper ODS Optional Attribute this way some of the code you write would be auto generated? nicolasvasilache: Can we increase the description and add IR examples ? In particular, if `symbol_source` is…
		limo1996AuthorUnsubmitted Done Reply Inline Actions Documentation updated.. In particular, if symbol_source is specified, it seems we need a symbol for every single dim of the corresponding operand? Yes we do. What happens if the operand is partially static? It behaves the same as if it was dynamic. Do you think it should behave differently? i.e. to not require a symbol for static dimension? Will partially static operands break something down the line? I don't think so Additionally, it seems like symbol_source should be a proper ODS Optional Attribute this way some of the code you write would be auto generated? Yes there is a separate WIP revision for that. Once it is approaved I will merge it into this one. limo1996: Documentation updated.. > In particular, if symbol_source is specified, it seems we need a…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions sounds good, thanks! nicolasvasilache: sounds good, thanks!
		as symbols to the indexing maps. When specified the number of symbols
		in each of the indexing maps has to be either 0 or the rank of the
		specified operand.

Example:		Example:
Defining a #matmul_trait attribute in MLIR can be done as follows:		Defining a #matmul_trait attribute in MLIR can be done as follows:
```mlir		```mlir
#matmul_accesses = [		#matmul_accesses = [
(m, n, k) -> (m, k),		(m, n, k) -> (m, k),
(m, n, k) -> (k, n),		(m, n, k) -> (k, n),
(m, n, k) -> (m, n)		(m, n, k) -> (m, n)
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	let description = [{
transformations can be applied. Such legalization moves tensor return values		transformations can be applied. Such legalization moves tensor return values
into output buffer operands and updates the region arguments accordingly.		into output buffer operands and updates the region arguments accordingly.

Transformations that create control-flow around linalg.indexed_generic		Transformations that create control-flow around linalg.indexed_generic
operations are not expected to work with tensors because SSA values do not		operations are not expected to work with tensors because SSA values do not
escape naturally. Still, transformations and rewrites that take advantage of		escape naturally. Still, transformations and rewrites that take advantage of
tensor SSA values are expected to be useful and will be added in the near		tensor SSA values are expected to be useful and will be added in the near
future.		future.

		Example of 1D convolution with symbols:
		```mlir
		#conv_1d_accesses = [
		affine_map<(m, n)[dimN] -> (m + n - dimN floordiv 2)>, // in
		affine_map<(m, n)[dimN] -> (n)>, // filter
		affine_map<(m, n)[dimN] -> (m)> // out
		]

		#conv_1d_trait = {
		doc = "O(m) += I(m + n - size(n) floordiv 2) * K(n)",
		indexing_maps = #conv_1d_accesses,
		library_call = "linalg_conv_1d",
		iterator_types = ["parallel", "parallel"],
		symbol_source = 1
		}

		linalg.generic #conv_1d_trait %in, %filter, %out {
		^bb0(%a: f32, %b: f32, %c: f32) :
		%d = mulf %a, %b : f32
		%e = addf %c, %d : f32
		linalg.yield %e : f32
		} : memref<?xf32>,
		memref<?xf32>,
		memref<?xf32>
		```
		where symbol s0 will be substituted with `dim %filter, %c0` i.e. the first
		and only dimension of the second operand as specified by the symbol_source
		attribute.
}];		}];

let builders = [		let builders = [
OpBuilder<		OpBuilder<
"OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes, "		"OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes, "
"ValueRange args, int64_t argsIn, int64_t argsOut, "		"ValueRange args, int64_t argsIn, int64_t argsOut, "
"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "		"ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes, "
"function_ref<void(OpBuilder &, Location, ValueRange)> = nullptr">		"function_ref<void(OpBuilder &, Location, ValueRange)> = nullptr">
▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h

	Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines

	/// Returns the linearized list of all view dimensions in a linalgOp. Applying			/// Returns the linearized list of all view dimensions in a linalgOp. Applying
	/// the inverse, concatenated loopToOperandRangeMaps to this list allows the			/// the inverse, concatenated loopToOperandRangeMaps to this list allows the
	/// derivation of loop ranges for any linalgOp.			/// derivation of loop ranges for any linalgOp.
	template <typename ConcreteOp>			template <typename ConcreteOp>
	SmallVector<Value, 8> getViewSizes(OpBuilder &builder, ConcreteOp linalgOp) {			SmallVector<Value, 8> getViewSizes(OpBuilder &builder, ConcreteOp linalgOp) {
	auto loc = linalgOp.getLoc();			auto loc = linalgOp.getLoc();
	SmallVector<Value, 8> res;			SmallVector<Value, 8> res;
				SmallVector<unsigned, 4> ranks;
	for (auto v : linalgOp.getInputsAndOutputBuffers()) {			for (auto v : linalgOp.getInputsAndOutputBuffers()) {
	MemRefType t = v.getType().template cast<MemRefType>();			MemRefType t = v.getType().template cast<MemRefType>();
				ranks.push_back(t.getRank());
	for (unsigned i = 0; i < t.getRank(); ++i)			for (unsigned i = 0; i < t.getRank(); ++i)
	res.push_back(builder.create<DimOp>(loc, v, i));			res.push_back(builder.create<DimOp>(loc, v, i));
	}			}

				auto attr = linalgOp.template getAttrOfType<IntegerAttr>("symbol_source");
				if (attr) {
				// Find the correct position for inserting values for symbols.
				nicolasvasilacheUnsubmitted Done Reply Inline Actions you should be able to just do: linalgOp.indexing_maps().template getAsRange<AffineMapAttr, AffineMap>() and drop the extra function. nicolasvasilache: you should be able to just do: ```linalgOp.indexing_maps().template getAsRange<AffineMapAttr…
				unsigned numSymb = ranks[attr.getInt()], symbolsPos = 0;
				for (unsigned idx = 0; idx < attr.getInt(); idx++)
				symbolsPos += ranks[idx];

				// Append or rewrite the end of the value list that corresponds to the
				// values mapping to symbols. Since inside concatinated map symbols are
				// repeated we have to repeat the sizes as well.
				for (unsigned idx = 0, s = ranks.size(); idx < s; ++idx)
				for (unsigned idx2 = 0; idx2 < numSymb; ++idx2)
				res.push_back(res[symbolsPos + idx2]);
				}
	return res;			return res;
	}			}

	/// Returns the values obtained by applying `map` to the list of values.			/// Returns the values obtained by applying `map` to the list of values.
	/// When non-null, the optional pointer `folder` is used to call into the			/// When non-null, the optional pointer `folder` is used to call into the
	/// `createAndFold` builder method. If `folder` is null, the regular `create`			/// `createAndFold` builder method. If `folder` is null, the regular `create`
	/// method is called.			/// method is called.
	SmallVector<Value, 4> applyMapToValues(OpBuilder &b, Location loc,			SmallVector<Value, 4> applyMapToValues(OpBuilder &b, Location loc,
	Show All 40 Lines

mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h

Show All 40 Lines	inline bool isColumnMajorMatmul(ArrayAttr indexingMaps) {
bindDims(context, m, n, k);		bindDims(context, m, n, k);
auto mapA = AffineMapAttr::get(AffineMap::get(3, 0, {k, n}, context));		auto mapA = AffineMapAttr::get(AffineMap::get(3, 0, {k, n}, context));
auto mapB = AffineMapAttr::get(AffineMap::get(3, 0, {m, k}, context));		auto mapB = AffineMapAttr::get(AffineMap::get(3, 0, {m, k}, context));
auto mapC = AffineMapAttr::get(AffineMap::get(3, 0, {n, m}, context));		auto mapC = AffineMapAttr::get(AffineMap::get(3, 0, {n, m}, context));
auto maps = ArrayAttr::get({mapA, mapB, mapC}, context);		auto maps = ArrayAttr::get({mapA, mapB, mapC}, context);
return indexingMaps == maps;		return indexingMaps == maps;
}		}

		/// Attribute name for the IntegerAttr which encodes the index of operand
		/// whose dimensions will be propagated as symbols to the indexing maps
		constexpr StringRef getSymbolSourceAttrName() { return "symbol_source"; }

/// Attribute name for the AffineArrayAttr which encodes the relationship		/// Attribute name for the AffineArrayAttr which encodes the relationship
/// between a structured op iterators' and its operands.		/// between a structured op iterators' and its operands.
constexpr StringRef getIndexingMapsAttrName() { return "indexing_maps"; }		constexpr StringRef getIndexingMapsAttrName() { return "indexing_maps"; }

/// Attribute name for the StrArrayAttr which encodes the type of a structured		/// Attribute name for the StrArrayAttr which encodes the type of a structured
/// op's iterators.		/// op's iterators.
constexpr StringRef getIteratorTypesAttrName() { return "iterator_types"; }		constexpr StringRef getIteratorTypesAttrName() { return "iterator_types"; }

▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

mlir/include/mlir/IR/AffineExpr.h

Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	public:
/// Walk all of the AffineExpr's in this expression in postorder.		/// Walk all of the AffineExpr's in this expression in postorder.
void walk(std::function<void(AffineExpr)> callback) const;		void walk(std::function<void(AffineExpr)> callback) const;

/// This method substitutes any uses of dimensions and symbols (e.g.		/// This method substitutes any uses of dimensions and symbols (e.g.
/// dim#0 with dimReplacements[0]) and returns the modified expression tree.		/// dim#0 with dimReplacements[0]) and returns the modified expression tree.
AffineExpr replaceDimsAndSymbols(ArrayRef<AffineExpr> dimReplacements,		AffineExpr replaceDimsAndSymbols(ArrayRef<AffineExpr> dimReplacements,
ArrayRef<AffineExpr> symReplacements) const;		ArrayRef<AffineExpr> symReplacements) const;

		/// Replace symbols[0 .. numDims - 1] by
		/// symbols[shift .. shift + numDims - 1].
		AffineExpr shiftSymbols(unsigned numSymbols, unsigned shift) const;

AffineExpr operator+(int64_t v) const;		AffineExpr operator+(int64_t v) const;
AffineExpr operator+(AffineExpr other) const;		AffineExpr operator+(AffineExpr other) const;
AffineExpr operator-() const;		AffineExpr operator-() const;
AffineExpr operator-(int64_t v) const;		AffineExpr operator-(int64_t v) const;
AffineExpr operator-(AffineExpr other) const;		AffineExpr operator-(AffineExpr other) const;
AffineExpr operator*(int64_t v) const;		AffineExpr operator*(int64_t v) const;
AffineExpr operator*(AffineExpr other) const;		AffineExpr operator*(AffineExpr other) const;
AffineExpr floorDiv(uint64_t v) const;		AffineExpr floorDiv(uint64_t v) const;
▲ Show 20 Lines • Show All 167 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/EDSC/Builders.cpp

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	auto *op =
edsc::ScopedContext::getLocation(),		edsc::ScopedContext::getLocation(),
types,		types,
values,		values,
IntegerAttr::get(IntegerType::get(64, ctx), nInputs),		IntegerAttr::get(IntegerType::get(64, ctx), nInputs),
IntegerAttr::get(IntegerType::get(64, ctx), nOutputs),		IntegerAttr::get(IntegerType::get(64, ctx), nOutputs),
builder.getAffineMapArrayAttr(maps),		builder.getAffineMapArrayAttr(maps),
builder.getStrArrayAttr(iteratorStrTypes),		builder.getStrArrayAttr(iteratorStrTypes),
StringAttr() /doc/,		StringAttr() /doc/,
StringAttr() /library_call/		StringAttr() /library_call/,
		IntegerAttr() /symbol_source/
/* TODO: other attributes in op */		/* TODO: other attributes in op */
)		)
.getOperation();		.getOperation();
// clang-format on		// clang-format on

using namespace edsc;		using namespace edsc;
SmallVector<Type, 4> blockTypes;		SmallVector<Type, 4> blockTypes;
blockTypes.reserve(values.size());		blockTypes.reserve(values.size());
▲ Show 20 Lines • Show All 209 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	void GenericOp::build(
OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,		OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,
ValueRange args, int64_t argsIn, int64_t argsOut,		ValueRange args, int64_t argsIn, int64_t argsOut,
ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,		ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,
function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {		function_ref<void(OpBuilder &, Location, ValueRange)> bodyBuild) {
build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),		build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),
builder.getI64IntegerAttr(argsOut),		builder.getI64IntegerAttr(argsOut),
builder.getAffineMapArrayAttr(indexingMaps),		builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),		builder.getStrArrayAttr(iteratorTypes),
/doc=/nullptr, /library_call=/nullptr);		/doc=/nullptr, /library_call=/nullptr,
		/symbol_source=/nullptr);
if (!bodyBuild)		if (!bodyBuild)
return;		return;

SmallVector<Type, 4> blockArgTypes;		SmallVector<Type, 4> blockArgTypes;
for (Value arg : args)		for (Value arg : args)
blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());		blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());

OpBuilder::InsertionGuard guard(builder);		OpBuilder::InsertionGuard guard(builder);
auto &region = *result.regions.front();		auto &region = *result.regions.front();
Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);		Block *bodyBlock = builder.createBlock(&region, region.end(), blockArgTypes);
bodyBuild(builder, result.location, bodyBlock->getArguments());		bodyBuild(builder, result.location, bodyBlock->getArguments());
}		}

void IndexedGenericOp::build(		void IndexedGenericOp::build(
OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,		OpBuilder &builder, OperationState &result, ArrayRef<Type> resultTypes,
ValueRange args, int64_t argsIn, int64_t argsOut,		ValueRange args, int64_t argsIn, int64_t argsOut,
ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,		ArrayRef<AffineMap> indexingMaps, ArrayRef<StringRef> iteratorTypes,
function_ref<void(OpBuilder &, Location, ValueRange, ValueRange)>		function_ref<void(OpBuilder &, Location, ValueRange, ValueRange)>
bodyBuild) {		bodyBuild) {
build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),		build(builder, result, resultTypes, args, builder.getI64IntegerAttr(argsIn),
builder.getI64IntegerAttr(argsOut),		builder.getI64IntegerAttr(argsOut),
builder.getAffineMapArrayAttr(indexingMaps),		builder.getAffineMapArrayAttr(indexingMaps),
builder.getStrArrayAttr(iteratorTypes),		builder.getStrArrayAttr(iteratorTypes),
/doc=/nullptr, /library_call=/nullptr);		/doc=/nullptr, /library_call=/nullptr,
		/symbol_source=/nullptr);
if (!bodyBuild)		if (!bodyBuild)
return;		return;

unsigned nLoops = iteratorTypes.size();		unsigned nLoops = iteratorTypes.size();
SmallVector<Type, 4> blockArgTypes(nLoops, builder.getIndexType());		SmallVector<Type, 4> blockArgTypes(nLoops, builder.getIndexType());
for (Value arg : args)		for (Value arg : args)
blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());		blockArgTypes.push_back(arg.getType().cast<ShapedType>().getElementType());

▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	if (viewType.getElementType() !=
<< ((i < nInputViews) ? "input " : "output ")		<< ((i < nInputViews) ? "input " : "output ")
<< "operand: " << viewType;		<< "operand: " << viewType;
}		}
return success();		return success();
}		}
} // namespace		} // namespace

template <typename GenericOpType>		template <typename GenericOpType>
static LogicalResult verifyGenericOp(GenericOpType op) {		static LogicalResult verifyGenericOp(GenericOpType op) {
		limo1996AuthorUnsubmitted Done Reply Inline Actions @nicolasvasilache @ftynse I found the function name misleading as not only `GenericOp` and `IndexedGenericOp` are passed but also `LinalgStructuredOp`s which does not allow me to call functions defined in `GenericOpBase`. Should I circumvent it somehow or just avoid functions I defined in `GenericOpBase`? limo1996: @nicolasvasilache @ftynse I found the function name misleading as not only `GenericOp` and…
		ftynseUnsubmitted Done Reply Inline Actions If you can avoid new ops, just do that. Possibly you can have a verifier in the GenericOp that calls this function, and then does some additional things. Otherwise, it's template-parameterized by the actual OpType, so it should be possible to use template specialization to work around missing functions . ftynse: If you can avoid new ops, just do that. Possibly you can have a verifier in the GenericOp that…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions This is mostly related to better OpInterface definition. This is good for now, I'll refactor and cleanup a bit at some later time, thanks! nicolasvasilache: This is mostly related to better OpInterface definition. This is good for now, I'll refactor…
		limo1996AuthorUnsubmitted Done Reply Inline Actions Ok let me know once it's refactored so I can simplify my code as well. Thanks :) Code starting at line 264 can then be simplified to: auto symSrc = op.getSymbolSource(); if (symSrc.hasValue() && symSrc.getValue() >= op.getNumOperands()) return op.emitOpError("symbol_source index out of range"); limo1996: Ok let me know once it's refactored so I can simplify my code as well. Thanks :) Code starting…
auto nInputViews = op.getNumInputs();		auto nInputViews = op.getNumInputs();
auto nLoops = op.getNumLoops();		auto nLoops = op.getNumLoops();
auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();		auto nInputsAndOutputBuffers = op.getNumInputsAndOutputBuffers();
if (nInputsAndOutputBuffers != llvm::size(op.views()))		if (nInputsAndOutputBuffers != llvm::size(op.views()))
return op.emitOpError("expected exactly ")		return op.emitOpError("expected exactly ")
<< nInputsAndOutputBuffers		<< nInputsAndOutputBuffers
<< " inputs (tensor or buffer) and output buffer operands";		<< " inputs (tensor or buffer) and output buffer operands";

auto &region = op.region();		auto &region = op.region();
if (!llvm::hasSingleElement(region))		if (!llvm::hasSingleElement(region))
return op.emitOpError("expected region with 1 block");		return op.emitOpError("expected region with 1 block");
if (failed(BlockArgsVerifier<GenericOpType>::verify(op, region.front())))		if (failed(BlockArgsVerifier<GenericOpType>::verify(op, region.front())))
return failure();		return failure();

		auto attr = op.template getAttrOfType<IntegerAttr>("symbol_source");
		int64_t targetRank = 0;
		if (attr) {
		unsigned index = attr.getInt();
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I think most of this could go away with a proper confined integer attr (e.g. https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/StandardOps/IR/Ops.td#L1983) nicolasvasilache: I think most of this could go away with a proper confined integer attr (e.g. https://github.
		if (index >= op.getNumOperands())
		return op.emitOpError("symbol_source index out of range");
		targetRank = op.getShapedType(index).getRank();
		}

		nicolasvasilacheUnsubmitted Done Reply Inline Actions It seems the logic behind this `targetRank` and its usage should be hidden behind a `op.getNumSymbolsPerOperand()` that would return a `SmallVector<unsigned>` where each operand index has the number of symbols. This would also be somewhat future proof if we decided we wanted to turn the the symbol source into an ArrayAttr nicolasvasilache: It seems the logic behind this `targetRank` and its usage should be hidden behind a `op.
		limo1996AuthorUnsubmitted Done Reply Inline Actions If we turn symbol source into `ArrayAttr` then we must either pass dimensions from all operands specified in the symbol source into the maps as symbols or we will pass only dimensions of the corresponding operand which is really limited and does not support our case when we need dimensions of the operand in the map that does not correspond to it. limo1996: If we turn symbol source into `ArrayAttr` then we must either pass dimensions from all operands…
		ftynseUnsubmitted Done Reply Inline Actions A helper function looks reasonable from the code complexity perspective regardless of the further evolution. `ArrayAttr` discussion is irrelevant for the current patch. ftynse: A helper function looks reasonable from the code complexity perspective regardless of the…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions The comment was about code hygiene, not suggesting we should discuss this extension right now. Basically l 264-272 is an implementation detail that should be hidden away in a helper function today. If tomorrow the logic grows this new helper function will be the right place to impl this. nicolasvasilache: The comment was about code hygiene, not suggesting we should discuss this extension right now.
		limo1996AuthorUnsubmitted Done Reply Inline Actions I think in the future once this function is refactored such that it takes only `GenericOp` and `IndexedGenericOp` we can introduce `op.getNumSymbols()` that would return sum of ranks of operands specified in `symbol_source` and lines 264-269 can be refactored like this: auto symSrc = op.getSymbolSource(); if (symSrc.hasValue() && symSrc.getValue() >= op.getNumOperands()) return op.emitOpError("symbol_source index out of range"); For now I refactored only line 271. limo1996: I think in the future once this function is refactored such that it takes only `GenericOp` and…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Ok I see how the fact that it is use on Generic, IndexedGeneric and on the op interface makes it less attractive to refactor now. I'll fix, thanks! nicolasvasilache: Ok I see how the fact that it is use on Generic, IndexedGeneric and on the op interface…
SmallVector<AffineMap, 4> indexingMaps;		SmallVector<AffineMap, 4> indexingMaps;
indexingMaps.reserve(op.indexing_maps().size());		indexingMaps.reserve(op.indexing_maps().size());
for (auto en : llvm::enumerate(op.indexing_maps())) {		for (auto en : llvm::enumerate(op.indexing_maps())) {
auto idx = en.index();		auto idx = en.index();
auto m = en.value().template cast<AffineMapAttr>().getValue();		auto m = en.value().template cast<AffineMapAttr>().getValue();
indexingMaps.push_back(m); // Save reference to map for further checks.		indexingMaps.push_back(m); // Save reference to map for further checks.
auto view = (idx < nInputViews) ? op.getInputShapedType(idx)		auto view = (idx < nInputViews) ? op.getInputShapedType(idx)
: op.getOutputShapedType(idx - nInputViews);		: op.getOutputShapedType(idx - nInputViews);

if (m.getNumSymbols() != 0)		if (m.getNumSymbols() != targetRank)
return op.emitOpError("expected indexing_map #")		return op.emitOpError("expected the number of symbols in indexing_map #")
<< idx << " to have no symbols";		<< idx << " to match target rank";
		nicolasvasilacheUnsubmitted Done Reply Inline Actions this seems incorrect as all indexing maps could have the number of symbols of operand X. If you had the helper above you could check `m.getNumSymbols() != 0 && m.getNumSymbols() != op.getNumSymbolsPerOperand()[idx]` nicolasvasilache: this seems incorrect as all indexing maps could have the number of symbols of operand X. If you…
		limo1996AuthorUnsubmitted Done Reply Inline Actions Yes, all maps can have the number of symbols of operand X. Is it wrong? If you had the helper above you could check `m.getNumSymbols() != 0 && m.getNumSymbols() != op.getNumSymbolsPerOperand()[idx]` This condition basically allows only map at index equal to `symbol_source` to have symbols which is not correct as in our scenario we want to have symbols in the map corresponding to the input view and not the kernel view. limo1996: Yes, all maps can have the number of symbols of operand X. Is it wrong? > If you had the…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions this seems incorrect as all indexing maps could have the number of symbols of operand X. I was thinking about the case where we have >1 operands that want to propagate their symbols. This falls into the extension to ArrayAttr that is better disregarded for now. nicolasvasilache: ``` this seems incorrect as all indexing maps could have the number of symbols of operand X.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions This condition basically allows only map at index equal to symbol_source to have symbols which is not correct as in our scenario we want to have symbols in the map corresponding to the input view and not the kernel view. Ah I see, thanks for explaining, the symbols of the kernel are indeed used in the input. Then I'd recommend making things homogeneous from the get go. Instead of #conv_1d_accesses = [ affine_map<(m, n)[s0] -> (m + n - s0 floordiv 2)>, // in affine_map<(m, n) -> (n)>, // filter affine_map<(m, n) -> (m)> // out ] let's use: #conv_1d_accesses = [ affine_map<(m, n)[s0] -> (m + n - s0 floordiv 2)>, // in affine_map<(m, n)[s0] -> (n)>, // filter affine_map<(m, n)[s0] -> (m)> // out ] ? The fact that only the first operand map had `[s0]` had me think that `s0` corresponds to the operand 0 (even though it clearly says 1). <don't consider for now> and so in the future, when we want symbols from multiple operands, we can just concatenate them all and still remain homogeneous. </don't consider for now> nicolasvasilache: ``` This condition basically allows only map at index equal to symbol_source to have symbols…
		limo1996AuthorUnsubmitted Done Reply Inline Actions Yeah exactly, we are on the same page. I added `s0` to every map to make it less confusing and also I enforce it now. Thanks for the feedback! limo1996: Yeah exactly, we are on the same page. I added `s0` to every map to make it less confusing and…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Looks great, thanks! nicolasvasilache: Looks great, thanks!

if (m.getNumDims() != nLoops)		if (m.getNumDims() != nLoops)
return op.emitOpError("expected indexing_map #")		return op.emitOpError("expected indexing_map #")
<< idx << " to have " << nLoops		<< idx << " to have " << nLoops
<< " dim(s) to match the number of loops";		<< " dim(s) to match the number of loops";

if (m.getNumResults() != view.getRank())		if (m.getNumResults() != view.getRank())
return op.emitOpError("expected indexing_map #")		return op.emitOpError("expected indexing_map #")
<< idx << " results to match view rank: " << view;		<< idx << " results to match view rank: " << view;
}		}

auto concatMap = concatAffineMaps(indexingMaps);		auto concatMap = concatAffineMaps(indexingMaps);
auto aggregateMap = inversePermutation(concatMap);		// TODO: Bound inference for maps with symbols
if (!aggregateMap)		if (!concatMap.getNumSymbols() && !inversePermutation(concatMap))
		ftynseUnsubmitted Done Reply Inline Actions Please don't put usernames after TODO in LLVM codebase. ftynse: Please don't put usernames after TODO in LLVM codebase.
return op.emitOpError("expected the concatenation of maps in indexing_map "		return op.emitOpError("expected the concatenation of maps in indexing_map "
"to be invertible");		"to be invertible");

return success();		return success();
}		}

static LogicalResult verify(GenericOp op) { return verifyGenericOp(op); }		static LogicalResult verify(GenericOp op) { return verifyGenericOp(op); }
static LogicalResult verify(IndexedGenericOp op) { return verifyGenericOp(op); }		static LogicalResult verify(IndexedGenericOp op) { return verifyGenericOp(op); }
▲ Show 20 Lines • Show All 1,003 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp

Show First 20 Lines • Show All 313 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(GenericOp genericOp,
for (unsigned i : llvm::seq<unsigned>(0, genericOp.getNumResults()))		for (unsigned i : llvm::seq<unsigned>(0, genericOp.getNumResults()))
resultTypes.push_back(		resultTypes.push_back(
newInputOutputTypes[i + genericOp.getNumOperands()]);		newInputOutputTypes[i + genericOp.getNumOperands()]);
GenericOp replacementOp = rewriter.create<GenericOp>(		GenericOp replacementOp = rewriter.create<GenericOp>(
loc, resultTypes, newOperands, genericOp.args_in(),		loc, resultTypes, newOperands, genericOp.args_in(),
genericOp.args_out(), rewriter.getAffineMapArrayAttr(newIndexingMaps),		genericOp.args_out(), rewriter.getAffineMapArrayAttr(newIndexingMaps),
genericOp.iterator_types(),		genericOp.iterator_types(),
/doc = / nullptr,		/doc = / nullptr,
/library_call = / nullptr);		/library_call = / nullptr,
		/symbol_source = / nullptr);
rewriter.inlineRegionBefore(genericOp.region(), replacementOp.region(),		rewriter.inlineRegionBefore(genericOp.region(), replacementOp.region(),
replacementOp.region().begin());		replacementOp.region().begin());

// If any result tensor has a modified shape, then add reshape to recover		// If any result tensor has a modified shape, then add reshape to recover
// the original shape.		// the original shape.
SmallVector<Value, 4> resultReplacements;		SmallVector<Value, 4> resultReplacements;
for (auto result : llvm::enumerate(replacementOp.getResults())) {		for (auto result : llvm::enumerate(replacementOp.getResults())) {
unsigned index = result.index() + replacementOp.getNumOperands();		unsigned index = result.index() + replacementOp.getNumOperands();
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

Show First 20 Lines • Show All 504 Lines • ▼ Show 20 Lines	if (isa<GenericOp>(producer.getOperation()) &&
rewriter.getUnknownLoc(),		rewriter.getUnknownLoc(),
consumer.getOperation()->getResultTypes(), fusedOperands,		consumer.getOperation()->getResultTypes(), fusedOperands,
rewriter.getI64IntegerAttr(fusedOperands.size()),		rewriter.getI64IntegerAttr(fusedOperands.size()),
rewriter.getI64IntegerAttr(		rewriter.getI64IntegerAttr(
consumer.getOperation()->getNumResults()),		consumer.getOperation()->getNumResults()),
rewriter.getArrayAttr(fusedIndexMaps),		rewriter.getArrayAttr(fusedIndexMaps),
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr)		/library_call=/nullptr,
		/symbol_source=/nullptr)
.getOperation();		.getOperation();
} else {		} else {
fusedOp =		fusedOp =
rewriter		rewriter
.create<IndexedGenericOp>(		.create<IndexedGenericOp>(
rewriter.getUnknownLoc(),		rewriter.getUnknownLoc(),
consumer.getOperation()->getResultTypes(), fusedOperands,		consumer.getOperation()->getResultTypes(), fusedOperands,
rewriter.getI64IntegerAttr(fusedOperands.size()),		rewriter.getI64IntegerAttr(fusedOperands.size()),
rewriter.getI64IntegerAttr(		rewriter.getI64IntegerAttr(
consumer.getOperation()->getNumResults()),		consumer.getOperation()->getNumResults()),
rewriter.getArrayAttr(fusedIndexMaps),		rewriter.getArrayAttr(fusedIndexMaps),
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr)		/library_call=/nullptr,
		/symbol_source=/nullptr)
.getOperation();		.getOperation();
}		}

// Construct an AffineMap from consumer loops to producer loops.		// Construct an AffineMap from consumer loops to producer loops.
// consumer loop -> tensor index		// consumer loop -> tensor index
AffineMap consumerResultIndexMap =		AffineMap consumerResultIndexMap =
consumer.getInputIndexingMap(consumerIdx);		consumer.getInputIndexingMap(consumerIdx);
// producer loop -> tensor index		// producer loop -> tensor index
▲ Show 20 Lines • Show All 246 Lines • ▼ Show 20 Lines	SmallVector<Attribute, 4> indexMapAttrs = llvm::to_vector<4>(
return AffineMapAttr::get(map);		return AffineMapAttr::get(map);
}));		}));
auto fusedOp = rewriter.create<LinalgOpTy>(		auto fusedOp = rewriter.create<LinalgOpTy>(
rewriter.getUnknownLoc(), consumer.getResultTypes(), fusedOperands,		rewriter.getUnknownLoc(), consumer.getResultTypes(), fusedOperands,
rewriter.getI64IntegerAttr(fusedOperands.size()),		rewriter.getI64IntegerAttr(fusedOperands.size()),
rewriter.getI64IntegerAttr(consumer.getNumResults()),		rewriter.getI64IntegerAttr(consumer.getNumResults()),
rewriter.getArrayAttr(indexMapAttrs), consumer.iterator_types(),		rewriter.getArrayAttr(indexMapAttrs), consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr);		/library_call=/nullptr,
		/symbol_source=/nullptr);
auto &fusedRegion = fusedOp.region();		auto &fusedRegion = fusedOp.region();
rewriter.cloneRegionBefore(consumer.region(), fusedRegion,		rewriter.cloneRegionBefore(consumer.region(), fusedRegion,
fusedRegion.begin());		fusedRegion.begin());
return fusedOp;		return fusedOp;
}		}
};		};

/// Implementation of fusion on tensor ops when consumer is a TensorReshapeOp.		/// Implementation of fusion on tensor ops when consumer is a TensorReshapeOp.
Show All 39 Lines	static Operation *fuse(LinalgOpTy producer, TensorReshapeOp consumer,

auto fusedOp = rewriter.create<LinalgOpTy>(		auto fusedOp = rewriter.create<LinalgOpTy>(
rewriter.getUnknownLoc(), consumer.getResultType(),		rewriter.getUnknownLoc(), consumer.getResultType(),
producer.getOperands(),		producer.getOperands(),
rewriter.getI64IntegerAttr(producer.getNumOperands()),		rewriter.getI64IntegerAttr(producer.getNumOperands()),
rewriter.getI64IntegerAttr(1), rewriter.getArrayAttr(indexMapAttrs),		rewriter.getI64IntegerAttr(1), rewriter.getArrayAttr(indexMapAttrs),
producer.iterator_types(),		producer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr);		/library_call=/nullptr,
		/symbol_source=/nullptr);
auto &fusedRegion = fusedOp.region();		auto &fusedRegion = fusedOp.region();
rewriter.cloneRegionBefore(producer.region(), fusedRegion,		rewriter.cloneRegionBefore(producer.region(), fusedRegion,
fusedRegion.begin());		fusedRegion.begin());
return fusedOp;		return fusedOp;
}		}
};		};

/// Implementation of fusion on tensor ops when producer is a splat constant.		/// Implementation of fusion on tensor ops when producer is a splat constant.
Show All 33 Lines	static Operation *fuse(ConstantOp producer, LinalgOpTy consumer,

auto fusedOp = rewriter.create<LinalgOpTy>(		auto fusedOp = rewriter.create<LinalgOpTy>(
rewriter.getUnknownLoc(), consumer.getResultTypes(), fusedOperands,		rewriter.getUnknownLoc(), consumer.getResultTypes(), fusedOperands,
rewriter.getI64IntegerAttr(consumer.getNumOperands() - 1),		rewriter.getI64IntegerAttr(consumer.getNumOperands() - 1),
rewriter.getI64IntegerAttr(consumer.getNumResults()),		rewriter.getI64IntegerAttr(consumer.getNumResults()),
rewriter.getAffineMapArrayAttr(fusedIndexMaps),		rewriter.getAffineMapArrayAttr(fusedIndexMaps),
consumer.iterator_types(),		consumer.iterator_types(),
/doc=/nullptr,		/doc=/nullptr,
/library_call=/nullptr);		/library_call=/nullptr,
		/symbol_source=/nullptr);

// Map the block argument corresponding to the replaced argument with the		// Map the block argument corresponding to the replaced argument with the
// scalar constant.		// scalar constant.
Region &consumerRegion = consumer.region();		Region &consumerRegion = consumer.region();
Block &entryBlock = *consumerRegion.begin();		Block &entryBlock = *consumerRegion.begin();
unsigned argIndex =		unsigned argIndex =
entryBlock.getNumArguments() - consumer.getNumOperands() + consumerIdx;		entryBlock.getNumArguments() - consumer.getNumOperands() + consumerIdx;
BlockAndValueMapping mapping;		BlockAndValueMapping mapping;
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp

Show All 30 Lines
using edsc::op::operator+;		using edsc::op::operator+;

static SmallVector<Value, 8> makeCanonicalAffineApplies(OpBuilder &b,		static SmallVector<Value, 8> makeCanonicalAffineApplies(OpBuilder &b,
Location loc,		Location loc,
AffineMap map,		AffineMap map,
ArrayRef<Value> vals) {		ArrayRef<Value> vals) {
if (map.isEmpty())		if (map.isEmpty())
return {};		return {};
assert(map.getNumSymbols() == 0);
		ftynseUnsubmitted Done Reply Inline Actions Please don't commit commented-out code ftynse: Please don't commit commented-out code
assert(map.getNumInputs() == vals.size());		assert(map.getNumInputs() == vals.size());
SmallVector<Value, 8> res;		SmallVector<Value, 8> res;
res.reserve(map.getNumResults());		res.reserve(map.getNumResults());
auto dims = map.getNumDims();		auto dims = map.getNumDims();
for (auto e : map.getResults()) {		for (auto e : map.getResults()) {
auto exprMap = AffineMap::get(dims, 0, e);		auto exprMap = AffineMap::get(dims, map.getNumSymbols(), e);
SmallVector<Value, 4> operands(vals.begin(), vals.end());		SmallVector<Value, 4> operands(vals.begin(), vals.end());
canonicalizeMapAndOperands(&exprMap, &operands);		canonicalizeMapAndOperands(&exprMap, &operands);
res.push_back(affine_apply(exprMap, operands));		res.push_back(affine_apply(exprMap, operands));
}		}
return res;		return res;
}		}

static SmallVector<Value, 4> permuteIvs(ArrayRef<Value> ivs,		static SmallVector<Value, 4> permuteIvs(ArrayRef<Value> ivs,
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	assert(linalgOp.hasBufferSemantics() &&
"expected linalg op with buffer semantics");		"expected linalg op with buffer semantics");
auto &b = ScopedContext::getBuilderRef();		auto &b = ScopedContext::getBuilderRef();
auto loc = ScopedContext::getLocation();		auto loc = ScopedContext::getLocation();
unsigned nInputs = linalgOp.getNumInputs();		unsigned nInputs = linalgOp.getNumInputs();
unsigned nOutputs = linalgOp.getNumOutputs();		unsigned nOutputs = linalgOp.getNumOutputs();
SmallVector<Value, 4> indexedValues;		SmallVector<Value, 4> indexedValues;
indexedValues.reserve(nInputs + nOutputs);		indexedValues.reserve(nInputs + nOutputs);

		auto attr = linalgOp.template getAttrOfType<IntegerAttr>("symbol_source");
		auto allIvsPlusDims = SmallVector<Value, 4>(allIvs.begin(), allIvs.end());
		if (attr) {
		auto operand = linalgOp.getOperand(attr.getInt());
		auto shapedType = operand.getType().template cast<ShapedType>();
		allIvsPlusDims.reserve(allIvs.size() + shapedType.getRank());
		for (unsigned idx = 0, e = shapedType.getRank(); idx < e; ++idx)
		allIvsPlusDims.push_back(b.create<DimOp>(loc, operand, idx));
		}

// TODO: Avoid the loads if the corresponding argument of the		// TODO: Avoid the loads if the corresponding argument of the
// region has no uses.		// region has no uses.
// 1.a. Emit load from input views.		// 1.a. Emit load from input views.
for (unsigned i = 0; i < nInputs; ++i) {		for (unsigned i = 0; i < nInputs; ++i) {
auto indexing = makeCanonicalAffineApplies(		auto indexing = makeCanonicalAffineApplies(
b, loc, linalgOp.getInputIndexingMap(i), allIvs);		b, loc, linalgOp.getInputIndexingMap(i), allIvsPlusDims);
		ftynseUnsubmitted Done Reply Inline Actions Please don't use `auto` unless if it improves readability (e.g. for very long types such as iterators, or when the type is already mentioned in the same statement such as `getAttrOfType`). ftynse: Please don't use `auto` unless if it improves readability (e.g. for very long types such as…
// Passing through IndexedValueType emits the proper load operation.		// Passing through IndexedValueType emits the proper load operation.
indexedValues.push_back(IndexedValueType(linalgOp.getInput(i))(indexing));		indexedValues.push_back(IndexedValueType(linalgOp.getInput(i))(indexing));
}		}
// 1.b. Emit load from output views.		// 1.b. Emit load from output views.
		ftynseUnsubmitted Done Reply Inline Actions Call `.reserve` to allocate space in the vector before calling `push_back` in a loop. ftynse: Call `.reserve` to allocate space in the vector before calling `push_back` in a loop.
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
		ftynseUnsubmitted Done Reply Inline Actions Can you just do `for (unsigned i = 0, e = shapedType.getRank(); i < e; ++i)` instead of enumerating the shape and using only the indices? ftynse: Can you just do `for (unsigned i = 0, e = shapedType.getRank(); i < e; ++i)` instead of…
auto indexing = makeCanonicalAffineApplies(		auto indexing = makeCanonicalAffineApplies(
b, loc, linalgOp.getOutputIndexingMap(i), allIvs);		b, loc, linalgOp.getOutputIndexingMap(i), allIvsPlusDims);
// Passing through IndexedValueType emits the proper load operation.		// Passing through IndexedValueType emits the proper load operation.
indexedValues.push_back(		indexedValues.push_back(
IndexedValueType(linalgOp.getOutputBuffer(i))(indexing));		IndexedValueType(linalgOp.getOutputBuffer(i))(indexing));
}		}

// TODO: When a region inliner exists, use it.		// TODO: When a region inliner exists, use it.
// 2. Inline region, currently only works for a single basic block.		// 2. Inline region, currently only works for a single basic block.
// 3. Emit store.		// 3. Emit store.
SmallVector<SmallVector<Value, 8>, 8> indexing;		SmallVector<SmallVector<Value, 8>, 8> indexing;
SmallVector<Value, 8> outputBuffers;		SmallVector<Value, 8> outputBuffers;
for (unsigned i = 0; i < nOutputs; ++i) {		for (unsigned i = 0; i < nOutputs; ++i) {
indexing.push_back(makeCanonicalAffineApplies(		indexing.push_back(makeCanonicalAffineApplies(
b, loc, linalgOp.getOutputIndexingMap(i), allIvs));		b, loc, linalgOp.getOutputIndexingMap(i), allIvsPlusDims));
outputBuffers.push_back(linalgOp.getOutputBuffer(i));		outputBuffers.push_back(linalgOp.getOutputBuffer(i));
}		}
inlineRegionAndEmitStore<IndexedValueType>(linalgOp, indexedValues, indexing,		inlineRegionAndEmitStore<IndexedValueType>(linalgOp, indexedValues, indexing,
outputBuffers);		outputBuffers);
}		}

template <typename IndexedValueType>		template <typename IndexedValueType>
void emitScalarImplementation(ArrayRef<Value> allIvs, CopyOp copyOp) {		void emitScalarImplementation(ArrayRef<Value> allIvs, CopyOp copyOp) {
▲ Show 20 Lines • Show All 250 Lines • ▼ Show 20 Lines	Optional<LinalgLoops> linalgOpToLoopsImpl(Operation *op, OpBuilder &builder) {
// permutation map (which is asserted in the inverse calculation).		// permutation map (which is asserted in the inverse calculation).
auto linalgOp = cast<ConcreteOpTy>(op);		auto linalgOp = cast<ConcreteOpTy>(op);
assert(linalgOp.hasBufferSemantics() &&		assert(linalgOp.hasBufferSemantics() &&
"expected linalg op with buffer semantics");		"expected linalg op with buffer semantics");
auto mapsRange =		auto mapsRange =
linalgOp.indexing_maps().template getAsRange<AffineMapAttr>();		linalgOp.indexing_maps().template getAsRange<AffineMapAttr>();
auto maps = llvm::to_vector<8>(		auto maps = llvm::to_vector<8>(
llvm::map_range(mapsRange, [](AffineMapAttr a) { return a.getValue(); }));		llvm::map_range(mapsRange, [](AffineMapAttr a) { return a.getValue(); }));
AffineMap invertedMap = inversePermutation(concatAffineMaps(maps));		SmallVector<Value, 8> sizes = getViewSizes(builder, linalgOp);
		AffineMap map = concatAffineMaps(maps);
		ftynseUnsubmitted Done Reply Inline Actions Please use full sentences in comments: start with a capital letter and terminate with a period, https://llvm.org/docs/CodingStandards.html#commenting. ftynse: Please use full sentences in comments: start with a capital letter and terminate with a period…
		if (map.getNumSymbols()) {
		// Ignore symbols for now as they are not supported by inversePermutation.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions This looks quite unsafe to me. It seems the proper way to drop symbols would be to replace them all with `0`. Is this just a temporary thing that needs to be fixed in the next CL and the state of the codebase is that incorrect code may be generated? nicolasvasilache: This looks quite unsafe to me. It seems the proper way to drop symbols would be to replace them…
		ftynseUnsubmitted Done Reply Inline Actions Good point. Can we just `return {}` to signify failure in presence of symbols? The tests for generated loops can go into the following commit where the loop bound computation is added. Here, we can just keep the "roundtripping" test that makes sure we print back what we parsed. ftynse: Good point. Can we just `return {}` to signify failure in presence of symbols? The tests for…
		unsigned dims = map.getNumDims();
		SmallVector<AffineExpr, 8> zeros(
		map.getNumSymbols(), getAffineConstantExpr(0, map.getContext()));
		SmallVector<AffineExpr, 8> res;
		for (auto result : map.getResults())
		res.push_back(result.replaceDimsAndSymbols({}, zeros));

		map = AffineMap::get(dims, 0, res, map.getContext());

		// Cut off values that would have been applied to symbols
		sizes.resize(res.size());
		}

		AffineMap invertedMap = inversePermutation(map);
if (!invertedMap)		if (!invertedMap)
return {};		return {};
if (invertedMap.isEmpty()) {		if (invertedMap.isEmpty()) {
emitScalarImplementation<IndexedValueTy>({}, linalgOp);		emitScalarImplementation<IndexedValueTy>({}, linalgOp);
return LinalgLoops();		return LinalgLoops();
}		}

SmallVector<Value, 4> allIvs;		SmallVector<Value, 4> allIvs;
auto loopRanges =		auto loopRanges = emitLoopRanges(scope.getBuilderRef(), scope.getLocation(),
emitLoopRanges(scope.getBuilderRef(), scope.getLocation(), invertedMap,		invertedMap, sizes);
getViewSizes(builder, linalgOp));
GenerateLoopNest<LoopTy>::doit(		GenerateLoopNest<LoopTy>::doit(
loopRanges, linalgOp.iterator_types().getValue(), [&](ValueRange ivs) {		loopRanges, linalgOp.iterator_types().getValue(), [&](ValueRange ivs) {
allIvs.append(ivs.begin(), ivs.end());		allIvs.append(ivs.begin(), ivs.end());
emitScalarImplementation<IndexedValueTy>(allIvs, linalgOp);		emitScalarImplementation<IndexedValueTy>(allIvs, linalgOp);
});		});
// Number of loop ops might be different from the number of ivs since some		// Number of loop ops might be different from the number of ivs since some
// loops like affine.parallel and scf.parallel have multiple ivs.		// loops like affine.parallel and scf.parallel have multiple ivs.
llvm::SetVector<Operation *> loopSet;		llvm::SetVector<Operation *> loopSet;
▲ Show 20 Lines • Show All 201 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	for (auto result : results) {
newArgs.push_back(alloc);		newArgs.push_back(alloc);
newResults.push_back(alloc);		newResults.push_back(alloc);
}		}

// Generate a new linalg operation that works on buffers.		// Generate a new linalg operation that works on buffers.
auto linalgOp = rewriter.create<linalg::GenericOp>(		auto linalgOp = rewriter.create<linalg::GenericOp>(
loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),		loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),
rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),		rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),
op.iterator_types(), op.docAttr(), op.library_callAttr());		op.iterator_types(), op.docAttr(), op.library_callAttr(),
		op.symbol_sourceAttr());

// Create a new block in the region of the new Generic Op.		// Create a new block in the region of the new Generic Op.
Block &oldBlock = op.getRegion().front();		Block &oldBlock = op.getRegion().front();
Region &newRegion = linalgOp.region();		Region &newRegion = linalgOp.region();
Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),		Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),
oldBlock.getArgumentTypes());		oldBlock.getArgumentTypes());

// Add the result arguments to the new block.		// Add the result arguments to the new block.
▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

mlir/lib/IR/AffineExpr.cpp

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	case AffineExprKind::Mod:
auto newRHS = rhs.replaceDimsAndSymbols(dimReplacements, symReplacements);		auto newRHS = rhs.replaceDimsAndSymbols(dimReplacements, symReplacements);
if (newLHS == lhs && newRHS == rhs)		if (newLHS == lhs && newRHS == rhs)
return *this;		return *this;
return getAffineBinaryOpExpr(getKind(), newLHS, newRHS);		return getAffineBinaryOpExpr(getKind(), newLHS, newRHS);
}		}
llvm_unreachable("Unknown AffineExpr");		llvm_unreachable("Unknown AffineExpr");
}		}

		/// Replace symbols[0 .. numDims - 1] by symbols[shift .. shift + numDims - 1].
		AffineExpr AffineExpr::shiftSymbols(unsigned numSymbols, unsigned shift) const {
		SmallVector<AffineExpr, 4> symbols;
		for (unsigned idx = 0; idx < numSymbols; ++idx)
		symbols.push_back(getAffineSymbolExpr(idx + shift, getContext()));
		return replaceDimsAndSymbols({}, symbols);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Drop `dims` above and make this return replaceDimsAndSymbols({}, symbols); ? nicolasvasilache: Drop `dims` above and make this ``` return replaceDimsAndSymbols({}, symbols); ``` ?
		}

/// Returns true if this expression is made out of only symbols and		/// Returns true if this expression is made out of only symbols and
/// constants (no dimensional identifiers).		/// constants (no dimensional identifiers).
bool AffineExpr::isSymbolicOrConstant() const {		bool AffineExpr::isSymbolicOrConstant() const {
switch (getKind()) {		switch (getKind()) {
case AffineExprKind::Constant:		case AffineExprKind::Constant:
return true;		return true;
case AffineExprKind::DimId:		case AffineExprKind::DimId:
return false;		return false;
▲ Show 20 Lines • Show All 785 Lines • Show Last 20 Lines

mlir/lib/IR/AffineMap.cpp

Show First 20 Lines • Show All 406 Lines • ▼ Show 20 Lines	uniqueExprs.erase(std::unique(uniqueExprs.begin(), uniqueExprs.end()),
uniqueExprs.end());		uniqueExprs.end());
return AffineMap::get(map.getNumDims(), map.getNumSymbols(), uniqueExprs,		return AffineMap::get(map.getNumDims(), map.getNumSymbols(), uniqueExprs,
map.getContext());		map.getContext());
}		}

AffineMap mlir::inversePermutation(AffineMap map) {		AffineMap mlir::inversePermutation(AffineMap map) {
if (map.isEmpty())		if (map.isEmpty())
return map;		return map;
assert(map.getNumSymbols() == 0 && "expected map without symbols");		assert(map.getNumSymbols() == 0 && "expected map without symbols");
		ftynseUnsubmitted Done Reply Inline Actions This function is still not supposed to work for maps with symbols IIUC ftynse: This function is still not supposed to work for maps with symbols IIUC
		limo1996AuthorUnsubmitted Done Reply Inline Actions At this point it should. The next diff introduces function that bypasses inversePermutation call limo1996: At this point it should. The next diff introduces function that bypasses inversePermutation call
		ftynseUnsubmitted Done Reply Inline Actions I see two bad practices here: trying to commit commented-out code: if the assertion is no longer necessary, remove it completely, add a test that exercises the new behavior, and update the doc; commits are not self-contained: if this change is only required for the next commit, it should be included in that commit and removed from this one. ftynse: I see two bad practices here: - trying to commit commented-out code: if the assertion is no…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I don't see how this is correct. Bypassing a caller has no incidence on potential other callers and the code would be bugged. Let's revert the 2 changes to this function please. nicolasvasilache: I don't see how this is correct. Bypassing a caller has no incidence on potential other callers…
SmallVector<AffineExpr, 4> exprs(map.getNumDims());		SmallVector<AffineExpr, 4> exprs(map.getNumDims());
for (auto en : llvm::enumerate(map.getResults())) {		for (auto en : llvm::enumerate(map.getResults())) {
auto expr = en.value();		auto expr = en.value();
// Skip non-permutations.		// Skip non-permutations.
if (auto d = expr.dyn_cast<AffineDimExpr>()) {		if (auto d = expr.dyn_cast<AffineDimExpr>()) {
if (exprs[d.getPosition()])		if (exprs[d.getPosition()])
continue;		continue;
exprs[d.getPosition()] = getAffineDimExpr(en.index(), d.getContext());		exprs[d.getPosition()] = getAffineDimExpr(en.index(), d.getContext());
}		}
}		}
SmallVector<AffineExpr, 4> seenExprs;		SmallVector<AffineExpr, 4> seenExprs;
seenExprs.reserve(map.getNumDims());		seenExprs.reserve(map.getNumDims());
for (auto expr : exprs)		for (auto expr : exprs)
if (expr)		if (expr)
seenExprs.push_back(expr);		seenExprs.push_back(expr);
if (seenExprs.size() != map.getNumInputs())		if (seenExprs.size() != map.getNumInputs())
		ftynseUnsubmitted Done Reply Inline Actions Is this change necessary? ftynse: Is this change necessary?
		limo1996AuthorUnsubmitted Done Reply Inline Actions When numSymbols == 0 then numDims == numInputs but yeah no need for the change.. limo1996: When numSymbols == 0 then numDims == numInputs but yeah no need for the change..
		limo1996AuthorUnsubmitted Done Reply Inline Actions Actually at this point there is a need for it. I will revert it in the next commit.. limo1996: Actually at this point there is a need for it. I will revert it in the next commit..
		nicolasvasilacheUnsubmitted Done Reply Inline Actions If it is not necessary let's revert is now, I don't see value in making a change that is reversed in a subsequent commit without a solid justification. nicolasvasilache: If it is not necessary let's revert is now, I don't see value in making a change that is…
return AffineMap();		return AffineMap();
return AffineMap::get(map.getNumResults(), 0, seenExprs, map.getContext());		return AffineMap::get(map.getNumResults(), 0, seenExprs, map.getContext());
}		}

AffineMap mlir::concatAffineMaps(ArrayRef<AffineMap> maps) {		AffineMap mlir::concatAffineMaps(ArrayRef<AffineMap> maps) {
unsigned numResults = 0;		unsigned numResults = 0, numDims = 0, numSymbols = 0;
for (auto m : maps)		for (auto m : maps)
numResults += m.getNumResults();		numResults += m.getNumResults();
unsigned numDims = 0;
SmallVector<AffineExpr, 8> results;		SmallVector<AffineExpr, 8> results;
results.reserve(numResults);		results.reserve(numResults);
for (auto m : maps) {		for (auto m : maps) {
assert(m.getNumSymbols() == 0 && "expected map without symbols");		for (auto res : m.getResults())
results.append(m.getResults().begin(), m.getResults().end());		results.push_back(res.shiftSymbols(m.getNumSymbols(), numSymbols));

		numSymbols += m.getNumSymbols();
		ftynseUnsubmitted Done Reply Inline Actions There are more of these, I only commented on the first occurrence. ftynse: There are more of these, I only commented on the first occurrence.
numDims = std::max(m.getNumDims(), numDims);		numDims = std::max(m.getNumDims(), numDims);
}		}
return AffineMap::get(numDims, /numSymbols=/0, results,		return AffineMap::get(numDims, numSymbols, results,
maps.front().getContext());		maps.front().getContext());
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// MutableAffineMap.		// MutableAffineMap.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

MutableAffineMap::MutableAffineMap(AffineMap map)		MutableAffineMap::MutableAffineMap(AffineMap map)
Show All 37 Lines

mlir/test/Dialect/Linalg/invalid.mlir

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	linalg.generic {
^bb(%0: f32):		^bb(%0: f32):
linalg.yield		linalg.yield
}: memref<f32>		}: memref<f32>
}		}

// -----		// -----

func @generic_symbol_in_map(%arg0: memref<i32>) {		func @generic_symbol_in_map(%arg0: memref<i32>) {
// expected-error @+1 {{op expected indexing_map #0 to have no symbols}}		// expected-error @+1 {{expected the number of symbols in indexing_map #0 to match target rank}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<()[N] -> (0)> ],		indexing_maps = [ affine_map<()[N] -> (0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
^bb(%i : i32):		^bb(%i : i32):
linalg.yield %i : i32		linalg.yield %i : i32
}: memref<i32>		}: memref<i32>
}		}

// -----		// -----

		func @generic_symbol_source_out_of_range(%arg0: memref<i32>) {
		// expected-error @+1 {{symbol_source index out of range}}
		linalg.generic {
		args_in = 0,
		args_out = 1,
		indexing_maps = [ affine_map<()[N] -> (0)> ],
		iterator_types = ["parallel"],
		symbol_source = 1
		} %arg0 {
		^bb(%i : i32):
		linalg.yield %i : i32
		}: memref<i32>
		}

		// -----

func @generic_wrong_dim_in_map(%arg0: memref<1xi32>) {		func @generic_wrong_dim_in_map(%arg0: memref<1xi32>) {
// expected-error @+1 {{op expected indexing_map #0 to have 1 dim(s) to match the number of loops}}		// expected-error @+1 {{op expected indexing_map #0 to have 1 dim(s) to match the number of loops}}
linalg.generic {		linalg.generic {
args_in = 0,		args_in = 0,
args_out = 1,		args_out = 1,
indexing_maps = [ affine_map<() -> (0)> ],		indexing_maps = [ affine_map<() -> (0)> ],
iterator_types = ["parallel"]		iterator_types = ["parallel"]
} %arg0 {		} %arg0 {
▲ Show 20 Lines • Show All 356 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/loops.mlir

	// RUN: mlir-opt %s -convert-linalg-to-loops \| FileCheck --check-prefix=CHECKLOOP %s			// RUN: mlir-opt %s -convert-linalg-to-loops \| FileCheck --check-prefix=CHECKLOOP %s
	// RUN: mlir-opt %s -convert-linalg-to-parallel-loops \| FileCheck --check-prefix=CHECKPARALLEL %s			// RUN: mlir-opt %s -convert-linalg-to-parallel-loops \| FileCheck --check-prefix=CHECKPARALLEL %s

	// Test that we can lower all the way to LLVM without crashing, don't check results here.			// Test that we can lower all the way to LLVM without crashing, don't check results here.
	// RUN: mlir-opt %s -convert-linalg-to-loops -convert-linalg-to-llvm -o=/dev/null 2>&1			// RUN: mlir-opt %s -convert-linalg-to-loops -convert-linalg-to-llvm -o=/dev/null 2>&1

	// CHECKLOOP-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>			// CHECKLOOP-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>
	// CHECKLOOP-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>			// CHECKLOOP-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
	// CHECKLOOP-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECKLOOP-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>
	// CHECKLOOP-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>			// CHECKLOOP-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>
	// CHECKLOOP-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>			// CHECKLOOP-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>

	// CHECKLOOP-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>			// CHECKLOOP-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
	// CHECKLOOP-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>			// CHECKLOOP-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>
	// CHECKLOOP-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>			// CHECKLOOP-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>
	// CHECKLOOP-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>			// CHECKLOOP-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>
				// CHECKLOOP-DAG: #[[$convMap:.*]] = affine_map<(d0, d1)[s0] -> (d0 + d1 - s0 floordiv 2)>
				ftynseUnsubmitted Done Reply Inline Actions Nit: the expression doesn't seem to correspond to a _strided_ convolution ftynse: Nit: the expression doesn't seem to correspond to a _strided_ convolution

	// CHECKPARALLEL-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>			// CHECKPARALLEL-DAG: #[[$strided1D:.*]] = affine_map<(d0)[s0] -> (d0 + s0)>
	// CHECKPARALLEL-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>			// CHECKPARALLEL-DAG: #[[$strided2D:.]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
	// CHECKPARALLEL-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>			// CHECKPARALLEL-DAG: #[[$strided3D:.]] = affine_map<(d0, d1, d2)[s0, s1, s2] -> (d0 s1 + s0 + d1 * s2 + d2)>
	// CHECKPARALLEL-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>			// CHECKPARALLEL-DAG: #[[$strided4D:.]] = affine_map<(d0, d1, d2, d3)[s0, s1, s2, s3] -> (d0 s1 + s0 + d1 * s2 + d2 * s3 + d3)>
	// CHECKPARALLEL-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>			// CHECKPARALLEL-DAG: #[[$clampMinMap:.*]] = affine_map<(d0) -> (d0, 0)>

	// CHECKPARALLEL-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>			// CHECKPARALLEL-DAG: #[[$stride1Dilation1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
	// CHECKPARALLEL-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>			// CHECKPARALLEL-DAG: #[[$stride2Dilation1:.]] = affine_map<(d0, d1) -> (d0 2 + d1)>
	// CHECKPARALLEL-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>			// CHECKPARALLEL-DAG: #[[$stride2Dilation4:.]] = affine_map<(d0, d1) -> (d0 2 + d1 * 4)>
	// CHECKPARALLEL-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>			// CHECKPARALLEL-DAG: #[[$stride3Dilation5:.]] = affine_map<(d0, d1) -> (d0 3 + d1 * 5)>
				// CHECKPARALLEL-DAG: #[[$convMap:.*]] = affine_map<(d0, d1)[s0] -> (d0 + d1 - s0 floordiv 2)>


	func @matmul(%arg0: memref<?xi8>, %M: index, %N: index, %K: index) {			func @matmul(%arg0: memref<?xi8>, %M: index, %N: index, %K: index) {
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%A = view %arg0[%c0][%M, %K] : memref<?xi8> to memref<?x?xf32>			%A = view %arg0[%c0][%M, %K] : memref<?xi8> to memref<?x?xf32>
	%B = view %arg0[%c0][%K, %N] : memref<?xi8> to memref<?x?xf32>			%B = view %arg0[%c0][%K, %N] : memref<?xi8> to memref<?x?xf32>
	%C = view %arg0[%c0][%M, %N] : memref<?xi8> to memref<?x?xf32>			%C = view %arg0[%c0][%M, %N] : memref<?xi8> to memref<?x?xf32>
	▲ Show 20 Lines • Show All 869 Lines • ▼ Show 20 Lines
	// CHECKPARALLEL: scf.parallel (%[[b:.]], %[[m:.]], %[[n:.]]) = ({{.}}) to (%[[B]], %[[M]], %[[N]]) step ({{.*}}) {			// CHECKPARALLEL: scf.parallel (%[[b:.]], %[[m:.]], %[[n:.]]) = ({{.}}) to (%[[B]], %[[M]], %[[N]]) step ({{.*}}) {
	// CHECKPARALLEL: scf.for %[[k:.]] = %{{.}} to %[[K]] step %{{.*}} {			// CHECKPARALLEL: scf.for %[[k:.]] = %{{.}} to %[[K]] step %{{.*}} {
	// CHECKPARALLEL: %[[va:.*]] = load %[[mA]][%[[b]], %[[m]], %[[k]]] : memref<?x?x?xf32>			// CHECKPARALLEL: %[[va:.*]] = load %[[mA]][%[[b]], %[[m]], %[[k]]] : memref<?x?x?xf32>
	// CHECKPARALLEL: %[[vb:.*]] = load %[[mB]][%[[b]], %[[k]], %[[n]]] : memref<?x?x?xf32>			// CHECKPARALLEL: %[[vb:.*]] = load %[[mB]][%[[b]], %[[k]], %[[n]]] : memref<?x?x?xf32>
	// CHECKPARALLEL: %[[vc:.*]] = load %[[mC]][%[[b]], %[[m]], %[[n]]] : memref<?x?x?xf32>			// CHECKPARALLEL: %[[vc:.*]] = load %[[mC]][%[[b]], %[[m]], %[[n]]] : memref<?x?x?xf32>
	// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32			// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
	// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32			// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
	// CHECKPARALLEL: store %[[res]], %[[mC]][%[[b]], %[[m]], %[[n]]] : memref<?x?x?xf32>			// CHECKPARALLEL: store %[[res]], %[[mC]][%[[b]], %[[m]], %[[n]]] : memref<?x?x?xf32>

				#conv_1d_accesses = [
				affine_map<(m, n)[s0] -> (m + n - s0 floordiv 2)>, // in
				affine_map<(m, n)[s0] -> (n)>, // filter
				affine_map<(m, n)[s0] -> (m)> // out
				]

				#conv_1d_trait = {
				args_in = 2,
				args_out = 1,
				doc = "C(m) += A(m) * B(n)",
				indexing_maps = #conv_1d_accesses,
				library_call = "linalg_conv_1d",
				n_views = [2, 1],
				iterator_types = ["parallel", "parallel"],
				symbol_source = 1
				}

				func @conv1d(%in : memref<?xf32>, %filter : memref<?xf32>, %out : memref<?xf32>) -> () {
				linalg.generic #conv_1d_trait %in, %filter, %out {
				^bb0(%a: f32, %b: f32, %c: f32) :
				%d = mulf %a, %b : f32
				%e = addf %c, %d : f32
				linalg.yield %e : f32
				} : memref<?xf32>,
				memref<?xf32>,
				memref<?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv1d
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				ftynseUnsubmitted Done Reply Inline Actions Please don't pattern-match on SSA value names (`%c0`), they are not guaranteed to be stable. ftynse: Please don't pattern-match on SSA value names (`%c0`), they are not guaranteed to be stable.
				limo1996AuthorUnsubmitted Done Reply Inline Actions Hmm I saw them in other tests so I just used them.. When naming convention changes other tests will need changes as well.. limo1996: Hmm I saw them in other tests so I just used them.. When naming convention changes other tests…
				ftynseUnsubmitted Done Reply Inline Actions The fact that other tests currently contradict the testing guide https://mlir.llvm.org/getting_started/TestingGuide/ because of legacy (they likely existed before the current convention was adopted) does not mean you are allowed to commit new code that also contradicts the testing guide. If you notice such tests, the proper hygiene is to update them as well, in a separate commit. There is _no_ naming convention for SSA names. The only convention is that SSA names can change at any time without any warning, that's why tests should not rely on them. If you disagree with the guide, feel free to start a discussion on the forum. MLIR currently has ~400 test input files totalling ~70k LoC. Updating matches for SSA names in all of them would be extremely painful, will likely take days or weeks even with automation and require multiple rounds because other commits will be editing tests concurrently. This sounds like extremely poor use of engineering time, compared to several minutes spent writing the proper match conditions for each new commit. ftynse: The fact that other tests currently contradict the testing guide https://mlir.llvm.
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg2]], %[[c0]] : memref<?xf32>
				// CHECKLOOP: scf.for %[[b:.]] = %{{.}} to %[[dim1]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[m:.]] = %{{.}} to %[[dim0]] step %{{.*}} {
				ftynseUnsubmitted Done Reply Inline Actions Hmm, would it be possible to put this operation before the loops (actually, there is one already, many we can just reuse its value)? ftynse: Hmm, would it be possible to put this operation before the loops (actually, there is one…
				limo1996AuthorUnsubmitted Done Reply Inline Actions Hmm I think `--cse` solves it right? limo1996: Hmm I think `--cse` solves it right?
				ftynseUnsubmitted Done Reply Inline Actions If it is simple to do in your code, you should do it instead of expecting the caller to run CSE after the fact. Otherwise, we risk getting into a situation where each pass requires five other cleanup passes to run, each of which also require five other cleanup passes and so on. If doing so feels like you start reimplementing a generic CSE, then you shouldn't do it. It's a trade-off that must be considered rather than shrugged off. ftynse: If it is simple to do in your code, you should do it instead of expecting the caller to run CSE…
				limo1996AuthorUnsubmitted Done Reply Inline Actions I will do it in the follow up commit limo1996: I will do it in the follow up commit
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKLOOP: %[[aff:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim2]]]
				// CHECKLOOP: %[[va:.*]] = load %[[arg0]][%[[aff]]] : memref<?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg1]][%[[m]]] : memref<?xf32>
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[b]]] : memref<?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[b]]] : memref<?xf32>

				// CHECKPARALLEL-LABEL: @conv1d
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg2]], %[[c0]] : memref<?xf32>
				// CHECKPARALLEL: scf.parallel (%[[b:.]], %[[m:.]]) = (%{{.}}, %{{.}}) to (%[[dim1]], %[[dim0]]) step ({{.*}}) {
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg1]], %[[c0]] : memref<?xf32>
				// CHECKPARALLEL: %[[aff:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim2]]]
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff]]] : memref<?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[m]]] : memref<?xf32>
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[b]]] : memref<?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[b]]] : memref<?xf32>

				#conv_2d_accesses = [
				affine_map<(m, n, m1, n1)[s0, s1] -> (m + m1 - s0 floordiv 2, n + n1 - s1 floordiv 2)>, // in
				affine_map<(m, n, m1, n1)[s0, s1] -> (m1, n1)>, // filter
				affine_map<(m, n, m1, n1)[s0, s1] -> (m, n)> // out
				]

				#conv_2d_trait = {
				args_in = 2,
				args_out = 1,
				doc = "C(m,n) += A(m,n) * B(m1,n1)",
				indexing_maps = #conv_2d_accesses,
				library_call = "linalg_conv_2d",
				n_views = [2, 1],
				iterator_types = ["parallel", "parallel", "parallel", "parallel"],
				symbol_source = 1
				}

				func @conv2d(%in : memref<?x?xf32>, %filter : memref<?x?xf32>, %out : memref<?x?xf32>) -> () {
				linalg.generic #conv_2d_trait %in, %filter, %out {
				^bb0(%a: f32, %b: f32, %c: f32) :
				%d = mulf %a, %b : f32
				%e = addf %c, %d : f32
				linalg.yield %e : f32
				} : memref<?x?xf32>,
				memref<?x?xf32>,
				memref<?x?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv2d
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim3:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?xf32>
				// CHECKLOOP: scf.for %[[i0:.]] = %{{.}} to %[[dim2]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i1:.]] = %{{.}} to %[[dim3]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i2:.]] = %{{.}} to %[[dim0]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i3:.]] = %{{.}} to %[[dim1]] step %{{.*}} {
				// CHECKLOOP: %[[dim4:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKLOOP: %[[dim5:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKLOOP: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim4]]]
				// CHECKLOOP: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim5]]]
				// CHECKLOOP: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]]] : memref<?x?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg1]][%[[i2]], %[[i3]]] : memref<?x?xf32>
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]]] : memref<?x?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[i0]], %[[i1]]] : memref<?x?xf32>

				// CHECKPARALLEL-LABEL: @conv2d
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim3:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?xf32>
				// CHECKPARALLEL: scf.parallel (%[[i0:.]], %[[i1:.]], %[[i2:.]], %[[i3:.]]) = (%{{.}}, %{{.}}, %{{.}}, %{{.}}) to (%[[dim2]], %[[dim3]], %[[dim0]], %[[dim1]]) step ({{.*}}) {
				// CHECKPARALLEL: %[[dim4:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[dim5:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim4]]]
				// CHECKPARALLEL: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim5]]]
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[i2]], %[[i3]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]]] : memref<?x?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[i0]], %[[i1]]] : memref<?x?xf32>

				#conv_3d_accesses = [
				affine_map<(m, n, k, m1, n1, k1)[s0, s1, s2] -> (m + m1 - s0 floordiv 2, n + n1 - s1 floordiv 2, k + k1 - s2 floordiv 2)>, // in
				affine_map<(m, n, k, m1, n1, k1)[s0, s1, s2] -> (m1, n1, k1)>, // filter
				affine_map<(m, n, k, m1, n1, k1)[s0, s1, s2] -> (m, n, k)> // out
				]

				#conv_3d_trait = {
				args_in = 2,
				args_out = 1,
				doc = "C(m,n,k) += A(m,n,k) * B(m1,n1,k1)",
				indexing_maps = #conv_3d_accesses,
				library_call = "linalg_conv_3d",
				n_views = [2, 1],
				iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],
				symbol_source = 1
				}

				func @conv3d(%in : memref<?x?x?xf32>, %filter : memref<?x?x?xf32>, %out : memref<?x?x?xf32>) -> () {
				linalg.generic #conv_3d_trait %in, %filter, %out {
				^bb0(%a: f32, %b: f32, %c: f32) :
				%d = mulf %a, %b : f32
				%e = addf %c, %d : f32
				linalg.yield %e : f32
				} : memref<?x?x?xf32>,
				memref<?x?x?xf32>,
				memref<?x?x?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv3d
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[c2:.*]] = constant 2 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim3:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim4:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim5:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?xf32>
				// CHECKLOOP: scf.for %[[i0:.]] = %{{.}} to %[[dim3]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i1:.]] = %{{.}} to %[[dim4]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i2:.]] = %{{.}} to %[[dim5]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i3:.]] = %{{.}} to %[[dim0]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i4:.]] = %{{.}} to %[[dim1]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i5:.]] = %{{.}} to %[[dim2]] step %{{.*}} {
				// CHECKLOOP: %[[dim6:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim7:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[dim8:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim6]]]
				// CHECKLOOP: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim7]]]
				// CHECKLOOP: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim8]]]
				// CHECKLOOP: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg1]][%[[i3]], %[[i4]], %[[i5]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]]] : memref<?x?x?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]]] : memref<?x?x?xf32>

				// CHECKPARALLEL-LABEL: @conv3d
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[c2:.*]] = constant 2 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim3:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim4:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim5:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?xf32>
				// CHECKPARALLEL: scf.parallel (%[[i0:.]], %[[i1:.]], %[[i2:.]], %[[i3:.]], %[[i4:.]], %[[i5:.]]) = (%{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}) to (%[[dim3]], %[[dim4]], %[[dim5]], %[[dim0]], %[[dim1]], %[[dim2]]) step ({{.*}}) {
				// CHECKPARALLEL: %[[dim6:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim7:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[dim8:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim6]]]
				// CHECKPARALLEL: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim7]]]
				// CHECKPARALLEL: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim8]]]
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[i3]], %[[i4]], %[[i5]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]]] : memref<?x?x?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]]] : memref<?x?x?xf32>

				#conv_4d_accesses = [
				affine_map<(m, n, k, l, m1, n1, k1, l1)[s0, s1, s2, s3] -> (m + m1 - s0 floordiv 2, n + n1 - s1 floordiv 2, k + k1 - s2 floordiv 2, l + l1 - s3 floordiv 2)>, // in
				affine_map<(m, n, k, l, m1, n1, k1, l1)[s0, s1, s2, s3] -> (m1, n1, k1, l1)>, // filter
				affine_map<(m, n, k, l, m1, n1, k1, l1)[s0, s1, s2, s3] -> (m, n, k, l)> // out
				]

				#conv_4d_trait = {
				args_in = 2,
				args_out = 1,
				doc = "C(m,n,k,l) += A(m,n,k,l) * B(m1,n1,k1,l1)",
				indexing_maps = #conv_4d_accesses,
				library_call = "linalg_conv_4d",
				n_views = [2, 1],
				iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel", "parallel"],
				symbol_source = 1
				}

				func @conv4d(%in : memref<?x?x?x?xf32>, %filter : memref<?x?x?x?xf32>, %out : memref<?x?x?x?xf32>) -> () {
				linalg.generic #conv_4d_trait %in, %filter, %out {
				^bb0(%a: f32, %b: f32, %c: f32) :
				%d = mulf %a, %b : f32
				%e = addf %c, %d : f32
				linalg.yield %e : f32
				} : memref<?x?x?x?xf32>,
				memref<?x?x?x?xf32>,
				memref<?x?x?x?xf32>
				return
				}

				// CHECKLOOP-LABEL: @conv4d
				// CHECKLOOP-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKLOOP-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKLOOP: %[[c0:.*]] = constant 0 : index
				// CHECKLOOP: %[[c1:.*]] = constant 1 : index
				// CHECKLOOP: %[[c2:.*]] = constant 2 : index
				// CHECKLOOP: %[[c3:.*]] = constant 3 : index
				// CHECKLOOP: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim3:.*]] = dim %[[arg1]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim4:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim5:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim6:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim7:.*]] = dim %[[arg2]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKLOOP: scf.for %[[i0:.]] = %{{.}} to %[[dim4]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i1:.]] = %{{.}} to %[[dim5]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i2:.]] = %{{.}} to %[[dim6]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i3:.]] = %{{.}} to %[[dim7]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i4:.]] = %{{.}} to %[[dim0]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i5:.]] = %{{.}} to %[[dim1]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i6:.]] = %{{.}} to %[[dim2]] step %{{.*}} {
				// CHECKLOOP: scf.for %[[i7:.]] = %{{.}} to %[[dim3]] step %{{.*}} {
				// CHECKLOOP: %[[dim8:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim9:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim10:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[dim11:.*]] = dim %[[arg1]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim8]]]
				// CHECKLOOP: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim9]]]
				// CHECKLOOP: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim10]]]
				// CHECKLOOP: %[[aff4:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim11]]]
				// CHECKLOOP: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]], %[[aff4]]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[vb:.*]] = load %[[arg1]][%[[i4]], %[[i5]], %[[i6]], %[[i7]]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>
				// CHECKLOOP: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKLOOP: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKLOOP: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>

				// CHECKPARALLEL-LABEL: @conv4d
				// CHECKPARALLEL-SAME: %[[arg0:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg1:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKPARALLEL-SAME: %[[arg2:[a-zA-Z0-9]+]]: memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[c0:.*]] = constant 0 : index
				// CHECKPARALLEL: %[[c1:.*]] = constant 1 : index
				// CHECKPARALLEL: %[[c2:.*]] = constant 2 : index
				// CHECKPARALLEL: %[[c3:.*]] = constant 3 : index
				// CHECKPARALLEL: %[[dim0:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim1:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim2:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim3:.*]] = dim %[[arg1]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim4:.*]] = dim %[[arg2]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim5:.*]] = dim %[[arg2]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim6:.*]] = dim %[[arg2]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim7:.*]] = dim %[[arg2]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: scf.parallel (%[[i0:.]], %[[i1:.]], %[[i2:.]], %[[i3:.]], %[[i4:.]], %[[i5:.]], %[[i6:.]], %[[i7:.]]) = (%{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}, %{{.}}) to (%[[dim4]], %[[dim5]], %[[dim6]], %[[dim7]], %[[dim0]], %[[dim1]], %[[dim2]], %[[dim3]]) step ({{.*}}) {
				// CHECKPARALLEL: %[[dim8:.*]] = dim %[[arg1]], %[[c0]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim9:.*]] = dim %[[arg1]], %[[c1]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim10:.*]] = dim %[[arg1]], %[[c2]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[dim11:.*]] = dim %[[arg1]], %[[c3]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[aff1:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim8]]]
				// CHECKPARALLEL: %[[aff2:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim9]]]
				// CHECKPARALLEL: %[[aff3:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim10]]]
				// CHECKPARALLEL: %[[aff4:.]] = affine.apply #[[$convMap]](%{{.}}, %{{.*}})[%[[dim11]]]
				// CHECKPARALLEL: %[[va:.*]] = load %[[arg0]][%[[aff1]], %[[aff2]], %[[aff3]], %[[aff4]]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[vb:.*]] = load %[[arg1]][%[[i4]], %[[i5]], %[[i6]], %[[i7]]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[vc:.*]] = load %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>
				// CHECKPARALLEL: %[[inc:.*]] = mulf %[[va]], %[[vb]] : f32
				// CHECKPARALLEL: %[[res:.*]] = addf %[[vc]], %[[inc]] : f32
				// CHECKPARALLEL: store %[[res]], %[[arg2]][%[[i0]], %[[i1]], %[[i2]], %[[i3]]] : memref<?x?x?x?xf32>

mlir/test/lib/Transforms/TestBufferPlacement.cpp

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	matchAndRewrite(linalg::GenericOp op, ArrayRef<Value> operands,
newArgs.push_back(alloc);		newArgs.push_back(alloc);
newResults.push_back(alloc);		newResults.push_back(alloc);
}		}

// Generate a new linalg operation that works on buffers.		// Generate a new linalg operation that works on buffers.
auto linalgOp = rewriter.create<linalg::GenericOp>(		auto linalgOp = rewriter.create<linalg::GenericOp>(
loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),		loc, llvm::None, newArgs, rewriter.getI64IntegerAttr(operands.size()),
rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),		rewriter.getI64IntegerAttr(results.size()), op.indexing_maps(),
op.iterator_types(), op.docAttr(), op.library_callAttr());		op.iterator_types(), op.docAttr(), op.library_callAttr(),
		op.symbol_sourceAttr());

// Create a new block in the region of the new Generic Op.		// Create a new block in the region of the new Generic Op.
Block &oldBlock = op.getRegion().front();		Block &oldBlock = op.getRegion().front();
Region &newRegion = linalgOp.region();		Region &newRegion = linalgOp.region();
Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),		Block *newBlock = rewriter.createBlock(&newRegion, newRegion.begin(),
oldBlock.getArgumentTypes());		oldBlock.getArgumentTypes());

// Map the old block arguments to the new ones.		// Map the old block arguments to the new ones.
▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Added support for symbols inside linalg.generic and map concatenationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 279301

mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h

mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h

mlir/include/mlir/IR/AffineExpr.h

mlir/lib/Dialect/Linalg/EDSC/Builders.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp

mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp

mlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp

mlir/lib/IR/AffineExpr.cpp

mlir/lib/IR/AffineMap.cpp

mlir/test/Dialect/Linalg/invalid.mlir

mlir/test/Dialect/Linalg/loops.mlir

mlir/test/lib/Transforms/TestBufferPlacement.cpp

[mlir] Added support for symbols inside linalg.generic and map concatenation
ClosedPublic