This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Tensor/IR/
-
mlir/
-
Dialect/
-
Tensor/
-
IR/
13/20
TensorOps.td
-
lib/Dialect/Tensor/IR/
-
Dialect/
-
Tensor/
-
IR/
11/20
TensorOps.cpp
-
test/
-
Dialect/Tensor/
-
Tensor/
-
invalid.mlir
5/5
ops.mlir
-
Transforms/
-
loop-invariant-code-motion.mlir

Differential D138119

Introduce `tensor.pack` and `tensor.unpack` operations
ClosedPublic

Authored by chelini on Nov 16 2022, 4:35 AM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
rengolin
mravishankar
hanchung
silvas
stellaraccident
mehdi_amini

Commits

rG9aa505a28d82: Introduce `tensor.pack` and `tensor.unpack` operations

Summary

Pack and Unpack return new tensors within which the individual elements
are reshuffled according to the packing specification. This has the
consequence of modifying the canonical order in which a given operator
(i.e., Matmul) accesses the individual elements. After bufferization,
this typically translates to increased access locality and cache
behavior improvement, e.g., eliminating cache line splitting.

Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Han-Chung Wang <hanchung@google.com>

RFC: https://discourse.llvm.org/t/rfc-tensor-pack-and-tensor-unpack/66408/1

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

chelini created this revision.Nov 16 2022, 4:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 4:35 AM

Herald added subscribers: Moerafaat, zero9178, bzcheeseman and 20 others. · View Herald Transcript

chelini requested review of this revision.Nov 16 2022, 4:35 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptNov 16 2022, 4:35 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

chelini added reviewers: rengolin, mravishankar, hanchung, silvas, stellaraccident, mehdi_amini.Nov 16 2022, 4:38 AM

Harbormaster completed remote builds in B197969: Diff 475775.Nov 16 2022, 5:19 AM

rengolin added inline comments.Nov 16 2022, 5:20 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1771	Shouldn't we have a similar for unpack? `getUnpackedType`?
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3187	Merge ifs?
3211	why is this a lambda?
mlir/test/Dialect/Tensor/ops.mlir
354	Missing CHECK lines for the third test

Generally looks good, thanks @chelini !

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1717	Can you rephrase and make it sound more from the point of view of the IR attached to the op? The pack operation converts an `input` N-D tensor into a 2N-D tensor with tiled and packed layout. The mandatory `inner_dims_pos` attribute specifies the order in which the original N dimensions are permuted to obtain the data order inside the tile. The optional `outer_dims_pos` ... The optional `padding_value` operand specifies a padding value at the boundary on non-perfectly divisible dimensions: - if absent: ... UB - if present: ...
1720	`s/tiled loops/tiled data dimensions`, there are no loops here
1737	`s/outer loops/outer data dimensions`, there are no loops here
1771	We usualy call this "inferXXXType" in other places.
1787	similar description to what I suggested above in shorter form.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3070	better name plz: `isInvalidPackingPosSpecification` ?
3116	use `isDynamic` plz, we want to remove leaky uses of the magic constant.
3167	avoid leaky magic values plz
3213	I could swear I had a factored out util that implemented a templated form of this .. try to find it ?

This revision is now accepted and ready to land.Nov 16 2022, 12:24 PM

mehdi_amini added inline comments.Nov 16 2022, 12:55 PM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1674	Pure is reserved for operations that don't have any undefined behavior, this does not seem to be the case here.
1695	I don't think we should have member methods that looks like accessors but are actually "heavy processing". Better leave this to free functions (same everywhere else).
1751	Please use `DenseI64ArrayAttr`. We shouldn't use `I64ArrayAttr` anywhere moving forward I think.
1782	Typo: unpack
mlir/test/Dialect/Tensor/ops.mlir
328	Please use CHECK-LABEL
335	Please minimize the CHECK to the absolute minimum needed for what you intend to test.

mehdi_amini requested changes to this revision.Nov 16 2022, 12:55 PM

This revision now requires changes to proceed.Nov 16 2022, 12:55 PM

hanchung requested changes to this revision.Nov 16 2022, 2:24 PM

hanchung added inline comments.

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1717	note that we don't require the op to pack all the dimension. It is not always packing a N-D tensor into a 2N-D tensor. E.g., we can pack something like NHWC to NHWChw.
1720	+1, `tiled data dimensions` makes more sense to me. There are no loops.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3010–3012	`Returns`, and maybe we can format it a bit like /// Returns ... is invalid when: /// a) .. /// b) .. /// c) ..
3014	it is a redundant comment to me. I'd delete it.
3030	It also accepts equal case. How about renaming it to `areAllInBound`?
3075	s/`less or equal than`/`less than or equal to`
3170–3176	[optional] I'd use `continue` for having less indents. It can save one level of nesting. E.g., if (it == dimTileMapping.end()) continue; Optional<int64_t> cstTileValue = ... if (!cstTileValue) continue; if (...) return true;
3187	It's also used below. Maybe just declare a variable and merge the checks. auto paddingValue = getPaddingValue(); if (paddingValue && ... ){ .. }
3211	I think it's worth for making it a method. We'll need interchange and undoInterchange for tiling implementation. We can add undoInterchange method when upstreaming tiling implementation. FYI that here is the implementation used in IREE: https://github.com/iree-org/iree/blob/3625adf98f0b87c24a89f8d4101550c1ef1eea44/llvm-external-projects/iree-dialects/include/iree-dialects/Dialect/LinalgExt/Utils/Utils.h#L29-L50 RE Nicolas: I did not find a similar thing when prototyping it in IREE. Maybe I searched with bad keyword. The keyword I used is `interchange`, like `rg --ignore-case 'interchange' */.h`. :-(
mlir/test/Dialect/Tensor/ops.mlir
320	We should use `// -----` to split tests. I don't know why `--split-input-file` is not added in the test command (i.e., line 1), but we should add it at least for consistency. That's how we write the tests in this file.

nicolasvasilache added inline comments.Nov 17 2022, 2:33 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1695	This is not something specific to this PR. This is something we have more generally and predates the automatic generation of prefixed getXXX as accessors everywhere AFAIR. Could you please start an RFC with a call for general cleanup and a proposal for properly naming these getters of derived information? I don't think free functions is reasonable here, there is a prohibitive cognitive cost in finding those functions when not attached to the op directly. Additionally, does that thinking also carry to interfaces?

Use NoMemoryEffect instead of Pure as the operation may trigger UB.

Update pack and unpack documentation as suggested.

Switch to use DenseI64ArrayAttr for all attributes but static_inner_tiles.

Rename getPackedType to inferPackedType.

Rename isSmallerThan to areAllInBound.

Rename isInvalid to isInvalidPackingPosSpecification.

Avoid leaking magic constant.

Update tests and add some dynamic tests.

Add co-authors.

chelini edited the summary of this revision. (Show Details)Nov 17 2022, 2:56 AM

Herald added a subscriber: kristof.beyls. · View Herald TranscriptNov 17 2022, 2:56 AM

chelini added inline comments.Nov 17 2022, 2:56 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1751	Update `outer_dims_perm` and `inner_dims_pos` to use `DenseI64ArrayAttr`. Moving `static_inner_tiles` is a bit more involved as we need to update `parseDynamicIndexList` which is used by other ops. I can follow-up with a PR if we want to use `DenseI64ArrayAttr` in the future.

Harbormaster completed remote builds in B198165: Diff 476065.Nov 17 2022, 3:14 AM

rengolin added inline comments.Nov 17 2022, 3:24 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1695	Definitely not for this PR. This is a much larger conversation, and whoever takes that task, will easily (and mechanically) convert these methods, too.

Model UB behavior for pack and unpack

Update comment.

Harbormaster completed remote builds in B198182: Diff 476087.Nov 17 2022, 4:39 AM

Drop lambda and use free function for interchange.

Harbormaster completed remote builds in B198193: Diff 476101.Nov 17 2022, 6:16 AM

LGTM, just few nits!

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3195	nit: !paddingValue
3211	llvm style nit: do not use braces for single statement. I.e., change it to for (...) vec[en.index() + offset] = ... https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
3234–3235	I don't see the point of `is available` because there are no checks. Maybe we can just drop this comment. It's obvious to me because the code describes what's happening. If you want to keep the comment, how about `Swap outer tiled data dimensions`.
mlir/test/Dialect/Tensor/ops.mlir
320	I meant add `// -----` to the new tests. Sorry for the ambiguous comment. The change looks good to me, I was trying not add too much non-specific things to the revision. Any way, thanks for fixing the other parts in this file!

Drop braces

Drop confusing comment

Address nit

chelini added inline comments.Nov 18 2022, 5:12 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1695	@mehdi_amini other than this, do you further comments/suggestions on the PR?

Harbormaster completed remote builds in B198440: Diff 476432.Nov 18 2022, 5:31 AM

LGTM too, with a nit.

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3076	you changed the comment but not user visible the error message

Fix user visible comment.

chelini marked an inline comment as done.Nov 21 2022, 12:51 AM

nicolasvasilache accepted this revision.Nov 21 2022, 4:54 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1695	I sync'ed with @mehdi_amini offline and he is OOO for 2 weeks, I do not expect him to check this thread (but I may be wrong ..) I'd favor not blocking this for 2 weeks and landing as is with followup post-commit review / improvements.
1751	+1 yes we need to gain that muscle memory indeed .. Any comment/suggestion on the 2 issues I have been seeing re DenseXXArrayAttr in https://discourse.llvm.org/t/rfc-inconsistency-between-dynamic-and-static-attributes-i64-v-index/66612 ?
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
26	plz double check ou really need all new includes.

Rebase

inferPackedType was also handling the memref case as in IREE; we have the

operations working at both abstractions; this is unnecessary here. Remove logic
and drop TypeSwitch include.

chelini marked an inline comment as done.Nov 21 2022, 7:28 AM

chelini added inline comments.

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1751	I can follow up with this.

mehdi_amini added inline comments.Nov 21 2022, 11:20 AM

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
1695	I do not expect him to check this thread (but I may be wrong ..) :) I'd favor not blocking this for 2 weeks and landing as is with followup post-commit review / improvements. Seems reasonable, my comments aren't intrinsic about the direction this PR is going to fundamentally and is really about some "coding convention" issues that we'll address later.

mehdi_amini resigned from this revision.Nov 21 2022, 11:20 AM

This revision is now accepted and ready to land.Nov 21 2022, 11:20 AM

Harbormaster completed remote builds in B198789: Diff 476897.Nov 21 2022, 4:16 PM

Closed by commit rG9aa505a28d82: Introduce `tensor.pack` and `tensor.unpack` operations (authored by chelini). · Explain WhyNov 22 2022, 12:12 AM

This revision was automatically updated to reflect the committed changes.

chelini added a commit: rG9aa505a28d82: Introduce `tensor.pack` and `tensor.unpack` operations.

nicolasvasilache added inline comments.Nov 22 2022, 1:00 AM

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3211	template <typename T, unsigned N> void applyPermutationToVector(SmallVector<T, N> &inVec, ArrayRef<int64_t> permutation) { In IndexingUtils.cpp/h

chelini mentioned this in D138480: [MLIR][Tensor] Use the existing helper function `applyPermutationToVector` (NFC).Nov 22 2022, 1:23 AM

chelini mentioned this in rG85e38e5292a3: [MLIR][Tensor] Use the existing helper function `applyPermutationToVector` (NFC).Nov 22 2022, 2:34 AM

hanchung added inline comments.Nov 23 2022, 1:46 PM

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3211	got it, thanks!

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Tensor/

IR/

TensorOps.td

164 lines

lib/

Dialect/

Tensor/

IR/

TensorOps.cpp

364 lines

test/

Dialect/

Tensor/

invalid.mlir

89 lines

ops.mlir

141 lines

Transforms/

loop-invariant-code-motion.mlir

55 lines

Diff 477084

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td

Show First 20 Lines • Show All 1,658 Lines • ▼ Show 20 Lines	let builders = [
OpBuilder<(ins "Value":$element, "Type":$aggregateType),		OpBuilder<(ins "Value":$element, "Type":$aggregateType),
[{ build($_builder, $_state, aggregateType, element); }]>];		[{ build($_builder, $_state, aggregateType, element); }]>];
let assemblyFormat = "$input attr-dict `:` type($aggregate)";		let assemblyFormat = "$input attr-dict `:` type($aggregate)";

let hasFolder = 1;		let hasFolder = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// PackOp
		//===----------------------------------------------------------------------===//

		class Tensor_RelayoutOp<string mnemonic, list<Trait> traits = []> :
		Tensor_Op<mnemonic, !listconcat(traits, [
		DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>,
		DestinationStyleOpInterface,
		ConditionallySpeculatable, NoMemoryEffect,
		mehdi_aminiUnsubmitted Done Reply Inline Actions Pure is reserved for operations that don't have any undefined behavior, this does not seem to be the case here. mehdi_amini: Pure is reserved for operations that don't have any undefined behavior, this does not seem to…
		DeclareOpInterfaceMethods<ReifyRankedShapedTypeOpInterface>,
		TypesMatchWith<"result type matches type of dest",
		"dest", "result",
		"$_self">])> {

		code commonExtraClassDeclaration = [{
		int64_t getSourceRank() { return getSource().getType().getRank(); };
		int64_t getDestRank() { return getDest().getType().getRank(); };
		RankedTensorType getSourceType() {
		return getSource().getType().cast<RankedTensorType>(); };
		RankedTensorType getDestType() {
		return getDest().getType().cast<RankedTensorType>(); };

		/// Return position for init operand. Init operand is `dest`.
		std::pair<int64_t, int64_t> getDpsInitsPositionRange() {
		return {1, 2}; // `dest` operand
		}

		/// Interface method for ConditionallySpeculatable.
		Speculation::Speculatability getSpeculatability();

		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I don't think we should have member methods that looks like accessors but are actually "heavy processing". Better leave this to free functions (same everywhere else). mehdi_amini: I don't think we should have member methods that looks like accessors but are actually "heavy…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This is not something specific to this PR. This is something we have more generally and predates the automatic generation of prefixed getXXX as accessors everywhere AFAIR. Could you please start an RFC with a call for general cleanup and a proposal for properly naming these getters of derived information? I don't think free functions is reasonable here, there is a prohibitive cognitive cost in finding those functions when not attached to the op directly. Additionally, does that thinking also carry to interfaces? nicolasvasilache: This is not something specific to this PR. This is something we have more generally and…
		rengolinUnsubmitted Not Done Reply Inline Actions Definitely not for this PR. This is a much larger conversation, and whoever takes that task, will easily (and mechanically) convert these methods, too. rengolin: Definitely not for this PR. This is a much larger conversation, and whoever takes that task…
		cheliniAuthorUnsubmitted Done Reply Inline Actions @mehdi_amini other than this, do you further comments/suggestions on the PR? chelini: @mehdi_amini other than this, do you further comments/suggestions on the PR?
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I sync'ed with @mehdi_amini offline and he is OOO for 2 weeks, I do not expect him to check this thread (but I may be wrong ..) I'd favor not blocking this for 2 weeks and landing as is with followup post-commit review / improvements. nicolasvasilache: I sync'ed with @mehdi_amini offline and he is OOO for 2 weeks, I do not expect him to check…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I do not expect him to check this thread (but I may be wrong ..) :) I'd favor not blocking this for 2 weeks and landing as is with followup post-commit review / improvements. Seems reasonable, my comments aren't intrinsic about the direction this PR is going to fundamentally and is really about some "coding convention" issues that we'll address later. mehdi_amini: > I do not expect him to check this thread (but I may be wrong ..) :) > I'd favor not…
		/// Return a mapping from positions `inner_dims_pos` to their
		/// tile factors.
		DenseMap<int64_t, OpFoldResult> getDimAndTileMapping();

		/// Return the tile sizes as OpFoldResult.
		SmallVector<OpFoldResult> getMixedTiles();

		/// Return the tile sizes as `int64_t`. If a tile size is dynamic
		/// a sentinel `kDynamic` is introduced at that position in
		/// the returned vector.
		SmallVector<int64_t> getStaticTiles();
		}];

		let hasVerifier = 1;
		}

		def Tensor_PackOp : Tensor_RelayoutOp<"pack", [
		AttrSizedOperandSegments]> {
		let summary = "tensor pack operation";
		let description = [{
		The pack operation converts an input tensor to a higher-dimensional tensor
		with a tiled and packed layout. The mandatory `inner_dims_pos` attribute
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can you rephrase and make it sound more from the point of view of the IR attached to the op? The pack operation converts an `input` N-D tensor into a 2N-D tensor with tiled and packed layout. The mandatory `inner_dims_pos` attribute specifies the order in which the original N dimensions are permuted to obtain the data order inside the tile. The optional `outer_dims_pos` ... The optional `padding_value` operand specifies a padding value at the boundary on non-perfectly divisible dimensions: - if absent: ... UB - if present: ... nicolasvasilache: Can you rephrase and make it sound more from the point of view of the IR attached to the op?
		hanchungUnsubmitted Done Reply Inline Actions note that we don't require the op to pack all the dimension. It is not always packing a N-D tensor into a 2N-D tensor. E.g., we can pack something like NHWC to NHWChw. hanchung: note that we don't require the op to pack all the dimension. It is not always packing a N-D…
		specifies a permutation for the original dimensions, while `inner_tiles` is the
		tiling factor for each dimension. The optional attribute `outer_dims_perm`
		specifies the order for the tiled data dimension, while the attribute
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `s/tiled loops/tiled data dimensions`, there are no loops here nicolasvasilache: `s/tiled loops/tiled data dimensions`, there are no loops here
		hanchungUnsubmitted Done Reply Inline Actions +1, `tiled data dimensions` makes more sense to me. There are no loops. hanchung: +1, `tiled data dimensions` makes more sense to me. There are no loops.
		`padding_value` specifies a padding value at the boundary on non-perfectly
		divisible dimensions. Padding is optional:
		- If absent, it is UB if the tile does not perfectly divide the dimension.
		- If present, it will pad along high dimensions (high-padding) to make the
		tile complete.

		Example NC_to_NCnc:

		```mlir
		tensor.pack %source inner_dims_pos = [0, 1]
		inner_tiles = [8, 32] into %dest : tensor<128x256xf32> -> tensor<16x8x8x32xf32>
		```
		Example CK to KCck

		```mlir
		tensor.pack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
		inner_tiles = [8, 32] into %dest : tensor<128x256xf32> -> tensor<8x16x8x32xf32>
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `s/outer loops/outer data dimensions`, there are no loops here nicolasvasilache: `s/outer loops/outer data dimensions`, there are no loops here
		```

		In all cases, dimension at position 0 in the input tensor (128) is tiled
		with a factor of 8, while dimension at position 1 (256) is tiled with a factor
		of 32. In the second example, the outer data dimensions are interchanged
		according to `outer_dims_perm`.

		Example NC_to_NCnc with padding:

		```mlir
		tensor.pack %arg padding_value(%pad : f32) inner_dims_pos = [0, 1]
		inner_tiles = [8, 2] into %arg1 : tensor<13x15xf32> -> tensor<2x8x8x2xf32>
		```

		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Please use `DenseI64ArrayAttr`. We shouldn't use `I64ArrayAttr` anywhere moving forward I think. mehdi_amini: Please use `DenseI64ArrayAttr`. We shouldn't use `I64ArrayAttr` anywhere moving forward I think.
		cheliniAuthorUnsubmitted Done Reply Inline Actions Update `outer_dims_perm` and `inner_dims_pos` to use `DenseI64ArrayAttr`. Moving `static_inner_tiles` is a bit more involved as we need to update `parseDynamicIndexList` which is used by other ops. I can follow-up with a PR if we want to use `DenseI64ArrayAttr` in the future. chelini: Update `outer_dims_perm` and `inner_dims_pos` to use `DenseI64ArrayAttr`. Moving…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions +1 yes we need to gain that muscle memory indeed .. Any comment/suggestion on the 2 issues I have been seeing re DenseXXArrayAttr in https://discourse.llvm.org/t/rfc-inconsistency-between-dynamic-and-static-attributes-i64-v-index/66612 ? nicolasvasilache: +1 yes we need to gain that muscle memory indeed .. Any comment/suggestion on the 2 issues I…
		cheliniAuthorUnsubmitted Done Reply Inline Actions I can follow up with this. chelini: I can follow up with this.
		}];
		let arguments = (ins AnyRankedTensor:$source,
		AnyRankedTensor:$dest,
		Optional<AnyType>:$padding_value,
		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$outer_dims_perm,
		DenseI64ArrayAttr:$inner_dims_pos,
		Variadic<Index>:$inner_tiles,
		I64ArrayAttr:$static_inner_tiles);
		let results = (outs AnyRankedTensor:$result);
		let assemblyFormat = [{
		$source
		(`padding_value` `(` $padding_value^ `:` type($padding_value) `)`)?
		(`outer_dims_perm` `=` $outer_dims_perm^)?
		`inner_dims_pos` `=` $inner_dims_pos
		`inner_tiles` `=`
		custom<DynamicIndexList>($inner_tiles, $static_inner_tiles,
		"ShapedType::kDynamic")
		`into` $dest attr-dict `:` type($source) `->` type($dest)
		}];

		rengolinUnsubmitted Done Reply Inline Actions Shouldn't we have a similar for unpack? `getUnpackedType`? rengolin: Shouldn't we have a similar for unpack? `getUnpackedType`?
		nicolasvasilacheUnsubmitted Done Reply Inline Actions We usualy call this "inferXXXType" in other places. nicolasvasilache: We usualy call this "inferXXXType" in other places.
		let extraClassDeclaration = commonExtraClassDeclaration # [{
		// Method to get the `ShapedType` of the result based on the inner tiles,
		// position of the inner tiles (innerDimsPos) and interchange vector of
		// outer loops (outerDimsPerm).
		static ShapedType inferPackedType(ShapedType sourceType,
		ArrayRef<int64_t> innerTileSizes, ArrayRef<int64_t> innerDimsPos,
		ArrayRef<int64_t> outerDimsPerm = {});
		}];
		}

		//===----------------------------------------------------------------------===//
		mehdi_aminiUnsubmitted Done Reply Inline Actions Typo: unpack mehdi_amini: Typo: unpack
		// UnPackOp
		//===----------------------------------------------------------------------===//

		def Tensor_UnPackOp : Tensor_RelayoutOp<"unpack"> {
		let summary = "tensor unpack operation";
		nicolasvasilacheUnsubmitted Done Reply Inline Actions similar description to what I suggested above in shorter form. nicolasvasilache: similar description to what I suggested above in shorter form.
		let description = [{
		The unpack operation converts a tensor with a tiled and packed layout to a
		lower-dimensional tensor. Similar to `pack`, the mandatory attributes
		`inner_dims_pos` specifies a permutation for the inner data dimensions, while
		`inner_tiles` is the tiling factor. The attribute `outer_dims_perm` has the
		exact behavior as the one described in `pack`. In `unpack`, it is UB if the
		tile does not perfectly divide the dimension.

		Example NCnc_to_NC:

		```mlir
		tensor.unpack %source inner_dims_pos = [0, 1]
		inner_tiles = [8, 32] into %dest : tensor<16x8x8x32xf32> -> tensor<128x256xf32>
		```

		Example CK to KCck:

		```mlir
		tensor.unapck %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
		inner_tiles = [8, 32] into %dest : tensor<8x16x8x32xf32> -> tensor<128x256xf32>
		```
		}];
		let arguments = (ins AnyRankedTensor:$source,
		AnyRankedTensor:$dest,
		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$outer_dims_perm,
		DenseI64ArrayAttr:$inner_dims_pos,
		Variadic<Index>:$inner_tiles,
		I64ArrayAttr:$static_inner_tiles);
		let results = (outs AnyRankedTensor:$result);
		let assemblyFormat = [{
		$source
		(`outer_dims_perm` `=` $outer_dims_perm^)?
		`inner_dims_pos` `=` $inner_dims_pos
		`inner_tiles` `=`
		custom<DynamicIndexList>($inner_tiles, $static_inner_tiles,
		"ShapedType::kDynamic")
		`into` $dest attr-dict `:` type($source) `->` type($dest)
		}];

		let extraClassDeclaration = commonExtraClassDeclaration;
		}

		//===----------------------------------------------------------------------===//
// YieldOp		// YieldOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def Tensor_YieldOp : Tensor_Op<"yield",		def Tensor_YieldOp : Tensor_Op<"yield",
[Pure, ReturnLike, Terminator,		[Pure, ReturnLike, Terminator,
HasParent<"::mlir::tensor::GenerateOp, ::mlir::tensor::PadOp">]> {		HasParent<"::mlir::tensor::GenerateOp, ::mlir::tensor::PadOp">]> {
let summary = "Yield a value from a region";		let summary = "Yield a value from a region";
let description = [{		let description = [{
Show All 14 Lines

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

Show All 12 Lines
#include "mlir/Dialect/Utils/ReshapeOpsUtils.h"		#include "mlir/Dialect/Utils/ReshapeOpsUtils.h"
#include "mlir/Dialect/Utils/StaticValueUtils.h"		#include "mlir/Dialect/Utils/StaticValueUtils.h"
#include "mlir/IR/BlockAndValueMapping.h"		#include "mlir/IR/BlockAndValueMapping.h"
#include "mlir/IR/Builders.h"		#include "mlir/IR/Builders.h"
#include "mlir/IR/BuiltinAttributeInterfaces.h"		#include "mlir/IR/BuiltinAttributeInterfaces.h"
#include "mlir/IR/Matchers.h"		#include "mlir/IR/Matchers.h"
#include "mlir/IR/TypeUtilities.h"		#include "mlir/IR/TypeUtilities.h"
#include "mlir/Interfaces/DestinationStyleOpInterface.h"		#include "mlir/Interfaces/DestinationStyleOpInterface.h"
		#include "mlir/Support/MathExtras.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallBitVector.h"		#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include <algorithm>		#include <algorithm>
		nicolasvasilacheUnsubmitted Done Reply Inline Actions plz double check ou really need all new includes. nicolasvasilache: plz double check ou really need all new includes.

using namespace mlir;		using namespace mlir;
using namespace mlir::tensor;		using namespace mlir::tensor;

/// Materialize a single constant operation from a given attribute value with		/// Materialize a single constant operation from a given attribute value with
/// the desired resultant type.		/// the desired resultant type.
Operation *TensorDialect::materializeConstant(OpBuilder &builder,		Operation *TensorDialect::materializeConstant(OpBuilder &builder,
Attribute value, Type type,		Attribute value, Type type,
▲ Show 20 Lines • Show All 2,906 Lines • ▼ Show 20 Lines	if (!constOperand.isa_and_nonnull<IntegerAttr, FloatAttr>())
return {};		return {};

// SplatElementsAttr::get treats single value for second arg as being a		// SplatElementsAttr::get treats single value for second arg as being a
// splat.		// splat.
return SplatElementsAttr::get(getType(), {constOperand});		return SplatElementsAttr::get(getType(), {constOperand});
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// PackOp/UnPackOp Common
		//===----------------------------------------------------------------------===//

		template <typename OpTy>
		static LogicalResult
		reifyResultShapesImpl(OpTy op, OpBuilder &builder,
		ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		int64_t destRank = op.getDestRank();
		reifiedReturnShapes.resize(1, SmallVector<Value>(destRank));
		for (auto dim : llvm::seq<int64_t>(0, destRank)) {
		reifiedReturnShapes[0][dim] =
		builder.createOrFold<tensor::DimOp>(op.getLoc(), op.getDest(), dim);
		}
		return success();
		}

		template <typename OpTy>
		static DenseMap<int64_t, OpFoldResult> getDimAndTileMappingImpl(OpTy op) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		DenseMap<int64_t, OpFoldResult> dimAndTileMapping;
		ArrayRef<int64_t> dimsToTile = op.getInnerDimsPos();
		SmallVector<OpFoldResult> tiles = op.getMixedTiles();
		assert(tiles.size() == dimsToTile.size() &&
		"tiles must match indices of dimension to block");
		// bind the dimension `i` with the tile factor.
		for (auto i : llvm::seq<int64_t>(0, dimsToTile.size()))
		dimAndTileMapping[dimsToTile[i]] = tiles[i];
		return dimAndTileMapping;
		}

		template <typename OpTy>
		static SmallVector<OpFoldResult> getMixedTilesImpl(OpTy op) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		SmallVector<OpFoldResult> mixedInnerTiles;
		unsigned dynamicValIndex = 0;
		for (Attribute attr : op.getStaticInnerTiles()) {
		auto tileAttr = attr.cast<IntegerAttr>();
		if (!ShapedType::isDynamic(tileAttr.getInt()))
		mixedInnerTiles.push_back(tileAttr);
		else
		mixedInnerTiles.push_back(op.getInnerTiles()[dynamicValIndex++]);
		}
		return mixedInnerTiles;
		}

		template <typename OpTy>
		static SmallVector<int64_t> getStaticTilesImpl(OpTy op) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		SmallVector<Value> dynamicTiles;
		SmallVector<int64_t> staticTiles;
		dispatchIndexOpFoldResults(op.getMixedTiles(), dynamicTiles, staticTiles,
		ShapedType::kDynamic);
		return staticTiles;
		}

		/// Returns true if `dimsPos` is invalid. It is invalid when:
		/// a) It contains duplicate.
		/// b) At least one dimension is out of bound (`dimPos` is >= 0 and < rank).
		/// c) The number of elements in `dimsPos` is > than `rank`.
		hanchungUnsubmitted Done Reply Inline Actions `Returns`, and maybe we can format it a bit like /// Returns ... is invalid when: /// a) .. /// b) .. /// c) .. hanchung: `Returns`, and maybe we can format it a bit like ``` /// Returns ... is invalid when: /// a)…
		static bool isInvalidPackingPosSpecification(ArrayRef<int64_t> dimsPos,
		size_t rank) {
		hanchungUnsubmitted Done Reply Inline Actions it is a redundant comment to me. I'd delete it. hanchung: it is a redundant comment to me. I'd delete it.
		size_t dimsPosSize = dimsPos.size();
		if (dimsPosSize > rank)
		return true;
		DenseSet<int64_t> uniqued;
		for (int64_t dim : dimsPos)
		uniqued.insert(dim);
		if (dimsPosSize != uniqued.size())
		return true;
		return llvm::any_of(dimsPos, [rank](int64_t dimPos) {
		return dimPos < 0 \|\| dimPos >= static_cast<int64_t>(rank);
		});
		}

		/// Returns true if the dimension of `sourceShape` is smaller than the dimension
		/// of the `limitShape`.
		static bool areAllInBound(ArrayRef<int64_t> sourceShape,
		hanchungUnsubmitted Done Reply Inline Actions It also accepts equal case. How about renaming it to `areAllInBound`? hanchung: It also accepts equal case. How about renaming it to `areAllInBound`?
		ArrayRef<int64_t> limitShape) {
		assert(
		sourceShape.size() == limitShape.size() &&
		"expected source shape rank, and limit of the shape to have same rank");
		return llvm::all_of(
		llvm::zip(sourceShape, limitShape), [](std::tuple<int64_t, int64_t> it) {
		int64_t sourceExtent = std::get<0>(it);
		int64_t limit = std::get<1>(it);
		return ShapedType::isDynamic(sourceExtent) \|\|
		ShapedType::isDynamic(limit) \|\| sourceExtent <= limit;
		});
		}

		template <typename OpTy>
		static LogicalResult commonVerifierPackAndUnPackOp(OpTy packOrUnPack) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		Operation *op = packOrUnPack.getOperation();

		// Return true if we have a zero-value tile.
		auto hasZeros = [&](ArrayRef<OpFoldResult> tiles) {
		return llvm::any_of(
		tiles, [](OpFoldResult tile) { return isConstantIntValue(tile, 0); });
		};

		// Verify tiles. Do not allow zero tiles.
		SmallVector<OpFoldResult> mixedTiles = packOrUnPack.getMixedTiles();
		if (hasZeros(mixedTiles))
		return op->emitError("invalid zero tile factor");

		// Verify inner_dims_pos and outer_dims_perm.
		ShapedType unpackedType = (std::is_same<OpTy, PackOp>::value)
		? packOrUnPack.getSourceType()
		: packOrUnPack.getDestType();
		size_t unpackedRank = unpackedType.getRank();
		ArrayRef<int64_t> innerDimsPos = packOrUnPack.getInnerDimsPos();
		ArrayRef<int64_t> outerDimPerm = packOrUnPack.getOuterDimsPerm();
		if (isInvalidPackingPosSpecification(innerDimsPos, unpackedRank))
		return op->emitError("invalid inner_dims_pos vector");
		if (isInvalidPackingPosSpecification(outerDimPerm, unpackedRank))
		nicolasvasilacheUnsubmitted Done Reply Inline Actions better name plz: `isInvalidPackingPosSpecification` ? nicolasvasilache: better name plz: `isInvalidPackingPosSpecification` ?
		return op->emitError("invalid outer_dims_perm vector");

		// Tiling factors must be less than or equal to the input rank for pack (or
		// output rank for unpack), and must match the number of `inner_dims_pos`.
		if (mixedTiles.size() > unpackedRank) {
		hanchungUnsubmitted Done Reply Inline Actions s/`less or equal than`/`less than or equal to` hanchung: s/`less or equal than`/`less than or equal to`
		return op->emitError("tiling factors must be less than or equal to the "
		rengolinUnsubmitted Done Reply Inline Actions you changed the comment but not user visible the error message rengolin: you changed the comment but not user visible the error message
		"input rank for pack or output rank for unpack");
		}
		if (mixedTiles.size() != innerDimsPos.size()) {
		return op->emitError(
		"tiling factors must equal the number of dimensions to tile");
		}

		ShapedType packedType = (std::is_same<OpTy, PackOp>::value)
		? packOrUnPack.getDestType()
		: packOrUnPack.getSourceType();
		size_t packedRank = packedType.getRank();
		// Require output rank to match input rank + number of blocking factors.
		if (unpackedRank + mixedTiles.size() != packedRank) {
		return op->emitError(
		"packed rank must equal unpacked rank + tiling factors");
		}

		// Verify result shape is greater than the minimum expected
		// by the pack operation, and that the output shape
		// represents full tiles.
		ShapedType expectedPackedType = PackOp::inferPackedType(
		unpackedType, packOrUnPack.getStaticTiles(), innerDimsPos, outerDimPerm);
		if (!areAllInBound(expectedPackedType.getShape(), packedType.getShape())) {
		return op->emitError("the shape of output is not large enough to hold the "
		"packed data. Expected at least ")
		<< expectedPackedType << ", got " << packedType;
		}
		if (!llvm::all_of(
		llvm::zip(packedType.getShape().take_back(mixedTiles.size()),
		mixedTiles),
		[](std::tuple<int64_t, OpFoldResult> it) {
		Optional<int64_t> constTileSize =
		getConstantIntValue(std::get<1>(it));
		int64_t shape = std::get<0>(it);
		if (!constTileSize) {
		// If specified tile size is dynamic, output shape should
		// be dynamic too.
		return ShapedType::isDynamic(shape);
		} else {
		if (ShapedType::isDynamic(shape)) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions use `isDynamic` plz, we want to remove leaky uses of the magic constant. nicolasvasilache: use `isDynamic` plz, we want to remove leaky uses of the magic constant.
		// For the shape being dynamic when tile size is
		// specified, return true. In canonical form a constant
		// tile size should lead to constant shape of the tiled
		// dimension, but not needed for verification.
		return true;
		}
		return shape == constTileSize.value();
		}
		})) {
		return op->emitError("mismatch in inner tile sizes specified and shaped of "
		"tiled dimension in the packed type");
		}
		return success();
		}

		//===----------------------------------------------------------------------===//
		// PackOp
		//===----------------------------------------------------------------------===//

		void PackOp::getAsmResultNames(function_ref<void(Value, StringRef)> setNameFn) {
		setNameFn(getResult(), "pack");
		}

		LogicalResult
		PackOp::reifyResultShapes(OpBuilder &builder,
		ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
		return reifyResultShapesImpl(*this, builder, reifiedReturnShapes);
		}

		DenseMap<int64_t, OpFoldResult> PackOp::getDimAndTileMapping() {
		return getDimAndTileMappingImpl(*this);
		}

		SmallVector<OpFoldResult> PackOp::getMixedTiles() {
		return getMixedTilesImpl(*this);
		}

		SmallVector<int64_t> PackOp::getStaticTiles() {
		return getStaticTilesImpl(*this);
		}

		/// Check if we have enough static information to catch undefined behavior when
		/// the tile size does not divide perfectly the dimension of the input tensor.
		static bool
		areNotFullTiles(ArrayRef<int64_t> inputShape,
		DenseMap<int64_t, OpFoldResult> const &dimAndTileMapping) {
		int64_t rank = inputShape.size();
		for (int64_t dim = 0; dim < rank; dim++) {
		if (ShapedType::isDynamic(inputShape[dim]))
		continue;
		auto it = dimAndTileMapping.find(dim);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions avoid leaky magic values plz nicolasvasilache: avoid leaky magic values plz
		if (it == dimAndTileMapping.end())
		continue;
		Optional<int64_t> constantTile = getConstantIntValue(it->second);
		if (!constantTile)
		continue;
		if (inputShape[dim] % (*constantTile) != 0)
		return true;
		}
		return false;
		hanchungUnsubmitted Not Done Reply Inline Actions [optional] I'd use `continue` for having less indents. It can save one level of nesting. E.g., if (it == dimTileMapping.end()) continue; Optional<int64_t> cstTileValue = ... if (!cstTileValue) continue; if (...) return true; hanchung: [optional] I'd use `continue` for having less indents. It can save one level of nesting. E.g.
		}

		LogicalResult PackOp::verify() {
		if (failed(commonVerifierPackAndUnPackOp(*this)))
		return failure();

		// Verify padding value, and bail out if the tile does not divide the
		// dimension fully. In the case of dynamic tile factors or dimensions, having
		// a partial tile is undefined behavior.
		auto paddingValue = getPaddingValue();
		if (paddingValue &&
		rengolinUnsubmitted Done Reply Inline Actions Merge ifs? rengolin: Merge ifs?
		hanchungUnsubmitted Done Reply Inline Actions It's also used below. Maybe just declare a variable and merge the checks. auto paddingValue = getPaddingValue(); if (paddingValue && ... ){ .. } hanchung: It's also used below. Maybe just declare a variable and merge the checks. ``` auto…
		paddingValue.getType() != getSourceType().getElementType()) {
		return emitOpError("expected padding_value has ")
		<< getSourceType().getElementType()
		<< " but got: " << paddingValue.getType();
		}

		auto dimAndTileMapping = getDimAndTileMapping();
		if (!paddingValue &&
		hanchungUnsubmitted Not Done Reply Inline Actions nit: !paddingValue hanchung: nit: !paddingValue
		areNotFullTiles(getSourceType().getShape(), dimAndTileMapping)) {
		return emitOpError("invalid tile factor provided. Only full tiles are "
		"supported when padding_value is not set");
		}
		return success();
		}

		/// Returns a vector that interchanges `elements` starting at offset `offset`
		/// based on the indexes in `interchangeVector`.
		template <typename T>
		SmallVector<T> interchange(ArrayRef<T> elements,
		ArrayRef<int64_t> interchangeVector,
		int offset = 0) {
		SmallVector<T> vec = llvm::to_vector(elements);
		for (auto en : llvm::enumerate(interchangeVector))
		vec[en.index() + offset] = elements[en.value() + offset];
		rengolinUnsubmitted Not Done Reply Inline Actions why is this a lambda? rengolin: why is this a lambda?
		hanchungUnsubmitted Not Done Reply Inline Actions I think it's worth for making it a method. We'll need interchange and undoInterchange for tiling implementation. We can add undoInterchange method when upstreaming tiling implementation. FYI that here is the implementation used in IREE: https://github.com/iree-org/iree/blob/3625adf98f0b87c24a89f8d4101550c1ef1eea44/llvm-external-projects/iree-dialects/include/iree-dialects/Dialect/LinalgExt/Utils/Utils.h#L29-L50 RE Nicolas: I did not find a similar thing when prototyping it in IREE. Maybe I searched with bad keyword. The keyword I used is `interchange`, like `rg --ignore-case 'interchange' */.h`. :-( hanchung: I think it's worth for making it a method. We'll need interchange and undoInterchange for…
		hanchungUnsubmitted Not Done Reply Inline Actions llvm style nit: do not use braces for single statement. I.e., change it to for (...) vec[en.index() + offset] = ... https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements hanchung: llvm style nit: do not use braces for single statement. I.e., change it to ``` for (...) vec…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions template <typename T, unsigned N> void applyPermutationToVector(SmallVector<T, N> &inVec, ArrayRef<int64_t> permutation) { In IndexingUtils.cpp/h nicolasvasilache: ``` template <typename T, unsigned N> void applyPermutationToVector(SmallVector<T, N> &inVec…
		hanchungUnsubmitted Not Done Reply Inline Actions got it, thanks! hanchung: got it, thanks!

		return vec;
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I could swear I had a factored out util that implemented a templated form of this .. try to find it ? nicolasvasilache: I could swear I had a factored out util that implemented a templated form of this .. try to…
		}

		/// Get the expected packed type based on source type, tile factors, position of
		/// the inner tiles and permutation of the outer tiled loop.
		ShapedType PackOp::inferPackedType(ShapedType sourceType,
		ArrayRef<int64_t> innerTileSizes,
		ArrayRef<int64_t> innerDimsPos,
		ArrayRef<int64_t> outerDimsPerm) {
		SmallVector<int64_t> resultShape = llvm::to_vector(sourceType.getShape());
		for (auto tiledDim : llvm::enumerate(innerDimsPos)) {
		if (ShapedType::isDynamic(resultShape[tiledDim.value()]))
		continue;
		if (ShapedType::isDynamic(innerTileSizes[tiledDim.index()])) {
		resultShape[tiledDim.value()] = ShapedType::kDynamic;
		continue;
		}
		resultShape[tiledDim.value()] = ceilDiv(resultShape[tiledDim.value()],
		innerTileSizes[tiledDim.index()]);
		}

		resultShape = interchange<int64_t>(resultShape, outerDimsPerm);

		hanchungUnsubmitted Not Done Reply Inline Actions I don't see the point of `is available` because there are no checks. Maybe we can just drop this comment. It's obvious to me because the code describes what's happening. If you want to keep the comment, how about `Swap outer tiled data dimensions`. hanchung: I don't see the point of `is available` because there are no checks. Maybe we can just drop…
		// Append the inner tile dimensions.
		resultShape.append(innerTileSizes.begin(), innerTileSizes.end());
		return RankedTensorType::get(resultShape, sourceType.getElementType());
		}

		/// Returns true if the tiles and the tiled dims are constant.
		template <typename OpTy>
		bool areTilesAndTiledDimsAllConstant(OpTy op) {
		static_assert(llvm::is_one_of<OpTy, PackOp, UnPackOp>::value,
		"applies to only pack or unpack operations");
		ShapedType packedType = (std::is_same<OpTy, PackOp>::value)
		? op.getDestType()
		: op.getSourceType();
		SmallVector<OpFoldResult> mixedTiles = op.getMixedTiles();
		for (auto [dimDest, tile] : llvm::zip(
		packedType.getShape().take_back(mixedTiles.size()), mixedTiles)) {
		Optional<int64_t> constTileSize = getConstantIntValue(tile);
		if (!constTileSize \|\| ShapedType::isDynamic(dimDest))
		return false;
		}
		return true;
		}

		Speculation::Speculatability PackOp::getSpeculatability() {
		if (auto paddingValue = getPaddingValue())
		return Speculation::Speculatable;

		// The verifier rejects already operations if we can statically prove that the
		// sizes of the tiles do not divide perfectly the dimension; thus, check only
		// to have constant tiles and tiled inner dimensions.
		if (!areTilesAndTiledDimsAllConstant(*this))
		return Speculation::NotSpeculatable;

		return Speculation::Speculatable;
		}

		//===----------------------------------------------------------------------===//
		// UnPackOp
		//===----------------------------------------------------------------------===//

		void UnPackOp::getAsmResultNames(
		function_ref<void(Value, StringRef)> setNameFn) {
		setNameFn(getResult(), "unpack");
		}

		LogicalResult
		UnPackOp::reifyResultShapes(OpBuilder &builder,
		ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
		return reifyResultShapesImpl(*this, builder, reifiedReturnShapes);
		}

		DenseMap<int64_t, OpFoldResult> UnPackOp::getDimAndTileMapping() {
		return getDimAndTileMappingImpl(*this);
		}

		SmallVector<OpFoldResult> UnPackOp::getMixedTiles() {
		return getMixedTilesImpl(*this);
		}

		SmallVector<int64_t> UnPackOp::getStaticTiles() {
		return getStaticTilesImpl(*this);
		}

		LogicalResult UnPackOp::verify() {
		return commonVerifierPackAndUnPackOp(*this);
		}

		Speculation::Speculatability UnPackOp::getSpeculatability() {
		// See PackOp::getSpeculatability.
		if (!areTilesAndTiledDimsAllConstant(*this))
		return Speculation::NotSpeculatable;

		return Speculation::Speculatable;
		}

		//===----------------------------------------------------------------------===//
// TableGen'd op method definitions		// TableGen'd op method definitions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/Tensor/IR/TensorOps.cpp.inc"		#include "mlir/Dialect/Tensor/IR/TensorOps.cpp.inc"

mlir/test/Dialect/Tensor/invalid.mlir

	Show First 20 Lines • Show All 516 Lines • ▼ Show 20 Lines

	// -----			// -----

	func.func @empty_wrong_number_of_operands(%sz : index) {			func.func @empty_wrong_number_of_operands(%sz : index) {
	// expected-error@+1 {{incorrect number of dynamic sizes, has 1, expected 2}}			// expected-error@+1 {{incorrect number of dynamic sizes, has 1, expected 2}}
	%out = tensor.empty(%sz) : tensor<2x?x?x5xf32>			%out = tensor.empty(%sz) : tensor<2x?x?x5xf32>
	return			return
	}			}

				// -----

				func.func @pack_invalid_no_padding_no_full_tiles(%input: tensor<256x128xf32>, %output: tensor<8x8x16x33xf32>) -> tensor<8x8x16x33xf32> {
				// expected-error@+1 {{invalid tile factor provided. Only full tiles are supported when padding_value is not set}}
				%0 = tensor.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 33] into %output : tensor<256x128xf32> -> tensor<8x8x16x33xf32>
				return %0 : tensor<8x8x16x33xf32>
				}

				// -----

				func.func @pad_and_pack_invalid_type(%input: tensor<13x15xf32>, %output: tensor<2x8x8x2xf32>, %pad: i32) -> tensor<2x8x8x2xf32> {
				// expected-error@+1 {{expected padding_value has 'f32' but got: 'i32'}}
				%0 = tensor.pack %input padding_value(%pad: i32) inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %output : tensor<13x15xf32> -> tensor<2x8x8x2xf32>
				return %0 : tensor<2x8x8x2xf32>
				}

				// -----

				func.func @pack_invalid_inner_dims_pos_vector(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{invalid inner_dims_pos vector}}
				%0 = tensor.pack %input inner_dims_pos = [2, 0] inner_tiles = [2, 2] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
				return %0 : tensor<8x8x32x16xf32>
				}

				// -----

				func.func @pack_invalid_duplicate_element_in_inner_dims(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{invalid inner_dims_pos vector}}
				%0 = tensor.pack %input inner_dims_pos = [1, 1] inner_tiles = [2, 2] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
				return %0 : tensor<8x8x32x16xf32>
				}

				// -----

				func.func @pack_invalid_duplicate_element_in_outer_perm(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{invalid outer_dims_perm vector}}
				%0 = tensor.pack %input outer_dims_perm = [1, 1] inner_dims_pos = [0, 1] inner_tiles = [2, 2] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
				return %0 : tensor<8x8x32x16xf32>
				}

				// -----

				func.func @unpack_invalid_out_of_bound_outer_perm(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{invalid outer_dims_perm vector}}
				%0 = tensor.unpack %output outer_dims_perm = [2, 1] inner_dims_pos = [0, 1] inner_tiles = [2, 2] into %input : tensor<8x8x32x16xf32> -> tensor<256x128xf32>
				return %0 : tensor<256x128xf32>
				}

				// -----

				func.func @pack_invalid(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<8x8x16x32xf32>', got 'tensor<8x8x32x16xf32>'}}
				%0 = tensor.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 32] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
				return %0 : tensor<8x8x32x16xf32>
				}

				// -----

				func.func @unpack_invalid(%output: tensor<256x128xf32>, %input: tensor<8x8x32x16xf32>) -> tensor<256x128xf32> {
				// expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<8x32x4x32xf32>', got 'tensor<8x8x32x16xf32>'}}
				%0 = tensor.unpack %input inner_dims_pos = [1, 0] inner_tiles = [4, 32] into %output : tensor<8x8x32x16xf32> -> tensor<256x128xf32>
				return %0 : tensor<256x128xf32>
				}

				// -----

				func.func @pack_invalid(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
				// expected-error@+1 {{invalid zero tile factor}}
				%0 = tensor.pack %input inner_dims_pos = [1, 0] inner_tiles = [0, 2] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
				return %0 : tensor<8x8x32x16xf32>
				}

				// -----
				func.func @pack_mismatch_inner_tile_size_and_output_shape(
				%input : tensor<?x?xf32>, %output : tensor<?x?x8x8xf32>) -> tensor<?x?x8x8xf32> {
				// expected-error@+1 {{mismatch in inner tile sizes specified and shaped of tiled dimension in the packed type}}
				%0 = tensor.pack %input inner_dims_pos = [0, 1] inner_tiles = [8, 4] into %output : tensor<?x?xf32> -> tensor<?x?x8x8xf32>
				return %0 : tensor<?x?x8x8xf32>
				}

				// -----

				func.func @unpack_mismatch_inner_tile_size_and_output_shape(
				%input : tensor<?x?x8x8xf32>, %output : tensor<?x?xf32>) -> tensor<?x?xf32> {
				// expected-error@+1 {{mismatch in inner tile sizes specified and shaped of tiled dimension in the packed type}}
				%0 = tensor.unpack %input inner_dims_pos = [0, 1] inner_tiles = [8, 4] into %output : tensor<?x?x8x8xf32> -> tensor<?x?xf32>
				return %0 : tensor<?x?xf32>
				}

mlir/test/Dialect/Tensor/ops.mlir

// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s		// RUN: mlir-opt --split-input-file %s \| mlir-opt \| FileCheck %s

// CHECK-LABEL: func @cast(		// CHECK-LABEL: func @cast(
func.func @cast(%arg0: tensor<*xf32>, %arg1 : tensor<4x4xf32>, %arg2: tensor<?x?xf32>) {		func.func @cast(%arg0: tensor<*xf32>, %arg1 : tensor<4x4xf32>, %arg2: tensor<?x?xf32>) {
// CHECK: tensor.cast %{{.}} : tensor<xf32> to tensor<?x?xf32>		// CHECK: tensor.cast %{{.}} : tensor<xf32> to tensor<?x?xf32>
%0 = tensor.cast %arg0 : tensor<*xf32> to tensor<?x?xf32>		%0 = tensor.cast %arg0 : tensor<*xf32> to tensor<?x?xf32>
// CHECK: tensor.cast %{{.}} : tensor<4x4xf32> to tensor<xf32>		// CHECK: tensor.cast %{{.}} : tensor<4x4xf32> to tensor<xf32>
%1 = tensor.cast %arg1 : tensor<4x4xf32> to tensor<*xf32>		%1 = tensor.cast %arg1 : tensor<4x4xf32> to tensor<*xf32>
// CHECK: tensor.cast %{{.*}} : tensor<?x?xf32> to tensor<4x?xf32>		// CHECK: tensor.cast %{{.*}} : tensor<?x?xf32> to tensor<4x?xf32>
%2 = tensor.cast %arg2 : tensor<?x?xf32> to tensor<4x?xf32>		%2 = tensor.cast %arg2 : tensor<?x?xf32> to tensor<4x?xf32>
// CHECK: tensor.cast %{{.*}} : tensor<4x?xf32> to tensor<?x?xf32>		// CHECK: tensor.cast %{{.*}} : tensor<4x?xf32> to tensor<?x?xf32>
%3 = tensor.cast %2 : tensor<4x?xf32> to tensor<?x?xf32>		%3 = tensor.cast %2 : tensor<4x?xf32> to tensor<?x?xf32>
return		return
}		}

		// -----

// CHECK-LABEL: func @empty(		// CHECK-LABEL: func @empty(
// CHECK-SAME: %[[sz:.*]]: index		// CHECK-SAME: %[[sz:.*]]: index
func.func @empty(%sz: index) -> tensor<5x?x6xf32> {		func.func @empty(%sz: index) -> tensor<5x?x6xf32> {
// CHECK: tensor.empty(%[[sz]]) : tensor<5x?x6xf32>		// CHECK: tensor.empty(%[[sz]]) : tensor<5x?x6xf32>
%0 = tensor.empty(%sz) : tensor<5x?x6xf32>		%0 = tensor.empty(%sz) : tensor<5x?x6xf32>
return %0 : tensor<5x?x6xf32>		return %0 : tensor<5x?x6xf32>
}		}

		// -----

// CHECK-LABEL: func @empty_with_encoding(		// CHECK-LABEL: func @empty_with_encoding(
// CHECK-SAME: %[[sz:.*]]: index		// CHECK-SAME: %[[sz:.*]]: index
func.func @empty_with_encoding(%sz: index) -> tensor<5x?x6xf32, "foo"> {		func.func @empty_with_encoding(%sz: index) -> tensor<5x?x6xf32, "foo"> {
// CHECK: tensor.empty(%[[sz]]) : tensor<5x?x6xf32, "foo">		// CHECK: tensor.empty(%[[sz]]) : tensor<5x?x6xf32, "foo">
%0 = tensor.empty(%sz) : tensor<5x?x6xf32, "foo">		%0 = tensor.empty(%sz) : tensor<5x?x6xf32, "foo">
return %0 : tensor<5x?x6xf32, "foo">		return %0 : tensor<5x?x6xf32, "foo">
}		}

		// -----

// CHECK-LABEL: func @extract(		// CHECK-LABEL: func @extract(
// CHECK-SAME: %[[TENSOR:.*]]: tensor<?x?x?xf32>,		// CHECK-SAME: %[[TENSOR:.*]]: tensor<?x?x?xf32>,
// CHECK-SAME: %[[INDEX:.*]]: index) {		// CHECK-SAME: %[[INDEX:.*]]: index) {
func.func @extract(%arg0: tensor<?x?x?xf32>, %arg1: index) {		func.func @extract(%arg0: tensor<?x?x?xf32>, %arg1: index) {
// CHECK: tensor.extract %[[TENSOR]][%[[INDEX]], %[[INDEX]], %[[INDEX]]] : tensor<?x?x?xf32>		// CHECK: tensor.extract %[[TENSOR]][%[[INDEX]], %[[INDEX]], %[[INDEX]]] : tensor<?x?x?xf32>
%0 = tensor.extract %arg0[%arg1, %arg1, %arg1] : tensor<?x?x?xf32>		%0 = tensor.extract %arg0[%arg1, %arg1, %arg1] : tensor<?x?x?xf32>
return		return
}		}

		// -----

// CHECK-LABEL: func @insert(		// CHECK-LABEL: func @insert(
// CHECK-SAME: %[[SCALAR:.*]]: f32		// CHECK-SAME: %[[SCALAR:.*]]: f32
// CHECK-SAME: %[[INDEX:.*]]: index		// CHECK-SAME: %[[INDEX:.*]]: index
// CHECK-SAME: %[[DEST1:.*]]: tensor<?x?x?xf32>		// CHECK-SAME: %[[DEST1:.*]]: tensor<?x?x?xf32>
func.func @insert(%arg0: f32, %arg1: index, %arg2: tensor<?x?x?xf32>) {		func.func @insert(%arg0: f32, %arg1: index, %arg2: tensor<?x?x?xf32>) {
// CHECK: tensor.insert %[[SCALAR]] into %[[DEST1]][%[[INDEX]], %[[INDEX]], %[[INDEX]]] : tensor<?x?x?xf32>		// CHECK: tensor.insert %[[SCALAR]] into %[[DEST1]][%[[INDEX]], %[[INDEX]], %[[INDEX]]] : tensor<?x?x?xf32>
%0 = tensor.insert %arg0 into %arg2[%arg1, %arg1, %arg1] : tensor<?x?x?xf32>		%0 = tensor.insert %arg0 into %arg2[%arg1, %arg1, %arg1] : tensor<?x?x?xf32>
return		return
}		}

		// -----

// CHECK-LABEL: func @tensor.from_elements() {		// CHECK-LABEL: func @tensor.from_elements() {
func.func @tensor.from_elements() {		func.func @tensor.from_elements() {
%c0 = "arith.constant"() {value = 0: index} : () -> index		%c0 = "arith.constant"() {value = 0: index} : () -> index
// CHECK: tensor.from_elements %c0 : tensor<1xindex>		// CHECK: tensor.from_elements %c0 : tensor<1xindex>
%0 = tensor.from_elements %c0 : tensor<1xindex>		%0 = tensor.from_elements %c0 : tensor<1xindex>

%c1 = "arith.constant"() {value = 1: index} : () -> index		%c1 = "arith.constant"() {value = 1: index} : () -> index
// CHECK: tensor.from_elements %c0, %c1 : tensor<2xindex>		// CHECK: tensor.from_elements %c0, %c1 : tensor<2xindex>
Show All 10 Lines	func.func @tensor.from_elements() {
// CHECK: tensor.from_elements %c0, %c1, %c0, %c1, %c0, %c1 : tensor<2x3xindex>		// CHECK: tensor.from_elements %c0, %c1, %c0, %c1, %c0, %c1 : tensor<2x3xindex>
%4 = tensor.from_elements %c0, %c1, %c0, %c1, %c0, %c1 : tensor<2x3xindex>		%4 = tensor.from_elements %c0, %c1, %c0, %c1, %c0, %c1 : tensor<2x3xindex>

// CHECK: tensor.from_elements %c0 : tensor<index>		// CHECK: tensor.from_elements %c0 : tensor<index>
%5 = tensor.from_elements %c0 : tensor<index>		%5 = tensor.from_elements %c0 : tensor<index>
return		return
}		}

		// -----

// CHECK-LABEL: @tensor.generate		// CHECK-LABEL: @tensor.generate
func.func @tensor.generate(%m : index, %n : index)		func.func @tensor.generate(%m : index, %n : index)
-> tensor<?x3x?xf32> {		-> tensor<?x3x?xf32> {
%tnsr = tensor.generate %m, %n {		%tnsr = tensor.generate %m, %n {
^bb0(%i : index, %j : index, %k : index):		^bb0(%i : index, %j : index, %k : index):
%elem = arith.constant 8.0 : f32		%elem = arith.constant 8.0 : f32
tensor.yield %elem : f32		tensor.yield %elem : f32
} : tensor<?x3x?xf32>		} : tensor<?x3x?xf32>
return %tnsr : tensor<?x3x?xf32>		return %tnsr : tensor<?x3x?xf32>
}		}

		// -----

// CHECK-LABEL: func @tensor_reshape		// CHECK-LABEL: func @tensor_reshape
func.func @tensor_reshape(%unranked: tensor<*xf32>, %shape1: tensor<1xi32>,		func.func @tensor_reshape(%unranked: tensor<*xf32>, %shape1: tensor<1xi32>,
%shape2: tensor<2xi32>, %shape3: tensor<?xi32>) -> tensor<*xf32> {		%shape2: tensor<2xi32>, %shape3: tensor<?xi32>) -> tensor<*xf32> {
%dyn_vec = tensor.reshape %unranked(%shape1)		%dyn_vec = tensor.reshape %unranked(%shape1)
: (tensor<*xf32>, tensor<1xi32>) -> tensor<?xf32>		: (tensor<*xf32>, tensor<1xi32>) -> tensor<?xf32>
%dyn_mat = tensor.reshape %dyn_vec(%shape2)		%dyn_mat = tensor.reshape %dyn_vec(%shape2)
: (tensor<?xf32>, tensor<2xi32>) -> tensor<?x?xf32>		: (tensor<?xf32>, tensor<2xi32>) -> tensor<?x?xf32>
%new_unranked = tensor.reshape %dyn_mat(%shape3)		%new_unranked = tensor.reshape %dyn_mat(%shape3)
: (tensor<?x?xf32>, tensor<?xi32>) -> tensor<*xf32>		: (tensor<?x?xf32>, tensor<?xi32>) -> tensor<*xf32>
return %new_unranked : tensor<*xf32>		return %new_unranked : tensor<*xf32>
}		}

		// -----

// CHECK-LABEL: func @slice({{.*}}) {		// CHECK-LABEL: func @slice({{.*}}) {
func.func @slice(%t: tensor<8x16x4xf32>, %idx : index) {		func.func @slice(%t: tensor<8x16x4xf32>, %idx : index) {
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index		%c1 = arith.constant 1 : index

// CHECK: tensor.extract_slice		// CHECK: tensor.extract_slice
// CHECK-SAME: tensor<8x16x4xf32> to tensor<?x?x?xf32>		// CHECK-SAME: tensor<8x16x4xf32> to tensor<?x?x?xf32>
%1 = tensor.extract_slice %t[%c0, %c0, %c0][%idx, %idx, %idx][%c1, %c1, %c1]		%1 = tensor.extract_slice %t[%c0, %c0, %c0][%idx, %idx, %idx][%c1, %c1, %c1]
: tensor<8x16x4xf32> to tensor<?x?x?xf32>		: tensor<8x16x4xf32> to tensor<?x?x?xf32>

// CHECK: tensor.extract_slice		// CHECK: tensor.extract_slice
// CHECK-SAME: tensor<8x16x4xf32> to tensor<4x4x4xf32>		// CHECK-SAME: tensor<8x16x4xf32> to tensor<4x4x4xf32>
%2 = tensor.extract_slice %t[0, 2, 0][4, 4, 4][1, 1, 1]		%2 = tensor.extract_slice %t[0, 2, 0][4, 4, 4][1, 1, 1]
: tensor<8x16x4xf32> to tensor<4x4x4xf32>		: tensor<8x16x4xf32> to tensor<4x4x4xf32>

// CHECK: tensor.extract_slice		// CHECK: tensor.extract_slice
// CHECK-SAME: tensor<8x16x4xf32> to tensor<4x4xf32>		// CHECK-SAME: tensor<8x16x4xf32> to tensor<4x4xf32>
%3 = tensor.extract_slice %t[0, 2, 0][4, 1, 4][1, 1, 1]		%3 = tensor.extract_slice %t[0, 2, 0][4, 1, 4][1, 1, 1]
: tensor<8x16x4xf32> to tensor<4x4xf32>		: tensor<8x16x4xf32> to tensor<4x4xf32>

return		return
}		}

		// -----

// CHECK-LABEL: func @insert_slice({{.*}}) {		// CHECK-LABEL: func @insert_slice({{.*}}) {
func.func @insert_slice(		func.func @insert_slice(
%t: tensor<8x16x4xf32>,		%t: tensor<8x16x4xf32>,
%td: tensor<8x?x4xf32>,		%td: tensor<8x?x4xf32>,
%t2: tensor<16x32x8xf32>,		%t2: tensor<16x32x8xf32>,
%t3: tensor<4x4xf32>,		%t3: tensor<4x4xf32>,
%idx : index,		%idx : index,
%sz : index) {		%sz : index) {
Show All 18 Lines	func.func @insert_slice(
// CHECK: tensor.insert_slice		// CHECK: tensor.insert_slice
// CHECK-SAME: tensor<8x?x4xf32> into tensor<8x16x4xf32>		// CHECK-SAME: tensor<8x?x4xf32> into tensor<8x16x4xf32>
%4 = tensor.insert_slice %td into %t[0, %idx, 0][8, %sz, 4][1, 1, 1]		%4 = tensor.insert_slice %td into %t[0, %idx, 0][8, %sz, 4][1, 1, 1]
: tensor<8x?x4xf32> into tensor<8x16x4xf32>		: tensor<8x?x4xf32> into tensor<8x16x4xf32>

return		return
}		}

		// -----

func.func @tensor_reshape_zero_dim(%arg0 : tensor<1x1xf32>, %arg1 : tensor<f32>)		func.func @tensor_reshape_zero_dim(%arg0 : tensor<1x1xf32>, %arg1 : tensor<f32>)
-> (tensor<f32>, tensor<1x1xf32>) {		-> (tensor<f32>, tensor<1x1xf32>) {
%0 = tensor.collapse_shape %arg0 [] : tensor<1x1xf32> into tensor<f32>		%0 = tensor.collapse_shape %arg0 [] : tensor<1x1xf32> into tensor<f32>
%1 = tensor.expand_shape %0 [] : tensor<f32> into tensor<1x1xf32>		%1 = tensor.expand_shape %0 [] : tensor<f32> into tensor<1x1xf32>
return %0, %1 : tensor<f32>, tensor<1x1xf32>		return %0, %1 : tensor<f32>, tensor<1x1xf32>
}		}
// CHECK-LABEL: func @tensor_reshape_zero_dim		// CHECK-LABEL: func @tensor_reshape_zero_dim
// CHECK: tensor.collapse_shape %{{.*}} [] : tensor<1x1xf32> into tensor<f32>		// CHECK: tensor.collapse_shape %{{.*}} [] : tensor<1x1xf32> into tensor<f32>
// CHECK: tensor.expand_shape %{{.*}} [] : tensor<f32> into tensor<1x1xf32>		// CHECK: tensor.expand_shape %{{.*}} [] : tensor<f32> into tensor<1x1xf32>

		// -----

func.func @legal_collapsing_reshape_dynamic_tensor		func.func @legal_collapsing_reshape_dynamic_tensor
(%arg0: tensor<?x?x?x4x?xf32>) -> tensor<?x?x?xf32>		(%arg0: tensor<?x?x?x4x?xf32>) -> tensor<?x?x?xf32>
{		{
%0 = tensor.collapse_shape %arg0 [[0], [1], [2, 3, 4]] :		%0 = tensor.collapse_shape %arg0 [[0], [1], [2, 3, 4]] :
tensor<?x?x?x4x?xf32> into tensor<?x?x?xf32>		tensor<?x?x?x4x?xf32> into tensor<?x?x?xf32>
return %0 : tensor<?x?x?xf32>		return %0 : tensor<?x?x?xf32>
}		}
// CHECK: func @legal_collapsing_reshape_dynamic_tensor		// CHECK: func @legal_collapsing_reshape_dynamic_tensor
// CHECK: tensor.collapse_shape		// CHECK: tensor.collapse_shape
// CHECK-SAME: [0], [1], [2, 3, 4]		// CHECK-SAME: [0], [1], [2, 3, 4]

		// -----

func.func @rank(%t : tensor<4x4x?xf32>) {		func.func @rank(%t : tensor<4x4x?xf32>) {
// CHECK: %{{.}} = tensor.rank %{{.}} : tensor<4x4x?xf32>		// CHECK: %{{.}} = tensor.rank %{{.}} : tensor<4x4x?xf32>
%0 = "tensor.rank"(%t) : (tensor<4x4x?xf32>) -> index		%0 = "tensor.rank"(%t) : (tensor<4x4x?xf32>) -> index

// CHECK: %{{.}} = tensor.rank %{{.}} : tensor<4x4x?xf32>		// CHECK: %{{.}} = tensor.rank %{{.}} : tensor<4x4x?xf32>
%1 = tensor.rank %t : tensor<4x4x?xf32>		%1 = tensor.rank %t : tensor<4x4x?xf32>
return		return
}		}

		// -----

func.func @pad_dynamic(%arg0: tensor<1x2x2x?xf32>, %low: index, %high: index,		func.func @pad_dynamic(%arg0: tensor<1x2x2x?xf32>, %low: index, %high: index,
%pad_value: f32) -> tensor<6x?x?x?xf32> {		%pad_value: f32) -> tensor<6x?x?x?xf32> {
%0 = tensor.pad %arg0 low[2, %low, 3, 3] high[3, 3, %high, 2] {		%0 = tensor.pad %arg0 low[2, %low, 3, 3] high[3, 3, %high, 2] {
^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: index):		^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: index):
tensor.yield %pad_value : f32		tensor.yield %pad_value : f32
} : tensor<1x2x2x?xf32> to tensor<6x?x?x?xf32>		} : tensor<1x2x2x?xf32> to tensor<6x?x?x?xf32>
return %0 : tensor<6x?x?x?xf32>		return %0 : tensor<6x?x?x?xf32>
}		}
// CHECK-LABEL: func @pad_dynamic		// CHECK-LABEL: func @pad_dynamic
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[LOW:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[LOW:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[HIGH:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[HIGH:[a-zA-Z0-9_]*]]
// CHECK: tensor.pad %[[ARG0]]		// CHECK: tensor.pad %[[ARG0]]
// CHECK-SAME: low[2, %[[LOW]], 3, 3]		// CHECK-SAME: low[2, %[[LOW]], 3, 3]
// CHECK-SAME: high[3, 3, %[[HIGH]], 2]		// CHECK-SAME: high[3, 3, %[[HIGH]], 2]
// CHECK: : tensor<1x2x2x?xf32> to tensor<6x?x?x?xf32>		// CHECK: : tensor<1x2x2x?xf32> to tensor<6x?x?x?xf32>

		// -----

func.func @pad_static(%arg0: tensor<3x4xf32>, %pad_value: f32) -> tensor<6x9xf32> {		func.func @pad_static(%arg0: tensor<3x4xf32>, %pad_value: f32) -> tensor<6x9xf32> {
%0 = tensor.pad %arg0 low[1, 2] high[2, 3] {		%0 = tensor.pad %arg0 low[1, 2] high[2, 3] {
^bb0(%arg1 : index, %arg2 : index):		^bb0(%arg1 : index, %arg2 : index):
tensor.yield %pad_value : f32		tensor.yield %pad_value : f32
} : tensor<3x4xf32> to tensor<6x9xf32>		} : tensor<3x4xf32> to tensor<6x9xf32>
return %0 : tensor<6x9xf32>		return %0 : tensor<6x9xf32>
}		}
// CHECK-LABEL: func @pad_static		// CHECK-LABEL: func @pad_static
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]
// CHECK: tensor.pad %[[ARG0]] low[1, 2] high[2, 3]		// CHECK: tensor.pad %[[ARG0]] low[1, 2] high[2, 3]
// CHECK: : tensor<3x4xf32> to tensor<6x9xf32>		// CHECK: : tensor<3x4xf32> to tensor<6x9xf32>

		// -----

func.func @pad_asymmetrical(%arg0: tensor<2x3xf32>, %ub0: index, %ub1: index,		func.func @pad_asymmetrical(%arg0: tensor<2x3xf32>, %ub0: index, %ub1: index,
%pad_value: f32) -> tensor<?x?xf32> {		%pad_value: f32) -> tensor<?x?xf32> {
%0 = tensor.pad %arg0 low[0, 0] high[%ub0, %ub1] {		%0 = tensor.pad %arg0 low[0, 0] high[%ub0, %ub1] {
^bb0(%arg1: index, %arg2: index):		^bb0(%arg1: index, %arg2: index):
tensor.yield %pad_value : f32		tensor.yield %pad_value : f32
} : tensor<2x3xf32> to tensor<?x?xf32>		} : tensor<2x3xf32> to tensor<?x?xf32>
return %0 : tensor<?x?xf32>		return %0 : tensor<?x?xf32>
}		}
// CHECK-LABEL: func @pad_asymmetrical		// CHECK-LABEL: func @pad_asymmetrical
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[UB0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[UB0:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[UB1:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[UB1:[a-zA-Z0-9_]*]]
// CHECK: tensor.pad %[[ARG0]]		// CHECK: tensor.pad %[[ARG0]]
// CHECK-SAME: low[0, 0]		// CHECK-SAME: low[0, 0]
// CHECK-SAME: high[%[[UB0]], %[[UB1]]]		// CHECK-SAME: high[%[[UB0]], %[[UB1]]]
// CHECK: : tensor<2x3xf32> to tensor<?x?xf32>		// CHECK: : tensor<2x3xf32> to tensor<?x?xf32>

		// -----

func.func @pad_to_static_size(%arg0: tensor<?x?xf32>, %ub0: index, %ub1: index,		func.func @pad_to_static_size(%arg0: tensor<?x?xf32>, %ub0: index, %ub1: index,
%pad_value: f32) -> tensor<2x3xf32> {		%pad_value: f32) -> tensor<2x3xf32> {
%0 = tensor.pad %arg0 low[0, 0] high[%ub0, %ub1] {		%0 = tensor.pad %arg0 low[0, 0] high[%ub0, %ub1] {
^bb0(%arg1: index, %arg2: index):		^bb0(%arg1: index, %arg2: index):
tensor.yield %pad_value : f32		tensor.yield %pad_value : f32
} : tensor<?x?xf32> to tensor<2x3xf32>		} : tensor<?x?xf32> to tensor<2x3xf32>
return %0 : tensor<2x3xf32>		return %0 : tensor<2x3xf32>
}		}
// CHECK-LABEL: func @pad_to_static_size		// CHECK-LABEL: func @pad_to_static_size
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[UB0:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[UB0:[a-zA-Z0-9_]*]]
// CHECK-SAME: %[[UB1:[a-zA-Z0-9_]*]]		// CHECK-SAME: %[[UB1:[a-zA-Z0-9_]*]]
// CHECK: tensor.pad %[[ARG0]]		// CHECK: tensor.pad %[[ARG0]]
// CHECK-SAME: low[0, 0]		// CHECK-SAME: low[0, 0]
// CHECK-SAME: high[%[[UB0]], %[[UB1]]]		// CHECK-SAME: high[%[[UB0]], %[[UB1]]]
// CHECK: : tensor<?x?xf32> to tensor<2x3xf32>		// CHECK: : tensor<?x?xf32> to tensor<2x3xf32>

		// -----

// CHECK-LABEL: func @test_splat_op		// CHECK-LABEL: func @test_splat_op
// CHECK-SAME: [[S:%arg[0-9]+]]: f32		// CHECK-SAME: [[S:%arg[0-9]+]]: f32
func.func @test_splat_op(%s : f32) {		func.func @test_splat_op(%s : f32) {
// CHECK: tensor.splat [[S]] : tensor<8xf32>		// CHECK: tensor.splat [[S]] : tensor<8xf32>
%v = tensor.splat %s : tensor<8xf32>		%v = tensor.splat %s : tensor<8xf32>

// CHECK: tensor.splat [[S]] : tensor<4xf32>		// CHECK: tensor.splat [[S]] : tensor<4xf32>
%u = "tensor.splat"(%s) : (f32) -> tensor<4xf32>		%u = "tensor.splat"(%s) : (f32) -> tensor<4xf32>
return		return
}		}

		// -----

// CHECK-LABEL: func.func @gather_scatter(		// CHECK-LABEL: func.func @gather_scatter(
// CHECK-SAME: %[[ARG0:.*]]: tensor<4x5x6xf32>,		// CHECK-SAME: %[[ARG0:.*]]: tensor<4x5x6xf32>,
// CHECK-SAME: %[[ARG1:.*]]: tensor<1x3x2xindex>,		// CHECK-SAME: %[[ARG1:.*]]: tensor<1x3x2xindex>,
// CHECK-SAME: %[[ARG2:.*]]: tensor<1x3x2xi32>) {		// CHECK-SAME: %[[ARG2:.*]]: tensor<1x3x2xi32>) {
func.func @gather_scatter(		func.func @gather_scatter(
%dest : tensor<4x5x6xf32>, %indices: tensor<1x3x2xindex>, %indices_i32: tensor<1x3x2xi32>) {		%dest : tensor<4x5x6xf32>, %indices: tensor<1x3x2xindex>, %indices_i32: tensor<1x3x2xi32>) {
// CHECK: %[[GATHER:.*]] = tensor.gather %[[ARG0]][%[[ARG2]]] gather_dims([1, 2]) unique : (tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<1x3x4x1x1xf32>		// CHECK: %[[GATHER:.*]] = tensor.gather %[[ARG0]][%[[ARG2]]] gather_dims([1, 2]) unique : (tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<1x3x4x1x1xf32>
%gathered = tensor.gather %dest[%indices_i32] gather_dims([1, 2]) unique:		%gathered = tensor.gather %dest[%indices_i32] gather_dims([1, 2]) unique:
(tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<1x3x4x1x1xf32>		(tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<1x3x4x1x1xf32>
// CHECK: %[[GATHER0:.*]] = tensor.gather %[[ARG0]][%[[ARG1]]] gather_dims([1, 2]) unique : (tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<1x3x4xf32>		// CHECK: %[[GATHER0:.*]] = tensor.gather %[[ARG0]][%[[ARG1]]] gather_dims([1, 2]) unique : (tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<1x3x4xf32>
%rank_reduced_gathered = tensor.gather %dest[%indices] gather_dims([1, 2]) unique:		%rank_reduced_gathered = tensor.gather %dest[%indices] gather_dims([1, 2]) unique:
(tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<1x3x4xf32>		(tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<1x3x4xf32>

// CHECK: %{{.*}} = tensor.scatter %[[GATHER]] into %[[ARG0]][%[[ARG1]]] scatter_dims([1, 2]) unique : (tensor<1x3x4x1x1xf32>, tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<4x5x6xf32>		// CHECK: %{{.*}} = tensor.scatter %[[GATHER]] into %[[ARG0]][%[[ARG1]]] scatter_dims([1, 2]) unique : (tensor<1x3x4x1x1xf32>, tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<4x5x6xf32>
%scattered = tensor.scatter %gathered into %dest[%indices]		%scattered = tensor.scatter %gathered into %dest[%indices]
scatter_dims([1, 2]) unique:		scatter_dims([1, 2]) unique:
(tensor<1x3x4x1x1xf32>, tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<4x5x6xf32>		(tensor<1x3x4x1x1xf32>, tensor<4x5x6xf32>, tensor<1x3x2xindex>) -> tensor<4x5x6xf32>
// CHECK: %{{.*}} = tensor.scatter %[[GATHER0]] into %[[ARG0]][%[[ARG2]]] scatter_dims([1, 2]) unique : (tensor<1x3x4xf32>, tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<4x5x6xf32>		// CHECK: %{{.*}} = tensor.scatter %[[GATHER0]] into %[[ARG0]][%[[ARG2]]] scatter_dims([1, 2]) unique : (tensor<1x3x4xf32>, tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<4x5x6xf32>
%rank_reduced_scattered = tensor.scatter %rank_reduced_gathered into %dest[%indices_i32]		%rank_reduced_scattered = tensor.scatter %rank_reduced_gathered into %dest[%indices_i32]
scatter_dims([1, 2]) unique:		scatter_dims([1, 2]) unique:
(tensor<1x3x4xf32>, tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<4x5x6xf32>		(tensor<1x3x4xf32>, tensor<4x5x6xf32>, tensor<1x3x2xi32>) -> tensor<4x5x6xf32>
return		return
}		}

		hanchungUnsubmitted Done Reply Inline Actions We should use `// -----` to split tests. I don't know why `--split-input-file` is not added in the test command (i.e., line 1), but we should add it at least for consistency. That's how we write the tests in this file. hanchung: We should use `// -----` to split tests. I don't know why `--split-input-file` is not added in…
		hanchungUnsubmitted Done Reply Inline Actions I meant add `// -----` to the new tests. Sorry for the ambiguous comment. The change looks good to me, I was trying not add too much non-specific things to the revision. Any way, thanks for fixing the other parts in this file! hanchung: I meant add `// -----` to the new tests. Sorry for the ambiguous comment. The change looks…
		// -----

		func.func @pack_nc_to_ncnc(%source: tensor<128x256xf32>, %dest: tensor<4x16x32x16xf32>) -> tensor<128x256xf32> {
		%0 = tensor.pack %source inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %dest : tensor<128x256xf32> -> tensor<4x16x32x16xf32>
		%1 = tensor.empty() : tensor<128x256xf32>
		%2 = tensor.unpack %0 inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %1 : tensor<4x16x32x16xf32> -> tensor<128x256xf32>
		return %2 : tensor<128x256xf32>
		}
		mehdi_aminiUnsubmitted Done Reply Inline Actions Please use CHECK-LABEL mehdi_amini: Please use CHECK-LABEL

		// CHECK-LABEL: func.func @pack_nc_to_ncnc(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<128x256xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<4x16x32x16xf32>)
		// CHECK: %[[PACKED:.*]] = tensor.pack %[[SOURCE]] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %[[DEST]] : tensor<128x256xf32> -> tensor<4x16x32x16xf32>
		// CHECK: %[[BUFF:.*]] = tensor.empty() : tensor<128x256xf32>
		// CHECK: %{{.*}} = tensor.unpack %[[PACKED]] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %[[BUFF]] : tensor<4x16x32x16xf32> -> tensor<128x256xf32>
		mehdi_aminiUnsubmitted Done Reply Inline Actions Please minimize the CHECK to the absolute minimum needed for what you intend to test. mehdi_amini: Please minimize the CHECK to the absolute minimum needed for what you intend to test.

		// -----

		func.func @pack_nc_to_ncnc_with_padding(%source: tensor<13x15xf32>, %dest: tensor<2x8x8x2xf32>, %padding: f32) -> tensor<13x15xf32> {
		%0 = tensor.pack %source padding_value(%padding : f32) inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %dest : tensor<13x15xf32> -> tensor<2x8x8x2xf32>
		%1 = tensor.empty() : tensor<13x15xf32>
		%2 = tensor.unpack %0 inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %1 : tensor<2x8x8x2xf32> -> tensor<13x15xf32>
		return %2 : tensor<13x15xf32>
		}

		// CHECK-LABEL: func.func @pack_nc_to_ncnc_with_padding(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<13x15xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<2x8x8x2xf32>,
		// CHECK-SAME: %[[PADDING:.*]]: f32)
		// CHECK: %[[PACKED:.*]] = tensor.pack %[[SOURCE]] padding_value(%[[PADDING]] : f32) inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %[[DEST]] : tensor<13x15xf32> -> tensor<2x8x8x2xf32>
		// CHECK: %[[BUFF:.*]] = tensor.empty() : tensor<13x15xf32>
		// CHECK: %{{.*}} = tensor.unpack %[[PACKED]] inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %[[BUFF]] : tensor<2x8x8x2xf32> -> tensor<13x15xf32>

		// -----
		rengolinUnsubmitted Done Reply Inline Actions Missing CHECK lines for the third test rengolin: Missing CHECK lines for the third test

		func.func @pack_ck_to_kcck(%source: tensor<128x256xf32>, %dest: tensor<16x4x32x16xf32>) -> tensor<128x256xf32> {
		%0 = tensor.pack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %dest : tensor<128x256xf32> -> tensor<16x4x32x16xf32>
		%1 = tensor.empty() : tensor<128x256xf32>
		%2 = tensor.unpack %0 outer_dims_perm = [1, 0] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %1 : tensor<16x4x32x16xf32> -> tensor<128x256xf32>
		return %2 : tensor<128x256xf32>
		}

		// CHECK-LABEL: func.func @pack_ck_to_kcck(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<128x256xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<16x4x32x16xf32>)
		// CHECK: %[[PACKED:.*]] = tensor.pack %[[SOURCE]] outer_dims_perm = [1, 0] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %[[DEST]] : tensor<128x256xf32> -> tensor<16x4x32x16xf32>
		// CHECK: %[[BUFF:.*]] = tensor.empty() : tensor<128x256xf32>
		// CHECK: %{{.*}} = tensor.unpack %[[PACKED]] outer_dims_perm = [1, 0] inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %[[BUFF]] : tensor<16x4x32x16xf32> -> tensor<128x256xf32>

		// -----

		func.func @pad_and_pack_fully_dynamic(%source: tensor<?x?xf32>, %dest: tensor<?x?x?x?xf32>, %pad: f32, %tile_n : index, %tile_m : index) -> tensor<?x?x?x?xf32> {
		%0 = tensor.pack %source padding_value(%pad : f32) inner_dims_pos = [0, 1] inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?xf32> -> tensor<?x?x?x?xf32>
		return %0 : tensor<?x?x?x?xf32>
		}

		// CHECK-LABEL: func.func @pad_and_pack_fully_dynamic(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<?x?xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<?x?x?x?xf32>,
		// CHECK-SAME: %[[PAD:.*]]: f32,
		// CHECK-SAME: %[[TILE_N:.*]]: index,
		// CHECK-SAME: %[[TILE_M:.*]]: index)
		// CHECK: %{{.*}} = tensor.pack %[[SOURCE]] padding_value(%[[PAD]] : f32) inner_dims_pos = [0, 1] inner_tiles = [%[[TILE_N]], %[[TILE_M]]] into %[[DEST]] : tensor<?x?xf32> -> tensor<?x?x?x?xf32>

		// -----

		func.func @pad_and_pack_partially_dynamic(%source: tensor<?x?xf32>, %dest: tensor<?x?x8x2xf32>, %pad: f32) -> tensor<?x?x8x2xf32> {
		%0 = tensor.pack %source padding_value(%pad : f32) inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %dest : tensor<?x?xf32> -> tensor<?x?x8x2xf32>
		return %0 : tensor<?x?x8x2xf32>
		}

		// CHECK-LABEL: func.func @pad_and_pack_partially_dynamic(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<?x?xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<?x?x8x2xf32>,
		// CHECK-SAME: %[[PAD:.*]]: f32)
		// CHECK: %{{.*}} = tensor.pack %[[SOURCE]] padding_value(%[[PAD]] : f32) inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %[[DEST]] : tensor<?x?xf32> -> tensor<?x?x8x2xf32>

		// -----

		func.func @unpack_fully_dynamic(%source: tensor<?x?x?x?xf32>, %dest: tensor<?x?xf32>, %tile_n : index, %tile_m : index) -> tensor<?x?xf32> {
		%0 = tensor.unpack %source inner_dims_pos = [0, 1] inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?x?x?xf32> -> tensor<?x?xf32>
		return %0 : tensor<?x?xf32>
		}

		// CHECK-LABEL: func.func @unpack_fully_dynamic(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<?x?x?x?xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<?x?xf32>,
		// CHECK-SAME: %[[TILE_N:.*]]: index,
		// CHECK-SAME: %[[TILE_M:.*]]: index)
		// CHECK: %{{.*}} = tensor.unpack %[[SOURCE]] inner_dims_pos = [0, 1] inner_tiles = [%[[TILE_N]], %[[TILE_M]]] into %[[DEST]] : tensor<?x?x?x?xf32> -> tensor<?x?xf32>

		// -----

		func.func @unpack_partially_dynamic(%source: tensor<?x?x8x2xf32>, %dest: tensor<?x?xf32>) -> tensor<?x?xf32> {
		%0 = tensor.unpack %source inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %dest : tensor<?x?x8x2xf32> -> tensor<?x?xf32>
		return %0: tensor<?x?xf32>
		}

		// CHECK-LABEL: func.func @unpack_partially_dynamic(
		// CHECK-SAME: %[[SOURCE:.*]]: tensor<?x?x8x2xf32>,
		// CHECK-SAME: %[[DEST:.*]]: tensor<?x?xf32>)
		// CHECK: %{{.*}} = tensor.unpack %[[SOURCE]] inner_dims_pos = [0, 1] inner_tiles = [8, 2] into %[[DEST]] : tensor<?x?x8x2xf32> -> tensor<?x?xf32>

mlir/test/Transforms/loop-invariant-code-motion.mlir

Show First 20 Lines • Show All 868 Lines • ▼ Show 20 Lines	// CHECK-LABEL: @speculate_ceildivsi_const(
scf.for %i = %lb to %ub step %step {		scf.for %i = %lb to %ub step %step {
// CHECK: arith.ceildivsi		// CHECK: arith.ceildivsi
// CHECK: scf.for		// CHECK: scf.for
%val = arith.ceildivsi %num, %c5 : i32		%val = arith.ceildivsi %num, %c5 : i32
}		}

return		return
}		}

		// -----

		func.func @speculate_static_pack_and_unpack(%source: tensor<128x256xf32>,
		%dest: tensor<4x16x32x16xf32>, %lb: index, %ub: index, %step: index) {

		// CHECK: tensor.pack
		// CHECK-NEXT: scf.for
		scf.for %i = %lb to %ub step %step {
		%packed = tensor.pack %source
		inner_dims_pos = [0, 1]
		inner_tiles = [32, 16] into %dest : tensor<128x256xf32> -> tensor<4x16x32x16xf32>
		}

		// CHECK: tensor.unpack
		// CHECK-NEXT: scf.for
		scf.for %i = %lb to %ub step %step {
		%unpacked = tensor.unpack %dest
		inner_dims_pos = [0, 1]
		inner_tiles = [32, 16] into %source : tensor<4x16x32x16xf32> -> tensor<128x256xf32>
		}
		return
		}

		// -----

		func.func @speculate_dynamic_pack_and_unpack(%source: tensor<?x?xf32>,
		%dest: tensor<?x?x?x?xf32>, %lb: index, %ub: index, %step: index,
		%tile_m: index, %tile_n: index, %pad: f32) {

		// CHECK: scf.for
		// CHECK-NEXT: tensor.pack
		scf.for %i = %lb to %ub step %step {
		%packed = tensor.pack %source
		inner_dims_pos = [0, 1]
		inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?xf32> -> tensor<?x?x?x?xf32>
		}

		// CHECK: scf.for
		// CHECK-NEXT: tensor.unpack
		scf.for %i = %lb to %ub step %step {
		%unpacked = tensor.unpack %dest
		inner_dims_pos = [0, 1]
		inner_tiles = [%tile_n, %tile_m] into %source : tensor<?x?x?x?xf32> -> tensor<?x?xf32>
		}

		// CHECK: tensor.pack
		// CHECK-NEXT: scf.for
		scf.for %i = %lb to %ub step %step {
		%packed = tensor.pack %source padding_value(%pad : f32)
		inner_dims_pos = [0, 1]
		inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?xf32> -> tensor<?x?x?x?xf32>
		}
		return
		}

This is an archive of the discontinued LLVM Phabricator instance.

Introduce `tensor.pack` and `tensor.unpack` operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 477084

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

mlir/test/Dialect/Tensor/invalid.mlir

mlir/test/Dialect/Tensor/ops.mlir

mlir/test/Transforms/loop-invariant-code-motion.mlir

Introduce `tensor.pack` and `tensor.unpack` operations
ClosedPublic