This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/
-
mlir/
-
Dialect/
-
Tensor/IR/
-
IR/
-
TensorOps.td
-
Utils/
2/2
IndexingUtils.h
2/2
ReshapeOpsUtils.h
-
lib/Dialect/
-
Dialect/
-
Linalg/TransformOps/
-
TransformOps/
7/8
LinalgTransformOps.cpp
-
Tensor/IR/
-
IR/
2/2
TensorOps.cpp
-
Utils/
-
IndexingUtils.cpp
-
ReshapeOpsUtils.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
1/3
transform-lower-pack.mlir

Differential D148159

[mlir][TransformDialect] Simplify the lowering of pack/unpack when these are just pad/unpad
ClosedPublic

Authored by qcolombet on Apr 12 2023, 1:21 PM.

Download Raw Diff

Details

Reviewers

ftynse
nicolasvasilache

Commits

rG0bfbecf52e8f: [mlir][TransformDialect] Simplify the lowering of pack/unpack when these are…

Summary

This patch recognizes when tensor.pack/unpack operations are simple tensor.pad/unpad (a.k.a. tensor.extract_slice) and lowers them in a simpler sequence of instruction.

For pack, instead of doing:

pad
expand_shape
transpose

we do

pad
insert_slice

For unpack, instead of doing:

transpose
collapse_shape
extract_slice

we do

extract_slice

Note: returning nullptr for the transform dialect is fine. The related handles are just ignored by the following transformation.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qcolombet created this revision.Apr 12 2023, 1:21 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 1:21 PM

Herald added subscribers: bviyer, hanchung, Moerafaat and 27 others. · View Herald Transcript

qcolombet requested review of this revision.Apr 12 2023, 1:21 PM

Herald added a subscriber: stephenneuendorffer. · View Herald TranscriptApr 12 2023, 1:21 PM

Harbormaster completed remote builds in B225163: Diff 512949.Apr 12 2023, 1:41 PM

can we add a test with dynamic shape ?
(maybe we can't if some op does not support it, but I *think* we can)

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
877	This assumes static, we really want `createOrFold<DimOp>` here; we should have some helpers around that return the vector<OpFoldResult> with the right quantities. Not sure where to look offhand for those helpers ..
883	3 newlines ? :)
884	can we early exit here and avoid the else nesting ?
908	with early exit, I believe we can dropp the trailing assert.
995	same re dynamic sizes
1060	same re early exit and dropping the trailing assert.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
3722	Something more idiomatic with `ArrayRef == llvm::seq` ?
3741	Something like `return llvm::all_of(..., [](){ == 1; })` ?

This revision is now accepted and ready to land.Apr 12 2023, 1:57 PM

mravishankar added inline comments.Apr 12 2023, 2:06 PM

mlir/test/Dialect/Linalg/transform-lower-pack.mlir
41	I wondering if we even want to have an `insert_slice` here? we could just do a `reshape` here. In which case the only difference is whether there is a transpose or not? If we can teach bufferization to handle this reshape without introducing additional buffers, that'd be very useful (we used to have a `memref.reshape` to just change the indexing, wonder if we still have that).

nicolasvasilache added inline comments.Apr 13 2023, 1:11 AM

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
877	Ok, after quick offline chat, I see this is not strictly needed given that the shape is asserted to be static. This is because `tensor.expand_shape` is not yet powerful enough re dynamic sizes, but this is a longer endeavor. I still pulled out the relevant functionality here: https://reviews.llvm.org/D148201 for better reuse. While it will do the same thing for now using an idiomatic API, it is also future proof. Could you update to using getMixedDimensions ?
mlir/test/Dialect/Linalg/transform-lower-pack.mlir
41	In principle we could but I'd like to see how these things connect in practice. At the graph level, reshapes tend to simplify reasonably well. At the codegen level, as soon as we start tiling and taking slices, reshapes don't behave properly atm. I'd say this is dependent on where we apply the lowering and we will likely want this to be configurable in the future, based on data.

Forgot to submit these comments!

mlir/include/mlir/Dialect/Utils/IndexingUtils.h
208	I ended up not needing this refactoring but I wanted to show it nevertheless. I'll remove them from that commit after I get your feedbacks. Do you think it makes sense to expose these in the utility functions (in that case I'll make a separate NFC commit) or should I just drop them all together?
mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h
535	Ditto, not needed but I wanted to show it to you.

nicolasvasilache added inline comments.Apr 13 2023, 1:23 AM

mlir/include/mlir/Dialect/Utils/IndexingUtils.h
208	I think these are useful utils to expose in general, fine as is IMO.
mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h
535	I think these are useful utils to expose in general, fine as is IMO.

Use all_of and llvm::seq where appropriate
Use getMixedDimensions to future proof the code w.r.t. dynamic dimensions
Early-exit to avoid else indentation

qcolombet marked 10 inline comments as done.Apr 13 2023, 2:37 AM

qcolombet added inline comments.

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
883	It was like that in the exiting code :). Removed it.
mlir/test/Dialect/Linalg/transform-lower-pack.mlir
41	Alright I'll land as is and we'll revisit later then.

Harbormaster completed remote builds in B225281: Diff 513132.Apr 13 2023, 2:48 AM

nicolasvasilache accepted this revision.Apr 13 2023, 3:06 AM

Closed by commit rG0bfbecf52e8f: [mlir][TransformDialect] Simplify the lowering of pack/unpack when these are… (authored by qcolombet). · Explain WhyApr 13 2023, 3:47 AM

This revision was automatically updated to reflect the committed changes.

qcolombet marked an inline comment as done.

qcolombet added a commit: rG0bfbecf52e8f: [mlir][TransformDialect] Simplify the lowering of pack/unpack when these are….

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Tensor/

IR/

TensorOps.td

13 lines

Utils/

IndexingUtils.h

9 lines

ReshapeOpsUtils.h

13 lines

lib/

Dialect/

Linalg/

TransformOps/

LinalgTransformOps.cpp

131 lines

Tensor/

IR/

TensorOps.cpp

43 lines

Utils/

IndexingUtils.cpp

21 lines

ReshapeOpsUtils.cpp

39 lines

test/

Dialect/

Linalg/

transform-lower-pack.mlir

110 lines

Diff 513159

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td

Show First 20 Lines • Show All 1,811 Lines • ▼ Show 20 Lines	let extraClassDeclaration = commonExtraClassDeclaration # [{
/// - If not empty, innerPermutation is a valid permutation of size		/// - If not empty, innerPermutation is a valid permutation of size
/// matching innerDimPos.		/// matching innerDimPos.
/// - If not empty, outerPermutation is a valid permutation of size		/// - If not empty, outerPermutation is a valid permutation of size
/// matching outerDimsPerm.		/// matching outerDimsPerm.
PackOp createTransposedClone(OpBuilder &b,		PackOp createTransposedClone(OpBuilder &b,
Location loc,		Location loc,
ArrayRef<int64_t> innerPermutation,		ArrayRef<int64_t> innerPermutation,
ArrayRef<int64_t> outerPermutation);		ArrayRef<int64_t> outerPermutation);

		/// Check if this PackOp is like a simple pad operation.
		/// In other words, this operation:
		/// 1. adds useless dimensions (dimension of size 1),
		/// 2. pads the other ones, and
		/// 3. doesn't shuffle the dimensions
		bool isLikePad();
}];		}];

let hasCanonicalizeMethod = 1;		let hasCanonicalizeMethod = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// UnPackOp		// UnPackOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	let extraClassDeclaration = commonExtraClassDeclaration # [{
/// matching innerDimPos.		/// matching innerDimPos.
/// - If not empty, outerPermutation is a valid permutation of size		/// - If not empty, outerPermutation is a valid permutation of size
/// matching outerDimsPerm.		/// matching outerDimsPerm.
UnPackOp createTransposedClone(OpBuilder &b,		UnPackOp createTransposedClone(OpBuilder &b,
Location loc,		Location loc,
Value transposedSource,		Value transposedSource,
ArrayRef<int64_t> innerPermutation,		ArrayRef<int64_t> innerPermutation,
ArrayRef<int64_t> outerPermutation);		ArrayRef<int64_t> outerPermutation);

		/// Check if this UnPackOp is like a simple unpad operation.
		/// In other words, this operation:
		/// 1. drops useless dimensions (dimension of size 1), and
		/// 2. reduces dimensions in place (i.e., no tranpose.)
		bool isLikeUnPad();
}];		}];

let hasCanonicalizeMethod = 1;		let hasCanonicalizeMethod = 1;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// YieldOp		// YieldOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
Show All 20 Lines

mlir/include/mlir/Dialect/Utils/IndexingUtils.h

	Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines
	}			}

	/// Helper method to apply to inverse a permutation.			/// Helper method to apply to inverse a permutation.
	SmallVector<int64_t> invertPermutationVector(ArrayRef<int64_t> permutation);			SmallVector<int64_t> invertPermutationVector(ArrayRef<int64_t> permutation);

	/// Method to check if an interchange vector is a permutation.			/// Method to check if an interchange vector is a permutation.
	bool isPermutationVector(ArrayRef<int64_t> interchange);			bool isPermutationVector(ArrayRef<int64_t> interchange);

				/// Return a permutation vector of size permSize that would result in moving
				/// positions into desiredPositions.
				///
				/// For example, permSize == 5, positions = {2, 4}, desiredPositions = {1, 0}
				/// would result in a {4, 2, 0, 1, 3} permutation vector.
				SmallVector<int64_t>
				computePermutationVector(int64_t permSize, ArrayRef<int64_t> positions,
				ArrayRef<int64_t> desiredPositions);
				qcolombetAuthorUnsubmitted Done Reply Inline Actions I ended up not needing this refactoring but I wanted to show it nevertheless. I'll remove them from that commit after I get your feedbacks. Do you think it makes sense to expose these in the utility functions (in that case I'll make a separate NFC commit) or should I just drop them all together? qcolombet: I ended up not needing this refactoring but I wanted to show it nevertheless. I'll remove them…
				nicolasvasilacheUnsubmitted Done Reply Inline Actions I think these are useful utils to expose in general, fine as is IMO. nicolasvasilache: I think these are useful utils to expose in general, fine as is IMO.

	/// Helper to return a subset of `arrayAttr` as a vector of int64_t.			/// Helper to return a subset of `arrayAttr` as a vector of int64_t.
	// TODO: Port everything relevant to DenseArrayAttr and drop this util.			// TODO: Port everything relevant to DenseArrayAttr and drop this util.
	SmallVector<int64_t> getI64SubArray(ArrayAttr arrayAttr, unsigned dropFront = 0,			SmallVector<int64_t> getI64SubArray(ArrayAttr arrayAttr, unsigned dropFront = 0,
	unsigned dropBack = 0);			unsigned dropBack = 0);

	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_UTILS_INDEXINGUTILS_H			#endif // MLIR_DIALECT_UTILS_INDEXINGUTILS_H

mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h

	Show First 20 Lines • Show All 514 Lines • ▼ Show 20 Lines
	/// [1, 1, 1]			/// [1, 1, 1]
	/// : tensor<?x1x30xf32> to tensor<?x30xf32>			/// : tensor<?x1x30xf32> to tensor<?x30xf32>
	/// ```			/// ```
	FailureOr<CollapseShapeRankReducingSliceSimplificationInfo>			FailureOr<CollapseShapeRankReducingSliceSimplificationInfo>
	getSimplifyCollapseShapeWithRankReducingSliceInfo(			getSimplifyCollapseShapeWithRankReducingSliceInfo(
	RankedTensorType sourceType,			RankedTensorType sourceType,
	ArrayRef<ReassociationIndices> reassociationIndices);			ArrayRef<ReassociationIndices> reassociationIndices);

				struct PackingMetadata {
				SmallVector<int64_t> insertPositions;
				SmallVector<ReassociationIndices> reassociations;
				};

				/// Given a vector of `positions` indices representing desired packing insertion
				/// points into a target vector (i.e. pack/unpack.inner_dim_pos), compute the
				/// final positions in the target shape as well as the reshape reassociations.
				// Note: This should not be called with a large positions array (or the
				// implementation needs to be updated to use an N.log N sort instead of
				// repeated N^2 counts).
				PackingMetadata computePackingMetadata(int64_t packedRank,
				ArrayRef<int64_t> innerDimPos);
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Ditto, not needed but I wanted to show it to you. qcolombet: Ditto, not needed but I wanted to show it to you.
				nicolasvasilacheUnsubmitted Done Reply Inline Actions I think these are useful utils to expose in general, fine as is IMO. nicolasvasilache: I think these are useful utils to expose in general, fine as is IMO.
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_UTILS_RESHAPEOPSUTILS_H			#endif // MLIR_DIALECT_UTILS_RESHAPEOPSUTILS_H

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp

Show First 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	if (op->getNumResults() != 1 \|\| !op->getResult(0).getType().isIndex()) {
return diag;		return diag;
}		}
result.push_back(op->getResult(0));		result.push_back(op->getResult(0));
}		}

return DiagnosedSilenceableFailure::success();		return DiagnosedSilenceableFailure::success();
}		}

/// Return a permutation vector of size permSize that would result in moving
/// positions into desiredPositions.
///
/// For example, permSize == 5, positions = {2, 4}, desiredPositions = {1, 0}
/// would result in a {4, 2, 0, 1, 3} permutation vector.
static SmallVector<int64_t>
computePermutationVector(int64_t permSize, ArrayRef<int64_t> positions,
ArrayRef<int64_t> desiredPositions) {
SmallVector<int64_t> res(permSize, -1);
DenseSet<int64_t> seen;
for (auto [pos, desiredPos] : llvm::zip_equal(positions, desiredPositions)) {
res[desiredPos] = pos;
seen.insert(pos);
}
int64_t nextPos = 0;
for (int64_t &entry : res) {
if (entry != -1)
continue;
while (seen.contains(nextPos))
++nextPos;
entry = nextPos;
++nextPos;
}
return res;
}

struct PackingMetadata {
SmallVector<int64_t> insertPositions;
SmallVector<ReassociationIndices> reassociations;
};
/// Given a vector of `positions` indices representing desired packing insertion
/// points into a target vector (i.e. pack/unpack.inner_dim_pos), compute the
/// final positions in the target shape as well as the reshape reassociations.
// Note: This should not be called with a large positions array (or the
// implementation needs to be updated to use an N.log N sort instead of
// repeated N^2 counts).
static PackingMetadata computePackingMetadata(int64_t packedRank,
ArrayRef<int64_t> innerDimPos) {
PackingMetadata res;
res.insertPositions.reserve(innerDimPos.size());
// The pack insert position is the position + the number of previously
// inserted positions + offset.
// The offset controls whether the packing dimension is the first or last.
//
// Example
// =======
// Consider packing from a hypothetical ABCD layout to ABCDba whose
// pack.inner_dims is [1, 0]. The first step consists in undoing the
// permutation and producing AaBbCD. This is achieved purely by computing the
// insert positions of `b` and `a` into `ABCD`, starting from [1, 0]. One
// possibility, is to produce insert positions [2, 0], this would result in an
// aAbBCD layout (i.e. offset 0). The other possibility, is to produce insert
// positions [3, 1], this would result in an AaBbCD layout (i.e. offset 1).
// The latter is what we expect from packing.
int64_t offset = 1;
for (int64_t pos : innerDimPos) {
int64_t numInsertedBefore = llvm::count_if(
innerDimPos, [&pos](int64_t pos2) { return pos > pos2; });
res.insertPositions.push_back(pos + numInsertedBefore + offset);
}

DenseSet<int64_t> posSet(res.insertPositions.begin(),
res.insertPositions.end());
res.reassociations.reserve(packedRank);
for (int64_t i = 1; i <= packedRank; ++i) {
if (!posSet.contains(i)) {
res.reassociations.push_back(ReassociationIndices{i - 1});
continue;
}
res.reassociations.push_back(ReassociationIndices{i - 1, i});
++i;
}
return res;
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BufferizeToAllocationOp		// BufferizeToAllocationOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
DiagnosedSilenceableFailure		DiagnosedSilenceableFailure
transform::BufferizeToAllocationOp::apply(transform::TransformResults &results,		transform::BufferizeToAllocationOp::apply(transform::TransformResults &results,
transform::TransformState &state) {		transform::TransformState &state) {
Attribute memorySpace =		Attribute memorySpace =
getMemorySpace().has_value() ? getMemorySpace().value() : Attribute();		getMemorySpace().has_value() ? getMemorySpace().value() : Attribute();
▲ Show 20 Lines • Show All 658 Lines • ▼ Show 20 Lines	LLVM_DEBUG(
packingMetadata.reassociations, DBGS() << "reassociations: ",		packingMetadata.reassociations, DBGS() << "reassociations: ",
[&](ReassociationIndices ri) {		[&](ReassociationIndices ri) {
llvm::interleaveComma(ri, llvm::dbgs() << "\|");		llvm::interleaveComma(ri, llvm::dbgs() << "\|");
});		});
DBGSNL();		DBGSNL();
llvm::interleaveComma(stripMinedShape, DBGS() << "stripMinedShape: ");		llvm::interleaveComma(stripMinedShape, DBGS() << "stripMinedShape: ");
DBGSNL(); DBGS() << "collapsed type: " << collapsed; DBGSNL(););		DBGSNL(); DBGS() << "collapsed type: " << collapsed; DBGSNL(););

		if (packOp.isLikePad()) {
		// This pack is just a plain pad.
		// Just insert the pad in the higher ranked tensor.
		auto emptyOp =
		rewriter.create<tensor::EmptyOp>(loc, packedTensorType, ValueRange{});
		// Offsets.
		SmallVector<OpFoldResult> zeros(packedRank, rewriter.getIndexAttr(0));
		// Strides.
		SmallVector<OpFoldResult> ones(packedRank, rewriter.getIndexAttr(1));
		SmallVector<OpFoldResult> sizes =
		getMixedDimensions(rewriter, loc, packOp.getDest());

		auto insertSliceOp = rewriter.create<tensor::InsertSliceOp>(
		loc, /source=/padOp, /dest=/emptyOp,
		/offsets=/zeros, sizes,
		/strides=/ones);

		LLVM_DEBUG(DBGS() << "insert_slice op: " << insertSliceOp; DBGSNL(););

		rewriter.replaceOp(packOp, insertSliceOp->getResults());

		return LowerPackResult{padOp, /reshapeOp=/nullptr,
		/transposeOp=/nullptr};
		}
// 5. Expand from the padded result to the stripMinedShape.		// 5. Expand from the padded result to the stripMinedShape.
auto reshapeOp = rewriter.create<tensor::ExpandShapeOp>(		auto reshapeOp = rewriter.create<tensor::ExpandShapeOp>(
loc,		loc,
RankedTensorType::Builder(packedTensorType).setShape(stripMinedShape),		RankedTensorType::Builder(packedTensorType).setShape(stripMinedShape),
padOp.getResult(), packingMetadata.reassociations);		padOp.getResult(), packingMetadata.reassociations);

// 6. Transpose stripMinedShape to packedShape.		// 6. Transpose stripMinedShape to packedShape.
SmallVector<int64_t> insertPositionsToLastDimsPerm = computePermutationVector(		SmallVector<int64_t> insertPositionsToLastDimsPerm = computePermutationVector(
Show All 21 Lines	DiagnosedSilenceableFailure transform::LowerPackOp::applyToOne(
rewriter.setInsertionPoint(target);		rewriter.setInsertionPoint(target);
FailureOr<LowerPackResult> res = lowerPack(rewriter, target);		FailureOr<LowerPackResult> res = lowerPack(rewriter, target);
if (failed(res)) {		if (failed(res)) {
return mlir::emitSilenceableFailure(target->getLoc())		return mlir::emitSilenceableFailure(target->getLoc())
<< "cannot lower to pad + expand + transpose";		<< "cannot lower to pad + expand + transpose";
}		}
transformResults.push_back(res->padOp);		transformResults.push_back(res->padOp);
transformResults.push_back(res->expandShapeOp);		transformResults.push_back(res->expandShapeOp);
transformResults.push_back(res->transposeOp);		transformResults.push_back(res->transposeOp);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions This assumes static, we really want `createOrFold<DimOp>` here; we should have some helpers around that return the vector<OpFoldResult> with the right quantities. Not sure where to look offhand for those helpers .. nicolasvasilache: This assumes static, we really want `createOrFold<DimOp>` here; we should have some helpers…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Ok, after quick offline chat, I see this is not strictly needed given that the shape is asserted to be static. This is because `tensor.expand_shape` is not yet powerful enough re dynamic sizes, but this is a longer endeavor. I still pulled out the relevant functionality here: https://reviews.llvm.org/D148201 for better reuse. While it will do the same thing for now using an idiomatic API, it is also future proof. Could you update to using getMixedDimensions ? nicolasvasilache: Ok, after quick offline chat, I see this is not strictly needed given that the shape is…
return DiagnosedSilenceableFailure::success();		return DiagnosedSilenceableFailure::success();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// LowerUnPackOp		// LowerUnPackOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		nicolasvasilacheUnsubmitted Done Reply Inline Actions 3 newlines ? :) nicolasvasilache: 3 newlines ? :)
		qcolombetAuthorUnsubmitted Done Reply Inline Actions It was like that in the exiting code :). Removed it. qcolombet: It was like that in the exiting code :). Removed it.

		nicolasvasilacheUnsubmitted Done Reply Inline Actions can we early exit here and avoid the else nesting ? nicolasvasilache: can we early exit here and avoid the else nesting ?
struct LowerUnPackOpResult {		struct LowerUnPackOpResult {
tensor::EmptyOp emptyOp;		tensor::EmptyOp emptyOp;
linalg::TransposeOp transposeOp;		linalg::TransposeOp transposeOp;
tensor::CollapseShapeOp collapseShapeOp;		tensor::CollapseShapeOp collapseShapeOp;
tensor::ExtractSliceOp extractSliceOp;		tensor::ExtractSliceOp extractSliceOp;
};		};

/// Rewrite pack as empty + transpose + reshape + extract_slice.		/// Rewrite pack as empty + transpose + reshape + extract_slice.
static FailureOr<LowerUnPackOpResult> lowerUnPack(RewriterBase &rewriter,		static FailureOr<LowerUnPackOpResult> lowerUnPack(RewriterBase &rewriter,
tensor::UnPackOp unPackOp) {		tensor::UnPackOp unPackOp) {
// 1. Filter out NYI cases.		// 1. Filter out NYI cases.
if (!unPackOp.getOuterDimsPerm().empty())		if (!unPackOp.getOuterDimsPerm().empty())
return rewriter.notifyMatchFailure(unPackOp, "outer dims perm NYI");		return rewriter.notifyMatchFailure(unPackOp, "outer dims perm NYI");

RankedTensorType packedTensorType = unPackOp.getSourceType();		RankedTensorType packedTensorType = unPackOp.getSourceType();
if (!packedTensorType.hasStaticShape()) {		if (!packedTensorType.hasStaticShape()) {
return rewriter.notifyMatchFailure(		return rewriter.notifyMatchFailure(
unPackOp,		unPackOp,
"non-static shape NYI, needs a more powerful tensor.expand_shape op");		"non-static shape NYI, needs a more powerful tensor.expand_shape op");
}		}

Location loc = unPackOp->getLoc();		Location loc = unPackOp->getLoc();
OpBuilder::InsertionGuard g(rewriter);		OpBuilder::InsertionGuard g(rewriter);
rewriter.setInsertionPoint(unPackOp);		rewriter.setInsertionPoint(unPackOp);
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions with early exit, I believe we can dropp the trailing assert. nicolasvasilache: with early exit, I believe we can dropp the trailing assert.

// 2. Compute the permutation vector to move the last `numPackedDims` into the
// `innerPosDims` of a shape of rank `packedRank`.
int64_t numPackedDims = unPackOp.getInnerDimsPos().size();
int64_t packedRank = packedTensorType.getRank();		int64_t packedRank = packedTensorType.getRank();

		OpFoldResult zero = rewriter.getIndexAttr(0), one = rewriter.getIndexAttr(1);
		auto destTensorType = unPackOp.getDest().getType().cast<RankedTensorType>();
		if (unPackOp.isLikeUnPad()) {
		// This unpack is just a plain unpad.
		// Just extract the slice from the higher ranked tensor.
		ArrayRef<int64_t> destShape = destTensorType.getShape();
		// The inner dimensions stay the same as the destination tensor, but the
		// outer ones are additional 1s.
		SmallVector<OpFoldResult> sizes(packedRank - destShape.size(), one);
		sizes.append(getMixedDimensions(rewriter, loc, unPackOp.getDest()));

		auto extractSliceOp = rewriter.create<tensor::ExtractSliceOp>(
		loc, destTensorType, unPackOp.getSource(),
		SmallVector<OpFoldResult>(packedRank, zero), sizes,
		SmallVector<OpFoldResult>(packedRank, one));

		rewriter.replaceOp(unPackOp, extractSliceOp->getResults());

		return LowerUnPackOpResult{/emptyOp=/nullptr, /transposeOp=/nullptr,
		/reshapeOp=/nullptr, extractSliceOp};
		}
		// 2. Compute the permutation vector to move the last `numPackedDims` into
		// the `innerPosDims` of a shape of rank `packedRank`.
		int64_t numPackedDims = unPackOp.getInnerDimsPos().size();
auto lastDims = llvm::to_vector(		auto lastDims = llvm::to_vector(
llvm::seq<int64_t>(packedRank - numPackedDims, packedRank));		llvm::seq<int64_t>(packedRank - numPackedDims, packedRank));
PackingMetadata packingMetadata =		PackingMetadata packingMetadata =
computePackingMetadata(packedRank, unPackOp.getInnerDimsPos());		computePackingMetadata(packedRank, unPackOp.getInnerDimsPos());
SmallVector<int64_t> lastDimsToInsertPositionsPerm = computePermutationVector(		SmallVector<int64_t> lastDimsToInsertPositionsPerm = computePermutationVector(
packedRank, lastDims, packingMetadata.insertPositions);		packedRank, lastDims, packingMetadata.insertPositions);

// 3. Compute the stripMinedShape: this is the packed shape without outer and		// 3. Compute the stripMinedShape: this is the packed shape without outer and
Show All 29 Lines	LLVM_DEBUG(
DBGSNL(); DBGS() << "collapsed type: " << collapsedType; DBGSNL(););		DBGSNL(); DBGS() << "collapsed type: " << collapsedType; DBGSNL(););

// 5. Collapse from the stripMinedShape to the padded result.		// 5. Collapse from the stripMinedShape to the padded result.
auto reshapeOp = rewriter.create<tensor::CollapseShapeOp>(		auto reshapeOp = rewriter.create<tensor::CollapseShapeOp>(
loc, collapsedType, transposeOp->getResult(0),		loc, collapsedType, transposeOp->getResult(0),
packingMetadata.reassociations);		packingMetadata.reassociations);

// 6. ExtractSlice		// 6. ExtractSlice
auto destTensorType = unPackOp.getDest().getType().cast<RankedTensorType>();
int64_t destRank = destTensorType.getRank();		int64_t destRank = destTensorType.getRank();
OpFoldResult zero = rewriter.getIndexAttr(0), one = rewriter.getIndexAttr(1);
auto extractSliceOp = rewriter.create<tensor::ExtractSliceOp>(		auto extractSliceOp = rewriter.create<tensor::ExtractSliceOp>(
loc, destTensorType, reshapeOp->getResult(0),		loc, destTensorType, reshapeOp->getResult(0),
SmallVector<OpFoldResult>(destRank, zero),		SmallVector<OpFoldResult>(destRank, zero),
tensor::getMixedSizes(rewriter, loc, unPackOp->getResult(0)),		tensor::getMixedSizes(rewriter, loc, unPackOp->getResult(0)),
SmallVector<OpFoldResult>(destRank, one));		SmallVector<OpFoldResult>(destRank, one));

// 7. Replace unPackOp by transposeOp.		// 7. Replace unPackOp by extractSliceOp.
rewriter.replaceOp(unPackOp, extractSliceOp->getResults());		rewriter.replaceOp(unPackOp, extractSliceOp->getResults());

return LowerUnPackOpResult{emptyOp, transposeOp, reshapeOp, extractSliceOp};		return LowerUnPackOpResult{emptyOp, transposeOp, reshapeOp, extractSliceOp};
}		}

DiagnosedSilenceableFailure transform::LowerUnPackOp::applyToOne(		DiagnosedSilenceableFailure transform::LowerUnPackOp::applyToOne(
tensor::UnPackOp target, transform::ApplyToEachResultList &transformResults,		tensor::UnPackOp target, transform::ApplyToEachResultList &transformResults,
		nicolasvasilacheUnsubmitted Done Reply Inline Actions same re dynamic sizes nicolasvasilache: same re dynamic sizes
transform::TransformState &state) {		transform::TransformState &state) {
IRRewriter rewriter(target->getContext());		IRRewriter rewriter(target->getContext());
rewriter.setInsertionPoint(target);		rewriter.setInsertionPoint(target);
FailureOr<LowerUnPackOpResult> res = lowerUnPack(rewriter, target);		FailureOr<LowerUnPackOpResult> res = lowerUnPack(rewriter, target);
if (failed(res)) {		if (failed(res)) {
return mlir::emitSilenceableFailure(target->getLoc())		return mlir::emitSilenceableFailure(target->getLoc())
<< "cannot rewrite to pad + expand + transpose";		<< "cannot rewrite to pad + expand + transpose";
}		}
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	auto matchFun = [&](Operation *op) {

// Check if all specified attributes match.		// Check if all specified attributes match.
if (getOpAttrs().has_value()) {		if (getOpAttrs().has_value()) {
DictionaryAttr opAttrs = getOpAttrs().value();		DictionaryAttr opAttrs = getOpAttrs().value();
for (NamedAttribute attr : opAttrs) {		for (NamedAttribute attr : opAttrs) {
if (attr.getName() == getInterfaceAttrName() \|\|		if (attr.getName() == getInterfaceAttrName() \|\|
attr.getName() == getOpsAttrName())		attr.getName() == getOpsAttrName())
continue;		continue;
if (!op->hasAttr(attr.getName()))		if (!op->hasAttr(attr.getName()))
		nicolasvasilacheUnsubmitted Done Reply Inline Actions same re early exit and dropping the trailing assert. nicolasvasilache: same re early exit and dropping the trailing assert.
return;		return;
if (op->getAttr(attr.getName()) != attr.getValue())		if (op->getAttr(attr.getName()) != attr.getValue())
return;		return;
}		}
}		}

if (getFilterResultType().has_value()) {		if (getFilterResultType().has_value()) {
Type t = getFilterResultType().value();		Type t = getFilterResultType().value();
▲ Show 20 Lines • Show All 2,204 Lines • Show Last 20 Lines

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

Show First 20 Lines • Show All 3,703 Lines • ▼ Show 20 Lines	LogicalResult PackOp::canonicalize(PackOp packOp, PatternRewriter &rewriter) {
if (packOp.getPaddingValue() \|\|		if (packOp.getPaddingValue() \|\|
!hasSameInnerOuterAttribute(packOp, unPackOp) \|\|		!hasSameInnerOuterAttribute(packOp, unPackOp) \|\|
!haveSameTiles(packOp, unPackOp))		!haveSameTiles(packOp, unPackOp))
return failure();		return failure();
rewriter.replaceOp(packOp, unPackOp.getSource());		rewriter.replaceOp(packOp, unPackOp.getSource());
return success();		return success();
}		}

		template <typename PackOrUnpackOp>
		static bool isLikePadUnPad(PackOrUnpackOp packOp,
		RankedTensorType packedTensorType) {
		static_assert(std::is_same<PackOrUnpackOp, tensor::PackOp>::value \|\|
		std::is_same<PackOrUnpackOp, tensor::UnPackOp>::value,
		"Function meant for pack/unpack");
		// This is a pad if packing only adds ones and we don't transpose dimensions.

		// Check that we are not transposing any dimensions.
		ArrayRef<int64_t> innerDimsPos = packOp.getInnerDimsPos();
		int64_t numPackedDims = innerDimsPos.size();
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Something more idiomatic with `ArrayRef == llvm::seq` ? nicolasvasilache: Something more idiomatic with `ArrayRef == llvm::seq` ?
		auto orderedDims = llvm::to_vector<4>(llvm::seq<int64_t>(0, numPackedDims));
		if (orderedDims != innerDimsPos) {
		// Dimensions don't happen in order.
		return false;
		}

		ArrayRef<int64_t> packedShape = packedTensorType.getShape();
		int64_t packedRank = packedTensorType.getRank();
		// At this point we know that we are taking numPackedDims outer
		// dimensions and pushing them all the way as the inner most dimensions.
		// What's left on the outer most dimensions is, in this order:
		// - the factor of the packed dimensions, then
		// - the untouched dimensions
		// This shifting inward of dimensions is a no-op (as opposed to a transpose)
		// if all the dimensions that bubble outerward are ones.
		// Therefore check that all the dimensions but the numPackedDims inner most
		// ones are ones.
		return llvm::all_of(
		llvm::seq<int64_t>(0, packedRank - numPackedDims),
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Something like `return llvm::all_of(..., [](){ == 1; })` ? nicolasvasilache: Something like `return llvm::all_of(..., [](){ == 1; })` ?
		[&packedShape](int64_t i) { return packedShape[i] == 1; });
		}

		bool PackOp::isLikePad() {
		auto packedTensorType =
		(*this)->getResultTypes().front().cast<RankedTensorType>();
		return isLikePadUnPad(*this, packedTensorType);
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// UnPackOp		// UnPackOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void UnPackOp::getAsmResultNames(		void UnPackOp::getAsmResultNames(
function_ref<void(Value, StringRef)> setNameFn) {		function_ref<void(Value, StringRef)> setNameFn) {
setNameFn(getResult(), "unpack");		setNameFn(getResult(), "unpack");
}		}
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	LogicalResult UnPackOp::canonicalize(UnPackOp unPackOp,
if (packOp.getPaddingValue() \|\|		if (packOp.getPaddingValue() \|\|
!hasSameInnerOuterAttribute(packOp, unPackOp) \|\|		!hasSameInnerOuterAttribute(packOp, unPackOp) \|\|
!haveSameTiles(packOp, unPackOp))		!haveSameTiles(packOp, unPackOp))
return failure();		return failure();
rewriter.replaceOp(unPackOp, packOp.getSource());		rewriter.replaceOp(unPackOp, packOp.getSource());
return success();		return success();
}		}

		bool UnPackOp::isLikeUnPad() {
		RankedTensorType packedTensorType = getSourceType();
		return isLikePadUnPad(*this, packedTensorType);
		}
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Common Canonicalizers and Folders.		// Common Canonicalizers and Folders.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Folds a tensor.cast op into a consuming DestinationStyleOpInterface op if		/// Folds a tensor.cast op into a consuming DestinationStyleOpInterface op if
/// the `tensor.cast` has source that is more static than the consuming op.		/// the `tensor.cast` has source that is more static than the consuming op.
///		///
/// Example:		/// Example:
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

mlir/lib/Dialect/Utils/IndexingUtils.cpp

Show First 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	bool mlir::isPermutationVector(ArrayRef<int64_t> interchange) {
for (auto val : interchange) {		for (auto val : interchange) {
if (seenVals.count(val))		if (seenVals.count(val))
return false;		return false;
seenVals.insert(val);		seenVals.insert(val);
}		}
return seenVals.size() == interchange.size();		return seenVals.size() == interchange.size();
}		}

		SmallVector<int64_t>
		mlir::computePermutationVector(int64_t permSize, ArrayRef<int64_t> positions,
		ArrayRef<int64_t> desiredPositions) {
		SmallVector<int64_t> res(permSize, -1);
		DenseSet<int64_t> seen;
		for (auto [pos, desiredPos] : llvm::zip_equal(positions, desiredPositions)) {
		res[desiredPos] = pos;
		seen.insert(pos);
		}
		int64_t nextPos = 0;
		for (int64_t &entry : res) {
		if (entry != -1)
		continue;
		while (seen.contains(nextPos))
		++nextPos;
		entry = nextPos;
		++nextPos;
		}
		return res;
		}

SmallVector<int64_t> mlir::getI64SubArray(ArrayAttr arrayAttr,		SmallVector<int64_t> mlir::getI64SubArray(ArrayAttr arrayAttr,
unsigned dropFront,		unsigned dropFront,
unsigned dropBack) {		unsigned dropBack) {
assert(arrayAttr.size() > dropFront + dropBack && "Out of bounds");		assert(arrayAttr.size() > dropFront + dropBack && "Out of bounds");
auto range = arrayAttr.getAsRange<IntegerAttr>();		auto range = arrayAttr.getAsRange<IntegerAttr>();
SmallVector<int64_t> res;		SmallVector<int64_t> res;
res.reserve(arrayAttr.size() - dropFront - dropBack);		res.reserve(arrayAttr.size() - dropFront - dropBack);
for (auto it = range.begin() + dropFront, eit = range.end() - dropBack;		for (auto it = range.begin() + dropFront, eit = range.end() - dropBack;
it != eit; ++it)		it != eit; ++it)
res.push_back((*it).getValue().getSExtValue());		res.push_back((*it).getValue().getSExtValue());
return res;		return res;
}		}

mlir/lib/Dialect/Utils/ReshapeOpsUtils.cpp

Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines	if ((*trivialSegments)[groupIdx] \|\|
reassociation.clear();		reassociation.clear();
groupIdx++;		groupIdx++;
}		}
}		}

return CollapseShapeRankReducingSliceSimplificationInfo{		return CollapseShapeRankReducingSliceSimplificationInfo{
sliceType, newReassociationIndices};		sliceType, newReassociationIndices};
}		}

		PackingMetadata mlir::computePackingMetadata(int64_t packedRank,
		ArrayRef<int64_t> innerDimPos) {
		PackingMetadata res;
		res.insertPositions.reserve(innerDimPos.size());
		// The pack insert position is the position + the number of previously
		// inserted positions + offset.
		// The offset controls whether the packing dimension is the first or last.
		//
		// Example
		// =======
		// Consider packing from a hypothetical ABCD layout to ABCDba whose
		// pack.inner_dims is [1, 0]. The first step consists in undoing the
		// permutation and producing AaBbCD. This is achieved purely by computing the
		// insert positions of `b` and `a` into `ABCD`, starting from [1, 0]. One
		// possibility, is to produce insert positions [2, 0], this would result in an
		// aAbBCD layout (i.e. offset 0). The other possibility, is to produce insert
		// positions [3, 1], this would result in an AaBbCD layout (i.e. offset 1).
		// The latter is what we expect from packing.
		int64_t offset = 1;
		for (int64_t pos : innerDimPos) {
		int64_t numInsertedBefore = llvm::count_if(
		innerDimPos, [&pos](int64_t pos2) { return pos > pos2; });
		res.insertPositions.push_back(pos + numInsertedBefore + offset);
		}

		DenseSet<int64_t> posSet(res.insertPositions.begin(),
		res.insertPositions.end());
		res.reassociations.reserve(packedRank);
		for (int64_t i = 1; i <= packedRank; ++i) {
		if (!posSet.contains(i)) {
		res.reassociations.push_back(ReassociationIndices{i - 1});
		continue;
		}
		res.reassociations.push_back(ReassociationIndices{i - 1, i});
		++i;
		}
		return res;
		}

mlir/test/Dialect/Linalg/transform-lower-pack.mlir

Show All 15 Lines	func.func @pack(%arg0: tensor<129x47x16x16xf32>, %arg1: tensor<17x2x16x16x32x8xf32>) -> tensor<17x2x16x16x32x8xf32> {
// CHECK-SAME: permutation = [0, 2, 4, 5, 3, 1]		// CHECK-SAME: permutation = [0, 2, 4, 5, 3, 1]
%pack = tensor.pack %arg0 padding_value(%cst_0 : f32) inner_dims_pos = [1, 0] inner_tiles = [32, 8] into %arg1		%pack = tensor.pack %arg0 padding_value(%cst_0 : f32) inner_dims_pos = [1, 0] inner_tiles = [32, 8] into %arg1
: tensor<129x47x16x16xf32> -> tensor<17x2x16x16x32x8xf32>		: tensor<129x47x16x16xf32> -> tensor<17x2x16x16x32x8xf32>
return %pack : tensor<17x2x16x16x32x8xf32>		return %pack : tensor<17x2x16x16x32x8xf32>
}		}

transform.sequence failures(propagate) {		transform.sequence failures(propagate) {
^bb1(%module_op: !pdl.operation):		^bb1(%module_op: !pdl.operation):
%pack = transform.structured.match ops{["tensor.pack"]} in %module_op		%pack = transform.structured.match ops{["tensor.pack"]} in %module_op
: (!pdl.operation) -> !transform.op<"tensor.pack">		: (!pdl.operation) -> !transform.op<"tensor.pack">
transform.structured.lower_pack %pack : (!transform.op<"tensor.pack">)		transform.structured.lower_pack %pack : (!transform.op<"tensor.pack">)
-> (!transform.op<"tensor.pad">, !transform.op<"tensor.expand_shape">, !transform.op<"linalg.transpose">)		-> (!transform.op<"tensor.pad">, !transform.op<"tensor.expand_shape">, !transform.op<"linalg.transpose">)
}		}

// -----		// -----

		// CHECK-LABEL: func.func @pack_as_pad(
		func.func @pack_as_pad(%arg0: tensor<129x47x16x16xf32>, %arg1: tensor<1x1x1x1x136x64x16x16xf32>) -> tensor<1x1x1x1x136x64x16x16xf32> {
		%cst_0 = arith.constant 0.0 : f32

		// tensor.pack is lowered to tensor.pad + tensor.insert_slice
		// CHECK: %[[C0:.*]] = arith.constant 0 : index
		// CHECK: %[[PAD:.]] = tensor.pad {{.}} low[%[[C0]], %[[C0]], %[[C0]], %[[C0]]]
		// CHECK: : tensor<129x47x16x16xf32> to tensor<136x64x16x16xf32>
		// CHECK: %[[EMPTY:.*]] = tensor.empty() : tensor<1x1x1x1x136x64x16x16xf32>
		// CHECK: %[[RES:.*]] = tensor.insert_slice %[[PAD]] into %[[EMPTY]]
		mravishankarUnsubmitted Not Done Reply Inline Actions I wondering if we even want to have an `insert_slice` here? we could just do a `reshape` here. In which case the only difference is whether there is a transpose or not? If we can teach bufferization to handle this reshape without introducing additional buffers, that'd be very useful (we used to have a `memref.reshape` to just change the indexing, wonder if we still have that). mravishankar: I wondering if we even want to have an `insert_slice` here? we could just do a `reshape` here.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions In principle we could but I'd like to see how these things connect in practice. At the graph level, reshapes tend to simplify reasonably well. At the codegen level, as soon as we start tiling and taking slices, reshapes don't behave properly atm. I'd say this is dependent on where we apply the lowering and we will likely want this to be configurable in the future, based on data. nicolasvasilache: In principle we could but I'd like to see how these things connect in practice. At the graph…
		qcolombetAuthorUnsubmitted Done Reply Inline Actions Alright I'll land as is and we'll revisit later then. qcolombet: Alright I'll land as is and we'll revisit later then.
		// offsets.
		// CHECK-SAME: [0, 0, 0, 0, 0, 0, 0, 0]
		// sizes.
		// CHECK-SAME: [1, 1, 1, 1, 136, 64, 16, 16]
		// strides multipliers.
		// CHECK-SAME: [1, 1, 1, 1, 1, 1, 1, 1]
		// CHECK-SAME: : tensor<136x64x16x16xf32> into tensor<1x1x1x1x136x64x16x16xf32>
		// CHECK: return %[[RES]]
		%pack = tensor.pack %arg0 padding_value(%cst_0 : f32) inner_dims_pos = [0, 1, 2, 3] inner_tiles = [136, 64, 16, 16] into %arg1
		: tensor<129x47x16x16xf32> -> tensor<1x1x1x1x136x64x16x16xf32>
		return %pack : tensor<1x1x1x1x136x64x16x16xf32>
		}

		transform.sequence failures(propagate) {
		^bb1(%module_op: !pdl.operation):
		%pack = transform.structured.match ops{["tensor.pack"]} in %module_op
		: (!pdl.operation) -> !transform.op<"tensor.pack">
		transform.structured.lower_pack %pack : (!transform.op<"tensor.pack">)
		-> (!transform.op<"tensor.pad">, !transform.op<"tensor.expand_shape">, !transform.op<"linalg.transpose">)
		}

		// -----

		// Check that we don't lower the following pack as a pad.
		// Although all the outer most dimensions in the resulting shape are 1s,
		// some of the original dimensions are not part of the inner_dims_pos, hence
		// some transpose needs to happen.
		// CHECK-LABEL: func.func @pack_not_a_pad(
		func.func @pack_not_a_pad(%arg0: tensor<129x47x16x16xf32>, %arg1: tensor<1x1x16x16x136x64xf32>) -> tensor<1x1x16x16x136x64xf32> {
		%cst_0 = arith.constant 0.0 : f32

		// CHECK: %[[C0:.*]] = arith.constant 0 : index
		// CHECK: tensor.pad {{.*}} low[%[[C0]], %[[C0]], %[[C0]], %[[C0]]]
		// CHECK: : tensor<129x47x16x16xf32> to tensor<136x64x16x16xf32>
		// CHECK: tensor.expand_shape %{{.}} [{{.}}[0, 1], [2, 3], [4], [5]]
		// CHECK-SAME: : tensor<136x64x16x16xf32> into tensor<1x136x1x64x16x16xf32>
		// CHECK: linalg.transpose
		// CHECK-SAME: ins(%{{.*}} : tensor<1x136x1x64x16x16xf32>)
		// CHECK-SAME: outs(%{{.*}} : tensor<1x1x16x16x136x64xf32>)
		// CHECK-SAME: permutation = [0, 2, 4, 5, 1, 3]

		%pack = tensor.pack %arg0 padding_value(%cst_0 : f32) inner_dims_pos = [0, 1] inner_tiles = [136, 64] into %arg1
		: tensor<129x47x16x16xf32> -> tensor<1x1x16x16x136x64xf32>
		return %pack : tensor<1x1x16x16x136x64xf32>
		}

		transform.sequence failures(propagate) {
		^bb1(%module_op: !pdl.operation):
		%pack = transform.structured.match ops{["tensor.pack"]} in %module_op
		: (!pdl.operation) -> !transform.op<"tensor.pack">
		transform.structured.lower_pack %pack : (!transform.op<"tensor.pack">)
		-> (!transform.op<"tensor.pad">, !transform.op<"tensor.expand_shape">, !transform.op<"linalg.transpose">)
		}

		// -----
// CHECK-LABEL: func.func @unpack(		// CHECK-LABEL: func.func @unpack(
func.func @unpack(%arg0: tensor<17x2x16x16x32x8xf32>, %arg1: tensor<129x47x16x16xf32>) -> tensor<129x47x16x16xf32> {		func.func @unpack(%arg0: tensor<17x2x16x16x32x8xf32>, %arg1: tensor<129x47x16x16xf32>) -> tensor<129x47x16x16xf32> {
%cst_0 = arith.constant 0.0 : f32		%cst_0 = arith.constant 0.0 : f32

// CHECK: tensor.empty() : tensor<17x8x2x32x16x16xf32>		// CHECK: tensor.empty() : tensor<17x8x2x32x16x16xf32>
// CHECK: linalg.transpose		// CHECK: linalg.transpose
// CHECK-SAME: ins(%{{.*}} : tensor<17x2x16x16x32x8xf32>)		// CHECK-SAME: ins(%{{.*}} : tensor<17x2x16x16x32x8xf32>)
// CHECK-SAME: outs(%{{.*}} : tensor<17x8x2x32x16x16xf32>)		// CHECK-SAME: outs(%{{.*}} : tensor<17x8x2x32x16x16xf32>)
// CHECK-SAME: permutation = [0, 5, 1, 4, 2, 3]		// CHECK-SAME: permutation = [0, 5, 1, 4, 2, 3]
// CHECK: tensor.collapse_shape {{.*}}[0, 1], [2, 3], [4], [5]]		// CHECK: tensor.collapse_shape {{.*}}[0, 1], [2, 3], [4], [5]]
// CHECK-SAME: : tensor<17x8x2x32x16x16xf32> into tensor<136x64x16x16xf32>		// CHECK-SAME: : tensor<17x8x2x32x16x16xf32> into tensor<136x64x16x16xf32>
// CHECK: tensor.extract_slice %{{.*}}[0, 0, 0, 0] [129, 47, 16, 16] [1, 1, 1, 1]		// CHECK: tensor.extract_slice %{{.*}}[0, 0, 0, 0] [129, 47, 16, 16] [1, 1, 1, 1]
// CHECK-SAME: : tensor<136x64x16x16xf32> to tensor<129x47x16x16xf32>		// CHECK-SAME: : tensor<136x64x16x16xf32> to tensor<129x47x16x16xf32>
%pack = tensor.unpack %arg0 inner_dims_pos = [1, 0] inner_tiles = [32, 8] into %arg1		%pack = tensor.unpack %arg0 inner_dims_pos = [1, 0] inner_tiles = [32, 8] into %arg1
: tensor<17x2x16x16x32x8xf32> -> tensor<129x47x16x16xf32>		: tensor<17x2x16x16x32x8xf32> -> tensor<129x47x16x16xf32>
return %pack : tensor<129x47x16x16xf32>		return %pack : tensor<129x47x16x16xf32>
}		}

transform.sequence failures(propagate) {		transform.sequence failures(propagate) {
^bb1(%module_op: !pdl.operation):		^bb1(%module_op: !pdl.operation):
%unpack = transform.structured.match ops{["tensor.unpack"]} in %module_op		%unpack = transform.structured.match ops{["tensor.unpack"]} in %module_op
: (!pdl.operation) -> !transform.op<"tensor.unpack">		: (!pdl.operation) -> !transform.op<"tensor.unpack">
transform.structured.lower_unpack %unpack : (!transform.op<"tensor.unpack">)		transform.structured.lower_unpack %unpack : (!transform.op<"tensor.unpack">)
-> (!transform.op<"tensor.empty">,		-> (!transform.op<"tensor.empty">,
!transform.op<"linalg.transpose">,		!transform.op<"linalg.transpose">,
!transform.op<"tensor.collapse_shape">,		!transform.op<"tensor.collapse_shape">,
!transform.op<"tensor.extract_slice">)		!transform.op<"tensor.extract_slice">)
}		}

		// -----
		// When an unpack is a plain 'unpad', lower it to a simple extract_slice.
		// CHECK-LABEL: func.func @unpack_as_pad(
		func.func @unpack_as_pad(%arg0: tensor<1x1x1x1x136x64x16x16xf32>, %arg1: tensor<129x47x16x16xf32>) -> tensor<129x47x16x16xf32> {
		%cst_0 = arith.constant 0.0 : f32

		// CHECK-SAME: %[[ARG0:[^:]*]]: tensor<1x1x1x1x136x64x16x16xf32>
		// CHECK: %[[RES:.*]] = tensor.extract_slice %[[ARG0]]
		// offsets.
		// CHECK-SAME: [0, 0, 0, 0, 0, 0, 0, 0]
		// sizes.
		// CHECK-SAME: [1, 1, 1, 1, 129, 47, 16, 16]
		// strides multiplers.
		// CHECK-SAME: [1, 1, 1, 1, 1, 1, 1, 1]
		// CHECK-SAME: : tensor<1x1x1x1x136x64x16x16xf32> to tensor<129x47x16x16xf32>
		%pack = tensor.unpack %arg0 inner_dims_pos = [0, 1, 2, 3] inner_tiles = [136, 64, 16, 16] into %arg1
		: tensor<1x1x1x1x136x64x16x16xf32> -> tensor<129x47x16x16xf32>
		return %pack : tensor<129x47x16x16xf32>
		}

		transform.sequence failures(propagate) {
		^bb1(%module_op: !pdl.operation):
		%unpack = transform.structured.match ops{["tensor.unpack"]} in %module_op
		: (!pdl.operation) -> !transform.op<"tensor.unpack">
		transform.structured.lower_unpack %unpack : (!transform.op<"tensor.unpack">)
		-> (!transform.op<"tensor.empty">,
		!transform.op<"linalg.transpose">,
		!transform.op<"tensor.collapse_shape">,
		!transform.op<"tensor.extract_slice">)
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][TransformDialect] Simplify the lowering of pack/unpack when these are just pad/unpadClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 513159

mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td

mlir/include/mlir/Dialect/Utils/IndexingUtils.h

mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

mlir/lib/Dialect/Utils/IndexingUtils.cpp

mlir/lib/Dialect/Utils/ReshapeOpsUtils.cpp

mlir/test/Dialect/Linalg/transform-lower-pack.mlir

[mlir][TransformDialect] Simplify the lowering of pack/unpack when these are just pad/unpad
ClosedPublic