This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/
-
mlir/
-
Dialect/
-
Vector/
2/2
VectorOps.h
-
VectorOps.td
-
lib/Dialect/Vector/
-
Dialect/
-
Vector/
-
VectorOps.cpp
-
VectorTransforms.cpp
-
test/
-
Dialect/Vector/
-
Vector/
-
vector-contract-transforms.mlir
3/4
vector-flat-transforms.mlir
-
lib/Transforms/
-
Transforms/
-
TestVectorTransforms.cpp

Differential D80772

[mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'
ClosedPublic

Authored by aartbik on May 28 2020, 6:12 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
reidtatge
andydavis1
ftynse

Commits

rG6391da98f43a: [mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'

Summary

Progressive lowering of vector.transpose into an operation that
is closer to an intrinsic, and thus the hardware ISA. Currently
under the common vector transform testing flag, as we prepare
deploying this transformation in the LLVM lowering pipeline.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aartbik created this revision.May 28 2020, 6:12 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptMay 28 2020, 6:12 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, jurahul, Kayjukh and 14 others. · View Herald Transcript

aartbik marked an inline comment as done.May 28 2020, 6:14 PM

aartbik added inline comments.

mlir/test/Dialect/Vector/vector-flat-transforms.mlir
6	sending this out for some early discussion; this hits a phase ordering issue I propose to move the shapecast lowering a bit later, so we can fold them first and only lower them when they cannot be eliminated

Harbormaster failed remote builds in B58362: Diff 267103!May 28 2020, 6:43 PM

nicolasvasilache added inline comments.May 29 2020, 2:56 AM

mlir/include/mlir/Dialect/Vector/VectorOps.h
67	Can we use a `struct VectorTransposeLowering` to keep it consistent with the one above (even if for now there are really only 2 options atm) ?
mlir/test/Dialect/Vector/vector-flat-transforms.mlir
6	I think this should be resolved in a separate revision and this is fine for now. Note that this is more an order of visitation problem. Given: %a = shape_cast %0 %b = shape_cast %a where the shape casts can fold, if %a is visited before %b then it will be expanded. I have seen this type of behavior a bunch of times in different place (albeit not involving folding + canonicalization + lowering IIRC). Seems like ShapeCast should have a canonicalizer / canonicalization pattern (`hasCanonicalizer=1`) with a separate match and rewrite. Then ShapeCastLowering could query that on all its uses and fail to lower if any foldable use is left. In other words, this type of pattern ordering can be resolved by finer-grained case disjunction. However this seems like it can still "miss folding at a distance": consider a chain of transposes that are lowered in some arbitrary order introducing reshapes. Folding opportunities would only appear if consecutive transposes are lowered before any newly introduced shape_cast is visited. This seems like the worklist-based algorithm would handle this ordering naturally but I imagine we can construct more intricate cases where this would not be true? Pinging @rriddle to see if there are more idiomatic ways of doing this, if this should be integrated in the rewriter itself (i.e. delay pattern application if any operand has folding opportunities), or something else.

made option an enum

aartbik added reviewers: reidtatge, andydavis1, ftynse.May 29 2020, 4:43 PM

aartbik added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.h
67	made this an enum
mlir/test/Dialect/Vector/vector-flat-transforms.mlir
6	yes, sounds good fixing this later somehow; the TODO reminds us to follow up :-)

Harbormaster failed remote builds in B58517: Diff 267411!May 29 2020, 5:30 PM

ftynse accepted this revision.Jun 2 2020, 7:51 AM

ftynse added inline comments.

mlir/test/Dialect/Vector/vector-flat-transforms.mlir
6	Canonicalizer doesn't run in any pattern rewriting (it can in the Nicolas's multi-level driver if you configure it that way), folding is run in both greedy rewriter and dialect converter. However, folding does not recurse upwards on operands of the given op, which it seems what you need here. In dialect conversion, there's a TODO comment related to potentially folding any operation it visits even if it is considered legal. Maybe that will help.

This revision is now accepted and ready to land.Jun 2 2020, 7:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2020, 7:51 AM

rebased

nicolasvasilache accepted this revision.Jun 3 2020, 2:33 PM

Closed by commit rG6391da98f43a: [mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose' (authored by aartbik). · Explain WhyJun 3 2020, 2:56 PM

This revision was automatically updated to reflect the committed changes.

Harbormaster failed remote builds in B58987: Diff 268301!Jun 3 2020, 3:29 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorOps.h

10 lines

VectorOps.td

1 line

lib/

Dialect/

Vector/

VectorOps.cpp

13 lines

VectorTransforms.cpp

30 lines

test/

Dialect/

Vector/

vector-contract-transforms.mlir

20 lines

vector-flat-transforms.mlir

62 lines

lib/

Transforms/

TestVectorTransforms.cpp

18 lines

Diff 268312

mlir/include/mlir/Dialect/Vector/VectorOps.h

	Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	enum class VectorContractLowering {			enum class VectorContractLowering {
	/// Progressively lower to finer grained `vector.contract` and `vector.fma`.			/// Progressively lower to finer grained `vector.contract` and `vector.fma`.
	FMA = 0,			FMA = 0,
	/// Lower to `vector.matrix_multiply`, maps 1-1 to LLVM matrix intrinsics.			/// Lower to `vector.matrix_multiply`, maps 1-1 to LLVM matrix intrinsics.
	Matmul = 1,			Matmul = 1,
	/// Lower to `vector.outerproduct`.			/// Lower to `vector.outerproduct`.
	OuterProduct = 2,			OuterProduct = 2,
	};			};
				/// Enum to control the lowering of `vector.transpose` operations.
				enum class VectorTransposeLowering {
				// Lower transpose into element-wise extract and inserts.
				EltWise = 0,
				/// Lower 2-D transpose to `vector.flat_transpose`, maps 1-1 to LLVM matrix
				/// intrinsics.
				Flat = 1,
				};
	/// Structure to control the behavior of vector transform patterns.			/// Structure to control the behavior of vector transform patterns.
	struct VectorTransformsOptions {			struct VectorTransformsOptions {
	VectorContractLowering vectorContractLowering = VectorContractLowering::FMA;			VectorContractLowering vectorContractLowering = VectorContractLowering::FMA;
				VectorTransposeLowering vectorTransposeLowering =
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we use a `struct VectorTransposeLowering` to keep it consistent with the one above (even if for now there are really only 2 options atm) ? nicolasvasilache: Can we use a `struct VectorTransposeLowering` to keep it consistent with the one above (even if…
				aartbikAuthorUnsubmitted Done Reply Inline Actions made this an enum aartbik: made this an enum
				VectorTransposeLowering::EltWise;
	VectorTransformsOptions &			VectorTransformsOptions &
	setVectorTransformsOptions(VectorContractLowering opt) {			setVectorTransformsOptions(VectorContractLowering opt) {
	vectorContractLowering = opt;			vectorContractLowering = opt;
	return *this;			return *this;
	}			}
	};			};

	/// Collect a set of transformation patterns that are related to contracting			/// Collect a set of transformation patterns that are related to contracting
	Show All 37 Lines

mlir/include/mlir/Dialect/Vector/VectorOps.td

Show First 20 Lines • Show All 1,200 Lines • ▼ Show 20 Lines	let extraClassDeclaration = [{
VectorType getSourceVectorType() {		VectorType getSourceVectorType() {
return source().getType().cast<VectorType>();		return source().getType().cast<VectorType>();
}		}
VectorType getResultVectorType() {		VectorType getResultVectorType() {
return getResult().getType().cast<VectorType>();		return getResult().getType().cast<VectorType>();
}		}
}];		}];
let assemblyFormat = "$source attr-dict `:` type($source) `to` type($result)";		let assemblyFormat = "$source attr-dict `:` type($source) `to` type($result)";
		let hasFolder = 1;
}		}

def Vector_TypeCastOp :		def Vector_TypeCastOp :
Vector_Op<"type_cast", [NoSideEffect]>,		Vector_Op<"type_cast", [NoSideEffect]>,
Arguments<(ins StaticShapeMemRefOf<[AnyType]>:$memref)>,		Arguments<(ins StaticShapeMemRefOf<[AnyType]>:$memref)>,
Results<(outs AnyMemRef:$result)> {		Results<(outs AnyMemRef:$result)> {
let summary = "type_cast op converts a scalar memref to a vector memref";		let summary = "type_cast op converts a scalar memref to a vector memref";
let description = [{		let description = [{
▲ Show 20 Lines • Show All 383 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorOps.cpp

Show First 20 Lines • Show All 1,661 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = sourceTupleType.size(); i < e; ++i)
if (failed(verifyVectorShapeCast(		if (failed(verifyVectorShapeCast(
op, sourceTupleType.getType(i).cast<VectorType>(),		op, sourceTupleType.getType(i).cast<VectorType>(),
resultTupleType.getType(i).cast<VectorType>())))		resultTupleType.getType(i).cast<VectorType>())))
return failure();		return failure();

return success();		return success();
}		}

		OpFoldResult ShapeCastOp::fold(ArrayRef<Attribute> operands) {
		// Nop shape cast.
		if (source().getType() == result().getType())
		return source();

		// Canceling shape casts.
		if (auto otherOp = source().getDefiningOp<ShapeCastOp>())
		if (result().getType() == otherOp.source().getType())
		return otherOp.source();

		return {};
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// TypeCastOp		// TypeCastOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

static SmallVector<int64_t, 8> extractShape(MemRefType memRefType) {		static SmallVector<int64_t, 8> extractShape(MemRefType memRefType) {
auto vectorType = memRefType.getElementType().dyn_cast<VectorType>();		auto vectorType = memRefType.getElementType().dyn_cast<VectorType>();
SmallVector<int64_t, 8> res(memRefType.getShape().begin(),		SmallVector<int64_t, 8> res(memRefType.getShape().begin(),
memRefType.getShape().end());		memRefType.getShape().end());
▲ Show 20 Lines • Show All 329 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 1,180 Lines • ▼ Show 20 Lines
/// %0 = vector.extract %y[0, 0]		/// %0 = vector.extract %y[0, 0]
/// %1 = vector.insert %0, %z [0, 0]		/// %1 = vector.insert %0, %z [0, 0]
/// ..		/// ..
/// %x = vector.insert .., .. [.., ..]		/// %x = vector.insert .., .. [.., ..]
class TransposeOpLowering : public OpRewritePattern<vector::TransposeOp> {		class TransposeOpLowering : public OpRewritePattern<vector::TransposeOp> {
public:		public:
using OpRewritePattern<vector::TransposeOp>::OpRewritePattern;		using OpRewritePattern<vector::TransposeOp>::OpRewritePattern;

		TransposeOpLowering(vector::VectorTransformsOptions vectorTransformsOptions,
		MLIRContext *context)
		: OpRewritePattern<vector::TransposeOp>(context),
		vectorTransformsOptions(vectorTransformsOptions) {}

LogicalResult matchAndRewrite(vector::TransposeOp op,		LogicalResult matchAndRewrite(vector::TransposeOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto loc = op.getLoc();		auto loc = op.getLoc();

VectorType resType = op.getResultType();		VectorType resType = op.getResultType();

// Set up convenience transposition table.		// Set up convenience transposition table.
SmallVector<int64_t, 4> transp;		SmallVector<int64_t, 4> transp;
for (auto attr : op.transp())		for (auto attr : op.transp())
transp.push_back(attr.cast<IntegerAttr>().getInt());		transp.push_back(attr.cast<IntegerAttr>().getInt());

		// Handle a true 2-D matrix transpose differently when requested.
		if (vectorTransformsOptions.vectorTransposeLowering ==
		vector::VectorTransposeLowering::Flat &&
		resType.getRank() == 2 && transp[0] == 1 && transp[1] == 0) {
		Type flattenedType =
		VectorType::get(resType.getNumElements(), resType.getElementType());
		auto matrix =
		rewriter.create<vector::ShapeCastOp>(loc, flattenedType, op.vector());
		auto rows = rewriter.getI32IntegerAttr(resType.getShape()[0]);
		auto columns = rewriter.getI32IntegerAttr(resType.getShape()[1]);
		Value trans = rewriter.create<vector::FlatTransposeOp>(
		loc, flattenedType, matrix, rows, columns);
		rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(op, resType, trans);
		return success();
		}

// Generate fully unrolled extract/insert ops.		// Generate fully unrolled extract/insert ops.
Value result = rewriter.create<ConstantOp>(loc, resType,		Value result = rewriter.create<ConstantOp>(loc, resType,
rewriter.getZeroAttr(resType));		rewriter.getZeroAttr(resType));
SmallVector<int64_t, 4> lhs(transp.size(), 0);		SmallVector<int64_t, 4> lhs(transp.size(), 0);
SmallVector<int64_t, 4> rhs(transp.size(), 0);		SmallVector<int64_t, 4> rhs(transp.size(), 0);
rewriter.replaceOp(op, expandIndices(loc, resType, 0, transp, lhs, rhs,		rewriter.replaceOp(op, expandIndices(loc, resType, 0, transp, lhs, rhs,
op.vector(), result, rewriter));		op.vector(), result, rewriter));
return success();		return success();
Show All 17 Lines	Value expandIndices(Location loc, VectorType resType, int64_t pos,
for (int64_t d = 0, e = resType.getDimSize(pos); d < e; ++d) {		for (int64_t d = 0, e = resType.getDimSize(pos); d < e; ++d) {
lhs[pos] = d;		lhs[pos] = d;
rhs[transp[pos]] = d;		rhs[transp[pos]] = d;
result = expandIndices(loc, resType, pos + 1, transp, lhs, rhs, input,		result = expandIndices(loc, resType, pos + 1, transp, lhs, rhs, input,
result, rewriter);		result, rewriter);
}		}
return result;		return result;
}		}

		/// Options to control the vector patterns.
		vector::VectorTransformsOptions vectorTransformsOptions;
};		};

/// Progressive lowering of OuterProductOp.		/// Progressive lowering of OuterProductOp.
/// One:		/// One:
/// %x = vector.outerproduct %lhs, %rhs, %acc		/// %x = vector.outerproduct %lhs, %rhs, %acc
/// is replaced by:		/// is replaced by:
/// %z = zero-result		/// %z = zero-result
/// %0 = vector.extract %lhs[0]		/// %0 = vector.extract %lhs[0]
▲ Show 20 Lines • Show All 583 Lines • ▼ Show 20 Lines	void mlir::vector::populateVectorContractLoweringPatterns(
OwningRewritePatternList &patterns, MLIRContext *context,		OwningRewritePatternList &patterns, MLIRContext *context,
VectorTransformsOptions parameters) {		VectorTransformsOptions parameters) {
// clang-format off		// clang-format off
patterns.insert<BroadcastOpLowering,		patterns.insert<BroadcastOpLowering,
CreateMaskOpLowering,		CreateMaskOpLowering,
ConstantMaskOpLowering,		ConstantMaskOpLowering,
OuterProductOpLowering,		OuterProductOpLowering,
ShapeCastOp2DDownCastRewritePattern,		ShapeCastOp2DDownCastRewritePattern,
ShapeCastOp2DUpCastRewritePattern,		ShapeCastOp2DUpCastRewritePattern>(context);
TransposeOpLowering>(context);		patterns.insert<TransposeOpLowering,
patterns.insert<ContractionOpLowering,		ContractionOpLowering,
ContractionOpToMatmulOpLowering,		ContractionOpToMatmulOpLowering,
ContractionOpToOuterProductOpLowering>(parameters, context);		ContractionOpToOuterProductOpLowering>(parameters, context);
// clang-format on		// clang-format on
}		}

mlir/test/Dialect/Vector/vector-contract-transforms.mlir

	Show First 20 Lines • Show All 313 Lines • ▼ Show 20 Lines
	// CHECK: %[[T11:.*]] = vector.insert %[[T10]], %[[T9]] [2, 1] : f32 into vector<3x2xf32>			// CHECK: %[[T11:.*]] = vector.insert %[[T10]], %[[T9]] [2, 1] : f32 into vector<3x2xf32>
	// CHECK: return %[[T11]] : vector<3x2xf32>			// CHECK: return %[[T11]] : vector<3x2xf32>

	func @transpose23(%arg0: vector<2x3xf32>) -> vector<3x2xf32> {			func @transpose23(%arg0: vector<2x3xf32>) -> vector<3x2xf32> {
	%0 = vector.transpose %arg0, [1, 0] : vector<2x3xf32> to vector<3x2xf32>			%0 = vector.transpose %arg0, [1, 0] : vector<2x3xf32> to vector<3x2xf32>
	return %0 : vector<3x2xf32>			return %0 : vector<3x2xf32>
	}			}


				// CHECK-LABEL: func @nop_shape_cast
				// CHECK-SAME: %[[A:.*]]: vector<16xf32>
				// CHECK: return %[[A]] : vector<16xf32>

				func @nop_shape_cast(%arg0: vector<16xf32>) -> vector<16xf32> {
				%0 = vector.shape_cast %arg0 : vector<16xf32> to vector<16xf32>
				return %0 : vector<16xf32>
				}

				// CHECK-LABEL: func @cancel_shape_cast
				// CHECK-SAME: %[[A:.*]]: vector<16xf32>
				// CHECK: return %[[A]] : vector<16xf32>

				func @cancel_shape_cast(%arg0: vector<16xf32>) -> vector<16xf32> {
				%0 = vector.shape_cast %arg0 : vector<16xf32> to vector<4x4xf32>
				%1 = vector.shape_cast %0 : vector<4x4xf32> to vector<16xf32>
				return %1 : vector<16xf32>
				}

	// Shape up and downcasts for 2-D vectors, for supporting conversion to			// Shape up and downcasts for 2-D vectors, for supporting conversion to
	// llvm.matrix operations			// llvm.matrix operations
	// CHECK-LABEL: func @shape_casts			// CHECK-LABEL: func @shape_casts
	func @shape_casts(%a: vector<2x2xf32>) -> (vector<4xf32>, vector<2x2xf32>) {			func @shape_casts(%a: vector<2x2xf32>) -> (vector<4xf32>, vector<2x2xf32>) {
	// CHECK: %[[cst:.*]] = constant dense<0.000000e+00> : vector<4xf32>			// CHECK: %[[cst:.*]] = constant dense<0.000000e+00> : vector<4xf32>
	// CHECK: %[[cst22:.*]] = constant dense<0.000000e+00> : vector<2x2xf32>			// CHECK: %[[cst22:.*]] = constant dense<0.000000e+00> : vector<2x2xf32>
	// CHECK: %[[ex0:.]] = vector.extract %{{.}}[0] : vector<2x2xf32>			// CHECK: %[[ex0:.]] = vector.extract %{{.}}[0] : vector<2x2xf32>
	//			//
	▲ Show 20 Lines • Show All 570 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/vector-flat-transforms.mlir

This file was added.

				// RUN: mlir-opt %s -test-vector-contraction-conversion=vector-flat-transpose=1 \| FileCheck %s --dump-input-on-failure

				// Tests for lowering 2-D vector.transpose into vector.flat_transpose.
				//
				// TODO(ajcbik,ntv): having ShapeCastOp2DDownCastRewritePattern and
				// ShapeCastOp2DUpCastRewritePattern too early in
				aartbikAuthorUnsubmitted Done Reply Inline Actions sending this out for some early discussion; this hits a phase ordering issue I propose to move the shapecast lowering a bit later, so we can fold them first and only lower them when they cannot be eliminated aartbik: sending this out for some early discussion; this hits a phase ordering issue I propose to move…
				nicolasvasilacheUnsubmitted Done Reply Inline Actions I think this should be resolved in a separate revision and this is fine for now. Note that this is more an order of visitation problem. Given: %a = shape_cast %0 %b = shape_cast %a where the shape casts can fold, if %a is visited before %b then it will be expanded. I have seen this type of behavior a bunch of times in different place (albeit not involving folding + canonicalization + lowering IIRC). Seems like ShapeCast should have a canonicalizer / canonicalization pattern (`hasCanonicalizer=1`) with a separate match and rewrite. Then ShapeCastLowering could query that on all its uses and fail to lower if any foldable use is left. In other words, this type of pattern ordering can be resolved by finer-grained case disjunction. However this seems like it can still "miss folding at a distance": consider a chain of transposes that are lowered in some arbitrary order introducing reshapes. Folding opportunities would only appear if consecutive transposes are lowered before any newly introduced shape_cast is visited. This seems like the worklist-based algorithm would handle this ordering naturally but I imagine we can construct more intricate cases where this would not be true? Pinging @rriddle to see if there are more idiomatic ways of doing this, if this should be integrated in the rewriter itself (i.e. delay pattern application if any operand has folding opportunities), or something else. nicolasvasilache: I think this should be resolved in a separate revision and this is fine for now. Note that…
				aartbikAuthorUnsubmitted Done Reply Inline Actions yes, sounds good fixing this later somehow; the TODO reminds us to follow up :-) aartbik: yes, sounds good fixing this later somehow; the TODO reminds us to follow up :-)
				ftynseUnsubmitted Not Done Reply Inline Actions Canonicalizer doesn't run in any pattern rewriting (it can in the Nicolas's multi-level driver if you configure it that way), folding is run in both greedy rewriter and dialect converter. However, folding does not recurse upwards on operands of the given op, which it seems what you need here. In dialect conversion, there's a TODO comment related to potentially folding any operation it visits even if it is considered legal. Maybe that will help. ftynse: Canonicalizer doesn't run in any pattern rewriting (it can in the Nicolas's multi-level driver…
				// the greedy rewriting patterns misses opportunities
				// to fold shape casts!

				// No shape cast folding expected.
				//
				// CHECK-LABEL: func @transpose44_44(
				// CHECK-SAME: %[[A:.*]]: vector<4x4xf32>
				// CHECK: %[[T0:.*]] = vector.extract %[[A]][0] : vector<4x4xf32>
				// CHECK: %[[T8:.]] = vector.flat_transpose %{{.}} {columns = 4 : i32, rows = 4 : i32} : vector<16xf32> -> vector<16xf32>
				// CHECK: %[[T9:.*]] = vector.extract_strided_slice %[[T8]] {offsets = [0], sizes = [4], strides = [1]} : vector<16xf32> to vector<4xf32>
				//
				func @transpose44_44(%arg0: vector<4x4xf32>) -> vector<4x4xf32> {
				%0 = vector.transpose %arg0, [1, 0] : vector<4x4xf32> to vector<4x4xf32>
				return %0 : vector<4x4xf32>
				}

				// Folds preceding shape cast as expected,
				// no following shape cast folding expected.
				//
				// CHECK-LABEL: func @transpose16_44(
				// CHECK-SAME: %[[A:.*]]: vector<16xf32>
				// CHECK: %[[T0:.*]] = vector.flat_transpose %[[A]] {columns = 4 : i32, rows = 4 : i32} : vector<16xf32> -> vector<16xf32>
				// CHECK: %[[T1:.*]] = vector.extract_strided_slice %[[T0]] {offsets = [0], sizes = [4], strides = [1]} : vector<16xf32> to vector<4xf32>
				//
				func @transpose16_44(%arg0: vector<16xf32>) -> vector<4x4xf32> {
				%0 = vector.shape_cast %arg0 : vector<16xf32> to vector<4x4xf32>
				%1 = vector.transpose %0, [1, 0] : vector<4x4xf32> to vector<4x4xf32>
				return %1 : vector<4x4xf32>
				}

				// No preceding shape cast folding expected,
				// but FAILS to fold following cast.
				//
				// CHECK-LABEL: func @transpose44_16(
				// CHECK-SAME: %[[A:.*]]: vector<4x4xf32>
				// CHECK: %[[T0:.*]] = vector.extract %[[A]][0] : vector<4x4xf32>
				// CHECK: %[[T8:.]] = vector.flat_transpose %{{.}} {columns = 4 : i32, rows = 4 : i32} : vector<16xf32> -> vector<16xf32>
				func @transpose44_16(%arg0: vector<4x4xf32>) -> vector<16xf32> {
				%0 = vector.transpose %arg0, [1, 0] : vector<4x4xf32> to vector<4x4xf32>
				%1 = vector.shape_cast %0 : vector<4x4xf32> to vector<16xf32>
				return %1 : vector<16xf32>
				}

				// Folds preceding shape cast as expected,
				// but FAILS to fold following cast.
				//
				// CHECK-LABEL: func @transpose16_16(
				// CHECK-SAME: %[[A:.*]]: vector<16xf32>
				// CHECK: %[[T0:.*]] = vector.flat_transpose %[[A]] {columns = 4 : i32, rows = 4 : i32} : vector<16xf32> -> vector<16xf32>
				//
				func @transpose16_16(%arg0: vector<16xf32>) -> vector<16xf32> {
				%0 = vector.shape_cast %arg0 : vector<16xf32> to vector<4x4xf32>
				%1 = vector.transpose %0, [1, 0] : vector<4x4xf32> to vector<4x4xf32>
				%2 = vector.shape_cast %1 : vector<4x4xf32> to vector<16xf32>
				return %2 : vector<16xf32>
				}

mlir/test/lib/Transforms/TestVectorTransforms.cpp

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	};			};

	struct TestVectorContractionConversion			struct TestVectorContractionConversion
	: public PassWrapper<TestVectorContractionConversion, FunctionPass> {			: public PassWrapper<TestVectorContractionConversion, FunctionPass> {
	TestVectorContractionConversion() = default;			TestVectorContractionConversion() = default;
	TestVectorContractionConversion(const TestVectorContractionConversion &pass) {			TestVectorContractionConversion(const TestVectorContractionConversion &pass) {
	}			}

	Option<bool> lowerToLLVMMatrixIntrinsics{			Option<bool> lowerToFlatMatrix{
	*this, "vector-lower-matrix-intrinsics",			*this, "vector-lower-matrix-intrinsics",
	llvm::cl::desc("Lower vector.contract to llvm.intr.matrix.multiply"),			llvm::cl::desc("Lower vector.contract to llvm.intr.matrix.multiply"),
	llvm::cl::init(false)};			llvm::cl::init(false)};
				Option<bool> lowerToFlatTranspose{
				*this, "vector-flat-transpose",
				llvm::cl::desc("Lower 2-D vector.transpose to vector.flat_transpose"),
				llvm::cl::init(false)};
	Option<bool> lowerToOuterProduct{			Option<bool> lowerToOuterProduct{
	*this, "vector-outerproduct",			*this, "vector-outerproduct",
	llvm::cl::desc("Lower vector.contract to vector.outerproduct"),			llvm::cl::desc("Lower vector.contract to vector.outerproduct"),
	llvm::cl::init(false)};			llvm::cl::init(false)};

	void runOnFunction() override {			void runOnFunction() override {
	OwningRewritePatternList patterns;			OwningRewritePatternList patterns;
	if (lowerToOuterProduct) {			if (lowerToOuterProduct) {
	VectorContractLowering lowering = VectorContractLowering::OuterProduct;			VectorContractLowering lowering = VectorContractLowering::OuterProduct;
	VectorTransformsOptions options{lowering};			VectorTransformsOptions options{lowering};
	patterns.insert<ContractionOpToOuterProductOpLowering>(options,			patterns.insert<ContractionOpToOuterProductOpLowering>(options,
	&getContext());			&getContext());
	applyPatternsAndFoldGreedily(getFunction(), patterns);			applyPatternsAndFoldGreedily(getFunction(), patterns);
	return;			return;
	}			}

	VectorContractLowering lowering = VectorContractLowering::FMA;			VectorContractLowering contractLowering = VectorContractLowering::FMA;
	if (lowerToLLVMMatrixIntrinsics)			if (lowerToFlatMatrix)
	lowering = VectorContractLowering::Matmul;			contractLowering = VectorContractLowering::Matmul;
	VectorTransformsOptions options{lowering};			VectorTransposeLowering transposeLowering =
				VectorTransposeLowering::EltWise;
				if (lowerToFlatTranspose)
				transposeLowering = VectorTransposeLowering::Flat;
				VectorTransformsOptions options{contractLowering, transposeLowering};
	populateVectorContractLoweringPatterns(patterns, &getContext(), options);			populateVectorContractLoweringPatterns(patterns, &getContext(), options);
	applyPatternsAndFoldGreedily(getFunction(), patterns);			applyPatternsAndFoldGreedily(getFunction(), patterns);
	}			}
	};			};

	} // end anonymous namespace			} // end anonymous namespace

	namespace mlir {			namespace mlir {
	Show All 14 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 268312

mlir/include/mlir/Dialect/Vector/VectorOps.h

mlir/include/mlir/Dialect/Vector/VectorOps.td

mlir/lib/Dialect/Vector/VectorOps.cpp

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/test/Dialect/Vector/vector-contract-transforms.mlir

mlir/test/Dialect/Vector/vector-flat-transforms.mlir

mlir/test/lib/Transforms/TestVectorTransforms.cpp

[mlir] [VectorOps] Use 'vector.flat_transpose' for 2-D 'vector.tranpose'
ClosedPublic