This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/Transforms/
-
mlir/
-
Dialect/
-
Vector/
-
Transforms/
-
VectorRewritePatterns.h
-
lib/Dialect/Vector/Transforms/
-
Dialect/
-
Vector/
-
Transforms/
2
VectorTransforms.cpp
-
test/
-
Dialect/Vector/
-
Vector/
-
vector-break-down-bitcast.mlir
-
lib/Dialect/Vector/
-
Dialect/
-
Vector/
-
TestVectorTransforms.cpp

Differential D149065

[mlir][vector] Add pattern to break down vector.bitcast
ClosedPublic

Authored by qedawkins on Apr 24 2023, 7:48 AM.

Download Raw Diff

Details

Reviewers

aartbik
nicolasvasilache
dcaballe
antiagainst
kuhar

Commits

rG650f04feda90: [mlir][vector] Add pattern to break down vector.bitcast

Summary

The pattern added here is intended as a last resort for targets like
SPIR-V where there are vector size restrictions and we need to be able
to break down large vector types. Vectorizing loads/stores for small
bitwidths (e.g. i8) relies on bitcasting to a larger element type and
patterns to bubble bitcast ops to where they can cancel.
This fails for cases such as

%1 = arith.trunci %0 : vector<2x32xi32> to vector<2x32xi8>
vector.transfer_write %1, %destination[%c0, %c0] {in_bounds = [true, true]} : vector<2x32xi8>, memref<2x32xi8>

where the arith.trunci op essentially does the job of one of the
bitcasts, leading to a bitcast that need to be further broken down

vector.bitcast %0 : vector<16xi8> to vector<4xi32>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qedawkins created this revision.Apr 24 2023, 7:48 AM

Herald added a reviewer: aartbik. · View Herald TranscriptApr 24 2023, 7:48 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bviyer, Moerafaat, zero9178 and 24 others. · View Herald Transcript

qedawkins requested review of this revision.Apr 24 2023, 7:48 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptApr 24 2023, 7:48 AM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

qedawkins added reviewers: antiagainst, kuhar.Apr 24 2023, 7:49 AM

LGTM. You may want to wait for a review from Lei before submitting.

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
856	nit: `Location loc` and `Type elemType`
870	nit: I'd think a regular for loop would work fine here

kuhar accepted this revision.Apr 24 2023, 8:13 AM

This revision is now accepted and ready to land.Apr 24 2023, 8:13 AM

Harbormaster completed remote builds in B227730: Diff 516410.Apr 24 2023, 8:27 AM

Address comments

Harbormaster completed remote builds in B227794: Diff 516500.Apr 24 2023, 12:47 PM

LGTM! Thanks for adding this!

rebase

Harbormaster completed remote builds in B228137: Diff 516963.Apr 25 2023, 4:19 PM

Closed by commit rG650f04feda90: [mlir][vector] Add pattern to break down vector.bitcast (authored by qedawkins). · Explain WhyApr 25 2023, 5:22 PM

This revision was automatically updated to reflect the committed changes.

qedawkins added a commit: rG650f04feda90: [mlir][vector] Add pattern to break down vector.bitcast.

LGTM! Wondering if it would make more sense to move these SPIR-V specific patterns to some kind of SPIR-V legalization group of patterns. Are there cases where this can be useful in general?

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

Transforms/

VectorRewritePatterns.h

16 lines

lib/

Dialect/

Vector/

Transforms/

VectorTransforms.cpp

92 lines

test/

Dialect/

Vector/

vector-break-down-bitcast.mlir

41 lines

lib/

Dialect/

Vector/

TestVectorTransforms.cpp

22 lines

Diff 516994

mlir/include/mlir/Dialect/Vector/Transforms/VectorRewritePatterns.h

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
	///			///
	/// If `controlFn` is not nullptr, the pattern will only be invoked on ops that			/// If `controlFn` is not nullptr, the pattern will only be invoked on ops that
	/// `controlFn` returns true. Otherwise runs on ops.			/// `controlFn` returns true. Otherwise runs on ops.
	void populateVectorExtractStridedSliceToExtractInsertChainPatterns(			void populateVectorExtractStridedSliceToExtractInsertChainPatterns(
	RewritePatternSet &patterns,			RewritePatternSet &patterns,
	std::function<bool(ExtractStridedSliceOp)> controlFn = nullptr,			std::function<bool(ExtractStridedSliceOp)> controlFn = nullptr,
	PatternBenefit benefit = 1);			PatternBenefit benefit = 1);

				/// Populate `patterns` with a pattern to break down 1-D vector.bitcast ops
				/// based on the destination vector shape. Bitcasts from a lower bitwidth
				/// element type to a higher bitwidth one are extracted from the lower bitwidth
				/// based on the native destination vector shape and inserted based on the ratio
				/// of the bitwidths.
				///
				/// This acts as a last resort way to break down vector.bitcast ops to smaller
				/// vector sizes. Because this pattern composes until it is bitcasting to a
				/// single element of the higher bitwidth, the is an optional control function.
				/// If `controlFn` is not nullptr, the pattern will only apply to ops where
				/// `controlFn` returns true, otherwise applies to all bitcast ops.
				void populateBreakDownVectorBitCastOpPatterns(
				RewritePatternSet &patterns,
				std::function<bool(BitCastOp)> controlFn = nullptr,
				PatternBenefit benefit = 1);

	/// Populate `patterns` with the following patterns.			/// Populate `patterns` with the following patterns.
	///			///
	/// Patterns in populateVectorInsertExtractStridedSliceDecompositionPatterns();			/// Patterns in populateVectorInsertExtractStridedSliceDecompositionPatterns();
	///			///
	/// [ConvertSameRankInsertStridedSliceIntoShuffle]			/// [ConvertSameRankInsertStridedSliceIntoShuffle]
	/// ==============================================			/// ==============================================
	/// RewritePattern for InsertStridedSliceOp where source and destination vectors			/// RewritePattern for InsertStridedSliceOp where source and destination vectors
	/// have the same rank. For each outermost index in the slice:			/// have the same rank. For each outermost index in the slice:
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp

Show First 20 Lines • Show All 794 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(vector::BitCastOp bitcastOp,
rewriter.replaceOpWithNewOp<vector::InsertStridedSliceOp>(		rewriter.replaceOpWithNewOp<vector::InsertStridedSliceOp>(
bitcastOp, bitcastOp.getType(), newCastSrcOp, newCastDstOp, newOffsets,		bitcastOp, bitcastOp.getType(), newCastSrcOp, newCastDstOp, newOffsets,
insertOp.getStrides());		insertOp.getStrides());

return success();		return success();
}		}
};		};

		// Breaks down vector.bitcast op
		//
		// This transforms IR like:
		// %1 = vector.bitcast %0: vector<8xf16> to vector<4xf32>
		// Into:
		// %cst = vector.splat %c0_f32 : vector<4xf32>
		// %1 = vector.extract_strided_slice %0 {
		// offsets = [0], sizes = [4], strides = [1]
		// } : vector<8xf16> to vector<4xf16>
		// %2 = vector.bitcast %1 : vector<4xf16> to vector<2xf32>
		// %4 = vector.insert_strided_slice %2, %cst {
		// offsets = [0], strides = [1]} : vector<2xf32> into vector<4xf32>
		// %5 = vector.extract_strided_slice %0 {
		// offsets = [4], sizes = [4], strides = [1]
		// } : vector<8xf16> to vector<4xf16>
		// %6 = vector.bitcast %5 : vector<4xf16> to vector<2xf32>
		// %7 = vector.insert_strided_slice %6, %cst {
		// offsets = [2], strides = [1]} : vector<2xf32> into vector<4xf32>
		struct BreakDownVectorBitCast : public OpRewritePattern<vector::BitCastOp> {
		using OpRewritePattern::OpRewritePattern;

		public:
		BreakDownVectorBitCast(MLIRContext *context,
		std::function<bool(vector::BitCastOp)> controlFn,
		PatternBenefit benefit)
		: OpRewritePattern(context, benefit), controlFn(std::move(controlFn)) {}

		LogicalResult matchAndRewrite(vector::BitCastOp bitcastOp,
		PatternRewriter &rewriter) const override {

		if (controlFn && !controlFn(bitcastOp))
		return failure();

		VectorType castSrcType = bitcastOp.getSourceVectorType();
		VectorType castDstType = bitcastOp.getResultVectorType();
		assert(castSrcType.getRank() == castDstType.getRank());

		// Only support rank 1 case for now.
		if (castSrcType.getRank() != 1)
		return failure();

		int64_t castSrcLastDim = castSrcType.getShape().back();
		int64_t castDstLastDim = castDstType.getShape().back();
		// Require casting to less elements for now; other cases to be implemented.
		if (castSrcLastDim < castDstLastDim)
		return failure();

		assert(castSrcLastDim % castDstLastDim == 0);
		int64_t shrinkRatio = castSrcLastDim / castDstLastDim;
		// Nothing to do if it is already bitcasting to a single element.
		if (castSrcLastDim == shrinkRatio)
		return failure();

		Location loc = bitcastOp.getLoc();
		kuharUnsubmitted Not Done Reply Inline Actions nit: `Location loc` and `Type elemType` kuhar: nit: `Location loc` and `Type elemType`
		Type elemType = castDstType.getElementType();
		assert(elemType.isSignlessIntOrIndexOrFloat());

		Value zero = rewriter.create<arith::ConstantOp>(
		loc, elemType, rewriter.getZeroAttr(elemType));
		Value res = rewriter.create<SplatOp>(loc, castDstType, zero);

		SmallVector<int64_t> sliceShape{castDstLastDim};
		SmallVector<int64_t> strides{1};
		VectorType newCastDstType =
		VectorType::get(SmallVector<int64_t>{castDstLastDim / shrinkRatio},
		castDstType.getElementType());

		for (int i = 0, e = shrinkRatio; i < e; ++i) {
		kuharUnsubmitted Not Done Reply Inline Actions nit: I'd think a regular for loop would work fine here kuhar: nit: I'd think a regular for loop would work fine here
		Value extracted = rewriter.create<ExtractStridedSliceOp>(
		loc, bitcastOp.getSource(), ArrayRef<int64_t>{i * castDstLastDim},
		sliceShape, strides);
		Value bitcast =
		rewriter.create<BitCastOp>(loc, newCastDstType, extracted);
		res = rewriter.create<InsertStridedSliceOp>(
		loc, bitcast, res,
		ArrayRef<int64_t>{i * castDstLastDim / shrinkRatio}, strides);
		}
		rewriter.replaceOp(bitcastOp, res);
		return success();
		}

		private:
		std::function<bool(BitCastOp)> controlFn;
		};

// Helper that returns a vector comparison that constructs a mask:		// Helper that returns a vector comparison that constructs a mask:
// mask = [0,1,..,n-1] + [o,o,..,o] < [b,b,..,b]		// mask = [0,1,..,n-1] + [o,o,..,o] < [b,b,..,b]
//		//
// If `dim == 0` then the result will be a 0-D vector.		// If `dim == 0` then the result will be a 0-D vector.
//		//
// NOTE: The LLVM::GetActiveLaneMaskOp intrinsic would provide an alternative,		// NOTE: The LLVM::GetActiveLaneMaskOp intrinsic would provide an alternative,
// much more compact, IR for this operation, but LLVM eventually		// much more compact, IR for this operation, but LLVM eventually
// generates more elaborate instructions for this intrinsic since it		// generates more elaborate instructions for this intrinsic since it
▲ Show 20 Lines • Show All 335 Lines • ▼ Show 20 Lines
void mlir::vector::populateBubbleVectorBitCastOpPatterns(		void mlir::vector::populateBubbleVectorBitCastOpPatterns(
RewritePatternSet &patterns, PatternBenefit benefit) {		RewritePatternSet &patterns, PatternBenefit benefit) {
patterns.add<BubbleDownVectorBitCastForExtract,		patterns.add<BubbleDownVectorBitCastForExtract,
BubbleDownBitCastForStridedSliceExtract,		BubbleDownBitCastForStridedSliceExtract,
BubbleUpBitCastForStridedSliceInsert>(patterns.getContext(),		BubbleUpBitCastForStridedSliceInsert>(patterns.getContext(),
benefit);		benefit);
}		}

		void mlir::vector::populateBreakDownVectorBitCastOpPatterns(
		RewritePatternSet &patterns,
		std::function<bool(vector::BitCastOp)> controlFn, PatternBenefit benefit) {
		patterns.add<BreakDownVectorBitCast>(patterns.getContext(),
		std::move(controlFn), benefit);
		}

void mlir::vector::populateVectorContractCanonicalizeMatmulToMMT(		void mlir::vector::populateVectorContractCanonicalizeMatmulToMMT(
RewritePatternSet &patterns,		RewritePatternSet &patterns,
std::function<LogicalResult(vector::ContractionOp)> constraint,		std::function<LogicalResult(vector::ContractionOp)> constraint,
PatternBenefit benefit) {		PatternBenefit benefit) {
patterns.add<CanonicalizeContractMatmulToMMT>(patterns.getContext(), benefit,		patterns.add<CanonicalizeContractMatmulToMMT>(patterns.getContext(), benefit,
std::move(constraint));		std::move(constraint));
}		}

Show All 19 Lines

mlir/test/Dialect/Vector/vector-break-down-bitcast.mlir

This file was added.

				// RUN: mlir-opt -split-input-file -test-vector-break-down-bitcast %s \| FileCheck %s

				// CHECK-LABEL: func.func @bitcast_f16_to_f32
				// CHECK-SAME: (%[[INPUT:.+]]: vector<8xf16>)
				func.func @bitcast_f16_to_f32(%input: vector<8xf16>) -> vector<4xf32> {
				%0 = vector.bitcast %input : vector<8xf16> to vector<4xf32>
				return %0: vector<4xf32>
				}

				// CHECK: %[[INIT:.+]] = arith.constant dense<0.000000e+00> : vector<4xf32>
				// CHECK: %[[EXTRACT0:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [0], sizes = [4], strides = [1]} : vector<8xf16> to vector<4xf16>
				// CHECK: %[[CAST0:.+]] = vector.bitcast %[[EXTRACT0]] : vector<4xf16> to vector<2xf32>
				// CHECK: %[[INSERT0:.+]] = vector.insert_strided_slice %[[CAST0]], %[[INIT]] {offsets = [0], strides = [1]} : vector<2xf32> into vector<4xf32>
				// CHECK: %[[EXTRACT1:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [4], sizes = [4], strides = [1]} : vector<8xf16> to vector<4xf16>
				// CHECK: %[[CAST1:.+]] = vector.bitcast %[[EXTRACT1]] : vector<4xf16> to vector<2xf32>
				// CHECK: %[[INSERT1:.+]] = vector.insert_strided_slice %[[CAST1]], %[[INSERT0]] {offsets = [2], strides = [1]} : vector<2xf32> into vector<4xf32>
				// CHECK: return %[[INSERT1]]

				// -----

				// CHECK-LABEL: func.func @bitcast_i8_to_i32
				// CHECK-SAME: (%[[INPUT:.+]]: vector<16xi8>)
				func.func @bitcast_i8_to_i32(%input: vector<16xi8>) -> vector<4xi32> {
				%0 = vector.bitcast %input : vector<16xi8> to vector<4xi32>
				return %0: vector<4xi32>
				}

				// CHECK: %[[INIT:.+]] = arith.constant dense<0> : vector<4xi32>
				// CHECK: %[[EXTRACT0:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [0], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
				// CHECK: %[[CAST0:.+]] = vector.bitcast %[[EXTRACT0]] : vector<4xi8> to vector<1xi32>
				// CHECK: %[[INSERT0:.+]] = vector.insert_strided_slice %[[CAST0]], %[[INIT]] {offsets = [0], strides = [1]} : vector<1xi32> into vector<4xi32>
				// CHECK: %[[EXTRACT1:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [4], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
				// CHECK: %[[CAST1:.+]] = vector.bitcast %[[EXTRACT1]] : vector<4xi8> to vector<1xi32>
				// CHECK: %[[INSERT1:.+]] = vector.insert_strided_slice %[[CAST1]], %[[INSERT0]] {offsets = [1], strides = [1]} : vector<1xi32> into vector<4xi32>
				// CHECK: %[[EXTRACT2:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [8], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
				// CHECK: %[[CAST2:.+]] = vector.bitcast %[[EXTRACT2]] : vector<4xi8> to vector<1xi32>
				// CHECK: %[[INSERT2:.+]] = vector.insert_strided_slice %[[CAST2]], %[[INSERT1]] {offsets = [2], strides = [1]} : vector<1xi32> into vector<4xi32>
				// CHECK: %[[EXTRACT3:.+]] = vector.extract_strided_slice %[[INPUT]] {offsets = [12], sizes = [4], strides = [1]} : vector<16xi8> to vector<4xi8>
				// CHECK: %[[CAST3:.+]] = vector.bitcast %[[EXTRACT3]] : vector<4xi8> to vector<1xi32>
				// CHECK: %[[INSERT3:.+]] = vector.insert_strided_slice %[[CAST3]], %[[INSERT2]] {offsets = [3], strides = [1]} : vector<1xi32> into vector<4xi32>
				// CHECK: return %[[INSERT3]]

mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp

Show First 20 Lines • Show All 598 Lines • ▼ Show 20 Lines	struct TestVectorExtractStridedSliceLowering
}		}
void runOnOperation() override {		void runOnOperation() override {
RewritePatternSet patterns(&getContext());		RewritePatternSet patterns(&getContext());
populateVectorExtractStridedSliceToExtractInsertChainPatterns(patterns);		populateVectorExtractStridedSliceToExtractInsertChainPatterns(patterns);
(void)applyPatternsAndFoldGreedily(getOperation(), std::move(patterns));		(void)applyPatternsAndFoldGreedily(getOperation(), std::move(patterns));
}		}
};		};

		struct TestVectorBreakDownBitCast
		: public PassWrapper<TestVectorBreakDownBitCast,
		OperationPass<func::FuncOp>> {
		MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestVectorBreakDownBitCast)

		StringRef getArgument() const final {
		return "test-vector-break-down-bitcast";
		}
		StringRef getDescription() const final {
		return "Test pattern that breaks down vector.bitcast ops ";
		}
		void runOnOperation() override {
		RewritePatternSet patterns(&getContext());
		populateBreakDownVectorBitCastOpPatterns(patterns, [](BitCastOp op) {
		return op.getSourceVectorType().getShape().back() > 4;
		});
		(void)applyPatternsAndFoldGreedily(getOperation(), std::move(patterns));
		}
		};

struct TestCreateVectorBroadcast		struct TestCreateVectorBroadcast
: public PassWrapper<TestCreateVectorBroadcast,		: public PassWrapper<TestCreateVectorBroadcast,
OperationPass<func::FuncOp>> {		OperationPass<func::FuncOp>> {
MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestCreateVectorBroadcast)		MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestCreateVectorBroadcast)

StringRef getArgument() const final { return "test-create-vector-broadcast"; }		StringRef getArgument() const final { return "test-create-vector-broadcast"; }
StringRef getDescription() const final {		StringRef getDescription() const final {
return "Test optimization transformations for transfer ops";		return "Test optimization transformations for transfer ops";
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	void registerTestVectorLowerings() {
PassRegistration<TestFlattenVectorTransferPatterns>();		PassRegistration<TestFlattenVectorTransferPatterns>();

PassRegistration<TestVectorScanLowering>();		PassRegistration<TestVectorScanLowering>();

PassRegistration<TestVectorDistribution>();		PassRegistration<TestVectorDistribution>();

PassRegistration<TestVectorExtractStridedSliceLowering>();		PassRegistration<TestVectorExtractStridedSliceLowering>();

		PassRegistration<TestVectorBreakDownBitCast>();

PassRegistration<TestCreateVectorBroadcast>();		PassRegistration<TestCreateVectorBroadcast>();

PassRegistration<TestVectorGatherLowering>();		PassRegistration<TestVectorGatherLowering>();
}		}
} // namespace test		} // namespace test
} // namespace mlir		} // namespace mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] Add pattern to break down vector.bitcastClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 516994

mlir/include/mlir/Dialect/Vector/Transforms/VectorRewritePatterns.h

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp

mlir/test/Dialect/Vector/vector-break-down-bitcast.mlir

mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp

[mlir][vector] Add pattern to break down vector.bitcast
ClosedPublic