Download Raw Diff

Details

Reviewers

c-rhodes
awarzynski
aartbik
ftynse
dcaballe
nicolasvasilache

Commits

rG2a82dfd70402: [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes…

Summary

This allows the lowering of > rank 1 transfer_reads/writes to equivalent
lower-rank ones when the trailing dimension is scalable. The resulting
ops still cannot be completely lowered as they depend on arrays of
scalable vectors being enabled, and a few related fixes (see D158517).

This patch also explicitly disables lowering transfer_reads/writes with
a leading scalable dimension, as more changes would be needed to handle
that correctly and it is unclear if it is required.

Examples of ops that can now be further lowered:

  %vec = vector.transfer_read %arg0[%c0, %c0], %cst, %mask
		 {in_bounds = [true, true]} : memref<3x?xf32>, vector<3x[4]xf32>

  vector.transfer_write %vec, %arg0[%c0, %c0], %mask
		 {in_bounds = [true, true]} : vector<3x[4]xf32>, memref<3x?xf32>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

benmxwl-arm created this revision.Aug 24 2023, 9:43 AM

Herald added a reviewer: aartbik. · View Herald TranscriptAug 24 2023, 9:43 AM

Herald added a reviewer: ftynse. · View Herald Transcript

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, bviyer and 26 others. · View Herald Transcript

benmxwl-arm requested review of this revision.Aug 24 2023, 9:43 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptAug 24 2023, 9:43 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: alextsao1999, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

benmxwl-arm added a parent revision: D158751: [mlir][BuiltinTypes] Add VectorType::sliceDims() and drop[Front/Back]Dims().Aug 24 2023, 9:44 AM

Harbormaster completed remote builds in B254658: Diff 553174.Aug 24 2023, 11:03 AM

What about vector.transfer_write? Wouldn't that be affected too?

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1872 ↗	(On Diff #553174)	Is it just me or does `[0, 0, 0, 0]` feel wrong? That's unrelated to this change, but it should be something "scalable" instead, right?
mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
647	An example with the trailing dim other than 1 would be more interesting ;-)

benmxwl-arm marked an inline comment as done.Aug 29 2023, 4:33 AM

benmxwl-arm added inline comments.

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1872 ↗	(On Diff #553174)	It looks wrong, but I think it's okay. Looking the langref, only zeroinit (for a splat), undef, or poison are allowed there for scalable vectors. When actually lowered to LLVM this becomes: shufflevector <vscale x 4 x i32> %4, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer So, this is actually a scalable splat. https://llvm.org/docs/LangRef.html#shufflevector-instruction

Use something more interesting than a 1-dim for vector-to-scf example :)
Add transfer_write tests/example (these changes also fix tranfer writes, as the code is shared)
Use VectorType::Builder for dropping dims (rather than abandoned helpers)

benmxwl-arm retitled this revision from [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads to [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes.Aug 30 2023, 5:21 AM

benmxwl-arm edited the summary of this revision. (Show Details)

benmxwl-arm marked an inline comment as done.

benmxwl-arm edited parent revisions, added: D159122: [mlir][BuiltinTypes] Return VectorType from VectorType::Builder conversion operator; removed: D158751: [mlir][BuiltinTypes] Add VectorType::sliceDims() and drop[Front/Back]Dims().Aug 30 2023, 5:26 AM

Harbormaster completed remote builds in B255760: Diff 554671.Aug 30 2023, 7:15 AM

LGTM, thanks!

In the spirit of "documenting through testing", it would be good to add some negative tests, i.e. where vector.transfer_{write|read} _is not_ rewritten because a non-trailing dim is scalable.

mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
63 ↗	(On Diff #554671)	[nit]

This revision is now accepted and ready to land.Sep 1 2023, 6:45 AM

Split out vector-to-LLVM changes (these depend on D158517)
Added negative tests (and updated patch to avoid hitting an assert on types it cannot handle)

benmxwl-arm retitled this revision from [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes to [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF).Sep 7 2023, 7:14 AM

benmxwl-arm edited the summary of this revision. (Show Details)

Still LGTM :) Two small suggestions added.

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp
314–315	[nit] A return value of std::nullopt indicates failure :) And at this point in time it happens when scalable dims are encounter, but keep in mind that this can change in the future. Saying something like "Vectors with the leading scalable dims are not supported (we cannot unroll scalable dims at compile time)." would, IMHO, a bit more future proof :)
1090	[nit] Could this be `xferVecType.getScalableDims()[0]` instead? Basically to match this bit furhter down: 1099: int64_t dimSize = xferVecType.getShape()[0]; #self-documenting-code

benmxwl-arm mentioned this in D159482: [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToLLVM).Sep 7 2023, 8:27 AM

Harbormaster completed remote builds in B256795: Diff 556146.Sep 7 2023, 8:29 AM

Fixup some nits

benmxwl-arm marked 2 inline comments as done.Sep 7 2023, 10:31 AM

Harbormaster completed remote builds in B256809: Diff 556173.Sep 7 2023, 12:56 PM

dcaballe accepted this revision.Sep 7 2023, 1:32 PM

dcaballe added inline comments.

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp
317	If optional is used to return a failure, we should better use FailureOr

LGTM barring Diego's comment, cheers

Use FailureOr rather than std::optional for failable unpackOneDim()

This revision was landed with ongoing or failed builds.Sep 8 2023, 2:45 AM

Closed by commit rG2a82dfd70402: [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes… (authored by benmxwl-arm). · Explain Why

This revision was automatically updated to reflect the committed changes.

benmxwl-arm added a commit: rG2a82dfd70402: [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes….

benmxwl-arm marked an inline comment as done.Sep 8 2023, 2:47 AM

Harbormaster completed remote builds in B256847: Diff 556241.Sep 8 2023, 3:07 AM

benmxwl-arm mentioned this in rGccef726d09b1: [mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes….Sep 11 2023, 9:49 AM

Diff 556242

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp

Show First 20 Lines • Show All 305 Lines • ▼ Show 20 Lines	if (xferOp.getMask()) {
b.create<memref::StoreOp>(loc, xferOp.getMask(), maskBuffer);		b.create<memref::StoreOp>(loc, xferOp.getMask(), maskBuffer);
result.maskBuffer = b.create<memref::LoadOp>(loc, maskBuffer, ValueRange());		result.maskBuffer = b.create<memref::LoadOp>(loc, maskBuffer, ValueRange());
}		}

return result;		return result;
}		}

/// Given a MemRefType with VectorType element type, unpack one dimension from		/// Given a MemRefType with VectorType element type, unpack one dimension from
/// the VectorType into the MemRefType.		/// the VectorType into the MemRefType.
///		///
		awarzynskiUnsubmitted Done Reply Inline Actions [nit] A return value of std::nullopt indicates failure :) And at this point in time it happens when scalable dims are encounter, but keep in mind that this can change in the future. Saying something like "Vectors with the leading scalable dims are not supported (we cannot unroll scalable dims at compile time)." would, IMHO, a bit more future proof :) awarzynski: [nit] A return value of std::nullopt indicates failure :) And at this point in time it happens…
/// E.g.: memref<9xvector<5x6xf32>> --> memref<9x5xvector<6xf32>>		/// E.g.: memref<9xvector<5x6xf32>> --> memref<9x5xvector<6xf32>>
static MemRefType unpackOneDim(MemRefType type) {		static FailureOr<MemRefType> unpackOneDim(MemRefType type) {
		dcaballeUnsubmitted Done Reply Inline Actions If optional is used to return a failure, we should better use FailureOr dcaballe: If optional is used to return a failure, we should better use FailureOr
auto vectorType = dyn_cast<VectorType>(type.getElementType());		auto vectorType = dyn_cast<VectorType>(type.getElementType());
		// Vectors with leading scalable dims are not supported.
		// It may be possible to support these in future by using dynamic memref dims.
		if (vectorType.getScalableDims().front())
		return failure();
auto memrefShape = type.getShape();		auto memrefShape = type.getShape();
SmallVector<int64_t, 8> newMemrefShape;		SmallVector<int64_t, 8> newMemrefShape;
newMemrefShape.append(memrefShape.begin(), memrefShape.end());		newMemrefShape.append(memrefShape.begin(), memrefShape.end());
newMemrefShape.push_back(vectorType.getDimSize(0));		newMemrefShape.push_back(vectorType.getDimSize(0));
return MemRefType::get(newMemrefShape,		return MemRefType::get(newMemrefShape,
VectorType::get(vectorType.getShape().drop_front(),		VectorType::Builder(vectorType).dropDim(0));
vectorType.getElementType()));
}		}

/// Given a transfer op, find the memref from which the mask is loaded. This		/// Given a transfer op, find the memref from which the mask is loaded. This
/// is similar to Strategy<TransferWriteOp>::getBuffer.		/// is similar to Strategy<TransferWriteOp>::getBuffer.
template <typename OpTy>		template <typename OpTy>
static Value getMaskBuffer(OpTy xferOp) {		static Value getMaskBuffer(OpTy xferOp) {
assert(xferOp.getMask() && "Expected that transfer op has mask");		assert(xferOp.getMask() && "Expected that transfer op has mask");
auto loadOp = xferOp.getMask().template getDefiningOp<memref::LoadOp>();		auto loadOp = xferOp.getMask().template getDefiningOp<memref::LoadOp>();
▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines

template <typename OpTy>		template <typename OpTy>
LogicalResult checkPrepareXferOp(OpTy xferOp,		LogicalResult checkPrepareXferOp(OpTy xferOp,
VectorTransferToSCFOptions options) {		VectorTransferToSCFOptions options) {
if (xferOp->hasAttr(kPassLabel))		if (xferOp->hasAttr(kPassLabel))
return failure();		return failure();
if (xferOp.getVectorType().getRank() <= options.targetRank)		if (xferOp.getVectorType().getRank() <= options.targetRank)
return failure();		return failure();
		// Currently the unpacking of the leading dimension into the memref is not
		// supported for scalable dimensions.
		if (xferOp.getVectorType().getScalableDims().front())
		return failure();
if (isTensorOp(xferOp) && !options.lowerTensors)		if (isTensorOp(xferOp) && !options.lowerTensors)
return failure();		return failure();
// Transfer ops that modify the element type are not supported atm.		// Transfer ops that modify the element type are not supported atm.
if (xferOp.getVectorType().getElementType() !=		if (xferOp.getVectorType().getElementType() !=
xferOp.getShapedType().getElementType())		xferOp.getShapedType().getElementType())
return failure();		return failure();
return success();		return success();
}		}
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(OpTy xferOp,
if (!xferOp->hasAttr(kPassLabel))		if (!xferOp->hasAttr(kPassLabel))
return failure();		return failure();

// Find and cast data buffer. How the buffer can be found depends on OpTy.		// Find and cast data buffer. How the buffer can be found depends on OpTy.
ImplicitLocOpBuilder locB(xferOp.getLoc(), rewriter);		ImplicitLocOpBuilder locB(xferOp.getLoc(), rewriter);
auto dataBuffer = Strategy<OpTy>::getBuffer(xferOp);		auto dataBuffer = Strategy<OpTy>::getBuffer(xferOp);
auto dataBufferType = dyn_cast<MemRefType>(dataBuffer.getType());		auto dataBufferType = dyn_cast<MemRefType>(dataBuffer.getType());
auto castedDataType = unpackOneDim(dataBufferType);		auto castedDataType = unpackOneDim(dataBufferType);
		if (failed(castedDataType))
		return failure();

auto castedDataBuffer =		auto castedDataBuffer =
locB.create<vector::TypeCastOp>(castedDataType, dataBuffer);		locB.create<vector::TypeCastOp>(*castedDataType, dataBuffer);

// If the xferOp has a mask: Find and cast mask buffer.		// If the xferOp has a mask: Find and cast mask buffer.
Value castedMaskBuffer;		Value castedMaskBuffer;
if (xferOp.getMask()) {		if (xferOp.getMask()) {
auto maskBuffer = getMaskBuffer(xferOp);		auto maskBuffer = getMaskBuffer(xferOp);
auto maskBufferType = dyn_cast<MemRefType>(maskBuffer.getType());		auto maskBufferType = dyn_cast<MemRefType>(maskBuffer.getType());
if (xferOp.isBroadcastDim(0) \|\| xferOp.getMaskType().getRank() == 1) {		if (xferOp.isBroadcastDim(0) \|\| xferOp.getMaskType().getRank() == 1) {
// Do not unpack a dimension of the mask, if:		// Do not unpack a dimension of the mask, if:
// * To-be-unpacked transfer op dimension is a broadcast.		// * To-be-unpacked transfer op dimension is a broadcast.
// * Mask is 1D, i.e., the mask cannot be further unpacked.		// * Mask is 1D, i.e., the mask cannot be further unpacked.
// (That means that all remaining dimensions of the transfer op must		// (That means that all remaining dimensions of the transfer op must
// be broadcasted.)		// be broadcasted.)
castedMaskBuffer = maskBuffer;		castedMaskBuffer = maskBuffer;
} else {		} else {
auto castedMaskType = unpackOneDim(maskBufferType);		// It's safe to assume the mask buffer can be unpacked if the data
		// buffer was unpacked.
		auto castedMaskType = *unpackOneDim(maskBufferType);
castedMaskBuffer =		castedMaskBuffer =
locB.create<vector::TypeCastOp>(castedMaskType, maskBuffer);		locB.create<vector::TypeCastOp>(castedMaskType, maskBuffer);
}		}
}		}

// Loop bounds and step.		// Loop bounds and step.
auto lb = locB.create<arith::ConstantIndexOp>(0);		auto lb = locB.create<arith::ConstantIndexOp>(0);
auto ub = locB.create<arith::ConstantIndexOp>(		auto ub = locB.create<arith::ConstantIndexOp>(
castedDataType.getDimSize(castedDataType.getRank() - 1));		castedDataType->getDimSize(castedDataType->getRank() - 1));
auto step = locB.create<arith::ConstantIndexOp>(1);		auto step = locB.create<arith::ConstantIndexOp>(1);
// TransferWriteOps that operate on tensors return the modified tensor and		// TransferWriteOps that operate on tensors return the modified tensor and
// require a loop state.		// require a loop state.
auto loopState = Strategy<OpTy>::initialLoopState(xferOp);		auto loopState = Strategy<OpTy>::initialLoopState(xferOp);

// Generate for loop.		// Generate for loop.
auto result = locB.create<scf::ForOp>(		auto result = locB.create<scf::ForOp>(
lb, ub, step, loopState ? ValueRange(loopState) : ValueRange(),		lb, ub, step, loopState ? ValueRange(loopState) : ValueRange(),
▲ Show 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(TransferReadOp xferOp,
if (xferOp.getVectorType().getElementType() !=		if (xferOp.getVectorType().getElementType() !=
xferOp.getShapedType().getElementType())		xferOp.getShapedType().getElementType())
return failure();		return failure();

auto insertOp = getInsertOp(xferOp);		auto insertOp = getInsertOp(xferOp);
auto vec = getResultVector(xferOp, rewriter);		auto vec = getResultVector(xferOp, rewriter);
auto vecType = dyn_cast<VectorType>(vec.getType());		auto vecType = dyn_cast<VectorType>(vec.getType());
auto xferVecType = xferOp.getVectorType();		auto xferVecType = xferOp.getVectorType();
auto newXferVecType = VectorType::get(xferVecType.getShape().drop_front(),
xferVecType.getElementType());		if (xferVecType.getScalableDims()[0]) {
		awarzynskiUnsubmitted Done Reply Inline Actions [nit] Could this be `xferVecType.getScalableDims()[0]` instead? Basically to match this bit furhter down: 1099: int64_t dimSize = xferVecType.getShape()[0]; #self-documenting-code awarzynski: [nit] Could this be `xferVecType.getScalableDims()[0]` instead? Basically to match this bit…
		// Cannot unroll a scalable dimension at compile time.
		return failure();
		}

		VectorType newXferVecType = VectorType::Builder(xferVecType).dropDim(0);

int64_t dimSize = xferVecType.getShape()[0];		int64_t dimSize = xferVecType.getShape()[0];

// Generate fully unrolled loop of transfer ops.		// Generate fully unrolled loop of transfer ops.
Location loc = xferOp.getLoc();		Location loc = xferOp.getLoc();
for (int64_t i = 0; i < dimSize; ++i) {		for (int64_t i = 0; i < dimSize; ++i) {
Value iv = rewriter.create<arith::ConstantIndexOp>(loc, i);		Value iv = rewriter.create<arith::ConstantIndexOp>(loc, i);

vec = generateInBoundsCheck(		vec = generateInBoundsCheck(
▲ Show 20 Lines • Show All 402 Lines • Show Last 20 Lines

mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir

	Show First 20 Lines • Show All 629 Lines • ▼ Show 20 Lines
	// CHECK: scf.if %[[IS_NOT_LAST]] {			// CHECK: scf.if %[[IS_NOT_LAST]] {
	// CHECK: vector.print punctuation <comma>			// CHECK: vector.print punctuation <comma>
	// CHECK: }			// CHECK: }
	// CHECK: }			// CHECK: }
	// CHECK: vector.print punctuation <close>			// CHECK: vector.print punctuation <close>
	// CHECK: vector.print			// CHECK: vector.print
	// CHECK: return			// CHECK: return
	// CHECK: }			// CHECK: }

				// -----

				func.func @transfer_read_array_of_scalable(%arg0: memref<3x?xf32>) -> vector<3x[4]xf32> {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%cst = arith.constant 0.000000e+00 : f32
				%dim = memref.dim %arg0, %c1 : memref<3x?xf32>
				%mask = vector.create_mask %c1, %dim : vector<3x[4]xi1>
				%read = vector.transfer_read %arg0[%c0, %c0], %cst, %mask {in_bounds = [true, true]} : memref<3x?xf32>, vector<3x[4]xf32>
				awarzynskiUnsubmitted Done Reply Inline Actions An example with the trailing dim other than 1 would be more interesting ;-) awarzynski: An example with the trailing dim other than 1 would be more interesting ;-)
				return %read : vector<3x[4]xf32>
				}
				// CHECK-LABEL: func.func @transfer_read_array_of_scalable(
				// CHECK-SAME: %[[ARG:.*]]: memref<3x?xf32>) -> vector<3x[4]xf32> {
				// CHECK: %[[PADDING:.*]] = arith.constant 0.000000e+00 : f32
				// CHECK: %[[C0:.*]] = arith.constant 0 : index
				// CHECK: %[[C3:.*]] = arith.constant 3 : index
				// CHECK: %[[C1:.*]] = arith.constant 1 : index
				// CHECK: %[[ALLOCA_VEC:.*]] = memref.alloca() : memref<vector<3x[4]xf32>>
				// CHECK: %[[ALLOCA_MASK:.*]] = memref.alloca() : memref<vector<3x[4]xi1>>
				// CHECK: %[[DIM_SIZE:.*]] = memref.dim %[[ARG]], %[[C1]] : memref<3x?xf32>
				// CHECK: %[[MASK:.*]] = vector.create_mask %[[C1]], %[[DIM_SIZE]] : vector<3x[4]xi1>
				// CHECK: memref.store %[[MASK]], %[[ALLOCA_MASK]][] : memref<vector<3x[4]xi1>>
				// CHECK: %[[UNPACK_VECTOR:.*]] = vector.type_cast %[[ALLOCA_VEC]] : memref<vector<3x[4]xf32>> to memref<3xvector<[4]xf32>>
				// CHECK: %[[UNPACK_MASK:.*]] = vector.type_cast %[[ALLOCA_MASK]] : memref<vector<3x[4]xi1>> to memref<3xvector<[4]xi1>>
				// CHECK: scf.for %[[VAL_11:.*]] = %[[C0]] to %[[C3]] step %[[C1]] {
				// CHECK: %[[MASK_SLICE:.*]] = memref.load %[[UNPACK_MASK]]{{\[}}%[[VAL_11]]] : memref<3xvector<[4]xi1>>
				// CHECK: %[[READ_SLICE:.*]] = vector.transfer_read %[[ARG]]{{\[}}%[[VAL_11]], %[[C0]]], %[[PADDING]], %[[MASK_SLICE]] {in_bounds = [true]} : memref<3x?xf32>, vector<[4]xf32>
				// CHECK: memref.store %[[READ_SLICE]], %[[UNPACK_VECTOR]]{{\[}}%[[VAL_11]]] : memref<3xvector<[4]xf32>>
				// CHECK: }
				// CHECK: %[[RESULT:.*]] = memref.load %[[ALLOCA_VEC]][] : memref<vector<3x[4]xf32>>
				// CHECK: return %[[RESULT]] : vector<3x[4]xf32>
				// CHECK: }

				// -----

				func.func @transfer_write_array_of_scalable(%vec: vector<3x[4]xf32>, %arg0: memref<3x?xf32>) {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%cst = arith.constant 0.000000e+00 : f32
				%dim = memref.dim %arg0, %c1 : memref<3x?xf32>
				%mask = vector.create_mask %c1, %dim : vector<3x[4]xi1>
				vector.transfer_write %vec, %arg0[%c0, %c0], %mask {in_bounds = [true, true]} : vector<3x[4]xf32>, memref<3x?xf32>
				return
				}
				// CHECK-LABEL: func.func @transfer_write_array_of_scalable(
				// CHECK-SAME: %[[VEC:.*]]: vector<3x[4]xf32>,
				// CHECK-SAME: %[[MEMREF:.*]]: memref<3x?xf32>) {
				// CHECK: %[[C0:.*]] = arith.constant 0 : index
				// CHECK: %[[C3:.*]] = arith.constant 3 : index
				// CHECK: %[[C1:.*]] = arith.constant 1 : index
				// CHECK: %[[ALLOCA_VEC:.*]] = memref.alloca() : memref<vector<3x[4]xf32>>
				// CHECK: %[[ALLOCA_MASK:.*]] = memref.alloca() : memref<vector<3x[4]xi1>>
				// CHECK: %[[DIM_SIZE:.*]] = memref.dim %[[MEMREF]], %[[C1]] : memref<3x?xf32>
				// CHECK: %[[MASK:.*]] = vector.create_mask %[[C1]], %[[DIM_SIZE]] : vector<3x[4]xi1>
				// CHECK: memref.store %[[MASK]], %[[ALLOCA_MASK]][] : memref<vector<3x[4]xi1>>
				// CHECK: memref.store %[[VEC]], %[[ALLOCA_VEC]][] : memref<vector<3x[4]xf32>>
				// CHECK: %[[UNPACK_VECTOR:.*]] = vector.type_cast %[[ALLOCA_VEC]] : memref<vector<3x[4]xf32>> to memref<3xvector<[4]xf32>>
				// CHECK: %[[UNPACK_MASK:.*]] = vector.type_cast %[[ALLOCA_MASK]] : memref<vector<3x[4]xi1>> to memref<3xvector<[4]xi1>>
				// CHECK: scf.for %[[VAL_11:.*]] = %[[C0]] to %[[C3]] step %[[C1]] {
				// CHECK: %[[MASK_SLICE:.*]] = memref.load %[[UNPACK_VECTOR]]{{\[}}%[[VAL_11]]] : memref<3xvector<[4]xf32>>
				// CHECK: %[[VECTOR_SLICE:.*]] = memref.load %[[UNPACK_MASK]]{{\[}}%[[VAL_11]]] : memref<3xvector<[4]xi1>>
				// CHECK: vector.transfer_write %[[MASK_SLICE]], %[[MEMREF]]{{\[}}%[[VAL_11]], %[[C0]]], %[[VECTOR_SLICE]] {in_bounds = [true]} : vector<[4]xf32>, memref<3x?xf32>
				// CHECK: }
				// CHECK: return
				// CHECK: }

				// -----

				/// The following two tests currently cannot be lowered via unpacking the leading dim since it is scalable.
				/// It may be possible to special case this via a dynamic dim in future.

				func.func @cannot_lower_transfer_write_with_leading_scalable(%vec: vector<[4]x4xf32>, %arg0: memref<?x4xf32>) {
				%c0 = arith.constant 0 : index
				%c4 = arith.constant 4 : index
				%cst = arith.constant 0.000000e+00 : f32
				%dim = memref.dim %arg0, %c0 : memref<?x4xf32>
				%mask = vector.create_mask %dim, %c4 : vector<[4]x4xi1>
				vector.transfer_write %vec, %arg0[%c0, %c0], %mask {in_bounds = [true, true]} : vector<[4]x4xf32>, memref<?x4xf32>
				return
				}
				// CHECK-LABEL: func.func @cannot_lower_transfer_write_with_leading_scalable(
				// CHECK-SAME: %[[VEC:.*]]: vector<[4]x4xf32>,
				// CHECK-SAME: %[[MEMREF:.*]]: memref<?x4xf32>)
				// CHECK: vector.transfer_write %[[VEC]], %[[MEMREF]][%{{.}}, %{{.}}], %{{.*}} {in_bounds = [true, true]} : vector<[4]x4xf32>, memref<?x4xf32>

				// -----

				func.func @cannot_lower_transfer_read_with_leading_scalable(%arg0: memref<?x4xf32>) -> vector<[4]x4xf32> {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%c4 = arith.constant 4 : index
				%cst = arith.constant 0.000000e+00 : f32
				%dim = memref.dim %arg0, %c0 : memref<?x4xf32>
				%mask = vector.create_mask %dim, %c4 : vector<[4]x4xi1>
				%read = vector.transfer_read %arg0[%c0, %c0], %cst, %mask {in_bounds = [true, true]} : memref<?x4xf32>, vector<[4]x4xf32>
				return %read : vector<[4]x4xf32>
				}
				// CHECK-LABEL: func.func @cannot_lower_transfer_read_with_leading_scalable(
				// CHECK-SAME: %[[MEMREF:.*]]: memref<?x4xf32>)
				// CHECK: %{{.}} = vector.transfer_read %[[MEMREF]][%{{.}}, %{{.}}], %{{.}}, %{{.*}} {in_bounds = [true, true]} : memref<?x4xf32>, vector<[4]x4xf32>

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 556242

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp

mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 556242

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp

mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir

[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF)
ClosedPublic