Diff 344318

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 2,898 Lines • ▼ Show 20 Lines	struct TransferReadPermutationLowering
using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferReadOp op,		LogicalResult matchAndRewrite(vector::TransferReadOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
SmallVector<unsigned> permutation;		SmallVector<unsigned> permutation;
AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))		if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))
return failure();		return failure();

AffineMap permutationMap =		AffineMap permutationMap =
map.getPermutationMap(permutation, op.getContext());		map.getPermutationMap(permutation, op.getContext());
if (permutationMap.isIdentity())		if (permutationMap.isIdentity())
return failure();		return failure();
if (op.mask())
return failure();
// Caluclate the map of the new read by applying the inverse permutation.		// Caluclate the map of the new read by applying the inverse permutation.
permutationMap = inversePermutation(permutationMap);		permutationMap = inversePermutation(permutationMap);
AffineMap newMap = permutationMap.compose(map);		AffineMap newMap = permutationMap.compose(map);
		springermAuthorUnsubmitted Done Reply Inline Actions I don't fully understand what this is doing. If there's a better way to implement the mask handling below (maybe reusing some of these functions), please let me know. springerm: I don't fully understand what this is doing. If there's a better way to implement the mask…
		ThomasRaouxUnsubmitted Done Reply Inline Actions You should be able to directly `permutation` for the transpose. (you'll have to convert it to int64_t unfortunately as affineMap take unsigned but transpose take int64_t, this is something we should fix) What this `isPermutationOfMinorIdentityWithBroadcasting` does is it picks a permutation to get to minor identity with broadcast. If there is a broadcast in the map there may be several potential permutation but the right thing to do is to apply the same transpose to masks and dimensions. In the code we inverse the permutation map to apply it to the result of the transfer read as we want to convert it back to the original order but for masks you just want to directly apply the permutation. ThomasRaoux: You should be able to directly `permutation` for the transpose. (you'll have to convert it to…
		springermAuthorUnsubmitted Done Reply Inline Actions Thanks for explaining. I cannot use `permutation` directly, because `mask` ignores all broadcast dimensions. E.g., a 4D transfer read op would have a 1D mask if 3 dimensions are broadcasted. Therefore, I would somehow have to remove source dimensions from `permutation` and "compress" (re-index) the map. Note, I added support for transfer op masks recently, and the commit handling broadcasts (D101745) is not submitted yet. So if the "vector type shape != mask type shape" thing is surprising, this is something we could still discuss in that commit. springerm: Thanks for explaining. I cannot use `permutation` directly, because `mask` ignores all…
		ThomasRaouxUnsubmitted Done Reply Inline Actions I see, I missed the part that mask skips the broadcast dimension. In this case it makes sense, I added some potential simplification below. ThomasRaoux: I see, I missed the part that mask skips the broadcast dimension. In this case it makes sense…
// Apply the reverse transpose to deduce the type of the transfer_read.		// Apply the reverse transpose to deduce the type of the transfer_read.
ArrayRef<int64_t> originalShape = op.getVectorType().getShape();		ArrayRef<int64_t> originalShape = op.getVectorType().getShape();
SmallVector<int64_t> newVectorShape(originalShape.size());		SmallVector<int64_t> newVectorShape(originalShape.size());
for (auto pos : llvm::enumerate(permutation)) {		for (auto pos : llvm::enumerate(permutation)) {
newVectorShape[pos.value()] = originalShape[pos.index()];		newVectorShape[pos.value()] = originalShape[pos.index()];
}		}

		Value newMask;
		if (op.mask()) {
		// Build helper array of size "number of dimensions" of the permutation
		// map. For each dim, assign an increasing counter if the dim is used in
		// the result. E.g.:
		// permutation map: (d0, d1, d2, d3, d4, d5) -> (d5, 0, d3, 0, d2)
		// dim in result? [ 0, 0, 1, 1, 0, 1]
		// dimUseIndexer: [ 0, 0, 0, 1, 1, 2]
		SmallVector<unsigned> dimUseIndexer(map.getNumDims());
		for (unsigned i = 0, pos = 0; i < map.getNumDims(); ++i) {
		auto dimInResult = llvm::any_of(map.getResults(), [&](AffineExpr e) {
		return e.isa<AffineDimExpr>() &&
		e.dyn_cast<AffineDimExpr>().getPosition() == i;
		ThomasRaouxUnsubmitted Done Reply Inline Actions should be `cast<>` instead of `dyn_cast` since you already know the type. ThomasRaoux: should be `cast<>` instead of `dyn_cast` since you already know the type.
		});
		dimUseIndexer[i] = dimInResult ? pos++ : pos;
		}
		ThomasRaouxUnsubmitted Done Reply Inline Actions Can you skip all this by doing `compressUnusedDims(map)`? this will remove all the unused dimension then you can just loop through all the results and do for (unsigned i = 0; i < map.getNumResults(); ++i) { if (auto expr = map.getResult(i).dyn_cast<AffineDimExpr>()) maskTransposeIndices.push_back(expr.getPosition()); } ThomasRaoux: Can you skip all this by doing `compressUnusedDims(map)`? this will remove all the unused…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions +1 nicolasvasilache: +1
		springermAuthorUnsubmitted Done Reply Inline Actions Nice, I didn't know that there's actually a `compressUnusedDims` helper function. springerm: Nice, I didn't know that there's actually a `compressUnusedDims` helper function.

		// Compute mask transpose indices. For each result dim, take corresponding
		// mask dim from `dimUseIndexer`. Note: Mask vectors have a dimension for
		// each result dim that is not a broadcast.
		SmallVector<int64_t> maskTransposeIndices;
		for (unsigned i = 0; i < map.getNumResults(); ++i) {
		if (auto expr = map.getResult(i).dyn_cast<AffineDimExpr>())
		maskTransposeIndices.push_back(dimUseIndexer[expr.getPosition()]);
		}

		newMask = rewriter.create<vector::TransposeOp>(op.getLoc(), op.mask(),
		maskTransposeIndices);
		}

VectorType newReadType =		VectorType newReadType =
VectorType::get(newVectorShape, op.getVectorType().getElementType());		VectorType::get(newVectorShape, op.getVectorType().getElementType());
Value newRead = rewriter.create<vector::TransferReadOp>(		vector::TransferReadOp newRead = rewriter.create<vector::TransferReadOp>(
op.getLoc(), newReadType, op.source(), op.indices(), newMap,		op.getLoc(), newReadType, op.source(), op.indices(), newMap,
op.padding(), op.in_bounds() ? *op.in_bounds() : ArrayAttr());		op.padding(), newMask, op.in_bounds() ? *op.in_bounds() : ArrayAttr());

SmallVector<int64_t> transposePerm(permutation.begin(), permutation.end());		SmallVector<int64_t> transposePerm(permutation.begin(), permutation.end());
rewriter.replaceOpWithNewOp<vector::TransposeOp>(op, newRead,		rewriter.replaceOpWithNewOp<vector::TransposeOp>(op, newRead,
transposePerm);		transposePerm);
return success();		return success();
}		}
};		};

/// Lower transfer_read op with broadcast in the leading dimensions into		/// Lower transfer_read op with broadcast in the leading dimensions into
/// transfer_read of lower rank + vector.broadcast.		/// transfer_read of lower rank + vector.broadcast.
/// Ex: vector.transfer_read ...		/// Ex: vector.transfer_read ...
/// permutation_map: (d0, d1, d2, d3) -> (0, d1, 0, d3)		/// permutation_map: (d0, d1, d2, d3) -> (0, d1, 0, d3)
/// into:		/// into:
/// %v = vector.transfer_read ...		/// %v = vector.transfer_read ...
/// permutation_map: (d0, d1, d2, d3) -> (d1, 0, d3)		/// permutation_map: (d0, d1, d2, d3) -> (d1, 0, d3)
/// vector.broadcast %v		/// vector.broadcast %v
struct TransferOpReduceRank : public OpRewritePattern<vector::TransferReadOp> {		struct TransferOpReduceRank : public OpRewritePattern<vector::TransferReadOp> {
using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferReadOp op,		LogicalResult matchAndRewrite(vector::TransferReadOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (op.mask())
return failure();
AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
unsigned numLeadingBroadcast = 0;		unsigned numLeadingBroadcast = 0;
for (auto expr : map.getResults()) {		for (auto expr : map.getResults()) {
auto dimExpr = expr.dyn_cast<AffineConstantExpr>();		auto dimExpr = expr.dyn_cast<AffineConstantExpr>();
if (!dimExpr \|\| dimExpr.getValue() != 0)		if (!dimExpr \|\| dimExpr.getValue() != 0)
break;		break;
numLeadingBroadcast++;		numLeadingBroadcast++;
}		}
Show All 19 Lines	VectorType newReadType =
VectorType::get(newShape, originalVecType.getElementType());		VectorType::get(newShape, originalVecType.getElementType());
ArrayAttr newInBounds =		ArrayAttr newInBounds =
op.in_bounds()		op.in_bounds()
? rewriter.getArrayAttr(		? rewriter.getArrayAttr(
op.in_boundsAttr().getValue().take_back(reducedShapeRank))		op.in_boundsAttr().getValue().take_back(reducedShapeRank))
: ArrayAttr();		: ArrayAttr();
Value newRead = rewriter.create<vector::TransferReadOp>(		Value newRead = rewriter.create<vector::TransferReadOp>(
op.getLoc(), newReadType, op.source(), op.indices(), newMap,		op.getLoc(), newReadType, op.source(), op.indices(), newMap,
op.padding(), newInBounds);		op.padding(), op.mask(), newInBounds);
		ThomasRaouxUnsubmitted Done Reply Inline Actions Don't we need to remove the leading dimension of the mask so that it matches the new rank of the transfer read? We should also add a test for this case. ThomasRaoux: Don't we need to remove the leading dimension of the mask so that it matches the new rank of…
		springermAuthorUnsubmitted Done Reply Inline Actions Since broadcast dims do not have a corresponding dimension in the mask vector, removing a broadcast does not require any changes to the mask vector. springerm: Since broadcast dims do not have a corresponding dimension in the mask vector, removing a…
		ThomasRaouxUnsubmitted Done Reply Inline Actions Makes sense. ThomasRaoux: Makes sense.
rewriter.replaceOpWithNewOp<vector::BroadcastOp>(op, originalVecType,		rewriter.replaceOpWithNewOp<vector::BroadcastOp>(op, originalVecType,
newRead);		newRead);
return success();		return success();
}		}
};		};

// Trims leading one dimensions from `oldType` and returns the result type.		// Trims leading one dimensions from `oldType` and returns the result type.
// Returns `vector<1xT>` if `oldType` only has one element.		// Returns `vector<1xT>` if `oldType` only has one element.
▲ Show 20 Lines • Show All 861 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/vector-transfer-lowering.mlir

	Show First 20 Lines • Show All 222 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: func @transfer_read_permutations			// CHECK-LABEL: func @transfer_read_permutations
	func @transfer_read_permutations(%arg0 : memref<?x?xf32>, %arg1 : memref<?x?x?x?xf32>)			func @transfer_read_permutations(%arg0 : memref<?x?xf32>, %arg1 : memref<?x?x?x?xf32>)
	-> (vector<7x14x8x16xf32>, vector<7x14x8x16xf32>, vector<7x14x8x16xf32>,			-> (vector<7x14x8x16xf32>, vector<7x14x8x16xf32>, vector<7x14x8x16xf32>,
	vector<7x14x8x16xf32>, vector<7x14x8x16xf32>, vector<7x14x8x16xf32>) {			vector<7x14x8x16xf32>, vector<7x14x8x16xf32>, vector<7x14x8x16xf32>) {
	// CHECK-DAG: %[[CF0:.*]] = constant 0.000000e+00 : f32			// CHECK-DAG: %[[CF0:.*]] = constant 0.000000e+00 : f32
	// CHECK-DAG: %[[C0:.*]] = constant 0 : index			// CHECK-DAG: %[[C0:.*]] = constant 0 : index
	%cst = constant 0.000000e+00 : f32			%cst = constant 0.000000e+00 : f32
	%c0 = constant 0 : index			%c0 = constant 0 : index
				%m = constant 1 : i1

	%0 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst {permutation_map = #map0} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>			%mask0 = splat %m : vector<7x14xi1>
	// CHECK: vector.transfer_read {{.*}} {permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<14x7x8x16xf32>			%0 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst, %mask0 {permutation_map = #map0} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
				// CHECK: %[[MASK0:.]] = vector.transpose {{.}} : vector<7x14xi1> to vector<14x7xi1>
				// CHECK: vector.transfer_read {{.*}} %[[MASK0]] {permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<14x7x8x16xf32>
	// CHECK: vector.transpose %{{.*}}, [1, 0, 2, 3] : vector<14x7x8x16xf32> to vector<7x14x8x16xf32>			// CHECK: vector.transpose %{{.*}}, [1, 0, 2, 3] : vector<14x7x8x16xf32> to vector<7x14x8x16xf32>

	%1 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst {permutation_map = #map1} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>			%mask1 = splat %m : vector<14x16xi1>
	// CHECK: vector.transfer_read {{.*}} {permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>			%1 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst, %mask1 {permutation_map = #map1} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
				// CHECK: %[[MASK1:.]] = vector.transpose {{.}} : vector<14x16xi1> to vector<16x14xi1>
				// CHECK: vector.transfer_read {{.*}} %[[MASK1]] {permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>
	// CHECK: vector.transpose %{{.*}}, [2, 1, 3, 0] : vector<16x14x7x8xf32> to vector<7x14x8x16xf32>			// CHECK: vector.transpose %{{.*}}, [2, 1, 3, 0] : vector<16x14x7x8xf32> to vector<7x14x8x16xf32>

	%2 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, false, true], permutation_map = #map2} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>			%mask2 = splat %m : vector<7x14xi1>
	// CHECK: vector.transfer_read {{.*}} {in_bounds = [true, false, true], permutation_map = #[[$MAP1]]} : memref<?x?x?x?xf32>, vector<14x16x7xf32>			%2 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst, %mask2 {in_bounds = [true, true, false, true], permutation_map = #map2} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
				// CHECK: %[[MASK2:.]] = vector.transpose {{.}} : vector<7x14xi1> to vector<14x7xi1>
				// CHECK: vector.transfer_read {{.*}} %[[MASK2]] {in_bounds = [true, false, true], permutation_map = #[[$MAP1]]} : memref<?x?x?x?xf32>, vector<14x16x7xf32>
	// CHECK: vector.broadcast %{{.*}} : vector<14x16x7xf32> to vector<8x14x16x7xf32>			// CHECK: vector.broadcast %{{.*}} : vector<14x16x7xf32> to vector<8x14x16x7xf32>
	// CHECK: vector.transpose %{{.*}}, [3, 1, 0, 2] : vector<8x14x16x7xf32> to vector<7x14x8x16xf32>			// CHECK: vector.transpose %{{.*}}, [3, 1, 0, 2] : vector<8x14x16x7xf32> to vector<7x14x8x16xf32>

	%3 = vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = #map3} : memref<?x?xf32>, vector<7x14x8x16xf32>			%3 = vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = #map3} : memref<?x?xf32>, vector<7x14x8x16xf32>
	// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] : memref<?x?xf32>, vector<14x7xf32>			// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] : memref<?x?xf32>, vector<14x7xf32>
	// CHECK: vector.broadcast %{{.*}} : vector<14x7xf32> to vector<8x16x14x7xf32>			// CHECK: vector.broadcast %{{.*}} : vector<14x7xf32> to vector<8x16x14x7xf32>
	// CHECK: vector.transpose %{{.*}}, [3, 2, 0, 1] : vector<8x16x14x7xf32> to vector<7x14x8x16xf32>			// CHECK: vector.transpose %{{.*}}, [3, 2, 0, 1] : vector<8x16x14x7xf32> to vector<7x14x8x16xf32>

	Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Support masks in TransferOpReduceRank and TransferReadPermutationLowering
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 344318

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/test/Dialect/Vector/vector-transfer-lowering.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Support masks in TransferOpReduceRank and TransferReadPermutationLoweringClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 344318

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/test/Dialect/Vector/vector-transfer-lowering.mlir

[mlir] Support masks in TransferOpReduceRank and TransferReadPermutationLowering
ClosedPublic