This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Dialect/Tensor/IR/
-
Dialect/
-
Tensor/
-
IR/
1/2
TensorOps.cpp
-
test/Dialect/Tensor/
-
Dialect/
-
Tensor/
-
canonicalize.mlir

Differential D158643

[mlir][tensor] Canonicalize scalar `tensor.insert_slice(tensor.insert, _)` to `tensor.insert`
Needs ReviewPublic

Authored by christopherbate on Aug 23 2023, 10:38 AM.

Download Raw Diff

Details

Reviewers

nicolasvasilache

Summary

Canonicalizes the pattern

%0 = tensor.insert %scalar into %t1[...] : (scalar tensor type)
%1 = tensor.insert_slice %0 into %t2[<indices>]

into

%1 = tensor.insert %scalar into %t2[<indices>]

This has a side effect on bufferization: prior to change canonicalization, the
IR below would result in two allocations (even with empty tensor elimination),
whereas afterwards it results in just the creation of two memref.store ops:

func.func @func(%arg0 : f32, %arg1: f32, %arg2: tensor<4xf32>) -> (tensor<4xf32>) {
  %c0 = arith.constant 0 : index
  %c2 = arith.constant 2 : index
  %c3 = arith.constant 3 : index
  %e1 = tensor.empty() : tensor<1xf32>
  %e2 = tensor.empty() : tensor<f32>
  %0 =  tensor.insert %arg0 into %e1[%c0] : tensor<1xf32>
  %1 =  tensor.insert %arg1 into %e2[] : tensor<f32>
  %2 = tensor.insert_slice %0 into %arg2[%c2][1][1] : tensor<1xf32> into tensor<4xf32>
  %3 = tensor.insert_slice %1 into %2[%c3][1][1] : tensor<f32> into tensor<4xf32>
  return %3 : tensor<4xf32>
}

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

christopherbate created this revision.Aug 23 2023, 10:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 10:38 AM

Herald added subscribers: bviyer, hanchung, Moerafaat and 21 others. · View Herald Transcript

christopherbate requested review of this revision.Aug 23 2023, 10:38 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptAug 23 2023, 10:38 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B254396: Diff 552792.Aug 23 2023, 11:53 AM

nicolasvasilache added inline comments.Aug 30 2023, 4:49 AM

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
2500	Seems to me you either need to check that the tensor.insert indices are all 0 or that you need to combine them together no ? Additionally, have you looked at mlir/lib/Dialect/Tensor/Transforms/FoldTensorSubsetOps.cpp ? It seems you could add a new pattern there ? Note that those patterns are not canonicalizations because they are not always desirable, so we made them opt-in.

nicolasvasilache mentioned this in D158637: [mlir][tensor] Add canonicalization of `tensor.extract` from cast-like slice.Aug 30 2023, 5:11 AM

christopherbate added inline comments.Sep 8 2023, 3:12 PM

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
2500	The producer (tensor.insert) is producing a scalar. That's what line 2500 is checking. The consumer (`tensor.insert_slice`) is then inserting that scalar into some position. So I don't think we need to check the indices of tensor.insert? Or did I get something wrong here?

Revision Contents

Path

Size

mlir/

lib/

Dialect/

Tensor/

IR/

TensorOps.cpp

37 lines

test/

Dialect/

Tensor/

canonicalize.mlir

23 lines

Diff 552792

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

Show First 20 Lines • Show All 2,472 Lines • ▼ Show 20 Lines	Value cast = rewriter.create<tensor::CastOp>(
insertSliceOp.getLoc(), newSrcType, insertSliceOp.getSource());		insertSliceOp.getLoc(), newSrcType, insertSliceOp.getSource());
rewriter.replaceOpWithNewOp<InsertOpTy>(		rewriter.replaceOpWithNewOp<InsertOpTy>(
insertSliceOp, cast, insertSliceOp.getDest(),		insertSliceOp, cast, insertSliceOp.getDest(),
insertSliceOp.getMixedOffsets(), insertSliceOp.getMixedSizes(),		insertSliceOp.getMixedOffsets(), insertSliceOp.getMixedSizes(),
insertSliceOp.getMixedStrides());		insertSliceOp.getMixedStrides());
return success();		return success();
}		}
};		};

		/// Canonicalizes the pattern
		/// ```
		/// %0 = tensor.insert %scalar into %t1[...] : (scalar tensor type)
		/// %1 = tensor.insert_slice %0 into %t2[<indices>]
		/// ```
		/// into
		/// ```
		/// %1 = tensor.insert %scalar into %t2[<indices>]
		/// ```
		struct InsertSliceToInsertRewriter : public OpRewritePattern<InsertSliceOp> {
		using OpRewritePattern::OpRewritePattern;

		LogicalResult matchAndRewrite(tensor::InsertSliceOp op,
		PatternRewriter &rewriter) const override {

		RankedTensorType sourceType = op.getSourceType();
		// The `tensor.insert` result (`insert_slice` source) should be a scalar
		// so that we know forwarding makes sense here.
		if (!sourceType.hasStaticShape() \|\| sourceType.getNumElements() != 1)
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Seems to me you either need to check that the tensor.insert indices are all 0 or that you need to combine them together no ? Additionally, have you looked at mlir/lib/Dialect/Tensor/Transforms/FoldTensorSubsetOps.cpp ? It seems you could add a new pattern there ? Note that those patterns are not canonicalizations because they are not always desirable, so we made them opt-in. nicolasvasilache: Seems to me you either need to check that the tensor.insert indices are all 0 or that you need…
		christopherbateAuthorUnsubmitted Done Reply Inline Actions The producer (tensor.insert) is producing a scalar. That's what line 2500 is checking. The consumer (`tensor.insert_slice`) is then inserting that scalar into some position. So I don't think we need to check the indices of tensor.insert? Or did I get something wrong here? christopherbate: The producer (tensor.insert) is producing a scalar. That's what line 2500 is checking. The…
		return failure();

		auto insertOp = op.getSource().getDefiningOp<tensor::InsertOp>();
		if (!insertOp)
		return failure();

		SmallVector<Value> indices = mlir::getValueOrCreateConstantIndexOp(
		rewriter, op.getLoc(), op.getMixedOffsets());
		rewriter.replaceOpWithNewOp<tensor::InsertOp>(op, insertOp.getScalar(),
		op.getDest(), indices);
		return success();
		}
		};

} // namespace		} // namespace

llvm::SmallBitVector InsertSliceOp::getDroppedDims() {		llvm::SmallBitVector InsertSliceOp::getDroppedDims() {
return ::getDroppedDims(getSourceType().getShape(), getMixedSizes());		return ::getDroppedDims(getSourceType().getShape(), getMixedSizes());
}		}

void InsertSliceOp::getCanonicalizationPatterns(RewritePatternSet &results,		void InsertSliceOp::getCanonicalizationPatterns(RewritePatternSet &results,
MLIRContext *context) {		MLIRContext *context) {
results.add<InsertSliceOpConstantArgumentFolder<InsertSliceOp>,		results.add<InsertSliceOpConstantArgumentFolder<InsertSliceOp>,
InsertSliceOpCastFolder<InsertSliceOp>,		InsertSliceOpCastFolder<InsertSliceOp>,
InsertSliceOpSourceCastInserter<InsertSliceOp>>(context);		InsertSliceOpSourceCastInserter<InsertSliceOp>,
		InsertSliceToInsertRewriter>(context);
}		}

Value mlir::tensor::createCanonicalRankReducingInsertSliceOp(OpBuilder &b,		Value mlir::tensor::createCanonicalRankReducingInsertSliceOp(OpBuilder &b,
Location loc,		Location loc,
Value tensor,		Value tensor,
Value dest) {		Value dest) {
auto rankedTensorType = llvm::cast<RankedTensorType>(dest.getType());		auto rankedTensorType = llvm::cast<RankedTensorType>(dest.getType());
unsigned rank = rankedTensorType.getRank();		unsigned rank = rankedTensorType.getRank();
▲ Show 20 Lines • Show All 1,507 Lines • Show Last 20 Lines

mlir/test/Dialect/Tensor/canonicalize.mlir

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	func.func @fold_insert(%arg0 : index) -> (tensor<4xf32>) {
%1 = arith.constant 4.0 : f32		%1 = arith.constant 4.0 : f32
%ins_1 = tensor.insert %1 into %0[%arg0] : tensor<4xf32>		%ins_1 = tensor.insert %1 into %0[%arg0] : tensor<4xf32>
// CHECK-NEXT: return %[[C4]]		// CHECK-NEXT: return %[[C4]]
return %ins_1 : tensor<4xf32>		return %ins_1 : tensor<4xf32>
}		}

// -----		// -----

		// CHECK-LABEL: func @scalar_insert_slice_to_insert
		// CHECK-SAME: %[[ARG0:.+]]: f32, %[[ARG1:.+]]: f32, %[[ARG2:.+]]: tensor<4xf32>
		func.func @scalar_insert_slice_to_insert(%arg0 : f32, %arg1: f32, %arg2: tensor<4xf32>) -> (tensor<4xf32>) {
		// Canonicalize an insert_slice into an insert.
		// CHECK-DAG: %[[C2:.+]] = arith.constant 2 : index
		// CHECK-DAG: %[[C3:.+]] = arith.constant 3 : index
		%c0 = arith.constant 0 : index
		%c2 = arith.constant 2 : index
		%c3 = arith.constant 3 : index
		%e1 = tensor.empty() : tensor<1xf32>
		%e2 = tensor.empty() : tensor<f32>
		%0 = tensor.insert %arg0 into %e1[%c0] : tensor<1xf32>
		%1 = tensor.insert %arg1 into %e2[] : tensor<f32>
		// CHECK: %[[INS:.+]] = tensor.insert %[[ARG0]] into %[[ARG2]][%[[C2]]]
		// CHECK: %[[INS1:.+]] = tensor.insert %[[ARG1]] into %[[INS]][%[[C3]]]
		// CHECK: return %[[INS1]]
		%2 = tensor.insert_slice %0 into %arg2[%c2][1][1] : tensor<1xf32> into tensor<4xf32>
		%3 = tensor.insert_slice %1 into %2[%c3][1][1] : tensor<f32> into tensor<4xf32>
		return %3 : tensor<4xf32>
		}

		// -----

// CHECK-LABEL: func @extract_from_tensor.cast		// CHECK-LABEL: func @extract_from_tensor.cast
// CHECK-SAME: %[[TENSOR:.*]]: tensor<9xf32>		// CHECK-SAME: %[[TENSOR:.*]]: tensor<9xf32>
func.func @extract_from_tensor.cast(%tensor: tensor<9xf32>) -> f32 {		func.func @extract_from_tensor.cast(%tensor: tensor<9xf32>) -> f32 {
// CHECK-NEXT: %[[C0:.*]] = arith.constant 0 : index		// CHECK-NEXT: %[[C0:.*]] = arith.constant 0 : index
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
// CHECK-NOT: tensor.cast		// CHECK-NOT: tensor.cast
%casted = tensor.cast %tensor : tensor<9xf32> to tensor<?xf32>		%casted = tensor.cast %tensor : tensor<9xf32> to tensor<?xf32>
// CHECK-NEXT: tensor.extract %[[TENSOR]][%[[C0]]]		// CHECK-NEXT: tensor.extract %[[TENSOR]][%[[C0]]]
▲ Show 20 Lines • Show All 1,703 Lines • Show Last 20 Lines