This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
integration_test/Dialect/Linalg/CPU/
-
Dialect/
-
Linalg/
-
CPU/
-
test-tensor-matmul.mlir
-
lib/Dialect/Linalg/Transforms/
-
Dialect/
-
Linalg/
-
Transforms/
9/9
Bufferize.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
3/3
bufferize.mlir

Differential D90953

[mlir][Linalg] Add support for bufferization of SubTensorOp and SubTensorInsertOp
ClosedPublic

Authored by nicolasvasilache on Nov 6 2020, 9:02 AM.

Download Raw Diff

Details

Reviewers

silvas
aartbik

Summary

This revision adds support for bufferization by using a mix of tensor_load, subview, linalg.copy and tensor_to_memref.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nicolasvasilache created this revision.Nov 6 2020, 9:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 6 2020, 9:02 AM

Herald added subscribers: rdzhabarov, tatianashp, msifontes and 13 others. · View Herald Transcript

nicolasvasilache requested review of this revision.Nov 6 2020, 9:02 AM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald TranscriptNov 6 2020, 9:02 AM

Harbormaster completed remote builds in B77902: Diff 303479.Nov 6 2020, 9:26 AM

silvas added inline comments.Nov 6 2020, 2:01 PM

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
242	This shouldn't be needed. See below.
250	This shouldn't be needed, see below.
279	The conversion framework guarantees that `operands` are all legal types (which would be memref here). So you shouldn't need this `if` condition and can proceed assuming that `sourceMemref` is a memref.
283	You can use `getTypeConverter()->convertType(op.getType())`.
296	You should not be creating TensorLoad / TensorToMemref ops yourself. The dialect conversion framework does it automatically. You can replace this line with `rewriter.replaceOp(op, alloc)` and it will automatically insert tensor_load if needed.
331	Same comment. This isn't needed. The conversion is guaranteed by the framework.
384	not needed.
mlir/test/Dialect/Linalg/bufferize.mlir
232	This is not correct. Sorry :( Consider `%t = std.get_global_memref @some_constant_memref`. There's no way of getting out of a non-peephole analysis here to prove that a memref is writable and not aliased in a way that prevents this optimization. Let's land this without attempting any tricks, and then do a "whiteboarding" session to look at ways to get the desired optimization here (I can think of many ways, but I think we will need to stare at the "inefficient but correct" versions to really make a decision on the tradeoffs here).

silvas requested changes to this revision.Nov 6 2020, 2:01 PM

This revision now requires changes to proceed.Nov 6 2020, 2:01 PM

Thanks @silvas

FYI @ulysseB @albertcohen

mlir/test/Dialect/Linalg/bufferize.mlir
232	Actually, note that the TensorToMemRefOp says: Create a memref from a tensor. This is equivalent to allocating a new memref of the appropriate (possibly dynamic) shape, and then copying the elements (as if by a tensor_store op) into the newly allocated memref. I dropped all the copies that are unnecessary according to this semantics. It seems it is the job of scf.for to do the right thing from tensor_load + tensor_to_memref + ops that create aliases.

Address review.

Herald added a reviewer: aartbik. · View Herald TranscriptNov 9 2020, 4:00 AM

nicolasvasilache edited the summary of this revision. (Show Details)Nov 9 2020, 4:00 AM

Harbormaster completed remote builds in B78096: Diff 303815.Nov 9 2020, 4:19 AM

silvas accepted this revision.Nov 9 2020, 8:35 AM

silvas added inline comments.

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
244	nit: remove tensor_to_memref here, as it no longer reflects the code in the pattern body.
mlir/test/Dialect/Linalg/bufferize.mlir
232	Ugh, I wrote that line, and somehow I didn't piece together the ramifications of these semantics. In practice, `tensor_to_memref(tensor_load(memref)) -> memref` pattern gets applied in many places without considering this semantic... which is a huge latent bug. This is partially just due to how the dialect converion framework works. It elides them "on the fly" without considering the aliasing implications of removing copies.... I'll need to think more about how to fix this..... Thanks for reminding me of this. Example: func @f(%arg0: memref<?xf32>) { %0 = tensor_to_memref(tensor_load(%arg0)) %1 = tensor_to_memref(tensor_load(%arg0)) // In current code, %0 and %1 will alias each other. // In corrected code, %0 and %1 must not alias each other unless it's provable that it doesn't matter :/ }

This revision is now accepted and ready to land.Nov 9 2020, 8:35 AM

nicolasvasilache marked 3 inline comments as done.Nov 9 2020, 9:10 AM

nicolasvasilache added inline comments.

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
244	Addressed: squashed into the commit but forgot to update phab.

This landed

Herald added subscribers: mravishankar, teijeong. · View Herald TranscriptNov 24 2020, 2:36 AM

Revision Contents

Path

Size

mlir/

integration_test/

Dialect/

Linalg/

CPU/

test-tensor-matmul.mlir

10 lines

lib/

Dialect/

Linalg/

Transforms/

Bufferize.cpp

114 lines

test/

Dialect/

Linalg/

bufferize.mlir

71 lines

Diff 303815

mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir

	// RUN: mlir-opt %s -std-bufferize -linalg-bufferize -func-bufferize -convert-linalg-to-loops -convert-linalg-to-llvm -convert-std-to-llvm \| \			// RUN: mlir-opt %s -linalg-bufferize -std-bufferize -func-bufferize \
				// RUN: -convert-linalg-to-loops -convert-linalg-to-llvm -convert-std-to-llvm \| \
				// RUN: mlir-cpu-runner -e main -entry-point-result=void \
				// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_runner_utils%shlibext \
				// RUN: \| FileCheck %s

				// RUN: mlir-opt %s -linalg-tile="linalg-tile-sizes=1,2,3" -linalg-bufferize \
				// RUN: -scf-bufferize -std-bufferize -func-bufferize -convert-linalg-to-loops \
				// RUN: -convert-scf-to-std -convert-linalg-to-llvm \| \
	// RUN: mlir-cpu-runner -e main -entry-point-result=void \			// RUN: mlir-cpu-runner -e main -entry-point-result=void \
	// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_runner_utils%shlibext \			// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_runner_utils%shlibext \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	func @main() {			func @main() {
	%A = constant dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf32>			%A = constant dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf32>
	%B = constant dense<[[1.0, 2.0, 3.0, 4.0],			%B = constant dense<[[1.0, 2.0, 3.0, 4.0],
	[5.0, 6.0, 7.0, 8.0],			[5.0, 6.0, 7.0, 8.0],
	Show All 18 Lines

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

Show First 20 Lines • Show All 222 Lines • ▼ Show 20 Lines	if (auto genericOp = dyn_cast<linalg::GenericOp>(op)) {
return success();		return success();
}		}

finalizeBufferAllocation(rewriter, linalgOp, adaptor.inputs(),		finalizeBufferAllocation(rewriter, linalgOp, adaptor.inputs(),
newOutputBuffers);		newOutputBuffers);
return success();		return success();
}		}
};		};
} // namespace

namespace {		// Extract int64_t values from the assumed ArrayAttr of IntegerAttr.
		static SmallVector<int64_t, 4> extractFromI64ArrayAttr(Attribute attr) {
		return llvm::to_vector<4>(
		llvm::map_range(attr.cast<ArrayAttr>(), [](Attribute a) -> int64_t {
		return a.cast<IntegerAttr>().getInt();
		}));
		}

		/// Convert `subtensor %t [offsets][sizes][strides] -> %st` to an alloc + copy
		/// pattern:
		/// ```
		silvasUnsubmitted Done Reply Inline Actions This shouldn't be needed. See below. silvas: This shouldn't be needed. See below.
		/// %a = alloc(sizes)
		/// %sv = subview tensor_to_memref(%t) [offsets][sizes][strides]
		silvasUnsubmitted Done Reply Inline Actions nit: remove tensor_to_memref here, as it no longer reflects the code in the pattern body. silvas: nit: remove tensor_to_memref here, as it no longer reflects the code in the pattern body.
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Addressed: squashed into the commit but forgot to update phab. nicolasvasilache: Addressed: squashed into the commit but forgot to update phab.
		/// linalg_copy(%sv, %a)
		/// ```
		///
		/// This pattern is arguable a std pattern once linalg::CopyOp becomes
		/// std::CopyOp.
		class SubTensorOpConverter : public OpConversionPattern<SubTensorOp> {
		silvasUnsubmitted Done Reply Inline Actions This shouldn't be needed, see below. silvas: This shouldn't be needed, see below.
		public:
		using OpConversionPattern<SubTensorOp>::OpConversionPattern;

		LogicalResult
		matchAndRewrite(SubTensorOp op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const final {
		SubTensorOpAdaptor adaptor(operands,
		op.getOperation()->getAttrDictionary());
		Value sourceMemref = adaptor.source();
		assert(sourceMemref.getType().isa<MemRefType>());

		MemRefType subviewMemRefType =
		getTypeConverter()->convertType(op.getType()).cast<MemRefType>();
		// op.sizes() capture exactly the dynamic alloc operands matching the
		// subviewMemRefType thanks to subview/subtensor canonicalization and
		// verification.
		Value alloc =
		rewriter.create<AllocOp>(op.getLoc(), subviewMemRefType, op.sizes());
		Value subView = rewriter.create<SubViewOp>(
		op.getLoc(), sourceMemref, extractFromI64ArrayAttr(op.static_offsets()),
		extractFromI64ArrayAttr(op.static_sizes()),
		extractFromI64ArrayAttr(op.static_strides()), op.offsets(), op.sizes(),
		op.strides());
		rewriter.create<linalg::CopyOp>(op.getLoc(), subView, alloc);
		rewriter.replaceOp(op, alloc);
		return success();
		}
		};

		silvasUnsubmitted Done Reply Inline Actions The conversion framework guarantees that `operands` are all legal types (which would be memref here). So you shouldn't need this `if` condition and can proceed assuming that `sourceMemref` is a memref. silvas: The conversion framework guarantees that `operands` are all legal types (which would be memref…
		/// Convert `subtensor_insert %source into %dest [offsets][sizes][strides] ->
		/// %t` to an tensor_to_memref + subview + copy + tensor_load pattern:
		/// ```
		/// %m = tensor_to_memref(%dest)
		silvasUnsubmitted Done Reply Inline Actions You can use `getTypeConverter()->convertType(op.getType())`. silvas: You can use `getTypeConverter()->convertType(op.getType())`.
		/// %sv = subview %m [offsets][sizes][strides]
		/// linalg_copy(tensor_to_memref(%source), %sv)
		/// %res = tensor_load(%m)
		/// ```
		///
		/// This pattern is arguable a std pattern once linalg::CopyOp becomes
		/// std::CopyOp.
		class SubTensorInsertOpConverter
		: public OpConversionPattern<SubTensorInsertOp> {
		public:
		using OpConversionPattern<SubTensorInsertOp>::OpConversionPattern;

		LogicalResult
		silvasUnsubmitted Done Reply Inline Actions You should not be creating TensorLoad / TensorToMemref ops yourself. The dialect conversion framework does it automatically. You can replace this line with `rewriter.replaceOp(op, alloc)` and it will automatically insert tensor_load if needed. silvas: You should not be creating TensorLoad / TensorToMemref ops yourself. The dialect conversion…
		matchAndRewrite(SubTensorInsertOp op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const final {
		SubTensorInsertOpAdaptor adaptor(operands,
		op.getOperation()->getAttrDictionary());
		Value sourceMemRef = adaptor.source();
		assert(sourceMemRef.getType().isa<MemRefType>());

		Value destMemRef = adaptor.dest();
		assert(destMemRef.getType().isa<MemRefType>());

		// Take a subview to copy the small memref.
		Value subview = rewriter.create<SubViewOp>(
		op.getLoc(), destMemRef, extractFromI64ArrayAttr(op.static_offsets()),
		extractFromI64ArrayAttr(op.static_sizes()),
		extractFromI64ArrayAttr(op.static_strides()), adaptor.offsets(),
		adaptor.sizes(), adaptor.strides());
		// Copy the small memref.
		rewriter.create<linalg::CopyOp>(op.getLoc(), sourceMemRef, subview);
		rewriter.replaceOp(op, destMemRef);
		return success();
		}
		};

/// TensorConstantOp conversion inserts a linearized 1-D vector constant that is		/// TensorConstantOp conversion inserts a linearized 1-D vector constant that is
/// stored in memory. A linalg.reshape is introduced to convert to the desired		/// stored in memory. A linalg.reshape is introduced to convert to the desired
/// n-D buffer form.		/// n-D buffer form.
class TensorConstantOpConverter : public OpConversionPattern<ConstantOp> {		class TensorConstantOpConverter : public OpConversionPattern<ConstantOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;

LogicalResult		LogicalResult
matchAndRewrite(ConstantOp op, ArrayRef<Value> operands,		matchAndRewrite(ConstantOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const final {		ConversionPatternRewriter &rewriter) const final {

RankedTensorType rankedTensorType =		RankedTensorType rankedTensorType =
		silvasUnsubmitted Done Reply Inline Actions Same comment. This isn't needed. The conversion is guaranteed by the framework. silvas: Same comment. This isn't needed. The conversion is guaranteed by the framework.
op.getType().dyn_cast<RankedTensorType>();		op.getType().dyn_cast<RankedTensorType>();
if (!rankedTensorType)		if (!rankedTensorType)
return failure();		return failure();
if (llvm::any_of(rankedTensorType.getShape(), [](int64_t s) {		if (llvm::any_of(rankedTensorType.getShape(), [](int64_t s) {
return s == 0 \|\| ShapedType::isDynamic(s);		return s == 0 \|\| ShapedType::isDynamic(s);
}))		}))
return failure();		return failure();

Show All 28 Lines	matchAndRewrite(ConstantOp op, ArrayRef<Value> operands,
rewriter.replaceOp(op, memref);		rewriter.replaceOp(op, memref);

return success();		return success();
}		}
};		};
} // namespace		} // namespace

namespace {		namespace {

/// Converts Linalg operations that work on tensor-type operands or results to		/// Converts Linalg operations that work on tensor-type operands or results to
/// work on buffers.		/// work on buffers.
struct LinalgBufferizePass : public LinalgBufferizeBase<LinalgBufferizePass> {		struct LinalgBufferizePass : public LinalgBufferizeBase<LinalgBufferizePass> {
void runOnOperation() override {		void runOnOperation() override {
MLIRContext &context = getContext();		MLIRContext &context = getContext();
ConversionTarget target(context);		ConversionTarget target(context);
BufferizeTypeConverter converter;		BufferizeTypeConverter typeConverter;

// Mark all Standard operations legal.		// Mark all Standard operations legal.
		silvasUnsubmitted Done Reply Inline Actions not needed. silvas: not needed.
// TODO: Remove after TensorConstantOpConverter moves to std-bufferize.
target.addLegalDialect<StandardOpsDialect, vector::VectorDialect>();		target.addLegalDialect<StandardOpsDialect, vector::VectorDialect>();
		target.addIllegalOp<SubTensorOp, SubTensorInsertOp>();

// Mark all Linalg operations illegal as long as they work on tensors.		// Mark all Linalg operations illegal as long as they work on tensors.
auto isLegalOperation = [&](Operation *op) {		auto isLegalOperation = [&](Operation *op) {
return converter.isLegal(op);		return typeConverter.isLegal(op);
};		};
target.addDynamicallyLegalDialect<linalg::LinalgDialect>(isLegalOperation);		target.addDynamicallyLegalDialect<linalg::LinalgDialect>(isLegalOperation);
target.addDynamicallyLegalOp<ConstantOp>(isLegalOperation);		target.addDynamicallyLegalOp<ConstantOp>(isLegalOperation);

OwningRewritePatternList patterns;		OwningRewritePatternList patterns;
populateLinalgBufferizePatterns(&context, converter, patterns);		populateLinalgBufferizePatterns(&context, typeConverter, patterns);
if (failed(applyPartialConversion(getOperation(), target,		if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))		std::move(patterns))))
signalPassFailure();		signalPassFailure();
}		}
};		};
} // end anonymous namespace		} // end anonymous namespace

std::unique_ptr<OperationPass<ModuleOp>> mlir::createLinalgBufferizePass() {		std::unique_ptr<OperationPass<ModuleOp>> mlir::createLinalgBufferizePass() {
return std::make_unique<LinalgBufferizePass>();		return std::make_unique<LinalgBufferizePass>();
}		}

void mlir::linalg::populateLinalgBufferizePatterns(		void mlir::linalg::populateLinalgBufferizePatterns(
MLIRContext *context, BufferizeTypeConverter &converter,		MLIRContext *context, BufferizeTypeConverter &typeConverter,
OwningRewritePatternList &patterns) {		OwningRewritePatternList &patterns) {
		patterns.insert<BufferizeAnyLinalgOp>(typeConverter);
patterns.insert<BufferizeAnyLinalgOp>(converter);		// TODO: Drop this once tensor constants work in standard.
patterns.insert<TensorConstantOpConverter>(converter, context);		patterns.insert<
		// clang-format off
		SubTensorOpConverter,
		SubTensorInsertOpConverter,
		TensorConstantOpConverter
		// clang-format on
		>(typeConverter, context);
}		}

mlir/test/Dialect/Linalg/bufferize.mlir

Show First 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	func @generic_with_init_tensor(%arg0: tensor<2x3x4xvector<3x4xi4>>,
init(%arg1 : tensor<3x2xf32>) {		init(%arg1 : tensor<3x2xf32>) {
^bb(%v0: vector<3x4xi4>, %v1: f32) :		^bb(%v0: vector<3x4xi4>, %v1: f32) :
%f0 = constant 0.0 : f32		%f0 = constant 0.0 : f32
linalg.yield %f0 : f32		linalg.yield %f0 : f32
} -> tensor<3x2xf32>		} -> tensor<3x2xf32>

return %0 : tensor<3x2xf32>		return %0 : tensor<3x2xf32>
}		}

		// -----

		// CHECK-DAG: #[[$MAP0:[0-9a-z]]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
		// CHECK-DAG: #[[$MAP1:[0-9a-z]]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1 * 2)>

		func @make_index() -> index

		// CHECK-LABEL: func @bufferize_subtensor(
		// CHECK-SAME: %[[T:[0-9a-z]*]]: tensor<?x?xf32>
		func @bufferize_subtensor(%t : tensor<?x?xf32>) -> (tensor<2x3xf32>, tensor<2x?xf32>) {
		// CHECK: %[[IDX:.*]] = call @make_index() : () -> index
		%i0 = call @make_index() : () -> index

		// CHECK: %[[M0:.*]] = tensor_to_memref %[[T]] : memref<?x?xf32>
		// CHECK-NEXT: %[[A0:.*]] = alloc() : memref<2x3xf32>
		// CHECK-NEXT: %[[SM0:.*]] = subview %[[M0]][0, 0] [2, 3] [1, 1]
		// CHECK-SAME: memref<?x?xf32> to memref<2x3xf32, #[[$MAP0]]>
		// CHECK-NEXT: linalg.copy(%[[SM0]], %[[A0]]) : memref<2x3xf32, #[[$MAP0]]>, memref<2x3xf32>
		// CHECK-NEXT: %[[RT0:.*]] = tensor_load %[[A0]] : memref<2x3xf32>
		%st0 = subtensor %t[0, 0][2, 3][1, 1] : tensor<?x?xf32> to tensor<2x3xf32>

		// CHECK: %[[M1:.*]] = tensor_to_memref %[[T]] : memref<?x?xf32>
		// CHECK-NEXT: %[[A1:.*]] = alloc(%[[IDX]]) : memref<2x?xf32>
		// CHECK-NEXT: %[[SM1:.*]] = subview %[[M1]][0, %[[IDX]]] [2, %[[IDX]]] [1, 2]
		// CHECK-SAME: memref<?x?xf32> to memref<2x?xf32, #[[$MAP1]]>
		// CHECK-NEXT: linalg.copy(%[[SM1]], %[[A1]]) : memref<2x?xf32, #[[$MAP1]]>, memref<2x?xf32>
		// CHECK-NEXT: %[[RT1:.*]] = tensor_load %[[A1]] : memref<2x?xf32>
		%st1 = subtensor %t[0, %i0][2, %i0][1, 2] : tensor<?x?xf32> to tensor<2x?xf32>

		// CHECK-NEXT: return %[[RT0]], %[[RT1]]
		return %st0, %st1 : tensor<2x3xf32>, tensor<2x?xf32>
		}

		// -----

		// CHECK-DAG: #[[$MAP0:[0-9a-z]]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1)>
		// CHECK-DAG: #[[$MAP1:[0-9a-z]]] = affine_map<(d0, d1)[s0, s1] -> (d0 s1 + s0 + d1 * 2)>

		func @make_index() -> index

		// CHECK-LABEL: func @bufferize_subtensor_insert(
		// CHECK-SAME: %[[T:[0-9a-z]*]]: tensor<?x?xf32>
		// CHECK-SAME: %[[ST0:[0-9a-z]*]]: tensor<2x3xf32>
		// CHECK-SAME: %[[ST1:[0-9a-z]*]]: tensor<2x?xf32>
		func @bufferize_subtensor_insert(%t : tensor<?x?xf32>, %st0 : tensor<2x3xf32>, %st1 : tensor<2x?xf32>) ->
		(tensor<?x?xf32>, tensor<?x?xf32>) {
		%c0 = constant 0 : index
		%c1 = constant 1 : index
		// CHECK: %[[IDX:.*]] = call @make_index() : () -> index
		%i0 = call @make_index() : () -> index

		// CHECK-DAG: %[[M0:.*]] = tensor_to_memref %[[T]] : memref<?x?xf32>
		// CHECK-DAG: %[[SM0:.*]] = tensor_to_memref %[[ST0]] : memref<2x3xf32>
		// CHECK-NEXT: %[[SUBVIEW0:.*]] = subview %[[M0]][0, 0] [2, 3] [1, 1]
		// CHECK-SAME: memref<?x?xf32> to memref<2x3xf32, #[[$MAP0]]>
		// CHECK-NEXT: linalg.copy(%[[SM0]], %[[SUBVIEW0]]) : memref<2x3xf32>, memref<2x3xf32, #[[$MAP0]]>
		// CHECK-NEXT: %[[RT0:.*]] = tensor_load %[[M0]] : memref<?x?xf32>
		%t0 = subtensor_insert %st0 into %t[0, 0][2, 3][1, 1] : tensor<2x3xf32> into tensor<?x?xf32>

		// CHECK-DAG: %[[M1:.*]] = tensor_to_memref %[[T]] : memref<?x?xf32>
		// CHECK-DAG: %[[SM1:.*]] = tensor_to_memref %[[ST1]] : memref<2x?xf32>
		// CHECK-NEXT: %[[SUBVIEW1:.*]] = subview %[[M1]][0, %[[IDX]]] [2, %[[IDX]]] [1, 2]
		// CHECK-SAME: memref<?x?xf32> to memref<2x?xf32, #[[$MAP1]]>
		// CHECK-NEXT: linalg.copy(%[[SM1]], %[[SUBVIEW1]]) : memref<2x?xf32>, memref<2x?xf32, #[[$MAP1]]>
		// CHECK-NEXT: %[[RT1:.*]] = tensor_load %[[M1]] : memref<?x?xf32>
		%t1 = subtensor_insert %st1 into %t[0, %i0][2, %i0][1, 2] : tensor<2x?xf32> into tensor<?x?xf32>

		// CHECK: return %[[RT0]], %[[RT1]]
		return %t0, %t1: tensor<?x?xf32>, tensor<?x?xf32>
		}
		silvasUnsubmitted Done Reply Inline Actions This is not correct. Sorry :( Consider `%t = std.get_global_memref @some_constant_memref`. There's no way of getting out of a non-peephole analysis here to prove that a memref is writable and not aliased in a way that prevents this optimization. Let's land this without attempting any tricks, and then do a "whiteboarding" session to look at ways to get the desired optimization here (I can think of many ways, but I think we will need to stare at the "inefficient but correct" versions to really make a decision on the tradeoffs here). silvas: This is not correct. Sorry :( Consider `%t = std.get_global_memref @some_constant_memref`.
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Actually, note that the TensorToMemRefOp says: Create a memref from a tensor. This is equivalent to allocating a new memref of the appropriate (possibly dynamic) shape, and then copying the elements (as if by a tensor_store op) into the newly allocated memref. I dropped all the copies that are unnecessary according to this semantics. It seems it is the job of scf.for to do the right thing from tensor_load + tensor_to_memref + ops that create aliases. nicolasvasilache: Actually, note that the TensorToMemRefOp says: ``` Create a memref from a tensor. This is…
		silvasUnsubmitted Done Reply Inline Actions Ugh, I wrote that line, and somehow I didn't piece together the ramifications of these semantics. In practice, `tensor_to_memref(tensor_load(memref)) -> memref` pattern gets applied in many places without considering this semantic... which is a huge latent bug. This is partially just due to how the dialect converion framework works. It elides them "on the fly" without considering the aliasing implications of removing copies.... I'll need to think more about how to fix this..... Thanks for reminding me of this. Example: func @f(%arg0: memref<?xf32>) { %0 = tensor_to_memref(tensor_load(%arg0)) %1 = tensor_to_memref(tensor_load(%arg0)) // In current code, %0 and %1 will alias each other. // In corrected code, %0 and %1 must not alias each other unless it's provable that it doesn't matter :/ } silvas: Ugh, I wrote that line, and somehow I didn't piece together the ramifications of these…

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Add support for bufferization of SubTensorOp and SubTensorInsertOpClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 303815

mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

mlir/test/Dialect/Linalg/bufferize.mlir

[mlir][Linalg] Add support for bufferization of SubTensorOp and SubTensorInsertOp
ClosedPublic