This is an archive of the discontinued LLVM Phabricator instance.

mlir/test/Dialect/Linalg/bufferize.mlir
49	Should there be something like `CHECK-NOT: memref.alloc` here? If I understand correctly you're avoiding generating a second alloc? (assuming this test case had two generated?) Otherwise the test does not really demonstrate what I understand from the description.

Added CHECK-NOT as suggested by the reviewer.

wenzhicui added inline comments.Sep 7 2021, 10:45 PM

mlir/test/Dialect/Linalg/bufferize.mlir
49	Thanks for the suggestion. Added the CHECK. That was the intention of this patch.

mehdi_amini added inline comments.Sep 7 2021, 11:00 PM

mlir/test/Dialect/Linalg/bufferize.mlir
49	I don't think it is satisfying actually: the test seems to be passing without the code change. Can you look into this?

mehdi_amini added inline comments.Sep 7 2021, 11:01 PM

mlir/test/Dialect/Linalg/bufferize.mlir

I see this IR in the test output before the code change:

18:  func @init_tensor(%arg0: tensor<?xf32>, %arg1: index) -> tensor<?xf32> { 
19:  %0 = memref.buffer_cast %arg0 : memref<?xf32> 
20:  %1 = memref.alloc(%arg1) : memref<?xf32> 
21:  linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} ins(%0 : memref<?xf32>) outs(%1 : memref<?xf32>) {

Harbormaster completed remote builds in B122996: Diff 371256.Sep 7 2021, 11:20 PM

wenzhicui added inline comments.Sep 8 2021, 1:44 AM

mlir/test/Dialect/Linalg/bufferize.mlir
49	I looked into this test and I realized that the CSE/Canonicalize pass is removing the extra memref.alloc ops because those ops are not used by any other ops. Do you think this patch is still needed since we can always have a CSE/Canonicalize pass to remove the extra alloc ops?

This should be reviewed by @herhut @pifon2a and DFKI friends

nicolasvasilache edited reviewers, added: herhut, pifon2a; removed: nicolasvasilache.Sep 8 2021, 4:20 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptSep 8 2021, 4:20 AM

This change is not correct in general I believe.

The existing alloc is the one for the initial value (e.g. init_tensor). However, there is no guarantee that this buffer is only passed to one linalg.generic operation. It might actually be reused. Hence, we have to create a copy if the value is required (see couple lines below the change). Otherwise, we allocate a fresh result buffer and only use the shape of the provided buffer.

If it then later turns out that the passed buffer indeed is not used anywhere else, we clean this up.

This revision now requires changes to proceed.Sep 8 2021, 7:31 AM

Stephan/Mehdi, thanks for the review. Will abandon this change since it is incorrect.

Revision Contents

Path

Size

mlir/

lib/

Dialect/

Linalg/

Transforms/

Bufferize.cpp

4 lines

test/

Dialect/

Linalg/

bufferize.mlir

5 lines

Diff 371256

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	for (auto en : llvm::enumerate(linalgOp->getResultTypes())) {
if (tensorType == nullptr) {		if (tensorType == nullptr) {
linalgOp.emitOpError()		linalgOp.emitOpError()
<< "tensor to buffer conversion expects ranked tensor results";		<< "tensor to buffer conversion expects ranked tensor results";
return failure();		return failure();
}		}
auto tensorShape = tensorType.getShape();		auto tensorShape = tensorType.getShape();
auto memrefType = MemRefType::get(tensorShape, tensorType.getElementType());		auto memrefType = MemRefType::get(tensorShape, tensorType.getElementType());
Value resultTensor = outputs[resultIndex];		Value resultTensor = outputs[resultIndex];
		if (isa<memref::AllocOp>(resultTensor.getDefiningOp())) {
		resultBuffers.push_back(resultTensor);
		continue;
		}

// Clone output buffers whose value is actually used.		// Clone output buffers whose value is actually used.
OpOperand *tiedOpOperand = linalgOp.getOutputOperand(resultIndex);		OpOperand *tiedOpOperand = linalgOp.getOutputOperand(resultIndex);
if (linalgOp.payloadUsesValueFromOperand(tiedOpOperand)) {		if (linalgOp.payloadUsesValueFromOperand(tiedOpOperand)) {
resultBuffers.push_back(cloneMemref(loc, resultTensor, b));		resultBuffers.push_back(cloneMemref(loc, resultTensor, b));
continue;		continue;
}		}

▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/bufferize.mlir

	Show All 39 Lines

	#map0 = affine_map<(d0) -> (d0)>			#map0 = affine_map<(d0) -> (d0)>

	// Same as above but with linalg.init_tensor op.			// Same as above but with linalg.init_tensor op.

	// CHECK: #map = affine_map<(d0) -> (d0)>			// CHECK: #map = affine_map<(d0) -> (d0)>
	// CHECK-LABEL: func @init_tensor(			// CHECK-LABEL: func @init_tensor(
	// CHECK-SAME: %[[IN:.]]: tensor<?xf32>, %[[SIZE:.]]: index)			// CHECK-SAME: %[[IN:.]]: tensor<?xf32>, %[[SIZE:.]]: index)
	// CHECK: %[[MEMREF:.*]] = memref.buffer_cast %[[IN]] : memref<?xf32>			// CHECK-DAG: %[[MEMREF:.*]] = memref.buffer_cast %[[IN]] : memref<?xf32>
	// CHECK: %[[OUT_BUF:.*]] = memref.alloc(%[[SIZE]]) : memref<?xf32>			// CHECK-DAG: %[[OUT_BUF:.*]] = memref.alloc(%[[SIZE]]) : memref<?xf32>
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Should there be something like `CHECK-NOT: memref.alloc` here? If I understand correctly you're avoiding generating a second alloc? (assuming this test case had two generated?) Otherwise the test does not really demonstrate what I understand from the description. mehdi_amini: Should there be something like `CHECK-NOT: memref.alloc` here? If I understand correctly you're…
				wenzhicuiAuthorUnsubmitted Done Reply Inline Actions Thanks for the suggestion. Added the CHECK. That was the intention of this patch. wenzhicui: Thanks for the suggestion. Added the CHECK. That was the intention of this patch.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions I don't think it is satisfying actually: the test seems to be passing without the code change. Can you look into this? mehdi_amini: I don't think it is satisfying actually: the test seems to be passing without the code change.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions I see this IR in the test output before the code change: 18: func @init_tensor(%arg0: tensor<?xf32>, %arg1: index) -> tensor<?xf32> { 19: %0 = memref.buffer_cast %arg0 : memref<?xf32> 20: %1 = memref.alloc(%arg1) : memref<?xf32> 21: linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} ins(%0 : memref<?xf32>) outs(%1 : memref<?xf32>) { mehdi_amini: I see this IR in the test output before the code change: ``` 18: func @init_tensor…
				wenzhicuiAuthorUnsubmitted Done Reply Inline Actions I looked into this test and I realized that the CSE/Canonicalize pass is removing the extra memref.alloc ops because those ops are not used by any other ops. Do you think this patch is still needed since we can always have a CSE/Canonicalize pass to remove the extra alloc ops? wenzhicui: I looked into this test and I realized that the CSE/Canonicalize pass is removing the extra…
				// CHECK-NOT: memref.alloc
	// CHECK: linalg.generic			// CHECK: linalg.generic
	// CHECK-SAME: ins(%[[MEMREF]] : memref<?xf32>)			// CHECK-SAME: ins(%[[MEMREF]] : memref<?xf32>)
	// CHECK-SAME: outs(%[[OUT_BUF]] : memref<?xf32>) {			// CHECK-SAME: outs(%[[OUT_BUF]] : memref<?xf32>) {
	func @init_tensor(%in : tensor<?xf32>, %size: index) -> tensor<?xf32> {			func @init_tensor(%in : tensor<?xf32>, %size: index) -> tensor<?xf32> {
	%init = linalg.init_tensor [%size] : tensor<?xf32>			%init = linalg.init_tensor [%size] : tensor<?xf32>
	%0 = linalg.generic {			%0 = linalg.generic {
	indexing_maps = [#map0, #map0],			indexing_maps = [#map0, #map0],
	iterator_types = ["parallel"]			iterator_types = ["parallel"]
	▲ Show 20 Lines • Show All 261 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Avoid adding extra memref::AllocOp in bufferize pass.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 371256

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

mlir/test/Dialect/Linalg/bufferize.mlir

[MLIR] Avoid adding extra memref::AllocOp in bufferize pass.
AbandonedPublic