This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/IR/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
-
LinalgOps.td
-
lib/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
IR/
-
LinalgOps.cpp
-
Transforms/
3/3
Bufferize.cpp
-
test/
-
Dialect/Linalg/
-
Linalg/
1
bufferize.mlir
-
Integration/Dialect/Linalg/CPU/
-
Dialect/
-
Linalg/
-
CPU/
-
test-padtensor.mlir

Differential D105293

Refactor GenericPadTensorOpVectorizationPattern
ClosedPublic

Authored by cathyzhyi on Jul 1 2021, 9:35 AM.

Download Raw Diff

Details

Reviewers

silvas
nicolasvasilache
springerm
aartbik

Commits

rG35df2f6fbd1a: Refactor GenericPadTensorOpVectorizationPattern

Summary

Refactor the original code to rewrite a PadTensorOp into a
sequence of InitTensorOp, FillOp and InsertSliceOp without
vectorization by default. GenericPadTensorOpVectorizationPattern
provides a customized OptimizeCopyFn to vectorize the
copying step.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cathyzhyi created this revision.Jul 1 2021, 9:35 AM

Herald added subscribers: dcaballe, cota, mravishankar and 17 others. · View Herald TranscriptJul 1 2021, 9:35 AM

cathyzhyi requested review of this revision.Jul 1 2021, 9:35 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJul 1 2021, 9:35 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: limo1996, stephenneuendorffer, nicolasvasilache. · View Herald Transcript

delete redundant comments

remove redundant type check

Can you add an integration test analogous to mlir/test/Integration/Dialect/Linalg/CPU/test-subtensor-insert.mlir ?

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
223	nit: Prefer PadTensorOp::Adaptor
237	nit: use llvm::hasSingleElement
256	nit: Use `.` at the end of the comment. Same with the other comments. https://llvm.org/docs/CodingStandards.html#commenting
mlir/test/Dialect/Linalg/bufferize.mlir
270	can you give meaningful names to the CHECK values?

This revision now requires changes to proceed.Jul 1 2021, 10:14 AM

Harbormaster completed remote builds in B112019: Diff 355925.Jul 1 2021, 10:43 AM

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

In D105293#2854047, @nicolasvasilache wrote:

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

That makes sense to me!

address comments

@cathyzhyi I copyhacked your code into a tensor rewrite here: https://reviews.llvm.org/D105317

It seems fine modulo canonicalizations missing on memref::DimOp + tensor (which also interferes with tensor::DimOp that @springerm is looking at).

Feel free to take whatever makes sense to you and refactor / land.
Depending on where you go with this I'll adapt.

Will pick it up again tomorrow.

Thanks for pushing this!

In D105293#2854084, @silvas wrote:

In D105293#2854047, @nicolasvasilache wrote:

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

That makes sense to me!

+1 thanks for the suggestion!

Harbormaster completed remote builds in B112081: Diff 356010.Jul 1 2021, 2:34 PM

use rewrite pattern instead

@nicolasvasilache seems with the rewrite pattern version the integration test passed but there are some failures in the unit test. Will need to take a closer look tmr. Here is the unit test.

func @pad_tensor(%arg0: tensor<4x?x2x?xf32>, %arg1: index) -> tensor<4x?x?x?xf32> {
  %cst = constant 0.0 : f32
  %out = linalg.pad_tensor %arg0 low[0, 0, %arg1, 0] high[0, 0, 0, %arg1]  {
  ^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index, %gen_arg4: index):  // no predecessors
    linalg.yield %cst : f32
  } : tensor<4x?x2x?xf32> to tensor<4x?x?x?xf32>
  return %out : tensor<4x?x?x?xf32>
}

The functionality of this pattern is already provided by GenericPadTensorOpVectorizationPattern. It also translates a PadTensorOp into InitTensorOp + FillOp + InsertSliceOp. However, GenericPadTensorOpVectorizationPattern also does a bit more: It tries to generate vectorized alternatives to FillOp and InsertSliceOp. Only if there is not enough static type information, it generates FillOp and InsertSliceOp.

See https://reviews.llvm.org/D103679 for details. (The pattern was extended in subsequent commits, so check on Github for the most recent version.)

Can you run the vectorization pass or is vectorization not desired in your use case? If you want just InitTensorOp + FillOp + InsertSliceOp and no vectorization, I think there should be a way to avoid duplicating that functionality.

This revision now requires changes to proceed.Jul 1 2021, 7:30 PM

Harbormaster completed remote builds in B112132: Diff 356084.Jul 1 2021, 7:48 PM

In D105293#2854620, @cathyzhyi wrote:
@nicolasvasilache seems with the rewrite pattern version the integration test passed but there are some failures in the unit test. Will need to take a closer look tmr. Here is the unit test.
func @pad_tensor(%arg0: tensor<4x?x2x?xf32>, %arg1: index) -> tensor<4x?x?x?xf32> {
  %cst = constant 0.0 : f32
  %out = linalg.pad_tensor %arg0 low[0, 0, %arg1, 0] high[0, 0, 0, %arg1]  {
  ^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index, %gen_arg4: index):  // no predecessors
    linalg.yield %cst : f32
  } : tensor<4x?x2x?xf32> to tensor<4x?x?x?xf32>
  return %out : tensor<4x?x?x?xf32>
}

It seems the version @springerm has for vectorization is already fitting the bill and would need to be split out from the vectorization part.
@springerm , indeed if we get to bufferization and still have this form, we should bufferize without relying on vectorization.

After looking at this in more detail, the easiest way would be to add a boolean (template) parameter to GenericPadTensorOpVectorizationPattern, which enables or disables vectorization (and maybe rename the pattern). Then register the pattern in applyEnablingTransformations. However, then we would have a dependency of ComprehensiveBufferize.cpp on Vectorization.cpp. Not sure if that is a good idea. I don't see a better way of factoring out common functionality.

Alternatively, just duplicate the pattern (without vectorization) as this revision does at the moment. With the above code suggestion, we would be duplicating around 50 lines of code. Maybe not too bad.

@nicolasvasilache What do you think?

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
2490–2507 ↗	(On Diff #356084)	Can be replaced with: Value paddingValue = op.getConstantPaddingValue(); if (!paddingValue) return failure();

In D105293#2854818, @springerm wrote:

After looking at this in more detail, the easiest way would be to add a boolean (template) parameter to GenericPadTensorOpVectorizationPattern, which enables or disables vectorization (and maybe rename the pattern). Then register the pattern in applyEnablingTransformations. However, then we would have a dependency of ComprehensiveBufferize.cpp on Vectorization.cpp. Not sure if that is a good idea. I don't see a better way of factoring out common functionality.

Alternatively, just duplicate the pattern (without vectorization) as this revision does at the moment. With the above code suggestion, we would be duplicating around 50 lines of code. Maybe not too bad.

@nicolasvasilache What do you think?

Indeed, so what I have in my local branch is quite close:

the FillOp is not a "tryVectorize" but just a "createFillOrGenerateOp", it is unconditionally always there
I just deleted the copy vectorization.

It seems this could be refactored and moved to Linalg/Transforms/Transforms.h and take a lambda for the copy.
By default it would take nothing (to just lower to other tensor ops).
The vectorization pattern would just reconfigure the pattern to add its own copy vectorization mechanism.

refactor and reuse existing code as suggested.

Herald added a reviewer: aartbik. · View Herald TranscriptJul 2 2021, 11:18 AM

update comments that's no longer relevant.

@silvas @nicolasvasilache @springerm Any suggestion on which pass to put this pattern in? There is a LinalgGeneralizationPass which currently only applies to patterns lowering to Linalg.generic.

Harbormaster completed remote builds in B112236: Diff 356232.Jul 2 2021, 11:57 AM

In D105293#2855869, @cathyzhyi wrote:

@silvas @nicolasvasilache @springerm Any suggestion on which pass to put this pattern in? There is a LinalgGeneralizationPass which currently only applies to patterns lowering to Linalg.generic.

The pattern will be used by whichever passes need it, such as ComprehensiveBufferize, Bufferize. (you should be able to just add this pattern into Bufferize.cpp in your original patch and let dialect conversion just work.

add the pattern into linalg bufferization pass

Oh, sorry, I thought you meant which woudl use it. It would be nice if it was in a generic helper file. You can probably put it next to PadTensorOpTransformationPattern for consistency.

Harbormaster completed remote builds in B112246: Diff 356246.Jul 2 2021, 12:55 PM

move the new pattern's source code next to PadTensorOpTransformationPattern for
consistency

LGTM. Wait for @springerm / @nicolasvasilache approval too.

nicolasvasilache accepted this revision.Jul 2 2021, 2:00 PM

Harbormaster completed remote builds in B112262: Diff 356269.Jul 2 2021, 2:17 PM

Nice refactoring!

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
689–690 ↗	(On Diff #356269)	can be deleted

This revision is now accepted and ready to land.Jul 2 2021, 6:23 PM

delete redundant comments as suggested

cathyzhyi marked an inline comment as done.Jul 2 2021, 7:42 PM

Harbormaster completed remote builds in B112302: Diff 356314.Jul 2 2021, 8:08 PM

This has reached consensus, I suspect @cathyzhyi does not yet have write privilege to github, committing on her behalf.

Closed by commit rG35df2f6fbd1a: Refactor GenericPadTensorOpVectorizationPattern (authored by cathyzhyi, committed by nicolasvasilache). · Explain WhyJul 7 2021, 4:45 AM

This revision was automatically updated to reflect the committed changes.

nicolasvasilache added a commit: rG35df2f6fbd1a: Refactor GenericPadTensorOpVectorizationPattern.

cathyzhyi mentioned this in D105642: Mark TensorDialect legal and PadTensor op illegal.Jul 8 2021, 12:14 PM

silvas mentioned this in rG7c35aae35b2c: Mark TensorDialect legal and PadTensor op illegal.Jul 8 2021, 3:02 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgOps.td

3 lines

lib/

Dialect/

Linalg/

IR/

LinalgOps.cpp

10 lines

Transforms/

Bufferize.cpp

77 lines

test/

Dialect/

Linalg/

bufferize.mlir

27 lines

Integration/

Dialect/

Linalg/

CPU/

test-padtensor.mlir

32 lines

Diff 356010

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	let extraClassDeclaration = [{

RankedTensorType getSourceType() {		RankedTensorType getSourceType() {
return source().getType().cast<RankedTensorType>();		return source().getType().cast<RankedTensorType>();
}		}
RankedTensorType getResultType() {		RankedTensorType getResultType() {
return getResult().getType().cast<RankedTensorType>();		return getResult().getType().cast<RankedTensorType>();
}		}

		// Infer the dynamic shape of the result tensor along each dim.
		SmallVector<Value> getResultTypeShapes(OpBuilder &b);

// Infer the shape of the result tensor given the static shapes		// Infer the shape of the result tensor given the static shapes
// and element type of the result tensor.		// and element type of the result tensor.
static RankedTensorType inferResultType(RankedTensorType sourceType,		static RankedTensorType inferResultType(RankedTensorType sourceType,
ArrayRef<int64_t> staticLow,		ArrayRef<int64_t> staticLow,
ArrayRef<int64_t> staticHigh);		ArrayRef<int64_t> staticHigh);

// Return a PadTensorOp that pads `source` to `type` size where the static		// Return a PadTensorOp that pads `source` to `type` size where the static
// sizes are assumed to be greater than the dynamic sizes. The op performs		// sizes are assumed to be greater than the dynamic sizes. The op performs
▲ Show 20 Lines • Show All 635 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 958 Lines • ▼ Show 20 Lines	for (int i = 0; i < rank; ++i) {
auto highValue = builder.createOrFold<SubIOp>(loc, resultDimSize, dimOp);		auto highValue = builder.createOrFold<SubIOp>(loc, resultDimSize, dimOp);
high.push_back(highValue);		high.push_back(highValue);
low.push_back(builder.createOrFold<ConstantIndexOp>(loc, 0));		low.push_back(builder.createOrFold<ConstantIndexOp>(loc, 0));
}		}
return PadTensorOp::createPadScalarOp(type, source, pad, low, high, loc,		return PadTensorOp::createPadScalarOp(type, source, pad, low, high, loc,
builder);		builder);
}		}

LogicalResult PadTensorOp::reifyReturnTypeShapesPerResultDim(		SmallVector<Value> PadTensorOp::getResultTypeShapes(OpBuilder &b) {
OpBuilder &b, SmallVectorImpl<SmallVector<Value>> &reifiedReturnShapes) {
Location loc = getLoc();		Location loc = getLoc();
auto lowPad = getMixedLowPad();		auto lowPad = getMixedLowPad();
auto highPad = getMixedHighPad();		auto highPad = getMixedHighPad();
SmallVector<Value> shapes;		SmallVector<Value> shapes;
for (auto dim : llvm::seq<int64_t>(0, getSourceType().getRank())) {		for (auto dim : llvm::seq<int64_t>(0, getSourceType().getRank())) {
// Shape along each dimension is source dim + low pad + high pad.		// Shape along each dimension is source dim + low pad + high pad.
SmallVector<Value> mapOperands;		SmallVector<Value> mapOperands;
mapOperands.push_back(b.createOrFold<memref::DimOp>(loc, source(), dim));		mapOperands.push_back(b.createOrFold<memref::DimOp>(loc, source(), dim));
Show All 9 Lines	auto addOpFoldResult = [&](OpFoldResult valueOrAttr) {
valueOrAttr.get<Attribute>().cast<IntegerAttr>().getInt();		valueOrAttr.get<Attribute>().cast<IntegerAttr>().getInt();
expr = expr + staticValue;		expr = expr + staticValue;
};		};
addOpFoldResult(lowPad[dim]);		addOpFoldResult(lowPad[dim]);
addOpFoldResult(highPad[dim]);		addOpFoldResult(highPad[dim]);
shapes.push_back(applyMapToValues(		shapes.push_back(applyMapToValues(
b, loc, AffineMap::get(1, numSymbols, expr), mapOperands)[0]);		b, loc, AffineMap::get(1, numSymbols, expr), mapOperands)[0]);
}		}
reifiedReturnShapes.emplace_back(std::move(shapes));		return shapes;
		}

		LogicalResult PadTensorOp::reifyReturnTypeShapesPerResultDim(
		OpBuilder &b, SmallVectorImpl<SmallVector<Value>> &reifiedReturnShapes) {
		reifiedReturnShapes.emplace_back(getResultTypeShapes(b));
return success();		return success();
}		}

namespace {		namespace {
// Folds linalg.pad_tensor when padding is static zeros.		// Folds linalg.pad_tensor when padding is static zeros.
struct FoldStaticZeroPadding : public OpRewritePattern<PadTensorOp> {		struct FoldStaticZeroPadding : public OpRewritePattern<PadTensorOp> {
using OpRewritePattern<PadTensorOp>::OpRewritePattern;		using OpRewritePattern<PadTensorOp>::OpRewritePattern;

▲ Show 20 Lines • Show All 2,270 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	matchAndRewrite(FillOp op, ArrayRef<Value> operands,

rewriter.create<FillOp>(op.getLoc(), adaptor.value(), adaptor.output());		rewriter.create<FillOp>(op.getLoc(), adaptor.value(), adaptor.output());
rewriter.replaceOp(op, adaptor.output());		rewriter.replaceOp(op, adaptor.output());

return success();		return success();
}		}
};		};

		/// Returns the static/dynamic mixed sizes of the memref.
		static SmallVector<OpFoldResult> getMixedSizes(OpBuilder &b, Location loc,
		Value memref) {
		auto inputType = memref.getType().cast<ShapedType>();
		auto inputShape = inputType.getShape();
		SmallVector<OpFoldResult> sizeMixedValues;
		for (int64_t i = 0; i < inputType.getRank(); ++i) {
		if (inputShape[i] == ShapedType::kDynamicSize) {
		Value dim = b.create<memref::DimOp>(loc, memref, i);
		sizeMixedValues.push_back(dim);
		} else {
		sizeMixedValues.push_back(b.getI64IntegerAttr(inputShape[i]));
		}
		}
		return sizeMixedValues;
		}

		/// Conversion pattern that bufferizes `linalg.pad_tensor` operation.
		class BufferizePadTensorOp : public OpConversionPattern<PadTensorOp> {
		public:
		using OpConversionPattern<PadTensorOp>::OpConversionPattern;

		LogicalResult
		matchAndRewrite(PadTensorOp op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const final {
		Location loc = op->getLoc();
		PadTensorOp::Adaptor adaptor(operands, op->getAttrDictionary());
		silvasUnsubmitted Done Reply Inline Actions nit: Prefer PadTensorOp::Adaptor silvas: nit: Prefer PadTensorOp::Adaptor
		Value sourceMemRef = adaptor.source();
		assert(sourceMemRef.getType().isa<MemRefType>());

		auto sourceType = sourceMemRef.getType().cast<ShapedType>();
		// Allocate the destination buffer
		SmallVector<Value> shapes = op.getResultTypeShapes(rewriter);
		SmallVector<int64_t> dynShapes(sourceType.getRank(), -1);
		auto memrefType = MemRefType::get(dynShapes, sourceType.getElementType());
		Value destMemRef =
		rewriter.create<memref::AllocOp>(loc, memrefType, shapes);

		// Get padding value and fill the destination buffer.
		auto yieldOps = op.region().getOps<linalg::YieldOp>();
		if (!llvm::hasSingleElement(yieldOps)) {
		silvasUnsubmitted Done Reply Inline Actions nit: use llvm::hasSingleElement silvas: nit: use llvm::hasSingleElement
		return rewriter.notifyMatchFailure(op,
		"linalg.pad_tensor with more than one "
		"padding value is not supported");
		}
		Value paddingValue = (*yieldOps.begin()).values()[0];
		auto constOp = paddingValue.getDefiningOp<ConstantOp>();
		if (!constOp) {
		return rewriter.notifyMatchFailure(
		op,
		"linalg.pad_tensor with non-constant padding value is not supported");
		}
		if (constOp.getValue().isa<DenseElementsAttr>()) {
		return rewriter.notifyMatchFailure(
		op, "linalg.pad_tensor with non-scalar constant padding value is not "
		"supported");
		}
		rewriter.create<linalg::FillOp>(loc, paddingValue, destMemRef);

		// Get the interior region.
		silvasUnsubmitted Done Reply Inline Actions nit: Use `.` at the end of the comment. Same with the other comments. https://llvm.org/docs/CodingStandards.html#commenting silvas: nit: Use `.` at the end of the comment. Same with the other comments. https://llvm.
		SmallVector<OpFoldResult> sizes =
		getMixedSizes(rewriter, loc, sourceMemRef);
		SmallVector<OpFoldResult> strides(sourceType.getRank(),
		rewriter.getI64IntegerAttr(1));
		auto resultSubView = rewriter.create<memref::SubViewOp>(
		loc, destMemRef, op.getMixedLowPad(), sizes, strides);
		// Copy input into the interior region.
		rewriter.create<linalg::CopyOp>(loc, sourceMemRef, resultSubView);
		auto newResultType = getTypeConverter()->convertType(op.getType());
		rewriter.replaceOpWithNewOp<memref::CastOp>(op, newResultType, destMemRef);
		return success();
		}
		};

/// Generic conversion pattern that matches any LinalgOp. This avoids template		/// Generic conversion pattern that matches any LinalgOp. This avoids template
/// instantiating one pattern for each LinalgOp.		/// instantiating one pattern for each LinalgOp.
class BufferizeAnyLinalgOp : public OpInterfaceConversionPattern<LinalgOp> {		class BufferizeAnyLinalgOp : public OpInterfaceConversionPattern<LinalgOp> {
public:		public:
using OpInterfaceConversionPattern<LinalgOp>::OpInterfaceConversionPattern;		using OpInterfaceConversionPattern<LinalgOp>::OpInterfaceConversionPattern;

LogicalResult		LogicalResult
matchAndRewrite(LinalgOp op, ArrayRef<Value> operands,		matchAndRewrite(LinalgOp op, ArrayRef<Value> operands,
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	void mlir::linalg::populateLinalgBufferizePatterns(
// clang-format off		// clang-format off
patterns.add<		patterns.add<
BufferizeAnyLinalgOp,		BufferizeAnyLinalgOp,
BufferizeFillOp,		BufferizeFillOp,
BufferizeInitTensorOp,		BufferizeInitTensorOp,
BufferizeTensorReshapeOp<TensorExpandShapeOp>,		BufferizeTensorReshapeOp<TensorExpandShapeOp>,
BufferizeTensorReshapeOp<TensorCollapseShapeOp>,		BufferizeTensorReshapeOp<TensorCollapseShapeOp>,
ExtractSliceOpConverter,		ExtractSliceOpConverter,
InsertSliceOpConverter		InsertSliceOpConverter,
		BufferizePadTensorOp
>(typeConverter, patterns.getContext());		>(typeConverter, patterns.getContext());
// clang-format on		// clang-format on
}		}

mlir/test/Dialect/Linalg/bufferize.mlir

Show First 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	%out = linalg.tensor_collapse_shape %arg0 [[0, 1]] :
tensor<4x5xf32> into tensor<20xf32>		tensor<4x5xf32> into tensor<20xf32>
return %out : tensor<20xf32>		return %out : tensor<20xf32>
}		}
// CHECK: %[[MEMREF:.*]] = memref.buffer_cast %[[IN]] : memref<4x5xf32>		// CHECK: %[[MEMREF:.*]] = memref.buffer_cast %[[IN]] : memref<4x5xf32>
// CHECK: %[[RESHAPE:.*]] = linalg.collapse_shape %[[MEMREF]] {{\[}}[0, 1]]		// CHECK: %[[RESHAPE:.*]] = linalg.collapse_shape %[[MEMREF]] {{\[}}[0, 1]]
// CHECK-SAME: : memref<4x5xf32> into memref<20xf32>		// CHECK-SAME: : memref<4x5xf32> into memref<20xf32>
// CHECK: %[[TENSOR:.*]] = memref.tensor_load %[[RESHAPE]] : memref<20xf32>		// CHECK: %[[TENSOR:.*]] = memref.tensor_load %[[RESHAPE]] : memref<20xf32>
// CHECK: return %[[TENSOR]]		// CHECK: return %[[TENSOR]]

		// CHECK-LABEL: func @bufferize_pad_tensor(
		// CHECK-SAME: %[[IN:.*]]: tensor<4x?x2x?xf32>,
		silvasUnsubmitted Not Done Reply Inline Actions can you give meaningful names to the CHECK values? silvas: can you give meaningful names to the CHECK values?
		// CHECK-SAME: %[[PAD_DYNMIC:.*]]: index) -> tensor<4x?x?x?xf32> {
		// CHECK: %[[C3:.*]] = constant 3 : index
		// CHECK: %[[C1:.*]] = constant 1 : index
		// CHECK: %[[C0_FLOAT:.*]] = constant 0.000000e+00 : f32
		// CHECK: %[[IN_MEMREF:.*]] = memref.buffer_cast %[[IN]] : memref<4x?x2x?xf32>
		// CHECK: %[[DIM1:.*]] = memref.dim %[[IN]], %[[C1]] : tensor<4x?x2x?xf32>
		// CHECK: %[[OUT_DIM2:.*]] = affine.apply #map0(){{\[}}%[[PAD_DYNMIC]]]
		// CHECK: %[[DIM3:.*]] = memref.dim %[[IN]], %[[C3]] : tensor<4x?x2x?xf32>
		// CHECK: %[[OUT_DIM3:.*]] = affine.apply #map1(){{\[}}%[[PAD_DYNMIC]], %[[DIM3]]]
		// CHECK: %[[OUT_MEMREF:.*]] = memref.alloc(%[[DIM1]], %[[OUT_DIM2]], %[[OUT_DIM3]]) : memref<4x?x?x?xf32>
		// CHECK: linalg.fill(%[[C0_FLOAT]], %[[OUT_MEMREF]]) : f32, memref<4x?x?x?xf32>
		// CHECK: %[[OUT_INTERIOR:.*]] = memref.subview %[[OUT_MEMREF]][0, 0, %[[PAD_DYNMIC]], 0] [4, %[[DIM1]], 2, %[[DIM3]]] [1, 1, 1, 1] : memref<4x?x?x?xf32> to memref<4x?x2x?xf32, #map2>
		// CHECK: linalg.copy(%[[IN_MEMREF]], %[[OUT_INTERIOR]]) : memref<4x?x2x?xf32>, memref<4x?x2x?xf32, #map2>
		// CHECK: %[[OUT:.*]] = memref.tensor_load %[[OUT_MEMREF]] : memref<4x?x?x?xf32>
		// CHECK: return %[[OUT]] : tensor<4x?x?x?xf32>
		func @bufferize_pad_tensor(%arg0: tensor<4x?x2x?xf32>, %arg1: index) -> tensor<4x?x?x?xf32> {
		%c0 = constant 0 : index
		%cst = constant 0.0 : f32
		%out = linalg.pad_tensor %arg0 low[%c0, %c0, %arg1, %c0] high[%c0, %c0, %c0, %arg1] {
		^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index, %gen_arg4: index): // no predecessors
		linalg.yield %cst : f32
		} : tensor<4x?x2x?xf32> to tensor<4x?x?x?xf32>
		return %out : tensor<4x?x?x?xf32>
		}

mlir/test/Integration/Dialect/Linalg/CPU/test-padtensor.mlir

This file was added.

				// RUN: mlir-opt %s -linalg-bufferize -std-bufferize \
				// RUN: -tensor-constant-bufferize -tensor-bufferize -func-bufferize \
				// RUN: -finalizing-bufferize \
				// RUN: -convert-linalg-to-loops -convert-scf-to-std -convert-linalg-to-llvm -convert-std-to-llvm \| \
				// RUN: mlir-cpu-runner -e main -entry-point-result=void \
				// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_runner_utils%shlibext \
				// RUN: \| FileCheck %s

				func @main() {
				%const = constant dense<[[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0]]]> : tensor<1x2x3xf32>
				%dynamic = tensor.cast %const: tensor<1x2x3xf32> to tensor<1x?x3xf32>
				%offset = constant 2 : index
				%cst = constant 2.3 : f32
				%c0 = constant 0 : index
				%out = linalg.pad_tensor %dynamic low[%c0, %offset, %c0] high[%c0, %c0, %offset] {
				^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index): // no predecessors
				linalg.yield %cst : f32
				} : tensor<1x?x3xf32> to tensor<1x?x?xf32>
				%unranked = tensor.cast %out: tensor<1x?x?xf32> to tensor<*xf32>
				call @print_memref_f32(%unranked) : (tensor<*xf32>) -> ()

				// CHECK: Unranked Memref base@ = {{0x[-9a-f]*}}
				// CHECK-SAME: rank = 3 offset = 0 sizes = [1, 4, 5] strides = [20, 5, 1] data =
				// CHECK-NEXT{LITERAL}: [[[2.3, 2.3, 2.3, 2.3, 2.3],
				// CHECK-NEXT: [2.3, 2.3, 2.3, 2.3, 2.3],
				// CHECK-NEXT: [1, 2, 3, 2.3, 2.3],
				// CHECK-NEXT: [2, 3, 4, 2.3, 2.3]]]

				return
				}

				func private @print_memref_f32(%ptr : tensor<*xf32>)

This is an archive of the discontinued LLVM Phabricator instance.

Refactor GenericPadTensorOpVectorizationPatternClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 356010

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp

mlir/test/Dialect/Linalg/bufferize.mlir

mlir/test/Integration/Dialect/Linalg/CPU/test-padtensor.mlir

Refactor GenericPadTensorOpVectorizationPattern
ClosedPublic