This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/IR/
-
mlir/
-
Dialect/
-
Linalg/
-
IR/
-
LinalgOps.td
-
lib/
-
Conversion/TosaToLinalg/
-
TosaToLinalg/
-
TosaToLinalg.cpp
-
Dialect/Linalg/
-
Linalg/
-
IR/
-
LinalgOps.cpp
-
Transforms/
-
Transforms.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
-
canonicalize.mlir
-
tile-and-pad-tensors.mlir

Differential D110425

[mlir] Linalg: ensure tile-and-pad always creates padding as requested
ClosedPublic

Authored by ftynse on Sep 24 2021, 9:26 AM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
sjarus

Commits

rG5988a3b7a091: [mlir] Linalg: ensure tile-and-pad always creates padding as requested

Summary

Initially, the padding transformation and the related operation were only used
to guarantee static shapes of subtensors in tiled operations. The
transformation would not insert the padding operation if the shapes were
already static, and the overall code generation would actively remove such
"noop" pads. However, this transformation can be also used to pack data into
smaller tensors and marshall them into faster memory, regardless of the size
mismatches. In context of expert-driven transformation, we should assume that,
if padding is requested, a potentially padded tensor must be always created.
Update the transformation accordingly. To do this, introduce an optional
packing attribute to the pad_tensor op that serves as an indication that
the padding is an intentional choice (as opposed to side effect of type
normalization) and should be left alone by cleanups.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ftynse created this revision.Sep 24 2021, 9:26 AM

Herald added a reviewer: sjarus. · View Herald TranscriptSep 24 2021, 9:26 AM

Herald added subscribers: wenzhicui, wrengr, Chia-hungDuan and 23 others. · View Herald Transcript

ftynse requested review of this revision.Sep 24 2021, 9:26 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2021, 9:26 AM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald Transcript

Nice, shipit!

This revision is now accepted and ready to land.Sep 24 2021, 9:31 AM

This revision was landed with ongoing or failed builds.Sep 24 2021, 9:40 AM

Closed by commit rG5988a3b7a091: [mlir] Linalg: ensure tile-and-pad always creates padding as requested (authored by ftynse). · Explain Why

This revision was automatically updated to reflect the committed changes.

ftynse added a commit: rG5988a3b7a091: [mlir] Linalg: ensure tile-and-pad always creates padding as requested.

Harbormaster completed remote builds in B125590: Diff 374874.Sep 24 2021, 9:46 AM

Sorry I didn't see this earlier, but I don't think this direction makes sense. This op operates on tensors, which are value semantic, so terminology like "guaranteed to create a new tensor suitable for packing" simply don't make sense.

Having an attribute that prevents folding from happening gives me bad memories of e.g. tf.Identity being marked as having side effects to prevent folding. I would rather not go down that route.

In D110425#3028927, @silvas wrote:

Sorry I didn't see this earlier, but I don't think this direction makes sense. This op operates on tensors, which are value semantic, so terminology like "guaranteed to create a new tensor suitable for packing" simply don't make sense.

I fail to see what doesn't make sense? Packing is done on tensor semantics too: this can be also called "tensor tiling" by opposition to "op tiling".

Having an attribute that prevents folding from happening gives me bad memories of e.g. tf.Identity being marked as having side effects to prevent folding. I would rather not go down that route.

I do not see the relation between the 2.
Marking tf.Identity as having side effects is clearly a hack misusing side effects for other purposes: the op still has no side effects.
Here, the semantics of the op are : it is illegitimate to fold this op. There is no discussion about side effects or any other interface.

This is much more similar to the "inbounds" attribute of vector.transfer ops: it is well-contained and part of the op semantics.

In D110425#3029615, @nicolasvasilache wrote:

In D110425#3028927, @silvas wrote:

Sorry I didn't see this earlier, but I don't think this direction makes sense. This op operates on tensors, which are value semantic, so terminology like "guaranteed to create a new tensor suitable for packing" simply don't make sense.

I fail to see what doesn't make sense? Packing is done on tensor semantics too: this can be also called "tensor tiling" by opposition to "op tiling".

Having an attribute that prevents folding from happening gives me bad memories of e.g. tf.Identity being marked as having side effects to prevent folding. I would rather not go down that route.

I do not see the relation between the 2.
Marking tf.Identity as having side effects is clearly a hack misusing side effects for other purposes: the op still has no side effects.
Here, the semantics of the op are : it is illegitimate to fold this op. There is no discussion about side effects or any other interface.

"the semantics of the op are : it is illegitimate to fold this op." - that is not a valid semantics. For example, if I prove that the output tensor and the input tensor have the same value, then SSA IR semantics + value semantics mean that it *must* be correct to replace the output with the input.

Specifically, it is impossible for these two functions to have different semantics (as long as linalg.pad_tensor is marked NoSideEffect) -- both return a value that is identical to their argument, and it is always legal to fold @with_pad to @without_pad per tensor value semantics:

func @with_pad(%arg0: tensor<5x6xf32>)
    -> tensor<5x6xf32> {
  %cst = constant 0.000000e+00 : f32
  %0 = linalg.pad_tensor %arg0 packing low[0, 0] high[0, 0] {
        ^bb0(%arg1: index, %arg2: index):
          linalg.yield %cst : f32
  } : tensor<5x6xf32> to tensor<5x6xf32>
  return %0 : tensor<5x6xf32>
}

func @without_pad(%arg0: tensor<5x6xf32>)
    -> tensor<5x6xf32> {
  return %arg0 : tensor<5x6xf32>
}

I share the same feeling as Sean: there is a confusion in the semantics definition of the operation and what should be "optimization hints". I'm not sure why the "guarantees" that are provided are semantically important here.

What would be a more correct phrasing for the semantics description in your opinion? We want the op to always result in a new value.

I don't really understand the argument about the operation having to be always folded away. This would effectively render impossible any clone or copy operation. Yet, there is a TensorCloneOp in IREE if I remember correctly and it isn't an always-folded noop. So there is clearly precedent and a use case for operations defining new values with same "content" as the existing values.

The additional attribute may indeed be interpreted as preventing some canonicalization wrt the semantics of the original op (or its version with the attribute unset). However, I argue that we just choose to define different canonical forms for the op depending on it having the attribute. Referring to the post our doc on canonicalization quotes in the opening paragraph - https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html - that describes how canonicalization and redundancy elimination get ambiguous: "<...> ultimately, in its purest form, canonicalization just focuses on removing unnecessary variation so that subsequent optimizations can be simpler", this change is exactly about removing, or rather shifting, variation to simplify subsequent optimization. There exist subsequent optimizations such as hoisting that are rendered significantly simpler by always having a fresh value that is always defined by pad_tensor.

In D110425#3033028, @ftynse wrote:

What would be a more correct phrasing for the semantics description in your opinion? We want the op to always result in a new value.

I don't really understand the argument about the operation having to be always folded away. This would effectively render impossible any clone or copy operation. Yet, there is a TensorCloneOp in IREE if I remember correctly and it isn't an always-folded noop. So there is clearly precedent and a use case for operations defining new values with same "content" as the existing values.

The additional attribute may indeed be interpreted as preventing some canonicalization wrt the semantics of the original op (or its version with the attribute unset). However, I argue that we just choose to define different canonical forms for the op depending on it having the attribute. Referring to the post our doc on canonicalization quotes in the opening paragraph - https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html - that describes how canonicalization and redundancy elimination get ambiguous: "<...> ultimately, in its purest form, canonicalization just focuses on removing unnecessary variation so that subsequent optimizations can be simpler", this change is exactly about removing, or rather shifting, variation to simplify subsequent optimization. There exist subsequent optimizations such as hoisting that are rendered significantly simpler by always having a fresh value that is always defined by pad_tensor.

I agree, I don't think this is a question about op semantics. It is about suppressing a transformation under certain circumstances. I think we can distinguish about which transformations are legal, and what transformations are desirable. Based on tensor value semantics and NoSideEffect and the definition of the content of the output of linalg.pad_tensor, we can conclude that the transformation "remove a pad when it is an identity" is always a legal transformation. I think if the attribute was simply named doNotFold or suppressFolds, with the meaning "advisory flag suggesting that folding the op is undesirable -- used internally by certain multi-step transformations to maintain certain invariants that would otherwise be broken by folding" then it would be a lot clearer.

There is precedent for things like this. e.g. llvm.ssa.copy intrinsic, used internally by PredicateInfo https://llvm.org/docs/LangRef.html#llvm-ssa-copy-intrinsic

In D110425#3034456, @silvas wrote:

In D110425#3033028, @ftynse wrote:

What would be a more correct phrasing for the semantics description in your opinion? We want the op to always result in a new value.

I don't really understand the argument about the operation having to be always folded away. This would effectively render impossible any clone or copy operation. Yet, there is a TensorCloneOp in IREE if I remember correctly and it isn't an always-folded noop. So there is clearly precedent and a use case for operations defining new values with same "content" as the existing values.

The additional attribute may indeed be interpreted as preventing some canonicalization wrt the semantics of the original op (or its version with the attribute unset). However, I argue that we just choose to define different canonical forms for the op depending on it having the attribute. Referring to the post our doc on canonicalization quotes in the opening paragraph - https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html - that describes how canonicalization and redundancy elimination get ambiguous: "<...> ultimately, in its purest form, canonicalization just focuses on removing unnecessary variation so that subsequent optimizations can be simpler", this change is exactly about removing, or rather shifting, variation to simplify subsequent optimization. There exist subsequent optimizations such as hoisting that are rendered significantly simpler by always having a fresh value that is always defined by pad_tensor.

I agree, I don't think this is a question about op semantics. It is about suppressing a transformation under certain circumstances. I think we can distinguish about which transformations are legal, and what transformations are desirable. Based on tensor value semantics and NoSideEffect and the definition of the content of the output of linalg.pad_tensor, we can conclude that the transformation "remove a pad when it is an identity" is always a legal transformation. I think if the attribute was simply named doNotFold or suppressFolds, with the meaning "advisory flag suggesting that folding the op is undesirable -- used internally by certain multi-step transformations to maintain certain invariants that would otherwise be broken by folding" then it would be a lot clearer.

This sounds good for me. @nicolasvasilache WDYT?

Herald added a subscriber: bixia. · View Herald TranscriptOct 4 2021, 2:13 AM

ftynse mentioned this in D111046: [mlir] rename the "packing" flag of linalg.pad_tensor to "nofold".Oct 4 2021, 3:10 AM

In D110425#3039342, @ftynse wrote:

In D110425#3034456, @silvas wrote:

In D110425#3033028, @ftynse wrote:

What would be a more correct phrasing for the semantics description in your opinion? We want the op to always result in a new value.

I don't really understand the argument about the operation having to be always folded away. This would effectively render impossible any clone or copy operation. Yet, there is a TensorCloneOp in IREE if I remember correctly and it isn't an always-folded noop. So there is clearly precedent and a use case for operations defining new values with same "content" as the existing values.

The additional attribute may indeed be interpreted as preventing some canonicalization wrt the semantics of the original op (or its version with the attribute unset). However, I argue that we just choose to define different canonical forms for the op depending on it having the attribute. Referring to the post our doc on canonicalization quotes in the opening paragraph - https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html - that describes how canonicalization and redundancy elimination get ambiguous: "<...> ultimately, in its purest form, canonicalization just focuses on removing unnecessary variation so that subsequent optimizations can be simpler", this change is exactly about removing, or rather shifting, variation to simplify subsequent optimization. There exist subsequent optimizations such as hoisting that are rendered significantly simpler by always having a fresh value that is always defined by pad_tensor.

I agree, I don't think this is a question about op semantics. It is about suppressing a transformation under certain circumstances. I think we can distinguish about which transformations are legal, and what transformations are desirable. Based on tensor value semantics and NoSideEffect and the definition of the content of the output of linalg.pad_tensor, we can conclude that the transformation "remove a pad when it is an identity" is always a legal transformation. I think if the attribute was simply named doNotFold or suppressFolds, with the meaning "advisory flag suggesting that folding the op is undesirable -- used internally by certain multi-step transformations to maintain certain invariants that would otherwise be broken by folding" then it would be a lot clearer.

This sounds good for me. @nicolasvasilache WDYT?

Great, approved, thanks @ftynse !

ftynse mentioned this in rG01d696e56354: [mlir] rename the "packing" flag of linalg.pad_tensor to "nofold".Oct 4 2021, 12:28 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

IR/

LinalgOps.td

31 lines

lib/

Conversion/

TosaToLinalg/

TosaToLinalg.cpp

5 lines

Dialect/

Linalg/

IR/

LinalgOps.cpp

43 lines

Transforms/

Transforms.cpp

20 lines

test/

Dialect/

Linalg/

canonicalize.mlir

32 lines

tile-and-pad-tensors.mlir

53 lines

Diff 374877

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td

Show First 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	let description = [{

The PadTensor operation supports the following arguments:		The PadTensor operation supports the following arguments:

* source: the "base" tensor on which to pad.		* source: the "base" tensor on which to pad.
* low: A list contains the padding along the start of each		* low: A list contains the padding along the start of each
dimension, i.e `low`.		dimension, i.e `low`.
* high: A list contains the padding along the end of each		* high: A list contains the padding along the end of each
dimension, i.e. `high`.		dimension, i.e. `high`.
		* packing: whether the padding operation is guaranteed to create a new
		tensor suitable for packing, i.e. a copy.

The result tensor dimensions are `low` + `dim` + `high` along that		The result tensor dimensions are `low` + `dim` + `high` along that
dimension. The number of elements of `low` and `high` must match		dimension. The number of elements of `low` and `high` must match
the rank of the input tensor. They can be either a constant or a		the rank of the input tensor. They can be either a constant or a
dynamic value.		dynamic value.

The region of the `pad_tensor` operation returns the value to use		The region of the `pad_tensor` operation returns the value to use
for the padding. The arguments of the region represent the index		for the padding. The arguments of the region represent the index
of the source being accessed. There should be as many arguments as		of the source being accessed. There should be as many arguments as
the rank of the `source` tensor. The value `yield`-ed by the		the rank of the `source` tensor. The value `yield`-ed by the
region is used as the value of the view at the given position.		region is used as the value of the view at the given position.

		If `packing` is indicated, the padding is guaranteed to produce a new
		tensor, e.g., to use for packing or promotion to faster memory. Such
		operations are not optimized away even when the source type has the same
		static shape.

Example 1:		Example 1:

```mlir		```mlir
%pad_value = ... : f32		%pad_value = ... : f32
%0 = linalg.pad_tensor %0 low[1, 2] high[2, 3] {		%0 = linalg.pad_tensor %0 low[1, 2] high[2, 3] {
^bb0(%arg0 : index, %arg1 : index):		^bb0(%arg0 : index, %arg1 : index):
linalg.yield %pad_value : f32		linalg.yield %pad_value : f32
} : tensor<?x?xf32> to tensor<?x?xf32>		} : tensor<?x?xf32> to tensor<?x?xf32>
Show All 13 Lines	let description = [{

```mlir		```mlir
%pad_value = ... : f32		%pad_value = ... : f32
%0 = linalg.pad_tensor %arg0 low[0, 0] high[%ub0, %ub1] {		%0 = linalg.pad_tensor %arg0 low[0, 0] high[%ub0, %ub1] {
^bb0(%arg1: index, %arg2: index):		^bb0(%arg1: index, %arg2: index):
linalg.yield %pad_value : f32		linalg.yield %pad_value : f32
} : tensor<2x3xf32> to tensor<?x?xf32>		} : tensor<2x3xf32> to tensor<?x?xf32>
```		```

		Example 4:

		```mlir
		// Force a padded value to be always exist with `packing`.
		%pad_value = ... : f32
		%0 = linalg.pad_tensor %arg0 packing low[0, 0] high[0, 0] {
		^bb0(%arg1: index, %arg2: index):
		linalg.yield %pad_value : f32
		} : tensor<2x3xf32> to tensor<2x3xf32>
		```
}];		}];

let arguments = (ins		let arguments = (ins
AnyTensor:$source,		AnyTensor:$source,
Variadic<Index>:$low,		Variadic<Index>:$low,
Variadic<Index>:$high,		Variadic<Index>:$high,
I64ArrayAttr:$static_low,		I64ArrayAttr:$static_low,
I64ArrayAttr:$static_high);		I64ArrayAttr:$static_high,
		UnitAttr:$packing);

let regions = (region SizedRegion<1>:$region);		let regions = (region SizedRegion<1>:$region);

let results = (outs AnyTensor:$result);		let results = (outs AnyTensor:$result);

// TODO: Remove custom<InferType> when AllTypesMatch supports opt. operands.		// TODO: Remove custom<InferType> when AllTypesMatch supports opt. operands.
let assemblyFormat = [{		let assemblyFormat = [{
$source		$source
		(`packing` $packing^)?
`low` `` custom<OperandsOrIntegersSizesList>($low, $static_low)		`low` `` custom<OperandsOrIntegersSizesList>($low, $static_low)
`high` `` custom<OperandsOrIntegersSizesList>($high, $static_high)		`high` `` custom<OperandsOrIntegersSizesList>($high, $static_high)
$region attr-dict `:` type($source) `to` type($result)		$region attr-dict `:` type($source) `to` type($result)
}];		}];

let extraClassDeclaration = [{		let extraClassDeclaration = [{
static StringRef getStaticLowAttrName() {		static StringRef getStaticLowAttrName() {
return "static_low";		return "static_low";
Show All 20 Lines	static RankedTensorType inferResultType(
ArrayRef<int64_t> staticHigh,		ArrayRef<int64_t> staticHigh,
ArrayRef<int64_t> resultShape = {});		ArrayRef<int64_t> resultShape = {});

// Return a PadTensorOp that pads `source` to `type` size where the static		// Return a PadTensorOp that pads `source` to `type` size where the static
// sizes are assumed to be greater than the dynamic sizes. The op performs		// sizes are assumed to be greater than the dynamic sizes. The op performs
// "high" padding (i.e. it adds trailing padding values until the desired		// "high" padding (i.e. it adds trailing padding values until the desired
// size is met).		// size is met).
static linalg::PadTensorOp createPadHighOp(		static linalg::PadTensorOp createPadHighOp(
Type type, Value source, Value pad, Location loc, OpBuilder & builder);		Type type, Value source, Value pad, bool packing, Location loc,
		OpBuilder & builder);

// Return a PadTensorOp that pads `source to `type` size with `pad` value.		// Return a PadTensorOp that pads `source to `type` size with `pad` value.
// I.e., a block will be created and the `pad` value will be yielded		// I.e., a block will be created and the `pad` value will be yielded
// directly. If the type passed is nullptr, it is inferred.		// directly. If the type passed is nullptr, it is inferred.
static linalg::PadTensorOp createPadScalarOp(		static linalg::PadTensorOp createPadScalarOp(
Type type, Value source, Value pad, ArrayRef<OpFoldResult> low,		Type type, Value source, Value pad, ArrayRef<OpFoldResult> low,
ArrayRef<OpFoldResult> high, Location loc, OpBuilder & builder);		ArrayRef<OpFoldResult> high, bool packing, Location loc,
		OpBuilder & builder);

// Return the pad value if it is a constant. Return null value otherwise.		// Return the pad value if it is a constant. Return null value otherwise.
Value getConstantPaddingValue();		Value getConstantPaddingValue();

// Return a vector of all the static or dynamic values (low/high padding) of		// Return a vector of all the static or dynamic values (low/high padding) of
// the op.		// the op.
inline SmallVector<OpFoldResult> getMixedPadImpl(ArrayAttr staticAttrs,		inline SmallVector<OpFoldResult> getMixedPadImpl(ArrayAttr staticAttrs,
ValueRange values) {		ValueRange values) {
Show All 27 Lines	bool hasZeroHighPad() {
});		});
}		}
}];		}];

let builders = [		let builders = [
// Build a PadTensorOp with mixed static and dynamic entries.		// Build a PadTensorOp with mixed static and dynamic entries.
OpBuilder<(ins "Value":$source, "ArrayRef<int64_t>":$staticLow,		OpBuilder<(ins "Value":$source, "ArrayRef<int64_t>":$staticLow,
"ArrayRef<int64_t>":$staticHigh, "ValueRange":$low, "ValueRange":$high,		"ArrayRef<int64_t>":$staticHigh, "ValueRange":$low, "ValueRange":$high,
		CArg<"bool", "false">:$packing,
CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,		CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,
// Build a PadTensorOp with all dynamic entries.		// Build a PadTensorOp with all dynamic entries.
OpBuilder<(ins "Value":$source, "ValueRange":$low, "ValueRange":$high,		OpBuilder<(ins "Value":$source, "ValueRange":$low, "ValueRange":$high,
		CArg<"bool", "false">:$packing,
CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,		CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,
// Build a PadTensorOp with mixed static and dynamic entries and custom		// Build a PadTensorOp with mixed static and dynamic entries and custom
// result type. If the type passed is nullptr, it is inferred.		// result type. If the type passed is nullptr, it is inferred.
OpBuilder<(ins "Type":$resultType, "Value":$source,		OpBuilder<(ins "Type":$resultType, "Value":$source,
"ArrayRef<OpFoldResult>":$low, "ArrayRef<OpFoldResult>":$high,		"ArrayRef<OpFoldResult>":$low, "ArrayRef<OpFoldResult>":$high,
		CArg<"bool", "false">:$packing,
CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,		CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,
];		];

let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
let hasFolder = 1;		let hasFolder = 1;
}		}

def Linalg_RangeOp :		def Linalg_RangeOp :
▲ Show 20 Lines • Show All 495 Lines • Show Last 20 Lines

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	for (int i = 0, s = inputShape.size(); i < s; i++) {
lowIndices.push_back(rewriter.getIndexAttr(lowPad));		lowIndices.push_back(rewriter.getIndexAttr(lowPad));
highIndices.push_back(rewriter.getIndexAttr(highPad));		highIndices.push_back(rewriter.getIndexAttr(highPad));
}		}

Value padValue = rewriter.create<ConstantOp>(loc, padAttr);		Value padValue = rewriter.create<ConstantOp>(loc, padAttr);

return linalg::PadTensorOp::createPadScalarOp(		return linalg::PadTensorOp::createPadScalarOp(
RankedTensorType::get(paddedShape, inputETy), input, padValue,		RankedTensorType::get(paddedShape, inputETy), input, padValue,
lowIndices, highIndices, loc, rewriter)		lowIndices, highIndices, /packing=/false, loc, rewriter)
.result();		.result();
}		}

static Value		static Value
createLinalgBodyCalculationForElementwiseOp(Operation *op, ValueRange args,		createLinalgBodyCalculationForElementwiseOp(Operation *op, ValueRange args,
ArrayRef<Type> resultTypes,		ArrayRef<Type> resultTypes,
PatternRewriter &rewriter) {		PatternRewriter &rewriter) {
Location loc = op->getLoc();		Location loc = op->getLoc();
▲ Show 20 Lines • Show All 2,246 Lines • ▼ Show 20 Lines	for (int i = 0; i < rank; i++) {

lowValues.push_back(lowVal);		lowValues.push_back(lowVal);
highValues.push_back(highVal);		highValues.push_back(highVal);
}		}

Value constant = rewriter.create<ConstantOp>(loc, constantAttr);		Value constant = rewriter.create<ConstantOp>(loc, constantAttr);

auto newPadOp = linalg::PadTensorOp::createPadScalarOp(		auto newPadOp = linalg::PadTensorOp::createPadScalarOp(
padOp.getType(), input, constant, lowValues, highValues, loc, rewriter);		padOp.getType(), input, constant, lowValues, highValues,
		/packing=/false, loc, rewriter);

rewriter.replaceOp(padOp, newPadOp.getResult());		rewriter.replaceOp(padOp, newPadOp.getResult());
return success();		return success();
}		}
};		};

// Tosa argmax lowering represents the ArgMax op as an linalg.indexed_generic		// Tosa argmax lowering represents the ArgMax op as an linalg.indexed_generic
// op, producing two output buffers.		// op, producing two output buffers.
▲ Show 20 Lines • Show All 635 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 1,079 Lines • ▼ Show 20 Lines	RankedTensorType PadTensorOp::inferResultType(RankedTensorType sourceType,
}		}

return RankedTensorType::get(inferredShape, sourceType.getElementType());		return RankedTensorType::get(inferredShape, sourceType.getElementType());
}		}

void PadTensorOp::build(OpBuilder &b, OperationState &result, Value source,		void PadTensorOp::build(OpBuilder &b, OperationState &result, Value source,
ArrayRef<int64_t> staticLow,		ArrayRef<int64_t> staticLow,
ArrayRef<int64_t> staticHigh, ValueRange low,		ArrayRef<int64_t> staticHigh, ValueRange low,
ValueRange high, ArrayRef<NamedAttribute> attrs) {		ValueRange high, bool packing,
		ArrayRef<NamedAttribute> attrs) {
auto sourceType = source.getType().cast<RankedTensorType>();		auto sourceType = source.getType().cast<RankedTensorType>();
auto resultType = inferResultType(sourceType, staticLow, staticHigh);		auto resultType = inferResultType(sourceType, staticLow, staticHigh);
build(b, result, resultType, source, low, high, b.getI64ArrayAttr(staticLow),		build(b, result, resultType, source, low, high, b.getI64ArrayAttr(staticLow),
b.getI64ArrayAttr(staticHigh));		b.getI64ArrayAttr(staticHigh), packing ? b.getUnitAttr() : UnitAttr());
result.addAttributes(attrs);		result.addAttributes(attrs);
}		}

void PadTensorOp::build(OpBuilder &b, OperationState &result, Value source,		void PadTensorOp::build(OpBuilder &b, OperationState &result, Value source,
ValueRange low, ValueRange high,		ValueRange low, ValueRange high, bool packing,
ArrayRef<NamedAttribute> attrs) {		ArrayRef<NamedAttribute> attrs) {
auto sourceType = source.getType().cast<RankedTensorType>();		auto sourceType = source.getType().cast<RankedTensorType>();
unsigned rank = sourceType.getRank();		unsigned rank = sourceType.getRank();
SmallVector<int64_t, 4> staticVector(rank, ShapedType::kDynamicSize);		SmallVector<int64_t, 4> staticVector(rank, ShapedType::kDynamicSize);
build(b, result, source, staticVector, staticVector, low, high, attrs);		build(b, result, source, staticVector, staticVector, low, high, packing,
		attrs);
}		}

void PadTensorOp::build(OpBuilder &b, OperationState &result, Type resultType,		void PadTensorOp::build(OpBuilder &b, OperationState &result, Type resultType,
Value source, ArrayRef<OpFoldResult> low,		Value source, ArrayRef<OpFoldResult> low,
ArrayRef<OpFoldResult> high,		ArrayRef<OpFoldResult> high, bool packing,
ArrayRef<NamedAttribute> attrs) {		ArrayRef<NamedAttribute> attrs) {
assert(resultType.isa<RankedTensorType>());		assert(resultType.isa<RankedTensorType>());
auto sourceType = source.getType().cast<RankedTensorType>();		auto sourceType = source.getType().cast<RankedTensorType>();
unsigned rank = sourceType.getRank();		unsigned rank = sourceType.getRank();
SmallVector<Value, 4> dynamicLow, dynamicHigh;		SmallVector<Value, 4> dynamicLow, dynamicHigh;
SmallVector<int64_t, 4> staticLow, staticHigh;		SmallVector<int64_t, 4> staticLow, staticHigh;
for (unsigned i = 0; i < rank; ++i) {		for (unsigned i = 0; i < rank; ++i) {
// staticLow and staticHigh have full information of the padding config.		// staticLow and staticHigh have full information of the padding config.
// This will grow staticLow and staticHigh with 1 value. If the config is		// This will grow staticLow and staticHigh with 1 value. If the config is
// dynamic (ie not a constant), dynamicLow and dynamicHigh will grow with 1		// dynamic (ie not a constant), dynamicLow and dynamicHigh will grow with 1
// value as well.		// value as well.
dispatchIndexOpFoldResult(low[i], dynamicLow, staticLow,		dispatchIndexOpFoldResult(low[i], dynamicLow, staticLow,
ShapedType::kDynamicSize);		ShapedType::kDynamicSize);
dispatchIndexOpFoldResult(high[i], dynamicHigh, staticHigh,		dispatchIndexOpFoldResult(high[i], dynamicHigh, staticHigh,
ShapedType::kDynamicSize);		ShapedType::kDynamicSize);
}		}
if (!resultType) {		if (!resultType) {
resultType =		resultType =
PadTensorOp::inferResultType(sourceType, staticLow, staticHigh);		PadTensorOp::inferResultType(sourceType, staticLow, staticHigh);
}		}
build(b, result, resultType, source, dynamicLow, dynamicHigh,		build(b, result, resultType, source, dynamicLow, dynamicHigh,
b.getI64ArrayAttr(staticLow), b.getI64ArrayAttr(staticHigh));		b.getI64ArrayAttr(staticLow), b.getI64ArrayAttr(staticHigh),
		packing ? b.getUnitAttr() : UnitAttr());
		result.addAttributes(attrs);
}		}

PadTensorOp PadTensorOp::createPadScalarOp(Type type, Value source, Value pad,		PadTensorOp PadTensorOp::createPadScalarOp(Type type, Value source, Value pad,
ArrayRef<OpFoldResult> low,		ArrayRef<OpFoldResult> low,
ArrayRef<OpFoldResult> high,		ArrayRef<OpFoldResult> high,
Location loc, OpBuilder &builder) {		bool packing, Location loc,
auto padTensorOp =		OpBuilder &builder) {
builder.create<linalg::PadTensorOp>(loc, type, source, low, high);		auto padTensorOp = builder.create<linalg::PadTensorOp>(loc, type, source, low,
		high, packing);
int rank = padTensorOp.getResultType().getRank();		int rank = padTensorOp.getResultType().getRank();
SmallVector<Type, 4> blockArgTypes;		SmallVector<Type, 4> blockArgTypes;
blockArgTypes.assign(rank, builder.getIndexType());		blockArgTypes.assign(rank, builder.getIndexType());
auto &region = padTensorOp.region();		auto &region = padTensorOp.region();
// `builder.createBlock` changes the insertion point within the block. Create		// `builder.createBlock` changes the insertion point within the block. Create
// a guard to reset the insertion point of the builder after it is destroyed.		// a guard to reset the insertion point of the builder after it is destroyed.
OpBuilder::InsertionGuard guard(builder);		OpBuilder::InsertionGuard guard(builder);
builder.createBlock(&region, region.end(), blockArgTypes);		builder.createBlock(&region, region.end(), blockArgTypes);
builder.create<linalg::YieldOp>(loc, pad);		builder.create<linalg::YieldOp>(loc, pad);
return padTensorOp;		return padTensorOp;
}		}

PadTensorOp PadTensorOp::createPadHighOp(Type type, Value source, Value pad,		PadTensorOp PadTensorOp::createPadHighOp(Type type, Value source, Value pad,
Location loc, OpBuilder &builder) {		bool packing, Location loc,
		OpBuilder &builder) {
SmallVector<OpFoldResult, 4> low, high;		SmallVector<OpFoldResult, 4> low, high;
auto rankedTensorType = type.cast<RankedTensorType>();		auto rankedTensorType = type.cast<RankedTensorType>();
assert(rankedTensorType.hasStaticShape());		assert(rankedTensorType.hasStaticShape());
int rank = rankedTensorType.getRank();		int rank = rankedTensorType.getRank();
for (int i = 0; i < rank; ++i) {		for (int i = 0; i < rank; ++i) {
auto dimOp = builder.createOrFold<tensor::DimOp>(loc, source, i);		auto dimOp = builder.createOrFold<tensor::DimOp>(loc, source, i);
auto resultDimSize = builder.createOrFold<ConstantIndexOp>(		auto resultDimSize = builder.createOrFold<ConstantIndexOp>(
loc, rankedTensorType.getDimSize(i));		loc, rankedTensorType.getDimSize(i));
auto highValue = builder.createOrFold<SubIOp>(loc, resultDimSize, dimOp);		auto highValue = builder.createOrFold<SubIOp>(loc, resultDimSize, dimOp);
high.push_back(highValue);		high.push_back(highValue);
low.push_back(builder.createOrFold<ConstantIndexOp>(loc, 0));		low.push_back(builder.createOrFold<ConstantIndexOp>(loc, 0));
}		}
return PadTensorOp::createPadScalarOp(type, source, pad, low, high, loc,		return PadTensorOp::createPadScalarOp(type, source, pad, low, high, packing,
builder);		loc, builder);
}		}

LogicalResult PadTensorOp::reifyResultShapes(		LogicalResult PadTensorOp::reifyResultShapes(
OpBuilder &b, ReifiedRankedShapedTypeDims &reifiedReturnShapes) {		OpBuilder &b, ReifiedRankedShapedTypeDims &reifiedReturnShapes) {
Location loc = getLoc();		Location loc = getLoc();
auto lowPad = getMixedLowPad();		auto lowPad = getMixedLowPad();
auto highPad = getMixedHighPad();		auto highPad = getMixedHighPad();
SmallVector<Value> shapes;		SmallVector<Value> shapes;
▲ Show 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	auto result = b.create<scf::IfOp>(
createPadTensorOfSubTensor()->getResult(0));		createPadTensorOfSubTensor()->getResult(0));
});		});
return result;		return result;
}		}
return createPadTensorOfSubTensor();		return createPadTensorOfSubTensor();
}		}

namespace {		namespace {
// Folds linalg.pad_tensor when padding is static zeros.		// Folds linalg.pad_tensor when padding is static zeros and packing is not
		// requested.
struct FoldStaticZeroPadding : public OpRewritePattern<PadTensorOp> {		struct FoldStaticZeroPadding : public OpRewritePattern<PadTensorOp> {
using OpRewritePattern<PadTensorOp>::OpRewritePattern;		using OpRewritePattern<PadTensorOp>::OpRewritePattern;

LogicalResult matchAndRewrite(PadTensorOp padTensorOp,		LogicalResult matchAndRewrite(PadTensorOp padTensorOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (!padTensorOp.hasZeroLowPad() \|\| !padTensorOp.hasZeroHighPad())		if (!padTensorOp.hasZeroLowPad() \|\| !padTensorOp.hasZeroHighPad())
return failure();		return failure();
		if (padTensorOp.packing())
		return failure();
rewriter.replaceOpWithNewOp<tensor::CastOp>(		rewriter.replaceOpWithNewOp<tensor::CastOp>(
padTensorOp, padTensorOp.result().getType(), padTensorOp.source());		padTensorOp, padTensorOp.result().getType(), padTensorOp.source());
return success();		return success();
}		}
};		};

// Fold CastOp into PadTensorOp when adding static information.		// Fold CastOp into PadTensorOp when adding static information.
struct FoldSourceTensorCast : public OpRewritePattern<PadTensorOp> {		struct FoldSourceTensorCast : public OpRewritePattern<PadTensorOp> {
Show All 14 Lines	LogicalResult matchAndRewrite(PadTensorOp padTensorOp,
if (newResultType == padTensorOp.getResultType()) {		if (newResultType == padTensorOp.getResultType()) {
rewriter.updateRootInPlace(padTensorOp, [&]() {		rewriter.updateRootInPlace(padTensorOp, [&]() {
padTensorOp.sourceMutable().assign(castOp.source());		padTensorOp.sourceMutable().assign(castOp.source());
});		});
} else {		} else {
auto newOp = rewriter.create<PadTensorOp>(		auto newOp = rewriter.create<PadTensorOp>(
padTensorOp->getLoc(), newResultType, padTensorOp.source(),		padTensorOp->getLoc(), newResultType, padTensorOp.source(),
padTensorOp.low(), padTensorOp.high(), padTensorOp.static_low(),		padTensorOp.low(), padTensorOp.high(), padTensorOp.static_low(),
padTensorOp.static_high());		padTensorOp.static_high(), padTensorOp.packing());
BlockAndValueMapping mapper;		BlockAndValueMapping mapper;
padTensorOp.getRegion().cloneInto(&newOp.getRegion(), mapper);		padTensorOp.getRegion().cloneInto(&newOp.getRegion(), mapper);

rewriter.replaceOpWithNewOp<tensor::CastOp>(		rewriter.replaceOpWithNewOp<tensor::CastOp>(
padTensorOp, padTensorOp.getResultType(), newOp);		padTensorOp, padTensorOp.getResultType(), newOp);
}		}
return success();		return success();
}		}
Show All 14 Lines	if (!tensorCastOp)
return failure();		return failure();
if (!tensor::preservesStaticInformation(padTensorOp.result().getType(),		if (!tensor::preservesStaticInformation(padTensorOp.result().getType(),
tensorCastOp.dest().getType()))		tensorCastOp.dest().getType()))
return failure();		return failure();

auto replacementOp = rewriter.create<PadTensorOp>(		auto replacementOp = rewriter.create<PadTensorOp>(
padTensorOp.getLoc(), tensorCastOp.dest().getType(),		padTensorOp.getLoc(), tensorCastOp.dest().getType(),
padTensorOp.source(), padTensorOp.low(), padTensorOp.high(),		padTensorOp.source(), padTensorOp.low(), padTensorOp.high(),
padTensorOp.static_low(), padTensorOp.static_high());		padTensorOp.static_low(), padTensorOp.static_high(),
		padTensorOp.packing());
replacementOp.region().takeBody(padTensorOp.region());		replacementOp.region().takeBody(padTensorOp.region());

rewriter.replaceOp(padTensorOp, replacementOp.result());		rewriter.replaceOp(padTensorOp, replacementOp.result());
rewriter.replaceOp(tensorCastOp, replacementOp.result());		rewriter.replaceOp(tensorCastOp, replacementOp.result());
return success();		return success();
}		}
};		};
} // namespace		} // namespace
Show All 24 Lines	Value PadTensorOp::getConstantPaddingValue() {
// Check if yield value is defined inside the PadTensorOp block.		// Check if yield value is defined inside the PadTensorOp block.
if (padValue.getParentBlock() == &getRegion().front())		if (padValue.getParentBlock() == &getRegion().front())
return {};		return {};
// Else: Yield value defined outside of the PadTensorOp block.		// Else: Yield value defined outside of the PadTensorOp block.
return padValue;		return padValue;
}		}

OpFoldResult PadTensorOp::fold(ArrayRef<Attribute>) {		OpFoldResult PadTensorOp::fold(ArrayRef<Attribute>) {
if (getResultType().hasStaticShape() && getResultType() == getSourceType())		if (getResultType().hasStaticShape() && getResultType() == getSourceType() &&
		!packing())
return source();		return source();
return {};		return {};
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ReshapeOp		// ReshapeOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 1,671 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	for (Value shapeSize : shapeSizes)
tileSizes.push_back(getConstantIntValue(shapeSize).hasValue()		tileSizes.push_back(getConstantIntValue(shapeSize).hasValue()
? b.create<ConstantIndexOp>(loc, 0)		? b.create<ConstantIndexOp>(loc, 0)
: b.create<ConstantIndexOp>(loc, 1));		: b.create<ConstantIndexOp>(loc, 1));
return tileSizes;		return tileSizes;
};		};
return *this;		return *this;
}		}

/// Try to compute a static bounding box for `operand`		/// Try to compute a static bounding box for `operand`. The padding happens
/// Return success if either:		/// even if the operand already has static shape. `result` is the result of a
/// 1. The operand is already statically shaped, `result` is left unchanged.		/// freshly created PadTensorOp. Return failure if the operand cannot be padded
/// 2. The operand is (partially) dynamic, `result` is the result of a freshly		/// to a static shape.
/// created PadTensorOp.
/// Return failure if the operand cannot be padded to a static shape.
static LogicalResult padOperandToSmallestStaticBoundingBox(		static LogicalResult padOperandToSmallestStaticBoundingBox(
PatternRewriter &rewriter, linalg::LinalgOp opToPad, OpOperand *opOperand,		PatternRewriter &rewriter, linalg::LinalgOp opToPad, OpOperand *opOperand,
const PaddingValueComputationFunction &paddingFunc, Value &result) {		const PaddingValueComputationFunction &paddingFunc, Value &result) {
// Already static shape, no need to pad.		// Can't pad scalars.
if (llvm::none_of(opToPad.getShape(opOperand), ShapedType::isDynamic))		if (opToPad.getShape(opOperand).empty())
return success();		return success();
auto sliceOp = opOperand->get().getDefiningOp<tensor::ExtractSliceOp>();		auto sliceOp = opOperand->get().getDefiningOp<tensor::ExtractSliceOp>();
// Not a slice op, cannot construct a static bounding box.		// Not a slice op, cannot construct a static bounding box.
if (!sliceOp)		if (!sliceOp)
return failure();		return failure();
SmallVector<int64_t> staticSizes;		SmallVector<int64_t> staticSizes;
staticSizes.reserve(opToPad.getRank(opOperand));		staticSizes.reserve(opToPad.getRank(opOperand));
auto shapedOp = cast<OffsetSizeAndStrideOpInterface>(sliceOp.getOperation());		auto shapedOp = cast<OffsetSizeAndStrideOpInterface>(sliceOp.getOperation());
for (auto size : shapedOp.getMixedSizes()) {		for (auto size : shapedOp.getMixedSizes()) {
auto indexAttr = size.is<Attribute>()		auto indexAttr = size.is<Attribute>()
? size.get<Attribute>().dyn_cast<IntegerAttr>()		? size.get<Attribute>().dyn_cast<IntegerAttr>()
: linalg::getSmallestBoundingIndex(size.get<Value>());		: linalg::getSmallestBoundingIndex(size.get<Value>());
// SmallestBoundingIndex must exist for all sizes.		// SmallestBoundingIndex must exist for all sizes.
// For now return an error if we can't find it.		// For now return an error if we can't find it.
if (!indexAttr)		if (!indexAttr)
return rewriter.notifyMatchFailure(		return rewriter.notifyMatchFailure(
opToPad, "No constant bounding box can be found for padding");		opToPad, "No constant bounding box can be found for padding");
staticSizes.push_back(indexAttr.getInt());		staticSizes.push_back(indexAttr.getInt());
}		}
Value pad = paddingFunc(rewriter, *opOperand);		Value pad = paddingFunc(rewriter, *opOperand);
auto staticTensorType = RankedTensorType::get(		auto staticTensorType = RankedTensorType::get(
staticSizes, getElementTypeOrSelf(opOperand->get()));		staticSizes, getElementTypeOrSelf(opOperand->get()));
result = linalg::PadTensorOp::createPadHighOp(		result = linalg::PadTensorOp::createPadHighOp(
staticTensorType, opOperand->get(), pad, opToPad->getLoc(), rewriter);		staticTensorType, opOperand->get(), pad, /packing=/true,
		opToPad->getLoc(), rewriter);
return success();		return success();
}		}

LogicalResult		LogicalResult
linalg::rewriteAsPaddedOp(PatternRewriter &rewriter, LinalgOp opToPad,		linalg::rewriteAsPaddedOp(PatternRewriter &rewriter, LinalgOp opToPad,
const PaddingValueComputationFunction &paddingFunc,		const PaddingValueComputationFunction &paddingFunc,
LinalgOp &paddedOp) {		LinalgOp &paddedOp) {
Location loc = opToPad->getLoc();		Location loc = opToPad->getLoc();

// If the op is fully static, it does not need padding.
// TODO: there are cases where we may still want to pad to larger sizes.		// TODO: there are cases where we may still want to pad to larger sizes.
assert(opToPad.hasTensorSemantics() &&		assert(opToPad.hasTensorSemantics() &&
"expected operation to have tensor semantics");		"expected operation to have tensor semantics");
if (!opToPad.hasDynamicShape())
return success();

OpBuilder::InsertionGuard g(rewriter);		OpBuilder::InsertionGuard g(rewriter);
// Set IP after op because we also take the dims of the original output.		// Set IP after op because we also take the dims of the original output.
rewriter.setInsertionPointAfter(opToPad);		rewriter.setInsertionPointAfter(opToPad);
// Make a copy of the shaped operands and update it.		// Make a copy of the shaped operands and update it.
SmallVector<Value> newOperands;		SmallVector<Value> newOperands;
newOperands.reserve(opToPad.getNumInputsAndOutputs());		newOperands.reserve(opToPad.getNumInputsAndOutputs());
for (OpOperand *opOperand : opToPad.getInputAndOutputOperands()) {		for (OpOperand *opOperand : opToPad.getInputAndOutputOperands()) {
▲ Show 20 Lines • Show All 542 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/canonicalize.mlir

Show First 20 Lines • Show All 624 Lines • ▼ Show 20 Lines	%0 = linalg.pad_tensor %arg0 low[%a, 0] high[0, %a] {
^bb0(%arg1: index, %arg2: index):		^bb0(%arg1: index, %arg2: index):
linalg.yield %cst : f32		linalg.yield %cst : f32
} : tensor<5x6xf32> to tensor<5x6xf32>		} : tensor<5x6xf32> to tensor<5x6xf32>
return %0 : tensor<5x6xf32>		return %0 : tensor<5x6xf32>
}		}

// -----		// -----

		// CHECK-LABEL: func @pad_tensor_packing_same_static_shape(
		// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
		// CHECK: %[[PAD:.*]] = linalg.pad_tensor
		// CHECK: return %[[PAD]]
		func @pad_tensor_packing_same_static_shape(%arg0: tensor<5x6xf32>, %a: index)
		-> tensor<5x6xf32> {
		%cst = constant 0.000000e+00 : f32
		%0 = linalg.pad_tensor %arg0 packing low[%a, 0] high[0, %a] {
		^bb0(%arg1: index, %arg2: index):
		linalg.yield %cst : f32
		} : tensor<5x6xf32> to tensor<5x6xf32>
		return %0 : tensor<5x6xf32>
		}

		// -----

// CHECK-LABEL: func @pad_tensor_after_cast_different_shape(		// CHECK-LABEL: func @pad_tensor_after_cast_different_shape(
// CHECK-SAME: %[[INPUT:.*]]: tensor<?x64x?x?xf32>) -> tensor<?x?x?x?xf32> {		// CHECK-SAME: %[[INPUT:.*]]: tensor<?x64x?x?xf32>) -> tensor<?x?x?x?xf32> {
// CHECK: %[[CST:.*]] = constant 0.000000e+00 : f32		// CHECK: %[[CST:.*]] = constant 0.000000e+00 : f32
// CHECK: %[[PADDED:.*]] = linalg.pad_tensor %[[INPUT]]		// CHECK: %[[PADDED:.*]] = linalg.pad_tensor %[[INPUT]]
// CHECK-SAME: low[0, 0, 1, 1] high[0, 0, 1, 1] {		// CHECK-SAME: low[0, 0, 1, 1] high[0, 0, 1, 1] {
// CHECK: ^bb0(%[[ARG1:.]]: index, %[[ARG2:.]]: index, %[[ARG3:.]]: index, %[[ARG4:.]]: index):		// CHECK: ^bb0(%[[ARG1:.]]: index, %[[ARG2:.]]: index, %[[ARG3:.]]: index, %[[ARG4:.]]: index):
// CHECK: linalg.yield %[[CST]] : f32		// CHECK: linalg.yield %[[CST]] : f32
// CHECK: } : tensor<?x64x?x?xf32> to tensor<?x64x?x?xf32>		// CHECK: } : tensor<?x64x?x?xf32> to tensor<?x64x?x?xf32>
▲ Show 20 Lines • Show All 275 Lines • ▼ Show 20 Lines	^bb0(%arg1: index, %arg2: index, %arg3: index):
linalg.yield %pad_value : f32		linalg.yield %pad_value : f32
} : tensor<?x?x?xf32> to tensor<2x3x4xf32>		} : tensor<?x?x?xf32> to tensor<2x3x4xf32>

return %0 : tensor<2x3x4xf32>		return %0 : tensor<2x3x4xf32>
}		}

// -----		// -----

		// CHECK-LABEL: func @pad_packing_static_zero(
		// CHECK-SAME: %[[ARG0:.*]]: tensor<?x?x?xf32>
		// CHECK: %[[PAD:.*]] = linalg.pad_tensor
		// CHECK: return %[[PAD]]
		func @pad_packing_static_zero(%arg0: tensor<?x?x?xf32>, %pad_value: f32) -> tensor<2x3x4xf32> {
		%c0 = constant 0 : index
		%0 = linalg.pad_tensor %arg0 packing low[0, %c0, 0] high[0, 0, %c0] {
		^bb0(%arg1: index, %arg2: index, %arg3: index):
		linalg.yield %pad_value : f32
		} : tensor<?x?x?xf32> to tensor<2x3x4xf32>

		return %0 : tensor<2x3x4xf32>
		}

		// -----

func private @some_use(%i : index, %j : index)		func private @some_use(%i : index, %j : index)

// CHECK-LABEL: func @init_canonicalize		// CHECK-LABEL: func @init_canonicalize
// CHECK-SAME: %[[I:.*]]: index		// CHECK-SAME: %[[I:.*]]: index
func @init_canonicalize(%i : index) {		func @init_canonicalize(%i : index) {
%c0 = constant 0 : index		%c0 = constant 0 : index
%c1 = constant 1 : index		%c1 = constant 1 : index

▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/tile-and-pad-tensors.mlir

Show All 14 Lines
// CHECK: %[[sTA:.]] = tensor.extract_slice %[[TA]][{{.}}] : tensor<?x?xi8> to tensor<?x?xi8>		// CHECK: %[[sTA:.]] = tensor.extract_slice %[[TA]][{{.}}] : tensor<?x?xi8> to tensor<?x?xi8>
// CHECK: %[[sTB:.]] = tensor.extract_slice %[[TB]][{{.}}] : tensor<?x?xi8> to tensor<?x?xi8>		// CHECK: %[[sTB:.]] = tensor.extract_slice %[[TB]][{{.}}] : tensor<?x?xi8> to tensor<?x?xi8>
// CHECK: %[[sTC:.]] = tensor.extract_slice %[[TC2]][{{.}}] : tensor<?x?xi32> to tensor<?x?xi32>		// CHECK: %[[sTC:.]] = tensor.extract_slice %[[TC2]][{{.}}] : tensor<?x?xi32> to tensor<?x?xi32>

// Dynamic op has been canonicalized away.		// Dynamic op has been canonicalized away.
// CHECK-NOT: linalg.matmul {{.*}} tensor<?x?xi8>		// CHECK-NOT: linalg.matmul {{.*}} tensor<?x?xi8>

// Padding injects static information.		// Padding injects static information.
// CHECK: %[[pA:.]] = linalg.pad_tensor %[[sTA]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK: %[[pA:.]] = linalg.pad_tensor %[[sTA]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK: : tensor<?x?xi8> to tensor<2x4xi8>		// CHECK: : tensor<?x?xi8> to tensor<2x4xi8>
// CHECK: %[[pB:.]] = linalg.pad_tensor %[[sTB]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK: %[[pB:.]] = linalg.pad_tensor %[[sTB]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK: : tensor<?x?xi8> to tensor<4x3xi8>		// CHECK: : tensor<?x?xi8> to tensor<4x3xi8>
// CHECK: %[[pC:.]] = linalg.pad_tensor %[[sTC]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK: %[[pC:.]] = linalg.pad_tensor %[[sTC]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK: : tensor<?x?xi32> to tensor<2x3xi32>		// CHECK: : tensor<?x?xi32> to tensor<2x3xi32>
// CHECK: %[[pD:.*]] = linalg.matmul_i8_i8_i32 ins(%[[pA]], %[[pB]] : tensor<2x4xi8>, tensor<4x3xi8>)		// CHECK: %[[pD:.*]] = linalg.matmul_i8_i8_i32 ins(%[[pA]], %[[pB]] : tensor<2x4xi8>, tensor<4x3xi8>)
// CHECK-SAME: outs(%[[pC]] : tensor<2x3xi32>) -> tensor<2x3xi32>		// CHECK-SAME: outs(%[[pC]] : tensor<2x3xi32>) -> tensor<2x3xi32>
// CHECK: %[[sTD:.]] = tensor.extract_slice %[[pD]][0, 0] [%{{.}}, %{{.*}}] [1, 1] : tensor<2x3xi32> to tensor<?x?xi32>		// CHECK: %[[sTD:.]] = tensor.extract_slice %[[pD]][0, 0] [%{{.}}, %{{.*}}] [1, 1] : tensor<2x3xi32> to tensor<?x?xi32>
// CHECK: %[[TD:.]] = tensor.insert_slice %[[sTD]] into %[[TC2]][{{.}}] : tensor<?x?xi32> into tensor<?x?xi32>		// CHECK: %[[TD:.]] = tensor.insert_slice %[[sTD]] into %[[TC2]][{{.}}] : tensor<?x?xi32> into tensor<?x?xi32>
// CHECK: scf.yield %[[TD]] : tensor<?x?xi32>		// CHECK: scf.yield %[[TD]] : tensor<?x?xi32>
// CHECK: scf.yield %[[TD2]] : tensor<?x?xi32>		// CHECK: scf.yield %[[TD2]] : tensor<?x?xi32>
// CHECK: scf.yield %[[TD1]] : tensor<?x?xi32>		// CHECK: scf.yield %[[TD1]] : tensor<?x?xi32>
Show All 14 Lines	%arg0: tensor<?x?x?xf32>, %arg1: f32)
-> tensor<?x?x?xf32> {		-> tensor<?x?x?xf32> {
// CHECK: %[[C0:.*]] = constant 0 : index		// CHECK: %[[C0:.*]] = constant 0 : index
// CHECK: %[[TD0:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC0:.*]] = %[[TC]]) -> (tensor<?x?x?xf32>) {		// CHECK: %[[TD0:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC0:.*]] = %[[TC]]) -> (tensor<?x?x?xf32>) {
// CHECK: %[[TD1:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC1:.*]] = %[[TC0]]) -> (tensor<?x?x?xf32>) {		// CHECK: %[[TD1:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC1:.*]] = %[[TC0]]) -> (tensor<?x?x?xf32>) {
// CHECK: %[[TD2:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC2:.*]] = %[[TC1]]) -> (tensor<?x?x?xf32>) {		// CHECK: %[[TD2:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC2:.*]] = %[[TC1]]) -> (tensor<?x?x?xf32>) {
// CHECK: %[[sTC:.]] = tensor.extract_slice %[[TC2]][{{.}}] : tensor<?x?x?xf32> to tensor<?x?x?xf32>		// CHECK: %[[sTC:.]] = tensor.extract_slice %[[TC2]][{{.}}] : tensor<?x?x?xf32> to tensor<?x?x?xf32>

// Padding injects static information.		// Padding injects static information.
// CHECK: %[[pC:.]] = linalg.pad_tensor %[[sTC]] low[%[[C0]], %[[C0]], %[[C0]]] high[%{{.}}, %{{.}}, %{{.}}]		// CHECK: %[[pC:.]] = linalg.pad_tensor %[[sTC]] packing low[%[[C0]], %[[C0]], %[[C0]]] high[%{{.}}, %{{.}}, %{{.}}]
// CHECK: : tensor<?x?x?xf32> to tensor<2x3x4xf32>		// CHECK: : tensor<?x?x?xf32> to tensor<2x3x4xf32>
// CHECK: %[[pD:.*]] = linalg.generic		// CHECK: %[[pD:.*]] = linalg.generic
// CHECK-SAME: ins(%[[VAL]] : f32) outs(%[[pC]] : tensor<2x3x4xf32>)		// CHECK-SAME: ins(%[[VAL]] : f32) outs(%[[pC]] : tensor<2x3x4xf32>)
// CHECK: %[[sTD:.]] = tensor.extract_slice %[[pD]][0, 0, 0] [%{{.}}, %{{.}}, %{{.}}] [1, 1, 1] : tensor<2x3x4xf32> to tensor<?x?x?xf32>		// CHECK: %[[sTD:.]] = tensor.extract_slice %[[pD]][0, 0, 0] [%{{.}}, %{{.}}, %{{.}}] [1, 1, 1] : tensor<2x3x4xf32> to tensor<?x?x?xf32>
// CHECK: %[[TD:.]] = tensor.insert_slice %[[sTD]] into %[[TC2]][{{.}}] : tensor<?x?x?xf32> into tensor<?x?x?xf32>		// CHECK: %[[TD:.]] = tensor.insert_slice %[[sTD]] into %[[TC2]][{{.}}] : tensor<?x?x?xf32> into tensor<?x?x?xf32>
// CHECK: scf.yield %[[TD]] : tensor<?x?x?xf32>		// CHECK: scf.yield %[[TD]] : tensor<?x?x?xf32>
// CHECK: scf.yield %[[TD2]] : tensor<?x?x?xf32>		// CHECK: scf.yield %[[TD2]] : tensor<?x?x?xf32>
// CHECK: scf.yield %[[TD1]] : tensor<?x?x?xf32>		// CHECK: scf.yield %[[TD1]] : tensor<?x?x?xf32>
Show All 35 Lines
// CHECK-1DIM-TILE-SAME: %[[TB:[0-9a-z]+]]: tensor<8x?xi8>		// CHECK-1DIM-TILE-SAME: %[[TB:[0-9a-z]+]]: tensor<8x?xi8>
// CHECK-1DIM-TILE-SAME: %[[TC:[0-9a-z]+]]: tensor<?x?xi32>) -> tensor<?x?xi32> {		// CHECK-1DIM-TILE-SAME: %[[TC:[0-9a-z]+]]: tensor<?x?xi32>) -> tensor<?x?xi32> {
// CHECK-1DIM-TILE: %[[C0:.*]] = constant 0 : index		// CHECK-1DIM-TILE: %[[C0:.*]] = constant 0 : index
// CHECK-1DIM-TILE: %[[TD0:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC0:.*]] = %[[TC]]) -> (tensor<?x?xi32>) {		// CHECK-1DIM-TILE: %[[TD0:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC0:.*]] = %[[TC]]) -> (tensor<?x?xi32>) {
// CHECK-1DIM-TILE: %[[TD1:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC1:.*]] = %[[TC0]]) -> (tensor<?x?xi32>) {		// CHECK-1DIM-TILE: %[[TD1:.]] = scf.for {{.}} to {{.}} step {{.}} iter_args(%[[TC1:.*]] = %[[TC0]]) -> (tensor<?x?xi32>) {
// CHECK-1DIM-TILE: %[[sTA:.]] = tensor.extract_slice %[[TA]][{{.}}] : tensor<?x8xi8> to tensor<?x8xi8>		// CHECK-1DIM-TILE: %[[sTA:.]] = tensor.extract_slice %[[TA]][{{.}}] : tensor<?x8xi8> to tensor<?x8xi8>
// CHECK-1DIM-TILE: %[[sTB:.]] = tensor.extract_slice %[[TB]][{{.}}] : tensor<8x?xi8> to tensor<8x?xi8>		// CHECK-1DIM-TILE: %[[sTB:.]] = tensor.extract_slice %[[TB]][{{.}}] : tensor<8x?xi8> to tensor<8x?xi8>
// CHECK-1DIM-TILE: %[[sTC:.]] = tensor.extract_slice %[[TC1]][{{.}}] : tensor<?x?xi32> to tensor<?x?xi32>		// CHECK-1DIM-TILE: %[[sTC:.]] = tensor.extract_slice %[[TC1]][{{.}}] : tensor<?x?xi32> to tensor<?x?xi32>
// CHECK-1DIM-TILE: %[[pA:.]] = linalg.pad_tensor %[[sTA]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK-1DIM-TILE: %[[pA:.]] = linalg.pad_tensor %[[sTA]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK-1DIM-TILE: : tensor<?x8xi8> to tensor<2x8xi8>		// CHECK-1DIM-TILE: : tensor<?x8xi8> to tensor<2x8xi8>
// CHECK-1DIM-TILE: %[[pB:.]] = linalg.pad_tensor %[[sTB]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK-1DIM-TILE: %[[pB:.]] = linalg.pad_tensor %[[sTB]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK-1DIM-TILE: : tensor<8x?xi8> to tensor<8x3xi8>		// CHECK-1DIM-TILE: : tensor<8x?xi8> to tensor<8x3xi8>
// CHECK-1DIM-TILE: %[[pC:.]] = linalg.pad_tensor %[[sTC]] low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]		// CHECK-1DIM-TILE: %[[pC:.]] = linalg.pad_tensor %[[sTC]] packing low[%[[C0]], %[[C0]]] high[%{{.}}, %{{.*}}]
// CHECK-1DIM-TILE: : tensor<?x?xi32> to tensor<2x3xi32>		// CHECK-1DIM-TILE: : tensor<?x?xi32> to tensor<2x3xi32>
// CHECK-1DIM-TILE: %[[pD:.*]] = linalg.matmul_i8_i8_i32 ins(%[[pA]], %[[pB]] : tensor<2x8xi8>, tensor<8x3xi8>)		// CHECK-1DIM-TILE: %[[pD:.*]] = linalg.matmul_i8_i8_i32 ins(%[[pA]], %[[pB]] : tensor<2x8xi8>, tensor<8x3xi8>)
// CHECK-1DIM-TILE: outs(%[[pC]] : tensor<2x3xi32>) -> tensor<2x3xi32>		// CHECK-1DIM-TILE: outs(%[[pC]] : tensor<2x3xi32>) -> tensor<2x3xi32>

		// Check that the tile-and-pad transformation actually introduces the padding
		// as requested, even if original operation already operates on static
		// shapes.
		// CHECK-LABEL: @pad_to_same_static_size
		func @pad_to_same_static_size(%arg0: tensor<2x3x4xf32>, %arg1: f32) -> tensor<2x3x4xf32> {
		// CHECK: %[[c0:.*]] = constant 0 : index
		// CHECK-NOT: scf.for
		// CHECK: linalg.pad_tensor %{{.*}} packing low[%[[c0]], %[[c0]], %[[c0]]] high[%[[c0]], %[[c0]], %[[c0]]]
		// CHECK: tensor<2x3x4xf32> to tensor<2x3x4xf32>
		%0 = linalg.generic {
		indexing_maps = [affine_map<(d0, d1, d2) -> ()>,
		affine_map<(d0, d1, d2) -> (d0, d1, d2)> ],
		iterator_types = ["parallel", "parallel", "parallel"]}
		{__internal_linalg_transform__ = "tile"}
		ins(%arg1 : f32) outs(%arg0 : tensor<2x3x4xf32>) {
		^bb0(%arg2: f32, %arg3: f32): // no predecessors
		linalg.yield %arg2 : f32
		} -> tensor<2x3x4xf32>
		return %0 : tensor<2x3x4xf32>
		}

		// CHECK-LABEL: @pad_static_divisible_size
		func @pad_static_divisible_size(%arg0: tensor<4x6x8xf32>, %arg1: f32) -> tensor<4x6x8xf32> {
		// CHECK: %[[c0:.*]] = constant 0 : index
		// CHECK-COUNT-3: scf.for
		// CHECK: linalg.pad_tensor %{{.*}} packing low[%[[c0]], %[[c0]], %[[c0]]] high[%[[c0]], %[[c0]], %[[c0]]]
		// CHECK: tensor<2x3x4xf32> to tensor<2x3x4xf32>
		%0 = linalg.generic {
		indexing_maps = [affine_map<(d0, d1, d2) -> ()>,
		affine_map<(d0, d1, d2) -> (d0, d1, d2)> ],
		iterator_types = ["parallel", "parallel", "parallel"]}
		{__internal_linalg_transform__ = "tile"}
		ins(%arg1 : f32) outs(%arg0 : tensor<4x6x8xf32>) {
		^bb0(%arg2: f32, %arg3: f32): // no predecessors
		linalg.yield %arg2 : f32
		} -> tensor<4x6x8xf32>
		return %0 : tensor<4x6x8xf32>
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Linalg: ensure tile-and-pad always creates padding as requestedClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 374877

mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td

mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp

mlir/test/Dialect/Linalg/canonicalize.mlir

mlir/test/Dialect/Linalg/tile-and-pad-tensors.mlir

[mlir] Linalg: ensure tile-and-pad always creates padding as requested
ClosedPublic