This is an archive of the discontinued LLVM Phabricator instance.

I have a question regarding your intended usage of this, do you have a pass pipeline in which this fits (if so it it possible to share it) ?
The reason I am asking is that most passes in Linalg should be considered as "test" passes as we do not have automatic profitability heuristics upstream.

Instead we have been switching to implementing key functionality as transform dialect ops and apply / test them independently of a pass pipeline.
Separately, pass pipelines with proper heuristics can be built from these building blocks.

Here is a recent example of adding a new transform and a new test, in case it is useful: https://reviews.llvm.org/D153420.

You could also directly target a memory space attribute rather than using the hardcoded constants that are backend specific and can be confusing (see also: https://discourse.llvm.org/t/confused-by-inconsistencies-in-gpu-magic-constants/72041).

mlir/include/mlir/Dialect/Linalg/Passes.td
157 ↗	(On Diff #555006)	nit: dimensions
mlir/lib/Dialect/Linalg/Transforms/UpdateAddressSpace.cpp
147 ↗	(On Diff #555006)	`MemRefType::Builder(srcMemRefType).setAddressSpace(...)` or something similar plz
159 ↗	(On Diff #555006)	you would likely be better off here with a `linalg::CopyOp` as it can further be mapped to e.g. GPU thread ids and vector load/store using something like: https://reviews.llvm.org/D154836

Actually, here is a contribution that seems related: https://reviews.llvm.org/D144666

Do you see a way to generalize and/or reuse the existing ?

@nicolasvasilache thanks for feedback, I really appreciate it!

I will take a look at Transform Dialect and the references you added here and see how I can generalize a solution.

Thanks,
Aviad Cohen

Updated linalg transform::promote operations instead of dedicated pass.

AviadCo retitled this revision from [mlir][linalg]: Add LinalgDMAAddressSpace pass to [mlir][Linalg]: Add memory space to linalg transform::PromoteOp.Sep 6 2023, 12:25 AM

AviadCo edited the summary of this revision. (Show Details)

In D159074#4631170, @nicolasvasilache wrote:

Actually, here is a contribution that seems related: https://reviews.llvm.org/D144666

Do you see a way to generalize and/or reuse the existing ?

I really appreciate your references. I now see that I may use Linalg transform::PromoteOp with small modification to add memory space support.
Not related to this patch, I am a bit confused why GPU dialect is deeply connected to this transform? IMO we should avoid that coupling. Moreover, I am not using GPU at all but still needs to specify the memory address.

Can you please review this new patch?

@AviadCo thanks this looks good to me.

As to the existing state, I suspect the person who added these patterns wanted to implement some simple heuristic to raise the level of control / automation in the transform but ended up coupling the GPU dialect with it in the process.
I'd welcome a refactoring to decouple these concerns indeed.
The operations used for alloc and copy could also be made parametric to avoid hardcoding.

This revision is now accepted and ready to land.Sep 6 2023, 4:52 AM

This revision was landed with ongoing or failed builds.Sep 7 2023, 7:35 AM

Closed by commit rGd6a2014eb8b9: [mlir][Linalg]: Add memory space to linalg transform::PromoteOp (authored by AviadCo). · Explain Why

This revision was automatically updated to reflect the committed changes.

AviadCo added a commit: rGd6a2014eb8b9: [mlir][Linalg]: Add memory space to linalg transform::PromoteOp.

In D159074#4639463, @nicolasvasilache wrote:

@AviadCo thanks this looks good to me.

As to the existing state, I suspect the person who added these patterns wanted to implement some simple heuristic to raise the level of control / automation in the transform but ended up coupling the GPU dialect with it in the process.
I'd welcome a refactoring to decouple these concerns indeed.
The operations used for alloc and copy could also be made parametric to avoid hardcoding.

thanks for reviewing so fast!

I acctually have some ideas to improve the promotion a bit (i.e. not to copy outputs before the linalg operation if there are no uses inside the linalg block - meaning output is not an "in/out" operand) so i believe I will make further commits to PromoteOp.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

TransformOps/

LinalgTransformOps.td

83 lines

Transforms/

Transforms.h

7 lines

lib/

Dialect/

Linalg/

TransformOps/

LinalgTransformOps.cpp

2 lines

Transforms/

Promotion.cpp

21 lines

test/

Dialect/

Linalg/

promote.mlir

108 lines

Diff 556152

mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	def BufferizeToAllocationOp : Op<Transform_Dialect,
];		];
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// DecomposeOp		// DecomposeOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def DecomposeOp : Op<Transform_Dialect, "structured.decompose",		def DecomposeOp : Op<Transform_Dialect, "structured.decompose",
[FunctionalStyleTransformOpTrait,		[FunctionalStyleTransformOpTrait,
MemoryEffectsOpInterface,		MemoryEffectsOpInterface,
TransformOpInterface,		TransformOpInterface,
TransformEachOpTrait,		TransformEachOpTrait,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Decomposes named complex operations, such as higher-dimensional		Decomposes named complex operations, such as higher-dimensional
(depthwise) convolutions, into combinations of lower-dimensional equivalents		(depthwise) convolutions, into combinations of lower-dimensional equivalents
when possible.		when possible.

#### Return modes		#### Return modes
▲ Show 20 Lines • Show All 230 Lines • ▼ Show 20 Lines	def InterchangeOp : Op<Transform_Dialect, "structured.interchange",
}];		}];

let arguments =		let arguments =
(ins TransformHandleTypeInterface:$target,		(ins TransformHandleTypeInterface:$target,
ConfinedAttr<DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">,		ConfinedAttr<DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">,
[DenseArrayNonNegative<DenseI64ArrayAttr>]>:$iterator_interchange);		[DenseArrayNonNegative<DenseI64ArrayAttr>]>:$iterator_interchange);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);

let assemblyFormat = [{		let assemblyFormat = [{
$target		$target
(`iterator_interchange` `=` $iterator_interchange^)? attr-dict		(`iterator_interchange` `=` $iterator_interchange^)? attr-dict
`:` custom<SemiFunctionType>(type($target), type($transformed))		`:` custom<SemiFunctionType>(type($target), type($transformed))
}];		}];
let hasVerifier = 1;		let hasVerifier = 1;

let extraClassDeclaration = [{		let extraClassDeclaration = [{
::mlir::DiagnosedSilenceableFailure applyToOne(		::mlir::DiagnosedSilenceableFailure applyToOne(
::mlir::transform::TransformRewriter &rewriter,		::mlir::transform::TransformRewriter &rewriter,
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
def LowerUnPackOp : Op<Transform_Dialect, "structured.lower_unpack", [		def LowerUnPackOp : Op<Transform_Dialect, "structured.lower_unpack", [
FunctionalStyleTransformOpTrait,		FunctionalStyleTransformOpTrait,
MemoryEffectsOpInterface,		MemoryEffectsOpInterface,
TransformEachOpTrait,		TransformEachOpTrait,
TransformOpInterface,		TransformOpInterface,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Lower a tensor.unpack into empty + linalg.transpose + tensor.collapse_shape +		Lower a tensor.unpack into empty + linalg.transpose + tensor.collapse_shape +
tensor.extract_slice.		tensor.extract_slice.

#### Return modes		#### Return modes

This operation ignores non-unpack ops and drops them in the return.		This operation ignores non-unpack ops and drops them in the return.
This operation produces a silenceableFailure if the rewrite fails for any		This operation produces a silenceableFailure if the rewrite fails for any
reason.		reason.
If all the operations referred to by the `target` are rewritten, the		If all the operations referred to by the `target` are rewritten, the
transform succeeds.		transform succeeds.
Return handles to the newly produced empty, transpose, collapse_shape and extract_slice ops.		Return handles to the newly produced empty, transpose, collapse_shape and extract_slice ops.
}];		}];

let arguments = (ins Transform_ConcreteOpType<"tensor.unpack">:$target);		let arguments = (ins Transform_ConcreteOpType<"tensor.unpack">:$target);
let results = (outs Transform_ConcreteOpType<"tensor.empty">:$empty_op,		let results = (outs Transform_ConcreteOpType<"tensor.empty">:$empty_op,
Transform_ConcreteOpType<"linalg.transpose">:$transpose_op,		Transform_ConcreteOpType<"linalg.transpose">:$transpose_op,
Transform_ConcreteOpType<"tensor.collapse_shape">:$collapse_shape_op,		Transform_ConcreteOpType<"tensor.collapse_shape">:$collapse_shape_op,
Transform_ConcreteOpType<"tensor.extract_slice">:$extract_slice_op);		Transform_ConcreteOpType<"tensor.extract_slice">:$extract_slice_op);
let assemblyFormat = [{		let assemblyFormat = [{
$target attr-dict `:` functional-type(operands, results)		$target attr-dict `:` functional-type(operands, results)
}];		}];

let extraClassDeclaration = [{		let extraClassDeclaration = [{
::mlir::DiagnosedSilenceableFailure applyToOne(		::mlir::DiagnosedSilenceableFailure applyToOne(
::mlir::transform::TransformRewriter &rewriter,		::mlir::transform::TransformRewriter &rewriter,
::mlir::tensor::UnPackOp target,		::mlir::tensor::UnPackOp target,
::mlir::transform::ApplyToEachResultList &transformResults,		::mlir::transform::ApplyToEachResultList &transformResults,
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines

def PackOp : Op<Transform_Dialect, "structured.pack", [		def PackOp : Op<Transform_Dialect, "structured.pack", [
DeclareOpInterfaceMethods<TransformOpInterface>,		DeclareOpInterfaceMethods<TransformOpInterface>,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,		DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Pack a LinalgOp by applying a data tiling transformation on the op and		Pack a LinalgOp by applying a data tiling transformation on the op and
packing the operands according to the `packed_sizes` specification.		packing the operands according to the `packed_sizes` specification.

Iterator dimensions are tiled in their canonical order in the op spec.		Iterator dimensions are tiled in their canonical order in the op spec.
Operands are packed according to the same canonical order of the op iterator		Operands are packed according to the same canonical order of the op iterator
dimensions.		dimensions.

Specifying a packed size of 0 for an iterator removes it from consideration		Specifying a packed size of 0 for an iterator removes it from consideration
for packing.		for packing.

`tensor.pack` (resp. `tensor.unpack`) operations are inserted for the operands		`tensor.pack` (resp. `tensor.unpack`) operations are inserted for the operands
Show All 18 Lines	let description = [{
M, N and K, in this order, in both the op and its operands.		M, N and K, in this order, in both the op and its operands.
```		```
// M N K m n k M K m k		// M N K m n k M K m k
// affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d2, d3, d5)>		// affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d2, d3, d5)>
// K N n k		// K N n k
// affine_map<(d0, d1, d2, d3, d4, d5) -> (d2, d1, d4, d5)>		// affine_map<(d0, d1, d2, d3, d4, d5) -> (d2, d1, d4, d5)>
// M N m n		// M N m n
// affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d3, d4)>		// affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d3, d4)>
%0 = linalg.generic_representing_some_higher_d_matmul		%0 = linalg.generic_representing_some_higher_d_matmul
ins(%A, %B: tensor<?x?x2x4xf32>, tensor<?x?x4x3xf32>)		ins(%A, %B: tensor<?x?x2x4xf32>, tensor<?x?x4x3xf32>)
outs( %C: tensor<?x?x2x3xf32>)		outs( %C: tensor<?x?x2x3xf32>)
```		```
In particular, note that the second operand `B` has shape `KxNxnxk` (and not		In particular, note that the second operand `B` has shape `KxNxnxk` (and not
`KxNxkxn` as one could expect by looking only at the operand).		`KxNxkxn` as one could expect by looking only at the operand).

Other layouts can be obtained unsurprisingly from this canonical		Other layouts can be obtained unsurprisingly from this canonical
transformation by composing the resulting operation with a		transformation by composing the resulting operation with a
Show All 10 Lines	let description = [{
The returned handle point to the packed LinalgOp.		The returned handle point to the packed LinalgOp.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target,		let arguments = (ins TransformHandleTypeInterface:$target,
Variadic<TransformHandleTypeInterface>:$packed_sizes,		Variadic<TransformHandleTypeInterface>:$packed_sizes,
DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$static_packed_sizes);		DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$static_packed_sizes);
let results = (outs TransformHandleTypeInterface:$packed_op);		let results = (outs TransformHandleTypeInterface:$packed_op);
let assemblyFormat = [{		let assemblyFormat = [{
$target		$target
`packed_sizes` `=` custom<DynamicIndexList>($packed_sizes,		`packed_sizes` `=` custom<DynamicIndexList>($packed_sizes,
$static_packed_sizes,		$static_packed_sizes,
type($packed_sizes))		type($packed_sizes))
attr-dict		attr-dict
`:` functional-type($target, results)		`:` functional-type($target, results)
}];		}];

let builders = [		let builders = [
Show All 12 Lines
def PackGreedilyOp : Op<Transform_Dialect, "structured.pack_greedily", [		def PackGreedilyOp : Op<Transform_Dialect, "structured.pack_greedily", [
DeclareOpInterfaceMethods<TransformOpInterface>,		DeclareOpInterfaceMethods<TransformOpInterface>,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,		DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Target a Linalg op and rewrite it into packed LinalgOp form by trying to		Target a Linalg op and rewrite it into packed LinalgOp form by trying to
infer whether a known suboperation is embedded		infer whether a known suboperation is embedded

Different packing strategies are applied in order, when one applies		Different packing strategies are applied in order, when one applies
successfully, the transform returns:		successfully, the transform returns:
1. Matmul packing: Try to infer a matmul operation embedded in the target op.		1. Matmul packing: Try to infer a matmul operation embedded in the target op.
Specifically, this looks for 2 parallel dimensions that participate in		Specifically, this looks for 2 parallel dimensions that participate in
an outer-product and 1 reduction dimension.		an outer-product and 1 reduction dimension.
These dimensions are referred as (m, n, k) to match canonical matmul		These dimensions are referred as (m, n, k) to match canonical matmul
terminology.		terminology.

The packed sizes for (m, n, k) are specified by `matmul_packed_sizes`		The packed sizes for (m, n, k) are specified by `matmul_packed_sizes`
and the optional `matmul_padded_sizes_next_multiple_of`.		and the optional `matmul_padded_sizes_next_multiple_of`.
When an entry `matmul_packed_sizes[i]` is non-0, the corresponding		When an entry `matmul_packed_sizes[i]` is non-0, the corresponding
dimension is packed by `matmul_packed_sizes[i]`.		dimension is packed by `matmul_packed_sizes[i]`.
Otherwise, the dimension is merely padded to the next multiple of		Otherwise, the dimension is merely padded to the next multiple of
`matmul_padded_sizes_next_multiple_of[i]`.		`matmul_padded_sizes_next_multiple_of[i]`.

`matmul_padded_sizes_next_multiple_of` is optional and is expected to		`matmul_padded_sizes_next_multiple_of` is optional and is expected to
either be empty or of size `3`, matching the size of `matmul_packed_sizes`.		either be empty or of size `3`, matching the size of `matmul_packed_sizes`.
For each individual element of `matmul_packed_sizes` and		For each individual element of `matmul_packed_sizes` and
`matmul_padded_sizes_next_multiple_of`, only one of them is allowed to		`matmul_padded_sizes_next_multiple_of`, only one of them is allowed to
be non-zero.		be non-zero.

The ordering of the packed dimensions (mm, nn, kk) is specified by the		The ordering of the packed dimensions (mm, nn, kk) is specified by the
`matmul_inner_dims_order` attribute.		`matmul_inner_dims_order` attribute.

Packing occurs as follows:		Packing occurs as follows:
1. Find the dimensions to pack according to the strategy.		1. Find the dimensions to pack according to the strategy.
2. The target is converted to linalg.generic form.		2. The target is converted to linalg.generic form.
3. An interchange transform is applied to isolate the dimensions to pack as		3. An interchange transform is applied to isolate the dimensions to pack as
the most minor indexing dimensions of the linalg.generic. The most minor		the most minor indexing dimensions of the linalg.generic. The most minor
dimensions are themselves ordered according to `inner_dims_order`.		dimensions are themselves ordered according to `inner_dims_order`.
4. An elementwise traversal of `matmul_packed_sizes` and		4. An elementwise traversal of `matmul_packed_sizes` and
`matmul_padded_sizes_next_multiple_of` is performed and for each		`matmul_padded_sizes_next_multiple_of` is performed and for each
dimension `d`, either pack to `matmul_packed_sizes[d]` or pad to the		dimension `d`, either pack to `matmul_packed_sizes[d]` or pad to the
`matmul_padded_sizes_next_multiple_of[d]`.		`matmul_padded_sizes_next_multiple_of[d]`.
5. Packing/padding is performed by the amounts determined in step 4. and		5. Packing/padding is performed by the amounts determined in step 4. and
following `inner_dims_order`.		following `inner_dims_order`.

By normalizing the most minor dimensions to `inner_dims_order`, the transform		By normalizing the most minor dimensions to `inner_dims_order`, the transform
guarantees that packing immediately generates inner dimensions in a desirable		guarantees that packing immediately generates inner dimensions in a desirable
layout.		layout.
Show All 11 Lines	def PackGreedilyOp : Op<Transform_Dialect, "structured.pack_greedily", [

// TODO: Transform_ConcreteOpType<linalg::LinalgOp> needs interface.		// TODO: Transform_ConcreteOpType<linalg::LinalgOp> needs interface.
let arguments = (ins TransformHandleTypeInterface:$target,		let arguments = (ins TransformHandleTypeInterface:$target,
Variadic<TransformHandleTypeInterface>:$matmul_packed_sizes,		Variadic<TransformHandleTypeInterface>:$matmul_packed_sizes,
ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,		ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,
[DenseArrayCount<3>]>:$static_matmul_packed_sizes,		[DenseArrayCount<3>]>:$static_matmul_packed_sizes,
ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,		ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,
[Attr<		[Attr<
Or<[DenseArrayCount<0>.predicate,		Or<[DenseArrayCount<0>.predicate,
DenseArrayCount<3>.predicate]>,		DenseArrayCount<3>.predicate]>,
"with 0 or 3 elements"		"with 0 or 3 elements"
>]>		>]>
:$matmul_padded_sizes_next_multiple_of,		:$matmul_padded_sizes_next_multiple_of,
ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,		ConfinedAttr<DefaultValuedAttr<DenseI64ArrayAttr, "{}">,
[DenseArrayCount<3>]>:$matmul_inner_dims_order);		[DenseArrayCount<3>]>:$matmul_inner_dims_order);
let results = (outs TransformHandleTypeInterface:$packed_op);		let results = (outs TransformHandleTypeInterface:$packed_op);

let builders = [		let builders = [
OpBuilder<(ins "Value":$target,		OpBuilder<(ins "Value":$target,
"ArrayRef<OpFoldResult>":$mixedMatmulPackedSizes,		"ArrayRef<OpFoldResult>":$mixedMatmulPackedSizes,
"ArrayRef<int64_t>":$matmulPaddededSizesNextMultipleOf,		"ArrayRef<int64_t>":$matmulPaddededSizesNextMultipleOf,
CArg<"ArrayRef<int64_t>", "{}">:$matmulDimsInnerDimsOrder)>		CArg<"ArrayRef<int64_t>", "{}">:$matmulDimsInnerDimsOrder)>
];		];

let assemblyFormat = [{		let assemblyFormat = [{
$target		$target
oilist(		oilist(
`matmul_packed_sizes` `=` custom<DynamicIndexList>($matmul_packed_sizes,		`matmul_packed_sizes` `=` custom<DynamicIndexList>($matmul_packed_sizes,
$static_matmul_packed_sizes,		$static_matmul_packed_sizes,
type($matmul_packed_sizes))		type($matmul_packed_sizes))
(`matmul_padded_sizes_next_multiple_of` `=`		(`matmul_padded_sizes_next_multiple_of` `=`
$matmul_padded_sizes_next_multiple_of^)?		$matmul_padded_sizes_next_multiple_of^)?
`matmul_inner_dims_order` `=` $matmul_inner_dims_order		`matmul_inner_dims_order` `=` $matmul_inner_dims_order
)		)
attr-dict		attr-dict
`:` functional-type($target, results)		`:` functional-type($target, results)
}];		}];
let hasVerifier = 1;		let hasVerifier = 1;

let extraClassDeclaration = [{		let extraClassDeclaration = [{
/// Returns the list of tile sizes, which may be static (Attribute) or		/// Returns the list of tile sizes, which may be static (Attribute) or
/// dynamic (Value).		/// dynamic (Value).
SmallVector<OpFoldResult> getMixedMatmulPackedSizes();		SmallVector<OpFoldResult> getMixedMatmulPackedSizes();
}];		}];
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PackTransposeOp		// PackTransposeOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
def PackTransposeOp : Op<Transform_Dialect, "structured.pack_transpose", [		def PackTransposeOp : Op<Transform_Dialect, "structured.pack_transpose", [
FunctionalStyleTransformOpTrait,		FunctionalStyleTransformOpTrait,
MemoryEffectsOpInterface,		MemoryEffectsOpInterface,
DeclareOpInterfaceMethods<TransformOpInterface>,		DeclareOpInterfaceMethods<TransformOpInterface>,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Apply a transposition to a single `tensor.pack` (resp. `tensor.unpack`) and		Apply a transposition to a single `tensor.pack` (resp. `tensor.unpack`) and
update the `linalg.generic` op that consumes (resp. produces) the operation.		update the `linalg.generic` op that consumes (resp. produces) the operation.

This transform allows composing a simple `structured.pack` with additional		This transform allows composing a simple `structured.pack` with additional
transpositions to e.g. match the data format required by a specific library		transpositions to e.g. match the data format required by a specific library
call or ISA instruction.		call or ISA instruction.

The transpose spec must specify at least one of `outer_perm` or `inner_perm`		The transpose spec must specify at least one of `outer_perm` or `inner_perm`
attributes, which will act upon the `outer_dims_perm` or `inner_dims_pos` of		attributes, which will act upon the `outer_dims_perm` or `inner_dims_pos` of
the specified `tensor.pack` or `tensor.unpack` op.		the specified `tensor.pack` or `tensor.unpack` op.

If the `target` of this op is a `tensor.pack` then a new `tensor.empty` will		If the `target` of this op is a `tensor.pack` then a new `tensor.empty` will
be created along with transposed versions of the `tensor.pack` and the		be created along with transposed versions of the `tensor.pack` and the
consuming `linalg.generic`, which is expected to be the sole consumer.		consuming `linalg.generic`, which is expected to be the sole consumer.

If the `target` of this op is a `tensor.unpack` then the whole pack / compute		If the `target` of this op is a `tensor.unpack` then the whole pack / compute
/ unpack chain will be transposed and transposed clones of `tensor.pack`,		/ unpack chain will be transposed and transposed clones of `tensor.pack`,
the consuming `linalg.generic` and the tail `tensor.pack` will be created.		the consuming `linalg.generic` and the tail `tensor.pack` will be created.

#### Return modes		#### Return modes

This operation targets a single `tensor.pack` / `tensor.unpack` op and a		This operation targets a single `tensor.pack` / `tensor.unpack` op and a
single matching `linalg.generic` that consumes / produces the op. Otherwise,		single matching `linalg.generic` that consumes / produces the op. Otherwise,
it produces a silenceableFailure.		it produces a silenceableFailure.

This operation may produce a silenceableFailure if the transpose spec is		This operation may produce a silenceableFailure if the transpose spec is
ill-formed (i.e. `outer_perm` or `inner_perm` are not permutations of the		ill-formed (i.e. `outer_perm` or `inner_perm` are not permutations of the
proper rank) or if the tranposition of all involved operations fails for any		proper rank) or if the tranposition of all involved operations fails for any
reason.		reason.

This operation returns 3 handles, one to the transformed LinalgOp, one to		This operation returns 3 handles, one to the transformed LinalgOp, one to
the transformed `tensor.pack` and one to the transformed `tensor.unpack`.		the transformed `tensor.pack` and one to the transformed `tensor.unpack`.
The last handle for `tensor.unpack` is empty if `target_pack_or_unpack_op`		The last handle for `tensor.unpack` is empty if `target_pack_or_unpack_op`
was not itself a `tensor.unpack`.		was not itself a `tensor.unpack`.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target_pack_or_un_pack_op,		let arguments = (ins TransformHandleTypeInterface:$target_pack_or_un_pack_op,
TransformHandleTypeInterface:$target_linalg_op,		TransformHandleTypeInterface:$target_linalg_op,
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$outer_perm,		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$outer_perm,
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$inner_perm);		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$inner_perm);
let results = (outs TransformHandleTypeInterface:$packed_op,		let results = (outs TransformHandleTypeInterface:$packed_op,
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	def PadOp : Op<Transform_Dialect, "structured.pad",

let assemblyFormat =		let assemblyFormat =
"$target attr-dict `:` functional-type(operands, results)";		"$target attr-dict `:` functional-type(operands, results)";
let hasVerifier = 1;		let hasVerifier = 1;

let builders = [		let builders = [
// Builder for a transform::PadOp with automatic inference of padding		// Builder for a transform::PadOp with automatic inference of padding
// value. Warning: this will set the value 0 for the inferred elemental		// value. Warning: this will set the value 0 for the inferred elemental
// type without taking the op into account and thus only work for the		// type without taking the op into account and thus only work for the
// add/mul ring at the moment.		// add/mul ring at the moment.
// TODO: support other operations (e.g. min, max etc).		// TODO: support other operations (e.g. min, max etc).
OpBuilder<(ins "Value":$target,		OpBuilder<(ins "Value":$target,
"ArrayRef<int64_t>":$paddingDimensions,		"ArrayRef<int64_t>":$paddingDimensions,
CArg<"ArrayRef<int64_t>", "{}">:$padToMultipleOf,		CArg<"ArrayRef<int64_t>", "{}">:$padToMultipleOf,
CArg<"ArrayRef<int64_t>", "{}">:$packPaddings,		CArg<"ArrayRef<int64_t>", "{}">:$packPaddings,
CArg<"ArrayRef<Attribute>", "{}">:$transposePaddings,		CArg<"ArrayRef<Attribute>", "{}">:$transposePaddings,
CArg<"StringRef", "::mlir::bufferization::CopyTensorOp::getOperationName()">:$copyBackOp)>		CArg<"StringRef", "::mlir::bufferization::CopyTensorOp::getOperationName()">:$copyBackOp)>
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	def HoistPadOp : Op<Transform_Dialect, "structured.hoist_pad",
[FunctionalStyleTransformOpTrait,		[FunctionalStyleTransformOpTrait,
MemoryEffectsOpInterface,		MemoryEffectsOpInterface,
TransformOpInterface,		TransformOpInterface,
TransformEachOpTrait]> {		TransformEachOpTrait]> {
let description = [{		let description = [{
Hoist the tensor.pad target operation by at most the given number of loops.		Hoist the tensor.pad target operation by at most the given number of loops.
Optionally apply the transpose attribute to the inner dimensions.		Optionally apply the transpose attribute to the inner dimensions.

TODO: In the future, we should consider rewriting as a tensor.pack after		TODO: In the future, we should consider rewriting as a tensor.pack after
hoisting since this abstraction is now available.		hoisting since this abstraction is now available.
TODO: Maybe also return the linalg.generic transpose created at some point.		TODO: Maybe also return the linalg.generic transpose created at some point.

#### Return modes		#### Return modes

This operation ignores non-tensor.pad ops and drops them in the result.		This operation ignores non-tensor.pad ops and drops them in the result.
If any non-tensor.pad is passed, the transform emits a silenceable failure.		If any non-tensor.pad is passed, the transform emits a silenceable failure.

If all the operations referred to by the `target` handle padproperly, the		If all the operations referred to by the `target` handle padproperly, the
transform succeeds. Otherwise the transform silently fails.		transform succeeds. Otherwise the transform silently fails.

The return handle points to only the subset of successfully hoisted		The return handle points to only the subset of successfully hoisted
tensor.pad operations, which can be empty.		tensor.pad operations, which can be empty.
}];		}];

// Also allow any operation for simpler composition. Non-tensor.pad ops		// Also allow any operation for simpler composition. Non-tensor.pad ops
// will be dropped from the results.		// will be dropped from the results.
let arguments =		let arguments =
(ins TransformHandleTypeInterface:$target,		(ins TransformHandleTypeInterface:$target,
I64Attr:$num_loops,		I64Attr:$num_loops,
DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$transpose);		DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$transpose);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);

let assemblyFormat = [{		let assemblyFormat = [{
$target		$target
`by` $num_loops `loops`		`by` $num_loops `loops`
(`,` `transpose` `by` $transpose^)?		(`,` `transpose` `by` $transpose^)?
attr-dict		attr-dict
`:` functional-type(operands, results)		`:` functional-type(operands, results)
}];		}];
let hasVerifier = 1;		let hasVerifier = 1;

let extraClassDeclaration = [{		let extraClassDeclaration = [{
::mlir::DiagnosedSilenceableFailure applyToOne(		::mlir::DiagnosedSilenceableFailure applyToOne(
::mlir::transform::TransformRewriter &rewriter,		::mlir::transform::TransformRewriter &rewriter,
Show All 30 Lines	let description = [{
was modified inplace.		was modified inplace.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target,		let arguments = (ins TransformHandleTypeInterface:$target,
DefaultValuedAttr<I64ArrayAttr, "{}">:$operands_to_promote,		DefaultValuedAttr<I64ArrayAttr, "{}">:$operands_to_promote,
DefaultValuedAttr<BoolArrayAttr, "{}">:$use_full_tile_buffers,		DefaultValuedAttr<BoolArrayAttr, "{}">:$use_full_tile_buffers,
UnitAttr:$use_full_tiles_by_default,		UnitAttr:$use_full_tiles_by_default,
UnitAttr:$use_alloca,		UnitAttr:$use_alloca,
		OptionalAttr<AnyAttr>:$memory_space,
OptionalAttr<DeviceMappingArrayAttr>:$mapping,		OptionalAttr<DeviceMappingArrayAttr>:$mapping,
OptionalAttr<I64Attr>:$alignment);		OptionalAttr<I64Attr>:$alignment);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);

let assemblyFormat =		let assemblyFormat =
"$target attr-dict `:`"		"$target attr-dict `:`"
"custom<SemiFunctionType>(type($target), type($transformed))";		"custom<SemiFunctionType>(type($target), type($transformed))";

▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	let description = [{
dimensions after multiple transformations have been applied).		dimensions after multiple transformations have been applied).
Loops can always be recovered by navigating from the tiled operations if		Loops can always be recovered by navigating from the tiled operations if
needed.		needed.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target);		let arguments = (ins TransformHandleTypeInterface:$target);
let results = (outs TransformHandleTypeInterface:$result);		let results = (outs TransformHandleTypeInterface:$result);

let assemblyFormat =		let assemblyFormat =
"$target attr-dict `:`"		"$target attr-dict `:`"
"custom<SemiFunctionType>(type($target), type($result))";		"custom<SemiFunctionType>(type($target), type($result))";

let extraClassDeclaration = [{		let extraClassDeclaration = [{
::mlir::DiagnosedSilenceableFailure applyToOne(		::mlir::DiagnosedSilenceableFailure applyToOne(
::mlir::transform::TransformRewriter &rewriter,		::mlir::transform::TransformRewriter &rewriter,
::mlir::linalg::LinalgOp target,		::mlir::linalg::LinalgOp target,
::mlir::transform::ApplyToEachResultList &results,		::mlir::transform::ApplyToEachResultList &results,
Show All 29 Lines	def DecomposeInterfaceOp : Op<Transform_Dialect, "structured.decompose_interface",
}];		}];
}		}
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// RewriteInDestinationPassingStyleOp.		// RewriteInDestinationPassingStyleOp.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def RewriteInDestinationPassingStyleOp : Op<		def RewriteInDestinationPassingStyleOp : Op<
Transform_Dialect, "structured.rewrite_in_destination_passing_style",		Transform_Dialect, "structured.rewrite_in_destination_passing_style",
[FunctionalStyleTransformOpTrait,		[FunctionalStyleTransformOpTrait,
MemoryEffectsOpInterface,		MemoryEffectsOpInterface,
TransformOpInterface,		TransformOpInterface,
TransformEachOpTrait,		TransformEachOpTrait,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Rewrite a supported tensor operation that is not in destination-passing style		Rewrite a supported tensor operation that is not in destination-passing style
into a form that is in destination-passing style.		into a form that is in destination-passing style.
Currently supported operations are:		Currently supported operations are:
- tensor.pad		- tensor.pad
- tensor.generate		- tensor.generate
- tensor.from_elements		- tensor.from_elements
This dichotomy hints at a future interface, for now the implementation just		This dichotomy hints at a future interface, for now the implementation just
switches between different implementation.		switches between different implementation.

#### Return modes		#### Return modes

This operation ignores non-unsupported ops and drops them from the return.		This operation ignores non-unsupported ops and drops them from the return.
If all the operations referred to by the `target` handle generalize		If all the operations referred to by the `target` handle generalize
properly, the transform succeeds. Otherwise the transform silently fails.		properly, the transform succeeds. Otherwise the transform silently fails.
The return handle points to a subset of successfully produced operations:		The return handle points to a subset of successfully produced operations:
- `tensor.pad` case, the returned handle points to the tensor.insert_slice.		- `tensor.pad` case, the returned handle points to the tensor.insert_slice.
- `tensor.generate` case, the returned handle points to the linalg.generic.		- `tensor.generate` case, the returned handle points to the linalg.generic.
- `tensor.from_elements` case, the returned handle points to the last		- `tensor.from_elements` case, the returned handle points to the last
`tensor.insert`.		`tensor.insert`.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target);		let arguments = (ins TransformHandleTypeInterface:$target);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);
let assemblyFormat = [{		let assemblyFormat = [{
$target attr-dict		$target attr-dict
`:` functional-type($target, results)		`:` functional-type($target, results)
▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines	let arguments = (ins TransformHandleTypeInterface:$target,
UnitAttr:$inner_parallel,		UnitAttr:$inner_parallel,
UnitAttr:$use_scaling_algorithm,		UnitAttr:$use_scaling_algorithm,
UnitAttr:$use_alloc);		UnitAttr:$use_alloc);
let results = (outs TransformHandleTypeInterface:$init_or_alloc_op,		let results = (outs TransformHandleTypeInterface:$init_or_alloc_op,
TransformHandleTypeInterface:$fill_op,		TransformHandleTypeInterface:$fill_op,
TransformHandleTypeInterface:$split_linalg_op,		TransformHandleTypeInterface:$split_linalg_op,
TransformHandleTypeInterface:$combining_linalg_op);		TransformHandleTypeInterface:$combining_linalg_op);

let assemblyFormat =		let assemblyFormat =
"$target attr-dict `:`"		"$target attr-dict `:`"
"functional-type(operands, results)";		"functional-type(operands, results)";

let builders = [		let builders = [
OpBuilder<(ins "Value":$target,		OpBuilder<(ins "Value":$target,
"int64_t":$splitFactor,		"int64_t":$splitFactor,
"int64_t":$insertSplitDimension,		"int64_t":$insertSplitDimension,
CArg<"bool", "false">:$innerParallel,		CArg<"bool", "false">:$innerParallel,
▲ Show 20 Lines • Show All 490 Lines • ▼ Show 20 Lines	def TileToScfForOp : Op<Transform_Dialect, "structured.tile_to_scf_for",
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target,		let arguments = (ins TransformHandleTypeInterface:$target,
Variadic<TransformHandleTypeInterface>:$dynamic_sizes,		Variadic<TransformHandleTypeInterface>:$dynamic_sizes,
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_sizes,		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_sizes,
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$interchange);		DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$interchange);
let results = (outs TransformHandleTypeInterface:$tiled_linalg_op,		let results = (outs TransformHandleTypeInterface:$tiled_linalg_op,
Variadic<TransformHandleTypeInterface>:$loops);		Variadic<TransformHandleTypeInterface>:$loops);

let builders = [		let builders = [
OpBuilder<(ins "Value":$target,		OpBuilder<(ins "Value":$target,
"ArrayRef<OpFoldResult>":$mixedTileSizes,		"ArrayRef<OpFoldResult>":$mixedTileSizes,
CArg<"ArrayRef<int64_t>", "{}">:$interchange)>		CArg<"ArrayRef<int64_t>", "{}">:$interchange)>
];		];

let hasCustomAssemblyFormat = 1;		let hasCustomAssemblyFormat = 1;

▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	def VectorizeOp : Op<Transform_Dialect, "structured.vectorize",

let arguments = (ins TransformHandleTypeInterface:$target,		let arguments = (ins TransformHandleTypeInterface:$target,
UnitAttr:$vectorize_padding,		UnitAttr:$vectorize_padding,
UnitAttr:$vectorize_nd_extract,		UnitAttr:$vectorize_nd_extract,
UnitAttr:$disable_multi_reduction_to_contract_patterns,		UnitAttr:$disable_multi_reduction_to_contract_patterns,
UnitAttr:$disable_transfer_permutation_map_lowering_patterns);		UnitAttr:$disable_transfer_permutation_map_lowering_patterns);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);

let assemblyFormat =		let assemblyFormat =
"$target attr-dict `:`"		"$target attr-dict `:`"
"functional-type(operands, results)";		"functional-type(operands, results)";

let builders = [		let builders = [
OpBuilder<(ins "Value":$target,		OpBuilder<(ins "Value":$target,
CArg<"bool", "false">:$vectorizePadding,		CArg<"bool", "false">:$vectorizePadding,
CArg<"bool", "false">:$vectorizeNDExtract)>,		CArg<"bool", "false">:$vectorizeNDExtract)>,
];		];
▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines

def HoistRedundantTensorSubsetsOp :		def HoistRedundantTensorSubsetsOp :
Op<Transform_Dialect, "structured.hoist_redundant_tensor_subsets",		Op<Transform_Dialect, "structured.hoist_redundant_tensor_subsets",
[DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,		[DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
TransformEachOpTrait,		TransformEachOpTrait,
TransformOpInterface,		TransformOpInterface,
ReportTrackingListenerFailuresOpTrait]> {		ReportTrackingListenerFailuresOpTrait]> {
let description = [{		let description = [{
Hoists supported tensor subset extract/insert operation pairs out of		Hoists supported tensor subset extract/insert operation pairs out of
immediately enclosing loop iteratively, if the following conditions		immediately enclosing loop iteratively, if the following conditions
are true:		are true:
1. The 2 ops access the same tensor subset.		1. The 2 ops access the same tensor subset.
2. All operands are invariant under the enclosing loop.		2. All operands are invariant under the enclosing loop.

The supported subset extract/insert operation pairs currently comprise:		The supported subset extract/insert operation pairs currently comprise:
- tensor.extract_slice / tensor.insert_slice		- tensor.extract_slice / tensor.insert_slice
- vector.transfer_read / vector.transfer_write on tensors		- vector.transfer_read / vector.transfer_write on tensors

Only scf.for loops are currently supported.		Only scf.for loops are currently supported.

When applied to:		When applied to:
1. an scf.for loop, hoist out of this loop only.		1. an scf.for loop, hoist out of this loop only.
2. a non-loop op, apply hoisting to all the contained loop ops.		2. a non-loop op, apply hoisting to all the contained loop ops.

#### Return modes:		#### Return modes:

The operation always succeeds and returns nothing.		The operation always succeeds and returns nothing.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target);		let arguments = (ins TransformHandleTypeInterface:$target);
let results = (outs);		let results = (outs);

let assemblyFormat = [{		let assemblyFormat = [{
$target		$target
attr-dict		attr-dict
`:` functional-type(operands, results)		`:` functional-type(operands, results)
}];		}];

let extraClassDeclaration = [{		let extraClassDeclaration = [{
::mlir::DiagnosedSilenceableFailure applyToOne(		::mlir::DiagnosedSilenceableFailure applyToOne(
::mlir::transform::TransformRewriter &rewriter,		::mlir::transform::TransformRewriter &rewriter,
::mlir::Operation *target,		::mlir::Operation *target,
::mlir::transform::ApplyToEachResultList &results,		::mlir::transform::ApplyToEachResultList &results,
::mlir::transform::TransformState &state);		::mlir::transform::TransformState &state);
}];		}];
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// InsertSliceToCopyOp		// InsertSliceToCopyOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def InsertSliceToCopyOp :		def InsertSliceToCopyOp :
Op<Transform_Dialect, "structured.insert_slice_to_copy",		Op<Transform_Dialect, "structured.insert_slice_to_copy",
[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,		[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
TransformEachOpTrait, TransformOpInterface]> {		TransformEachOpTrait, TransformOpInterface]> {
let description = [{		let description = [{
Targeted rewrite of an tensor.insert_slice to linalg.copy.		Targeted rewrite of an tensor.insert_slice to linalg.copy.
This is useful to materialize copies explicitly before bufferization and		This is useful to materialize copies explicitly before bufferization and
transform them, avoiding the need to rediscover them after bufferization.		transform them, avoiding the need to rediscover them after bufferization.

If the insert_slice source is already a linalg.copy, only return the source		If the insert_slice source is already a linalg.copy, only return the source
op (i.e. do not create an additional linalg.copy op).		op (i.e. do not create an additional linalg.copy op).

#### Return modes:		#### Return modes:

The operation always succeeds and returns a handle to the relevant		The operation always succeeds and returns a handle to the relevant
linalg.copy op.		linalg.copy op.
}];		}];

let arguments = (ins TransformHandleTypeInterface:$target);		let arguments = (ins TransformHandleTypeInterface:$target);
let results = (outs TransformHandleTypeInterface:$transformed);		let results = (outs TransformHandleTypeInterface:$transformed);

let assemblyFormat = "$target attr-dict `:` functional-type(operands, results) ";		let assemblyFormat = "$target attr-dict `:` functional-type(operands, results) ";

▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

Show First 20 Lines • Show All 356 Lines • ▼ Show 20 Lines	LinalgPromotionOptions &setUseFullTileBuffersByDefault(bool use) {
return *this;		return *this;
}		}
/// Alignment of promoted buffer. If `std::nullopt` do not specify alignment.		/// Alignment of promoted buffer. If `std::nullopt` do not specify alignment.
std::optional<unsigned> alignment;		std::optional<unsigned> alignment;
LinalgPromotionOptions &setAlignment(unsigned align) {		LinalgPromotionOptions &setAlignment(unsigned align) {
alignment = align;		alignment = align;
return *this;		return *this;
}		}
		/// Memory space of promoted buffer. If `std::nullopt` do not specify memory
		/// space.
		std::optional<Attribute> memorySpace;
		LinalgPromotionOptions &setMemorySpace(Attribute memorySpc) {
		memorySpace = memorySpc;
		return *this;
		}
/// Use alloca with the default allocation scheme.		/// Use alloca with the default allocation scheme.
bool useAlloca = false;		bool useAlloca = false;
LinalgPromotionOptions &setUseAlloca(bool use) {		LinalgPromotionOptions &setUseAlloca(bool use) {
useAlloca = use;		useAlloca = use;
return *this;		return *this;
}		}
/// Callback function to do the allocation of the promoted buffer. If		/// Callback function to do the allocation of the promoted buffer. If
/// std::nullopt, then the default allocation scheme of allocating a		/// std::nullopt, then the default allocation scheme of allocating a
▲ Show 20 Lines • Show All 1,184 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp

Show First 20 Lines • Show All 1,877 Lines • ▼ Show 20 Lines	promotionOptions = promotionOptions.setUseFullTileBuffersByDefault(
getUseFullTilesByDefault());		getUseFullTilesByDefault());
if (getUseAlloca())		if (getUseAlloca())
promotionOptions = promotionOptions.setUseAlloca(getUseAlloca());		promotionOptions = promotionOptions.setUseAlloca(getUseAlloca());
if (!getUseFullTileBuffers().empty())		if (!getUseFullTileBuffers().empty())
promotionOptions = promotionOptions.setUseFullTileBuffers(		promotionOptions = promotionOptions.setUseFullTileBuffers(
llvm::to_vector(getUseFullTileBuffers().getAsValueRange<BoolAttr>()));		llvm::to_vector(getUseFullTileBuffers().getAsValueRange<BoolAttr>()));
if (getAlignment().has_value())		if (getAlignment().has_value())
promotionOptions = promotionOptions.setAlignment(*getAlignment());		promotionOptions = promotionOptions.setAlignment(*getAlignment());
		if (getMemorySpace().has_value())
		promotionOptions = promotionOptions.setMemorySpace(*getMemorySpace());

if (getMapping().has_value()) {		if (getMapping().has_value()) {
// The mapping should only contain an element		// The mapping should only contain an element
auto mapping = *getMapping();		auto mapping = *getMapping();
if (mapping.size() > 1)		if (mapping.size() > 1)
return emitDefaultDefiniteFailure(target);		return emitDefaultDefiniteFailure(target);

auto addressSpace = cast<mlir::gpu::GPUMemorySpaceMappingAttr>(mapping[0]);		auto addressSpace = cast<mlir::gpu::GPUMemorySpaceMappingAttr>(mapping[0]);
▲ Show 20 Lines • Show All 1,588 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Promotion.cpp

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	static Value allocBuffer(ImplicitLocOpBuilder &b,
Type elementType, Value allocSize, DataLayout &layout,		Type elementType, Value allocSize, DataLayout &layout,
std::optional<unsigned> alignment = std::nullopt) {		std::optional<unsigned> alignment = std::nullopt) {
auto width = layout.getTypeSize(elementType);		auto width = layout.getTypeSize(elementType);

IntegerAttr alignmentAttr;		IntegerAttr alignmentAttr;
if (alignment.has_value())		if (alignment.has_value())
alignmentAttr = b.getI64IntegerAttr(alignment.value());		alignmentAttr = b.getI64IntegerAttr(alignment.value());

		Attribute memorySpaceAttr;
		if (options.memorySpace.has_value())
		memorySpaceAttr = *options.memorySpace;

// Static buffer.		// Static buffer.
if (std::optional<int64_t> cst = getConstantIntValue(allocSize)) {		if (std::optional<int64_t> cst = getConstantIntValue(allocSize)) {
auto staticBufferType =		auto staticBufferType =
MemRefType::get(width * cst.value(), b.getIntegerType(8));		MemRefType::get(width * cst.value(), b.getIntegerType(8));
		staticBufferType =
		MemRefType::Builder(staticBufferType).setMemorySpace(memorySpaceAttr);
if (options.useAlloca) {		if (options.useAlloca) {
return b.create<memref::AllocaOp>(staticBufferType, ValueRange{},		return b.create<memref::AllocaOp>(staticBufferType, ValueRange{},
alignmentAttr);		alignmentAttr);
}		}
return b.create<memref::AllocOp>(staticBufferType, ValueRange{},		return b.create<memref::AllocOp>(staticBufferType, ValueRange{},
alignmentAttr);		alignmentAttr);
}		}

// Fallback dynamic buffer.		// Fallback dynamic buffer.
auto dynamicBufferType =		auto dynamicBufferType =
MemRefType::get(ShapedType::kDynamic, b.getIntegerType(8));		MemRefType::get(ShapedType::kDynamic, b.getIntegerType(8));
		dynamicBufferType =
		MemRefType::Builder(dynamicBufferType).setMemorySpace(memorySpaceAttr);
Value mul = b.createOrFold<arith::MulIOp>(		Value mul = b.createOrFold<arith::MulIOp>(
b.create<arith::ConstantIndexOp>(width), allocSize);		b.create<arith::ConstantIndexOp>(width), allocSize);
if (options.useAlloca)		if (options.useAlloca)
return b.create<memref::AllocaOp>(dynamicBufferType, mul, alignmentAttr);		return b.create<memref::AllocaOp>(dynamicBufferType, mul, alignmentAttr);
return b.create<memref::AllocOp>(dynamicBufferType, mul, alignmentAttr);		return b.create<memref::AllocOp>(dynamicBufferType, mul, alignmentAttr);
}		}

/// Default allocation callback function. This allocates a promoted buffer when		/// Default allocation callback function. This allocates a promoted buffer when
/// no call back to do so is provided. The default is to allocate a		/// no call back to do so is provided. The default is to allocate a
/// memref<..xi8> and return a view to get a memref type of shape		/// memref<..xi8> and return a view to get a memref type of shape
/// boundingSubViewSize.		/// boundingSubViewSize.
static std::optional<Value> defaultAllocBufferCallBack(		static std::optional<Value> defaultAllocBufferCallBack(
const LinalgPromotionOptions &options, OpBuilder &builder,		const LinalgPromotionOptions &options, OpBuilder &builder,
memref::SubViewOp subView, ArrayRef<Value> boundingSubViewSize,		memref::SubViewOp subView, ArrayRef<Value> boundingSubViewSize,
std::optional<unsigned> alignment, DataLayout &layout) {		std::optional<unsigned> alignment, DataLayout &layout) {
ShapedType viewType = subView.getType();		ShapedType viewType = subView.getType();
ImplicitLocOpBuilder b(subView.getLoc(), builder);		ImplicitLocOpBuilder b(subView.getLoc(), builder);
auto zero = b.create<arith::ConstantIndexOp>(0);		auto zero = b.create<arith::ConstantIndexOp>(0);
auto one = b.create<arith::ConstantIndexOp>(1);		auto one = b.create<arith::ConstantIndexOp>(1);

		Attribute memorySpaceAttr;
		if (options.memorySpace.has_value())
		memorySpaceAttr = *options.memorySpace;

Value allocSize = one;		Value allocSize = one;
for (const auto &size : llvm::enumerate(boundingSubViewSize))		for (const auto &size : llvm::enumerate(boundingSubViewSize))
allocSize = b.createOrFold<arith::MulIOp>(allocSize, size.value());		allocSize = b.createOrFold<arith::MulIOp>(allocSize, size.value());
Value buffer = allocBuffer(b, options, viewType.getElementType(), allocSize,		Value buffer = allocBuffer(b, options, viewType.getElementType(), allocSize,
layout, alignment);		layout, alignment);
SmallVector<int64_t, 4> dynSizes(boundingSubViewSize.size(),		SmallVector<int64_t, 4> dynSizes(boundingSubViewSize.size(),
ShapedType::kDynamic);		ShapedType::kDynamic);
Value view = b.createOrFold<memref::ViewOp>(
MemRefType::get(dynSizes, viewType.getElementType()), buffer, zero,		auto viewMemRefType = MemRefType::get(dynSizes, viewType.getElementType());
		viewMemRefType =
		MemRefType::Builder(viewMemRefType).setMemorySpace(memorySpaceAttr);
		Value view = b.createOrFold<memref::ViewOp>(viewMemRefType, buffer, zero,
boundingSubViewSize);		boundingSubViewSize);
return view;		return view;
}		}

/// Default implementation of deallocation of the buffer use for promotion. It		/// Default implementation of deallocation of the buffer use for promotion. It
/// expects to get the same value that the default allocation method returned,		/// expects to get the same value that the default allocation method returned,
/// i.e. result of a ViewOp.		/// i.e. result of a ViewOp.
static LogicalResult		static LogicalResult
defaultDeallocBufferCallBack(const LinalgPromotionOptions &options,		defaultDeallocBufferCallBack(const LinalgPromotionOptions &options,
▲ Show 20 Lines • Show All 379 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/promote.mlir

Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines	func.func @promote_rank_reducing_subviews(%arg0: memref<?x?x?x64xf32, strided<[?, ?, ?, ?], offset: ?>>, %arg1: memref<128x3x3x64xf32, strided<[?, ?, ?, ?], offset: ?>>, %arg2: memref<?x?x?x128xf32>,
return		return
}		}

transform.sequence failures(propagate) {		transform.sequence failures(propagate) {
^bb0(%arg1: !transform.any_op):		^bb0(%arg1: !transform.any_op):
%0 = transform.structured.match interface{LinalgOp} in %arg1 : (!transform.any_op) -> !transform.any_op		%0 = transform.structured.match interface{LinalgOp} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.structured.promote %0 : (!transform.any_op) -> !transform.any_op		%1 = transform.structured.promote %0 : (!transform.any_op) -> !transform.any_op
}		}

		// -----

		#map = affine_map<(d0, d1) -> (d0, d1)>

		// CHECK-LABEL: func.func @linalg_generic_update_all_function_inputs_outputs(
		// CHECK-SAME: %[[VAL_0:.*]]: memref<3x4xf32, 1>,
		// CHECK-SAME: %[[VAL_1:.*]]: memref<3x4xf32, 1>) -> memref<3x4xf32, 1> {
		func.func @linalg_generic_update_all_function_inputs_outputs(%arg0: memref<3x4xf32, 1>, %arg1: memref<3x4xf32, 1>) -> memref<3x4xf32, 1> {
		// CHECK: %[[VAL_2:.*]] = memref.alloc() {alignment = 64 : i64} : memref<3x4xf32, 1>
		// CHECK: %[[VAL_3:.*]] = memref.subview %[[VAL_0]][0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>
		// CHECK: %[[VAL_4:.*]] = memref.subview %[[VAL_1]][0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>
		// CHECK: %[[VAL_5:.*]] = memref.subview %[[VAL_2]][0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>

		%alloc = memref.alloc() {alignment = 64 : i64} : memref<3x4xf32, 1>
		%subview = memref.subview %arg0[0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>
		%subview_0 = memref.subview %arg1[0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>
		%subview_1 = memref.subview %alloc[0, 0] [4, 3] [1, 1] : memref<3x4xf32, 1> to memref<4x3xf32, strided<[4, 1]>, 1>

		// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_7:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_9:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_10:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_11:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_12:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_13:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_14:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_15:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_16:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_17:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_18:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_19:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_20:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_21:.*]] = arith.constant 12 : index
		// CHECK: %[[VAL_22:.*]] = memref.alloc() : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_23:.*]] = memref.view %[[VAL_22]]{{\[}}%[[VAL_18]]]{{\[}}%[[VAL_12]], %[[VAL_15]]] : memref<48xi8, #gpu.address_space<workgroup>> to memref<?x?xf32, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_24:.*]] = memref.subview %[[VAL_23]][0, 0] {{\[}}%[[VAL_14]], %[[VAL_17]]] [1, 1] : memref<?x?xf32, #gpu.address_space<workgroup>> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_25:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_26:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_27:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_28:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_29:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_30:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_31:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_32:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_33:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_34:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_35:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_36:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_37:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_38:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_39:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_40:.*]] = arith.constant 12 : index
		// CHECK: %[[VAL_41:.*]] = memref.alloc() : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_42:.*]] = memref.view %[[VAL_41]]{{\[}}%[[VAL_37]]]{{\[}}%[[VAL_31]], %[[VAL_34]]] : memref<48xi8, #gpu.address_space<workgroup>> to memref<?x?xf32, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_43:.*]] = memref.subview %[[VAL_42]][0, 0] {{\[}}%[[VAL_33]], %[[VAL_36]]] [1, 1] : memref<?x?xf32, #gpu.address_space<workgroup>> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_44:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_45:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_46:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_47:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_48:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_49:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_50:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_51:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_52:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_53:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_54:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_55:.*]] = arith.constant 3 : index
		// CHECK: %[[VAL_56:.*]] = arith.constant 0 : index
		// CHECK: %[[VAL_57:.*]] = arith.constant 1 : index
		// CHECK: %[[VAL_58:.*]] = arith.constant 4 : index
		// CHECK: %[[VAL_59:.*]] = arith.constant 12 : index
		// CHECK: %[[VAL_60:.*]] = memref.alloc() : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_61:.*]] = memref.view %[[VAL_60]]{{\[}}%[[VAL_56]]]{{\[}}%[[VAL_50]], %[[VAL_53]]] : memref<48xi8, #gpu.address_space<workgroup>> to memref<?x?xf32, #gpu.address_space<workgroup>>
		// CHECK: %[[VAL_62:.*]] = memref.subview %[[VAL_61]][0, 0] {{\[}}%[[VAL_52]], %[[VAL_55]]] [1, 1] : memref<?x?xf32, #gpu.address_space<workgroup>> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: memref.copy %[[VAL_3]], %[[VAL_24]] : memref<4x3xf32, strided<[4, 1]>, 1> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: memref.copy %[[VAL_4]], %[[VAL_43]] : memref<4x3xf32, strided<[4, 1]>, 1> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: memref.copy %[[VAL_5]], %[[VAL_62]] : memref<4x3xf32, strided<[4, 1]>, 1> to memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>
		// CHECK: linalg.generic {doc = "", indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"], library_call = ""} ins(%[[VAL_24]], %[[VAL_43]] : memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>, memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>) outs(%[[VAL_62]] : memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>>) {
		// CHECK: ^bb0(%[[VAL_63:.]]: f32, %[[VAL_64:.]]: f32, %[[VAL_65:.*]]: f32):
		// CHECK: %[[VAL_66:.*]] = arith.addf %[[VAL_63]], %[[VAL_64]] : f32
		// CHECK: linalg.yield %[[VAL_66]] : f32
		// CHECK: }


		linalg.generic {doc = "", indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"], library_call = ""} ins(%subview, %subview_0 : memref<4x3xf32, strided<[4, 1]>, 1>, memref<4x3xf32, strided<[4, 1]>, 1>) outs(%subview_1 : memref<4x3xf32, strided<[4, 1]>, 1>) {
		^bb0(%in: f32, %in_1: f32, %out: f32):
		%1 = arith.addf %in, %in_1 : f32
		linalg.yield %1 : f32
		}

		// CHECK: memref.copy %[[VAL_62]], %[[VAL_5]] : memref<?x?xf32, strided<[?, 1], offset: ?>, #gpu.address_space<workgroup>> to memref<4x3xf32, strided<[4, 1]>, 1>
		// CHECK: memref.dealloc %[[VAL_22]] : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: memref.dealloc %[[VAL_41]] : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: memref.dealloc %[[VAL_60]] : memref<48xi8, #gpu.address_space<workgroup>>
		// CHECK: return %[[VAL_2]] : memref<3x4xf32, 1>
		// CHECK: }

		return %alloc : memref<3x4xf32, 1>
		}


		transform.sequence failures(propagate) {
		^bb0(%arg1: !transform.any_op):
		%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
		%1 = transform.structured.promote %0 { memory_space = #gpu.address_space<workgroup> } : (!transform.any_op) -> !transform.any_op
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg]: Add memory space to linalg transform::PromoteOpClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 556152

mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp

mlir/lib/Dialect/Linalg/Transforms/Promotion.cpp

mlir/test/Dialect/Linalg/promote.mlir

[mlir][Linalg]: Add memory space to linalg transform::PromoteOp
ClosedPublic