This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Dialect/Vector/
-
Vector/
8/8
VectorOps.td
-
Interfaces/
-
VectorInterfaces.td
-
lib/
-
Conversion/
-
VectorToGPU/
2/2
VectorToGPU.cpp
-
VectorToROCDL/
1/1
VectorToROCDL.cpp
-
VectorToSCF/
-
VectorToSCF.cpp
-
Dialect/
-
Linalg/
-
ComprehensiveBufferize/
-
VectorInterfaceImpl.cpp
-
Transforms/
2/2
Vectorization.cpp
-
MemRef/Transforms/
-
Transforms/
-
FoldSubViewOps.cpp
-
Vector/
-
VectorDropLeadUnitDim.cpp
1/1
VectorOps.cpp
-
VectorTransferPermutationMapRewritePatterns.cpp
4/4
VectorTransforms.cpp
-
Interfaces/
-
VectorInterfaces.cpp
-
test/
-
Conversion/VectorToSCF/
-
VectorToSCF/
-
vector-to-scf.mlir
-
Dialect/
-
Linalg/
-
vectorization.mlir
-
Vector/
-
invalid.mlir
1/1
ops.mlir
-
vector-transfer-to-vector-load-store.mlir

Differential D114803

[mlir][Vector] Thread 0-d vectors through vector.transfer ops
ClosedPublic

Authored by nicolasvasilache on Nov 30 2021, 7:46 AM.

Download Raw Diff

Details

Reviewers

ThomasRaoux
dcaballe
springerm
mravishankar
ftynse
aartbik
herhut

Commits

rGc537a943342b: [mlir][Vector] Thread 0-d vectors through vector.transfer ops

Summary

This revision adds 0-d vector support to vector.transfer ops.
In the process, numerous cleanups are applied, in particular around normalizing
and reducing the number of builders.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nicolasvasilache created this revision.Nov 30 2021, 7:46 AM

Herald added a reviewer: aartbik. · View Herald TranscriptNov 30 2021, 7:46 AM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 21 others. · View Herald Transcript

nicolasvasilache requested review of this revision.Nov 30 2021, 7:46 AM

Herald added a reviewer: herhut. · View Herald TranscriptNov 30 2021, 7:46 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: stephenneuendorffer. · View Herald Transcript

ThomasRaoux added inline comments.Nov 30 2021, 8:03 AM

mlir/test/Dialect/Vector/ops.mlir
7–14	Can 0-D vector be read from non 0-D tensor? It would be good to add a test for it.

Harbormaster completed remote builds in B136708: Diff 390725.Nov 30 2021, 8:26 AM

Address comment.
Use mandatory empty AffineMapAttr instead of optional null to reduce corner cases as suggested offline by @ThomasRaoux.

nicolasvasilache marked an inline comment as done.Nov 30 2021, 1:56 PM

Harbormaster completed remote builds in B136773: Diff 390817.Nov 30 2021, 2:13 PM

springerm accepted this revision.Dec 1 2021, 1:42 AM

springerm added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
1157	why did this change?
1329	extract from
1464	insert into

This revision is now accepted and ready to land.Dec 1 2021, 1:42 AM

Thanks for helping with the 0-d vector problem! Added some minor comments. I have a question about the lowering of 0-d vectors to the scalar world. See comments inline.

mlir/include/mlir/Dialect/Vector/VectorOps.td
1157	+1, same for xfer write
1306–1329	and an?
mlir/lib/Conversion/VectorToROCDL/VectorToROCDL.cpp
69	These checks on GPU and ROCDL mean that 0-d tensors are not expected at all by these backends or it's more a TODO to be implemented in the future? If the latter, we should add a TODO comment staying that it will be implemented in the future.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
230	This looks like we are vectorizing the operand to be stored in place. Shouldn't this call to some generic function that takes care of that (`vectorizeOneOp` or some other one)?
mlir/lib/Dialect/Vector/VectorOps.cpp
2317	and an?
mlir/lib/Dialect/Vector/VectorTransforms.cpp
2805	Move comment before the if?
2870–2877	load?
2895	Question about this lowering: why are we lowering 0-d vectors to the scalar world? I thought one of the goals discussed in the RFC was to keep everything within the vector world to avoid the scalar<->vector transition that could be very expensive for some targets. Shouldn't we lower 0-d vectors to a `vector<1xtype> in LLVM instead?

ThomasRaoux accepted this revision.Dec 1 2021, 7:02 AM

ThomasRaoux added inline comments.

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
74–76	we check that the rank must be 2 right below so this won't really change anything. I'm assuming you are just adding it from consistency? (same below)

nicolasvasilache marked 12 inline comments as done.Dec 1 2021, 8:24 AM

nicolasvasilache added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
1157	Because the comment was off by 1. The thinking is that if you start counting operands at 0 and: plug in rank == 0 you now get `[1 .. 1)` which is empty as expected; before it used to be `2 .. 1` which is weird plug in rank == 1, you now get `[1 .. 2)` instead of the more ambiguous `2 .. 2`.
1329	dead now
1464	dead
mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
74–76	Yes the evolution has been that I broke the API to make sure by way of compiler crashes that I am not missing any entry. Then I just copied this everywhere. In this very special case I can just remove indeed.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
230	This is a side-effect of the piece below that says: // Not all ops support 0-d vectors, extract the scalar for now. // TODO: remove this. if (readValue.getType().cast<VectorType>().getRank() == 0) readValue = b.create<vector::ExtractElementOp>(loc, readValue); Both should go away once all ops support 0-d vectors.
mlir/lib/Dialect/Vector/VectorTransforms.cpp
2895	Note that this is not changing any behavior here and we were already doing this extraction for `vector<1x..x1xtype>`. Your point is very valid but I think is beyond this CL: we need to either: go directly to LLVM + bitcast introduce a bitcast let memref.load/store additionally support the 0-d vector case which itself may be quite fraught with issues There are still deeper data layout issues lingering even in this trivial case (for architectures for which this matters that is .. ). For now I don't think I can do better but adding a TODO.

Address comments and rebase.

This revision was landed with ongoing or failed builds.Dec 1 2021, 8:49 AM

Closed by commit rGc537a943342b: [mlir][Vector] Thread 0-d vectors through vector.transfer ops (authored by nicolasvasilache). · Explain Why

This revision was automatically updated to reflect the committed changes.

nicolasvasilache added a commit: rGc537a943342b: [mlir][Vector] Thread 0-d vectors through vector.transfer ops.

Harbormaster completed remote builds in B136934: Diff 391041.Dec 1 2021, 9:11 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorOps.td

137 lines

Interfaces/

VectorInterfaces.td

28 lines

lib/

Conversion/

VectorToGPU/

VectorToGPU.cpp

14 lines

VectorToROCDL/

VectorToROCDL.cpp

4 lines

VectorToSCF/

VectorToSCF.cpp

8 lines

Dialect/

Linalg/

ComprehensiveBufferize/

VectorInterfaceImpl.cpp

3 lines

Transforms/

Vectorization.cpp

102 lines

MemRef/

Transforms/

FoldSubViewOps.cpp

31 lines

Vector/

VectorDropLeadUnitDim.cpp

24 lines

VectorOps.cpp

258 lines

VectorTransferPermutationMapRewritePatterns.cpp

30 lines

VectorTransforms.cpp

96 lines

Interfaces/

VectorInterfaces.cpp

2 lines

test/

Conversion/

VectorToSCF/

vector-to-scf.mlir

31 lines

Dialect/

Linalg/

vectorization.mlir

25 lines

Vector/

invalid.mlir

12 lines

ops.mlir

32 lines

vector-transfer-to-vector-load-store.mlir

12 lines

Diff 391048

mlir/include/mlir/Dialect/Vector/VectorOps.td

Show First 20 Lines • Show All 1,127 Lines • ▼ Show 20 Lines

def Vector_TransferReadOp :		def Vector_TransferReadOp :
Vector_Op<"transfer_read", [		Vector_Op<"transfer_read", [
DeclareOpInterfaceMethods<VectorTransferOpInterface>,		DeclareOpInterfaceMethods<VectorTransferOpInterface>,
DeclareOpInterfaceMethods<VectorUnrollOpInterface, ["getShapeForUnroll"]>,		DeclareOpInterfaceMethods<VectorUnrollOpInterface, ["getShapeForUnroll"]>,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,		DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
AttrSizedOperandSegments		AttrSizedOperandSegments
]>,		]>,
Arguments<(ins AnyShaped:$source, Variadic<Index>:$indices,		Arguments<(ins AnyShaped:$source,
AffineMapAttr:$permutation_map, AnyType:$padding,		Variadic<Index>:$indices,
		AffineMapAttr:$permutation_map,
		AnyType:$padding,
Optional<VectorOf<[I1]>>:$mask,		Optional<VectorOf<[I1]>>:$mask,
OptionalAttr<BoolArrayAttr>:$in_bounds)>,		OptionalAttr<BoolArrayAttr>:$in_bounds)>,
Results<(outs AnyVector:$vector)> {		Results<(outs AnyVectorOfAnyRank:$vector)> {

let summary = "Reads a supervector from memory into an SSA vector value.";		let summary = "Reads a supervector from memory into an SSA vector value.";

let description = [{		let description = [{
The `vector.transfer_read` op performs a read from a slice within a		The `vector.transfer_read` op performs a read from a slice within a
[MemRef](../LangRef.md#memref-type) or a Ranked		[MemRef](../LangRef.md#memref-type) or a Ranked
[Tensor](../LangRef.md#tensor-type) supplied as its first operand into a		[Tensor](../LangRef.md#tensor-type) supplied as its first operand
[vector](../LangRef.md#vector-type) of the same base elemental type.		into a [vector](../LangRef.md#vector-type) of the same base elemental type.

A memref/tensor operand with vector element type, must have its vector		A memref/tensor operand with vector element type, must have its vector
element type match a suffix (shape and element type) of the vector (e.g.		element type match a suffix (shape and element type) of the vector (e.g.
memref<3x2x6x4x3xf32>, vector<1x1x4x3xf32>).		memref<3x2x6x4x3xf32>, vector<1x1x4x3xf32>).

The slice is further defined by a full-rank index within the MemRef/Tensor,		The slice is further defined by a full-rank index within the MemRef/Tensor,
supplied as the operands `2 .. 1 + rank(memref/tensor)`.		supplied as the operands `[1 .. 1 + rank(memref/tensor))`.
		springermUnsubmitted Done Reply Inline Actions why did this change? springerm: why did this change?
		dcaballeUnsubmitted Done Reply Inline Actions +1, same for xfer write dcaballe: +1, same for xfer write
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Because the comment was off by 1. The thinking is that if you start counting operands at 0 and: plug in rank == 0 you now get `[1 .. 1)` which is empty as expected; before it used to be `2 .. 1` which is weird plug in rank == 1, you now get `[1 .. 2)` instead of the more ambiguous `2 .. 2`. nicolasvasilache: Because the comment was off by 1. The thinking is that if you start counting operands at 0 and…

The permutation_map [attribute](../LangRef.md#attributes) is an		The permutation_map [attribute](../LangRef.md#attributes) is an
[affine-map](Affine.md#affine-maps) which specifies the transposition on the		[affine-map](Affine.md#affine-maps) which specifies the transposition on the
slice to match the vector shape. The permutation map may be implicit and		slice to match the vector shape. The permutation map may be implicit and
omitted from parsing and printing if it is the canonical minor identity map		omitted from parsing and printing if it is the canonical minor identity map
(i.e. if it does not permute or broadcast any dimension).		(i.e. if it does not permute or broadcast any dimension).

The size of the slice is specified by the size of the vector, given as the		The size of the slice is specified by the size of the vector, given as the
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	let description = [{
// Special encoding for 0-d transfer with 0-d tensor/memref, vector shape		// Special encoding for 0-d transfer with 0-d tensor/memref, vector shape
// {1} and permutation_map () -> (0).		// {1} and permutation_map () -> (0).
%0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<()->(0)>} :		%0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>		tensor<f32>, vector<1xf32>
```		```
}];		}];

let builders = [		let builders = [
// Builder that sets padding to zero.		/// 1. Builder that sets padding to zero and an empty mask (variant with attrs).
OpBuilder<(ins "VectorType":$vector, "Value":$source,		OpBuilder<(ins "VectorType":$vectorType,
"ValueRange":$indices, "AffineMap":$permutationMap,		"Value":$source,
CArg<"ArrayRef<bool>", "{}">:$inBounds)>,		"ValueRange":$indices,
// Builder that sets permutation map to 'getMinorIdentityMap'.		"AffineMapAttr":$permutationMapAttr,
OpBuilder<(ins "VectorType":$vector, "Value":$source,		"ArrayAttr":$inBoundsAttr)>,
"ValueRange":$indices, "Value":$padding,		/// 2. Builder that sets padding to zero and an empty mask (variant without attrs).
CArg<"ArrayRef<bool>", "{}">:$inBounds)>,		OpBuilder<(ins "VectorType":$vectorType,
// Builder that sets permutation map (resp. padding) to		"Value":$source,
// 'getMinorIdentityMap' (resp. zero).		"ValueRange":$indices,
OpBuilder<(ins "VectorType":$vector, "Value":$source,		"AffineMap":$permutationMap,
"ValueRange":$indices, CArg<"ArrayRef<bool>", "{}">:$inBounds)>,		CArg<"Optional<ArrayRef<bool>>", "::llvm::None">:$inBounds)>,
// Builder that does not set mask.		/// 3. Builder that sets permutation map to 'getMinorIdentityMap'.
OpBuilder<(ins "Type":$vector, "Value":$source,		OpBuilder<(ins "VectorType":$vectorType,
"ValueRange":$indices, "AffineMapAttr":$permutationMap, "Value":$padding,		"Value":$source,
"ArrayAttr":$inBounds)>,		"ValueRange":$indices,
// Builder that does not set mask.		"Value":$padding,
OpBuilder<(ins "Type":$vector, "Value":$source,		CArg<"Optional<ArrayRef<bool>>", "::llvm::None">:$inBounds)>,
"ValueRange":$indices, "AffineMap":$permutationMap, "Value":$padding,		/// 4. Builder that sets padding to zero and permutation map to
"ArrayAttr":$inBounds)>		/// 'getMinorIdentityMap'.
		OpBuilder<(ins "VectorType":$vectorType,
		"Value":$source,
		"ValueRange":$indices,
		CArg<"Optional<ArrayRef<bool>>", "::llvm::None">:$inBounds)>,
		dcaballeUnsubmitted Done Reply Inline Actions and an? dcaballe: and an?
];		];

let extraClassDeclaration = [{
/// Temporary convenience builders to account for the fact that we do not
/// have 0-d vectors atm. These create a constant `vector<1xt>` and
/// insert/extract into it.
springermUnsubmitted Done Reply Inline Actions extract from springerm: extract from
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions dead now nicolasvasilache: dead now
// Builder that sets permutation map (resp. padding) to
// 'getMinorIdentityMap' (resp. zero).
static Value createScalarOp(OpBuilder &builder, Location loc, Value source,
ValueRange indices,
ArrayRef<bool> inBounds = ArrayRef<bool>{});
}];

let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
let hasFolder = 1;		let hasFolder = 1;
}		}

def Vector_TransferWriteOp :		def Vector_TransferWriteOp :
Vector_Op<"transfer_write", [		Vector_Op<"transfer_write", [
DeclareOpInterfaceMethods<VectorTransferOpInterface>,		DeclareOpInterfaceMethods<VectorTransferOpInterface>,
DeclareOpInterfaceMethods<VectorUnrollOpInterface, ["getShapeForUnroll"]>,		DeclareOpInterfaceMethods<VectorUnrollOpInterface, ["getShapeForUnroll"]>,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,		DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
AttrSizedOperandSegments		AttrSizedOperandSegments
]>,		]>,
Arguments<(ins AnyVector:$vector, AnyShaped:$source,		Arguments<(ins AnyVectorOfAnyRank:$vector,
		AnyShaped:$source,
Variadic<Index>:$indices,		Variadic<Index>:$indices,
AffineMapAttr:$permutation_map,		AffineMapAttr:$permutation_map,
Optional<VectorOf<[I1]>>:$mask,		Optional<VectorOf<[I1]>>:$mask,
OptionalAttr<BoolArrayAttr>:$in_bounds)>,		OptionalAttr<BoolArrayAttr>:$in_bounds)>,
Results<(outs Optional<AnyRankedTensor>:$result)> {		Results<(outs Optional<AnyRankedTensor>:$result)> {

let summary = "The vector.transfer_write op writes a supervector to memory.";		let summary = "The vector.transfer_write op writes a supervector to memory.";

let description = [{		let description = [{
The `vector.transfer_write` op performs a write from a		The `vector.transfer_write` op performs a write from a
[vector](../LangRef.md#vector-type), supplied as its first operand, into a		[vector](../LangRef.md#vector-type), supplied as its first operand, into a
slice within a [MemRef](../LangRef.md#memref-type) or a Ranked		slice within a [MemRef](../LangRef.md#memref-type) or a Ranked
[Tensor](../LangRef.md#tensor-type) of the same base elemental type,		[Tensor](../LangRef.md#tensor-type) of the same base elemental type,
supplied as its second operand.		supplied as its second operand.

A vector memref/tensor operand must have its vector element type match a		A vector memref/tensor operand must have its vector element type match a
suffix (shape and element type) of the vector (e.g. memref<3x2x6x4x3xf32>,		suffix (shape and element type) of the vector (e.g. memref<3x2x6x4x3xf32>,
vector<1x1x4x3xf32>). If the operand is a tensor, the operation returns a		vector<1x1x4x3xf32>). If the operand is a tensor, the operation returns a
new tensor of the same type.		new tensor of the same type.

The slice is further defined by a full-rank index within the MemRef/Tensor,		The slice is further defined by a full-rank index within the MemRef/Tensor,
supplied as the operands `3 .. 2 + rank(memref/tensor)`.		supplied as the operands `[2 .. 2 + rank(memref/tensor))`.

The permutation_map [attribute](../LangRef.md#attributes) is an		The permutation_map [attribute](../LangRef.md#attributes) is an
[affine-map](Affine.md#affine-maps) which specifies the transposition on the		[affine-map](Affine.md#affine-maps) which specifies the transposition on the
slice to match the vector shape. The permutation map may be implicit and		slice to match the vector shape. The permutation map may be implicit and
omitted from parsing and printing if it is the canonical minor identity map		omitted from parsing and printing if it is the canonical minor identity map
(i.e. if it does not permute any dimension). In contrast to `transfer_read`,		(i.e. if it does not permute any dimension). In contrast to `transfer_read`,
write ops cannot have broadcast dimensions.		write ops cannot have broadcast dimensions.

▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	let description = [{
// Special encoding for 0-d transfer with 0-d tensor/memref, vector shape		// Special encoding for 0-d transfer with 0-d tensor/memref, vector shape
// {1} and permutation_map () -> (0).		// {1} and permutation_map () -> (0).
%1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->(0)>} :		%1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->(0)>} :
vector<1xf32>, tensor<f32>		vector<1xf32>, tensor<f32>
```		```
}];		}];

let builders = [		let builders = [
// Builder that sets an empty mask.		/// 1. Builder with type inference.
OpBuilder<(ins "Value":$vector, "Value":$source, "ValueRange":$indices,		OpBuilder<(ins "Value":$vector,
"AffineMap":$permutationMap, CArg<"ArrayRef<bool>", "{}">:$inBounds)>,		"Value":$dest,
// Builder that sets permutation map to 'getMinorIdentityMap'.		"ValueRange":$indices,
OpBuilder<(ins "Value":$vector, "Value":$source, "ValueRange":$indices,		"AffineMapAttr":$permutationMapAttr,
CArg<"ArrayRef<bool>", "{}">:$inBounds)>,		"Value":$mask,
OpBuilder<(ins "Value":$vector, "Value":$source, "ValueRange":$indices,		"ArrayAttr":$inBoundsAttr)>,
"AffineMapAttr":$permutationMap, "ArrayAttr":$inBounds)>,		/// 2. Builder with type inference that sets an empty mask (variant with attrs).
OpBuilder<(ins "Value":$vector, "Value":$source, "ValueRange":$indices,		OpBuilder<(ins "Value":$vector,
"AffineMap":$permutationMap, "Value":$mask, "ArrayAttr":$inBounds)>,		"Value":$dest,
OpBuilder<(ins "Value":$vector, "Value":$source, "ValueRange":$indices,		"ValueRange":$indices,
"AffineMap":$permutationMap, "ArrayAttr":$inBounds)>,		"AffineMapAttr":$permutationMapAttr,
		"ArrayAttr":$inBoundsAttr)>,
		/// 3. Builder with type inference that sets an empty mask (variant without attrs).
		OpBuilder<(ins "Value":$vector,
		"Value":$dest,
		"ValueRange":$indices,
		"AffineMap":$permutationMap,
		CArg<"Optional<ArrayRef<bool>>", "::llvm::None">:$inBounds)>,
		/// 4. Builder with type inference that sets an empty mask and sets permutation
		/// map to 'getMinorIdentityMap'.
		OpBuilder<(ins "Value":$vector,
		"Value":$dest,
		"ValueRange":$indices,
		CArg<"Optional<ArrayRef<bool>>", "::llvm::None">:$inBounds)>,
];		];

let extraClassDeclaration = [{
/// Temporary convenience builders to account for the fact that we do not
/// have 0-d vectors atm. These create a constant `vector<1xt>` and
/// insert/extract into it.
springermUnsubmitted Done Reply Inline Actions insert into springerm: insert into
nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions dead nicolasvasilache: dead
// Builder that sets permutation map (resp. padding) to
// 'getMinorIdentityMap' (resp. zero).
static Operation *createScalarOp(
OpBuilder &builder, Location loc, Value value,
Value dest, ValueRange indices,
ArrayRef<bool> inBounds = ArrayRef<bool>{});
}];

let hasFolder = 1;		let hasFolder = 1;
let hasCanonicalizer = 1;		let hasCanonicalizer = 1;
}		}

def Vector_LoadOp : Vector_Op<"load"> {		def Vector_LoadOp : Vector_Op<"load"> {
let summary = "reads an n-D slice of memory into an n-D vector";		let summary = "reads an n-D slice of memory into an n-D vector";
let description = [{		let description = [{
The 'vector.load' operation reads an n-D slice of memory into an n-D		The 'vector.load' operation reads an n-D slice of memory into an n-D
▲ Show 20 Lines • Show All 906 Lines • Show Last 20 Lines

mlir/include/mlir/Interfaces/VectorInterfaces.td

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	InterfaceMethod<
/desc=/"Return the permutation map.",		/desc=/"Return the permutation map.",
/retTy=/"::mlir::AffineMap",		/retTy=/"::mlir::AffineMap",
/methodName=/"permutation_map",		/methodName=/"permutation_map",
/args=/(ins),		/args=/(ins),
/methodBody=/"return $_op.permutation_map();"		/methodBody=/"return $_op.permutation_map();"
/defaultImplementation=/		/defaultImplementation=/
>,		>,
InterfaceMethod<		InterfaceMethod<
/desc=/[{
Returns true if op involves a 0-d tensor/memref and a vector
of shape {1}. This is temporary until we have 0-d vectors.
// TODO: turn this into 0-d vectors + empty permutation_map.
}],
/retTy=/"bool",
/methodName=/"isZeroD",
/args=/(ins),
/methodBody=/"",
/defaultImplementation=/[{
if (getShapedType().getRank() > 0)
return false;
if (getVectorType().getShape() != ArrayRef<int64_t>{1})
return false;
AffineMap map = AffineMap::get(
/numDims=/0, /numSymbols=/0,
getAffineConstantExpr(0, $_op->getContext()));
if ($_op.permutation_map() != map)
return false;
return true;
}]
>,
InterfaceMethod<
/desc=/[{ Returns true if the specified dimension is a broadcast. }],		/desc=/[{ Returns true if the specified dimension is a broadcast. }],
/retTy=/"bool",		/retTy=/"bool",
/methodName=/"isBroadcastDim",		/methodName=/"isBroadcastDim",
/args=/(ins "unsigned":$idx),		/args=/(ins "unsigned":$idx),
/methodBody=/"",		/methodBody=/"",
/defaultImplementation=/[{		/defaultImplementation=/[{
auto expr = $_op.permutation_map().getResult(idx);		auto expr = $_op.permutation_map().getResult(idx);
return expr.template isa<::mlir::AffineConstantExpr>() &&		return expr.template isa<::mlir::AffineConstantExpr>() &&
expr.template dyn_cast<::mlir::AffineConstantExpr>().getValue() == 0;		expr.template dyn_cast<::mlir::AffineConstantExpr>().getValue() == 0;
}]		}]
>,		>,
InterfaceMethod<		InterfaceMethod<
/desc=/[{ Returns true if at least one of the dimensions in the		/desc=/[{ Returns true if at least one of the dimensions in the
permutation map is a broadcast.}],		permutation map is a broadcast.}],
/retTy=/"bool",		/retTy=/"bool",
/methodName=/"hasBroadcastDim",		/methodName=/"hasBroadcastDim",
/args=/(ins),		/args=/(ins),
/methodBody=/"",		/methodBody=/"",
/defaultImplementation=/[{		/defaultImplementation=/[{
// 0-d transfers are not considered broadcasts but they need to be		for (unsigned i = 0, rank = getTransferRank(); i < rank; ++i) {
// represented with a vector<1xt> until we have 0-d vectors.
if ($_op.isZeroD()) return false;
for (unsigned i = 0; i < $_op.permutation_map().getNumResults(); ++i) {
if ($_op.isBroadcastDim(i))		if ($_op.isBroadcastDim(i))
return true;		return true;
}		}
return false;		return false;
}]		}]
>,		>,
InterfaceMethod<		InterfaceMethod<
/desc=/"Return the `in_bounds` boolean ArrayAttr.",		/desc=/"Return the `in_bounds` boolean ArrayAttr.",
▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	if (failed(getStridesAndOffset(memrefType, strides, offset)))
return llvm::None;		return llvm::None;
if (strides[0] == ShapedType::kDynamicStrideOrOffset)		if (strides[0] == ShapedType::kDynamicStrideOrOffset)
return llvm::None;		return llvm::None;
return strides[0];		return strides[0];
}		}

// Return true if the transfer op can be converted to a MMA matrix load.		// Return true if the transfer op can be converted to a MMA matrix load.
static bool transferReadSupportsMMAMatrixType(vector::TransferReadOp readOp) {		static bool transferReadSupportsMMAMatrixType(vector::TransferReadOp readOp) {
if (readOp.mask() \|\| readOp.hasOutOfBoundsDim() \|\|		if (readOp.mask() \|\| readOp.hasOutOfBoundsDim() \|\|
readOp.getVectorType().getRank() != 2)		readOp.getVectorType().getRank() != 2)
return false;		return false;
		ThomasRaouxUnsubmitted Done Reply Inline Actions we check that the rank must be 2 right below so this won't really change anything. I'm assuming you are just adding it from consistency? (same below) ThomasRaoux: we check that the rank must be 2 right below so this won't really change anything. I'm assuming…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Yes the evolution has been that I broke the API to make sure by way of compiler crashes that I am not missing any entry. Then I just copied this everywhere. In this very special case I can just remove indeed. nicolasvasilache: Yes the evolution has been that I broke the API to make sure by way of compiler crashes that I…
if (!getMemrefConstantHorizontalStride(readOp.getShapedType()))		if (!getMemrefConstantHorizontalStride(readOp.getShapedType()))
return false;		return false;
AffineMap map = readOp.permutation_map();		AffineMap map = readOp.permutation_map();
OpBuilder b(readOp.getContext());		OpBuilder b(readOp.getContext());
AffineExpr innerDim = b.getAffineDimExpr(map.getNumDims() - 1);		AffineExpr innerDim = b.getAffineDimExpr(map.getNumDims() - 1);
AffineExpr zero = b.getAffineConstantExpr(0);		AffineExpr zero = b.getAffineConstantExpr(0);
auto broadcastInnerDim = AffineMap::get(map.getNumDims(), 0, {zero, innerDim},		auto broadcastInnerDim = AffineMap::get(map.getNumDims(), 0, {zero, innerDim},
readOp.getContext());		readOp.getContext());
// TODO: Support transpose once it is added to GPU dialect ops.		// TODO: Support transpose once it is added to GPU dialect ops.
// For now we only support (d0, d1) -> (d0, d1) and (d0, d1) -> (0, d1).		// For now we only support (d0, d1) -> (d0, d1) and (d0, d1) -> (0, d1).
if (!map.isMinorIdentity() && map != broadcastInnerDim)		if (!map.isMinorIdentity() && map != broadcastInnerDim)
return false;		return false;
return true;		return true;
}		}

// Return true if the transfer op can be converted to a MMA matrix store.		// Return true if the transfer op can be converted to a MMA matrix store.
static bool		static bool
transferWriteSupportsMMAMatrixType(vector::TransferWriteOp writeOp) {		transferWriteSupportsMMAMatrixType(vector::TransferWriteOp writeOp) {
		// TODO: support 0-d corner case.
		if (writeOp.getTransferRank() == 0)
		return false;

if (writeOp.mask() \|\| writeOp.hasOutOfBoundsDim() \|\|		if (writeOp.mask() \|\| writeOp.hasOutOfBoundsDim() \|\|
writeOp.getVectorType().getRank() != 2)		writeOp.getVectorType().getRank() != 2)
return false;		return false;
if (!getMemrefConstantHorizontalStride(writeOp.getShapedType()))		if (!getMemrefConstantHorizontalStride(writeOp.getShapedType()))
return false;		return false;
// TODO: Support transpose once it is added to GPU dialect ops.		// TODO: Support transpose once it is added to GPU dialect ops.
if (!writeOp.permutation_map().isMinorIdentity())		if (!writeOp.permutation_map().isMinorIdentity())
return false;		return false;
▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	struct CombineTransferReadOpTranspose final
: public OpRewritePattern<vector::TransposeOp> {		: public OpRewritePattern<vector::TransposeOp> {
using OpRewritePattern<vector::TransposeOp>::OpRewritePattern;		using OpRewritePattern<vector::TransposeOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransposeOp op,		LogicalResult matchAndRewrite(vector::TransposeOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto transferReadOp = op.vector().getDefiningOp<vector::TransferReadOp>();		auto transferReadOp = op.vector().getDefiningOp<vector::TransferReadOp>();
if (!transferReadOp)		if (!transferReadOp)
return failure();		return failure();

		// TODO: support 0-d corner case.
		if (transferReadOp.getTransferRank() == 0)
		return failure();

if (transferReadOp.mask() \|\| transferReadOp.hasOutOfBoundsDim())		if (transferReadOp.mask() \|\| transferReadOp.hasOutOfBoundsDim())
return failure();		return failure();
SmallVector<int64_t, 2> perm;		SmallVector<int64_t, 2> perm;
op.getTransp(perm);		op.getTransp(perm);
SmallVector<unsigned, 2> permU;		SmallVector<unsigned, 2> permU;
for (int64_t o : perm)		for (int64_t o : perm)
permU.push_back(unsigned(o));		permU.push_back(unsigned(o));
AffineMap permutationMap =		AffineMap permutationMap =
AffineMap::getPermutationMap(permU, op.getContext());		AffineMap::getPermutationMap(permU, op.getContext());
AffineMap newMap = permutationMap.compose(transferReadOp.permutation_map());		AffineMap newMap = permutationMap.compose(transferReadOp.permutation_map());
rewriter.replaceOpWithNewOp<vector::TransferReadOp>(		rewriter.replaceOpWithNewOp<vector::TransferReadOp>(
op, op.getType(), transferReadOp.source(), transferReadOp.indices(),		op, op.getType(), transferReadOp.source(), transferReadOp.indices(),
newMap, transferReadOp.padding(), transferReadOp.mask(),		AffineMapAttr::get(newMap), transferReadOp.padding(),
transferReadOp.in_boundsAttr());		transferReadOp.mask(), transferReadOp.in_boundsAttr());
return success();		return success();
}		}
};		};

} // namespace		} // namespace

// MMA types have different layout based on how they are used in matmul ops.		// MMA types have different layout based on how they are used in matmul ops.
// Figure the right layout to use by looking at op uses.		// Figure the right layout to use by looking at op uses.
Show All 10 Lines	for (Operation *users : op->getUsers()) {
if (contract.rhs() == op.getResult())		if (contract.rhs() == op.getResult())
return "BOp";		return "BOp";
}		}
return "COp";		return "COp";
}		}

static void convertTransferReadOp(vector::TransferReadOp op,		static void convertTransferReadOp(vector::TransferReadOp op,
llvm::DenseMap<Value, Value> &valueMapping) {		llvm::DenseMap<Value, Value> &valueMapping) {
		assert(op.getTransferRank() > 0 && "unexpected 0-d transfer");
assert(transferReadSupportsMMAMatrixType(op));		assert(transferReadSupportsMMAMatrixType(op));
Optional<int64_t> stride =		Optional<int64_t> stride =
getMemrefConstantHorizontalStride(op.getShapedType());		getMemrefConstantHorizontalStride(op.getShapedType());
AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
// Handle broadcast by setting the stride to 0.		// Handle broadcast by setting the stride to 0.
if (map.getResult(0).isa<AffineConstantExpr>()) {		if (map.getResult(0).isa<AffineConstantExpr>()) {
assert(map.getResult(0).cast<AffineConstantExpr>().getValue() == 0);		assert(map.getResult(0).cast<AffineConstantExpr>().getValue() == 0);
stride = 0;		stride = 0;
▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

mlir/lib/Conversion/VectorToROCDL/VectorToROCDL.cpp

	Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	template <typename ConcreteOp>			template <typename ConcreteOp>
	class VectorTransferConversion : public ConvertOpToLLVMPattern<ConcreteOp> {			class VectorTransferConversion : public ConvertOpToLLVMPattern<ConcreteOp> {
	public:			public:
	using ConvertOpToLLVMPattern<ConcreteOp>::ConvertOpToLLVMPattern;			using ConvertOpToLLVMPattern<ConcreteOp>::ConvertOpToLLVMPattern;

	LogicalResult			LogicalResult
	matchAndRewrite(ConcreteOp xferOp, typename ConcreteOp::Adaptor adaptor,			matchAndRewrite(ConcreteOp xferOp, typename ConcreteOp::Adaptor adaptor,
	ConversionPatternRewriter &rewriter) const override {			ConversionPatternRewriter &rewriter) const override {
				// TODO: support 0-d corner case.
				if (xferOp.getTransferRank() == 0)
				return failure();
				dcaballeUnsubmitted Done Reply Inline Actions These checks on GPU and ROCDL mean that 0-d tensors are not expected at all by these backends or it's more a TODO to be implemented in the future? If the latter, we should add a TODO comment staying that it will be implemented in the future. dcaballe: These checks on GPU and ROCDL mean that 0-d tensors are not expected at all by these backends…

	if (xferOp.getVectorType().getRank() > 1 \|\|			if (xferOp.getVectorType().getRank() > 1 \|\|
	llvm::size(xferOp.indices()) == 0)			llvm::size(xferOp.indices()) == 0)
	return failure();			return failure();

	if (!xferOp.permutation_map().isMinorIdentity())			if (!xferOp.permutation_map().isMinorIdentity())
	return failure();			return failure();

	// Have it handled in vector->llvm conversion pass.			// Have it handled in vector->llvm conversion pass.
	▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	struct VectorToSCFPattern : public OpRewritePattern<OpTy> {
VectorTransferToSCFOptions options;		VectorTransferToSCFOptions options;
};		};

/// Given a vector transfer op, calculate which dimension of the `source`		/// Given a vector transfer op, calculate which dimension of the `source`
/// memref should be unpacked in the next application of TransferOpConversion.		/// memref should be unpacked in the next application of TransferOpConversion.
/// A return value of None indicates a broadcast.		/// A return value of None indicates a broadcast.
template <typename OpTy>		template <typename OpTy>
static Optional<int64_t> unpackedDim(OpTy xferOp) {		static Optional<int64_t> unpackedDim(OpTy xferOp) {
		// TODO: support 0-d corner case.
		assert(xferOp.getTransferRank() > 0 && "unexpected 0-d transfer");
auto map = xferOp.permutation_map();		auto map = xferOp.permutation_map();
if (auto expr = map.getResult(0).template dyn_cast<AffineDimExpr>()) {		if (auto expr = map.getResult(0).template dyn_cast<AffineDimExpr>()) {
return expr.getPosition();		return expr.getPosition();
}		}
assert(xferOp.isBroadcastDim(0) &&		assert(xferOp.isBroadcastDim(0) &&
"Expected AffineDimExpr or AffineConstantExpr");		"Expected AffineDimExpr or AffineConstantExpr");
return None;		return None;
}		}

/// Compute the permutation map for the new (N-1)-D vector transfer op. This		/// Compute the permutation map for the new (N-1)-D vector transfer op. This
/// map is identical to the current permutation map, but the first result is		/// map is identical to the current permutation map, but the first result is
/// omitted.		/// omitted.
template <typename OpTy>		template <typename OpTy>
static AffineMap unpackedPermutationMap(OpBuilder &b, OpTy xferOp) {		static AffineMap unpackedPermutationMap(OpBuilder &b, OpTy xferOp) {
		// TODO: support 0-d corner case.
		assert(xferOp.getTransferRank() > 0 && "unexpected 0-d transfer");
auto map = xferOp.permutation_map();		auto map = xferOp.permutation_map();
return AffineMap::get(map.getNumDims(), 0, map.getResults().drop_front(),		return AffineMap::get(map.getNumDims(), 0, map.getResults().drop_front(),
b.getContext());		b.getContext());
}		}

/// Calculate the indices for the new vector transfer op.		/// Calculate the indices for the new vector transfer op.
///		///
/// E.g.: transfer_read %A[%a, %b, %c, %d] ... : vector<5x4x3xf32> ...		/// E.g.: transfer_read %A[%a, %b, %c, %d] ... : vector<5x4x3xf32> ...
▲ Show 20 Lines • Show All 999 Lines • ▼ Show 20 Lines
/// part of TransferOp1dConversion. Return the memref dimension on which		/// part of TransferOp1dConversion. Return the memref dimension on which
/// the transfer is operating. A return value of None indicates a broadcast.		/// the transfer is operating. A return value of None indicates a broadcast.
template <typename OpTy>		template <typename OpTy>
static Optional<int64_t>		static Optional<int64_t>
get1dMemrefIndices(OpBuilder &b, OpTy xferOp, Value iv,		get1dMemrefIndices(OpBuilder &b, OpTy xferOp, Value iv,
SmallVector<Value, 8> &memrefIndices) {		SmallVector<Value, 8> &memrefIndices) {
auto indices = xferOp.indices();		auto indices = xferOp.indices();
auto map = xferOp.permutation_map();		auto map = xferOp.permutation_map();
		assert(xferOp.getTransferRank() > 0 && "unexpected 0-d transfer");

memrefIndices.append(indices.begin(), indices.end());		memrefIndices.append(indices.begin(), indices.end());
assert(map.getNumResults() == 1 &&		assert(map.getNumResults() == 1 &&
"Expected 1 permutation map result for 1D transfer");		"Expected 1 permutation map result for 1D transfer");
if (auto expr = map.getResult(0).template dyn_cast<AffineDimExpr>()) {		if (auto expr = map.getResult(0).template dyn_cast<AffineDimExpr>()) {
Location loc = xferOp.getLoc();		Location loc = xferOp.getLoc();
auto dim = expr.getPosition();		auto dim = expr.getPosition();
AffineExpr d0, d1;		AffineExpr d0, d1;
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines
/// }		/// }
/// ```		/// ```
template <typename OpTy>		template <typename OpTy>
struct TransferOp1dConversion : public VectorToSCFPattern<OpTy> {		struct TransferOp1dConversion : public VectorToSCFPattern<OpTy> {
using VectorToSCFPattern<OpTy>::VectorToSCFPattern;		using VectorToSCFPattern<OpTy>::VectorToSCFPattern;

LogicalResult matchAndRewrite(OpTy xferOp,		LogicalResult matchAndRewrite(OpTy xferOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (xferOp.getTransferRank() == 0)
		return failure();
auto map = xferOp.permutation_map();		auto map = xferOp.permutation_map();
auto memRefType = xferOp.getShapedType().template dyn_cast<MemRefType>();		auto memRefType = xferOp.getShapedType().template dyn_cast<MemRefType>();

if (!memRefType)		if (!memRefType)
return failure();		return failure();
if (xferOp.getVectorType().getRank() != 1)		if (xferOp.getVectorType().getRank() != 1)
return failure();		return failure();
if (map.isMinorIdentity() && isLastMemrefDimUnitStride(memRefType))		if (map.isMinorIdentity() && isLastMemrefDimUnitStride(memRefType))
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/ComprehensiveBufferize/VectorInterfaceImpl.cpp

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	LogicalResult bufferize(Operation *op, OpBuilder &b,
// this point.		// this point.
assert(writeOp.getShapedType().isa<TensorType>() &&		assert(writeOp.getShapedType().isa<TensorType>() &&
"only tensor types expected");		"only tensor types expected");
Value resultBuffer = getResultBuffer(b, op->getResult(0), state);		Value resultBuffer = getResultBuffer(b, op->getResult(0), state);
if (!resultBuffer)		if (!resultBuffer)
return failure();		return failure();
b.create<vector::TransferWriteOp>(		b.create<vector::TransferWriteOp>(
writeOp.getLoc(), writeOp.vector(), resultBuffer, writeOp.indices(),		writeOp.getLoc(), writeOp.vector(), resultBuffer, writeOp.indices(),
writeOp.permutation_map(),		writeOp.permutation_mapAttr(), writeOp.in_boundsAttr());
writeOp.in_bounds() ? *writeOp.in_bounds() : ArrayAttr());
state.mapBuffer(op->getResult(0), resultBuffer);		state.mapBuffer(op->getResult(0), resultBuffer);

return success();		return success();
}		}
};		};

} // namespace vector_ext		} // namespace vector_ext
} // namespace comprehensive_bufferize		} // namespace comprehensive_bufferize
Show All 10 Lines

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	struct VectorizationResult {
/// Replacement behavior is specified by `status`.		/// Replacement behavior is specified by `status`.
Operation *newOp;		Operation *newOp;
};		};

/// Return a vector type of the same shape and element type as the (assumed)		/// Return a vector type of the same shape and element type as the (assumed)
/// ShapedType of `v`.		/// ShapedType of `v`.
static VectorType extractVectorTypeFromShapedValue(Value v) {		static VectorType extractVectorTypeFromShapedValue(Value v) {
auto st = v.getType().cast<ShapedType>();		auto st = v.getType().cast<ShapedType>();
if (st.getShape().empty())
return VectorType();
return VectorType::get(st.getShape(), st.getElementType());		return VectorType::get(st.getShape(), st.getElementType());
}		}

static llvm::Optional<vector::CombiningKind>		static llvm::Optional<vector::CombiningKind>
getKindForOp(Operation *reductionOp) {		getKindForOp(Operation *reductionOp) {
if (!reductionOp)		if (!reductionOp)
return llvm::None;		return llvm::None;
return llvm::TypeSwitch<Operation *, llvm::Optional<vector::CombiningKind>>(		return llvm::TypeSwitch<Operation *, llvm::Optional<vector::CombiningKind>>(
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	VectorType targetVectorType =
VectorType::get(shape, getElementTypeOrSelf(value));		VectorType::get(shape, getElementTypeOrSelf(value));
if (vector::isBroadcastableTo(value.getType(), targetVectorType) !=		if (vector::isBroadcastableTo(value.getType(), targetVectorType) !=
vector::BroadcastableToResult::Success)		vector::BroadcastableToResult::Success)
return value;		return value;
Location loc = b.getInsertionPoint()->getLoc();		Location loc = b.getInsertionPoint()->getLoc();
return b.createOrFold<vector::BroadcastOp>(loc, targetVectorType, value);		return b.createOrFold<vector::BroadcastOp>(loc, targetVectorType, value);
}		}

/// Build a vector.transfer_read from `source` at indices set to all `0`.
/// If source has rank zero, build a `vector<1xt> transfer_read + extract`.
/// Return the produced value.
static Value buildVectorRead(OpBuilder &b, Value source, Type readType,
AffineMap map) {
Location loc = source.getLoc();
auto shapedType = source.getType().cast<ShapedType>();
SmallVector<Value> indices(shapedType.getRank(),
b.create<arith::ConstantIndexOp>(loc, 0));
if (auto vectorType = readType.dyn_cast<VectorType>())
return b.create<vector::TransferReadOp>(loc, vectorType, source, indices,
map);
return vector::TransferReadOp::createScalarOp(b, loc, source, indices);
}

/// Create MultiDimReductionOp to compute the reduction for `reductionOp`. This		/// Create MultiDimReductionOp to compute the reduction for `reductionOp`. This
/// assumes that `reductionOp` has two operands and one of them is the reduction		/// assumes that `reductionOp` has two operands and one of them is the reduction
/// initial value.		/// initial value.
static Value buildMultiDimReduce(OpBuilder &b, Operation *reduceOp,		static Value buildMultiDimReduce(OpBuilder &b, Operation *reduceOp,
Value valueToReduce,		Value valueToReduce,
const SmallVector<bool> &reductionMask) {		const SmallVector<bool> &reductionMask) {
auto maybeKind = getKindForOp(reduceOp);		auto maybeKind = getKindForOp(reduceOp);
assert(maybeKind && "Failed precondition: could not get reduction kind");		assert(maybeKind && "Failed precondition: could not get reduction kind");
Show All 16 Lines
/// to all `0`; where `outputOperand` is an output operand of the LinalgOp		/// to all `0`; where `outputOperand` is an output operand of the LinalgOp
/// currently being vectorized. If `dest` has null rank, build an memref.store.		/// currently being vectorized. If `dest` has null rank, build an memref.store.
/// Return the produced value or null if no value is produced.		/// Return the produced value or null if no value is produced.
static Value buildVectorWrite(OpBuilder &b, Value value,		static Value buildVectorWrite(OpBuilder &b, Value value,
OpOperand *outputOperand) {		OpOperand *outputOperand) {
Operation *write;		Operation *write;
Location loc = value.getLoc();		Location loc = value.getLoc();
auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());		auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());
if (VectorType vectorType =		ArrayRef<int64_t> shape = linalgOp.getShape(outputOperand);
extractVectorTypeFromShapedValue(outputOperand->get())) {		auto vectorType = VectorType::get(
		shape, getElementTypeOrSelf(outputOperand->get().getType()));
		if (vectorType.getRank() > 0) {
		// 0-d case is still special: do not invert the reindexing map.
AffineMap map =		AffineMap map =
reindexIndexingMap(linalgOp.getTiedIndexingMap(outputOperand));		reindexIndexingMap(linalgOp.getTiedIndexingMap(outputOperand));
SmallVector<int64_t> transposeShape =		SmallVector<int64_t> transposeShape =
applyPermutationMap(inversePermutation(map), vectorType.getShape());		applyPermutationMap(inversePermutation(map), vectorType.getShape());
assert(!transposeShape.empty() && "unexpected empty transpose shape");		assert(!transposeShape.empty() && "unexpected empty transpose shape");
vectorType = VectorType::get(transposeShape, vectorType.getElementType());		vectorType = VectorType::get(transposeShape, vectorType.getElementType());
SmallVector<Value> indices(linalgOp.getRank(outputOperand),		SmallVector<Value> indices(linalgOp.getRank(outputOperand),
b.create<arith::ConstantIndexOp>(loc, 0));		b.create<arith::ConstantIndexOp>(loc, 0));
value = broadcastIfNeeded(b, value, vectorType.getShape());		value = broadcastIfNeeded(b, value, vectorType.getShape());
write = b.create<vector::TransferWriteOp>(loc, value, outputOperand->get(),		write = b.create<vector::TransferWriteOp>(loc, value, outputOperand->get(),
indices, map);		indices, map);
} else {		} else {
write = vector::TransferWriteOp::createScalarOp(		if (!value.getType().isa<VectorType>())
b, loc, value, outputOperand->get(), ValueRange{});		value = b.create<vector::BroadcastOp>(loc, vectorType, value);
		dcaballeUnsubmitted Done Reply Inline Actions This looks like we are vectorizing the operand to be stored in place. Shouldn't this call to some generic function that takes care of that (`vectorizeOneOp` or some other one)? dcaballe: This looks like we are vectorizing the operand to be stored in place. Shouldn't this call to…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions This is a side-effect of the piece below that says: // Not all ops support 0-d vectors, extract the scalar for now. // TODO: remove this. if (readValue.getType().cast<VectorType>().getRank() == 0) readValue = b.create<vector::ExtractElementOp>(loc, readValue); Both should go away once all ops support 0-d vectors. nicolasvasilache: This is a side-effect of the piece below that says: ``` // Not all ops support 0-d vectors…
		assert(value.getType() == vectorType && "incorrect type");
		write = b.create<vector::TransferWriteOp>(loc, value, outputOperand->get(),
		ValueRange{});
}		}
LDBG("vectorized op: " << *write);		LDBG("vectorized op: " << *write);
if (!write->getResults().empty())		if (!write->getResults().empty())
return write->getResult(0);		return write->getResult(0);
return Value();		return Value();
}		}

// Custom vectorization function type. Produce a vector form of Operation*		// Custom vectorization function type. Produce a vector form of Operation*
▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines	if (linalgOp.getNumOutputs() == 0)
return failure();		return failure();

// TODO: the common vector shape is equal to the static loop sizes only when		// TODO: the common vector shape is equal to the static loop sizes only when
// all indexing maps are projected permutations. For convs and stencils the		// all indexing maps are projected permutations. For convs and stencils the
// logic will need to evolve.		// logic will need to evolve.
SmallVector<int64_t> commonVectorShape = linalgOp.computeStaticLoopSizes();		SmallVector<int64_t> commonVectorShape = linalgOp.computeStaticLoopSizes();

// 3. Turn all BBArgs into vector.transfer_read / load.		// 3. Turn all BBArgs into vector.transfer_read / load.
SmallVector<AffineMap> indexings;		Location loc = linalgOp.getLoc();
		Value zero = b.create<arith::ConstantIndexOp>(loc, 0);
for (OpOperand *opOperand : linalgOp.getInputAndOutputOperands()) {		for (OpOperand *opOperand : linalgOp.getInputAndOutputOperands()) {
BlockArgument bbarg = block->getArgument(opOperand->getOperandNumber());		BlockArgument bbarg = block->getArgument(opOperand->getOperandNumber());
if (linalgOp.isScalar(opOperand)) {		if (linalgOp.isScalar(opOperand)) {
bvm.map(bbarg, opOperand->get());		bvm.map(bbarg, opOperand->get());
continue;		continue;
}		}
// TODO: 0-d vectors.		VectorType readType;
Type readType;
AffineMap map;		AffineMap map;
if (linalgOp.getShape(opOperand).empty()) {		// TODO: can we keep this simplification?
readType = bbarg.getType();		// if (linalgOp.getShape(opOperand).empty()) {
} else {		// readType = VectorType::get({}, bbarg.getType());
		// } else {
if (opOperand->getOperandNumber() < linalgOp.getNumInputs()) {		if (opOperand->getOperandNumber() < linalgOp.getNumInputs()) {
map = inverseAndBroadcastProjectedPermuation(		map = inverseAndBroadcastProjectedPermuation(
linalgOp.getTiedIndexingMap(opOperand));		linalgOp.getTiedIndexingMap(opOperand));
readType = VectorType::get(commonVectorShape,		readType = VectorType::get(commonVectorShape,
getElementTypeOrSelf(opOperand->get()));		getElementTypeOrSelf(opOperand->get()));
} else {		} else {
map = inversePermutation(		map = inversePermutation(
reindexIndexingMap(linalgOp.getTiedIndexingMap(opOperand)));		reindexIndexingMap(linalgOp.getTiedIndexingMap(opOperand)));
readType = VectorType::get(map.compose(linalgOp.getShape(opOperand)),		readType = VectorType::get(map.compose(linalgOp.getShape(opOperand)),
getElementTypeOrSelf(opOperand->get()));		getElementTypeOrSelf(opOperand->get()));
}		}
}		// }
Value readValue = buildVectorRead(b, opOperand->get(), readType, map);
		auto shape = linalgOp.getShape(opOperand);
		SmallVector<Value> indices(shape.size(), zero);
		Value readValue = b.create<vector::TransferReadOp>(
		loc, readType, opOperand->get(), indices, map);
		// Not all ops support 0-d vectors, extract the scalar for now.
		// TODO: remove this.
		if (readValue.getType().cast<VectorType>().getRank() == 0)
		readValue = b.create<vector::ExtractElementOp>(loc, readValue);

LDBG("new vectorized bbarg(" << bbarg.getArgNumber() << "): " << readValue);		LDBG("new vectorized bbarg(" << bbarg.getArgNumber() << "): " << readValue);
bvm.map(bbarg, readValue);		bvm.map(bbarg, readValue);
bvm.map(opOperand->get(), readValue);		bvm.map(opOperand->get(), readValue);
}		}

SmallVector<CustomVectorizationHook> hooks;		SmallVector<CustomVectorizationHook> hooks;
// 4a. Register CustomVectorizationHook for yieldOp.		// 4a. Register CustomVectorizationHook for yieldOp.
CustomVectorizationHook vectorizeYield =		CustomVectorizationHook vectorizeYield =
▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines	static LogicalResult tryVectorizeCopy(PatternRewriter &rewriter,
auto vecType = VectorType::get(vecShape, sourceType.getElementType());		auto vecType = VectorType::get(vecShape, sourceType.getElementType());

// Generate TransferReadOp.		// Generate TransferReadOp.
SmallVector<Value> readIndices(		SmallVector<Value> readIndices(
vecType.getRank(),		vecType.getRank(),
rewriter.create<arith::ConstantIndexOp>(padOp.getLoc(), 0));		rewriter.create<arith::ConstantIndexOp>(padOp.getLoc(), 0));
auto read = rewriter.create<vector::TransferReadOp>(		auto read = rewriter.create<vector::TransferReadOp>(
padOp.getLoc(), vecType, padOp.source(), readIndices, padValue,		padOp.getLoc(), vecType, padOp.source(), readIndices, padValue,
readInBounds);		ArrayRef<bool>{readInBounds});

// If `dest` is a FillOp and the TransferWriteOp would overwrite the entire		// If `dest` is a FillOp and the TransferWriteOp would overwrite the entire
// tensor, write directly to the FillOp's operand.		// tensor, write directly to the FillOp's operand.
if (llvm::equal(vecShape, resultType.getShape()) &&		if (llvm::equal(vecShape, resultType.getShape()) &&
llvm::all_of(writeInBounds, [](bool b) { return b; }))		llvm::all_of(writeInBounds, [](bool b) { return b; }))
if (auto fill = dest.getDefiningOp<FillOp>())		if (auto fill = dest.getDefiningOp<FillOp>())
dest = fill.output();		dest = fill.output();

// Generate TransferWriteOp.		// Generate TransferWriteOp.
auto writeIndices =		auto writeIndices =
ofrToIndexValues(rewriter, padOp.getLoc(), padOp.getMixedLowPad());		ofrToIndexValues(rewriter, padOp.getLoc(), padOp.getMixedLowPad());
rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(		rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
padOp, read, dest, writeIndices, writeInBounds);		padOp, read, dest, writeIndices, ArrayRef<bool>{writeInBounds});

return success();		return success();
}		}
};		};

/// Base pattern for rewriting PadTensorOps whose result is consumed by a given		/// Base pattern for rewriting PadTensorOps whose result is consumed by a given
/// operation type OpTy.		/// operation type OpTy.
template <typename OpTy>		template <typename OpTy>
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
/// - Single, scalar padding value.		/// - Single, scalar padding value.
struct PadTensorOpVectorizationWithTransferWritePattern		struct PadTensorOpVectorizationWithTransferWritePattern
: public VectorizePadTensorOpUserPattern<vector::TransferWriteOp> {		: public VectorizePadTensorOpUserPattern<vector::TransferWriteOp> {
using VectorizePadTensorOpUserPattern<		using VectorizePadTensorOpUserPattern<
vector::TransferWriteOp>::VectorizePadTensorOpUserPattern;		vector::TransferWriteOp>::VectorizePadTensorOpUserPattern;

LogicalResult rewriteUser(PatternRewriter &rewriter, PadTensorOp padOp,		LogicalResult rewriteUser(PatternRewriter &rewriter, PadTensorOp padOp,
vector::TransferWriteOp xferOp) const override {		vector::TransferWriteOp xferOp) const override {
		// TODO: support 0-d corner case.
		if (xferOp.getTransferRank() == 0)
		return failure();

// Low padding must be static 0.		// Low padding must be static 0.
if (!padOp.hasZeroLowPad())		if (!padOp.hasZeroLowPad())
return failure();		return failure();
// Pad value must be a constant.		// Pad value must be a constant.
auto padValue = padOp.getConstantPaddingValue();		auto padValue = padOp.getConstantPaddingValue();
if (!padValue)		if (!padValue)
return failure();		return failure();
// TransferWriteOp result must be directly consumed by an ExtractSliceOp.		// TransferWriteOp result must be directly consumed by an ExtractSliceOp.
▲ Show 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	LogicalResult rewriteUser(PatternRewriter &rewriter, PadTensorOp padOp,

// Generate TransferWriteOp: Write to InsertSliceOp's dest tensor at		// Generate TransferWriteOp: Write to InsertSliceOp's dest tensor at
// specified offsets. Write is fully in-bounds because a InsertSliceOp's		// specified offsets. Write is fully in-bounds because a InsertSliceOp's
// source must fit into the destination at the specified offsets.		// source must fit into the destination at the specified offsets.
auto writeIndices =		auto writeIndices =
ofrToIndexValues(rewriter, padOp.getLoc(), insertOp.getMixedOffsets());		ofrToIndexValues(rewriter, padOp.getLoc(), insertOp.getMixedOffsets());
SmallVector<bool> inBounds(vecRank, true);		SmallVector<bool> inBounds(vecRank, true);
rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(		rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
insertOp, read, insertOp.dest(), writeIndices, inBounds);		insertOp, read, insertOp.dest(), writeIndices,
		ArrayRef<bool>{inBounds});

return success();		return success();
}		}
};		};

void mlir::linalg::populatePadTensorOpVectorizationPatterns(		void mlir::linalg::populatePadTensorOpVectorizationPatterns(
RewritePatternSet &patterns, PatternBenefit baseBenefit) {		RewritePatternSet &patterns, PatternBenefit baseBenefit) {
patterns.add<GenericPadTensorOpVectorizationPattern>(patterns.getContext(),		patterns.add<GenericPadTensorOpVectorizationPattern>(patterns.getContext(),
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	static memref::SubViewOp getSubViewUseIfUnique(Value v) {
return subViewOp;		return subViewOp;
}		}

/// TODO: use interfaces, side-effects and aliasing analysis as appropriate,		/// TODO: use interfaces, side-effects and aliasing analysis as appropriate,
/// when available.		/// when available.
LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(		LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(
vector::TransferReadOp xferOp, PatternRewriter &rewriter) const {		vector::TransferReadOp xferOp, PatternRewriter &rewriter) const {

		// TODO: support mask.
		if (xferOp.mask())
		return failure();

// Transfer into `view`.		// Transfer into `view`.
Value viewOrAlloc = xferOp.source();		Value viewOrAlloc = xferOp.source();
if (!viewOrAlloc.getDefiningOp<memref::ViewOp>() &&		if (!viewOrAlloc.getDefiningOp<memref::ViewOp>() &&
!viewOrAlloc.getDefiningOp<memref::AllocOp>())		!viewOrAlloc.getDefiningOp<memref::AllocOp>())
return failure();		return failure();

LDBG(viewOrAlloc);		LDBG(viewOrAlloc);

▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(
Value in = copyOp.input();		Value in = copyOp.input();

// linalg.copy + linalg.fill can be used to create a padded local buffer.		// linalg.copy + linalg.fill can be used to create a padded local buffer.
// The `masked` attribute is only valid on this padded buffer.		// The `masked` attribute is only valid on this padded buffer.
// When forwarding to vector.transfer_read, the attribute must be reset		// When forwarding to vector.transfer_read, the attribute must be reset
// conservatively.		// conservatively.
Value res = rewriter.create<vector::TransferReadOp>(		Value res = rewriter.create<vector::TransferReadOp>(
xferOp.getLoc(), xferOp.getVectorType(), in, xferOp.indices(),		xferOp.getLoc(), xferOp.getVectorType(), in, xferOp.indices(),
xferOp.permutation_map(), xferOp.padding(), ArrayAttr());		xferOp.permutation_mapAttr(), xferOp.padding(), xferOp.mask(),
		// in_bounds is explicitly reset
		/inBoundsAttr=/ArrayAttr());

if (maybeFillOp)		if (maybeFillOp)
rewriter.eraseOp(maybeFillOp);		rewriter.eraseOp(maybeFillOp);
rewriter.eraseOp(copyOp);		rewriter.eraseOp(copyOp);
rewriter.replaceOp(xferOp, res);		rewriter.replaceOp(xferOp, res);

return success();		return success();
}		}

/// TODO: use interfaces, side-effects and aliasing analysis as appropriate,		/// TODO: use interfaces, side-effects and aliasing analysis as appropriate,
/// when available.		/// when available.
LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(		LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(
vector::TransferWriteOp xferOp, PatternRewriter &rewriter) const {		vector::TransferWriteOp xferOp, PatternRewriter &rewriter) const {
		// TODO: support mask.
		if (xferOp.mask())
		return failure();

// Transfer into `viewOrAlloc`.		// Transfer into `viewOrAlloc`.
Value viewOrAlloc = xferOp.source();		Value viewOrAlloc = xferOp.source();
if (!viewOrAlloc.getDefiningOp<memref::ViewOp>() &&		if (!viewOrAlloc.getDefiningOp<memref::ViewOp>() &&
!viewOrAlloc.getDefiningOp<memref::AllocOp>())		!viewOrAlloc.getDefiningOp<memref::AllocOp>())
return failure();		return failure();

// Ensure there is exactly one subview of `viewOrAlloc` defining `subView`.		// Ensure there is exactly one subview of `viewOrAlloc` defining `subView`.
memref::SubViewOp subViewOp = getSubViewUseIfUnique(viewOrAlloc);		memref::SubViewOp subViewOp = getSubViewUseIfUnique(viewOrAlloc);
Show All 22 Lines	LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(

// Forward vector.transfer into copy.		// Forward vector.transfer into copy.
// linalg.copy + linalg.fill can be used to create a padded local buffer.		// linalg.copy + linalg.fill can be used to create a padded local buffer.
// The `masked` attribute is only valid on this padded buffer.		// The `masked` attribute is only valid on this padded buffer.
// When forwarding to vector.transfer_write, the attribute must be reset		// When forwarding to vector.transfer_write, the attribute must be reset
// conservatively.		// conservatively.
rewriter.create<vector::TransferWriteOp>(		rewriter.create<vector::TransferWriteOp>(
xferOp.getLoc(), xferOp.vector(), out, xferOp.indices(),		xferOp.getLoc(), xferOp.vector(), out, xferOp.indices(),
xferOp.permutation_map(), ArrayAttr());		xferOp.permutation_mapAttr(), xferOp.mask(),
		// in_bounds is explicitly reset
		/inBoundsAttr=/ArrayAttr());

rewriter.eraseOp(copyOp);		rewriter.eraseOp(copyOp);
rewriter.eraseOp(xferOp);		rewriter.eraseOp(xferOp);

return success();		return success();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 377 Lines • Show Last 20 Lines

mlir/lib/Dialect/MemRef/Transforms/FoldSubViewOps.cpp

Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines

static Value getMemRefOperand(vector::TransferWriteOp op) {		static Value getMemRefOperand(vector::TransferWriteOp op) {
return op.source();		return op.source();
}		}

/// Given the permutation map of the original		/// Given the permutation map of the original
/// `vector.transfer_read`/`vector.transfer_write` operations compute the		/// `vector.transfer_read`/`vector.transfer_write` operations compute the
/// permutation map to use after the subview is folded with it.		/// permutation map to use after the subview is folded with it.
static AffineMap getPermutationMap(MLIRContext *context,		static AffineMapAttr getPermutationMapAttr(MLIRContext *context,
memref::SubViewOp subViewOp,		memref::SubViewOp subViewOp,
AffineMap currPermutationMap) {		AffineMap currPermutationMap) {
llvm::SmallDenseSet<unsigned> unusedDims = subViewOp.getDroppedDims();		llvm::SmallDenseSet<unsigned> unusedDims = subViewOp.getDroppedDims();
SmallVector<AffineExpr> exprs;		SmallVector<AffineExpr> exprs;
int64_t sourceRank = subViewOp.getSourceType().getRank();		int64_t sourceRank = subViewOp.getSourceType().getRank();
for (auto dim : llvm::seq<int64_t>(0, sourceRank)) {		for (auto dim : llvm::seq<int64_t>(0, sourceRank)) {
if (unusedDims.count(dim))		if (unusedDims.count(dim))
continue;		continue;
exprs.push_back(getAffineDimExpr(dim, context));		exprs.push_back(getAffineDimExpr(dim, context));
}		}
auto resultDimToSourceDimMap = AffineMap::get(sourceRank, 0, exprs, context);		auto resultDimToSourceDimMap = AffineMap::get(sourceRank, 0, exprs, context);
return currPermutationMap.compose(resultDimToSourceDimMap);		return AffineMapAttr::get(
		currPermutationMap.compose(resultDimToSourceDimMap));
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Patterns		// Patterns
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
/// Merges subview operation with load/transferRead operation.		/// Merges subview operation with load/transferRead operation.
Show All 31 Lines	void LoadOpOfSubViewFolder<memref::LoadOp>::replaceOp(
memref::LoadOp loadOp, memref::SubViewOp subViewOp,		memref::LoadOp loadOp, memref::SubViewOp subViewOp,
ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {		ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
rewriter.replaceOpWithNewOp<memref::LoadOp>(loadOp, subViewOp.source(),		rewriter.replaceOpWithNewOp<memref::LoadOp>(loadOp, subViewOp.source(),
sourceIndices);		sourceIndices);
}		}

template <>		template <>
void LoadOpOfSubViewFolder<vector::TransferReadOp>::replaceOp(		void LoadOpOfSubViewFolder<vector::TransferReadOp>::replaceOp(
vector::TransferReadOp loadOp, memref::SubViewOp subViewOp,		vector::TransferReadOp transferReadOp, memref::SubViewOp subViewOp,
ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {		ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
		// TODO: support 0-d corner case.
		if (transferReadOp.getTransferRank() == 0)
		return;
rewriter.replaceOpWithNewOp<vector::TransferReadOp>(		rewriter.replaceOpWithNewOp<vector::TransferReadOp>(
loadOp, loadOp.getVectorType(), subViewOp.source(), sourceIndices,		transferReadOp, transferReadOp.getVectorType(), subViewOp.source(),
getPermutationMap(rewriter.getContext(), subViewOp,		sourceIndices,
loadOp.permutation_map()),		getPermutationMapAttr(rewriter.getContext(), subViewOp,
loadOp.padding(), loadOp.in_boundsAttr());		transferReadOp.permutation_map()),
		transferReadOp.padding(),
		/mask=/Value(), transferReadOp.in_boundsAttr());
}		}

template <>		template <>
void StoreOpOfSubViewFolder<memref::StoreOp>::replaceOp(		void StoreOpOfSubViewFolder<memref::StoreOp>::replaceOp(
memref::StoreOp storeOp, memref::SubViewOp subViewOp,		memref::StoreOp storeOp, memref::SubViewOp subViewOp,
ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {		ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
rewriter.replaceOpWithNewOp<memref::StoreOp>(		rewriter.replaceOpWithNewOp<memref::StoreOp>(
storeOp, storeOp.value(), subViewOp.source(), sourceIndices);		storeOp, storeOp.value(), subViewOp.source(), sourceIndices);
}		}

template <>		template <>
void StoreOpOfSubViewFolder<vector::TransferWriteOp>::replaceOp(		void StoreOpOfSubViewFolder<vector::TransferWriteOp>::replaceOp(
vector::TransferWriteOp transferWriteOp, memref::SubViewOp subViewOp,		vector::TransferWriteOp transferWriteOp, memref::SubViewOp subViewOp,
ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {		ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
		// TODO: support 0-d corner case.
		if (transferWriteOp.getTransferRank() == 0)
		return;
rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(		rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
transferWriteOp, transferWriteOp.vector(), subViewOp.source(),		transferWriteOp, transferWriteOp.vector(), subViewOp.source(),
sourceIndices,		sourceIndices,
getPermutationMap(rewriter.getContext(), subViewOp,		getPermutationMapAttr(rewriter.getContext(), subViewOp,
transferWriteOp.permutation_map()),		transferWriteOp.permutation_map()),
transferWriteOp.in_boundsAttr());		transferWriteOp.in_boundsAttr());
}		}
} // namespace		} // namespace

template <typename OpTy>		template <typename OpTy>
LogicalResult		LogicalResult
LoadOpOfSubViewFolder<OpTy>::matchAndRewrite(OpTy loadOp,		LoadOpOfSubViewFolder<OpTy>::matchAndRewrite(OpTy loadOp,
PatternRewriter &rewriter) const {		PatternRewriter &rewriter) const {
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorDropLeadUnitDim.cpp

	Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
	// vector.shape_cast followed by vector.transfer_read on vector without leading			// vector.shape_cast followed by vector.transfer_read on vector without leading
	// 1 dimensions.			// 1 dimensions.
	struct CastAwayTransferReadLeadingOneDim			struct CastAwayTransferReadLeadingOneDim
	: public OpRewritePattern<vector::TransferReadOp> {			: public OpRewritePattern<vector::TransferReadOp> {
	using OpRewritePattern::OpRewritePattern;			using OpRewritePattern::OpRewritePattern;

	LogicalResult matchAndRewrite(vector::TransferReadOp read,			LogicalResult matchAndRewrite(vector::TransferReadOp read,
	PatternRewriter &rewriter) const override {			PatternRewriter &rewriter) const override {
				// TODO: support 0-d corner case.
				if (read.getTransferRank() == 0)
				return failure();

	if (read.mask())			if (read.mask())
	return failure();			return failure();

	auto shapedType = read.source().getType().cast<ShapedType>();			auto shapedType = read.source().getType().cast<ShapedType>();
	if (shapedType.getElementType() != read.getVectorType().getElementType())			if (shapedType.getElementType() != read.getVectorType().getElementType())
	return failure();			return failure();

	VectorType oldType = read.getVectorType();			VectorType oldType = read.getVectorType();
	VectorType newType = trimLeadingOneDims(oldType);			VectorType newType = trimLeadingOneDims(oldType);

	if (newType == oldType)			if (newType == oldType)
	return failure();			return failure();

	AffineMap oldMap = read.permutation_map();			AffineMap oldMap = read.permutation_map();
	ArrayRef<AffineExpr> newResults =			ArrayRef<AffineExpr> newResults =
	oldMap.getResults().take_back(newType.getRank());			oldMap.getResults().take_back(newType.getRank());
	AffineMap newMap =			AffineMap newMap =
	AffineMap::get(oldMap.getNumDims(), oldMap.getNumSymbols(), newResults,			AffineMap::get(oldMap.getNumDims(), oldMap.getNumSymbols(), newResults,
	rewriter.getContext());			rewriter.getContext());

	ArrayAttr inBounds;			ArrayAttr inBoundsAttr;
	if (read.in_bounds())			if (read.in_bounds())
	inBounds = rewriter.getArrayAttr(			inBoundsAttr = rewriter.getArrayAttr(
	read.in_boundsAttr().getValue().take_back(newType.getRank()));			read.in_boundsAttr().getValue().take_back(newType.getRank()));

	auto newRead = rewriter.create<vector::TransferReadOp>(			auto newRead = rewriter.create<vector::TransferReadOp>(
	read.getLoc(), newType, read.source(), read.indices(), newMap,			read.getLoc(), newType, read.source(), read.indices(),
	read.padding(), inBounds);			AffineMapAttr::get(newMap), read.padding(), /mask=/Value(),
				inBoundsAttr);
	rewriter.replaceOpWithNewOp<vector::BroadcastOp>(read, oldType, newRead);			rewriter.replaceOpWithNewOp<vector::BroadcastOp>(read, oldType, newRead);

	return success();			return success();
	}			}
	};			};

	// Turns vector.transfer_write on vector with leading 1 dimensions into			// Turns vector.transfer_write on vector with leading 1 dimensions into
	// vector.shape_cast followed by vector.transfer_write on vector without leading			// vector.shape_cast followed by vector.transfer_write on vector without leading
	// 1 dimensions.			// 1 dimensions.
	struct CastAwayTransferWriteLeadingOneDim			struct CastAwayTransferWriteLeadingOneDim
	: public OpRewritePattern<vector::TransferWriteOp> {			: public OpRewritePattern<vector::TransferWriteOp> {
	using OpRewritePattern::OpRewritePattern;			using OpRewritePattern::OpRewritePattern;

	LogicalResult matchAndRewrite(vector::TransferWriteOp write,			LogicalResult matchAndRewrite(vector::TransferWriteOp write,
	PatternRewriter &rewriter) const override {			PatternRewriter &rewriter) const override {
				// TODO: support 0-d corner case.
				if (write.getTransferRank() == 0)
				return failure();

	if (write.mask())			if (write.mask())
	return failure();			return failure();

	auto shapedType = write.source().getType().dyn_cast<ShapedType>();			auto shapedType = write.source().getType().dyn_cast<ShapedType>();
	if (shapedType.getElementType() != write.getVectorType().getElementType())			if (shapedType.getElementType() != write.getVectorType().getElementType())
	return failure();			return failure();

	VectorType oldType = write.getVectorType();			VectorType oldType = write.getVectorType();
	VectorType newType = trimLeadingOneDims(oldType);			VectorType newType = trimLeadingOneDims(oldType);
	if (newType == oldType)			if (newType == oldType)
	return failure();			return failure();
	int64_t dropDim = oldType.getRank() - newType.getRank();			int64_t dropDim = oldType.getRank() - newType.getRank();

	AffineMap oldMap = write.permutation_map();			AffineMap oldMap = write.permutation_map();
	ArrayRef<AffineExpr> newResults =			ArrayRef<AffineExpr> newResults =
	oldMap.getResults().take_back(newType.getRank());			oldMap.getResults().take_back(newType.getRank());
	AffineMap newMap =			AffineMap newMap =
	AffineMap::get(oldMap.getNumDims(), oldMap.getNumSymbols(), newResults,			AffineMap::get(oldMap.getNumDims(), oldMap.getNumSymbols(), newResults,
	rewriter.getContext());			rewriter.getContext());

	ArrayAttr inBounds;			ArrayAttr inBoundsAttr;
	if (write.in_bounds())			if (write.in_bounds())
	inBounds = rewriter.getArrayAttr(			inBoundsAttr = rewriter.getArrayAttr(
	write.in_boundsAttr().getValue().take_back(newType.getRank()));			write.in_boundsAttr().getValue().take_back(newType.getRank()));

	auto newVector = rewriter.create<vector::ExtractOp>(			auto newVector = rewriter.create<vector::ExtractOp>(
	write.getLoc(), write.vector(), splatZero(dropDim));			write.getLoc(), write.vector(), splatZero(dropDim));
	rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(			rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
	write, newVector, write.source(), write.indices(), newMap, inBounds);			write, newVector, write.source(), write.indices(),
				AffineMapAttr::get(newMap), inBoundsAttr);

	return success();			return success();
	}			}
	};			};

	class CastAwayElementwiseLeadingOneDim : public RewritePattern {			class CastAwayElementwiseLeadingOneDim : public RewritePattern {
	public:			public:
	CastAwayElementwiseLeadingOneDim(MLIRContext *context)			CastAwayElementwiseLeadingOneDim(MLIRContext *context)
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorOps.cpp

Show First 20 Lines • Show All 1,607 Lines • ▼ Show 20 Lines	if (positionAttr.size() > static_cast<unsigned>(destVectorType.getRank()))
return op.emitOpError(		return op.emitOpError(
"expected position attribute of rank smaller than dest vector rank");		"expected position attribute of rank smaller than dest vector rank");
auto srcVectorType = op.getSourceType().dyn_cast<VectorType>();		auto srcVectorType = op.getSourceType().dyn_cast<VectorType>();
if (srcVectorType &&		if (srcVectorType &&
(static_cast<unsigned>(srcVectorType.getRank()) + positionAttr.size() !=		(static_cast<unsigned>(srcVectorType.getRank()) + positionAttr.size() !=
static_cast<unsigned>(destVectorType.getRank())))		static_cast<unsigned>(destVectorType.getRank())))
return op.emitOpError("expected position attribute rank + source rank to "		return op.emitOpError("expected position attribute rank + source rank to "
"match dest vector rank");		"match dest vector rank");
if (!srcVectorType && (positionAttr.size() !=		if (!srcVectorType &&
static_cast<unsigned>(destVectorType.getRank())))		(positionAttr.size() != static_cast<unsigned>(destVectorType.getRank())))
return op.emitOpError(		return op.emitOpError(
"expected position attribute rank to match the dest vector rank");		"expected position attribute rank to match the dest vector rank");
for (auto en : llvm::enumerate(positionAttr)) {		for (auto en : llvm::enumerate(positionAttr)) {
auto attr = en.value().dyn_cast<IntegerAttr>();		auto attr = en.value().dyn_cast<IntegerAttr>();
if (!attr \|\| attr.getInt() < 0 \|\|		if (!attr \|\| attr.getInt() < 0 \|\|
attr.getInt() >= destVectorType.getDimSize(en.index()))		attr.getInt() >= destVectorType.getDimSize(en.index()))
return op.emitOpError("expected position attribute #")		return op.emitOpError("expected position attribute #")
<< (en.index() + 1)		<< (en.index() + 1)
▲ Show 20 Lines • Show All 683 Lines • ▼ Show 20 Lines	void ExtractStridedSliceOp::getCanonicalizationPatterns(
results.add<StridedSliceConstantMaskFolder, StridedSliceConstantFolder,		results.add<StridedSliceConstantMaskFolder, StridedSliceConstantFolder,
StridedSliceBroadcast, StridedSliceSplat>(context);		StridedSliceBroadcast, StridedSliceSplat>(context);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// TransferReadOp		// TransferReadOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		/// 1. Builder that sets padding to zero and an empty mask (variant with attrs).
		dcaballeUnsubmitted Done Reply Inline Actions and an? dcaballe: and an?
		void TransferReadOp::build(OpBuilder &builder, OperationState &result,
		VectorType vectorType, Value source,
		ValueRange indices, AffineMapAttr permutationMapAttr,
		/optional/ ArrayAttr inBoundsAttr) {
		Type elemType = source.getType().cast<ShapedType>().getElementType();
		Value padding = builder.create<arith::ConstantOp>(
		result.location, elemType, builder.getZeroAttr(elemType));
		build(builder, result, vectorType, source, indices, permutationMapAttr,
		padding, /mask=/Value(), inBoundsAttr);
		}

		/// 2. Builder that sets padding to zero an empty mask (variant without attrs).
		void TransferReadOp::build(OpBuilder &builder, OperationState &result,
		VectorType vectorType, Value source,
		ValueRange indices, AffineMap permutationMap,
		Optional<ArrayRef<bool>> inBounds) {
		auto permutationMapAttr = AffineMapAttr::get(permutationMap);
		auto inBoundsAttr = (inBounds && !inBounds.getValue().empty())
		? builder.getBoolArrayAttr(inBounds.getValue())
		: ArrayAttr();
		build(builder, result, vectorType, source, indices, permutationMapAttr,
		inBoundsAttr);
		}

		/// 3. Builder that sets permutation map to 'getMinorIdentityMap'.
		void TransferReadOp::build(OpBuilder &builder, OperationState &result,
		VectorType vectorType, Value source,
		ValueRange indices, Value padding,
		Optional<ArrayRef<bool>> inBounds) {
		AffineMap permutationMap = getTransferMinorIdentityMap(
		source.getType().cast<ShapedType>(), vectorType);
		auto permutationMapAttr = AffineMapAttr::get(permutationMap);
		auto inBoundsAttr = (inBounds && !inBounds.getValue().empty())
		? builder.getBoolArrayAttr(inBounds.getValue())
		: ArrayAttr();
		build(builder, result, vectorType, source, indices, permutationMapAttr,
		padding,
		/mask=/Value(), inBoundsAttr);
		}

		/// 4. Builder that sets padding to zero and permutation map to
		/// 'getMinorIdentityMap'.
		void TransferReadOp::build(OpBuilder &builder, OperationState &result,
		VectorType vectorType, Value source,
		ValueRange indices,
		Optional<ArrayRef<bool>> inBounds) {
		Type elemType = source.getType().cast<ShapedType>().getElementType();
		Value padding = builder.create<arith::ConstantOp>(
		result.location, elemType, builder.getZeroAttr(elemType));
		build(builder, result, vectorType, source, indices, padding, inBounds);
		}

template <typename EmitFun>		template <typename EmitFun>
static LogicalResult verifyPermutationMap(AffineMap permutationMap,		static LogicalResult verifyPermutationMap(AffineMap permutationMap,
EmitFun emitOpError) {		EmitFun emitOpError) {
SmallVector<bool, 8> seen(permutationMap.getNumInputs(), false);		SmallVector<bool, 8> seen(permutationMap.getNumInputs(), false);
for (auto expr : permutationMap.getResults()) {		for (auto expr : permutationMap.getResults()) {
auto dim = expr.dyn_cast<AffineDimExpr>();		auto dim = expr.dyn_cast<AffineDimExpr>();
auto zero = expr.dyn_cast<AffineConstantExpr>();		auto zero = expr.dyn_cast<AffineConstantExpr>();
if (zero) {		if (zero) {
Show All 17 Lines	static LogicalResult verifyPermutationMap(AffineMap permutationMap,
}		}
return success();		return success();
}		}

static LogicalResult		static LogicalResult
verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,		verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
VectorType vectorType, VectorType maskType,		VectorType vectorType, VectorType maskType,
AffineMap permutationMap, ArrayAttr inBounds) {		AffineMap permutationMap, ArrayAttr inBounds) {
if (shapedType.getRank() == 0 && !op.isZeroD())
return op->emitOpError("0-d transfer requires vector<1xt> shape and () -> "
"(0) permutation_map");

if (op->hasAttr("masked")) {		if (op->hasAttr("masked")) {
return op->emitOpError("masked attribute has been removed. "		return op->emitOpError("masked attribute has been removed. "
"Use in_bounds instead.");		"Use in_bounds instead.");
}		}

if (!shapedType.isa<MemRefType, RankedTensorType>())		if (!shapedType.isa<MemRefType, RankedTensorType>())
return op->emitOpError(		return op->emitOpError(
"requires source to be a memref or ranked tensor type");		"requires source to be a memref or ranked tensor type");

auto elementType = shapedType.getElementType();		auto elementType = shapedType.getElementType();
DataLayout dataLayout = DataLayout::closest(op);		DataLayout dataLayout = DataLayout::closest(op);
if (auto vectorElementType = elementType.dyn_cast<VectorType>()) {		if (auto vectorElementType = elementType.dyn_cast<VectorType>()) {
// Memref or tensor has vector element type.		// Memref or tensor has vector element type.
unsigned sourceVecSize =		unsigned sourceVecSize =
dataLayout.getTypeSizeInBits(vectorElementType.getElementType()) *		dataLayout.getTypeSizeInBits(vectorElementType.getElementType()) *
vectorElementType.getShape().back();		vectorElementType.getShape().back();
unsigned resultVecSize =		unsigned resultVecSize =
Show All 14 Lines	if (auto vectorElementType = elementType.dyn_cast<VectorType>()) {
if (permutationMap.getNumResults() != rankOffset)		if (permutationMap.getNumResults() != rankOffset)
return op->emitOpError("requires a permutation_map with result dims of "		return op->emitOpError("requires a permutation_map with result dims of "
"the same rank as the vector type");		"the same rank as the vector type");

if (maskType)		if (maskType)
return op->emitOpError("does not support masks with vector element type");		return op->emitOpError("does not support masks with vector element type");
} else {		} else {
// Memref or tensor has scalar element type.		// Memref or tensor has scalar element type.
		unsigned minorSize =
		vectorType.getRank() == 0 ? 1 : vectorType.getShape().back();
unsigned resultVecSize =		unsigned resultVecSize =
dataLayout.getTypeSizeInBits(vectorType.getElementType()) *		dataLayout.getTypeSizeInBits(vectorType.getElementType()) * minorSize;
vectorType.getShape().back();
if (resultVecSize % dataLayout.getTypeSizeInBits(elementType) != 0)		if (resultVecSize % dataLayout.getTypeSizeInBits(elementType) != 0)
return op->emitOpError(		return op->emitOpError(
"requires the bitwidth of the minor 1-D vector to be an integral "		"requires the bitwidth of the minor 1-D vector to be an integral "
"multiple of the bitwidth of the source element type");		"multiple of the bitwidth of the source element type");

// Check that permutation map results match rank of vector type.		// Check that permutation map results match rank of vector type.
if (permutationMap.getNumResults() != vectorType.getRank())		if (permutationMap.getNumResults() != vectorType.getRank())
return op->emitOpError("requires a permutation_map with result dims of "		return op->emitOpError("requires a permutation_map with result dims of "
"the same rank as the vector type");		"the same rank as the vector type");

VectorType expectedMaskType =		VectorType expectedMaskType =
vector::detail::transferMaskType(vectorType, permutationMap);		vector::detail::transferMaskType(vectorType, permutationMap);
if (maskType && expectedMaskType != maskType)		if (maskType && expectedMaskType != maskType)
return op->emitOpError("expects mask type consistent with permutation "		return op->emitOpError("expects mask type consistent with permutation "
"map: ")		"map: ")
<< maskType;		<< maskType;
}		}

if (permutationMap.getNumSymbols() != 0)		if (permutationMap.getNumSymbols() != 0)
return op->emitOpError("requires permutation_map without symbols");		return op->emitOpError("requires permutation_map without symbols");
// TODO: implement 0-d vector corner cases.
if (!op.isZeroD() && permutationMap.getNumInputs() != shapedType.getRank())		if (permutationMap.getNumInputs() != shapedType.getRank())
return op->emitOpError("requires a permutation_map with input dims of the "		return op->emitOpError("requires a permutation_map with input dims of the "
"same rank as the source type");		"same rank as the source type");

if (inBounds) {		if (inBounds) {
if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))		if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
return op->emitOpError("expects the optional in_bounds attr of same rank "		return op->emitOpError("expects the optional in_bounds attr of same rank "
"as permutation_map results: ")		"as permutation_map results: ")
<< AffineMapAttr::get(permutationMap);		<< AffineMapAttr::get(permutationMap)
		<< " vs inBounds of size: " << inBounds.size();
for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)		for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
if (permutationMap.getResult(i).isa<AffineConstantExpr>() &&		if (permutationMap.getResult(i).isa<AffineConstantExpr>() &&
!inBounds.getValue()[i].cast<BoolAttr>().getValue())		!inBounds.getValue()[i].cast<BoolAttr>().getValue())
return op->emitOpError("requires broadcast dimensions to be in-bounds");		return op->emitOpError("requires broadcast dimensions to be in-bounds");
}		}

return success();		return success();
}		}

/// Builder that sets padding to zero.
void TransferReadOp::build(OpBuilder &builder, OperationState &result,
VectorType vectorType, Value source,
ValueRange indices, AffineMap permutationMap,
ArrayRef<bool> inBounds) {
Type elemType = source.getType().cast<ShapedType>().getElementType();
Value padding = builder.create<arith::ConstantOp>(
result.location, elemType, builder.getZeroAttr(elemType));
if (inBounds.empty())
return build(builder, result, vectorType, source, indices, permutationMap,
padding, ArrayAttr());
ArrayAttr inBoundsArrayAttr = builder.getBoolArrayAttr(inBounds);
build(builder, result, vectorType, source, indices, permutationMap, padding,
inBoundsArrayAttr);
}

/// Builder that sets permutation map to 'getMinorIdentityMap'.
void TransferReadOp::build(OpBuilder &builder, OperationState &result,
VectorType vectorType, Value source,
ValueRange indices, Value padding,
ArrayRef<bool> inBounds) {
auto permMap = getTransferMinorIdentityMap(
source.getType().cast<ShapedType>(), vectorType);
if (inBounds.empty())
return build(builder, result, vectorType, source, indices, permMap, padding,
ArrayAttr());
ArrayAttr inBoundsArrayAttr = builder.getBoolArrayAttr(inBounds);
build(builder, result, vectorType, source, indices, permMap, padding,
inBoundsArrayAttr);
}

/// Builder that sets permutation map (resp. padding) to 'getMinorIdentityMap'
/// (resp. zero).
void TransferReadOp::build(OpBuilder &builder, OperationState &result,
VectorType vectorType, Value source,
ValueRange indices, ArrayRef<bool> inBounds) {
auto permMap = getTransferMinorIdentityMap(
source.getType().cast<ShapedType>(), vectorType);
build(builder, result, vectorType, source, indices, permMap, inBounds);
}

/// Builder that does not provide a mask.
void TransferReadOp::build(OpBuilder &builder, OperationState &result,
Type vectorType, Value source, ValueRange indices,
AffineMap permutationMap, Value padding,
ArrayAttr inBounds) {
build(builder, result, vectorType, source, indices, permutationMap, padding,
/mask=/Value(), inBounds);
}

/// Builder that does not provide a mask.
void TransferReadOp::build(OpBuilder &builder, OperationState &result,
Type vectorType, Value source, ValueRange indices,
AffineMapAttr permutationMap, Value padding,
ArrayAttr inBounds) {
build(builder, result, vectorType, source, indices, permutationMap, padding,
/mask=/Value(), inBounds);
}

Value TransferReadOp::createScalarOp(OpBuilder &builder, Location loc,
Value source, ValueRange indices,
ArrayRef<bool> inBounds) {
Type elemType = source.getType().cast<ShapedType>().getElementType();
auto vectorType = VectorType::get(ArrayRef<int64_t>{1}, elemType);
AffineMap map = AffineMap::get(/numDims=/0, /numSymbols=/0,
getAffineConstantExpr(0, loc.getContext()));
Value read = builder.create<vector::TransferReadOp>(loc, vectorType, source,
indices, map, inBounds);
return builder.create<vector::ExtractOp>(loc, read, ArrayRef<int64_t>{0});
}

static void printTransferAttrs(OpAsmPrinter &p, VectorTransferOpInterface op) {		static void printTransferAttrs(OpAsmPrinter &p, VectorTransferOpInterface op) {
SmallVector<StringRef, 3> elidedAttrs;		SmallVector<StringRef, 3> elidedAttrs;
elidedAttrs.push_back(TransferReadOp::getOperandSegmentSizeAttr());		elidedAttrs.push_back(TransferReadOp::getOperandSegmentSizeAttr());
if (op.permutation_map().isMinorIdentity())		if (op.permutation_map().isMinorIdentity())
elidedAttrs.push_back(op.getPermutationMapAttrName());		elidedAttrs.push_back(op.getPermutationMapAttrName());
bool elideInBounds = true;		bool elideInBounds = true;
if (auto inBounds = op.in_bounds()) {		if (auto inBounds = op.in_bounds()) {
for (auto attr : *inBounds) {		for (auto attr : *inBounds) {
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (!shapedType \|\| !shapedType.isa<MemRefType, RankedTensorType>())
return parser.emitError(typesLoc, "requires memref or ranked tensor type");		return parser.emitError(typesLoc, "requires memref or ranked tensor type");
VectorType vectorType = types[1].dyn_cast<VectorType>();		VectorType vectorType = types[1].dyn_cast<VectorType>();
if (!vectorType)		if (!vectorType)
return parser.emitError(typesLoc, "requires vector type");		return parser.emitError(typesLoc, "requires vector type");
auto permutationAttrName = TransferReadOp::getPermutationMapAttrName();		auto permutationAttrName = TransferReadOp::getPermutationMapAttrName();
Attribute mapAttr = result.attributes.get(permutationAttrName);		Attribute mapAttr = result.attributes.get(permutationAttrName);
if (!mapAttr) {		if (!mapAttr) {
auto permMap = getTransferMinorIdentityMap(shapedType, vectorType);		auto permMap = getTransferMinorIdentityMap(shapedType, vectorType);
		// Update `mapAttr` that is used later to determine mask type.
mapAttr = AffineMapAttr::get(permMap);		mapAttr = AffineMapAttr::get(permMap);
result.attributes.set(permutationAttrName, mapAttr);		result.attributes.set(permutationAttrName, mapAttr);
}		}
if (parser.resolveOperand(sourceInfo, shapedType, result.operands) \|\|		if (parser.resolveOperand(sourceInfo, shapedType, result.operands) \|\|
parser.resolveOperands(indexInfo, indexType, result.operands) \|\|		parser.resolveOperands(indexInfo, indexType, result.operands) \|\|
parser.resolveOperand(paddingInfo, shapedType.getElementType(),		parser.resolveOperand(paddingInfo, shapedType.getElementType(),
result.operands))		result.operands))
return failure();		return failure();
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	static bool isInBounds(TransferOp op, int64_t resultIdx, int64_t indicesIdx) {
int64_t sourceSize = op.getShapedType().getDimSize(indicesIdx);		int64_t sourceSize = op.getShapedType().getDimSize(indicesIdx);
int64_t vectorSize = op.getVectorType().getDimSize(resultIdx);		int64_t vectorSize = op.getVectorType().getDimSize(resultIdx);

return cstOp.value() + vectorSize <= sourceSize;		return cstOp.value() + vectorSize <= sourceSize;
}		}

template <typename TransferOp>		template <typename TransferOp>
static LogicalResult foldTransferInBoundsAttribute(TransferOp op) {		static LogicalResult foldTransferInBoundsAttribute(TransferOp op) {
// TODO: Be less conservative once we have 0-d vectors.		// TODO: support 0-d corner case.
if (op.isZeroD())		// TODO: Be less conservative.
		if (op.getTransferRank() == 0)
return failure();		return failure();
AffineMap permutationMap = op.permutation_map();		AffineMap permutationMap = op.permutation_map();
bool changed = false;		bool changed = false;
SmallVector<bool, 4> newInBounds;		SmallVector<bool, 4> newInBounds;
newInBounds.reserve(op.getTransferRank());		newInBounds.reserve(op.getTransferRank());
for (unsigned i = 0; i < op.getTransferRank(); ++i) {		for (unsigned i = 0; i < op.getTransferRank(); ++i) {
// Already marked as in-bounds, nothing to see here.		// Already marked as in-bounds, nothing to see here.
if (op.isDimInBounds(i)) {		if (op.isDimInBounds(i)) {
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
/// ```		/// ```
struct FoldExtractSliceIntoTransferRead		struct FoldExtractSliceIntoTransferRead
: public OpRewritePattern<TransferReadOp> {		: public OpRewritePattern<TransferReadOp> {
public:		public:
using OpRewritePattern<TransferReadOp>::OpRewritePattern;		using OpRewritePattern<TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(TransferReadOp xferOp,		LogicalResult matchAndRewrite(TransferReadOp xferOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (xferOp.getTransferRank() == 0)
		return failure();
if (xferOp.hasOutOfBoundsDim())		if (xferOp.hasOutOfBoundsDim())
return failure();		return failure();
if (!xferOp.permutation_map().isIdentity())		if (!xferOp.permutation_map().isIdentity())
return failure();		return failure();
if (xferOp.mask())		if (xferOp.mask())
return failure();		return failure();
auto extractOp = xferOp.source().getDefiningOp<tensor::ExtractSliceOp>();		auto extractOp = xferOp.source().getDefiningOp<tensor::ExtractSliceOp>();
if (!extractOp)		if (!extractOp)
Show All 15 Lines	for (auto it : llvm::enumerate(xferOp.indices())) {
OpFoldResult offset =		OpFoldResult offset =
extractOp.getMixedOffsets()[it.index() + rankReduced];		extractOp.getMixedOffsets()[it.index() + rankReduced];
newIndices.push_back(rewriter.create<arith::AddIOp>(		newIndices.push_back(rewriter.create<arith::AddIOp>(
xferOp->getLoc(), it.value(),		xferOp->getLoc(), it.value(),
getValueOrCreateConstantIndexOp(rewriter, extractOp.getLoc(),		getValueOrCreateConstantIndexOp(rewriter, extractOp.getLoc(),
offset)));		offset)));
}		}
SmallVector<bool> inBounds(xferOp.getTransferRank(), true);		SmallVector<bool> inBounds(xferOp.getTransferRank(), true);
rewriter.replaceOpWithNewOp<TransferReadOp>(xferOp, xferOp.getVectorType(),		rewriter.replaceOpWithNewOp<TransferReadOp>(
extractOp.source(), newIndices,		xferOp, xferOp.getVectorType(), extractOp.source(), newIndices,
xferOp.padding(), inBounds);		xferOp.padding(), ArrayRef<bool>{inBounds});

return success();		return success();
}		}
};		};
} // namespace		} // namespace

void TransferReadOp::getCanonicalizationPatterns(RewritePatternSet &results,		void TransferReadOp::getCanonicalizationPatterns(RewritePatternSet &results,
MLIRContext *context) {		MLIRContext *context) {
results.add<FoldExtractSliceIntoTransferRead>(context);		results.add<FoldExtractSliceIntoTransferRead>(context);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// TransferWriteOp		// TransferWriteOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		/// 1. Builder with type inference.
void TransferWriteOp::build(OpBuilder &builder, OperationState &result,		void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
Value vector, Value dest, ValueRange indices,		Value vector, Value dest, ValueRange indices,
AffineMap permutationMap, ArrayRef<bool> inBounds) {		AffineMapAttr permutationMapAttr,
if (inBounds.empty())		/optional/ Value mask,
return build(builder, result, vector, dest, indices, permutationMap,		/optional/ ArrayAttr inBoundsAttr) {
/mask=/Value(), ArrayAttr());		Type resultType = dest.getType().dyn_cast<RankedTensorType>();
build(builder, result, vector, dest, indices, permutationMap,		build(builder, result, resultType, vector, dest, indices, permutationMapAttr,
/mask=/Value(), builder.getBoolArrayAttr(inBounds));		mask, inBoundsAttr);
}

/// Builder that sets permutation map to 'getMinorIdentityMap'.
void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
Value vector, Value source, ValueRange indices,
ArrayRef<bool> inBounds) {
auto vectorType = vector.getType().cast<VectorType>();
auto permMap = getTransferMinorIdentityMap(
source.getType().cast<ShapedType>(), vectorType);
if (inBounds.empty())
return build(builder, result, vector, source, indices, permMap,
ArrayAttr());
ArrayAttr inBoundsArrayAttr = builder.getBoolArrayAttr(inBounds);
build(builder, result, vector, source, indices, permMap, inBoundsArrayAttr);
}		}

		/// 2. Builder with type inference that sets an empty mask (variant with attrs).
void TransferWriteOp::build(OpBuilder &builder, OperationState &result,		void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
Value vector, Value source, ValueRange indices,		Value vector, Value dest, ValueRange indices,
AffineMapAttr permutationMap,		AffineMapAttr permutationMapAttr,
/optional/ ArrayAttr inBounds) {		/optional/ ArrayAttr inBoundsAttr) {
Type resultType = source.getType().dyn_cast<RankedTensorType>();		build(builder, result, vector, dest, indices, permutationMapAttr,
build(builder, result, resultType, vector, source, indices, permutationMap,		/mask=/Value(), inBoundsAttr);
/mask=/Value(), inBounds);
}		}

		/// 3. Builder with type inference that sets an empty mask (variant without
		/// attrs)
void TransferWriteOp::build(OpBuilder &builder, OperationState &result,		void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
Value vector, Value source, ValueRange indices,		Value vector, Value dest, ValueRange indices,
AffineMap permutationMap,		AffineMap permutationMap,
/optional/ ArrayAttr inBounds) {		Optional<ArrayRef<bool>> inBounds) {
Type resultType = source.getType().dyn_cast<RankedTensorType>();		auto permutationMapAttr = AffineMapAttr::get(permutationMap);
build(builder, result, resultType, vector, source, indices, permutationMap,		auto inBoundsAttr = (inBounds && !inBounds.getValue().empty())
/mask=/Value(), inBounds);		? builder.getBoolArrayAttr(inBounds.getValue())
		: ArrayAttr();
		build(builder, result, vector, dest, indices, permutationMapAttr,
		/mask=/Value(), inBoundsAttr);
}		}

		/// 4. Builder with type inference that sets an empty mask and sets permutation
		/// map to 'getMinorIdentityMap'.
void TransferWriteOp::build(OpBuilder &builder, OperationState &result,		void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
Value vector, Value source, ValueRange indices,		Value vector, Value dest, ValueRange indices,
AffineMap permutationMap, /optional/ Value mask,		Optional<ArrayRef<bool>> inBounds) {
/optional/ ArrayAttr inBounds) {		auto vectorType = vector.getType().cast<VectorType>();
Type resultType = source.getType().dyn_cast<RankedTensorType>();		AffineMap permutationMap = getTransferMinorIdentityMap(
build(builder, result, resultType, vector, source, indices, permutationMap,		dest.getType().cast<ShapedType>(), vectorType);
mask, inBounds);		build(builder, result, vector, dest, indices, permutationMap, inBounds);
}

Operation *TransferWriteOp::createScalarOp(OpBuilder &builder, Location loc,
Value value, Value dest,
ValueRange indices,
ArrayRef<bool> inBounds) {
Value vectorOfAScalar = value;
if (!value.getType().isa<VectorType>())
vectorOfAScalar = builder.create<vector::BroadcastOp>(
loc, VectorType::get({1}, value.getType()), value);
AffineMap map = AffineMap::get(/numDims=/0, /numSymbols=/0,
getAffineConstantExpr(0, loc.getContext()));
return builder.create<vector::TransferWriteOp>(loc, vectorOfAScalar, dest,
indices, map, inBounds);
}		}

static ParseResult parseTransferWriteOp(OpAsmParser &parser,		static ParseResult parseTransferWriteOp(OpAsmParser &parser,
OperationState &result) {		OperationState &result) {
auto &builder = parser.getBuilder();		auto &builder = parser.getBuilder();
llvm::SMLoc typesLoc;		llvm::SMLoc typesLoc;
OpAsmParser::OperandType vectorInfo, sourceInfo;		OpAsmParser::OperandType vectorInfo, sourceInfo;
SmallVector<OpAsmParser::OperandType, 8> indexInfo;		SmallVector<OpAsmParser::OperandType, 8> indexInfo;
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
/// %t0		/// %t0
/// ```		/// ```
///		///
/// The producer of t1 may or may not be DCE'd depending on whether it is a		/// The producer of t1 may or may not be DCE'd depending on whether it is a
/// block argument or has side effects.		/// block argument or has side effects.
static LogicalResult foldReadInitWrite(TransferWriteOp write,		static LogicalResult foldReadInitWrite(TransferWriteOp write,
ArrayRef<Attribute>,		ArrayRef<Attribute>,
SmallVectorImpl<OpFoldResult> &results) {		SmallVectorImpl<OpFoldResult> &results) {
		// TODO: support 0-d corner case.
		if (write.getTransferRank() == 0)
		return failure();
auto rankedTensorType = write.source().getType().dyn_cast<RankedTensorType>();		auto rankedTensorType = write.source().getType().dyn_cast<RankedTensorType>();
// If not operating on tensors, bail.		// If not operating on tensors, bail.
if (!rankedTensorType)		if (!rankedTensorType)
return failure();		return failure();
// If no read, bail.		// If no read, bail.
auto read = write.vector().getDefiningOp<vector::TransferReadOp>();		auto read = write.vector().getDefiningOp<vector::TransferReadOp>();
if (!read)		if (!read)
return failure();		return failure();
		// TODO: support 0-d corner case.
		if (read.getTransferRank() == 0)
		return failure();
// For now, only accept minor identity. Future: composition is minor identity.		// For now, only accept minor identity. Future: composition is minor identity.
if (!read.permutation_map().isMinorIdentity() \|\|		if (!read.permutation_map().isMinorIdentity() \|\|
!write.permutation_map().isMinorIdentity())		!write.permutation_map().isMinorIdentity())
return failure();		return failure();
// Bail on mismatching ranks.		// Bail on mismatching ranks.
if (read.getTransferRank() != write.getTransferRank())		if (read.getTransferRank() != write.getTransferRank())
return failure();		return failure();
// Bail on potential out-of-bounds accesses.		// Bail on potential out-of-bounds accesses.
▲ Show 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	struct FoldInsertSliceIntoTransferWrite
: public OpRewritePattern<tensor::InsertSliceOp> {		: public OpRewritePattern<tensor::InsertSliceOp> {
public:		public:
using OpRewritePattern<tensor::InsertSliceOp>::OpRewritePattern;		using OpRewritePattern<tensor::InsertSliceOp>::OpRewritePattern;

LogicalResult matchAndRewrite(tensor::InsertSliceOp insertOp,		LogicalResult matchAndRewrite(tensor::InsertSliceOp insertOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (!insertOp.hasUnitStride())		if (!insertOp.hasUnitStride())
return failure();		return failure();

auto xferOp = insertOp.source().getDefiningOp<TransferWriteOp>();		auto xferOp = insertOp.source().getDefiningOp<TransferWriteOp>();
if (!xferOp)		if (!xferOp)
return failure();		return failure();
		// TODO: support 0-d corner case.
		if (xferOp.getTransferRank() == 0)
		return failure();

if (xferOp.hasOutOfBoundsDim())		if (xferOp.hasOutOfBoundsDim())
return failure();		return failure();
if (xferOp.getVectorType().getRank() != xferOp.getShapedType().getRank())		if (xferOp.getVectorType().getRank() != xferOp.getShapedType().getRank())
return failure();		return failure();
if (xferOp.mask())		if (xferOp.mask())
return failure();		return failure();
// Fold only if the TransferWriteOp completely overwrites the `source` with		// Fold only if the TransferWriteOp completely overwrites the `source` with
// a vector. I.e., the result of the TransferWriteOp is a new tensor who's		// a vector. I.e., the result of the TransferWriteOp is a new tensor who's
// content is the data of the vector.		// content is the data of the vector.
if (!llvm::equal(xferOp.getVectorType().getShape(),		if (!llvm::equal(xferOp.getVectorType().getShape(),
xferOp.getShapedType().getShape()))		xferOp.getShapedType().getShape()))
return failure();		return failure();
if (!xferOp.permutation_map().isIdentity())		if (!xferOp.permutation_map().isIdentity())
return failure();		return failure();

SmallVector<Value> indices = getValueOrCreateConstantIndexOp(		SmallVector<Value> indices = getValueOrCreateConstantIndexOp(
rewriter, insertOp.getLoc(), insertOp.getMixedOffsets());		rewriter, insertOp.getLoc(), insertOp.getMixedOffsets());
SmallVector<bool> inBounds(xferOp.getTransferRank(), true);		SmallVector<bool> inBounds(xferOp.getTransferRank(), true);
rewriter.replaceOpWithNewOp<TransferWriteOp>(		rewriter.replaceOpWithNewOp<TransferWriteOp>(insertOp, xferOp.vector(),
insertOp, xferOp.vector(), insertOp.dest(), indices, inBounds);		insertOp.dest(), indices,
		ArrayRef<bool>{inBounds});
return success();		return success();
}		}
};		};
} // namespace		} // namespace

void TransferWriteOp::getCanonicalizationPatterns(RewritePatternSet &results,		void TransferWriteOp::getCanonicalizationPatterns(RewritePatternSet &results,
MLIRContext *context) {		MLIRContext *context) {
results.add<FoldWaw, FoldInsertSliceIntoTransferWrite>(context);		results.add<FoldWaw, FoldInsertSliceIntoTransferWrite>(context);
▲ Show 20 Lines • Show All 814 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorTransferPermutationMapRewritePatterns.cpp

Show All 25 Lines
transposeInBoundsAttr(OpBuilder &builder, ArrayAttr attr,		transposeInBoundsAttr(OpBuilder &builder, ArrayAttr attr,
const SmallVector<unsigned> &permutation) {		const SmallVector<unsigned> &permutation) {
SmallVector<bool> newInBoundsValues;		SmallVector<bool> newInBoundsValues;
for (unsigned pos : permutation)		for (unsigned pos : permutation)
newInBoundsValues.push_back(		newInBoundsValues.push_back(
attr.getValue()[pos].cast<BoolAttr>().getValue());		attr.getValue()[pos].cast<BoolAttr>().getValue());
return builder.getBoolArrayAttr(newInBoundsValues);		return builder.getBoolArrayAttr(newInBoundsValues);
}		}

/// Lower transfer_read op with permutation into a transfer_read with a		/// Lower transfer_read op with permutation into a transfer_read with a
/// permutation map composed of leading zeros followed by a minor identiy +		/// permutation map composed of leading zeros followed by a minor identiy +
/// vector.transpose op.		/// vector.transpose op.
/// Ex:		/// Ex:
/// vector.transfer_read ...		/// vector.transfer_read ...
/// permutation_map: (d0, d1, d2) -> (0, d1)		/// permutation_map: (d0, d1, d2) -> (0, d1)
/// into:		/// into:
/// %v = vector.transfer_read ...		/// %v = vector.transfer_read ...
Show All 9 Lines
/// Note that an alternative is to transform it to linalg.transpose +		/// Note that an alternative is to transform it to linalg.transpose +
/// vector.transfer_read to do the transpose in memory instead.		/// vector.transfer_read to do the transpose in memory instead.
struct TransferReadPermutationLowering		struct TransferReadPermutationLowering
: public OpRewritePattern<vector::TransferReadOp> {		: public OpRewritePattern<vector::TransferReadOp> {
using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferReadOp op,		LogicalResult matchAndRewrite(vector::TransferReadOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (op.getTransferRank() == 0)
		return failure();

SmallVector<unsigned> permutation;		SmallVector<unsigned> permutation;
AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
if (map.getNumResults() == 0)		if (map.getNumResults() == 0)
return failure();		return failure();
if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))		if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))
return failure();		return failure();
AffineMap permutationMap =		AffineMap permutationMap =
map.getPermutationMap(permutation, op.getContext());		map.getPermutationMap(permutation, op.getContext());
Show All 27 Lines	if (op.mask()) {
maskTransposeIndices.push_back(expr.getPosition());		maskTransposeIndices.push_back(expr.getPosition());
}		}

newMask = rewriter.create<vector::TransposeOp>(op.getLoc(), op.mask(),		newMask = rewriter.create<vector::TransposeOp>(op.getLoc(), op.mask(),
maskTransposeIndices);		maskTransposeIndices);
}		}

// Transpose in_bounds attribute.		// Transpose in_bounds attribute.
ArrayAttr newInBounds =		ArrayAttr newInBoundsAttr =
op.in_bounds() ? transposeInBoundsAttr(		op.in_bounds() ? transposeInBoundsAttr(
rewriter, op.in_bounds().getValue(), permutation)		rewriter, op.in_bounds().getValue(), permutation)
: ArrayAttr();		: ArrayAttr();

// Generate new transfer_read operation.		// Generate new transfer_read operation.
VectorType newReadType =		VectorType newReadType =
VectorType::get(newVectorShape, op.getVectorType().getElementType());		VectorType::get(newVectorShape, op.getVectorType().getElementType());
Value newRead = rewriter.create<vector::TransferReadOp>(		Value newRead = rewriter.create<vector::TransferReadOp>(
op.getLoc(), newReadType, op.source(), op.indices(), newMap,		op.getLoc(), newReadType, op.source(), op.indices(),
op.padding(), newMask, newInBounds);		AffineMapAttr::get(newMap), op.padding(), newMask, newInBoundsAttr);

// Transpose result of transfer_read.		// Transpose result of transfer_read.
SmallVector<int64_t> transposePerm(permutation.begin(), permutation.end());		SmallVector<int64_t> transposePerm(permutation.begin(), permutation.end());
rewriter.replaceOpWithNewOp<vector::TransposeOp>(op, newRead,		rewriter.replaceOpWithNewOp<vector::TransposeOp>(op, newRead,
transposePerm);		transposePerm);
return success();		return success();
}		}
};		};
Show All 15 Lines
/// %v = vector.transfer_write %tmp ...		/// %v = vector.transfer_write %tmp ...
/// permutation_map: (d0, d1, d2, d3) -> (d2, d3)		/// permutation_map: (d0, d1, d2, d3) -> (d2, d3)
struct TransferWritePermutationLowering		struct TransferWritePermutationLowering
: public OpRewritePattern<vector::TransferWriteOp> {		: public OpRewritePattern<vector::TransferWriteOp> {
using OpRewritePattern<vector::TransferWriteOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferWriteOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferWriteOp op,		LogicalResult matchAndRewrite(vector::TransferWriteOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (op.isZeroD())		// TODO: support 0-d corner case.
		if (op.getTransferRank() == 0)
return failure();		return failure();

SmallVector<unsigned> permutation;		SmallVector<unsigned> permutation;
AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
if (map.isMinorIdentity())		if (map.isMinorIdentity())
return failure();		return failure();
if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))		if (!map.isPermutationOfMinorIdentityWithBroadcasting(permutation))
return failure();		return failure();
Show All 10 Lines	llvm::transform(comp.getResults(), std::back_inserter(indices),
});		});

// Transpose mask operand.		// Transpose mask operand.
Value newMask = op.mask() ? rewriter.create<vector::TransposeOp>(		Value newMask = op.mask() ? rewriter.create<vector::TransposeOp>(
op.getLoc(), op.mask(), indices)		op.getLoc(), op.mask(), indices)
: Value();		: Value();

// Transpose in_bounds attribute.		// Transpose in_bounds attribute.
ArrayAttr newInBounds =		ArrayAttr newInBoundsAttr =
op.in_bounds() ? transposeInBoundsAttr(		op.in_bounds() ? transposeInBoundsAttr(
rewriter, op.in_bounds().getValue(), permutation)		rewriter, op.in_bounds().getValue(), permutation)
: ArrayAttr();		: ArrayAttr();

// Generate new transfer_write operation.		// Generate new transfer_write operation.
Value newVec =		Value newVec =
rewriter.create<vector::TransposeOp>(op.getLoc(), op.vector(), indices);		rewriter.create<vector::TransposeOp>(op.getLoc(), op.vector(), indices);
auto newMap = AffineMap::getMinorIdentityMap(		auto newMap = AffineMap::getMinorIdentityMap(
map.getNumDims(), map.getNumResults(), rewriter.getContext());		map.getNumDims(), map.getNumResults(), rewriter.getContext());
rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(		rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
op, Type(), newVec, op.source(), op.indices(), newMap, newMask,		op, Type(), newVec, op.source(), op.indices(),
newInBounds);		AffineMapAttr::get(newMap), newMask, newInBoundsAttr);

return success();		return success();
}		}
};		};

/// Lower transfer_read op with broadcast in the leading dimensions into		/// Lower transfer_read op with broadcast in the leading dimensions into
/// transfer_read of lower rank + vector.broadcast.		/// transfer_read of lower rank + vector.broadcast.
/// Ex: vector.transfer_read ...		/// Ex: vector.transfer_read ...
/// permutation_map: (d0, d1, d2, d3) -> (0, d1, 0, d3)		/// permutation_map: (d0, d1, d2, d3) -> (0, d1, 0, d3)
/// into:		/// into:
/// %v = vector.transfer_read ...		/// %v = vector.transfer_read ...
/// permutation_map: (d0, d1, d2, d3) -> (d1, 0, d3)		/// permutation_map: (d0, d1, d2, d3) -> (d1, 0, d3)
/// vector.broadcast %v		/// vector.broadcast %v
struct TransferOpReduceRank : public OpRewritePattern<vector::TransferReadOp> {		struct TransferOpReduceRank : public OpRewritePattern<vector::TransferReadOp> {
using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferReadOp op,		LogicalResult matchAndRewrite(vector::TransferReadOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (op.getTransferRank() == 0)
		return failure();

AffineMap map = op.permutation_map();		AffineMap map = op.permutation_map();
unsigned numLeadingBroadcast = 0;		unsigned numLeadingBroadcast = 0;
for (auto expr : map.getResults()) {		for (auto expr : map.getResults()) {
auto dimExpr = expr.dyn_cast<AffineConstantExpr>();		auto dimExpr = expr.dyn_cast<AffineConstantExpr>();
if (!dimExpr \|\| dimExpr.getValue() != 0)		if (!dimExpr \|\| dimExpr.getValue() != 0)
break;		break;
numLeadingBroadcast++;		numLeadingBroadcast++;
}		}
Show All 30 Lines	LogicalResult matchAndRewrite(vector::TransferReadOp op,
}		}
SmallVector<int64_t> newShape = llvm::to_vector<4>(		SmallVector<int64_t> newShape = llvm::to_vector<4>(
originalVecType.getShape().take_back(reducedShapeRank));		originalVecType.getShape().take_back(reducedShapeRank));
// Vector rank cannot be zero. Handled by TransferReadToVectorLoadLowering.		// Vector rank cannot be zero. Handled by TransferReadToVectorLoadLowering.
if (newShape.empty())		if (newShape.empty())
return failure();		return failure();
VectorType newReadType =		VectorType newReadType =
VectorType::get(newShape, originalVecType.getElementType());		VectorType::get(newShape, originalVecType.getElementType());
ArrayAttr newInBounds =		ArrayAttr newInBoundsAttr =
op.in_bounds()		op.in_bounds()
? rewriter.getArrayAttr(		? rewriter.getArrayAttr(
op.in_boundsAttr().getValue().take_back(reducedShapeRank))		op.in_boundsAttr().getValue().take_back(reducedShapeRank))
: ArrayAttr();		: ArrayAttr();
Value newRead = rewriter.create<vector::TransferReadOp>(		Value newRead = rewriter.create<vector::TransferReadOp>(
op.getLoc(), newReadType, op.source(), op.indices(), newMap,		op.getLoc(), newReadType, op.source(), op.indices(),
op.padding(), op.mask(), newInBounds);		AffineMapAttr::get(newMap), op.padding(), op.mask(), newInBoundsAttr);
rewriter.replaceOpWithNewOp<vector::BroadcastOp>(op, originalVecType,		rewriter.replaceOpWithNewOp<vector::BroadcastOp>(op, originalVecType,
newRead);		newRead);
return success();		return success();
}		}
};		};

void mlir::vector::populateVectorTransferPermutationMapLoweringPatterns(		void mlir::vector::populateVectorTransferPermutationMapLoweringPatterns(
RewritePatternSet &patterns) {		RewritePatternSet &patterns) {
patterns.add<TransferReadPermutationLowering,		patterns.add<TransferReadPermutationLowering,
TransferWritePermutationLowering, TransferOpReduceRank>(		TransferWritePermutationLowering, TransferOpReduceRank>(
patterns.getContext());		patterns.getContext());
}		}

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 223 Lines • ▼ Show 20 Lines
struct UnrollTransferReadPattern		struct UnrollTransferReadPattern
: public OpRewritePattern<vector::TransferReadOp> {		: public OpRewritePattern<vector::TransferReadOp> {
UnrollTransferReadPattern(MLIRContext *context,		UnrollTransferReadPattern(MLIRContext *context,
const vector::UnrollVectorOptions &options)		const vector::UnrollVectorOptions &options)
: OpRewritePattern<vector::TransferReadOp>(context, /benefit=/1),		: OpRewritePattern<vector::TransferReadOp>(context, /benefit=/1),
options(options) {}		options(options) {}
LogicalResult matchAndRewrite(vector::TransferReadOp readOp,		LogicalResult matchAndRewrite(vector::TransferReadOp readOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (readOp.getTransferRank() == 0)
		return failure();
if (readOp.mask())		if (readOp.mask())
return failure();		return failure();
auto targetShape = getTargetShape(options, readOp);		auto targetShape = getTargetShape(options, readOp);
if (!targetShape)		if (!targetShape)
return failure();		return failure();
auto sourceVectorType = readOp.getVectorType();		auto sourceVectorType = readOp.getVectorType();
SmallVector<int64_t, 4> strides(targetShape->size(), 1);		SmallVector<int64_t, 4> strides(targetShape->size(), 1);
Location loc = readOp.getLoc();		Location loc = readOp.getLoc();
ArrayRef<int64_t> originalSize = readOp.getVectorType().getShape();		ArrayRef<int64_t> originalSize = readOp.getVectorType().getShape();
SmallVector<int64_t, 4> ratio = shapeRatio(originalSize, targetShape);		SmallVector<int64_t, 4> ratio = shapeRatio(originalSize, targetShape);
// Compute shape ratio of 'shape' and 'sizes'.		// Compute shape ratio of 'shape' and 'sizes'.
int64_t sliceCount = computeMaxLinearIndex(ratio);		int64_t sliceCount = computeMaxLinearIndex(ratio);
// Prepare the result vector;		// Prepare the result vector;
Value result = rewriter.create<arith::ConstantOp>(		Value result = rewriter.create<arith::ConstantOp>(
loc, sourceVectorType, rewriter.getZeroAttr(sourceVectorType));		loc, sourceVectorType, rewriter.getZeroAttr(sourceVectorType));
auto targetType =		auto targetType =
VectorType::get(*targetShape, sourceVectorType.getElementType());		VectorType::get(*targetShape, sourceVectorType.getElementType());
SmallVector<Value, 4> originalIndices(readOp.indices().begin(),		SmallVector<Value, 4> originalIndices(readOp.indices().begin(),
readOp.indices().end());		readOp.indices().end());
for (int64_t i = 0; i < sliceCount; i++) {		for (int64_t i = 0; i < sliceCount; i++) {
SmallVector<Value, 4> indices =		SmallVector<Value, 4> indices =
sliceTransferIndices(i, originalSize, *targetShape, originalIndices,		sliceTransferIndices(i, originalSize, *targetShape, originalIndices,
readOp.permutation_map(), loc, rewriter);		readOp.permutation_map(), loc, rewriter);
auto slicedRead = rewriter.create<vector::TransferReadOp>(		auto slicedRead = rewriter.create<vector::TransferReadOp>(
loc, targetType, readOp.source(), indices, readOp.permutation_map(),		loc, targetType, readOp.source(), indices,
readOp.padding(),		readOp.permutation_mapAttr(), readOp.padding(), readOp.mask(),
readOp.in_bounds() ? *readOp.in_bounds() : ArrayAttr());		readOp.in_boundsAttr());

SmallVector<int64_t, 4> elementOffsets =		SmallVector<int64_t, 4> elementOffsets =
getVectorOffset(originalSize, *targetShape, i);		getVectorOffset(originalSize, *targetShape, i);
result = rewriter.create<vector::InsertStridedSliceOp>(		result = rewriter.create<vector::InsertStridedSliceOp>(
loc, slicedRead, result, elementOffsets, strides);		loc, slicedRead, result, elementOffsets, strides);
}		}
rewriter.replaceOp(readOp, result);		rewriter.replaceOp(readOp, result);
return success();		return success();
}		}

private:		private:
vector::UnrollVectorOptions options;		vector::UnrollVectorOptions options;
};		};

struct UnrollTransferWritePattern		struct UnrollTransferWritePattern
: public OpRewritePattern<vector::TransferWriteOp> {		: public OpRewritePattern<vector::TransferWriteOp> {
UnrollTransferWritePattern(MLIRContext *context,		UnrollTransferWritePattern(MLIRContext *context,
const vector::UnrollVectorOptions &options)		const vector::UnrollVectorOptions &options)
: OpRewritePattern<vector::TransferWriteOp>(context, /benefit=/1),		: OpRewritePattern<vector::TransferWriteOp>(context, /benefit=/1),
options(options) {}		options(options) {}
LogicalResult matchAndRewrite(vector::TransferWriteOp writeOp,		LogicalResult matchAndRewrite(vector::TransferWriteOp writeOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (writeOp.getTransferRank() == 0)
		return failure();

if (writeOp.mask())		if (writeOp.mask())
return failure();		return failure();
auto targetShape = getTargetShape(options, writeOp);		auto targetShape = getTargetShape(options, writeOp);
if (!targetShape)		if (!targetShape)
return failure();		return failure();
auto sourceVectorType = writeOp.getVectorType();		auto sourceVectorType = writeOp.getVectorType();
SmallVector<int64_t, 4> strides(targetShape->size(), 1);		SmallVector<int64_t, 4> strides(targetShape->size(), 1);
Location loc = writeOp.getLoc();		Location loc = writeOp.getLoc();
Show All 10 Lines	for (int64_t i = 0; i < sliceCount; i++) {
Value slicedVector = rewriter.create<vector::ExtractStridedSliceOp>(		Value slicedVector = rewriter.create<vector::ExtractStridedSliceOp>(
loc, writeOp.vector(), elementOffsets, *targetShape, strides);		loc, writeOp.vector(), elementOffsets, *targetShape, strides);

SmallVector<Value, 4> indices =		SmallVector<Value, 4> indices =
sliceTransferIndices(i, originalSize, *targetShape, originalIndices,		sliceTransferIndices(i, originalSize, *targetShape, originalIndices,
writeOp.permutation_map(), loc, rewriter);		writeOp.permutation_map(), loc, rewriter);
Operation *slicedWrite = rewriter.create<vector::TransferWriteOp>(		Operation *slicedWrite = rewriter.create<vector::TransferWriteOp>(
loc, slicedVector, resultTensor ? resultTensor : writeOp.source(),		loc, slicedVector, resultTensor ? resultTensor : writeOp.source(),
indices, writeOp.permutation_map(),		indices, writeOp.permutation_mapAttr(), writeOp.in_boundsAttr());
writeOp.in_bounds() ? *writeOp.in_bounds() : ArrayAttr());
// For the tensor case update the destination for the next transfer write.		// For the tensor case update the destination for the next transfer write.
if (!slicedWrite->getResults().empty())		if (!slicedWrite->getResults().empty())
resultTensor = slicedWrite->getResult(0);		resultTensor = slicedWrite->getResult(0);
}		}
if (resultTensor)		if (resultTensor)
rewriter.replaceOp(writeOp, resultTensor);		rewriter.replaceOp(writeOp, resultTensor);
else		else
rewriter.eraseOp(writeOp);		rewriter.eraseOp(writeOp);
▲ Show 20 Lines • Show All 1,734 Lines • ▼ Show 20 Lines
///		///
/// Preconditions:		/// Preconditions:
/// 1. `xferOp.permutation_map()` must be a minor identity map		/// 1. `xferOp.permutation_map()` must be a minor identity map
/// 2. the rank of the `xferOp.memref()` and the rank of the `xferOp.vector()`		/// 2. the rank of the `xferOp.memref()` and the rank of the `xferOp.vector()`
/// must be equal. This will be relaxed in the future but requires		/// must be equal. This will be relaxed in the future but requires
/// rank-reducing subviews.		/// rank-reducing subviews.
static LogicalResult		static LogicalResult
splitFullAndPartialTransferPrecondition(VectorTransferOpInterface xferOp) {		splitFullAndPartialTransferPrecondition(VectorTransferOpInterface xferOp) {
		// TODO: support 0-d corner case.
		if (xferOp.getTransferRank() == 0)
		return failure();

// TODO: expand support to these 2 cases.		// TODO: expand support to these 2 cases.
if (!xferOp.permutation_map().isMinorIdentity())		if (!xferOp.permutation_map().isMinorIdentity())
return failure();		return failure();
// Must have some out-of-bounds dimension to be a candidate for splitting.		// Must have some out-of-bounds dimension to be a candidate for splitting.
if (!xferOp.hasOutOfBoundsDim())		if (!xferOp.hasOutOfBoundsDim())
return failure();		return failure();
// Don't split transfer operations directly under IfOp, this avoids applying		// Don't split transfer operations directly under IfOp, this avoids applying
// the pattern recursively.		// the pattern recursively.
▲ Show 20 Lines • Show All 609 Lines • ▼ Show 20 Lines
/// memref<64x64x64xf32>, vector<2x4x1xf32>		/// memref<64x64x64xf32>, vector<2x4x1xf32>
/// ```		/// ```
struct TransferReadExtractPattern		struct TransferReadExtractPattern
: public OpRewritePattern<vector::TransferReadOp> {		: public OpRewritePattern<vector::TransferReadOp> {
TransferReadExtractPattern(MLIRContext *context)		TransferReadExtractPattern(MLIRContext *context)
: OpRewritePattern<vector::TransferReadOp>(context) {}		: OpRewritePattern<vector::TransferReadOp>(context) {}
LogicalResult matchAndRewrite(vector::TransferReadOp read,		LogicalResult matchAndRewrite(vector::TransferReadOp read,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (read.getTransferRank() == 0)
		return failure();

if (!read.getResult().hasOneUse())		if (!read.getResult().hasOneUse())
return failure();		return failure();
auto extract =		auto extract =
dyn_cast<vector::ExtractMapOp>(*read.getResult().getUsers().begin());		dyn_cast<vector::ExtractMapOp>(*read.getResult().getUsers().begin());
if (!extract)		if (!extract)
return failure();		return failure();
if (read.mask())		if (read.mask())
return failure();		return failure();
Show All 13 Lines	for (auto it :
unsigned vectorPos = std::get<1>(it).cast<AffineDimExpr>().getPosition();		unsigned vectorPos = std::get<1>(it).cast<AffineDimExpr>().getPosition();
auto scale = getAffineConstantExpr(		auto scale = getAffineConstantExpr(
extract.getResultType().getDimSize(vectorPos), read.getContext());		extract.getResultType().getDimSize(vectorPos), read.getContext());
indices[indexPos] = makeComposedAffineApply(		indices[indexPos] = makeComposedAffineApply(
rewriter, read.getLoc(), d0 + scale * d1,		rewriter, read.getLoc(), d0 + scale * d1,
{indices[indexPos], extract.ids()[idCount++]});		{indices[indexPos], extract.ids()[idCount++]});
}		}
Value newRead = lb.create<vector::TransferReadOp>(		Value newRead = lb.create<vector::TransferReadOp>(
extract.getType(), read.source(), indices, read.permutation_map(),		extract.getType(), read.source(), indices, read.permutation_mapAttr(),
read.padding(), read.in_boundsAttr());		read.padding(), read.mask(), read.in_boundsAttr());
Value dest = lb.create<arith::ConstantOp>(		Value dest = lb.create<arith::ConstantOp>(
read.getType(), rewriter.getZeroAttr(read.getType()));		read.getType(), rewriter.getZeroAttr(read.getType()));
newRead = lb.create<vector::InsertMapOp>(newRead, dest, extract.ids());		newRead = lb.create<vector::InsertMapOp>(newRead, dest, extract.ids());
rewriter.replaceOp(read, newRead);		rewriter.replaceOp(read, newRead);
return success();		return success();
}		}
};		};

struct TransferWriteInsertPattern		struct TransferWriteInsertPattern
: public OpRewritePattern<vector::TransferWriteOp> {		: public OpRewritePattern<vector::TransferWriteOp> {
TransferWriteInsertPattern(MLIRContext *context)		TransferWriteInsertPattern(MLIRContext *context)
: OpRewritePattern<vector::TransferWriteOp>(context) {}		: OpRewritePattern<vector::TransferWriteOp>(context) {}
LogicalResult matchAndRewrite(vector::TransferWriteOp write,		LogicalResult matchAndRewrite(vector::TransferWriteOp write,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (write.getTransferRank() == 0)
		return failure();

auto insert = write.vector().getDefiningOp<vector::InsertMapOp>();		auto insert = write.vector().getDefiningOp<vector::InsertMapOp>();
if (!insert)		if (!insert)
return failure();		return failure();
if (write.mask())		if (write.mask())
return failure();		return failure();
SmallVector<Value, 4> indices(write.indices().begin(),		SmallVector<Value, 4> indices(write.indices().begin(),
write.indices().end());		write.indices().end());
AffineMap indexMap = insert.map().compose(write.permutation_map());		AffineMap indexMap = insert.map().compose(write.permutation_map());
Show All 11 Lines	for (auto it :
auto scale = getAffineConstantExpr(		auto scale = getAffineConstantExpr(
insert.getSourceVectorType().getDimSize(vectorPos),		insert.getSourceVectorType().getDimSize(vectorPos),
write.getContext());		write.getContext());
indices[indexPos] =		indices[indexPos] =
makeComposedAffineApply(rewriter, loc, d0 + scale * d1,		makeComposedAffineApply(rewriter, loc, d0 + scale * d1,
{indices[indexPos], insert.ids()[idCount++]});		{indices[indexPos], insert.ids()[idCount++]});
}		}
rewriter.create<vector::TransferWriteOp>(		rewriter.create<vector::TransferWriteOp>(
loc, insert.vector(), write.source(), indices, write.permutation_map(),		loc, insert.vector(), write.source(), indices,
write.in_boundsAttr());		write.permutation_mapAttr(), write.in_boundsAttr());
rewriter.eraseOp(write);		rewriter.eraseOp(write);
return success();		return success();
}		}
};		};

/// Progressive lowering of transfer_read. This pattern supports lowering of		/// Progressive lowering of transfer_read. This pattern supports lowering of
/// `vector.transfer_read` to a combination of `vector.load` and		/// `vector.transfer_read` to a combination of `vector.load` and
/// `vector.broadcast` if all of the following hold:		/// `vector.broadcast` if all of the following hold:
/// - Stride of most minor memref dimension must be 1.		/// - Stride of most minor memref dimension must be 1.
/// - Out-of-bounds masking is not required.		/// - Out-of-bounds masking is not required.
/// - If the memref's element type is a vector type then it coincides with the		/// - If the memref's element type is a vector type then it coincides with the
/// result type.		/// result type.
/// - The permutation map doesn't perform permutation (broadcasting is allowed).		/// - The permutation map doesn't perform permutation (broadcasting is allowed).
struct TransferReadToVectorLoadLowering		struct TransferReadToVectorLoadLowering
: public OpRewritePattern<vector::TransferReadOp> {		: public OpRewritePattern<vector::TransferReadOp> {
TransferReadToVectorLoadLowering(MLIRContext *context,		TransferReadToVectorLoadLowering(MLIRContext *context,
llvm::Optional<unsigned> maxRank)		llvm::Optional<unsigned> maxRank)
: OpRewritePattern<vector::TransferReadOp>(context),		: OpRewritePattern<vector::TransferReadOp>(context),
maxTransferRank(maxRank) {}		maxTransferRank(maxRank) {}

LogicalResult matchAndRewrite(vector::TransferReadOp read,		LogicalResult matchAndRewrite(vector::TransferReadOp read,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (maxTransferRank && read.getVectorType().getRank() > *maxTransferRank)		if (maxTransferRank && read.getVectorType().getRank() > *maxTransferRank)
return failure();		return failure();

SmallVector<unsigned, 4> broadcastedDims;		SmallVector<unsigned, 4> broadcastedDims;
// Permutations are handled by VectorToSCF or		// Permutations are handled by VectorToSCF or
// populateVectorTransferPermutationMapLoweringPatterns.		// populateVectorTransferPermutationMapLoweringPatterns.
		// We let the 0-d corner case pass-through as it is supported.
if (!read.permutation_map().isMinorIdentityWithBroadcasting(		if (!read.permutation_map().isMinorIdentityWithBroadcasting(
		dcaballeUnsubmitted Done Reply Inline Actions Move comment before the if? dcaballe: Move comment before the if?
&broadcastedDims))		&broadcastedDims))
return failure();		return failure();

auto memRefType = read.getShapedType().dyn_cast<MemRefType>();		auto memRefType = read.getShapedType().dyn_cast<MemRefType>();
if (!memRefType)		if (!memRefType)
return failure();		return failure();

// Non-unit strides are handled by VectorToSCF.		// Non-unit strides are handled by VectorToSCF.
if (!vector::isLastMemrefDimUnitStride(memRefType))		if (!vector::isLastMemrefDimUnitStride(memRefType))
return failure();		return failure();

// If there is broadcasting involved then we first load the unbroadcasted		// If there is broadcasting involved then we first load the unbroadcasted
// vector, and then broadcast it with `vector.broadcast`.		// vector, and then broadcast it with `vector.broadcast`.
ArrayRef<int64_t> vectorShape = read.getVectorType().getShape();		ArrayRef<int64_t> vectorShape = read.getVectorType().getShape();
SmallVector<int64_t, 4> unbroadcastedVectorShape(vectorShape.begin(),		SmallVector<int64_t, 4> unbroadcastedVectorShape(vectorShape.begin(),
vectorShape.end());		vectorShape.end());
for (unsigned i : broadcastedDims)		for (unsigned i : broadcastedDims)
unbroadcastedVectorShape[i] = 1;		unbroadcastedVectorShape[i] = 1;
VectorType unbroadcastedVectorType = VectorType::get(		VectorType unbroadcastedVectorType = VectorType::get(
unbroadcastedVectorShape, read.getVectorType().getElementType());		unbroadcastedVectorShape, read.getVectorType().getElementType());

// `vector.load` supports vector types as memref's elements only when the		// `vector.load` supports vector types as memref's elements only when the
// resulting vector type is the same as the element type.		// resulting vector type is the same as the element type.
auto memrefElTy = memRefType.getElementType();		auto memrefElTy = memRefType.getElementType();
if (memrefElTy.isa<VectorType>() && memrefElTy != unbroadcastedVectorType)		if (memrefElTy.isa<VectorType>() && memrefElTy != unbroadcastedVectorType)
return failure();		return failure();

// Otherwise, element types of the memref and the vector must match.		// Otherwise, element types of the memref and the vector must match.
if (!memrefElTy.isa<VectorType>() &&		if (!memrefElTy.isa<VectorType>() &&
memrefElTy != read.getVectorType().getElementType())		memrefElTy != read.getVectorType().getElementType())
return failure();		return failure();

// Out-of-bounds dims are handled by MaterializeTransferMask.		// Out-of-bounds dims are handled by MaterializeTransferMask.
if (read.hasOutOfBoundsDim())		if (read.hasOutOfBoundsDim())
return failure();		return failure();
Show All 21 Lines	LogicalResult matchAndRewrite(vector::TransferReadOp read,
}		}

return success();		return success();
}		}

llvm::Optional<unsigned> maxTransferRank;		llvm::Optional<unsigned> maxTransferRank;
};		};

/// Replace a scalar vector.load with a memref.load.		/// Replace a 0-d vector.load with a memref.load + vector.broadcast.
		// TODO: we shouldn't cross the vector/scalar domains just for this
		// but atm we lack the infra to avoid it. Possible solutions include:
		// - go directly to LLVM + bitcast
		// - introduce a bitcast op and likely a new pointer dialect
		// - let memref.load/store additionally support the 0-d vector case
		// There are still deeper data layout issues lingering even in this
		// trivial case (for architectures for which this matters).
		dcaballeUnsubmitted Done Reply Inline Actions load? dcaballe: load?
struct VectorLoadToMemrefLoadLowering		struct VectorLoadToMemrefLoadLowering
: public OpRewritePattern<vector::LoadOp> {		: public OpRewritePattern<vector::LoadOp> {
using OpRewritePattern<vector::LoadOp>::OpRewritePattern;		using OpRewritePattern<vector::LoadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::LoadOp loadOp,		LogicalResult matchAndRewrite(vector::LoadOp loadOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto vecType = loadOp.getVectorType();		auto vecType = loadOp.getVectorType();
if (vecType.getNumElements() != 1)		if (vecType.getNumElements() != 1)
return failure();		return failure();
auto memrefLoad = rewriter.create<memref::LoadOp>(		auto memrefLoad = rewriter.create<memref::LoadOp>(
loadOp.getLoc(), loadOp.base(), loadOp.indices());		loadOp.getLoc(), loadOp.base(), loadOp.indices());
rewriter.replaceOpWithNewOp<vector::BroadcastOp>(		rewriter.replaceOpWithNewOp<vector::BroadcastOp>(loadOp, vecType,
loadOp, VectorType::get({1}, vecType.getElementType()), memrefLoad);		memrefLoad);
return success();		return success();
}		}
};		};

/// Replace a scalar vector.store with a memref.store.		/// Replace a 0-d vector.store with a vector.extractelement + memref.store.
		dcaballeUnsubmitted Done Reply Inline Actions Question about this lowering: why are we lowering 0-d vectors to the scalar world? I thought one of the goals discussed in the RFC was to keep everything within the vector world to avoid the scalar<->vector transition that could be very expensive for some targets. Shouldn't we lower 0-d vectors to a `vector<1xtype> in LLVM instead? dcaballe: Question about this lowering: why are we lowering 0-d vectors to the scalar world? I thought…
		nicolasvasilacheAuthorUnsubmitted Done Reply Inline Actions Note that this is not changing any behavior here and we were already doing this extraction for `vector<1x..x1xtype>`. Your point is very valid but I think is beyond this CL: we need to either: go directly to LLVM + bitcast introduce a bitcast let memref.load/store additionally support the 0-d vector case which itself may be quite fraught with issues There are still deeper data layout issues lingering even in this trivial case (for architectures for which this matters that is .. ). For now I don't think I can do better but adding a TODO. nicolasvasilache: Note that this is not changing any behavior here and we were already doing this extraction for…
struct VectorStoreToMemrefStoreLowering		struct VectorStoreToMemrefStoreLowering
: public OpRewritePattern<vector::StoreOp> {		: public OpRewritePattern<vector::StoreOp> {
using OpRewritePattern<vector::StoreOp>::OpRewritePattern;		using OpRewritePattern<vector::StoreOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::StoreOp storeOp,		LogicalResult matchAndRewrite(vector::StoreOp storeOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto vecType = storeOp.getVectorType();		auto vecType = storeOp.getVectorType();
if (vecType.getNumElements() != 1)		if (vecType.getNumElements() != 1)
return failure();		return failure();
		Value extracted;
		if (vecType.getRank() == 0) {
		// TODO: Unifiy once ExtractOp supports 0-d vectors.
		extracted = rewriter.create<vector::ExtractElementOp>(
		storeOp.getLoc(), storeOp.valueToStore());
		} else {
SmallVector<int64_t> indices(vecType.getRank(), 0);		SmallVector<int64_t> indices(vecType.getRank(), 0);
Value extracted = rewriter.create<vector::ExtractOp>(		extracted = rewriter.create<vector::ExtractOp>(
storeOp.getLoc(), storeOp.valueToStore(), indices);		storeOp.getLoc(), storeOp.valueToStore(), indices);
		}

rewriter.replaceOpWithNewOp<memref::StoreOp>(		rewriter.replaceOpWithNewOp<memref::StoreOp>(
storeOp, extracted, storeOp.base(), storeOp.indices());		storeOp, extracted, storeOp.base(), storeOp.indices());
return success();		return success();
}		}
};		};

/// Progressive lowering of transfer_write. This pattern supports lowering of		/// Progressive lowering of transfer_write. This pattern supports lowering of
/// `vector.transfer_write` to `vector.store` if all of the following hold:		/// `vector.transfer_write` to `vector.store` if all of the following hold:
Show All 9 Lines	TransferWriteToVectorStoreLowering(MLIRContext *context,
llvm::Optional<unsigned> maxRank)		llvm::Optional<unsigned> maxRank)
: OpRewritePattern<vector::TransferWriteOp>(context),		: OpRewritePattern<vector::TransferWriteOp>(context),
maxTransferRank(maxRank) {}		maxTransferRank(maxRank) {}

LogicalResult matchAndRewrite(vector::TransferWriteOp write,		LogicalResult matchAndRewrite(vector::TransferWriteOp write,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
if (maxTransferRank && write.getVectorType().getRank() > *maxTransferRank)		if (maxTransferRank && write.getVectorType().getRank() > *maxTransferRank)
return failure();		return failure();

// Permutations are handled by VectorToSCF or		// Permutations are handled by VectorToSCF or
// populateVectorTransferPermutationMapLoweringPatterns.		// populateVectorTransferPermutationMapLoweringPatterns.
if (!write.isZeroD() && !write.permutation_map().isMinorIdentity())		if ( // pass-through for the 0-d corner case.
		!write.permutation_map().isMinorIdentity())
return failure();		return failure();

auto memRefType = write.getShapedType().dyn_cast<MemRefType>();		auto memRefType = write.getShapedType().dyn_cast<MemRefType>();
if (!memRefType)		if (!memRefType)
return failure();		return failure();

// Non-unit strides are handled by VectorToSCF.		// Non-unit strides are handled by VectorToSCF.
if (!vector::isLastMemrefDimUnitStride(memRefType))		if (!vector::isLastMemrefDimUnitStride(memRefType))
return failure();		return failure();

// `vector.store` supports vector types as memref's elements only when the		// `vector.store` supports vector types as memref's elements only when the
// type of the vector value being written is the same as the element type.		// type of the vector value being written is the same as the element type.
auto memrefElTy = memRefType.getElementType();		auto memrefElTy = memRefType.getElementType();
if (memrefElTy.isa<VectorType>() && memrefElTy != write.getVectorType())		if (memrefElTy.isa<VectorType>() && memrefElTy != write.getVectorType())
return failure();		return failure();

// Otherwise, element types of the memref and the vector must match.		// Otherwise, element types of the memref and the vector must match.
if (!memrefElTy.isa<VectorType>() &&		if (!memrefElTy.isa<VectorType>() &&
memrefElTy != write.getVectorType().getElementType())		memrefElTy != write.getVectorType().getElementType())
return failure();		return failure();

// Out-of-bounds dims are handled by MaterializeTransferMask.		// Out-of-bounds dims are handled by MaterializeTransferMask.
if (write.hasOutOfBoundsDim())		if (write.hasOutOfBoundsDim())
return failure();		return failure();
if (write.mask()) {		if (write.mask()) {
rewriter.replaceOpWithNewOp<vector::MaskedStoreOp>(		rewriter.replaceOpWithNewOp<vector::MaskedStoreOp>(
write, write.source(), write.indices(), write.mask(), write.vector());		write, write.source(), write.indices(), write.mask(), write.vector());
} else {		} else {
rewriter.replaceOpWithNewOp<vector::StoreOp>(		rewriter.replaceOpWithNewOp<vector::StoreOp>(
▲ Show 20 Lines • Show All 383 Lines • ▼ Show 20 Lines
};		};

// Drop inner most contiguous unit dimensions from transfer_read operand.		// Drop inner most contiguous unit dimensions from transfer_read operand.
class DropInnerMostUnitDims : public OpRewritePattern<vector::TransferReadOp> {		class DropInnerMostUnitDims : public OpRewritePattern<vector::TransferReadOp> {
using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;		using OpRewritePattern<vector::TransferReadOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TransferReadOp readOp,		LogicalResult matchAndRewrite(vector::TransferReadOp readOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
		// TODO: support 0-d corner case.
		if (readOp.getTransferRank() == 0)
		return failure();

		// TODO: support mask.
		if (readOp.mask())
		return failure();

auto srcType = readOp.source().getType().dyn_cast<MemRefType>();		auto srcType = readOp.source().getType().dyn_cast<MemRefType>();
if (!srcType \|\| !srcType.hasStaticShape())		if (!srcType \|\| !srcType.hasStaticShape())
return failure();		return failure();

if (!readOp.permutation_map().isMinorIdentity())		if (!readOp.permutation_map().isMinorIdentity())
return failure();		return failure();

auto targetType = readOp.getVectorType();		auto targetType = readOp.getVectorType();
Show All 40 Lines	if (srcType.getLayout().getAffineMap().isIdentity()) {
srcType.getShape().drop_back(dimsToDrop), srcType.getElementType(),		srcType.getShape().drop_back(dimsToDrop), srcType.getElementType(),
map, srcType.getMemorySpaceAsInt());		map, srcType.getMemorySpaceAsInt());
}		}

auto loc = readOp.getLoc();		auto loc = readOp.getLoc();
SmallVector<int64_t> offsets(srcType.getRank(), 0);		SmallVector<int64_t> offsets(srcType.getRank(), 0);
SmallVector<int64_t> strides(srcType.getRank(), 1);		SmallVector<int64_t> strides(srcType.getRank(), 1);

ArrayAttr inBounds =		ArrayAttr inBoundsAttr =
readOp.in_bounds()		readOp.in_bounds()
? rewriter.getArrayAttr(		? rewriter.getArrayAttr(
readOp.in_boundsAttr().getValue().drop_back(dimsToDrop))		readOp.in_boundsAttr().getValue().drop_back(dimsToDrop))
: ArrayAttr();		: ArrayAttr();
Value rankedReducedView = rewriter.create<memref::SubViewOp>(		Value rankedReducedView = rewriter.create<memref::SubViewOp>(
loc, resultMemrefType, readOp.source(), offsets, srcType.getShape(),		loc, resultMemrefType, readOp.source(), offsets, srcType.getShape(),
strides);		strides);
auto permMap = getTransferMinorIdentityMap(		auto permMap = getTransferMinorIdentityMap(
rankedReducedView.getType().cast<ShapedType>(), resultTargetVecType);		rankedReducedView.getType().cast<ShapedType>(), resultTargetVecType);
Value result = rewriter.create<vector::TransferReadOp>(		Value result = rewriter.create<vector::TransferReadOp>(
loc, resultTargetVecType, rankedReducedView,		loc, resultTargetVecType, rankedReducedView,
readOp.indices().drop_back(dimsToDrop), permMap, readOp.padding(),		readOp.indices().drop_back(dimsToDrop), AffineMapAttr::get(permMap),
inBounds);		readOp.padding(),
		// TODO: support mask.
		/mask=/Value(), inBoundsAttr);
rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(readOp, targetType,		rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(readOp, targetType,
result);		result);
return success();		return success();
}		}
};		};

void mlir::vector::populateVectorMaskMaterializationPatterns(		void mlir::vector::populateVectorMaskMaterializationPatterns(
RewritePatternSet &patterns, bool indexOptimizations) {		RewritePatternSet &patterns, bool indexOptimizations) {
▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

mlir/lib/Interfaces/VectorInterfaces.cpp

Show All 14 Lines	VectorType mlir::vector::detail::transferMaskType(VectorType vecType,
auto i1Type = IntegerType::get(map.getContext(), 1);		auto i1Type = IntegerType::get(map.getContext(), 1);
SmallVector<int64_t, 8> shape;		SmallVector<int64_t, 8> shape;
for (int64_t i = 0; i < vecType.getRank(); ++i) {		for (int64_t i = 0; i < vecType.getRank(); ++i) {
// Only result dims have a corresponding dim in the mask.		// Only result dims have a corresponding dim in the mask.
if (map.getResult(i).template isa<AffineDimExpr>()) {		if (map.getResult(i).template isa<AffineDimExpr>()) {
shape.push_back(vecType.getDimSize(i));		shape.push_back(vecType.getDimSize(i));
}		}
}		}
return shape.empty() ? VectorType() : VectorType::get(shape, i1Type);		return VectorType::get(shape, i1Type);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// VectorUnroll Interfaces		// VectorUnroll Interfaces
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Include the definitions of the VectorUnroll interfaces.		/// Include the definitions of the VectorUnroll interfaces.
#include "mlir/Interfaces/VectorInterfaces.cpp.inc"		#include "mlir/Interfaces/VectorInterfaces.cpp.inc"

mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir

	// RUN: mlir-opt %s -convert-vector-to-scf -split-input-file -allow-unregistered-dialect \| FileCheck %s			// RUN: mlir-opt %s -convert-vector-to-scf -split-input-file -allow-unregistered-dialect \| FileCheck %s
	// RUN: mlir-opt %s -convert-vector-to-scf=full-unroll=true -split-input-file -allow-unregistered-dialect \| FileCheck %s --check-prefix=FULL-UNROLL			// RUN: mlir-opt %s -convert-vector-to-scf=full-unroll=true -split-input-file -allow-unregistered-dialect \| FileCheck %s --check-prefix=FULL-UNROLL

	// CHECK-LABEL: func @vector_transfer_ops_0d(			// CHECK-LABEL: func @vector_transfer_ops_0d(
	// CHECK-SAME: %[[MEM:.*]]: memref<f32>) {
	func @vector_transfer_ops_0d(%M: memref<f32>) {			func @vector_transfer_ops_0d(%M: memref<f32>) {
	%f0 = arith.constant 0.0 : f32			%f0 = arith.constant 0.0 : f32

	// CHECK: %[[V0:.]] = arith.constant dense<0{{.}}> : vector<1xf32>			// 0-d transfers are left untouched by vector-to-scf.
	// CHECK: %[[R0:.]] = scf.for %[[I:.]] = {{.}} iter_args(%[[V0_ITER:.]] = %[[V0]]) -> (vector<1xf32>) {			// They are independently lowered to the proper memref.load/store.
	// CHECK: %[[S:.*]] = memref.load %[[MEM]][] : memref<f32>			// CHECK: vector.transfer_read {{.*}}: memref<f32>, vector<f32>
	// CHECK: %[[R_ITER:.*]] = vector.insertelement %[[S]], %[[V0_ITER]][%[[I]] : index] : vector<1xf32>			%0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->()>} :
	// CHECK: scf.yield %[[R_ITER]] : vector<1xf32>			memref<f32>, vector<f32>
	%0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->(0)>} :
	memref<f32>, vector<1xf32>			// CHECK: vector.transfer_write {{.*}}: vector<f32>, memref<f32>
				vector.transfer_write %0, %M[] {permutation_map = affine_map<()->()>} :
	// CHECK: scf.for %[[J:.]] = %{{.}}			vector<f32>, memref<f32>
	// CHECK: %[[SS:.*]] = vector.extractelement %[[R0]][%[[J]] : index] : vector<1xf32>
	// CHECK: memref.store %[[SS]], %[[MEM]][] : memref<f32>
	vector.transfer_write %0, %M[] {permutation_map = affine_map<()->(0)>} :
	vector<1xf32>, memref<f32>

	return			return
	}			}

	// -----			// -----

	// CHECK-LABEL: func @materialize_read_1d() {			// CHECK-LABEL: func @materialize_read_1d() {
	func @materialize_read_1d() {			func @materialize_read_1d() {
	%f0 = arith.constant 0.0: f32			%f0 = arith.constant 0.0: f32
	%A = memref.alloc () : memref<7x42xf32>			%A = memref.alloc () : memref<7x42xf32>
	▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/vectorization.mlir

Show First 20 Lines • Show All 194 Lines • ▼ Show 20 Lines	func @test_vectorize_fill(%A : memref<8x16xf32>, %arg0 : f32) {
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @test_vectorize_fill		// CHECK-LABEL: func @test_vectorize_fill
func @test_vectorize_fill_scalar(%A : memref<f32>, %arg0 : f32) {		func @test_vectorize_fill_scalar(%A : memref<f32>, %arg0 : f32) {
// CHECK-SAME: (%[[M:.]]: memref<f32>, %[[val:.]]: f32)		// CHECK-SAME: (%[[M:.]]: memref<f32>, %[[val:.]]: f32)
// CHECK: %[[VEC:.*]] = vector.broadcast %[[val]] : f32 to vector<1xf32>		// CHECK: %[[VEC:.*]] = vector.broadcast %[[val]] : f32 to vector<f32>
// CHECK: vector.transfer_write %[[VEC]], %[[M]][] {{.*}} : vector<1xf32>, memref<f32>		// CHECK: vector.transfer_write %[[VEC]], %[[M]][] : vector<f32>, memref<f32>
linalg.fill(%arg0, %A) : f32, memref<f32>		linalg.fill(%arg0, %A) : f32, memref<f32>
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @test_vectorize_copy		// CHECK-LABEL: func @test_vectorize_copy
func @test_vectorize_copy(%A : memref<8x16xf32>, %B : memref<8x16xf32>) {		func @test_vectorize_copy(%A : memref<8x16xf32>, %B : memref<8x16xf32>) {
// CHECK: %[[V:.]] = vector.transfer_read {{.}} : memref<8x16xf32>, vector<8x16xf32>		// CHECK: %[[V:.]] = vector.transfer_read {{.}} : memref<8x16xf32>, vector<8x16xf32>
// CHECK: vector.transfer_write %[[V]], {{.*}} : vector<8x16xf32>, memref<8x16xf32>		// CHECK: vector.transfer_write %[[V]], {{.*}} : vector<8x16xf32>, memref<8x16xf32>
linalg.copy(%A, %B) : memref<8x16xf32>, memref<8x16xf32>		linalg.copy(%A, %B) : memref<8x16xf32>, memref<8x16xf32>
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @test_vectorize_copy_scalar		// CHECK-LABEL: func @test_vectorize_copy_scalar
func @test_vectorize_copy_scalar(%A : memref<f32>, %B : memref<f32>) {		func @test_vectorize_copy_scalar(%A : memref<f32>, %B : memref<f32>) {
// CHECK-SAME: (%[[A:.]]: memref<f32>, %[[B:.]]: memref<f32>)		// CHECK-SAME: (%[[A:.]]: memref<f32>, %[[B:.]]: memref<f32>)
// CHECK: %[[V:.]] = vector.transfer_read %[[A]][]{{.}} : memref<f32>, vector<1xf32>		// CHECK: %[[V:.]] = vector.transfer_read %[[A]][]{{.}} : memref<f32>, vector<f32>
// CHECK: %[[val:.*]] = vector.extract %[[V]][0] : vector<1xf32>		// CHECK: %[[val:.*]] = vector.extractelement %[[V]][] : vector<f32>
// CHECK: %[[VV:.*]] = vector.broadcast %[[val]] : f32 to vector<1xf32>		// CHECK: %[[VV:.*]] = vector.broadcast %[[val]] : f32 to vector<f32>
// CHECK: vector.transfer_write %[[VV]], %[[B]][] {{.*}} : vector<1xf32>, memref<f32>		// CHECK: vector.transfer_write %[[VV]], %[[B]][] : vector<f32>, memref<f32>
linalg.copy(%A, %B) : memref<f32>, memref<f32>		linalg.copy(%A, %B) : memref<f32>, memref<f32>
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @test_vectorize_trailing_index		// CHECK-LABEL: func @test_vectorize_trailing_index
// CHECK-SAME: (%[[ARG0:.*]]: memref<1x2x4x8xindex>)		// CHECK-SAME: (%[[ARG0:.*]]: memref<1x2x4x8xindex>)
▲ Show 20 Lines • Show All 764 Lines • ▼ Show 20 Lines	func @fused_broadcast_red_2d(%arg0: tensor<4x4xf32>, %arg1: tensor<4x1xf32>) -> tensor<4xf32> {
return %red : tensor<4xf32>		return %red : tensor<4xf32>
}		}

// -----		// -----

// CHECK-LABEL: func @reduce_1d(		// CHECK-LABEL: func @reduce_1d(
// CHECK-SAME: %[[A:.*]]: tensor<32xf32>		// CHECK-SAME: %[[A:.*]]: tensor<32xf32>
func @reduce_1d(%arg0: tensor<32xf32>) -> tensor<f32> {		func @reduce_1d(%arg0: tensor<32xf32>) -> tensor<f32> {
// CHECK-DAG: %[[F0_v1:.*]] = arith.constant dense<0.000000e+00> : vector<1xf32>		// CHECK-DAG: %[[vF0:.*]] = arith.constant dense<0.000000e+00> : vector<f32>
// CHECK-DAG: %[[F0:.*]] = arith.constant 0.000000e+00 : f32		// CHECK-DAG: %[[F0:.*]] = arith.constant 0.000000e+00 : f32
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index		// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
%f0 = arith.constant 0.000000e+00 : f32		%f0 = arith.constant 0.000000e+00 : f32

// CHECK: %[[init:.*]] = linalg.init_tensor [] : tensor<f32>		// CHECK: %[[init:.*]] = linalg.init_tensor [] : tensor<f32>
%0 = linalg.init_tensor [] : tensor<f32>		%0 = linalg.init_tensor [] : tensor<f32>

// CHECK: %[[f:.*]] = vector.transfer_write %[[F0_v1]], %[[init]][]		// CHECK: %[[f:.*]] = vector.transfer_write %[[vF0]], %[[init]][]
// CHECK-SAME: : vector<1xf32>, tensor<f32>		// CHECK-SAME: : vector<f32>, tensor<f32>
%1 = linalg.fill(%f0, %0) : f32, tensor<f32> -> tensor<f32>		%1 = linalg.fill(%f0, %0) : f32, tensor<f32> -> tensor<f32>
// CHECK: %[[r:.*]] = vector.transfer_read %[[A]][%[[C0]]]		// CHECK: %[[r:.*]] = vector.transfer_read %[[A]][%[[C0]]]
// CHECK-SAME: : tensor<32xf32>, vector<32xf32>		// CHECK-SAME: : tensor<32xf32>, vector<32xf32>
		// CHECK: %[[f0:.*]] = vector.extractelement %[[vF0]][] : vector<f32>
// CHECK: %[[red:.*]] = vector.multi_reduction #vector.kind<add>, %[[r]] [0]		// CHECK: %[[red:.*]] = vector.multi_reduction #vector.kind<add>, %[[r]] [0]
// CHECK-SAME: : vector<32xf32> to f32		// CHECK-SAME: : vector<32xf32> to f32
// CHECK: %[[a:.*]] = arith.addf %[[red]], %[[F0]] : f32		// CHECK: %[[a:.*]] = arith.addf %[[red]], %[[f0]] : f32
// CHECK: %[[red_v1:.*]] = vector.broadcast %[[a]] : f32 to vector<1xf32>		// CHECK: %[[red_v1:.*]] = vector.broadcast %[[a]] : f32 to vector<f32>
// CHECK: %[[res:.*]] = vector.transfer_write %[[red_v1]], %[[f]][]		// CHECK: %[[res:.*]] = vector.transfer_write %[[red_v1]], %[[f]][]
// CHECK-SAME: : vector<1xf32>, tensor<f32>		// CHECK-SAME: : vector<f32>, tensor<f32>
%2 = linalg.generic {		%2 = linalg.generic {
indexing_maps = [affine_map<(d0) -> (d0)>,		indexing_maps = [affine_map<(d0) -> (d0)>,
affine_map<(d0) -> ()>],		affine_map<(d0) -> ()>],
iterator_types = ["reduction"]}		iterator_types = ["reduction"]}
ins(%arg0 : tensor<32xf32>)		ins(%arg0 : tensor<32xf32>)
outs(%1 : tensor<f32>) {		outs(%1 : tensor<f32>) {
^bb0(%a: f32, %b: f32): // no predecessors		^bb0(%a: f32, %b: f32): // no predecessors
%3 = arith.addf %a, %b : f32		%3 = arith.addf %a, %b : f32
linalg.yield %3 : f32		linalg.yield %3 : f32
} -> tensor<f32>		} -> tensor<f32>

return %2 : tensor<f32>		return %2 : tensor<f32>
}		}

mlir/test/Dialect/Vector/invalid.mlir

	Show First 20 Lines • Show All 1,421 Lines • ▼ Show 20 Lines

	// -----			// -----

	func @insert_map_id(%v: vector<2x1xf32>, %v1: vector<4x32xf32>, %id : index) {			func @insert_map_id(%v: vector<2x1xf32>, %v1: vector<4x32xf32>, %id : index) {
	// expected-error@+1 {{'vector.insert_map' op expected number of ids must match the number of dimensions distributed}}			// expected-error@+1 {{'vector.insert_map' op expected number of ids must match the number of dimensions distributed}}
	%0 = vector.insert_map %v, %v1[%id] : vector<2x1xf32> into vector<4x32xf32>			%0 = vector.insert_map %v, %v1[%id] : vector<2x1xf32> into vector<4x32xf32>
	}			}

	// -----

	func @vector_transfer_ops_0d(%arg0: tensor<f32>)
	-> tensor<f32> {
	%f0 = arith.constant 0.0 : f32
	// expected-error@+1 {{0-d transfer requires vector<1xt> shape and () -> (0) permutation_map}}
	%0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<(d0)->(d0)>} :
	tensor<f32>, vector<1xf32>
	%1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->(0)>} :
	vector<1xf32>, tensor<f32>
	return %1: tensor<f32>
	}

mlir/test/Dialect/Vector/ops.mlir

	// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s			// RUN: mlir-opt %s \| mlir-opt \| FileCheck %s

	// CHECK-LABEL: func @vector_transfer_ops_0d(			// CHECK-LABEL: func @vector_transfer_ops_0d(
	func @vector_transfer_ops_0d(%arg0: tensor<f32>, %arg1: memref<f32>)			func @vector_transfer_ops_0d(%arg0: tensor<f32>, %arg1: memref<f32>)
	-> tensor<f32> {			-> tensor<f32> {
	%f0 = arith.constant 0.0 : f32			%f0 = arith.constant 0.0 : f32
	%0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<()->(0)>} :			%0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<()->()>} :
	tensor<f32>, vector<1xf32>			tensor<f32>, vector<f32>
	%1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->(0)>} :			%1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->()>} :
	vector<1xf32>, tensor<f32>			vector<f32>, tensor<f32>
	%2 = vector.transfer_read %arg1[], %f0 {permutation_map = affine_map<()->(0)>} :			%2 = vector.transfer_read %arg1[], %f0 {permutation_map = affine_map<()->()>} :
	memref<f32>, vector<1xf32>			memref<f32>, vector<f32>
	vector.transfer_write %2, %arg1[] {permutation_map = affine_map<()->(0)>} :			vector.transfer_write %2, %arg1[] {permutation_map = affine_map<()->()>} :
	vector<1xf32>, memref<f32>			vector<f32>, memref<f32>
				ThomasRaouxUnsubmitted Done Reply Inline Actions Can 0-D vector be read from non 0-D tensor? It would be good to add a test for it. ThomasRaoux: Can 0-D vector be read from non 0-D tensor? It would be good to add a test for it.
	return %1: tensor<f32>			return %1: tensor<f32>
	}			}

				// CHECK-LABEL: func @vector_transfer_ops_0d_from_higher_d(
				func @vector_transfer_ops_0d_from_higher_d(%arg0: tensor<?xf32>, %arg1: memref<?x?xf32>)
				-> tensor<?xf32> {
				%c0 = arith.constant 0 : index
				%f0 = arith.constant 0.0 : f32
				%0 = vector.transfer_read %arg0[%c0], %f0 {permutation_map = affine_map<(d0)->()>} :
				tensor<?xf32>, vector<f32>
				%1 = vector.transfer_write %0, %arg0[%c0] {permutation_map = affine_map<(d0)->()>} :
				vector<f32>, tensor<?xf32>
				%2 = vector.transfer_read %arg1[%c0, %c0], %f0 {permutation_map = affine_map<(d0, d1)->()>} :
				memref<?x?xf32>, vector<f32>
				vector.transfer_write %2, %arg1[%c0, %c0] {permutation_map = affine_map<(d0, d1)->()>} :
				vector<f32>, memref<?x?xf32>
				return %1: tensor<?xf32>
				}

	// CHECK-LABEL: func @vector_transfer_ops(			// CHECK-LABEL: func @vector_transfer_ops(
	func @vector_transfer_ops(%arg0: memref<?x?xf32>,			func @vector_transfer_ops(%arg0: memref<?x?xf32>,
	%arg1 : memref<?x?xvector<4x3xf32>>,			%arg1 : memref<?x?xvector<4x3xf32>>,
	%arg2 : memref<?x?xvector<4x3xi32>>,			%arg2 : memref<?x?xvector<4x3xi32>>,
	%arg3 : memref<?x?xvector<4x3xindex>>,			%arg3 : memref<?x?xvector<4x3xindex>>,
	%arg4 : memref<?x?x?xf32>) {			%arg4 : memref<?x?x?xf32>) {
	// CHECK: %[[C3:.*]] = arith.constant 3 : index			// CHECK: %[[C3:.*]] = arith.constant 3 : index
	%c3 = arith.constant 3 : index			%c3 = arith.constant 3 : index
	▲ Show 20 Lines • Show All 635 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir

	// RUN: mlir-opt %s -test-vector-transfer-lowering-patterns -canonicalize -split-input-file \| FileCheck %s			// RUN: mlir-opt %s -test-vector-transfer-lowering-patterns -canonicalize -split-input-file \| FileCheck %s

	// CHECK-LABEL: func @vector_transfer_ops_0d_memref(			// CHECK-LABEL: func @vector_transfer_ops_0d_memref(
	// CHECK-SAME: %[[MEM:.*]]: memref<f32>			// CHECK-SAME: %[[MEM:.*]]: memref<f32>
	// CHECK-SAME: %[[VV:.*]]: vector<1x1x1xf32>			// CHECK-SAME: %[[VV:.*]]: vector<1x1x1xf32>
	func @vector_transfer_ops_0d_memref(%M: memref<f32>, %v: vector<1x1x1xf32>) {			func @vector_transfer_ops_0d_memref(%M: memref<f32>, %v: vector<1x1x1xf32>) {
	%f0 = arith.constant 0.0 : f32			%f0 = arith.constant 0.0 : f32

	// CHECK-NEXT: %[[V:.*]] = memref.load %[[MEM]][] : memref<f32>			// CHECK-NEXT: %[[s:.*]] = memref.load %[[MEM]][] : memref<f32>
	%0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->(0)>} :			// CHECK-NEXT: %[[V:.*]] = vector.broadcast %[[s]] : f32 to vector<f32>
	memref<f32>, vector<1xf32>			%0 = vector.transfer_read %M[], %f0 : memref<f32>, vector<f32>

	// CHECK-NEXT: memref.store %[[V]], %[[MEM]][] : memref<f32>			// CHECK-NEXT: %[[ss:.*]] = vector.extractelement %[[V]][] : vector<f32>
	vector.transfer_write %0, %M[] {permutation_map = affine_map<()->(0)>} :			// CHECK-NEXT: memref.store %[[ss]], %[[MEM]][] : memref<f32>
	vector<1xf32>, memref<f32>			vector.transfer_write %0, %M[] : vector<f32>, memref<f32>

	// CHECK-NEXT: %[[VV:.*]] = vector.extract %arg1[0, 0, 0] : vector<1x1x1xf32>			// CHECK-NEXT: %[[VV:.*]] = vector.extract %arg1[0, 0, 0] : vector<1x1x1xf32>
	// CHECK-NEXT: memref.store %[[VV]], %[[MEM]][] : memref<f32>			// CHECK-NEXT: memref.store %[[VV]], %[[MEM]][] : memref<f32>
	vector.store %v, %M[] : memref<f32>, vector<1x1x1xf32>			vector.store %v, %M[] : memref<f32>, vector<1x1x1xf32>

	return			return
	}			}

	▲ Show 20 Lines • Show All 324 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Vector] Thread 0-d vectors through vector.transfer opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 391048

mlir/include/mlir/Dialect/Vector/VectorOps.td

mlir/include/mlir/Interfaces/VectorInterfaces.td

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp

mlir/lib/Conversion/VectorToROCDL/VectorToROCDL.cpp

mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp

mlir/lib/Dialect/Linalg/ComprehensiveBufferize/VectorInterfaceImpl.cpp

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp

mlir/lib/Dialect/MemRef/Transforms/FoldSubViewOps.cpp

mlir/lib/Dialect/Vector/VectorDropLeadUnitDim.cpp

mlir/lib/Dialect/Vector/VectorOps.cpp

mlir/lib/Dialect/Vector/VectorTransferPermutationMapRewritePatterns.cpp

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/lib/Interfaces/VectorInterfaces.cpp

mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir

mlir/test/Dialect/Linalg/vectorization.mlir

mlir/test/Dialect/Vector/invalid.mlir

mlir/test/Dialect/Vector/ops.mlir

mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir

[mlir][Vector] Thread 0-d vectors through vector.transfer ops
ClosedPublic