This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/VectorOps/
-
mlir/
-
Dialect/
-
VectorOps/
1/2
VectorOps.h
-
lib/Dialect/VectorOps/
-
Dialect/
-
VectorOps/
6/7
VectorTransforms.cpp
-
test/
-
Dialect/VectorOps/
-
VectorOps/
-
vector-slices-transforms.mlir
-
lib/Transforms/
-
Transforms/
2/2
TestVectorTransforms.cpp

Differential D73295

[mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops
ClosedPublic

Authored by aartbik on Jan 23 2020, 2:17 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
andydavis1
ftynse

Commits

rG303fddeeab10: [mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops

Summary

Rewrites the extract/insert_slices operation in terms of
strided_slice/insert_strided_slice ops with intermediate
tuple uses (that should get optimimized away with typical
usage). This is done in a separate "pass" to enable testing
this particular rewriting in isolation.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aartbik created this revision.Jan 23 2020, 2:17 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJan 23 2020, 2:17 PM

Herald added a reviewer: nicolasvasilache. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, liufengdb, lucyrfox and 8 others. · View Herald Transcript

aartbik added reviewers: andydavis1, ftynse.Jan 23 2020, 2:19 PM

Harbormaster failed remote builds in B44756: Diff 240008!Jan 23 2020, 2:31 PM

I really like how this has evolved from the original point, it is almost good to go on my end.

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp
774	drop trivial braces plz, this is the MLIR style.
797	So I really like what you're doing re exposing and classifying patterns by intention, other places in the codebase should also do that and document it: "this set of patterns is useful for X" Now, the selection of patterns you chose to add is a bit trickier IMO and I think we should: name this `populateVectorSlicesLoweringPatterns` because it lowers out of these ops. Since it does not lower the type it is not a conversion so `LoweringPatterns` seems an appropriate name. insert all the extra patterns (all the necessary tuple stuff) that ensure the Insert/ExtractSlices indeed go away, otherwise it will be surprising that the VectorSlicesLoweringPatterns are not enough to lower the vector slices. explicitly list at the API doc level the patterns that are included (including the ) so people can easily look them up. After we have enough of those, we will end up with pattern collections that implement behaviors. This will have a granularity somewhere in between (1) individual patterns and (2) full transformations. I expect this to be very powerful and independently testable like you do. I am particularly sensitive to this in light of https://reviews.llvm.org/D73145 in which I could not break the phase ordering/dependence for now. @rriddle what's your take on this? Do you see a need / opportunity to have core infra support for collections of patterns and tests? Side note: So far I have shelved the debugging of why https://reviews.llvm.org/D73145 does not work with fused patterns but we need to resolve it.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
41	Re pattern selection, it would be greate that `populateVectorSlicesLoweringPatterns` has everything it needs to make the test pass. So concretely, this line should go away (and we should hunt other opportunities to improving other `populateXXX` methods by following your model).

oh yes and trivial braces everywhere plz, I just annotated one.

addressed ntv's comments

aartbik added inline comments.Jan 23 2020, 4:09 PM

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp
774	you would think I knew that by now, but old habits die hard....
797	Done all (renamed and added doc), except that we don't need any specific tuple stuff anymore! Dead tuples are removed by DCE (in the greedy rewriter at least) while get-tuples-on-tuples are folded away automatically as well!
mlir/test/lib/Transforms/TestVectorTransforms.cpp
41	It has. I simply overlooked this line. It is not needed!

Unit tests: fail. 62144 tests passed, 6 failed and 811 were skipped.

failed: MLIR.Dialect/VectorOps/vector-slices-transforms.mlir
failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B44779: Diff 240039!Jan 23 2020, 4:50 PM

rebase

Unit tests: fail. 62151 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B44857: Diff 240227!Jan 24 2020, 9:46 AM

nicolasvasilache added inline comments.Jan 24 2020, 11:45 AM

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp
797	Sorry I don't understand how the tuples get magically removed. Both rewrites insert tuple/tuple_get but I do not see what patterns removes them concretely. Maybe in these particular tests you wrote things degenerate into DCE (and that's great!), but I imagine it is possible to write tests where tuple ops will remain right?
799	Also, let's plz rename to `XXXLowering` given the naming argument above.

renaming, added more comments

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp

797

Well if of course does not guarantee that all tuple ops are eliminated (it only deals with slices ops rewriting), since tuple values may "leak" going in already. Take for example

func @extract_slices(%arg0: vector<3x3xf32>) -> tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> {

%0 = vector.extract_slices %arg0, [2, 2], [1, 1]
  : vector<3x3xf32> into tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
return %0 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>

}

this will be lowered to

func @extract_slices(%arg0: vector<3x3xf32>) -> tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> {

  %0 = vector.strided_slice %arg0 {offsets = [0, 0], sizes = [2, 2], strides = [1, 1]} : vector<3x3xf32> to vector<2x2xf32>
  %1 = vector.strided_slice %arg0 {offsets = [0, 2], sizes = [2, 1], strides = [1, 1]} : vector<3x3xf32> to vector<2x1xf32>
  %2 = vector.strided_slice %arg0 {offsets = [2, 0], sizes = [1, 2], strides = [1, 1]} : vector<3x3xf32> to vector<1x2xf32>
  %3 = vector.strided_slice %arg0 {offsets = [2, 2], sizes = [1, 1], strides = [1, 1]} : vector<3x3xf32> to vector<1x1xf32>
  %4 = vector.tuple %0, %1, %2, %3 : vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>
  return %4 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
}

Here the tuple leaks, because there is no way to remove it. However, the lowering guarantees that any uses of slices where the tuple values are consumed, are lowered into something without tuples, even for the newly introduced operations.

I have added the comment on the lowering API a bit to reflect this.

added "leaking" tuple to test

PTAL

Unit tests: fail. 62151 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B44871: Diff 240271!Jan 24 2020, 12:55 PM

Unit tests: fail. 62151 tests passed, 5 failed and 811 were skipped.

failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B44872: Diff 240273!Jan 24 2020, 1:04 PM

rebase to master

Unit tests: pass. 62191 tests passed, 0 failed and 815 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster completed remote builds in B44878: Diff 240282.Jan 24 2020, 1:59 PM

Well my point here is that if you added the patterns to fold the tuples, you would do as good a job as you could in the absence of region/function boundaries.
Here you created an example that leaks in an unfixable way (because we have conciously decided to no lower the tuple<vector> type to LLVMIR and take the strong position that it must canonicalize/fold/DCE away).

Really the only thing I am trying to get is to either have:

the pattern collection be this:

void mlir::vector::populateVectorSlicesLoweringPatterns(
    OwningRewritePatternList &patterns, MLIRContext *context) {
  patterns.insert<ExtractSlicesOpLowering, InsertSlicesOpLowering, TupleGetFolderOp, OtherRelevantTuplePatterns>(context);
}

a good justification why that would be a bad idea.

The outcome I am looking for is an API that makes is easy and unsurprising to add patterns that guarantee Slices are lowered away (and it should be a "compiler bug" if they are remaining).
All this because we refuse to have a representation for tuple<vector<4x8x16x32xf32>.... > in LLVMIR (unless strong data comes to suggest this inuition is wrong).

Does this make sense?

Hmm. I would argue that the tuple discussion makes less sense in the context of this particular rewriting (since it really deals with getting rid of extract_slices/insert_slices using other slice ops). Cleaning up tuple uses is a nice bonus, but not a requirement, since tuples are really a part of the vector dialect to start with, as the leaking example shows. If you want something that blocks any tuple uses, I would argue that we need another pattern population for that, and run that at places where we need to guarantee they are gone (such as the lowering to LLVM, although the "legality" part takes care of it there already).

Other than this comment, I am not sure what you would like to see added in this CL.

ok, fair enough, thank you Aart!

This revision is now accepted and ready to land.Jan 24 2020, 2:32 PM

Note: I was probably overfitting on the issues I have with https://reviews.llvm.org/D73145 :)

Closed by commit rG303fddeeab10: [mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops (authored by aartbik). · Explain WhyJan 24 2020, 4:27 PM

This revision was automatically updated to reflect the committed changes.

rriddle added inline comments.Jan 24 2020, 4:35 PM

mlir/include/mlir/Dialect/VectorOps/VectorOps.h
51	I don't understand this comment. I don't see how this lowering removes tuples. The pattern driver performs some DCE, but I don't see how that is related to these patterns.

aartbik marked an inline comment as done.Jan 24 2020, 4:51 PM

aartbik added inline comments.

mlir/include/mlir/Dialect/VectorOps/VectorOps.h
51	Yes, this assumes the patterns are run through the greedy driver (or a similar driver that calls DCE and folds as part of the rewriting). I was trying to convey that rewriting of a typical extract_slices or insert_slices where the individual parts are always consumed somehow leaves no tuple behind. I can send out a follow-up CL with some rephrased comments to continue the discussion there.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

VectorOps/

VectorOps.h

11 lines

lib/

Dialect/

VectorOps/

VectorTransforms.cpp

132 lines

test/

Dialect/

VectorOps/

vector-slices-transforms.mlir

63 lines

lib/

Transforms/

TestVectorTransforms.cpp

15 lines

Diff 240323

mlir/include/mlir/Dialect/VectorOps/VectorOps.h

	Show All 37 Lines
	/// Collect a set of vector-to-vector canonicalization patterns.			/// Collect a set of vector-to-vector canonicalization patterns.
	void populateVectorToVectorCanonicalizationPatterns(			void populateVectorToVectorCanonicalizationPatterns(
	OwningRewritePatternList &patterns, MLIRContext *context);			OwningRewritePatternList &patterns, MLIRContext *context);

	/// Collect a set of vector-to-vector transformation patterns.			/// Collect a set of vector-to-vector transformation patterns.
	void populateVectorToVectorTransformationPatterns(			void populateVectorToVectorTransformationPatterns(
	OwningRewritePatternList &patterns, MLIRContext *context);			OwningRewritePatternList &patterns, MLIRContext *context);

				/// Collect a set of vector slices transformation patterns:
				/// ExtractSlicesOpLowering, InsertSlicesOpLowering
				/// Useful for clients that want to express all vector "slices"
				/// ops in terms of more elementary vector "slice" ops. If all
				/// "produced" tuple values are "consumed" (the most common
				/// use for "slices" ops), this lowering removes all tuple related
				rriddleUnsubmitted Not Done Reply Inline Actions I don't understand this comment. I don't see how this lowering removes tuples. The pattern driver performs some DCE, but I don't see how that is related to these patterns. rriddle: I don't understand this comment. I don't see how this lowering removes tuples. The pattern…
				aartbikAuthorUnsubmitted Done Reply Inline Actions Yes, this assumes the patterns are run through the greedy driver (or a similar driver that calls DCE and folds as part of the rewriting). I was trying to convey that rewriting of a typical extract_slices or insert_slices where the individual parts are always consumed somehow leaves no tuple behind. I can send out a follow-up CL with some rephrased comments to continue the discussion there. aartbik: Yes, this assumes the patterns are run through the greedy driver (or a similar driver that…
				/// operations as well (through DCE and folding). If tuple values
				/// "leak" coming in, however, some tuple related ops will remain.
				void populateVectorSlicesLoweringPatterns(OwningRewritePatternList &patterns,
				MLIRContext *context);

	/// Returns the integer type required for subscripts in the vector dialect.			/// Returns the integer type required for subscripts in the vector dialect.
	IntegerType getVectorSubscriptType(Builder &builder);			IntegerType getVectorSubscriptType(Builder &builder);

	/// Returns an integer array attribute containing the given values using			/// Returns an integer array attribute containing the given values using
	/// the integer type required for subscripts in the vector dialect.			/// the integer type required for subscripts in the vector dialect.
	ArrayAttr getVectorSubscriptAttr(Builder &b, ArrayRef<int64_t> values);			ArrayAttr getVectorSubscriptAttr(Builder &b, ArrayRef<int64_t> values);

	#define GET_OP_CLASSES			#define GET_OP_CLASSES
	#include "mlir/Dialect/VectorOps/VectorOps.h.inc"			#include "mlir/Dialect/VectorOps/VectorOps.h.inc"

	} // end namespace vector			} // end namespace vector
	} // end namespace mlir			} // end namespace mlir

	#endif // MLIR_DIALECT_VECTOROPS_VECTOROPS_H			#endif // MLIR_DIALECT_VECTOROPS_VECTOROPS_H

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp

//===- VectorToLoops.cpp - Conversion within the Vector dialect -----------===//		//===- VectorToLoops.cpp - Conversion within the Vector dialect -----------===//
//		//
// Part of the MLIR Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the MLIR Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements target-independent rewrites as 1->N patterns.		// This file implements target-independent rewrites as 1->N patterns.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include <type_traits>		#include <type_traits>

#include "mlir/Dialect/AffineOps/AffineOps.h"		#include "mlir/Dialect/AffineOps/AffineOps.h"
		#include "mlir/Dialect/StandardOps/Ops.h"
#include "mlir/Dialect/VectorOps/VectorOps.h"		#include "mlir/Dialect/VectorOps/VectorOps.h"
#include "mlir/Dialect/VectorOps/VectorTransforms.h"		#include "mlir/Dialect/VectorOps/VectorTransforms.h"
#include "mlir/Dialect/VectorOps/VectorUtils.h"		#include "mlir/Dialect/VectorOps/VectorUtils.h"
#include "mlir/IR/AffineExpr.h"		#include "mlir/IR/AffineExpr.h"
#include "mlir/IR/AffineMap.h"		#include "mlir/IR/AffineMap.h"
#include "mlir/IR/Attributes.h"		#include "mlir/IR/Attributes.h"
#include "mlir/IR/Builders.h"		#include "mlir/IR/Builders.h"
#include "mlir/IR/Function.h"		#include "mlir/IR/Function.h"
#include "mlir/IR/Location.h"		#include "mlir/IR/Location.h"
#include "mlir/IR/Matchers.h"		#include "mlir/IR/Matchers.h"
#include "mlir/IR/Module.h"		#include "mlir/IR/Module.h"
#include "mlir/IR/OperationSupport.h"		#include "mlir/IR/OperationSupport.h"
#include "mlir/IR/PatternMatch.h"		#include "mlir/IR/PatternMatch.h"
#include "mlir/IR/Types.h"		#include "mlir/IR/Types.h"
#include "mlir/Support/Functional.h"		#include "mlir/Support/Functional.h"
		#include "mlir/Support/MathExtras.h"
#include "mlir/Support/STLExtras.h"		#include "mlir/Support/STLExtras.h"

#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"

#define DEBUG_TYPE "vector-to-vector"		#define DEBUG_TYPE "vector-to-vector"

▲ Show 20 Lines • Show All 613 Lines • ▼ Show 20 Lines	PatternMatchResult matchAndRewrite(vector::TupleGetOp tupleGetOp,

// Forward Value from 'tupleOp' at 'tupleGetOp.index'.		// Forward Value from 'tupleOp' at 'tupleGetOp.index'.
Value tupleValue = tupleOp.getOperand(tupleGetOp.getIndex());		Value tupleValue = tupleOp.getOperand(tupleGetOp.getIndex());
rewriter.replaceOp(tupleGetOp, tupleValue);		rewriter.replaceOp(tupleGetOp, tupleValue);
return matchSuccess();		return matchSuccess();
}		}
};		};

		/// Progressive lowering of ExtractSlicesOp to tuple of StridedSliceOp.
		/// One:
		/// %x = vector.extract_slices %0
		/// is replaced by:
		/// %a = vector.strided_slice %0
		/// %b = vector.strided_slice %0
		/// ..
		/// %x = vector.tuple %a, %b, ..
		class ExtractSlicesOpLowering
		: public OpRewritePattern<vector::ExtractSlicesOp> {
		public:
		using OpRewritePattern<vector::ExtractSlicesOp>::OpRewritePattern;

		// TODO(ajcbik): refactor slice utilities out into VectorUtils.h
		PatternMatchResult matchAndRewrite(vector::ExtractSlicesOp op,
		PatternRewriter &rewriter) const override {
		auto loc = op.getLoc();

		VectorType vectorType = op.getSourceVectorType();
		int64_t rank = vectorType.getRank();
		auto shape = vectorType.getShape();

		SmallVector<int64_t, 4> sizes;
		op.getSizes(sizes);
		SmallVector<int64_t, 4> strides;
		op.getStrides(strides); // all-ones at the moment

		// Compute the number of slices in each dimension.
		SmallVector<int64_t, 4> sliceDimCounts(rank);
		for (int64_t r = 0; r < rank; ++r)
		sliceDimCounts[r] = ceilDiv(shape[r], sizes[r]);

		// For each element in the tuple, generate the proper strided slice.
		auto basis = computeStrides(sliceDimCounts);
		TupleType tupleType = op.getResultTupleType();
		int64_t tupleSize = tupleType.size();
		SmallVector<Value, 4> tupleValues(tupleSize);
		for (int64_t i = 0; i < tupleSize; ++i) {
		// De-linearize w.r.t. 'basis'.
		auto vectorOffsets = delinearize(i, basis);
		// Convert from unrolled vector-space offsets to element-space offsets.
		auto elementOffsets = mlir::functional::zipMap(
		[](int64_t v1, int64_t v2) { return v1 * v2; }, vectorOffsets, sizes);
		// Compute the size of each slice.
		SmallVector<int64_t, 4> sliceSizes(rank);
		for (int64_t r = 0; r < rank; ++r)
		sliceSizes[r] = std::min(sizes[r], shape[r] - elementOffsets[r]);
		// Insert in tuple.
		tupleValues[i] = rewriter.create<vector::StridedSliceOp>(
		loc, op.vector(), elementOffsets, sliceSizes, strides);
		}

		rewriter.replaceOpWithNewOp<vector::TupleOp>(op, tupleType, tupleValues);
		return matchSuccess();
		}
		};

		/// Progressive lowering of InsertSlicesOp to series of InsertStridedSliceOp.
		/// One:
		/// %x = vector.insert_slices %0
		/// is replaced by:
		/// %r0 = vector.splat 0
		// %t1 = vector.tuple_get %0, 0
		/// %r1 = vector.insert_strided_slice %r0, %t1
		// %t2 = vector.tuple_get %0, 1
		/// %r2 = vector.insert_strided_slice %r1, %t2
		/// ..
		/// %x = ..
		class InsertSlicesOpLowering : public OpRewritePattern<vector::InsertSlicesOp> {
		public:
		using OpRewritePattern<vector::InsertSlicesOp>::OpRewritePattern;

		// TODO(ajcbik): refactor slice utilities out into VectorUtils.h
		PatternMatchResult matchAndRewrite(vector::InsertSlicesOp op,
		PatternRewriter &rewriter) const override {
		auto loc = op.getLoc();

		VectorType vectorType = op.getResultVectorType();
		int64_t rank = vectorType.getRank();
		auto shape = vectorType.getShape();

		SmallVector<int64_t, 4> sizes;
		op.getSizes(sizes);
		SmallVector<int64_t, 4> strides;
		op.getStrides(strides); // all-ones at the moment

		// Compute the number of slices in each dimension.
		SmallVector<int64_t, 4> sliceDimCounts(rank);
		for (int64_t r = 0; r < rank; ++r)
		sliceDimCounts[r] = ceilDiv(shape[r], sizes[r]);

		// Prepare result.
		auto elemType = vectorType.getElementType();
		Value zero = rewriter.create<ConstantOp>(loc, elemType,
		rewriter.getZeroAttr(elemType));
		Value result = rewriter.create<SplatOp>(loc, vectorType, zero);

		// For each element in the tuple, extract the proper strided slice.
		auto basis = computeStrides(sliceDimCounts);
		TupleType tupleType = op.getSourceTupleType();
		int64_t tupleSize = tupleType.size();
		SmallVector<Value, 4> tupleValues(tupleSize);
		for (int64_t i = 0; i < tupleSize; ++i) {
		// De-linearize w.r.t. 'basis'.
		auto vectorOffsets = delinearize(i, basis);
		// Convert from unrolled vector-space offsets to element-space offsets.
		auto elementOffsets = mlir::functional::zipMap(
		[](int64_t v1, int64_t v2) { return v1 * v2; }, vectorOffsets, sizes);
		// Compute the size of each slice.
		SmallVector<int64_t, 4> sliceSizes(rank);
		for (int64_t r = 0; r < rank; ++r)
		sliceSizes[r] = std::min(sizes[r], shape[r] - elementOffsets[r]);
		// Extract from tuple into the result.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions drop trivial braces plz, this is the MLIR style. nicolasvasilache: drop trivial braces plz, this is the MLIR style.
		aartbikAuthorUnsubmitted Done Reply Inline Actions you would think I knew that by now, but old habits die hard.... aartbik: you would think I knew that by now, but old habits die hard....
		auto index = rewriter.getI64IntegerAttr(i);
		auto tupleGet = rewriter.create<vector::TupleGetOp>(
		loc, tupleType.getType(i), op.getOperand(), index);
		result = rewriter.create<vector::InsertStridedSliceOp>(
		loc, tupleGet, result, elementOffsets, strides);
		}

		rewriter.replaceOp(op, result);
		return matchSuccess();
		}
		};

} // namespace		} // namespace

// TODO(andydavis) Add pattern to rewrite ExtractSlices(ConstantMaskOp).		// TODO(andydavis) Add pattern to rewrite ExtractSlices(ConstantMaskOp).
// TODO(andydavis) Add this as DRR pattern.		// TODO(andydavis) Add this as DRR pattern.
void mlir::vector::populateVectorToVectorTransformationPatterns(		void mlir::vector::populateVectorToVectorTransformationPatterns(
OwningRewritePatternList &patterns, MLIRContext *context) {		OwningRewritePatternList &patterns, MLIRContext *context) {
patterns.insert<SplitTransferReadOp, SplitTransferWriteOp, TupleGetFolderOp>(		patterns.insert<SplitTransferReadOp, SplitTransferWriteOp, TupleGetFolderOp>(
context);		context);
}		}

		void mlir::vector::populateVectorSlicesLoweringPatterns(
		nicolasvasilacheUnsubmitted Done Reply Inline Actions So I really like what you're doing re exposing and classifying patterns by intention, other places in the codebase should also do that and document it: "this set of patterns is useful for X" Now, the selection of patterns you chose to add is a bit trickier IMO and I think we should: name this `populateVectorSlicesLoweringPatterns` because it lowers out of these ops. Since it does not lower the type it is not a conversion so `LoweringPatterns` seems an appropriate name. insert all the extra patterns (all the necessary tuple stuff) that ensure the Insert/ExtractSlices indeed go away, otherwise it will be surprising that the VectorSlicesLoweringPatterns are not enough to lower the vector slices. explicitly list at the API doc level the patterns that are included (including the ) so people can easily look them up. After we have enough of those, we will end up with pattern collections that implement behaviors. This will have a granularity somewhere in between (1) individual patterns and (2) full transformations. I expect this to be very powerful and independently testable like you do. I am particularly sensitive to this in light of https://reviews.llvm.org/D73145 in which I could not break the phase ordering/dependence for now. @rriddle what's your take on this? Do you see a need / opportunity to have core infra support for collections of patterns and tests? Side note: So far I have shelved the debugging of why https://reviews.llvm.org/D73145 does not work with fused patterns but we need to resolve it. nicolasvasilache: So I really like what you're doing re exposing and classifying patterns by intention, other…
		aartbikAuthorUnsubmitted Done Reply Inline Actions Done all (renamed and added doc), except that we don't need any specific tuple stuff anymore! Dead tuples are removed by DCE (in the greedy rewriter at least) while get-tuples-on-tuples are folded away automatically as well! aartbik: Done all (renamed and added doc), except that we don't need any specific tuple stuff anymore!
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Sorry I don't understand how the tuples get magically removed. Both rewrites insert tuple/tuple_get but I do not see what patterns removes them concretely. Maybe in these particular tests you wrote things degenerate into DCE (and that's great!), but I imagine it is possible to write tests where tuple ops will remain right? nicolasvasilache: Sorry I don't understand how the tuples get magically removed. Both rewrites insert…
		aartbikAuthorUnsubmitted Done Reply Inline Actions Well if of course does not guarantee that all tuple ops are eliminated (it only deals with slices ops rewriting), since tuple values may "leak" going in already. Take for example func @extract_slices(%arg0: vector<3x3xf32>) -> tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> { %0 = vector.extract_slices %arg0, [2, 2], [1, 1] : vector<3x3xf32> into tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> return %0 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> } this will be lowered to func @extract_slices(%arg0: vector<3x3xf32>) -> tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> { %0 = vector.strided_slice %arg0 {offsets = [0, 0], sizes = [2, 2], strides = [1, 1]} : vector<3x3xf32> to vector<2x2xf32> %1 = vector.strided_slice %arg0 {offsets = [0, 2], sizes = [2, 1], strides = [1, 1]} : vector<3x3xf32> to vector<2x1xf32> %2 = vector.strided_slice %arg0 {offsets = [2, 0], sizes = [1, 2], strides = [1, 1]} : vector<3x3xf32> to vector<1x2xf32> %3 = vector.strided_slice %arg0 {offsets = [2, 2], sizes = [1, 1], strides = [1, 1]} : vector<3x3xf32> to vector<1x1xf32> %4 = vector.tuple %0, %1, %2, %3 : vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32> return %4 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> } Here the tuple leaks, because there is no way to remove it. However, the lowering guarantees that any uses of slices where the tuple values are consumed, are lowered into something without tuples, even for the newly introduced operations. I have added the comment on the lowering API a bit to reflect this. aartbik: Well if of course does not guarantee that all tuple ops are eliminated (it only deals with…
		OwningRewritePatternList &patterns, MLIRContext *context) {
		patterns.insert<ExtractSlicesOpLowering, InsertSlicesOpLowering>(context);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Also, let's plz rename to `XXXLowering` given the naming argument above. nicolasvasilache: Also, let's plz rename to `XXXLowering` given the naming argument above.
		}

mlir/test/Dialect/VectorOps/vector-slices-transforms.mlir

This file was added.

				// RUN: mlir-opt %s -test-vector-slices-conversion \| FileCheck %s

				// CHECK-LABEL: func @extract_slices(%arg0: vector<3x3xf32>)
				// CHECK: %[[SS:.*]] = vector.strided_slice %arg0 {offsets = [0, 0], sizes = [2, 2], strides = [1, 1]}
				// CHECK: return %[[SS]]

				func @extract_slices(%arg0: vector<3x3xf32>) -> vector<2x2xf32> {
				%0 = vector.extract_slices %arg0, [2, 2], [1, 1]
				: vector<3x3xf32> into tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
				%1 = vector.tuple_get %0, 0 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
				return %1 : vector<2x2xf32>
				}

				// CHECK-LABEL: func @insert_slices(%arg0: vector<2x2xf32>, %arg1: vector<2x1xf32>, %arg2: vector<1x2xf32>, %arg3: vector<1x1xf32>)
				// CHECK: %[[C0:.*]] = constant dense<0.000000e+00> : vector<3x3xf32>
				// CHECK: %[[I0:.*]] = vector.insert_strided_slice %arg0, %[[C0]] {offsets = [0, 0], strides = [1, 1]}
				// CHECK: %[[I1:.*]] = vector.insert_strided_slice %arg1, %[[I0]] {offsets = [0, 2], strides = [1, 1]}
				// CHECK: %[[I2:.*]] = vector.insert_strided_slice %arg2, %[[I1]] {offsets = [2, 0], strides = [1, 1]}
				// CHECK: %[[I3:.*]] = vector.insert_strided_slice %arg3, %[[I2]] {offsets = [2, 2], strides = [1, 1]}
				// CHECK: return %[[I3]]

				func @insert_slices(%arg0: vector<2x2xf32>,
				%arg1: vector<2x1xf32>,
				%arg2: vector<1x2xf32>,
				%arg3: vector<1x1xf32>) -> vector<3x3xf32> {
				%0 = vector.tuple %arg0, %arg1, %arg2, %arg3
				: vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>
				%1 = vector.insert_slices %0, [2, 2], [1, 1]
				: tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> into vector<3x3xf32>
				return %1 : vector<3x3xf32>
				}

				// CHECK-LABEL: func @extract_insert_slices(%arg0: vector<3x3xf32>)
				// CHECK: %[[C:.*]] = constant dense<0.000000e+00> : vector<3x3xf32>
				// CHECK: %[[X0:.*]] = vector.strided_slice %arg0 {offsets = [0, 0], sizes = [2, 2], strides = [1, 1]}
				// CHECK: %[[X1:.*]] = vector.strided_slice %arg0 {offsets = [0, 2], sizes = [2, 1], strides = [1, 1]}
				// CHECK: %[[X2:.*]] = vector.strided_slice %arg0 {offsets = [2, 0], sizes = [1, 2], strides = [1, 1]}
				// CHECK: %[[X3:.*]] = vector.strided_slice %arg0 {offsets = [2, 2], sizes = [1, 1], strides = [1, 1]}
				// CHECK: %[[X4:.*]] = vector.insert_strided_slice %[[X0]], %[[C0]] {offsets = [0, 0], strides = [1, 1]}
				// CHECK: %[[X5:.*]] = vector.insert_strided_slice %[[X1]], %[[X4]] {offsets = [0, 2], strides = [1, 1]}
				// CHECK: %[[X6:.*]] = vector.insert_strided_slice %[[X2]], %[[X5]] {offsets = [2, 0], strides = [1, 1]}
				// CHECK: %[[X7:.*]] = vector.insert_strided_slice %[[X3]], %[[X6]] {offsets = [2, 2], strides = [1, 1]}
				// CHECK:return %[[X7]]

				func @extract_insert_slices(%arg0: vector<3x3xf32>) -> vector<3x3xf32> {
				%0 = vector.extract_slices %arg0, [2, 2], [1, 1]
				: vector<3x3xf32> into tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
				%1 = vector.insert_slices %0, [2, 2], [1, 1]
				: tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>> into vector<3x3xf32>
				return %1 : vector<3x3xf32>
				}

				// CHECK-LABEL: func @extract_slices_tuple_leaks(%arg0: vector<4xf32>)
				// CHECK: %[[X0:.*]] = vector.strided_slice %arg0 {offsets = [0], sizes = [2], strides = [1]}
				// CHECK: %[[X1:.*]] = vector.strided_slice %arg0 {offsets = [2], sizes = [2], strides = [1]}
				// CHECK: %[[X2:.*]] = vector.tuple %[[X0]], %[[X1]]
				// CHECK: return %[[X2]]

				func @extract_slices_tuple_leaks(%arg0: vector<4xf32>) -> tuple<vector<2xf32>, vector<2xf32>> {
				%0 = vector.extract_slices %arg0, [2], [1] : vector<4xf32> into tuple<vector<2xf32>, vector<2xf32>>
				return %0 : tuple<vector<2xf32>, vector<2xf32>>
				}

mlir/test/lib/Transforms/TestVectorTransforms.cpp

	Show All 12 Lines
	#include "mlir/Dialect/VectorOps/VectorTransforms.h"			#include "mlir/Dialect/VectorOps/VectorTransforms.h"
	#include "mlir/IR/PatternMatch.h"			#include "mlir/IR/PatternMatch.h"
	#include "mlir/Pass/Pass.h"			#include "mlir/Pass/Pass.h"

	using namespace mlir;			using namespace mlir;
	using namespace mlir::vector;			using namespace mlir::vector;

	namespace {			namespace {

	#include "TestVectorTransformPatterns.h.inc"			#include "TestVectorTransformPatterns.h.inc"

	struct TestVectorToVectorConversion			struct TestVectorToVectorConversion
	: public FunctionPass<TestVectorToVectorConversion> {			: public FunctionPass<TestVectorToVectorConversion> {
	void runOnFunction() override {			void runOnFunction() override {
	OwningRewritePatternList patterns;			OwningRewritePatternList patterns;
	auto *context = &getContext();			auto *context = &getContext();
	populateWithGenerated(context, &patterns);			populateWithGenerated(context, &patterns);
	populateVectorToVectorCanonicalizationPatterns(patterns, context);			populateVectorToVectorCanonicalizationPatterns(patterns, context);
	populateVectorToVectorTransformationPatterns(patterns, context);			populateVectorToVectorTransformationPatterns(patterns, context);
	applyPatternsGreedily(getFunction(), patterns);			applyPatternsGreedily(getFunction(), patterns);
	}			}
	};			};

				struct TestVectorSlicesConversion
				: public FunctionPass<TestVectorSlicesConversion> {
				void runOnFunction() override {
				OwningRewritePatternList patterns;
				populateVectorSlicesLoweringPatterns(patterns, &getContext());
				applyPatternsGreedily(getFunction(), patterns);
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Re pattern selection, it would be greate that `populateVectorSlicesLoweringPatterns` has everything it needs to make the test pass. So concretely, this line should go away (and we should hunt other opportunities to improving other `populateXXX` methods by following your model). nicolasvasilache: Re pattern selection, it would be greate that `populateVectorSlicesLoweringPatterns` has…
				aartbikAuthorUnsubmitted Done Reply Inline Actions It has. I simply overlooked this line. It is not needed! aartbik: It has. I simply overlooked this line. It is not needed!
				}
				};

	} // end anonymous namespace			} // end anonymous namespace

	static PassRegistration<TestVectorToVectorConversion>			static PassRegistration<TestVectorToVectorConversion>
	pass("test-vector-to-vector-conversion",			pass("test-vector-to-vector-conversion",
	"Test conversion patterns between ops in the vector dialect");			"Test conversion patterns between ops in the vector dialect");

				static PassRegistration<TestVectorSlicesConversion> slices_pass(
				"test-vector-slices-conversion",
				"Test conversion patterns that lower slices ops in the vector dialect");

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 240323

mlir/include/mlir/Dialect/VectorOps/VectorOps.h

mlir/lib/Dialect/VectorOps/VectorTransforms.cpp

mlir/test/Dialect/VectorOps/vector-slices-transforms.mlir

mlir/test/lib/Transforms/TestVectorTransforms.cpp

[mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops
ClosedPublic