This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/
-
mlir/
-
Dialect/
-
Vector/
-
VectorUtils.h
-
lib/Dialect/Vector/
-
Dialect/
-
Vector/
7/7
VectorTransforms.cpp
1/1
VectorUtils.cpp
-
test/Dialect/Vector/
-
Dialect/
-
Vector/
2/2
vector-transforms.mlir

Differential D76889

[MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
ClosedPublic

Authored by andydavis1 on Mar 26 2020, 3:07 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
aartbik

Commits

rG31a346cc35c8: [MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and…

Summary

Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
Vector-to-vector transformations for unrolling and lowering to hardware vectors
can generate chains of structured vector operations (InsertSlicesOp,
ExtractSlicesOp and ShapeCastOp) between the producer of a hardware vector
value and its consumer. Because InsertSlicesOp, ExtractSlicesOp and ShapeCastOp
are structured, we can track the location (tuple index and vector offsets) of
the consumer vector value through the chain of structured operations to the
producer, enabling a much more powerful producer-consumer fowarding of values
through structured ops and tuple, which in turn enables a more powerful
TupleGetOp folding transformation.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

andydavis1 created this revision.Mar 26 2020, 3:07 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2020, 3:07 PM

Herald added subscribers: llvm-commits, Joonsoo, liufengdb and 9 others. · View Herald Transcript

rriddle retitled this revision from BEGIN_PUBLIC [MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp. to [MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp..Mar 26 2020, 3:24 PM

rriddle edited the summary of this revision. (Show Details)

rriddle added inline comments.Mar 26 2020, 3:28 PM

mlir/lib/Dialect/Vector/VectorTransforms.cpp
677	nit: -> /
677	Sorry, meant ///. (The formatting messed up here)
692	Please add a comment to this assert.
715	nit: Drop all trivial braces.
758	cast never returns null, did you intend to use dyn_cast here? Also, can you just merge these two predicates: if (value.getType() == consumerVectorType) ... ?
mlir/lib/Dialect/Vector/VectorUtils.cpp
31	We should not be doing this inside of .cpp files. It should be `using namespace mlir` with the functions explicitly qualified. https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions

Harbormaster completed remote builds in B50626: Diff 252989.Mar 26 2020, 3:47 PM

aartbik added inline comments.Mar 26 2020, 4:25 PM

mlir/lib/Dialect/Vector/VectorTransforms.cpp
822	just curious, this is part of VectorTransforms explicitly now. Does it make sense to do this as part of the TupleGetOp::fold() at some point?
mlir/test/Dialect/Vector/vector-transforms.mlir
317	I noticed I may have introduced this at L308 at some point, but in general it seems a bit cleaner to capture the argument CHECK-SAME: %[[A:.0]]: vector<2x4xf32>,* CHECK-SAME: %[[A:.*1]]: vector<2x4xf32>, ... and then check for [[D]] in the return (note that I use the 0,1 etc. in the end to avoid some ambiguities for same typed arguments)
347	same request

Nice foldings.

One request on my end, could we please beef up the test?
Atm the cases tested are: tupleIndex = -1, offsets = [4, 0], tupleIndex = -1, offsets = [0, 0] and tupleIndex >= 0, offsets = [0, 0].

Can you reshuffle things a bit so we get more: tupleIndex = x, offsets = [y, z] combinations tested?

Thanks. I've addressed comments and will update with a new patch soon...

mlir/lib/Dialect/Vector/VectorTransforms.cpp
822	Perhaps. What are the tradeoffs of moving it to a fold method?

andydavis1 marked an inline comment as done.Mar 27 2020, 8:58 AM

addressing feedback

Harbormaster completed remote builds in B50718: Diff 253184.Mar 27 2020, 1:10 PM

PTAL. Let me know if there are any more comments...Thanks!

Herald added a subscriber: grosul1. · View Herald TranscriptMar 30 2020, 4:10 PM

aartbik accepted this revision.Mar 30 2020, 5:29 PM

This revision is now accepted and ready to land.Mar 30 2020, 5:29 PM

rebasing

Harbormaster failed remote builds in B51136: Diff 253893!Mar 31 2020, 8:52 AM

Closed by commit rG31a346cc35c8: [MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and… (authored by Andy Davis <andydavis@google.com>). · Explain WhyMar 31 2020, 8:53 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorUtils.h

3 lines

lib/

Dialect/

Vector/

VectorTransforms.cpp

129 lines

VectorUtils.cpp

46 lines

test/

Dialect/

Vector/

vector-transforms.mlir

89 lines

Diff 253903

mlir/include/mlir/Dialect/Vector/VectorUtils.h

	Show All 25 Lines
	class Value;			class Value;
	class VectorType;			class VectorType;

	/// Given the shape and sizes of a vector, returns the corresponding			/// Given the shape and sizes of a vector, returns the corresponding
	/// strides for each dimension.			/// strides for each dimension.
	SmallVector<int64_t, 4> computeStrides(ArrayRef<int64_t> shape,			SmallVector<int64_t, 4> computeStrides(ArrayRef<int64_t> shape,
	ArrayRef<int64_t> sizes);			ArrayRef<int64_t> sizes);

				/// Computes and returns the linearized index of 'offsets' w.r.t. 'basis'.
				int64_t linearize(ArrayRef<int64_t> offsets, ArrayRef<int64_t> basis);

	/// Given the slice strides together with a linear index in the dimension			/// Given the slice strides together with a linear index in the dimension
	/// space, returns the vector-space offsets in each dimension for a			/// space, returns the vector-space offsets in each dimension for a
	/// de-linearized index.			/// de-linearized index.
	SmallVector<int64_t, 4> delinearize(ArrayRef<int64_t> sliceStrides,			SmallVector<int64_t, 4> delinearize(ArrayRef<int64_t> sliceStrides,
	int64_t linearIndex);			int64_t linearIndex);

	/// Given the target sizes of a vector, together with vector-space offsets,			/// Given the target sizes of a vector, together with vector-space offsets,
	/// returns the element-space offsets for each dimension.			/// returns the element-space offsets for each dimension.
	▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	static int64_t computeMaxLinearIndex(ArrayRef<int64_t> basis) {
if (basis.empty())		if (basis.empty())
return 0;		return 0;
int64_t res = 1;		int64_t res = 1;
for (auto b : basis)		for (auto b : basis)
res *= b;		res *= b;
return res;		return res;
}		}

/// Computes and returns the linearized index of 'offsets' w.r.t. 'basis'.
static int64_t linearize(ArrayRef<int64_t> offsets, ArrayRef<int64_t> basis) {
assert(offsets.size() == basis.size());
int64_t linearIndex = 0;
for (unsigned idx = 0, e = basis.size(); idx < e; ++idx)
linearIndex += offsets[idx] * basis[idx];
return linearIndex;
}

// Clones `op` into a new operations that takes `operands` and returns		// Clones `op` into a new operations that takes `operands` and returns
// `resultTypes`.		// `resultTypes`.
static Operation *cloneOpWithOperandsAndTypes(PatternRewriter &builder,		static Operation *cloneOpWithOperandsAndTypes(PatternRewriter &builder,
Location loc, Operation *op,		Location loc, Operation *op,
ArrayRef<Value> operands,		ArrayRef<Value> operands,
ArrayRef<Type> resultTypes) {		ArrayRef<Type> resultTypes) {
OperationState res(loc, op->getName().getStringRef(), operands, resultTypes,		OperationState res(loc, op->getName().getStringRef(), operands, resultTypes,
op->getAttrs());		op->getAttrs());
▲ Show 20 Lines • Show All 589 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(vector::ShapeCastOp shapeCastOp,

// Replace 'shapeCastOp' with tuple of 'resultElements'.		// Replace 'shapeCastOp' with tuple of 'resultElements'.
rewriter.replaceOpWithNewOp<vector::TupleOp>(shapeCastOp, resultTupleType,		rewriter.replaceOpWithNewOp<vector::TupleOp>(shapeCastOp, resultTupleType,
resultElements);		resultElements);
return success();		return success();
}		}
};		};

		/// Returns the producer Value of the same type as 'consumerValue', by tracking
		rriddleUnsubmitted Done Reply Inline Actions nit: -> / rriddle: nit: // -> ///
		rriddleUnsubmitted Done Reply Inline Actions Sorry, meant ///. (The formatting messed up here) rriddle: Sorry, meant ///. (The formatting messed up here)
		/// the tuple index and offsets of the consumer vector value through the
		/// chain of operations (TupleGetOp, InsertSlicesOp, ExtractSlicesOp, TupleOp)
		/// from consumer to producer. Each operation in the chain is structured, and
		/// so the tuple index and offsets can be mapped from result to input, while
		/// visiting each operation in the chain.
		/// Returns nullptr on failure.
		static Value getProducerValue(Value consumerValue) {
		auto consumerVectorType = consumerValue.getType().cast<VectorType>();
		// A tupleIndex == -1 indicates that 'offsets' are w.r.t a vector type.
		int64_t tupleIndex = -1;
		SmallVector<int64_t, 4> offsets(consumerVectorType.getRank(), 0);
		auto *op = consumerValue.getDefiningOp();
		while (op != nullptr) {
		if (auto tupleGetOp = dyn_cast<vector::TupleGetOp>(op)) {
		assert(tupleIndex == -1 && "TupleGetOp must have vector result type");
		rriddleUnsubmitted Done Reply Inline Actions Please add a comment to this assert. rriddle: Please add a comment to this assert.

		// Update 'tupleIndex' and next defining 'op' to visit.
		tupleIndex = tupleGetOp.getIndex();
		op = tupleGetOp.vectors().getDefiningOp();
		} else if (auto extractSlicesOp = dyn_cast<vector::ExtractSlicesOp>(op)) {
		assert(tupleIndex >= 0);

		// Compute slice strides for 'extractSlicesOp'.
		SmallVector<int64_t, 4> sizes;
		extractSlicesOp.getSizes(sizes);
		auto sliceStrides = computeStrides(
		extractSlicesOp.getSourceVectorType().getShape(), sizes);

		// Compute 'elementOffsets' into 'extractSlicesOp' input vector type,
		// of 'extractSlicesOp' result vector tuple element at 'tupleIndex'.
		auto vectorOffsets = delinearize(sliceStrides, tupleIndex);
		auto elementOffsets =
		computeElementOffsetsFromVectorSliceOffsets(sizes, vectorOffsets);

		// Add 'elementOffsets' to 'offsets' so that 'offsets' are now relative
		// to the 'extractSlicesOp' input vector type.
		assert(offsets.size() == elementOffsets.size());
		for (unsigned i = 0, e = offsets.size(); i < e; ++i)
		rriddleUnsubmitted Done Reply Inline Actions nit: Drop all trivial braces. rriddle: nit: Drop all trivial braces.
		offsets[i] += elementOffsets[i];

		// Clear 'tupleIndex' and update next defining 'op' to visit.
		tupleIndex = -1;
		op = extractSlicesOp.vector().getDefiningOp();
		} else if (auto insertSlicesOp = dyn_cast<vector::InsertSlicesOp>(op)) {
		assert(tupleIndex == -1);

		// Compute slice strides for 'insertSlicesOp'.
		SmallVector<int64_t, 4> sizes;
		insertSlicesOp.getSizes(sizes);
		auto sliceStrides = computeStrides(
		insertSlicesOp.getResultVectorType().getShape(), sizes);

		// Compute 'vectorOffsets' of 'insertSlicesOp' input vector slice,
		// of 'insertSlicesOp' result vector type at 'offsets'.
		SmallVector<int64_t, 4> vectorOffsets(offsets.size());
		assert(offsets.size() == sizes.size());
		for (unsigned i = 0, e = offsets.size(); i < e; ++i)
		vectorOffsets[i] = offsets[i] / sizes[i];

		// Compute the source tuple element index.
		tupleIndex = linearize(vectorOffsets, sliceStrides);

		// Subtract 'elementOffsets' from 'offsets' so that 'offsets' are now
		// relative to input tuple element vector type at 'tupleIndex'.
		auto elementOffsets =
		computeElementOffsetsFromVectorSliceOffsets(sizes, vectorOffsets);
		assert(offsets.size() == elementOffsets.size());
		for (unsigned i = 0, e = offsets.size(); i < e; ++i) {
		offsets[i] -= elementOffsets[i];
		assert(offsets[i] >= 0);
		}

		// Update next defining 'op' to visit.
		op = insertSlicesOp.vectors().getDefiningOp();
		} else if (auto tupleOp = dyn_cast<vector::TupleOp>(op)) {
		assert(tupleIndex >= 0);

		// Return tuple element 'value' at 'tupleIndex' if it matches type.
		auto value = tupleOp.getOperand(tupleIndex);
		if (value.getType() == consumerVectorType)
		return value;
		rriddleUnsubmitted Done Reply Inline Actions cast never returns null, did you intend to use dyn_cast here? Also, can you just merge these two predicates: if (value.getType() == consumerVectorType) ... ? rriddle: cast never returns null, did you intend to use dyn_cast here? Also, can you just merge these…

		// Update 'tupleIndex' and next defining 'op' to visit.
		tupleIndex = -1;
		op = value.getDefiningOp();
		} else {
		break;
		}
		}
		return nullptr;
		}

/// ShapeCastOpFolder folds cancelling ShapeCastOps away.		/// ShapeCastOpFolder folds cancelling ShapeCastOps away.
//		//
// Example:		// Example:
//		//
// The following MLIR with cancelling ShapeCastOps:		// The following MLIR with cancelling ShapeCastOps:
//		//
// %0 = source : vector<5x4x2xf32>		// %0 = source : vector<5x4x2xf32>
// %1 = shape_cast %0 : vector<5x4x2xf32> to vector<20x2xf32>		// %1 = shape_cast %0 : vector<5x4x2xf32> to vector<20x2xf32>
Show All 36 Lines	LogicalResult matchAndRewrite(vector::ShapeCastOp shapeCastOp,
rewriter.replaceOp(shapeCastOp, sourceShapeCastOp.source());		rewriter.replaceOp(shapeCastOp, sourceShapeCastOp.source());
return success();		return success();
}		}
};		};

// Patter rewrite which forward tuple elements to their users.		// Patter rewrite which forward tuple elements to their users.
// User(TupleGetOp(ExtractSlicesOp(InsertSlicesOp(TupleOp(Producer)))))		// User(TupleGetOp(ExtractSlicesOp(InsertSlicesOp(TupleOp(Producer)))))
// -> User(Producer)		// -> User(Producer)
struct TupleGetFolderOp : public OpRewritePattern<vector::TupleGetOp> {		struct TupleGetFolderOp : public OpRewritePattern<vector::TupleGetOp> {
		aartbikUnsubmitted Done Reply Inline Actions just curious, this is part of VectorTransforms explicitly now. Does it make sense to do this as part of the TupleGetOp::fold() at some point? aartbik: just curious, this is part of VectorTransforms explicitly now. Does it make sense to do this…
		andydavis1AuthorUnsubmitted Done Reply Inline Actions Perhaps. What are the tradeoffs of moving it to a fold method? andydavis1: Perhaps. What are the tradeoffs of moving it to a fold method?
using OpRewritePattern<vector::TupleGetOp>::OpRewritePattern;		using OpRewritePattern<vector::TupleGetOp>::OpRewritePattern;

LogicalResult matchAndRewrite(vector::TupleGetOp tupleGetOp,		LogicalResult matchAndRewrite(vector::TupleGetOp tupleGetOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
// Return if 'tupleGetOp.vectors' arg was not defined by ExtractSlicesOp.		if (auto producer = getProducerValue(tupleGetOp.getResult())) {
auto extractSlicesOp = dyn_cast_or_null<vector::ExtractSlicesOp>(		rewriter.replaceOp(tupleGetOp, producer);
tupleGetOp.vectors().getDefiningOp());
if (!extractSlicesOp)
return failure();

// Return if 'extractSlicesOp.vector' arg was not defined by InsertSlicesOp.
auto insertSlicesOp = dyn_cast_or_null<vector::InsertSlicesOp>(
extractSlicesOp.vector().getDefiningOp());
if (!insertSlicesOp)
return failure();

// Return if 'insertSlicesOp.vectors' arg was not defined by TupleOp.
auto tupleOp = dyn_cast_or_null<vector::TupleOp>(
insertSlicesOp.vectors().getDefiningOp());
if (!tupleOp)
return failure();

// Forward Value from 'tupleOp' at 'tupleGetOp.index'.
Value tupleValue = tupleOp.getOperand(tupleGetOp.getIndex());
rewriter.replaceOp(tupleGetOp, tupleValue);
return success();		return success();
}		}
		return failure();
		}
};		};

/// Progressive lowering of ExtractSlicesOp to tuple of StridedSliceOp.		/// Progressive lowering of ExtractSlicesOp to tuple of StridedSliceOp.
/// One:		/// One:
/// %x = vector.extract_slices %0		/// %x = vector.extract_slices %0
/// is replaced by:		/// is replaced by:
/// %a = vector.strided_slice %0		/// %a = vector.strided_slice %0
/// %b = vector.strided_slice %0		/// %b = vector.strided_slice %0
▲ Show 20 Lines • Show All 647 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorUtils.cpp

Show All 22 Lines
#include "mlir/Support/MathExtras.h"		#include "mlir/Support/MathExtras.h"
#include "mlir/Support/STLExtras.h"		#include "mlir/Support/STLExtras.h"

#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"

using llvm::SetVector;		using llvm::SetVector;

namespace mlir {		using namespace mlir;
		rriddleUnsubmitted Done Reply Inline Actions We should not be doing this inside of .cpp files. It should be `using namespace mlir` with the functions explicitly qualified. https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions rriddle: We should not be doing this inside of .cpp files. It should be `using namespace mlir` with the…

SmallVector<int64_t, 4> computeStrides(ArrayRef<int64_t> shape,		SmallVector<int64_t, 4> mlir::computeStrides(ArrayRef<int64_t> shape,
ArrayRef<int64_t> sizes) {		ArrayRef<int64_t> sizes) {
int64_t rank = shape.size();		int64_t rank = shape.size();
// Compute the count for each dimension.		// Compute the count for each dimension.
SmallVector<int64_t, 4> sliceDimCounts(rank);		SmallVector<int64_t, 4> sliceDimCounts(rank);
for (int64_t r = 0; r < rank; ++r)		for (int64_t r = 0; r < rank; ++r)
sliceDimCounts[r] = ceilDiv(shape[r], sizes[r]);		sliceDimCounts[r] = ceilDiv(shape[r], sizes[r]);
// Use that to compute the slice stride for each dimension.		// Use that to compute the slice stride for each dimension.
SmallVector<int64_t, 4> sliceStrides(rank);		SmallVector<int64_t, 4> sliceStrides(rank);
sliceStrides[rank - 1] = 1;		sliceStrides[rank - 1] = 1;
for (int64_t r = rank - 2; r >= 0; --r)		for (int64_t r = rank - 2; r >= 0; --r)
sliceStrides[r] = sliceStrides[r + 1] * sliceDimCounts[r + 1];		sliceStrides[r] = sliceStrides[r + 1] * sliceDimCounts[r + 1];
return sliceStrides;		return sliceStrides;
}		}

SmallVector<int64_t, 4> delinearize(ArrayRef<int64_t> sliceStrides,		int64_t mlir::linearize(ArrayRef<int64_t> offsets, ArrayRef<int64_t> basis) {
		assert(offsets.size() == basis.size());
		int64_t linearIndex = 0;
		for (unsigned idx = 0, e = basis.size(); idx < e; ++idx)
		linearIndex += offsets[idx] * basis[idx];
		return linearIndex;
		}

		SmallVector<int64_t, 4> mlir::delinearize(ArrayRef<int64_t> sliceStrides,
int64_t index) {		int64_t index) {
int64_t rank = sliceStrides.size();		int64_t rank = sliceStrides.size();
SmallVector<int64_t, 4> vectorOffsets(rank);		SmallVector<int64_t, 4> vectorOffsets(rank);
for (int64_t r = 0; r < rank; ++r) {		for (int64_t r = 0; r < rank; ++r) {
assert(sliceStrides[r] > 0);		assert(sliceStrides[r] > 0);
vectorOffsets[r] = index / sliceStrides[r];		vectorOffsets[r] = index / sliceStrides[r];
index %= sliceStrides[r];		index %= sliceStrides[r];
}		}
return vectorOffsets;		return vectorOffsets;
}		}

SmallVector<int64_t, 4>		SmallVector<int64_t, 4> mlir::computeElementOffsetsFromVectorSliceOffsets(
computeElementOffsetsFromVectorSliceOffsets(ArrayRef<int64_t> sizes,		ArrayRef<int64_t> sizes, ArrayRef<int64_t> vectorOffsets) {
ArrayRef<int64_t> vectorOffsets) {
return functional::zipMap([](int64_t v1, int64_t v2) { return v1 * v2; },		return functional::zipMap([](int64_t v1, int64_t v2) { return v1 * v2; },
vectorOffsets, sizes);		vectorOffsets, sizes);
}		}

SmallVector<int64_t, 4> computeSliceSizes(ArrayRef<int64_t> shape,		SmallVector<int64_t, 4>
ArrayRef<int64_t> sizes,		mlir::computeSliceSizes(ArrayRef<int64_t> shape, ArrayRef<int64_t> sizes,
ArrayRef<int64_t> elementOffsets) {		ArrayRef<int64_t> elementOffsets) {
int64_t rank = shape.size();		int64_t rank = shape.size();
SmallVector<int64_t, 4> sliceSizes(rank);		SmallVector<int64_t, 4> sliceSizes(rank);
for (unsigned r = 0; r < rank; ++r)		for (unsigned r = 0; r < rank; ++r)
sliceSizes[r] = std::min(sizes[r], shape[r] - elementOffsets[r]);		sliceSizes[r] = std::min(sizes[r], shape[r] - elementOffsets[r]);
return sliceSizes;		return sliceSizes;
}		}

Optional<SmallVector<int64_t, 4>> shapeRatio(ArrayRef<int64_t> superShape,		Optional<SmallVector<int64_t, 4>> mlir::shapeRatio(ArrayRef<int64_t> superShape,
ArrayRef<int64_t> subShape) {		ArrayRef<int64_t> subShape) {
if (superShape.size() < subShape.size()) {		if (superShape.size() < subShape.size()) {
return Optional<SmallVector<int64_t, 4>>();		return Optional<SmallVector<int64_t, 4>>();
}		}

// Starting from the end, compute the integer divisors.		// Starting from the end, compute the integer divisors.
// Set the boolean `divides` if integral division is not possible.		// Set the boolean `divides` if integral division is not possible.
std::vector<int64_t> result;		std::vector<int64_t> result;
result.reserve(superShape.size());		result.reserve(superShape.size());
Show All 22 Lines	Optional<SmallVector<int64_t, 4>> mlir::shapeRatio(ArrayRef<int64_t> superShape,

assert(result.size() == superShape.size() &&		assert(result.size() == superShape.size() &&
"super to sub shape ratio is not of the same size as the super rank");		"super to sub shape ratio is not of the same size as the super rank");

// Reverse again to get it back in the proper order and return.		// Reverse again to get it back in the proper order and return.
return SmallVector<int64_t, 4>{result.rbegin(), result.rend()};		return SmallVector<int64_t, 4>{result.rbegin(), result.rend()};
}		}

Optional<SmallVector<int64_t, 4>> shapeRatio(VectorType superVectorType,		Optional<SmallVector<int64_t, 4>> mlir::shapeRatio(VectorType superVectorType,
VectorType subVectorType) {		VectorType subVectorType) {
assert(superVectorType.getElementType() == subVectorType.getElementType() &&		assert(superVectorType.getElementType() == subVectorType.getElementType() &&
"vector types must be of the same elemental type");		"vector types must be of the same elemental type");
return shapeRatio(superVectorType.getShape(), subVectorType.getShape());		return shapeRatio(superVectorType.getShape(), subVectorType.getShape());
}		}

/// Constructs a permutation map from memref indices to vector dimension.		/// Constructs a permutation map from memref indices to vector dimension.
///		///
/// The implementation uses the knowledge of the mapping of enclosing loop to		/// The implementation uses the knowledge of the mapping of enclosing loop to
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	static SetVector<Operation > getParentsOfType(Operation op) {
return res;		return res;
}		}

/// Returns the enclosing AffineForOp, from closest to farthest.		/// Returns the enclosing AffineForOp, from closest to farthest.
static SetVector<Operation > getEnclosingforOps(Operation op) {		static SetVector<Operation > getEnclosingforOps(Operation op) {
return getParentsOfType<AffineForOp>(op);		return getParentsOfType<AffineForOp>(op);
}		}

AffineMap		AffineMap mlir::makePermutationMap(
makePermutationMap(Operation *op, ArrayRef<Value> indices,		Operation *op, ArrayRef<Value> indices,
const DenseMap<Operation *, unsigned> &loopToVectorDim) {		const DenseMap<Operation *, unsigned> &loopToVectorDim) {
DenseMap<Operation *, unsigned> enclosingLoopToVectorDim;		DenseMap<Operation *, unsigned> enclosingLoopToVectorDim;
auto enclosingLoops = getEnclosingforOps(op);		auto enclosingLoops = getEnclosingforOps(op);
for (auto *forInst : enclosingLoops) {		for (auto *forInst : enclosingLoops) {
auto it = loopToVectorDim.find(forInst);		auto it = loopToVectorDim.find(forInst);
if (it != loopToVectorDim.end()) {		if (it != loopToVectorDim.end()) {
enclosingLoopToVectorDim.insert(*it);		enclosingLoopToVectorDim.insert(*it);
}		}
}		}
return makePermutationMap(indices, enclosingLoopToVectorDim);		return ::makePermutationMap(indices, enclosingLoopToVectorDim);
}		}

bool matcher::operatesOnSuperVectorsOf(Operation &op,		bool matcher::operatesOnSuperVectorsOf(Operation &op,
VectorType subVectorType) {		VectorType subVectorType) {
// First, extract the vector type and distinguish between:		// First, extract the vector type and distinguish between:
// a. ops that must lower a super-vector (i.e. vector.transfer_read,		// a. ops that must lower a super-vector (i.e. vector.transfer_read,
// vector.transfer_write); and		// vector.transfer_write); and
// b. ops that may lower a super-vector (all other ops).		// b. ops that may lower a super-vector (all other ops).
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	bool matcher::operatesOnSuperVectorsOf(Operation &op,
// between parallel, reduction and possibly other cases.		// between parallel, reduction and possibly other cases.
if (!ratio.hasValue()) {		if (!ratio.hasValue()) {
return false;		return false;
}		}

return true;		return true;
}		}

} // namespace mlir

mlir/test/Dialect/Vector/vector-transforms.mlir

	Show First 20 Lines • Show All 307 Lines • ▼ Show 20 Lines
	// CHECK: return %arg1			// CHECK: return %arg1

	func @tuple_get(%arg0: vector<4xf32>, %arg1: vector<8xf32>) -> vector<8xf32> {			func @tuple_get(%arg0: vector<4xf32>, %arg1: vector<8xf32>) -> vector<8xf32> {
	%0 = vector.tuple %arg0, %arg1 : vector<4xf32>, vector<8xf32>			%0 = vector.tuple %arg0, %arg1 : vector<4xf32>, vector<8xf32>
	%1 = vector.tuple_get %0, 1 : tuple<vector<4xf32>, vector<8xf32>>			%1 = vector.tuple_get %0, 1 : tuple<vector<4xf32>, vector<8xf32>>
	return %1 : vector<8xf32>			return %1 : vector<8xf32>
	}			}

				// CHECK-LABEL: func @tuple_get_producer_consumer
				// CHECK-SAME: %[[A0:.*0]]: vector<2x4xf32>,
				aartbikUnsubmitted Done Reply Inline Actions I noticed I may have introduced this at L308 at some point, but in general it seems a bit cleaner to capture the argument CHECK-SAME: %[[A:.0]]: vector<2x4xf32>,* CHECK-SAME: %[[A:.1]]: vector<2x4xf32>, ... and then check for [[D]] in the return (note that I use the 0,1 etc. in the end to avoid some ambiguities for same typed arguments) aartbik:* I noticed I may have introduced this at L308 at some point, but in general it seems a bit…
				// CHECK-SAME: %[[A1:.*1]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A2:.*2]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A3:.*3]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A4:.*4]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A5:.*5]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A6:.*6]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A7:.*7]]: vector<2x4xf32>
				// CHECK: return %[[A7]] : vector<2x4xf32>

				func @tuple_get_producer_consumer(
				%arg0 : vector<2x4xf32>, %arg1 : vector<2x4xf32>,
				%arg2 : vector<2x4xf32>, %arg3 : vector<2x4xf32>,
				%arg4 : vector<2x4xf32>, %arg5 : vector<2x4xf32>,
				%arg6 : vector<2x4xf32>, %arg7 : vector<2x4xf32>) -> vector<2x4xf32> {
				%0 = vector.tuple %arg0, %arg1, %arg2, %arg3, %arg4, %arg5, %arg6, %arg7
				: vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>,
				vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>
				// %arg7 == %0 at tupleIndex = 7, offsets = [0, 0]
				%1 = vector.insert_slices %0, [2, 4], [1, 1]
				: tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>,
				vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				into vector<4x16xf32>
				// %arg7 == %1 at tupleIndex = -1, offsets = [2, 12]
				%2 = vector.extract_slices %1, [4, 8], [1, 1]
				: vector<4x16xf32> into tuple<vector<4x8xf32>, vector<4x8xf32>>
				// %arg7 == %2 at tupleIndex = 1, offsets = [2, 4]
				%3 = vector.tuple_get %2, 1 : tuple<vector<4x8xf32>, vector<4x8xf32>>
				// %arg7 == %3 at tupleIndex = -1, offsets = [2, 4]
				%4 = vector.extract_slices %3, [2, 4], [1, 1]
				: vector<4x8xf32> into
				aartbikUnsubmitted Done Reply Inline Actions same request aartbik: same request
				tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				// %arg7 == %4 at tupleIndex = 3, offsets = [0, 0]
				%5 = vector.tuple_get %4, 3
				: tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				// %arg7 == %5
				return %5 : vector<2x4xf32>
				}

				// CHECK-LABEL: func @tuple_get_producer_consumer_swizzle
				// CHECK-SAME: %[[A0:.*0]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A1:.*1]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A2:.*2]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A3:.*3]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A4:.*4]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A5:.*5]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A6:.*6]]: vector<2x4xf32>,
				// CHECK-SAME: %[[A7:.*7]]: vector<2x4xf32>
				// CHECK: return %[[A7]] : vector<2x4xf32>

				func @tuple_get_producer_consumer_swizzle(
				%arg0 : vector<2x4xf32>, %arg1 : vector<2x4xf32>,
				%arg2 : vector<2x4xf32>, %arg3 : vector<2x4xf32>,
				%arg4 : vector<2x4xf32>, %arg5 : vector<2x4xf32>,
				%arg6 : vector<2x4xf32>, %arg7 : vector<2x4xf32>) -> vector<2x4xf32> {
				%0 = vector.tuple %arg0, %arg1, %arg2, %arg3, %arg4, %arg5, %arg6, %arg7
				: vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>,
				vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>
				// %arg7 == %0 at tupleIndex = 7, offsets = [0, 0]
				%1 = vector.insert_slices %0, [2, 4], [1, 1]
				: tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>,
				vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				into vector<4x16xf32>
				// %arg7 == %1 at tupleIndex = -1, offsets = [2, 12]
				%2 = vector.extract_slices %1, [4, 8], [1, 1]
				: vector<4x16xf32> into tuple<vector<4x8xf32>, vector<4x8xf32>>
				// %arg7 == %2 at tupleIndex = 1, offsets = [2, 4]

				// Extract tuple elements.
				%3 = vector.tuple_get %2, 0 : tuple<vector<4x8xf32>, vector<4x8xf32>>
				%4 = vector.tuple_get %2, 1 : tuple<vector<4x8xf32>, vector<4x8xf32>>
				// %arg7 == %4 at tupleIndex = -1, offsets = [2, 4]

				// Swizzle tuple elements.
				%5 = vector.tuple %4, %3 : vector<4x8xf32>, vector<4x8xf32>
				// %arg7 == %5 at tupleIndex = 0, offsets = [2, 4]
				%6 = vector.tuple_get %5, 0 : tuple<vector<4x8xf32>, vector<4x8xf32>>
				// %arg7 == %6 at tupleIndex = -1, offsets = [2, 4]
				%7 = vector.extract_slices %6, [2, 4], [1, 1]
				: vector<4x8xf32> into
				tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				// %arg7 == %7 at tupleIndex = 3, offsets = [0, 0]
				%8 = vector.tuple_get %7, 3
				: tuple<vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>, vector<2x4xf32>>
				// %arg7 == %8
				return %8 : vector<2x4xf32>
				}

	// CHECK-LABEL: func @vector_transfers_vector_element_type			// CHECK-LABEL: func @vector_transfers_vector_element_type
	// CHECK: %[[C0:.*]] = constant 0 : index			// CHECK: %[[C0:.*]] = constant 0 : index
	// CHECK: %[[C1:.*]] = constant 1 : index			// CHECK: %[[C1:.*]] = constant 1 : index
	// CHECK: %[[VTR0:.]] = vector.transfer_read %{{.}}[%[[C0]], %[[C0]], %[[C0]]], %{{.*}} {permutation_map = #[[MAP1]]} : memref<6x2x1xvector<2x4xf32>>, vector<1x1x2x4xf32>			// CHECK: %[[VTR0:.]] = vector.transfer_read %{{.}}[%[[C0]], %[[C0]], %[[C0]]], %{{.*}} {permutation_map = #[[MAP1]]} : memref<6x2x1xvector<2x4xf32>>, vector<1x1x2x4xf32>
	// CHECK-NEXT: %[[VTR1:.]] = vector.transfer_read %{{.}}[%[[C0]], %[[C1]], %[[C0]]], %{{.*}} {permutation_map = #[[MAP1]]} : memref<6x2x1xvector<2x4xf32>>, vector<1x1x2x4xf32>			// CHECK-NEXT: %[[VTR1:.]] = vector.transfer_read %{{.}}[%[[C0]], %[[C1]], %[[C0]]], %{{.*}} {permutation_map = #[[MAP1]]} : memref<6x2x1xvector<2x4xf32>>, vector<1x1x2x4xf32>
	// CHECK-NEXT: vector.transfer_write %[[VTR0]], %{{.*}}[%[[C0]], %[[C0]], %[[C0]]] {permutation_map = #[[MAP1]]} : vector<1x1x2x4xf32>, memref<6x2x1xvector<2x4xf32>>			// CHECK-NEXT: vector.transfer_write %[[VTR0]], %{{.*}}[%[[C0]], %[[C0]], %[[C0]]] {permutation_map = #[[MAP1]]} : vector<1x1x2x4xf32>, memref<6x2x1xvector<2x4xf32>>
	// CHECK-NEXT: vector.transfer_write %[[VTR1]], %{{.*}}[%[[C0]], %[[C1]], %[[C0]]] {permutation_map = #[[MAP1]]} : vector<1x1x2x4xf32>, memref<6x2x1xvector<2x4xf32>>			// CHECK-NEXT: vector.transfer_write %[[VTR1]], %{{.*}}[%[[C0]], %[[C1]], %[[C0]]] {permutation_map = #[[MAP1]]} : vector<1x1x2x4xf32>, memref<6x2x1xvector<2x4xf32>>

	▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 253903

mlir/include/mlir/Dialect/Vector/VectorUtils.h

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/lib/Dialect/Vector/VectorUtils.cpp

mlir/test/Dialect/Vector/vector-transforms.mlir

[MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
ClosedPublic