This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
19/23
ConvertVectorToLLVM.cpp
-
test/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
2/2
vector-to-llvm.mlir

Differential D72808

[mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR
ClosedPublic

Authored by aartbik on Jan 15 2020, 1:55 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
andydavis1
rriddle

Commits

rG459cf6e5006a: [mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR

Summary

Uses progressive lowering to convert vector.extract_slices and vector_insert_slices to equivalent vector operations that can be subsequently lowered into LLVM.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aartbik created this revision.Jan 15 2020, 1:55 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJan 15 2020, 1:55 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, liufengdb, lucyrfox and 9 others. · View Herald Transcript

aartbik edited the summary of this revision. (Show Details)Jan 15 2020, 1:56 PM

aartbik added reviewers: andydavis1, rriddle.

Unit tests: pass. 61804 tests passed, 0 failed and 781 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B44097: Diff 238363.Jan 15 2020, 2:06 PM

Please provide much more explanation in the commit message because the progressive lowering behavior may be unexpected for most people.

Nothing profound in my comments, just some things I think we should reorganize to keep our house in order.
This looks great, thanks Aart for pushing this forward!

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
899	This looks more like a rewrite / folding pattern than a conversion to me (i.e. the types don't change). I would prob go for a folding pattern in VectorOps.cpp but we also need to make sure it is accessible and In any case please document with a before / after IR example.
900	Well on second inspection you already have that in the VectorToVectorOps conversion. So please don't duplicate here and just make the vectorToVector canonicalization patterns "insert" the "foldingPatterns". Structurally this is also an indication that VectorToLLVM should start including all the VectorToVector rewrites so that things compose properly.
910	We should `assert(operands[0].getDefiningOp` is non-null. In the current design, tuples do not pass function boundaries and do not lower to LLVMIR so we want to fail hard so use cases pop up and we can reevaluate the decision with concrete evidence. Please also make sure the message is informative enough to surface the context I just explained. At that point, the whole things should look like: assert(...); rewriter.replaceOp(cast<...>); return matchSuccess(); i.e. no need to dyn_cast + check, this is guaranteed by the pattern rewrite infra.
919	Doc please, with before / after IR.
929	Yes that is hacky indeed :) You could just return failure if the uses are not empty: if (!op->getResult()->use_empty()) return matchFailure(); The pattern will fail to apply until all canonicalizations occured at which point it will apply. This will also error out gracefully by using the pattern rewrite infrastructure and fail a "full conversion" if things don't canonicalize.
952	Very cool, yay intuition :) ! Same comments as above apply still: bit more doc with before/after IR. I may not have set a good example myself here in the past, sorry about that! As the number of patterns grow we want short and intuitive examples to just get what these do without going deep in code. Also, this should go through VectorToVector patterns and be included in the VectorToLLVM patterns (e.g. what if the target is not LLVM, some of our friends at Xilinx seem to be in this case for example).
976	Strides is probably become overloaded, maybe we should rename `strided_slice` and `insert_strided_slice` as Insert/ExtractSlice and these ops as Insert/ExtractSliceTuple or something along those lines? In particular the `stride` attribute is not really a stride but a `step`. In any case, we should have help functions for computing this. Could you please reuse/refactor/adapt the first functions in VectorTransforms.cpp and move the to VectorOps/Utils.h as appropriate? Thanks!
987	We should have help functions for computing this. Could you please reuse/refactor/adapt the first functions in VectorTransforms.cpp and move the to VectorOps/Utils.h as appropriate? Thanks!
1007	Great, hooray for progressive lowering and pattern/canonicalization compositions !
mlir/lib/Dialect/VectorOps/VectorOps.cpp
1701 ↗	(On Diff #238363)	This can be called directly from other getXXXPatterns that want to use it, no need to duplicate the pattern.
mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
548	This should be made into a VectorToVector test, that more directly tests "extract_slices" -> "strided_slices" + the removal of the tuple op. Atm it tests too much at a distance and involves multiple patterns coming together. Testing those patterns coming together could be done in a separate test pass where canonicalization, DCE and pattern interaction would kick in too and many things would get simplified.

addressed comments

aartbik marked 13 inline comments as done.Jan 16 2020, 4:04 PM

aartbik added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
899	Agreed in isolation, but since we introduce vector.tuple in other rewritings, I really need a ConversionPattern here, so that I see the newly introduced operands.
900	I avoided the duplication for now by just having the lowering change. Once we have a proper vector2vector rewriting, we can move it into folding again and populate.
929	Two complications I added to the comments. For newly introduced tuples, neither the op nor the op.getResult use_empty call is up-to-date (it is always empty). However, since this is a lowering pass, such new nodes need to be legalized anyway, so removing them is the right way. We still could assert if our rules somehow introduce a tuple we don't consume.
952	It sounds like we really need to move some of this into a vector to vector pass. That will also avoid some of the complications above.
976	I see similar TODOs elsewhere, so let's postpone that into one CL later.
mlir/lib/Dialect/VectorOps/VectorOps.cpp
1701 ↗	(On Diff #238363)	For now I do everything in lowering, leaving proper interaction with folding/canonicalization for later.
mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
548	I agree. Once we have a proper vector to vector pass, we should test this (and others) in intermediate stages too.

rriddle added inline comments.Jan 16 2020, 4:06 PM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
899	// -> ///
924	getOperand(tupleGetOp.getIndex())
931	Same here.
940	Note, we don't run DCE during dialect conversion.
1016	replaceOpWithNewOp?

Unit tests: pass. 61909 tests passed, 0 failed and 782 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B44219: Diff 238653.Jan 16 2020, 4:12 PM

addressed comments

aartbik added inline comments.Jan 16 2020, 4:46 PM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
940	I was under the impression we may introduce that in the future. Rephrased this a bit.

nicolasvasilache requested changes to this revision.Jan 20 2020, 7:39 PM

nicolasvasilache added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
976	I think with the changes and proto we discussed offline, this will change significantly before landing? I can't remember whether you'll still need to use existing utils. Marking as "request changes" until we see the new E2E impl.

This revision now requires changes to proceed.Jan 20 2020, 7:39 PM

Lowering of vector.extract/insert_slices to LLVM IR

aartbik retitled this revision from [mlir] [VectorOps] Lowering of vector.extract_slices to LLVM IR to [mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR.Jan 21 2020, 1:52 PM

aartbik edited the summary of this revision. (Show Details)

PTAL

This new approach directly uses the canonicalization for vector.tuple_get on a tuple (so no need to dup the rule!). Furthermore, by using the vector to vector rewriting prior to the lowering, the introduction of a vector tuple is completely transparent, since subsequent rewriting removes them again. An additional advantage is that we get DCE for free, so need to introduce a DCE for just vector.tuple. The introduction of DCE forced me to fix a few test cases with dead code (good to do that anyway).

Note, since this code is much more structured, I also introduced vector.insert_slices in addition to vector_extract_slices, so that the final mechanism becomes more apparent.

Unit tests: pass. 62017 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B44524: Diff 239427.Jan 21 2020, 2:14 PM

fixed a few minor typos

Unit tests: pass. 62017 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster completed remote builds in B44535: Diff 239447.Jan 21 2020, 4:04 PM

rriddle requested changes to this revision.Jan 21 2020, 4:33 PM

rriddle added inline comments.

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	This should just be in the 'fold' method instead of a canonicalization pattern.

This revision now requires changes to proceed.Jan 21 2020, 4:33 PM

rriddle added inline comments.Jan 21 2020, 4:35 PM

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	'fold' should be used whenever possible because it is applicable in many more places, e.g. during dialect conversion and OpBuilder::createOrFold.

rriddle added inline comments.Jan 21 2020, 4:36 PM

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	Also, please split this out into a different revision as this is unrelated.

sounds good, one question though

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	I followed the other "canonicalization" patterns in VectorOps. Just for my own understanding, should these strictly speaking have been written as folders too (the name "Folder" seems to imply that)?

sounds good, one question though

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	Other question, isn't fold for constants only? Here the tuple itself does not need to be constant, the only thing that matters is that get-on-tuple becomes a pass through of the value at the corresponding index?

andydavis1 added inline comments.Jan 22 2020, 11:36 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
1072	We need a TODO to move this code into a VectorOpsUtils class, we have various versions of it floating around in VectorOps.cp and VectorTransforms.cpp

rriddle added inline comments.Jan 22 2020, 11:41 AM

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	fold has a slightly different contract than canonicalization patterns, and can be used (generally) for the following: In-place canonicalization Folding to an existing SSA value(does not have to be constant) Folding to an attribute value So with that being said, if the canonicalization relies on creating new operations then it must use a pattern.

Hold off on reviewing this CL, since it will be broken up in three.
The final remaining part for the lowering will be super small :=)

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
1072	The TODO is already there, see line 1038

mehdi_amini added inline comments.Jan 22 2020, 11:57 AM

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	River: do we have a doc on this? Seems like potentially a good entry for the FAQ otherwise?

aartbik marked an inline comment as done.Jan 22 2020, 1:57 PM

aartbik added inline comments.

mlir/lib/Dialect/VectorOps/VectorOps.cpp
1699 ↗	(On Diff #239447)	In particular, it was not immediately apparent to me that we always call fold(), passing in constants as arguments when available, but also allowing access to the non-constant operands through the regular getters (giving it potentially "two sources of truth"). Now that I have working, it makes sense, but a bit more API doc would have been helpful.

integrate new lowering mechanism in LLVM lowering

PTAL (note, this will not build yet, but it shows how small this CL has become by defining the progressive lowering in a separate patterns)

Unit tests: unknown.

clang-tidy: fail. clang-tidy found 2 errors and 0 warnings. 0 of them are added as review comments below (why?).

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B44876: Diff 240278!Jan 24 2020, 1:13 PM

synced with latest update

\o/ yeah, it has been a long road, but PTAL

this CL is much, much smaller now; extra test added, and rewritten some existing tests to ensure code is not dead on entry

Unit tests: pass. 62194 tests passed, 0 failed and 815 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster completed remote builds in B44895: Diff 240325.Jan 24 2020, 5:12 PM

nicolasvasilache accepted this revision.Jan 27 2020, 10:13 AM

rriddle accepted this revision.Jan 27 2020, 10:35 AM

This revision is now accepted and ready to land.Jan 27 2020, 10:35 AM

Closed by commit rG459cf6e5006a: [mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR (authored by aartbik). · Explain WhyJan 27 2020, 10:39 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

mlir/

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

12 lines

test/

Conversion/

VectorToLLVM/

vector-to-llvm.mlir

52 lines

Diff 240639

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

Show First 20 Lines • Show All 890 Lines • ▼ Show 20 Lines	private:
Operation getPrintComma(Operation op) const {		Operation getPrintComma(Operation op) const {
return getPrint(op, lowering.getDialect(), "print_comma", {});		return getPrint(op, lowering.getDialect(), "print_comma", {});
}		}
Operation getPrintNewline(Operation op) const {		Operation getPrintNewline(Operation op) const {
return getPrint(op, lowering.getDialect(), "print_newline", {});		return getPrint(op, lowering.getDialect(), "print_newline", {});
}		}
};		};

/// Progressive lowering of StridedSliceOp to either:		/// Progressive lowering of StridedSliceOp to either:
		nicolasvasilacheUnsubmitted Done Reply Inline Actions This looks more like a rewrite / folding pattern than a conversion to me (i.e. the types don't change). I would prob go for a folding pattern in VectorOps.cpp but we also need to make sure it is accessible and In any case please document with a before / after IR example. nicolasvasilache: This looks more like a rewrite / folding pattern than a conversion to me (i.e. the types don't…
		aartbikAuthorUnsubmitted Done Reply Inline Actions Agreed in isolation, but since we introduce vector.tuple in other rewritings, I really need a ConversionPattern here, so that I see the newly introduced operands. aartbik: Agreed in isolation, but since we introduce vector.tuple in other rewritings, I really need a…
		rriddleUnsubmitted Done Reply Inline Actions // -> /// rriddle: //// -> /////
/// 1. extractelement + insertelement for the 1-D case		/// 1. extractelement + insertelement for the 1-D case
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Well on second inspection you already have that in the VectorToVectorOps conversion. So please don't duplicate here and just make the vectorToVector canonicalization patterns "insert" the "foldingPatterns". Structurally this is also an indication that VectorToLLVM should start including all the VectorToVector rewrites so that things compose properly. nicolasvasilache: Well on second inspection you already have that in the VectorToVectorOps conversion. So please…
		aartbikAuthorUnsubmitted Done Reply Inline Actions I avoided the duplication for now by just having the lowering change. Once we have a proper vector2vector rewriting, we can move it into folding again and populate. aartbik: I avoided the duplication for now by just having the lowering change. Once we have a proper…
/// 2. extract + optional strided_slice + insert for the n-D case.		/// 2. extract + optional strided_slice + insert for the n-D case.
class VectorStridedSliceOpConversion : public OpRewritePattern<StridedSliceOp> {		class VectorStridedSliceOpConversion : public OpRewritePattern<StridedSliceOp> {
public:		public:
using OpRewritePattern<StridedSliceOp>::OpRewritePattern;		using OpRewritePattern<StridedSliceOp>::OpRewritePattern;

PatternMatchResult matchAndRewrite(StridedSliceOp op,		PatternMatchResult matchAndRewrite(StridedSliceOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto dstType = op.getResult().getType().cast<VectorType>();		auto dstType = op.getResult().getType().cast<VectorType>();

assert(!op.offsets().getValue().empty() && "Unexpected empty offsets");		assert(!op.offsets().getValue().empty() && "Unexpected empty offsets");
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions We should `assert(operands[0].getDefiningOp` is non-null. In the current design, tuples do not pass function boundaries and do not lower to LLVMIR so we want to fail hard so use cases pop up and we can reevaluate the decision with concrete evidence. Please also make sure the message is informative enough to surface the context I just explained. At that point, the whole things should look like: assert(...); rewriter.replaceOp(cast<...>); return matchSuccess(); i.e. no need to dyn_cast + check, this is guaranteed by the pattern rewrite infra. nicolasvasilache: We should `assert(operands[0].getDefiningOp` is non-null. In the current design, tuples do not…

int64_t offset =		int64_t offset =
op.offsets().getValue().front().cast<IntegerAttr>().getInt();		op.offsets().getValue().front().cast<IntegerAttr>().getInt();
int64_t size = op.sizes().getValue().front().cast<IntegerAttr>().getInt();		int64_t size = op.sizes().getValue().front().cast<IntegerAttr>().getInt();
int64_t stride =		int64_t stride =
op.strides().getValue().front().cast<IntegerAttr>().getInt();		op.strides().getValue().front().cast<IntegerAttr>().getInt();

auto loc = op.getLoc();		auto loc = op.getLoc();
auto elemType = dstType.getElementType();		auto elemType = dstType.getElementType();
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Doc please, with before / after IR. nicolasvasilache: Doc please, with before / after IR.
assert(elemType.isIntOrIndexOrFloat());		assert(elemType.isIntOrIndexOrFloat());
Value zero = rewriter.create<ConstantOp>(loc, elemType,		Value zero = rewriter.create<ConstantOp>(loc, elemType,
rewriter.getZeroAttr(elemType));		rewriter.getZeroAttr(elemType));
Value res = rewriter.create<SplatOp>(loc, dstType, zero);		Value res = rewriter.create<SplatOp>(loc, dstType, zero);
for (int64_t off = offset, e = offset + size * stride, idx = 0; off < e;		for (int64_t off = offset, e = offset + size * stride, idx = 0; off < e;
		rriddleUnsubmitted Done Reply Inline Actions getOperand(tupleGetOp.getIndex()) rriddle: getOperand(tupleGetOp.getIndex())
off += stride, ++idx) {		off += stride, ++idx) {
Value extracted = extractOne(rewriter, loc, op.vector(), off);		Value extracted = extractOne(rewriter, loc, op.vector(), off);
if (op.offsets().getValue().size() > 1) {		if (op.offsets().getValue().size() > 1) {
StridedSliceOp stridedSliceOp = rewriter.create<StridedSliceOp>(		StridedSliceOp stridedSliceOp = rewriter.create<StridedSliceOp>(
loc, extracted, getI64SubArray(op.offsets(), /* dropFront=*/1),		loc, extracted, getI64SubArray(op.offsets(), /* dropFront=*/1),
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Yes that is hacky indeed :) You could just return failure if the uses are not empty: if (!op->getResult()->use_empty()) return matchFailure(); The pattern will fail to apply until all canonicalizations occured at which point it will apply. This will also error out gracefully by using the pattern rewrite infrastructure and fail a "full conversion" if things don't canonicalize. nicolasvasilache: Yes that is hacky indeed :) You could just return failure if the uses are not empty: ``` if (!
		aartbikAuthorUnsubmitted Done Reply Inline Actions Two complications I added to the comments. For newly introduced tuples, neither the op nor the op.getResult use_empty call is up-to-date (it is always empty). However, since this is a lowering pass, such new nodes need to be legalized anyway, so removing them is the right way. We still could assert if our rules somehow introduce a tuple we don't consume. aartbik: Two complications I added to the comments. For newly introduced tuples, neither the op nor the…
getI64SubArray(op.sizes(), /* dropFront=*/1),		getI64SubArray(op.sizes(), /* dropFront=*/1),
getI64SubArray(op.strides(), /* dropFront=*/1));		getI64SubArray(op.strides(), /* dropFront=*/1));
		rriddleUnsubmitted Done Reply Inline Actions Same here. rriddle: Same here.
// Call matchAndRewrite recursively from within the pattern. This		// Call matchAndRewrite recursively from within the pattern. This
// circumvents the current limitation that a given pattern cannot		// circumvents the current limitation that a given pattern cannot
// be called multiple times by the PatternRewrite infrastructure (to		// be called multiple times by the PatternRewrite infrastructure (to
// avoid infinite recursion, but in this case, infinite recursion		// avoid infinite recursion, but in this case, infinite recursion
// cannot happen because the rank is strictly decreasing).		// cannot happen because the rank is strictly decreasing).
// TODO(rriddle, nicolasvasilache) Implement something like a hook for		// TODO(rriddle, nicolasvasilache) Implement something like a hook for
// a potential function that must decrease and allow the same pattern		// a potential function that must decrease and allow the same pattern
// multiple times.		// multiple times.
auto success = matchAndRewrite(stridedSliceOp, rewriter);		auto success = matchAndRewrite(stridedSliceOp, rewriter);
		rriddleUnsubmitted Not Done Reply Inline Actions Note, we don't run DCE during dialect conversion. rriddle: Note, we don't run DCE during dialect conversion.
		aartbikAuthorUnsubmitted Done Reply Inline Actions I was under the impression we may introduce that in the future. Rephrased this a bit. aartbik: I was under the impression we may introduce that in the future. Rephrased this a bit.
(void)success;		(void)success;
assert(success && "Unexpected failure");		assert(success && "Unexpected failure");
extracted = stridedSliceOp;		extracted = stridedSliceOp;
}		}
res = insertOne(rewriter, loc, extracted, res, idx);		res = insertOne(rewriter, loc, extracted, res, idx);
}		}
rewriter.replaceOp(op, {res});		rewriter.replaceOp(op, {res});
return matchSuccess();		return matchSuccess();
}		}
};		};

} // namespace		} // namespace
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Very cool, yay intuition :) ! Same comments as above apply still: bit more doc with before/after IR. I may not have set a good example myself here in the past, sorry about that! As the number of patterns grow we want short and intuitive examples to just get what these do without going deep in code. Also, this should go through VectorToVector patterns and be included in the VectorToLLVM patterns (e.g. what if the target is not LLVM, some of our friends at Xilinx seem to be in this case for example). nicolasvasilache: Very cool, yay intuition :) ! Same comments as above apply still: bit more doc with…
		aartbikAuthorUnsubmitted Done Reply Inline Actions It sounds like we really need to move some of this into a vector to vector pass. That will also avoid some of the complications above. aartbik: It sounds like we really need to move some of this into a vector to vector pass. That will also…

/// Populate the given list with patterns that convert from Vector to LLVM.		/// Populate the given list with patterns that convert from Vector to LLVM.
void mlir::populateVectorToLLVMConversionPatterns(		void mlir::populateVectorToLLVMConversionPatterns(
LLVMTypeConverter &converter, OwningRewritePatternList &patterns) {		LLVMTypeConverter &converter, OwningRewritePatternList &patterns) {
MLIRContext *ctx = converter.getDialect()->getContext();		MLIRContext *ctx = converter.getDialect()->getContext();
patterns.insert<VectorInsertStridedSliceOpDifferentRankRewritePattern,		patterns.insert<VectorInsertStridedSliceOpDifferentRankRewritePattern,
VectorInsertStridedSliceOpSameRankRewritePattern,		VectorInsertStridedSliceOpSameRankRewritePattern,
VectorStridedSliceOpConversion>(ctx);		VectorStridedSliceOpConversion>(ctx);
patterns.insert<VectorBroadcastOpConversion, VectorShuffleOpConversion,		patterns.insert<VectorBroadcastOpConversion, VectorShuffleOpConversion,
VectorExtractElementOpConversion, VectorExtractOpConversion,		VectorExtractElementOpConversion, VectorExtractOpConversion,
VectorInsertElementOpConversion, VectorInsertOpConversion,		VectorInsertElementOpConversion, VectorInsertOpConversion,
VectorOuterProductOpConversion, VectorTypeCastOpConversion,		VectorOuterProductOpConversion, VectorTypeCastOpConversion,
VectorPrintOpConversion>(ctx, converter);		VectorPrintOpConversion>(ctx, converter);
}		}

namespace {		namespace {
struct LowerVectorToLLVMPass : public ModulePass<LowerVectorToLLVMPass> {		struct LowerVectorToLLVMPass : public ModulePass<LowerVectorToLLVMPass> {
void runOnModule() override;		void runOnModule() override;
};		};
} // namespace		} // namespace

void LowerVectorToLLVMPass::runOnModule() {		void LowerVectorToLLVMPass::runOnModule() {
// Convert to the LLVM IR dialect using the converter defined above.		// Perform progressive lowering of operations on "slices".
		// Folding and DCE get rid of all non-leaking tuple ops.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Strides is probably become overloaded, maybe we should rename `strided_slice` and `insert_strided_slice` as Insert/ExtractSlice and these ops as Insert/ExtractSliceTuple or something along those lines? In particular the `stride` attribute is not really a stride but a `step`. In any case, we should have help functions for computing this. Could you please reuse/refactor/adapt the first functions in VectorTransforms.cpp and move the to VectorOps/Utils.h as appropriate? Thanks! nicolasvasilache: Strides is probably become overloaded, maybe we should rename `strided_slice` and…
		aartbikAuthorUnsubmitted Done Reply Inline Actions I see similar TODOs elsewhere, so let's postpone that into one CL later. aartbik: I see similar TODOs elsewhere, so let's postpone that into one CL later.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I think with the changes and proto we discussed offline, this will change significantly before landing? I can't remember whether you'll still need to use existing utils. Marking as "request changes" until we see the new E2E impl. nicolasvasilache: I think with the changes and proto we discussed offline, this will change significantly before…
		{
OwningRewritePatternList patterns;		OwningRewritePatternList patterns;
		populateVectorSlicesLoweringPatterns(patterns, &getContext());
		applyPatternsGreedily(getModule(), patterns);
		}

		// Convert to the LLVM IR dialect.
LLVMTypeConverter converter(&getContext());		LLVMTypeConverter converter(&getContext());
		OwningRewritePatternList patterns;
populateVectorToLLVMConversionPatterns(converter, patterns);		populateVectorToLLVMConversionPatterns(converter, patterns);
populateStdToLLVMConversionPatterns(converter, patterns);		populateStdToLLVMConversionPatterns(converter, patterns);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions We should have help functions for computing this. Could you please reuse/refactor/adapt the first functions in VectorTransforms.cpp and move the to VectorOps/Utils.h as appropriate? Thanks! nicolasvasilache: We should have help functions for computing this. Could you please reuse/refactor/adapt the…

ConversionTarget target(getContext());		ConversionTarget target(getContext());
target.addLegalDialect<LLVM::LLVMDialect>();		target.addLegalDialect<LLVM::LLVMDialect>();
target.addDynamicallyLegalOp<FuncOp>(		target.addDynamicallyLegalOp<FuncOp>(
[&](FuncOp op) { return converter.isSignatureLegal(op.getType()); });		[&](FuncOp op) { return converter.isSignatureLegal(op.getType()); });
if (failed(		if (failed(
applyPartialConversion(getModule(), target, patterns, &converter))) {		applyPartialConversion(getModule(), target, patterns, &converter))) {
signalPassFailure();		signalPassFailure();
}		}
}		}

OpPassBase<ModuleOp> *mlir::createLowerVectorToLLVMPass() {		OpPassBase<ModuleOp> *mlir::createLowerVectorToLLVMPass() {
return new LowerVectorToLLVMPass();		return new LowerVectorToLLVMPass();
}		}

static PassRegistration<LowerVectorToLLVMPass>		static PassRegistration<LowerVectorToLLVMPass>
pass("convert-vector-to-llvm",		pass("convert-vector-to-llvm",
"Lower the operations from the vector dialect into the LLVM dialect");		"Lower the operations from the vector dialect into the LLVM dialect");
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Great, hooray for progressive lowering and pattern/canonicalization compositions ! nicolasvasilache: Great, hooray for progressive lowering and pattern/canonicalization compositions !
		rriddleUnsubmitted Done Reply Inline Actions replaceOpWithNewOp? rriddle: replaceOpWithNewOp?
		andydavis1Unsubmitted Not Done Reply Inline Actions We need a TODO to move this code into a VectorOpsUtils class, we have various versions of it floating around in VectorOps.cp and VectorTransforms.cpp andydavis1: We need a TODO to move this code into a VectorOpsUtils class, we have various versions of it…
		aartbikAuthorUnsubmitted Done Reply Inline Actions The TODO is already there, see line 1038 aartbik: The TODO is already there, see line 1038

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

	Show First 20 Lines • Show All 418 Lines • ▼ Show 20 Lines
	// CHECK: llvm.call @print_comma() : () -> ()			// CHECK: llvm.call @print_comma() : () -> ()
	// CHECK: %[[x8:.*]] = llvm.mlir.constant(1 : index) : !llvm.i64			// CHECK: %[[x8:.*]] = llvm.mlir.constant(1 : index) : !llvm.i64
	// CHECK: %[[x9:.*]] = llvm.extractelement %[[x5]][%[[x8]] : !llvm.i64] : !llvm<"<2 x float>">			// CHECK: %[[x9:.*]] = llvm.extractelement %[[x5]][%[[x8]] : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK: llvm.call @print_f32(%[[x9]]) : (!llvm.float) -> ()			// CHECK: llvm.call @print_f32(%[[x9]]) : (!llvm.float) -> ()
	// CHECK: llvm.call @print_close() : () -> ()			// CHECK: llvm.call @print_close() : () -> ()
	// CHECK: llvm.call @print_close() : () -> ()			// CHECK: llvm.call @print_close() : () -> ()
	// CHECK: llvm.call @print_newline() : () -> ()			// CHECK: llvm.call @print_newline() : () -> ()

				func @strided_slice1(%arg0: vector<4xf32>) -> vector<2xf32> {
	func @strided_slice(%arg0: vector<4xf32>, %arg1: vector<4x8xf32>, %arg2: vector<4x8x16xf32>) {
	// CHECK-LABEL: llvm.func @strided_slice(
	%0 = vector.strided_slice %arg0 {offsets = [2], sizes = [2], strides = [1]} : vector<4xf32> to vector<2xf32>			%0 = vector.strided_slice %arg0 {offsets = [2], sizes = [2], strides = [1]} : vector<4xf32> to vector<2xf32>
				return %0 : vector<2xf32>
				}
				// CHECK-LABEL: llvm.func @strided_slice1
	// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float			// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
	// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2xf32>) : !llvm<"<2 x float>">			// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2xf32>) : !llvm<"<2 x float>">
	// CHECK: llvm.mlir.constant(2 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(2 : index) : !llvm.i64
	// CHECK: llvm.extractelement %{{.}}[%{{.}} : !llvm.i64] : !llvm<"<4 x float>">			// CHECK: llvm.extractelement %{{.}}[%{{.}} : !llvm.i64] : !llvm<"<4 x float>">
	// CHECK: llvm.mlir.constant(0 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(0 : index) : !llvm.i64
	// CHECK: llvm.insertelement %{{.}}, %{{.}}[%{{.*}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK: llvm.insertelement %{{.}}, %{{.}}[%{{.*}} : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK: llvm.mlir.constant(3 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(3 : index) : !llvm.i64
	// CHECK: llvm.extractelement %{{.}}[%{{.}} : !llvm.i64] : !llvm<"<4 x float>">			// CHECK: llvm.extractelement %{{.}}[%{{.}} : !llvm.i64] : !llvm<"<4 x float>">
	// CHECK: llvm.mlir.constant(1 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(1 : index) : !llvm.i64
	// CHECK: llvm.insertelement %{{.}}, %{{.}}[%{{.*}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK: llvm.insertelement %{{.}}, %{{.}}[%{{.*}} : !llvm.i64] : !llvm<"<2 x float>">

	%1 = vector.strided_slice %arg1 {offsets = [2], sizes = [2], strides = [1]} : vector<4x8xf32> to vector<2x8xf32>			func @strided_slice2(%arg0: vector<4x8xf32>) -> vector<2x8xf32> {
				%0 = vector.strided_slice %arg0 {offsets = [2], sizes = [2], strides = [1]} : vector<4x8xf32> to vector<2x8xf32>
				return %0 : vector<2x8xf32>
				}
				// CHECK-LABEL: llvm.func @strided_slice2
	// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float			// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
	// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2x8xf32>) : !llvm<"[2 x <8 x float>]">			// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2x8xf32>) : !llvm<"[2 x <8 x float>]">
	// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm<"[4 x <8 x float>]">			// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm<"[4 x <8 x float>]">
	// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm<"[2 x <8 x float>]">			// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm<"[2 x <8 x float>]">
	// CHECK: llvm.extractvalue %{{.*}}[3] : !llvm<"[4 x <8 x float>]">			// CHECK: llvm.extractvalue %{{.*}}[3] : !llvm<"[4 x <8 x float>]">
	// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm<"[2 x <8 x float>]">			// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm<"[2 x <8 x float>]">

	%2 = vector.strided_slice %arg1 {offsets = [2, 2], sizes = [2, 2], strides = [1, 1]} : vector<4x8xf32> to vector<2x2xf32>			func @strided_slice3(%arg0: vector<4x8xf32>) -> vector<2x2xf32> {
				%0 = vector.strided_slice %arg0 {offsets = [2, 2], sizes = [2, 2], strides = [1, 1]} : vector<4x8xf32> to vector<2x2xf32>
				return %0 : vector<2x2xf32>
				}
				// CHECK-LABEL: llvm.func @strided_slice3
	// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float			// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
	// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2x2xf32>) : !llvm<"[2 x <2 x float>]">			// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2x2xf32>) : !llvm<"[2 x <2 x float>]">
	//			//
	// Subvector vector<8xf32> @2			// Subvector vector<8xf32> @2
	// CHECK: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x <8 x float>]">			// CHECK: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x <8 x float>]">
	// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float			// CHECK: llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
	// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2xf32>) : !llvm<"<2 x float>">			// CHECK: llvm.mlir.constant(dense<0.000000e+00> : vector<2xf32>) : !llvm<"<2 x float>">
	// CHECK: llvm.mlir.constant(2 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(2 : index) : !llvm.i64
	Show All 15 Lines
	// CHECK: llvm.mlir.constant(0 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(0 : index) : !llvm.i64
	// CHECK: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK: llvm.mlir.constant(3 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(3 : index) : !llvm.i64
	// CHECK: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<8 x float>">			// CHECK: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<8 x float>">
	// CHECK: llvm.mlir.constant(1 : index) : !llvm.i64			// CHECK: llvm.mlir.constant(1 : index) : !llvm.i64
	// CHECK: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK: llvm.insertvalue {{.}}, {{.}}[1] : !llvm<"[2 x <2 x float>]">			// CHECK: llvm.insertvalue {{.}}, {{.}}[1] : !llvm<"[2 x <2 x float>]">

	return			func @insert_strided_slice1(%b: vector<4x4xf32>, %c: vector<4x4x4xf32>) -> vector<4x4x4xf32> {
	}

	func @insert_strided_slice(%a: vector<2x2xf32>, %b: vector<4x4xf32>, %c: vector<4x4x4xf32>) {
	// CHECK-LABEL: @insert_strided_slice

	%0 = vector.insert_strided_slice %b, %c {offsets = [2, 0, 0], strides = [1, 1]} : vector<4x4xf32> into vector<4x4x4xf32>			%0 = vector.insert_strided_slice %b, %c {offsets = [2, 0, 0], strides = [1, 1]} : vector<4x4xf32> into vector<4x4x4xf32>
				return %0 : vector<4x4x4xf32>
				}
				// CHECK-LABEL: @insert_strided_slice1
	// CHECK: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x [4 x <4 x float>]]">			// CHECK: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x [4 x <4 x float>]]">
	// CHECK-NEXT: llvm.insertvalue {{.}}, {{.}}[2] : !llvm<"[4 x [4 x <4 x float>]]">			// CHECK-NEXT: llvm.insertvalue {{.}}, {{.}}[2] : !llvm<"[4 x [4 x <4 x float>]]">

	%1 = vector.insert_strided_slice %a, %b {offsets = [2, 2], strides = [1, 1]} : vector<2x2xf32> into vector<4x4xf32>			func @insert_strided_slice2(%a: vector<2x2xf32>, %b: vector<4x4xf32>) -> vector<4x4xf32> {
				%0 = vector.insert_strided_slice %a, %b {offsets = [2, 2], strides = [1, 1]} : vector<2x2xf32> into vector<4x4xf32>
				return %0 : vector<4x4xf32>
				}
				// CHECK-LABEL: @insert_strided_slice2
	//			//
	// Subvector vector<2xf32> @0 into vector<4xf32> @2			// Subvector vector<2xf32> @0 into vector<4xf32> @2
	// CHECK: llvm.extractvalue {{.*}}[0] : !llvm<"[2 x <2 x float>]">			// CHECK: llvm.extractvalue {{.*}}[0] : !llvm<"[2 x <2 x float>]">
	// CHECK-NEXT: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x <4 x float>]">			// CHECK-NEXT: llvm.extractvalue {{.*}}[2] : !llvm<"[4 x <4 x float>]">
	// Element @0 -> element @2			// Element @0 -> element @2
	// CHECK-NEXT: llvm.mlir.constant(0 : index) : !llvm.i64			// CHECK-NEXT: llvm.mlir.constant(0 : index) : !llvm.i64
	// CHECK-NEXT: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK-NEXT: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK-NEXT: llvm.mlir.constant(2 : index) : !llvm.i64			// CHECK-NEXT: llvm.mlir.constant(2 : index) : !llvm.i64
	Show All 15 Lines
	// CHECK-NEXT: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<4 x float>">			// CHECK-NEXT: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<4 x float>">
	// Element @1 -> element @3			// Element @1 -> element @3
	// CHECK-NEXT: llvm.mlir.constant(1 : index) : !llvm.i64			// CHECK-NEXT: llvm.mlir.constant(1 : index) : !llvm.i64
	// CHECK-NEXT: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<2 x float>">			// CHECK-NEXT: llvm.extractelement {{.}}[{{.}} : !llvm.i64] : !llvm<"<2 x float>">
	// CHECK-NEXT: llvm.mlir.constant(3 : index) : !llvm.i64			// CHECK-NEXT: llvm.mlir.constant(3 : index) : !llvm.i64
	// CHECK-NEXT: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<4 x float>">			// CHECK-NEXT: llvm.insertelement {{.}}, {{.}}[{{.*}} : !llvm.i64] : !llvm<"<4 x float>">
	// CHECK-NEXT: llvm.insertvalue {{.}}, {{.}}[3] : !llvm<"[4 x <4 x float>]">			// CHECK-NEXT: llvm.insertvalue {{.}}, {{.}}[3] : !llvm<"[4 x <4 x float>]">

	return			func @extract_strides(%arg0: vector<3x3xf32>) -> vector<1x1xf32> {
	}			%0 = vector.extract_slices %arg0, [2, 2], [1, 1]
				: vector<3x3xf32> into tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
				%1 = vector.tuple_get %0, 3 : tuple<vector<2x2xf32>, vector<2x1xf32>, vector<1x2xf32>, vector<1x1xf32>>
				return %1 : vector<1x1xf32>
				}
				// CHECK-LABEL: extract_strides(%arg0: !llvm<"[3 x <3 x float>]">)
				// CHECK: %[[s0:.*]] = llvm.mlir.constant(dense<0.000000e+00> : vector<1x1xf32>) : !llvm<"[1 x <1 x float>]">
				// CHECK: %[[s1:.*]] = llvm.extractvalue %arg0[2] : !llvm<"[3 x <3 x float>]">
				// CHECK: %[[s3:.*]] = llvm.mlir.constant(dense<0.000000e+00> : vector<1xf32>) : !llvm<"<1 x float>">
				// CHECK: %[[s4:.*]] = llvm.mlir.constant(2 : index) : !llvm.i64
				// CHECK: %[[s5:.*]] = llvm.extractelement %[[s1]][%[[s4]] : !llvm.i64] : !llvm<"<3 x float>">
				// CHECK: %[[s6:.*]] = llvm.mlir.constant(0 : index) : !llvm.i64
				// CHECK: %[[s7:.*]] = llvm.insertelement %[[s5]], %[[s3]][%[[s6]] : !llvm.i64] : !llvm<"<1 x float>">
				nicolasvasilacheUnsubmitted Done Reply Inline Actions This should be made into a VectorToVector test, that more directly tests "extract_slices" -> "strided_slices" + the removal of the tuple op. Atm it tests too much at a distance and involves multiple patterns coming together. Testing those patterns coming together could be done in a separate test pass where canonicalization, DCE and pattern interaction would kick in too and many things would get simplified. nicolasvasilache: This should be made into a VectorToVector test, that more directly tests "extract_slices" ->…
				aartbikAuthorUnsubmitted Done Reply Inline Actions I agree. Once we have a proper vector to vector pass, we should test this (and others) in intermediate stages too. aartbik: I agree. Once we have a proper vector to vector pass, we should test this (and others) in…
				// CHECK: %[[s8:.*]] = llvm.insertvalue %[[s7]], %[[s0]][0] : !llvm<"[1 x <1 x float>]">
				// CHECK: llvm.return %[[s8]] : !llvm<"[1 x <1 x float>]">

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IRClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 240639

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

[mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR
ClosedPublic