This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Conversion/VectorToLLVM/
-
VectorToLLVM/
3/3
ConvertVectorToLLVM.h
-
Dialect/
-
Arithmetic/Utils/
-
Utils/
-
Utils.h
-
LLVMIR/
1/1
LLVMOps.td
-
lib/
-
Conversion/VectorToLLVM/
-
VectorToLLVM/
17/17
ConvertVectorToLLVM.cpp
-
ConvertVectorToLLVMPass.cpp
-
Dialect/
-
Arithmetic/Utils/
-
Utils/
-
Utils.cpp
-
Vector/
-
IR/
2/2
VectorOps.cpp
-
Transforms/
1/1
VectorTransforms.cpp
-
test/
-
Conversion/VectorToLLVM/
-
VectorToLLVM/
-
vector-mask-to-llvm.mlir
1/1
vector-to-llvm.mlir
-
Dialect/Vector/
-
Vector/
-
canonicalize.mlir
-
invalid.mlir
1/1
ops.mlir

Differential D118248

[mlir][Vector] Enable create_mask for scalable vectors
ClosedPublic

Authored by jsetoain on Jan 26 2022, 7:08 AM.

Download Raw Diff

Details

Reviewers

aartbik
nicolasvasilache
ftynse
c-rhodes
dcaballe

Commits

rGa75a46db89f3: [mlir][Vector] Enable create_mask for scalable vectors

Summary

The way vector.create_mask is currently lowered is
vector-length-dependent, and therefore incompatible with scalable vector
types. This patch adds an alternative lowering path for create_mask
operations that return a scalable vector mask.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jsetoain created this revision.Jan 26 2022, 7:08 AM

Herald added a reviewer: aartbik. · View Herald TranscriptJan 26 2022, 7:08 AM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 21 others. · View Herald Transcript

jsetoain requested review of this revision.Jan 26 2022, 7:08 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJan 26 2022, 7:08 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Please block the patch for now, this is just something I'm thinking about, I'm also opening a post in discourse to get some feedback.

This is the discussion thread: Vector.create_mask for scalable vectors

Harbormaster completed remote builds in B145740: Diff 403258.Jan 27 2022, 2:21 AM

jsetoain added a child revision: D118379: [mlir][Sparse] Add option for VLA sparsification.Jan 27 2022, 8:43 AM

jsetoain added a child revision: D104517: [mlir][Vector] Add integration tests for ArmSVE.Jan 28 2022, 9:55 AM

Fix create mask folder. Move create_mask on scalable vectors from
VectorTransforms to ConvertVectorToLLVM

Herald added a reviewer: ftynse. · View Herald TranscriptFeb 1 2022, 9:07 AM

Herald added subscribers: alextsao1999, awarzynski. · View Herald Transcript

Harbormaster completed remote builds in B146927: Diff 404962.Feb 1 2022, 10:39 AM

c-rhodes added a subscriber: c-rhodes.Feb 3 2022, 1:40 AM

Switching to llvm.intr.experimental.stepvector for lowering. This fixes
consistency issues with fixed-length create_mask operations.

Harbormaster completed remote builds in B148443: Diff 407105.Feb 9 2022, 2:55 AM

jsetoain added a parent revision: D118371: [mlir][LLVM] Allow scalable vectors in ShuffleVectorOp.Feb 9 2022, 2:57 AM

c-rhodes added inline comments.Feb 9 2022, 7:30 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920	this optimisation seems odd, I can't imagine there's any hardware out there with vectors approaching 2^64 elements (and 2^32 for that matter). Can this option be removed and always default i32?
923	`getScalableVectorType`?
928	I know this is based on the fixed lowering, but I wonder if this should be `ult`.

jsetoain marked 3 inline comments as done.Feb 9 2022, 9:11 AM

jsetoain added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920	The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-)
923	It's broken and I'm not sure how to fix it. It's in the backlog.
928	If you do `ult` you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 \| [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior.

Rebase on main

Harbormaster completed remote builds in B148500: Diff 407184.Feb 9 2022, 10:12 AM

Does this need a round-trip test in mlir/test/Dialect/Vector/ops.mlir?

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920	The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-) I agree, better to be consistent and remove in another patch. Just an observation :)
928	If you do `ult` you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 \| [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior. the `vector.create_mask` -> `vector.constant_mask` canonicalization for negative values should happen before this lowering?

Accept particular cases of scalable constant masks.

Harbormaster completed remote builds in B150273: Diff 409701.Feb 17 2022, 10:14 AM

jsetoain marked an inline comment as done.Feb 17 2022, 10:14 AM

jsetoain added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
928	I've included the right flow and constrains for create_mask -> constant_mask canonicalization.

jsetoain marked an inline comment as done.Feb 17 2022, 10:15 AM

jsetoain added a reviewer: c-rhodes.Feb 21 2022, 2:37 AM

c-rhodes added inline comments.Feb 21 2022, 5:43 AM

mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
94 ↗	(On Diff #409701)	nit: an
mlir/lib/Dialect/Vector/IR/VectorOps.cpp
4242	`== 0`? Or `to be 0`? https://mlir.llvm.org/docs/Dialects/Vector/#vectorconstant_mask-mlirvectorconstantmaskop Each value of ‘mask_dim_sizes’ must be non-negative and not greater than the size of the corresponding vector dimension (as opposed to vector.create_mask which allows this).
mlir/test/Dialect/Vector/ops.mlir
375	nit: unrelated change

Address comments

jsetoain marked 3 inline comments as done.Feb 21 2022, 8:39 AM

Harbormaster completed remote builds in B150703: Diff 410314.Feb 21 2022, 8:57 AM

LGTM, but I'm pretty new around these parts so I'll leave it to another reviewer to accept

@ftynse Hi Alex, I just wanted to kindly remind you about this patch, after the discussion in discourse nobody else seems to have anything against it, and it's currently blocking a stack of approved patches. Thank you!

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2022, 8:39 AM

LGTM, let's wait for Alex.
Few remaining nits

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
1755	period at end
mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
607	period at end
mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1466	although this will match genbool_1d_scalable is a better LABEL!

Address reviewer comments

Thanks for the review, Aart!

Harbormaster completed remote builds in B153313: Diff 414038.Mar 9 2022, 2:23 AM

jsetoain added a reviewer: dcaballe.Mar 23 2022, 9:06 AM

Thanks for the contribution, Javier! Some comments inline.

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
69	Add doc about what `indexOptimizations` is actually enabling?
mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
99 ↗	(On Diff #414038)	Now that we are making this utility public, I wonder if it would make more sense to move it somewhere else since it's not a vector specific transformation, right? Maybe we could move it to the builder class? It looks like a builder method to me.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905	Would it make sense to align this lowering with the non-scalable version (i.e., moving it to Vector Transforms)? I think `create_mask` is a relatively high level op that would make sense to lower to something simpler before we lower it to LLVM. That would align with the approach that we follow for similar vector ops and would make the LLVM lowering simpler (which is complex already). (Hopefully I'm not missing any context. Feel free to ignore this comment you already discussed this).
919–920	+1 to removing this. It's a bit odd that we shrink the index type as part of the lowering. I think that kind of transformation should happen as a separate step before the lowering to LLVM.

Address reviewer comments

mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
99 ↗	(On Diff #414038)	That creates an awkward dependency between common MLIR and the Arithmetic dialect, but it does make a lot of sense to move this to Arithmetic Utils (I've also renamed it to match similar utilities).
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905	The reason to separate them was that the scalable lowering depends on LLVM IR much earlier than the fixed-length version. In a way, `create_mask` for scalable vectors is a lower level operations than `create_mask` for fixed-length vectors. If we move scalable create_mask to Vector Transforms, we add an additional dependency there. If we want to unify both lowerings, I think it would make more sense to move `create_mask` to conversions. There was some discussion in the thread I created for this, and the conclusion was that having two lowerings for one operation was not an issue. If that not the case any more or you think it doesn't apply to this case, I'm happy to move things around. It's a trivial change :-)

Harbormaster completed remote builds in B156027: Diff 417883.Mar 24 2022, 4:51 AM

ftynse accepted this revision.Mar 24 2022, 6:54 AM

ftynse added inline comments.

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
66	Why not call it `assume32BitIndices` then?
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
911	Nit: no need for explicit `mlir::` here.
mlir/lib/Dialect/Vector/IR/VectorOps.cpp
4286–4287	Use `matchPattern(m_constantInt(...))` instead of explicitly matching for `arith.constant` here.

This revision is now accepted and ready to land.Mar 24 2022, 6:54 AM

ftynse added inline comments.Mar 24 2022, 6:57 AM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920	Do you all know that the bitwidth of index is parameterized through the data layout mechanism? You can also define this as a full-fledged `ConversionPattern`, get hold of a `TypeConverter` instance and make it convert the actual `IndexType` to get you the integer type of the expected bitwidth. Anything else will run into type mismatches sooner or later.

Address reviewer comments

jsetoain added inline comments.Mar 24 2022, 8:35 AM

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
66	The only reason is that it is the name used elsewhere, but I agree it's not a great name. Once this has landed, I will push a NFC patch changing the name everywhere, probably to something like `assume32BitVectorIndices` to avoid confusion.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920	I believe this is handled independently from that mechanism because you might want to have IndexType to i64 conversion, and yet generate i32 indices in this very particular case. While your loop indices might go well beyond 2^32, the length of your physical vector is unlikely to do so.

Harbormaster completed remote builds in B156068: Diff 417939.Mar 24 2022, 8:51 AM

jsetoain added a child revision: D122415: [mlir][vector][nfc] Rename index optimizations option.Mar 24 2022, 9:55 AM

dcaballe accepted this revision.Mar 24 2022, 12:01 PM

dcaballe added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905	If it was discussed already, it's fine then! We can align both lowerings in the future. We may have a better picture once we introduce further masking support. The reason to separate them was that the scalable lowering depends on LLVM IR What is missing on the MLIR side to be able to do the lowering? StepVectorOp?

jsetoain marked 2 inline comments as done.Mar 24 2022, 12:48 PM

jsetoain added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905	StepVectorOp would unify the lowering, but, unless we find more use cases, I'm not sure its worth the change. The documentation of llvm.experimental.stepvector says that the recommended way to do this for fixed-length vectors is by using a constant vector. This would lead to a unification on the lowering of create_mask, but then a dual lowering for StepVectorOp (to the intrinsic or the constant vector, depending on scalability of result). I'd say it's just marginally better, if that. If as we expand mask creation and manipulation in MLIR we find this op is useful, I'll take care of unifying the lowering of this op.

jsetoain marked 2 inline comments as done.Mar 25 2022, 2:34 AM

Closed by commit rGa75a46db89f3: [mlir][Vector] Enable create_mask for scalable vectors (authored by jsetoain). · Explain WhyMar 25 2022, 3:50 AM

This revision was automatically updated to reflect the committed changes.

jsetoain added a commit: rGa75a46db89f3: [mlir][Vector] Enable create_mask for scalable vectors.

Revision Contents

Path

Size

mlir/

include/

mlir/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.h

3 lines

Dialect/

Arithmetic/

Utils/

Utils.h

6 lines

LLVMIR/

LLVMOps.td

8 lines

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

43 lines

ConvertVectorToLLVMPass.cpp

4 lines

Dialect/

Arithmetic/

Utils/

Utils.cpp

21 lines

Vector/

IR/

VectorOps.cpp

21 lines

Transforms/

VectorTransforms.cpp

36 lines

test/

Conversion/

VectorToLLVM/

vector-mask-to-llvm.mlir

23 lines

vector-to-llvm.mlir

24 lines

Dialect/

Vector/

canonicalize.mlir

10 lines

invalid.mlir

7 lines

ops.mlir

2 lines

Diff 418177

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

	/// Collect a set of patterns to convert from Vector contractions to LLVM Matrix			/// Collect a set of patterns to convert from Vector contractions to LLVM Matrix
	/// Intrinsics. To lower to assembly, the LLVM flag -lower-matrix-intrinsics			/// Intrinsics. To lower to assembly, the LLVM flag -lower-matrix-intrinsics
	/// will be needed when invoking LLVM.			/// will be needed when invoking LLVM.
	void populateVectorToLLVMMatrixConversionPatterns(LLVMTypeConverter &converter,			void populateVectorToLLVMMatrixConversionPatterns(LLVMTypeConverter &converter,
	RewritePatternSet &patterns);			RewritePatternSet &patterns);

	/// Collect a set of patterns to convert from the Vector dialect to LLVM.			/// Collect a set of patterns to convert from the Vector dialect to LLVM.
				/// If `indexOptimizations` is set, assume indices fit in 32-bit.
				ftynseUnsubmitted Done Reply Inline Actions Why not call it `assume32BitIndices` then? ftynse: Why not call it `assume32BitIndices` then?
				jsetoainAuthorUnsubmitted Done Reply Inline Actions The only reason is that it is the name used elsewhere, but I agree it's not a great name. Once this has landed, I will push a NFC patch changing the name everywhere, probably to something like `assume32BitVectorIndices` to avoid confusion. jsetoain: The only reason is that it is the name used elsewhere, but I agree it's not a great name. Once…
	void populateVectorToLLVMConversionPatterns(			void populateVectorToLLVMConversionPatterns(
	LLVMTypeConverter &converter, RewritePatternSet &patterns,			LLVMTypeConverter &converter, RewritePatternSet &patterns,
	bool reassociateFPReductions = false);			bool reassociateFPReductions = false, bool indexOptimizations = false);
				dcaballeUnsubmitted Done Reply Inline Actions Add doc about what `indexOptimizations` is actually enabling? dcaballe: Add doc about what `indexOptimizations` is actually enabling?

	/// Create a pass to convert vector operations to the LLVMIR dialect.			/// Create a pass to convert vector operations to the LLVMIR dialect.
	std::unique_ptr<OperationPass<ModuleOp>> createConvertVectorToLLVMPass(			std::unique_ptr<OperationPass<ModuleOp>> createConvertVectorToLLVMPass(
	const LowerVectorToLLVMOptions &options = LowerVectorToLLVMOptions());			const LowerVectorToLLVMOptions &options = LowerVectorToLLVMOptions());

	} // namespace mlir			} // namespace mlir

	#endif // MLIR_CONVERSION_VECTORTOLLVM_CONVERTVECTORTOLLVM_H_			#endif // MLIR_CONVERSION_VECTORTOLLVM_CONVERTVECTORTOLLVM_H_

mlir/include/mlir/Dialect/Arithmetic/Utils/Utils.h

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	};			};

	/// Converts an OpFoldResult to a Value. Returns the fold result if it casts to			/// Converts an OpFoldResult to a Value. Returns the fold result if it casts to
	/// a Value or creates a ConstantIndexOp if it casts to an IntegerAttribute.			/// a Value or creates a ConstantIndexOp if it casts to an IntegerAttribute.
	/// Other attribute types are not supported.			/// Other attribute types are not supported.
	Value getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,			Value getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,
	OpFoldResult ofr);			OpFoldResult ofr);

				/// Create a cast from an index-like value (index or integer) to another
				/// index-like value. If the value type and the target type are the same, it
				/// returns the original value.
				Value getValueOrCreateCastToIndexLike(OpBuilder &b, Location loc,
				Type targetType, Value value);

	/// Similar to the other overload, but converts multiple OpFoldResults into			/// Similar to the other overload, but converts multiple OpFoldResults into
	/// Values.			/// Values.
	SmallVector<Value>			SmallVector<Value>
	getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,			getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,
	ArrayRef<OpFoldResult> valueOrAttrVec);			ArrayRef<OpFoldResult> valueOrAttrVec);

	/// Helper struct to build simple arithmetic quantities with minimal type			/// Helper struct to build simple arithmetic quantities with minimal type
	/// inference support.			/// inference support.
	Show All 17 Lines

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td

	Show First 20 Lines • Show All 1,746 Lines • ▼ Show 20 Lines
	def LLVM_masked_compressstore			def LLVM_masked_compressstore
	: LLVM_IntrOp<"masked.compressstore", [], [0], [], 0> {			: LLVM_IntrOp<"masked.compressstore", [], [0], [], 0> {
	let arguments = (ins LLVM_Type, LLVM_Type, LLVM_Type);			let arguments = (ins LLVM_Type, LLVM_Type, LLVM_Type);
	}			}

	/// Create a call to vscale intrinsic.			/// Create a call to vscale intrinsic.
	def LLVM_vscale : LLVM_IntrOp<"vscale", [0], [], [], 1>;			def LLVM_vscale : LLVM_IntrOp<"vscale", [0], [], [], 1>;

				/// Create a call to stepvector intrinsic.
				aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
				def LLVM_StepVectorOp
				: LLVM_IntrOp<"experimental.stepvector", [0], [], [NoSideEffect], 1> {
				let arguments = (ins);
				let results = (outs LLVM_Type:$res);
				let assemblyFormat = "attr-dict `:` type($res)";
				}

	// Atomic operations.			// Atomic operations.
	//			//

	def AtomicBinOpXchg : I64EnumAttrCase<"xchg", 0>;			def AtomicBinOpXchg : I64EnumAttrCase<"xchg", 0>;
	def AtomicBinOpAdd : I64EnumAttrCase<"add", 1>;			def AtomicBinOpAdd : I64EnumAttrCase<"add", 1>;
	def AtomicBinOpSub : I64EnumAttrCase<"sub", 2>;			def AtomicBinOpSub : I64EnumAttrCase<"sub", 2>;
	def AtomicBinOpAnd : I64EnumAttrCase<"_and", 3>;			def AtomicBinOpAnd : I64EnumAttrCase<"_and", 3>;
	def AtomicBinOpNand : I64EnumAttrCase<"nand", 4>;			def AtomicBinOpNand : I64EnumAttrCase<"nand", 4>;
	▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

//===- VectorToLLVM.cpp - Conversion from Vector to the LLVM dialect ------===//		//===- VectorToLLVM.cpp - Conversion from Vector to the LLVM dialect ------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h"		#include "mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h"

#include "mlir/Conversion/LLVMCommon/VectorPattern.h"		#include "mlir/Conversion/LLVMCommon/VectorPattern.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
		#include "mlir/Dialect/Arithmetic/Utils/Utils.h"
#include "mlir/Dialect/LLVMIR/FunctionCallUtils.h"		#include "mlir/Dialect/LLVMIR/FunctionCallUtils.h"
#include "mlir/Dialect/LLVMIR/LLVMDialect.h"		#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"		#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/Vector/Transforms/VectorTransforms.h"		#include "mlir/Dialect/Vector/Transforms/VectorTransforms.h"
#include "mlir/IR/BuiltinTypes.h"		#include "mlir/IR/BuiltinTypes.h"
#include "mlir/Support/MathExtras.h"		#include "mlir/Support/MathExtras.h"
#include "mlir/Target/LLVMIR/TypeToLLVM.h"		#include "mlir/Target/LLVMIR/TypeToLLVM.h"
#include "mlir/Transforms/DialectConversion.h"		#include "mlir/Transforms/DialectConversion.h"
▲ Show 20 Lines • Show All 874 Lines • ▼ Show 20 Lines	for (const auto &indexedSize :
desc.setStride(rewriter, loc, index, stride);		desc.setStride(rewriter, loc, index, stride);
}		}

rewriter.replaceOp(castOp, {desc});		rewriter.replaceOp(castOp, {desc});
return success();		return success();
}		}
};		};

		/// Conversion pattern for a `vector.create_mask` (1-D scalable vectors only).
		/// Non-scalable versions of this operation are handled in Vector Transforms.
		dcaballeUnsubmitted Done Reply Inline Actions Would it make sense to align this lowering with the non-scalable version (i.e., moving it to Vector Transforms)? I think `create_mask` is a relatively high level op that would make sense to lower to something simpler before we lower it to LLVM. That would align with the approach that we follow for similar vector ops and would make the LLVM lowering simpler (which is complex already). (Hopefully I'm not missing any context. Feel free to ignore this comment you already discussed this). dcaballe: Would it make sense to align this lowering with the non-scalable version (i.e., moving it to…
		jsetoainAuthorUnsubmitted Done Reply Inline Actions The reason to separate them was that the scalable lowering depends on LLVM IR much earlier than the fixed-length version. In a way, `create_mask` for scalable vectors is a lower level operations than `create_mask` for fixed-length vectors. If we move scalable create_mask to Vector Transforms, we add an additional dependency there. If we want to unify both lowerings, I think it would make more sense to move `create_mask` to conversions. There was some discussion in the thread I created for this, and the conclusion was that having two lowerings for one operation was not an issue. If that not the case any more or you think it doesn't apply to this case, I'm happy to move things around. It's a trivial change :-) jsetoain: The reason to separate them was that the scalable lowering depends on LLVM IR much earlier than…
		dcaballeUnsubmitted Done Reply Inline Actions If it was discussed already, it's fine then! We can align both lowerings in the future. We may have a better picture once we introduce further masking support. The reason to separate them was that the scalable lowering depends on LLVM IR What is missing on the MLIR side to be able to do the lowering? StepVectorOp? dcaballe: If it was discussed already, it's fine then! We can align both lowerings in the future. We may…
		jsetoainAuthorUnsubmitted Done Reply Inline Actions StepVectorOp would unify the lowering, but, unless we find more use cases, I'm not sure its worth the change. The documentation of llvm.experimental.stepvector says that the recommended way to do this for fixed-length vectors is by using a constant vector. This would lead to a unification on the lowering of create_mask, but then a dual lowering for StepVectorOp (to the intrinsic or the constant vector, depending on scalability of result). I'd say it's just marginally better, if that. If as we expand mask creation and manipulation in MLIR we find this op is useful, I'll take care of unifying the lowering of this op. jsetoain: StepVectorOp would unify the lowering, but, unless we find more use cases, I'm not sure its…
		class VectorCreateMaskOpRewritePattern
		: public OpRewritePattern<vector::CreateMaskOp> {
		public:
		explicit VectorCreateMaskOpRewritePattern(MLIRContext *context,
		bool enableIndexOpt)
		: OpRewritePattern<vector::CreateMaskOp>(context),
		ftynseUnsubmitted Done Reply Inline Actions Nit: no need for explicit `mlir::` here. ftynse: Nit: no need for explicit `mlir::` here.
		indexOptimizations(enableIndexOpt) {}

		LogicalResult matchAndRewrite(vector::CreateMaskOp op,
		PatternRewriter &rewriter) const override {
		auto dstType = op.getType();
		if (dstType.getRank() != 1 \|\| !dstType.cast<VectorType>().isScalable())
		return failure();
		IntegerType idxType =
		indexOptimizations ? rewriter.getI32Type() : rewriter.getI64Type();
		c-rhodesUnsubmitted Done Reply Inline Actions this optimisation seems odd, I can't imagine there's any hardware out there with vectors approaching 2^64 elements (and 2^32 for that matter). Can this option be removed and always default i32? c-rhodes: this optimisation seems odd, I can't imagine there's any hardware out there with vectors…
		jsetoainAuthorUnsubmitted Done Reply Inline Actions The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-) jsetoain: The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation…
		c-rhodesUnsubmitted Done Reply Inline Actions The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-) I agree, better to be consistent and remove in another patch. Just an observation :) c-rhodes: > The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation…
		dcaballeUnsubmitted Done Reply Inline Actions +1 to removing this. It's a bit odd that we shrink the index type as part of the lowering. I think that kind of transformation should happen as a separate step before the lowering to LLVM. dcaballe: +1 to removing this. It's a bit odd that we shrink the index type as part of the lowering. I…
		ftynseUnsubmitted Done Reply Inline Actions Do you all know that the bitwidth of index is parameterized through the data layout mechanism? You can also define this as a full-fledged `ConversionPattern`, get hold of a `TypeConverter` instance and make it convert the actual `IndexType` to get you the integer type of the expected bitwidth. Anything else will run into type mismatches sooner or later. ftynse: Do you all know that the bitwidth of index is parameterized through the data layout mechanism?
		jsetoainAuthorUnsubmitted Done Reply Inline Actions I believe this is handled independently from that mechanism because you might want to have IndexType to i64 conversion, and yet generate i32 indices in this very particular case. While your loop indices might go well beyond 2^32, the length of your physical vector is unlikely to do so. jsetoain: I believe this is handled independently from that mechanism because you might want to have…
		auto loc = op->getLoc();
		Value indices = rewriter.create<LLVM::StepVectorOp>(
		loc, LLVM::getVectorType(idxType, dstType.getShape()[0],
		c-rhodesUnsubmitted Done Reply Inline Actions `getScalableVectorType`? c-rhodes: `getScalableVectorType`?
		jsetoainAuthorUnsubmitted Done Reply Inline Actions It's broken and I'm not sure how to fix it. It's in the backlog. jsetoain: It's broken and I'm not sure how to fix it. It's in the backlog.
		/isScalable=/true));
		auto bound = getValueOrCreateCastToIndexLike(rewriter, loc, idxType,
		op.getOperand(0));
		Value bounds = rewriter.create<SplatOp>(loc, indices.getType(), bound);
		Value comp = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt,
		c-rhodesUnsubmitted Done Reply Inline Actions I know this is based on the fixed lowering, but I wonder if this should be `ult`. c-rhodes: I know this is based on the fixed lowering, but I wonder if this should be `ult`.
		jsetoainAuthorUnsubmitted Done Reply Inline Actions If you do `ult` you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 \| [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior. jsetoain: If you do `ult` you have the same "wrap around" problem. It has to be signed in case the index…
		c-rhodesUnsubmitted Done Reply Inline Actions If you do `ult` you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 \| [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior. the `vector.create_mask` -> `vector.constant_mask` canonicalization for negative values should happen before this lowering? c-rhodes: > If you do `ult` you have the same "wrap around" problem. It has to be signed in case the…
		jsetoainAuthorUnsubmitted Done Reply Inline Actions I've included the right flow and constrains for create_mask -> constant_mask canonicalization. jsetoain: I've included the right flow and constrains for create_mask -> constant_mask canonicalization.
		indices, bounds);
		rewriter.replaceOp(op, comp);
		return success();
		}

		private:
		const bool indexOptimizations;
		};

class VectorPrintOpConversion : public ConvertOpToLLVMPattern<vector::PrintOp> {		class VectorPrintOpConversion : public ConvertOpToLLVMPattern<vector::PrintOp> {
public:		public:
using ConvertOpToLLVMPattern<vector::PrintOp>::ConvertOpToLLVMPattern;		using ConvertOpToLLVMPattern<vector::PrintOp>::ConvertOpToLLVMPattern;

// Proof-of-concept lowering implementation that relies on a small		// Proof-of-concept lowering implementation that relies on a small
// runtime support library, which only needs to provide a few		// runtime support library, which only needs to provide a few
// printing methods (single value for all data types, opening/closing		// printing methods (single value for all data types, opening/closing
// bracket, comma, newline). The lowering fully unrolls a vector		// bracket, comma, newline). The lowering fully unrolls a vector
▲ Show 20 Lines • Show All 241 Lines • ▼ Show 20 Lines	matchAndRewrite(SplatOp splatOp, OpAdaptor adaptor,
rewriter.replaceOp(splatOp, desc);		rewriter.replaceOp(splatOp, desc);
return success();		return success();
}		}
};		};

} // namespace		} // namespace

/// Populate the given list with patterns that convert from Vector to LLVM.		/// Populate the given list with patterns that convert from Vector to LLVM.
void mlir::populateVectorToLLVMConversionPatterns(		void mlir::populateVectorToLLVMConversionPatterns(LLVMTypeConverter &converter,
LLVMTypeConverter &converter, RewritePatternSet &patterns,		RewritePatternSet &patterns,
bool reassociateFPReductions) {		bool reassociateFPReductions,
		bool indexOptimizations) {
MLIRContext *ctx = converter.getDialect()->getContext();		MLIRContext *ctx = converter.getDialect()->getContext();
patterns.add<VectorFMAOpNDRewritePattern>(ctx);		patterns.add<VectorFMAOpNDRewritePattern>(ctx);
populateVectorInsertExtractStridedSliceTransforms(patterns);		populateVectorInsertExtractStridedSliceTransforms(patterns);
patterns.add<VectorReductionOpConversion>(converter, reassociateFPReductions);		patterns.add<VectorReductionOpConversion>(converter, reassociateFPReductions);
		patterns.add<VectorCreateMaskOpRewritePattern>(ctx, indexOptimizations);
patterns		patterns
.add<VectorBitCastOpConversion, VectorShuffleOpConversion,		.add<VectorBitCastOpConversion, VectorShuffleOpConversion,
VectorExtractElementOpConversion, VectorExtractOpConversion,		VectorExtractElementOpConversion, VectorExtractOpConversion,
VectorFMAOp1DConversion, VectorInsertElementOpConversion,		VectorFMAOp1DConversion, VectorInsertElementOpConversion,
VectorInsertOpConversion, VectorPrintOpConversion,		VectorInsertOpConversion, VectorPrintOpConversion,
VectorTypeCastOpConversion, VectorScaleOpConversion,		VectorTypeCastOpConversion, VectorScaleOpConversion,
VectorLoadStoreConversion<vector::LoadOp, vector::LoadOpAdaptor>,		VectorLoadStoreConversion<vector::LoadOp, vector::LoadOpAdaptor>,
VectorLoadStoreConversion<vector::MaskedLoadOp,		VectorLoadStoreConversion<vector::MaskedLoadOp,
Show All 16 Lines

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	void LowerVectorToLLVMPass::runOnOperation() {
}		}

// Convert to the LLVM IR dialect.		// Convert to the LLVM IR dialect.
LLVMTypeConverter converter(&getContext());		LLVMTypeConverter converter(&getContext());
RewritePatternSet patterns(&getContext());		RewritePatternSet patterns(&getContext());
populateVectorMaskMaterializationPatterns(patterns, indexOptimizations);		populateVectorMaskMaterializationPatterns(patterns, indexOptimizations);
populateVectorTransferLoweringPatterns(patterns);		populateVectorTransferLoweringPatterns(patterns);
populateVectorToLLVMMatrixConversionPatterns(converter, patterns);		populateVectorToLLVMMatrixConversionPatterns(converter, patterns);
populateVectorToLLVMConversionPatterns(converter, patterns,		populateVectorToLLVMConversionPatterns(
reassociateFPReductions);		converter, patterns, reassociateFPReductions, indexOptimizations);
populateVectorToLLVMMatrixConversionPatterns(converter, patterns);		populateVectorToLLVMMatrixConversionPatterns(converter, patterns);

// Architecture specific augmentations.		// Architecture specific augmentations.
LLVMConversionTarget target(getContext());		LLVMConversionTarget target(getContext());
target.addLegalDialect<arith::ArithmeticDialect>();		target.addLegalDialect<arith::ArithmeticDialect>();
target.addLegalDialect<memref::MemRefDialect>();		target.addLegalDialect<memref::MemRefDialect>();
target.addLegalOp<UnrealizedConversionCastOp>();		target.addLegalOp<UnrealizedConversionCastOp>();
if (armNeon) {		if (armNeon) {
Show All 27 Lines

mlir/lib/Dialect/Arithmetic/Utils/Utils.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	Value mlir::getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,
OpFoldResult ofr) {		OpFoldResult ofr) {
if (auto value = ofr.dyn_cast<Value>())		if (auto value = ofr.dyn_cast<Value>())
return value;		return value;
auto attr = ofr.dyn_cast<Attribute>().dyn_cast<IntegerAttr>();		auto attr = ofr.dyn_cast<Attribute>().dyn_cast<IntegerAttr>();
assert(attr && "expect the op fold result casts to an integer attribute");		assert(attr && "expect the op fold result casts to an integer attribute");
return b.create<arith::ConstantIndexOp>(loc, attr.getValue().getSExtValue());		return b.create<arith::ConstantIndexOp>(loc, attr.getValue().getSExtValue());
}		}

		Value mlir::getValueOrCreateCastToIndexLike(OpBuilder &b, Location loc,
		Type targetType, Value value) {
		if (targetType == value.getType())
		return value;

		bool targetIsIndex = targetType.isIndex();
		bool valueIsIndex = value.getType().isIndex();
		if (targetIsIndex ^ valueIsIndex)
		return b.create<arith::IndexCastOp>(loc, targetType, value);

		auto targetIntegerType = targetType.dyn_cast<IntegerType>();
		auto valueIntegerType = value.getType().dyn_cast<IntegerType>();
		assert(targetIntegerType && valueIntegerType &&
		"unexpected cast between types other than integers and index");
		assert(targetIntegerType.getSignedness() == valueIntegerType.getSignedness());

		if (targetIntegerType.getWidth() > valueIntegerType.getWidth())
		return b.create<arith::ExtSIOp>(loc, targetIntegerType, value);
		return b.create<arith::TruncIOp>(loc, targetIntegerType, value);
		}

SmallVector<Value>		SmallVector<Value>
mlir::getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,		mlir::getValueOrCreateConstantIndexOp(OpBuilder &b, Location loc,
ArrayRef<OpFoldResult> valueOrAttrVec) {		ArrayRef<OpFoldResult> valueOrAttrVec) {
return llvm::to_vector<4>(		return llvm::to_vector<4>(
llvm::map_range(valueOrAttrVec, [&](OpFoldResult value) -> Value {		llvm::map_range(valueOrAttrVec, [&](OpFoldResult value) -> Value {
return getValueOrCreateConstantIndexOp(b, loc, value);		return getValueOrCreateConstantIndexOp(b, loc, value);
}));		}));
}		}
Show All 27 Lines

mlir/lib/Dialect/Vector/IR/VectorOps.cpp

Show First 20 Lines • Show All 4,226 Lines • ▼ Show 20 Lines	LogicalResult ConstantMaskOp::verify() {
}		}
// Verify that if one mask dim size is zero, they all should be zero (because		// Verify that if one mask dim size is zero, they all should be zero (because
// the mask region is a conjunction of each mask dimension interval).		// the mask region is a conjunction of each mask dimension interval).
bool anyZeros = llvm::is_contained(maskDimSizes, 0);		bool anyZeros = llvm::is_contained(maskDimSizes, 0);
bool allZeros = llvm::all_of(maskDimSizes, [](int64_t s) { return s == 0; });		bool allZeros = llvm::all_of(maskDimSizes, [](int64_t s) { return s == 0; });
if (anyZeros && !allZeros)		if (anyZeros && !allZeros)
return emitOpError("expected all mask dim sizes to be zeros, "		return emitOpError("expected all mask dim sizes to be zeros, "
"as a result of conjunction with zero mask dim");		"as a result of conjunction with zero mask dim");
		// Verify that if the mask type is scalable, dimensions should be zero because
		// constant scalable masks can only be defined for the "none set" or "all set"
		// cases, and there is no VLA way to define an "all set" case for
		// `vector.constant_mask`. In the future, a convention could be established
		// to decide if a specific dimension value could be considered as "all set".
		if (resultType.isScalable() &&
		mask_dim_sizes()[0].cast<IntegerAttr>().getInt() != 0)
		return emitOpError("expected mask dim sizes for scalable masks to be 0");
		c-rhodesUnsubmitted Done Reply Inline Actions `== 0`? Or `to be 0`? https://mlir.llvm.org/docs/Dialects/Vector/#vectorconstant_mask-mlirvectorconstantmaskop Each value of ‘mask_dim_sizes’ must be non-negative and not greater than the size of the corresponding vector dimension (as opposed to vector.create_mask which allows this). c-rhodes: `== 0`? Or `to be 0`? https://mlir.llvm.org/docs/Dialects/Vector/#vectorconstant_mask…
return success();		return success();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// CreateMaskOp		// CreateMaskOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

LogicalResult CreateMaskOp::verify() {		LogicalResult CreateMaskOp::verify() {
Show All 21 Lines	public:
LogicalResult matchAndRewrite(CreateMaskOp createMaskOp,		LogicalResult matchAndRewrite(CreateMaskOp createMaskOp,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
// Return if any of 'createMaskOp' operands are not defined by a constant.		// Return if any of 'createMaskOp' operands are not defined by a constant.
auto isNotDefByConstant = [](Value operand) {		auto isNotDefByConstant = [](Value operand) {
return !isa_and_nonnull<arith::ConstantIndexOp>(operand.getDefiningOp());		return !isa_and_nonnull<arith::ConstantIndexOp>(operand.getDefiningOp());
};		};
if (llvm::any_of(createMaskOp.operands(), isNotDefByConstant))		if (llvm::any_of(createMaskOp.operands(), isNotDefByConstant))
return failure();		return failure();

		// CreateMaskOp for scalable vectors can be folded only if all dimensions
		// are negative or zero.
		if (auto vType = createMaskOp.getType().dyn_cast<VectorType>()) {
		if (vType.isScalable())
		for (auto opDim : createMaskOp.getOperands()) {
		APInt intVal;
		if (matchPattern(opDim, m_ConstantInt(&intVal)) &&
		ftynseUnsubmitted Done Reply Inline Actions Use `matchPattern(m_constantInt(...))` instead of explicitly matching for `arith.constant` here. ftynse: Use `matchPattern(m_constantInt(...))` instead of explicitly matching for `arith.constant` here.
		intVal.isStrictlyPositive())
		return failure();
		}
		}

// Gather constant mask dimension sizes.		// Gather constant mask dimension sizes.
SmallVector<int64_t, 4> maskDimSizes;		SmallVector<int64_t, 4> maskDimSizes;
for (auto it : llvm::zip(createMaskOp.operands(),		for (auto it : llvm::zip(createMaskOp.operands(),
createMaskOp.getType().getShape())) {		createMaskOp.getType().getShape())) {
auto *defOp = std::get<0>(it).getDefiningOp();		auto *defOp = std::get<0>(it).getDefiningOp();
int64_t maxDimSize = std::get<1>(it);		int64_t maxDimSize = std::get<1>(it);
int64_t dimSize = cast<arith::ConstantIndexOp>(defOp).value();		int64_t dimSize = cast<arith::ConstantIndexOp>(defOp).value();
dimSize = std::min(dimSize, maxDimSize);		dimSize = std::min(dimSize, maxDimSize);
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp

Show All 10 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Dialect/Vector/Transforms/VectorTransforms.h"		#include "mlir/Dialect/Vector/Transforms/VectorTransforms.h"

#include <type_traits>		#include <type_traits>

#include "mlir/Dialect/Affine/IR/AffineOps.h"		#include "mlir/Dialect/Affine/IR/AffineOps.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
		#include "mlir/Dialect/Arithmetic/Utils/Utils.h"
		#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
#include "mlir/Dialect/Linalg/IR/Linalg.h"		#include "mlir/Dialect/Linalg/IR/Linalg.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"		#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/SCF/SCF.h"		#include "mlir/Dialect/SCF/SCF.h"
#include "mlir/Dialect/Utils/IndexingUtils.h"		#include "mlir/Dialect/Utils/IndexingUtils.h"
#include "mlir/Dialect/Utils/StructuredOpsUtils.h"		#include "mlir/Dialect/Utils/StructuredOpsUtils.h"
#include "mlir/Dialect/Vector/Utils/VectorUtils.h"		#include "mlir/Dialect/Vector/Utils/VectorUtils.h"
#include "mlir/IR/ImplicitLocOpBuilder.h"		#include "mlir/IR/ImplicitLocOpBuilder.h"
#include "mlir/IR/Matchers.h"		#include "mlir/IR/Matchers.h"
▲ Show 20 Lines • Show All 570 Lines • ▼ Show 20 Lines	if (rank == 0) {
rewriter.replaceOpWithNewOp<arith::ConstantOp>(		rewriter.replaceOpWithNewOp<arith::ConstantOp>(
op, dstType,		op, dstType,
DenseIntElementsAttr::get(		DenseIntElementsAttr::get(
VectorType::get(ArrayRef<int64_t>{}, rewriter.getI1Type()),		VectorType::get(ArrayRef<int64_t>{}, rewriter.getI1Type()),
ArrayRef<bool>{value}));		ArrayRef<bool>{value}));
return success();		return success();
}		}

		// Scalable constant masks can only be lowered for the "none set" case.
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		if (dstType.cast<VectorType>().isScalable()) {
		rewriter.replaceOpWithNewOp<arith::ConstantOp>(
		op, DenseElementsAttr::get(dstType, false));
		return success();
		}

int64_t trueDim = std::min(dstType.getDimSize(0),		int64_t trueDim = std::min(dstType.getDimSize(0),
dimSizes[0].cast<IntegerAttr>().getInt());		dimSizes[0].cast<IntegerAttr>().getInt());

if (rank == 1) {		if (rank == 1) {
// Express constant 1-D case in explicit vector form:		// Express constant 1-D case in explicit vector form:
// [T,..,T,F,..,F].		// [T,..,T,F,..,F].
SmallVector<bool, 4> values(dstType.getDimSize(0));		SmallVector<bool, 4> values(dstType.getDimSize(0));
for (int64_t d = 0; d < trueDim; d++)		for (int64_t d = 0; d < trueDim; d++)
▲ Show 20 Lines • Show All 1,543 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(vector::BitCastOp bitcastOp,
rewriter.replaceOpWithNewOp<vector::InsertStridedSliceOp>(		rewriter.replaceOpWithNewOp<vector::InsertStridedSliceOp>(
bitcastOp, bitcastOp.getType(), newCastSrcOp, newCastDstOp, newOffsets,		bitcastOp, bitcastOp.getType(), newCastSrcOp, newCastDstOp, newOffsets,
insertOp.strides());		insertOp.strides());

return success();		return success();
}		}
};		};

static Value createCastToIndexLike(PatternRewriter &rewriter, Location loc,
Type targetType, Value value) {
if (targetType == value.getType())
return value;

bool targetIsIndex = targetType.isIndex();
bool valueIsIndex = value.getType().isIndex();
if (targetIsIndex ^ valueIsIndex)
return rewriter.create<arith::IndexCastOp>(loc, targetType, value);

auto targetIntegerType = targetType.dyn_cast<IntegerType>();
auto valueIntegerType = value.getType().dyn_cast<IntegerType>();
assert(targetIntegerType && valueIntegerType &&
"unexpected cast between types other than integers and index");
assert(targetIntegerType.getSignedness() == valueIntegerType.getSignedness());

if (targetIntegerType.getWidth() > valueIntegerType.getWidth())
return rewriter.create<arith::ExtSIOp>(loc, targetIntegerType, value);
return rewriter.create<arith::TruncIOp>(loc, targetIntegerType, value);
}

// Helper that returns a vector comparison that constructs a mask:		// Helper that returns a vector comparison that constructs a mask:
// mask = [0,1,..,n-1] + [o,o,..,o] < [b,b,..,b]		// mask = [0,1,..,n-1] + [o,o,..,o] < [b,b,..,b]
//		//
// If `dim == 0` then the result will be a 0-D vector.		// If `dim == 0` then the result will be a 0-D vector.
//		//
// NOTE: The LLVM::GetActiveLaneMaskOp intrinsic would provide an alternative,		// NOTE: The LLVM::GetActiveLaneMaskOp intrinsic would provide an alternative,
// much more compact, IR for this operation, but LLVM eventually		// much more compact, IR for this operation, but LLVM eventually
// generates more elaborate instructions for this intrinsic since it		// generates more elaborate instructions for this intrinsic since it
Show All 19 Lines	indicesAttr = rewriter.getI32VectorAttr(
llvm::to_vector<4>(llvm::seq<int32_t>(0, dim)));		llvm::to_vector<4>(llvm::seq<int32_t>(0, dim)));
} else {		} else {
indicesAttr = rewriter.getI64VectorAttr(		indicesAttr = rewriter.getI64VectorAttr(
llvm::to_vector<4>(llvm::seq<int64_t>(0, dim)));		llvm::to_vector<4>(llvm::seq<int64_t>(0, dim)));
}		}
Value indices = rewriter.create<arith::ConstantOp>(loc, indicesAttr);		Value indices = rewriter.create<arith::ConstantOp>(loc, indicesAttr);
// Add in an offset if requested.		// Add in an offset if requested.
if (off) {		if (off) {
Value o = createCastToIndexLike(rewriter, loc, idxType, *off);		Value o = getValueOrCreateCastToIndexLike(rewriter, loc, idxType, *off);
Value ov = rewriter.create<vector::SplatOp>(loc, indices.getType(), o);		Value ov = rewriter.create<vector::SplatOp>(loc, indices.getType(), o);
indices = rewriter.create<arith::AddIOp>(loc, ov, indices);		indices = rewriter.create<arith::AddIOp>(loc, ov, indices);
}		}
// Construct the vector comparison.		// Construct the vector comparison.
Value bound = createCastToIndexLike(rewriter, loc, idxType, b);		Value bound = getValueOrCreateCastToIndexLike(rewriter, loc, idxType, b);
Value bounds =		Value bounds =
rewriter.create<vector::SplatOp>(loc, indices.getType(), bound);		rewriter.create<vector::SplatOp>(loc, indices.getType(), bound);
return rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, indices,		return rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, indices,
bounds);		bounds);
}		}

template <typename ConcreteOp>		template <typename ConcreteOp>
struct MaterializeTransferMask : public OpRewritePattern<ConcreteOp> {		struct MaterializeTransferMask : public OpRewritePattern<ConcreteOp> {
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	public:
explicit VectorCreateMaskOpConversion(MLIRContext *context,		explicit VectorCreateMaskOpConversion(MLIRContext *context,
bool enableIndexOpt)		bool enableIndexOpt)
: mlir::OpRewritePattern<vector::CreateMaskOp>(context),		: mlir::OpRewritePattern<vector::CreateMaskOp>(context),
indexOptimizations(enableIndexOpt) {}		indexOptimizations(enableIndexOpt) {}

LogicalResult matchAndRewrite(vector::CreateMaskOp op,		LogicalResult matchAndRewrite(vector::CreateMaskOp op,
PatternRewriter &rewriter) const override {		PatternRewriter &rewriter) const override {
auto dstType = op.getType();		auto dstType = op.getType();
		if (dstType.cast<VectorType>().isScalable())
		return failure();
int64_t rank = dstType.getRank();		int64_t rank = dstType.getRank();
if (rank > 1)		if (rank > 1)
return failure();		return failure();
rewriter.replaceOp(		rewriter.replaceOp(
op, buildVectorComparison(rewriter, op, indexOptimizations,		op, buildVectorComparison(rewriter, op, indexOptimizations,
rank == 0 ? 0 : dstType.getDimSize(0),		rank == 0 ? 0 : dstType.getDimSize(0),
op.getOperand(0)));		op.getOperand(0)));
return success();		return success();
▲ Show 20 Lines • Show All 375 Lines • Show Last 20 Lines

mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir

	Show All 18 Lines
	// CMP64: %[[T4:.*]] = arith.cmpi slt, %[[T0]], %[[T3]] : vector<11xi64>			// CMP64: %[[T4:.*]] = arith.cmpi slt, %[[T0]], %[[T3]] : vector<11xi64>
	// CMP64: return %[[T4]] : vector<11xi1>			// CMP64: return %[[T4]] : vector<11xi1>

	func @genbool_var_1d(%arg0: index) -> vector<11xi1> {			func @genbool_var_1d(%arg0: index) -> vector<11xi1> {
	%0 = vector.create_mask %arg0 : vector<11xi1>			%0 = vector.create_mask %arg0 : vector<11xi1>
	return %0 : vector<11xi1>			return %0 : vector<11xi1>
	}			}

				// CMP32-LABEL: @genbool_var_1d_scalable(
				// CMP32-SAME: %[[ARG:.*]]: index)
				// CMP32: %[[T0:.*]] = llvm.intr.experimental.stepvector : vector<[11]xi32>
				// CMP32: %[[T1:.*]] = arith.index_cast %[[ARG]] : index to i32
				// CMP32: %[[T2:.]] = llvm.insertelement %[[T1]], %{{.}}[%{{.*}} : i32] : vector<[11]xi32>
				// CMP32: %[[T3:.]] = llvm.shufflevector %[[T2]], %{{.}} [0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<[11]xi32>, vector<[11]xi32>
				// CMP32: %[[T4:.*]] = arith.cmpi slt, %[[T0]], %[[T3]] : vector<[11]xi32>
				// CMP32: return %[[T4]] : vector<[11]xi1>

				// CMP64-LABEL: @genbool_var_1d_scalable(
				// CMP64-SAME: %[[ARG:.*]]: index)
				// CMP64: %[[T0:.*]] = llvm.intr.experimental.stepvector : vector<[11]xi64>
				// CMP64: %[[T1:.*]] = arith.index_cast %[[ARG]] : index to i64
				// CMP64: %[[T2:.]] = llvm.insertelement %[[T1]], %{{.}}[%{{.*}} : i32] : vector<[11]xi64>
				// CMP64: %[[T3:.]] = llvm.shufflevector %[[T2]], %{{.}} [0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<[11]xi64>, vector<[11]xi64>
				// CMP64: %[[T4:.*]] = arith.cmpi slt, %[[T0]], %[[T3]] : vector<[11]xi64>
				// CMP64: return %[[T4]] : vector<[11]xi1>

				func @genbool_var_1d_scalable(%arg0: index) -> vector<[11]xi1> {
				%0 = vector.create_mask %arg0 : vector<[11]xi1>
				return %0 : vector<[11]xi1>
				}

	// CMP32-LABEL: @transfer_read_1d			// CMP32-LABEL: @transfer_read_1d
	// CMP32: %[[MEM:.]]: memref<?xf32>, %[[OFF:.]]: index) -> vector<16xf32> {			// CMP32: %[[MEM:.]]: memref<?xf32>, %[[OFF:.]]: index) -> vector<16xf32> {
	// CMP32: %[[D:.]] = memref.dim %[[MEM]], %{{.}} : memref<?xf32>			// CMP32: %[[D:.]] = memref.dim %[[MEM]], %{{.}} : memref<?xf32>
	// CMP32: %[[S:.*]] = arith.subi %[[D]], %[[OFF]] : index			// CMP32: %[[S:.*]] = arith.subi %[[D]], %[[OFF]] : index
	// CMP32: %[[C:.*]] = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : vector<16xi32>			// CMP32: %[[C:.*]] = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : vector<16xi32>
	// CMP32: %[[B:.*]] = arith.index_cast %[[S]] : index to i32			// CMP32: %[[B:.*]] = arith.index_cast %[[S]] : index to i32
	// CMP32: %[[B0:.]] = llvm.insertelement %[[B]], %{{.}} : vector<16xi32>			// CMP32: %[[B0:.]] = llvm.insertelement %[[B]], %{{.}} : vector<16xi32>
	// CMP32: %[[BV:.]] = llvm.shufflevector %[[B0]], {{.}} : vector<16xi32>, vector<16xi32>			// CMP32: %[[BV:.]] = llvm.shufflevector %[[B0]], {{.}} : vector<16xi32>, vector<16xi32>
	Show All 21 Lines

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

Show First 20 Lines • Show All 1,453 Lines • ▼ Show 20 Lines	func @genbool_1d() -> vector<8xi1> {
return %0 : vector<8xi1>		return %0 : vector<8xi1>
}		}
// CHECK-LABEL: func @genbool_1d		// CHECK-LABEL: func @genbool_1d
// CHECK: %[[VAL_0:.*]] = arith.constant dense<[true, true, true, true, false, false, false, false]> : vector<8xi1>		// CHECK: %[[VAL_0:.*]] = arith.constant dense<[true, true, true, true, false, false, false, false]> : vector<8xi1>
// CHECK: return %[[VAL_0]] : vector<8xi1>		// CHECK: return %[[VAL_0]] : vector<8xi1>

// -----		// -----

		func @genbool_1d_scalable() -> vector<[8]xi1> {
		%0 = vector.constant_mask [0] : vector<[8]xi1>
		return %0 : vector<[8]xi1>
		}
		// CHECK-LABEL: func @genbool_1d_scalable
		aartbikUnsubmitted Done Reply Inline Actions although this will match genbool_1d_scalable is a better LABEL! aartbik: although this will match genbool_1d_scalable is a better LABEL!
		// CHECK: %[[VAL_0:.*]] = arith.constant dense<false> : vector<[8]xi1>
		// CHECK: return %[[VAL_0]] : vector<[8]xi1>

		// -----

func @genbool_2d() -> vector<4x4xi1> {		func @genbool_2d() -> vector<4x4xi1> {
%v = vector.constant_mask [2, 2] : vector<4x4xi1>		%v = vector.constant_mask [2, 2] : vector<4x4xi1>
return %v: vector<4x4xi1>		return %v: vector<4x4xi1>
}		}

// CHECK-LABEL: func @genbool_2d		// CHECK-LABEL: func @genbool_2d
// CHECK: %[[VAL_0:.*]] = arith.constant dense<[true, true, false, false]> : vector<4xi1>		// CHECK: %[[VAL_0:.*]] = arith.constant dense<[true, true, false, false]> : vector<4xi1>
// CHECK: %[[VAL_1:.*]] = arith.constant dense<false> : vector<4x4xi1>		// CHECK: %[[VAL_1:.*]] = arith.constant dense<false> : vector<4x4xi1>
Show All 30 Lines
// CHECK-SAME: %[[arg:.*]]: index		// CHECK-SAME: %[[arg:.*]]: index
// CHECK: %[[indices:.*]] = arith.constant dense<[0, 1, 2, 3]> : vector<4xi32>		// CHECK: %[[indices:.*]] = arith.constant dense<[0, 1, 2, 3]> : vector<4xi32>
// CHECK: %[[arg_i32:.*]] = arith.index_cast %[[arg]] : index to i32		// CHECK: %[[arg_i32:.*]] = arith.index_cast %[[arg]] : index to i32
// CHECK: %[[boundsInsert:.*]] = llvm.insertelement %[[arg_i32]]		// CHECK: %[[boundsInsert:.*]] = llvm.insertelement %[[arg_i32]]
// CHECK: %[[bounds:.*]] = llvm.shufflevector %[[boundsInsert]]		// CHECK: %[[bounds:.*]] = llvm.shufflevector %[[boundsInsert]]
// CHECK: %[[result:.*]] = arith.cmpi slt, %[[indices]], %[[bounds]] : vector<4xi32>		// CHECK: %[[result:.*]] = arith.cmpi slt, %[[indices]], %[[bounds]] : vector<4xi32>
// CHECK: return %[[result]] : vector<4xi1>		// CHECK: return %[[result]] : vector<4xi1>

		func @create_mask_1d_scalable(%a : index) -> vector<[4]xi1> {
		%v = vector.create_mask %a : vector<[4]xi1>
		return %v: vector<[4]xi1>
		}

		// CHECK-LABEL: func @create_mask_1d_scalable
		// CHECK-SAME: %[[arg:.*]]: index
		// CHECK: %[[indices:.*]] = llvm.intr.experimental.stepvector : vector<[4]xi32>
		// CHECK: %[[arg_i32:.*]] = arith.index_cast %[[arg]] : index to i32
		// CHECK: %[[boundsInsert:.]] = llvm.insertelement %[[arg_i32]], {{.}} : vector<[4]xi32>
		// CHECK: %[[bounds:.]] = llvm.shufflevector %[[boundsInsert]], {{.}} : vector<[4]xi32>, vector<[4]xi32>
		// CHECK: %[[result:.*]] = arith.cmpi slt, %[[indices]], %[[bounds]] : vector<[4]xi32>
		// CHECK: return %[[result]] : vector<[4]xi1>

// -----		// -----

func @flat_transpose(%arg0: vector<16xf32>) -> vector<16xf32> {		func @flat_transpose(%arg0: vector<16xf32>) -> vector<16xf32> {
%0 = vector.flat_transpose %arg0 { rows = 4: i32, columns = 4: i32 }		%0 = vector.flat_transpose %arg0 { rows = 4: i32, columns = 4: i32 }
: vector<16xf32> -> vector<16xf32>		: vector<16xf32> -> vector<16xf32>
return %0 : vector<16xf32>		return %0 : vector<16xf32>
}		}

▲ Show 20 Lines • Show All 282 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/canonicalize.mlir

	// RUN: mlir-opt %s -pass-pipeline='func.func(canonicalize)' -split-input-file -allow-unregistered-dialect \| FileCheck %s			// RUN: mlir-opt %s -pass-pipeline='func.func(canonicalize)' -split-input-file -allow-unregistered-dialect \| FileCheck %s

	// -----			// -----

	// CHECK-LABEL: create_vector_mask_to_constant_mask			// CHECK-LABEL: create_vector_mask_to_constant_mask
	func @create_vector_mask_to_constant_mask() -> (vector<4x3xi1>) {			func @create_vector_mask_to_constant_mask() -> (vector<4x3xi1>) {
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	%c3 = arith.constant 3 : index			%c3 = arith.constant 3 : index
	// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>			// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>
	%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>			%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>
	return %0 : vector<4x3xi1>			return %0 : vector<4x3xi1>
	}			}

	// -----			// -----

				// CHECK-LABEL: create_scalable_vector_mask_to_constant_mask
				func @create_scalable_vector_mask_to_constant_mask() -> (vector<[8]xi1>) {
				%c-1 = arith.constant -1 : index
				// CHECK: vector.constant_mask [0] : vector<[8]xi1>
				%0 = vector.create_mask %c-1 : vector<[8]xi1>
				return %0 : vector<[8]xi1>
				}

				// -----

	// CHECK-LABEL: create_vector_mask_to_constant_mask_truncation			// CHECK-LABEL: create_vector_mask_to_constant_mask_truncation
	func @create_vector_mask_to_constant_mask_truncation() -> (vector<4x3xi1>) {			func @create_vector_mask_to_constant_mask_truncation() -> (vector<4x3xi1>) {
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	%c5 = arith.constant 5 : index			%c5 = arith.constant 5 : index
	// CHECK: vector.constant_mask [4, 2] : vector<4x3xi1>			// CHECK: vector.constant_mask [4, 2] : vector<4x3xi1>
	%0 = vector.create_mask %c5, %c2 : vector<4x3xi1>			%0 = vector.create_mask %c5, %c2 : vector<4x3xi1>
	return %0 : vector<4x3xi1>			return %0 : vector<4x3xi1>
	}			}
	▲ Show 20 Lines • Show All 1,245 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/invalid.mlir

	Show First 20 Lines • Show All 938 Lines • ▼ Show 20 Lines

	func @constant_mask_with_zero_mask_dim_size() {			func @constant_mask_with_zero_mask_dim_size() {
	// expected-error@+1 {{expected all mask dim sizes to be zeros, as a result of conjunction with zero mask dim}}			// expected-error@+1 {{expected all mask dim sizes to be zeros, as a result of conjunction with zero mask dim}}
	%0 = vector.constant_mask [0, 2] : vector<4x3xi1>			%0 = vector.constant_mask [0, 2] : vector<4x3xi1>
	}			}

	// -----			// -----

				func @constant_mask_scalable_non_zero_dim_size() {
				// expected-error@+1 {{expected mask dim sizes for scalable masks to be 0}}
				%0 = vector.constant_mask [2] : vector<[8]xi1>
				}

				// -----

	func @print_no_result(%arg0 : f32) -> i32 {			func @print_no_result(%arg0 : f32) -> i32 {
	// expected-error@+1 {{cannot name an operation with no results}}			// expected-error@+1 {{cannot name an operation with no results}}
	%0 = vector.print %arg0 : f32			%0 = vector.print %arg0 : f32
	}			}

	// -----			// -----

	func @reshape_bad_input_shape(%arg0 : vector<3x2x4xf32>) {			func @reshape_bad_input_shape(%arg0 : vector<3x2x4xf32>) {
	▲ Show 20 Lines • Show All 569 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/ops.mlir

	Show First 20 Lines • Show All 366 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: @create_vector_mask			// CHECK-LABEL: @create_vector_mask
	func @create_vector_mask() {			func @create_vector_mask() {
	// CHECK: %[[C2:.*]] = arith.constant 2 : index			// CHECK: %[[C2:.*]] = arith.constant 2 : index
	%c2 = arith.constant 2 : index			%c2 = arith.constant 2 : index
	// CHECK-NEXT: %[[C3:.*]] = arith.constant 3 : index			// CHECK-NEXT: %[[C3:.*]] = arith.constant 3 : index
	%c3 = arith.constant 3 : index			%c3 = arith.constant 3 : index
	// CHECK-NEXT: vector.create_mask %[[C3]], %[[C2]] : vector<4x3xi1>			// CHECK-NEXT: vector.create_mask %[[C3]], %[[C2]] : vector<4x3xi1>
	%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>			%0 = vector.create_mask %c3, %c2 : vector<4x3xi1>

	c-rhodesUnsubmitted Done Reply Inline Actions nit: unrelated change c-rhodes: nit: unrelated change
	return			return
	}			}

	// CHECK-LABEL: @constant_vector_mask_0d			// CHECK-LABEL: @constant_vector_mask_0d
	func @constant_vector_mask_0d() {			func @constant_vector_mask_0d() {
	// CHECK: vector.constant_mask [0] : vector<i1>			// CHECK: vector.constant_mask [0] : vector<i1>
	%0 = vector.constant_mask [0] : vector<i1>			%0 = vector.constant_mask [0] : vector<i1>
	// CHECK: vector.constant_mask [1] : vector<i1>			// CHECK: vector.constant_mask [1] : vector<i1>
	%1 = vector.constant_mask [1] : vector<i1>			%1 = vector.constant_mask [1] : vector<i1>
	return			return
	}			}

	// CHECK-LABEL: @constant_vector_mask			// CHECK-LABEL: @constant_vector_mask
	func @constant_vector_mask() {			func @constant_vector_mask() {
	// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>			// CHECK: vector.constant_mask [3, 2] : vector<4x3xi1>
	%0 = vector.constant_mask [3, 2] : vector<4x3xi1>			%0 = vector.constant_mask [3, 2] : vector<4x3xi1>
				// CHECK: vector.constant_mask [0] : vector<[4]xi1>
				%1 = vector.constant_mask [0] : vector<[4]xi1>
	return			return
	}			}

	// CHECK-LABEL: @vector_print			// CHECK-LABEL: @vector_print
	func @vector_print(%arg0: vector<8x4xf32>) {			func @vector_print(%arg0: vector<8x4xf32>) {
	// CHECK: vector.print %{{.*}} : vector<8x4xf32>			// CHECK: vector.print %{{.*}} : vector<8x4xf32>
	vector.print %arg0 : vector<8x4xf32>			vector.print %arg0 : vector<8x4xf32>
	return			return
	▲ Show 20 Lines • Show All 346 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Vector] Enable create_mask for scalable vectorsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 418177

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h

mlir/include/mlir/Dialect/Arithmetic/Utils/Utils.h

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp

mlir/lib/Dialect/Arithmetic/Utils/Utils.cpp

mlir/lib/Dialect/Vector/IR/VectorOps.cpp

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp

mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

mlir/test/Dialect/Vector/canonicalize.mlir

mlir/test/Dialect/Vector/invalid.mlir

mlir/test/Dialect/Vector/ops.mlir

[mlir][Vector] Enable create_mask for scalable vectors
ClosedPublic