This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/
-
mlir/
-
Dialect/
-
Vector/
4/10
VectorOps.td
-
lib/Dialect/
-
Dialect/
-
Linalg/Transforms/
-
Transforms/
-
Vectorization.cpp
-
Vector/
4/4
VectorOps.cpp
1/6
VectorTransforms.cpp
-
test/Dialect/
-
Dialect/
-
Linalg/
-
vectorization.mlir
-
Vector/
-
ops.mlir
-
vector-contract-matvec-transforms.mlir
-
vector-multi-reduction-outer-lowering.mlir

Differential D110854

[mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic
ClosedPublic

Authored by dcaballe on Sep 30 2021, 10:08 AM.

Download Raw Diff

Details

Reviewers

pifon2a
ThomasRaoux
nicolasvasilache
herhut
aartbik
pravnar
ftynse

Commits

rGeaf2588a51bf: [mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic

Summary

This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 126603
Build 183946: arc lint + arc unit

Event Timeline

dcaballe created this revision.Sep 30 2021, 10:08 AM

Herald added subscribers: wenzhicui, wrengr, Chia-hungDuan and 18 others. · View Herald TranscriptSep 30 2021, 10:08 AM

dcaballe requested review of this revision.Sep 30 2021, 10:08 AM

Harbormaster completed remote builds in B126603: Diff 376264.Sep 30 2021, 10:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2021, 10:08 AM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald Transcript

ThomasRaoux added inline comments.Sep 30 2021, 10:18 AM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	nit: The naming seems a bit confusing. Is there any reason why we don't want to add spearate minf/maxf? It doesn't look like it would add code/complexity?
mlir/lib/Dialect/Vector/VectorTransforms.cpp
868–870	This lowering doesn't propagate Nan while the new minf op is supposed to propagate Nan. I assume we want to match the semantic of minf. (same for maxf of course)

ThomasRaoux added inline comments.Sep 30 2021, 10:24 AM

mlir/lib/Dialect/Vector/VectorTransforms.cpp
868–870	The most natural would be to lower to minf/maxf ops in my opinion.

pifon2a added inline comments.Sep 30 2021, 1:14 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	yes, i think with the current maxf, maxsi, maxui ops added we can actually have 1 to 1 map.
mlir/lib/Dialect/Vector/VectorTransforms.cpp
825–827	`create<MinSIOp>`?
830	`create<MaxSIOp>`?
868–870	+1 @ThomasRaoux , `create<MinFOp>`?

dcaballe added inline comments.Sep 30 2021, 3:16 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	What I inferred from the current implementation, and maybe I'm totally wrong, is that the actual type is taken from the reduction/contraction op itself. I understand that's why we only have `ADD`, `MUL`, etc., and not `ADDI`/`ADDF`, `MULI`/`MULF` etc. The only thing that is missing is the signedness for those operations where it makes a difference (min/max, in this case). I though adding MINS/MAXS for signed integer and floating point (and eventually MINU/MAXU for unsigned integers) would avoid duplicating all the I/SI/UI/F variants. If we wanted to go with all the I/SI/UI/F variants, shouldn't we do it for all the ops to be consistent? We would also need verification rules to make sure the type of the combiner matches the type of the op. Again, this sounds a bit redundant to me since the type itself is in the operation. We just need the sign information. WDYT?
mlir/lib/Dialect/Vector/VectorTransforms.cpp
868–870	Yes, that makes sense, thanks! I thought @pifon2a's patch was taking care of that already. I see now that it's only expanding the min/max ops to the compare/select implementation.

ThomasRaoux added inline comments.Sep 30 2021, 4:16 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	I agree, that's how it is right now, however having MAXS maps to float max seems confusing to me. Overall even for Add/Mul the float and integer version are separate opcodes everywhere so we don't really save a lot having a single enum. It doesn't make sense to have ADD and MUL for I/SI/UI as they are equivalent so I don't think it is makes sense to have different version for those (and we don't have different opcodes for them). Updating verifier is just a matter of updating is `SupportedCombiningKind` so shouldn't be to bad. What do you think?

dcaballe added inline comments.Sep 30 2021, 6:05 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	It doesn't make sense to have ADD and MUL for I/SI/UI as they are equivalent so I don't think it is makes sense to have different version for those (and we don't have different opcodes for them). Sorry, I wasn't clear here. Not all the variants make sense for all the ops, obviously. `I` and `SI/UI` are mutually exclusive. No strong opinion, though. I can start addressing the min/max ones.

ThomasRaoux added inline comments.Sep 30 2021, 9:02 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	I don't have a strong opinion but my preference is to have mins/maxs/minf/maxf in this case

General comment (not for this CL), I find mins maxs etc quite worse than smin, smax etc.
Any reason we want to deviate from the LLVM names (e.g. https://llvm.org/docs/LangRef.html#llvm-smin-intrinsic) ?

I see that the attr names are consistent with the MLIR names but I don't know why we deviated in the first place.

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ? @pifon2a I realized I hadn't checked that in your original PR, but would be nice to be consistent with LLVM unless there is a strong reason not to. I'd agree that using smin/smax for float would be a further step away form LLVM so I'd prefer we'd avoid that.

pifon2a added inline comments.Oct 1 2021, 12:17 AM

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	Because in MLIR we already have "postfix" style. We have `addf` and not `fadd`. Also the PR that adds Arith dialect (https://reviews.llvm.org/D110200) has `DivSIOp` and so on. I wanted to be consistent with the style.

dcaballe planned changes to this revision.Oct 1 2021, 10:09 AM

dcaballe added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
41	Yes, I think it's important to keep things consistent within MLIR. If we want to align with LLVM, it's probably better to address it separately, at a broader level. I'll go with the S/U/F approach for min/max, then.
41	> The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ? Note that we have more inconsistencies than just the prefix/postfix name. LLVM has the following intrinsics for fp min/max: Scalar/vector max/min intrinsics: 'llvm.maxnum': https://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic - Nothing equivalent in MLIR. 'llvm.minnum': https://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic - Nothing equivalent in MLIR. 'lllvm.maximum': https://llvm.org/docs/LangRef.html#llvm-maximum-intrinsic - Same as MLIR's maxf. 'lllvm.minimum': https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic - Same as MLIR's minf. Horizontal vector reductions: llvm.vector.reduce.fmax - https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmax-intrinsic - It uses 'maxnum' semantics. We can't lower MLIR `vector.reduce #maxf` to this one! llvm.vector.reduce.fmin - https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic - It uses 'minnum' semantics. We can't lower MLIR `vector.reduce #minf` to this one! So, to be aligned with LLVM, our minf/maxf ops should be renamed to minimum/maximum. Note as well that, AFAICT, there are no horizontal reductions with minimum/maximum semantics in LLVM so we can't use those! If we want to be aligned with LLVM, I think we should consider renaming minf/maxf -> minimum/maximum.

dcaballe added a reviewer: pravnar.Oct 1 2021, 11:23 AM

Addressed feedback.
Sorry for the delay but I went into a rabbit hole. For some reason,
the existing implementation lowered min/max ops on signless integers
assuming signed integers. It took me a while to realize what was going
on...

Herald added a reviewer: ftynse. · View Herald TranscriptOct 4 2021, 8:09 PM

dcaballe added inline comments.Oct 4 2021, 8:14 PM

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
494 ↗	(On Diff #377075)	This is exactly the problem that I described above. This lowering is incorrect and it won't produce the right output if the input has NaNs. We need to add a new 'vector_reduce' intrinsic to LLVM or implement our own lowering to more basic vector instructions in MLIR. I'll follow up on this when we hit this limitation.

ThomasRaoux added inline comments.Oct 4 2021, 8:16 PM

mlir/lib/Dialect/Vector/VectorOps.cpp
100	should MAXF case only return true if the elementType is float?
296	can we call `isSupportedCombiningKind` here?

Harbormaster completed remote builds in B126961: Diff 377075.Oct 4 2021, 8:25 PM

Addressed feedback.
Thanks!

mlir/lib/Dialect/Vector/VectorOps.cpp
100	Good catch, thanks!
296	much better, thanks!

Harbormaster completed remote builds in B126977: Diff 377093.Oct 4 2021, 11:20 PM

Great, thank you, Diego!

This revision is now accepted and ready to land.Oct 5 2021, 4:29 AM

ThomasRaoux accepted this revision.Oct 5 2021, 9:05 AM

ThomasRaoux added inline comments.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
494 ↗	(On Diff #377075)	As discuss it is a bit scary to have a potential silent miscompile. Can we fail this pattern here?

dcaballe mentioned this in D111170: [mlir][linalg] Update OpDSL to use the newly introduced min and max ops..Oct 5 2021, 12:14 PM

SuperFoo42 added a subscriber: SuperFoo42.Oct 5 2021, 1:39 PM

Closed by commit rGeaf2588a51bf: [mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic (authored by dcaballe). · Explain WhyOct 5 2021, 3:51 PM

This revision was automatically updated to reflect the committed changes.

dcaballe added a commit: rGeaf2588a51bf: [mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorOps.td

15 lines

lib/

Dialect/

Linalg/

Transforms/

Vectorization.cpp

57 lines

Vector/

VectorOps.cpp

8 lines

VectorTransforms.cpp

16 lines

test/

Dialect/

Linalg/

vectorization.mlir

51 lines

Vector/

ops.mlir

8 lines

vector-contract-matvec-transforms.mlir

6 lines

vector-multi-reduction-outer-lowering.mlir

4 lines

Diff 376264

mlir/include/mlir/Dialect/Vector/VectorOps.td

Show All 31 Lines	class Vector_Op<string mnemonic, list<OpTrait> traits = []> :
// * ParseResult parse${C++ class of Op}(OpAsmParser &parser,		// * ParseResult parse${C++ class of Op}(OpAsmParser &parser,
// OperationState &result)		// OperationState &result)
// functions.		// functions.
let printer = [{ return ::print(p, *this); }];		let printer = [{ return ::print(p, *this); }];
let verifier = [{ return ::verify(*this); }];		let verifier = [{ return ::verify(*this); }];
let parser = [{ return ::parse$cppClass(parser, result); }];		let parser = [{ return ::parse$cppClass(parser, result); }];
}		}

// The "kind" of combining function for contractions and reductions.		// The "kind" of combining function for contractions and reductions. Signed
		// kinds are used for floating point and signed integer types.
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions nit: The naming seems a bit confusing. Is there any reason why we don't want to add spearate minf/maxf? It doesn't look like it would add code/complexity? ThomasRaoux: nit: The naming seems a bit confusing. Is there any reason why we don't want to add spearate…
		pifon2aUnsubmitted Not Done Reply Inline Actions yes, i think with the current maxf, maxsi, maxui ops added we can actually have 1 to 1 map. pifon2a: yes, i think with the current maxf, maxsi, maxui ops added we can actually have 1 to 1 map.
		dcaballeAuthorUnsubmitted Done Reply Inline Actions What I inferred from the current implementation, and maybe I'm totally wrong, is that the actual type is taken from the reduction/contraction op itself. I understand that's why we only have `ADD`, `MUL`, etc., and not `ADDI`/`ADDF`, `MULI`/`MULF` etc. The only thing that is missing is the signedness for those operations where it makes a difference (min/max, in this case). I though adding MINS/MAXS for signed integer and floating point (and eventually MINU/MAXU for unsigned integers) would avoid duplicating all the I/SI/UI/F variants. If we wanted to go with all the I/SI/UI/F variants, shouldn't we do it for all the ops to be consistent? We would also need verification rules to make sure the type of the combiner matches the type of the op. Again, this sounds a bit redundant to me since the type itself is in the operation. We just need the sign information. WDYT? dcaballe: What I inferred from the current implementation, and maybe I'm totally wrong, is that the…
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions I agree, that's how it is right now, however having MAXS maps to float max seems confusing to me. Overall even for Add/Mul the float and integer version are separate opcodes everywhere so we don't really save a lot having a single enum. It doesn't make sense to have ADD and MUL for I/SI/UI as they are equivalent so I don't think it is makes sense to have different version for those (and we don't have different opcodes for them). Updating verifier is just a matter of updating is `SupportedCombiningKind` so shouldn't be to bad. What do you think? ThomasRaoux: I agree, that's how it is right now, however having MAXS maps to float max seems confusing to…
		dcaballeAuthorUnsubmitted Done Reply Inline Actions It doesn't make sense to have ADD and MUL for I/SI/UI as they are equivalent so I don't think it is makes sense to have different version for those (and we don't have different opcodes for them). Sorry, I wasn't clear here. Not all the variants make sense for all the ops, obviously. `I` and `SI/UI` are mutually exclusive. No strong opinion, though. I can start addressing the min/max ones. dcaballe: > It doesn't make sense to have ADD and MUL for I/SI/UI as they are equivalent so I don't think…
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions I don't have a strong opinion but my preference is to have mins/maxs/minf/maxf in this case ThomasRaoux: I don't have a strong opinion but my preference is to have mins/maxs/minf/maxf in this case
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ? @pifon2a I realized I hadn't checked that in your original PR, but would be nice to be consistent with LLVM unless there is a strong reason not to. I'd agree that using smin/smax for float would be a further step away form LLVM so I'd prefer we'd avoid that. nicolasvasilache: The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ?
		pifon2aUnsubmitted Not Done Reply Inline Actions Because in MLIR we already have "postfix" style. We have `addf` and not `fadd`. Also the PR that adds Arith dialect (https://reviews.llvm.org/D110200) has `DivSIOp` and so on. I wanted to be consistent with the style. pifon2a: Because in MLIR we already have "postfix" style. We have `addf` and not `fadd`. Also the PR…
		dcaballeAuthorUnsubmitted Done Reply Inline Actions Yes, I think it's important to keep things consistent within MLIR. If we want to align with LLVM, it's probably better to address it separately, at a broader level. I'll go with the S/U/F approach for min/max, then. dcaballe: Yes, I think it's important to keep things consistent within MLIR. If we want to align with…
		dcaballeAuthorUnsubmitted Done Reply Inline Actions > The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ? Note that we have more inconsistencies than just the prefix/postfix name. LLVM has the following intrinsics for fp min/max: Scalar/vector max/min intrinsics: 'llvm.maxnum': https://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic - Nothing equivalent in MLIR. 'llvm.minnum': https://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic - Nothing equivalent in MLIR. 'lllvm.maximum': https://llvm.org/docs/LangRef.html#llvm-maximum-intrinsic - Same as MLIR's maxf. 'lllvm.minimum': https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic - Same as MLIR's minf. Horizontal vector reductions: llvm.vector.reduce.fmax - https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmax-intrinsic - It uses 'maxnum' semantics. We can't lower MLIR `vector.reduce #maxf` to this one! llvm.vector.reduce.fmin - https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fmin-intrinsic - It uses 'minnum' semantics. We can't lower MLIR `vector.reduce #minf` to this one! So, to be aligned with LLVM, our minf/maxf ops should be renamed to minimum/maximum. Note as well that, AFAICT, there are no horizontal reductions with minimum/maximum semantics in LLVM so we can't use those! If we want to be aligned with LLVM, I think we should consider renaming minf/maxf -> minimum/maximum. dcaballe: > The LLVM Langref has smin/smax/umin/umax/fmin/fmax; any reason we want to deviate from that ?
def COMBINING_KIND_ADD : BitEnumAttrCase<"ADD", 0x1, "add">;		def COMBINING_KIND_ADD : BitEnumAttrCase<"ADD", 0x1, "add">;
def COMBINING_KIND_MUL : BitEnumAttrCase<"MUL", 0x2, "mul">;		def COMBINING_KIND_MUL : BitEnumAttrCase<"MUL", 0x2, "mul">;
def COMBINING_KIND_MIN : BitEnumAttrCase<"MIN", 0x4, "min">;		def COMBINING_KIND_MINS : BitEnumAttrCase<"MINS", 0x4, "mins">;
def COMBINING_KIND_MAX : BitEnumAttrCase<"MAX", 0x8, "max">;		def COMBINING_KIND_MAXS : BitEnumAttrCase<"MAXS", 0x8, "maxs">;
def COMBINING_KIND_AND : BitEnumAttrCase<"AND", 0x10, "and">;		def COMBINING_KIND_AND : BitEnumAttrCase<"AND", 0x10, "and">;
def COMBINING_KIND_OR : BitEnumAttrCase<"OR", 0x20, "or">;		def COMBINING_KIND_OR : BitEnumAttrCase<"OR", 0x20, "or">;
def COMBINING_KIND_XOR : BitEnumAttrCase<"XOR", 0x40, "xor">;		def COMBINING_KIND_XOR : BitEnumAttrCase<"XOR", 0x40, "xor">;

def CombiningKind : BitEnumAttr<		def CombiningKind : BitEnumAttr<
"CombiningKind",		"CombiningKind",
"Kind of combining function for contractions and reductions",		"Kind of combining function for contractions and reductions",
[COMBINING_KIND_ADD, COMBINING_KIND_MUL, COMBINING_KIND_MIN,		[COMBINING_KIND_ADD, COMBINING_KIND_MUL, COMBINING_KIND_MINS,
COMBINING_KIND_MAX, COMBINING_KIND_AND, COMBINING_KIND_OR,		COMBINING_KIND_MAXS, COMBINING_KIND_AND, COMBINING_KIND_OR,
COMBINING_KIND_XOR]> {		COMBINING_KIND_XOR]> {
let cppNamespace = "::mlir::vector";		let cppNamespace = "::mlir::vector";
let genSpecializedAttr = 0;		let genSpecializedAttr = 0;
}		}

def Vector_CombiningKindAttr : DialectAttr<		def Vector_CombiningKindAttr : DialectAttr<
Vector_Dialect,		Vector_Dialect,
CPred<"$_self.isa<::mlir::vector::CombiningKindAttr>()">,		CPred<"$_self.isa<::mlir::vector::CombiningKindAttr>()">,
▲ Show 20 Lines • Show All 270 Lines • ▼ Show 20 Lines	static SmallVector<bool> getReductionMask(
SmallVector<bool> res(sourceRank, false);		SmallVector<bool> res(sourceRank, false);
for (auto idx : reductionDims)		for (auto idx : reductionDims)
res[idx] = true;		res[idx] = true;
return res;		return res;
}		}

static SmallVector<int64_t> inferDestShape(		static SmallVector<int64_t> inferDestShape(
ArrayRef<int64_t> shape, ArrayRef<bool> reducedDimsMask) {		ArrayRef<int64_t> shape, ArrayRef<bool> reducedDimsMask) {
assert(shape.size() == reducedDimsMask.size() &&		assert(shape.size() == reducedDimsMask.size() &&
"shape and maks of different sizes");		"shape and maks of different sizes");
SmallVector<int64_t> res;		SmallVector<int64_t> res;
for (auto it : llvm::zip(reducedDimsMask, shape))		for (auto it : llvm::zip(reducedDimsMask, shape))
if (!std::get<0>(it))		if (!std::get<0>(it))
res.push_back(std::get<1>(it));		res.push_back(std::get<1>(it));
return res;		return res;
}		}
}];		}];
▲ Show 20 Lines • Show All 1,963 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp

	Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines
	/// ShapedType of `v`.			/// ShapedType of `v`.
	static VectorType extractVectorTypeFromShapedValue(Value v) {			static VectorType extractVectorTypeFromShapedValue(Value v) {
	auto st = v.getType().cast<ShapedType>();			auto st = v.getType().cast<ShapedType>();
	if (st.isa<MemRefType>() && st.getShape().empty())			if (st.isa<MemRefType>() && st.getShape().empty())
	return VectorType();			return VectorType();
	return VectorType::get(st.getShape(), st.getElementType());			return VectorType::get(st.getShape(), st.getElementType());
	}			}

				static llvm::Optional<vector::CombiningKind>
				getKindForOp(Operation *reductionOp) {
				if (!reductionOp)
				return llvm::None;
				return llvm::TypeSwitch<Operation *, llvm::Optional<vector::CombiningKind>>(
				reductionOp)
				.Case<AddIOp, AddFOp>([&](auto op) { return vector::CombiningKind::ADD; })
				.Case<MaxSIOp, MaxFOp>(
				[&](auto op) { return vector::CombiningKind::MAXS; })
				.Case<MinSIOp, MinFOp>(
				[&](auto op) { return vector::CombiningKind::MINS; })
				.Default([&](auto op) { return llvm::None; });
				}

	/// Check whether `outputOperand` is a reduction with a single combiner			/// Check whether `outputOperand` is a reduction with a single combiner
	/// operation. Return the combiner operation of the reduction, which is assumed			/// operation. Return the combiner operation kind of the reduction, if
	/// to be a binary operation. Multiple reduction operations would impose an			/// supported. Return llvm::None, otherwise. Multiple reduction operations would
	/// ordering between reduction dimensions and is currently unsupported in			/// impose an ordering between reduction dimensions and is currently unsupported
	/// Linalg. This limitation is motivated by the fact that e.g. min(max(X)) !=			/// in Linalg. This limitation is motivated by the fact that e.g. min(max(X)) !=
	/// max(min(X))			/// max(min(X))
	// TODO: use in LinalgOp verification, there is a circular dependency atm.			// TODO: use in LinalgOp verification, there is a circular dependency atm.
	static Operation getSingleBinaryOpAssumedReduction(OpOperand outputOperand) {			static llvm::Optional<vector::CombiningKind>
				matchLinalgReduction(OpOperand *outputOperand) {
	auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());			auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());
	unsigned outputPos =			unsigned outputPos =
	outputOperand->getOperandNumber() - linalgOp.getNumInputs();			outputOperand->getOperandNumber() - linalgOp.getNumInputs();
				// Only single combiner operatios are supported for now.
	SmallVector<Operation *, 4> combinerOps;			SmallVector<Operation *, 4> combinerOps;
	if (!matchReduction(linalgOp.getRegionOutputArgs(), outputPos, combinerOps) \|\|			if (!matchReduction(linalgOp.getRegionOutputArgs(), outputPos, combinerOps) \|\|
	combinerOps.size() != 1)			combinerOps.size() != 1)
	return nullptr;			return llvm::None;

	// TODO: also assert no other subsequent ops break the reduction.			// Return the combiner operation kind, if supported.
	return combinerOps[0];			return getKindForOp(combinerOps[0]);
	}			}

	/// If `value` of assumed VectorType has a shape different than `shape`, try to			/// If `value` of assumed VectorType has a shape different than `shape`, try to
	/// build and return a new vector.broadcast to `shape`.			/// build and return a new vector.broadcast to `shape`.
	/// Otherwise, just return `value`.			/// Otherwise, just return `value`.
	// TODO: this is best effort atm and there is currently no guarantee of			// TODO: this is best effort atm and there is currently no guarantee of
	// correctness for the broadcast semantics.			// correctness for the broadcast semantics.
	static Value broadcastIfNeeded(OpBuilder &b, Value value,			static Value broadcastIfNeeded(OpBuilder &b, Value value,
	ArrayRef<int64_t> shape) {			ArrayRef<int64_t> shape) {
	unsigned numDimsGtOne = std::count_if(shape.begin(), shape.end(),			unsigned numDimsGtOne = std::count_if(shape.begin(), shape.end(),
	[](int64_t val) { return val > 1; });			[](int64_t val) { return val > 1; });
	auto vecType = value.getType().dyn_cast<VectorType>();			auto vecType = value.getType().dyn_cast<VectorType>();
	if (shape.empty() \|\|			if (shape.empty() \|\|
	(vecType != nullptr &&			(vecType != nullptr &&
	(vecType.getShape() == shape \|\| vecType.getRank() > numDimsGtOne)))			(vecType.getShape() == shape \|\| vecType.getRank() > numDimsGtOne)))
	return value;			return value;
	auto newVecType = VectorType::get(shape, vecType ? vecType.getElementType()			auto newVecType = VectorType::get(shape, vecType ? vecType.getElementType()
	: value.getType());			: value.getType());
	return b.create<vector::BroadcastOp>(b.getInsertionPoint()->getLoc(),			return b.create<vector::BroadcastOp>(b.getInsertionPoint()->getLoc(),
	newVecType, value);			newVecType, value);
	}			}

	static llvm::Optional<vector::CombiningKind>
	getKindForOp(Operation *reductionOp) {
	if (!reductionOp)
	return llvm::None;
	return llvm::TypeSwitch<Operation *, llvm::Optional<vector::CombiningKind>>(
	reductionOp)
	.Case<AddIOp, AddFOp>([&](auto op) {
	return llvm::Optional<vector::CombiningKind>{
	vector::CombiningKind::ADD};
	})
	.Default([&](auto op) { return llvm::None; });
	}

	/// If value of assumed VectorType has a shape different than `shape`, build and			/// If value of assumed VectorType has a shape different than `shape`, build and
	/// return a new vector.broadcast to `shape`.			/// return a new vector.broadcast to `shape`.
	/// Otherwise, just return value.			/// Otherwise, just return value.
	static Value reduceIfNeeded(OpBuilder &b, VectorType targetVectorType,			static Value reduceIfNeeded(OpBuilder &b, VectorType targetVectorType,
	Value value, OpOperand *outputOperand) {			Value value, OpOperand *outputOperand) {
	auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());			auto linalgOp = cast<LinalgOp>(outputOperand->getOwner());
	auto vecType = value.getType().dyn_cast<VectorType>();			auto vecType = value.getType().dyn_cast<VectorType>();
	if (!vecType \|\| vecType.getShape() == targetVectorType.getShape())			if (!vecType \|\| vecType.getShape() == targetVectorType.getShape())
	return value;			return value;
	// At this point, we know we need to reduce. Detect the reduction operator.
	// TODO: Use the generic reduction detection util.
	Operation *reductionOp = getSingleBinaryOpAssumedReduction(outputOperand);
	unsigned pos = 0;			unsigned pos = 0;
	MLIRContext *ctx = b.getContext();			MLIRContext *ctx = b.getContext();
	SmallVector<AffineExpr> exprs;			SmallVector<AffineExpr> exprs;
	for (auto s : linalgOp.iterator_types())			for (auto s : linalgOp.iterator_types())
	if (isParallelIterator(s))			if (isParallelIterator(s))
	exprs.push_back(getAffineDimExpr(pos++, ctx));			exprs.push_back(getAffineDimExpr(pos++, ctx));
	auto loc = value.getLoc();			auto loc = value.getLoc();
	// TODO: reuse common CombiningKing logic and support more than add.
	auto maybeKind = getKindForOp(reductionOp);			// At this point, we know we need to reduce. Detect the reduction operator.
				auto maybeKind = matchLinalgReduction(outputOperand);
	assert(maybeKind && "Failed precondition: could not get reduction kind");			assert(maybeKind && "Failed precondition: could not get reduction kind");
	unsigned idx = 0;			unsigned idx = 0;
	SmallVector<bool> reductionMask(linalgOp.iterator_types().size(), false);			SmallVector<bool> reductionMask(linalgOp.iterator_types().size(), false);
	for (auto attr : linalgOp.iterator_types()) {			for (auto attr : linalgOp.iterator_types()) {
	if (isReductionIterator(attr))			if (isReductionIterator(attr))
	reductionMask[idx] = true;			reductionMask[idx] = true;
	++idx;			++idx;
	}			}
	▲ Show 20 Lines • Show All 396 Lines • ▼ Show 20 Lines
	}			}

	// TODO: probably need some extra checks for reduction followed by consumer			// TODO: probably need some extra checks for reduction followed by consumer
	// ops that may not commute (e.g. linear reduction + non-linear instructions).			// ops that may not commute (e.g. linear reduction + non-linear instructions).
	static LogicalResult reductionPreconditions(LinalgOp op) {			static LogicalResult reductionPreconditions(LinalgOp op) {
	if (llvm::none_of(op.iterator_types(), isReductionIterator))			if (llvm::none_of(op.iterator_types(), isReductionIterator))
	return failure();			return failure();
	for (OpOperand *opOperand : op.getOutputOperands()) {			for (OpOperand *opOperand : op.getOutputOperands()) {
	Operation *reductionOp = getSingleBinaryOpAssumedReduction(opOperand);			if (!matchLinalgReduction(opOperand))
	if (!getKindForOp(reductionOp))
	return failure();			return failure();
	}			}
	return success();			return success();
	}			}

	LogicalResult mlir::linalg::vectorizeLinalgOpPrecondition(Operation *op) {			LogicalResult mlir::linalg::vectorizeLinalgOpPrecondition(Operation *op) {
	auto linalgOp = cast<linalg::LinalgOp>(op);			auto linalgOp = cast<linalg::LinalgOp>(op);
	// All types must be static shape to go to vector.			// All types must be static shape to go to vector.
	▲ Show 20 Lines • Show All 746 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorOps.cpp

Show First 20 Lines • Show All 86 Lines • ▼ Show 20 Lines
}		}

// Helper for verifying combining kinds in contractions and reductions.		// Helper for verifying combining kinds in contractions and reductions.
static bool isSupportedCombiningKind(CombiningKind combiningKind,		static bool isSupportedCombiningKind(CombiningKind combiningKind,
Type elementType) {		Type elementType) {
switch (combiningKind) {		switch (combiningKind) {
case CombiningKind::ADD:		case CombiningKind::ADD:
case CombiningKind::MUL:		case CombiningKind::MUL:
case CombiningKind::MIN:		case CombiningKind::MINS:
case CombiningKind::MAX:		case CombiningKind::MAXS:
return elementType.isIntOrIndexOrFloat();		return elementType.isIntOrIndexOrFloat();
case CombiningKind::AND:		case CombiningKind::AND:
case CombiningKind::OR:		case CombiningKind::OR:
case CombiningKind::XOR:		case CombiningKind::XOR:
		ThomasRaouxUnsubmitted Done Reply Inline Actions should MAXF case only return true if the elementType is float? ThomasRaoux: should MAXF case only return true if the elementType is float?
		dcaballeAuthorUnsubmitted Done Reply Inline Actions Good catch, thanks! dcaballe: Good catch, thanks!
return elementType.isIntOrIndex();		return elementType.isIntOrIndex();
}		}
return false;		return false;
}		}

/// Return true if the last dimension of the MemRefType has unit stride. Also		/// Return true if the last dimension of the MemRefType has unit stride. Also
/// return true for memrefs with no strides.		/// return true for memrefs with no strides.
bool mlir::vector::isLastMemrefDimUnitStride(MemRefType type) {		bool mlir::vector::isLastMemrefDimUnitStride(MemRefType type) {
Show All 37 Lines
CombiningKind CombiningKindAttr::getKind() const {		CombiningKind CombiningKindAttr::getKind() const {
return static_cast<CombiningKind>(getImpl()->value);		return static_cast<CombiningKind>(getImpl()->value);
}		}

static constexpr const CombiningKind combiningKindsList[] = {		static constexpr const CombiningKind combiningKindsList[] = {
// clang-format off		// clang-format off
CombiningKind::ADD,		CombiningKind::ADD,
CombiningKind::MUL,		CombiningKind::MUL,
CombiningKind::MIN,		CombiningKind::MINS,
CombiningKind::MAX,		CombiningKind::MAXS,
CombiningKind::AND,		CombiningKind::AND,
CombiningKind::OR,		CombiningKind::OR,
CombiningKind::XOR,		CombiningKind::XOR,
// clang-format on		// clang-format on
};		};

void CombiningKindAttr::print(DialectAsmPrinter &printer) const {		void CombiningKindAttr::print(DialectAsmPrinter &printer) const {
printer << "kind<";		printer << "kind<";
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	static LogicalResult verify(ReductionOp op) {
// Verify for 1-D vector.		// Verify for 1-D vector.
int64_t rank = op.getVectorType().getRank();		int64_t rank = op.getVectorType().getRank();
if (rank != 1)		if (rank != 1)
return op.emitOpError("unsupported reduction rank: ") << rank;		return op.emitOpError("unsupported reduction rank: ") << rank;

// Verify supported reduction kind.		// Verify supported reduction kind.
auto kind = op.kind();		auto kind = op.kind();
Type eltType = op.dest().getType();		Type eltType = op.dest().getType();
if (kind == "add" \|\| kind == "mul" \|\| kind == "min" \|\| kind == "max") {		if (kind == "add" \|\| kind == "mul" \|\| kind == "min" \|\| kind == "max") {
		ThomasRaouxUnsubmitted Done Reply Inline Actions can we call `isSupportedCombiningKind` here? ThomasRaoux: can we call `isSupportedCombiningKind` here?
		dcaballeAuthorUnsubmitted Done Reply Inline Actions much better, thanks! dcaballe: much better, thanks!
if (!eltType.isIntOrIndexOrFloat())		if (!eltType.isIntOrIndexOrFloat())
return op.emitOpError("unsupported reduction type");		return op.emitOpError("unsupported reduction type");
} else if (kind == "and" \|\| kind == "or" \|\| kind == "xor") {		} else if (kind == "and" \|\| kind == "or" \|\| kind == "xor") {
if (!eltType.isIntOrIndex())		if (!eltType.isIntOrIndex())
return op.emitOpError("unsupported reduction type");		return op.emitOpError("unsupported reduction type");
} else {		} else {
return op.emitOpError("unknown reduction kind: ") << kind;		return op.emitOpError("unknown reduction kind: ") << kind;
}		}
▲ Show 20 Lines • Show All 3,577 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 815 Lines • ▼ Show 20 Lines	static Optional<Value> genMultI(Location loc, Value x, Value y, Value acc,
Value combinedResult;		Value combinedResult;
switch (kind) {		switch (kind) {
case CombiningKind::ADD:		case CombiningKind::ADD:
combinedResult = rewriter.create<AddIOp>(loc, mul, acc);		combinedResult = rewriter.create<AddIOp>(loc, mul, acc);
break;		break;
case CombiningKind::MUL:		case CombiningKind::MUL:
combinedResult = rewriter.create<MulIOp>(loc, mul, acc);		combinedResult = rewriter.create<MulIOp>(loc, mul, acc);
break;		break;
case CombiningKind::MIN:		case CombiningKind::MINS:
combinedResult = rewriter.create<SelectOp>(		combinedResult = rewriter.create<SelectOp>(
loc, rewriter.create<CmpIOp>(loc, CmpIPredicate::slt, mul, acc), mul,		loc, rewriter.create<CmpIOp>(loc, CmpIPredicate::slt, mul, acc), mul,
acc);		acc);
		pifon2aUnsubmitted Not Done Reply Inline Actions `create<MinSIOp>`? pifon2a: `create<MinSIOp>`?
break;		break;
case CombiningKind::MAX:		case CombiningKind::MAXS:
combinedResult = rewriter.create<SelectOp>(		combinedResult = rewriter.create<SelectOp>(
		pifon2aUnsubmitted Not Done Reply Inline Actions `create<MaxSIOp>`? pifon2a: `create<MaxSIOp>`?
loc, rewriter.create<CmpIOp>(loc, CmpIPredicate::sge, mul, acc), mul,		loc, rewriter.create<CmpIOp>(loc, CmpIPredicate::sge, mul, acc), mul,
acc);		acc);
break;		break;
case CombiningKind::AND:		case CombiningKind::AND:
combinedResult = rewriter.create<AndOp>(loc, mul, acc);		combinedResult = rewriter.create<AndOp>(loc, mul, acc);
break;		break;
case CombiningKind::OR:		case CombiningKind::OR:
combinedResult = rewriter.create<OrOp>(loc, mul, acc);		combinedResult = rewriter.create<OrOp>(loc, mul, acc);
Show All 20 Lines	static Optional<Value> genMultF(Location loc, Value x, Value y, Value acc,
if (!acc)		if (!acc)
return Optional<Value>(mul);		return Optional<Value>(mul);

Value combinedResult;		Value combinedResult;
switch (kind) {		switch (kind) {
case CombiningKind::MUL:		case CombiningKind::MUL:
combinedResult = rewriter.create<MulFOp>(loc, mul, acc);		combinedResult = rewriter.create<MulFOp>(loc, mul, acc);
break;		break;
case CombiningKind::MIN:		case CombiningKind::MINS:
combinedResult = rewriter.create<SelectOp>(		combinedResult = rewriter.create<SelectOp>(
loc, rewriter.create<CmpFOp>(loc, CmpFPredicate::OLE, mul, acc), mul,		loc, rewriter.create<CmpFOp>(loc, CmpFPredicate::OLE, mul, acc), mul,
acc);		acc);
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions This lowering doesn't propagate Nan while the new minf op is supposed to propagate Nan. I assume we want to match the semantic of minf. (same for maxf of course) ThomasRaoux: This lowering doesn't propagate Nan while the new minf op is supposed to propagate Nan. I…
		ThomasRaouxUnsubmitted Not Done Reply Inline Actions The most natural would be to lower to minf/maxf ops in my opinion. ThomasRaoux: The most natural would be to lower to minf/maxf ops in my opinion.
		pifon2aUnsubmitted Not Done Reply Inline Actions +1 @ThomasRaoux , `create<MinFOp>`? pifon2a: +1 @ThomasRaoux , `create<MinFOp>`?
		dcaballeAuthorUnsubmitted Done Reply Inline Actions Yes, that makes sense, thanks! I thought @pifon2a's patch was taking care of that already. I see now that it's only expanding the min/max ops to the compare/select implementation. dcaballe: Yes, that makes sense, thanks! I thought @pifon2a's patch was taking care of that already. I…
break;		break;
case CombiningKind::MAX:		case CombiningKind::MAXS:
combinedResult = rewriter.create<SelectOp>(		combinedResult = rewriter.create<SelectOp>(
loc, rewriter.create<CmpFOp>(loc, CmpFPredicate::OGT, mul, acc), mul,		loc, rewriter.create<CmpFOp>(loc, CmpFPredicate::OGT, mul, acc), mul,
acc);		acc);
break;		break;
case CombiningKind::ADD: // Already handled this special case above.		case CombiningKind::ADD: // Already handled this special case above.
case CombiningKind::AND: // Only valid for integer types.		case CombiningKind::AND: // Only valid for integer types.
case CombiningKind::OR: // Only valid for integer types.		case CombiningKind::OR: // Only valid for integer types.
case CombiningKind::XOR: // Only valid for integer types.		case CombiningKind::XOR: // Only valid for integer types.
▲ Show 20 Lines • Show All 2,811 Lines • ▼ Show 20 Lines	for (int64_t i = 1; i < srcShape[0]; i++) {
result = rewriter.create<AddFOp>(loc, operand, result);		result = rewriter.create<AddFOp>(loc, operand, result);
break;		break;
case vector::CombiningKind::MUL:		case vector::CombiningKind::MUL:
if (elementType.isIntOrIndex())		if (elementType.isIntOrIndex())
result = rewriter.create<MulIOp>(loc, operand, result);		result = rewriter.create<MulIOp>(loc, operand, result);
else		else
result = rewriter.create<MulFOp>(loc, operand, result);		result = rewriter.create<MulFOp>(loc, operand, result);
break;		break;
case vector::CombiningKind::MIN:		case vector::CombiningKind::MINS:
if (elementType.isIntOrIndex())		if (elementType.isIntOrIndex())
condition =		condition =
rewriter.create<CmpIOp>(loc, CmpIPredicate::slt, operand, result);		rewriter.create<CmpIOp>(loc, CmpIPredicate::slt, operand, result);
else		else
condition =		condition =
rewriter.create<CmpFOp>(loc, CmpFPredicate::OLT, operand, result);		rewriter.create<CmpFOp>(loc, CmpFPredicate::OLT, operand, result);
result = rewriter.create<SelectOp>(loc, condition, operand, result);		result = rewriter.create<SelectOp>(loc, condition, operand, result);
break;		break;
case vector::CombiningKind::MAX:		case vector::CombiningKind::MAXS:
if (elementType.isIntOrIndex())		if (elementType.isIntOrIndex())
condition =		condition =
rewriter.create<CmpIOp>(loc, CmpIPredicate::sge, operand, result);		rewriter.create<CmpIOp>(loc, CmpIPredicate::sge, operand, result);
else		else
condition =		condition =
rewriter.create<CmpFOp>(loc, CmpFPredicate::OGE, operand, result);		rewriter.create<CmpFOp>(loc, CmpFPredicate::OGE, operand, result);
result = rewriter.create<SelectOp>(loc, condition, operand, result);		result = rewriter.create<SelectOp>(loc, condition, operand, result);
break;		break;
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(vector::MultiDimReductionOp multiReductionOp,
// TODO: Add vector::CombiningKind attribute instead of string to		// TODO: Add vector::CombiningKind attribute instead of string to
// vector.reduction.		// vector.reduction.
auto getKindStr = [](vector::CombiningKind kind) {		auto getKindStr = [](vector::CombiningKind kind) {
switch (kind) {		switch (kind) {
case vector::CombiningKind::ADD:		case vector::CombiningKind::ADD:
return "add";		return "add";
case vector::CombiningKind::MUL:		case vector::CombiningKind::MUL:
return "mul";		return "mul";
case vector::CombiningKind::MIN:		case vector::CombiningKind::MINS:
return "min";		return "min";
case vector::CombiningKind::MAX:		case vector::CombiningKind::MAXS:
return "max";		return "max";
case vector::CombiningKind::AND:		case vector::CombiningKind::AND:
return "and";		return "and";
case vector::CombiningKind::OR:		case vector::CombiningKind::OR:
return "or";		return "or";
case vector::CombiningKind::XOR:		case vector::CombiningKind::XOR:
return "xor";		return "xor";
}		}
▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/vectorization.mlir

Show First 20 Lines • Show All 829 Lines • ▼ Show 20 Lines	^bb0(%arg0: f32, %arg1: f32, %arg2: f32): // no predecessors
%1 = math.exp %arg0 : f32		%1 = math.exp %arg0 : f32
%2 = math.exp %arg1 : f32		%2 = math.exp %arg1 : f32
%3 = addf %1, %2 : f32		%3 = addf %1, %2 : f32
%4 = addf %3, %arg2 : f32		%4 = addf %3, %arg2 : f32
linalg.yield %4 : f32		linalg.yield %4 : f32
} -> tensor<5x2xf32>		} -> tensor<5x2xf32>
return %0 : tensor<5x2xf32>		return %0 : tensor<5x2xf32>
}		}

		// -----

		// CHECK-LABEL: func @red_max_2d(
		func @red_max_2d(%arg0: tensor<4x4xf32>) -> tensor<4xf32> {
		// CHECK: linalg.init_tensor [4] : tensor<4xf32>
		// CHECK: vector.transfer_write {{.*}} : vector<4xf32>, tensor<4xf32>
		// CHECK: vector.transfer_read {{.*}} : tensor<4x4xf32>, vector<4x4xf32>
		// CHECK: vector.transfer_read {{.*}} : tensor<4xf32>, vector<4x4xf32>
		// CHECK: maxf {{.*}} : vector<4x4xf32>
		// CHECK: vector.multi_reduction #vector.kind<maxs>, {{.*}} [1] : vector<4x4xf32> to vector<4xf32>
		// CHECK: vector.transfer_write {{.*}} : vector<4xf32>, tensor<4xf32>
		%minf32 = constant -3.40282e+38 : f32
		%init = linalg.init_tensor [4] : tensor<4xf32>
		%fill = linalg.fill(%minf32, %init) : f32, tensor<4xf32> -> tensor<4xf32>
		%red = linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
		affine_map<(d0, d1) -> (d0)>],
		iterator_types = ["parallel", "reduction"]}
		ins(%arg0 : tensor<4x4xf32>) outs(%fill : tensor<4xf32>) {
		^bb0(%in0: f32, %out0: f32): // no predecessors
		%max = maxf %in0, %out0 : f32
		linalg.yield %max : f32
		} -> tensor<4xf32>
		return %red : tensor<4xf32>
		}

		// -----

		// CHECK-LABEL: func @red_min_2d(
		func @red_min_2d(%arg0: tensor<4x4xf32>) -> tensor<4xf32> {
		// CHECK: linalg.init_tensor [4] : tensor<4xf32>
		// CHECK: vector.transfer_write {{.*}} : vector<4xf32>, tensor<4xf32>
		// CHECK: vector.transfer_read {{.*}} : tensor<4x4xf32>, vector<4x4xf32>
		// CHECK: vector.transfer_read {{.*}} : tensor<4xf32>, vector<4x4xf32>
		// CHECK: minf {{.*}} : vector<4x4xf32>
		// CHECK: vector.multi_reduction #vector.kind<mins>, {{.*}} [1] : vector<4x4xf32> to vector<4xf32>
		// CHECK: vector.transfer_write {{.*}} : vector<4xf32>, tensor<4xf32>
		%minf32 = constant -3.40282e+38 : f32
		%init = linalg.init_tensor [4] : tensor<4xf32>
		%fill = linalg.fill(%minf32, %init) : f32, tensor<4xf32> -> tensor<4xf32>
		%red = linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>,
		affine_map<(d0, d1) -> (d0)>],
		iterator_types = ["parallel", "reduction"]}
		ins(%arg0 : tensor<4x4xf32>) outs(%fill : tensor<4xf32>) {
		^bb0(%in0: f32, %out0: f32): // no predecessors
		%max = minf %in0, %out0 : f32
		linalg.yield %max : f32
		} -> tensor<4xf32>
		return %red : tensor<4xf32>
		}

mlir/test/Dialect/Vector/ops.mlir

Show First 20 Lines • Show All 237 Lines • ▼ Show 20 Lines
#contraction_to_scalar_max_accesses = [		#contraction_to_scalar_max_accesses = [
affine_map<(i) -> (i)>,		affine_map<(i) -> (i)>,
affine_map<(i) -> (i)>,		affine_map<(i) -> (i)>,
affine_map<(i) -> ()>		affine_map<(i) -> ()>
]		]
#contraction_to_scalar_max_trait = {		#contraction_to_scalar_max_trait = {
indexing_maps = #contraction_to_scalar_max_accesses,		indexing_maps = #contraction_to_scalar_max_accesses,
iterator_types = ["reduction"],		iterator_types = ["reduction"],
kind = #vector.kind<max>		kind = #vector.kind<maxs>
}		}
// CHECK-LABEL: @contraction_to_scalar_with_max		// CHECK-LABEL: @contraction_to_scalar_with_max
func @contraction_to_scalar_with_max(%arg0: vector<10xf32>, %arg1: vector<10xf32>) -> f32 {		func @contraction_to_scalar_with_max(%arg0: vector<10xf32>, %arg1: vector<10xf32>) -> f32 {
// CHECK: %[[C0:.*]] = constant 0.000000e+00 : f32		// CHECK: %[[C0:.*]] = constant 0.000000e+00 : f32
%f0 = constant 0.0: f32		%f0 = constant 0.0: f32
// CHECK: %[[X:.]] = vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["reduction"], kind = #vector.kind<max>} %{{.}}, %{{.}}, %[[C0]] : vector<10xf32>, vector<10xf32> into f32		// CHECK: %[[X:.]] = vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["reduction"], kind = #vector.kind<maxs>} %{{.}}, %{{.}}, %[[C0]] : vector<10xf32>, vector<10xf32> into f32
%0 = vector.contract #contraction_to_scalar_max_trait %arg0, %arg1, %f0		%0 = vector.contract #contraction_to_scalar_max_trait %arg0, %arg1, %f0
: vector<10xf32>, vector<10xf32> into f32		: vector<10xf32>, vector<10xf32> into f32
// CHECK: return %[[X]] : f32		// CHECK: return %[[X]] : f32
return %0 : f32		return %0 : f32
}		}

#contraction_accesses0 = [		#contraction_accesses0 = [
affine_map<(b0, f0, f1, c0, c1) -> (c0, b0, c1, f0)>,		affine_map<(b0, f0, f1, c0, c1) -> (c0, b0, c1, f0)>,
Show All 15 Lines	#iterator_types1 = ["parallel", "parallel", "parallel", "parallel", "reduction",
"reduction"]		"reduction"]
#contraction_trait1 = {		#contraction_trait1 = {
indexing_maps = #contraction_accesses1,		indexing_maps = #contraction_accesses1,
iterator_types = #iterator_types1		iterator_types = #iterator_types1
}		}
#contraction_trait2 = {		#contraction_trait2 = {
indexing_maps = #contraction_accesses1,		indexing_maps = #contraction_accesses1,
iterator_types = #iterator_types1,		iterator_types = #iterator_types1,
kind = #vector.kind<max>		kind = #vector.kind<maxs>
}		}
// CHECK-LABEL: @contraction		// CHECK-LABEL: @contraction
func @contraction(%arg0 : vector<7x8x16x15xf32>, %arg1 : vector<8x16x7x5xf32>,		func @contraction(%arg0 : vector<7x8x16x15xf32>, %arg1 : vector<8x16x7x5xf32>,
%arg2 : vector<8x15x5xf32>, %arg3 : vector<8x8x15x5xf32>,		%arg2 : vector<8x15x5xf32>, %arg3 : vector<8x8x15x5xf32>,
%arg4 : vector<7x8x16x15xf16>, %arg5 : vector<8x16x7x5xf16>) {		%arg4 : vector<7x8x16x15xf16>, %arg5 : vector<8x16x7x5xf16>) {
// Test contraction with batch and contracting dims.		// Test contraction with batch and contracting dims.
// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<add>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x15x5xf32>		// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<add>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x15x5xf32>
%0 = vector.contract #contraction_trait0 %arg0, %arg1, %arg2		%0 = vector.contract #contraction_trait0 %arg0, %arg1, %arg2
Show All 11 Lines	func @contraction(%arg0 : vector<7x8x16x15xf32>, %arg1 : vector<8x16x7x5xf32>,
%2 = vector.contract #contraction_trait1 %arg0, %arg1, %arg3, %lhs_mask,		%2 = vector.contract #contraction_trait1 %arg0, %arg1, %arg3, %lhs_mask,
%rhs_mask		%rhs_mask
: vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>		: vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>
// Test contraction with mixed type.		// Test contraction with mixed type.
// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<add>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf16>, vector<8x16x7x5xf16> into vector<8x8x15x5xf32>		// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<add>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf16>, vector<8x16x7x5xf16> into vector<8x8x15x5xf32>
%3 = vector.contract #contraction_trait1 %arg4, %arg5, %arg3		%3 = vector.contract #contraction_trait1 %arg4, %arg5, %arg3
: vector<7x8x16x15xf16>, vector<8x16x7x5xf16> into vector<8x8x15x5xf32>		: vector<7x8x16x15xf16>, vector<8x16x7x5xf16> into vector<8x8x15x5xf32>
// Test contraction with "max" instead of "add".		// Test contraction with "max" instead of "add".
// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<max>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>		// CHECK: vector.contract {indexing_maps = [#{{.}}, #{{.}}, #{{.}}], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction"], kind = #vector.kind<maxs>} {{.}}, {{.}}, {{.}} : vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>
%4 = vector.contract #contraction_trait2 %arg0, %arg1, %arg3		%4 = vector.contract #contraction_trait2 %arg0, %arg1, %arg3
: vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>		: vector<7x8x16x15xf32>, vector<8x16x7x5xf32> into vector<8x8x15x5xf32>
return		return
}		}

// CHECK-LABEL: @create_vector_mask		// CHECK-LABEL: @create_vector_mask
func @create_vector_mask() {		func @create_vector_mask() {
// CHECK: %[[C2:.*]] = constant 2 : index		// CHECK: %[[C2:.*]] = constant 2 : index
▲ Show 20 Lines • Show All 299 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/vector-contract-matvec-transforms.mlir

	// RUN: mlir-opt %s -test-vector-contraction-conversion=vector-outerproduct=1 \| FileCheck %s			// RUN: mlir-opt %s -test-vector-contraction-conversion=vector-outerproduct=1 \| FileCheck %s

	#matvec_accesses = [			#matvec_accesses = [
	affine_map<(i, j) -> (i, j)>,			affine_map<(i, j) -> (i, j)>,
	affine_map<(i, j) -> (j)>,			affine_map<(i, j) -> (j)>,
	affine_map<(i, j) -> (i)>			affine_map<(i, j) -> (i)>
	]			]
	#matvec_trait = {			#matvec_trait = {
	indexing_maps = #matvec_accesses,			indexing_maps = #matvec_accesses,
	iterator_types = ["parallel", "reduction"]			iterator_types = ["parallel", "reduction"]
	}			}
	#matvecmax_trait = {			#matvecmax_trait = {
	indexing_maps = #matvec_accesses,			indexing_maps = #matvec_accesses,
	iterator_types = ["parallel", "reduction"],			iterator_types = ["parallel", "reduction"],
	kind = #vector.kind<max>			kind = #vector.kind<maxs>
	}			}

	#mattransvec_accesses = [			#mattransvec_accesses = [
	affine_map<(i, j) -> (j, i)>,			affine_map<(i, j) -> (j, i)>,
	affine_map<(i, j) -> (j)>,			affine_map<(i, j) -> (j)>,
	affine_map<(i, j) -> (i)>			affine_map<(i, j) -> (i)>
	]			]
	#mattransvec_trait = {			#mattransvec_trait = {
	▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	// CHECK-SAME: %[[B:.*1]]: memref<vector<2xf32>>			// CHECK-SAME: %[[B:.*1]]: memref<vector<2xf32>>
	// CHECK-SAME: %[[C:.*2]]: memref<vector<2xf32>>			// CHECK-SAME: %[[C:.*2]]: memref<vector<2xf32>>
	// CHECK: %[[T0:.*]] = memref.load %[[A]][] : memref<vector<2x2xf32>>			// CHECK: %[[T0:.*]] = memref.load %[[A]][] : memref<vector<2x2xf32>>
	// CHECK: %[[T1:.*]] = memref.load %[[B]][] : memref<vector<2xf32>>			// CHECK: %[[T1:.*]] = memref.load %[[B]][] : memref<vector<2xf32>>
	// CHECK: %[[T2:.*]] = memref.load %[[C]][] : memref<vector<2xf32>>			// CHECK: %[[T2:.*]] = memref.load %[[C]][] : memref<vector<2xf32>>
	// CHECK: %[[T3:.*]] = vector.transpose %[[T0]], [1, 0] : vector<2x2xf32> to vector<2x2xf32>			// CHECK: %[[T3:.*]] = vector.transpose %[[T0]], [1, 0] : vector<2x2xf32> to vector<2x2xf32>
	// CHECK: %[[T4:.*]] = vector.extract %[[T3]][0] : vector<2x2xf32>			// CHECK: %[[T4:.*]] = vector.extract %[[T3]][0] : vector<2x2xf32>
	// CHECK: %[[T5:.*]] = vector.extract %[[T1]][0] : vector<2xf32>			// CHECK: %[[T5:.*]] = vector.extract %[[T1]][0] : vector<2xf32>
	// CHECK: %[[T6:.*]] = vector.outerproduct %[[T4]], %[[T5]], %[[T2]] {kind = #vector.kind<max>} : vector<2xf32>, f32			// CHECK: %[[T6:.*]] = vector.outerproduct %[[T4]], %[[T5]], %[[T2]] {kind = #vector.kind<maxs>} : vector<2xf32>, f32
	// CHECK: %[[T7:.*]] = vector.extract %[[T3]][1] : vector<2x2xf32>			// CHECK: %[[T7:.*]] = vector.extract %[[T3]][1] : vector<2x2xf32>
	// CHECK: %[[T8:.*]] = vector.extract %[[T1]][1] : vector<2xf32>			// CHECK: %[[T8:.*]] = vector.extract %[[T1]][1] : vector<2xf32>
	// CHECK: %[[T9:.*]] = vector.outerproduct %[[T7]], %[[T8]], %[[T6]] {kind = #vector.kind<max>} : vector<2xf32>, f32			// CHECK: %[[T9:.*]] = vector.outerproduct %[[T7]], %[[T8]], %[[T6]] {kind = #vector.kind<maxs>} : vector<2xf32>, f32
	// CHECK: memref.store %[[T9]], %[[C]][] : memref<vector<2xf32>>			// CHECK: memref.store %[[T9]], %[[C]][] : memref<vector<2xf32>>
	// CHECK: return			// CHECK: return
	func @matvecmax2x2(%arg0: memref<vector<2x2xf32>>, %arg1: memref<vector<2xf32>>,			func @matvecmax2x2(%arg0: memref<vector<2x2xf32>>, %arg1: memref<vector<2xf32>>,
	%arg2: memref<vector<2xf32>>) {			%arg2: memref<vector<2xf32>>) {
	%A = memref.load %arg0[] : memref<vector<2x2xf32>>			%A = memref.load %arg0[] : memref<vector<2x2xf32>>
	%x = memref.load %arg1[] : memref<vector<2xf32>>			%x = memref.load %arg1[] : memref<vector<2xf32>>
	%b = memref.load %arg2[] : memref<vector<2xf32>>			%b = memref.load %arg2[] : memref<vector<2xf32>>
	%0 = vector.contract #matvecmax_trait %A, %x, %b : vector<2x2xf32>, vector<2xf32> into vector<2xf32>			%0 = vector.contract #matvecmax_trait %A, %x, %b : vector<2x2xf32>, vector<2xf32> into vector<2xf32>
	▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

mlir/test/Dialect/Vector/vector-multi-reduction-outer-lowering.mlir

	Show All 12 Lines
	// CHECK: %[[RV01:.+]] = mulf %[[V1]], %[[V0]] : vector<2xf32>			// CHECK: %[[RV01:.+]] = mulf %[[V1]], %[[V0]] : vector<2xf32>
	// CHECK: %[[V2:.+]] = vector.extract %[[TRANSPOSED]][2] : vector<4x2xf32>			// CHECK: %[[V2:.+]] = vector.extract %[[TRANSPOSED]][2] : vector<4x2xf32>
	// CHECK: %[[RV012:.+]] = mulf %[[V2]], %[[RV01]] : vector<2xf32>			// CHECK: %[[RV012:.+]] = mulf %[[V2]], %[[RV01]] : vector<2xf32>
	// CHECK: %[[V3:.+]] = vector.extract %[[TRANSPOSED]][3] : vector<4x2xf32>			// CHECK: %[[V3:.+]] = vector.extract %[[TRANSPOSED]][3] : vector<4x2xf32>
	// CHECK: %[[RESULT_VEC:.+]] = mulf %[[V3]], %[[RV012]] : vector<2xf32>			// CHECK: %[[RESULT_VEC:.+]] = mulf %[[V3]], %[[RV012]] : vector<2xf32>
	// CHECK: return %[[RESULT_VEC]] : vector<2xf32>			// CHECK: return %[[RESULT_VEC]] : vector<2xf32>

	func @vector_multi_reduction_min(%arg0: vector<2x4xf32>) -> vector<2xf32> {			func @vector_multi_reduction_min(%arg0: vector<2x4xf32>) -> vector<2xf32> {
	%0 = vector.multi_reduction #vector.kind<min>, %arg0 [1] : vector<2x4xf32> to vector<2xf32>			%0 = vector.multi_reduction #vector.kind<mins>, %arg0 [1] : vector<2x4xf32> to vector<2xf32>
	return %0 : vector<2xf32>			return %0 : vector<2xf32>
	}			}

	// CHECK-LABEL: func @vector_multi_reduction_min			// CHECK-LABEL: func @vector_multi_reduction_min
	// CHECK-SAME: %[[INPUT:.+]]: vector<2x4xf32>			// CHECK-SAME: %[[INPUT:.+]]: vector<2x4xf32>
	// CHECK: %[[TRANSPOSED:.+]] = vector.transpose %[[INPUT]], [1, 0] : vector<2x4xf32> to vector<4x2xf32>			// CHECK: %[[TRANSPOSED:.+]] = vector.transpose %[[INPUT]], [1, 0] : vector<2x4xf32> to vector<4x2xf32>
	// CHECK: %[[V0:.+]] = vector.extract %[[TRANSPOSED]][0] : vector<4x2xf32>			// CHECK: %[[V0:.+]] = vector.extract %[[TRANSPOSED]][0] : vector<4x2xf32>
	// CHECK: %[[V1:.+]] = vector.extract %[[TRANSPOSED]][1] : vector<4x2xf32>			// CHECK: %[[V1:.+]] = vector.extract %[[TRANSPOSED]][1] : vector<4x2xf32>
	// CHECK: %[[C0:.+]] = cmpf olt, %[[V1]], %[[V0]] : vector<2xf32>			// CHECK: %[[C0:.+]] = cmpf olt, %[[V1]], %[[V0]] : vector<2xf32>
	// CHECK: %[[RV01:.+]] = select %[[C0]], %[[V1]], %[[V0]] : vector<2xi1>, vector<2xf32>			// CHECK: %[[RV01:.+]] = select %[[C0]], %[[V1]], %[[V0]] : vector<2xi1>, vector<2xf32>
	// CHECK: %[[V2:.+]] = vector.extract %[[TRANSPOSED]][2] : vector<4x2xf32>			// CHECK: %[[V2:.+]] = vector.extract %[[TRANSPOSED]][2] : vector<4x2xf32>
	// CHECK: %[[C1:.+]] = cmpf olt, %[[V2]], %[[RV01]] : vector<2xf32>			// CHECK: %[[C1:.+]] = cmpf olt, %[[V2]], %[[RV01]] : vector<2xf32>
	// CHECK: %[[RV012:.+]] = select %[[C1]], %[[V2]], %[[RV01]] : vector<2xi1>, vector<2xf32>			// CHECK: %[[RV012:.+]] = select %[[C1]], %[[V2]], %[[RV01]] : vector<2xi1>, vector<2xf32>
	// CHECK: %[[V3:.+]] = vector.extract %[[TRANSPOSED]][3] : vector<4x2xf32>			// CHECK: %[[V3:.+]] = vector.extract %[[TRANSPOSED]][3] : vector<4x2xf32>
	// CHECK: %[[C2:.+]] = cmpf olt, %[[V3]], %[[RV012]] : vector<2xf32>			// CHECK: %[[C2:.+]] = cmpf olt, %[[V3]], %[[RV012]] : vector<2xf32>
	// CHECK: %[[RESULT_VEC:.+]] = select %[[C2]], %[[V3]], %[[RV012]] : vector<2xi1>, vector<2xf32>			// CHECK: %[[RESULT_VEC:.+]] = select %[[C2]], %[[V3]], %[[RV012]] : vector<2xi1>, vector<2xf32>
	// CHECK: return %[[RESULT_VEC]] : vector<2xf32>			// CHECK: return %[[RESULT_VEC]] : vector<2xf32>

	func @vector_multi_reduction_max(%arg0: vector<2x4xf32>) -> vector<2xf32> {			func @vector_multi_reduction_max(%arg0: vector<2x4xf32>) -> vector<2xf32> {
	%0 = vector.multi_reduction #vector.kind<max>, %arg0 [1] : vector<2x4xf32> to vector<2xf32>			%0 = vector.multi_reduction #vector.kind<maxs>, %arg0 [1] : vector<2x4xf32> to vector<2xf32>
	return %0 : vector<2xf32>			return %0 : vector<2xf32>
	}			}

	// CHECK-LABEL: func @vector_multi_reduction_max			// CHECK-LABEL: func @vector_multi_reduction_max
	// CHECK-SAME: %[[INPUT:.+]]: vector<2x4xf32>			// CHECK-SAME: %[[INPUT:.+]]: vector<2x4xf32>
	// CHECK: %[[TRANSPOSED:.+]] = vector.transpose %[[INPUT]], [1, 0] : vector<2x4xf32> to vector<4x2xf32>			// CHECK: %[[TRANSPOSED:.+]] = vector.transpose %[[INPUT]], [1, 0] : vector<2x4xf32> to vector<4x2xf32>
	// CHECK: %[[V0:.+]] = vector.extract %[[TRANSPOSED]][0] : vector<4x2xf32>			// CHECK: %[[V0:.+]] = vector.extract %[[TRANSPOSED]][0] : vector<4x2xf32>
	// CHECK: %[[V1:.+]] = vector.extract %[[TRANSPOSED]][1] : vector<4x2xf32>			// CHECK: %[[V1:.+]] = vector.extract %[[TRANSPOSED]][1] : vector<4x2xf32>
	▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Add support for min/max reduction vectorization in linalg.genericClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 376264

mlir/include/mlir/Dialect/Vector/VectorOps.td

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp

mlir/lib/Dialect/Vector/VectorOps.cpp

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/test/Dialect/Linalg/vectorization.mlir

mlir/test/Dialect/Vector/ops.mlir

mlir/test/Dialect/Vector/vector-contract-matvec-transforms.mlir

mlir/test/Dialect/Vector/vector-multi-reduction-outer-lowering.mlir

[mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic
ClosedPublic