Diff 548576

mlir/include/mlir/Dialect/Arith/IR/Arith.h

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	bool applyCmpPredicate(arith::CmpIPredicate predicate, const APInt &lhs,
const APInt &rhs);		const APInt &rhs);

/// Compute `lhs` `pred` `rhs`, where `pred` is one of the known floating point		/// Compute `lhs` `pred` `rhs`, where `pred` is one of the known floating point
/// comparison predicates.		/// comparison predicates.
bool applyCmpPredicate(arith::CmpFPredicate predicate, const APFloat &lhs,		bool applyCmpPredicate(arith::CmpFPredicate predicate, const APFloat &lhs,
const APFloat &rhs);		const APFloat &rhs);

/// Returns the identity value attribute associated with an AtomicRMWKind op.		/// Returns the identity value attribute associated with an AtomicRMWKind op.
		/// `useOnlyFiniteValue` defines whether the identity value should steer away
		dcaballeUnsubmitted Done Reply Inline Actions We use markdown style instead of `\p` in MLIR dcaballe: We use markdown style instead of `\p` in MLIR
		/// from infinity representations or anything that is not a proper finite
		/// number.
		/// E.g., The identity value for maxf is in theory `-Inf`, but if we want to
		/// stay in the finite range, it would be `BiggestRepresentableNegativeFloat`.
		/// The purpose of this boolean is to offer constants that will play nice
		/// with fast math related optimizations.
TypedAttr getIdentityValueAttr(AtomicRMWKind kind, Type resultType,		TypedAttr getIdentityValueAttr(AtomicRMWKind kind, Type resultType,
OpBuilder &builder, Location loc);		OpBuilder &builder, Location loc,
		bool useOnlyFiniteValue = false);
		rengolinUnsubmitted Not Done Reply Inline Actions If we eventually add some/all of the fast-math flags from LLVM, we may want this to be a bitfield struct or something. Not necessarily for this patch, but it's good to start thinking about it. rengolin: If we eventually add some/all of the fast-math flags from LLVM, we may want this to be a…

/// Return the identity numeric value associated to the give op. Return		/// Return the identity numeric value associated to the give op. Return
/// std::nullopt if there is no known neutral element.		/// std::nullopt if there is no known neutral element.
		/// If `op` has `FastMathFlags::ninf`, only finite values will be used
		/// as neutral element.
std::optional<TypedAttr> getNeutralElement(Operation *op);		std::optional<TypedAttr> getNeutralElement(Operation *op);

/// Returns the identity value associated with an AtomicRMWKind op.		/// Returns the identity value associated with an AtomicRMWKind op.
		/// \see getIdentityValueAttr for a description of what `useOnlyFiniteValue`
		/// does.
Value getIdentityValue(AtomicRMWKind op, Type resultType, OpBuilder &builder,		Value getIdentityValue(AtomicRMWKind op, Type resultType, OpBuilder &builder,
Location loc);		Location loc, bool useOnlyFiniteValue = false);
		rengolinUnsubmitted Not Done Reply Inline Actions Can we make these getters to set a default value? then we don't need to override on all calls, only when we know we need, for instance, when the front-end / pass knows what it's doing. rengolin: Can we make these getters to set a default value? then we don't need to override on all calls…
		qcolombetAuthorUnsubmitted Done Reply Inline Actions Yes we could. I decided against it because I wanted people to think about this problem when they use this API. Now, since this may be a temporary solution, it may be better to limit the changes. WDYT? qcolombet: Yes we could. I decided against it because I wanted people to think about this problem when…
		rengolinUnsubmitted Done Reply Inline Actions I wanted people to think about this problem when they use this API. This works for people adding new code, not the existing code that is being changed with this patch, which the original authors are probably not going to look back. it may be better to limit the changes. Also, if the RFC linked goes ahead, the argument will change soon and lead to more mechanical changes. I'd keep the changes minimal (default value) for now and enter the RFC discussion for the long term strategy. rengolin: > I wanted people to think about this problem when they use this API. This works for people…
		qcolombetAuthorUnsubmitted Done Reply Inline Actions Used default value that keeps the old behavior. qcolombet: Used default value that keeps the old behavior.

/// Returns the value obtained by applying the reduction operation kind		/// Returns the value obtained by applying the reduction operation kind
/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.		/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.
Value getReductionOp(AtomicRMWKind op, OpBuilder &builder, Location loc,		Value getReductionOp(AtomicRMWKind op, OpBuilder &builder, Location loc,
Value lhs, Value rhs);		Value lhs, Value rhs);

arith::CmpIPredicate invertPredicate(arith::CmpIPredicate pred);		arith::CmpIPredicate invertPredicate(arith::CmpIPredicate pred);
} // namespace arith		} // namespace arith
} // namespace mlir		} // namespace mlir

#endif // MLIR_DIALECT_ARITH_IR_ARITH_H_		#endif // MLIR_DIALECT_ARITH_IR_ARITH_H_

mlir/lib/Dialect/Arith/IR/ArithOps.cpp

	Show First 20 Lines • Show All 2,381 Lines • ▼ Show 20 Lines
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Atomic Enum			// Atomic Enum
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	/// Returns the identity value attribute associated with an AtomicRMWKind op.			/// Returns the identity value attribute associated with an AtomicRMWKind op.
	TypedAttr mlir::arith::getIdentityValueAttr(AtomicRMWKind kind, Type resultType,			TypedAttr mlir::arith::getIdentityValueAttr(AtomicRMWKind kind, Type resultType,
	OpBuilder &builder, Location loc) {			OpBuilder &builder, Location loc,
				bool useOnlyFiniteValue) {
	switch (kind) {			switch (kind) {
	case AtomicRMWKind::maxf:			case AtomicRMWKind::maxf: {
	return builder.getFloatAttr(			const llvm::fltSemantics &semantic =
	resultType,			llvm::cast<FloatType>(resultType).getFloatSemantics();
	APFloat::getInf(llvm::cast<FloatType>(resultType).getFloatSemantics(),			APFloat identity = useOnlyFiniteValue
	/Negative=/true));			? APFloat::getSmallest(semantic, /Negative=/true)
				rengolinUnsubmitted Not Done Reply Inline Actions Later we could discuss if it makes sense to expand fast-math logic into `APFloat`, so that `getSmallest`/`getLargest` would "know what to do" against the fast-math flags. rengolin: Later we could discuss if it makes sense to expand fast-math logic into `APFloat`, so that…
				: APFloat::getInf(semantic, /Negative=/true);
				return builder.getFloatAttr(resultType, identity);
				}
	case AtomicRMWKind::addf:			case AtomicRMWKind::addf:
	case AtomicRMWKind::addi:			case AtomicRMWKind::addi:
	case AtomicRMWKind::maxu:			case AtomicRMWKind::maxu:
	case AtomicRMWKind::ori:			case AtomicRMWKind::ori:
	return builder.getZeroAttr(resultType);			return builder.getZeroAttr(resultType);
	case AtomicRMWKind::andi:			case AtomicRMWKind::andi:
	return builder.getIntegerAttr(			return builder.getIntegerAttr(
	resultType,			resultType,
	APInt::getAllOnes(llvm::cast<IntegerType>(resultType).getWidth()));			APInt::getAllOnes(llvm::cast<IntegerType>(resultType).getWidth()));
	case AtomicRMWKind::maxs:			case AtomicRMWKind::maxs:
	return builder.getIntegerAttr(			return builder.getIntegerAttr(
	resultType, APInt::getSignedMinValue(			resultType, APInt::getSignedMinValue(
	llvm::cast<IntegerType>(resultType).getWidth()));			llvm::cast<IntegerType>(resultType).getWidth()));
	case AtomicRMWKind::minf:			case AtomicRMWKind::minf: {
	return builder.getFloatAttr(			const llvm::fltSemantics &semantic =
	resultType,			llvm::cast<FloatType>(resultType).getFloatSemantics();
	APFloat::getInf(llvm::cast<FloatType>(resultType).getFloatSemantics(),			APFloat identity = useOnlyFiniteValue
	/Negative=/false));			? APFloat::getLargest(semantic, /Negative=/false)
				: APFloat::getInf(semantic, /Negative=/false);

				return builder.getFloatAttr(resultType, identity);
				}
	case AtomicRMWKind::mins:			case AtomicRMWKind::mins:
	return builder.getIntegerAttr(			return builder.getIntegerAttr(
	resultType, APInt::getSignedMaxValue(			resultType, APInt::getSignedMaxValue(
	llvm::cast<IntegerType>(resultType).getWidth()));			llvm::cast<IntegerType>(resultType).getWidth()));
	case AtomicRMWKind::minu:			case AtomicRMWKind::minu:
	return builder.getIntegerAttr(			return builder.getIntegerAttr(
	resultType,			resultType,
	APInt::getMaxValue(llvm::cast<IntegerType>(resultType).getWidth()));			APInt::getMaxValue(llvm::cast<IntegerType>(resultType).getWidth()));
	case AtomicRMWKind::muli:			case AtomicRMWKind::muli:
	return builder.getIntegerAttr(resultType, 1);			return builder.getIntegerAttr(resultType, 1);
	case AtomicRMWKind::mulf:			case AtomicRMWKind::mulf:
	return builder.getFloatAttr(resultType, 1);			return builder.getFloatAttr(resultType, 1);
	// TODO: Add remaining reduction operations.			// TODO: Add remaining reduction operations.
	default:			default:
	(void)emitOptionalError(loc, "Reduction operation type not supported");			(void)emitOptionalError(loc, "Reduction operation type not supported");
	break;			break;
	}			}
	return nullptr;			return nullptr;
	}			}

	/// Return the identity numeric value associated to the give op.			/// Return the identity numeric value associated to the give op.
	std::optional<TypedAttr> mlir::arith::getNeutralElement(Operation *op) {			std::optional<TypedAttr> mlir::arith::getNeutralElement(Operation *op) {
	std::optional<AtomicRMWKind> maybeKind =			std::optional<AtomicRMWKind> maybeKind =
				Hardcode84Unsubmitted Done Reply Inline Actions While we need to pass flag explicitly to `AtomicRMWKind` version, for op version we can take fastmath flag from op itself (which already have fastmath flags support I believe). Hardcode84: While we need to pass flag explicitly to `AtomicRMWKind` version, for op version we can take…
				qcolombetAuthorUnsubmitted Done Reply Inline Actions I think the fastmath flags are only on arith operations, not on the generic ones. I.e., I'm not sure this is generally possible. qcolombet: I think the fastmath flags are only on arith operations, not on the generic ones. I.e., I'm not…
				Hardcode84Unsubmitted Done Reply Inline Actions Current `getNeutralElement(op)` implementation supports only arith ops now, so I don't see problem here. Hardcode84: Current `getNeutralElement(op)` implementation supports only arith ops now, so I don't see…
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Good point. Let me give it a try then. qcolombet: Good point. Let me give it a try then.
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Please take a look @Hardcode84 qcolombet: Please take a look @Hardcode84
	llvm::TypeSwitch<Operation *, std::optional<AtomicRMWKind>>(op)			llvm::TypeSwitch<Operation *, std::optional<AtomicRMWKind>>(op)
	// Floating-point operations.			// Floating-point operations.
	.Case([](arith::AddFOp op) { return AtomicRMWKind::addf; })			.Case([](arith::AddFOp op) { return AtomicRMWKind::addf; })
	.Case([](arith::MulFOp op) { return AtomicRMWKind::mulf; })			.Case([](arith::MulFOp op) { return AtomicRMWKind::mulf; })
	.Case([](arith::MaxFOp op) { return AtomicRMWKind::maxf; })			.Case([](arith::MaxFOp op) { return AtomicRMWKind::maxf; })
	.Case([](arith::MinFOp op) { return AtomicRMWKind::minf; })			.Case([](arith::MinFOp op) { return AtomicRMWKind::minf; })
	// Integer operations.			// Integer operations.
	.Case([](arith::AddIOp op) { return AtomicRMWKind::addi; })			.Case([](arith::AddIOp op) { return AtomicRMWKind::addi; })
	.Case([](arith::OrIOp op) { return AtomicRMWKind::ori; })			.Case([](arith::OrIOp op) { return AtomicRMWKind::ori; })
	.Case([](arith::XOrIOp op) { return AtomicRMWKind::ori; })			.Case([](arith::XOrIOp op) { return AtomicRMWKind::ori; })
	.Case([](arith::AndIOp op) { return AtomicRMWKind::andi; })			.Case([](arith::AndIOp op) { return AtomicRMWKind::andi; })
	.Case([](arith::MaxUIOp op) { return AtomicRMWKind::maxu; })			.Case([](arith::MaxUIOp op) { return AtomicRMWKind::maxu; })
	.Case([](arith::MinUIOp op) { return AtomicRMWKind::minu; })			.Case([](arith::MinUIOp op) { return AtomicRMWKind::minu; })
	.Case([](arith::MaxSIOp op) { return AtomicRMWKind::maxs; })			.Case([](arith::MaxSIOp op) { return AtomicRMWKind::maxs; })
	.Case([](arith::MinSIOp op) { return AtomicRMWKind::mins; })			.Case([](arith::MinSIOp op) { return AtomicRMWKind::mins; })
	.Case([](arith::MulIOp op) { return AtomicRMWKind::muli; })			.Case([](arith::MulIOp op) { return AtomicRMWKind::muli; })
	.Default([](Operation *op) { return std::nullopt; });			.Default([](Operation *op) { return std::nullopt; });
	if (!maybeKind) {			if (!maybeKind) {
	op->emitError() << "Unknown neutral element for: " << *op;			op->emitError() << "Unknown neutral element for: " << *op;
	return std::nullopt;			return std::nullopt;
	}			}

				bool useOnlyFiniteValue = false;
				auto fmfOpInterface = dyn_cast<ArithFastMathInterface>(op);
				Hardcode84Unsubmitted Done Reply Inline Actions There is `ArithFastMathInterface`, I think you can get flags from it instead of iteration through all attrs. Hardcode84: There is `ArithFastMathInterface`, I think you can get flags from it instead of iteration…
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Ah thanks! qcolombet: Ah thanks!
				if (fmfOpInterface) {
				arith::FastMathFlagsAttr fmfAttr = fmfOpInterface.getFastMathFlagsAttr();
				useOnlyFiniteValue =
				bitEnumContainsAny(fmfAttr.getValue(), arith::FastMathFlags::ninf);
				}

	// Builder only used as helper for attribute creation.			// Builder only used as helper for attribute creation.
	OpBuilder b(op->getContext());			OpBuilder b(op->getContext());
	Type resultType = op->getResult(0).getType();			Type resultType = op->getResult(0).getType();

	return getIdentityValueAttr(*maybeKind, resultType, b, op->getLoc());			return getIdentityValueAttr(*maybeKind, resultType, b, op->getLoc(),
				useOnlyFiniteValue);
	}			}

	/// Returns the identity value associated with an AtomicRMWKind op.			/// Returns the identity value associated with an AtomicRMWKind op.
	Value mlir::arith::getIdentityValue(AtomicRMWKind op, Type resultType,			Value mlir::arith::getIdentityValue(AtomicRMWKind op, Type resultType,
	OpBuilder &builder, Location loc) {			OpBuilder &builder, Location loc,
	auto attr = getIdentityValueAttr(op, resultType, builder, loc);			bool useOnlyFiniteValue) {
				auto attr =
				getIdentityValueAttr(op, resultType, builder, loc, useOnlyFiniteValue);
	return builder.create<arith::ConstantOp>(loc, attr);			return builder.create<arith::ConstantOp>(loc, attr);
	}			}

	/// Return the value obtained by applying the reduction operation kind			/// Return the value obtained by applying the reduction operation kind
	/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.			/// associated with a binary AtomicRMWKind op to `lhs` and `rhs`.
	Value mlir::arith::getReductionOp(AtomicRMWKind op, OpBuilder &builder,			Value mlir::arith::getReductionOp(AtomicRMWKind op, OpBuilder &builder,
	Location loc, Value lhs, Value rhs) {			Location loc, Value lhs, Value rhs) {
	switch (op) {			switch (op) {
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

Show First 20 Lines • Show All 2,486 Lines • ▼ Show 20 Lines	FailureOr<SmallVector<Value>> SoftmaxOp::decomposeOperation(OpBuilder &b) {
Type elementType = inputType.getElementType();		Type elementType = inputType.getElementType();
int64_t reductionDim = getDimension();		int64_t reductionDim = getDimension();
SmallVector<OpFoldResult> dims = tensor::getMixedSizes(b, loc, input);		SmallVector<OpFoldResult> dims = tensor::getMixedSizes(b, loc, input);
Value output = getOutput();		Value output = getOutput();
dims.erase(dims.begin() + reductionDim);		dims.erase(dims.begin() + reductionDim);
// Step 1: Compute max along dim.		// Step 1: Compute max along dim.
Value outputReduce = b.create<tensor::EmptyOp>(loc, dims, elementType);		Value outputReduce = b.create<tensor::EmptyOp>(loc, dims, elementType);
Value neutralForMaxF =		Value neutralForMaxF =
arith::getIdentityValue(arith::AtomicRMWKind::maxf, elementType, b, loc);		arith::getIdentityValue(arith::AtomicRMWKind::maxf, elementType, b, loc,
		/useOnlyFiniteValue=/true);
Value neutralForMaxFInit =		Value neutralForMaxFInit =
b.create<linalg::FillOp>(loc, Value{neutralForMaxF}, outputReduce)		b.create<linalg::FillOp>(loc, Value{neutralForMaxF}, outputReduce)
.result();		.result();
Value max =		Value max =
reduce<arith::MaxFOp>(b, loc, input, neutralForMaxFInit, reductionDim);		reduce<arith::MaxFOp>(b, loc, input, neutralForMaxFInit, reductionDim);

// Step 2: Subtract max from input and exponentiate.		// Step 2: Subtract max from input and exponentiate.
Value numerator = buildSubAndExpOp(b, loc, input, max, output, reductionDim);		Value numerator = buildSubAndExpOp(b, loc, input, max, output, reductionDim);

// Step 3: Compute sum along dim.		// Step 3: Compute sum along dim.
Value zero =		Value zero = arith::getIdentityValue(arith::AtomicRMWKind::addf, elementType,
arith::getIdentityValue(arith::AtomicRMWKind::addf, elementType, b, loc);		b, loc, /useOnlyFiniteValue=/true);
Value zeroInit =		Value zeroInit =
b.create<linalg::FillOp>(loc, Value{zero}, outputReduce).result();		b.create<linalg::FillOp>(loc, Value{zero}, outputReduce).result();
Value denominator =		Value denominator =
reduce<arith::AddFOp>(b, loc, numerator, zeroInit, reductionDim);		reduce<arith::AddFOp>(b, loc, numerator, zeroInit, reductionDim);

// Step 4: Compute softmax.		// Step 4: Compute softmax.
Value result =		Value result =
buildDivOp(b, loc, numerator, denominator, output, reductionDim);		buildDivOp(b, loc, numerator, denominator, output, reductionDim);
Show All 18 Lines

mlir/test/Dialect/Linalg/transform-op-decompose.mlir

	Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines
	func.func @softmax(%arg0: tensor<2x16x32xf32>, %dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32> {			func.func @softmax(%arg0: tensor<2x16x32xf32>, %dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32> {
	%1 = linalg.softmax dimension(2) ins(%arg0 : tensor<2x16x32xf32>) outs(%dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32>			%1 = linalg.softmax dimension(2) ins(%arg0 : tensor<2x16x32xf32>) outs(%dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32>
	return %1 : tensor<2x16x32xf32>			return %1 : tensor<2x16x32xf32>
	}			}

	// CHECK-LABEL: func.func @softmax(			// CHECK-LABEL: func.func @softmax(
	// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]+]]: tensor<2x16x32xf32>, %[[DST:[a-zA-Z0-9_]+]]: tensor<2x16x32xf32>) -> tensor<2x16x32xf32> {			// CHECK-SAME: %[[ARG0:[a-zA-Z0-9_]+]]: tensor<2x16x32xf32>, %[[DST:[a-zA-Z0-9_]+]]: tensor<2x16x32xf32>) -> tensor<2x16x32xf32> {
	// CHECK-DAG: %[[D1:.+]] = tensor.empty() : tensor<2x16xf32>			// CHECK-DAG: %[[D1:.+]] = tensor.empty() : tensor<2x16xf32>
	// CHECK-DAG: %[[CST:.+]] = arith.constant 0xFF800000 : f32			// CHECK-DAG: %[[CST:.+]] = arith.constant -1.401300e-45 : f32
				rengolinUnsubmitted Done Reply Inline Actions With no way to currently pass the flag through `mlir-opt`, it's hard to test the `inf` case... rengolin: With no way to currently pass the flag through `mlir-opt`, it's hard to test the `inf` case...
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Good point, maybe we'll want a sort of index-width option equivalent for fast math handling. qcolombet: Good point, maybe we'll want a sort of index-width option equivalent for fast math handling.
				qcolombetAuthorUnsubmitted Done Reply Inline Actions Added a test through the reduction split test following @Hardcode84 recommendation of using the fastmath flag attached to the operation when using `getIdentityElement`. qcolombet: Added a test through the reduction split test following @Hardcode84 recommendation of using the…
	// CHECK: %[[D2:.+]] = linalg.fill ins(%[[CST]] : f32) outs(%[[D1]] : tensor<2x16xf32>) -> tensor<2x16xf32>			// CHECK: %[[D2:.+]] = linalg.fill ins(%[[CST]] : f32) outs(%[[D1]] : tensor<2x16xf32>) -> tensor<2x16xf32>
	// CHECK: %[[D3:.+]] = linalg.generic {indexing_maps = [#[[$MAP]], #[[$MAP1]]], iterator_types = ["parallel",			// CHECK: %[[D3:.+]] = linalg.generic {indexing_maps = [#[[$MAP]], #[[$MAP1]]], iterator_types = ["parallel",
	// CHECK-SAME: "parallel", "reduction"]} ins(%[[ARG0]] : tensor<2x16x32xf32>) outs(%[[D2]] : tensor<2x16xf32>) {			// CHECK-SAME: "parallel", "reduction"]} ins(%[[ARG0]] : tensor<2x16x32xf32>) outs(%[[D2]] : tensor<2x16xf32>) {
	// CHECK: ^bb0(%[[IN:.+]]: f32, %[[OUT:.+]]: f32):			// CHECK: ^bb0(%[[IN:.+]]: f32, %[[OUT:.+]]: f32):
	// CHECK: %[[D8:.+]] = arith.maxf %[[IN]], %[[OUT]] : f32			// CHECK: %[[D8:.+]] = arith.maxf %[[IN]], %[[OUT]] : f32
	// CHECK: linalg.yield %[[D8]] : f32			// CHECK: linalg.yield %[[D8]] : f32
	// CHECK: } -> tensor<2x16xf32>			// CHECK: } -> tensor<2x16xf32>
	// CHECK: %[[D4:.+]] = linalg.generic {indexing_maps = [#[[$MAP]], #[[$MAP1]], #[[$MAP]]], iterator_types =			// CHECK: %[[D4:.+]] = linalg.generic {indexing_maps = [#[[$MAP]], #[[$MAP1]], #[[$MAP]]], iterator_types =
	Show All 32 Lines

mlir/test/Dialect/Linalg/transform-op-split-reduction.mlir

	Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
	^bb1(%arg1: !transform.any_op):			^bb1(%arg1: !transform.any_op):
	%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op			%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
	%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2}			%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2}
	: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)			: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
	}			}

	// -----			// -----

				// Check that we don't use -inf as the neutral element for maxf when maxf has
				// ninf. Instead check that we use the smallest finite floating point value.
				// Also check that the fastmath flags are set on the created maxf
				// instructions.
				func.func @generic_split_3d_ninf(%input: tensor<32x2xf32>, %input_2: tensor<5x32xf32>, %output: tensor<5x2xf32>)
				-> tensor<5x2xf32>
				{
				%0 = linalg.generic {
				indexing_maps = [
				affine_map<(d0, d1, d2) -> (d1, d0)>,
				affine_map<(d0, d1, d2) -> (d2, d1)>,
				affine_map<(d0, d1, d2) -> (d2, d0)>
				],
				iterator_types = ["parallel", "reduction", "parallel"]
				} ins(%input, %input_2 : tensor<32x2xf32>, tensor<5x32xf32>) outs(%output : tensor<5x2xf32>) {
				^bb0(%arg0: f32, %arg1: f32, %arg2: f32):
				%3 = arith.addf %arg0, %arg1 : f32
				%4 = arith.maxf %3, %arg2 fastmath<nnan,ninf> : f32
				linalg.yield %4 : f32
				} -> tensor<5x2xf32>
				return %0 : tensor<5x2xf32>
				}

				// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d2, d1, d0)>
				// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d3, d2, d1)>
				// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1, d2, d3) -> (d3, d0, d2)>
				// CHECK-DAG: #[[$MAP3:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
				// CHECK-DAG: #[[$MAP4:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>
				// CHECK-LABEL: func @generic_split_3d_ninf
				// CHECK-DAG: %[[ID:.*]] = arith.constant -1.401300e-45 : f32
				// CHECK-DAG: %[[I1:.]] = tensor.expand_shape %{{.}}[0, 1], [2]] : tensor<32x2xf32> into tensor<4x8x2xf32>
				// CHECK-DAG: %[[I2:.]] = tensor.expand_shape %{{.}}[0], [1, 2]] : tensor<5x32xf32> into tensor<5x4x8xf32>
				// CHECK-DAG: %[[INI:.*]] = tensor.empty() : tensor<5x2x4xf32>
				// CHECK: %[[F:.*]] = linalg.fill ins(%[[ID]] : f32) outs(%[[INI]] : tensor<5x2x4xf32>) -> tensor<5x2x4xf32>
				// CHECK: %[[G:.*]] = linalg.generic {indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP2]]], iterator_types = ["parallel", "reduction", "parallel", "parallel"]}
				// CHECK-SAME: ins(%[[I1]], %[[I2]] : tensor<4x8x2xf32>, tensor<5x4x8xf32>) outs(%[[F]] : tensor<5x2x4xf32>) {
				// CHECK: arith.addf
				// CHECK: arith.maxf {{.*}} fastmath<nnan,ninf>
				// CHECK: linalg.yield
				// CHECK: } -> tensor<5x2x4xf32>
				// CHECK: %[[R:.*]] = linalg.generic {indexing_maps = [#[[$MAP3]], #[[$MAP4]]], iterator_types = ["parallel", "parallel", "reduction"]}
				// CHECK-SAME: ins(%[[G]] : tensor<5x2x4xf32>) outs(%{{.*}} : tensor<5x2xf32>) {
				// CHECK: arith.maxf {{.*}} fastmath<nnan,ninf>
				// CHECK: linalg.yield
				// CHECK: } -> tensor<5x2xf32>
				// CHECK: return %[[R]] : tensor<5x2xf32>

				transform.sequence failures(propagate) {
				^bb1(%arg1: !transform.any_op):
				%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
				%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2}
				: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
				}

				// -----

	func.func @matmul_split(%A : tensor<16x256xf32>, %B: tensor<256x32xf32>, %C: tensor<16x32xf32>) -> tensor<16x32xf32> {			func.func @matmul_split(%A : tensor<16x256xf32>, %B: tensor<256x32xf32>, %C: tensor<16x32xf32>) -> tensor<16x32xf32> {
	%0 = linalg.matmul ins(%A, %B: tensor<16x256xf32>, tensor<256x32xf32>)			%0 = linalg.matmul ins(%A, %B: tensor<16x256xf32>, tensor<256x32xf32>)
	outs(%C: tensor<16x32xf32>) -> tensor<16x32xf32>			outs(%C: tensor<16x32xf32>) -> tensor<16x32xf32>
	return %0: tensor<16x32xf32>			return %0: tensor<16x32xf32>
	}			}

	// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>			// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>
	// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d2, d3, d1)>			// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d2, d3, d1)>
	▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	// CHECK: return %[[R]] : tensor<5x2xf32>			// CHECK: return %[[R]] : tensor<5x2xf32>

	transform.sequence failures(propagate) {			transform.sequence failures(propagate) {
	^bb1(%arg1: !transform.any_op):			^bb1(%arg1: !transform.any_op):
	%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op			%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
	%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2, inner_parallel}			%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2, inner_parallel}
	: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)			: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
	}			}

				// -----

				// Check that we don't use +inf as the neutral element for minf when minf has
				// ninf. Instead check that we use the largest finite floating point value.
				// Also check that the fastmath flags are set on the created minf
				// instructions.
				func.func @generic_split_3d(%input: tensor<32x2xf32>, %input_2: tensor<5x32xf32>, %output: tensor<5x2xf32>)
				-> tensor<5x2xf32>
				{
				%0 = linalg.generic {
				indexing_maps = [
				affine_map<(d0, d1, d2) -> (d1, d0)>,
				affine_map<(d0, d1, d2) -> (d2, d1)>,
				affine_map<(d0, d1, d2) -> (d2, d0)>
				],
				iterator_types = ["parallel", "reduction", "parallel"]
				} ins(%input, %input_2 : tensor<32x2xf32>, tensor<5x32xf32>) outs(%output : tensor<5x2xf32>) {
				^bb0(%arg0: f32, %arg1: f32, %arg2: f32):
				%3 = arith.addf %arg0, %arg1 : f32
				%4 = arith.minf %3, %arg2 fastmath<ninf> : f32
				linalg.yield %4 : f32
				} -> tensor<5x2xf32>
				return %0 : tensor<5x2xf32>
				}

				// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d1, d2, d0)>
				// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1, d2, d3) -> (d3, d1, d2)>
				// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1, d2, d3) -> (d3, d0, d2)>
				// CHECK-DAG: #[[$MAP3:.*]] = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
				// CHECK-DAG: #[[$MAP4:.*]] = affine_map<(d0, d1, d2) -> (d0, d1)>
				// CHECK-LABEL: func @generic_split_3d
				// CHECK-DAG: %[[ID:.*]] = arith.constant 3.40282347E+38 : f32
				// CHECK-DAG: %[[I1:.]] = tensor.expand_shape %{{.}}[0, 1], [2]] : tensor<32x2xf32> into tensor<8x4x2xf32>
				// CHECK-DAG: %[[I2:.]] = tensor.expand_shape %{{.}}[0], [1, 2]] : tensor<5x32xf32> into tensor<5x8x4xf32>
				// CHECK-DAG: %[[INI:.*]] = tensor.empty() : tensor<5x2x4xf32>
				// CHECK: %[[F:.*]] = linalg.fill ins(%[[ID]] : f32) outs(%[[INI]] : tensor<5x2x4xf32>) -> tensor<5x2x4xf32>
				// CHECK: %[[G:.*]] = linalg.generic {indexing_maps = [#[[$MAP0]], #[[$MAP1]], #[[$MAP2]]], iterator_types = ["parallel", "reduction", "parallel", "parallel"]}
				// CHECK-SAME: ins(%[[I1]], %[[I2]] : tensor<8x4x2xf32>, tensor<5x8x4xf32>) outs(%[[F]] : tensor<5x2x4xf32>) {
				// CHECK: arith.addf
				// CHECK: arith.minf {{.*}} fastmath<ninf>
				// CHECK: linalg.yield
				// CHECK: } -> tensor<5x2x4xf32>
				// CHECK: %[[R:.*]] = linalg.generic {indexing_maps = [#[[$MAP3]], #[[$MAP4]]], iterator_types = ["parallel", "parallel", "reduction"]}
				// CHECK-SAME: ins(%[[G]] : tensor<5x2x4xf32>) outs(%{{.*}} : tensor<5x2xf32>) {
				// CHECK: arith.minf {{.*}} fastmath<ninf>
				// CHECK: linalg.yield
				// CHECK: } -> tensor<5x2xf32>
				// CHECK: return %[[R]] : tensor<5x2xf32>

				transform.sequence failures(propagate) {
				^bb1(%arg1: !transform.any_op):
				%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
				%1:4 = transform.structured.split_reduction %0 { split_factor = 4, insert_split_dimension = 2, inner_parallel}
				: (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op, !transform.any_op)
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Support fast-math friendly constants for identity value
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 548576

mlir/include/mlir/Dialect/Arith/IR/Arith.h

mlir/lib/Dialect/Arith/IR/ArithOps.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/test/Dialect/Linalg/transform-op-decompose.mlir

mlir/test/Dialect/Linalg/transform-op-split-reduction.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Support fast-math friendly constants for identity valueClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 548576

mlir/include/mlir/Dialect/Arith/IR/Arith.h

mlir/lib/Dialect/Arith/IR/ArithOps.cpp

mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp

mlir/test/Dialect/Linalg/transform-op-decompose.mlir

mlir/test/Dialect/Linalg/transform-op-split-reduction.mlir

[mlir] Support fast-math friendly constants for identity value
ClosedPublic