This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Conversion/
-
Passes.td
-
VectorToLLVM/
-
ConvertVectorToLLVM.h
-
Dialect/LLVMIR/
-
LLVMIR/
1/2
LLVMOpBase.td
-
lib/Conversion/VectorToLLVM/
-
Conversion/
-
VectorToLLVM/
-
ConvertVectorToLLVM.cpp
-
test/
-
Conversion/VectorToLLVM/
-
VectorToLLVM/
-
vector-reduction-to-llvm.mlir
-
vector-to-llvm.mlir
-
Target/
-
llvmir-intrinsics.mlir

Differential D82624

[mlir] [VectorOps] Add the ability to mark FP reductions with "reassociate" attribute
ClosedPublic

Authored by aartbik on Jun 25 2020, 10:41 PM.

Download Raw Diff

Details

Reviewers

ftynse
nicolasvasilache
reidtatge
mehdi_amini
sanjoy

Commits

rGceb1b327b53c: [mlir] [VectorOps] Add the ability to mark FP reductions with "reassociate"…

Summary

Rationale:
In general, passing "fastmath" from MLIR to LLVM backend is not supported, and even just providing such a feature for experimentation is under debate. However, passing fine-grained fastmath related attributes on individual operations is generally accepted. This CL introduces an option to instruct the vector-to-llvm lowering phase to annotate floating-point reductions with the "reassociate" fastmath attribute, which allows the LLVM backend to use SIMD implementations for such constructs. Oher lowering passes can start using this mechanism right away in cases where reassociation is allowed.

Benefit:
For some microbenchmarks on x86-avx2, speedups over 20 were observed for longer vector (due to cleaner, spill-free and SIMD exploiting code).

Usage:
mlir-opt --convert-vector-to-llvm="reassociate-fp-reductions"

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aartbik created this revision.Jun 25 2020, 10:41 PM

Herald added a reviewer: ftynse. · View Herald TranscriptJun 25 2020, 10:41 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: msifontes, jurahul, Kayjukh and 13 others. · View Herald Transcript

aartbik added reviewers: nicolasvasilache, reidtatge, mehdi_amini, sanjoy.Jun 25 2020, 10:42 PM

Harbormaster failed remote builds in B61867: Diff 273594!Jun 25 2020, 11:25 PM

ftynse requested changes to this revision.Jun 26 2020, 12:58 AM

ftynse added inline comments.

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td
246	I'm generally reluctant to having LLVM dialect operations that do don't exist in LLVM IR, as either operations or intrinsics. Could we rather add attributes to the already existing `LLVM_VectorReductionV2` for reassociation (and "fast" since there are only two that need to be supported). This should be as simple as `Arguments<(ins LLVM_Type, LLVM_Type, DefaultValuedAttr<BoolAttr, "false">:$reassoc, DefaultValuedAttr<BoolAttr, "false">:$fast)>` and using them in the `llvmBuilder`. When we have a general mechanism for fast-math flags, we can rely on that instead.

This revision now requires changes to proceed.Jun 26 2020, 12:58 AM

aartbik marked an inline comment as done.Jun 26 2020, 9:11 AM

aartbik added inline comments.

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td
246	Thanks. Agreed an attributed is cleaner. Changed that (note that I just need $reassoc, not $fast ). I had an internal prototype for passing on fast-math flags in general, but that stopped after an internal discussion, since it was deemed to dangerous. Hence the intrinsic focused solution for now.

used attributed rather than naming for reassoc=true

aartbik edited the summary of this revision. (Show Details)Jun 26 2020, 10:17 AM

aartbik edited the summary of this revision. (Show Details)

Thanks!

This revision is now accepted and ready to land.Jun 26 2020, 10:19 AM

Harbormaster failed remote builds in B61957: Diff 273773!Jun 26 2020, 10:55 AM

Nice! I agree the attribute is appropriate here.

To be clear: I'm not against a fast-math for experimenting, I wanted to make sure we build what you did here as a "proper" solution and we don't rely on the "big hammer" global fast-math instead.

Closed by commit rGceb1b327b53c: [mlir] [VectorOps] Add the ability to mark FP reductions with "reassociate"… (authored by aartbik). · Explain WhyJun 26 2020, 11:30 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

mlir/

include/

mlir/

Conversion/

Passes.td

5 lines

VectorToLLVM/

ConvertVectorToLLVM.h

5 lines

Dialect/

LLVMIR/

LLVMOpBase.td

28 lines

lib/

Conversion/

VectorToLLVM/

ConvertVectorToLLVM.cpp

26 lines

test/

Conversion/

VectorToLLVM/

vector-reduction-to-llvm.mlir

42 lines

vector-to-llvm.mlir

2 lines

Target/

llvmir-intrinsics.mlir

4 lines

Diff 273795

mlir/include/mlir/Conversion/Passes.td

	Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VectorToLLVM			// VectorToLLVM
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def ConvertVectorToLLVM : Pass<"convert-vector-to-llvm", "ModuleOp"> {			def ConvertVectorToLLVM : Pass<"convert-vector-to-llvm", "ModuleOp"> {
	let summary = "Lower the operations from the vector dialect into the LLVM "			let summary = "Lower the operations from the vector dialect into the LLVM "
	"dialect";			"dialect";
	let constructor = "mlir::createConvertVectorToLLVMPass()";			let constructor = "mlir::createConvertVectorToLLVMPass()";
				let options = [
				Option<"reassociateFPReductions", "reassociate-fp-reductions",
				"bool", /default=/"false",
				"Allows llvm to reassociate floating-point reductions for speed">
				];
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VectorToROCDL			// VectorToROCDL
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def ConvertVectorToROCDL : Pass<"convert-vector-to-rocdl", "ModuleOp"> {			def ConvertVectorToROCDL : Pass<"convert-vector-to-rocdl", "ModuleOp"> {
	let summary = "Lower the operations from the vector dialect into the ROCDL "			let summary = "Lower the operations from the vector dialect into the ROCDL "
	"dialect";			"dialect";
	let constructor = "mlir::createConvertVectorToROCDLPass()";			let constructor = "mlir::createConvertVectorToROCDLPass()";
	}			}

	#endif // MLIR_CONVERSION_PASSES			#endif // MLIR_CONVERSION_PASSES

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h

	Show All 17 Lines

	/// Collect a set of patterns to convert from Vector contractions to LLVM Matrix			/// Collect a set of patterns to convert from Vector contractions to LLVM Matrix
	/// Intrinsics. To lower to assembly, the LLVM flag -lower-matrix-intrinsics			/// Intrinsics. To lower to assembly, the LLVM flag -lower-matrix-intrinsics
	/// will be needed when invoking LLVM.			/// will be needed when invoking LLVM.
	void populateVectorToLLVMMatrixConversionPatterns(			void populateVectorToLLVMMatrixConversionPatterns(
	LLVMTypeConverter &converter, OwningRewritePatternList &patterns);			LLVMTypeConverter &converter, OwningRewritePatternList &patterns);

	/// Collect a set of patterns to convert from the Vector dialect to LLVM.			/// Collect a set of patterns to convert from the Vector dialect to LLVM.
	void populateVectorToLLVMConversionPatterns(LLVMTypeConverter &converter,			void populateVectorToLLVMConversionPatterns(
	OwningRewritePatternList &patterns);			LLVMTypeConverter &converter, OwningRewritePatternList &patterns,
				bool reassociateFPReductions = false);

	/// Create a pass to convert vector operations to the LLVMIR dialect.			/// Create a pass to convert vector operations to the LLVMIR dialect.
	std::unique_ptr<OperationPass<ModuleOp>> createConvertVectorToLLVMPass();			std::unique_ptr<OperationPass<ModuleOp>> createConvertVectorToLLVMPass();

	} // namespace mlir			} // namespace mlir

	#endif // MLIR_CONVERSION_VECTORTOLLVM_CONVERTVECTORTOLLVM_H_			#endif // MLIR_CONVERSION_VECTORTOLLVM_CONVERTVECTORTOLLVM_H_

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td

Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	class LLVM_OneResultIntrOp<string mnem, list<int> overloadedResults = [],
list<OpTrait> traits = []>		list<OpTrait> traits = []>
: LLVM_IntrOp<mnem, overloadedResults, overloadedOperands, traits, 1>;		: LLVM_IntrOp<mnem, overloadedResults, overloadedOperands, traits, 1>;

// LLVM vector reduction over a single vector.		// LLVM vector reduction over a single vector.
class LLVM_VectorReduction<string mnem>		class LLVM_VectorReduction<string mnem>
: LLVM_OneResultIntrOp<"experimental.vector.reduce." # mnem, [], [0], []>,		: LLVM_OneResultIntrOp<"experimental.vector.reduce." # mnem, [], [0], []>,
Arguments<(ins LLVM_Type)>;		Arguments<(ins LLVM_Type)>;

// LLVM vector reduction over a single vector, with an initial value.		// LLVM vector reduction over a single vector, with an initial value,
		// and with permission to reassociate the reduction operations.
class LLVM_VectorReductionV2<string mnem>		class LLVM_VectorReductionV2<string mnem>
: LLVM_OneResultIntrOp<"experimental.vector.reduce.v2." # mnem,		: LLVM_OpBase<LLVM_Dialect, "intr.experimental.vector.reduce.v2." # mnem, []>,
[0], [1], []>,		Results<(outs LLVM_Type:$res)>,
Arguments<(ins LLVM_Type, LLVM_Type)>;		Arguments<(ins LLVM_Type, LLVM_Type,
		DefaultValuedAttr<BoolAttr, "false">:$reassoc)> {
		let llvmBuilder = [{
		llvm::Module *module = builder.GetInsertBlock()->getModule();
		llvm::Function *fn = llvm::Intrinsic::getDeclaration(
		module,
		llvm::Intrinsic::experimental_vector_reduce_v2_}] # mnem # [{,
		{ }] # StrJoin<!listconcat(
		ListIntSubst<LLVM_IntrPatterns.result, [0]>.lst,
		ListIntSubst<LLVM_IntrPatterns.operand, [1]>.lst)>.result # [{
		});
		auto operands = lookupValues(opInst.getOperands());
		llvm::FastMathFlags origFM = builder.getFastMathFlags();
		llvm::FastMathFlags tempFM = origFM;
		tempFM.setAllowReassoc($reassoc);
		builder.setFastMathFlags(tempFM); // set fastmath flag
		$res = builder.CreateCall(fn, operands);
		builder.setFastMathFlags(origFM); // restore fastmath flag
		}];
		}

#endif // LLVMIR_OP_BASE		#endif // LLVMIR_OP_BASE
		ftynseUnsubmitted Not Done Reply Inline Actions I'm generally reluctant to having LLVM dialect operations that do don't exist in LLVM IR, as either operations or intrinsics. Could we rather add attributes to the already existing `LLVM_VectorReductionV2` for reassociation (and "fast" since there are only two that need to be supported). This should be as simple as `Arguments<(ins LLVM_Type, LLVM_Type, DefaultValuedAttr<BoolAttr, "false">:$reassoc, DefaultValuedAttr<BoolAttr, "false">:$fast)>` and using them in the `llvmBuilder`. When we have a general mechanism for fast-math flags, we can rely on that instead. ftynse: I'm generally reluctant to having LLVM dialect operations that do don't exist in LLVM IR, as…
		aartbikAuthorUnsubmitted Done Reply Inline Actions Thanks. Agreed an attributed is cleaner. Changed that (note that I just need $reassoc, not $fast ). I had an internal prototype for passing on fast-math flags in general, but that stopped after an internal discussion, since it was deemed to dangerous. Hence the intrinsic focused solution for now. aartbik: Thanks. Agreed an attributed is cleaner. Changed that (note that I just need $reassoc, not…

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

Show First 20 Lines • Show All 249 Lines • ▼ Show 20 Lines	rewriter.replaceOpWithNewOp<LLVM::MatrixTransposeOp>(
adaptor.matrix(), transOp.rows(), transOp.columns());		adaptor.matrix(), transOp.rows(), transOp.columns());
return success();		return success();
}		}
};		};

class VectorReductionOpConversion : public ConvertToLLVMPattern {		class VectorReductionOpConversion : public ConvertToLLVMPattern {
public:		public:
explicit VectorReductionOpConversion(MLIRContext *context,		explicit VectorReductionOpConversion(MLIRContext *context,
LLVMTypeConverter &typeConverter)		LLVMTypeConverter &typeConverter,
		bool reassociateFP)
: ConvertToLLVMPattern(vector::ReductionOp::getOperationName(), context,		: ConvertToLLVMPattern(vector::ReductionOp::getOperationName(), context,
typeConverter) {}		typeConverter),
		reassociateFPReductions(reassociateFP) {}

LogicalResult		LogicalResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,		matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
auto reductionOp = cast<vector::ReductionOp>(op);		auto reductionOp = cast<vector::ReductionOp>(op);
auto kind = reductionOp.kind();		auto kind = reductionOp.kind();
Type eltType = reductionOp.dest().getType();		Type eltType = reductionOp.dest().getType();
Type llvmType = typeConverter.convertType(eltType);		Type llvmType = typeConverter.convertType(eltType);
Show All 28 Lines	if (eltType.isSignlessInteger(32) \|\| eltType.isSignlessInteger(64)) {
// Floating-point reductions: add/mul/min/max		// Floating-point reductions: add/mul/min/max
if (kind == "add") {		if (kind == "add") {
// Optional accumulator (or zero).		// Optional accumulator (or zero).
Value acc = operands.size() > 1 ? operands[1]		Value acc = operands.size() > 1 ? operands[1]
: rewriter.create<LLVM::ConstantOp>(		: rewriter.create<LLVM::ConstantOp>(
op->getLoc(), llvmType,		op->getLoc(), llvmType,
rewriter.getZeroAttr(eltType));		rewriter.getZeroAttr(eltType));
rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_v2_fadd>(		rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_v2_fadd>(
op, llvmType, acc, operands[0]);		op, llvmType, acc, operands[0],
		rewriter.getBoolAttr(reassociateFPReductions));
} else if (kind == "mul") {		} else if (kind == "mul") {
// Optional accumulator (or one).		// Optional accumulator (or one).
Value acc = operands.size() > 1		Value acc = operands.size() > 1
? operands[1]		? operands[1]
: rewriter.create<LLVM::ConstantOp>(		: rewriter.create<LLVM::ConstantOp>(
op->getLoc(), llvmType,		op->getLoc(), llvmType,
rewriter.getFloatAttr(eltType, 1.0));		rewriter.getFloatAttr(eltType, 1.0));
rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_v2_fmul>(		rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_v2_fmul>(
op, llvmType, acc, operands[0]);		op, llvmType, acc, operands[0],
		rewriter.getBoolAttr(reassociateFPReductions));
} else if (kind == "min")		} else if (kind == "min")
rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_fmin>(		rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_fmin>(
op, llvmType, operands[0]);		op, llvmType, operands[0]);
else if (kind == "max")		else if (kind == "max")
rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_fmax>(		rewriter.replaceOpWithNewOp<LLVM::experimental_vector_reduce_fmax>(
op, llvmType, operands[0]);		op, llvmType, operands[0]);
else		else
return failure();		return failure();
return success();		return success();
}		}
return failure();		return failure();
}		}

		private:
		const bool reassociateFPReductions;
};		};

class VectorShuffleOpConversion : public ConvertToLLVMPattern {		class VectorShuffleOpConversion : public ConvertToLLVMPattern {
public:		public:
explicit VectorShuffleOpConversion(MLIRContext *context,		explicit VectorShuffleOpConversion(MLIRContext *context,
LLVMTypeConverter &typeConverter)		LLVMTypeConverter &typeConverter)
: ConvertToLLVMPattern(vector::ShuffleOp::getOperationName(), context,		: ConvertToLLVMPattern(vector::ShuffleOp::getOperationName(), context,
typeConverter) {}		typeConverter) {}
▲ Show 20 Lines • Show All 799 Lines • ▼ Show 20 Lines	public:
/// bounded as the rank is strictly decreasing.		/// bounded as the rank is strictly decreasing.
bool hasBoundedRewriteRecursion() const final { return true; }		bool hasBoundedRewriteRecursion() const final { return true; }
};		};

} // namespace		} // namespace

/// Populate the given list with patterns that convert from Vector to LLVM.		/// Populate the given list with patterns that convert from Vector to LLVM.
void mlir::populateVectorToLLVMConversionPatterns(		void mlir::populateVectorToLLVMConversionPatterns(
LLVMTypeConverter &converter, OwningRewritePatternList &patterns) {		LLVMTypeConverter &converter, OwningRewritePatternList &patterns,
		bool reassociateFPReductions) {
MLIRContext *ctx = converter.getDialect()->getContext();		MLIRContext *ctx = converter.getDialect()->getContext();
// clang-format off		// clang-format off
patterns.insert<VectorFMAOpNDRewritePattern,		patterns.insert<VectorFMAOpNDRewritePattern,
VectorInsertStridedSliceOpDifferentRankRewritePattern,		VectorInsertStridedSliceOpDifferentRankRewritePattern,
VectorInsertStridedSliceOpSameRankRewritePattern,		VectorInsertStridedSliceOpSameRankRewritePattern,
VectorStridedSliceOpConversion>(ctx);		VectorStridedSliceOpConversion>(ctx);
		patterns.insert<VectorReductionOpConversion>(
		ctx, converter, reassociateFPReductions);
patterns		patterns
.insert<VectorReductionOpConversion,		.insert<VectorShuffleOpConversion,
VectorShuffleOpConversion,
VectorExtractElementOpConversion,		VectorExtractElementOpConversion,
VectorExtractOpConversion,		VectorExtractOpConversion,
VectorFMAOp1DConversion,		VectorFMAOp1DConversion,
VectorInsertElementOpConversion,		VectorInsertElementOpConversion,
VectorInsertOpConversion,		VectorInsertOpConversion,
VectorPrintOpConversion,		VectorPrintOpConversion,
VectorTransferConversion<TransferReadOp>,		VectorTransferConversion<TransferReadOp>,
VectorTransferConversion<TransferWriteOp>,		VectorTransferConversion<TransferWriteOp>,
Show All 25 Lines	// all contraction operations. Also applies folding and DCE.
populateVectorContractLoweringPatterns(patterns, &getContext());		populateVectorContractLoweringPatterns(patterns, &getContext());
applyPatternsAndFoldGreedily(getOperation(), patterns);		applyPatternsAndFoldGreedily(getOperation(), patterns);
}		}

// Convert to the LLVM IR dialect.		// Convert to the LLVM IR dialect.
LLVMTypeConverter converter(&getContext());		LLVMTypeConverter converter(&getContext());
OwningRewritePatternList patterns;		OwningRewritePatternList patterns;
populateVectorToLLVMMatrixConversionPatterns(converter, patterns);		populateVectorToLLVMMatrixConversionPatterns(converter, patterns);
populateVectorToLLVMConversionPatterns(converter, patterns);		populateVectorToLLVMConversionPatterns(converter, patterns,
		reassociateFPReductions);
populateVectorToLLVMMatrixConversionPatterns(converter, patterns);		populateVectorToLLVMMatrixConversionPatterns(converter, patterns);
populateStdToLLVMConversionPatterns(converter, patterns);		populateStdToLLVMConversionPatterns(converter, patterns);

LLVMConversionTarget target(getContext());		LLVMConversionTarget target(getContext());
if (failed(applyPartialConversion(getOperation(), target, patterns))) {		if (failed(applyPartialConversion(getOperation(), target, patterns))) {
signalPassFailure();		signalPassFailure();
}		}
}		}

std::unique_ptr<OperationPass<ModuleOp>> mlir::createConvertVectorToLLVMPass() {		std::unique_ptr<OperationPass<ModuleOp>> mlir::createConvertVectorToLLVMPass() {
return std::make_unique<LowerVectorToLLVMPass>();		return std::make_unique<LowerVectorToLLVMPass>();
}		}

mlir/test/Conversion/VectorToLLVM/vector-reduction-to-llvm.mlir

This file was added.

				// RUN: mlir-opt %s -convert-vector-to-llvm \| FileCheck %s
				// RUN: mlir-opt %s -convert-vector-to-llvm='reassociate-fp-reductions' \| FileCheck %s --check-prefix=REASSOC

				//
				// CHECK-LABEL: llvm.func @reduce_add_f32(
				// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x float>">)
				// CHECK: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
				// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])
				// CHECK-SAME: {reassoc = false} : (!llvm.float, !llvm<"<16 x float>">) -> !llvm.float
				// CHECK: llvm.return %[[V]] : !llvm.float
				//
				// REASSOC-LABEL: llvm.func @reduce_add_f32(
				// REASSOC-SAME: %[[A:.*]]: !llvm<"<16 x float>">)
				// REASSOC: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
				// REASSOC: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])
				// REASSOC-SAME: {reassoc = true} : (!llvm.float, !llvm<"<16 x float>">) -> !llvm.float
				// REASSOC: llvm.return %[[V]] : !llvm.float
				//
				func @reduce_add_f32(%arg0: vector<16xf32>) -> f32 {
				%0 = vector.reduction "add", %arg0 : vector<16xf32> into f32
				return %0 : f32
				}

				//
				// CHECK-LABEL: llvm.func @reduce_mul_f32(
				// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x float>">)
				// CHECK: %[[C:.*]] = llvm.mlir.constant(1.000000e+00 : f32) : !llvm.float
				// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fmul"(%[[C]], %[[A]])
				// CHECK-SAME: {reassoc = false} : (!llvm.float, !llvm<"<16 x float>">) -> !llvm.float
				// CHECK: llvm.return %[[V]] : !llvm.float
				//
				// REASSOC-LABEL: llvm.func @reduce_mul_f32(
				// REASSOC-SAME: %[[A:.*]]: !llvm<"<16 x float>">)
				// REASSOC: %[[C:.*]] = llvm.mlir.constant(1.000000e+00 : f32) : !llvm.float
				// REASSOC: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fmul"(%[[C]], %[[A]])
				// REASSOC-SAME: {reassoc = true} : (!llvm.float, !llvm<"<16 x float>">) -> !llvm.float
				// REASSOC: llvm.return %[[V]] : !llvm.float
				//
				func @reduce_mul_f32(%arg0: vector<16xf32>) -> f32 {
				%0 = vector.reduction "mul", %arg0 : vector<16xf32> into f32
				return %0 : f32
				}

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

	Show First 20 Lines • Show All 715 Lines • ▼ Show 20 Lines
	func @reduce_f32(%arg0: vector<16xf32>) -> f32 {			func @reduce_f32(%arg0: vector<16xf32>) -> f32 {
	%0 = vector.reduction "add", %arg0 : vector<16xf32> into f32			%0 = vector.reduction "add", %arg0 : vector<16xf32> into f32
	return %0 : f32			return %0 : f32
	}			}
	// CHECK-LABEL: llvm.func @reduce_f32(			// CHECK-LABEL: llvm.func @reduce_f32(
	// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x float>">)			// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x float>">)
	// CHECK: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float			// CHECK: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f32) : !llvm.float
	// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])			// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])
				// CHECK-SAME: {reassoc = false} : (!llvm.float, !llvm<"<16 x float>">) -> !llvm.float
	// CHECK: llvm.return %[[V]] : !llvm.float			// CHECK: llvm.return %[[V]] : !llvm.float

	func @reduce_f64(%arg0: vector<16xf64>) -> f64 {			func @reduce_f64(%arg0: vector<16xf64>) -> f64 {
	%0 = vector.reduction "add", %arg0 : vector<16xf64> into f64			%0 = vector.reduction "add", %arg0 : vector<16xf64> into f64
	return %0 : f64			return %0 : f64
	}			}
	// CHECK-LABEL: llvm.func @reduce_f64(			// CHECK-LABEL: llvm.func @reduce_f64(
	// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x double>">)			// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x double>">)
	// CHECK: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f64) : !llvm.double			// CHECK: %[[C:.*]] = llvm.mlir.constant(0.000000e+00 : f64) : !llvm.double
	// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])			// CHECK: %[[V:.*]] = "llvm.intr.experimental.vector.reduce.v2.fadd"(%[[C]], %[[A]])
				// CHECK-SAME: {reassoc = false} : (!llvm.double, !llvm<"<16 x double>">) -> !llvm.double
	// CHECK: llvm.return %[[V]] : !llvm.double			// CHECK: llvm.return %[[V]] : !llvm.double

	func @reduce_i32(%arg0: vector<16xi32>) -> i32 {			func @reduce_i32(%arg0: vector<16xi32>) -> i32 {
	%0 = vector.reduction "add", %arg0 : vector<16xi32> into i32			%0 = vector.reduction "add", %arg0 : vector<16xi32> into i32
	return %0 : i32			return %0 : i32
	}			}
	// CHECK-LABEL: llvm.func @reduce_i32(			// CHECK-LABEL: llvm.func @reduce_i32(
	// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x i32>">)			// CHECK-SAME: %[[A:.*]]: !llvm<"<16 x i32>">)
	▲ Show 20 Lines • Show All 232 Lines • Show Last 20 Lines

mlir/test/Target/llvmir-intrinsics.mlir

Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	llvm.func @vector_reductions(%arg0: !llvm.float, %arg1: !llvm<"<8 x float>">, %arg2: !llvm<"<8 x i32>">) {
// CHECK: call i32 @llvm.experimental.vector.reduce.umax.v8i32		// CHECK: call i32 @llvm.experimental.vector.reduce.umax.v8i32
"llvm.intr.experimental.vector.reduce.umax"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32		"llvm.intr.experimental.vector.reduce.umax"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32
// CHECK: call i32 @llvm.experimental.vector.reduce.umin.v8i32		// CHECK: call i32 @llvm.experimental.vector.reduce.umin.v8i32
"llvm.intr.experimental.vector.reduce.umin"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32		"llvm.intr.experimental.vector.reduce.umin"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32
// CHECK: call float @llvm.experimental.vector.reduce.v2.fadd.f32.v8f32		// CHECK: call float @llvm.experimental.vector.reduce.v2.fadd.f32.v8f32
"llvm.intr.experimental.vector.reduce.v2.fadd"(%arg0, %arg1) : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float		"llvm.intr.experimental.vector.reduce.v2.fadd"(%arg0, %arg1) : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float
// CHECK: call float @llvm.experimental.vector.reduce.v2.fmul.f32.v8f32		// CHECK: call float @llvm.experimental.vector.reduce.v2.fmul.f32.v8f32
"llvm.intr.experimental.vector.reduce.v2.fmul"(%arg0, %arg1) : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float		"llvm.intr.experimental.vector.reduce.v2.fmul"(%arg0, %arg1) : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float
		// CHECK: call reassoc float @llvm.experimental.vector.reduce.v2.fadd.f32.v8f32
		"llvm.intr.experimental.vector.reduce.v2.fadd"(%arg0, %arg1) {reassoc = true} : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float
		// CHECK: call reassoc float @llvm.experimental.vector.reduce.v2.fmul.f32.v8f32
		"llvm.intr.experimental.vector.reduce.v2.fmul"(%arg0, %arg1) {reassoc = true} : (!llvm.float, !llvm<"<8 x float>">) -> !llvm.float
// CHECK: call i32 @llvm.experimental.vector.reduce.xor.v8i32		// CHECK: call i32 @llvm.experimental.vector.reduce.xor.v8i32
"llvm.intr.experimental.vector.reduce.xor"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32		"llvm.intr.experimental.vector.reduce.xor"(%arg2) : (!llvm<"<8 x i32>">) -> !llvm.i32
llvm.return		llvm.return
}		}

// CHECK-LABEL: @matrix_intrinsics		// CHECK-LABEL: @matrix_intrinsics
// 4x16 16x3		// 4x16 16x3
llvm.func @matrix_intrinsics(%A: !llvm<"<64 x float>">, %B: !llvm<"<48 x float>">,		llvm.func @matrix_intrinsics(%A: !llvm<"<64 x float>">, %B: !llvm<"<48 x float>">,
▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [VectorOps] Add the ability to mark FP reductions with "reassociate" attributeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 273795

mlir/include/mlir/Conversion/Passes.td

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp

mlir/test/Conversion/VectorToLLVM/vector-reduction-to-llvm.mlir

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir

mlir/test/Target/llvmir-intrinsics.mlir

[mlir] [VectorOps] Add the ability to mark FP reductions with "reassociate" attribute
ClosedPublic