Download Raw Diff

Details

Reviewers

aartbik
ftynse
nicolasvasilache

Commits

rG0b63e3222b2d: [mlir] X86Vector: Add AVX Rsqrt

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cota created this revision.Apr 2 2021, 2:54 PM

Herald added a reviewer: ftynse. · View Herald TranscriptApr 2 2021, 2:54 PM

Herald added subscribers: dcaballe, teijeong, rdzhabarov and 14 others. · View Herald Transcript

cota requested review of this revision.Apr 2 2021, 2:54 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptApr 2 2021, 2:54 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

cota added a subscriber: ezhulenev.Apr 2 2021, 2:55 PM

Harbormaster completed remote builds in B96953: Diff 335032.Apr 2 2021, 3:29 PM

aartbik added inline comments.Apr 2 2021, 3:56 PM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	So technically you are bringing in AVX into the AVX512 dialect. That is not a big deal, but it makes me wonder, do we want any of the x86_avx512_rsqrtWIDTH versions as well? If so, should we introduce a generic op for this that takes an attribute? (not saying we should since I am not very familiar what you are going to use this for, just thinking out loud here)
275 ↗	(On Diff #335032)	Please use the section separating comment used above consistently for new instructions
mlir/test/Dialect/AVX512/legalize-for-llvm.mlir
43 ↗	(On Diff #335032)	this omission is pre-existing, but shouldn't we have CHECK-LABEL at the start of each method (or split-input), to make sure CHECKs are leaking between methods in hte long run
mlir/test/Dialect/AVX512/roundtrip.mlir
48 ↗	(On Diff #335032)	same question

ezhulenev added inline comments.Apr 5 2021, 11:08 AM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	Would it be a good idea to create a "generic" AVX dialect, and lower to concrete AVX version based on vector length and availability? Or too much instructions are sufficiently different to make this practical?

ftynse added inline comments.Apr 6 2021, 2:31 AM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	I find it very surprising to have AVX instructions in the AVX512 dialect, these print as `avx512.intr.rsqrt.ps.256` but the instruction in question is _not_ defined in the avx512 set. Maybe we should consider renaming AVX512 to X86 (which would be also consistent with how LLVM has it) and allow any instruction set with several prefixes, i.e. `x86.avx512....`, `x86.avx....` As for the "generic" AVX dialect, why stop at AVX? Can't we just have an operation on a generic vector that lowers to the target-specific intrinsic depending on the target features and vector sizes. This sounds like Vector dialect :)
mlir/test/Dialect/AVX512/roundtrip.mlir
52 ↗	(On Diff #335032)	Please add a newline

Address some reviewer comments

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	Maybe we should consider renaming AVX512 to X86 (which would be also consistent with how LLVM has it) and allow any instruction set with several prefixes, i.e. x86.avx512...., x86.avx This means more work for me, but I have to agree. I'll work on this and send it as a parent patch.
42 ↗	(On Diff #335032)	do we want any of the x86_avx512_rsqrtWIDTH versions as well? I have no use case for them since I am only interested in AVX's rsqrt, but I can probably squeeze this change in.

Harbormaster completed remote builds in B97380: Diff 335620.Apr 6 2021, 1:43 PM

aartbik added inline comments.Apr 6 2021, 1:59 PM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	Thanks. When given the choice between putting a generic op in vector dialect (viz. VV) or a target specific op (viz. HWV), we should always favor the VV approach, the HWV is a last (and often temporary) solution for experimenting with idiomatic instructions. As for the new name, X86 makes sense since it keeps a lot of specifics hidden (SSE vz AVX, AVX2, AVX512 etc), but it does not carry the "vector" semantics part ver ystrong, like ARMNeon and ARMSVE do). So AVX would make some sense, or X86SIMD, X86Vector. Not to bikeshed too much, but what do you think?

cota added inline comments.Apr 6 2021, 2:11 PM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	So AVX would make some sense, or X86SIMD, X86Vector. I like this suggestion -- with just x86 in the name there is a risk this will become more of an an incoherent bag than it needs to be. Plus, it would be confusing since there are other x86-related dialects (e.g. AMX). My vote is for X86Vector.

ftynse added inline comments.Apr 7 2021, 12:13 AM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	X86Vector works for me

cota mentioned this in D100119: [mlir] Rename AVX512 dialect to X86Vector.Apr 8 2021, 8:52 AM

cota added inline comments.Apr 8 2021, 8:55 AM

mlir/include/mlir/Dialect/AVX512/AVX512.td
42 ↗	(On Diff #335032)	I've introduced X86Vector in https://reviews.llvm.org/D100119 I'll update this patch to apply on top of X86Vector once the X86Vector patch is reviewed.

ftynse mentioned this in rG8508a63b887e: [mlir] Rename AVX512 dialect to X86Vector.Apr 12 2021, 10:20 AM

can you please rebase so we can continue the review in the context of the new dialect name?

Rebase on X86Vector

cota retitled this revision from [mlir] Add rsqrt to AVX512 dialect to [mlir] X86Vector: Add AVX Rsqrt .Apr 12 2021, 2:33 PM

cota edited the summary of this revision. (Show Details)

fix newline

Harbormaster completed remote builds in B98362: Diff 336966.Apr 12 2021, 3:15 PM

aartbik added inline comments.Apr 12 2021, 3:23 PM

mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir
19	Note that this is old-style (I am responsible for setting the example, but that was in the context of testing the broadcast/insert). But a much better and more readable style for such vector constants is %v = std.constant dense<[0,125, 0.25 ....]> : vector<8xf32> In such cases, the CHECKin print also is not needed (which was only added to help the reader see what vectors were constructed.

Harbormaster completed remote builds in B98363: Diff 336968.Apr 12 2021, 3:32 PM

Use std.constant dense
Mention https://bugs.llvm.org/show_bug.cgi?id=49906 in test-rsqrt.mlir

mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir
19	Much better! Done.

Harbormaster completed remote builds in B98403: Diff 337021.Apr 12 2021, 7:18 PM

aartbik accepted this revision.Apr 12 2021, 10:39 PM

This revision is now accepted and ready to land.Apr 12 2021, 10:39 PM

Closed by commit rG0b63e3222b2d: [mlir] X86Vector: Add AVX Rsqrt (authored by cota, committed by aartbik). · Explain WhyApr 13 2021, 8:44 AM

This revision was automatically updated to reflect the committed changes.

aartbik added a commit: rG0b63e3222b2d: [mlir] X86Vector: Add AVX Rsqrt.

Diff 337170

mlir/include/mlir/Dialect/X86Vector/X86Vector.td

	Show First 20 Lines • Show All 261 Lines • ▼ Show 20 Lines
	}			}

	def Vp2IntersectQIntrOp : AVX512_IntrOp<"vp2intersect.q.512", 2, [			def Vp2IntersectQIntrOp : AVX512_IntrOp<"vp2intersect.q.512", 2, [
	NoSideEffect]> {			NoSideEffect]> {
	let arguments = (ins VectorOfLengthAndType<[8], [I64]>:$a,			let arguments = (ins VectorOfLengthAndType<[8], [I64]>:$a,
	VectorOfLengthAndType<[8], [I64]>:$b);			VectorOfLengthAndType<[8], [I64]>:$b);
	}			}

				//===----------------------------------------------------------------------===//
				// AVX op definitions
				//===----------------------------------------------------------------------===//

				class AVX_Op<string mnemonic, list<OpTrait> traits = []> :
				Op<X86Vector_Dialect, "avx." # mnemonic, traits> {}

				class AVX_IntrOp<string mnemonic, int numResults, list<OpTrait> traits = []> :
				LLVM_IntrOpBase<X86Vector_Dialect, "avx.intr." # mnemonic,
				"x86_avx_" # !subst(".", "_", mnemonic),
				[], [], traits, numResults>;

				//----------------------------------------------------------------------------//
				// AVX Rsqrt
				//----------------------------------------------------------------------------//

				def RsqrtOp : AVX_Op<"rsqrt", [NoSideEffect, SameOperandsAndResultType]> {
				let summary = "Rsqrt";
				let arguments = (ins VectorOfLengthAndType<[8], [F32]>:$a);
				let results = (outs VectorOfLengthAndType<[8], [F32]>:$b);
				let assemblyFormat = "$a attr-dict `:` type($a)";
				}

				def RsqrtIntrOp : AVX_IntrOp<"rsqrt.ps.256", 1, [NoSideEffect,
				SameOperandsAndResultType]> {
				let arguments = (ins VectorOfLengthAndType<[8], [F32]>:$a);
				}

	#endif // X86VECTOR_OPS			#endif // X86VECTOR_OPS

mlir/lib/Dialect/X86Vector/Transforms/LegalizeForLLVMExport.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	matchAndRewrite(MaskCompressOp op, ArrayRef<Value> operands,

rewriter.replaceOpWithNewOp<MaskCompressIntrOp>(op, opType, adaptor.a(),		rewriter.replaceOpWithNewOp<MaskCompressIntrOp>(op, opType, adaptor.a(),
src, adaptor.k());		src, adaptor.k());

return success();		return success();
}		}
};		};

		struct RsqrtOpConversion : public ConvertOpToLLVMPattern<RsqrtOp> {
		using ConvertOpToLLVMPattern<RsqrtOp>::ConvertOpToLLVMPattern;

		LogicalResult
		matchAndRewrite(RsqrtOp op, ArrayRef<Value> operands,
		ConversionPatternRewriter &rewriter) const override {
		RsqrtOp::Adaptor adaptor(operands);

		auto opType = adaptor.a().getType();
		rewriter.replaceOpWithNewOp<RsqrtIntrOp>(op, opType, adaptor.a());
		return success();
		}
		};

/// An entry associating the "main" AVX512 op with its instantiations for		/// An entry associating the "main" AVX512 op with its instantiations for
/// vectors of 32-bit and 64-bit elements.		/// vectors of 32-bit and 64-bit elements.
template <typename OpTy, typename Intr32OpTy, typename Intr64OpTy>		template <typename OpTy, typename Intr32OpTy, typename Intr64OpTy>
struct RegEntry {		struct RegEntry {
using MainOp = OpTy;		using MainOp = OpTy;
using Intr32Op = Intr32OpTy;		using Intr32Op = Intr32OpTy;
using Intr64Op = Intr64OpTy;		using Intr64Op = Intr64OpTy;
};		};
Show All 25 Lines	using Registry = RegistryImpl<
RegEntry<Vp2IntersectOp, Vp2IntersectDIntrOp, Vp2IntersectQIntrOp>>;		RegEntry<Vp2IntersectOp, Vp2IntersectDIntrOp, Vp2IntersectQIntrOp>>;

} // namespace		} // namespace

/// Populate the given list with patterns that convert from X86Vector to LLVM.		/// Populate the given list with patterns that convert from X86Vector to LLVM.
void mlir::populateX86VectorLegalizeForLLVMExportPatterns(		void mlir::populateX86VectorLegalizeForLLVMExportPatterns(
LLVMTypeConverter &converter, RewritePatternSet &patterns) {		LLVMTypeConverter &converter, RewritePatternSet &patterns) {
Registry::registerPatterns(converter, patterns);		Registry::registerPatterns(converter, patterns);
patterns.add<MaskCompressOpConversion>(converter);		patterns.add<MaskCompressOpConversion, RsqrtOpConversion>(converter);
}		}

void mlir::configureX86VectorLegalizeForExportTarget(		void mlir::configureX86VectorLegalizeForExportTarget(
LLVMConversionTarget &target) {		LLVMConversionTarget &target) {
Registry::configureTarget(target);		Registry::configureTarget(target);
target.addLegalOp<MaskCompressIntrOp>();		target.addLegalOp<MaskCompressIntrOp>();
target.addIllegalOp<MaskCompressOp>();		target.addIllegalOp<MaskCompressOp>();
		target.addLegalOp<RsqrtIntrOp>();
		target.addIllegalOp<RsqrtOp>();
}		}

mlir/test/Dialect/X86Vector/legalize-for-llvm.mlir

Show All 36 Lines	func @avx512_vp2intersect(%a: vector<16xi32>, %b: vector<8xi64>)
-> (vector<16xi1>, vector<16xi1>, vector<8xi1>, vector<8xi1>)		-> (vector<16xi1>, vector<16xi1>, vector<8xi1>, vector<8xi1>)
{		{
// CHECK: x86vector.avx512.intr.vp2intersect.d.512		// CHECK: x86vector.avx512.intr.vp2intersect.d.512
%0, %1 = x86vector.avx512.vp2intersect %a, %a : vector<16xi32>		%0, %1 = x86vector.avx512.vp2intersect %a, %a : vector<16xi32>
// CHECK: x86vector.avx512.intr.vp2intersect.q.512		// CHECK: x86vector.avx512.intr.vp2intersect.q.512
%2, %3 = x86vector.avx512.vp2intersect %b, %b : vector<8xi64>		%2, %3 = x86vector.avx512.vp2intersect %b, %b : vector<8xi64>
return %0, %1, %2, %3 : vector<16xi1>, vector<16xi1>, vector<8xi1>, vector<8xi1>		return %0, %1, %2, %3 : vector<16xi1>, vector<16xi1>, vector<8xi1>, vector<8xi1>
}		}

		// CHECK-LABEL: func @avx_rsqrt
		func @avx_rsqrt(%a: vector<8xf32>) -> (vector<8xf32>)
		{
		// CHECK: x86vector.avx.intr.rsqrt.ps.256
		%0 = x86vector.avx.rsqrt %a : vector<8xf32>
		return %0 : vector<8xf32>
		}

mlir/test/Dialect/X86Vector/roundtrip.mlir

Show All 40 Lines	func @avx512_mask_compress(%k1: vector<16xi1>, %a1: vector<16xf32>,
// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<16xf32>		// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<16xf32>
%0 = x86vector.avx512.mask.compress %k1, %a1 : vector<16xf32>		%0 = x86vector.avx512.mask.compress %k1, %a1 : vector<16xf32>
// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<16xf32>		// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<16xf32>
%1 = x86vector.avx512.mask.compress %k1, %a1 {constant_src = dense<5.0> : vector<16xf32>} : vector<16xf32>		%1 = x86vector.avx512.mask.compress %k1, %a1 {constant_src = dense<5.0> : vector<16xf32>} : vector<16xf32>
// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<8xi64>		// CHECK: x86vector.avx512.mask.compress {{.*}} : vector<8xi64>
%2 = x86vector.avx512.mask.compress %k2, %a2, %a2 : vector<8xi64>, vector<8xi64>		%2 = x86vector.avx512.mask.compress %k2, %a2, %a2 : vector<8xi64>, vector<8xi64>
return %0, %1, %2 : vector<16xf32>, vector<16xf32>, vector<8xi64>		return %0, %1, %2 : vector<16xf32>, vector<16xf32>, vector<8xi64>
}		}

		// CHECK-LABEL: func @avx_rsqrt
		func @avx_rsqrt(%a: vector<8xf32>) -> (vector<8xf32>)
		{
		// CHECK: x86vector.avx.rsqrt {{.*}} : vector<8xf32>
		%0 = x86vector.avx.rsqrt %a : vector<8xf32>
		return %0 : vector<8xf32>
		}

mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir

This file was added.

				// RUN: mlir-opt %s -convert-scf-to-std -convert-vector-to-llvm="enable-x86vector" -convert-std-to-llvm \| \
				// RUN: mlir-translate --mlir-to-llvmir \| \
				// RUN: %lli --jit-kind=mcjit --entry-function=entry --mattr="avx512bw" --dlopen=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext \| \
				// RUN: FileCheck %s
				// TODO: drop lli's --jit-kind flag once PR#49906 (https://bugs.llvm.org/show_bug.cgi?id=49906) is fixed.

				func @entry() -> i32 {
				%i0 = constant 0 : i32

				%v = std.constant dense<[0.125, 0.25, 0.5, 1.0, 2.0, 4.0, 8.0, 16.0]> : vector<8xf32>
				%r = x86vector.avx.rsqrt %v : vector<8xf32>
				// CHECK: ( 2.82764, 1.99951, 1.41382, 0.999756, 0.706909, 0.499878, 0.353455, 0.249939 )
				vector.print %r : vector<8xf32>

				return %i0 : i32
				}
				aartbikUnsubmitted Done Reply Inline Actions Note that this is old-style (I am responsible for setting the example, but that was in the context of testing the broadcast/insert). But a much better and more readable style for such vector constants is %v = std.constant dense<[0,125, 0.25 ....]> : vector<8xf32> In such cases, the CHECKin print also is not needed (which was only added to help the reader see what vectors were constructed. aartbik: Note that this is old-style (I am responsible for setting the example, but that was in the…
				cotaAuthorUnsubmitted Done Reply Inline Actions Much better! Done. cota: Much better! Done.

mlir/test/Target/LLVMIR/x86vector.mlir

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	llvm.func @LLVM_x86_vp2intersect_q_512(%a: vector<8xi64>, %b: vector<8xi64>)			llvm.func @LLVM_x86_vp2intersect_q_512(%a: vector<8xi64>, %b: vector<8xi64>)
	-> !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>			-> !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>
	{			{
	// CHECK: call { <8 x i1>, <8 x i1> } @llvm.x86.avx512.vp2intersect.q.512(<8 x i64>			// CHECK: call { <8 x i1>, <8 x i1> } @llvm.x86.avx512.vp2intersect.q.512(<8 x i64>
	%0 = "x86vector.avx512.intr.vp2intersect.q.512"(%a, %b) :			%0 = "x86vector.avx512.intr.vp2intersect.q.512"(%a, %b) :
	(vector<8xi64>, vector<8xi64>) -> !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>			(vector<8xi64>, vector<8xi64>) -> !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>
	llvm.return %0 : !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>			llvm.return %0 : !llvm.struct<(vector<8 x i1>, vector<8 x i1>)>
	}			}

				// CHECK-LABEL: define <8 x float> @LLVM_x86_avx_rsqrt_ps_256
				llvm.func @LLVM_x86_avx_rsqrt_ps_256(%a: vector <8xf32>) -> vector<8xf32>
				{
				// CHECK: call <8 x float> @llvm.x86.avx.rsqrt.ps.256(<8 x float>
				%0 = "x86vector.avx.intr.rsqrt.ps.256"(%a) : (vector<8xf32>) -> (vector<8xf32>)
				llvm.return %0 : vector<8xf32>
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] X86Vector: Add AVX Rsqrt
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 337170

mlir/include/mlir/Dialect/X86Vector/X86Vector.td

mlir/lib/Dialect/X86Vector/Transforms/LegalizeForLLVMExport.cpp

mlir/test/Dialect/X86Vector/legalize-for-llvm.mlir

mlir/test/Dialect/X86Vector/roundtrip.mlir

mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir

mlir/test/Target/LLVMIR/x86vector.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] X86Vector: Add AVX RsqrtClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 337170

mlir/include/mlir/Dialect/X86Vector/X86Vector.td

mlir/lib/Dialect/X86Vector/Transforms/LegalizeForLLVMExport.cpp

mlir/test/Dialect/X86Vector/legalize-for-llvm.mlir

mlir/test/Dialect/X86Vector/roundtrip.mlir

mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir

mlir/test/Target/LLVMIR/x86vector.mlir

[mlir] X86Vector: Add AVX Rsqrt
ClosedPublic