llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
13341	Can we add a isZerosVector function that encapsulates isBuildVectorAllZeros and the splats/dups logic? It sounds generally useful to have a function that checks for the various ways a vector can be all zeros. isConstantSplatVectorAllZeros may also be better, as it may already handle the ISD::SPLAT_VECTOR part.

junparser added inline comments.Apr 1 2021, 3:18 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
13341	sound good to me, I'll update this.

junparser added a reviewer: dmgreen.Apr 1 2021, 3:18 AM

address the comments.

Thanks for the patch @junparser! Just a few minor comments from me. 🙂

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2148–2150	Might be better to do this _before_ the above `if(ISD::isConstantSplatVectorAllZeroes(N))`, in my opinion, to avoid running this loop twice (once in this function, once in `ISD::isConstantSplatVectorAllZeros`).
2152	A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is entirely captured earlier on, I think. I reckon you can get away with just: if (N->getOpcode() != AArch64ISD::DUP) return false;
2158–2160	nit: return (CINT && CINT->isNullValue()) \|\| (CFP && CFP->isZero());
13337	nit: superfluous newline.
13341	Following on from this, it might be a good idea to write `isZerosVector` as: static bool isZerosVector(const SDNode *N) { return ISD::isConstantSplatVectorAllZeros(N) \|\| isConstantDupVectorAllZeros(N); } Might be a nicer separation of the logic here. Your call, though!
llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	The new `CHECK:` lines introduced look simple enough to write out by hand, so I would prefer to not use `utils/update_llc_checks.py` in this case because it has made the diff a lot noisier (e.g. all of the tests have been changed syntactically in this patch, but AFAICT none of them have changed functionally). In this case, I think it should be sufficient to write your new tests like this: define <vscale x 2 x i64> @test_sdot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) { ; CHECK-LABEL: test_sdot_i64_zero: ; CHECK: sdot z0.d, z1.h, z2.h ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) %ret = add <vscale x 2 x i64> %vdot1.i, %a ret <vscale x 2 x i64> %ret } define <vscale x 4 x i32> @test_sdot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) { ; CHECK-LABEL: test_sdot_i32_zero: ; CHECK: sdot z0.s, z1.b, z2.b ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) %ret = add <vscale x 4 x i32> %vdot1.i, %a ret <vscale x 4 x i32> %ret } ... and: define <vscale x 2 x i64> @test_udot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) { ; CHECK-LABEL: test_udot_i64_zero: ; CHECK: udot z0.d, z1.h, z2.h ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) %ret = add <vscale x 2 x i64> %vdot1.i, %a ret <vscale x 2 x i64> %ret } define <vscale x 4 x i32> @test_udot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) { ; CHECK-LABEL: test_udot_i32_zero: ; CHECK: udot z0.s, z1.b, z2.b ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) %ret = add <vscale x 4 x i32> %vdot1.i, %a ret <vscale x 4 x i32> %ret } ... and then just revert all other changes in the file. 😄

Harbormaster completed remote builds in B96672: Diff 334637.Apr 1 2021, 4:00 AM

paulwalker-arm added inline comments.Apr 1 2021, 4:00 AM

llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	I think not using update_llc_checks.py was a mistake when landing the original file. Although using it now makes the patch look bigger it will make it easier to update in the future, which is a good thing.

david-arm added inline comments.Apr 1 2021, 4:06 AM

llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	I think the LLVM community prefers using utils/update_llc_test_checks.py where possible, in particular for small functions like those here.

junparser added inline comments.Apr 1 2021, 4:24 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2152	the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's I added it here.
2152	I will add another patch to check check ConstantFPSDNode for SPLAT_VECTOR in ISD::isConstantSplatVectorAllZeros and then remove the code here? is that ok?
llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	I also prefer to use update_llc_checks.py here.

dmgreen added inline comments.Apr 1 2021, 4:33 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2152	I had not realized that isConstantSplatVectorAllZeros didn't already handle SPLAT_VECTOR with FP constants. Adding it there sounds like a good idea, either here or in a separate patch.
llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	Yeah, using the update script has benefits in terms of maintainability and consistency between test. As well as not missing things in the generated assembly. Feel free regenerate the existing test and commit that separately, so just the changes from the patch are shown here. I often to so far as to commit the new tests with the current trunk codegen, so that in the review it is obvious what is changing in the codegen.

address comments.

joechrisellis added inline comments.Apr 1 2021, 5:09 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2152	ACK -- I would have expected it to already handle SPLAT_VECTOR with float constants. @dmgreen's suggestion sounds good to me, hopefully that's a straightforward change! 🙂
llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll
41	ACK -- any which way if fine by me. I like @dmgreen's idea of regenerating the test as a separate commit, but no complaints from me if you just leave it as-is. 🙂

address comments.

@joechrisellis @david-arm Add fp constants handle in ISD::isConstantSplatVector , also split the testcase.

Harbormaster completed remote builds in B96684: Diff 334651.Apr 1 2021, 5:45 AM

Harbormaster completed remote builds in B96689: Diff 334656.Apr 1 2021, 6:16 AM

Thanks. LGTM

This revision is now accepted and ready to land.Apr 2 2021, 4:40 AM

Closed by commit rG274ac9d40e79: [AArch64][SVE] Lowering sve.dot to DOT node (authored by junparser). · Explain WhyApr 2 2021, 5:17 AM

This revision was automatically updated to reflect the committed changes.

junparser added a commit: rG274ac9d40e79: [AArch64][SVE] Lowering sve.dot to DOT node.

paulwalker-arm added inline comments.Apr 6 2021, 4:48 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
149 ↗	(On Diff #334939)	@junparser This looks a little wired to me. Specifically the truncOrSelf part which to my mind is never something that is likely safe to do with a floating point constant (even after bitcasting to an APInt). Looking at the equivalent code in `BuildVectorSDNode::isConstantSplat` I can see that it stops after the `bitcastToAPInt()` stage. To be consistent, can this function follow the same pattern?

Diff 334651

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,134 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AArch64 Lowering private implementation.		// AArch64 Lowering private implementation.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Lowering Code		// Lowering Code
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		/// isZerosVector - Check whether SDNode N is a zero-filled vector.
		static bool isZerosVector(const SDNode *N) {
		// Look through a bit convert.
		while (N->getOpcode() == ISD::BITCAST)
		N = N->getOperand(0).getNode();

		if (ISD::isConstantSplatVectorAllZeros(N))
		return true;
		joechrisellisUnsubmitted Not Done Reply Inline Actions Might be better to do this _before_ the above `if(ISD::isConstantSplatVectorAllZeroes(N))`, in my opinion, to avoid running this loop twice (once in this function, once in `ISD::isConstantSplatVectorAllZeros`). joechrisellis: Might be better to do this _before_ the above `if(ISD::isConstantSplatVectorAllZeroes(N))`, in…

		if (N->getOpcode() != AArch64ISD::DUP && N->getOpcode() != ISD::SPLAT_VECTOR)
		joechrisellisUnsubmitted Not Done Reply Inline Actions A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is entirely captured earlier on, I think. I reckon you can get away with just: if (N->getOpcode() != AArch64ISD::DUP) return false; joechrisellis: A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is…
		junparserAuthorUnsubmitted Done Reply Inline Actions the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's I added it here. junparser: the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's…
		junparserAuthorUnsubmitted Done Reply Inline Actions I will add another patch to check check ConstantFPSDNode for SPLAT_VECTOR in ISD::isConstantSplatVectorAllZeros and then remove the code here? is that ok? junparser: I will add another patch to check check ConstantFPSDNode for SPLAT_VECTOR in ISD…
		dmgreenUnsubmitted Not Done Reply Inline Actions I had not realized that isConstantSplatVectorAllZeros didn't already handle SPLAT_VECTOR with FP constants. Adding it there sounds like a good idea, either here or in a separate patch. dmgreen: I had not realized that isConstantSplatVectorAllZeros didn't already handle SPLAT_VECTOR with…
		joechrisellisUnsubmitted Not Done Reply Inline Actions ACK -- I would have expected it to already handle SPLAT_VECTOR with float constants. @dmgreen's suggestion sounds good to me, hopefully that's a straightforward change! 🙂 joechrisellis: ACK -- I would have expected it to already handle SPLAT_VECTOR with float constants. @dmgreen's…
		return false;

		auto Opnd0 = N->getOperand(0);
		auto *CINT = dyn_cast<ConstantSDNode>(Opnd0);
		auto *CFP = dyn_cast<ConstantFPSDNode>(Opnd0);
		return (CINT && CINT->isNullValue()) \|\| (CFP && CFP->isZero());
		}

		joechrisellisUnsubmitted Not Done Reply Inline Actions nit: return (CINT && CINT->isNullValue()) \|\| (CFP && CFP->isZero()); joechrisellis: nit: ``` return (CINT && CINT->isNullValue()) \|\| (CFP && CFP->isZero()); ```
/// changeIntCCToAArch64CC - Convert a DAG integer condition code to an AArch64		/// changeIntCCToAArch64CC - Convert a DAG integer condition code to an AArch64
/// CC		/// CC
static AArch64CC::CondCode changeIntCCToAArch64CC(ISD::CondCode CC) {		static AArch64CC::CondCode changeIntCCToAArch64CC(ISD::CondCode CC) {
switch (CC) {		switch (CC) {
default:		default:
llvm_unreachable("Unknown condition code!");		llvm_unreachable("Unknown condition code!");
case ISD::SETNE:		case ISD::SETNE:
return AArch64CC::NE;		return AArch64CC::NE;
▲ Show 20 Lines • Show All 1,755 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
case Intrinsic::aarch64_neon_sabd:		case Intrinsic::aarch64_neon_sabd:
case Intrinsic::aarch64_neon_uabd: {		case Intrinsic::aarch64_neon_uabd: {
unsigned Opcode = IntNo == Intrinsic::aarch64_neon_uabd ? AArch64ISD::UABD		unsigned Opcode = IntNo == Intrinsic::aarch64_neon_uabd ? AArch64ISD::UABD
: AArch64ISD::SABD;		: AArch64ISD::SABD;
return DAG.getNode(Opcode, dl, Op.getValueType(), Op.getOperand(1),		return DAG.getNode(Opcode, dl, Op.getValueType(), Op.getOperand(1),
Op.getOperand(2));		Op.getOperand(2));
}		}
case Intrinsic::aarch64_neon_sdot:		case Intrinsic::aarch64_neon_sdot:
case Intrinsic::aarch64_neon_udot: {		case Intrinsic::aarch64_neon_udot:
unsigned Opcode = IntNo == Intrinsic::aarch64_neon_udot ? AArch64ISD::UDOT		case Intrinsic::aarch64_sve_sdot:
		case Intrinsic::aarch64_sve_udot: {
		unsigned Opcode = (IntNo == Intrinsic::aarch64_neon_udot \|\|
		IntNo == Intrinsic::aarch64_sve_udot)
		? AArch64ISD::UDOT
: AArch64ISD::SDOT;		: AArch64ISD::SDOT;
return DAG.getNode(Opcode, dl, Op.getValueType(), Op.getOperand(1),		return DAG.getNode(Opcode, dl, Op.getValueType(), Op.getOperand(1),
Op.getOperand(2), Op.getOperand(3));		Op.getOperand(2), Op.getOperand(3));
}		}
}		}
}		}

bool AArch64TargetLowering::shouldExtendGSIndex(EVT VT, EVT &EltTy) const {		bool AArch64TargetLowering::shouldExtendGSIndex(EVT VT, EVT &EltTy) const {
if (VT.getVectorElementType() == MVT::i8 \|\|		if (VT.getVectorElementType() == MVT::i8 \|\|
▲ Show 20 Lines • Show All 9,382 Lines • ▼ Show 20 Lines
// ADD(UDOT(zero, x, y), A) --> UDOT(A, x, y)		// ADD(UDOT(zero, x, y), A) --> UDOT(A, x, y)
static SDValue performAddDotCombine(SDNode *N, SelectionDAG &DAG) {		static SDValue performAddDotCombine(SDNode *N, SelectionDAG &DAG) {
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
if (N->getOpcode() != ISD::ADD)		if (N->getOpcode() != ISD::ADD)
return SDValue();		return SDValue();

SDValue Dot = N->getOperand(0);		SDValue Dot = N->getOperand(0);
SDValue A = N->getOperand(1);		SDValue A = N->getOperand(1);
// Handle commutivity		// Handle commutivity
		joechrisellisUnsubmitted Not Done Reply Inline Actions nit: superfluous newline. joechrisellis: nit: superfluous newline.
auto isZeroDot = [](SDValue Dot) {		auto isZeroDot = [](SDValue Dot) {
return (Dot.getOpcode() == AArch64ISD::UDOT \|\|		return (Dot.getOpcode() == AArch64ISD::UDOT \|\|
Dot.getOpcode() == AArch64ISD::SDOT) &&		Dot.getOpcode() == AArch64ISD::SDOT) &&
ISD::isBuildVectorAllZeros(Dot.getOperand(0).getNode());		isZerosVector(Dot.getOperand(0).getNode());
		dmgreenUnsubmitted Not Done Reply Inline Actions Can we add a isZerosVector function that encapsulates isBuildVectorAllZeros and the splats/dups logic? It sounds generally useful to have a function that checks for the various ways a vector can be all zeros. isConstantSplatVectorAllZeros may also be better, as it may already handle the ISD::SPLAT_VECTOR part. dmgreen: Can we add a isZerosVector function that encapsulates isBuildVectorAllZeros and the splats/dups…
		junparserAuthorUnsubmitted Done Reply Inline Actions sound good to me, I'll update this. junparser: sound good to me, I'll update this.
		joechrisellisUnsubmitted Not Done Reply Inline Actions Following on from this, it might be a good idea to write `isZerosVector` as: static bool isZerosVector(const SDNode N) { return ISD::isConstantSplatVectorAllZeros(N) \|\| isConstantDupVectorAllZeros(N); } Might be a nicer separation of the logic here. Your call, though! joechrisellis:* Following on from this, it might be a good idea to write `isZerosVector` as: ``` static bool…
};		};
if (!isZeroDot(Dot))		if (!isZeroDot(Dot))
std::swap(Dot, A);		std::swap(Dot, A);
if (!isZeroDot(Dot))		if (!isZeroDot(Dot))
return SDValue();		return SDValue();

return DAG.getNode(Dot.getOpcode(), SDLoc(N), VT, A, Dot.getOperand(1),		return DAG.getNode(Dot.getOpcode(), SDLoc(N), VT, A, Dot.getOperand(1),
Dot.getOperand(2));		Dot.getOperand(2));
▲ Show 20 Lines • Show All 4,157 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

Show First 20 Lines • Show All 347 Lines • ▼ Show 20 Lines	let Predicates = [HasSVE] in {
defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", "SDIV_ZPZZ", int_aarch64_sve_sdiv, DestructiveBinaryCommWithRev, "SDIVR_ZPmZ">;		defm SDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b100, "sdiv", "SDIV_ZPZZ", int_aarch64_sve_sdiv, DestructiveBinaryCommWithRev, "SDIVR_ZPmZ">;
defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", "UDIV_ZPZZ", int_aarch64_sve_udiv, DestructiveBinaryCommWithRev, "UDIVR_ZPmZ">;		defm UDIV_ZPmZ : sve_int_bin_pred_arit_2_div<0b101, "udiv", "UDIV_ZPZZ", int_aarch64_sve_udiv, DestructiveBinaryCommWithRev, "UDIVR_ZPmZ">;
defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", "SDIVR_ZPZZ", int_aarch64_sve_sdivr, DestructiveBinaryCommWithRev, "SDIV_ZPmZ", /isReverseInstr/ 1>;		defm SDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b110, "sdivr", "SDIVR_ZPZZ", int_aarch64_sve_sdivr, DestructiveBinaryCommWithRev, "SDIV_ZPmZ", /isReverseInstr/ 1>;
defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", "UDIVR_ZPZZ", int_aarch64_sve_udivr, DestructiveBinaryCommWithRev, "UDIV_ZPmZ", /isReverseInstr/ 1>;		defm UDIVR_ZPmZ : sve_int_bin_pred_arit_2_div<0b111, "udivr", "UDIVR_ZPZZ", int_aarch64_sve_udivr, DestructiveBinaryCommWithRev, "UDIV_ZPmZ", /isReverseInstr/ 1>;

defm SDIV_ZPZZ : sve_int_bin_pred_sd<AArch64sdiv_p>;		defm SDIV_ZPZZ : sve_int_bin_pred_sd<AArch64sdiv_p>;
defm UDIV_ZPZZ : sve_int_bin_pred_sd<AArch64udiv_p>;		defm UDIV_ZPZZ : sve_int_bin_pred_sd<AArch64udiv_p>;

defm SDOT_ZZZ : sve_intx_dot<0b0, "sdot", int_aarch64_sve_sdot>;		defm SDOT_ZZZ : sve_intx_dot<0b0, "sdot", AArch64sdot>;
defm UDOT_ZZZ : sve_intx_dot<0b1, "udot", int_aarch64_sve_udot>;		defm UDOT_ZZZ : sve_intx_dot<0b1, "udot", AArch64udot>;

defm SDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b0, "sdot", int_aarch64_sve_sdot_lane>;		defm SDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b0, "sdot", int_aarch64_sve_sdot_lane>;
defm UDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b1, "udot", int_aarch64_sve_udot_lane>;		defm UDOT_ZZZI : sve_intx_dot_by_indexed_elem<0b1, "udot", int_aarch64_sve_udot_lane>;

defm SXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b000, "sxtb", AArch64sxt_mt>;		defm SXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b000, "sxtb", AArch64sxt_mt>;
defm UXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b001, "uxtb", AArch64uxt_mt>;		defm UXTB_ZPmZ : sve_int_un_pred_arit_0_h<0b001, "uxtb", AArch64uxt_mt>;
defm SXTH_ZPmZ : sve_int_un_pred_arit_0_w<0b010, "sxth", AArch64sxt_mt>;		defm SXTH_ZPmZ : sve_int_un_pred_arit_0_w<0b010, "sxth", AArch64sxt_mt>;
defm UXTH_ZPmZ : sve_int_un_pred_arit_0_w<0b011, "uxth", AArch64uxt_mt>;		defm UXTH_ZPmZ : sve_int_un_pred_arit_0_w<0b011, "uxth", AArch64uxt_mt>;
▲ Show 20 Lines • Show All 2,422 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll

	Show All 32 Lines
	; CHECK: abs z0.s, p0/m, z1.s			; CHECK: abs z0.s, p0/m, z1.s
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 4 x i32> @llvm.aarch64.sve.abs.nxv4i32(<vscale x 4 x i32> %a,			%out = call <vscale x 4 x i32> @llvm.aarch64.sve.abs.nxv4i32(<vscale x 4 x i32> %a,
	<vscale x 4 x i1> %pg,			<vscale x 4 x i1> %pg,
	<vscale x 4 x i32> %b)			<vscale x 4 x i32> %b)
	ret <vscale x 4 x i32> %out			ret <vscale x 4 x i32> %out
	}			}

	define <vscale x 2 x i64> @abs_i64(<vscale x 2 x i64> %a, <vscale x 2 x i1> %pg, <vscale x 2 x i64> %b) {			define <vscale x 2 x i64> @abs_i64(<vscale x 2 x i64> %a, <vscale x 2 x i1> %pg, <vscale x 2 x i64> %b) {
				joechrisellisUnsubmitted Not Done Reply Inline Actions The new `CHECK:` lines introduced look simple enough to write out by hand, so I would prefer to not use `utils/update_llc_checks.py` in this case because it has made the diff a lot noisier (e.g. all of the tests have been changed syntactically in this patch, but AFAICT none of them have changed functionally). In this case, I think it should be sufficient to write your new tests like this: define <vscale x 2 x i64> @test_sdot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) { ; CHECK-LABEL: test_sdot_i64_zero: ; CHECK: sdot z0.d, z1.h, z2.h ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) %ret = add <vscale x 2 x i64> %vdot1.i, %a ret <vscale x 2 x i64> %ret } define <vscale x 4 x i32> @test_sdot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) { ; CHECK-LABEL: test_sdot_i32_zero: ; CHECK: sdot z0.s, z1.b, z2.b ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) %ret = add <vscale x 4 x i32> %vdot1.i, %a ret <vscale x 4 x i32> %ret } ... and: define <vscale x 2 x i64> @test_udot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) { ; CHECK-LABEL: test_udot_i64_zero: ; CHECK: udot z0.d, z1.h, z2.h ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) %ret = add <vscale x 2 x i64> %vdot1.i, %a ret <vscale x 2 x i64> %ret } define <vscale x 4 x i32> @test_udot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) { ; CHECK-LABEL: test_udot_i32_zero: ; CHECK: udot z0.s, z1.b, z2.b ; CHECK-NEXT: ret entry: %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) %ret = add <vscale x 4 x i32> %vdot1.i, %a ret <vscale x 4 x i32> %ret } ... and then just revert all other changes in the file. 😄 joechrisellis: The new `CHECK:` lines introduced look simple enough to write out by hand, so I would prefer to…
				paulwalker-armUnsubmitted Not Done Reply Inline Actions I think not using update_llc_checks.py was a mistake when landing the original file. Although using it now makes the patch look bigger it will make it easier to update in the future, which is a good thing. paulwalker-arm: I think not using update_llc_checks.py was a mistake when landing the original file. Although…
				david-armUnsubmitted Not Done Reply Inline Actions I think the LLVM community prefers using utils/update_llc_test_checks.py where possible, in particular for small functions like those here. david-arm: I think the LLVM community prefers using utils/update_llc_test_checks.py where possible, in…
				junparserAuthorUnsubmitted Done Reply Inline Actions I also prefer to use update_llc_checks.py here. junparser: I also prefer to use update_llc_checks.py here.
				dmgreenUnsubmitted Not Done Reply Inline Actions Yeah, using the update script has benefits in terms of maintainability and consistency between test. As well as not missing things in the generated assembly. Feel free regenerate the existing test and commit that separately, so just the changes from the patch are shown here. I often to so far as to commit the new tests with the current trunk codegen, so that in the review it is obvious what is changing in the codegen. dmgreen: Yeah, using the update script has benefits in terms of maintainability and consistency between…
				joechrisellisUnsubmitted Not Done Reply Inline Actions ACK -- any which way if fine by me. I like @dmgreen's idea of regenerating the test as a separate commit, but no complaints from me if you just leave it as-is. 🙂 joechrisellis: ACK -- any which way if fine by me. I like @dmgreen's idea of regenerating the test as a…
	; CHECK-LABEL: abs_i64:			; CHECK-LABEL: abs_i64:
	; CHECK: abs z0.d, p0/m, z1.d			; CHECK: abs z0.d, p0/m, z1.d
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 2 x i64> @llvm.aarch64.sve.abs.nxv2i64(<vscale x 2 x i64> %a,			%out = call <vscale x 2 x i64> @llvm.aarch64.sve.abs.nxv2i64(<vscale x 2 x i64> %a,
	<vscale x 2 x i1> %pg,			<vscale x 2 x i1> %pg,
	<vscale x 2 x i64> %b)			<vscale x 2 x i64> %b)
	ret <vscale x 2 x i64> %out			ret <vscale x 2 x i64> %out
	}			}
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	; CHECK: sdot z0.d, z1.h, z2.h			; CHECK: sdot z0.d, z1.h, z2.h
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> %a,			%out = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> %a,
	<vscale x 8 x i16> %b,			<vscale x 8 x i16> %b,
	<vscale x 8 x i16> %c)			<vscale x 8 x i16> %c)
	ret <vscale x 2 x i64> %out			ret <vscale x 2 x i64> %out
	}			}

				define <vscale x 2 x i64> @test_sdot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: test_sdot_i64_zero:
				; CHECK: sdot z0.d, z1.h, z2.h
				; CHECK-NEXT: ret
				entry:
				%vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
				%ret = add <vscale x 2 x i64> %vdot1.i, %a
				ret <vscale x 2 x i64> %ret
				}

				define <vscale x 4 x i32> @test_sdot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
				; CHECK-LABEL: test_sdot_i32_zero:
				; CHECK: sdot z0.s, z1.b, z2.b
				; CHECK-NEXT: ret
				entry:
				%vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
				%ret = add <vscale x 4 x i32> %vdot1.i, %a
				ret <vscale x 4 x i32> %ret
				}

	; SDOT (Indexed)			; SDOT (Indexed)

	define <vscale x 4 x i32> @sdot_lane_i32(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {			define <vscale x 4 x i32> @sdot_lane_i32(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
	; CHECK-LABEL: sdot_lane_i32:			; CHECK-LABEL: sdot_lane_i32:
	; CHECK: sdot z0.s, z1.b, z2.b[2]			; CHECK: sdot z0.s, z1.b, z2.b[2]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.lane.nxv4i32(<vscale x 4 x i32> %a,			%out = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.lane.nxv4i32(<vscale x 4 x i32> %a,
	<vscale x 16 x i8> %b,			<vscale x 16 x i8> %b,
	▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
	; CHECK: udot z0.d, z1.h, z2.h			; CHECK: udot z0.d, z1.h, z2.h
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> %a,			%out = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> %a,
	<vscale x 8 x i16> %b,			<vscale x 8 x i16> %b,
	<vscale x 8 x i16> %c)			<vscale x 8 x i16> %c)
	ret <vscale x 2 x i64> %out			ret <vscale x 2 x i64> %out
	}			}

				define <vscale x 2 x i64> @test_udot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: test_udot_i64_zero:
				; CHECK: udot z0.d, z1.h, z2.h
				; CHECK-NEXT: ret
				entry:
				%vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
				%ret = add <vscale x 2 x i64> %vdot1.i, %a
				ret <vscale x 2 x i64> %ret
				}

				define <vscale x 4 x i32> @test_udot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
				; CHECK-LABEL: test_udot_i32_zero:
				; CHECK: udot z0.s, z1.b, z2.b
				; CHECK-NEXT: ret
				entry:
				%vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
				%ret = add <vscale x 4 x i32> %vdot1.i, %a
				ret <vscale x 4 x i32> %ret
				}

	; UDOT (Indexed)			; UDOT (Indexed)

	define <vscale x 4 x i32> @udot_lane_i32(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {			define <vscale x 4 x i32> @udot_lane_i32(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
	; CHECK-LABEL: udot_lane_i32:			; CHECK-LABEL: udot_lane_i32:
	; CHECK: udot z0.s, z1.b, z2.b[2]			; CHECK: udot z0.s, z1.b, z2.b[2]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%out = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.lane.nxv4i32(<vscale x 4 x i32> %a,			%out = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.lane.nxv4i32(<vscale x 4 x i32> %a,
	<vscale x 16 x i8> %b,			<vscale x 16 x i8> %b,
	▲ Show 20 Lines • Show All 159 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Lowering sve.dot to DOT node
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 334651

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Lowering sve.dot to DOT nodeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 334651

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll

[AArch64][SVE] Lowering sve.dot to DOT node
ClosedPublic