This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8?
353	This is a little inconsistent with the other patterns: I think we also need need nxv2f32/nxv4f16/nxv2f16?
llvm/test/CodeGen/AArch64/sve-vector-splat.ll
303	It looks like splat_nxv4f32_imm is returning the integer 1, not the floating-point 1.0?

Update 2 out of 3 @efriedma comments.

cameron.mcinally marked 4 inline comments as done.Feb 19 2020, 3:08 PM

cameron.mcinally added inline comments.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	I'm not sure I understand this one. What should I replace with cpy_imm8_opt_lsl_i8? SVE8BitLslImm is looking for two i8 immediates (i8 value and i8 shift amount). cpy_imm8_opt_lsl_i8 is just checking for one i8 immediate, IINM.
353	Good catch. The f16 tests don't really work, so didn't catch the missing patterns. The nxv2f32 pattern was there. Added the missing patterns.
llvm/test/CodeGen/AArch64/sve-vector-splat.ll
303	Just a bad copy-and-paste. Updated.

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	"class imm8_opt_lsl" has some code which looks like it supposed to be used for matching. Granted, it's using ImmLeaf, so maybe it's just broken.

In D74856#1883971, @efriedma wrote:

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

Good idea. I'll do that.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	Oh, yeah, I see what you mean now. I'll dig into that...

cameron.mcinally marked an inline comment as done.Feb 20 2020, 9:57 AM

cameron.mcinally added inline comments.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	Looking closer, this seems to be in line with the other ComplexPattern uses (e.g. SVEAddSubImm8Pat). The SVE8BitLslImm ComplexPattern is used to match the two i8 operands, $imm and $shift. If I tried to use cpy_imm8_opt_lsl_i64 in its place, I don't see a way to get the individual operands by name. E.g. something like: def : Pat<(nxv2i64 (AArch64dup (i64 (cpy_imm8_opt_lsl_i64 i64:$imm)))), (DUP_ZI_D $???, $???)>;

efriedma added inline comments.Feb 20 2020, 10:51 AM

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	If you write something like `SVE8BitLslImm:$a`, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure. Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them.

cameron.mcinally added inline comments.Feb 20 2020, 11:52 AM

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure. `SVE8BitLslImm` usage would be something like this: `(SVE8BitLslImm i32:$a, i32:$b)` where `a` is an immediate value and `b` is a shift amount. I think that's necessary, having two separate named immediates, since `DUP_ZI_x` expects two immediate operands. E.g. def : Pat<(nxv2i64 (AArch64dup (i64 (SVE8BitLslImm i32:$a, i32:$b)))), (DUP_ZI_D $a, $b)>; Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them. We do use `cpy_imm8_opt_lsl_i64` in `DUP_ZI_x` right now, but it's slightly different: def : InstAlias<"mov $Zd, $imm", (!cast<Instruction>(NAME # _D) ZPR64:$Zd, cpy_imm8_opt_lsl_i64:$imm), 1>; `cpy_imm8_opt_lsl_i64` will match to `(ops i32imm, i32imm)`, which are the two operands of the `DUP_ZI_x`. This seems desirable since the hardware wants a single immediate constructed from the 2 immediate parts. That is, we want a way to verify that the two immediates actually make sense together. All this is really at the edge of my TblGen understanding, so take with some skepticism...

efriedma added inline comments.Feb 20 2020, 1:26 PM

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13	Looking a little more, I think I'm just wrong on the ComplexPattern thing. We do use cpy_imm8_opt_lsl_i64 in DUP_ZI_x right now, but it's slightly different: That's not using the IntImm predicate; there's a separate asm operand parser function.

In D74856#1883971, @efriedma wrote:

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

Created D74965 for this, so that we don't have to touch the existing tests here.

Rebase to get +fullfp16 change, and update fp16 tests.

LGTM

This revision is now accepted and ready to land.Feb 21 2020, 10:27 AM

Closed by commit rG266959c0f72f: [AArch64][SVE] Add backend support for splats of immediates (authored by cameron.mcinally). · Explain WhyFeb 21 2020, 11:27 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64ISelDAGToDAG.cpp

28 lines

AArch64SVEInstrInfo.td

28 lines

test/

CodeGen/

AArch64/

sve-vector-splat.ll

90 lines

Diff 245916

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Show First 20 Lines • Show All 267 Lines • ▼ Show 20 Lines	private:
bool SelectCVTFixedPosOperand(SDValue N, SDValue &FixedPos) {		bool SelectCVTFixedPosOperand(SDValue N, SDValue &FixedPos) {
return SelectCVTFixedPosOperand(N, FixedPos, RegWidth);		return SelectCVTFixedPosOperand(N, FixedPos, RegWidth);
}		}

bool SelectCVTFixedPosOperand(SDValue N, SDValue &FixedPos, unsigned Width);		bool SelectCVTFixedPosOperand(SDValue N, SDValue &FixedPos, unsigned Width);

bool SelectCMP_SWAP(SDNode *N);		bool SelectCMP_SWAP(SDNode *N);

		bool SelectSVE8BitLslImm(SDValue N, SDValue &Imm, SDValue &Shift);

bool SelectSVEAddSubImm(SDValue N, MVT VT, SDValue &Imm, SDValue &Shift);		bool SelectSVEAddSubImm(SDValue N, MVT VT, SDValue &Imm, SDValue &Shift);

bool SelectSVELogicalImm(SDValue N, MVT VT, SDValue &Imm);		bool SelectSVELogicalImm(SDValue N, MVT VT, SDValue &Imm);

bool SelectSVESignedArithImm(SDValue N, SDValue &Imm);		bool SelectSVESignedArithImm(SDValue N, SDValue &Imm);

bool SelectSVEArithImm(SDValue N, SDValue &Imm);		bool SelectSVEArithImm(SDValue N, SDValue &Imm);
};		};
▲ Show 20 Lines • Show All 2,629 Lines • ▼ Show 20 Lines	bool AArch64DAGToDAGISel::SelectCMP_SWAP(SDNode *N) {

ReplaceUses(SDValue(N, 0), SDValue(CmpSwap, 0));		ReplaceUses(SDValue(N, 0), SDValue(CmpSwap, 0));
ReplaceUses(SDValue(N, 1), SDValue(CmpSwap, 2));		ReplaceUses(SDValue(N, 1), SDValue(CmpSwap, 2));
CurDAG->RemoveDeadNode(N);		CurDAG->RemoveDeadNode(N);

return true;		return true;
}		}

		bool AArch64DAGToDAGISel::SelectSVE8BitLslImm(SDValue N, SDValue &Base,
		SDValue &Offset) {
		auto C = dyn_cast<ConstantSDNode>(N);
		if (!C)
		return false;

		auto Ty = N->getValueType(0);

		int64_t Imm = C->getSExtValue();
		SDLoc DL(N);

		if ((Imm >= -128) && (Imm <= 127)) {
		Base = CurDAG->getTargetConstant(Imm, DL, Ty);
		Offset = CurDAG->getTargetConstant(0, DL, Ty);
		return true;
		}

		if (((Imm % 256) == 0) && (Imm >= -32768) && (Imm <= 32512)) {
		Base = CurDAG->getTargetConstant(Imm/256, DL, Ty);
		Offset = CurDAG->getTargetConstant(8, DL, Ty);
		return true;
		}

		return false;
		}

bool AArch64DAGToDAGISel::SelectSVEAddSubImm(SDValue N, MVT VT, SDValue &Imm, SDValue &Shift) {		bool AArch64DAGToDAGISel::SelectSVEAddSubImm(SDValue N, MVT VT, SDValue &Imm, SDValue &Shift) {
if (auto CNode = dyn_cast<ConstantSDNode>(N)) {		if (auto CNode = dyn_cast<ConstantSDNode>(N)) {
const int64_t ImmVal = CNode->getZExtValue();		const int64_t ImmVal = CNode->getZExtValue();
SDLoc DL(N);		SDLoc DL(N);

switch (VT.SimpleTy) {		switch (VT.SimpleTy) {
case MVT::i8:		case MVT::i8:
if ((ImmVal & 0xFF) == ImmVal) {		if ((ImmVal & 0xFF) == ImmVal) {
▲ Show 20 Lines • Show All 1,473 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

//=- AArch64SVEInstrInfo.td - AArch64 SVE Instructions -- tablegen ------=//		//=- AArch64SVEInstrInfo.td - AArch64 SVE Instructions -- tablegen ------=//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// AArch64 Scalable Vector Extension (SVE) Instruction definitions.		// AArch64 Scalable Vector Extension (SVE) Instruction definitions.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		def SVE8BitLslImm : ComplexPattern<i32, 2, "SelectSVE8BitLslImm", [imm]>;
		efriedmaUnsubmitted Not Done Reply Inline Actions Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8? efriedma: Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8?
		cameron.mcinallyAuthorUnsubmitted Not Done Reply Inline Actions I'm not sure I understand this one. What should I replace with cpy_imm8_opt_lsl_i8? SVE8BitLslImm is looking for two i8 immediates (i8 value and i8 shift amount). cpy_imm8_opt_lsl_i8 is just checking for one i8 immediate, IINM. cameron.mcinally: I'm not sure I understand this one. What should I replace with cpy_imm8_opt_lsl_i8?
		efriedmaUnsubmitted Not Done Reply Inline Actions "class imm8_opt_lsl" has some code which looks like it supposed to be used for matching. Granted, it's using ImmLeaf, so maybe it's just broken. efriedma: "class imm8_opt_lsl" has some code which looks like it supposed to be used for matching.
		cameron.mcinallyAuthorUnsubmitted Not Done Reply Inline Actions Oh, yeah, I see what you mean now. I'll dig into that... cameron.mcinally: Oh, yeah, I see what you mean now. I'll dig into that...
		cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions Looking closer, this seems to be in line with the other ComplexPattern uses (e.g. SVEAddSubImm8Pat). The SVE8BitLslImm ComplexPattern is used to match the two i8 operands, $imm and $shift. If I tried to use cpy_imm8_opt_lsl_i64 in its place, I don't see a way to get the individual operands by name. E.g. something like: def : Pat<(nxv2i64 (AArch64dup (i64 (cpy_imm8_opt_lsl_i64 i64:$imm)))), (DUP_ZI_D $???, $???)>; cameron.mcinally: Looking closer, this seems to be in line with the other ComplexPattern uses (e.g.
		efriedmaUnsubmitted Not Done Reply Inline Actions If you write something like `SVE8BitLslImm:$a`, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure. Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them. efriedma: If you write something like `SVE8BitLslImm:$a`, it actually counts as two operands, I think…
		cameron.mcinallyAuthorUnsubmitted Not Done Reply Inline Actions If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure. `SVE8BitLslImm` usage would be something like this: `(SVE8BitLslImm i32:$a, i32:$b)` where `a` is an immediate value and `b` is a shift amount. I think that's necessary, having two separate named immediates, since `DUP_ZI_x` expects two immediate operands. E.g. def : Pat<(nxv2i64 (AArch64dup (i64 (SVE8BitLslImm i32:$a, i32:$b)))), (DUP_ZI_D $a, $b)>; Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them. We do use `cpy_imm8_opt_lsl_i64` in `DUP_ZI_x` right now, but it's slightly different: def : InstAlias<"mov $Zd, $imm", (!cast<Instruction>(NAME # _D) ZPR64:$Zd, cpy_imm8_opt_lsl_i64:$imm), 1>; `cpy_imm8_opt_lsl_i64` will match to `(ops i32imm, i32imm)`, which are the two operands of the `DUP_ZI_x`. This seems desirable since the hardware wants a single immediate constructed from the 2 immediate parts. That is, we want a way to verify that the two immediates actually make sense together. All this is really at the edge of my TblGen understanding, so take with some skepticism... cameron.mcinally: > If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think…
		efriedmaUnsubmitted Not Done Reply Inline Actions Looking a little more, I think I'm just wrong on the ComplexPattern thing. We do use cpy_imm8_opt_lsl_i64 in DUP_ZI_x right now, but it's slightly different: That's not using the IntImm predicate; there's a separate asm operand parser function. efriedma: Looking a little more, I think I'm just wrong on the ComplexPattern thing. > We do use…

// Non-faulting loads - node definitions		// Non-faulting loads - node definitions
//		//
def SDT_AArch64_LDNF1 : SDTypeProfile<1, 3, [		def SDT_AArch64_LDNF1 : SDTypeProfile<1, 3, [
SDTCisVec<0>, SDTCisVec<1>, SDTCisPtrTy<2>,		SDTCisVec<0>, SDTCisVec<1>, SDTCisPtrTy<2>,
SDTCVecEltisVT<1,i1>, SDTCisSameNumEltsAs<0,1>		SDTCVecEltisVT<1,i1>, SDTCisSameNumEltsAs<0,1>
]>;		]>;

def AArch64ldnf1 : SDNode<"AArch64ISD::LDNF1", SDT_AArch64_LDNF1, [SDNPHasChain, SDNPMayLoad, SDNPOptInGlue]>;		def AArch64ldnf1 : SDNode<"AArch64ISD::LDNF1", SDT_AArch64_LDNF1, [SDNPHasChain, SDNPMayLoad, SDNPOptInGlue]>;
▲ Show 20 Lines • Show All 303 Lines • ▼ Show 20 Lines	let Predicates = [HasSVE] in {
// Duplicate +0.0 into all vector elements		// Duplicate +0.0 into all vector elements
def : Pat<(nxv8f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;		def : Pat<(nxv8f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
def : Pat<(nxv4f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;		def : Pat<(nxv4f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
def : Pat<(nxv2f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;		def : Pat<(nxv2f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
def : Pat<(nxv4f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;		def : Pat<(nxv4f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;
def : Pat<(nxv2f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;		def : Pat<(nxv2f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;
def : Pat<(nxv2f64 (AArch64dup (f64 fpimm0))), (DUP_ZI_D 0, 0)>;		def : Pat<(nxv2f64 (AArch64dup (f64 fpimm0))), (DUP_ZI_D 0, 0)>;

		// Duplicate Int immediate into all vector elements
		def : Pat<(nxv16i8 (AArch64dup (i32 (SVE8BitLslImm i32:$a, i32:$b)))),
		(DUP_ZI_B $a, $b)>;
		def : Pat<(nxv8i16 (AArch64dup (i32 (SVE8BitLslImm i32:$a, i32:$b)))),
		(DUP_ZI_H $a, $b)>;
		def : Pat<(nxv4i32 (AArch64dup (i32 (SVE8BitLslImm i32:$a, i32:$b)))),
		(DUP_ZI_S $a, $b)>;
		def : Pat<(nxv2i64 (AArch64dup (i64 (SVE8BitLslImm i32:$a, i32:$b)))),
		(DUP_ZI_D $a, $b)>;

		// Duplicate FP immediate into all vector elements
		let AddedComplexity = 2 in {
		def : Pat<(nxv8f16 (AArch64dup fpimm16:$imm8)),
		(FDUP_ZI_H fpimm16:$imm8)>;
		def : Pat<(nxv4f16 (AArch64dup fpimm16:$imm8)),
		(FDUP_ZI_H fpimm16:$imm8)>;
		def : Pat<(nxv2f16 (AArch64dup fpimm16:$imm8)),
		(FDUP_ZI_H fpimm16:$imm8)>;
		def : Pat<(nxv4f32 (AArch64dup fpimm32:$imm8)),
		(FDUP_ZI_S fpimm32:$imm8)>;
		efriedmaUnsubmitted Done Reply Inline Actions This is a little inconsistent with the other patterns: I think we also need need nxv2f32/nxv4f16/nxv2f16? efriedma: This is a little inconsistent with the other patterns: I think we also need need…
		cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions Good catch. The f16 tests don't really work, so didn't catch the missing patterns. The nxv2f32 pattern was there. Added the missing patterns. cameron.mcinally: Good catch. The f16 tests don't really work, so didn't catch the missing patterns. The nxv2f32…
		def : Pat<(nxv2f32 (AArch64dup fpimm32:$imm8)),
		(FDUP_ZI_S fpimm32:$imm8)>;
		def : Pat<(nxv2f64 (AArch64dup fpimm64:$imm8)),
		(FDUP_ZI_D fpimm64:$imm8)>;
		}

// Select elements from either vector (predicated)		// Select elements from either vector (predicated)
defm SEL_ZPZZ : sve_int_sel_vvv<"sel", vselect>;		defm SEL_ZPZZ : sve_int_sel_vvv<"sel", vselect>;

defm SPLICE_ZPZ : sve_int_perm_splice<"splice", int_aarch64_sve_splice>;		defm SPLICE_ZPZ : sve_int_perm_splice<"splice", int_aarch64_sve_splice>;
defm COMPACT_ZPZ : sve_int_perm_compact<"compact", int_aarch64_sve_compact>;		defm COMPACT_ZPZ : sve_int_perm_compact<"compact", int_aarch64_sve_compact>;
defm INSR_ZR : sve_int_perm_insrs<"insr", AArch64insr>;		defm INSR_ZR : sve_int_perm_insrs<"insr", AArch64insr>;
defm INSR_ZV : sve_int_perm_insrv<"insr", AArch64insr>;		defm INSR_ZV : sve_int_perm_insrv<"insr", AArch64insr>;
defm EXT_ZZI : sve_int_perm_extract_i<"ext", AArch64ext>;		defm EXT_ZZI : sve_int_perm_extract_i<"ext", AArch64ext>;
▲ Show 20 Lines • Show All 1,529 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-vector-splat.ll

	Show All 32 Lines
	; CHECK-LABEL: @sve_splat_2xi64			; CHECK-LABEL: @sve_splat_2xi64
	; CHECK: mov z0.d, x0			; CHECK: mov z0.d, x0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%ins = insertelement <vscale x 2 x i64> undef, i64 %val, i32 0			%ins = insertelement <vscale x 2 x i64> undef, i64 %val, i32 0
	%splat = shufflevector <vscale x 2 x i64> %ins, <vscale x 2 x i64> undef, <vscale x 2 x i32> zeroinitializer			%splat = shufflevector <vscale x 2 x i64> %ins, <vscale x 2 x i64> undef, <vscale x 2 x i32> zeroinitializer
	ret <vscale x 2 x i64> %splat			ret <vscale x 2 x i64> %splat
	}			}

				define <vscale x 16 x i8> @sve_splat_16xi8_imm() {
				; CHECK-LABEL: @sve_splat_16xi8_imm
				; CHECK: mov z0.b, #1
				; CHECK-NEXT: ret
				%ins = insertelement <vscale x 16 x i8> undef, i8 1, i32 0
				%splat = shufflevector <vscale x 16 x i8> %ins, <vscale x 16 x i8> undef, <vscale x 16 x i32> zeroinitializer
				ret <vscale x 16 x i8> %splat
				}

				define <vscale x 8 x i16> @sve_splat_8xi16_imm() {
				; CHECK-LABEL: @sve_splat_8xi16_imm
				; CHECK: mov z0.h, #1
				; CHECK-NEXT: ret
				%ins = insertelement <vscale x 8 x i16> undef, i16 1, i32 0
				%splat = shufflevector <vscale x 8 x i16> %ins, <vscale x 8 x i16> undef, <vscale x 8 x i32> zeroinitializer
				ret <vscale x 8 x i16> %splat
				}

				define <vscale x 4 x i32> @sve_splat_4xi32_imm() {
				; CHECK-LABEL: @sve_splat_4xi32_imm
				; CHECK: mov z0.s, #1
				; CHECK-NEXT: ret
				%ins = insertelement <vscale x 4 x i32> undef, i32 1, i32 0
				%splat = shufflevector <vscale x 4 x i32> %ins, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				ret <vscale x 4 x i32> %splat
				}

				define <vscale x 2 x i64> @sve_splat_2xi64_imm() {
				; CHECK-LABEL: @sve_splat_2xi64_imm
				; CHECK: mov z0.d, #1
				; CHECK-NEXT: ret
				%ins = insertelement <vscale x 2 x i64> undef, i64 1, i32 0
				%splat = shufflevector <vscale x 2 x i64> %ins, <vscale x 2 x i64> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x i64> %splat
				}

	;; Promote splats of smaller illegal integer vector types			;; Promote splats of smaller illegal integer vector types

	define <vscale x 2 x i8> @sve_splat_2xi8(i8 %val) {			define <vscale x 2 x i8> @sve_splat_2xi8(i8 %val) {
	; CHECK-LABEL: @sve_splat_2xi8			; CHECK-LABEL: @sve_splat_2xi8
	; CHECK: mov z0.d, x0			; CHECK: mov z0.d, x0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%ins = insertelement <vscale x 2 x i8> undef, i8 %val, i32 0			%ins = insertelement <vscale x 2 x i8> undef, i8 %val, i32 0
	%splat = shufflevector <vscale x 2 x i8> %ins, <vscale x 2 x i8> undef, <vscale x 2 x i32> zeroinitializer			%splat = shufflevector <vscale x 2 x i8> %ins, <vscale x 2 x i8> undef, <vscale x 2 x i32> zeroinitializer
	▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
	}			}

	define <vscale x 2 x double> @splat_nxv2f64_zero() {			define <vscale x 2 x double> @splat_nxv2f64_zero() {
	; CHECK-LABEL: splat_nxv2f64_zero:			; CHECK-LABEL: splat_nxv2f64_zero:
	; CHECK: mov z0.d, #0			; CHECK: mov z0.d, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	ret <vscale x 2 x double> zeroinitializer			ret <vscale x 2 x double> zeroinitializer
	}			}

				define <vscale x 8 x half> @splat_nxv8f16_imm() {
				; CHECK-LABEL: splat_nxv8f16_imm:
				; CHECK: mov z0.h, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 8 x half> undef, half 1.0, i32 0
				%2 = shufflevector <vscale x 8 x half> %1, <vscale x 8 x half> undef, <vscale x 8 x i32> zeroinitializer
				ret <vscale x 8 x half> %2
				}

				define <vscale x 4 x half> @splat_nxv4f16_imm() {
				; CHECK-LABEL: splat_nxv4f16_imm:
				; CHECK: mov z0.h, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 4 x half> undef, half 1.0, i32 0
				%2 = shufflevector <vscale x 4 x half> %1, <vscale x 4 x half> undef, <vscale x 4 x i32> zeroinitializer
				ret <vscale x 4 x half> %2
				}

				define <vscale x 2 x half> @splat_nxv2f16_imm() {
				; CHECK-LABEL: splat_nxv2f16_imm:
				; CHECK: mov z0.h, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x half> undef, half 1.0, i32 0
				%2 = shufflevector <vscale x 2 x half> %1, <vscale x 2 x half> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x half> %2
				}

				define <vscale x 4 x float> @splat_nxv4f32_imm() {
				; CHECK-LABEL: splat_nxv4f32_imm:
				; CHECK: mov z0.s, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 4 x float> undef, float 1.0, i32 0
				%2 = shufflevector <vscale x 4 x float> %1, <vscale x 4 x float> undef, <vscale x 4 x i32> zeroinitializer
				efriedmaUnsubmitted Done Reply Inline Actions It looks like splat_nxv4f32_imm is returning the integer 1, not the floating-point 1.0? efriedma: It looks like splat_nxv4f32_imm is returning the integer 1, not the floating-point 1.0?
				cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions Just a bad copy-and-paste. Updated. cameron.mcinally: Just a bad copy-and-paste. Updated.
				ret <vscale x 4 x float> %2
				}

				define <vscale x 2 x float> @splat_nxv2f32_imm() {
				; CHECK-LABEL: splat_nxv2f32_imm:
				; CHECK: mov z0.s, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x float> undef, float 1.0, i32 0
				%2 = shufflevector <vscale x 2 x float> %1, <vscale x 2 x float> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x float> %2
				}

				define <vscale x 2 x double> @splat_nxv2f64_imm() {
				; CHECK-LABEL: splat_nxv2f64_imm:
				; CHECK: mov z0.d, #1.0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x double> undef, double 1.0, i32 0
				%2 = shufflevector <vscale x 2 x double> %1, <vscale x 2 x double> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x double> %2
				}

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add backend support for splats of immediatesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 245916

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/test/CodeGen/AArch64/sve-vector-splat.ll

[AArch64][SVE] Add backend support for splats of immediates
ClosedPublic