Download Raw Diff

Details

Reviewers

sdesmalen
kmclaughlin
dancgr
rengolin
efriedma
huntergr

Commits

rG393173499099: [AArch64][SVE] Add initial backend support for FP splat_vector

Summary

This issue was found in D73711.

We'll also need separate changes to CodeGen wrt splat_vectors. There are some incorrect uses of scalable build_vectors (int and fp) scattered there. This is obvious when an assert to check for scalable build_vectors is added to DAG.getNode(...).

Notice the TODO in the f16 immediate splat tests. There seems to be a missing lowering for f16 constants. Instead of becoming ConstantFPs, the f16 constants are being explicitly loaded from the constant pool. Does anyone know where the special f16 constant lowerings live? I don't have a lot of experience with those.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cameron.mcinally created this revision.Feb 14 2020, 10:57 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2020, 10:57 AM

Herald added subscribers: llvm-commits, psnobl, rkruppe and 3 others. · View Herald Transcript

Fix bad copy-and-paste.

There seems to be a missing lowering for f16 constants. Instead of becoming ConstantFPs, the f16 constants are being explicitly loaded from the constant pool

See AArch64TargetLowering::isFPImmLegal ? We should generate an appropriate fmov with the right target features. I don't think "-mattr=+sve" implies those features, though.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
300	I'm not adding a pseudo-instruction is worthwhile if the only benefit is avoiding an INSERT_SUBREG.
313	What do we end up generating for a non-zero float immediate? We might need a pattern to avoid an extra mov in the general case. In theory, we can generate other float immediates using the integer dup/dupm, but I guess most of them won't be useful for 32-bit or 64-bit floats. Some probably are, though; for example, you can generate 1.0 with dupm.

In D74632#1877178, @efriedma wrote:

There seems to be a missing lowering for f16 constants. Instead of becoming ConstantFPs, the f16 constants are being explicitly loaded from the constant pool

See AArch64TargetLowering::isFPImmLegal ? We should generate an appropriate fmov with the right target features. I don't think "-mattr=+sve" implies those features, though.

Thanks, I'll take a look.

I also just noticed that the test file I added was renamed upstream to sve-vector-splat.ll. I'll move these new tests to that file.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
300	Good point. Will update it. I'm not the original author (cherry-picked from D71712), so maybe I'm missing something though...
313	I'm fairly certain that support exists in D71712. I didn't include any of the shufflevector tests or patterns in this Diff though. My intention was to cherry-pick something small for easy reviewing. If you'd like to see that support included in this patch, I'll add it.

efriedma added inline comments.Feb 18 2020, 10:53 AM

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
313	Oh, that's fine, it doesn't need to be in the same patch. Just wanted to make sure you were considering it.

Update with @efriedma suggestions.

cameron.mcinally marked 5 inline comments as done.Feb 18 2020, 12:07 PM

efriedma added inline comments.Feb 18 2020, 1:08 PM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
878	I think `setOperationAction(ISD::ConstantFP, MVT::f16, Legal)` is going to cause a fatal error for some FP constants. Not that they would be impossible to lower appropriately, but I don't think we have the necessary patterns.

cameron.mcinally marked an inline comment as done.Feb 18 2020, 1:53 PM

cameron.mcinally added inline comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
878	Ah, good call. That was also from D71712, but yeah, there are probably missing f16 patterns. Are we're ok with not folding the f16 constants for now? Or I could wait on this patch too. Any preference?

efriedma added inline comments.Feb 18 2020, 2:30 PM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
878	I'm okay if we miss the optimization for now... That said, I think making `+sve` imply `+fullfp16` solves the immediate problem you're running into, where constantfp 0.0 is getting lowered to a constant pool.

Don't mark f16 as a legal ConstantFP for SVE.

That said, I think making +sve imply +fullfp16 solves the immediate problem you're running into, where constantfp 0.0 is getting lowered to a constant pool.

I don't have a lot of intuition built up around fp16, so I'm hesitant to change it. I will open another Diff linking the two feature flags though, so we can start a discussion.

LGTM, assuming the regression test still passes.

This revision is now accepted and ready to land.Feb 18 2020, 3:04 PM

Closed by commit rG393173499099: [AArch64][SVE] Add initial backend support for FP splat_vector (authored by cameron.mcinally). · Explain WhyFeb 19 2020, 8:23 AM

This revision was automatically updated to reflect the committed changes.

cameron.mcinally mentioned this in D73711: [AArch64][SVE] Add support for DestructiveBinary and DestructiveBinaryComm DestructiveInstTypes.Feb 19 2020, 2:11 PM

Diff 245413

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 868 Lines • ▼ Show 20 Lines	if (Subtarget->hasSVE()) {
// splat of 0 or undef) once vector selects supported in SVE codegen. See		// splat of 0 or undef) once vector selects supported in SVE codegen. See
// D68877 for more details.		// D68877 for more details.
for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {		for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {
if (isTypeLegal(VT))		if (isTypeLegal(VT))
setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);		setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
}		}
setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom);		setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i8, Custom);
setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i16, Custom);		setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::i16, Custom);

		for (MVT VT : MVT::fp_scalable_vector_valuetypes()) {
		efriedmaUnsubmitted Not Done Reply Inline Actions I think `setOperationAction(ISD::ConstantFP, MVT::f16, Legal)` is going to cause a fatal error for some FP constants. Not that they would be impossible to lower appropriately, but I don't think we have the necessary patterns. efriedma: I think `setOperationAction(ISD::ConstantFP, MVT::f16, Legal)` is going to cause a fatal error…
		cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions Ah, good call. That was also from D71712, but yeah, there are probably missing f16 patterns. Are we're ok with not folding the f16 constants for now? Or I could wait on this patch too. Any preference? cameron.mcinally: Ah, good call. That was also from D71712, but yeah, there are probably missing f16 patterns.
		efriedmaUnsubmitted Not Done Reply Inline Actions I'm okay if we miss the optimization for now... That said, I think making `+sve` imply `+fullfp16` solves the immediate problem you're running into, where constantfp 0.0 is getting lowered to a constant pool. efriedma: I'm okay if we miss the optimization for now... That said, I think making `+sve` imply…
		if (isTypeLegal(VT)) {
		setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
		}
		}
}		}

PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();		PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();
}		}

void AArch64TargetLowering::addTypeForNEON(MVT VT, MVT PromotedBitwiseVT) {		void AArch64TargetLowering::addTypeForNEON(MVT VT, MVT PromotedBitwiseVT) {
assert(VT.isVector() && "VT should be a vector type");		assert(VT.isVector() && "VT should be a vector type");

▲ Show 20 Lines • Show All 6,593 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerSPLAT_VECTOR(SDValue Op,
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
EVT ElemVT = VT.getScalarType();		EVT ElemVT = VT.getScalarType();

SDValue SplatVal = Op.getOperand(0);		SDValue SplatVal = Op.getOperand(0);

// Extend input splat value where needed to fit into a GPR (32b or 64b only)		// Extend input splat value where needed to fit into a GPR (32b or 64b only)
// FPRs don't have this restriction.		// FPRs don't have this restriction.
switch (ElemVT.getSimpleVT().SimpleTy) {		switch (ElemVT.getSimpleVT().SimpleTy) {
case MVT::i8:
case MVT::i16:
case MVT::i32:
SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i32);
return DAG.getNode(AArch64ISD::DUP, dl, VT, SplatVal);
case MVT::i64:
SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i64);
return DAG.getNode(AArch64ISD::DUP, dl, VT, SplatVal);
case MVT::i1: {		case MVT::i1: {
// The general case of i1. There isn't any natural way to do this,		// The general case of i1. There isn't any natural way to do this,
// so we use some trickery with whilelo.		// so we use some trickery with whilelo.
// TODO: Add special cases for splat of constant true/false.		// TODO: Add special cases for splat of constant true/false.
SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i64);		SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i64);
SplatVal = DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, MVT::i64, SplatVal,		SplatVal = DAG.getNode(ISD::SIGN_EXTEND_INREG, dl, MVT::i64, SplatVal,
DAG.getValueType(MVT::i1));		DAG.getValueType(MVT::i1));
SDValue ID = DAG.getTargetConstant(Intrinsic::aarch64_sve_whilelo, dl,		SDValue ID = DAG.getTargetConstant(Intrinsic::aarch64_sve_whilelo, dl,
MVT::i64);		MVT::i64);
return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, dl, VT, ID,		return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, dl, VT, ID,
DAG.getConstant(0, dl, MVT::i64), SplatVal);		DAG.getConstant(0, dl, MVT::i64), SplatVal);
}		}
// TODO: we can support float types, but haven't added patterns yet.		case MVT::i8:
		case MVT::i16:
		case MVT::i32:
		SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i32);
		break;
		case MVT::i64:
		SplatVal = DAG.getAnyExtOrTrunc(SplatVal, dl, MVT::i64);
		break;
case MVT::f16:		case MVT::f16:
case MVT::f32:		case MVT::f32:
case MVT::f64:		case MVT::f64:
		// Fine as is
		break;
default:		default:
report_fatal_error("Unsupported SPLAT_VECTOR input operand type");		report_fatal_error("Unsupported SPLAT_VECTOR input operand type");
}		}

		return DAG.getNode(AArch64ISD::DUP, dl, VT, SplatVal);
}		}

static bool resolveBuildVector(BuildVectorSDNode *BVN, APInt &CnstBits,		static bool resolveBuildVector(BuildVectorSDNode *BVN, APInt &CnstBits,
APInt &UndefBits) {		APInt &UndefBits) {
EVT VT = BVN->getValueType(0);		EVT VT = BVN->getValueType(0);
APInt SplatBits, SplatUndef;		APInt SplatBits, SplatUndef;
unsigned SplatBitSize;		unsigned SplatBitSize;
bool HasAnyUndefs;		bool HasAnyUndefs;
▲ Show 20 Lines • Show All 6,084 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrFormats.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines

	// Pseudo instructions (don't have encoding information)			// Pseudo instructions (don't have encoding information)
	class Pseudo<dag oops, dag iops, list<dag> pattern, string cstr = "">			class Pseudo<dag oops, dag iops, list<dag> pattern, string cstr = "">
	: AArch64Inst<PseudoFrm, cstr> {			: AArch64Inst<PseudoFrm, cstr> {
	dag OutOperandList = oops;			dag OutOperandList = oops;
	dag InOperandList = iops;			dag InOperandList = iops;
	let Pattern = pattern;			let Pattern = pattern;
	let isCodeGenOnly = 1;			let isCodeGenOnly = 1;
				let isPseudo = 1;
	}			}

	// Real instructions (have encoding information)			// Real instructions (have encoding information)
	class EncodedI<string cstr, list<dag> pattern> : AArch64Inst<NormalFrm, cstr> {			class EncodedI<string cstr, list<dag> pattern> : AArch64Inst<NormalFrm, cstr> {
	let Pattern = pattern;			let Pattern = pattern;
	let Size = 4;			let Size = 4;
	}			}

	▲ Show 20 Lines • Show All 10,955 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

Show First 20 Lines • Show All 290 Lines • ▼ Show 20 Lines	let Predicates = [HasSVE] in {
// Splat scalar register (unpredicated, GPR or vector + element index)		// Splat scalar register (unpredicated, GPR or vector + element index)
defm DUP_ZR : sve_int_perm_dup_r<"dup", AArch64dup>;		defm DUP_ZR : sve_int_perm_dup_r<"dup", AArch64dup>;
defm DUP_ZZI : sve_int_perm_dup_i<"dup">;		defm DUP_ZZI : sve_int_perm_dup_i<"dup">;

// Splat scalar register (predicated)		// Splat scalar register (predicated)
defm CPY_ZPmR : sve_int_perm_cpy_r<"cpy", AArch64dup_pred>;		defm CPY_ZPmR : sve_int_perm_cpy_r<"cpy", AArch64dup_pred>;
defm CPY_ZPmV : sve_int_perm_cpy_v<"cpy", AArch64dup_pred>;		defm CPY_ZPmV : sve_int_perm_cpy_v<"cpy", AArch64dup_pred>;

		// Duplicate FP scalar into all vector elements
		def : Pat<(nxv8f16 (AArch64dup (f16 FPR16:$src))),
		efriedmaUnsubmitted Done Reply Inline Actions I'm not adding a pseudo-instruction is worthwhile if the only benefit is avoiding an INSERT_SUBREG. efriedma: I'm not adding a pseudo-instruction is worthwhile if the only benefit is avoiding an…
		cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions Good point. Will update it. I'm not the original author (cherry-picked from D71712), so maybe I'm missing something though... cameron.mcinally: Good point. Will update it. I'm not the original author (cherry-picked from D71712), so maybe…
		(DUP_ZZI_H (INSERT_SUBREG (IMPLICIT_DEF), FPR16:$src, hsub), 0)>;
		def : Pat<(nxv4f16 (AArch64dup (f16 FPR16:$src))),
		(DUP_ZZI_H (INSERT_SUBREG (IMPLICIT_DEF), FPR16:$src, hsub), 0)>;
		def : Pat<(nxv2f16 (AArch64dup (f16 FPR16:$src))),
		(DUP_ZZI_H (INSERT_SUBREG (IMPLICIT_DEF), FPR16:$src, hsub), 0)>;
		def : Pat<(nxv4f32 (AArch64dup (f32 FPR32:$src))),
		(DUP_ZZI_S (INSERT_SUBREG (IMPLICIT_DEF), FPR32:$src, ssub), 0)>;
		def : Pat<(nxv2f32 (AArch64dup (f32 FPR32:$src))),
		(DUP_ZZI_S (INSERT_SUBREG (IMPLICIT_DEF), FPR32:$src, ssub), 0)>;
		def : Pat<(nxv2f64 (AArch64dup (f64 FPR64:$src))),
		(DUP_ZZI_D (INSERT_SUBREG (IMPLICIT_DEF), FPR64:$src, dsub), 0)>;

		// Duplicate +0.0 into all vector elements
		efriedmaUnsubmitted Done Reply Inline Actions What do we end up generating for a non-zero float immediate? We might need a pattern to avoid an extra mov in the general case. In theory, we can generate other float immediates using the integer dup/dupm, but I guess most of them won't be useful for 32-bit or 64-bit floats. Some probably are, though; for example, you can generate 1.0 with dupm. efriedma: What do we end up generating for a non-zero float immediate? We might need a pattern to avoid…
		cameron.mcinallyAuthorUnsubmitted Done Reply Inline Actions I'm fairly certain that support exists in D71712. I didn't include any of the shufflevector tests or patterns in this Diff though. My intention was to cherry-pick something small for easy reviewing. If you'd like to see that support included in this patch, I'll add it. cameron.mcinally: I'm fairly certain that support exists in D71712. I didn't include any of the shufflevector…
		efriedmaUnsubmitted Done Reply Inline Actions Oh, that's fine, it doesn't need to be in the same patch. Just wanted to make sure you were considering it. efriedma: Oh, that's fine, it doesn't need to be in the same patch. Just wanted to make sure you were…
		def : Pat<(nxv8f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
		def : Pat<(nxv4f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
		def : Pat<(nxv2f16 (AArch64dup (f16 fpimm0))), (DUP_ZI_H 0, 0)>;
		def : Pat<(nxv4f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;
		def : Pat<(nxv2f32 (AArch64dup (f32 fpimm0))), (DUP_ZI_S 0, 0)>;
		def : Pat<(nxv2f64 (AArch64dup (f64 fpimm0))), (DUP_ZI_D 0, 0)>;

// Select elements from either vector (predicated)		// Select elements from either vector (predicated)
defm SEL_ZPZZ : sve_int_sel_vvv<"sel", vselect>;		defm SEL_ZPZZ : sve_int_sel_vvv<"sel", vselect>;

defm SPLICE_ZPZ : sve_int_perm_splice<"splice", int_aarch64_sve_splice>;		defm SPLICE_ZPZ : sve_int_perm_splice<"splice", int_aarch64_sve_splice>;
defm COMPACT_ZPZ : sve_int_perm_compact<"compact", int_aarch64_sve_compact>;		defm COMPACT_ZPZ : sve_int_perm_compact<"compact", int_aarch64_sve_compact>;
defm INSR_ZR : sve_int_perm_insrs<"insr", AArch64insr>;		defm INSR_ZR : sve_int_perm_insrs<"insr", AArch64insr>;
defm INSR_ZV : sve_int_perm_insrv<"insr", AArch64insr>;		defm INSR_ZV : sve_int_perm_insrv<"insr", AArch64insr>;
defm EXT_ZZI : sve_int_perm_extract_i<"ext", AArch64ext>;		defm EXT_ZZI : sve_int_perm_extract_i<"ext", AArch64ext>;
▲ Show 20 Lines • Show All 1,529 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-vector-splat.ll

	Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @sve_splat_16xi1			; CHECK-LABEL: @sve_splat_16xi1
	; CHECK: sbfx x8, x0, #0, #1			; CHECK: sbfx x8, x0, #0, #1
	; CHECK-NEXT: whilelo p0.b, xzr, x8			; CHECK-NEXT: whilelo p0.b, xzr, x8
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%ins = insertelement <vscale x 16 x i1> undef, i1 %val, i32 0			%ins = insertelement <vscale x 16 x i1> undef, i1 %val, i32 0
	%splat = shufflevector <vscale x 16 x i1> %ins, <vscale x 16 x i1> undef, <vscale x 16 x i32> zeroinitializer			%splat = shufflevector <vscale x 16 x i1> %ins, <vscale x 16 x i1> undef, <vscale x 16 x i32> zeroinitializer
	ret <vscale x 16 x i1> %splat			ret <vscale x 16 x i1> %splat
	}			}

				;; Splats of legal floating point vector types

				define <vscale x 8 x half> @splat_nxv8f16(half %val) {
				; CHECK-LABEL: splat_nxv8f16:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 8 x half> undef, half %val, i32 0
				%2 = shufflevector <vscale x 8 x half> %1, <vscale x 8 x half> undef, <vscale x 8 x i32> zeroinitializer
				ret <vscale x 8 x half> %2
				}

				define <vscale x 4 x half> @splat_nxv4f16(half %val) {
				; CHECK-LABEL: splat_nxv4f16:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 4 x half> undef, half %val, i32 0
				%2 = shufflevector <vscale x 4 x half> %1, <vscale x 4 x half> undef, <vscale x 4 x i32> zeroinitializer
				ret <vscale x 4 x half> %2
				}

				define <vscale x 2 x half> @splat_nxv2f16(half %val) {
				; CHECK-LABEL: splat_nxv2f16:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x half> undef, half %val, i32 0
				%2 = shufflevector <vscale x 2 x half> %1, <vscale x 2 x half> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x half> %2
				}

				define <vscale x 4 x float> @splat_nxv4f32(float %val) {
				; CHECK-LABEL: splat_nxv4f32:
				; CHECK: mov z0.s, s0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 4 x float> undef, float %val, i32 0
				%2 = shufflevector <vscale x 4 x float> %1, <vscale x 4 x float> undef, <vscale x 4 x i32> zeroinitializer
				ret <vscale x 4 x float> %2
				}

				define <vscale x 2 x float> @splat_nxv2f32(float %val) {
				; CHECK-LABEL: splat_nxv2f32:
				; CHECK: mov z0.s, s0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x float> undef, float %val, i32 0
				%2 = shufflevector <vscale x 2 x float> %1, <vscale x 2 x float> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x float> %2
				}

				define <vscale x 2 x double> @splat_nxv2f64(double %val) {
				; CHECK-LABEL: splat_nxv2f64:
				; CHECK: mov z0.d, d0
				; CHECK-NEXT: ret
				%1 = insertelement <vscale x 2 x double> undef, double %val, i32 0
				%2 = shufflevector <vscale x 2 x double> %1, <vscale x 2 x double> undef, <vscale x 2 x i32> zeroinitializer
				ret <vscale x 2 x double> %2
				}

				; TODO: The f16 constant should be folded into the move.
				define <vscale x 8 x half> @splat_nxv8f16_zero() {
				; CHECK-LABEL: splat_nxv8f16_zero:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				ret <vscale x 8 x half> zeroinitializer
				}

				; TODO: The f16 constant should be folded into the move.
				define <vscale x 4 x half> @splat_nxv4f16_zero() {
				; CHECK-LABEL: splat_nxv4f16_zero:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				ret <vscale x 4 x half> zeroinitializer
				}

				; TODO: The f16 constant should be folded into the move.
				define <vscale x 2 x half> @splat_nxv2f16_zero() {
				; CHECK-LABEL: splat_nxv2f16_zero:
				; CHECK: mov z0.h, h0
				; CHECK-NEXT: ret
				ret <vscale x 2 x half> zeroinitializer
				}

				define <vscale x 4 x float> @splat_nxv4f32_zero() {
				; CHECK-LABEL: splat_nxv4f32_zero:
				; CHECK: mov z0.s, #0
				; CHECK-NEXT: ret
				ret <vscale x 4 x float> zeroinitializer
				}

				define <vscale x 2 x float> @splat_nxv2f32_zero() {
				; CHECK-LABEL: splat_nxv2f32_zero:
				; CHECK: mov z0.s, #0
				; CHECK-NEXT: ret
				ret <vscale x 2 x float> zeroinitializer
				}

				define <vscale x 2 x double> @splat_nxv2f64_zero() {
				; CHECK-LABEL: splat_nxv2f64_zero:
				; CHECK: mov z0.d, #0
				; CHECK-NEXT: ret
				ret <vscale x 2 x double> zeroinitializer
				}

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add initial backend support for FP splat_vector
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 245413

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64InstrFormats.td

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/test/CodeGen/AArch64/sve-vector-splat.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add initial backend support for FP splat_vectorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 245413

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64InstrFormats.td

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/test/CodeGen/AArch64/sve-vector-splat.ll

[AArch64][SVE] Add initial backend support for FP splat_vector
ClosedPublic