This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelLowering.h
1/2
RISCVISelLowering.cpp
-
RISCVInstrInfoVVLPatterns.td
-
test/CodeGen/RISCV/rvv/
-
CodeGen/
-
RISCV/
-
rvv/
-
vslide1down-rv32.ll
-
vslide1up-rv32.ll

Differential D99910

[RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.
ClosedPublic

Authored by craig.topper on Apr 5 2021, 5:47 PM.

Download Raw Diff

Details

Reviewers

frasercrmck
khchen
HsiangKai
arcbbb
evandro
kito-cheng

Commits

rGf087d7544a41: [RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.

Summary

This can't use our normal strategy of splatting the scalar and using
a .vv operation instead of .vx.

Instead this patch bitcasts the vector to the equivalent SEW=32
vector and inserts the scalar parts using two vslide1up/down. We
do that unmasked and apply the mask separately at the end with
a vmerge.

For vslide1up there maybe some other options here like getting
i64 into element 0 and using vslideup.vi with this vector as
vd and the original source as vs1. Masking would still need to
be done afterwards.

That idea doesn't work for vslide1down. We need to slidedown and
then insert a single scalar at vl-1 which we could do with a
vslideup, but that assumes vl > 0 which I don't think we can assume.

The i32 double slide1down implemented here is the best I could come
up with and I just made vslide1up consistent.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Apr 5 2021, 5:47 PM

Herald added subscribers: StephenFan, vkmr, luismarques and 24 others. · View Herald TranscriptApr 5 2021, 5:47 PM

craig.topper requested review of this revision.Apr 5 2021, 5:47 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2021, 5:47 PM

Herald added a subscriber: MaskRay. · View Herald Transcript

craig.topper added a reviewer: kito-cheng.Apr 5 2021, 5:47 PM

Harbormaster completed remote builds in B97199: Diff 335360.Apr 5 2021, 6:20 PM

Seems like a good strategy to me.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2986	I can't remember exactly how the intrinsics work but is it able to omit this if you're using `zero` as the vector length (i.e. VLMAX)?

This revision is now accepted and ready to land.Apr 7 2021, 7:35 AM

craig.topper added inline comments.Apr 7 2021, 10:01 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2986	Passing 0 to the intrinsic is really 0 not VLMAX. If you want VLMAX you would need to call one of the vsetvlmax_*() intrinsics before this and pass the return value which would be the real VLMAX for the vtype. I suppose we could try to figure out that the producing instruction is a vsetvlmax intrinsic with SEW=64 vtype and create a new vsetvlmax intrinsic with SEW=32 to pass here.

Closed by commit rGf087d7544a41: [RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32. (authored by craig.topper). · Explain WhyApr 7 2021, 10:45 AM

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rGf087d7544a41: [RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32..

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelLowering.h

5 lines

RISCVISelLowering.cpp

65 lines

RISCVInstrInfoVVLPatterns.td

6 lines

test/

CodeGen/

RISCV/

rvv/

vslide1down-rv32.ll

200 lines

vslide1up-rv32.ll

200 lines

Diff 335867

llvm/lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
VLEFF,		VLEFF,
VLEFF_MASK,		VLEFF_MASK,
// Matches the semantics of vslideup/vslidedown. The first operand is the		// Matches the semantics of vslideup/vslidedown. The first operand is the
// pass-thru operand, the second is the source vector, the third is the		// pass-thru operand, the second is the source vector, the third is the
// XLenVT index (either constant or non-constant), the fourth is the mask		// XLenVT index (either constant or non-constant), the fourth is the mask
// and the fifth the VL.		// and the fifth the VL.
VSLIDEUP_VL,		VSLIDEUP_VL,
VSLIDEDOWN_VL,		VSLIDEDOWN_VL,
// Matches the semantics of vslide1up. The first operand is the source		// Matches the semantics of vslide1up/slide1down. The first operand is the
// vector, the second is the XLenVT scalar value. The third and fourth		// source vector, the second is the XLenVT scalar value. The third and fourth
// operands are the mask and VL operands.		// operands are the mask and VL operands.
VSLIDE1UP_VL,		VSLIDE1UP_VL,
		VSLIDE1DOWN_VL,
// Matches the semantics of the vid.v instruction, with a mask and VL		// Matches the semantics of the vid.v instruction, with a mask and VL
// operand.		// operand.
VID_VL,		VID_VL,
// Matches the semantics of the vfcnvt.rod function (Convert double-width		// Matches the semantics of the vfcnvt.rod function (Convert double-width
// float to single-width float, rounding towards odd). Takes a double-width		// float to single-width float, rounding towards odd). Takes a double-width
// float vector and produces a single-width float vector. Also has a mask and		// float vector and produces a single-width float vector. Also has a mask and
// VL operand.		// VL operand.
VFNCVT_ROD_VL,		VFNCVT_ROD_VL,
▲ Show 20 Lines • Show All 410 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,946 Lines • ▼ Show 20 Lines	case Intrinsic::riscv_vmv_s_x: {
SDValue Mask = DAG.getNode(RISCVISD::VMSET_VL, DL, MaskVT, VL);		SDValue Mask = DAG.getNode(RISCVISD::VMSET_VL, DL, MaskVT, VL);
SDValue VID = DAG.getNode(RISCVISD::VID_VL, DL, VT, Mask, VL);		SDValue VID = DAG.getNode(RISCVISD::VID_VL, DL, VT, Mask, VL);
SDValue SelectCond =		SDValue SelectCond =
DAG.getNode(RISCVISD::SETCC_VL, DL, MaskVT, VID, SplattedIdx,		DAG.getNode(RISCVISD::SETCC_VL, DL, MaskVT, VID, SplattedIdx,
DAG.getCondCode(ISD::SETEQ), Mask, VL);		DAG.getCondCode(ISD::SETEQ), Mask, VL);
return DAG.getNode(RISCVISD::VSELECT_VL, DL, VT, SelectCond, SplattedVal,		return DAG.getNode(RISCVISD::VSELECT_VL, DL, VT, SelectCond, SplattedVal,
Vec, VL);		Vec, VL);
}		}
		case Intrinsic::riscv_vslide1up:
		case Intrinsic::riscv_vslide1down:
		case Intrinsic::riscv_vslide1up_mask:
		case Intrinsic::riscv_vslide1down_mask: {
		// We need to special case these when the scalar is larger than XLen.
		unsigned NumOps = Op.getNumOperands();
		bool IsMasked = NumOps == 6;
		unsigned OpOffset = IsMasked ? 1 : 0;
		SDValue Scalar = Op.getOperand(2 + OpOffset);
		if (Scalar.getValueType().bitsLE(XLenVT))
		break;

		// Splatting a sign extended constant is fine.
		if (auto *CVal = dyn_cast<ConstantSDNode>(Scalar))
		if (isInt<32>(CVal->getSExtValue()))
		break;

		MVT VT = Op.getSimpleValueType();
		assert(VT.getVectorElementType() == MVT::i64 &&
		Scalar.getValueType() == MVT::i64 && "Unexpected VTs");

		// Convert the vector source to the equivalent nxvXi32 vector.
		MVT I32VT = MVT::getVectorVT(MVT::i32, VT.getVectorElementCount() * 2);
		SDValue Vec = DAG.getBitcast(I32VT, Op.getOperand(1 + OpOffset));

		SDValue ScalarLo = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Scalar,
		DAG.getConstant(0, DL, XLenVT));
		SDValue ScalarHi = DAG.getNode(ISD::EXTRACT_ELEMENT, DL, MVT::i32, Scalar,
		DAG.getConstant(1, DL, XLenVT));

		// Double the VL since we halved SEW.
		SDValue VL = Op.getOperand(NumOps - 1);
		frasercrmckUnsubmitted Not Done Reply Inline Actions I can't remember exactly how the intrinsics work but is it able to omit this if you're using `zero` as the vector length (i.e. VLMAX)? frasercrmck: I can't remember exactly how the intrinsics work but is it able to omit this if you're using…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Passing 0 to the intrinsic is really 0 not VLMAX. If you want VLMAX you would need to call one of the vsetvlmax_() intrinsics before this and pass the return value which would be the real VLMAX for the vtype. I suppose we could try to figure out that the producing instruction is a vsetvlmax intrinsic with SEW=64 vtype and create a new vsetvlmax intrinsic with SEW=32 to pass here. craig.topper:* Passing 0 to the intrinsic is really 0 not VLMAX. If you want VLMAX you would need to call one…
		SDValue I32VL =
		DAG.getNode(ISD::SHL, DL, XLenVT, VL, DAG.getConstant(1, DL, XLenVT));

		MVT I32MaskVT = MVT::getVectorVT(MVT::i1, I32VT.getVectorElementCount());
		SDValue I32Mask = DAG.getNode(RISCVISD::VMSET_VL, DL, I32MaskVT, VL);

		// Shift the two scalar parts in using SEW=32 slide1up/slide1down
		// instructions.
		if (IntNo == Intrinsic::riscv_vslide1up \|\|
		IntNo == Intrinsic::riscv_vslide1up_mask) {
		Vec = DAG.getNode(RISCVISD::VSLIDE1UP_VL, DL, I32VT, Vec, ScalarHi,
		I32Mask, I32VL);
		Vec = DAG.getNode(RISCVISD::VSLIDE1UP_VL, DL, I32VT, Vec, ScalarLo,
		I32Mask, I32VL);
		} else {
		Vec = DAG.getNode(RISCVISD::VSLIDE1DOWN_VL, DL, I32VT, Vec, ScalarLo,
		I32Mask, I32VL);
		Vec = DAG.getNode(RISCVISD::VSLIDE1DOWN_VL, DL, I32VT, Vec, ScalarHi,
		I32Mask, I32VL);
		}

		// Convert back to nxvXi64.
		Vec = DAG.getBitcast(VT, Vec);

		if (!IsMasked)
		return Vec;

		// Apply mask after the operation.
		SDValue Mask = Op.getOperand(NumOps - 2);
		SDValue MaskedOff = Op.getOperand(1);
		return DAG.getNode(RISCVISD::VSELECT_VL, DL, VT, Mask, Vec, MaskedOff, VL);
		}
}		}

return lowerVectorIntrinsicSplats(Op, DAG, Subtarget);		return lowerVectorIntrinsicSplats(Op, DAG, Subtarget);
}		}

SDValue RISCVTargetLowering::LowerINTRINSIC_W_CHAIN(SDValue Op,		SDValue RISCVTargetLowering::LowerINTRINSIC_W_CHAIN(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
return lowerVectorIntrinsicSplats(Op, DAG, Subtarget);		return lowerVectorIntrinsicSplats(Op, DAG, Subtarget);
▲ Show 20 Lines • Show All 4,011 Lines • ▼ Show 20 Lines	#define NODE_NAME_CASE(NODE) \
NODE_NAME_CASE(SPLAT_VECTOR_I64)		NODE_NAME_CASE(SPLAT_VECTOR_I64)
NODE_NAME_CASE(READ_VLENB)		NODE_NAME_CASE(READ_VLENB)
NODE_NAME_CASE(TRUNCATE_VECTOR_VL)		NODE_NAME_CASE(TRUNCATE_VECTOR_VL)
NODE_NAME_CASE(VLEFF)		NODE_NAME_CASE(VLEFF)
NODE_NAME_CASE(VLEFF_MASK)		NODE_NAME_CASE(VLEFF_MASK)
NODE_NAME_CASE(VSLIDEUP_VL)		NODE_NAME_CASE(VSLIDEUP_VL)
NODE_NAME_CASE(VSLIDE1UP_VL)		NODE_NAME_CASE(VSLIDE1UP_VL)
NODE_NAME_CASE(VSLIDEDOWN_VL)		NODE_NAME_CASE(VSLIDEDOWN_VL)
		NODE_NAME_CASE(VSLIDE1DOWN_VL)
NODE_NAME_CASE(VID_VL)		NODE_NAME_CASE(VID_VL)
NODE_NAME_CASE(VFNCVT_ROD_VL)		NODE_NAME_CASE(VFNCVT_ROD_VL)
NODE_NAME_CASE(VECREDUCE_ADD_VL)		NODE_NAME_CASE(VECREDUCE_ADD_VL)
NODE_NAME_CASE(VECREDUCE_UMAX_VL)		NODE_NAME_CASE(VECREDUCE_UMAX_VL)
NODE_NAME_CASE(VECREDUCE_SMAX_VL)		NODE_NAME_CASE(VECREDUCE_SMAX_VL)
NODE_NAME_CASE(VECREDUCE_UMIN_VL)		NODE_NAME_CASE(VECREDUCE_UMIN_VL)
NODE_NAME_CASE(VECREDUCE_SMIN_VL)		NODE_NAME_CASE(VECREDUCE_SMIN_VL)
NODE_NAME_CASE(VECREDUCE_AND_VL)		NODE_NAME_CASE(VECREDUCE_AND_VL)
▲ Show 20 Lines • Show All 728 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td

	Show First 20 Lines • Show All 1,138 Lines • ▼ Show 20 Lines
	def SDTRVVSlide1 : SDTypeProfile<1, 4, [			def SDTRVVSlide1 : SDTypeProfile<1, 4, [
	SDTCisVec<0>, SDTCisSameAs<1, 0>, SDTCisInt<0>, SDTCisVT<2, XLenVT>,			SDTCisVec<0>, SDTCisSameAs<1, 0>, SDTCisInt<0>, SDTCisVT<2, XLenVT>,
	SDTCVecEltisVT<3, i1>, SDTCisSameNumEltsAs<0, 3>, SDTCisVT<4, XLenVT>			SDTCVecEltisVT<3, i1>, SDTCisSameNumEltsAs<0, 3>, SDTCisVT<4, XLenVT>
	]>;			]>;

	def riscv_slideup_vl : SDNode<"RISCVISD::VSLIDEUP_VL", SDTRVVSlide, []>;			def riscv_slideup_vl : SDNode<"RISCVISD::VSLIDEUP_VL", SDTRVVSlide, []>;
	def riscv_slide1up_vl : SDNode<"RISCVISD::VSLIDE1UP_VL", SDTRVVSlide1, []>;			def riscv_slide1up_vl : SDNode<"RISCVISD::VSLIDE1UP_VL", SDTRVVSlide1, []>;
	def riscv_slidedown_vl : SDNode<"RISCVISD::VSLIDEDOWN_VL", SDTRVVSlide, []>;			def riscv_slidedown_vl : SDNode<"RISCVISD::VSLIDEDOWN_VL", SDTRVVSlide, []>;
				def riscv_slide1down_vl : SDNode<"RISCVISD::VSLIDE1DOWN_VL", SDTRVVSlide1, []>;

	let Predicates = [HasStdExtV] in {			let Predicates = [HasStdExtV] in {

	foreach vti = AllIntegerVectors in {			foreach vti = AllIntegerVectors in {
	def : Pat<(vti.Vector (riscv_vid_vl (vti.Mask true_mask),			def : Pat<(vti.Vector (riscv_vid_vl (vti.Mask true_mask),
	(XLenVT (VLOp GPR:$vl)))),			(XLenVT (VLOp GPR:$vl)))),
	(!cast<Instruction>("PseudoVID_V_"#vti.LMul.MX) GPR:$vl, vti.SEW)>;			(!cast<Instruction>("PseudoVID_V_"#vti.LMul.MX) GPR:$vl, vti.SEW)>;

	def : Pat<(vti.Vector (riscv_slide1up_vl (vti.Vector vti.RegClass:$rs1),			def : Pat<(vti.Vector (riscv_slide1up_vl (vti.Vector vti.RegClass:$rs1),
	GPR:$rs2, (vti.Mask true_mask),			GPR:$rs2, (vti.Mask true_mask),
	(XLenVT (VLOp GPR:$vl)))),			(XLenVT (VLOp GPR:$vl)))),
	(!cast<Instruction>("PseudoVSLIDE1UP_VX_"#vti.LMul.MX)			(!cast<Instruction>("PseudoVSLIDE1UP_VX_"#vti.LMul.MX)
	vti.RegClass:$rs1, GPR:$rs2, GPR:$vl, vti.SEW)>;			vti.RegClass:$rs1, GPR:$rs2, GPR:$vl, vti.SEW)>;
				def : Pat<(vti.Vector (riscv_slide1down_vl (vti.Vector vti.RegClass:$rs1),
				GPR:$rs2, (vti.Mask true_mask),
				(XLenVT (VLOp GPR:$vl)))),
				(!cast<Instruction>("PseudoVSLIDE1DOWN_VX_"#vti.LMul.MX)
				vti.RegClass:$rs1, GPR:$rs2, GPR:$vl, vti.SEW)>;
	}			}

	foreach vti = !listconcat(AllIntegerVectors, AllFloatVectors) in {			foreach vti = !listconcat(AllIntegerVectors, AllFloatVectors) in {
	def : Pat<(vti.Vector (riscv_slideup_vl (vti.Vector vti.RegClass:$rs3),			def : Pat<(vti.Vector (riscv_slideup_vl (vti.Vector vti.RegClass:$rs3),
	(vti.Vector vti.RegClass:$rs1),			(vti.Vector vti.RegClass:$rs1),
	uimm5:$rs2, (vti.Mask true_mask),			uimm5:$rs2, (vti.Mask true_mask),
	(XLenVT (VLOp GPR:$vl)))),			(XLenVT (VLOp GPR:$vl)))),
	(!cast<Instruction>("PseudoVSLIDEUP_VI_"#vti.LMul.MX)			(!cast<Instruction>("PseudoVSLIDEUP_VI_"#vti.LMul.MX)
	Show All 29 Lines

llvm/test/CodeGen/RISCV/rvv/vslide1down-rv32.ll

Show First 20 Lines • Show All 786 Lines • ▼ Show 20 Lines	%a = call <vscale x 16 x i32> @llvm.riscv.vslide1down.mask.nxv16i32.i32(
<vscale x 16 x i32> %0,		<vscale x 16 x i32> %0,
<vscale x 16 x i32> %1,		<vscale x 16 x i32> %1,
i32 %2,		i32 %2,
<vscale x 16 x i1> %3,		<vscale x 16 x i1> %3,
i32 %4)		i32 %4)

ret <vscale x 16 x i32> %a		ret <vscale x 16 x i32> %a
}		}

		declare <vscale x 1 x i64> @llvm.riscv.vslide1down.nxv1i64.i64(
		<vscale x 1 x i64>,
		i64,
		i32);

		define <vscale x 1 x i64> @intrinsic_vslide1down_vx_nxv1i64_nxv1i64_i64(<vscale x 1 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_vx_nxv1i64_nxv1i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m1,ta,mu
		; CHECK-NEXT: vslide1down.vx v25, v8, a0
		; CHECK-NEXT: vslide1down.vx v8, v25, a1
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 1 x i64> @llvm.riscv.vslide1down.nxv1i64.i64(
		<vscale x 1 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 1 x i64> %a
		}

		declare <vscale x 1 x i64> @llvm.riscv.vslide1down.mask.nxv1i64.i64(
		<vscale x 1 x i64>,
		<vscale x 1 x i64>,
		i64,
		<vscale x 1 x i1>,
		i32);

		define <vscale x 1 x i64> @intrinsic_vslide1down_mask_vx_nxv1i64_nxv1i64_i64(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, i64 %2, <vscale x 1 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_mask_vx_nxv1i64_nxv1i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m1,ta,mu
		; CHECK-NEXT: vslide1down.vx v25, v9, a0
		; CHECK-NEXT: vslide1down.vx v25, v25, a1
		; CHECK-NEXT: vsetvli a0, a2, e64,m1,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v25, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 1 x i64> @llvm.riscv.vslide1down.mask.nxv1i64.i64(
		<vscale x 1 x i64> %0,
		<vscale x 1 x i64> %1,
		i64 %2,
		<vscale x 1 x i1> %3,
		i32 %4)

		ret <vscale x 1 x i64> %a
		}

		declare <vscale x 2 x i64> @llvm.riscv.vslide1down.nxv2i64.i64(
		<vscale x 2 x i64>,
		i64,
		i32);

		define <vscale x 2 x i64> @intrinsic_vslide1down_vx_nxv2i64_nxv2i64_i64(<vscale x 2 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_vx_nxv2i64_nxv2i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m2,ta,mu
		; CHECK-NEXT: vslide1down.vx v26, v8, a0
		; CHECK-NEXT: vslide1down.vx v8, v26, a1
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 2 x i64> @llvm.riscv.vslide1down.nxv2i64.i64(
		<vscale x 2 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 2 x i64> %a
		}

		declare <vscale x 2 x i64> @llvm.riscv.vslide1down.mask.nxv2i64.i64(
		<vscale x 2 x i64>,
		<vscale x 2 x i64>,
		i64,
		<vscale x 2 x i1>,
		i32);

		define <vscale x 2 x i64> @intrinsic_vslide1down_mask_vx_nxv2i64_nxv2i64_i64(<vscale x 2 x i64> %0, <vscale x 2 x i64> %1, i64 %2, <vscale x 2 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_mask_vx_nxv2i64_nxv2i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m2,ta,mu
		; CHECK-NEXT: vslide1down.vx v26, v10, a0
		; CHECK-NEXT: vslide1down.vx v26, v26, a1
		; CHECK-NEXT: vsetvli a0, a2, e64,m2,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v26, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 2 x i64> @llvm.riscv.vslide1down.mask.nxv2i64.i64(
		<vscale x 2 x i64> %0,
		<vscale x 2 x i64> %1,
		i64 %2,
		<vscale x 2 x i1> %3,
		i32 %4)

		ret <vscale x 2 x i64> %a
		}

		declare <vscale x 4 x i64> @llvm.riscv.vslide1down.nxv4i64.i64(
		<vscale x 4 x i64>,
		i64,
		i32);

		define <vscale x 4 x i64> @intrinsic_vslide1down_vx_nxv4i64_nxv4i64_i64(<vscale x 4 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_vx_nxv4i64_nxv4i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m4,ta,mu
		; CHECK-NEXT: vslide1down.vx v28, v8, a0
		; CHECK-NEXT: vslide1down.vx v8, v28, a1
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 4 x i64> @llvm.riscv.vslide1down.nxv4i64.i64(
		<vscale x 4 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 4 x i64> %a
		}

		declare <vscale x 4 x i64> @llvm.riscv.vslide1down.mask.nxv4i64.i64(
		<vscale x 4 x i64>,
		<vscale x 4 x i64>,
		i64,
		<vscale x 4 x i1>,
		i32);

		define <vscale x 4 x i64> @intrinsic_vslide1down_mask_vx_nxv4i64_nxv4i64_i64(<vscale x 4 x i64> %0, <vscale x 4 x i64> %1, i64 %2, <vscale x 4 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_mask_vx_nxv4i64_nxv4i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m4,ta,mu
		; CHECK-NEXT: vslide1down.vx v28, v12, a0
		; CHECK-NEXT: vslide1down.vx v28, v28, a1
		; CHECK-NEXT: vsetvli a0, a2, e64,m4,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v28, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 4 x i64> @llvm.riscv.vslide1down.mask.nxv4i64.i64(
		<vscale x 4 x i64> %0,
		<vscale x 4 x i64> %1,
		i64 %2,
		<vscale x 4 x i1> %3,
		i32 %4)

		ret <vscale x 4 x i64> %a
		}

		declare <vscale x 8 x i64> @llvm.riscv.vslide1down.nxv8i64.i64(
		<vscale x 8 x i64>,
		i64,
		i32);

		define <vscale x 8 x i64> @intrinsic_vslide1down_vx_nxv8i64_nxv8i64_i64(<vscale x 8 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_vx_nxv8i64_nxv8i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m8,ta,mu
		; CHECK-NEXT: vslide1down.vx v8, v8, a0
		; CHECK-NEXT: vslide1down.vx v8, v8, a1
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 8 x i64> @llvm.riscv.vslide1down.nxv8i64.i64(
		<vscale x 8 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 8 x i64> %a
		}

		declare <vscale x 8 x i64> @llvm.riscv.vslide1down.mask.nxv8i64.i64(
		<vscale x 8 x i64>,
		<vscale x 8 x i64>,
		i64,
		<vscale x 8 x i1>,
		i32);

		define <vscale x 8 x i64> @intrinsic_vslide1down_mask_vx_nxv8i64_nxv8i64_i64(<vscale x 8 x i64> %0, <vscale x 8 x i64> %1, i64 %2, <vscale x 8 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1down_mask_vx_nxv8i64_nxv8i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m8,ta,mu
		; CHECK-NEXT: vslide1down.vx v16, v16, a0
		; CHECK-NEXT: vslide1down.vx v16, v16, a1
		; CHECK-NEXT: vsetvli a0, a2, e64,m8,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v16, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 8 x i64> @llvm.riscv.vslide1down.mask.nxv8i64.i64(
		<vscale x 8 x i64> %0,
		<vscale x 8 x i64> %1,
		i64 %2,
		<vscale x 8 x i1> %3,
		i32 %4)

		ret <vscale x 8 x i64> %a
		}

llvm/test/CodeGen/RISCV/rvv/vslide1up-rv32.ll

Show First 20 Lines • Show All 804 Lines • ▼ Show 20 Lines	%a = call <vscale x 16 x i32> @llvm.riscv.vslide1up.mask.nxv16i32.i32(
<vscale x 16 x i32> %0,		<vscale x 16 x i32> %0,
<vscale x 16 x i32> %1,		<vscale x 16 x i32> %1,
i32 %2,		i32 %2,
<vscale x 16 x i1> %3,		<vscale x 16 x i1> %3,
i32 %4)		i32 %4)

ret <vscale x 16 x i32> %a		ret <vscale x 16 x i32> %a
}		}

		declare <vscale x 1 x i64> @llvm.riscv.vslide1up.nxv1i64.i64(
		<vscale x 1 x i64>,
		i64,
		i32);

		define <vscale x 1 x i64> @intrinsic_vslide1up_vx_nxv1i64_nxv1i64_i64(<vscale x 1 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_vx_nxv1i64_nxv1i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m1,ta,mu
		; CHECK-NEXT: vslide1up.vx v25, v8, a1
		; CHECK-NEXT: vslide1up.vx v8, v25, a0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 1 x i64> @llvm.riscv.vslide1up.nxv1i64.i64(
		<vscale x 1 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 1 x i64> %a
		}

		declare <vscale x 1 x i64> @llvm.riscv.vslide1up.mask.nxv1i64.i64(
		<vscale x 1 x i64>,
		<vscale x 1 x i64>,
		i64,
		<vscale x 1 x i1>,
		i32);

		define <vscale x 1 x i64> @intrinsic_vslide1up_mask_vx_nxv1i64_nxv1i64_i64(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, i64 %2, <vscale x 1 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_mask_vx_nxv1i64_nxv1i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m1,ta,mu
		; CHECK-NEXT: vslide1up.vx v25, v9, a1
		; CHECK-NEXT: vslide1up.vx v26, v25, a0
		; CHECK-NEXT: vsetvli a0, a2, e64,m1,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v26, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 1 x i64> @llvm.riscv.vslide1up.mask.nxv1i64.i64(
		<vscale x 1 x i64> %0,
		<vscale x 1 x i64> %1,
		i64 %2,
		<vscale x 1 x i1> %3,
		i32 %4)

		ret <vscale x 1 x i64> %a
		}

		declare <vscale x 2 x i64> @llvm.riscv.vslide1up.nxv2i64.i64(
		<vscale x 2 x i64>,
		i64,
		i32);

		define <vscale x 2 x i64> @intrinsic_vslide1up_vx_nxv2i64_nxv2i64_i64(<vscale x 2 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_vx_nxv2i64_nxv2i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m2,ta,mu
		; CHECK-NEXT: vslide1up.vx v26, v8, a1
		; CHECK-NEXT: vslide1up.vx v8, v26, a0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 2 x i64> @llvm.riscv.vslide1up.nxv2i64.i64(
		<vscale x 2 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 2 x i64> %a
		}

		declare <vscale x 2 x i64> @llvm.riscv.vslide1up.mask.nxv2i64.i64(
		<vscale x 2 x i64>,
		<vscale x 2 x i64>,
		i64,
		<vscale x 2 x i1>,
		i32);

		define <vscale x 2 x i64> @intrinsic_vslide1up_mask_vx_nxv2i64_nxv2i64_i64(<vscale x 2 x i64> %0, <vscale x 2 x i64> %1, i64 %2, <vscale x 2 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_mask_vx_nxv2i64_nxv2i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m2,ta,mu
		; CHECK-NEXT: vslide1up.vx v26, v10, a1
		; CHECK-NEXT: vslide1up.vx v28, v26, a0
		; CHECK-NEXT: vsetvli a0, a2, e64,m2,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v28, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 2 x i64> @llvm.riscv.vslide1up.mask.nxv2i64.i64(
		<vscale x 2 x i64> %0,
		<vscale x 2 x i64> %1,
		i64 %2,
		<vscale x 2 x i1> %3,
		i32 %4)

		ret <vscale x 2 x i64> %a
		}

		declare <vscale x 4 x i64> @llvm.riscv.vslide1up.nxv4i64.i64(
		<vscale x 4 x i64>,
		i64,
		i32);

		define <vscale x 4 x i64> @intrinsic_vslide1up_vx_nxv4i64_nxv4i64_i64(<vscale x 4 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_vx_nxv4i64_nxv4i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m4,ta,mu
		; CHECK-NEXT: vslide1up.vx v28, v8, a1
		; CHECK-NEXT: vslide1up.vx v8, v28, a0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 4 x i64> @llvm.riscv.vslide1up.nxv4i64.i64(
		<vscale x 4 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 4 x i64> %a
		}

		declare <vscale x 4 x i64> @llvm.riscv.vslide1up.mask.nxv4i64.i64(
		<vscale x 4 x i64>,
		<vscale x 4 x i64>,
		i64,
		<vscale x 4 x i1>,
		i32);

		define <vscale x 4 x i64> @intrinsic_vslide1up_mask_vx_nxv4i64_nxv4i64_i64(<vscale x 4 x i64> %0, <vscale x 4 x i64> %1, i64 %2, <vscale x 4 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_mask_vx_nxv4i64_nxv4i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m4,ta,mu
		; CHECK-NEXT: vslide1up.vx v28, v12, a1
		; CHECK-NEXT: vslide1up.vx v12, v28, a0
		; CHECK-NEXT: vsetvli a0, a2, e64,m4,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v12, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 4 x i64> @llvm.riscv.vslide1up.mask.nxv4i64.i64(
		<vscale x 4 x i64> %0,
		<vscale x 4 x i64> %1,
		i64 %2,
		<vscale x 4 x i1> %3,
		i32 %4)

		ret <vscale x 4 x i64> %a
		}

		declare <vscale x 8 x i64> @llvm.riscv.vslide1up.nxv8i64.i64(
		<vscale x 8 x i64>,
		i64,
		i32);

		define <vscale x 8 x i64> @intrinsic_vslide1up_vx_nxv8i64_nxv8i64_i64(<vscale x 8 x i64> %0, i64 %1, i32 %2) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_vx_nxv8i64_nxv8i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a2, a2, 1
		; CHECK-NEXT: vsetvli a2, a2, e32,m8,ta,mu
		; CHECK-NEXT: vslide1up.vx v16, v8, a1
		; CHECK-NEXT: vslide1up.vx v8, v16, a0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 8 x i64> @llvm.riscv.vslide1up.nxv8i64.i64(
		<vscale x 8 x i64> %0,
		i64 %1,
		i32 %2)

		ret <vscale x 8 x i64> %a
		}

		declare <vscale x 8 x i64> @llvm.riscv.vslide1up.mask.nxv8i64.i64(
		<vscale x 8 x i64>,
		<vscale x 8 x i64>,
		i64,
		<vscale x 8 x i1>,
		i32);

		define <vscale x 8 x i64> @intrinsic_vslide1up_mask_vx_nxv8i64_nxv8i64_i64(<vscale x 8 x i64> %0, <vscale x 8 x i64> %1, i64 %2, <vscale x 8 x i1> %3, i32 %4) nounwind {
		; CHECK-LABEL: intrinsic_vslide1up_mask_vx_nxv8i64_nxv8i64_i64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: slli a3, a2, 1
		; CHECK-NEXT: vsetvli a3, a3, e32,m8,ta,mu
		; CHECK-NEXT: vslide1up.vx v24, v16, a1
		; CHECK-NEXT: vslide1up.vx v16, v24, a0
		; CHECK-NEXT: vsetvli a0, a2, e64,m8,ta,mu
		; CHECK-NEXT: vmerge.vvm v8, v8, v16, v0
		; CHECK-NEXT: jalr zero, 0(ra)
		entry:
		%a = call <vscale x 8 x i64> @llvm.riscv.vslide1up.mask.nxv8i64.i64(
		<vscale x 8 x i64> %0,
		<vscale x 8 x i64> %1,
		i64 %2,
		<vscale x 8 x i1> %3,
		i32 %4)

		ret <vscale x 8 x i64> %a
		}