This is an archive of the discontinued LLVM Phabricator instance.

llvm/include/llvm/IR/IntrinsicsAArch64.td
2028	What does `_x` mean here?
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
1222	`DL` ;-)
3595	`NumVecs` seems be always 2 in this patch. Will we need this to work for other values in the future too? [Nit] `2` is a bit of a magic number here. What about `2` -> `/NumVecs=/2`
llvm/test/CodeGen/AArch64/sve2-intrinsics-bit-permutation.ll
2	AFAIK, `-asm-verbose=0` is not currently needed here (and you don't use it in the other test). There are 2 options: Leave `-asm-verbose=0` (guarantees that there are no comments in assembly) and additionally decorate every function that you define with `nounwind` (guarantees that no CFI directives are added). This way you can safely replace every instance of `CHECK` with `CHECK-NEXT`. Remove `-asm-verbose=0` and leave things as they are.

Removed NumVecs parameter from SelectTableSVE2 as the value is always the same (2)
Removed unnecessary -asm-verbose=0 from the RUN line of sve2-intrinsics-bit-permutation.ll

Thanks for reviewing this, @andwar!

llvm/include/llvm/IR/IntrinsicsAArch64.td
2028	_x indicates that this is an unpredicated intrinsic.
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
3595	I agree that it's not very clear what 2 is used for here. As NumVecs will always be the same value for the tbl2 intrinsic and SelectTableSVE2 is unlikely to be used for anything else, I've removed it from the list of parameters & added a comment there to explain the value used.

sdesmalen added inline comments.Feb 21 2020, 9:34 AM

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
1226	nit: You can just as well inline this value now.
1229	nit: given that we know NumVecs == 2, you can write `SmallVector<SDVDalue, 2>`. nit: How about `N->ops().slice(1, 2)` ? https://llvm.org/doxygen/classllvm_1_1ArrayRef.html#ace7bdce94e806bb8870626657630dab0
1235	nit: Maybe just: ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, VT, { RegSeq, N->getOperand(NumVecs + 1) });
llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll
9	We should test this with operands that are not already consecutive. `%a` and `%b` will come in as `z0` and `z1` by definition of the calling convention. By adding a `%dummy` in between `%a` and `%b`, you can check that a `mov` is inserted to ensure both registers are consecutive.

efriedma added inline comments.Feb 21 2020, 10:13 AM

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
1220	Is it possible to write this as a TableGen pattern? We manage for other variants of tbl (for example, https://github.com/llvm/llvm-project/blob/bc7b26c333f51b4b534abb81d597c0b86123718c/llvm/lib/Target/ARM/ARMInstrNEON.td#L7059 ).

Addressed review comments:

Removed SelectTableSVE2 from AArch64ISelDAGToDAG.cpp and added tablegen patterns for the tbl2 intrinsic
Updated tests to use operands that are not consecutive to ensure that the result is still two consecutive registers

LGTM

This revision is now accepted and ready to land.Feb 25 2020, 3:09 PM

Closed by commit rG9c859fc54d92: [AArch64][SVE] Add SVE2 intrinsics for bit permutation & table lookup (authored by kmclaughlin). · Explain WhyFeb 26 2020, 3:30 AM

This revision was automatically updated to reflect the committed changes.

Thanks for reviewing this, @sdesmalen & @efriedma!

c-rhodes mentioned this in D75197: [AArch64][SVE] Add intrinsics for bitwise permute instructions.Feb 27 2020, 1:48 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

IntrinsicsAArch64.td

22 lines

lib/

Target/

AArch64/

AArch64ISelDAGToDAG.cpp

48 lines

AArch64SVEInstrInfo.td

8 lines

SVEInstrFormats.td

18 lines

test/

CodeGen/

AArch64/

sve2-intrinsics-bit-permutation.ll

124 lines

sve2-intrinsics-perm-tb.ll

167 lines

Diff 245835

llvm/include/llvm/IR/IntrinsicsAArch64.td

Show First 20 Lines • Show All 1,047 Lines • ▼ Show 20 Lines	: Intrinsic<[llvm_i1_ty],
[IntrNoMem]>;		[IntrNoMem]>;

class AdvSIMD_SVE_TBL_Intrinsic		class AdvSIMD_SVE_TBL_Intrinsic
: Intrinsic<[llvm_anyvector_ty],		: Intrinsic<[llvm_anyvector_ty],
[LLVMMatchType<0>,		[LLVMMatchType<0>,
LLVMVectorOfBitcastsToInt<0>],		LLVMVectorOfBitcastsToInt<0>],
[IntrNoMem]>;		[IntrNoMem]>;

		class AdvSIMD_SVE2_TBX_Intrinsic
		: Intrinsic<[llvm_anyvector_ty],
		[LLVMMatchType<0>,
		LLVMMatchType<0>,
		LLVMVectorOfBitcastsToInt<0>],
		[IntrNoMem]>;

class SVE2_1VectorArg_Long_Intrinsic		class SVE2_1VectorArg_Long_Intrinsic
: Intrinsic<[llvm_anyvector_ty],		: Intrinsic<[llvm_anyvector_ty],
[LLVMSubdivide2VectorType<0>,		[LLVMSubdivide2VectorType<0>,
llvm_i32_ty],		llvm_i32_ty],
[IntrNoMem, ImmArg<1>]>;		[IntrNoMem, ImmArg<1>]>;

class SVE2_2VectorArg_Long_Intrinsic		class SVE2_2VectorArg_Long_Intrinsic
: Intrinsic<[llvm_anyvector_ty],		: Intrinsic<[llvm_anyvector_ty],
▲ Show 20 Lines • Show All 938 Lines • ▼ Show 20 Lines
// SVE2 - Polynomial arithmetic		// SVE2 - Polynomial arithmetic
//		//

def int_aarch64_sve_eorbt : AdvSIMD_3VectorArg_Intrinsic;		def int_aarch64_sve_eorbt : AdvSIMD_3VectorArg_Intrinsic;
def int_aarch64_sve_eortb : AdvSIMD_3VectorArg_Intrinsic;		def int_aarch64_sve_eortb : AdvSIMD_3VectorArg_Intrinsic;
def int_aarch64_sve_pmullb_pair : AdvSIMD_2VectorArg_Intrinsic;		def int_aarch64_sve_pmullb_pair : AdvSIMD_2VectorArg_Intrinsic;
def int_aarch64_sve_pmullt_pair : AdvSIMD_2VectorArg_Intrinsic;		def int_aarch64_sve_pmullt_pair : AdvSIMD_2VectorArg_Intrinsic;

		//
		// SVE2 - Extended table lookup/permute
		//

		def int_aarch64_sve_tbl2 : AdvSIMD_SVE2_TBX_Intrinsic;
		def int_aarch64_sve_tbx : AdvSIMD_SVE2_TBX_Intrinsic;

		//
		// SVE2 - Optional bit permutation
		//

		def int_aarch64_sve_bdep_x : AdvSIMD_2VectorArg_Intrinsic;
		andwarUnsubmitted Not Done Reply Inline Actions What does `_x` mean here? andwar: What does `_x` mean here?
		kmclaughlinAuthorUnsubmitted Done Reply Inline Actions _x indicates that this is an unpredicated intrinsic. kmclaughlin: _x indicates that this is an unpredicated intrinsic.
		def int_aarch64_sve_bext_x : AdvSIMD_2VectorArg_Intrinsic;
		def int_aarch64_sve_bgrp_x : AdvSIMD_2VectorArg_Intrinsic;

}		}

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Show First 20 Lines • Show All 194 Lines • ▼ Show 20 Lines	public:
}		}

/// Form sequences of consecutive 64/128-bit registers for use in NEON		/// Form sequences of consecutive 64/128-bit registers for use in NEON
/// instructions making use of a vector-list (e.g. ldN, tbl). Vecs must have		/// instructions making use of a vector-list (e.g. ldN, tbl). Vecs must have
/// between 1 and 4 elements. If it contains a single element that is returned		/// between 1 and 4 elements. If it contains a single element that is returned
/// unchanged; otherwise a REG_SEQUENCE value is returned.		/// unchanged; otherwise a REG_SEQUENCE value is returned.
SDValue createDTuple(ArrayRef<SDValue> Vecs);		SDValue createDTuple(ArrayRef<SDValue> Vecs);
SDValue createQTuple(ArrayRef<SDValue> Vecs);		SDValue createQTuple(ArrayRef<SDValue> Vecs);
		// Same thing for SVE instructions making use of lists of Z registers
		SDValue createZTuple(ArrayRef<SDValue> Vecs);

/// Generic helper for the createDTuple/createQTuple		/// Generic helper for the createDTuple/createQTuple
/// functions. Those should almost always be called instead.		/// functions. Those should almost always be called instead.
SDValue createTuple(ArrayRef<SDValue> Vecs, const unsigned RegClassIDs[],		SDValue createTuple(ArrayRef<SDValue> Vecs, const unsigned RegClassIDs[],
const unsigned SubRegs[]);		const unsigned SubRegs[]);

void SelectTable(SDNode *N, unsigned NumVecs, unsigned Opc, bool isExt);		void SelectTable(SDNode *N, unsigned NumVecs, unsigned Opc, bool isExt);

		void SelectTableSVE2(SDNode *N, unsigned Opc);

bool tryIndexedLoad(SDNode *N);		bool tryIndexedLoad(SDNode *N);

bool trySelectStackSlotTagP(SDNode *N);		bool trySelectStackSlotTagP(SDNode *N);
void SelectTagP(SDNode *N);		void SelectTagP(SDNode *N);

void SelectLoad(SDNode *N, unsigned NumVecs, unsigned Opc,		void SelectLoad(SDNode *N, unsigned NumVecs, unsigned Opc,
unsigned SubRegIdx);		unsigned SubRegIdx);
void SelectPostLoad(SDNode *N, unsigned NumVecs, unsigned Opc,		void SelectPostLoad(SDNode *N, unsigned NumVecs, unsigned Opc,
▲ Show 20 Lines • Show All 930 Lines • ▼ Show 20 Lines	SDValue AArch64DAGToDAGISel::createQTuple(ArrayRef<SDValue> Regs) {
static const unsigned RegClassIDs[] = {		static const unsigned RegClassIDs[] = {
AArch64::QQRegClassID, AArch64::QQQRegClassID, AArch64::QQQQRegClassID};		AArch64::QQRegClassID, AArch64::QQQRegClassID, AArch64::QQQQRegClassID};
static const unsigned SubRegs[] = {AArch64::qsub0, AArch64::qsub1,		static const unsigned SubRegs[] = {AArch64::qsub0, AArch64::qsub1,
AArch64::qsub2, AArch64::qsub3};		AArch64::qsub2, AArch64::qsub3};

return createTuple(Regs, RegClassIDs, SubRegs);		return createTuple(Regs, RegClassIDs, SubRegs);
}		}

		SDValue AArch64DAGToDAGISel::createZTuple(ArrayRef<SDValue> Regs) {
		static const unsigned RegClassIDs[] = {
		AArch64::ZPR2RegClassID, AArch64::ZPR3RegClassID, AArch64::ZPR4RegClassID};
		static const unsigned SubRegs[] = {AArch64::zsub0, AArch64::zsub1,
		AArch64::zsub2, AArch64::zsub3};

		return createTuple(Regs, RegClassIDs, SubRegs);
		}

SDValue AArch64DAGToDAGISel::createTuple(ArrayRef<SDValue> Regs,		SDValue AArch64DAGToDAGISel::createTuple(ArrayRef<SDValue> Regs,
const unsigned RegClassIDs[],		const unsigned RegClassIDs[],
const unsigned SubRegs[]) {		const unsigned SubRegs[]) {
// There's no special register-class for a vector-list of 1 element: it's just		// There's no special register-class for a vector-list of 1 element: it's just
// a vector.		// a vector.
if (Regs.size() == 1)		if (Regs.size() == 1)
return Regs[0];		return Regs[0];

Show All 34 Lines	void AArch64DAGToDAGISel::SelectTable(SDNode *N, unsigned NumVecs, unsigned Opc,
SmallVector<SDValue, 6> Ops;		SmallVector<SDValue, 6> Ops;
if (isExt)		if (isExt)
Ops.push_back(N->getOperand(1));		Ops.push_back(N->getOperand(1));
Ops.push_back(RegSeq);		Ops.push_back(RegSeq);
Ops.push_back(N->getOperand(NumVecs + ExtOff + 1));		Ops.push_back(N->getOperand(NumVecs + ExtOff + 1));
ReplaceNode(N, CurDAG->getMachineNode(Opc, dl, VT, Ops));		ReplaceNode(N, CurDAG->getMachineNode(Opc, dl, VT, Ops));
}		}

		void AArch64DAGToDAGISel::SelectTableSVE2(SDNode *N, unsigned Opc) {
		efriedmaUnsubmitted Not Done Reply Inline Actions Is it possible to write this as a TableGen pattern? We manage for other variants of tbl (for example, https://github.com/llvm/llvm-project/blob/bc7b26c333f51b4b534abb81d597c0b86123718c/llvm/lib/Target/ARM/ARMInstrNEON.td#L7059 ). efriedma: Is it possible to write this as a TableGen pattern? We manage for other variants of tbl (for…
		SDLoc DL(N);
		EVT VT = N->getValueType(0);
		andwarUnsubmitted Done Reply Inline Actions `DL` ;-) andwar: `DL` ;-)

		// We are only using this to select the aarch64_sve_tbl2
		// intrinsic currently, where NumVecs is always 2
		unsigned NumVecs = 2;
		sdesmalenUnsubmitted Not Done Reply Inline Actions nit: You can just as well inline this value now. sdesmalen: nit: You can just as well inline this value now.

		// Form a REG_SEQUENCE to force register allocation.
		SmallVector<SDValue, 4> Regs(N->op_begin() + 1, N->op_begin() + 1 + NumVecs);
		sdesmalenUnsubmitted Not Done Reply Inline Actions nit: given that we know NumVecs == 2, you can write `SmallVector<SDVDalue, 2>`. nit: How about `N->ops().slice(1, 2)` ? https://llvm.org/doxygen/classllvm_1_1ArrayRef.html#ace7bdce94e806bb8870626657630dab0 sdesmalen: nit: given that we know NumVecs == 2, you can write `SmallVector<SDVDalue, 2>`. nit: How about…
		SDValue RegSeq = createZTuple(Regs);

		SmallVector<SDValue, 6> Ops;
		Ops.push_back(RegSeq);
		Ops.push_back(N->getOperand(NumVecs + 1));
		ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, VT, Ops));
		sdesmalenUnsubmitted Not Done Reply Inline Actions nit: Maybe just: ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, VT, { RegSeq, N->getOperand(NumVecs + 1) }); sdesmalen: nit: Maybe just: ```ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, VT, { RegSeq, N->getOperand…
		}

bool AArch64DAGToDAGISel::tryIndexedLoad(SDNode *N) {		bool AArch64DAGToDAGISel::tryIndexedLoad(SDNode *N) {
LoadSDNode *LD = cast<LoadSDNode>(N);		LoadSDNode *LD = cast<LoadSDNode>(N);
if (LD->isUnindexed())		if (LD->isUnindexed())
return false;		return false;
EVT VT = LD->getMemoryVT();		EVT VT = LD->getMemoryVT();
EVT DstVT = N->getValueType(0);		EVT DstVT = N->getValueType(0);
ISD::MemIndexedMode AM = LD->getAddressingMode();		ISD::MemIndexedMode AM = LD->getAddressingMode();
bool IsPre = AM == ISD::PRE_INC \|\| AM == ISD::PRE_DEC;		bool IsPre = AM == ISD::PRE_INC \|\| AM == ISD::PRE_DEC;
▲ Show 20 Lines • Show All 2,339 Lines • ▼ Show 20 Lines	case Intrinsic::aarch64_neon_tbx4:
: AArch64::TBXv16i8Four,		: AArch64::TBXv16i8Four,
true);		true);
return;		return;
case Intrinsic::aarch64_neon_smull:		case Intrinsic::aarch64_neon_smull:
case Intrinsic::aarch64_neon_umull:		case Intrinsic::aarch64_neon_umull:
if (tryMULLV64LaneV128(IntNo, Node))		if (tryMULLV64LaneV128(IntNo, Node))
return;		return;
break;		break;
		case Intrinsic::aarch64_sve_tbl2:
		if (VT == MVT::nxv16i8) {
		SelectTableSVE2(Node, AArch64::TBL_ZZZZ_B);
		andwarUnsubmitted Not Done Reply Inline Actions `NumVecs` seems be always 2 in this patch. Will we need this to work for other values in the future too? [Nit] `2` is a bit of a magic number here. What about `2` -> `/NumVecs=/2` andwar: `NumVecs` seems be always 2 in this patch. Will we need this to work for other values in the…
		kmclaughlinAuthorUnsubmitted Done Reply Inline Actions I agree that it's not very clear what 2 is used for here. As NumVecs will always be the same value for the tbl2 intrinsic and SelectTableSVE2 is unlikely to be used for anything else, I've removed it from the list of parameters & added a comment there to explain the value used. kmclaughlin: I agree that it's not very clear what 2 is used for here. As NumVecs will always be the same…
		return;
		}
		if (VT == MVT::nxv8i16 \|\| VT == MVT::nxv8f16) {
		SelectTableSVE2(Node, AArch64::TBL_ZZZZ_H);
		return;
		}
		if (VT == MVT::nxv4i32 \|\| VT == MVT::nxv4f32) {
		SelectTableSVE2(Node, AArch64::TBL_ZZZZ_S);
		return;
		}
		if (VT == MVT::nxv2i64 \|\| VT == MVT::nxv2f64) {
		SelectTableSVE2(Node, AArch64::TBL_ZZZZ_D);
		return;
		}
}		}
break;		break;
}		}
case ISD::INTRINSIC_VOID: {		case ISD::INTRINSIC_VOID: {
unsigned IntNo = cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();		unsigned IntNo = cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();
if (Node->getNumOperands() >= 3)		if (Node->getNumOperands() >= 3)
VT = Node->getOperand(2)->getValueType(0);		VT = Node->getOperand(2)->getValueType(0);
switch (IntNo) {		switch (IntNo) {
▲ Show 20 Lines • Show All 832 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

Show First 20 Lines • Show All 1,808 Lines • ▼ Show 20 Lines	let Predicates = [HasSVE2] in {

defm STNT1B_ZZR_D : sve2_mem_sstnt_vs<0b000, "stnt1b", Z_d, ZPR64>;		defm STNT1B_ZZR_D : sve2_mem_sstnt_vs<0b000, "stnt1b", Z_d, ZPR64>;
defm STNT1H_ZZR_D : sve2_mem_sstnt_vs<0b010, "stnt1h", Z_d, ZPR64>;		defm STNT1H_ZZR_D : sve2_mem_sstnt_vs<0b010, "stnt1h", Z_d, ZPR64>;
defm STNT1W_ZZR_D : sve2_mem_sstnt_vs<0b100, "stnt1w", Z_d, ZPR64>;		defm STNT1W_ZZR_D : sve2_mem_sstnt_vs<0b100, "stnt1w", Z_d, ZPR64>;
defm STNT1D_ZZR_D : sve2_mem_sstnt_vs<0b110, "stnt1d", Z_d, ZPR64>;		defm STNT1D_ZZR_D : sve2_mem_sstnt_vs<0b110, "stnt1d", Z_d, ZPR64>;

// SVE2 table lookup (three sources)		// SVE2 table lookup (three sources)
defm TBL_ZZZZ : sve2_int_perm_tbl<"tbl">;		defm TBL_ZZZZ : sve2_int_perm_tbl<"tbl">;
defm TBX_ZZZ : sve2_int_perm_tbx<"tbx">;		defm TBX_ZZZ : sve2_int_perm_tbx<"tbx", int_aarch64_sve_tbx>;

// SVE2 integer compare scalar count and limit		// SVE2 integer compare scalar count and limit
defm WHILEGE_PWW : sve_int_while4_rr<0b000, "whilege", int_aarch64_sve_whilege>;		defm WHILEGE_PWW : sve_int_while4_rr<0b000, "whilege", int_aarch64_sve_whilege>;
defm WHILEGT_PWW : sve_int_while4_rr<0b001, "whilegt", int_aarch64_sve_whilegt>;		defm WHILEGT_PWW : sve_int_while4_rr<0b001, "whilegt", int_aarch64_sve_whilegt>;
defm WHILEHS_PWW : sve_int_while4_rr<0b100, "whilehs", int_aarch64_sve_whilehs>;		defm WHILEHS_PWW : sve_int_while4_rr<0b100, "whilehs", int_aarch64_sve_whilehs>;
defm WHILEHI_PWW : sve_int_while4_rr<0b101, "whilehi", int_aarch64_sve_whilehi>;		defm WHILEHI_PWW : sve_int_while4_rr<0b101, "whilehi", int_aarch64_sve_whilehi>;

defm WHILEGE_PXX : sve_int_while8_rr<0b000, "whilege", int_aarch64_sve_whilege>;		defm WHILEGE_PXX : sve_int_while8_rr<0b000, "whilege", int_aarch64_sve_whilege>;
Show All 31 Lines

let Predicates = [HasSVE2SHA3] in {		let Predicates = [HasSVE2SHA3] in {
// SVE2 crypto constructive binary operations		// SVE2 crypto constructive binary operations
def RAX1_ZZZ_D : sve2_crypto_cons_bin_op<0b1, "rax1", ZPR64>;		def RAX1_ZZZ_D : sve2_crypto_cons_bin_op<0b1, "rax1", ZPR64>;
}		}

let Predicates = [HasSVE2BitPerm] in {		let Predicates = [HasSVE2BitPerm] in {
// SVE2 bitwise permute		// SVE2 bitwise permute
defm BEXT_ZZZ : sve2_misc_bitwise<0b1100, "bext">;		defm BEXT_ZZZ : sve2_misc_bitwise<0b1100, "bext", int_aarch64_sve_bext_x>;
defm BDEP_ZZZ : sve2_misc_bitwise<0b1101, "bdep">;		defm BDEP_ZZZ : sve2_misc_bitwise<0b1101, "bdep", int_aarch64_sve_bdep_x>;
defm BGRP_ZZZ : sve2_misc_bitwise<0b1110, "bgrp">;		defm BGRP_ZZZ : sve2_misc_bitwise<0b1110, "bgrp", int_aarch64_sve_bgrp_x>;
}		}

llvm/lib/Target/AArch64/SVEInstrFormats.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 961 Lines • ▼ Show 20 Lines	: I<(outs zprty:$Zd), (ins zprty:$_Zd, zprty:$Zn, zprty:$Zm),
let Inst{20-16} = Zm;		let Inst{20-16} = Zm;
let Inst{15-10} = 0b001011;		let Inst{15-10} = 0b001011;
let Inst{9-5} = Zn;		let Inst{9-5} = Zn;
let Inst{4-0} = Zd;		let Inst{4-0} = Zd;

let Constraints = "$Zd = $_Zd";		let Constraints = "$Zd = $_Zd";
}		}

multiclass sve2_int_perm_tbx<string asm> {		multiclass sve2_int_perm_tbx<string asm, SDPatternOperator op> {
def _B : sve2_int_perm_tbx<0b00, asm, ZPR8>;		def _B : sve2_int_perm_tbx<0b00, asm, ZPR8>;
def _H : sve2_int_perm_tbx<0b01, asm, ZPR16>;		def _H : sve2_int_perm_tbx<0b01, asm, ZPR16>;
def _S : sve2_int_perm_tbx<0b10, asm, ZPR32>;		def _S : sve2_int_perm_tbx<0b10, asm, ZPR32>;
def _D : sve2_int_perm_tbx<0b11, asm, ZPR64>;		def _D : sve2_int_perm_tbx<0b11, asm, ZPR64>;

		def : SVE_3_Op_Pat<nxv16i8, op, nxv16i8, nxv16i8, nxv16i8, !cast<Instruction>(NAME # _B)>;
		def : SVE_3_Op_Pat<nxv8i16, op, nxv8i16, nxv8i16, nxv8i16, !cast<Instruction>(NAME # _H)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i32, nxv4i32, !cast<Instruction>(NAME # _S)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i64, nxv2i64, !cast<Instruction>(NAME # _D)>;

		def : SVE_3_Op_Pat<nxv8f16, op, nxv8f16, nxv8f16, nxv8i16, !cast<Instruction>(NAME # _H)>;
		def : SVE_3_Op_Pat<nxv4f32, op, nxv4f32, nxv4f32, nxv4i32, !cast<Instruction>(NAME # _S)>;
		def : SVE_3_Op_Pat<nxv2f64, op, nxv2f64, nxv2f64, nxv2i64, !cast<Instruction>(NAME # _D)>;
}		}

class sve_int_perm_reverse_z<bits<2> sz8_64, string asm, ZPRRegOp zprty>		class sve_int_perm_reverse_z<bits<2> sz8_64, string asm, ZPRRegOp zprty>
: I<(outs zprty:$Zd), (ins zprty:$Zn),		: I<(outs zprty:$Zd), (ins zprty:$Zn),
asm, "\t$Zd, $Zn",		asm, "\t$Zd, $Zn",
"",		"",
[]>, Sched<[]> {		[]>, Sched<[]> {
bits<5> Zd;		bits<5> Zd;
▲ Show 20 Lines • Show All 1,963 Lines • ▼ Show 20 Lines	: I<(outs zprty1:$Zd), (ins zprty2:$Zn, zprty2:$Zm),
let Inst{21} = 0b0;		let Inst{21} = 0b0;
let Inst{20-16} = Zm;		let Inst{20-16} = Zm;
let Inst{15-14} = 0b10;		let Inst{15-14} = 0b10;
let Inst{13-10} = opc;		let Inst{13-10} = opc;
let Inst{9-5} = Zn;		let Inst{9-5} = Zn;
let Inst{4-0} = Zd;		let Inst{4-0} = Zd;
}		}

multiclass sve2_misc_bitwise<bits<4> opc, string asm> {		multiclass sve2_misc_bitwise<bits<4> opc, string asm, SDPatternOperator op> {
def _B : sve2_misc<0b00, opc, asm, ZPR8, ZPR8>;		def _B : sve2_misc<0b00, opc, asm, ZPR8, ZPR8>;
def _H : sve2_misc<0b01, opc, asm, ZPR16, ZPR16>;		def _H : sve2_misc<0b01, opc, asm, ZPR16, ZPR16>;
def _S : sve2_misc<0b10, opc, asm, ZPR32, ZPR32>;		def _S : sve2_misc<0b10, opc, asm, ZPR32, ZPR32>;
def _D : sve2_misc<0b11, opc, asm, ZPR64, ZPR64>;		def _D : sve2_misc<0b11, opc, asm, ZPR64, ZPR64>;

		def : SVE_2_Op_Pat<nxv16i8, op, nxv16i8, nxv16i8, !cast<Instruction>(NAME # _B)>;
		def : SVE_2_Op_Pat<nxv8i16, op, nxv8i16, nxv8i16, !cast<Instruction>(NAME # _H)>;
		def : SVE_2_Op_Pat<nxv4i32, op, nxv4i32, nxv4i32, !cast<Instruction>(NAME # _S)>;
		def : SVE_2_Op_Pat<nxv2i64, op, nxv2i64, nxv2i64, !cast<Instruction>(NAME # _D)>;
}		}

multiclass sve2_misc_int_addsub_long_interleaved<bits<2> opc, string asm,		multiclass sve2_misc_int_addsub_long_interleaved<bits<2> opc, string asm,
SDPatternOperator op> {		SDPatternOperator op> {
def _H : sve2_misc<0b01, { 0b00, opc }, asm, ZPR16, ZPR8>;		def _H : sve2_misc<0b01, { 0b00, opc }, asm, ZPR16, ZPR8>;
def _S : sve2_misc<0b10, { 0b00, opc }, asm, ZPR32, ZPR16>;		def _S : sve2_misc<0b10, { 0b00, opc }, asm, ZPR32, ZPR16>;
def _D : sve2_misc<0b11, { 0b00, opc }, asm, ZPR64, ZPR32>;		def _D : sve2_misc<0b11, { 0b00, opc }, asm, ZPR64, ZPR32>;

▲ Show 20 Lines • Show All 4,090 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve2-intrinsics-bit-permutation.ll

This file was added.

				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2,+sve2-bitperm < %s \| FileCheck %s

				andwarUnsubmitted Done Reply Inline Actions AFAIK, `-asm-verbose=0` is not currently needed here (and you don't use it in the other test). There are 2 options: Leave `-asm-verbose=0` (guarantees that there are no comments in assembly) and additionally decorate every function that you define with `nounwind` (guarantees that no CFI directives are added). This way you can safely replace every instance of `CHECK` with `CHECK-NEXT`. Remove `-asm-verbose=0` and leave things as they are. andwar: AFAIK, `-asm-verbose=0` is not currently needed here (and you don't use it in the other test).
				;
				; BDEP
				;

				define <vscale x 16 x i8> @bdep_nxv16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b) {
				; CHECK-LABEL: bdep_nxv16i8:
				; CHECK: bdep z0.b, z0.b, z1.b
				; CHECK-NEXT: ret
				%out = call <vscale x 16 x i8> @llvm.aarch64.sve.bdep.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				ret <vscale x 16 x i8> %out
				}

				define <vscale x 8 x i16> @bdep_nxv8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b) {
				; CHECK-LABEL: bdep_nxv8i16:
				; CHECK: bdep z0.h, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.bdep.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 4 x i32> @bdep_nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
				; CHECK-LABEL: bdep_nxv4i32:
				; CHECK: bdep z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.bdep.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 2 x i64> @bdep_nxv2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b) {
				; CHECK-LABEL: bdep_nxv2i64:
				; CHECK: bdep z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.bdep.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)
				ret <vscale x 2 x i64> %out
				}

				;
				; BEXT
				;

				define <vscale x 16 x i8> @bext_nxv16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b) {
				; CHECK-LABEL: bext_nxv16i8:
				; CHECK: bext z0.b, z0.b, z1.b
				; CHECK-NEXT: ret
				%out = call <vscale x 16 x i8> @llvm.aarch64.sve.bext.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				ret <vscale x 16 x i8> %out
				}

				define <vscale x 8 x i16> @bext_nxv8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b) {
				; CHECK-LABEL: bext_nxv8i16:
				; CHECK: bext z0.h, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.bext.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 4 x i32> @bext_nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
				; CHECK-LABEL: bext_nxv4i32:
				; CHECK: bext z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.bext.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 2 x i64> @bext_nxv2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b) {
				; CHECK-LABEL: bext_nxv2i64:
				; CHECK: bext z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.bext.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)
				ret <vscale x 2 x i64> %out
				}

				;
				; BGRP
				;

				define <vscale x 16 x i8> @bgrp_nxv16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b) {
				; CHECK-LABEL: bgrp_nxv16i8:
				; CHECK: bgrp z0.b, z0.b, z1.b
				; CHECK-NEXT: ret
				%out = call <vscale x 16 x i8> @llvm.aarch64.sve.bgrp.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				ret <vscale x 16 x i8> %out
				}

				define <vscale x 8 x i16> @bgrp_nxv8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b) {
				; CHECK-LABEL: bgrp_nxv8i16:
				; CHECK: bgrp z0.h, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.bgrp.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 4 x i32> @bgrp_nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
				; CHECK-LABEL: bgrp_nxv4i32:
				; CHECK: bgrp z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.bgrp.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 2 x i64> @bgrp_nxv2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b) {
				; CHECK-LABEL: bgrp_nxv2i64:
				; CHECK: bgrp z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.bgrp.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)
				ret <vscale x 2 x i64> %out
				}

				declare <vscale x 16 x i8> @llvm.aarch64.sve.bdep.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				declare <vscale x 8 x i16> @llvm.aarch64.sve.bdep.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.bdep.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.bdep.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)

				declare <vscale x 16 x i8> @llvm.aarch64.sve.bext.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				declare <vscale x 8 x i16> @llvm.aarch64.sve.bext.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.bext.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.bext.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)

				declare <vscale x 16 x i8> @llvm.aarch64.sve.bgrp.x.nx16i8(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b)
				declare <vscale x 8 x i16> @llvm.aarch64.sve.bgrp.x.nx8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.bgrp.x.nx4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.bgrp.x.nx2i64(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)

llvm/test/CodeGen/AArch64/sve2-intrinsics-perm-tb.ll

This file was added.

				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s \| FileCheck %s

				;
				; TBL2
				;

				define <vscale x 16 x i8> @tbl2_b(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
				; CHECK-LABEL: tbl2_b:
				; CHECK: tbl z0.b, { z0.b, z1.b }, z2.b
				sdesmalenUnsubmitted Not Done Reply Inline Actions We should test this with operands that are not already consecutive. `%a` and `%b` will come in as `z0` and `z1` by definition of the calling convention. By adding a `%dummy` in between `%a` and `%b`, you can check that a `mov` is inserted to ensure both registers are consecutive. sdesmalen: We should test this with operands that are not already consecutive. `%a` and `%b` will come in…
				; CHECK-NEXT: ret
				%out = call <vscale x 16 x i8> @llvm.aarch64.sve.tbl2.nxv16i8(<vscale x 16 x i8> %a,
				<vscale x 16 x i8> %b,
				<vscale x 16 x i8> %c)
				ret <vscale x 16 x i8> %out
				}

				define <vscale x 8 x i16> @tbl2_h(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: tbl2_h:
				; CHECK: tbl z0.h, { z0.h, z1.h }, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.tbl2.nxv8i16(<vscale x 8 x i16> %a,
				<vscale x 8 x i16> %b,
				<vscale x 8 x i16> %c)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 4 x i32> @tbl2_s(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c) {
				; CHECK-LABEL: tbl2_s:
				; CHECK: tbl z0.s, { z0.s, z1.s }, z2.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.tbl2.nxv4i32(<vscale x 4 x i32> %a,
				<vscale x 4 x i32> %b,
				<vscale x 4 x i32> %c)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 2 x i64> @tbl2_d(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b, <vscale x 2 x i64> %c) {
				; CHECK-LABEL: tbl2_d:
				; CHECK: tbl z0.d, { z0.d, z1.d }, z2.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.tbl2.nxv2i64(<vscale x 2 x i64> %a,
				<vscale x 2 x i64> %b,
				<vscale x 2 x i64> %c)
				ret <vscale x 2 x i64> %out
				}

				define <vscale x 8 x half> @tbl2_fh(<vscale x 8 x half> %a, <vscale x 8 x half> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: tbl2_fh:
				; CHECK: tbl z0.h, { z0.h, z1.h }, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.tbl2.nxv8f16(<vscale x 8 x half> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x i16> %c)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @tbl2_fs(<vscale x 4 x float> %a, <vscale x 4 x float> %b, <vscale x 4 x i32> %c) {
				; CHECK-LABEL: tbl2_fs:
				; CHECK: tbl z0.s, { z0.s, z1.s }, z2.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.tbl2.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 4 x float> %b,
				<vscale x 4 x i32> %c)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @tbl2_fd(<vscale x 2 x double> %a, <vscale x 2 x double> %b, <vscale x 2 x i64> %c) {
				; CHECK-LABEL: tbl2_fd:
				; CHECK: tbl z0.d, { z0.d, z1.d }, z2.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.tbl2.nxv2f64(<vscale x 2 x double> %a,
				<vscale x 2 x double> %b,
				<vscale x 2 x i64> %c)
				ret <vscale x 2 x double> %out
				}

				;
				; TBX
				;

				define <vscale x 16 x i8> @tbx_b(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
				; CHECK-LABEL: tbx_b:
				; CHECK: tbx z0.b, z1.b, z2.b
				; CHECK-NEXT: ret
				%out = call <vscale x 16 x i8> @llvm.aarch64.sve.tbx.nxv16i8(<vscale x 16 x i8> %a,
				<vscale x 16 x i8> %b,
				<vscale x 16 x i8> %c)
				ret <vscale x 16 x i8> %out
				}

				define <vscale x 8 x i16> @tbx_h(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: tbx_h:
				; CHECK: tbx z0.h, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.tbx.nxv8i16(<vscale x 8 x i16> %a,
				<vscale x 8 x i16> %b,
				<vscale x 8 x i16> %c)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 8 x half> @ftbx_h(<vscale x 8 x half> %a, <vscale x 8 x half> %b, <vscale x 8 x i16> %c) {
				; CHECK-LABEL: ftbx_h:
				; CHECK: tbx z0.h, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.tbx.nxv8f16(<vscale x 8 x half> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x i16> %c)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x i32> @tbx_s(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c) {
				; CHECK-LABEL: tbx_s:
				; CHECK: tbx z0.s, z1.s, z2.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.tbx.nxv4i32(<vscale x 4 x i32> %a,
				<vscale x 4 x i32> %b,
				<vscale x 4 x i32> %c)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 4 x float> @ftbx_s(<vscale x 4 x float> %a, <vscale x 4 x float> %b, <vscale x 4 x i32> %c) {
				; CHECK-LABEL: ftbx_s:
				; CHECK: tbx z0.s, z1.s, z2.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.tbx.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 4 x float> %b,
				<vscale x 4 x i32> %c)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x i64> @tbx_d(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b, <vscale x 2 x i64> %c) {
				; CHECK-LABEL: tbx_d:
				; CHECK: tbx z0.d, z1.d, z2.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.tbx.nxv2i64(<vscale x 2 x i64> %a,
				<vscale x 2 x i64> %b,
				<vscale x 2 x i64> %c)
				ret <vscale x 2 x i64> %out
				}

				define <vscale x 2 x double> @ftbx_d(<vscale x 2 x double> %a, <vscale x 2 x double> %b, <vscale x 2 x i64> %c) {
				; CHECK-LABEL: ftbx_d:
				; CHECK: tbx z0.d, z1.d, z2.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.tbx.nxv2f64(<vscale x 2 x double> %a,
				<vscale x 2 x double> %b,
				<vscale x 2 x i64> %c)
				ret <vscale x 2 x double> %out
				}

				declare <vscale x 16 x i8> @llvm.aarch64.sve.tbl2.nxv16i8(<vscale x 16 x i8>, <vscale x 16 x i8>, <vscale x 16 x i8>)
				declare <vscale x 8 x i16> @llvm.aarch64.sve.tbl2.nxv8i16(<vscale x 8 x i16>, <vscale x 8 x i16>, <vscale x 8 x i16>)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.tbl2.nxv4i32(<vscale x 4 x i32>, <vscale x 4 x i32>, <vscale x 4 x i32>)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.tbl2.nxv2i64(<vscale x 2 x i64>, <vscale x 2 x i64>, <vscale x 2 x i64>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.tbl2.nxv8f16(<vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x i16>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.tbl2.nxv4f32(<vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x i32>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.tbl2.nxv2f64(<vscale x 2 x double>, <vscale x 2 x double>, <vscale x 2 x i64>)

				declare <vscale x 16 x i8> @llvm.aarch64.sve.tbx.nxv16i8(<vscale x 16 x i8>, <vscale x 16 x i8>, <vscale x 16 x i8>)
				declare <vscale x 8 x i16> @llvm.aarch64.sve.tbx.nxv8i16(<vscale x 8 x i16>, <vscale x 8 x i16>, <vscale x 8 x i16>)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.tbx.nxv4i32(<vscale x 4 x i32>, <vscale x 4 x i32>, <vscale x 4 x i32>)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.tbx.nxv2i64(<vscale x 2 x i64>, <vscale x 2 x i64>, <vscale x 2 x i64>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.tbx.nxv8f16(<vscale x 8 x half>, <vscale x 8 x half>, <vscale x 8 x i16>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.tbx.nxv4f32(<vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x i32>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.tbx.nxv2f64(<vscale x 2 x double>, <vscale x 2 x double>, <vscale x 2 x i64>)

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add SVE2 intrinsics for bit permutation & table lookupClosedPublic

Details

Diff Detail

Event Timeline