This is an archive of the discontinued LLVM Phabricator instance.

llvm/include/llvm/IR/IntrinsicsAArch64.td
958	Thanks for taking a look at this :) I tried your suggestion of adding ImmAr<Op> to the list of properties here but had some problems with it (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't think this is too much of an issue here as we have additional checks on the immediate with VectorIndexH32b, which ensures the immediate is in the correct range.

efriedma added a subscriber: efriedma.Nov 27 2019, 1:32 PM

efriedma added inline comments.

llvm/include/llvm/IR/IntrinsicsAArch64.td
958	The point of immarg markings isn't to assist the backend; it's to ensure IR optimizations don't break your intrinsic calls.

sdesmalen added inline comments.Nov 29 2019, 4:55 AM

llvm/include/llvm/IR/IntrinsicsAArch64.td
958	The pattern is probably not matching because the immediate operand is a `TargetConstant` where the `AsmVectorIndexOpnd` derives from `ImmLeaf`, rather than `TImmLeaf` as introduced by D58232.

Herald added a reviewer: efriedma. · View Herald TranscriptNov 29 2019, 4:55 AM

kmclaughlin added inline comments.Dec 2 2019, 3:17 AM

llvm/include/llvm/IR/IntrinsicsAArch64.td
958	Thanks for the suggestion, this was the reason why the patterns were not matching! As this also affects many of the existing intrinsics not added here or in D70437, I would prefer to address this fully in a separate patch - do you have objections to this?

Thanks @kmclaughlin , LGTM.

llvm/include/llvm/IR/IntrinsicsAArch64.td
958	Okay, I'm happy with you want to make that change in a separate patch. It will also be needed for several of the other SVE intrinsics.

This revision is now accepted and ready to land.Dec 2 2019, 9:40 AM

Closed by commit rG8881ac9c3986: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics (authored by kmclaughlin). · Explain WhyDec 3 2019, 5:45 AM

This revision was automatically updated to reflect the committed changes.

sdesmalen mentioned this in D71401: [AArch64][SVE] Add permutation and selection intrinsics.Dec 13 2019, 5:42 AM

kmclaughlin mentioned this in D72612: [AArch64][SVE] Add ImmArg property to intrinsics with immediates.Jan 13 2020, 5:54 AM

kmclaughlin mentioned this in rGfe3bb8ec9683: [AArch64][SVE] Add ImmArg property to intrinsics with immediates.Jan 17 2020, 3:02 AM

Allen added a subscriber: Allen.Dec 7 2022, 8:14 PM

Allen added inline comments.

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll

hi, kmclaughlin:

Sorry for the naive question：
flogb is an unary instruction showed in assemble . Why shall we need %a as an **input** operand in the instrinsic? can it be similar with

%a = call <vscale x 2 x i64> @llvm.aarch64.sve.flogb.nxv2f64(<vscale x 2 x i1> %pg,<vscale x 2 x double> %b)

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 8:14 PM

kmclaughlin added inline comments.Dec 16 2022, 7:33 AM

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll
31	Hi @Allen, The first input to this intrinsic is the passthru, which contains the values used for inactive lanes of the predicate `%pg`. The inactive lanes can be set to zero, merged with separate vector or set to unknown.

kmclaughlin marked an inline comment as not done.Dec 16 2022, 7:33 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

IntrinsicsAArch64.td

48 lines

lib/

Target/

AArch64/

AArch64SVEInstrInfo.td

28 lines

SVEInstrFormats.td

31 lines

test/

CodeGen/

AArch64/

sve2-intrinsics-fp-int-binary-logarithm.ll

39 lines

sve2-intrinsics-fp-widening-mul-acc.ll

127 lines

sve2-intrinsics-non-widening-pairwise-arith.ll

191 lines

Diff 231886

llvm/include/llvm/IR/IntrinsicsAArch64.td

Show First 20 Lines • Show All 945 Lines • ▼ Show 20 Lines	class AdvSIMD_GatherLoad_64bitOffset_Intrinsic
: Intrinsic<[llvm_anyvector_ty],		: Intrinsic<[llvm_anyvector_ty],
[		[
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,		LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
LLVMPointerToElt<0>,		LLVMPointerToElt<0>,
LLVMScalarOrSameVectorWidth<0, llvm_i64_ty>		LLVMScalarOrSameVectorWidth<0, llvm_i64_ty>
],		],
[IntrReadMem, IntrArgMemOnly]>;		[IntrReadMem, IntrArgMemOnly]>;

		class SVE2_3VectorArg_Long_Intrinsic
		: Intrinsic<[llvm_anyvector_ty],
		[LLVMMatchType<0>,
		LLVMSubdivide2VectorType<0>,
		LLVMSubdivide2VectorType<0>],
		sdesmalenUnsubmitted Not Done Reply Inline Actions I'd expect the `llvm_i32_ty` to be an immediate for these instructions, right? If so you'll need to add `ImmArg<OpNo>` to the list of properties. sdesmalen: I'd expect the `llvm_i32_ty` to be an immediate for these instructions, right? If so you'll…
		kmclaughlinAuthorUnsubmitted Not Done Reply Inline Actions Thanks for taking a look at this :) I tried your suggestion of adding ImmAr<Op> to the list of properties here but had some problems with it (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't think this is too much of an issue here as we have additional checks on the immediate with VectorIndexH32b, which ensures the immediate is in the correct range. kmclaughlin: Thanks for taking a look at this :) I tried your suggestion of adding ImmAr<Op> to the list of…
		efriedmaUnsubmitted Not Done Reply Inline Actions The point of immarg markings isn't to assist the backend; it's to ensure IR optimizations don't break your intrinsic calls. efriedma: The point of immarg markings isn't to assist the backend; it's to ensure IR optimizations don't…
		sdesmalenUnsubmitted Not Done Reply Inline Actions The pattern is probably not matching because the immediate operand is a `TargetConstant` where the `AsmVectorIndexOpnd` derives from `ImmLeaf`, rather than `TImmLeaf` as introduced by D58232. sdesmalen: The pattern is probably not matching because the immediate operand is a `TargetConstant` where…
		kmclaughlinAuthorUnsubmitted Not Done Reply Inline Actions Thanks for the suggestion, this was the reason why the patterns were not matching! As this also affects many of the existing intrinsics not added here or in D70437, I would prefer to address this fully in a separate patch - do you have objections to this? kmclaughlin: Thanks for the suggestion, this was the reason why the patterns were not matching! As this also…
		sdesmalenUnsubmitted Not Done Reply Inline Actions Okay, I'm happy with you want to make that change in a separate patch. It will also be needed for several of the other SVE intrinsics. sdesmalen: Okay, I'm happy with you want to make that change in a separate patch. It will also be needed…
		[IntrNoMem]>;

		class SVE2_3VectorArgIndexed_Long_Intrinsic
		: Intrinsic<[llvm_anyvector_ty],
		[LLVMMatchType<0>,
		LLVMSubdivide2VectorType<0>,
		LLVMSubdivide2VectorType<0>,
		llvm_i32_ty],
		[IntrNoMem]>;

		// NOTE: There is no relationship between these intrinsics beyond an attempt
		// to reuse currently identical class definitions.
		class AdvSIMD_SVE_LOGB_Intrinsic : AdvSIMD_SVE_CNT_Intrinsic;

// This class of intrinsics are not intended to be useful within LLVM IR but		// This class of intrinsics are not intended to be useful within LLVM IR but
// are instead here to support some of the more regid parts of the ACLE.		// are instead here to support some of the more regid parts of the ACLE.
class Builtin_SVCVT<string name, LLVMType OUT, LLVMType IN>		class Builtin_SVCVT<string name, LLVMType OUT, LLVMType IN>
: GCCBuiltin<"__builtin_sve_" # name>,		: GCCBuiltin<"__builtin_sve_" # name>,
Intrinsic<[OUT], [OUT, llvm_nxv16i1_ty, IN], [IntrNoMem]>;		Intrinsic<[OUT], [OUT, llvm_nxv16i1_ty, IN], [IntrNoMem]>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines
// Gather loads:		// Gather loads:
//		//

// scalar + vector, 64 bit unscaled offsets		// scalar + vector, 64 bit unscaled offsets
def int_aarch64_sve_ld1_gather : AdvSIMD_GatherLoad_64bitOffset_Intrinsic;		def int_aarch64_sve_ld1_gather : AdvSIMD_GatherLoad_64bitOffset_Intrinsic;

// scalar + vector, 64 bit scaled offsets		// scalar + vector, 64 bit scaled offsets
def int_aarch64_sve_ld1_gather_index : AdvSIMD_GatherLoad_64bitOffset_Intrinsic;		def int_aarch64_sve_ld1_gather_index : AdvSIMD_GatherLoad_64bitOffset_Intrinsic;

		//
		// SVE2 - Non-widening pairwise arithmetic
		//

		def int_aarch64_sve_faddp : AdvSIMD_Pred2VectorArg_Intrinsic;
		def int_aarch64_sve_fmaxp : AdvSIMD_Pred2VectorArg_Intrinsic;
		def int_aarch64_sve_fmaxnmp : AdvSIMD_Pred2VectorArg_Intrinsic;
		def int_aarch64_sve_fminp : AdvSIMD_Pred2VectorArg_Intrinsic;
		def int_aarch64_sve_fminnmp : AdvSIMD_Pred2VectorArg_Intrinsic;

		//
		// SVE2 - Floating-point widening multiply-accumulate
		//

		def int_aarch64_sve_fmlalb : SVE2_3VectorArg_Long_Intrinsic;
		def int_aarch64_sve_fmlalb_lane : SVE2_3VectorArgIndexed_Long_Intrinsic;
		def int_aarch64_sve_fmlalt : SVE2_3VectorArg_Long_Intrinsic;
		def int_aarch64_sve_fmlalt_lane : SVE2_3VectorArgIndexed_Long_Intrinsic;
		def int_aarch64_sve_fmlslb : SVE2_3VectorArg_Long_Intrinsic;
		def int_aarch64_sve_fmlslb_lane : SVE2_3VectorArgIndexed_Long_Intrinsic;
		def int_aarch64_sve_fmlslt : SVE2_3VectorArg_Long_Intrinsic;
		def int_aarch64_sve_fmlslt_lane : SVE2_3VectorArgIndexed_Long_Intrinsic;

		//
		// SVE2 - Floating-point integer binary logarithm
		//

		def int_aarch64_sve_flogb : AdvSIMD_SVE_LOGB_Intrinsic;
}		}

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

Show First 20 Lines • Show All 1,421 Lines • ▼ Show 20 Lines	let Predicates = [HasSVE2] in {

// SVE2 histogram generation (segment)		// SVE2 histogram generation (segment)
def HISTSEG_ZZZ : sve2_hist_gen_segment<"histseg">;		def HISTSEG_ZZZ : sve2_hist_gen_segment<"histseg">;

// SVE2 histogram generation (vector)		// SVE2 histogram generation (vector)
defm HISTCNT_ZPzZZ : sve2_hist_gen_vector<"histcnt">;		defm HISTCNT_ZPzZZ : sve2_hist_gen_vector<"histcnt">;

// SVE2 floating-point base 2 logarithm as integer		// SVE2 floating-point base 2 logarithm as integer
defm FLOGB_ZPmZ : sve2_fp_flogb<"flogb">;		defm FLOGB_ZPmZ : sve2_fp_flogb<"flogb", int_aarch64_sve_flogb>;

// SVE2 floating-point convert precision		// SVE2 floating-point convert precision
defm FCVTXNT_ZPmZ : sve2_fp_convert_down_odd_rounding_top<"fcvtxnt", "int_aarch64_sve_fcvtxnt">;		defm FCVTXNT_ZPmZ : sve2_fp_convert_down_odd_rounding_top<"fcvtxnt", "int_aarch64_sve_fcvtxnt">;
defm FCVTX_ZPmZ : sve2_fp_convert_down_odd_rounding<"fcvtx", "int_aarch64_sve_fcvtx">;		defm FCVTX_ZPmZ : sve2_fp_convert_down_odd_rounding<"fcvtx", "int_aarch64_sve_fcvtx">;
defm FCVTNT_ZPmZ : sve2_fp_convert_down_narrow<"fcvtnt", "int_aarch64_sve_fcvtnt">;		defm FCVTNT_ZPmZ : sve2_fp_convert_down_narrow<"fcvtnt", "int_aarch64_sve_fcvtnt">;
defm FCVTLT_ZPmZ : sve2_fp_convert_up_long<"fcvtlt", "int_aarch64_sve_fcvtlt">;		defm FCVTLT_ZPmZ : sve2_fp_convert_up_long<"fcvtlt", "int_aarch64_sve_fcvtlt">;

// SVE2 floating-point pairwise operations		// SVE2 floating-point pairwise operations
defm FADDP_ZPmZZ : sve2_fp_pairwise_pred<0b000, "faddp">;		defm FADDP_ZPmZZ : sve2_fp_pairwise_pred<0b000, "faddp", int_aarch64_sve_faddp>;
defm FMAXNMP_ZPmZZ : sve2_fp_pairwise_pred<0b100, "fmaxnmp">;		defm FMAXNMP_ZPmZZ : sve2_fp_pairwise_pred<0b100, "fmaxnmp", int_aarch64_sve_fmaxnmp>;
defm FMINNMP_ZPmZZ : sve2_fp_pairwise_pred<0b101, "fminnmp">;		defm FMINNMP_ZPmZZ : sve2_fp_pairwise_pred<0b101, "fminnmp", int_aarch64_sve_fminnmp>;
defm FMAXP_ZPmZZ : sve2_fp_pairwise_pred<0b110, "fmaxp">;		defm FMAXP_ZPmZZ : sve2_fp_pairwise_pred<0b110, "fmaxp", int_aarch64_sve_fmaxp>;
defm FMINP_ZPmZZ : sve2_fp_pairwise_pred<0b111, "fminp">;		defm FMINP_ZPmZZ : sve2_fp_pairwise_pred<0b111, "fminp", int_aarch64_sve_fminp>;

// SVE2 floating-point multiply-add long (indexed)		// SVE2 floating-point multiply-add long (indexed)
def FMLALB_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b00, "fmlalb">;		defm FMLALB_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b00, "fmlalb", int_aarch64_sve_fmlalb_lane>;
def FMLALT_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b01, "fmlalt">;		defm FMLALT_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b01, "fmlalt", int_aarch64_sve_fmlalt_lane>;
def FMLSLB_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b10, "fmlslb">;		defm FMLSLB_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b10, "fmlslb", int_aarch64_sve_fmlslb_lane>;
def FMLSLT_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b11, "fmlslt">;		defm FMLSLT_ZZZI_SHH : sve2_fp_mla_long_by_indexed_elem<0b11, "fmlslt", int_aarch64_sve_fmlslt_lane>;

// SVE2 floating-point multiply-add long		// SVE2 floating-point multiply-add long
def FMLALB_ZZZ_SHH : sve2_fp_mla_long<0b00, "fmlalb">;		defm FMLALB_ZZZ_SHH : sve2_fp_mla_long<0b00, "fmlalb", int_aarch64_sve_fmlalb>;
def FMLALT_ZZZ_SHH : sve2_fp_mla_long<0b01, "fmlalt">;		defm FMLALT_ZZZ_SHH : sve2_fp_mla_long<0b01, "fmlalt", int_aarch64_sve_fmlalt>;
def FMLSLB_ZZZ_SHH : sve2_fp_mla_long<0b10, "fmlslb">;		defm FMLSLB_ZZZ_SHH : sve2_fp_mla_long<0b10, "fmlslb", int_aarch64_sve_fmlslb>;
def FMLSLT_ZZZ_SHH : sve2_fp_mla_long<0b11, "fmlslt">;		defm FMLSLT_ZZZ_SHH : sve2_fp_mla_long<0b11, "fmlslt", int_aarch64_sve_fmlslt>;

// SVE2 bitwise ternary operations		// SVE2 bitwise ternary operations
defm EOR3_ZZZZ_D : sve2_int_bitwise_ternary_op<0b000, "eor3">;		defm EOR3_ZZZZ_D : sve2_int_bitwise_ternary_op<0b000, "eor3">;
defm BCAX_ZZZZ_D : sve2_int_bitwise_ternary_op<0b010, "bcax">;		defm BCAX_ZZZZ_D : sve2_int_bitwise_ternary_op<0b010, "bcax">;
def BSL_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b001, "bsl">;		def BSL_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b001, "bsl">;
def BSL1N_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b011, "bsl1n">;		def BSL1N_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b011, "bsl1n">;
def BSL2N_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b101, "bsl2n">;		def BSL2N_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b101, "bsl2n">;
def NBSL_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b111, "nbsl">;		def NBSL_ZZZZ_D : sve2_int_bitwise_ternary_op_d<0b111, "nbsl">;
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/SVEInstrFormats.td

Show First 20 Lines • Show All 304 Lines • ▼ Show 20 Lines	: Pat<(vtd (op vt1:$Op1, vt2:$Op2, vt3:$Op3, vt4:$Op4)),
(inst $Op1, $Op2, $Op3, $Op4)>;		(inst $Op1, $Op2, $Op3, $Op4)>;

class SVE_3_Op_Imm_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,		class SVE_3_Op_Imm_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,
ValueType vt2, ValueType vt3, Operand ImmTy,		ValueType vt2, ValueType vt3, Operand ImmTy,
Instruction inst>		Instruction inst>
: Pat<(vtd (op vt1:$Op1, vt2:$Op2, (vt3 ImmTy:$Op3))),		: Pat<(vtd (op vt1:$Op1, vt2:$Op2, (vt3 ImmTy:$Op3))),
(inst $Op1, $Op2, ImmTy:$Op3)>;		(inst $Op1, $Op2, ImmTy:$Op3)>;

		class SVE_4_Op_Imm_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,
		ValueType vt2, ValueType vt3, ValueType vt4,
		Operand ImmTy, Instruction inst>
		: Pat<(vtd (op vt1:$Op1, vt2:$Op2, vt3:$Op3, (vt4 ImmTy:$Op4))),
		(inst $Op1, $Op2, $Op3, ImmTy:$Op4)>;

def SVEDup0Undef : ComplexPattern<i64, 0, "SelectDupZeroOrUndef", []>;		def SVEDup0Undef : ComplexPattern<i64, 0, "SelectDupZeroOrUndef", []>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SVE Predicate Misc Group		// SVE Predicate Misc Group
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class sve_int_pfalse<bits<6> opc, string asm>		class sve_int_pfalse<bits<6> opc, string asm>
: I<(outs PPR8:$Pd), (ins),		: I<(outs PPR8:$Pd), (ins),
▲ Show 20 Lines • Show All 1,369 Lines • ▼ Show 20 Lines	: I<(outs zprty:$Zdn), (ins PPR3bAny:$Pg, zprty:$_Zdn, zprty:$Zm),
let Inst{9-5} = Zm;		let Inst{9-5} = Zm;
let Inst{4-0} = Zdn;		let Inst{4-0} = Zdn;

let Constraints = "$Zdn = $_Zdn";		let Constraints = "$Zdn = $_Zdn";
let DestructiveInstType = Destructive;		let DestructiveInstType = Destructive;
let ElementSize = zprty.ElementSize;		let ElementSize = zprty.ElementSize;
}		}

multiclass sve2_fp_pairwise_pred<bits<3> opc, string asm> {		multiclass sve2_fp_pairwise_pred<bits<3> opc, string asm, SDPatternOperator op> {
def _H : sve2_fp_pairwise_pred<0b01, opc, asm, ZPR16>;		def _H : sve2_fp_pairwise_pred<0b01, opc, asm, ZPR16>;
def _S : sve2_fp_pairwise_pred<0b10, opc, asm, ZPR32>;		def _S : sve2_fp_pairwise_pred<0b10, opc, asm, ZPR32>;
def _D : sve2_fp_pairwise_pred<0b11, opc, asm, ZPR64>;		def _D : sve2_fp_pairwise_pred<0b11, opc, asm, ZPR64>;

		def : SVE_3_Op_Pat<nxv8f16, op, nxv8i1, nxv8f16, nxv8f16, !cast<Instruction>(NAME # _H)>;
		def : SVE_3_Op_Pat<nxv4f32, op, nxv4i1, nxv4f32, nxv4f32, !cast<Instruction>(NAME # _S)>;
		def : SVE_3_Op_Pat<nxv2f64, op, nxv2i1, nxv2f64, nxv2f64, !cast<Instruction>(NAME # _D)>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SVE2 Floating Point Widening Multiply-Add - Indexed Group		// SVE2 Floating Point Widening Multiply-Add - Indexed Group
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class sve2_fp_mla_long_by_indexed_elem<bits<2> opc, string asm>		class sve2_fp_mla_long_by_indexed_elem<bits<2> opc, string asm>
: I<(outs ZPR32:$Zda), (ins ZPR32:$_Zda, ZPR16:$Zn, ZPR3b16:$Zm,		: I<(outs ZPR32:$Zda), (ins ZPR32:$_Zda, ZPR16:$Zn, ZPR3b16:$Zm,
VectorIndexH:$iop),		VectorIndexH32b:$iop),
asm, "\t$Zda, $Zn, $Zm$iop",		asm, "\t$Zda, $Zn, $Zm$iop",
"",		"",
[]>, Sched<[]> {		[]>, Sched<[]> {
bits<5> Zda;		bits<5> Zda;
bits<5> Zn;		bits<5> Zn;
bits<3> Zm;		bits<3> Zm;
bits<3> iop;		bits<3> iop;
let Inst{31-21} = 0b01100100101;		let Inst{31-21} = 0b01100100101;
let Inst{20-19} = iop{2-1};		let Inst{20-19} = iop{2-1};
let Inst{18-16} = Zm;		let Inst{18-16} = Zm;
let Inst{15-14} = 0b01;		let Inst{15-14} = 0b01;
let Inst{13} = opc{1};		let Inst{13} = opc{1};
let Inst{12} = 0b0;		let Inst{12} = 0b0;
let Inst{11} = iop{0};		let Inst{11} = iop{0};
let Inst{10} = opc{0};		let Inst{10} = opc{0};
let Inst{9-5} = Zn;		let Inst{9-5} = Zn;
let Inst{4-0} = Zda;		let Inst{4-0} = Zda;

let Constraints = "$Zda = $_Zda";		let Constraints = "$Zda = $_Zda";
let DestructiveInstType = Destructive;		let DestructiveInstType = Destructive;
let ElementSize = ElementSizeNone;		let ElementSize = ElementSizeNone;
}		}

		multiclass sve2_fp_mla_long_by_indexed_elem<bits<2> opc, string asm,
		SDPatternOperator op> {
		def NAME : sve2_fp_mla_long_by_indexed_elem<opc, asm>;
		def : SVE_4_Op_Imm_Pat<nxv4f32, op, nxv4f32, nxv8f16, nxv8f16, i32, VectorIndexH32b, !cast<Instruction>(NAME)>;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SVE2 Floating Point Widening Multiply-Add Group		// SVE2 Floating Point Widening Multiply-Add Group
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class sve2_fp_mla_long<bits<2> opc, string asm>		class sve2_fp_mla_long<bits<2> opc, string asm>
: I<(outs ZPR32:$Zda), (ins ZPR32:$_Zda, ZPR16:$Zn, ZPR16:$Zm),		: I<(outs ZPR32:$Zda), (ins ZPR32:$_Zda, ZPR16:$Zn, ZPR16:$Zm),
asm, "\t$Zda, $Zn, $Zm",		asm, "\t$Zda, $Zn, $Zm",
"",		"",
Show All 10 Lines	: I<(outs ZPR32:$Zda), (ins ZPR32:$_Zda, ZPR16:$Zn, ZPR16:$Zm),
let Inst{9-5} = Zn;		let Inst{9-5} = Zn;
let Inst{4-0} = Zda;		let Inst{4-0} = Zda;

let Constraints = "$Zda = $_Zda";		let Constraints = "$Zda = $_Zda";
let DestructiveInstType = Destructive;		let DestructiveInstType = Destructive;
let ElementSize = ElementSizeNone;		let ElementSize = ElementSizeNone;
}		}

		multiclass sve2_fp_mla_long<bits<2> opc, string asm, SDPatternOperator op> {
		def NAME : sve2_fp_mla_long<opc, asm>;
		def : SVE_3_Op_Pat<nxv4f32, op, nxv4f32, nxv8f16, nxv8f16, !cast<Instruction>(NAME)>;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SVE Stack Allocation Group		// SVE Stack Allocation Group
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class sve_int_arith_vl<bit opc, string asm>		class sve_int_arith_vl<bit opc, string asm>
: I<(outs GPR64sp:$Rd), (ins GPR64sp:$Rn, simm6_32b:$imm6),		: I<(outs GPR64sp:$Rd), (ins GPR64sp:$Rn, simm6_32b:$imm6),
asm, "\t$Rd, $Rn, $imm6",		asm, "\t$Rd, $Rn, $imm6",
"",		"",
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	multiclass sve_fp_2op_p_zd_HSD<bits<5> opc, string asm, SDPatternOperator op> {
def _S : sve_fp_2op_p_zd<{ 0b10, opc }, asm, ZPR32, ZPR32, ElementSizeS>;		def _S : sve_fp_2op_p_zd<{ 0b10, opc }, asm, ZPR32, ZPR32, ElementSizeS>;
def _D : sve_fp_2op_p_zd<{ 0b11, opc }, asm, ZPR64, ZPR64, ElementSizeD>;		def _D : sve_fp_2op_p_zd<{ 0b11, opc }, asm, ZPR64, ZPR64, ElementSizeD>;

def : SVE_3_Op_Pat<nxv8f16, op, nxv8f16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _H)>;		def : SVE_3_Op_Pat<nxv8f16, op, nxv8f16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _H)>;
def : SVE_3_Op_Pat<nxv4f32, op, nxv4f32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _S)>;		def : SVE_3_Op_Pat<nxv4f32, op, nxv4f32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _S)>;
def : SVE_3_Op_Pat<nxv2f64, op, nxv2f64, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _D)>;		def : SVE_3_Op_Pat<nxv2f64, op, nxv2f64, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _D)>;
}		}

multiclass sve2_fp_flogb<string asm> {		multiclass sve2_fp_flogb<string asm, SDPatternOperator op> {
def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>;		def _H : sve_fp_2op_p_zd<0b0011010, asm, ZPR16, ZPR16, ElementSizeH>;
def _S : sve_fp_2op_p_zd<0b0011100, asm, ZPR32, ZPR32, ElementSizeS>;		def _S : sve_fp_2op_p_zd<0b0011100, asm, ZPR32, ZPR32, ElementSizeS>;
def _D : sve_fp_2op_p_zd<0b0011110, asm, ZPR64, ZPR64, ElementSizeD>;		def _D : sve_fp_2op_p_zd<0b0011110, asm, ZPR64, ZPR64, ElementSizeD>;

		def : SVE_3_Op_Pat<nxv8i16, op, nxv8i16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _H)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _S)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _D)>;
}		}

multiclass sve2_fp_convert_down_odd_rounding<string asm, string op> {		multiclass sve2_fp_convert_down_odd_rounding<string asm, string op> {
def _DtoS : sve_fp_2op_p_zd<0b0001010, asm, ZPR64, ZPR32, ElementSizeD>;		def _DtoS : sve_fp_2op_p_zd<0b0001010, asm, ZPR64, ZPR32, ElementSizeD>;
def : SVE_3_Op_Pat<nxv4f32, !cast<SDPatternOperator>(op # _f32f64), nxv4f32, nxv16i1, nxv2f64, !cast<Instruction>(NAME # _DtoS)>;		def : SVE_3_Op_Pat<nxv4f32, !cast<SDPatternOperator>(op # _f32f64), nxv4f32, nxv16i1, nxv2f64, !cast<Instruction>(NAME # _DtoS)>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 4,245 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll

This file was added.

				;RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 -asm-verbose=0 < %s \| FileCheck %s

				;
				; FLOGB
				;

				define <vscale x 8 x i16> @flogb_f16(<vscale x 8 x i16> %a, <vscale x 8 x i1> %pg, <vscale x 8 x half> %b) {
				; CHECK-LABEL: flogb_f16:
				; CHECK: flogb z0.h, p0/m, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x i16> @llvm.aarch64.sve.flogb.nxv8f16(<vscale x 8 x i16> %a,
				<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x i16> %out
				}

				define <vscale x 4 x i32> @flogb_f32(<vscale x 4 x i32> %a, <vscale x 4 x i1> %pg, <vscale x 4 x float> %b) {
				; CHECK-LABEL: flogb_f32:
				; CHECK: flogb z0.s, p0/m, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x i32> @llvm.aarch64.sve.flogb.nxv4f32(<vscale x 4 x i32> %a,
				<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x i32> %out
				}

				define <vscale x 2 x i64> @flogb_f64(<vscale x 2 x i64> %a, <vscale x 2 x i1> %pg, <vscale x 2 x double> %b) {
				; CHECK-LABEL: flogb_f64:
				; CHECK: flogb z0.d, p0/m, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x i64> @llvm.aarch64.sve.flogb.nxv2f64(<vscale x 2 x i64> %a,
				AllenUnsubmitted Not Done Reply Inline Actions hi, kmclaughlin: Sorry for the naive question： flogb is an unary instruction showed in assemble . Why shall we need %a as an input operand in the instrinsic? can it be similar with %a = call <vscale x 2 x i64> @llvm.aarch64.sve.flogb.nxv2f64(<vscale x 2 x i1> %pg,<vscale x 2 x double> %b) Allen: hi, kmclaughlin: Sorry for the naive question： flogb is an unary instruction showed in…
				kmclaughlinAuthorUnsubmitted Not Done Reply Inline Actions Hi @Allen, The first input to this intrinsic is the passthru, which contains the values used for inactive lanes of the predicate `%pg`. The inactive lanes can be set to zero, merged with separate vector or set to unknown. kmclaughlin: Hi @Allen, The first input to this intrinsic is the passthru, which contains the values used…
				<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x i64> %out
				}

				declare <vscale x 8 x i16> @llvm.aarch64.sve.flogb.nxv8f16(<vscale x 8 x i16>, <vscale x 8 x i1>, <vscale x 8 x half>)
				declare <vscale x 4 x i32> @llvm.aarch64.sve.flogb.nxv4f32(<vscale x 4 x i32>, <vscale x 4 x i1>, <vscale x 4 x float>)
				declare <vscale x 2 x i64> @llvm.aarch64.sve.flogb.nxv2f64(<vscale x 2 x i64>, <vscale x 2 x i1>, <vscale x 2 x double>)

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-widening-mul-acc.ll

This file was added.

				;RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s \| FileCheck %s

				;
				; FMLALB (Vectors)
				;

				define <vscale x 4 x float> @fmlalb_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlalb_h:
				; CHECK: fmlalb z0.s, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlalb.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLALB (Indexed)
				;

				define <vscale x 4 x float> @fmlalb_lane_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlalb_lane_h:
				; CHECK: fmlalb z0.s, z1.h, z2.h[0]
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlalb.lane.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c,
				i32 0)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLALT (Vectors)
				;

				define <vscale x 4 x float> @fmlalt_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlalt_h:
				; CHECK: fmlalt z0.s, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlalt.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLALT (Indexed)
				;

				define <vscale x 4 x float> @fmlalt_lane_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlalt_lane_h:
				; CHECK: fmlalt z0.s, z1.h, z2.h[1]
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlalt.lane.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c,
				i32 1)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLSLB (Vectors)
				;

				define <vscale x 4 x float> @fmlslb_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlslb_h:
				; CHECK: fmlslb z0.s, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlslb.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLSLB (Indexed)
				;

				define <vscale x 4 x float> @fmlslb_lane_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlslb_lane_h:
				; CHECK: fmlslb z0.s, z1.h, z2.h[2]
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlslb.lane.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c,
				i32 2)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLSLT (Vectors)
				;

				define <vscale x 4 x float> @fmlslt_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlslt_h:
				; CHECK: fmlslt z0.s, z1.h, z2.h
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlslt.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c)
				ret <vscale x 4 x float> %out
				}

				;
				; FMLSLT (Indexed)
				;

				define <vscale x 4 x float> @fmlslt_lane_h(<vscale x 4 x float> %a, <vscale x 8 x half> %b, <vscale x 8 x half> %c) {
				; CHECK-LABEL: fmlslt_lane_h:
				; CHECK: fmlslt z0.s, z1.h, z2.h[3]
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmlslt.lane.nxv4f32(<vscale x 4 x float> %a,
				<vscale x 8 x half> %b,
				<vscale x 8 x half> %c,
				i32 3)
				ret <vscale x 4 x float> %out
				}

				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlalb.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlalb.lane.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>, i32)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlalt.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlalt.lane.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>, i32)

				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlslb.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlslb.lane.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>, i32)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlslt.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmlslt.lane.nxv4f32(<vscale x 4 x float>, <vscale x 8 x half>, <vscale x 8 x half>, i32)

llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll

This file was added.

				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s \| FileCheck %s

				;
				; FADDP
				;

				define <vscale x 8 x half> @faddp_f16(<vscale x 8 x i1> %pg, <vscale x 8 x half> %a, <vscale x 8 x half> %b) {
				; CHECK-LABEL: faddp_f16:
				; CHECK: faddp z0.h, p0/m, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.faddp.nxv8f16(<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %a,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @faddp_f32(<vscale x 4 x i1> %pg, <vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: faddp_f32:
				; CHECK: faddp z0.s, p0/m, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.faddp.nxv4f32(<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %a,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @faddp_f64(<vscale x 2 x i1> %pg, <vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: faddp_f64:
				; CHECK: faddp z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.faddp.nxv2f64(<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %a,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x double> %out
				}

				;
				; FMAXP
				;

				define <vscale x 8 x half> @fmaxp_f16(<vscale x 8 x i1> %pg, <vscale x 8 x half> %a, <vscale x 8 x half> %b) {
				; CHECK-LABEL: fmaxp_f16:
				; CHECK: fmaxp z0.h, p0/m, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.fmaxp.nxv8f16(<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %a,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @fmaxp_f32(<vscale x 4 x i1> %pg, <vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: fmaxp_f32:
				; CHECK: fmaxp z0.s, p0/m, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmaxp.nxv4f32(<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %a,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @fmaxp_f64(<vscale x 2 x i1> %pg, <vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: fmaxp_f64:
				; CHECK: fmaxp z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.fmaxp.nxv2f64(<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %a,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x double> %out
				}

				;
				; FMAXNMP
				;

				define <vscale x 8 x half> @fmaxnmp_f16(<vscale x 8 x i1> %pg, <vscale x 8 x half> %a, <vscale x 8 x half> %b) {
				; CHECK-LABEL: fmaxnmp_f16:
				; CHECK: fmaxnmp z0.h, p0/m, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.fmaxnmp.nxv8f16(<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %a,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @fmaxnmp_f32(<vscale x 4 x i1> %pg, <vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: fmaxnmp_f32:
				; CHECK: fmaxnmp z0.s, p0/m, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fmaxnmp.nxv4f32(<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %a,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @fmaxnmp_f64(<vscale x 2 x i1> %pg, <vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: fmaxnmp_f64:
				; CHECK: fmaxnmp z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.fmaxnmp.nxv2f64(<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %a,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x double> %out
				}

				;
				; FMINP
				;

				define <vscale x 8 x half> @fminp_f16(<vscale x 8 x i1> %pg, <vscale x 8 x half> %a, <vscale x 8 x half> %b) {
				; CHECK-LABEL: fminp_f16:
				; CHECK: fminp z0.h, p0/m, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.fminp.nxv8f16(<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %a,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @fminp_f32(<vscale x 4 x i1> %pg, <vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: fminp_f32:
				; CHECK: fminp z0.s, p0/m, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fminp.nxv4f32(<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %a,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @fminp_f64(<vscale x 2 x i1> %pg, <vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: fminp_f64:
				; CHECK: fminp z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.fminp.nxv2f64(<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %a,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x double> %out
				}

				;
				; FMINNMP
				;

				define <vscale x 8 x half> @fminnmp_f16(<vscale x 8 x i1> %pg, <vscale x 8 x half> %a, <vscale x 8 x half> %b) {
				; CHECK-LABEL: fminnmp_f16:
				; CHECK: fminnmp z0.h, p0/m, z0.h, z1.h
				; CHECK-NEXT: ret
				%out = call <vscale x 8 x half> @llvm.aarch64.sve.fminnmp.nxv8f16(<vscale x 8 x i1> %pg,
				<vscale x 8 x half> %a,
				<vscale x 8 x half> %b)
				ret <vscale x 8 x half> %out
				}

				define <vscale x 4 x float> @fminnmp_f32(<vscale x 4 x i1> %pg, <vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: fminnmp_f32:
				; CHECK: fminnmp z0.s, p0/m, z0.s, z1.s
				; CHECK-NEXT: ret
				%out = call <vscale x 4 x float> @llvm.aarch64.sve.fminnmp.nxv4f32(<vscale x 4 x i1> %pg,
				<vscale x 4 x float> %a,
				<vscale x 4 x float> %b)
				ret <vscale x 4 x float> %out
				}

				define <vscale x 2 x double> @fminnmp_f64(<vscale x 2 x i1> %pg, <vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: fminnmp_f64:
				; CHECK: fminnmp z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: ret
				%out = call <vscale x 2 x double> @llvm.aarch64.sve.fminnmp.nxv2f64(<vscale x 2 x i1> %pg,
				<vscale x 2 x double> %a,
				<vscale x 2 x double> %b)
				ret <vscale x 2 x double> %out
				}

				declare <vscale x 8 x half> @llvm.aarch64.sve.faddp.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.faddp.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.faddp.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.fmaxp.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmaxp.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.fmaxp.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.fmaxnmp.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fmaxnmp.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.fmaxnmp.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.fminp.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fminp.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.fminp.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)

				declare <vscale x 8 x half> @llvm.aarch64.sve.fminnmp.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				declare <vscale x 4 x float> @llvm.aarch64.sve.fminnmp.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				declare <vscale x 2 x double> @llvm.aarch64.sve.fminnmp.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE2] Implement remaining SVE2 floating-point intrinsicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 231886

llvm/include/llvm/IR/IntrinsicsAArch64.td

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/lib/Target/AArch64/SVEInstrFormats.td

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll

llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-widening-mul-acc.ll

llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll

[AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
ClosedPublic