Diff 290278

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
unsigned NewOpc = N->getOpcode();		unsigned NewOpc = N->getOpcode();
SDLoc dl(N);		SDLoc dl(N);

// If we're promoting a UINT to a larger size and the larger FP_TO_UINT is		// If we're promoting a UINT to a larger size and the larger FP_TO_UINT is
// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT		// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT
// and SINT conversions are Custom, there is no way to tell which is		// and SINT conversions are Custom, there is no way to tell which is
// preferable. We choose SINT because that's the right thing on PPC.)		// preferable. We choose SINT because that's the right thing on PPC.)
if (N->getOpcode() == ISD::FP_TO_UINT &&		if (!NVT.isScalableVector() && N->getOpcode() == ISD::FP_TO_UINT &&
		efriedmaUnsubmitted Not Done Reply Inline Actions Why is isScalableVector() relevant here? efriedma: Why is isScalableVector() relevant here?
		kmclaughlinAuthorUnsubmitted Done Reply Inline Actions It's not really relevant here, when I first created the patch I thought that it was better not to replace `FP_TO_UINT` with `FP_TO_SINT` when promoting the result, but I think it's fine to do this for scalable types also. I've removed the isScalableVector() from here and updated the affected tests (e.g. nxv2f64 -> nxv2i32) to check for `fcvtzs`. kmclaughlin: It's not really relevant here, when I first created the patch I thought that it was better not…
!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&		!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&
TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))		TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))
NewOpc = ISD::FP_TO_SINT;		NewOpc = ISD::FP_TO_SINT;

if (N->getOpcode() == ISD::STRICT_FP_TO_UINT &&		if (N->getOpcode() == ISD::STRICT_FP_TO_UINT &&
!TLI.isOperationLegal(ISD::STRICT_FP_TO_UINT, NVT) &&		!TLI.isOperationLegal(ISD::STRICT_FP_TO_UINT, NVT) &&
TLI.isOperationLegalOrCustom(ISD::STRICT_FP_TO_SINT, NVT))		TLI.isOperationLegalOrCustom(ISD::STRICT_FP_TO_SINT, NVT))
NewOpc = ISD::STRICT_FP_TO_SINT;		NewOpc = ISD::STRICT_FP_TO_SINT;
▲ Show 20 Lines • Show All 4,151 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
// Predicated instructions where inactive lanes produce undefined results.		// Predicated instructions where inactive lanes produce undefined results.
ADD_PRED,		ADD_PRED,
FADD_PRED,		FADD_PRED,
FDIV_PRED,		FDIV_PRED,
FMA_PRED,		FMA_PRED,
FMAXNM_PRED,		FMAXNM_PRED,
FMINNM_PRED,		FMINNM_PRED,
FMUL_PRED,		FMUL_PRED,
		FP_TO_UINT_PRED,
		FP_TO_SINT_PRED,
		efriedmaUnsubmitted Not Done Reply Inline Actions Probably should be named FP_TO_UINT_SAT_MERGE_PASSTHRU or something like that, if you're going to use it to lower llvm.aarch64.sve.fcvtzu. (See D54749) efriedma: Probably should be named FP_TO_UINT_SAT_MERGE_PASSTHRU or something like that, if you're going…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I had a feeling you might say this. The problem is this is more than a rename because the SAT nodes take an additional parameter. This is not really a hurdle but there's a question as to where you draw the line. All the intrinsics expect to produce results that match the instructions on which they're based. This would suggest we can never use names based on similar ISD nodes because the common nodes do not define all corner cases. This seems like overkill considering the _PRED/_MERGE nodes are all under the AArch64ISD namespace, which to me suggests they follow AArch64 semantics. paulwalker-arm: I had a feeling you might say this. The problem is this is more than a rename because the SAT…
		efriedmaUnsubmitted Not Done Reply Inline Actions I think if there's some significant functionality difference, we should call it out in the name somehow. There's a very high likelihood that these nodes, or something like them, will eventually become target-independent. And even before that, people will likely assume they're equivalent based on the name. I don't want something to subtly break because someone didn't realize there was a difference. (It's particularly terrible in cases like fp->int conversions and shifts: if you confuse them, the result appears to mostly work, but you'll get a bug report years later.) If you're not comfortable naming FP_TO_UINT_SAT_MERGE_PASSTHRU without the matching width parameter, we could just call it FCVTZU_MERGE_PASSTHRU . efriedma: I think if there's some significant functionality difference, we should call it out in the name…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Yes I much prefer that naming strategy. Given this change I guess there are other nodes that should be renamed. paulwalker-arm: Yes I much prefer that naming strategy. Given this change I guess there are other nodes that…
FSUB_PRED,		FSUB_PRED,
MUL_PRED,		MUL_PRED,
SDIV_PRED,		SDIV_PRED,
SHL_PRED,		SHL_PRED,
SMAX_PRED,		SMAX_PRED,
SMIN_PRED,		SMIN_PRED,
SRA_PRED,		SRA_PRED,
SRL_PRED,		SRL_PRED,
▲ Show 20 Lines • Show All 913 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 938 Lines • ▼ Show 20 Lines	AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,

if (Subtarget->hasSVE()) {		if (Subtarget->hasSVE()) {
// FIXME: Add custom lowering of MLOAD to handle different passthrus (not a		// FIXME: Add custom lowering of MLOAD to handle different passthrus (not a
// splat of 0 or undef) once vector selects supported in SVE codegen. See		// splat of 0 or undef) once vector selects supported in SVE codegen. See
// D68877 for more details.		// D68877 for more details.
for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {		for (MVT VT : MVT::integer_scalable_vector_valuetypes()) {
if (isTypeLegal(VT)) {		if (isTypeLegal(VT)) {
setOperationAction(ISD::INSERT_SUBVECTOR, VT, Custom);		setOperationAction(ISD::INSERT_SUBVECTOR, VT, Custom);
		setOperationAction(ISD::FP_TO_UINT, VT, Custom);
		setOperationAction(ISD::FP_TO_SINT, VT, Custom);
setOperationAction(ISD::MUL, VT, Custom);		setOperationAction(ISD::MUL, VT, Custom);
setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);		setOperationAction(ISD::SPLAT_VECTOR, VT, Custom);
setOperationAction(ISD::SELECT, VT, Custom);		setOperationAction(ISD::SELECT, VT, Custom);
setOperationAction(ISD::SDIV, VT, Custom);		setOperationAction(ISD::SDIV, VT, Custom);
setOperationAction(ISD::UDIV, VT, Custom);		setOperationAction(ISD::UDIV, VT, Custom);
setOperationAction(ISD::SMIN, VT, Custom);		setOperationAction(ISD::SMIN, VT, Custom);
setOperationAction(ISD::UMIN, VT, Custom);		setOperationAction(ISD::UMIN, VT, Custom);
setOperationAction(ISD::SMAX, VT, Custom);		setOperationAction(ISD::SMAX, VT, Custom);
▲ Show 20 Lines • Show All 642 Lines • ▼ Show 20 Lines	case AArch64ISD::FIRST_NUMBER:
MAKE_CASE(AArch64ISD::FMA_PRED)		MAKE_CASE(AArch64ISD::FMA_PRED)
MAKE_CASE(AArch64ISD::FMAXV_PRED)		MAKE_CASE(AArch64ISD::FMAXV_PRED)
MAKE_CASE(AArch64ISD::FMAXNM_PRED)		MAKE_CASE(AArch64ISD::FMAXNM_PRED)
MAKE_CASE(AArch64ISD::FMAXNMV_PRED)		MAKE_CASE(AArch64ISD::FMAXNMV_PRED)
MAKE_CASE(AArch64ISD::FMINV_PRED)		MAKE_CASE(AArch64ISD::FMINV_PRED)
MAKE_CASE(AArch64ISD::FMINNM_PRED)		MAKE_CASE(AArch64ISD::FMINNM_PRED)
MAKE_CASE(AArch64ISD::FMINNMV_PRED)		MAKE_CASE(AArch64ISD::FMINNMV_PRED)
MAKE_CASE(AArch64ISD::FMUL_PRED)		MAKE_CASE(AArch64ISD::FMUL_PRED)
		MAKE_CASE(AArch64ISD::FP_TO_UINT_PRED)
		MAKE_CASE(AArch64ISD::FP_TO_SINT_PRED)
MAKE_CASE(AArch64ISD::FSUB_PRED)		MAKE_CASE(AArch64ISD::FSUB_PRED)
MAKE_CASE(AArch64ISD::NOT)		MAKE_CASE(AArch64ISD::NOT)
MAKE_CASE(AArch64ISD::BIT)		MAKE_CASE(AArch64ISD::BIT)
MAKE_CASE(AArch64ISD::CBZ)		MAKE_CASE(AArch64ISD::CBZ)
MAKE_CASE(AArch64ISD::CBNZ)		MAKE_CASE(AArch64ISD::CBNZ)
MAKE_CASE(AArch64ISD::TBZ)		MAKE_CASE(AArch64ISD::TBZ)
MAKE_CASE(AArch64ISD::TBNZ)		MAKE_CASE(AArch64ISD::TBNZ)
MAKE_CASE(AArch64ISD::TC_RETURN)		MAKE_CASE(AArch64ISD::TC_RETURN)
▲ Show 20 Lines • Show All 1,248 Lines • ▼ Show 20 Lines

SDValue AArch64TargetLowering::LowerVectorFP_TO_INT(SDValue Op,		SDValue AArch64TargetLowering::LowerVectorFP_TO_INT(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
// Warning: We maintain cost tables in AArch64TargetTransformInfo.cpp.		// Warning: We maintain cost tables in AArch64TargetTransformInfo.cpp.
// Any additional optimization in this function should be recorded		// Any additional optimization in this function should be recorded
// in the cost tables.		// in the cost tables.
EVT InVT = Op.getOperand(0).getValueType();		EVT InVT = Op.getOperand(0).getValueType();
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();

		if (VT.isScalableVector()) {
		unsigned Opcode = Op.getOpcode() == ISD::FP_TO_UINT
		? AArch64ISD::FP_TO_UINT_PRED
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ? AArch64ISD::FP_TO_UINT_PRED - : AArch64ISD::FP_TO_SINT_PRED; + ? AArch64ISD::FP_TO_UINT_PRED + : AArch64ISD::FP_TO_SINT_PRED; Lint: Pre-merge checks: clang-format: please reformat the code ``` - ?
		: AArch64ISD::FP_TO_SINT_PRED;
		return LowerToPredicatedOp(Op, DAG, Opcode);
		}

unsigned NumElts = InVT.getVectorNumElements();		unsigned NumElts = InVT.getVectorNumElements();

// f16 conversions are promoted to f32 when full fp16 is not supported.		// f16 conversions are promoted to f32 when full fp16 is not supported.
if (InVT.getVectorElementType() == MVT::f16 &&		if (InVT.getVectorElementType() == MVT::f16 &&
!Subtarget->hasFullFP16()) {		!Subtarget->hasFullFP16()) {
MVT NewVT = MVT::getVectorVT(MVT::f32, NumElts);		MVT NewVT = MVT::getVectorVT(MVT::f32, NumElts);
SDLoc dl(Op);		SDLoc dl(Op);
return DAG.getNode(		return DAG.getNode(
▲ Show 20 Lines • Show All 9,427 Lines • ▼ Show 20 Lines	static SDValue combineSVEReductionFP(SDNode *N, unsigned Opc,

// SVE reductions set the whole vector register with the first element		// SVE reductions set the whole vector register with the first element
// containing the reduction result, which we'll now extract.		// containing the reduction result, which we'll now extract.
SDValue Zero = DAG.getConstant(0, DL, MVT::i64);		SDValue Zero = DAG.getConstant(0, DL, MVT::i64);
return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, N->getValueType(0), Reduce,		return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, N->getValueType(0), Reduce,
Zero);		Zero);
}		}

		static SDValue combineSVEConversionFP(SDNode *N, unsigned Opc,
		SelectionDAG &DAG) {
		SDLoc DL(N);

		SDValue Pred = N->getOperand(2);
		SDValue VecToConvert = N->getOperand(3);
		EVT ConvertVT = N->getOperand(1).getValueType();

		return DAG.getNode(Opc, DL, ConvertVT,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return DAG.getNode(Opc, DL, ConvertVT, - DAG.getUNDEF(ConvertVT), - Pred, VecToConvert); + return DAG.getNode(Opc, DL, ConvertVT, DAG.getUNDEF(ConvertVT), Pred, + VecToConvert); Lint: Pre-merge checks: clang-format: please reformat the code ``` - return DAG.getNode(Opc, DL, ConvertVT…
		DAG.getUNDEF(ConvertVT),
		Pred, VecToConvert);
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This doesn't look correct. You are dropping the passthru value (i.e. Operand(1)) and always using UNDEF instead. It's a bit troubling that such a change hasn't broken any of the lit tests. I think you'll want to treat these like the other unary operations, which have matching _MERGE_PASSTHRU ISD nodes rather than _PRED ones. paulwalker-arm: This doesn't look correct. You are dropping the passthru value (i.e. Operand(1)) and always…
		}

static SDValue combineSVEReductionOrderedFP(SDNode *N, unsigned Opc,		static SDValue combineSVEReductionOrderedFP(SDNode *N, unsigned Opc,
SelectionDAG &DAG) {		SelectionDAG &DAG) {
SDLoc DL(N);		SDLoc DL(N);

SDValue Pred = N->getOperand(1);		SDValue Pred = N->getOperand(1);
SDValue InitVal = N->getOperand(2);		SDValue InitVal = N->getOperand(2);
SDValue VecToReduce = N->getOperand(3);		SDValue VecToReduce = N->getOperand(3);
EVT ReduceVT = VecToReduce.getValueType();		EVT ReduceVT = VecToReduce.getValueType();
▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	static SDValue performIntrinsicCombine(SDNode *N,
case Intrinsic::aarch64_sve_fmaxnmv:		case Intrinsic::aarch64_sve_fmaxnmv:
return combineSVEReductionFP(N, AArch64ISD::FMAXNMV_PRED, DAG);		return combineSVEReductionFP(N, AArch64ISD::FMAXNMV_PRED, DAG);
case Intrinsic::aarch64_sve_fmaxv:		case Intrinsic::aarch64_sve_fmaxv:
return combineSVEReductionFP(N, AArch64ISD::FMAXV_PRED, DAG);		return combineSVEReductionFP(N, AArch64ISD::FMAXV_PRED, DAG);
case Intrinsic::aarch64_sve_fminnmv:		case Intrinsic::aarch64_sve_fminnmv:
return combineSVEReductionFP(N, AArch64ISD::FMINNMV_PRED, DAG);		return combineSVEReductionFP(N, AArch64ISD::FMINNMV_PRED, DAG);
case Intrinsic::aarch64_sve_fminv:		case Intrinsic::aarch64_sve_fminv:
return combineSVEReductionFP(N, AArch64ISD::FMINV_PRED, DAG);		return combineSVEReductionFP(N, AArch64ISD::FMINV_PRED, DAG);
		case Intrinsic::aarch64_sve_fcvtzu:
		case Intrinsic::aarch64_sve_fcvtzu_i32f64:
		case Intrinsic::aarch64_sve_fcvtzu_i32f16:
		case Intrinsic::aarch64_sve_fcvtzu_i64f16:
		case Intrinsic::aarch64_sve_fcvtzu_i64f32:
		return combineSVEConversionFP(N, AArch64ISD::FP_TO_UINT_PRED, DAG);
		case Intrinsic::aarch64_sve_fcvtzs:
		case Intrinsic::aarch64_sve_fcvtzs_i32f64:
		case Intrinsic::aarch64_sve_fcvtzs_i64f32:
		case Intrinsic::aarch64_sve_fcvtzs_i32f16:
		case Intrinsic::aarch64_sve_fcvtzs_i64f16:
		return combineSVEConversionFP(N, AArch64ISD::FP_TO_SINT_PRED, DAG);
case Intrinsic::aarch64_sve_sel:		case Intrinsic::aarch64_sve_sel:
return DAG.getNode(ISD::VSELECT, SDLoc(N), N->getValueType(0),		return DAG.getNode(ISD::VSELECT, SDLoc(N), N->getValueType(0),
N->getOperand(1), N->getOperand(2), N->getOperand(3));		N->getOperand(1), N->getOperand(2), N->getOperand(3));
case Intrinsic::aarch64_sve_cmpeq_wide:		case Intrinsic::aarch64_sve_cmpeq_wide:
return tryConvertSVEWideCompare(N, ISD::SETEQ, DCI, DAG);		return tryConvertSVEWideCompare(N, ISD::SETEQ, DCI, DAG);
case Intrinsic::aarch64_sve_cmpne_wide:		case Intrinsic::aarch64_sve_cmpne_wide:
return tryConvertSVEWideCompare(N, ISD::SETNE, DCI, DAG);		return tryConvertSVEWideCompare(N, ISD::SETNE, DCI, DAG);
case Intrinsic::aarch64_sve_cmpge_wide:		case Intrinsic::aarch64_sve_cmpge_wide:
▲ Show 20 Lines • Show All 3,230 Lines • ▼ Show 20 Lines	if (isMergePassthruOpcode(NewOp))
Operands.push_back(DAG.getUNDEF(ContainerVT));		Operands.push_back(DAG.getUNDEF(ContainerVT));

auto ScalableRes = DAG.getNode(NewOp, DL, ContainerVT, Operands);		auto ScalableRes = DAG.getNode(NewOp, DL, ContainerVT, Operands);
return convertFromScalableVector(DAG, VT, ScalableRes);		return convertFromScalableVector(DAG, VT, ScalableRes);
}		}

assert(VT.isScalableVector() && "Only expect to lower scalable vector op!");		assert(VT.isScalableVector() && "Only expect to lower scalable vector op!");

SmallVector<SDValue, 4> Operands = {Pg};		SmallVector<SDValue, 4> Operands;

		if (NewOp == AArch64ISD::FP_TO_UINT_PRED \|\|
		NewOp == AArch64ISD::FP_TO_SINT_PRED)
		Operands.push_back(DAG.getUNDEF(Op.getValueType()));

		Operands.push_back(Pg);
for (const SDValue &V : Op->op_values()) {		for (const SDValue &V : Op->op_values()) {
assert((isa<CondCodeSDNode>(V) \|\| V.getValueType().isScalableVector()) &&		assert((isa<CondCodeSDNode>(V) \|\| V.getValueType().isScalableVector()) &&
"Only scalable vectors are supported!");		"Only scalable vectors are supported!");
Operands.push_back(V);		Operands.push_back(V);
}		}

if (isMergePassthruOpcode(NewOp))		if (isMergePassthruOpcode(NewOp))
Operands.push_back(DAG.getUNDEF(VT));		Operands.push_back(DAG.getUNDEF(VT));
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

Show First 20 Lines • Show All 161 Lines • ▼ Show 20 Lines
def AArch64lasta : SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>;		def AArch64lasta : SDNode<"AArch64ISD::LASTA", SDT_AArch64Reduce>;
def AArch64lastb : SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>;		def AArch64lastb : SDNode<"AArch64ISD::LASTB", SDT_AArch64Reduce>;

def SDT_AArch64Arith : SDTypeProfile<1, 3, [		def SDT_AArch64Arith : SDTypeProfile<1, 3, [
SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>,		SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>,
SDTCVecEltisVT<1,i1>, SDTCisSameAs<0,2>, SDTCisSameAs<2,3>		SDTCVecEltisVT<1,i1>, SDTCisSameAs<0,2>, SDTCisSameAs<2,3>
]>;		]>;

		def SDT_AArch64FCVT : SDTypeProfile<1, 3, [
		SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>,
		SDTCVecEltisVT<2,i1>
		]>;

		def AArch64fcvtzu_p : SDNode<"AArch64ISD::FP_TO_UINT_PRED", SDT_AArch64FCVT>;
		def AArch64fcvtzs_p : SDNode<"AArch64ISD::FP_TO_SINT_PRED", SDT_AArch64FCVT>;

def SDT_AArch64FMA : SDTypeProfile<1, 4, [		def SDT_AArch64FMA : SDTypeProfile<1, 4, [
SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>, SDTCisVec<4>,		SDTCisVec<0>, SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3>, SDTCisVec<4>,
SDTCVecEltisVT<1,i1>, SDTCisSameAs<0,2>, SDTCisSameAs<2,3>, SDTCisSameAs<3,4>		SDTCVecEltisVT<1,i1>, SDTCisSameAs<0,2>, SDTCisSameAs<2,3>, SDTCisSameAs<3,4>
]>;		]>;

// Predicated operations with the result of inactive lanes being unspecified.		// Predicated operations with the result of inactive lanes being unspecified.
def AArch64add_p : SDNode<"AArch64ISD::ADD_PRED", SDT_AArch64Arith>;		def AArch64add_p : SDNode<"AArch64ISD::ADD_PRED", SDT_AArch64Arith>;
def AArch64asr_p : SDNode<"AArch64ISD::SRA_PRED", SDT_AArch64Arith>;		def AArch64asr_p : SDNode<"AArch64ISD::SRA_PRED", SDT_AArch64Arith>;
▲ Show 20 Lines • Show All 1,204 Lines • ▼ Show 20 Lines	multiclass sve_prefetch<SDPatternOperator prefetch, ValueType PredTy, Instruction RegImmInst, Instruction RegRegInst, int scale, ComplexPattern AddrCP> {
defm ASR_ZPZZ : sve_int_bin_pred_bhsd<AArch64asr_p>;		defm ASR_ZPZZ : sve_int_bin_pred_bhsd<AArch64asr_p>;
defm LSR_ZPZZ : sve_int_bin_pred_bhsd<AArch64lsr_p>;		defm LSR_ZPZZ : sve_int_bin_pred_bhsd<AArch64lsr_p>;
defm LSL_ZPZZ : sve_int_bin_pred_bhsd<AArch64lsl_p>;		defm LSL_ZPZZ : sve_int_bin_pred_bhsd<AArch64lsl_p>;

defm ASR_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b000, "asr", int_aarch64_sve_asr_wide>;		defm ASR_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b000, "asr", int_aarch64_sve_asr_wide>;
defm LSR_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b001, "lsr", int_aarch64_sve_lsr_wide>;		defm LSR_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b001, "lsr", int_aarch64_sve_lsr_wide>;
defm LSL_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b011, "lsl", int_aarch64_sve_lsl_wide>;		defm LSL_WIDE_ZPmZ : sve_int_bin_pred_shift_wide<0b011, "lsl", int_aarch64_sve_lsl_wide>;

		defm FCVTZS_ZPmZ : sve_fp_2op_p_zd_signed <"fcvtzs", AArch64fcvtzs_p>;
		defm FCVTZU_ZPmZ : sve_fp_2op_p_zd_unsigned<"fcvtzu", AArch64fcvtzu_p>;
defm FCVT_ZPmZ_StoH : sve_fp_2op_p_zd<0b1001000, "fcvt", ZPR32, ZPR16, int_aarch64_sve_fcvt_f16f32, nxv8f16, nxv4i1, nxv4f32, ElementSizeS>;		defm FCVT_ZPmZ_StoH : sve_fp_2op_p_zd<0b1001000, "fcvt", ZPR32, ZPR16, int_aarch64_sve_fcvt_f16f32, nxv8f16, nxv4i1, nxv4f32, ElementSizeS>;
defm FCVT_ZPmZ_HtoS : sve_fp_2op_p_zd<0b1001001, "fcvt", ZPR16, ZPR32, int_aarch64_sve_fcvt_f32f16, nxv4f32, nxv4i1, nxv8f16, ElementSizeS>;		defm FCVT_ZPmZ_HtoS : sve_fp_2op_p_zd<0b1001001, "fcvt", ZPR16, ZPR32, int_aarch64_sve_fcvt_f32f16, nxv4f32, nxv4i1, nxv8f16, ElementSizeS>;
defm SCVTF_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0110010, "scvtf", ZPR16, ZPR16, int_aarch64_sve_scvtf, nxv8f16, nxv8i1, nxv8i16, ElementSizeH>;		defm SCVTF_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0110010, "scvtf", ZPR16, ZPR16, int_aarch64_sve_scvtf, nxv8f16, nxv8i1, nxv8i16, ElementSizeH>;
defm SCVTF_ZPmZ_StoS : sve_fp_2op_p_zd<0b1010100, "scvtf", ZPR32, ZPR32, int_aarch64_sve_scvtf, nxv4f32, nxv4i1, nxv4i32, ElementSizeS>;		defm SCVTF_ZPmZ_StoS : sve_fp_2op_p_zd<0b1010100, "scvtf", ZPR32, ZPR32, int_aarch64_sve_scvtf, nxv4f32, nxv4i1, nxv4i32, ElementSizeS>;
defm UCVTF_ZPmZ_StoS : sve_fp_2op_p_zd<0b1010101, "ucvtf", ZPR32, ZPR32, int_aarch64_sve_ucvtf, nxv4f32, nxv4i1, nxv4i32, ElementSizeS>;		defm UCVTF_ZPmZ_StoS : sve_fp_2op_p_zd<0b1010101, "ucvtf", ZPR32, ZPR32, int_aarch64_sve_ucvtf, nxv4f32, nxv4i1, nxv4i32, ElementSizeS>;
defm UCVTF_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0110011, "ucvtf", ZPR16, ZPR16, int_aarch64_sve_ucvtf, nxv8f16, nxv8i1, nxv8i16, ElementSizeH>;		defm UCVTF_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0110011, "ucvtf", ZPR16, ZPR16, int_aarch64_sve_ucvtf, nxv8f16, nxv8i1, nxv8i16, ElementSizeH>;
defm FCVTZS_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0111010, "fcvtzs", ZPR16, ZPR16, int_aarch64_sve_fcvtzs, nxv8i16, nxv8i1, nxv8f16, ElementSizeH>;
defm FCVTZS_ZPmZ_StoS : sve_fp_2op_p_zd<0b1011100, "fcvtzs", ZPR32, ZPR32, int_aarch64_sve_fcvtzs, nxv4i32, nxv4i1, nxv4f32, ElementSizeS>;
defm FCVTZU_ZPmZ_HtoH : sve_fp_2op_p_zd<0b0111011, "fcvtzu", ZPR16, ZPR16, int_aarch64_sve_fcvtzu, nxv8i16, nxv8i1, nxv8f16, ElementSizeH>;
defm FCVTZU_ZPmZ_StoS : sve_fp_2op_p_zd<0b1011101, "fcvtzu", ZPR32, ZPR32, int_aarch64_sve_fcvtzu, nxv4i32, nxv4i1, nxv4f32, ElementSizeS>;
defm FCVT_ZPmZ_DtoH : sve_fp_2op_p_zd<0b1101000, "fcvt", ZPR64, ZPR16, int_aarch64_sve_fcvt_f16f64, nxv8f16, nxv2i1, nxv2f64, ElementSizeD>;		defm FCVT_ZPmZ_DtoH : sve_fp_2op_p_zd<0b1101000, "fcvt", ZPR64, ZPR16, int_aarch64_sve_fcvt_f16f64, nxv8f16, nxv2i1, nxv2f64, ElementSizeD>;
defm FCVT_ZPmZ_HtoD : sve_fp_2op_p_zd<0b1101001, "fcvt", ZPR16, ZPR64, int_aarch64_sve_fcvt_f64f16, nxv2f64, nxv2i1, nxv8f16, ElementSizeD>;		defm FCVT_ZPmZ_HtoD : sve_fp_2op_p_zd<0b1101001, "fcvt", ZPR16, ZPR64, int_aarch64_sve_fcvt_f64f16, nxv2f64, nxv2i1, nxv8f16, ElementSizeD>;
defm FCVT_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1101010, "fcvt", ZPR64, ZPR32, int_aarch64_sve_fcvt_f32f64, nxv4f32, nxv2i1, nxv2f64, ElementSizeD>;		defm FCVT_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1101010, "fcvt", ZPR64, ZPR32, int_aarch64_sve_fcvt_f32f64, nxv4f32, nxv2i1, nxv2f64, ElementSizeD>;
defm FCVT_ZPmZ_StoD : sve_fp_2op_p_zd<0b1101011, "fcvt", ZPR32, ZPR64, int_aarch64_sve_fcvt_f64f32, nxv2f64, nxv2i1, nxv4f32, ElementSizeD>;		defm FCVT_ZPmZ_StoD : sve_fp_2op_p_zd<0b1101011, "fcvt", ZPR32, ZPR64, int_aarch64_sve_fcvt_f64f32, nxv2f64, nxv2i1, nxv4f32, ElementSizeD>;
defm SCVTF_ZPmZ_StoD : sve_fp_2op_p_zd<0b1110000, "scvtf", ZPR32, ZPR64, int_aarch64_sve_scvtf_f64i32, nxv2f64, nxv2i1, nxv4i32, ElementSizeD>;		defm SCVTF_ZPmZ_StoD : sve_fp_2op_p_zd<0b1110000, "scvtf", ZPR32, ZPR64, int_aarch64_sve_scvtf_f64i32, nxv2f64, nxv2i1, nxv4i32, ElementSizeD>;
defm UCVTF_ZPmZ_StoD : sve_fp_2op_p_zd<0b1110001, "ucvtf", ZPR32, ZPR64, int_aarch64_sve_ucvtf_f64i32, nxv2f64, nxv2i1, nxv4i32, ElementSizeD>;		defm UCVTF_ZPmZ_StoD : sve_fp_2op_p_zd<0b1110001, "ucvtf", ZPR32, ZPR64, int_aarch64_sve_ucvtf_f64i32, nxv2f64, nxv2i1, nxv4i32, ElementSizeD>;
defm UCVTF_ZPmZ_StoH : sve_fp_2op_p_zd<0b0110101, "ucvtf", ZPR32, ZPR16, int_aarch64_sve_ucvtf_f16i32, nxv8f16, nxv4i1, nxv4i32, ElementSizeS>;		defm UCVTF_ZPmZ_StoH : sve_fp_2op_p_zd<0b0110101, "ucvtf", ZPR32, ZPR16, int_aarch64_sve_ucvtf_f16i32, nxv8f16, nxv4i1, nxv4i32, ElementSizeS>;
defm SCVTF_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1110100, "scvtf", ZPR64, ZPR32, int_aarch64_sve_scvtf_f32i64, nxv4f32, nxv2i1, nxv2i64, ElementSizeD>;		defm SCVTF_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1110100, "scvtf", ZPR64, ZPR32, int_aarch64_sve_scvtf_f32i64, nxv4f32, nxv2i1, nxv2i64, ElementSizeD>;
defm SCVTF_ZPmZ_StoH : sve_fp_2op_p_zd<0b0110100, "scvtf", ZPR32, ZPR16, int_aarch64_sve_scvtf_f16i32, nxv8f16, nxv4i1, nxv4i32, ElementSizeS>;		defm SCVTF_ZPmZ_StoH : sve_fp_2op_p_zd<0b0110100, "scvtf", ZPR32, ZPR16, int_aarch64_sve_scvtf_f16i32, nxv8f16, nxv4i1, nxv4i32, ElementSizeS>;
defm SCVTF_ZPmZ_DtoH : sve_fp_2op_p_zd<0b0110110, "scvtf", ZPR64, ZPR16, int_aarch64_sve_scvtf_f16i64, nxv8f16, nxv2i1, nxv2i64, ElementSizeD>;		defm SCVTF_ZPmZ_DtoH : sve_fp_2op_p_zd<0b0110110, "scvtf", ZPR64, ZPR16, int_aarch64_sve_scvtf_f16i64, nxv8f16, nxv2i1, nxv2i64, ElementSizeD>;
defm UCVTF_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1110101, "ucvtf", ZPR64, ZPR32, int_aarch64_sve_ucvtf_f32i64, nxv4f32, nxv2i1, nxv2i64, ElementSizeD>;		defm UCVTF_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1110101, "ucvtf", ZPR64, ZPR32, int_aarch64_sve_ucvtf_f32i64, nxv4f32, nxv2i1, nxv2i64, ElementSizeD>;
defm UCVTF_ZPmZ_DtoH : sve_fp_2op_p_zd<0b0110111, "ucvtf", ZPR64, ZPR16, int_aarch64_sve_ucvtf_f16i64, nxv8f16, nxv2i1, nxv2i64, ElementSizeD>;		defm UCVTF_ZPmZ_DtoH : sve_fp_2op_p_zd<0b0110111, "ucvtf", ZPR64, ZPR16, int_aarch64_sve_ucvtf_f16i64, nxv8f16, nxv2i1, nxv2i64, ElementSizeD>;
defm SCVTF_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1110110, "scvtf", ZPR64, ZPR64, int_aarch64_sve_scvtf, nxv2f64, nxv2i1, nxv2i64, ElementSizeD>;		defm SCVTF_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1110110, "scvtf", ZPR64, ZPR64, int_aarch64_sve_scvtf, nxv2f64, nxv2i1, nxv2i64, ElementSizeD>;
defm UCVTF_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1110111, "ucvtf", ZPR64, ZPR64, int_aarch64_sve_ucvtf, nxv2f64, nxv2i1, nxv2i64, ElementSizeD>;		defm UCVTF_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1110111, "ucvtf", ZPR64, ZPR64, int_aarch64_sve_ucvtf, nxv2f64, nxv2i1, nxv2i64, ElementSizeD>;
defm FCVTZS_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1111000, "fcvtzs", ZPR64, ZPR32, int_aarch64_sve_fcvtzs_i32f64, nxv4i32, nxv2i1, nxv2f64, ElementSizeD>;
defm FCVTZU_ZPmZ_DtoS : sve_fp_2op_p_zd<0b1111001, "fcvtzu", ZPR64, ZPR32, int_aarch64_sve_fcvtzu_i32f64, nxv4i32, nxv2i1, nxv2f64, ElementSizeD>;
defm FCVTZS_ZPmZ_StoD : sve_fp_2op_p_zd<0b1111100, "fcvtzs", ZPR32, ZPR64, int_aarch64_sve_fcvtzs_i64f32, nxv2i64, nxv2i1, nxv4f32, ElementSizeD>;
defm FCVTZS_ZPmZ_HtoS : sve_fp_2op_p_zd<0b0111100, "fcvtzs", ZPR16, ZPR32, int_aarch64_sve_fcvtzs_i32f16, nxv4i32, nxv4i1, nxv8f16, ElementSizeS>;
defm FCVTZS_ZPmZ_HtoD : sve_fp_2op_p_zd<0b0111110, "fcvtzs", ZPR16, ZPR64, int_aarch64_sve_fcvtzs_i64f16, nxv2i64, nxv2i1, nxv8f16, ElementSizeD>;
defm FCVTZU_ZPmZ_HtoS : sve_fp_2op_p_zd<0b0111101, "fcvtzu", ZPR16, ZPR32, int_aarch64_sve_fcvtzu_i32f16, nxv4i32, nxv4i1, nxv8f16, ElementSizeS>;
defm FCVTZU_ZPmZ_HtoD : sve_fp_2op_p_zd<0b0111111, "fcvtzu", ZPR16, ZPR64, int_aarch64_sve_fcvtzu_i64f16, nxv2i64, nxv2i1, nxv8f16, ElementSizeD>;
defm FCVTZU_ZPmZ_StoD : sve_fp_2op_p_zd<0b1111101, "fcvtzu", ZPR32, ZPR64, int_aarch64_sve_fcvtzu_i64f32, nxv2i64, nxv2i1, nxv4f32, ElementSizeD>;
defm FCVTZS_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1111110, "fcvtzs", ZPR64, ZPR64, int_aarch64_sve_fcvtzs, nxv2i64, nxv2i1, nxv2f64, ElementSizeD>;
defm FCVTZU_ZPmZ_DtoD : sve_fp_2op_p_zd<0b1111111, "fcvtzu", ZPR64, ZPR64, int_aarch64_sve_fcvtzu, nxv2i64, nxv2i1, nxv2f64, ElementSizeD>;

defm FRINTN_ZPmZ : sve_fp_2op_p_zd_HSD<0b00000, "frintn", null_frag, AArch64frintn_mt>;		defm FRINTN_ZPmZ : sve_fp_2op_p_zd_HSD<0b00000, "frintn", null_frag, AArch64frintn_mt>;
defm FRINTP_ZPmZ : sve_fp_2op_p_zd_HSD<0b00001, "frintp", null_frag, AArch64frintp_mt>;		defm FRINTP_ZPmZ : sve_fp_2op_p_zd_HSD<0b00001, "frintp", null_frag, AArch64frintp_mt>;
defm FRINTM_ZPmZ : sve_fp_2op_p_zd_HSD<0b00010, "frintm", null_frag, AArch64frintm_mt>;		defm FRINTM_ZPmZ : sve_fp_2op_p_zd_HSD<0b00010, "frintm", null_frag, AArch64frintm_mt>;
defm FRINTZ_ZPmZ : sve_fp_2op_p_zd_HSD<0b00011, "frintz", null_frag, AArch64frintz_mt>;		defm FRINTZ_ZPmZ : sve_fp_2op_p_zd_HSD<0b00011, "frintz", null_frag, AArch64frintz_mt>;
defm FRINTA_ZPmZ : sve_fp_2op_p_zd_HSD<0b00100, "frinta", null_frag, AArch64frinta_mt>;		defm FRINTA_ZPmZ : sve_fp_2op_p_zd_HSD<0b00100, "frinta", null_frag, AArch64frinta_mt>;
defm FRINTX_ZPmZ : sve_fp_2op_p_zd_HSD<0b00110, "frintx", null_frag, AArch64frintx_mt>;		defm FRINTX_ZPmZ : sve_fp_2op_p_zd_HSD<0b00110, "frintx", null_frag, AArch64frintx_mt>;
defm FRINTI_ZPmZ : sve_fp_2op_p_zd_HSD<0b00111, "frinti", null_frag, AArch64frinti_mt>;		defm FRINTI_ZPmZ : sve_fp_2op_p_zd_HSD<0b00111, "frinti", null_frag, AArch64frinti_mt>;
▲ Show 20 Lines • Show All 1,241 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/SVEInstrFormats.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,280 Lines • ▼ Show 20 Lines	multiclass sve_fp_2op_p_zd<bits<7> opc, string asm,
RegisterOperand o_zprtype,		RegisterOperand o_zprtype,
SDPatternOperator op, ValueType vt1,		SDPatternOperator op, ValueType vt1,
ValueType vt2, ValueType vt3, ElementSizeEnum Sz> {		ValueType vt2, ValueType vt3, ElementSizeEnum Sz> {
def NAME : sve_fp_2op_p_zd<opc, asm, i_zprtype, o_zprtype, Sz>;		def NAME : sve_fp_2op_p_zd<opc, asm, i_zprtype, o_zprtype, Sz>;

def : SVE_3_Op_Pat<vt1, op, vt1, vt2, vt3, !cast<Instruction>(NAME)>;		def : SVE_3_Op_Pat<vt1, op, vt1, vt2, vt3, !cast<Instruction>(NAME)>;
}		}

		multiclass sve_fp_2op_p_zd_signed<string asm, SDPatternOperator op> {
		def _HtoH : sve_fp_2op_p_zd<0b0111010, asm, ZPR16, ZPR16, ElementSizeH>;
		def _HtoS : sve_fp_2op_p_zd<0b0111100, asm, ZPR16, ZPR32, ElementSizeS>;
		def _HtoD : sve_fp_2op_p_zd<0b0111110, asm, ZPR16, ZPR64, ElementSizeD>;
		def _StoS : sve_fp_2op_p_zd<0b1011100, asm, ZPR32, ZPR32, ElementSizeS>;
		def _StoD : sve_fp_2op_p_zd<0b1111100, asm, ZPR32, ZPR64, ElementSizeD>;
		def _DtoS : sve_fp_2op_p_zd<0b1111000, asm, ZPR64, ZPR32, ElementSizeS>;
		def _DtoD : sve_fp_2op_p_zd<0b1111110, asm, ZPR64, ZPR64, ElementSizeD>;

		def : SVE_3_Op_Pat<nxv8i16, op, nxv8i16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _HtoH)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv4f16, !cast<Instruction>(NAME # _HtoS)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv8f16, !cast<Instruction>(NAME # _HtoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f16, !cast<Instruction>(NAME # _HtoD)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv8f16, !cast<Instruction>(NAME # _HtoD)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _StoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f32, !cast<Instruction>(NAME # _StoD)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv4f32, !cast<Instruction>(NAME # _StoD)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _DtoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _DtoD)>;
		}

		multiclass sve_fp_2op_p_zd_unsigned<string asm, SDPatternOperator op> {
		def _HtoH : sve_fp_2op_p_zd<0b0111011, asm, ZPR16, ZPR16, ElementSizeH>;
		def _HtoS : sve_fp_2op_p_zd<0b0111101, asm, ZPR16, ZPR32, ElementSizeS>;
		def _HtoD : sve_fp_2op_p_zd<0b0111111, asm, ZPR16, ZPR64, ElementSizeD>;
		def _StoS : sve_fp_2op_p_zd<0b1011101, asm, ZPR32, ZPR32, ElementSizeS>;
		def _StoD : sve_fp_2op_p_zd<0b1111101, asm, ZPR32, ZPR64, ElementSizeD>;
		def _DtoS : sve_fp_2op_p_zd<0b1111001, asm, ZPR64, ZPR32, ElementSizeS>;
		def _DtoD : sve_fp_2op_p_zd<0b1111111, asm, ZPR64, ZPR64, ElementSizeD>;

		def : SVE_3_Op_Pat<nxv8i16, op, nxv8i16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _HtoH)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv4f16, !cast<Instruction>(NAME # _HtoS)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv8f16, !cast<Instruction>(NAME # _HtoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f16, !cast<Instruction>(NAME # _HtoD)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv8f16, !cast<Instruction>(NAME # _HtoD)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _StoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f32, !cast<Instruction>(NAME # _StoD)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv4f32, !cast<Instruction>(NAME # _StoD)>;
		def : SVE_3_Op_Pat<nxv4i32, op, nxv4i32, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _DtoS)>;
		def : SVE_3_Op_Pat<nxv2i64, op, nxv2i64, nxv2i1, nxv2f64, !cast<Instruction>(NAME # _DtoD)>;
		}

multiclass sve_fp_2op_p_zd_HSD<bits<5> opc, string asm, SDPatternOperator op_merge,		multiclass sve_fp_2op_p_zd_HSD<bits<5> opc, string asm, SDPatternOperator op_merge,
SDPatternOperator op_pt = null_frag> {		SDPatternOperator op_pt = null_frag> {
def _H : sve_fp_2op_p_zd<{ 0b01, opc }, asm, ZPR16, ZPR16, ElementSizeH>;		def _H : sve_fp_2op_p_zd<{ 0b01, opc }, asm, ZPR16, ZPR16, ElementSizeH>;
def _S : sve_fp_2op_p_zd<{ 0b10, opc }, asm, ZPR32, ZPR32, ElementSizeS>;		def _S : sve_fp_2op_p_zd<{ 0b10, opc }, asm, ZPR32, ZPR32, ElementSizeS>;
def _D : sve_fp_2op_p_zd<{ 0b11, opc }, asm, ZPR64, ZPR64, ElementSizeD>;		def _D : sve_fp_2op_p_zd<{ 0b11, opc }, asm, ZPR64, ZPR64, ElementSizeD>;

def : SVE_3_Op_Pat<nxv8f16, op_merge, nxv8f16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _H)>;		def : SVE_3_Op_Pat<nxv8f16, op_merge, nxv8f16, nxv8i1, nxv8f16, !cast<Instruction>(NAME # _H)>;
def : SVE_3_Op_Pat<nxv4f32, op_merge, nxv4f32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _S)>;		def : SVE_3_Op_Pat<nxv4f32, op_merge, nxv4f32, nxv4i1, nxv4f32, !cast<Instruction>(NAME # _S)>;
▲ Show 20 Lines • Show All 5,637 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fcvt.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64--linux-gnu -mattr=+sve < %s \| FileCheck %s

				;
				; FP_TO_SINT
				;

				define <vscale x 2 x i16> @fcvtzs_h_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzs_h_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x half> %a to <vscale x 2 x i16>
				ret <vscale x 2 x i16> %res
				}

				define <vscale x 4 x i16> @fcvtzs_h_nxv4f16(<vscale x 4 x half> %a) {
				; CHECK-LABEL: fcvtzs_h_nxv4f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzs z0.s, p0/m, z0.h
				paulwalker-armUnsubmitted Done Reply Inline Actions Perhaps worth adding tests for: fcvtz[s,u]_h_nxv2f32 fcvtz[s,u]_h_nxv2f64 fcvtz[s,u]_h_nxv4f32 paulwalker-arm: Perhaps worth adding tests for: fcvtz[s,u]_h_nxv2f32 fcvtz[s,u]_h_nxv2f64 fcvtz[s,u]_h_nxv4f32
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 4 x half> %a to <vscale x 4 x i16>
				ret <vscale x 4 x i16> %res
				}

				define <vscale x 8 x i16> @fcvtzs_h_nxv8f16(<vscale x 8 x half> %a) {
				; CHECK-LABEL: fcvtzs_h_nxv8f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.h
				; CHECK-NEXT: fcvtzs z0.h, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 8 x half> %a to <vscale x 8 x i16>
				ret <vscale x 8 x i16> %res
				}

				define <vscale x 2 x i32> @fcvtzs_s_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x half> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 2 x i32> @fcvtzs_s_nxv2f32(<vscale x 2 x float> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv2f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x float> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 2 x i32> @fcvtzs_s_nxv2f64(<vscale x 2 x double> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv2f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.d
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x double> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 4 x i32> @fcvtzs_s_nxv4f16(<vscale x 4 x half> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv4f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzs z0.s, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 4 x half> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				define <vscale x 4 x i32> @fcvtzs_s_nxv4f32(<vscale x 4 x float> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv4f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzs z0.s, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 4 x float> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				define <vscale x 2 x i64> @fcvtzs_d_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzs_d_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x half> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

				define <vscale x 2 x i64> @fcvtzs_d_nxv2f32(<vscale x 2 x float> %a) {
				; CHECK-LABEL: fcvtzs_d_nxv2f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x float> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

				define <vscale x 2 x i64> @fcvtzs_d_nxv2f64(<vscale x 2 x double> %a) {
				; CHECK-LABEL: fcvtzs_d_nxv2f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.d
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 2 x double> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

				;
				; FP_TO_UINT
				;

				define <vscale x 2 x i16> @fcvtzu_h_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzu_h_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				paulwalker-armUnsubmitted Not Done Reply Inline Actions I know it's in the langref but it cannot hurt to add a comment here along the lines of `NOTE: Using fcvtzs is safe because fptoui overflow is considered poison and a 64bit signed value encompasses the entire range of a 16bit unsigned value.` What do you think? paulwalker-arm: I know it's in the langref but it cannot hurt to add a comment here along the lines of `NOTE…
				kmclaughlinAuthorUnsubmitted Done Reply Inline Actions I've added your comment in here, it will definitely help me to remember why this is safe in future :) kmclaughlin: I've added your comment in here, it will definitely help me to remember why this is safe in…
				%res = fptoui <vscale x 2 x half> %a to <vscale x 2 x i16>
				ret <vscale x 2 x i16> %res
				}

				define <vscale x 4 x i16> @fcvtzu_h_nxv4f16(<vscale x 4 x half> %a) {
				; CHECK-LABEL: fcvtzu_h_nxv4f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzu z0.s, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 4 x half> %a to <vscale x 4 x i16>
				ret <vscale x 4 x i16> %res
				}

				define <vscale x 8 x i16> @fcvtzu_h_nxv8f16(<vscale x 8 x half> %a) {
				; CHECK-LABEL: fcvtzu_h_nxv8f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.h
				; CHECK-NEXT: fcvtzu z0.h, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 8 x half> %a to <vscale x 8 x i16>
				ret <vscale x 8 x i16> %res
				}

				define <vscale x 2 x i32> @fcvtzu_s_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x half> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 2 x i32> @fcvtzu_s_nxv2f32(<vscale x 2 x float> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv2f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x float> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 2 x i32> @fcvtzu_s_nxv2f64(<vscale x 2 x double> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv2f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.d
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x double> %a to <vscale x 2 x i32>
				ret <vscale x 2 x i32> %res
				}

				define <vscale x 4 x i32> @fcvtzu_s_nxv4f16(<vscale x 4 x half> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv4f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzu z0.s, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 4 x half> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				define <vscale x 4 x i32> @fcvtzu_s_nxv4f32(<vscale x 4 x float> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv4f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcvtzu z0.s, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 4 x float> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				define <vscale x 2 x i64> @fcvtzu_d_nxv2f16(<vscale x 2 x half> %a) {
				; CHECK-LABEL: fcvtzu_d_nxv2f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.h
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x half> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

				define <vscale x 2 x i64> @fcvtzu_d_nxv2f32(<vscale x 2 x float> %a) {
				; CHECK-LABEL: fcvtzu_d_nxv2f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.s
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x float> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

				define <vscale x 2 x i64> @fcvtzu_d_nxv2f64(<vscale x 2 x double> %a) {
				; CHECK-LABEL: fcvtzu_d_nxv2f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.d
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 2 x double> %a to <vscale x 2 x i64>
				ret <vscale x 2 x i64> %res
				}

llvm/test/CodeGen/AArch64/sve-split-fcvt.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64--linux-gnu -mattr=sve < %s \| FileCheck %s

				; FP_TO_SINT

				; Split operand
				define <vscale x 4 x i32> @fcvtzs_s_nxv4f64(<vscale x 4 x double> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv4f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z1.d, p0/m, z1.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.d
				; CHECK-NEXT: uzp1 z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 4 x double> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				define <vscale x 8 x i16> @fcvtzs_h_nxv8f64(<vscale x 8 x double> %a) {
				; CHECK-LABEL: fcvtzs_h_nxv8f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzs z3.d, p0/m, z3.d
				; CHECK-NEXT: fcvtzs z2.d, p0/m, z2.d
				; CHECK-NEXT: fcvtzs z1.d, p0/m, z1.d
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z0.d
				; CHECK-NEXT: uzp1 z2.s, z2.s, z3.s
				; CHECK-NEXT: uzp1 z0.s, z0.s, z1.s
				; CHECK-NEXT: uzp1 z0.h, z0.h, z2.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 8 x double> %a to <vscale x 8 x i16>
				ret <vscale x 8 x i16> %res
				}

				; Split result
				define <vscale x 4 x i64> @fcvtzs_d_nxv4f32(<vscale x 4 x float> %a) {
				; CHECK-LABEL: fcvtzs_d_nxv4f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: uunpklo z1.d, z0.s
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: uunpkhi z2.d, z0.s
				; CHECK-NEXT: fcvtzs z0.d, p0/m, z1.s
				; CHECK-NEXT: fcvtzs z1.d, p0/m, z2.s
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 4 x float> %a to <vscale x 4 x i64>
				ret <vscale x 4 x i64> %res
				}

				define <vscale x 16 x i32> @fcvtzs_s_nxv16f16(<vscale x 16 x half> %a) {
				; CHECK-LABEL: fcvtzs_s_nxv16f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: uunpklo z2.s, z0.h
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: uunpkhi z3.s, z0.h
				; CHECK-NEXT: uunpklo z4.s, z1.h
				; CHECK-NEXT: uunpkhi z5.s, z1.h
				; CHECK-NEXT: fcvtzs z0.s, p0/m, z2.h
				; CHECK-NEXT: fcvtzs z1.s, p0/m, z3.h
				; CHECK-NEXT: fcvtzs z2.s, p0/m, z4.h
				; CHECK-NEXT: fcvtzs z3.s, p0/m, z5.h
				; CHECK-NEXT: ret
				%res = fptosi <vscale x 16 x half> %a to <vscale x 16 x i32>
				ret <vscale x 16 x i32> %res
				}

				; FP_TO_UINT

				; Split operand
				define <vscale x 4 x i32> @fcvtzu_s_nxv4f64(<vscale x 4 x double> %a) {
				; CHECK-LABEL: fcvtzu_s_nxv4f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcvtzu z1.d, p0/m, z1.d
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z0.d
				; CHECK-NEXT: uzp1 z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 4 x double> %a to <vscale x 4 x i32>
				ret <vscale x 4 x i32> %res
				}

				; Split result
				define <vscale x 4 x i64> @fcvtzu_d_nxv4f32(<vscale x 4 x float> %a) {
				; CHECK-LABEL: fcvtzu_d_nxv4f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: uunpklo z1.d, z0.s
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: uunpkhi z2.d, z0.s
				; CHECK-NEXT: fcvtzu z0.d, p0/m, z1.s
				; CHECK-NEXT: fcvtzu z1.d, p0/m, z2.s
				; CHECK-NEXT: ret
				%res = fptoui <vscale x 4 x float> %a to <vscale x 4 x i64>
				ret <vscale x 4 x i64> %res
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SVE][CodeGen] Lower floating point -> integer conversions
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 290278

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/lib/Target/AArch64/SVEInstrFormats.td

llvm/test/CodeGen/AArch64/sve-fcvt.ll

llvm/test/CodeGen/AArch64/sve-split-fcvt.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SVE][CodeGen] Lower floating point -> integer conversionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 290278

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td

llvm/lib/Target/AArch64/SVEInstrFormats.td

llvm/test/CodeGen/AArch64/sve-fcvt.ll

llvm/test/CodeGen/AArch64/sve-split-fcvt.ll

[SVE][CodeGen] Lower floating point -> integer conversions
ClosedPublic