This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetLowering.h
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
-
LegalizeDAG.cpp
-
LegalizeVectorOps.cpp
2/5
TargetLowering.cpp
-
Target/
-
ARM/
-
ARMISelLowering.cpp
-
RISCV/
1/1
RISCVISelLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
ARM/
-
minnum-maxnum-intrinsics.ll
-
PowerPC/
-
fminimum-fmaximum-f128.ll
-
fminimum-fmaximum.ll

Differential D158053

[Legalizer] Expand fmaximum and fminimum
AbandonedPublic

Authored by qiucf on Aug 15 2023, 11:30 PM.

Download Raw Diff

Details

Reviewers

nemanjai
shchenz
stefanp
tlively
RKSimon
e-kud
craig.topper
jcranmer-intel

Group Reviewers

Restricted Project

Summary

According to langref, llvm.maximum/minimum has -0.0 < +0.0 semantics and propagates NaN.

Expand the nodes on targets not supporting the operation, by adding extra check for NaN and using is_fpclass to check zero signs.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qiucf created this revision.Aug 15 2023, 11:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2023, 11:30 PM

Herald added subscribers: steven.zhang, pengfei, kbarton and 2 others. · View Herald Transcript

qiucf requested review of this revision.Aug 15 2023, 11:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2023, 11:30 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

NaN handling is not the only difference between these intrinsics, they also have different signed zero semantics.

Also, I really don't think another target should implement custom lowering for this without adding generic legalization support first. Please add generic legalization first, and if that's not optimal for ppc, add custom lowering. It makes no sense that every target is re-implementing this using custom lowering right now.

Harbormaster completed remote builds in B252868: Diff 550628.Aug 16 2023, 3:13 AM

This should be done as a generic legalization and is missing the -0.0 < +0.0 semantic detailed here: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic

This revision now requires changes to proceed.Aug 16 2023, 6:12 AM

Move to generic expansion.

Some targets do not set legal status of fmaximum and fminimum properly, so after expansion added, legalizer goes into the expansion path. RISCV looks suspicious so I marked them as legal, but uncertain for ARM.

Also this call needs to be added into VectorLegalizer::Expand, which will make large case change, I think that's also because of the wrong action setting of fminimum/fmaximum.

PowerPC codegen is not optimal on VSX targets, further patch will be provided to optimize it.

Herald added subscribers: wangpc, luke, sunshaoce and 23 others. · View Herald TranscriptAug 18 2023, 3:15 AM

asb added a reviewer: craig.topper.Aug 18 2023, 3:20 AM

I think we should either define the existing FMINNUM_IEEE/FMAXNUM_IEEE to have the correct IEEE 2019 signed zero ordering (I can't name a target that doesn't have this behavior), or we have to add a pair of DAG nodes that do

Harbormaster completed remote builds in B253447: Diff 551451.Aug 18 2023, 4:30 AM

craig.topper added inline comments.Aug 18 2023, 7:49 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
441	Can you reverse the `if` and `else` here to remove the `!` from the condition.

e-kud added inline comments.Aug 18 2023, 9:07 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
8183	Could it be simplified for `maximum` as follows? Tmp = select(is_fpclass(PosZero, LHS), LHS, MinMax) Res = select(is_fpclass(PosZero, RHS), RHS, Tmp) MinMax = select(IsZero, Res, MinMax)

arsenm added a reviewer: jcranmer-intel.Aug 18 2023, 11:45 AM

I don't believe any of the existing Arm/AArch64 tests should change with this patch. The SVE test looks like it has just regenerated check lines without any changes? Can you include this change to make the operations always legal when we have NEON, It will be better than expanding them:

diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 80a147666094..27f12f3a1cc8 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -1543,15 +1543,11 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
 
   if (Subtarget->hasNEON()) {
     // vmin and vmax aren't available in a scalar form, so we can use
-    // a NEON instruction with an undef lane instead.  This has a performance
-    // penalty on some cores, so we don't do this unless we have been
-    // asked to by the core tuning model.
-    if (Subtarget->useNEONForSinglePrecisionFP()) {
-      setOperationAction(ISD::FMINIMUM, MVT::f32, Legal);
-      setOperationAction(ISD::FMAXIMUM, MVT::f32, Legal);
-      setOperationAction(ISD::FMINIMUM, MVT::f16, Legal);
-      setOperationAction(ISD::FMAXIMUM, MVT::f16, Legal);
-    }
+    // a NEON instruction with an undef lane instead.
+    setOperationAction(ISD::FMINIMUM, MVT::f32, Legal);
+    setOperationAction(ISD::FMAXIMUM, MVT::f32, Legal);
+    setOperationAction(ISD::FMINIMUM, MVT::f16, Legal);
+    setOperationAction(ISD::FMAXIMUM, MVT::f16, Legal);
     setOperationAction(ISD::FMINIMUM, MVT::v2f32, Legal);
     setOperationAction(ISD::FMAXIMUM, MVT::v2f32, Legal);
     setOperationAction(ISD::FMINIMUM, MVT::v4f32, Legal);

RKSimon mentioned this in rG1b956616162f: [AArch64] Regenerate sve-fixed-length-fp-minmax.ll.Aug 20 2023, 3:46 AM

tra added a subscriber: tra.Aug 21 2023, 1:47 PM

tra added inline comments.Aug 21 2023, 2:13 PM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
8161	Are we expected to return a NaN or the specific NaN value of one of the arguments? APfloat's implementation returns one of the argument values. X86ISelLowering.cpp does the same.

qiucf updated this revision to Diff 552239.Aug 22 2023, 12:33 AM

qiucf marked 2 inline comments as done.

In D158053#4598440, @arsenm wrote:

I think we should either define the existing FMINNUM_IEEE/FMAXNUM_IEEE to have the correct IEEE 2019 signed zero ordering (I can't name a target that doesn't have this behavior), or we have to add a pair of DAG nodes that do

Some targets have instructions legal for fminnum_ieee. It's better to add new one.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
8161	Does the standard specify if the exact NaN should be returned? Choosing one NaN will produce more complex code.

Harbormaster completed remote builds in B254007: Diff 552239.Aug 22 2023, 1:17 AM

tra added inline comments.Aug 22 2023, 10:48 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
8161	I don't have access to the standard text. All I could find online are these notes https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/minNum_maxNum_Removal_Demotion_v3.pdf oulining the state of the min/max and the recommeendations for the standard changes. They do seem to care about NaN associativity/commutativity, but I have no idea what exactly ended up specified by the standard. I assume that the authors of the implementations in APFloat and X86 did implement it according to the standard. @tlively -- would you happen to know the details?

tlively added inline comments.Aug 22 2023, 12:00 PM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
8161	Looked into it, and yes, the standard says (in section 6.2) that the output NaN should have the same payload as one of the input NaNs, explicitly not specifying which input NaN the payload is from if there are multiple. Furthermore, the output NaN must be canonical even if the input is a non-canonical NaN.

In D158053#4605852, @qiucf wrote:

In D158053#4598440, @arsenm wrote:

I think we should either define the existing FMINNUM_IEEE/FMAXNUM_IEEE to have the correct IEEE 2019 signed zero ordering (I can't name a target that doesn't have this behavior), or we have to add a pair of DAG nodes that do

Some targets have instructions legal for fminnum_ieee. It's better to add new one.

I thought AMDGPU was the only one using it, but I see PPC and Loongarch have started. AMDGPU definitely has the correct signed zero behavior, so do the other 2?

dcaballe mentioned this in D158618: [mlir][vector] Rename vector reductions: `maxf` → `maximumf`, `minf` → `minimumf`.Aug 23 2023, 9:40 AM

In D158053#4607756, @arsenm wrote:

In D158053#4605852, @qiucf wrote:

In D158053#4598440, @arsenm wrote:

I think we should either define the existing FMINNUM_IEEE/FMAXNUM_IEEE to have the correct IEEE 2019 signed zero ordering (I can't name a target that doesn't have this behavior), or we have to add a pair of DAG nodes that do

Some targets have instructions legal for fminnum_ieee. It's better to add new one.

I thought AMDGPU was the only one using it, but I see PPC and Loongarch have started. AMDGPU definitely has the correct signed zero behavior, so do the other 2?

PowerPC VSX max/min instruction (1) respects zero sign ordering; (2) max(QNaN, non-NaN) = non-NaN; (3) max(SNaN, *) = QNaN, so it matches fmaximum_ieee semantics. (Page 735 Power ISA 3.1).

LoongArch ISA says 'the operation of these two instructions follows the specification of maxNum(x,y) operation in the IEEE 754-2008 standard', so I think it is also right. (CC @SixWeining )

It seems no way to get fmaximum_ieee/fminimum_ieee through IR?

Migrated to https://github.com/llvm/llvm-project/pull/67301

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetLowering.h

3 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

6 lines

LegalizeVectorOps.cpp

7 lines

TargetLowering.cpp

58 lines

Target/

ARM/

ARMISelLowering.cpp

14 lines

RISCV/

RISCVISelLowering.cpp

10 lines

test/

CodeGen/

ARM/

minnum-maxnum-intrinsics.ll

28 lines

PowerPC/

fminimum-fmaximum-f128.ll

97 lines

fminimum-fmaximum.ll

847 lines

Diff 552239

llvm/include/llvm/CodeGen/TargetLowering.h

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	/// \param Chain output chain after conversion			/// \param Chain output chain after conversion
	/// \returns True, if the expansion was successful, false otherwise			/// \returns True, if the expansion was successful, false otherwise
	bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SDValue &Chain,			bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SDValue &Chain,
	SelectionDAG &DAG) const;			SelectionDAG &DAG) const;

	/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.			/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.
	SDValue expandFMINNUM_FMAXNUM(SDNode *N, SelectionDAG &DAG) const;			SDValue expandFMINNUM_FMAXNUM(SDNode *N, SelectionDAG &DAG) const;

				/// Expand fminimum/fmaximum into multiple comparison with selects.
				SDValue expandFMINIMUM_FMAXIMUM(SDNode *N, SelectionDAG &DAG) const;

	/// Expand FP_TO_[US]INT_SAT into FP_TO_[US]INT and selects or min/max.			/// Expand FP_TO_[US]INT_SAT into FP_TO_[US]INT and selects or min/max.
	/// \param N Node to expand			/// \param N Node to expand
	/// \returns The expansion result			/// \returns The expansion result
	SDValue expandFP_TO_INT_SAT(SDNode *N, SelectionDAG &DAG) const;			SDValue expandFP_TO_INT_SAT(SDNode *N, SelectionDAG &DAG) const;

	/// Expand check for floating point class.			/// Expand check for floating point class.
	/// \param ResultVT The type of intrinsic call result.			/// \param ResultVT The type of intrinsic call result.
	/// \param Op The tested value.			/// \param Op The tested value.
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	break;			break;
	}			}
	case ISD::FMINNUM:			case ISD::FMINNUM:
	case ISD::FMAXNUM: {			case ISD::FMAXNUM: {
	if (SDValue Expanded = TLI.expandFMINNUM_FMAXNUM(Node, DAG))			if (SDValue Expanded = TLI.expandFMINNUM_FMAXNUM(Node, DAG))
	Results.push_back(Expanded);			Results.push_back(Expanded);
	break;			break;
	}			}
				case ISD::FMINIMUM:
				case ISD::FMAXIMUM: {
				if (SDValue Expanded = TLI.expandFMINIMUM_FMAXIMUM(Node, DAG))
				Results.push_back(Expanded);
				break;
				}
	case ISD::FSIN:			case ISD::FSIN:
	case ISD::FCOS: {			case ISD::FCOS: {
	EVT VT = Node->getValueType(0);			EVT VT = Node->getValueType(0);
	// Turn fsin / fcos into ISD::FSINCOS node if there are a pair of fsin /			// Turn fsin / fcos into ISD::FSINCOS node if there are a pair of fsin /
	// fcos which share the same operand and both are used.			// fcos which share the same operand and both are used.
	if ((TLI.isOperationLegalOrCustom(ISD::FSINCOS, VT) \|\|			if ((TLI.isOperationLegalOrCustom(ISD::FSINCOS, VT) \|\|
	isSinCosLibcallAvailable(Node, TLI))			isSinCosLibcallAvailable(Node, TLI))
	&& useSinCos(Node)) {			&& useSinCos(Node)) {
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	break;			break;
	case ISD::FMINNUM:			case ISD::FMINNUM:
	case ISD::FMAXNUM:			case ISD::FMAXNUM:
	if (SDValue Expanded = TLI.expandFMINNUM_FMAXNUM(Node, DAG)) {			if (SDValue Expanded = TLI.expandFMINNUM_FMAXNUM(Node, DAG)) {
	Results.push_back(Expanded);			Results.push_back(Expanded);
	return;			return;
	}			}
	break;			break;
				case ISD::FMINIMUM:
				case ISD::FMAXIMUM:
				if (SDValue Expanded = TLI.expandFMINIMUM_FMAXIMUM(Node, DAG)) {
				Results.push_back(Expanded);
				return;
				}
				break;
	case ISD::SMIN:			case ISD::SMIN:
	case ISD::SMAX:			case ISD::SMAX:
	case ISD::UMIN:			case ISD::UMIN:
	case ISD::UMAX:			case ISD::UMAX:
	if (SDValue Expanded = TLI.expandIntMINMAX(Node, DAG)) {			if (SDValue Expanded = TLI.expandIntMINMAX(Node, DAG)) {
	Results.push_back(Expanded);			Results.push_back(Expanded);
	return;			return;
	}			}
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	}			}

	if (SDValue SelCC = createSelectForFMINNUM_FMAXNUM(Node, DAG))			if (SDValue SelCC = createSelectForFMINNUM_FMAXNUM(Node, DAG))
	return SelCC;			return SelCC;

	return SDValue();			return SDValue();
	}			}

				SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N,
				SelectionDAG &DAG) const {
				SDLoc DL(N);
				SDValue LHS = N->getOperand(0);
				SDValue RHS = N->getOperand(1);
				unsigned Opc = N->getOpcode();
				EVT VT = N->getValueType(0);
				EVT CCVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
				traUnsubmitted Not Done Reply Inline Actions Are we expected to return a NaN or the specific NaN value of one of the arguments? APfloat's implementation returns one of the argument values. X86ISelLowering.cpp does the same. tra: Are we expected to return a NaN or the specific NaN value of one of the arguments? APfloat's…
				qiucfAuthorUnsubmitted Done Reply Inline Actions Does the standard specify if the exact NaN should be returned? Choosing one NaN will produce more complex code. qiucf: Does the standard specify if the exact NaN should be returned? Choosing one NaN will produce…
				traUnsubmitted Not Done Reply Inline Actions I don't have access to the standard text. All I could find online are these notes https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/minNum_maxNum_Removal_Demotion_v3.pdf oulining the state of the min/max and the recommeendations for the standard changes. They do seem to care about NaN associativity/commutativity, but I have no idea what exactly ended up specified by the standard. I assume that the authors of the implementations in APFloat and X86 did implement it according to the standard. @tlively -- would you happen to know the details? tra: I don't have access to the standard text. All I could find online are these notes https…
				tlivelyUnsubmitted Not Done Reply Inline Actions Looked into it, and yes, the standard says (in section 6.2) that the output NaN should have the same payload as one of the input NaNs, explicitly not specifying which input NaN the payload is from if there are multiple. Furthermore, the output NaN must be canonical even if the input is a non-canonical NaN. tlively: Looked into it, and yes, the standard says (in section 6.2) that the output NaN should have the…
				bool NoNaN = (N->getFlags().hasNoNaNs() \|\|
				(DAG.isKnownNeverNaN(LHS) && DAG.isKnownNeverNaN(RHS)));
				bool NoZeroSign =
				(N->getFlags().hasNoSignedZeros() \|\| DAG.isKnownNeverZeroFloat(LHS) \|\|
				DAG.isKnownNeverZeroFloat(RHS));
				bool IsMax = Opc == ISD::FMAXIMUM;

				if (VT.isVector() &&
				isOperationLegalOrCustomOrPromote(Opc, VT.getScalarType()))
				return SDValue();

				SDValue MinMax;
				if (isOperationLegalOrCustom(IsMax ? ISD::FMAXNUM_IEEE : ISD::FMINNUM_IEEE,
				VT))
				MinMax = DAG.getNode(IsMax ? ISD::FMAXNUM_IEEE : ISD::FMINNUM_IEEE, DL, VT,
				LHS, RHS);
				else if (isOperationLegalOrCustom(IsMax ? ISD::FMAXNUM : ISD::FMINNUM, VT))
				MinMax = DAG.getNode(IsMax ? ISD::FMAXNUM : ISD::FMINNUM, DL, VT, LHS, RHS);
				else
				MinMax = DAG.getSelect(
				DL, VT,
				DAG.getSetCC(DL, CCVT, LHS, RHS, IsMax ? ISD::SETGT : ISD::SETLT), LHS,
				e-kudUnsubmitted Done Reply Inline Actions Could it be simplified for `maximum` as follows? Tmp = select(is_fpclass(PosZero, LHS), LHS, MinMax) Res = select(is_fpclass(PosZero, RHS), RHS, Tmp) MinMax = select(IsZero, Res, MinMax) e-kud: Could it be simplified for `maximum` as follows? ``` Tmp = select(is_fpclass(PosZero, LHS)…
				RHS);

				// Propagate any NaN of both operands
				if (!NoNaN) {
				ConstantFP *FPNaN = ConstantFP::get(
				*DAG.getContext(), APFloat::getNaN(DAG.EVTToAPFloatSemantics(VT)));
				MinMax = DAG.getSelect(DL, VT, DAG.getSetCC(DL, CCVT, LHS, RHS, ISD::SETUO),
				DAG.getConstantFP(*FPNaN, DL, VT), MinMax);
				}

				// fminimum/fmaximum requires -0.0 less than +0.0
				if (!NoZeroSign) {
				SDValue IsZero = DAG.getSetCC(DL, CCVT, MinMax,
				DAG.getConstantFP(0.0, DL, VT), ISD::SETEQ);
				SDValue TestZero =
				DAG.getTargetConstant(IsMax ? fcPosZero : fcNegZero, DL, MVT::i32);
				SDValue LCmp = DAG.getSelect(
				DL, VT, DAG.getNode(ISD::IS_FPCLASS, DL, CCVT, LHS, TestZero), LHS,
				MinMax);
				SDValue RCmp = DAG.getSelect(
				DL, VT, DAG.getNode(ISD::IS_FPCLASS, DL, CCVT, RHS, TestZero), RHS,
				LCmp);
				MinMax = DAG.getSelect(DL, VT, IsZero, RCmp, MinMax);
				}

				return MinMax;
				}

	/// Returns a true value if if this FPClassTest can be performed with an ordered			/// Returns a true value if if this FPClassTest can be performed with an ordered
	/// fcmp to 0, and a false value if it's an unordered fcmp to 0. Returns			/// fcmp to 0, and a false value if it's an unordered fcmp to 0. Returns
	/// std::nullopt if it cannot be performed as a compare with 0.			/// std::nullopt if it cannot be performed as a compare with 0.
	static std::optional<bool> isFCmpEqualZero(FPClassTest Test,			static std::optional<bool> isFCmpEqualZero(FPClassTest Test,
	const fltSemantics &Semantics,			const fltSemantics &Semantics,
	const MachineFunction &MF) {			const MachineFunction &MF) {
	FPClassTest OrderedMask = Test & ~fcNan;			FPClassTest OrderedMask = Test & ~fcNan;
	FPClassTest NanTest = Test & fcNan;			FPClassTest NanTest = Test & fcNan;
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMISelLowering.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	setOperationAction(ISD::FLOG10, MVT::f16, Promote);			setOperationAction(ISD::FLOG10, MVT::f16, Promote);
	setOperationAction(ISD::FLOG2, MVT::f16, Promote);			setOperationAction(ISD::FLOG2, MVT::f16, Promote);

	setOperationAction(ISD::FROUND, MVT::f16, Legal);			setOperationAction(ISD::FROUND, MVT::f16, Legal);
	}			}

	if (Subtarget->hasNEON()) {			if (Subtarget->hasNEON()) {
	// vmin and vmax aren't available in a scalar form, so we can use			// vmin and vmax aren't available in a scalar form, so we can use
	// a NEON instruction with an undef lane instead. This has a performance			// a NEON instruction with an undef lane instead.
	// penalty on some cores, so we don't do this unless we have been			setOperationAction(ISD::FMINIMUM, MVT::f32, Legal);
	// asked to by the core tuning model.			setOperationAction(ISD::FMAXIMUM, MVT::f32, Legal);
	if (Subtarget->useNEONForSinglePrecisionFP()) {			setOperationAction(ISD::FMINIMUM, MVT::f16, Legal);
	setOperationAction(ISD::FMINIMUM, MVT::f32, Legal);			setOperationAction(ISD::FMAXIMUM, MVT::f16, Legal);
	setOperationAction(ISD::FMAXIMUM, MVT::f32, Legal);
	setOperationAction(ISD::FMINIMUM, MVT::f16, Legal);
	setOperationAction(ISD::FMAXIMUM, MVT::f16, Legal);
	}
	setOperationAction(ISD::FMINIMUM, MVT::v2f32, Legal);			setOperationAction(ISD::FMINIMUM, MVT::v2f32, Legal);
	setOperationAction(ISD::FMAXIMUM, MVT::v2f32, Legal);			setOperationAction(ISD::FMAXIMUM, MVT::v2f32, Legal);
	setOperationAction(ISD::FMINIMUM, MVT::v4f32, Legal);			setOperationAction(ISD::FMINIMUM, MVT::v4f32, Legal);
	setOperationAction(ISD::FMAXIMUM, MVT::v4f32, Legal);			setOperationAction(ISD::FMAXIMUM, MVT::v4f32, Legal);

	if (Subtarget->hasFullFP16()) {			if (Subtarget->hasFullFP16()) {
	setOperationAction(ISD::FMINNUM, MVT::v4f16, Legal);			setOperationAction(ISD::FMINNUM, MVT::v4f16, Legal);
	setOperationAction(ISD::FMAXNUM, MVT::v4f16, Legal);			setOperationAction(ISD::FMAXNUM, MVT::v4f16, Legal);
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	ISD::STRICT_FROUND, ISD::STRICT_FROUNDEVEN,			ISD::STRICT_FROUND, ISD::STRICT_FROUNDEVEN,
	ISD::STRICT_FTRUNC},			ISD::STRICT_FTRUNC},
	MVT::f16, Promote);			MVT::f16, Promote);

	// We need to custom promote this.			// We need to custom promote this.
	if (Subtarget.is64Bit())			if (Subtarget.is64Bit())
	setOperationAction(ISD::FPOWI, MVT::i32, Custom);			setOperationAction(ISD::FPOWI, MVT::i32, Custom);

	if (!Subtarget.hasStdExtZfa())			setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f16,
				craig.topperUnsubmitted Done Reply Inline Actions Can you reverse the `if` and `else` here to remove the `!` from the condition. craig.topper: Can you reverse the `if` and `else` here to remove the `!` from the condition.
	setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f16, Custom);			Subtarget.hasStdExtZfa() ? Legal : Custom);
	}			}

	if (Subtarget.hasStdExtFOrZfinx()) {			if (Subtarget.hasStdExtFOrZfinx()) {
	setOperationAction(FPLegalNodeTypes, MVT::f32, Legal);			setOperationAction(FPLegalNodeTypes, MVT::f32, Legal);
	setOperationAction(FPRndMode, MVT::f32,			setOperationAction(FPRndMode, MVT::f32,
	Subtarget.hasStdExtZfa() ? Legal : Custom);			Subtarget.hasStdExtZfa() ? Legal : Custom);
	setCondCodeAction(FPCCToExpand, MVT::f32, Expand);			setCondCodeAction(FPCCToExpand, MVT::f32, Expand);
	setOperationAction(ISD::SELECT_CC, MVT::f32, Expand);			setOperationAction(ISD::SELECT_CC, MVT::f32, Expand);
	setOperationAction(ISD::SELECT, MVT::f32, Custom);			setOperationAction(ISD::SELECT, MVT::f32, Custom);
	setOperationAction(ISD::BR_CC, MVT::f32, Expand);			setOperationAction(ISD::BR_CC, MVT::f32, Expand);
	setOperationAction(FPOpToExpand, MVT::f32, Expand);			setOperationAction(FPOpToExpand, MVT::f32, Expand);
	setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::f16, Expand);			setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::f16, Expand);
	setTruncStoreAction(MVT::f32, MVT::f16, Expand);			setTruncStoreAction(MVT::f32, MVT::f16, Expand);
	setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::bf16, Expand);			setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::bf16, Expand);
	setTruncStoreAction(MVT::f32, MVT::bf16, Expand);			setTruncStoreAction(MVT::f32, MVT::bf16, Expand);
	setOperationAction(ISD::IS_FPCLASS, MVT::f32, Custom);			setOperationAction(ISD::IS_FPCLASS, MVT::f32, Custom);
	setOperationAction(ISD::BF16_TO_FP, MVT::f32, Custom);			setOperationAction(ISD::BF16_TO_FP, MVT::f32, Custom);
	setOperationAction(ISD::FP_TO_BF16, MVT::f32,			setOperationAction(ISD::FP_TO_BF16, MVT::f32,
	Subtarget.isSoftFPABI() ? LibCall : Custom);			Subtarget.isSoftFPABI() ? LibCall : Custom);
	setOperationAction(ISD::FP_TO_FP16, MVT::f32, Custom);			setOperationAction(ISD::FP_TO_FP16, MVT::f32, Custom);
	setOperationAction(ISD::FP16_TO_FP, MVT::f32, Custom);			setOperationAction(ISD::FP16_TO_FP, MVT::f32, Custom);

	if (Subtarget.hasStdExtZfa())			if (Subtarget.hasStdExtZfa()) {
	setOperationAction(ISD::FNEARBYINT, MVT::f32, Legal);			setOperationAction(ISD::FNEARBYINT, MVT::f32, Legal);
	else			setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f32, Legal);
				} else
	setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f32, Custom);			setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f32, Custom);
	}			}

	if (Subtarget.hasStdExtFOrZfinx() && Subtarget.is64Bit())			if (Subtarget.hasStdExtFOrZfinx() && Subtarget.is64Bit())
	setOperationAction(ISD::BITCAST, MVT::i32, Custom);			setOperationAction(ISD::BITCAST, MVT::i32, Custom);

	if (Subtarget.hasStdExtDOrZdinx()) {			if (Subtarget.hasStdExtDOrZdinx()) {
	setOperationAction(FPLegalNodeTypes, MVT::f64, Legal);			setOperationAction(FPLegalNodeTypes, MVT::f64, Legal);

	if (Subtarget.hasStdExtZfa()) {			if (Subtarget.hasStdExtZfa()) {
	setOperationAction(FPRndMode, MVT::f64, Legal);			setOperationAction(FPRndMode, MVT::f64, Legal);
	setOperationAction(ISD::FNEARBYINT, MVT::f64, Legal);			setOperationAction(ISD::FNEARBYINT, MVT::f64, Legal);
	setOperationAction(ISD::BITCAST, MVT::i64, Custom);			setOperationAction(ISD::BITCAST, MVT::i64, Custom);
	setOperationAction(ISD::BITCAST, MVT::f64, Custom);			setOperationAction(ISD::BITCAST, MVT::f64, Custom);
				setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f64, Legal);
	} else {			} else {
	if (Subtarget.is64Bit())			if (Subtarget.is64Bit())
	setOperationAction(FPRndMode, MVT::f64, Custom);			setOperationAction(FPRndMode, MVT::f64, Custom);

	setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f64, Custom);			setOperationAction({ISD::FMAXIMUM, ISD::FMINIMUM}, MVT::f64, Custom);
	}			}

	setOperationAction(ISD::STRICT_FP_ROUND, MVT::f32, Legal);			setOperationAction(ISD::STRICT_FP_ROUND, MVT::f32, Legal);
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/minnum-maxnum-intrinsics.ll

Show All 39 Lines
; ARMV8M-NEXT: bx lr		; ARMV8M-NEXT: bx lr
%a = call nnan float @llvm.minnum.f32(float %x, float %y)		%a = call nnan float @llvm.minnum.f32(float %x, float %y)
ret float %a		ret float %a
}		}

define float @fminnum32_nsz_intrinsic(float %x, float %y) {		define float @fminnum32_nsz_intrinsic(float %x, float %y) {
; ARMV7-LABEL: fminnum32_nsz_intrinsic:		; ARMV7-LABEL: fminnum32_nsz_intrinsic:
; ARMV7: @ %bb.0:		; ARMV7: @ %bb.0:
; ARMV7-NEXT: vmov s0, r0		; ARMV7-NEXT: vmov s0, r1
; ARMV7-NEXT: vmov s2, r1		; ARMV7-NEXT: vmov s2, r0
; ARMV7-NEXT: vcmp.f32 s0, s2		; ARMV7-NEXT: vmin.f32 d0, d1, d0
; ARMV7-NEXT: vmrs APSR_nzcv, fpscr		; ARMV7-NEXT: vmov r0, s0
; ARMV7-NEXT: vmovlt.f32 s2, s0
; ARMV7-NEXT: vmov r0, s2
; ARMV7-NEXT: bx lr		; ARMV7-NEXT: bx lr
;		;
; ARMV8-LABEL: fminnum32_nsz_intrinsic:		; ARMV8-LABEL: fminnum32_nsz_intrinsic:
; ARMV8: @ %bb.0:		; ARMV8: @ %bb.0:
; ARMV8-NEXT: vmov s0, r1		; ARMV8-NEXT: vmov s0, r1
; ARMV8-NEXT: vmov s2, r0		; ARMV8-NEXT: vmov s2, r0
; ARMV8-NEXT: vminnm.f32 s0, s2, s0		; ARMV8-NEXT: vminnm.f32 s0, s2, s0
; ARMV8-NEXT: vmov r0, s0		; ARMV8-NEXT: vmov r0, s0
Show All 10 Lines	; ARMV8M-NEXT: bx lr
ret float %a		ret float %a
}		}

define float @fminnum32_non_zero_intrinsic(float %x) {		define float @fminnum32_non_zero_intrinsic(float %x) {
; ARMV7-LABEL: fminnum32_non_zero_intrinsic:		; ARMV7-LABEL: fminnum32_non_zero_intrinsic:
; ARMV7: @ %bb.0:		; ARMV7: @ %bb.0:
; ARMV7-NEXT: vmov.f32 s0, #-1.000000e+00		; ARMV7-NEXT: vmov.f32 s0, #-1.000000e+00
; ARMV7-NEXT: vmov s2, r0		; ARMV7-NEXT: vmov s2, r0
; ARMV7-NEXT: vcmp.f32 s2, s0		; ARMV7-NEXT: vmin.f32 d0, d1, d0
; ARMV7-NEXT: vmrs APSR_nzcv, fpscr
; ARMV7-NEXT: vmovlt.f32 s0, s2
; ARMV7-NEXT: vmov r0, s0		; ARMV7-NEXT: vmov r0, s0
; ARMV7-NEXT: bx lr		; ARMV7-NEXT: bx lr
;		;
; ARMV8-LABEL: fminnum32_non_zero_intrinsic:		; ARMV8-LABEL: fminnum32_non_zero_intrinsic:
; ARMV8: @ %bb.0:		; ARMV8: @ %bb.0:
; ARMV8-NEXT: vmov.f32 s0, #-1.000000e+00		; ARMV8-NEXT: vmov.f32 s0, #-1.000000e+00
; ARMV8-NEXT: vmov s2, r0		; ARMV8-NEXT: vmov s2, r0
; ARMV8-NEXT: vminnm.f32 s0, s2, s0		; ARMV8-NEXT: vminnm.f32 s0, s2, s0
Show All 39 Lines
; ARMV8M-NEXT: bx lr		; ARMV8M-NEXT: bx lr
%a = call nnan float @llvm.maxnum.f32(float %x, float %y)		%a = call nnan float @llvm.maxnum.f32(float %x, float %y)
ret float %a		ret float %a
}		}

define float @fmaxnum32_nsz_intrinsic(float %x, float %y) {		define float @fmaxnum32_nsz_intrinsic(float %x, float %y) {
; ARMV7-LABEL: fmaxnum32_nsz_intrinsic:		; ARMV7-LABEL: fmaxnum32_nsz_intrinsic:
; ARMV7: @ %bb.0:		; ARMV7: @ %bb.0:
; ARMV7-NEXT: vmov s0, r0		; ARMV7-NEXT: vmov s0, r1
; ARMV7-NEXT: vmov s2, r1		; ARMV7-NEXT: vmov s2, r0
; ARMV7-NEXT: vcmp.f32 s0, s2		; ARMV7-NEXT: vmax.f32 d0, d1, d0
; ARMV7-NEXT: vmrs APSR_nzcv, fpscr		; ARMV7-NEXT: vmov r0, s0
; ARMV7-NEXT: vmovgt.f32 s2, s0
; ARMV7-NEXT: vmov r0, s2
; ARMV7-NEXT: bx lr		; ARMV7-NEXT: bx lr
;		;
; ARMV8-LABEL: fmaxnum32_nsz_intrinsic:		; ARMV8-LABEL: fmaxnum32_nsz_intrinsic:
; ARMV8: @ %bb.0:		; ARMV8: @ %bb.0:
; ARMV8-NEXT: vmov s0, r1		; ARMV8-NEXT: vmov s0, r1
; ARMV8-NEXT: vmov s2, r0		; ARMV8-NEXT: vmov s2, r0
; ARMV8-NEXT: vmaxnm.f32 s0, s2, s0		; ARMV8-NEXT: vmaxnm.f32 s0, s2, s0
; ARMV8-NEXT: vmov r0, s0		; ARMV8-NEXT: vmov r0, s0
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	; ARMV8M-NEXT: .long 0x00000000 @ float 0
ret float %a		ret float %a
}		}

define float @fmaxnum32_non_zero_intrinsic(float %x) {		define float @fmaxnum32_non_zero_intrinsic(float %x) {
; ARMV7-LABEL: fmaxnum32_non_zero_intrinsic:		; ARMV7-LABEL: fmaxnum32_non_zero_intrinsic:
; ARMV7: @ %bb.0:		; ARMV7: @ %bb.0:
; ARMV7-NEXT: vmov.f32 s0, #1.000000e+00		; ARMV7-NEXT: vmov.f32 s0, #1.000000e+00
; ARMV7-NEXT: vmov s2, r0		; ARMV7-NEXT: vmov s2, r0
; ARMV7-NEXT: vcmp.f32 s2, s0		; ARMV7-NEXT: vmax.f32 d0, d1, d0
; ARMV7-NEXT: vmrs APSR_nzcv, fpscr
; ARMV7-NEXT: vmovgt.f32 s0, s2
; ARMV7-NEXT: vmov r0, s0		; ARMV7-NEXT: vmov r0, s0
; ARMV7-NEXT: bx lr		; ARMV7-NEXT: bx lr
;		;
; ARMV8-LABEL: fmaxnum32_non_zero_intrinsic:		; ARMV8-LABEL: fmaxnum32_non_zero_intrinsic:
; ARMV8: @ %bb.0:		; ARMV8: @ %bb.0:
; ARMV8-NEXT: vmov.f32 s0, #1.000000e+00		; ARMV8-NEXT: vmov.f32 s0, #1.000000e+00
; ARMV8-NEXT: vmov s2, r0		; ARMV8-NEXT: vmov s2, r0
; ARMV8-NEXT: vmaxnm.f32 s0, s2, s0		; ARMV8-NEXT: vmaxnm.f32 s0, s2, s0
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fminimum-fmaximum-f128.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
				; RUN: llc -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 < %s \| FileCheck %s

				define fp128 @f128_minimum(fp128 %a, fp128 %b) {
				; CHECK-LABEL: f128_minimum:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscmpuqp 0, 2, 3
				; CHECK-NEXT: vmr 4, 2
				; CHECK-NEXT: bge 0, .LBB0_8
				; CHECK-NEXT: # %bb.1: # %entry
				; CHECK-NEXT: bun 0, .LBB0_9
				; CHECK-NEXT: .LBB0_2: # %entry
				; CHECK-NEXT: xststdcqp 0, 2, 4
				; CHECK-NEXT: bc 4, 2, .LBB0_10
				; CHECK-NEXT: .LBB0_3: # %entry
				; CHECK-NEXT: xststdcqp 0, 3, 4
				; CHECK-NEXT: bc 12, 2, .LBB0_5
				; CHECK-NEXT: .LBB0_4: # %entry
				; CHECK-NEXT: vmr 3, 2
				; CHECK-NEXT: .LBB0_5: # %entry
				; CHECK-NEXT: addis 3, 2, .LCPI0_1@toc@ha
				; CHECK-NEXT: addi 3, 3, .LCPI0_1@toc@l
				; CHECK-NEXT: lxv 34, 0(3)
				; CHECK-NEXT: xscmpuqp 0, 4, 2
				; CHECK-NEXT: beq 0, .LBB0_7
				; CHECK-NEXT: # %bb.6: # %entry
				; CHECK-NEXT: vmr 3, 4
				; CHECK-NEXT: .LBB0_7: # %entry
				; CHECK-NEXT: vmr 2, 3
				; CHECK-NEXT: blr
				; CHECK-NEXT: .LBB0_8: # %entry
				; CHECK-NEXT: vmr 4, 3
				; CHECK-NEXT: bnu 0, .LBB0_2
				; CHECK-NEXT: .LBB0_9:
				; CHECK-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-NEXT: addi 3, 3, .LCPI0_0@toc@l
				; CHECK-NEXT: lxv 36, 0(3)
				; CHECK-NEXT: xststdcqp 0, 2, 4
				; CHECK-NEXT: bc 12, 2, .LBB0_3
				; CHECK-NEXT: .LBB0_10: # %entry
				; CHECK-NEXT: vmr 2, 4
				; CHECK-NEXT: xststdcqp 0, 3, 4
				; CHECK-NEXT: bc 4, 2, .LBB0_4
				; CHECK-NEXT: b .LBB0_5
				entry:
				%m = call fp128 @llvm.minimum.f128(fp128 %a, fp128 %b)
				ret fp128 %m
				}

				define fp128 @f128_maximum(fp128 %a, fp128 %b) {
				; CHECK-LABEL: f128_maximum:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: xscmpuqp 0, 2, 3
				; CHECK-NEXT: vmr 4, 2
				; CHECK-NEXT: ble 0, .LBB1_8
				; CHECK-NEXT: # %bb.1: # %entry
				; CHECK-NEXT: bun 0, .LBB1_9
				; CHECK-NEXT: .LBB1_2: # %entry
				; CHECK-NEXT: xststdcqp 0, 2, 8
				; CHECK-NEXT: bc 4, 2, .LBB1_10
				; CHECK-NEXT: .LBB1_3: # %entry
				; CHECK-NEXT: xststdcqp 0, 3, 8
				; CHECK-NEXT: bc 12, 2, .LBB1_5
				; CHECK-NEXT: .LBB1_4: # %entry
				; CHECK-NEXT: vmr 3, 2
				; CHECK-NEXT: .LBB1_5: # %entry
				; CHECK-NEXT: addis 3, 2, .LCPI1_1@toc@ha
				; CHECK-NEXT: addi 3, 3, .LCPI1_1@toc@l
				; CHECK-NEXT: lxv 34, 0(3)
				; CHECK-NEXT: xscmpuqp 0, 4, 2
				; CHECK-NEXT: beq 0, .LBB1_7
				; CHECK-NEXT: # %bb.6: # %entry
				; CHECK-NEXT: vmr 3, 4
				; CHECK-NEXT: .LBB1_7: # %entry
				; CHECK-NEXT: vmr 2, 3
				; CHECK-NEXT: blr
				; CHECK-NEXT: .LBB1_8: # %entry
				; CHECK-NEXT: vmr 4, 3
				; CHECK-NEXT: bnu 0, .LBB1_2
				; CHECK-NEXT: .LBB1_9:
				; CHECK-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-NEXT: addi 3, 3, .LCPI1_0@toc@l
				; CHECK-NEXT: lxv 36, 0(3)
				; CHECK-NEXT: xststdcqp 0, 2, 8
				; CHECK-NEXT: bc 12, 2, .LBB1_3
				; CHECK-NEXT: .LBB1_10: # %entry
				; CHECK-NEXT: vmr 2, 4
				; CHECK-NEXT: xststdcqp 0, 3, 8
				; CHECK-NEXT: bc 4, 2, .LBB1_4
				; CHECK-NEXT: b .LBB1_5
				entry:
				%m = call fp128 @llvm.maximum.f128(fp128 %a, fp128 %b)
				ret fp128 %m
				}

				declare fp128 @llvm.minimum.f128(fp128, fp128)
				declare fp128 @llvm.maximum.f128(fp128, fp128)

llvm/test/CodeGen/PowerPC/fminimum-fmaximum.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
				; RUN: llc -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 -mattr=-vsx < %s \| FileCheck %s --check-prefix=NOVSX
				; RUN: llc -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 < %s \| FileCheck %s --check-prefix=VSX
				; RUN: llc -mtriple=powerpc64-ibm-aix -mcpu=pwr8 < %s \| FileCheck %s --check-prefix=AIX

				define float @f32_minimum(float %a, float %b) {
				; NOVSX-LABEL: f32_minimum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 2
				; NOVSX-NEXT: fmr 0, 1
				; NOVSX-NEXT: stfs 2, -8(1)
				; NOVSX-NEXT: stfs 1, -4(1)
				; NOVSX-NEXT: bc 12, 0, .LBB0_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 0, 2
				; NOVSX-NEXT: .LBB0_2: # %entry
				; NOVSX-NEXT: lwz 3, -4(1)
				; NOVSX-NEXT: bc 4, 3, .LBB0_4
				; NOVSX-NEXT: # %bb.3:
				; NOVSX-NEXT: addis 4, 2, .LCPI0_0@toc@ha
				; NOVSX-NEXT: lfs 0, .LCPI0_0@toc@l(4)
				; NOVSX-NEXT: .LBB0_4: # %entry
				; NOVSX-NEXT: xoris 3, 3, 32768
				; NOVSX-NEXT: cmplwi 3, 0
				; NOVSX-NEXT: lwz 3, -8(1)
				; NOVSX-NEXT: bc 12, 2, .LBB0_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 0
				; NOVSX-NEXT: .LBB0_6: # %entry
				; NOVSX-NEXT: xoris 3, 3, 32768
				; NOVSX-NEXT: cmplwi 3, 0
				; NOVSX-NEXT: bc 12, 2, .LBB0_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 2, 1
				; NOVSX-NEXT: .LBB0_8: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI0_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI0_1@toc@l(3)
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB0_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: .LBB0_10: # %entry
				; NOVSX-NEXT: fmr 1, 2
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: f32_minimum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: fcmpu 0, 1, 2
				; VSX-NEXT: xscvdpspn 0, 1
				; VSX-NEXT: xscvdpspn 3, 2
				; VSX-NEXT: mffprwz 3, 0
				; VSX-NEXT: bc 12, 3, .LBB0_2
				; VSX-NEXT: # %bb.1: # %entry
				; VSX-NEXT: xsmindp 0, 1, 2
				; VSX-NEXT: b .LBB0_3
				; VSX-NEXT: .LBB0_2:
				; VSX-NEXT: addis 4, 2, .LCPI0_0@toc@ha
				; VSX-NEXT: lfs 0, .LCPI0_0@toc@l(4)
				; VSX-NEXT: .LBB0_3: # %entry
				; VSX-NEXT: xoris 3, 3, 32768
				; VSX-NEXT: cmplwi 3, 0
				; VSX-NEXT: mffprwz 3, 3
				; VSX-NEXT: bc 12, 2, .LBB0_5
				; VSX-NEXT: # %bb.4: # %entry
				; VSX-NEXT: fmr 1, 0
				; VSX-NEXT: .LBB0_5: # %entry
				; VSX-NEXT: xoris 3, 3, 32768
				; VSX-NEXT: cmplwi 3, 0
				; VSX-NEXT: bc 12, 2, .LBB0_7
				; VSX-NEXT: # %bb.6: # %entry
				; VSX-NEXT: fmr 2, 1
				; VSX-NEXT: .LBB0_7: # %entry
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: fcmpu 0, 0, 1
				; VSX-NEXT: bc 12, 2, .LBB0_9
				; VSX-NEXT: # %bb.8: # %entry
				; VSX-NEXT: fmr 2, 0
				; VSX-NEXT: .LBB0_9: # %entry
				; VSX-NEXT: fmr 1, 2
				; VSX-NEXT: blr
				;
				; AIX-LABEL: f32_minimum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: fcmpu 0, 1, 2
				; AIX-NEXT: xscvdpspn 0, 1
				; AIX-NEXT: xscvdpspn 3, 2
				; AIX-NEXT: mffprwz 3, 0
				; AIX-NEXT: bc 12, 3, L..BB0_2
				; AIX-NEXT: # %bb.1: # %entry
				; AIX-NEXT: xsmindp 0, 1, 2
				; AIX-NEXT: b L..BB0_3
				; AIX-NEXT: L..BB0_2:
				; AIX-NEXT: ld 4, L..C0(2) # %const.0
				; AIX-NEXT: lfs 0, 0(4)
				; AIX-NEXT: L..BB0_3: # %entry
				; AIX-NEXT: xoris 3, 3, 32768
				; AIX-NEXT: cmplwi 3, 0
				; AIX-NEXT: mffprwz 3, 3
				; AIX-NEXT: bc 12, 2, L..BB0_5
				; AIX-NEXT: # %bb.4: # %entry
				; AIX-NEXT: fmr 1, 0
				; AIX-NEXT: L..BB0_5: # %entry
				; AIX-NEXT: xoris 3, 3, 32768
				; AIX-NEXT: cmplwi 3, 0
				; AIX-NEXT: bc 12, 2, L..BB0_7
				; AIX-NEXT: # %bb.6: # %entry
				; AIX-NEXT: fmr 2, 1
				; AIX-NEXT: L..BB0_7: # %entry
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: fcmpu 0, 0, 1
				; AIX-NEXT: bc 12, 2, L..BB0_9
				; AIX-NEXT: # %bb.8: # %entry
				; AIX-NEXT: fmr 2, 0
				; AIX-NEXT: L..BB0_9: # %entry
				; AIX-NEXT: fmr 1, 2
				; AIX-NEXT: blr
				entry:
				%m = call float @llvm.minimum.f32(float %a, float %b)
				ret float %m
				}

				define float @f32_maximum(float %a, float %b) {
				; NOVSX-LABEL: f32_maximum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 2
				; NOVSX-NEXT: fmr 0, 1
				; NOVSX-NEXT: stfs 2, -8(1)
				; NOVSX-NEXT: stfs 1, -4(1)
				; NOVSX-NEXT: bc 12, 1, .LBB1_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 0, 2
				; NOVSX-NEXT: .LBB1_2: # %entry
				; NOVSX-NEXT: lwz 3, -4(1)
				; NOVSX-NEXT: bc 4, 3, .LBB1_4
				; NOVSX-NEXT: # %bb.3:
				; NOVSX-NEXT: addis 4, 2, .LCPI1_0@toc@ha
				; NOVSX-NEXT: lfs 0, .LCPI1_0@toc@l(4)
				; NOVSX-NEXT: .LBB1_4: # %entry
				; NOVSX-NEXT: cmpwi 3, 0
				; NOVSX-NEXT: lwz 3, -8(1)
				; NOVSX-NEXT: bc 12, 2, .LBB1_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 0
				; NOVSX-NEXT: .LBB1_6: # %entry
				; NOVSX-NEXT: cmpwi 3, 0
				; NOVSX-NEXT: bc 12, 2, .LBB1_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 2, 1
				; NOVSX-NEXT: .LBB1_8: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI1_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI1_1@toc@l(3)
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB1_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: .LBB1_10: # %entry
				; NOVSX-NEXT: fmr 1, 2
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: f32_maximum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: fcmpu 0, 1, 2
				; VSX-NEXT: xscvdpspn 0, 1
				; VSX-NEXT: xscvdpspn 3, 2
				; VSX-NEXT: mffprwz 3, 0
				; VSX-NEXT: bc 12, 3, .LBB1_2
				; VSX-NEXT: # %bb.1: # %entry
				; VSX-NEXT: xsmaxdp 0, 1, 2
				; VSX-NEXT: b .LBB1_3
				; VSX-NEXT: .LBB1_2:
				; VSX-NEXT: addis 4, 2, .LCPI1_0@toc@ha
				; VSX-NEXT: lfs 0, .LCPI1_0@toc@l(4)
				; VSX-NEXT: .LBB1_3: # %entry
				; VSX-NEXT: cmpwi 3, 0
				; VSX-NEXT: mffprwz 3, 3
				; VSX-NEXT: bc 12, 2, .LBB1_5
				; VSX-NEXT: # %bb.4: # %entry
				; VSX-NEXT: fmr 1, 0
				; VSX-NEXT: .LBB1_5: # %entry
				; VSX-NEXT: cmpwi 3, 0
				; VSX-NEXT: bc 12, 2, .LBB1_7
				; VSX-NEXT: # %bb.6: # %entry
				; VSX-NEXT: fmr 2, 1
				; VSX-NEXT: .LBB1_7: # %entry
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: fcmpu 0, 0, 1
				; VSX-NEXT: bc 12, 2, .LBB1_9
				; VSX-NEXT: # %bb.8: # %entry
				; VSX-NEXT: fmr 2, 0
				; VSX-NEXT: .LBB1_9: # %entry
				; VSX-NEXT: fmr 1, 2
				; VSX-NEXT: blr
				;
				; AIX-LABEL: f32_maximum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: fcmpu 0, 1, 2
				; AIX-NEXT: xscvdpspn 0, 1
				; AIX-NEXT: xscvdpspn 3, 2
				; AIX-NEXT: mffprwz 3, 0
				; AIX-NEXT: bc 12, 3, L..BB1_2
				; AIX-NEXT: # %bb.1: # %entry
				; AIX-NEXT: xsmaxdp 0, 1, 2
				; AIX-NEXT: b L..BB1_3
				; AIX-NEXT: L..BB1_2:
				; AIX-NEXT: ld 4, L..C1(2) # %const.0
				; AIX-NEXT: lfs 0, 0(4)
				; AIX-NEXT: L..BB1_3: # %entry
				; AIX-NEXT: cmpwi 3, 0
				; AIX-NEXT: mffprwz 3, 3
				; AIX-NEXT: bc 12, 2, L..BB1_5
				; AIX-NEXT: # %bb.4: # %entry
				; AIX-NEXT: fmr 1, 0
				; AIX-NEXT: L..BB1_5: # %entry
				; AIX-NEXT: cmpwi 3, 0
				; AIX-NEXT: bc 12, 2, L..BB1_7
				; AIX-NEXT: # %bb.6: # %entry
				; AIX-NEXT: fmr 2, 1
				; AIX-NEXT: L..BB1_7: # %entry
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: fcmpu 0, 0, 1
				; AIX-NEXT: bc 12, 2, L..BB1_9
				; AIX-NEXT: # %bb.8: # %entry
				; AIX-NEXT: fmr 2, 0
				; AIX-NEXT: L..BB1_9: # %entry
				; AIX-NEXT: fmr 1, 2
				; AIX-NEXT: blr
				entry:
				%m = call float @llvm.maximum.f32(float %a, float %b)
				ret float %m
				}

				define double @f64_minimum(double %a, double %b) {
				; NOVSX-LABEL: f64_minimum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 2
				; NOVSX-NEXT: fmr 0, 1
				; NOVSX-NEXT: stfd 2, -16(1)
				; NOVSX-NEXT: stfd 1, -8(1)
				; NOVSX-NEXT: bc 12, 0, .LBB2_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 0, 2
				; NOVSX-NEXT: .LBB2_2: # %entry
				; NOVSX-NEXT: ld 3, -8(1)
				; NOVSX-NEXT: bc 4, 3, .LBB2_4
				; NOVSX-NEXT: # %bb.3:
				; NOVSX-NEXT: addis 4, 2, .LCPI2_0@toc@ha
				; NOVSX-NEXT: lfs 0, .LCPI2_0@toc@l(4)
				; NOVSX-NEXT: .LBB2_4: # %entry
				; NOVSX-NEXT: li 4, 1
				; NOVSX-NEXT: rldic 4, 4, 63, 0
				; NOVSX-NEXT: cmpd 3, 4
				; NOVSX-NEXT: ld 3, -16(1)
				; NOVSX-NEXT: bc 12, 2, .LBB2_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 0
				; NOVSX-NEXT: .LBB2_6: # %entry
				; NOVSX-NEXT: cmpd 3, 4
				; NOVSX-NEXT: bc 12, 2, .LBB2_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 2, 1
				; NOVSX-NEXT: .LBB2_8: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI2_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI2_1@toc@l(3)
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB2_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: .LBB2_10: # %entry
				; NOVSX-NEXT: fmr 1, 2
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: f64_minimum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: fcmpu 0, 1, 2
				; VSX-NEXT: mffprd 3, 1
				; VSX-NEXT: bc 12, 3, .LBB2_2
				; VSX-NEXT: # %bb.1: # %entry
				; VSX-NEXT: xsmindp 0, 1, 2
				; VSX-NEXT: b .LBB2_3
				; VSX-NEXT: .LBB2_2:
				; VSX-NEXT: addis 4, 2, .LCPI2_0@toc@ha
				; VSX-NEXT: lfs 0, .LCPI2_0@toc@l(4)
				; VSX-NEXT: .LBB2_3: # %entry
				; VSX-NEXT: li 4, 1
				; VSX-NEXT: rldic 4, 4, 63, 0
				; VSX-NEXT: cmpd 3, 4
				; VSX-NEXT: mffprd 3, 2
				; VSX-NEXT: bc 12, 2, .LBB2_5
				; VSX-NEXT: # %bb.4: # %entry
				; VSX-NEXT: fmr 1, 0
				; VSX-NEXT: .LBB2_5: # %entry
				; VSX-NEXT: cmpd 3, 4
				; VSX-NEXT: bc 12, 2, .LBB2_7
				; VSX-NEXT: # %bb.6: # %entry
				; VSX-NEXT: fmr 2, 1
				; VSX-NEXT: .LBB2_7: # %entry
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: fcmpu 0, 0, 1
				; VSX-NEXT: bc 12, 2, .LBB2_9
				; VSX-NEXT: # %bb.8: # %entry
				; VSX-NEXT: fmr 2, 0
				; VSX-NEXT: .LBB2_9: # %entry
				; VSX-NEXT: fmr 1, 2
				; VSX-NEXT: blr
				;
				; AIX-LABEL: f64_minimum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: fcmpu 0, 1, 2
				; AIX-NEXT: mffprd 3, 1
				; AIX-NEXT: bc 12, 3, L..BB2_2
				; AIX-NEXT: # %bb.1: # %entry
				; AIX-NEXT: xsmindp 0, 1, 2
				; AIX-NEXT: b L..BB2_3
				; AIX-NEXT: L..BB2_2:
				; AIX-NEXT: ld 4, L..C2(2) # %const.0
				; AIX-NEXT: lfs 0, 0(4)
				; AIX-NEXT: L..BB2_3: # %entry
				; AIX-NEXT: li 4, 1
				; AIX-NEXT: rldic 4, 4, 63, 0
				; AIX-NEXT: cmpd 3, 4
				; AIX-NEXT: mffprd 3, 2
				; AIX-NEXT: bc 12, 2, L..BB2_5
				; AIX-NEXT: # %bb.4: # %entry
				; AIX-NEXT: fmr 1, 0
				; AIX-NEXT: L..BB2_5: # %entry
				; AIX-NEXT: cmpd 3, 4
				; AIX-NEXT: bc 12, 2, L..BB2_7
				; AIX-NEXT: # %bb.6: # %entry
				; AIX-NEXT: fmr 2, 1
				; AIX-NEXT: L..BB2_7: # %entry
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: fcmpu 0, 0, 1
				; AIX-NEXT: bc 12, 2, L..BB2_9
				; AIX-NEXT: # %bb.8: # %entry
				; AIX-NEXT: fmr 2, 0
				; AIX-NEXT: L..BB2_9: # %entry
				; AIX-NEXT: fmr 1, 2
				; AIX-NEXT: blr
				entry:
				%m = call double @llvm.minimum.f64(double %a, double %b)
				ret double %m
				}

				define double @f64_maximum(double %a, double %b) {
				; NOVSX-LABEL: f64_maximum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 2
				; NOVSX-NEXT: fmr 0, 1
				; NOVSX-NEXT: stfd 2, -16(1)
				; NOVSX-NEXT: stfd 1, -8(1)
				; NOVSX-NEXT: bc 12, 1, .LBB3_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 0, 2
				; NOVSX-NEXT: .LBB3_2: # %entry
				; NOVSX-NEXT: ld 3, -8(1)
				; NOVSX-NEXT: bc 4, 3, .LBB3_4
				; NOVSX-NEXT: # %bb.3:
				; NOVSX-NEXT: addis 4, 2, .LCPI3_0@toc@ha
				; NOVSX-NEXT: lfs 0, .LCPI3_0@toc@l(4)
				; NOVSX-NEXT: .LBB3_4: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: ld 3, -16(1)
				; NOVSX-NEXT: bc 12, 2, .LBB3_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 0
				; NOVSX-NEXT: .LBB3_6: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: bc 12, 2, .LBB3_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 2, 1
				; NOVSX-NEXT: .LBB3_8: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI3_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI3_1@toc@l(3)
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB3_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: .LBB3_10: # %entry
				; NOVSX-NEXT: fmr 1, 2
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: f64_maximum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: fcmpu 0, 1, 2
				; VSX-NEXT: mffprd 3, 1
				; VSX-NEXT: bc 12, 3, .LBB3_2
				; VSX-NEXT: # %bb.1: # %entry
				; VSX-NEXT: xsmaxdp 0, 1, 2
				; VSX-NEXT: b .LBB3_3
				; VSX-NEXT: .LBB3_2:
				; VSX-NEXT: addis 4, 2, .LCPI3_0@toc@ha
				; VSX-NEXT: lfs 0, .LCPI3_0@toc@l(4)
				; VSX-NEXT: .LBB3_3: # %entry
				; VSX-NEXT: cmpdi 3, 0
				; VSX-NEXT: mffprd 3, 2
				; VSX-NEXT: bc 12, 2, .LBB3_5
				; VSX-NEXT: # %bb.4: # %entry
				; VSX-NEXT: fmr 1, 0
				; VSX-NEXT: .LBB3_5: # %entry
				; VSX-NEXT: cmpdi 3, 0
				; VSX-NEXT: bc 12, 2, .LBB3_7
				; VSX-NEXT: # %bb.6: # %entry
				; VSX-NEXT: fmr 2, 1
				; VSX-NEXT: .LBB3_7: # %entry
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: fcmpu 0, 0, 1
				; VSX-NEXT: bc 12, 2, .LBB3_9
				; VSX-NEXT: # %bb.8: # %entry
				; VSX-NEXT: fmr 2, 0
				; VSX-NEXT: .LBB3_9: # %entry
				; VSX-NEXT: fmr 1, 2
				; VSX-NEXT: blr
				;
				; AIX-LABEL: f64_maximum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: fcmpu 0, 1, 2
				; AIX-NEXT: mffprd 3, 1
				; AIX-NEXT: bc 12, 3, L..BB3_2
				; AIX-NEXT: # %bb.1: # %entry
				; AIX-NEXT: xsmaxdp 0, 1, 2
				; AIX-NEXT: b L..BB3_3
				; AIX-NEXT: L..BB3_2:
				; AIX-NEXT: ld 4, L..C3(2) # %const.0
				; AIX-NEXT: lfs 0, 0(4)
				; AIX-NEXT: L..BB3_3: # %entry
				; AIX-NEXT: cmpdi 3, 0
				; AIX-NEXT: mffprd 3, 2
				; AIX-NEXT: bc 12, 2, L..BB3_5
				; AIX-NEXT: # %bb.4: # %entry
				; AIX-NEXT: fmr 1, 0
				; AIX-NEXT: L..BB3_5: # %entry
				; AIX-NEXT: cmpdi 3, 0
				; AIX-NEXT: bc 12, 2, L..BB3_7
				; AIX-NEXT: # %bb.6: # %entry
				; AIX-NEXT: fmr 2, 1
				; AIX-NEXT: L..BB3_7: # %entry
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: fcmpu 0, 0, 1
				; AIX-NEXT: bc 12, 2, L..BB3_9
				; AIX-NEXT: # %bb.8: # %entry
				; AIX-NEXT: fmr 2, 0
				; AIX-NEXT: L..BB3_9: # %entry
				; AIX-NEXT: fmr 1, 2
				; AIX-NEXT: blr
				entry:
				%m = call double @llvm.maximum.f64(double %a, double %b)
				ret double %m
				}

				define <4 x float> @v4f32_minimum(<4 x float> %a, <4 x float> %b) {
				; NOVSX-LABEL: v4f32_minimum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: vcmpeqfp 5, 3, 3
				; NOVSX-NEXT: vspltisb 4, -1
				; NOVSX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; NOVSX-NEXT: vcmpeqfp 0, 2, 2
				; NOVSX-NEXT: addi 3, 3, .LCPI4_0@toc@l
				; NOVSX-NEXT: vcmpgtfp 1, 3, 2
				; NOVSX-NEXT: vslw 4, 4, 4
				; NOVSX-NEXT: vnot 5, 5
				; NOVSX-NEXT: vnot 0, 0
				; NOVSX-NEXT: vsel 1, 3, 2, 1
				; NOVSX-NEXT: vor 5, 0, 5
				; NOVSX-NEXT: lvx 0, 0, 3
				; NOVSX-NEXT: vsel 5, 1, 0, 5
				; NOVSX-NEXT: vcmpequw 0, 2, 4
				; NOVSX-NEXT: vcmpequw 4, 3, 4
				; NOVSX-NEXT: vsel 2, 5, 2, 0
				; NOVSX-NEXT: vxor 0, 0, 0
				; NOVSX-NEXT: vsel 2, 2, 3, 4
				; NOVSX-NEXT: vcmpeqfp 3, 5, 0
				; NOVSX-NEXT: vsel 2, 5, 2, 3
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: v4f32_minimum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: xxleqv 36, 36, 36
				; VSX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; VSX-NEXT: xvcmpeqsp 0, 35, 35
				; VSX-NEXT: addi 3, 3, .LCPI4_0@toc@l
				; VSX-NEXT: xvcmpeqsp 1, 34, 34
				; VSX-NEXT: lxvd2x 3, 0, 3
				; VSX-NEXT: vslw 4, 4, 4
				; VSX-NEXT: xvminsp 2, 34, 35
				; VSX-NEXT: xxlnor 0, 0, 0
				; VSX-NEXT: xxlnor 1, 1, 1
				; VSX-NEXT: vcmpequw 5, 2, 4
				; VSX-NEXT: xxlor 0, 1, 0
				; VSX-NEXT: vcmpequw 4, 3, 4
				; VSX-NEXT: xxsel 0, 2, 3, 0
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: xxsel 2, 0, 34, 37
				; VSX-NEXT: xvcmpeqsp 1, 0, 1
				; VSX-NEXT: xxsel 2, 2, 35, 36
				; VSX-NEXT: xxsel 34, 0, 2, 1
				; VSX-NEXT: blr
				;
				; AIX-LABEL: v4f32_minimum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: xxleqv 36, 36, 36
				; AIX-NEXT: ld 3, L..C4(2) # %const.0
				; AIX-NEXT: xvcmpeqsp 0, 35, 35
				; AIX-NEXT: xvcmpeqsp 1, 34, 34
				; AIX-NEXT: vslw 4, 4, 4
				; AIX-NEXT: lxvw4x 3, 0, 3
				; AIX-NEXT: xvminsp 2, 34, 35
				; AIX-NEXT: xxlnor 0, 0, 0
				; AIX-NEXT: xxlnor 1, 1, 1
				; AIX-NEXT: vcmpequw 5, 2, 4
				; AIX-NEXT: xxlor 0, 1, 0
				; AIX-NEXT: vcmpequw 4, 3, 4
				; AIX-NEXT: xxsel 0, 2, 3, 0
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: xxsel 2, 0, 34, 37
				; AIX-NEXT: xvcmpeqsp 1, 0, 1
				; AIX-NEXT: xxsel 2, 2, 35, 36
				; AIX-NEXT: xxsel 34, 0, 2, 1
				; AIX-NEXT: blr
				entry:
				%m = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b)
				ret <4 x float> %m
				}

				define <4 x float> @v4f32_maximum(<4 x float> %a, <4 x float> %b) {
				; NOVSX-LABEL: v4f32_maximum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: vcmpeqfp 4, 3, 3
				; NOVSX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; NOVSX-NEXT: vcmpeqfp 5, 2, 2
				; NOVSX-NEXT: addi 3, 3, .LCPI5_0@toc@l
				; NOVSX-NEXT: vcmpgtfp 0, 2, 3
				; NOVSX-NEXT: lvx 1, 0, 3
				; NOVSX-NEXT: vnot 4, 4
				; NOVSX-NEXT: vnot 5, 5
				; NOVSX-NEXT: vsel 0, 3, 2, 0
				; NOVSX-NEXT: vor 4, 5, 4
				; NOVSX-NEXT: vxor 5, 5, 5
				; NOVSX-NEXT: vsel 4, 0, 1, 4
				; NOVSX-NEXT: vcmpequw 0, 2, 5
				; NOVSX-NEXT: vsel 2, 4, 2, 0
				; NOVSX-NEXT: vcmpequw 0, 3, 5
				; NOVSX-NEXT: vsel 2, 2, 3, 0
				; NOVSX-NEXT: vcmpeqfp 3, 4, 5
				; NOVSX-NEXT: vsel 2, 4, 2, 3
				; NOVSX-NEXT: blr
				;
				; VSX-LABEL: v4f32_maximum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: xvcmpeqsp 0, 35, 35
				; VSX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; VSX-NEXT: xvcmpeqsp 1, 34, 34
				; VSX-NEXT: addi 3, 3, .LCPI5_0@toc@l
				; VSX-NEXT: xvmaxsp 2, 34, 35
				; VSX-NEXT: lxvd2x 3, 0, 3
				; VSX-NEXT: xxlxor 36, 36, 36
				; VSX-NEXT: vcmpequw 5, 2, 4
				; VSX-NEXT: xxlnor 0, 0, 0
				; VSX-NEXT: xxlnor 1, 1, 1
				; VSX-NEXT: vcmpequw 0, 3, 4
				; VSX-NEXT: xxlor 0, 1, 0
				; VSX-NEXT: xxsel 0, 2, 3, 0
				; VSX-NEXT: xxsel 1, 0, 34, 37
				; VSX-NEXT: xvcmpeqsp 2, 0, 36
				; VSX-NEXT: xxsel 1, 1, 35, 32
				; VSX-NEXT: xxsel 34, 0, 1, 2
				; VSX-NEXT: blr
				;
				; AIX-LABEL: v4f32_maximum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: xvcmpeqsp 0, 35, 35
				; AIX-NEXT: ld 3, L..C5(2) # %const.0
				; AIX-NEXT: xvcmpeqsp 1, 34, 34
				; AIX-NEXT: xvmaxsp 2, 34, 35
				; AIX-NEXT: xxlxor 36, 36, 36
				; AIX-NEXT: lxvw4x 3, 0, 3
				; AIX-NEXT: vcmpequw 5, 2, 4
				; AIX-NEXT: xxlnor 0, 0, 0
				; AIX-NEXT: xxlnor 1, 1, 1
				; AIX-NEXT: vcmpequw 0, 3, 4
				; AIX-NEXT: xxlor 0, 1, 0
				; AIX-NEXT: xxsel 0, 2, 3, 0
				; AIX-NEXT: xxsel 1, 0, 34, 37
				; AIX-NEXT: xvcmpeqsp 2, 0, 36
				; AIX-NEXT: xxsel 1, 1, 35, 32
				; AIX-NEXT: xxsel 34, 0, 1, 2
				; AIX-NEXT: blr
				entry:
				%m = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b)
				ret <4 x float> %m
				}

				define <2 x double> @v2f64_minimum(<2 x double> %a, <2 x double> %b) {
				; NOVSX-LABEL: v2f64_minimum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 3
				; NOVSX-NEXT: fmr 6, 1
				; NOVSX-NEXT: stfd 4, -16(1)
				; NOVSX-NEXT: stfd 2, -8(1)
				; NOVSX-NEXT: stfd 3, -32(1)
				; NOVSX-NEXT: stfd 1, -24(1)
				; NOVSX-NEXT: bc 12, 0, .LBB6_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 6, 3
				; NOVSX-NEXT: .LBB6_2: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI6_0@toc@ha
				; NOVSX-NEXT: ld 4, -24(1)
				; NOVSX-NEXT: lfs 0, .LCPI6_0@toc@l(3)
				; NOVSX-NEXT: fmr 5, 0
				; NOVSX-NEXT: bc 12, 3, .LBB6_4
				; NOVSX-NEXT: # %bb.3: # %entry
				; NOVSX-NEXT: fmr 5, 6
				; NOVSX-NEXT: .LBB6_4: # %entry
				; NOVSX-NEXT: li 3, 1
				; NOVSX-NEXT: rldic 3, 3, 63, 0
				; NOVSX-NEXT: cmpd 4, 3
				; NOVSX-NEXT: ld 4, -32(1)
				; NOVSX-NEXT: bc 12, 2, .LBB6_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 5
				; NOVSX-NEXT: .LBB6_6: # %entry
				; NOVSX-NEXT: cmpd 4, 3
				; NOVSX-NEXT: bc 12, 2, .LBB6_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 3, 1
				; NOVSX-NEXT: .LBB6_8: # %entry
				; NOVSX-NEXT: addis 4, 2, .LCPI6_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI6_1@toc@l(4)
				; NOVSX-NEXT: fcmpu 0, 5, 1
				; NOVSX-NEXT: bc 12, 2, .LBB6_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 3, 5
				; NOVSX-NEXT: .LBB6_10: # %entry
				; NOVSX-NEXT: fcmpu 0, 2, 4
				; NOVSX-NEXT: fmr 5, 2
				; NOVSX-NEXT: bc 12, 0, .LBB6_12
				; NOVSX-NEXT: # %bb.11: # %entry
				; NOVSX-NEXT: fmr 5, 4
				; NOVSX-NEXT: .LBB6_12: # %entry
				; NOVSX-NEXT: ld 4, -8(1)
				; NOVSX-NEXT: bc 12, 3, .LBB6_14
				; NOVSX-NEXT: # %bb.13: # %entry
				; NOVSX-NEXT: fmr 0, 5
				; NOVSX-NEXT: .LBB6_14: # %entry
				; NOVSX-NEXT: cmpd 4, 3
				; NOVSX-NEXT: ld 4, -16(1)
				; NOVSX-NEXT: bc 4, 2, .LBB6_19
				; NOVSX-NEXT: # %bb.15: # %entry
				; NOVSX-NEXT: cmpd 4, 3
				; NOVSX-NEXT: bc 4, 2, .LBB6_20
				; NOVSX-NEXT: .LBB6_16: # %entry
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB6_18
				; NOVSX-NEXT: .LBB6_17: # %entry
				; NOVSX-NEXT: fmr 4, 0
				; NOVSX-NEXT: .LBB6_18: # %entry
				; NOVSX-NEXT: fmr 1, 3
				; NOVSX-NEXT: fmr 2, 4
				; NOVSX-NEXT: blr
				; NOVSX-NEXT: .LBB6_19: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: cmpd 4, 3
				; NOVSX-NEXT: bc 12, 2, .LBB6_16
				; NOVSX-NEXT: .LBB6_20: # %entry
				; NOVSX-NEXT: fmr 4, 2
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 4, 2, .LBB6_17
				; NOVSX-NEXT: b .LBB6_18
				;
				; VSX-LABEL: v2f64_minimum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: xvcmpeqdp 36, 35, 35
				; VSX-NEXT: addis 3, 2, .LCPI6_1@toc@ha
				; VSX-NEXT: xvcmpeqdp 37, 34, 34
				; VSX-NEXT: addi 3, 3, .LCPI6_1@toc@l
				; VSX-NEXT: xvmindp 0, 34, 35
				; VSX-NEXT: lxvd2x 32, 0, 3
				; VSX-NEXT: addis 3, 2, .LCPI6_0@toc@ha
				; VSX-NEXT: addi 3, 3, .LCPI6_0@toc@l
				; VSX-NEXT: lxvd2x 1, 0, 3
				; VSX-NEXT: vcmpequd 1, 2, 0
				; VSX-NEXT: xxlnor 36, 36, 36
				; VSX-NEXT: xxlnor 37, 37, 37
				; VSX-NEXT: xxlor 2, 37, 36
				; VSX-NEXT: vcmpequd 4, 3, 0
				; VSX-NEXT: xxsel 0, 0, 1, 2
				; VSX-NEXT: xxlxor 1, 1, 1
				; VSX-NEXT: xxsel 2, 0, 34, 33
				; VSX-NEXT: xvcmpeqdp 34, 0, 1
				; VSX-NEXT: xxsel 1, 2, 35, 36
				; VSX-NEXT: xxsel 34, 0, 1, 34
				; VSX-NEXT: blr
				;
				; AIX-LABEL: v2f64_minimum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: xvcmpeqdp 36, 35, 35
				; AIX-NEXT: ld 3, L..C6(2) # %const.1
				; AIX-NEXT: xvcmpeqdp 37, 34, 34
				; AIX-NEXT: xvmindp 0, 34, 35
				; AIX-NEXT: lxvd2x 32, 0, 3
				; AIX-NEXT: ld 3, L..C7(2) # %const.0
				; AIX-NEXT: xxlnor 36, 36, 36
				; AIX-NEXT: lxvd2x 1, 0, 3
				; AIX-NEXT: xxlnor 37, 37, 37
				; AIX-NEXT: vcmpequd 1, 2, 0
				; AIX-NEXT: xxlor 2, 37, 36
				; AIX-NEXT: vcmpequd 4, 3, 0
				; AIX-NEXT: xxsel 0, 0, 1, 2
				; AIX-NEXT: xxlxor 1, 1, 1
				; AIX-NEXT: xxsel 2, 0, 34, 33
				; AIX-NEXT: xvcmpeqdp 34, 0, 1
				; AIX-NEXT: xxsel 1, 2, 35, 36
				; AIX-NEXT: xxsel 34, 0, 1, 34
				; AIX-NEXT: blr
				entry:
				%m = call <2 x double> @llvm.minimum.v2f64(<2 x double> %a, <2 x double> %b)
				ret <2 x double> %m
				}

				define <2 x double> @v2f64_maximum(<2 x double> %a, <2 x double> %b) {
				; NOVSX-LABEL: v2f64_maximum:
				; NOVSX: # %bb.0: # %entry
				; NOVSX-NEXT: fcmpu 0, 1, 3
				; NOVSX-NEXT: fmr 6, 1
				; NOVSX-NEXT: stfd 4, -16(1)
				; NOVSX-NEXT: stfd 2, -8(1)
				; NOVSX-NEXT: stfd 3, -32(1)
				; NOVSX-NEXT: stfd 1, -24(1)
				; NOVSX-NEXT: bc 12, 1, .LBB7_2
				; NOVSX-NEXT: # %bb.1: # %entry
				; NOVSX-NEXT: fmr 6, 3
				; NOVSX-NEXT: .LBB7_2: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI7_0@toc@ha
				; NOVSX-NEXT: lfs 0, .LCPI7_0@toc@l(3)
				; NOVSX-NEXT: ld 3, -24(1)
				; NOVSX-NEXT: fmr 5, 0
				; NOVSX-NEXT: bc 12, 3, .LBB7_4
				; NOVSX-NEXT: # %bb.3: # %entry
				; NOVSX-NEXT: fmr 5, 6
				; NOVSX-NEXT: .LBB7_4: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: ld 3, -32(1)
				; NOVSX-NEXT: bc 12, 2, .LBB7_6
				; NOVSX-NEXT: # %bb.5: # %entry
				; NOVSX-NEXT: fmr 1, 5
				; NOVSX-NEXT: .LBB7_6: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: bc 12, 2, .LBB7_8
				; NOVSX-NEXT: # %bb.7: # %entry
				; NOVSX-NEXT: fmr 3, 1
				; NOVSX-NEXT: .LBB7_8: # %entry
				; NOVSX-NEXT: addis 3, 2, .LCPI7_1@toc@ha
				; NOVSX-NEXT: lfs 1, .LCPI7_1@toc@l(3)
				; NOVSX-NEXT: fcmpu 0, 5, 1
				; NOVSX-NEXT: bc 12, 2, .LBB7_10
				; NOVSX-NEXT: # %bb.9: # %entry
				; NOVSX-NEXT: fmr 3, 5
				; NOVSX-NEXT: .LBB7_10: # %entry
				; NOVSX-NEXT: fcmpu 0, 2, 4
				; NOVSX-NEXT: fmr 5, 2
				; NOVSX-NEXT: bc 12, 1, .LBB7_12
				; NOVSX-NEXT: # %bb.11: # %entry
				; NOVSX-NEXT: fmr 5, 4
				; NOVSX-NEXT: .LBB7_12: # %entry
				; NOVSX-NEXT: ld 3, -8(1)
				; NOVSX-NEXT: bc 12, 3, .LBB7_14
				; NOVSX-NEXT: # %bb.13: # %entry
				; NOVSX-NEXT: fmr 0, 5
				; NOVSX-NEXT: .LBB7_14: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: ld 3, -16(1)
				; NOVSX-NEXT: bc 4, 2, .LBB7_19
				; NOVSX-NEXT: # %bb.15: # %entry
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: bc 4, 2, .LBB7_20
				; NOVSX-NEXT: .LBB7_16: # %entry
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 12, 2, .LBB7_18
				; NOVSX-NEXT: .LBB7_17: # %entry
				; NOVSX-NEXT: fmr 4, 0
				; NOVSX-NEXT: .LBB7_18: # %entry
				; NOVSX-NEXT: fmr 1, 3
				; NOVSX-NEXT: fmr 2, 4
				; NOVSX-NEXT: blr
				; NOVSX-NEXT: .LBB7_19: # %entry
				; NOVSX-NEXT: fmr 2, 0
				; NOVSX-NEXT: cmpdi 3, 0
				; NOVSX-NEXT: bc 12, 2, .LBB7_16
				; NOVSX-NEXT: .LBB7_20: # %entry
				; NOVSX-NEXT: fmr 4, 2
				; NOVSX-NEXT: fcmpu 0, 0, 1
				; NOVSX-NEXT: bc 4, 2, .LBB7_17
				; NOVSX-NEXT: b .LBB7_18
				;
				; VSX-LABEL: v2f64_maximum:
				; VSX: # %bb.0: # %entry
				; VSX-NEXT: xvcmpeqdp 37, 35, 35
				; VSX-NEXT: addis 3, 2, .LCPI7_0@toc@ha
				; VSX-NEXT: xvcmpeqdp 32, 34, 34
				; VSX-NEXT: addi 3, 3, .LCPI7_0@toc@l
				; VSX-NEXT: xvmaxdp 0, 34, 35
				; VSX-NEXT: lxvd2x 1, 0, 3
				; VSX-NEXT: xxlxor 36, 36, 36
				; VSX-NEXT: vcmpequd 1, 2, 4
				; VSX-NEXT: xxlnor 37, 37, 37
				; VSX-NEXT: xxlnor 32, 32, 32
				; VSX-NEXT: xxlor 2, 32, 37
				; VSX-NEXT: vcmpequd 5, 3, 4
				; VSX-NEXT: xxsel 0, 0, 1, 2
				; VSX-NEXT: xxsel 1, 0, 34, 33
				; VSX-NEXT: xvcmpeqdp 34, 0, 36
				; VSX-NEXT: xxsel 1, 1, 35, 37
				; VSX-NEXT: xxsel 34, 0, 1, 34
				; VSX-NEXT: blr
				;
				; AIX-LABEL: v2f64_maximum:
				; AIX: # %bb.0: # %entry
				; AIX-NEXT: xvcmpeqdp 36, 35, 35
				; AIX-NEXT: ld 3, L..C8(2) # %const.0
				; AIX-NEXT: xvcmpeqdp 37, 34, 34
				; AIX-NEXT: xvmaxdp 0, 34, 35
				; AIX-NEXT: xxlxor 32, 32, 32
				; AIX-NEXT: lxvd2x 1, 0, 3
				; AIX-NEXT: vcmpequd 1, 2, 0
				; AIX-NEXT: xxlnor 36, 36, 36
				; AIX-NEXT: xxlnor 37, 37, 37
				; AIX-NEXT: xxlor 2, 37, 36
				; AIX-NEXT: vcmpequd 4, 3, 0
				; AIX-NEXT: xxsel 0, 0, 1, 2
				; AIX-NEXT: xxsel 1, 0, 34, 33
				; AIX-NEXT: xvcmpeqdp 34, 0, 32
				; AIX-NEXT: xxsel 1, 1, 35, 36
				; AIX-NEXT: xxsel 34, 0, 1, 34
				; AIX-NEXT: blr
				entry:
				%m = call <2 x double> @llvm.maximum.v2f64(<2 x double> %a, <2 x double> %b)
				ret <2 x double> %m
				}

				declare float @llvm.maximum.f32(float, float)
				declare double @llvm.maximum.f64(double, double)
				declare <4 x float> @llvm.maximum.v4f32(<4 x float>, <4 x float>)
				declare <2 x double> @llvm.maximum.v2f64(<2 x double>, <2 x double>)

				declare float @llvm.minimum.f32(float, float)
				declare double @llvm.minimum.f64(double, double)
				declare <4 x float> @llvm.minimum.v4f32(<4 x float>, <4 x float>)
				declare <2 x double> @llvm.minimum.v2f64(<2 x double>, <2 x double>)

This is an archive of the discontinued LLVM Phabricator instance.

[Legalizer] Expand fmaximum and fminimumAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 552239

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/ARM/ARMISelLowering.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/ARM/minnum-maxnum-intrinsics.ll

llvm/test/CodeGen/PowerPC/fminimum-fmaximum-f128.ll

llvm/test/CodeGen/PowerPC/fminimum-fmaximum.ll

[Legalizer] Expand fmaximum and fminimum
AbandonedPublic