Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -9530,6 +9530,151 @@ %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c +'``llvm.umin.*``' and ``llvm.smin.*`` Intrinsics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" +This is an overloaded intrinsic. You can use ``llvm.umin`` or ``llvm.smin`` +on any integer bit width. + +.. code-block:: llvm + + declare i8 @llvm.umin.i8(i8 %a, i8 %b) + declare i16 @llvm.umin.i16(i16 %a, i16 %b) + declare i32 @llvm.umin.i32(i32 %a, i32 %b) + declare i64 @llvm.umin.i64(i64 %a, i64 %b) + declare i8 @llvm.smin.i8(i8 %a, i8 %b) + declare i16 @llvm.smin.i16(i16 %a, i16 %b) + declare i32 @llvm.smin.i32(i32 %a, i32 %b) + declare i64 @llvm.smin.i64(i64 %a, i64 %b) + + +Overview: +""""""""" + +The ``llvm.umin`` intrinsic returns the minimum of the two operands, +treating them both as unsigned integers. + +The ``llvm.smin`` intrinsic returns the minimum of the two operands, +treating them both as signed integers. + +.. note:: + + These intrinsics are primarily used during late-stage code generation. + + They are generated by the compiler for the compiler and it is not + recommended for users to create them manually. Identity and constant + folding will not do as good a job with these intrinsics as with + the canonical, and recommended, ``icmp; select`` sequence given below. + +Arguments: +"""""""""" + +Both intrinsics take two integer arguments of the same bitwidth. + +Semantics: +"""""""""" + +The expression:: + + call i8 @llvm.umin.i8(i8 %a, i8 %b) + +is equivalent to:: + + %1 = icmp ult i8 %a, %b + %2 = select i1 %1, i8 %a, i8 %b + +Similarly the expression:: + + call i8 @llvm.smin.i8(i8 %a, i8 %b) + +is equivalent to:: + + %1 = icmp slt i8 %a, %b + %2 = select i1 %1, i8 %a, i8 %b + +Examples: +""""""""" + +.. code-block:: llvm + + %1 = call i8 @llvm.smin.i8(i8 42, i8 -24) ; Yields -24 + %1 = call i8 @llvm.umin.i8(i8 42, i8 -24) ; Yields 42 + + +'``llvm.umax.*``' and ``llvm.smax.*`` Intrinsics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" +This is an overloaded intrinsic. You can use ``llvm.umax`` or ``llvm.smax`` +on any integer bit width. + +.. code-block:: llvm + + declare i8 @llvm.umax.i8(i8 %a, i8 %b) + declare i16 @llvm.umax.i16(i16 %a, i16 %b) + declare i32 @llvm.umax.i32(i32 %a, i32 %b) + declare i64 @llvm.umax.i64(i64 %a, i64 %b) + declare i8 @llvm.smax.i8(i8 %a, i8 %b) + declare i16 @llvm.smax.i16(i16 %a, i16 %b) + declare i32 @llvm.smax.i32(i32 %a, i32 %b) + declare i64 @llvm.smax.i64(i64 %a, i64 %b) + + +Overview: +""""""""" + +The ``llvm.umax`` intrinsic returns the maximum of the two operands, +treating them both as unsigned integers. + +The ``llvm.smax`` intrinsic returns the maximum of the two operands, +treating them both as signed integers. + +.. note:: + + These intrinsics are primarily used during late-stage code generation. + + They are generated by the compiler for the compiler and it is not + recommended for users to create them manually. Identity and constant + folding will not do as good a job with these intrinsics as with + the canonical, and recommended, ``icmp; select`` sequence given below. + +Arguments: +"""""""""" + +Both intrinsics take two integer arguments of the same bitwidth. + +Semantics: +"""""""""" + +The expression:: + + call i8 @llvm.umax.i8(i8 %a, i8 %b) + +is equivalent to:: + + %1 = icmp ugt i8 %a, %b + %2 = select i1 %1, i8 %a, i8 %b + +Similarly the expression:: + + call i8 @llvm.smax.i8(i8 %a, i8 %b) + +is equivalent to:: + + %1 = icmp sgt i8 %a, %b + %2 = select i1 %1, i8 %a, i8 %b + +Examples: +""""""""" + +.. code-block:: llvm + + %1 = call i8 @llvm.smax.i8(i8 42, i8 -24) ; Yields 42 + %1 = call i8 @llvm.umax.i8(i8 42, i8 -24) ; Yields -24 + Half Precision Floating Point Intrinsics ---------------------------------------- Index: include/llvm/CodeGen/ISDOpcodes.h =================================================================== --- include/llvm/CodeGen/ISDOpcodes.h +++ include/llvm/CodeGen/ISDOpcodes.h @@ -308,6 +308,10 @@ /// part. MULHU, MULHS, + /// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned + /// integers. + SMIN, SMAX, UMIN, UMAX, + /// Bitwise operators - logical and, logical or, logical xor. AND, OR, XOR, Index: include/llvm/IR/Intrinsics.td =================================================================== --- include/llvm/IR/Intrinsics.td +++ include/llvm/IR/Intrinsics.td @@ -330,6 +330,15 @@ [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>]>; + def int_umin : Intrinsic<[llvm_anyint_ty], + [LLVMMatchType<0>, LLVMMatchType<0>]>; + def int_smin : Intrinsic<[llvm_anyint_ty], + [LLVMMatchType<0>, LLVMMatchType<0>]>; + def int_umax : Intrinsic<[llvm_anyint_ty], + [LLVMMatchType<0>, LLVMMatchType<0>]>; + def int_smax : Intrinsic<[llvm_anyint_ty], + [LLVMMatchType<0>, LLVMMatchType<0>]>; + // These functions do not read memory, but are sensitive to the // rounding mode. LLVM purposely does not model changes to the FP // environment so they can be treated as readnone. Index: include/llvm/Target/TargetSelectionDAG.td =================================================================== --- include/llvm/Target/TargetSelectionDAG.td +++ include/llvm/Target/TargetSelectionDAG.td @@ -370,6 +370,10 @@ [SDNPOutGlue]>; def sube : SDNode<"ISD::SUBE" , SDTIntBinOp, [SDNPOutGlue, SDNPInGlue]>; +def smin : SDNode<"ISD::SMIN" , SDTIntBinOp>; +def smax : SDNode<"ISD::SMAX" , SDTIntBinOp>; +def umin : SDNode<"ISD::UMIN" , SDTIntBinOp>; +def umax : SDNode<"ISD::UMAX" , SDTIntBinOp>; def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>; def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>; Index: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp =================================================================== --- lib/CodeGen/SelectionDAG/LegalizeDAG.cpp +++ lib/CodeGen/SelectionDAG/LegalizeDAG.cpp @@ -3281,6 +3281,26 @@ Results.push_back(Tmp1); break; } + case ISD::SMIN: + case ISD::SMAX: + case ISD::UMIN: + case ISD::UMAX: { + // Expand Y = MAX(A, B) -> Y = (A > B) ? A : B + ISD::CondCode Pred; + switch (Node->getOpcode()) { + default: llvm_unreachable("How did we get here?"); + case ISD::SMAX: Pred = ISD::SETGT; break; + case ISD::SMIN: Pred = ISD::SETLT; break; + case ISD::UMAX: Pred = ISD::SETUGT; break; + case ISD::UMIN: Pred = ISD::SETULT; break; + } + Tmp1 = Node->getOperand(0); + Tmp2 = Node->getOperand(1); + Tmp1 = DAG.getSelectCC(dl, Tmp1, Tmp2, Tmp1, Tmp2, Pred); + Results.push_back(Tmp1); + break; + } + case ISD::FMINNUM: Results.push_back(ExpandFPLibCall(Node, RTLIB::FMIN_F32, RTLIB::FMIN_F64, RTLIB::FMIN_F80, RTLIB::FMIN_F128, Index: lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp =================================================================== --- lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp +++ lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp @@ -71,6 +71,10 @@ case ISD::VSELECT: Res = PromoteIntRes_VSELECT(N); break; case ISD::SELECT_CC: Res = PromoteIntRes_SELECT_CC(N); break; case ISD::SETCC: Res = PromoteIntRes_SETCC(N); break; + case ISD::SMIN: + case ISD::SMAX: + case ISD::UMIN: + case ISD::UMAX: Res = PromoteIntRes_MINMAX(N, N->getOpcode()); break; case ISD::SHL: Res = PromoteIntRes_SHL(N); break; case ISD::SIGN_EXTEND_INREG: Res = PromoteIntRes_SIGN_EXTEND_INREG(N); break; @@ -555,6 +559,13 @@ LHS.getValueType(), Mask, LHS, RHS); } +SDValue DAGTypeLegalizer::PromoteIntRes_MINMAX(SDNode *N, unsigned Opcode) { + SDValue LHS = GetPromotedInteger(N->getOperand(0)); + SDValue RHS = GetPromotedInteger(N->getOperand(1)); + return DAG.getNode(Opcode, SDLoc(N), + LHS.getValueType(), LHS, RHS); +} + SDValue DAGTypeLegalizer::PromoteIntRes_SELECT_CC(SDNode *N) { SDValue LHS = GetPromotedInteger(N->getOperand(2)); SDValue RHS = GetPromotedInteger(N->getOperand(3)); Index: lib/CodeGen/SelectionDAG/LegalizeTypes.h =================================================================== --- lib/CodeGen/SelectionDAG/LegalizeTypes.h +++ lib/CodeGen/SelectionDAG/LegalizeTypes.h @@ -264,6 +264,7 @@ SDValue PromoteIntRes_UNDEF(SDNode *N); SDValue PromoteIntRes_VAARG(SDNode *N); SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo); + SDValue PromoteIntRes_MINMAX(SDNode *N, unsigned Opcode); // Integer Operand Promotion. bool PromoteIntegerOperand(SDNode *N, unsigned OperandNo); Index: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp =================================================================== --- lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp +++ lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp @@ -321,6 +321,10 @@ case ISD::ANY_EXTEND_VECTOR_INREG: case ISD::SIGN_EXTEND_VECTOR_INREG: case ISD::ZERO_EXTEND_VECTOR_INREG: + case ISD::SMIN: + case ISD::SMAX: + case ISD::UMIN: + case ISD::UMAX: QueryType = Node->getValueType(0); break; case ISD::FP_ROUND_INREG: Index: lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp =================================================================== --- lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp +++ lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp @@ -668,6 +668,10 @@ case ISD::UREM: case ISD::SREM: case ISD::FREM: + case ISD::SMIN: + case ISD::SMAX: + case ISD::UMIN: + case ISD::UMAX: SplitVecRes_BinOp(N, Lo, Hi); break; case ISD::FMA: Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -5456,6 +5456,31 @@ case Intrinsic::eh_begincatch: case Intrinsic::eh_endcatch: llvm_unreachable("begin/end catch intrinsics not lowered in codegen"); + case Intrinsic::smin: + setValue(&I, DAG.getNode(ISD::SMIN, sdl, + getValue(I.getArgOperand(0)).getValueType(), + getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)))); + return nullptr; + case Intrinsic::smax: + setValue(&I, DAG.getNode(ISD::SMAX, sdl, + getValue(I.getArgOperand(0)).getValueType(), + getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)))); + return nullptr; + case Intrinsic::umin: + setValue(&I, DAG.getNode(ISD::UMIN, sdl, + getValue(I.getArgOperand(0)).getValueType(), + getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)))); + return nullptr; + case Intrinsic::umax: + setValue(&I, DAG.getNode(ISD::UMAX, sdl, + getValue(I.getArgOperand(0)).getValueType(), + getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)))); + return nullptr; + } } Index: lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp @@ -193,6 +193,10 @@ case ISD::FCOPYSIGN: return "fcopysign"; case ISD::FGETSIGN: return "fgetsign"; case ISD::FPOW: return "fpow"; + case ISD::SMIN: return "smin"; + case ISD::SMAX: return "smax"; + case ISD::UMIN: return "umin"; + case ISD::UMAX: return "umax"; case ISD::FPOWI: return "fpowi"; case ISD::SETCC: return "setcc"; Index: lib/CodeGen/TargetLoweringBase.cpp =================================================================== --- lib/CodeGen/TargetLoweringBase.cpp +++ lib/CodeGen/TargetLoweringBase.cpp @@ -803,6 +803,10 @@ setOperationAction(ISD::FMINNUM, VT, Expand); setOperationAction(ISD::FMAXNUM, VT, Expand); setOperationAction(ISD::FMAD, VT, Expand); + setOperationAction(ISD::SMIN, VT, Expand); + setOperationAction(ISD::SMAX, VT, Expand); + setOperationAction(ISD::UMIN, VT, Expand); + setOperationAction(ISD::UMAX, VT, Expand); // These library functions default to expand. setOperationAction(ISD::FROUND, VT, Expand); Index: test/CodeGen/AArch64/maxmin-expand.ll =================================================================== --- /dev/null +++ test/CodeGen/AArch64/maxmin-expand.ll @@ -0,0 +1,41 @@ +; RUN: llc < %s -mtriple=aarch64-linux-gnu -O3 | FileCheck %s + +; Check that expansion of llvm.[su]{min,max} intrinsics produces sensible code. + +; CHECK-LABEL: test_umin_i8 +; CHECK: cmp +; CHECK-NEXT: csel +; CHECK-NEXT: ret +define i8 @test_umin_i8(i8 %a, i8 %b) { + %1 = call i8 @llvm.umin.i8(i8 %a, i8 %b) + ret i8 %1 +} + +; CHECK: cmp +; CHECK-NEXT: csel +; CHECK-NEXT: ret +define i16 @test_smin_i16(i16 %a, i16 %b) { + %1 = call i16 @llvm.smin.i16(i16 %a, i16 %b) + ret i16 %1 +} + +; CHECK: cmp +; CHECK-NEXT: csel +; CHECK-NEXT: ret +define i32 @test_smax_i32(i32 %a, i32 %b) { + %1 = call i32 @llvm.smax.i32(i32 %a, i32 %b) + ret i32 %1 +} + +; CHECK: cmp +; CHECK-NEXT: csel +; CHECK-NEXT: ret +define i64 @test_umax_i64(i64 %a, i64 %b) { + %1 = call i64 @llvm.umax.i64(i64 %a, i64 %b) + ret i64 %1 +} + +declare i8 @llvm.umin.i8(i8, i8) +declare i16 @llvm.smin.i16(i16, i16) +declare i32 @llvm.smax.i32(i32, i32) +declare i64 @llvm.umax.i64(i64, i64)