Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -12024,6 +12024,275 @@ Returns another pointer that aliases its argument but which is considered different for the purposes of ``load``/``store`` ``invariant.group`` metadata. +Constrained Floating Point Intrinsics +------------------------------------- + +These intrinsics are used to provide special handling of floating point +operations when specific rounding mode or floating point exception behavior is +required. By default, LLVM optimization passes assume that the rounding mode is +round-to-nearest and that floating point exceptions will not be monitored. +Constrained FP intrinsics are used to support non-default rounding modes and +accurately preserve exception behavior without compromising LLVM's ability to +optimize FP code when the default behavior is used. + +Each of these intrinsics corresponds to a normal floating point operation. The +first two arguments and the return value are the same as the corresponding FP +operation. + +The third argument is a metadata argument specifying the rounding mode to be +assumed. This argument must be one of the following strings: + +:: + "LLVM_ROUND_DYNAMIC" + "LLVM_ROUND_TONEAREST" + "LLVM_ROUND_DOWNWARD" + "LLVM_ROUND_UPWARD" + "LLVM_ROUND_TOWARDZERO" + +If this argument is "LLVM_ROUND_DYNAMIC" optimization passes must assume that +the rounding mode is unknown and may change at runtime. No transformations that +depend on rounding mode may be performed in this case. + +The other possible values for the rounding mode argument correspond to the +similarly named IEEE rounding modes. If the argument is any of these values +optimization passes may perform transformations as long as they are consistent +with the specified rounding mode. + +For example, 'x-0'->'x' is not a valid transformation if the rounding mode is +"LLVM_ROUND_DOWNWARD" or "LLVM_ROUND_DYNAMIC" because if the value of 'x' is +0 +then 'x-0' should evaluate to '-0' when rounding downward. However, this +transformation is legal for all other rounding modes. + +For values other than "LLVM_ROUND_DYNAMIC" optimization passes may assume that +the actual runtime rounding mode (as defined in a target-specific manner) +matches the specified rounding mode, but this is not guaranteed. Using a +specific non-dynamic rounding mode which does not match the actual rounding +mode at runtime results in undefined behavior. + +The fourth argument to the constrained floating point intrinsics specifies the +required exception behavior. This argument must be one of the following +strings: + +:: + "LLVM_FPEXCEPT_IGNORE" + "LLVM_FPEXCEPT_MAYTRAP" + "LLVM_FPEXCEPT_STRICT" + +If this argument is "LLVM_FPEXCEPT_IGNORE" optimization passes may assume that +the exception status flags will not be read and that floating point exceptions +will not be unmasked. This allows transformations to be performed that may +change the exception semantics of the original code. For example, FP operations +may be speculatively executed in this case whereas they must not be for either +of the other possible values of this argument. + +If the exception behavior argument is "LLVM_FPEXCEPT_MAYTRAP" optimization +passes must avoid transformations that may raise exceptions that would not +have been raised by the original code (such as speculatively executing FP +operations), but passes are not required to preserve all exceptions that are +implied by the original code. For example, exceptions may be potentially hidden +by constant folding. + +If the exception behavior argument is "LLVM_FPEXCEPT_STRICT" all transformations +must strictly preserve the floating point exception semantics of the original +code. Any FP exception that would have been raised by the original code must be +raised by the transformed code, and the transformed code must not raise any FP +exceptions that would not have been raised by the original code. This is the +exception behavior argument that will be used if the code being compiled reads +the FP exception status flags, but this mode can also be used with code that +unmasks FP exceptions. + +The number and order of floating point exceptions is NOT guaranteed. For +example, a series of FP operations that each may raise exceptions may be +vectorized into a single instruction that raises each unique exception a single +time. + + +'``llvm.experimental.constrained.fadd``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fadd( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``lllvm.experimental.constrained.fadd``' intrinsic returns the sum of its +two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``lllvm.experimental.constrained.fadd``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point sum of the two value operands and has +the same type as the operands. + + +'``llvm.experimental.constrained.fsub``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fsub( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``lllvm.experimental.constrained.fsub``' intrinsic returns the difference +of its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``lllvm.experimental.constrained.fsub``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point difference of the two value operands +and has the same type as the operands. + + +'``llvm.experimental.constrained.fmul``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fmul( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``lllvm.experimental.constrained.fmul``' intrinsic returns the product of +its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``lllvm.experimental.constrained.fmul``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point product of the two value operands and +has the same type as the operands. + + +'``llvm.experimental.constrained.fdiv``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fdiv( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``lllvm.experimental.constrained.fdiv``' intrinsic returns the quotient of +its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``lllvm.experimental.constrained.fdiv``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point quotient of the two value operands and +has the same type as the operands. + + +'``llvm.experimental.constrained.frem``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.frem( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``lllvm.experimental.constrained.frem``' intrinsic returns the remainder +from the division of its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``lllvm.experimental.constrained.frem``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point remainder from the division of the two +value operands and has the same type as the operands. The remainder has the +same sign as the dividend. + + General Intrinsics ------------------ Index: include/llvm/CodeGen/SelectionDAGNodes.h =================================================================== --- include/llvm/CodeGen/SelectionDAGNodes.h +++ include/llvm/CodeGen/SelectionDAGNodes.h @@ -340,6 +340,22 @@ /// TODO: This data structure should be shared by the IR optimizer and the /// the backend. struct SDNodeFlags { +public: + enum RoundingModeTy { + rmDefault = 0, + rmDynamic = 1, + rmToNearest = 2, + rmDownward = 4, + rmUpward = 8, + rmTowardZero = 16 + }; + + enum ExceptionBehaviorTy { + ebIgnore = 0, + ebMayTrap = 1, + ebStrict = 2 + }; + private: bool NoUnsignedWrap : 1; bool NoSignedWrap : 1; @@ -350,6 +366,8 @@ bool NoSignedZeros : 1; bool AllowReciprocal : 1; bool VectorReduction : 1; + RoundingModeTy RoundingMode : 5; + ExceptionBehaviorTy ExceptionBehavior : 2; public: /// Default constructor turns off all optimization flags. @@ -363,6 +381,8 @@ NoSignedZeros = false; AllowReciprocal = false; VectorReduction = false; + RoundingMode = rmDefault; + ExceptionBehavior = ebIgnore; } // These are mutators for each flag. @@ -375,6 +395,8 @@ void setNoSignedZeros(bool b) { NoSignedZeros = b; } void setAllowReciprocal(bool b) { AllowReciprocal = b; } void setVectorReduction(bool b) { VectorReduction = b; } + void setRoundingMode(RoundingModeTy rm) { RoundingMode = rm; } + void setExceptionBehavior(ExceptionBehaviorTy eb) { ExceptionBehavior = eb; } // These are accessors for each flag. bool hasNoUnsignedWrap() const { return NoUnsignedWrap; } @@ -386,6 +408,14 @@ bool hasNoSignedZeros() const { return NoSignedZeros; } bool hasAllowReciprocal() const { return AllowReciprocal; } bool hasVectorReduction() const { return VectorReduction; } + RoundingModeTy getRoundingMode() const { + // For flag merging purposes, we need to recognize when no expicit rounding + // mode has been set (rmDefault), but the default is rmToNearest. + if (RoundingMode == rmDefault) + return rmToNearest; + return RoundingMode; + } + ExceptionBehaviorTy getExceptionBehavior() const { return ExceptionBehavior; } /// Clear any flags in this flag set that aren't also set in Flags. void intersectWith(const SDNodeFlags *Flags) { @@ -397,6 +427,23 @@ NoInfs &= Flags->NoInfs; NoSignedZeros &= Flags->NoSignedZeros; AllowReciprocal &= Flags->AllowReciprocal; + // If either RoundingMode is rmDefault, we can use the other RoundingMode. + // If neither is rmDefault and they are different, we must assume rmDynamic. + if (RoundingMode == rmDefault) + RoundingMode = Flags->RoundingMode; + else if (RoundingMode != Flags->RoundingMode && + Flags->RoundingMode != rmDefault) + RoundingMode = rmDynamic; + // ExceptionBehavior is progressive. If the current flags specify ebIgnore + // we should use whatever the merged flags specify. If the current flags + // specify ebMayTrap, we can update to the more restrictive ebStrict but not + // to the less restrictive ebIgnore. If the current flags specify ebStrict + // we must keep that setting. + if (ExceptionBehavior == ebIgnore) + ExceptionBehavior = Flags->ExceptionBehavior; + else if (ExceptionBehavior == ebMayTrap && + Flags->ExceptionBehavior != ebIgnore) + ExceptionBehavior = Flags->ExceptionBehavior; } }; Index: include/llvm/IR/IntrinsicInst.h =================================================================== --- include/llvm/IR/IntrinsicInst.h +++ include/llvm/IR/IntrinsicInst.h @@ -137,6 +137,43 @@ } }; + /// This is the common base class for constrained floating point intrinsics. + class ConstrainedFPIntrinsic : public IntrinsicInst { + public: + enum RoundingMode { + rmDynamic, + rmToNearest, + rmDownward, + rmUpward, + rmTowardZero + }; + + enum ExceptionBehavior { + ebIgnore, + ebMayTrap, + ebStrict + }; + + RoundingMode getRoundingMode() const; + ExceptionBehavior getExceptionBehavior() const; + + // Methods for support type inquiry through isa, cast, and dyn_cast: + static inline bool classof(const IntrinsicInst *I) { + switch (I->getIntrinsicID()) { + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + return true; + default: return false; + } + } + static inline bool classof(const Value *V) { + return isa(V) && classof(cast(V)); + } + }; + /// This is the common base class for memset/memcpy/memmove. class MemIntrinsic : public IntrinsicInst { public: Index: include/llvm/IR/Intrinsics.td =================================================================== --- include/llvm/IR/Intrinsics.td +++ include/llvm/IR/Intrinsics.td @@ -442,6 +442,39 @@ [IntrNoMem]>, GCCBuiltin<"__builtin_object_size">; +//===--------------- Constrained Floating Point Intrinsics ----------------===// +// + +let IntrProperties = [IntrInaccessibleMemOnly] in { + def int_experimental_constrained_fadd : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fsub : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fmul : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; +} +// FIXME: Add intrinsic for fcmp, fptrunc, fpext, fptoui and fptosi. + + //===------------------------- Expect Intrinsics --------------------------===// // def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, Index: lib/CodeGen/SelectionDAG/SelectionDAG.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAG.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAG.cpp @@ -3950,39 +3950,80 @@ // Constant fold FP operations. bool HasFPExceptions = TLI->hasFloatingPointExceptions(); + bool MustBeExact = false; + bool MustPreserveFPExceptions = false; + APFloat::roundingMode RoundingMode = APFloat::rmNearestTiesToEven; + if (Flags) { + switch (Flags->getRoundingMode()) { + case SDNodeFlags::rmDefault: + case SDNodeFlags::rmToNearest: + break; + case SDNodeFlags::rmDynamic: + MustBeExact = true; + break; + case SDNodeFlags::rmDownward: + RoundingMode = APFloat::rmTowardNegative; + break; + case SDNodeFlags::rmUpward: + RoundingMode = APFloat::rmTowardPositive; + break; + case SDNodeFlags::rmTowardZero: + RoundingMode = APFloat::rmTowardZero; + break; + } + // If the exception behavior is ebIgnore or ebMayTrap we may perform + // constant folding that hides FP exceptions that would otherwise have + // been raised, but if it is ebStrict we may not. + if (Flags->getExceptionBehavior() == SDNodeFlags::ebStrict) + MustPreserveFPExceptions = true; + } + if (N1CFP) { if (N2CFP) { APFloat V1 = N1CFP->getValueAPF(), V2 = N2CFP->getValueAPF(); APFloat::opStatus s; switch (Opcode) { case ISD::FADD: - s = V1.add(V2, APFloat::rmNearestTiesToEven); - if (!HasFPExceptions || s != APFloat::opInvalidOp) + s = V1.add(V2, RoundingMode); + if (s == APFloat::opOK || + (!MustPreserveFPExceptions && + (!MustBeExact || s != APFloat::opInexact) && + (!HasFPExceptions || s != APFloat::opInvalidOp))) return getConstantFP(V1, DL, VT); break; case ISD::FSUB: - s = V1.subtract(V2, APFloat::rmNearestTiesToEven); - if (!HasFPExceptions || s!=APFloat::opInvalidOp) + s = V1.subtract(V2, RoundingMode); + if (s == APFloat::opOK || + (!MustPreserveFPExceptions && + (!MustBeExact || s != APFloat::opInexact) && + (!HasFPExceptions || s != APFloat::opInvalidOp))) return getConstantFP(V1, DL, VT); break; case ISD::FMUL: - s = V1.multiply(V2, APFloat::rmNearestTiesToEven); - if (!HasFPExceptions || s!=APFloat::opInvalidOp) + s = V1.multiply(V2, RoundingMode); + if (s == APFloat::opOK || + (!MustPreserveFPExceptions && + (!MustBeExact || s != APFloat::opInexact) && + (!HasFPExceptions || s != APFloat::opInvalidOp))) return getConstantFP(V1, DL, VT); break; case ISD::FDIV: - s = V1.divide(V2, APFloat::rmNearestTiesToEven); - if (!HasFPExceptions || (s!=APFloat::opInvalidOp && - s!=APFloat::opDivByZero)) { + s = V1.divide(V2, RoundingMode); + if (s == APFloat::opOK || + (!MustPreserveFPExceptions && + (!MustBeExact || s != APFloat::opInexact) && + (!HasFPExceptions || + (s != APFloat::opInvalidOp && s != APFloat::opDivByZero)))) return getConstantFP(V1, DL, VT); - } break; case ISD::FREM : s = V1.mod(V2); - if (!HasFPExceptions || (s!=APFloat::opInvalidOp && - s!=APFloat::opDivByZero)) { + if (s == APFloat::opOK || + (!MustPreserveFPExceptions && + (!MustBeExact || s != APFloat::opInexact) && + (!HasFPExceptions || + (s != APFloat::opInvalidOp && s != APFloat::opDivByZero)))) return getConstantFP(V1, DL, VT); - } break; case ISD::FCOPYSIGN: V1.copySign(V2); @@ -3997,7 +4038,7 @@ // This can return overflow, underflow, or inexact; we don't care. // FIXME need to be more flexible about rounding mode. (void)V.convert(EVTToAPFloatSemantics(VT), - APFloat::rmNearestTiesToEven, &ignored); + RoundingMode, &ignored); return getConstantFP(V, DL, VT); } } Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h @@ -900,6 +900,7 @@ void visitInlineAsm(ImmutableCallSite CS); const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic); void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic); + void visitConstrainedFPIntrinsic(const CallInst &I, unsigned Intrinsic); void visitVAStart(const CallInst &I); void visitVAArg(const VAArgInst &I); Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -5290,6 +5290,13 @@ getValue(I.getArgOperand(1)), getValue(I.getArgOperand(2)))); return nullptr; + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + visitConstrainedFPIntrinsic(I, Intrinsic); + return nullptr; case Intrinsic::fmuladd: { EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType()); if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict && @@ -5738,6 +5745,57 @@ } } +void SelectionDAGBuilder::visitConstrainedFPIntrinsic(const CallInst &I, + unsigned Intrinsic) { + // FIXME: Do something to prevent unwanted code motion. + SDLoc sdl = getCurSDLoc(); + unsigned Opcode; + switch (Intrinsic) { + default: llvm_unreachable("Impossible intrinsic"); // Can't reach here. + case Intrinsic::experimental_constrained_fadd: Opcode = ISD::FADD; break; + case Intrinsic::experimental_constrained_fsub: Opcode = ISD::FSUB; break; + case Intrinsic::experimental_constrained_fmul: Opcode = ISD::FMUL; break; + case Intrinsic::experimental_constrained_fdiv: Opcode = ISD::FDIV; break; + case Intrinsic::experimental_constrained_frem: Opcode = ISD::FREM; break; + } + + const ConstrainedFPIntrinsic *FPI = cast(&I); + SDNodeFlags Flags; + switch (FPI->getRoundingMode()) { + case ConstrainedFPIntrinsic::rmDynamic: + Flags.setRoundingMode(SDNodeFlags::rmDynamic); + break; + case ConstrainedFPIntrinsic::rmToNearest: + Flags.setRoundingMode(SDNodeFlags::rmToNearest); + break; + case ConstrainedFPIntrinsic::rmDownward: + Flags.setRoundingMode(SDNodeFlags::rmDownward); + break; + case ConstrainedFPIntrinsic::rmUpward: + Flags.setRoundingMode(SDNodeFlags::rmUpward); + break; + case ConstrainedFPIntrinsic::rmTowardZero: + Flags.setRoundingMode(SDNodeFlags::rmTowardZero); + break; + } + switch (FPI->getExceptionBehavior()) { + case ConstrainedFPIntrinsic::ebIgnore: + Flags.setExceptionBehavior(SDNodeFlags::ebIgnore); + break; + case ConstrainedFPIntrinsic::ebMayTrap: + Flags.setExceptionBehavior(SDNodeFlags::ebMayTrap); + break; + case ConstrainedFPIntrinsic::ebStrict: + Flags.setExceptionBehavior(SDNodeFlags::ebStrict); + break; + } + SDValue FPNode = DAG.getNode(Opcode, sdl, + getValue(I.getArgOperand(0)).getValueType(), + getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)), &Flags); + setValue(&I, FPNode); +} + std::pair SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI, const BasicBlock *EHPadBB) { Index: lib/IR/IntrinsicInst.cpp =================================================================== --- lib/IR/IntrinsicInst.cpp +++ lib/IR/IntrinsicInst.cpp @@ -93,3 +93,38 @@ LLVMContext &Context = M->getContext(); return ConstantInt::get(Type::getInt64Ty(Context), 1); } + +ConstrainedFPIntrinsic::RoundingMode +ConstrainedFPIntrinsic::getRoundingMode() const { + Metadata *RoundingMD = cast(getOperand(2))->getMetadata(); + StringRef RoundingArg = cast(RoundingMD)->getString(); + + // For dynamic rounding mode, we use round to nearest but we will set the + // 'exact' SDNodeFlag so that the value will not be rounded. + if (RoundingArg.equals("LLVM_ROUND_DYNAMIC")) + return rmDynamic; + else if (RoundingArg.equals("LLVM_ROUND_TONEAREST")) + return rmToNearest; + else if (RoundingArg.equals("LLVM_ROUND_DOWNWARD")) + return rmDownward; + else if (RoundingArg.equals("LLVM_ROUND_UPWARD")) + return rmUpward; + else if (RoundingArg.equals("LLVM_ROUND_TOWARDZERO")) + return rmTowardZero; + + llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!"); +} + +ConstrainedFPIntrinsic::ExceptionBehavior +ConstrainedFPIntrinsic::getExceptionBehavior() const { + Metadata *ExceptionMD = cast(getOperand(3))->getMetadata(); + StringRef ExceptionArg = cast(ExceptionMD)->getString(); + if (ExceptionArg.equals("LLVM_FPEXCEPT_IGNORE")) + return ebIgnore; + else if (ExceptionArg.equals("LLVM_FPEXCEPT_MAYTRAP")) + return ebMayTrap; + else if (ExceptionArg.equals("LLVM_FPEXCEPT_STRICT")) + return ebStrict; + + llvm_unreachable("Unexpected exception behavior argument in FP intrinsic!"); +} Index: lib/IR/Verifier.cpp =================================================================== --- lib/IR/Verifier.cpp +++ lib/IR/Verifier.cpp @@ -444,6 +444,7 @@ void visitUserOp1(Instruction &I); void visitUserOp2(Instruction &I) { visitUserOp1(I); } void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS); + void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI); template void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII); void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI); @@ -3907,6 +3908,14 @@ "constant int", CS); break; + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + visitConstrainedFPIntrinsic( + cast(*CS.getInstruction())); + break; case Intrinsic::dbg_declare: // llvm.dbg.declare Assert(isa(CS.getArgOperand(0)), "invalid llvm.dbg.declare intrinsic call 1", CS); @@ -4246,6 +4255,33 @@ return nullptr; } +void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) { + Assert(isa(FPI.getOperand(2)), + "invalid rounding mode argument", &FPI); + Metadata *RoundingMD = + cast(FPI.getOperand(2))->getMetadata(); + Assert(isa(RoundingMD), "invalid rounding mode argument", &FPI); + StringRef RoundingArg = dyn_cast(RoundingMD)->getString(); + Assert(RoundingArg.equals("LLVM_ROUND_DYNAMIC") || + RoundingArg.equals("LLVM_ROUND_TONEAREST") || + RoundingArg.equals("LLVM_ROUND_DOWNWARD") || + RoundingArg.equals("LLVM_ROUND_UPWARD") || + RoundingArg.equals("LLVM_ROUND_TOWARDZERO"), + "invalid rounding mode argument", &FPI); + + Assert(isa(FPI.getOperand(3)), + "invalid exception behavior argument", &FPI); + Metadata *ExceptionMD = + cast(FPI.getOperand(3))->getMetadata(); + Assert(isa(ExceptionMD), "invalid exception behavior argument", + &FPI); + StringRef ExceptionArg = dyn_cast(ExceptionMD)->getString(); + Assert(ExceptionArg.equals("LLVM_FPEXCEPT_IGNORE") || + ExceptionArg.equals("LLVM_FPEXCEPT_MAYTRAP") || + ExceptionArg.equals("LLVM_FPEXCEPT_STRICT"), + "invalid exception behavior argument", &FPI); +} + template void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) { auto *MD = cast(DII.getArgOperand(0))->getMetadata(); Index: test/CodeGen/X86/fp-intrinsics.ll =================================================================== --- test/CodeGen/X86/fp-intrinsics.ll +++ test/CodeGen/X86/fp-intrinsics.ll @@ -0,0 +1,112 @@ +; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s | FileCheck %s + +; Verify that constants aren't folded to inexact results when the rounding mode +; is unknown. +; +; double f1() { +; // Because 0.1 cannot be represented exactly, this shouldn't be folded. +; return 1.0/10.0; +; } +; +; CHECK-LABEL: f1 +; CHECK: divsd +define double @f1() { +entry: + %div = call double @llvm.experimental.constrained.fdiv.f64( + double 1.000000e+00, + double 1.000000e+01, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %div +} + +; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown. +; However, transforming this to 'a + (-0)' is OK. +; +; double f2(double a) { +; // Because he result of '0 - 0' is negative zero if rounding mode is +; // downward, this shouldn't be simplified. +; return a - 0; +; } +; +; CHECK-LABEL: f2 +; CHECK: addsd +define double @f2(double %a) { +entry: + %div = call double @llvm.experimental.constrained.fsub.f64( + double %a, + double 0.000000e+00, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %div +} + +; Verify that '-((-a)*b)' isn't simplified to 'a*b' when the rounding mode is +; unknown. +; +; double f3(double a, double b) { +; // Because the intermediate value involved in this calculation may require +; // rounding, this shouldn't be simplified. +; return -((-a)*b); +; } +; +; CHECK-LABEL: f3: +; CHECK: subsd +; CHECK: mulsd +; CHECK: subsd +define double @f3(double %a, double %b) { +entry: + %sub = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, double %a, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + %mul = call double @llvm.experimental.constrained.fmul.f64( + double %sub, double %b, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + %ret = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, + double %mul, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %ret +} + +; Verify that FP operations are not performed speculatively when FP exceptions +; are not being ignored. +; +; double f4(int n, double a) { +; // Because a + 1 may overflow, this should not be simplified. +; if (n > 0) +; return a + 1.0; +; return a; +; } +; +; +; CHECK-LABEL: f4: +; CHECK: testl +; CHECK: jle +; CHECK: addsd +define double @f4(i32 %n, double %a) { +entry: + %cmp = icmp sgt i32 %n, 0 + br i1 %cmp, label %if.then, label %if.end + +if.then: + %add = call double @llvm.experimental.constrained.fadd.f64( + double 1.000000e+00, double %a, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + br label %if.end + +if.end: + %a.0 = phi double [%add, %if.then], [ %a, %entry ] + ret double %a.0 +} + + +@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" +declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata) Index: test/Feature/fp-intrinsics.ll =================================================================== --- test/Feature/fp-intrinsics.ll +++ test/Feature/fp-intrinsics.ll @@ -0,0 +1,102 @@ +; RUN: opt -O3 -S < %s | FileCheck %s + +; Test to verify that constants aren't folded when the rounding mode is unknown. +; CHECK-LABEL: @f1 +; CHECK: call double @llvm.experimental.constrained.fdiv.f64 +define double @f1() { +entry: + %div = call double @llvm.experimental.constrained.fdiv.f64( + double 1.000000e+00, + double 1.000000e+01, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %div +} + +; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown. +; +; double f2(double a) { +; // Because the result of '0 - 0' is negative zero if rounding mode is +; // downward, this shouldn't be simplified. +; return a - 0.0; +; } +; +; CHECK-LABEL: @f2 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +define double @f2(double %a) { +entry: + %div = call double @llvm.experimental.constrained.fsub.f64( + double %a, double 0.000000e+00, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %div +} + +; Verify that '-((-a)*b)' isn't simplified to 'a*b' when the rounding mode is +; unknown. +; +; double f3(double a, double b) { +; // Because the intermediate value involved in this calculation may require +; // rounding, this shouldn't be simplified. +; return -((-a)*b); +; } +; +; CHECK-LABEL: @f3 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +; CHECK: call double @llvm.experimental.constrained.fmul.f64 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +define double @f3(double %a, double %b) { +entry: + %sub = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, double %a, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + %mul = call double @llvm.experimental.constrained.fmul.f64( + double %sub, double %b, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + %ret = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, + double %mul, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %ret +} + +; Verify that FP operations are not performed speculatively when FP exceptions +; are not being ignored. +; +; double f4(int n, double a) { +; // Because a + 1 may overflow, this should not be simplified. +; if (n > 0) +; return a + 1.0; +; return a; +; } +; +; +; CHECK-LABEL: @f4 +; CHECK-NOT: select +; CHECK: br i1 %cmp +define double @f4(i32 %n, double %a) { +entry: + %cmp = icmp sgt i32 %n, 0 + br i1 %cmp, label %if.then, label %if.end + +if.then: + %add = call double @llvm.experimental.constrained.fadd.f64( + double 1.000000e+00, double %a, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + br label %if.end + +if.end: + %a.0 = phi double [%add, %if.then], [ %a, %entry ] + ret double %a.0 +} + + +@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" +declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata) Index: test/Verifier/fp-intrinsics.ll =================================================================== --- test/Verifier/fp-intrinsics.ll +++ test/Verifier/fp-intrinsics.ll @@ -0,0 +1,43 @@ +; RUN: opt -verify -S < %s 2>&1 | FileCheck --check-prefix=CHECK1 %s +; RUN: sed -e s/.T2:// %s | not opt -verify -disable-output 2>&1 | FileCheck --check-prefix=CHECK2 %s +; RUN: sed -e s/.T3:// %s | not opt -verify -disable-output 2>&1 | FileCheck --check-prefix=CHECK3 %s + +; Common declaration used for all runs. +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) + +; Test that the verifier accepts legal code, and that the correct attributes are +; attached to the FP intrinsic. +; CHECK1: declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) #[[ATTR:[0-9]+]] +; CHECK1: attributes #[[ATTR]] = { inaccessiblememonly nounwind } +; Note: FP exceptions aren't usually caught through normal unwind mechanisms, +; but we may want to revisit this for asynchronous exception handling. +define double @f1(double %a, double %b) { +entry: + %fadd = call double @llvm.experimental.constrained.fadd.f64( + double %a, double %b, + metadata !"LLVM_ROUND_DYNAMIC", + metadata !"LLVM_FPEXCEPT_STRICT") + ret double %fadd +} + +; Test an illegal value for the rounding mode argument. +; CHECK2: invalid rounding mode argument +;T2: define double @f2(double %a, double %b) { +;T2: entry: +;T2: %fadd = call double @llvm.experimental.constrained.fadd.f64( +;T2: double %a, double %b, +;T2: metadata !"LLVM_ROUND_DYNOMITE", +;T2: metadata !"LLVM_FPEXCEPT_STRICT") +;T2: ret double %fadd +;T2: } + +; Test an illegal value for the exception behavior argument. +; CHECK3: invalid exception behavior argument +;T3: define double @f2(double %a, double %b) { +;T3: entry: +;T3: %fadd = call double @llvm.experimental.constrained.fadd.f64( +;T3: double %a, double %b, +;T3: metadata !"LLVM_ROUND_DYNAMIC", +;T3: metadata !"LLVM_FPEXCEPT_RESTRICT") +;T3: ret double %fadd +;T3: }