Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -12024,6 +12024,277 @@ Returns another pointer that aliases its argument but which is considered different for the purposes of ``load``/``store`` ``invariant.group`` metadata. +Constrained Floating Point Intrinsics +------------------------------------- + +These intrinsics are used to provide special handling of floating point +operations when specific rounding mode or floating point exception behavior is +required. By default, LLVM optimization passes assume that the rounding mode is +round-to-nearest and that floating point exceptions will not be monitored. +Constrained FP intrinsics are used to support non-default rounding modes and +accurately preserve exception behavior without compromising LLVM's ability to +optimize FP code when the default behavior is used. + +Each of these intrinsics corresponds to a normal floating point operation. The +first two arguments and the return value are the same as the corresponding FP +operation. + +The third argument is a metadata argument specifying the rounding mode to be +assumed. This argument must be one of the following strings: + +:: + "round.dynamic" + "round.tonearest" + "round.downward" + "round.upward" + "round.towardzero" + +If this argument is "round.dynamic" optimization passes must assume that the +rounding mode is unknown and may change at runtime. No transformations that +depend on rounding mode may be performed in this case. + +The other possible values for the rounding mode argument correspond to the +similarly named IEEE rounding modes. If the argument is any of these values +optimization passes may perform transformations as long as they are consistent +with the specified rounding mode. + +For example, 'x-0'->'x' is not a valid transformation if the rounding mode is +"round.downward" or "round.dynamic" because if the value of 'x' is +0 then +'x-0' should evaluate to '-0' when rounding downward. However, this +transformation is legal for all other rounding modes. + +For values other than "round.dynamic" optimization passes may assume that the +actual runtime rounding mode (as defined in a target-specific manner) matches +the specified rounding mode, but this is not guaranteed. Using a specific +non-dynamic rounding mode which does not match the actual rounding mode at +runtime results in undefined behavior. + +The fourth argument to the constrained floating point intrinsics specifies the +required exception behavior. This argument must be one of the following +strings: + +:: + "fpexcept.ignore" + "fpexcept.maytrap" + "fpexcept.strict" + +If this argument is "fpexcept.ignore" optimization passes may assume that the +exception status flags will not be read and that floating point exceptions will +be masked. This allows transformations to be performed that may change the +exception semantics of the original code. For example, FP operations may be +speculatively executed in this case whereas they must not be for either of the +other possible values of this argument. + +If the exception behavior argument is "fpexcept.maytrap" optimization passes +must avoid transformations that may raise exceptions that would not have been +raised by the original code (such as speculatively executing FP operations), but +passes are not required to preserve all exceptions that are implied by the +original code. For example, exceptions may be potentially hidden by constant +folding. + +If the exception behavior argument is "fpexcept.strict" all transformations must +strictly preserve the floating point exception semantics of the original code. +Any FP exception that would have been raised by the original code must be raised +by the transformed code, and the transformed code must not raise any FP +exceptions that would not have been raised by the original code. This is the +exception behavior argument that will be used if the code being compiled reads +the FP exception status flags, but this mode can also be used with code that +unmasks FP exceptions. + +The number and order of floating point exceptions is NOT guaranteed. For +example, a series of FP operations that each may raise exceptions may be +vectorized into a single instruction that raises each unique exception a single +time. + + +'``llvm.experimental.constrained.fadd``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fadd( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its +two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``llvm.experimental.constrained.fadd``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point sum of the two value operands and has +the same type as the operands. + + +'``llvm.experimental.constrained.fsub``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fsub( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference +of its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``llvm.experimental.constrained.fsub``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point difference of the two value operands +and has the same type as the operands. + + +'``llvm.experimental.constrained.fmul``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fmul( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of +its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``llvm.experimental.constrained.fmul``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point product of the two value operands and +has the same type as the operands. + + +'``llvm.experimental.constrained.fdiv``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.fdiv( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of +its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``llvm.experimental.constrained.fdiv``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. + +Semantics: +"""""""""" + +The value produced is the floating point quotient of the two value operands and +has the same type as the operands. + + +'``llvm.experimental.constrained.frem``' Intrinsic +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare + @llvm.experimental.constrained.frem( , , + metadata , + metadata ) + +Overview: +""""""""" + +The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder +from the division of its two operands. + + +Arguments: +"""""""""" + +The first two arguments to the '``llvm.experimental.constrained.frem``' +intrinsic must be :ref:`floating point ` or :ref:`vector ` +of floating point values. Both arguments must have identical types. + +The third and fourth arguments specify the rounding mode and exception +behavior as described above. The rounding mode argument has no effect, since +the result of frem is never rounded, but the argument is included for +consistency with the other constrained floating point intrinsics. + +Semantics: +"""""""""" + +The value produced is the floating point remainder from the division of the two +value operands and has the same type as the operands. The remainder has the +same sign as the dividend. + + General Intrinsics ------------------ Index: include/llvm/CodeGen/ISDOpcodes.h =================================================================== --- include/llvm/CodeGen/ISDOpcodes.h +++ include/llvm/CodeGen/ISDOpcodes.h @@ -245,6 +245,12 @@ /// Simple binary floating point operators. FADD, FSUB, FMUL, FDIV, FREM, + /// Constrained versions of the binary floating point operators. + /// These will be lowered to the simple operators before final selection. + /// They are used to limit optimizations while the DAG is being + /// optimized. + STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM, + /// FMA - Perform a * b + c with no intermediate rounding step. FMA, Index: include/llvm/CodeGen/SelectionDAGISel.h =================================================================== --- include/llvm/CodeGen/SelectionDAGISel.h +++ include/llvm/CodeGen/SelectionDAGISel.h @@ -270,6 +270,8 @@ SDNode *MorphNode(SDNode *Node, unsigned TargetOpc, SDVTList VTs, ArrayRef Ops, unsigned EmitNodeInfo); + SDNode *MutateStrictFPToFP(SDNode *Node, unsigned NewOpc); + /// Prepares the landing pad to take incoming values or do other EH /// personality specific tasks. Returns true if the block should be /// instruction selected, false if no code should be emitted for it. Index: include/llvm/IR/IntrinsicInst.h =================================================================== --- include/llvm/IR/IntrinsicInst.h +++ include/llvm/IR/IntrinsicInst.h @@ -137,6 +137,45 @@ } }; + /// This is the common base class for constrained floating point intrinsics. + class ConstrainedFPIntrinsic : public IntrinsicInst { + public: + enum RoundingMode { + rmInvalid, + rmDynamic, + rmToNearest, + rmDownward, + rmUpward, + rmTowardZero + }; + + enum ExceptionBehavior { + ebInvalid, + ebIgnore, + ebMayTrap, + ebStrict + }; + + RoundingMode getRoundingMode() const; + ExceptionBehavior getExceptionBehavior() const; + + // Methods for support type inquiry through isa, cast, and dyn_cast: + static inline bool classof(const IntrinsicInst *I) { + switch (I->getIntrinsicID()) { + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + return true; + default: return false; + } + } + static inline bool classof(const Value *V) { + return isa(V) && classof(cast(V)); + } + }; + /// This is the common base class for memset/memcpy/memmove. class MemIntrinsic : public IntrinsicInst { public: Index: include/llvm/IR/Intrinsics.td =================================================================== --- include/llvm/IR/Intrinsics.td +++ include/llvm/IR/Intrinsics.td @@ -389,6 +389,9 @@ llvm_i32_ty, llvm_i1_ty], [IntrArgMemOnly, NoCapture<0>, WriteOnly<0>]>; +// FIXME: Add version of these floating point intrinsics which allow non-default +// rounding modes and FP exception handling. + let IntrProperties = [IntrNoMem] in { def int_fma : Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>, LLVMMatchType<0>, @@ -442,6 +445,39 @@ [IntrNoMem]>, GCCBuiltin<"__builtin_object_size">; +//===--------------- Constrained Floating Point Intrinsics ----------------===// +// + +let IntrProperties = [IntrInaccessibleMemOnly] in { + def int_experimental_constrained_fadd : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fsub : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fmul : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_fdiv : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; + def int_experimental_constrained_frem : Intrinsic<[ llvm_anyfloat_ty ], + [ LLVMMatchType<0>, + LLVMMatchType<0>, + llvm_metadata_ty, + llvm_metadata_ty ]>; +} +// FIXME: Add intrinsic for fcmp, fptrunc, fpext, fptoui and fptosi. + + //===------------------------- Expect Intrinsics --------------------------===// // def int_expect : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h @@ -900,6 +900,7 @@ void visitInlineAsm(ImmutableCallSite CS); const char *visitIntrinsicCall(const CallInst &I, unsigned Intrinsic); void visitTargetIntrinsic(const CallInst &I, unsigned Intrinsic); + void visitConstrainedFPIntrinsic(const CallInst &I, unsigned Intrinsic); void visitVAStart(const CallInst &I); void visitVAArg(const VAArgInst &I); Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -5290,6 +5290,13 @@ getValue(I.getArgOperand(1)), getValue(I.getArgOperand(2)))); return nullptr; + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + visitConstrainedFPIntrinsic(I, Intrinsic); + return nullptr; case Intrinsic::fmuladd: { EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType()); if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict && @@ -5738,6 +5745,46 @@ } } +void SelectionDAGBuilder::visitConstrainedFPIntrinsic(const CallInst &I, + unsigned Intrinsic) { + SDLoc sdl = getCurSDLoc(); + unsigned Opcode; + switch (Intrinsic) { + default: llvm_unreachable("Impossible intrinsic"); // Can't reach here. + case Intrinsic::experimental_constrained_fadd: + Opcode = ISD::STRICT_FADD; + break; + case Intrinsic::experimental_constrained_fsub: + Opcode = ISD::STRICT_FSUB; + break; + case Intrinsic::experimental_constrained_fmul: + Opcode = ISD::STRICT_FMUL; + break; + case Intrinsic::experimental_constrained_fdiv: + Opcode = ISD::STRICT_FDIV; + break; + case Intrinsic::experimental_constrained_frem: + Opcode = ISD::STRICT_FREM; + break; + } + const TargetLowering &TLI = DAG.getTargetLoweringInfo(); + SDValue Chain = getRoot(); + SDValue Ops[3] = { Chain, getValue(I.getArgOperand(0)), + getValue(I.getArgOperand(1)) }; + SmallVector ValueVTs; + ComputeValueVTs(TLI, DAG.getDataLayout(), I.getType(), ValueVTs); + ValueVTs.push_back(MVT::Other); // Out chain + + SDVTList VTs = DAG.getVTList(ValueVTs); + SDValue Result = DAG.getNode(Opcode, sdl, VTs, Ops); + + assert(Result.getNode()->getNumValues() == 2); + SDValue OutChain = Result.getValue(1); + DAG.setRoot(OutChain); + SDValue FPResult = Result.getValue(0); + setValue(&I, FPResult); +} + std::pair SelectionDAGBuilder::lowerInvokable(TargetLowering::CallLoweringInfo &CLI, const BasicBlock *EHPadBB) { Index: lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp @@ -925,6 +925,50 @@ }; } // end anonymous namespace +static bool isStrictFPOp(SDNode *Node, unsigned &NewOpc) { + unsigned OrigOpc = Node->getOpcode(); + switch (OrigOpc) { + case ISD::STRICT_FADD: NewOpc = ISD::FADD; return true; + case ISD::STRICT_FSUB: NewOpc = ISD::FSUB; return true; + case ISD::STRICT_FMUL: NewOpc = ISD::FMUL; return true; + case ISD::STRICT_FDIV: NewOpc = ISD::FDIV; return true; + case ISD::STRICT_FREM: NewOpc = ISD::FREM; return true; + default: return false; + } +} + +SDNode* SelectionDAGISel::MutateStrictFPToFP(SDNode *Node, unsigned NewOpc) { + assert(((Node->getOpcode() == ISD::STRICT_FADD && NewOpc == ISD::FADD) || + (Node->getOpcode() == ISD::STRICT_FSUB && NewOpc == ISD::FSUB) || + (Node->getOpcode() == ISD::STRICT_FMUL && NewOpc == ISD::FMUL) || + (Node->getOpcode() == ISD::STRICT_FDIV && NewOpc == ISD::FDIV) || + (Node->getOpcode() == ISD::STRICT_FREM && NewOpc == ISD::FREM)) && + "Unexpected StrictFP opcode!"); + + // We're taking this node out of the chain, so we need to re-link things. + SDValue InputChain = Node->getOperand(0); + SDValue OutputChain = SDValue(Node, 1); + CurDAG->ReplaceAllUsesOfValueWith(OutputChain, InputChain); + + SDVTList VTs = CurDAG->getVTList(Node->getOperand(1).getValueType()); + SDValue Ops[2] = { Node->getOperand(1), Node->getOperand(2) }; + SDNode *Res = CurDAG->MorphNodeTo(Node, NewOpc, VTs, Ops); + + // MorphNodeTo can operate in two ways: if an existing node with the + // specified operands exists, it can just return it. Otherwise, it + // updates the node in place to have the requested operands. + if (Res == Node) { + // If we updated the node in place, reset the node ID. To the isel, + // this should be just like a newly allocated machine node. + Res->setNodeId(-1); + } else { + CurDAG->ReplaceAllUsesWith(Node, Res); + CurDAG->RemoveDeadNode(Node); + } + + return Res; +} + void SelectionDAGISel::DoInstructionSelection() { DEBUG(dbgs() << "===== Instruction selection begins: BB#" << FuncInfo->MBB->getNumber() @@ -960,7 +1004,23 @@ if (Node->use_empty()) continue; + // When we are using non-default rounding modes or FP exception behavior + // FP operations are represented by StrictFP pseudo-operations. They + // need to be simplified here so that the target-specific instruction + // selectors know how to handle them. + // + // If the current node is a strict FP pseudo-op, the isStrictFPOp() + // function will provide the corresponding normal FP opcode to which the + // node should be mutated. + unsigned NormalFPOpc = ISD::UNDEF; + bool IsStrictFPOp = isStrictFPOp(Node, NormalFPOpc); + if (IsStrictFPOp) + Node = MutateStrictFPToFP(Node, NormalFPOpc); + Select(Node); + + // FIXME: Add code here to attach an implicit def and use of + // target-specific FP environment registers. } CurDAG->setRoot(Dummy.getValue()); Index: lib/IR/IntrinsicInst.cpp =================================================================== --- lib/IR/IntrinsicInst.cpp +++ lib/IR/IntrinsicInst.cpp @@ -21,6 +21,7 @@ // //===----------------------------------------------------------------------===// +#include "llvm/ADT/StringSwitch.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/Constants.h" #include "llvm/IR/GlobalVariable.h" @@ -93,3 +94,34 @@ LLVMContext &Context = M->getContext(); return ConstantInt::get(Type::getInt64Ty(Context), 1); } + +ConstrainedFPIntrinsic::RoundingMode +ConstrainedFPIntrinsic::getRoundingMode() const { + Metadata *MD = dyn_cast(getOperand(2))->getMetadata(); + if (!MD || !isa(MD)) + return rmInvalid; + StringRef RoundingArg = cast(MD)->getString(); + + // For dynamic rounding mode, we use round to nearest but we will set the + // 'exact' SDNodeFlag so that the value will not be rounded. + return StringSwitch(RoundingArg) + .Case("round.dynamic", rmDynamic) + .Case("round.tonearest", rmToNearest) + .Case("round.downward", rmDownward) + .Case("round.upward", rmUpward) + .Case("round.towardzero", rmTowardZero) + .Default(rmInvalid); +} + +ConstrainedFPIntrinsic::ExceptionBehavior +ConstrainedFPIntrinsic::getExceptionBehavior() const { + Metadata *MD = dyn_cast(getOperand(3))->getMetadata(); + if (!MD || !isa(MD)) + return ebInvalid; + StringRef ExceptionArg = cast(MD)->getString(); + return StringSwitch(ExceptionArg) + .Case("fpexcept.ignore", ebIgnore) + .Case("fpexcept.maytrap", ebMayTrap) + .Case("fpexcept.strict", ebStrict) + .Default(ebInvalid); +} Index: lib/IR/Verifier.cpp =================================================================== --- lib/IR/Verifier.cpp +++ lib/IR/Verifier.cpp @@ -444,6 +444,7 @@ void visitUserOp1(Instruction &I); void visitUserOp2(Instruction &I) { visitUserOp1(I); } void visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS); + void visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI); template void visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII); void visitAtomicCmpXchgInst(AtomicCmpXchgInst &CXI); @@ -3907,6 +3908,14 @@ "constant int", CS); break; + case Intrinsic::experimental_constrained_fadd: + case Intrinsic::experimental_constrained_fsub: + case Intrinsic::experimental_constrained_fmul: + case Intrinsic::experimental_constrained_fdiv: + case Intrinsic::experimental_constrained_frem: + visitConstrainedFPIntrinsic( + cast(*CS.getInstruction())); + break; case Intrinsic::dbg_declare: // llvm.dbg.declare Assert(isa(CS.getArgOperand(0)), "invalid llvm.dbg.declare intrinsic call 1", CS); @@ -4246,6 +4255,15 @@ return nullptr; } +void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) { + Assert(isa(FPI.getOperand(2)), + "invalid rounding mode argument", &FPI); + Assert(FPI.getRoundingMode() != ConstrainedFPIntrinsic::rmInvalid, + "invalid rounding mode argument", &FPI); + Assert(FPI.getExceptionBehavior() != ConstrainedFPIntrinsic::ebInvalid, + "invalid exception behavior argument", &FPI); +} + template void Verifier::visitDbgIntrinsic(StringRef Kind, DbgIntrinsicTy &DII) { auto *MD = cast(DII.getArgOperand(0))->getMetadata(); Index: test/CodeGen/X86/fp-intrinsics.ll =================================================================== --- test/CodeGen/X86/fp-intrinsics.ll +++ test/CodeGen/X86/fp-intrinsics.ll @@ -0,0 +1,111 @@ +; RUN: llc -O3 -mtriple=x86_64-pc-linux < %s | FileCheck %s + +; Verify that constants aren't folded to inexact results when the rounding mode +; is unknown. +; +; double f1() { +; // Because 0.1 cannot be represented exactly, this shouldn't be folded. +; return 1.0/10.0; +; } +; +; CHECK-LABEL: f1 +; CHECK: divsd +define double @f1() { +entry: + %div = call double @llvm.experimental.constrained.fdiv.f64( + double 1.000000e+00, + double 1.000000e+01, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %div +} + +; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown. +; +; double f2(double a) { +; // Because the result of '0 - 0' is negative zero if rounding mode is +; // downward, this shouldn't be simplified. +; return a - 0; +; } +; +; CHECK-LABEL: f2 +; CHECK: subsd +define double @f2(double %a) { +entry: + %div = call double @llvm.experimental.constrained.fsub.f64( + double %a, + double 0.000000e+00, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %div +} + +; Verify that '-((-a)*b)' isn't simplified to 'a*b' when the rounding mode is +; unknown. +; +; double f3(double a, double b) { +; // Because the intermediate value involved in this calculation may require +; // rounding, this shouldn't be simplified. +; return -((-a)*b); +; } +; +; CHECK-LABEL: f3: +; CHECK: subsd +; CHECK: mulsd +; CHECK: subsd +define double @f3(double %a, double %b) { +entry: + %sub = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, double %a, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + %mul = call double @llvm.experimental.constrained.fmul.f64( + double %sub, double %b, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + %ret = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, + double %mul, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %ret +} + +; Verify that FP operations are not performed speculatively when FP exceptions +; are not being ignored. +; +; double f4(int n, double a) { +; // Because a + 1 may overflow, this should not be simplified. +; if (n > 0) +; return a + 1.0; +; return a; +; } +; +; +; CHECK-LABEL: f4: +; CHECK: testl +; CHECK: jle +; CHECK: addsd +define double @f4(i32 %n, double %a) { +entry: + %cmp = icmp sgt i32 %n, 0 + br i1 %cmp, label %if.then, label %if.end + +if.then: + %add = call double @llvm.experimental.constrained.fadd.f64( + double 1.000000e+00, double %a, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + br label %if.end + +if.end: + %a.0 = phi double [%add, %if.then], [ %a, %entry ] + ret double %a.0 +} + + +@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" +declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata) Index: test/Feature/fp-intrinsics.ll =================================================================== --- test/Feature/fp-intrinsics.ll +++ test/Feature/fp-intrinsics.ll @@ -0,0 +1,102 @@ +; RUN: opt -O3 -S < %s | FileCheck %s + +; Test to verify that constants aren't folded when the rounding mode is unknown. +; CHECK-LABEL: @f1 +; CHECK: call double @llvm.experimental.constrained.fdiv.f64 +define double @f1() { +entry: + %div = call double @llvm.experimental.constrained.fdiv.f64( + double 1.000000e+00, + double 1.000000e+01, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %div +} + +; Verify that 'a - 0' isn't simplified to 'a' when the rounding mode is unknown. +; +; double f2(double a) { +; // Because the result of '0 - 0' is negative zero if rounding mode is +; // downward, this shouldn't be simplified. +; return a - 0.0; +; } +; +; CHECK-LABEL: @f2 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +define double @f2(double %a) { +entry: + %div = call double @llvm.experimental.constrained.fsub.f64( + double %a, double 0.000000e+00, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %div +} + +; Verify that '-((-a)*b)' isn't simplified to 'a*b' when the rounding mode is +; unknown. +; +; double f3(double a, double b) { +; // Because the intermediate value involved in this calculation may require +; // rounding, this shouldn't be simplified. +; return -((-a)*b); +; } +; +; CHECK-LABEL: @f3 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +; CHECK: call double @llvm.experimental.constrained.fmul.f64 +; CHECK: call double @llvm.experimental.constrained.fsub.f64 +define double @f3(double %a, double %b) { +entry: + %sub = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, double %a, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + %mul = call double @llvm.experimental.constrained.fmul.f64( + double %sub, double %b, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + %ret = call double @llvm.experimental.constrained.fsub.f64( + double -0.000000e+00, + double %mul, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %ret +} + +; Verify that FP operations are not performed speculatively when FP exceptions +; are not being ignored. +; +; double f4(int n, double a) { +; // Because a + 1 may overflow, this should not be simplified. +; if (n > 0) +; return a + 1.0; +; return a; +; } +; +; +; CHECK-LABEL: @f4 +; CHECK-NOT: select +; CHECK: br i1 %cmp +define double @f4(i32 %n, double %a) { +entry: + %cmp = icmp sgt i32 %n, 0 + br i1 %cmp, label %if.then, label %if.end + +if.then: + %add = call double @llvm.experimental.constrained.fadd.f64( + double 1.000000e+00, double %a, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + br label %if.end + +if.end: + %a.0 = phi double [%add, %if.then], [ %a, %entry ] + ret double %a.0 +} + + +@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata" +declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata) Index: test/Verifier/fp-intrinsics.ll =================================================================== --- test/Verifier/fp-intrinsics.ll +++ test/Verifier/fp-intrinsics.ll @@ -0,0 +1,43 @@ +; RUN: opt -verify -S < %s 2>&1 | FileCheck --check-prefix=CHECK1 %s +; RUN: sed -e s/.T2:// %s | not opt -verify -disable-output 2>&1 | FileCheck --check-prefix=CHECK2 %s +; RUN: sed -e s/.T3:// %s | not opt -verify -disable-output 2>&1 | FileCheck --check-prefix=CHECK3 %s + +; Common declaration used for all runs. +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) + +; Test that the verifier accepts legal code, and that the correct attributes are +; attached to the FP intrinsic. +; CHECK1: declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) #[[ATTR:[0-9]+]] +; CHECK1: attributes #[[ATTR]] = { inaccessiblememonly nounwind } +; Note: FP exceptions aren't usually caught through normal unwind mechanisms, +; but we may want to revisit this for asynchronous exception handling. +define double @f1(double %a, double %b) { +entry: + %fadd = call double @llvm.experimental.constrained.fadd.f64( + double %a, double %b, + metadata !"round.dynamic", + metadata !"fpexcept.strict") + ret double %fadd +} + +; Test an illegal value for the rounding mode argument. +; CHECK2: invalid rounding mode argument +;T2: define double @f2(double %a, double %b) { +;T2: entry: +;T2: %fadd = call double @llvm.experimental.constrained.fadd.f64( +;T2: double %a, double %b, +;T2: metadata !"round.dynomite", +;T2: metadata !"fpexcept.strict") +;T2: ret double %fadd +;T2: } + +; Test an illegal value for the exception behavior argument. +; CHECK3: invalid exception behavior argument +;T3: define double @f2(double %a, double %b) { +;T3: entry: +;T3: %fadd = call double @llvm.experimental.constrained.fadd.f64( +;T3: double %a, double %b, +;T3: metadata !"round.dynamic", +;T3: metadata !"fpexcept.restrict") +;T3: ret double %fadd +;T3: }