Index: llvm/trunk/docs/LangRef.rst =================================================================== --- llvm/trunk/docs/LangRef.rst +++ llvm/trunk/docs/LangRef.rst @@ -2272,11 +2272,11 @@ Fast-Math Flags --------------- -LLVM IR floating-point binary ops (:ref:`fadd `, +LLVM IR floating-point operations (:ref:`fadd `, :ref:`fsub `, :ref:`fmul `, :ref:`fdiv `, :ref:`frem `, :ref:`fcmp `) and :ref:`call ` -instructions have the following flags that can be set to enable -otherwise unsafe floating point transformations. +may use the following flags to enable otherwise unsafe +floating-point transformations. ``nnan`` No NaNs - Allow optimizations to assume the arguments and result are not @@ -2300,10 +2300,17 @@ Allow floating-point contraction (e.g. fusing a multiply followed by an addition into a fused multiply-and-add). +``afn`` + Approximate functions - Allow substitution of approximate calculations for + functions (sin, log, sqrt, etc). See floating-point intrinsic definitions + for places where this can apply to LLVM's intrinsic math functions. + +``reassoc`` + Allow reassociation transformations for floating-point instructions. + This may dramatically change results in floating point. + ``fast`` - Fast - Allow algebraically equivalent transformations that may - dramatically change results in floating point (e.g. reassociate). This - flag implies all the others. + This flag implies all of the others. .. _uselistorder: @@ -10483,7 +10490,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10497,20 +10504,22 @@ Overview: """"""""" -The '``llvm.sqrt``' intrinsics return the square root of the specified value, -returning the same value as the libm '``sqrt``' functions would, but without -trapping or setting ``errno``. +The '``llvm.sqrt``' intrinsics return the square root of the specified value. Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the square root of the operand if it is a nonnegative -floating point number. +Return the same value as a corresponding libm '``sqrt``' function but without +trapping or setting ``errno``. For types specified by IEEE-754, the result +matches a conforming libm implementation. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.powi.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10557,7 +10566,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.sin`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10576,14 +10585,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the sine of the specified operand, returning the -same values as the libm ``sin`` functions would, and handles error -conditions in the same way. +Return the same value as a corresponding libm '``sin``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.cos.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10592,7 +10603,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.cos`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10611,14 +10622,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the cosine of the specified operand, returning the -same values as the libm ``cos`` functions would, and handles error -conditions in the same way. +Return the same value as a corresponding libm '``cos``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.pow.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10627,7 +10640,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.pow`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10647,15 +10660,16 @@ Arguments: """""""""" -The second argument is a floating point power, and the first is a value -to raise to that power. +The arguments and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the first value raised to the second power, -returning the same values as the libm ``pow`` functions would, and -handles error conditions in the same way. +Return the same value as a corresponding libm '``pow``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.exp.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10664,7 +10678,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.exp`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10684,13 +10698,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``exp`` functions -would, and handles error conditions in the same way. +Return the same value as a corresponding libm '``exp``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.exp2.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10699,7 +10716,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.exp2`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10719,13 +10736,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``exp2`` functions -would, and handles error conditions in the same way. +Return the same value as a corresponding libm '``exp2``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.log.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10734,7 +10754,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.log`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10754,13 +10774,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``log`` functions -would, and handles error conditions in the same way. +Return the same value as a corresponding libm '``log``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.log10.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10769,7 +10792,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.log10`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10789,13 +10812,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``log10`` functions -would, and handles error conditions in the same way. +Return the same value as a corresponding libm '``log10``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.log2.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10804,7 +10830,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.log2`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10824,13 +10850,16 @@ Arguments: """""""""" -The argument and return value are floating point numbers of the same type. +The argument and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``log2`` functions -would, and handles error conditions in the same way. +Return the same value as a corresponding libm '``log2``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.fma.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -10839,7 +10868,7 @@ """"""" This is an overloaded intrinsic. You can use ``llvm.fma`` on any -floating point or vector of floating point type. Not all targets support +floating-point or vector of floating-point type. Not all targets support all types however. :: @@ -10853,20 +10882,21 @@ Overview: """"""""" -The '``llvm.fma.*``' intrinsics perform the fused multiply-add -operation. +The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. Arguments: """""""""" -The argument and return value are floating point numbers of the same -type. +The arguments and return value are floating-point numbers of the same type. Semantics: """""""""" -This function returns the same values as the libm ``fma`` functions -would, and does not set errno. +Return the same value as a corresponding libm '``fma``' function but without +trapping or setting ``errno``. + +When specified with the fast-math-flag 'afn', the result may be approximated +using a less accurate calculation. '``llvm.fabs.*``' Intrinsic ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Index: llvm/trunk/include/llvm/IR/Instruction.h =================================================================== --- llvm/trunk/include/llvm/IR/Instruction.h +++ llvm/trunk/include/llvm/IR/Instruction.h @@ -308,10 +308,15 @@ /// Determine whether the exact flag is set. bool isExact() const; - /// Set or clear the unsafe-algebra flag on this instruction, which must be an + /// Set or clear all fast-math-flags on this instruction, which must be an /// operator which supports this flag. See LangRef.html for the meaning of /// this flag. - void setHasUnsafeAlgebra(bool B); + void setFast(bool B); + + /// Set or clear the reassociation flag on this instruction, which must be + /// an operator which supports this flag. See LangRef.html for the meaning of + /// this flag. + void setHasAllowReassoc(bool B); /// Set or clear the no-nans flag on this instruction, which must be an /// operator which supports this flag. See LangRef.html for the meaning of @@ -333,6 +338,11 @@ /// this flag. void setHasAllowReciprocal(bool B); + /// Set or clear the approximate-math-functions flag on this instruction, + /// which must be an operator which supports this flag. See LangRef.html for + /// the meaning of this flag. + void setHasApproxFunc(bool B); + /// Convenience function for setting multiple fast-math flags on this /// instruction, which must be an operator which supports these flags. See /// LangRef.html for the meaning of these flags. @@ -343,8 +353,11 @@ /// LangRef.html for the meaning of these flags. void copyFastMathFlags(FastMathFlags FMF); - /// Determine whether the unsafe-algebra flag is set. - bool hasUnsafeAlgebra() const; + /// Determine whether all fast-math-flags are set. + bool isFast() const; + + /// Determine whether the allow-reassociation flag is set. + bool hasAllowReassoc() const; /// Determine whether the no-NaNs flag is set. bool hasNoNaNs() const; @@ -361,6 +374,9 @@ /// Determine whether the allow-contract flag is set. bool hasAllowContract() const; + /// Determine whether the approximate-math-functions flag is set. + bool hasApproxFunc() const; + /// Convenience function for getting all the fast-math flags, which must be an /// operator which supports these flags. See LangRef.html for the meaning of /// these flags. Index: llvm/trunk/include/llvm/IR/Operator.h =================================================================== --- llvm/trunk/include/llvm/IR/Operator.h +++ llvm/trunk/include/llvm/IR/Operator.h @@ -163,52 +163,61 @@ unsigned Flags = 0; - FastMathFlags(unsigned F) : Flags(F) { } + FastMathFlags(unsigned F) { + // If all 7 bits are set, turn this into -1. If the number of bits grows, + // this must be updated. This is intended to provide some forward binary + // compatibility insurance for the meaning of 'fast' in case bits are added. + if (F == 0x7F) Flags = ~0U; + else Flags = F; + } public: - /// This is how the bits are used in Value::SubclassOptionalData so they - /// should fit there too. + // This is how the bits are used in Value::SubclassOptionalData so they + // should fit there too. + // WARNING: We're out of space. SubclassOptionalData only has 7 bits. New + // functionality will require a change in how this information is stored. enum { - UnsafeAlgebra = (1 << 0), + AllowReassoc = (1 << 0), NoNaNs = (1 << 1), NoInfs = (1 << 2), NoSignedZeros = (1 << 3), AllowReciprocal = (1 << 4), - AllowContract = (1 << 5) + AllowContract = (1 << 5), + ApproxFunc = (1 << 6) }; FastMathFlags() = default; - /// Whether any flag is set bool any() const { return Flags != 0; } + bool none() const { return Flags == 0; } + bool all() const { return Flags == ~0U; } - /// Set all the flags to false void clear() { Flags = 0; } + void set() { Flags = ~0U; } /// Flag queries + bool allowReassoc() const { return 0 != (Flags & AllowReassoc); } bool noNaNs() const { return 0 != (Flags & NoNaNs); } bool noInfs() const { return 0 != (Flags & NoInfs); } bool noSignedZeros() const { return 0 != (Flags & NoSignedZeros); } bool allowReciprocal() const { return 0 != (Flags & AllowReciprocal); } - bool allowContract() const { return 0 != (Flags & AllowContract); } - bool unsafeAlgebra() const { return 0 != (Flags & UnsafeAlgebra); } + bool allowContract() const { return 0 != (Flags & AllowContract); } + bool approxFunc() const { return 0 != (Flags & ApproxFunc); } + /// 'Fast' means all bits are set. + bool isFast() const { return all(); } /// Flag setters + void setAllowReassoc() { Flags |= AllowReassoc; } void setNoNaNs() { Flags |= NoNaNs; } void setNoInfs() { Flags |= NoInfs; } void setNoSignedZeros() { Flags |= NoSignedZeros; } void setAllowReciprocal() { Flags |= AllowReciprocal; } + // TODO: Change the other set* functions to take a parameter? void setAllowContract(bool B) { Flags = (Flags & ~AllowContract) | B * AllowContract; } - void setUnsafeAlgebra() { - Flags |= UnsafeAlgebra; - setNoNaNs(); - setNoInfs(); - setNoSignedZeros(); - setAllowReciprocal(); - setAllowContract(true); - } + void setApproxFunc() { Flags |= ApproxFunc; } + void setFast() { set(); } void operator&=(const FastMathFlags &OtherFlags) { Flags &= OtherFlags.Flags; @@ -221,18 +230,21 @@ private: friend class Instruction; - void setHasUnsafeAlgebra(bool B) { - SubclassOptionalData = - (SubclassOptionalData & ~FastMathFlags::UnsafeAlgebra) | - (B * FastMathFlags::UnsafeAlgebra); + /// 'Fast' means all bits are set. + void setFast(bool B) { + setHasAllowReassoc(B); + setHasNoNaNs(B); + setHasNoInfs(B); + setHasNoSignedZeros(B); + setHasAllowReciprocal(B); + setHasAllowContract(B); + setHasApproxFunc(B); + } - // Unsafe algebra implies all the others - if (B) { - setHasNoNaNs(true); - setHasNoInfs(true); - setHasNoSignedZeros(true); - setHasAllowReciprocal(true); - } + void setHasAllowReassoc(bool B) { + SubclassOptionalData = + (SubclassOptionalData & ~FastMathFlags::AllowReassoc) | + (B * FastMathFlags::AllowReassoc); } void setHasNoNaNs(bool B) { @@ -265,6 +277,12 @@ (B * FastMathFlags::AllowContract); } + void setHasApproxFunc(bool B) { + SubclassOptionalData = + (SubclassOptionalData & ~FastMathFlags::ApproxFunc) | + (B * FastMathFlags::ApproxFunc); + } + /// Convenience function for setting multiple fast-math flags. /// FMF is a mask of the bits to set. void setFastMathFlags(FastMathFlags FMF) { @@ -278,42 +296,53 @@ } public: - /// Test whether this operation is permitted to be - /// algebraically transformed, aka the 'A' fast-math property. - bool hasUnsafeAlgebra() const { - return (SubclassOptionalData & FastMathFlags::UnsafeAlgebra) != 0; + /// Test if this operation allows all non-strict floating-point transforms. + bool isFast() const { + return ((SubclassOptionalData & FastMathFlags::AllowReassoc) != 0 && + (SubclassOptionalData & FastMathFlags::NoNaNs) != 0 && + (SubclassOptionalData & FastMathFlags::NoInfs) != 0 && + (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0 && + (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0 && + (SubclassOptionalData & FastMathFlags::AllowContract) != 0 && + (SubclassOptionalData & FastMathFlags::ApproxFunc) != 0); + } + + /// Test if this operation may be simplified with reassociative transforms. + bool hasAllowReassoc() const { + return (SubclassOptionalData & FastMathFlags::AllowReassoc) != 0; } - /// Test whether this operation's arguments and results are to be - /// treated as non-NaN, aka the 'N' fast-math property. + /// Test if this operation's arguments and results are assumed not-NaN. bool hasNoNaNs() const { return (SubclassOptionalData & FastMathFlags::NoNaNs) != 0; } - /// Test whether this operation's arguments and results are to be - /// treated as NoN-Inf, aka the 'I' fast-math property. + /// Test if this operation's arguments and results are assumed not-infinite. bool hasNoInfs() const { return (SubclassOptionalData & FastMathFlags::NoInfs) != 0; } - /// Test whether this operation can treat the sign of zero - /// as insignificant, aka the 'S' fast-math property. + /// Test if this operation can ignore the sign of zero. bool hasNoSignedZeros() const { return (SubclassOptionalData & FastMathFlags::NoSignedZeros) != 0; } - /// Test whether this operation is permitted to use - /// reciprocal instead of division, aka the 'R' fast-math property. + /// Test if this operation can use reciprocal multiply instead of division. bool hasAllowReciprocal() const { return (SubclassOptionalData & FastMathFlags::AllowReciprocal) != 0; } - /// Test whether this operation is permitted to - /// be floating-point contracted. + /// Test if this operation can be floating-point contracted (FMA). bool hasAllowContract() const { return (SubclassOptionalData & FastMathFlags::AllowContract) != 0; } + /// Test if this operation allows approximations of math library functions or + /// intrinsics. + bool hasApproxFunc() const { + return (SubclassOptionalData & FastMathFlags::ApproxFunc) != 0; + } + /// Convenience function for getting all the fast-math flags FastMathFlags getFastMathFlags() const { return FastMathFlags(SubclassOptionalData); Index: llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h =================================================================== --- llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h +++ llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h @@ -331,15 +331,13 @@ /// not have the "fast-math" property. Such operation requires a relaxed FP /// mode. bool hasUnsafeAlgebra() { - return InductionBinOp && - !cast(InductionBinOp)->hasUnsafeAlgebra(); + return InductionBinOp && !cast(InductionBinOp)->isFast(); } /// Returns induction operator that does not have "fast-math" property /// and requires FP unsafe mode. Instruction *getUnsafeAlgebraInst() { - if (!InductionBinOp || - cast(InductionBinOp)->hasUnsafeAlgebra()) + if (!InductionBinOp || cast(InductionBinOp)->isFast()) return nullptr; return InductionBinOp; } Index: llvm/trunk/lib/AsmParser/LLLexer.cpp =================================================================== --- llvm/trunk/lib/AsmParser/LLLexer.cpp +++ llvm/trunk/lib/AsmParser/LLLexer.cpp @@ -552,6 +552,8 @@ KEYWORD(nsz); KEYWORD(arcp); KEYWORD(contract); + KEYWORD(reassoc); + KEYWORD(afn); KEYWORD(fast); KEYWORD(nuw); KEYWORD(nsw); Index: llvm/trunk/lib/AsmParser/LLParser.h =================================================================== --- llvm/trunk/lib/AsmParser/LLParser.h +++ llvm/trunk/lib/AsmParser/LLParser.h @@ -193,7 +193,7 @@ FastMathFlags FMF; while (true) switch (Lex.getKind()) { - case lltok::kw_fast: FMF.setUnsafeAlgebra(); Lex.Lex(); continue; + case lltok::kw_fast: FMF.setFast(); Lex.Lex(); continue; case lltok::kw_nnan: FMF.setNoNaNs(); Lex.Lex(); continue; case lltok::kw_ninf: FMF.setNoInfs(); Lex.Lex(); continue; case lltok::kw_nsz: FMF.setNoSignedZeros(); Lex.Lex(); continue; @@ -202,6 +202,8 @@ FMF.setAllowContract(true); Lex.Lex(); continue; + case lltok::kw_reassoc: FMF.setAllowReassoc(); Lex.Lex(); continue; + case lltok::kw_afn: FMF.setApproxFunc(); Lex.Lex(); continue; default: return FMF; } return FMF; Index: llvm/trunk/lib/AsmParser/LLToken.h =================================================================== --- llvm/trunk/lib/AsmParser/LLToken.h +++ llvm/trunk/lib/AsmParser/LLToken.h @@ -102,6 +102,8 @@ kw_nsz, kw_arcp, kw_contract, + kw_reassoc, + kw_afn, kw_fast, kw_nuw, kw_nsw, Index: llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp =================================================================== --- llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp +++ llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp @@ -1046,8 +1046,8 @@ static FastMathFlags getDecodedFastMathFlags(unsigned Val) { FastMathFlags FMF; - if (0 != (Val & FastMathFlags::UnsafeAlgebra)) - FMF.setUnsafeAlgebra(); + if (0 != (Val & FastMathFlags::AllowReassoc)) + FMF.setAllowReassoc(); if (0 != (Val & FastMathFlags::NoNaNs)) FMF.setNoNaNs(); if (0 != (Val & FastMathFlags::NoInfs)) @@ -1058,6 +1058,8 @@ FMF.setAllowReciprocal(); if (0 != (Val & FastMathFlags::AllowContract)) FMF.setAllowContract(true); + if (0 != (Val & FastMathFlags::ApproxFunc)) + FMF.setApproxFunc(); return FMF; } Index: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp =================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -1321,8 +1321,8 @@ if (PEO->isExact()) Flags |= 1 << bitc::PEO_EXACT; } else if (const auto *FPMO = dyn_cast(V)) { - if (FPMO->hasUnsafeAlgebra()) - Flags |= FastMathFlags::UnsafeAlgebra; + if (FPMO->hasAllowReassoc()) + Flags |= FastMathFlags::AllowReassoc; if (FPMO->hasNoNaNs()) Flags |= FastMathFlags::NoNaNs; if (FPMO->hasNoInfs()) @@ -1333,6 +1333,8 @@ Flags |= FastMathFlags::AllowReciprocal; if (FPMO->hasAllowContract()) Flags |= FastMathFlags::AllowContract; + if (FPMO->hasApproxFunc()) + Flags |= FastMathFlags::ApproxFunc; } return Flags; Index: llvm/trunk/lib/CodeGen/ExpandReductions.cpp =================================================================== --- llvm/trunk/lib/CodeGen/ExpandReductions.cpp +++ llvm/trunk/lib/CodeGen/ExpandReductions.cpp @@ -95,7 +95,7 @@ // and it can't be handled by generating this shuffle sequence. // TODO: Implement scalarization of ordered reductions here for targets // without native support. - if (!II->getFastMathFlags().unsafeAlgebra()) + if (!II->getFastMathFlags().isFast()) continue; Vec = II->getArgOperand(1); break; Index: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp =================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -2585,7 +2585,7 @@ case Instruction::FAdd: case Instruction::FMul: if (const FPMathOperator *FPOp = dyn_cast(Inst)) - if (FPOp->getFastMathFlags().unsafeAlgebra()) + if (FPOp->getFastMathFlags().isFast()) break; LLVM_FALLTHROUGH; default: @@ -2631,7 +2631,7 @@ if (Inst->getOpcode() == OpCode || isa(U)) { if (const FPMathOperator *FPOp = dyn_cast(Inst)) - if (!isa(FPOp) && !FPOp->getFastMathFlags().unsafeAlgebra()) + if (!isa(FPOp) && !FPOp->getFastMathFlags().isFast()) return false; UsersToVisit.push_back(U); } else if (const ShuffleVectorInst *ShufInst = @@ -2725,7 +2725,7 @@ Flags.setNoInfs(FMF.noInfs()); Flags.setNoNaNs(FMF.noNaNs()); Flags.setNoSignedZeros(FMF.noSignedZeros()); - Flags.setUnsafeAlgebra(FMF.unsafeAlgebra()); + Flags.setUnsafeAlgebra(FMF.isFast()); SDValue BinNodeValue = DAG.getNode(OpCode, getCurSDLoc(), Op1.getValueType(), Op1, Op2, Flags); @@ -7959,13 +7959,13 @@ switch (Intrinsic) { case Intrinsic::experimental_vector_reduce_fadd: - if (FMF.unsafeAlgebra()) + if (FMF.isFast()) Res = DAG.getNode(ISD::VECREDUCE_FADD, dl, VT, Op2); else Res = DAG.getNode(ISD::VECREDUCE_STRICT_FADD, dl, VT, Op1, Op2); break; case Intrinsic::experimental_vector_reduce_fmul: - if (FMF.unsafeAlgebra()) + if (FMF.isFast()) Res = DAG.getNode(ISD::VECREDUCE_FMUL, dl, VT, Op2); else Res = DAG.getNode(ISD::VECREDUCE_STRICT_FMUL, dl, VT, Op1, Op2); Index: llvm/trunk/lib/IR/AsmWriter.cpp =================================================================== --- llvm/trunk/lib/IR/AsmWriter.cpp +++ llvm/trunk/lib/IR/AsmWriter.cpp @@ -1108,10 +1108,12 @@ static void WriteOptimizationInfo(raw_ostream &Out, const User *U) { if (const FPMathOperator *FPO = dyn_cast(U)) { - // Unsafe algebra implies all the others, no need to write them all out - if (FPO->hasUnsafeAlgebra()) + // 'Fast' is an abbreviation for all fast-math-flags. + if (FPO->isFast()) Out << " fast"; else { + if (FPO->hasAllowReassoc()) + Out << " reassoc"; if (FPO->hasNoNaNs()) Out << " nnan"; if (FPO->hasNoInfs()) @@ -1122,6 +1124,8 @@ Out << " arcp"; if (FPO->hasAllowContract()) Out << " contract"; + if (FPO->hasApproxFunc()) + Out << " afn"; } } Index: llvm/trunk/lib/IR/Instruction.cpp =================================================================== --- llvm/trunk/lib/IR/Instruction.cpp +++ llvm/trunk/lib/IR/Instruction.cpp @@ -146,9 +146,14 @@ return cast(this)->isExact(); } -void Instruction::setHasUnsafeAlgebra(bool B) { +void Instruction::setFast(bool B) { assert(isa(this) && "setting fast-math flag on invalid op"); - cast(this)->setHasUnsafeAlgebra(B); + cast(this)->setFast(B); +} + +void Instruction::setHasAllowReassoc(bool B) { + assert(isa(this) && "setting fast-math flag on invalid op"); + cast(this)->setHasAllowReassoc(B); } void Instruction::setHasNoNaNs(bool B) { @@ -171,6 +176,11 @@ cast(this)->setHasAllowReciprocal(B); } +void Instruction::setHasApproxFunc(bool B) { + assert(isa(this) && "setting fast-math flag on invalid op"); + cast(this)->setHasApproxFunc(B); +} + void Instruction::setFastMathFlags(FastMathFlags FMF) { assert(isa(this) && "setting fast-math flag on invalid op"); cast(this)->setFastMathFlags(FMF); @@ -181,9 +191,14 @@ cast(this)->copyFastMathFlags(FMF); } -bool Instruction::hasUnsafeAlgebra() const { +bool Instruction::isFast() const { assert(isa(this) && "getting fast-math flag on invalid op"); - return cast(this)->hasUnsafeAlgebra(); + return cast(this)->isFast(); +} + +bool Instruction::hasAllowReassoc() const { + assert(isa(this) && "getting fast-math flag on invalid op"); + return cast(this)->hasAllowReassoc(); } bool Instruction::hasNoNaNs() const { @@ -211,6 +226,11 @@ return cast(this)->hasAllowContract(); } +bool Instruction::hasApproxFunc() const { + assert(isa(this) && "getting fast-math flag on invalid op"); + return cast(this)->hasApproxFunc(); +} + FastMathFlags Instruction::getFastMathFlags() const { assert(isa(this) && "getting fast-math flag on invalid op"); return cast(this)->getFastMathFlags(); @@ -579,7 +599,7 @@ switch (Opcode) { case FMul: case FAdd: - return cast(this)->hasUnsafeAlgebra(); + return cast(this)->isFast(); default: return false; } Index: llvm/trunk/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp =================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp @@ -400,7 +400,7 @@ return false; FastMathFlags FMF = FPOp->getFastMathFlags(); - bool UnsafeDiv = HasUnsafeFPMath || FMF.unsafeAlgebra() || + bool UnsafeDiv = HasUnsafeFPMath || FMF.isFast() || FMF.allowReciprocal(); // With UnsafeDiv node will be optimized to just rcp and mul. Index: llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp =================================================================== --- llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp +++ llvm/trunk/lib/Target/AMDGPU/AMDGPULibCalls.cpp @@ -487,7 +487,7 @@ bool AMDGPULibCalls::isUnsafeMath(const CallInst *CI) const { if (auto Op = dyn_cast(CI)) - if (Op->hasUnsafeAlgebra()) + if (Op->isFast()) return true; const Function *F = CI->getParent()->getParent(); Attribute Attr = F->getFnAttribute("unsafe-fp-math"); Index: llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp =================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineAddSub.cpp @@ -482,7 +482,7 @@ return nullptr; FastMathFlags Flags; - Flags.setUnsafeAlgebra(); + Flags.setFast(); if (I0) Flags &= I->getFastMathFlags(); if (I1) Flags &= I->getFastMathFlags(); @@ -511,7 +511,7 @@ } Value *FAddCombine::simplify(Instruction *I) { - assert(I->hasUnsafeAlgebra() && "Should be in unsafe mode"); + assert(I->isFast() && "Expected 'fast' instruction"); // Currently we are not able to handle vector type. if (I->getType()->isVectorTy()) @@ -1386,7 +1386,7 @@ if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS)) return replaceInstUsesWith(I, V); - if (I.hasUnsafeAlgebra()) { + if (I.isFast()) { if (Value *V = FAddCombine(Builder).simplify(&I)) return replaceInstUsesWith(I, V); } @@ -1736,7 +1736,7 @@ if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1)) return replaceInstUsesWith(I, V); - if (I.hasUnsafeAlgebra()) { + if (I.isFast()) { if (Value *V = FAddCombine(Builder).simplify(&I)) return replaceInstUsesWith(I, V); } Index: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp =================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp @@ -2017,7 +2017,7 @@ } case Intrinsic::fmuladd: { // Canonicalize fast fmuladd to the separate fmul + fadd. - if (II->hasUnsafeAlgebra()) { + if (II->isFast()) { BuilderTy::FastMathFlagGuard Guard(Builder); Builder.setFastMathFlags(II->getFastMathFlags()); Value *Mul = Builder.CreateFMul(II->getArgOperand(0), Index: llvm/trunk/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp =================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp @@ -487,7 +487,7 @@ IntrinsicInst *II = dyn_cast(Op); if (!II) return; - if (II->getIntrinsicID() != Intrinsic::log2 || !II->hasUnsafeAlgebra()) + if (II->getIntrinsicID() != Intrinsic::log2 || !II->isFast()) return; Log2 = II; @@ -498,7 +498,8 @@ Instruction *I = dyn_cast(OpLog2Of); if (!I) return; - if (I->getOpcode() != Instruction::FMul || !I->hasUnsafeAlgebra()) + + if (I->getOpcode() != Instruction::FMul || !I->isFast()) return; if (match(I->getOperand(0), m_SpecificFP(0.5))) @@ -601,7 +602,7 @@ } if (R) { - R->setHasUnsafeAlgebra(true); + R->setFast(true); InsertNewInstWith(R, *InsertBefore); } @@ -622,7 +623,7 @@ SQ.getWithInstruction(&I))) return replaceInstUsesWith(I, V); - bool AllowReassociate = I.hasUnsafeAlgebra(); + bool AllowReassociate = I.isFast(); // Simplify mul instructions with a constant RHS. if (isa(Op1)) { @@ -1341,7 +1342,7 @@ if (Instruction *R = FoldOpIntoSelect(I, SI)) return R; - bool AllowReassociate = I.hasUnsafeAlgebra(); + bool AllowReassociate = I.isFast(); bool AllowReciprocal = I.hasAllowReciprocal(); if (Constant *Op1C = dyn_cast(Op1)) { Index: llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp =================================================================== --- llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp +++ llvm/trunk/lib/Transforms/Scalar/Reassociate.cpp @@ -145,8 +145,7 @@ static BinaryOperator *isReassociableOp(Value *V, unsigned Opcode) { if (V->hasOneUse() && isa(V) && cast(V)->getOpcode() == Opcode && - (!isa(V) || - cast(V)->hasUnsafeAlgebra())) + (!isa(V) || cast(V)->isFast())) return cast(V); return nullptr; } @@ -156,8 +155,7 @@ if (V->hasOneUse() && isa(V) && (cast(V)->getOpcode() == Opcode1 || cast(V)->getOpcode() == Opcode2) && - (!isa(V) || - cast(V)->hasUnsafeAlgebra())) + (!isa(V) || cast(V)->isFast())) return cast(V); return nullptr; } @@ -565,7 +563,7 @@ assert((!isa(Op) || cast(Op)->getOpcode() != Opcode || (isa(Op) && - !cast(Op)->hasUnsafeAlgebra())) && + !cast(Op)->isFast())) && "Should have been handled above!"); assert(Op->hasOneUse() && "Has uses outside the expression tree!"); @@ -2017,8 +2015,8 @@ if (I->isCommutative()) canonicalizeOperands(I); - // Don't optimize floating point instructions that don't have unsafe algebra. - if (I->getType()->isFPOrFPVectorTy() && !I->hasUnsafeAlgebra()) + // Don't optimize floating-point instructions unless they are 'fast'. + if (I->getType()->isFPOrFPVectorTy() && !I->isFast()) return; // Do not reassociate boolean (i1) expressions. We want to preserve the Index: llvm/trunk/lib/Transforms/Utils/LoopUtils.cpp =================================================================== --- llvm/trunk/lib/Transforms/Utils/LoopUtils.cpp +++ llvm/trunk/lib/Transforms/Utils/LoopUtils.cpp @@ -432,7 +432,7 @@ InstDesc &Prev, bool HasFunNoNaNAttr) { bool FP = I->getType()->isFloatingPointTy(); Instruction *UAI = Prev.getUnsafeAlgebraInst(); - if (!UAI && FP && !I->hasUnsafeAlgebra()) + if (!UAI && FP && !I->isFast()) UAI = I; // Found an unsafe (unvectorizable) algebra instruction. switch (I->getOpcode()) { @@ -660,11 +660,11 @@ break; } - // We only match FP sequences with unsafe algebra, so we can unconditionally + // We only match FP sequences that are 'fast', so we can unconditionally // set it on any generated instructions. IRBuilder<>::FastMathFlagGuard FMFG(Builder); FastMathFlags FMF; - FMF.setUnsafeAlgebra(); + FMF.setFast(); Builder.setFastMathFlags(FMF); Value *Cmp; @@ -768,7 +768,7 @@ // Floating point operations had to be 'fast' to enable the induction. FastMathFlags Flags; - Flags.setUnsafeAlgebra(); + Flags.setFast(); Value *MulExp = B.CreateFMul(StepValue, Index); if (isa(MulExp)) @@ -1338,7 +1338,7 @@ static Value *addFastMathFlag(Value *V) { if (isa(V)) { FastMathFlags Flags; - Flags.setUnsafeAlgebra(); + Flags.setFast(); cast(V)->setFastMathFlags(Flags); } return V; @@ -1401,7 +1401,7 @@ RD::MinMaxRecurrenceKind MinMaxKind = RD::MRK_Invalid; // TODO: Support creating ordered reductions. FastMathFlags FMFUnsafe; - FMFUnsafe.setUnsafeAlgebra(); + FMFUnsafe.setFast(); switch (Opcode) { case Instruction::Add: Index: llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp =================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp +++ llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp @@ -1111,7 +1111,7 @@ // Example: x = 1000, y = 0.001. // pow(exp(x), y) = pow(inf, 0.001) = inf, whereas exp(x*y) = exp(1). auto *OpC = dyn_cast(Op1); - if (OpC && OpC->hasUnsafeAlgebra() && CI->hasUnsafeAlgebra()) { + if (OpC && OpC->isFast() && CI->isFast()) { LibFunc Func; Function *OpCCallee = OpC->getCalledFunction(); if (OpCCallee && TLI->getLibFunc(OpCCallee->getName(), Func) && @@ -1136,7 +1136,7 @@ LibFunc_sqrtl)) { // If -ffast-math: // pow(x, -0.5) -> 1.0 / sqrt(x) - if (CI->hasUnsafeAlgebra()) { + if (CI->isFast()) { IRBuilder<>::FastMathFlagGuard Guard(B); B.setFastMathFlags(CI->getFastMathFlags()); @@ -1157,7 +1157,7 @@ LibFunc_sqrtl)) { // In -ffast-math, pow(x, 0.5) -> sqrt(x). - if (CI->hasUnsafeAlgebra()) { + if (CI->isFast()) { IRBuilder<>::FastMathFlagGuard Guard(B); B.setFastMathFlags(CI->getFastMathFlags()); @@ -1196,7 +1196,7 @@ return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Op1, "powrecip"); // In -ffast-math, generate repeated fmul instead of generating pow(x, n). - if (CI->hasUnsafeAlgebra()) { + if (CI->isFast()) { APFloat V = abs(Op2C->getValueAPF()); // We limit to a max of 7 fmul(s). Thus max exponent is 32. // This transformation applies to integer exponents only. @@ -1284,9 +1284,9 @@ IRBuilder<>::FastMathFlagGuard Guard(B); FastMathFlags FMF; - if (CI->hasUnsafeAlgebra()) { - // Unsafe algebra sets all fast-math-flags to true. - FMF.setUnsafeAlgebra(); + if (CI->isFast()) { + // If the call is 'fast', then anything we create here will also be 'fast'. + FMF.setFast(); } else { // At a minimum, no-nans-fp-math must be true. if (!CI->hasNoNaNs()) @@ -1317,13 +1317,13 @@ if (UnsafeFPShrink && hasFloatVersion(Name)) Ret = optimizeUnaryDoubleFP(CI, B, true); - if (!CI->hasUnsafeAlgebra()) + if (!CI->isFast()) return Ret; Value *Op1 = CI->getArgOperand(0); auto *OpC = dyn_cast(Op1); - // The earlier call must also be unsafe in order to do these transforms. - if (!OpC || !OpC->hasUnsafeAlgebra()) + // The earlier call must also be 'fast' in order to do these transforms. + if (!OpC || !OpC->isFast()) return Ret; // log(pow(x,y)) -> y*log(x) @@ -1333,7 +1333,7 @@ IRBuilder<>::FastMathFlagGuard Guard(B); FastMathFlags FMF; - FMF.setUnsafeAlgebra(); + FMF.setFast(); B.setFastMathFlags(FMF); LibFunc Func; @@ -1365,11 +1365,11 @@ Callee->getIntrinsicID() == Intrinsic::sqrt)) Ret = optimizeUnaryDoubleFP(CI, B, true); - if (!CI->hasUnsafeAlgebra()) + if (!CI->isFast()) return Ret; Instruction *I = dyn_cast(CI->getArgOperand(0)); - if (!I || I->getOpcode() != Instruction::FMul || !I->hasUnsafeAlgebra()) + if (!I || I->getOpcode() != Instruction::FMul || !I->isFast()) return Ret; // We're looking for a repeated factor in a multiplication tree, @@ -1391,8 +1391,7 @@ Value *OtherMul0, *OtherMul1; if (match(Op0, m_FMul(m_Value(OtherMul0), m_Value(OtherMul1)))) { // Pattern: sqrt((x * y) * z) - if (OtherMul0 == OtherMul1 && - cast(Op0)->hasUnsafeAlgebra()) { + if (OtherMul0 == OtherMul1 && cast(Op0)->isFast()) { // Matched: sqrt((x * x) * z) RepeatOp = OtherMul0; OtherOp = Op1; @@ -1437,8 +1436,8 @@ if (!OpC) return Ret; - // Both calls must allow unsafe optimizations in order to remove them. - if (!CI->hasUnsafeAlgebra() || !OpC->hasUnsafeAlgebra()) + // Both calls must be 'fast' in order to remove them. + if (!CI->isFast() || !OpC->isFast()) return Ret; // tan(atan(x)) -> x @@ -2167,10 +2166,10 @@ // Command-line parameter overrides instruction attribute. // This can't be moved to optimizeFloatingPointLibCall() because it may be - // used by the intrinsic optimizations. + // used by the intrinsic optimizations. if (EnableUnsafeFPShrink.getNumOccurrences() > 0) UnsafeFPShrink = EnableUnsafeFPShrink; - else if (isa(CI) && CI->hasUnsafeAlgebra()) + else if (isa(CI) && CI->isFast()) UnsafeFPShrink = true; // First, check for intrinsics. Index: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp =================================================================== --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -385,7 +385,7 @@ static Value *addFastMathFlag(Value *V) { if (isa(V)) { FastMathFlags Flags; - Flags.setUnsafeAlgebra(); + Flags.setFast(); cast(V)->setFastMathFlags(Flags); } return V; @@ -2720,7 +2720,7 @@ // Floating point operations had to be 'fast' to enable the induction. FastMathFlags Flags; - Flags.setUnsafeAlgebra(); + Flags.setFast(); Value *MulOp = Builder.CreateFMul(Cv, Step); if (isa(MulOp)) @@ -5396,7 +5396,7 @@ // operations, shuffles, or casts, as they don't change precision or // semantics. } else if (I.getType()->isFloatingPointTy() && (CI || I.isBinaryOp()) && - !I.hasUnsafeAlgebra()) { + !I.isFast()) { DEBUG(dbgs() << "LV: Found FP op with unsafe algebra.\n"); Hints->setPotentiallyUnsafe(); } Index: llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp =================================================================== --- llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -4880,7 +4880,7 @@ case RK_Min: case RK_Max: return Opcode == Instruction::ICmp || - cast(I->getOperand(0))->hasUnsafeAlgebra(); + cast(I->getOperand(0))->isFast(); case RK_UMin: case RK_UMax: assert(Opcode == Instruction::ICmp && @@ -5232,7 +5232,7 @@ Value *VectorizedTree = nullptr; IRBuilder<> Builder(ReductionRoot); FastMathFlags Unsafe; - Unsafe.setUnsafeAlgebra(); + Unsafe.setFast(); Builder.setFastMathFlags(Unsafe); unsigned i = 0; Index: llvm/trunk/test/Assembler/fast-math-flags.ll =================================================================== --- llvm/trunk/test/Assembler/fast-math-flags.ll +++ llvm/trunk/test/Assembler/fast-math-flags.ll @@ -7,6 +7,8 @@ @vec = external global <3 x float> @arr = external global [3 x float] +declare float @foo(float) + define float @none(float %x, float %y) { entry: ; CHECK: %vec = load <3 x float>, <3 x float>* @vec @@ -86,6 +88,28 @@ ret float %c } +; CHECK: @reassoc( +define float @reassoc(float %x, float %y) { +; CHECK: %a = fsub reassoc float %x, %y + %a = fsub reassoc float %x, %y +; CHECK: %b = fmul reassoc float %x, %y + %b = fmul reassoc float %x, %y +; CHECK: %c = call reassoc float @foo(float %b) + %c = call reassoc float @foo(float %b) + ret float %c +} + +; CHECK: @afn( +define float @afn(float %x, float %y) { +; CHECK: %a = fdiv afn float %x, %y + %a = fdiv afn float %x, %y +; CHECK: %b = frem afn float %x, %y + %b = frem afn float %x, %y +; CHECK: %c = call afn float @foo(float %b) + %c = call afn float @foo(float %b) + ret float %c +} + ; CHECK: no_nan_inf define float @no_nan_inf(float %x, float %y) { entry: @@ -130,10 +154,10 @@ ; CHECK: %arr = load [3 x float], [3 x float]* @arr %arr = load [3 x float], [3 x float]* @arr -; CHECK: %a = fadd nnan ninf float %x, %y - %a = fadd ninf nnan float %x, %y -; CHECK: %a_vec = fadd nnan <3 x float> %vec, %vec - %a_vec = fadd nnan <3 x float> %vec, %vec +; CHECK: %a = fadd nnan ninf afn float %x, %y + %a = fadd ninf nnan afn float %x, %y +; CHECK: %a_vec = fadd reassoc nnan <3 x float> %vec, %vec + %a_vec = fadd reassoc nnan <3 x float> %vec, %vec ; CHECK: %b = fsub fast float %x, %y %b = fsub nnan nsz fast float %x, %y ; CHECK: %b_vec = fsub nnan <3 x float> %vec, %vec Index: llvm/trunk/test/Bitcode/compatibility-3.6.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-3.6.ll +++ llvm/trunk/test/Bitcode/compatibility-3.6.ll @@ -612,7 +612,9 @@ %f.arcp = fadd arcp float %op1, %op2 ; CHECK: %f.arcp = fadd arcp float %op1, %op2 %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2 ret void } Index: llvm/trunk/test/Bitcode/compatibility-3.7.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-3.7.ll +++ llvm/trunk/test/Bitcode/compatibility-3.7.ll @@ -656,7 +656,9 @@ %f.arcp = fadd arcp float %op1, %op2 ; CHECK: %f.arcp = fadd arcp float %op1, %op2 %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2 ret void } Index: llvm/trunk/test/Bitcode/compatibility-3.8.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-3.8.ll +++ llvm/trunk/test/Bitcode/compatibility-3.8.ll @@ -687,7 +687,9 @@ %f.arcp = fadd arcp float %op1, %op2 ; CHECK: %f.arcp = fadd arcp float %op1, %op2 %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2 ret void } @@ -700,7 +702,9 @@ ; CHECK-LABEL: fastMathFlagsForCalls( define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) { %call.fast = call fast float @fmf1() - ; CHECK: %call.fast = call fast float @fmf1() + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'aml' bits set, so this is not fully 'fast'. + ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1() ; Throw in some other attributes to make sure those stay in the right places. Index: llvm/trunk/test/Bitcode/compatibility-3.9.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-3.9.ll +++ llvm/trunk/test/Bitcode/compatibility-3.9.ll @@ -758,7 +758,9 @@ %f.arcp = fadd arcp float %op1, %op2 ; CHECK: %f.arcp = fadd arcp float %op1, %op2 %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2 ret void } @@ -771,7 +773,9 @@ ; CHECK-LABEL: fastMathFlagsForCalls( define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) { %call.fast = call fast float @fmf1() - ; CHECK: %call.fast = call fast float @fmf1() + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1() ; Throw in some other attributes to make sure those stay in the right places. Index: llvm/trunk/test/Bitcode/compatibility-4.0.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-4.0.ll +++ llvm/trunk/test/Bitcode/compatibility-4.0.ll @@ -757,8 +757,10 @@ ; CHECK: %f.nsz = fadd nsz float %op1, %op2 %f.arcp = fadd arcp float %op1, %op2 ; CHECK: %f.arcp = fadd arcp float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp float %op1, %op2 ret void } @@ -771,7 +773,9 @@ ; CHECK-LABEL: fastMathFlagsForCalls( define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) { %call.fast = call fast float @fmf1() - ; CHECK: %call.fast = call fast float @fmf1() + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'contract' and 'afn' bits set, so this is not fully 'fast'. + ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp float @fmf1() ; Throw in some other attributes to make sure those stay in the right places. Index: llvm/trunk/test/Bitcode/compatibility-5.0.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility-5.0.ll +++ llvm/trunk/test/Bitcode/compatibility-5.0.ll @@ -765,7 +765,9 @@ %f.contract = fadd contract float %op1, %op2 ; CHECK: %f.contract = fadd contract float %op1, %op2 %f.fast = fadd fast float %op1, %op2 - ; CHECK: %f.fast = fadd fast float %op1, %op2 + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'. + ; CHECK: %f.fast = fadd reassoc nnan ninf nsz arcp contract float %op1, %op2 ret void } @@ -778,7 +780,9 @@ ; CHECK-LABEL: fastMathFlagsForCalls( define void @fastMathFlagsForCalls(float %f, double %d1, <4 x double> %d2) { %call.fast = call fast float @fmf1() - ; CHECK: %call.fast = call fast float @fmf1() + ; 'fast' used to be its own bit, but this changed in Oct 2017. + ; The binary test file does not have the newer 'afn' bit set, so this is not fully 'fast'. + ; CHECK: %call.fast = call reassoc nnan ninf nsz arcp contract float @fmf1() ; Throw in some other attributes to make sure those stay in the right places. Index: llvm/trunk/test/Bitcode/compatibility.ll =================================================================== --- llvm/trunk/test/Bitcode/compatibility.ll +++ llvm/trunk/test/Bitcode/compatibility.ll @@ -775,6 +775,10 @@ ; CHECK: %f.arcp = fadd arcp float %op1, %op2 %f.contract = fadd contract float %op1, %op2 ; CHECK: %f.contract = fadd contract float %op1, %op2 + %f.afn = fadd afn float %op1, %op2 + ; CHECK: %f.afn = fadd afn float %op1, %op2 + %f.reassoc = fadd reassoc float %op1, %op2 + ; CHECK: %f.reassoc = fadd reassoc float %op1, %op2 %f.fast = fadd fast float %op1, %op2 ; CHECK: %f.fast = fadd fast float %op1, %op2 ret void Index: llvm/trunk/unittests/IR/IRBuilderTest.cpp =================================================================== --- llvm/trunk/unittests/IR/IRBuilderTest.cpp +++ llvm/trunk/unittests/IR/IRBuilderTest.cpp @@ -144,17 +144,40 @@ FastMathFlags FMF; Builder.setFastMathFlags(FMF); + // By default, no flags are set. F = Builder.CreateFAdd(F, F); EXPECT_FALSE(Builder.getFastMathFlags().any()); + ASSERT_TRUE(isa(F)); + FAdd = cast(F); + EXPECT_FALSE(FAdd->hasNoNaNs()); + EXPECT_FALSE(FAdd->hasNoInfs()); + EXPECT_FALSE(FAdd->hasNoSignedZeros()); + EXPECT_FALSE(FAdd->hasAllowReciprocal()); + EXPECT_FALSE(FAdd->hasAllowContract()); + EXPECT_FALSE(FAdd->hasAllowReassoc()); + EXPECT_FALSE(FAdd->hasApproxFunc()); - FMF.setUnsafeAlgebra(); + // Set all flags in the instruction. + FAdd->setFast(true); + EXPECT_TRUE(FAdd->hasNoNaNs()); + EXPECT_TRUE(FAdd->hasNoInfs()); + EXPECT_TRUE(FAdd->hasNoSignedZeros()); + EXPECT_TRUE(FAdd->hasAllowReciprocal()); + EXPECT_TRUE(FAdd->hasAllowContract()); + EXPECT_TRUE(FAdd->hasAllowReassoc()); + EXPECT_TRUE(FAdd->hasApproxFunc()); + + // All flags are set in the builder. + FMF.setFast(); Builder.setFastMathFlags(FMF); F = Builder.CreateFAdd(F, F); EXPECT_TRUE(Builder.getFastMathFlags().any()); + EXPECT_TRUE(Builder.getFastMathFlags().all()); ASSERT_TRUE(isa(F)); FAdd = cast(F); EXPECT_TRUE(FAdd->hasNoNaNs()); + EXPECT_TRUE(FAdd->isFast()); // Now, try it with CreateBinOp F = Builder.CreateBinOp(Instruction::FAdd, F, F); @@ -162,21 +185,23 @@ ASSERT_TRUE(isa(F)); FAdd = cast(F); EXPECT_TRUE(FAdd->hasNoNaNs()); + EXPECT_TRUE(FAdd->isFast()); F = Builder.CreateFDiv(F, F); - EXPECT_TRUE(Builder.getFastMathFlags().any()); - EXPECT_TRUE(Builder.getFastMathFlags().UnsafeAlgebra); + EXPECT_TRUE(Builder.getFastMathFlags().all()); ASSERT_TRUE(isa(F)); FDiv = cast(F); EXPECT_TRUE(FDiv->hasAllowReciprocal()); + // Clear all FMF in the builder. Builder.clearFastMathFlags(); F = Builder.CreateFDiv(F, F); ASSERT_TRUE(isa(F)); FDiv = cast(F); EXPECT_FALSE(FDiv->hasAllowReciprocal()); - + + // Try individual flags. FMF.clear(); FMF.setAllowReciprocal(); Builder.setFastMathFlags(FMF); @@ -225,7 +250,25 @@ FAdd = cast(FC); EXPECT_TRUE(FAdd->hasAllowContract()); + FMF.setApproxFunc(); + Builder.clearFastMathFlags(); + Builder.setFastMathFlags(FMF); + // Now 'aml' and 'contract' are set. + F = Builder.CreateFMul(F, F); + FAdd = cast(F); + EXPECT_TRUE(FAdd->hasApproxFunc()); + EXPECT_TRUE(FAdd->hasAllowContract()); + EXPECT_FALSE(FAdd->hasAllowReassoc()); + + FMF.setAllowReassoc(); Builder.clearFastMathFlags(); + Builder.setFastMathFlags(FMF); + // Now 'aml' and 'contract' and 'reassoc' are set. + F = Builder.CreateFMul(F, F); + FAdd = cast(F); + EXPECT_TRUE(FAdd->hasApproxFunc()); + EXPECT_TRUE(FAdd->hasAllowContract()); + EXPECT_TRUE(FAdd->hasAllowReassoc()); // Test a call with FMF. auto CalleeTy = FunctionType::get(Type::getFloatTy(Ctx),