This change implements constant folding for constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 44949 Build 46514: arc lint + arc unit
Event Timeline
llvm/lib/IR/FPEnv.cpp | ||
---|---|---|
82 | rmToNearest should map to APFloat::rmNearestTiesToEven. |
llvm/lib/IR/FPEnv.cpp | ||
---|---|---|
82 |
Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of round however contains definite requirement (7.12.9.6): The round functions round their argument to the nearest integer value in floating-point format, rounding halfway cases away from zero, regardless of the current rounding direction. In this case rounding mode is roundTiesToAway. Why in other cases it must be roundTiesToEven? |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1767 | I thought rint could raise an inexact exception? Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first? |
llvm/lib/IR/FPEnv.cpp | ||
---|---|---|
82 | As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1767 | rint may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas. If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, getAPFloatRoundingMode returns None in this case and the expression is not folded. Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior. | |
llvm/lib/IR/FPEnv.cpp | ||
82 | Yes, I see that ConstantFolding.cpp uses mainly roundTiesToEven and will change the patch. It would be nice to understand why this is so. What make me worry is unusual behavior of roundTiesToEven. For instance, 10.5 is rounded to 10.0. Behavior of round is more familiar. |
llvm/lib/IR/FPEnv.cpp | ||
---|---|---|
82 | The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations. More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods. |
llvm/lib/IR/FPEnv.cpp | ||
---|---|---|
82 | roundTiestoAway is more familiar because its what we're taught in school when we're young. But it introduces bias in the results. Rounding to even can cancel out some bias. See the "round half to even" section here https://en.wikipedia.org/wiki/Rounding |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1752 | Does moving this up change the behavior or non-constrained intrinsics? |
Updated patch
- Implemented constfolding for non-finite numbers
- Do not affect processing of non-constrained intrinsics.
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1752 | Changed the patch so that behavior of non-constrained intrinsics remains unaffected. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1767 | But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't? What's the definition of "may"? Is it:
If it's #1 then we're dropping an exception we should be issuing. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1767 | Inexact exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why rint exists is faster implementation than nearbyint in some cases. I don't know these particular cases but the cited defect report mentions it. So rint is just a loose version of nearbyint. There are no cases when this exception would make sense. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1767 | After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception. Objection withdrawn, and sorry I wasn't quicker to come to this conclusion. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1406 | fabs is allowed irrespective of isStrictFP() , so it should be processed here. As for functions like ceil, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics. | |
2525 | The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1406 | I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here. Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day. | |
1478–1484 | This is a reasonable suggestion. | |
1821 | For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer. We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag. | |
2525 | This is probably OK. I think these changes will always end up going through the other function for FP calls. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1406 | Reorganized this function. | |
1821 |
None of rounding functions raises exception, if argument is not SNaN. All big numbers are already integers, so overflow cannot occur. Result of rounding is integer number, so underflow or inexact exceptions also absent. Mentioning of inexact exception in the description of rint in C standard is somewhat misleading, it is discusses earlier: https://reviews.llvm.org/D72930#inline-676354. |
I may have overstated. The exception behavior of the
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1821 | My thoughts here: (1) We aren't implementing the C standard. LLVM IR is language-independent and needs to be suitable for any language. We should clearly define the semantics of the constrained intrinsics (ideally more clearly than the C standard does for the corresponding functions), and those semantics should be suitable for any language. Depending on what we decide, that may mean languages that want stricter behavior might need to avoid using the intrinsics and instead provide a library call that does what they need. (2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version. (3) The point of the strict exception semantics mode is that any exception that would be raised by a literal translation of the code will be raised when the code is optimized in the strict mode. If a call to one of these functions (in the case where this was a function call) or the instructions to which they would be lowered in the case where the intrinsic is lowered directly to instructions, would raise an exception, then we should not fold the exception away in this mode. (4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently. I certainly understand why we would want to be able to constant fold in all these cases. However, I'm not convinced that doing so is always the right thing to do. At the very least, I think we'd need either some option to control this behavior or perhaps a new interface in TTI to query the behavior of these functions in the target math library. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1821 | This is rather complicated topic. Let's consider it in details. IEEE 754 This standard defines the cases when operation should raise inexact exception (7.6): Unless stated otherwise, if the rounded result of an operation is inexact - that is, it differs from what would have been computed were both exponent range and precision unbounded - then the inexact exception shall be signaled. Result of rounding operation is an integer value in the same floating point format as its argument. Integers are always exact, so one could expect that inexact exception are never raised in rounding operations. However the standard treats inexactness in wider sense when dealing with rounding. The standard defines set of rounding operations that produce values in floating point formats in 5.9:
It states that: These operations shall not signal any exception except for signaling NaN input. It also defines operation: For the following operation, the rounding direction is the applicable rounding-direction attribute. This operation signals the invalid operation exception for a signaling NaN operand, and for a numerical operand, signals the inexact exception if the result does not have the same numerical value as x. - sourceFormat roundToIntegralExact(source) roundToIntegralExact(x) rounds x to an integral value according to the applicable rounding direction attribute. C standard Most recent draft (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf) defines functions ceil, floor, round, roundeven and trunc, that have strict correspondence with the rounding operations defined in IEEE Std 754-2008. nearbyint does not have strict counterpart in IEEE 754, but dynamically selects one of the standard operations depending on the current rounding mode. As for rint, its description in 7.12.9.4 states that it may signal and does not specifies when it may signal. In the recent draft cited above (but not in the previous versions of the C standard) it is stated (F.10p6): The functions bound to operations in IEC 60559 (F.3) are fully specified by IEC 60559, including rounding behaviors and floating-point exceptions. F.3p1 specifies that rint is bound to roundToIntegralExact, ceil - to roundToIntegralTowardPositive and so on. So depending on the viewpoint, rint either may rise inexact exception (because of 7.12.9.4) or must do it (because of F.10p6). The former reflects perception of rounding as an operation, see the defect report http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm, although it probably is outdated. LLVM LLVM IR is language-independent. Intrinsic functions ceil, floor, round, roundeven and trunc are defined by C standard but they correspond to rounding operations defined by IEEE 754, which is also language-independent. It looks reasonable to implement them according to their IEEEE 754 counterparts. This means that all these intrinsics but rint do not raise inexact exception, rint raises inexact exception if its result differs from the argument.
Exactly. In C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) there is a statement about ceil (F.10.6.1): The ceil functions may, but are not required to, raise the ‘‘inexact’’ floating-point exception for finite non-integer arguments, as this implementation does. Similar statements existed for floor, round` and trunc. The recent draft does not provide such option, inexact is not allowed for these functions.
I believe this is caused by evolution of C standard. GCC implemented special option -fno-fp-int-builtin-inexact deal with compatibility issues (https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02060.html). We could implement this option in clang if necessary. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1821 | Thank you for the very thorough answer! Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. That would satisfy my objection. I have verified that the latest Fortran standard also defines these functions in terms of their IEEE 754 counterparts, and I think that's a reasonable expectation for any front end that wants strict floating point semantics. The more precisely specific exception semantics do present a potential issue for backends. The older "may, but are not required to, raise inexact" allows for a wider variety of implementations. Semantics that do not permit exceptions may require special handling. For example, the x87 FRNDINT instruction raises the inexact exception and so could only be used with rint. Because the non-constrained versions of these functions are defined to assume a floating point environment with no side effects and the constrained versions are still considered experimental, I think we are free to just declare their semantics to be bound to the IEEE-754 operations. If we are agreed on this, then only the rint folding here would need to handle the inexact condition. |
llvm/lib/Analysis/ConstantFolding.cpp | ||
---|---|---|
1821 |
I like this idea. +1 |
Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions.
I started preparing documentation patch but got problem with description of rint and nearbyint: we should describe non-constrained variants as returning argument rounded toward nearest even, because they are usable only when default FP environment is set. Asked question in maillist about that.
If we are agreed on this, then only the rint folding here would need to handle the inexact condition.
Great!
Added changes for PowerPC tests
Rounding intrinsics now can be constfolded, even if they are represented
by constrained intrinsics. As a result a PowerPC test that checks code
generation for constrained intrinsics needs update.
@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.
@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.
I'll take a look. I'm not a PowerPC expert but I'll do what I can.
It looks like a whole bunch of instructions got converted into simple loads. Which is exactly what was expected.
LGTM.
llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll | ||
---|---|---|
6444 ↗ | (On Diff #253132) | Now that I look at this, it might have been nice if all along we'd been writing test cases that gave some different values for the vector elements. Oh well. I doubt it's worth changing now. |
Should these cases be true even when isStrictFP() is true?