This change implements constant folding for constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of round however contains definite requirement (184.108.40.206):
The round functions round their argument to the nearest integer value in floating-point format, rounding halfway cases away from zero, regardless of the current rounding direction.
In this case rounding mode is roundTiesToAway. Why in other cases it must be roundTiesToEven?
I thought rint could raise an inexact exception?
Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first?
As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior.
rint may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas.
If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, getAPFloatRoundingMode returns None in this case and the expression is not folded.
Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior.
Yes, I see that ConstantFolding.cpp uses mainly roundTiesToEven and will change the patch. It would be nice to understand why this is so.
What make me worry is unusual behavior of roundTiesToEven. For instance, 10.5 is rounded to 10.0. Behavior of round is more familiar.
The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations.
More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods.
But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't?
What's the definition of "may"? Is it:
If it's #1 then we're dropping an exception we should be issuing.
Inexact exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why rint exists is faster implementation than nearbyint in some cases. I don't know these particular cases but the cited defect report mentions it.
So rint is just a loose version of nearbyint. There are no cases when this exception would make sense.
After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception.
Objection withdrawn, and sorry I wasn't quicker to come to this conclusion.
fabs is allowed irrespective of isStrictFP() , so it should be processed here. As for functions like ceil, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics.
The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function.
I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here.
Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day.
This is a reasonable suggestion.
For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer.
We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag.
This is probably OK. I think these changes will always end up going through the other function for FP calls.