This is an archive of the discontinued LLVM Phabricator instance.

[FEnv] Constfold some unary constrained operations
ClosedPublic

Authored by sepavloff on Jan 17 2020, 8:29 AM.

Details

Summary

This change implements constant folding for constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Diff Detail

Event Timeline

sepavloff created this revision.Jan 17 2020, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2020, 8:29 AM
craig.topper added inline comments.Jan 17 2020, 8:45 AM
llvm/lib/IR/FPEnv.cpp
98

rmToNearest should map to APFloat::rmNearestTiesToEven.

sepavloff marked an inline comment as done.Jan 17 2020, 9:29 AM
sepavloff added inline comments.
llvm/lib/IR/FPEnv.cpp
98

rmToNearest should map to APFloat::rmNearestTiesToEven

Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of round however contains definite requirement (7.12.9.6):

The round functions round their argument to the nearest integer value in floating-point
format, rounding halfway cases away from zero, regardless of the current rounding
direction.

In this case rounding mode is roundTiesToAway. Why in other cases it must be roundTiesToEven?

kpn added inline comments.Jan 17 2020, 10:25 AM
llvm/lib/Analysis/ConstantFolding.cpp
1786

I thought rint could raise an inexact exception?

Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first?

craig.topper added inline comments.Jan 17 2020, 10:28 AM
llvm/lib/IR/FPEnv.cpp
98

As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior.

sepavloff marked 2 inline comments as done.Jan 17 2020, 11:29 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1786

rint may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas.

If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, getAPFloatRoundingMode returns None in this case and the expression is not folded.

Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior.

llvm/lib/IR/FPEnv.cpp
98

Yes, I see that ConstantFolding.cpp uses mainly roundTiesToEven and will change the patch. It would be nice to understand why this is so.

What make me worry is unusual behavior of roundTiesToEven. For instance, 10.5 is rounded to 10.0. Behavior of round is more familiar.

llvm/lib/IR/FPEnv.cpp
98

The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations.

More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods.

craig.topper added inline comments.Jan 17 2020, 11:59 AM
llvm/lib/IR/FPEnv.cpp
98

roundTiestoAway is more familiar because its what we're taught in school when we're young. But it introduces bias in the results. Rounding to even can cancel out some bias. See the "round half to even" section here https://en.wikipedia.org/wiki/Rounding

Andrew, Craig, thank you for explanations!

sepavloff updated this revision to Diff 238952.Jan 18 2020, 8:53 AM

Changed rounding to nearest from roundTiesToAway to roundTiesToEven

craig.topper added inline comments.Jan 23 2020, 9:43 PM
llvm/lib/Analysis/ConstantFolding.cpp
1771

Does moving this up change the behavior or non-constrained intrinsics?

sepavloff updated this revision to Diff 240428.Jan 26 2020, 3:55 AM

Updated patch

  • Implemented constfolding for non-finite numbers
  • Do not affect processing of non-constrained intrinsics.
sepavloff marked an inline comment as done.Jan 26 2020, 3:59 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1771

Changed the patch so that behavior of non-constrained intrinsics remains unaffected.

sepavloff updated this revision to Diff 242254.Feb 3 2020, 9:13 PM

Rebase and ping.

kpn added inline comments.Feb 11 2020, 9:29 AM
llvm/lib/Analysis/ConstantFolding.cpp
1786

But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't?

What's the definition of "may"? Is it:

  1. If conditions warrant, an exception _shall_ be issued OR
  2. If conditions warrant, the decision to raise an exception is _implementation defined_.

If it's #1 then we're dropping an exception we should be issuing.

sepavloff marked an inline comment as done.Feb 11 2020, 10:30 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1786

Inexact exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why rint exists is faster implementation than nearbyint in some cases. I don't know these particular cases but the cited defect report mentions it.

So rint is just a loose version of nearbyint. There are no cases when this exception would make sense.

kpn added inline comments.Feb 12 2020, 5:40 AM
llvm/lib/Analysis/ConstantFolding.cpp
1786

After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception.

Objection withdrawn, and sorry I wasn't quicker to come to this conclusion.

evandro added inline comments.Feb 13 2020, 12:09 PM
llvm/lib/Analysis/ConstantFolding.cpp
1402–1403

Should these cases be true even when isStrictFP() is true?

1481–1502

Or should...

return !Call->isStrictFP();

be inserted here?

2647

Again, not sure about the impact of doing this...

sepavloff marked 2 inline comments as done.Feb 13 2020, 10:24 PM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1402–1403

fabs is allowed irrespective of isStrictFP() , so it should be processed here. As for functions like ceil, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics.

2647

The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function.

llvm/lib/Analysis/ConstantFolding.cpp
1402–1403

I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here.

Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day.

1481–1502

This is a reasonable suggestion.

1840

For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer.

We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag.

2647

This is probably OK. I think these changes will always end up going through the other function for FP calls.

Updated patch according to review notes

  • Reorganized canConstantFoldCallTo.
sepavloff marked 4 inline comments as done.Feb 25 2020, 10:40 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1402–1403

Reorganized this function.

1840

For all of these except nearbyint, you need to check the return code

None of rounding functions raises exception, if argument is not SNaN. All big numbers are already integers, so overflow cannot occur. Result of rounding is integer number, so underflow or inexact exceptions also absent. Mentioning of inexact exception in the description of rint in C standard is somewhat misleading, it is discusses earlier: https://reviews.llvm.org/D72930#inline-676354.

I may have overstated. The exception behavior of the

llvm/lib/Analysis/ConstantFolding.cpp
1840

My thoughts here:

(1) We aren't implementing the C standard. LLVM IR is language-independent and needs to be suitable for any language. We should clearly define the semantics of the constrained intrinsics (ideally more clearly than the C standard does for the corresponding functions), and those semantics should be suitable for any language. Depending on what we decide, that may mean languages that want stricter behavior might need to avoid using the intrinsics and instead provide a library call that does what they need.

(2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version.

(3) The point of the strict exception semantics mode is that any exception that would be raised by a literal translation of the code will be raised when the code is optimized in the strict mode. If a call to one of these functions (in the case where this was a function call) or the instructions to which they would be lowered in the case where the intrinsic is lowered directly to instructions, would raise an exception, then we should not fold the exception away in this mode.

(4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently.

I certainly understand why we would want to be able to constant fold in all these cases. However, I'm not convinced that doing so is always the right thing to do. At the very least, I think we'd need either some option to control this behavior or perhaps a new interface in TTI to query the behavior of these functions in the target math library.

sepavloff updated this revision to Diff 246960.Feb 27 2020, 8:32 AM

Updated patch according to reviewer's notes

  • rint now can raise inexact exception.
sepavloff marked an inline comment as done.Feb 27 2020, 8:52 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1840

This is rather complicated topic. Let's consider it in details.

IEEE 754

This standard defines the cases when operation should raise inexact exception (7.6):

Unless stated otherwise, if the rounded result of an operation is inexact - that is, it differs from what would have been computed were both exponent range and precision unbounded - then the inexact exception shall be signaled.

Result of rounding operation is an integer value in the same floating point format as its argument. Integers are always exact, so one could expect that inexact exception are never raised in rounding operations. However the standard treats inexactness in wider sense when dealing with rounding.

The standard defines set of rounding operations that produce values in floating point formats in 5.9:

  • roundToIntegralTiesToEven
  • roundToIntegralTowardZero
  • roundToIntegralTowardPositive
  • roundToIntegralTowardNegative
  • roundToIntegralTiesToAway

It states that:

These operations shall not signal any exception except for signaling NaN input.

It also defines operation:

For the following operation, the rounding direction is the applicable rounding-direction attribute. This operation signals the invalid operation exception for a signaling NaN operand, and for a numerical operand, signals the inexact exception if the result does not have the same numerical value as x.
- sourceFormat roundToIntegralExact(source)
roundToIntegralExact(x) rounds x to an integral value according to the applicable rounding direction attribute.

C standard

Most recent draft (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf) defines functions ceil, floor, round, roundeven and trunc, that have strict correspondence with the rounding operations defined in IEEE Std 754-2008. nearbyint does not have strict counterpart in IEEE 754, but dynamically selects one of the standard operations depending on the current rounding mode.

As for rint, its description in 7.12.9.4 states that it may signal and does not specifies when it may signal. In the recent draft cited above (but not in the previous versions of the C standard) it is stated (F.10p6):

The functions bound to operations in IEC 60559 (F.3) are fully specified by IEC 60559, including rounding behaviors and floating-point exceptions.

F.3p1 specifies that rint is bound to roundToIntegralExact, ceil - to roundToIntegralTowardPositive and so on.

So depending on the viewpoint, rint either may rise inexact exception (because of 7.12.9.4) or must do it (because of F.10p6). The former reflects perception of rounding as an operation, see the defect report http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm, although it probably is outdated.

LLVM

LLVM IR is language-independent. Intrinsic functions ceil, floor, round, roundeven and trunc are defined by C standard but they correspond to rounding operations defined by IEEE 754, which is also language-independent. It looks reasonable to implement them according to their IEEEE 754 counterparts. This means that all these intrinsics but rint do not raise inexact exception, rint raises inexact exception if its result differs from the argument.

(2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version.

Exactly. In C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) there is a statement about ceil (F.10.6.1):

The ceil functions may, but are not required to, raise the ‘‘inexact’’ floating-point exception for finite non-integer arguments, as this implementation does.

Similar statements existed for floor, round` and trunc. The recent draft does not provide such option, inexact is not allowed for these functions.

(4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently.

I believe this is caused by evolution of C standard. GCC implemented special option -fno-fp-int-builtin-inexact deal with compatibility issues (https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02060.html). We could implement this option in clang if necessary.

llvm/lib/Analysis/ConstantFolding.cpp
1840

Thank you for the very thorough answer!

Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. That would satisfy my objection. I have verified that the latest Fortran standard also defines these functions in terms of their IEEE 754 counterparts, and I think that's a reasonable expectation for any front end that wants strict floating point semantics.

The more precisely specific exception semantics do present a potential issue for backends. The older "may, but are not required to, raise inexact" allows for a wider variety of implementations. Semantics that do not permit exceptions may require special handling. For example, the x87 FRNDINT instruction raises the inexact exception and so could only be used with rint.

Because the non-constrained versions of these functions are defined to assume a floating point environment with no side effects and the constrained versions are still considered experimental, I think we are free to just declare their semantics to be bound to the IEEE-754 operations.

If we are agreed on this, then only the rint folding here would need to handle the inexact condition.

kpn added inline comments.Feb 28 2020, 5:43 AM
llvm/lib/Analysis/ConstantFolding.cpp
1840

Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions.

I like this idea. +1

Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions.

I started preparing documentation patch but got problem with description of rint and nearbyint: we should describe non-constrained variants as returning argument rounded toward nearest even, because they are usable only when default FP environment is set. Asked question in maillist about that.

If we are agreed on this, then only the rint folding here would need to handle the inexact condition.

Great!

Is there something that prevents this patch from approval?

andrew.w.kaylor accepted this revision.Mar 26 2020, 1:41 PM

lgtm

Sorry for the delay in approval.

This revision is now accepted and ready to land.Mar 26 2020, 1:41 PM
sepavloff updated this revision to Diff 253132.Mar 27 2020, 8:42 AM

Added changes for PowerPC tests

Rounding intrinsics now can be constfolded, even if they are represented
by constrained intrinsics. As a result a PowerPC test that checks code
generation for constrained intrinsics needs update.

@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.

kpn added a comment.Mar 27 2020, 9:35 AM

@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.

I'll take a look. I'm not a PowerPC expert but I'll do what I can.

kpn added a comment.Mar 27 2020, 11:39 AM

It looks like a whole bunch of instructions got converted into simple loads. Which is exactly what was expected.

LGTM.

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll
6444

Now that I look at this, it might have been nice if all along we'd been writing test cases that gave some different values for the vector elements. Oh well. I doubt it's worth changing now.

This revision was automatically updated to reflect the committed changes.