Page MenuHomePhabricator

[FEnv] Constfold some unary constrained operations
Needs ReviewPublic

Authored by sepavloff on Jan 17 2020, 8:29 AM.

Details

Summary

This change implements constant folding for constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Event Timeline

sepavloff created this revision.Jan 17 2020, 8:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2020, 8:29 AM
craig.topper added inline comments.Jan 17 2020, 8:45 AM
llvm/lib/IR/FPEnv.cpp
82

rmToNearest should map to APFloat::rmNearestTiesToEven.

sepavloff marked an inline comment as done.Jan 17 2020, 9:29 AM
sepavloff added inline comments.
llvm/lib/IR/FPEnv.cpp
82

rmToNearest should map to APFloat::rmNearestTiesToEven

Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of round however contains definite requirement (7.12.9.6):

The round functions round their argument to the nearest integer value in floating-point
format, rounding halfway cases away from zero, regardless of the current rounding
direction.

In this case rounding mode is roundTiesToAway. Why in other cases it must be roundTiesToEven?

kpn added inline comments.Jan 17 2020, 10:25 AM
llvm/lib/Analysis/ConstantFolding.cpp
1768

I thought rint could raise an inexact exception?

Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first?

craig.topper added inline comments.Jan 17 2020, 10:28 AM
llvm/lib/IR/FPEnv.cpp
82

As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior.

sepavloff marked 2 inline comments as done.Jan 17 2020, 11:29 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1768

rint may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas.

If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, getAPFloatRoundingMode returns None in this case and the expression is not folded.

Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior.

llvm/lib/IR/FPEnv.cpp
82

Yes, I see that ConstantFolding.cpp uses mainly roundTiesToEven and will change the patch. It would be nice to understand why this is so.

What make me worry is unusual behavior of roundTiesToEven. For instance, 10.5 is rounded to 10.0. Behavior of round is more familiar.

llvm/lib/IR/FPEnv.cpp
82

The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations.

More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods.

craig.topper added inline comments.Jan 17 2020, 11:59 AM
llvm/lib/IR/FPEnv.cpp
82

roundTiestoAway is more familiar because its what we're taught in school when we're young. But it introduces bias in the results. Rounding to even can cancel out some bias. See the "round half to even" section here https://en.wikipedia.org/wiki/Rounding

Andrew, Craig, thank you for explanations!

sepavloff updated this revision to Diff 238952.Jan 18 2020, 8:53 AM

Changed rounding to nearest from roundTiesToAway to roundTiesToEven

craig.topper added inline comments.Jan 23 2020, 9:43 PM
llvm/lib/Analysis/ConstantFolding.cpp
1753

Does moving this up change the behavior or non-constrained intrinsics?

sepavloff updated this revision to Diff 240428.Jan 26 2020, 3:55 AM

Updated patch

  • Implemented constfolding for non-finite numbers
  • Do not affect processing of non-constrained intrinsics.
sepavloff marked an inline comment as done.Jan 26 2020, 3:59 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1753

Changed the patch so that behavior of non-constrained intrinsics remains unaffected.

sepavloff updated this revision to Diff 242254.Mon, Feb 3, 9:13 PM

Rebase and ping.

kpn added inline comments.Tue, Feb 11, 9:29 AM
llvm/lib/Analysis/ConstantFolding.cpp
1768

But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't?

What's the definition of "may"? Is it:

  1. If conditions warrant, an exception _shall_ be issued OR
  2. If conditions warrant, the decision to raise an exception is _implementation defined_.

If it's #1 then we're dropping an exception we should be issuing.

sepavloff marked an inline comment as done.Tue, Feb 11, 10:30 AM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1768

Inexact exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why rint exists is faster implementation than nearbyint in some cases. I don't know these particular cases but the cited defect report mentions it.

So rint is just a loose version of nearbyint. There are no cases when this exception would make sense.

kpn added inline comments.Wed, Feb 12, 5:40 AM
llvm/lib/Analysis/ConstantFolding.cpp
1768

After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception.

Objection withdrawn, and sorry I wasn't quicker to come to this conclusion.

evandro added inline comments.Thu, Feb 13, 12:09 PM
llvm/lib/Analysis/ConstantFolding.cpp
1407

Should these cases be true even when isStrictFP() is true?

1479

Or should...

return !Call->isStrictFP();

be inserted here?

2526

Again, not sure about the impact of doing this...

sepavloff marked 2 inline comments as done.Thu, Feb 13, 10:24 PM
sepavloff added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1407

fabs is allowed irrespective of isStrictFP() , so it should be processed here. As for functions like ceil, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics.

2526

The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function.

llvm/lib/Analysis/ConstantFolding.cpp
1407

I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here.

Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day.

1479

This is a reasonable suggestion.

1822

For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer.

We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag.

2526

This is probably OK. I think these changes will always end up going through the other function for FP calls.