This is an archive of the discontinued LLVM Phabricator instance.

[LangRef] improve documentation of SNaN in the default FP environment
ClosedPublic

Authored by spatel on Feb 1 2023, 7:41 AM.

Details

Summary

Make it explicit that SNaN is not handled differently than QNaN in the LLVM default floating-point environment.

Note that an IEEE-754-compliant model disallows transforms like "X * 1.0 -> X". That is because math operations are expected to convert SNaN to QNaN (set the signaling bit).

But LLVM has had those kinds of transforms from the beginning:
https://alive2.llvm.org/ce/z/igb55y

We should be IEEE-754-compliant under strict-FP (the logic is implemented with a helper named canIgnoreSNaN()), but I don't think there is any demand to do that with default optimization.

See issue #43070 for earlier draft/discussion about this change.

Diff Detail

Event Timeline

spatel created this revision.Feb 1 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 7:41 AM
spatel requested review of this revision.Feb 1 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 7:41 AM

"undefined" is a dangerous word. I hope you didn't mean that it is literally undef.

Also, it's not exceptions that are cleared, it's the sNaN bit, right? So you might want to keep the original sentence about exceptions, and then add a new one saying
"There is also no attempt to treat the signaling bit as specified by IEEE-754; in particular, arithmetic operations can produce signaling NaNs."

arsenm added a comment.Feb 1 2023, 8:17 AM

"There is also no attempt to treat the signaling bit as specified by IEEE-754; in particular, arithmetic operations can produce signaling NaNs."

No, arithmetic operations cannot produce signaling nans. Signaling nans only come from initialization.

spatel added a comment.Feb 1 2023, 8:29 AM

"undefined" is a dangerous word. I hope you didn't mean that it is literally undef.

I think that is how it would be specified in Alive2 terminology, but I'm happy to avoid the controversy if possible!

Also, it's not exceptions that are cleared, it's the sNaN bit, right?

The signaling bit of SNaN is set (not cleared) to make a QNaN (although the spec was vague enough that at least one target implemented it backwards in hardware).
IEEE says:
"Signaling NaNs shall be reserved operands that, under default exception handling, signal the invalid operation exception"
...so I was trying to make it explicit that none of it matters. But I'm trying to write as little as possible about IEEE-754 exceptions, and still make it clear what optimizations are allowed/expected. :)

RalfJung added a comment.EditedFeb 1 2023, 8:32 AM

No, arithmetic operations cannot produce signaling nans. Signaling nans only come from initialization.

In LLVM, arithmetic operations can produce signaling nans: x * 1.0 can be optimized to x, so if x was a signalling NaN that will be the output of the multiplication.
(Maybe we are using the term "produce" differently here? I meant it as "can be the output of".)

Maybe we want a guarantee that if no input is signaling than the output is not signaling either? I am not sure if that is a guarantee LLVM provides currently. Generally the vibe seems to be that LLVM will just produce *some* NaN, arbitrarily chosen from the set of all qNaN and sNaN, and completely independent of which kind of NaN (if any) was in the input.

I think that is how it would be specified in Alive2 terminology, but I'm happy to avoid the controversy if possible!

No, in Alive2 this would be non-determinism, which is more like freeze undef than undef.

arsenm added a comment.Feb 1 2023, 8:37 AM

No, arithmetic operations cannot produce signaling nans. Signaling nans only come from initialization.

In LLVM, arithmetic operations can produce signaling nans: x * 1.0 can be optimized to x, so if x was a signalling NaN that will be the output of the multiplication.
(Maybe we are using the term "produce" differently here? I meant it as "can be the output of".)

I would interpret “produce” as being the origin of the snan. Propagation of what was originally a signaling nan isn’t production

Maybe we want a guarantee that if no input is signaling then the output is not signaling either? I am not sure if that is a guarantee LLVM provides currently.

I think this should be preserved. The one possible issue is I think the old style MIPS snan encoding probably doesn’t work in general. APFloat assumes the newer quiet bit encoding every other target uses

RalfJung added a comment.EditedFeb 1 2023, 8:42 AM

Okay, so the rules would be something like: when a floating point operation outputs a NaN, that is

  • either any of its input NaNs (even if that is signaling, which violates IEEE-754)
  • or an arbitrary quiet NaN

(with exceptions for copysign and a few other operations that are defined to operate on the bit representation)

However given how LLVM pretty much ignores the qNaN vs sNaN distinction and already violates IEEE-754, I am not sure what the point is of making the spec more complicated just to forbid LLVM from producing sNaNs when no input was an sNaN. Also as you said this statement is only true for targets where the APFloat idea of what is a qNaN matches the hardware, which is not true for all MIPS chips. Is this an APFloat bug that should be fixed or is it fine?

spatel updated this revision to Diff 493975.Feb 1 2023, 8:51 AM

Updated:

  1. Avoid controversy by not using the term "undef".
  2. Make it clearer that math ops may not quiet an SNaN (intentionally avoiding the use of "set" or "clear" of bits).
spatel updated this revision to Diff 494000.Feb 1 2023, 10:14 AM

Soften the wording - rather than "no attempt", we offer "no guarantee" about the value of the signaling bit.

kpn added a comment.Feb 1 2023, 10:40 AM

I like the new wording. Taking the list of exceptions out was also a good idea.

Muon added a subscriber: Muon.EditedFeb 1 2023, 6:44 PM

The text repeats itself, since the only signaling bit changes in IEEE 754 are quieting a signaling NaN. "May not" is also ambiguous. Also, at the risk of appearing pedantic, the earlier text conflates exceptions and their associated status flags. In IEEE 754, when exceptions are signaled, the default handling raises the corresponding status flags. (Status flags are only lowered when expressly requested.)

Perhaps the following would be clearer and more accurate:
"Therefore, there is no attempt to ensure that exceptions are signaled or that status flags are changed or preserved. There is also no guarantee that signaling NaNs are handled as specified by IEEE 754. In particular, there is no guarantee that arithmetic operations quiet a signaling NaN, or that operating on a signaling NaN causes an exception to be signaled."

EDIT: Actually, this is not strictly accurate either since it is default exception handling which causes special values to be generated in the first place. More accurate would be the following:

"Therefore, there is no attempt to ensure that status flags are raised as a result of exceptions. There is also no guarantee that signaling NaNs are handled as specified by IEEE 754. In particular, there is no guarantee that operating on a signaling NaN causes an exception to be signaled, and thus no guarantee that arithmetic operations quiet a signaling NaN."

"Therefore, there is no attempt to ensure that status flags are raised as a result of exceptions. There is also no guarantee that signaling NaNs are handled as specified by IEEE 754. In particular, there is no guarantee that operating on a signaling NaN causes an exception to be signaled, and thus no guarantee that arithmetic operations quiet a signaling NaN."

Can we reduce to this without losing meaning/coverage?

"The default LLVM floating-point environment assumes that floating-point
instructions do not have side effects. Results assume the round-to-nearest
rounding mode. There is no attempt to set status flags as a result of
exceptions. Signaling NaNs are not handled as specified by IEEE-754. In
particular, math operations are not guaranteed to quiet a signaling NaN."

arsenm added a comment.Feb 2 2023, 1:24 PM

I wonder if we should apply an exception to llvm.canonicalize and require it quiet signaling nans

Muon added a comment.Feb 2 2023, 5:04 PM

I think "no attempt to set status flags" is a bit confusing, since they will in fact be set on compliant architectures whenever an operation is not optimized out. Similarly for "Signaling NaNs are not handled as specified by IEEE-754", since a compliant architecture will in fact follow spec whenever the operation on a signaling NaN is not optimized out. This is why I was hedging in my earlier version with "no attempt to ensure" and "no guarantee".

That said, if we're instead just trying to specify what the LLVM abstract machine does, with no regard to lowering, it would be more like:

"The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. Operations assume default exception handling according to IEEE 754, with the following changes:

  1. Status flags are assumed not to exist.
  2. Signaling NaNs are assumed not to signal exceptions.
    • As such, arithmetic operations are not guaranteed to quiet a signaling NaN."
spatel added a comment.Feb 3 2023, 8:39 AM

I think "no attempt to set status flags" is a bit confusing, since they will in fact be set on compliant architectures whenever an operation is not optimized out. Similarly for "Signaling NaNs are not handled as specified by IEEE-754", since a compliant architecture will in fact follow spec whenever the operation on a signaling NaN is not optimized out. This is why I was hedging in my earlier version with "no attempt to ensure" and "no guarantee".

That said, if we're instead just trying to specify what the LLVM abstract machine does, with no regard to lowering, it would be more like:

Yes, this describes the IR-level semantics only. We could make a note about codegen semantics, but I don't think that's needed here.

"The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. Operations assume default exception handling according to IEEE 754, with the following changes:

I don't understand the line about default exception handling - LLVM doesn't model exceptions or flags in this mode. We should create values as specified by IEEE-754, but that's it? The instructions are defined like this:
https://llvm.org/docs/LangRef.html#fmul-instruction

Muon added a comment.EditedFeb 3 2023, 12:41 PM

The way it goes in IEEE 754 is that infinities and NaNs are only generated by exceptions. So when you compute 1/0, that signals the divide by zero exception, and default exception handling for that is to return the infinity of appropriate sign and raise the divide by zero status flag. Similarly, 0/0, inf/inf, inf - inf, and operations on an sNaN (other than bitwise ops like negate, abs, and copysign) all signal invalid operation, for which the default handling is to return a qNaN and raise the invalid operation flag.

EDIT: just to clarify, exceptions don't trap in IEEE 754 under default handling. Trapping functionality is recommended to be available, but is not specified in any way.

Overall, I think the points we need to convey are here:

  1. It's assumed that the rounding mode is round-to-nearest, and all floating-point traps are disabled.
  2. The status flags may be set to arbitrary values.
  3. Floating-point math operations assume that all NaNs are quiet.

For point 3: I've made this statement stronger than you had; we actually need the stronger statement, because we have other canonicalizations like pow(1.0, <anything>) -> 1.0. and pow(<anything>, 0) -> 1.0 which I think should ALSO be kept in the default modes.

Q: What about the denormal fp environment settings (ftz/daz)? Should we say anything about those? They are frequently set globally (e.g. with -ffast-math), so presumably it's permissible to execute default-compiled code with them set, at least in some sense...

Muon added a comment.EditedFeb 3 2023, 4:30 PM

If we're only talking about IR semantics, there's no need to mention traps. Also, LLVM's pow intrinsic is specified to work like C's, which works like what you've mentioned. IEEE 754 only mandates the behavior of basic functions (+, -, *, /, √, FMA, rem, copy, abs, neg, copysign).

Perhaps it would be simpler to say "Operations are assumed to behave according to IEEE 754, with the following differences:"?

EDIT: regarding ftz/daz being set, does LLVM even do anything special in that instance? It certainly breaks assumptions and makes some transformations invalid.

RalfJung added a comment.EditedFeb 4 2023, 6:50 AM

Floating-point math operations assume that all NaNs are quiet.

That could be read as "if you produce a signaling NaN (e.g. via a bitcast) and feed that to a math operation, that violates the assumption, and hence is UB".

I think it'd be better to explicitly say that when a math operation produces a NaN, it non-deterministically picks an arbitrary (quiet or signaling) NaN.

"Operations are assumed to behave according to IEEE 754, with the following differences:"

This is not about assuming anything though (whose job would it be to satisfy these assumptions?), it is about *defining* what the behavior of these operations in LLVM IR (in the Abstract Machine) is.

Floating-point math operations assume that all NaNs are quiet.

That could be read as "if you produce a signaling NaN (e.g. via a bitcast) and feed that to a math operation, that violates the assumption, and hence is UB".

I'm happy to have as little "undefinedness" as possible...but...

I think it'd be better to explicitly say that when a math operation produces a NaN, it non-deterministically picks an arbitrary (quiet or signaling) NaN.

What I was trying to state is that this is insufficient (but failed to do so clearly, sorry). Yet, for basically the same reason, it's also too permissive: we cannot allow LLVM to spuriously introduce sNaNs when the original code did not use any.

As a general rule, when an operation gets an sNaN as input, it raises an invalid exception immediately. The default behavior upon is to set the invalid bit in the status flags AND to trigger an immediate return of a qNaN -- even when a qNaN input value would've resulted in some other output.

Taking pow in particular, the correct results are:

pow(1.0, sNaN) -> qNaN
pow(1.0, qNaN) -> 1.0
pow(sNaN, 0) -> qNaN
pow(qNaN, 0) -> 1.0

If we canonicalize pow(1.0, <anything>) -> 1.0 and pow(<anything>, 0) -> 1.0, then pow(sNaN, 0) -> 1.0, instead of qNaN.

Of course, we may also get the wrong answer in the other direction: pow(1.0 * x, y) should result in 1.0 when passed x = sNaN, y = 0 (because 1.0 * sNaN -> qNaN, and pow(qNaN, 0) -> 1.0); but if we canonicalize away the multiplication to pow(x, y), we may up with qNaN as the result.

Along the same lines, if we were to allow any operation to return sNaN instead of qNaN -- even when no sNaN has been provided as input -- then we'd allow something like pow(2.0 * qNaN, 0.0) to non-deterministically result in qNaN or 1.0, which is not OK.

arsenm added a comment.Feb 4 2023, 2:42 PM

The snan problem is essentially the same as the question of whether denormals are flushed. We have llvm.canonicalize for stronger guarantees about flushing behavior than an arbitrary IR instruction. I think canonicalize should get a note that it musta quiet signaling nans.

APFloat also assumes new IEEE signaling representation. MIPS old style nans have to be all kinds of broken but I’m not sure how much we should really care

Muon added a comment.Feb 4 2023, 5:38 PM

This is not about assuming anything though (whose job would it be to satisfy these assumptions?), it is about *defining* what the behavior of these operations in LLVM IR (in the Abstract Machine) is.

I was just trying to use the same language as the original text. I'm guessing it really means that "these assumptions are made by optimizations". Regarding what LLVM actually does, I think it would be more accurate to say that floating-point IR instructions are lowered to the closest target-specific counterpart, but optimizations assume that the lowering conforms to IEEE 754 semantics (except as mentioned). Remember that LLVM doesn't put in the extra work to actually make the x87 produce conformant results for float or double, or to ensure that subnormals work on FPUs that don't implement them. (In any case, IR semantics don't seem to be described in a strict definition style in the rest of the LangRef.)

Taking pow in particular, the correct results are:

pow(1.0, sNaN) -> qNaN
pow(1.0, qNaN) -> 1.0
pow(sNaN, 0) -> qNaN
pow(qNaN, 0) -> 1.0

Only if pow follows the IEEE 754 recommendation. However, llvm.pow is defined to conform to the C standard library's semantics, which state that pow(1.0, NaN) = 1.0, for any NaN. Similarly for the rest of those intrinsics; they're just shortcuts for the C math library functions, none of which follow the IEEE 754 recommendation in that regard.

The snan problem is essentially the same as the question of whether denormals are flushed. We have llvm.canonicalize for stronger guarantees about flushing behavior than an arbitrary IR instruction. I think canonicalize should get a note that it musta quiet signaling nans.

APFloat also assumes new IEEE signaling representation. MIPS old style nans have to be all kinds of broken but I’m not sure how much we should really care

llvm.canonicalize already says "SNaNs must be quieted per the usual methods". APFloat is no doubt broken on MIPS in that regard.

C standard library's semantics, which state that pow(1.0, NaN) = 1.0, for any NaN

That's not the case.

In C17, the behavior of signaling NaNs is not defined at all, "This specification does not define the behavior of signaling NaNs. It generally uses the term NaN to denote quiet NaNs" (section F.2.1). On the other hand, C2x draft "recommends" the behavior I describe, and requires it if FE_SNANS_ALWAYS_SIGNAL is defined (section F.2.1 paragraphs 6-8).

The glibc math routines have implemented that recommended practice since 2017, for glibc 2.25, and its fenv.h header sets FE_SNANS_ALWAYS_SIGNAL if the compiler defines __SUPPORT_SNAN__, which GCC sets when -fsignaling-nans is passed.

RalfJung added a comment.EditedFeb 5 2023, 9:34 AM

Regarding what LLVM actually does, I think it would be more accurate to say that floating-point IR instructions are lowered to the closest target-specific counterpart, but optimizations assume that the lowering conforms to IEEE 754 semantics

The point of the LangRef is to describe the effective behavior of LLVM in the abstract, which then puts a boundary on the allowed optimizations. Exhaustively listing the optimizations is not going to work very well.

we cannot allow LLVM to spuriously introduce sNaNs when the original code did not use any.

That is one of the things being discussed here, right? Right now at least on old MIPS LLVM does *not* satisfy this requirement (apfloat will produce sNaN for that platform). The LangRef says "No floating-point exception state is maintained in this environment" which I read as "we don't care at all about the sNaN vs qNaN distinction and will just do whatever".

As a general rule, when an operation gets an sNaN as input, it raises an invalid exception immediately. The default behavior upon is to set the invalid bit in the status flags AND to trigger an immediate return of a qNaN -- even when a qNaN input value would've resulted in some other output.

I thought people were saying LLVM cares about none of that. Probably that was different people. ;)

then we'd allow something like pow(2.0 * qNaN, 0.0) to non-deterministically result in qNaN or 1.0, which is not OK.

Right now it is the case that pow(1.0 * sNaN, 0.0) non-deterministically returns a qNaN or 1.0 (depending on whether the multiplication returned an sNaN or qNaN, and assuming I understood correctly that pow on an sNaN returns a qNaN). That also sounds pretty bad?

I would expect that LLVM pow treats sNaN like qNaN, because otherwise these sNaN-returning arithmetic operations could have rather surprising long-distance effects. If that is *not* the case then for sure just saying "anything may return an sNaN whenever it returns any NaN" does not work. (I honestly find the behavior of pow that you describe extremely surprising. I would not expect an operation to have such wildly different behavior based on whether the input is an sNaN or qNaN. The vast majority of people calling pow will not even know that there are two kinds of NaN; the spec here will almost make sure that non-experts form a wrong mental model of what is happening. If the spec explicitly says that pow(_, 0.0) is always 1, then it better be always 1 -- everything else is actively asking for trouble. You can't just then say in a completely different part of the document "btw when we say 'always' we actually don't mean 'always', we mean 'most of the time'". That's a terrible spec.)

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Some targets does not support SNaNs and they could use the optimization x * 1.0 -> x. This transformation on targets that support SNaNs is invalid in general case, as it violates requirements of IEEE-754 and do not agree with hardware behavior. Probably LLVM could support a flag that indicates if semantics of SNaN should be preserved. Until such flag is implemented, LLVM should always honor SNaNs, because ignoring SNaNs on the target that supports them is incorrect behavior, but preserving SNaN behavior on the target that do not support it is only a missed optimization.

we cannot allow LLVM to spuriously introduce sNaNs when the original code did not use any.

That is one of the things being discussed here, right? Right now at least on old MIPS LLVM does *not* satisfy this requirement (apfloat will produce sNaN for that platform). The LangRef says "No floating-point exception state is maintained in this environment" which I read as "we don't care at all about the sNaN vs qNaN distinction and will just do whatever".

Then, LLVM is broken on old MIPS. There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place. It results in incorrect results, without the original user code breaking any assumptions. (I have no idea if anyone still cares about floating-point on old MIPS or not, but I think it'll be up to someone who does to fix this, if there is someone who cares...)

As a general rule, when an operation gets an sNaN as input, it raises an invalid exception immediately. The default behavior upon is to set the invalid bit in the status flags AND to trigger an immediate return of a qNaN -- even when a qNaN input value would've resulted in some other output.

I thought people were saying LLVM cares about none of that. Probably that was different people. ;)

Above I was describing the general rule per IEEE semantics. As it relates to LLVM, we do care to get the complete semantics correct in strictfp modes. In non-strictfp mode, we care only about a subset of the semantics -- and exactly what that subset contains is what we're trying to refine the definition of here.

Right now it is the case that pow(1.0 * sNaN, 0.0) non-deterministically returns a qNaN or 1.0 (depending on whether the multiplication returned an sNaN or qNaN, and assuming I understood correctly that pow on an sNaN returns a qNaN). That also sounds pretty bad?

Correct, and that is why my recommendation is: "Floating-point math operations assume that all NaNs are quiet." If you violate that, by using an sNaN, you will get potentially unexpected results. (But I'm not sure if using sNaN has to be full-on UB/poison in order to explain the optimizations, or, can we make the constraint violation produce incorrect results which are bounded in some manner.)

I would expect that LLVM pow treats sNaN like qNaN, because otherwise these sNaN-returning arithmetic operations could have rather surprising long-distance effects. If that is *not* the case then for sure just saying "anything may return an sNaN whenever it returns any NaN" does not work.

Yes, precisely, that's what I've been trying to say! But, LLVM cannot redefine pow.

(I honestly find the behavior of pow that you describe extremely surprising. I would not expect an operation to have such wildly different behavior based on whether the input is an sNaN or qNaN.

The mental model you need is to remember that an sNaN triggers an immediate exception for any operation. The default exception handler will abort the operation, and return qNaN immediately. The actual operation doesn't even matter: the exception occurs, just from looking at the arguments. (If you have traps enabled, this seems like it could actually be useful. With traps disabled, I'm not sure there's really much point.)

qNaN is different: it doesn't invoke an immediate exception handler, instead it propagates through the operation. That typically results in a qNaN output, but not always. I believe the rationale for why pow(qNaN, 0) does NOT return qNaN is that every possible finite or infinite value which could be substituted there would produce the same answer: 1.0. Therefore, the result is fully defined, even though the input is not. (Just for clarity, this property is not true for multiplication by 0, because of fmul 0, inf not being 0.)

In some ways, I'd say qNaN operates similarly to LLVM-IR "undef" and sNaN similarly to LLVM-IR "poison" (But please don't try to take that analogy too far, it's certainly not fully accurate!)

kpn added a comment.Feb 6 2023, 8:04 AM

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Agreed. Your example of SNan + 0.0 -> QNaN is an "operation". It affects the environment (raises status flags, for example) but is not a part of the environment. That's my understanding, anyway. I'm happy to be corrected.

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Neither GCC nor Clang have considered sNaN-related semantics important to provide by default thus far, and I don't think we ought to start now, either. We should continue to support them on an opt-in basis -- at least as far as the end-to-end behavior goes for Clang users. (If there's some good reason to do so, it'd be fine to contemplate changing the IR-level defaults.)

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Neither GCC nor Clang have considered sNaN-related semantics important to provide by default thus far, and I don't think we ought to start now, either. We should continue to support them on an opt-in basis -- at least as far as the end-to-end behavior goes for Clang users. (If there's some good reason to do so, it'd be fine to contemplate changing the IR-level defaults.)

GCC has option -fsignaling-nans, which may be used to turn on standard treatment of SNaNs. Clang does not have similar option, now the only way to handle SNaNs in the standard way is to turn on exception handling, which is not suitable in some cases. LLVM as a low-level component must support SNaN on an opt-in basis, but request for such support must be more selective. Anyway SNaN treatment is nor a part of strict exception handling semantics.

GCC has option -fsignaling-nans, which may be used to turn on standard treatment of SNaNs. Clang does not have similar option, now the only way to handle SNaNs in the standard way is to turn on exception handling, which is not suitable in some cases. LLVM as a low-level component must support SNaN on an opt-in basis, but request for such support must be more selective. Anyway SNaN treatment is nor a part of strict exception handling semantics.

That's an enhancement request. Support for sNaN currently _is_ part of the strictfp mode in LLVM, and that's functionally sufficient for all cases. That said, I do agree it may be overkill (with an unnecessary performance cost) if you only wanted sNaN support but don't care about exception support. I'm not sure if there's really a use-case for sNaN without exceptions, but if someone has such a use, I wouldn't be opposed to seeing such a mode added in the future.

spatel added a comment.Feb 7 2023, 8:36 AM

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Neither GCC nor Clang have considered sNaN-related semantics important to provide by default thus far, and I don't think we ought to start now, either.

Agreed - we're not changing the default LLVM behavior with this patch, and I have not heard any reasons why we should. IEEE-754-compliant SNaN handling isn't important enough to the majority of FP users to justify FP performance regressions.

GCC has option -fsignaling-nans, which may be used to turn on standard treatment of SNaNs. Clang does not have similar option, now the only way to handle SNaNs in the standard way is to turn on exception handling, which is not suitable in some cases. LLVM as a low-level component must support SNaN on an opt-in basis, but request for such support must be more selective. Anyway SNaN treatment is nor a part of strict exception handling semantics.

I think every combination of IEEE-754-(non-)compliance is covered by existing Clang flags at this point:
https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior

If -fsignaling-nans is important enough, it could be added as an alias of some combo of those flags. If the optimizer is not behaving as specified by those flags, that's a bug. Here's a bug fix proposal:
https://reviews.llvm.org/D143505

Here's a test program/playground for messing with the optimization settings:
https://godbolt.org/z/EWrKdYx1W

Muon added a comment.Feb 8 2023, 12:26 AM

Agreed - we're not changing the default LLVM behavior with this patch, and I have not heard any reasons why we should. IEEE-754-compliant SNaN handling isn't important enough to the majority of FP users to justify FP performance regressions.

Although I am uncertain about the consequences of changing LLVM's defaults, I would like to point out that there may not actually be a performance impact to handling signaling NaNs correctly. Replacing x * 1.0 (and the like) with x is sound whenever the only uses of x are in quieting floating-point operations (that is, +, -, *, /, fma, rem, sqrt, but not bitwise ops such as negate, abs or copySign). It would pretty much only affect cases where someone was just performing an isolated multiplication by 1 and returning that. (Such as, say, someone trying to quiet a possible-signaling value.)

spatel updated this revision to Diff 495823.Feb 8 2023, 6:34 AM

Patch updated:
Minimized word count and tightened the behavior/expectations for SNaN.

kpn added a comment.Feb 8 2023, 6:35 AM

Agreed - we're not changing the default LLVM behavior with this patch, and I have not heard any reasons why we should. IEEE-754-compliant SNaN handling isn't important enough to the majority of FP users to justify FP performance regressions.

Although I am uncertain about the consequences of changing LLVM's defaults, I would like to point out that there may not actually be a performance impact to handling signaling NaNs correctly. Replacing x * 1.0 (and the like) with x is sound whenever the only uses of x are in quieting floating-point operations (that is, +, -, *, /, fma, rem, sqrt, but not bitwise ops such as negate, abs or copySign). It would pretty much only affect cases where someone was just performing an isolated multiplication by 1 and returning that. (Such as, say, someone trying to quiet a possible-signaling value.)

That's a project that nobody is working on, though, and it would require widespread work in passes like InstSimplify, InstCombine, and others to add the necessary analysis to detect when it is safe and when it is not. There's also the issue that even if it is safe to make the transform, subsequent optimizations that aren't aware the transform happened might make the transform unsafe after the fact.

The property of SNaN like SNan + 0.0 -> QNaN is not related to exceptions and must be preserved in default environment also.

Neither GCC nor Clang have considered sNaN-related semantics important to provide by default thus far, and I don't think we ought to start now, either.

Agreed - we're not changing the default LLVM behavior with this patch, and I have not heard any reasons why we should. IEEE-754-compliant SNaN handling isn't important enough to the majority of FP users to justify FP performance regressions.

We don't know how support of SNaN is important and how large is performance gain due to dropping it. The safe solution in this case is to support both strategies and let user to choose the needed.

GCC has option -fsignaling-nans, which may be used to turn on standard treatment of SNaNs. Clang does not have similar option, now the only way to handle SNaNs in the standard way is to turn on exception handling, which is not suitable in some cases. LLVM as a low-level component must support SNaN on an opt-in basis, but request for such support must be more selective. Anyway SNaN treatment is nor a part of strict exception handling semantics.

I think every combination of IEEE-754-(non-)compliance is covered by existing Clang flags at this point:
https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior

If -fsignaling-nans is important enough, it could be added as an alias of some combo of those flags.

According to the proposed wording SNaNs will be supported with strict exception handling only. It is not natural to couple SNaN support with FP environment because these are orthogonal things. To keep the property SNaN + 0.0 -> QNaN access to rounding mode or status flags is not needed, so it can be available in default environment also. Strict exception behavior is associated with substantial performance drop and it can be inappropriate for users that use them.

spatel added a comment.Feb 8 2023, 9:56 AM

We don't know how support of SNaN is important and how large is performance gain due to dropping it. The safe solution in this case is to support both strategies and let user to choose the needed.
According to the proposed wording SNaNs will be supported with strict exception handling only. It is not natural to couple SNaN support with FP environment because these are orthogonal things. To keep the property SNaN + 0.0 -> QNaN access to rounding mode or status flags is not needed, so it can be available in default environment also. Strict exception behavior is associated with substantial performance drop and it can be inappropriate for users that use them.

No substantial performance drop is implied by using strict FP; someone just needs to do the work to make it fast(er) while remaining (some subset of) strict. Previous comments said that would be a welcome enhancement if anyone wants to implement it. It's just not appropriate to make that enhancement a pre-condition for clarifying the existing default behavior.

For default FP, we actually do have an indication of how important IEEE-754-compliant SNaN handling is for most users: neither LLVM nor GCC has ever had that mode on by default, and there are no LLVM bug reports asking for SNaN handling while ignoring exceptions AFAIK. I'd be interested in seeing a real program where that mode of operation makes sense.

kpn accepted this revision.Feb 8 2023, 11:14 AM

I think that about covers it. LGTM.

This revision is now accepted and ready to land.Feb 8 2023, 11:14 AM
Muon added a comment.Feb 8 2023, 4:55 PM

Hold on, that still says that operations assume that all NaNs are quiet. Doesn't that mean that passing a signaling NaN to an operation is potentially undefined behavior? Can we instead say that math operations treat all NaNs as if they were quiet NaNs? Does that run into issues with pow() and friends?

Also, "status flags may be set to arbitrary values" by what exactly? Is it assuming that the initial state is arbitrary? If it's talking about operations, it certainly doesn't imply that they are side effect-free. Quite the opposite, since it implies that operations do in fact have side effects and can do arbitrary things to the status flags, and therefore can never be reordered. This is worse than the normal semantics, since those at least ensure that operations can be freely reordered as long as the reordering doesn't cross a read or clear of the status flags.

In order to actually be side effect-free you need to declare that status flags don't exist, are treated as nonexistent, or that any attempt to read the status flags may return arbitrary values. As soon as operations are declared to interact (or not interact) with status flags, you have the problem of making sure they are (or aren't) modified by them.

arsenm accepted this revision.Feb 8 2023, 5:07 PM
programmerjake added a comment.EditedFeb 8 2023, 5:47 PM

Hold on, that still says that operations assume that all NaNs are quiet. Doesn't that mean that passing a signaling NaN to an operation is potentially undefined behavior? Can we instead say that math operations treat all NaNs as if they were quiet NaNs? Does that run into issues with pow() and friends?

What about saying that for every non-bit-copy fp operation LLVM may arbitrarily treat any input NaNs as if they are quiet NaNs and any output NaNs have arbitrary quietness?

RalfJung added a comment.EditedFeb 9 2023, 2:29 AM

Then, LLVM is broken on old MIPS. There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place. It results in incorrect results, without the original user code breaking any assumptions.

I think I am trying to understand why this is the case. For my own use of floating points, until fairly recently I didn't even know about the sNaN vs qNaN distinction. Most programmers won't know, and they won't care. Maybe it is reasonable to expect programmers that do care to pass a strictfp flag?

OTOH, even programmers that do not know about sNaN vs qNaN might be very surprised if pow(x, 0.0) can return NaN despite the docs saying it won't... so that cursed behavior of pow might be a forcing function for ensuring LLVM will never introduce new sNaN.

What about saying that for every non-bit-copy fp operation LLVM may arbitrarily treat any input NaNs as if they are quiet NaNs and any output NaNs have arbitrary quietness?

That would give LLVM license to introduce sNaN intro programs that don't originally have any sNaN. Whether that is okay seems to be the main remaining contentious point in this discussion.

So the alternative is to say that if any input is an sNaN, then it may be treated as if it was a qNaN and output NaN have arbitrary quietness, but if there are no sNaN inputs then all output NaN will be quiet.

Then, LLVM is broken on old MIPS. There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place. It results in incorrect results, without the original user code breaking any assumptions.

I think I am trying to understand why this is the case.

because, from what I understand, before IEEE 754 specified how to encode quietness for NaNs, MIPS (and PA-RISC) arbitrarily chose the opposite encoding to what IEEE 754-2008 specifies, so LLVM generating quiet NaNs following IEEE 754-2008 produces NaNs that are actually signalling NaNs for old MIPS. MIPS later added a mode bit allowing swapping its interpretation of signalling/quiet NaNs to fall in line with the IEEE 754-2008 spec -- new MIPS has that set to IEEE 754-2008 mode.

So the alternative is to say that if any input is an sNaN, then it may be treated as if it was a qNaN and output NaN have arbitrary quietness, but if there are no sNaN inputs then all output NaN will be quiet.

That works, except for old MIPS, where LLVM's idea of what's quiet/signalling doesn't currently match.

because, from what I understand, before IEEE 754 specified how to encode quietness for NaNs, MIPS (and PA-RISC) arbitrarily chose the opposite encoding to what IEEE 754-2008 specifies, so LLVM generating quiet NaNs following IEEE 754-2008 produces NaNs that are actually signalling NaNs for old MIPS. MIPS later added a mode bit allowing swapping its interpretation of signalling/quiet NaNs to fall in line with the IEEE 754-2008 spec -- new MIPS has that set to IEEE 754-2008 mode.

Sorry, I should have clarified, I was asking specifically about this part:
"There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place."

kpn added a comment.Feb 9 2023, 5:32 AM

Then, LLVM is broken on old MIPS. There's just no way it's okay to spuriously introduce sNaNs when the original program didn't contain sNaNs in the first place. It results in incorrect results, without the original user code breaking any assumptions.

I think I am trying to understand why this is the case.

because, from what I understand, before IEEE 754 specified how to encode quietness for NaNs, MIPS (and PA-RISC) arbitrarily chose the opposite encoding to what IEEE 754-2008 specifies, so LLVM generating quiet NaNs following IEEE 754-2008 produces NaNs that are actually signalling NaNs for old MIPS. MIPS later added a mode bit allowing swapping its interpretation of signalling/quiet NaNs to fall in line with the IEEE 754-2008 spec -- new MIPS has that set to IEEE 754-2008 mode.

So the alternative is to say that if any input is an sNaN, then it may be treated as if it was a qNaN and output NaN have arbitrary quietness, but if there are no sNaN inputs then all output NaN will be quiet.

That works, except for old MIPS, where LLVM's idea of what's quiet/signalling doesn't currently match.

I really think that old MIPS shouldn't be a part of this analysis. We should treat old MIPS as "broken" and not bother discussing it here. If someone wants to fix it by, for example, adding support for it to the APFloat class then they can. Currently it just makes this discussion more complicated and I don't see the benefit.

kpn added a comment.Feb 9 2023, 6:09 AM

Hold on, that still says that operations assume that all NaNs are quiet. Doesn't that mean that passing a signaling NaN to an operation is potentially undefined behavior? Can we instead say that math operations treat all NaNs as if they were quiet NaNs? Does that run into issues with pow() and friends?

How many of these sNaN vs qNaN cases that matter are there? I've seen pow() mentioned in this ticket. What are the other cases?

Maybe a sentence or two that states we may treat sNaN and qNaN differently if required by a standards document but otherwise don't? Except we probably don't right now and shouldn't promise we do or will.

Also, "status flags may be set to arbitrary values" by what exactly? Is it assuming that the initial state is arbitrary? If it's talking about operations, it certainly doesn't imply that they are side effect-free. Quite the opposite, since it implies that operations do in fact have side effects and can do arbitrary things to the status flags, and therefore can never be reordered. This is worse than the normal semantics, since those at least ensure that operations can be freely reordered as long as the reordering doesn't cross a read or clear of the status flags.

There aren't going to be any reads of the status flags because in the default environment we define them to never be observed by the program. So no reads, and we can reorder as we wish. At program start all flags are lowered, and they may be changed during the execution of the program but the program won't see it.

The default environment is what we assume when "FENV_ACCESS OFF" (or similar with compiler flags and maybe other pragmas), and we support accessing the floating point environment and possibly enabling Unix signals when using "FENV_ACCESS ON". The constrained floating point intrinsics are for the latter case. We try to get sNaN handling correct with the constrained intrinsics, and those are described in a different part of the document. The constrained intrinsics are still marked experimental.

In order to actually be side effect-free you need to declare that status flags don't exist, are treated as nonexistent, or that any attempt to read the status flags may return arbitrary values. As soon as operations are declared to interact (or not interact) with status flags, you have the problem of making sure they are (or aren't) modified by them.

If you are going to be interacting with the status flags then you need to be using the constrained floating point intrinsics where you are allowed to access the FP environment and to be in an alternate floating point environment. With the constrained intrinsics you are allowed to observe the FP status flags. No constrained intrinsics, no observing the status flags. Period.

Hold on, that still says that operations assume that all NaNs are quiet. Doesn't that mean that passing a signaling NaN to an operation is potentially undefined behavior? Can we instead say that math operations treat all NaNs as if they were quiet NaNs? Does that run into issues with pow() and friends?

Passing SNaN is not undefined behavior. Either it's going to pass through the optimizer undetected, or it's going to be noticed and be optimized as if it were a QNaN. It seems similar to our description of the "nsz" fast-math-flag:
"No Signed Zeros - Allow optimizations to treat the sign of a zero argument or zero result as insignificant. This does not imply that -0.0 is poison and/or guaranteed to not exist in the operation."

C2X draft N3047 says this in "F.2.1":
"This annex does not require the full support for signaling NaNs specified in IEC 60559. This annex uses the term NaN, unless explicitly qualified, to denote quiet NaNs. Where specification of signaling NaNs is not provided, the behavior of signaling NaNs is implementation-defined (either treated as an IEC 60559 quiet NaN or treated as an IEC 60559 signaling NaN). 438)
438)Since NaNs created by IEC 60559 arithmetic operations are always quiet, quiet NaNs (along with infinities) are sufficient for closure of the arithmetic."

Is there something we can adapt from either of those to make the text clearer?

How many of these sNaN vs qNaN cases that matter are there? I've seen pow() mentioned in this ticket. What are the other cases?

Maybe a sentence or two that states we may treat sNaN and qNaN differently if required by a standards document but otherwise don't? Except we probably don't right now and shouldn't promise we do or will.

We do not treat them differently in any optimization that I know of, and we don't want to start. We want to operate as we are currently - that lines up with the C spec quoted above.

How many of these sNaN vs qNaN cases that matter are there? I've seen pow() mentioned in this ticket. What are the other cases?

There's pow, hypot, fmin, and fmax: they all are expected to return a qNaN when passes an sNaN input, and will return a non-NaN value for qNaN input, depending on the other argument. I'm not 100% sure that's a complete list, but I think so.

OTOH, even programmers that do not know about sNaN vs qNaN might be very surprised if pow(x, 0.0) can return NaN despite the docs saying it won't... so that cursed behavior of pow might be a forcing function for ensuring LLVM will never introduce new sNaN.

Yes. We do not want to break the semantics of a correct program which does not use any sNaNs. We support qNaN, so I don't see how pow(expr-resulting-in-qNaN, 1.0) -> qNaN (instead of 1.0) could be considered anything other than a miscompile -- unless the user has done something forbidden like using an sNaN in the computation of expr-resulting-in-qNaN.

So the alternative is to say that if any input is an sNaN, then it may be treated as if it was a qNaN and output NaN have arbitrary quietness, but if there are no sNaN inputs then all output NaN will be quiet.

This sounds right. We may:

  1. Treat an sNaN input value as if it had been a qNaN (and thus e.g. return a 1.0 instead of a qNaN from pow 1.0, sNaN), or
  2. Pass an sNaN through an operation without quieting it first (and thus e.g. return a sNaN instead of qNaN from fadd sNaN, 1.0).

The part I am unsure of, is whether it is possible to put such a limited bound on the undefinedness. I don't feel like I have a good enough understanding/intuition of this sort of thing to do anything other than express a worry, so I hope someone else can clarify this aspect for me, and either reassure me that's an unfounded worry, or confirm it.

My worry is: Does having such an indeterminate output value, combined with other optimization passes, trigger unbounded UB from the system as-a-whole? E.g., because we can duplicate and coalesce FP math instructions, and make a different optimization decision for each duplicated instance separately, a single fadd with an sNaN input could appear to be a qNaN to some of its uses and an sNaN for others. Which then as discussed changes the results of finite values from FP computations too. Could that cause problems in downstream optimization passes?

My worry is: Does having such an indeterminate output value, combined with other optimization passes, trigger unbounded UB from the system as-a-whole? E.g., because we can duplicate and coalesce FP math instructions, and make a different optimization decision for each duplicated instance separately, a single fadd with an sNaN input could appear to be a qNaN to some of its uses and an sNaN for others. Which then as discussed changes the results of finite values from FP computations too. Could that cause problems in downstream optimization passes?

We need to be a little careful with operations that produce non-deterministic results, sure; in particular, we can't rematerialize/LoopSink/etc. them, and certain peepholes aren't legal if a value has multiple uses. (A simple example of such an instruction is "freeze poison".) This is something we have to deal with anyway for floating-point ops, though; fast-math is defined to be non-deterministic.

My worry is: Does having such an indeterminate output value, combined with other optimization passes, trigger unbounded UB from the system as-a-whole? E.g., because we can duplicate and coalesce FP math instructions, and make a different optimization decision for each duplicated instance separately, a single fadd with an sNaN input could appear to be a qNaN to some of its uses and an sNaN for others. Which then as discussed changes the results of finite values from FP computations too. Could that cause problems in downstream optimization passes?

Note that this is already a concern even without the special sNaN handling: the exact qNaN being produced by 0.0/0.0 is non-deterministic, and can differ depending on which optimizations are applied. (See for instance this Rust issue.)

So it is already the case that optimizations may not duplicate FP math instructions (unless they can prove that the result is not a NaN). Coalescing is fine though.

spatel updated this revision to Diff 496500.Feb 10 2023, 8:52 AM

Patch updated:

  1. Specify the SNaN behavior more exactly and use examples.
  2. Adjusted wording about the status flags.
jyknight accepted this revision.Feb 14 2023, 7:04 AM

Given the reassurance regarding optimizations of non-deterministic results, I'm satisfied.

Only one final nit: "Floating-point math operations treat all NaNs as quiet NaNs." sounds like they _always_ do, so how about: "Floating-point math operations are permitted to treat all NaNs as if they were quiet NaNs." instead?

Thanks!

Only one final nit: "Floating-point math operations treat all NaNs as quiet NaNs." sounds like they _always_ do, so how about: "Floating-point math operations are permitted to treat all NaNs as if they were quiet NaNs." instead?

Yes, that's a better description. If the optimizer doesn't know if something is a NaN at compile-time, then it's going to pass through to the backend and asm, so the final behavior depends on the target.

spatel updated this revision to Diff 497322.Feb 14 2023, 7:26 AM

Softened the wording for NaN handling - we may treat SNaN as QNaN, but we don't have to (and D143505 tries to make that more IEEE-compliant even with default FP).

This revision was landed with ongoing or failed builds.Feb 15 2023, 5:58 AM
This revision was automatically updated to reflect the committed changes.