This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] preserve signbit semantics of NAN with fold to fabs
ClosedPublic

Authored by spatel on Dec 11 2022, 7:13 AM.

Details

Summary

As discussed in issue #59279, we want fneg/fabs to conform to the IEEE-754 spec for signbit operations - quoting from section 5.5.1 of IEEE-754-2008:
"negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit"
"abs(x) copies a floating-point operand x to a destination in the same format, setting the sign bit to 0 (positive)"
"The operations treat floating-point numbers and NaNs alike."

So we gate this transform with "nnan" in addition to "nsz":
(X > 0.0) ? X : -X --> fabs(X)

Without that restriction, we could have for example:
(+NaN > 0.0) ? +NaN : -NaN --> -NaN (because an ordered compare with NaN is always false)
That would be different than fabs(+NaN) --> +NaN.

More fabs/fneg patterns demonstrated here:
https://godbolt.org/z/h8ecc659d
(without any FMF, these are correct independently of this patch - no fabs should be created)

The code change is a one-liner, but we have lots of tests diffs because there are many variations of the basic pattern.

Diff Detail

Event Timeline

spatel created this revision.Dec 11 2022, 7:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 11 2022, 7:13 AM
spatel requested review of this revision.Dec 11 2022, 7:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 11 2022, 7:13 AM
spatel edited the summary of this revision. (Show Details)Dec 11 2022, 7:14 AM
RalfJung added a comment.EditedDec 11 2022, 8:52 AM

I am a bit confused by the explanation. The signbit of NaN is "insignificant" for *some* operations like + in the sense that they just pick an arbitrary sign bit if the result is NaN (and so the sign bit of a NaN input has no effect). But it affects the behavior of >, fneg, and fabs (the operations involved here) in totally deterministic ways, right? When only >, fneg, fabs are used (and copies of the floating point value from one register to another), the sign bit is fully significant, like all the others.

The actual issue is that X > 0.0 is false for negative NaNs, so the LHS returns X in that case -- meaning the LHS can sometimes return a value with the sign bit set, but the RHS will never have the sign bit set. Hence the transformation sometimes produces a wrong sign bit. And that bit is significant enough to make this an incorrect transformation.

I think it is hence very confusing to claim that the sign bit of a NaN is insignificant/meaningless in general.

Or did I misunderstand something?

The actual issue is that X > 0.0 is false for negative NaNs

Comparisons don't look at the sign of NaN. They are not bitwise operations like fabs/fneg.

Comparisons don't look at the sign of NaN. They are not bitwise operations like fabs/fneg.

Yeah, and hence -NaN > 0.0 is false, so that's what I said, no?

My confusion is with this comment

// Note: This requires nnan to preserve signbit semantics even though the
//       signbit of a NAN is insignificant.

since "signbit of a NAN is insignificant" is not correct in general.

I would be less confused by something like

// Note: This requires nnan to preserve signbit semantics since fcmp ignores
//       the signbit of a NAN.

Comparisons don't look at the sign of NaN. They are not bitwise operations like fabs/fneg.

Yeah, and hence -NaN > 0.0 is false, so that's what I said, no?

My confusion is with this comment

// Note: This requires nnan to preserve signbit semantics even though the
//       signbit of a NAN is insignificant.

since "signbit of a NAN is insignificant" is not correct in general.

I would be less confused by something like

// Note: This requires nnan to preserve signbit semantics since fcmp ignores
//       the signbit of a NAN.

The confusion comes from the IEEE spec itself - the signbit of NAN is never meaningful in real code, but the spec requires that the neg/abs/copysign operations treat the signbit of NAN deterministically anyway. This is what we went back and forth on in the bug report. Ie, we could implement the less strict interpretation, and it should not affect functionality of any real program. The code/test comments were intending to suggest that alternative interpretation if we ever decided that this way was overkill. I'm ok changing the comment if that's preferred.

the signbit of NAN is never meaningful in real code,

This is the statement I am still objecting to. It *is* meaningful in the sense that (a) code can look at it if it wants, and (b) some operations guarantee to treat it in a particular way (in particular, float Y = true ? X1 : X2; guarantees that Y has the same signbit as X1).

The signbit is never meaningful in code that only treats float values as approximation to mathematical real numbers, but LLVM must work on other code, too. For that reason I think the mindset of "it is not meaningful but ..." is confusing, and easily leads to misunderstandings and bugs such as the one this change is fixing.

The signbit is never meaningful in code that only treats float values as approximation to mathematical real numbers, but LLVM must work on other code, too. For that reason I think the mindset of "it is not meaningful but ..." is confusing, and easily leads to misunderstandings and bugs such as the one this change is fixing.

I'm not convinced that this is really a bug, but I'm ok changing the wording. :)

By default, LLVM IR implements a subset of IEEE FP to allow optimizations that IEEE does not:
https://llvm.org/docs/LangRef.html#floating-point-environment
(also note that we support targets that are even less IEEE-compliant than the IR spec, so we don't do some transforms that would be allowed if we could assume an IEEE-compliant target)

So it's up to us to decide if this quirk of NAN and signbit-op behavior is worth implementing. This will make Alive2 more useful, and I'm hoping it won't harm optimization in real code. If it does harm optimization, we could switch back (leave this transform as-is). In other words, we would make it explicit that the signbit of NAN is meaningless as it was (briefly) implemented in Alive2. But then we would lose some other optimization as noted in the bug report.

spatel updated this revision to Diff 481932.Dec 11 2022, 11:03 AM
spatel edited the summary of this revision. (Show Details)

Patch updated:
Adjusted code comment - we are implementing a model where the signbit is specified by fabs/fneg, so don't say it is meaningless.

RalfJung added a comment.EditedDec 11 2022, 11:13 AM

Note that semantics of fneg/fabs is irrelevant for this optimization to be a problem. That's why I keep using float Y = true ? X1 : X2; as the example. To my knowledge nobody disputes that NaN < 0.0 must deterministically return false, so X < 0.0 ? -X : X will behave the same as false ? -X : X when X is a NAN, which is the same as just X -- I hope this much is uncontroversial. The only potential question then is whether float Y = X must preserve the NAN sign bit of X.

float X = /* NaN with sign bit set */;
float Y1 = X < 0.0 ? -X : X; // just a convoluted way to write `Y1 = false ? ... : X`, aka `Y1 = X`.
float Y2 = X;
// Clearly Y1 and Y2 must be bit-equal here, since both are just copies of X?
// But making Y1 be fabs(X) will destroy that property no matter whether or
// not we specify `fabs` to behave deterministically on NaNs.

I personally would be extremely surprised if just copying a value was allowed to alter any of its bits. Is that the alternative NAN semantics you meant, or am I misunderstanding something?

Note that semantics of fneg/fabs is irrelevant for this optimization to be a problem. That's why I keep using float Y = true ? X1 : X2; as the example. To my knowledge nobody disputes that NaN < 0.0 must deterministically return false, so X < 0.0 ? -X : X will behave the same as false ? -X : X when X is a NAN, which is the same as just X -- I hope this much is uncontroversial. The only potential question then is whether float Y = X must preserve the NAN sign bit of X.

float X = /* NaN with sign bit set */;
float Y1 = X < 0.0 ? -X : X; // just a convoluted way to write `Y1 = false ? ... : X`, aka `Y1 = X`.
float Y2 = X;
// Clearly Y1 and Y2 must be bit-equal here, since both are just copies of X?

I think there's still some ambiguity in this example. How was X initialized/observed with -NaN? Testing for bit-equality requires casting to integer. There's no way to do that comparison with FP values?

Note that we also don't guarantee that NaN payload bits propagate. That part is less well-specified in IEEE-754, so maybe it's less controversial?

I personally would be extremely surprised if just copying a value was allowed to alter any of its bits. Is that the alternative NAN semantics you meant, or am I misunderstanding something?

Yes, I think that's right. But as I suggested above, this is already the case and continues independently of this patch. NaN payload bits are part of this problem/discussion too, right?
https://discourse.llvm.org/t/semantics-of-nan/66729/11

How was X initialized/observed with -NaN? Testing for bit-equality requires casting to integer. There's no way to do that comparison with FP values?

Yes, this assumes we can convert between float and i32 freely, and non-poison/undef values are preserved perfectly with all their bits when doing that. If this is not true, the LangRef surely must contain big fat warnings about that, and I doubt Rust would be the only frontend that would be broken badly by such a decision.

Note that we also don't guarantee that NaN payload bits propagate. That part is less well-specified in IEEE-754, so maybe it's less controversial?

They do propagate perfectly in copy, copysign, fneg, fabs according to IEEE, is my understanding.

Ping.

This patch gets us closer to the desired behavior of IEEE-compliant fneg/fabs/copysign (there is no "copy" in IR) semantics with the default LLVM FP environment, and the code comment is hopefully clear now.

If there are other open questions, those should be added to the discourse thread or in bug reports.

Could you add a relevant quote of the IEE spec into the description,
and add an example with +NaN and -NaN and describe the outcome (kinda like alive2 counter-example)?

spatel edited the summary of this revision. (Show Details)Dec 26 2022, 12:41 PM

Could you add a relevant quote of the IEE spec into the description,
and add an example with +NaN and -NaN and describe the outcome (kinda like alive2 counter-example)?

Sure - see updated text. Let me know if you'd like to see something else or different.

Could you add a relevant quote of the IEE spec into the description,
and add an example with +NaN and -NaN and describe the outcome (kinda like alive2 counter-example)?

Sure - see updated text. Let me know if you'd like to see something else or different.

Consider also adding https://godbolt.org/z/h9aefnKhE

lebedev.ri added a comment.EditedDec 26 2022, 1:09 PM

Could you add a relevant quote of the IEE spec into the description,
and add an example with +NaN and -NaN and describe the outcome (kinda like alive2 counter-example)?

Sure - see updated text. Let me know if you'd like to see something else or different.

Consider also adding https://godbolt.org/z/h9aefnKhE

Err, more like https://godbolt.org/z/1h161jEEj (or even https://godbolt.org/z/fe3sKa9Yc)

spatel edited the summary of this revision. (Show Details)Dec 27 2022, 8:24 AM

Thanks! Added a similar link and explanation to the patch summary.

lebedev.ri accepted this revision.Dec 27 2022, 8:40 AM

I'm not very good with FP, but this does seem well-justified.
Thanks.

This revision is now accepted and ready to land.Dec 27 2022, 8:40 AM