Previously we assumed that the result of sqrt(x) always had 0 as its
sign bit. But sqrt(-0) == -0.
Details
Diff Detail
- Build Status
Buildable 3100 Build 3100: arc lint + arc unit
Event Timeline
LGTM with extra changes dropped
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2676–2678 | Unrelated changes |
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2676–2678 | I introduced the variable "CI" above, and I did not want to shadow it here. I could split out this variable rename into a separate patch if you really think that's necessary? |
I think this is actually incorrect, see https://llvm.org/bugs/show_bug.cgi?id=31702#c1.
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2676–2678 | OK, that's fine |
Update based on my understanding of what fabs really means, see https://llvm.org/bugs/show_bug.cgi?id=31702#c1.
Basically, I had my "assumption directions" wrong. IEEE 754 says the sign bit
of NaNs is unspecified, but the behavior of fabs on a NaN is to unset the sign
bit unconditionally. Therefore the correct thing is to assume that any NaN we
get has the sign bit set to 1! IOW the NaN sign bit is "might be anything",
not "undefined, we can pretend it's whatever we want."
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2673 | You need to prove that the result isn't a NaN and isn't -0. Using nsz to prove the result isn't -0 is suspicious. For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0. Actually, sqrt(0) could produce -0 under that definition. Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt(). I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about. We can split floating-point numbers into categories: +inf, -inf, +0, -0, positive finite, negative finite, negative nan, and positive nan. CannotBeOrderedLessThanZero needs to prove that the input is not in the category -inf or the category negative finite. SignBitMustBeZero needs to prove that the input is not in one of the categories -0, -inf, negative finite, or negative nan. To transform nnan fabs(x) -> x, we need to prove that the input is not in one of the categories -0, -inf, or negative finite. To transform nnan nsz fabs(x) -> x, we need to prove that the input is not in one of the categories -inf, or negative finite. To transform fast fabs(x) -> x, we need to prove the input is not in the category negative finite. Does this make sense, or is it completely crazy? :) |
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2673 |
What I understand it means by "not significant" is that anywhere we have an nsz operation that produces or consumes a zero, we can assume that zero is positive or negative zero, at our convenience. So I agree nsz sqrt(+0) could return -0 under this definition, but I think the point is that we get to choose. Reading it the opposite way -- that nsz sqrt(+0) might return -0 and we have to account for this as a possibility -- would mean that nsz *limits* our optimization opportunities, which seems clearly not what it's intended to do.
I think this would be an additional correct optimization, but per above I don't think that what we have here is wrong.
Makes a ton of sense! Just a few things to add, which we may already agree on:
|
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
2673 |
I agree. If the user passes -fno-signed-zeros, then they're saying that we can ignore the possibility of signed zeros when determining optimization legality. They might still exist, and the fact that they still exist might still be observable, but the user is promising that they don't care that this might be observed. One key question here is: Are we allowed to observe signed zeros inconsistently? If x == -0, for example, can if (x != -0) print(x); print -0? I think the answer must be yes. This code cannot be sensibly compiled under -fno-signed-zeros. |
Reading it the opposite way -- that nsz sqrt(+0) might return -0 and we have to account for this as a possibility -- would mean that nsz *limits* our optimization opportunities, which seems clearly not what it's intended to do.
Well, if everything is nsz, that doesn't really come up; the user is under the same restriction as the sqrt() itself, so it doesn't limit optimization. The weird edge-case is where you have an nsz instruction which feeds into a non-nsz instruction: either we restrict optimization like you describe, or users of an nsz instruction are effectively nsz themselves. The latter seems worse... for example, if we started using SignBitMustBeZero to feed into integer KnownBits computations, we could have an integer value 0x80000000 whose top bit is "known" to be zero, which would lead to undefined behavior.
If x == -0, for example, can if (x != -0) print(x); print -0?
No, because 0. == -0. is true. :)
! In D28928#656636, @efriedma wrote:
The weird edge-case is where you have an nsz instruction which feeds into a non-nsz instruction: either we restrict optimization like you describe, or users of an nsz instruction are effectively nsz themselves.
Agreed, these are the two options. But I would amend the second option to say that the users of these instructions are "halfway nsz" themselves. Their inputs are nsz, but their outputs are not.
The latter seems worse... for example, if we started using SignBitMustBeZero to feed into integer KnownBits computations, we could have an integer value 0x80000000 whose top bit is "known" to be zero, which would lead to undefined behavior.
Are you worried about the case where we have something like
float f = nsz some_op; int i = bitcast f to int bool b = top bit of i
? I think as soon as we leave the fp space we can no longer play fast and loose with signed zeroes. Like, the result of the bitcast must have some value for its sign bit -- doesn't matter what it is because nsz -- and then we must stick with that, consistently.
If x == -0, for example, can if (x != -0) print(x); print -0?
No, because 0. == -0. is true. :)
Not sure if kidding around; if not, I think Hal was trying to ask: Can`if (x == 0 && !signbit(x)) print (x);` print "-0"? He's saying yes (if the whole program is compiled with nsz), and I think I agree.
I think as soon as we leave the fp space we can no longer play fast and loose with signed zeroes.
Agreed.
Can`if (x == 0 && !signbit(x)) print (x);` print "-0"? He's saying yes (if the whole program is compiled with nsz), and I think I agree.
I think it depends on how exactly "signbit" is implemented; if signbit and print are both integer operations, we probably have to be conservative because we're no longer in FP space. I think we can be more flexible with an nsz copysign, as in "if (x == 0 && copysign(1, x) == 1)".
llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll | ||
---|---|---|
237 | I'm not convinced this transform is legal without an nsz flag on the fabs... Patch is fine otherwise. |
llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll | ||
---|---|---|
237 | I thought we talked about this above and established that, because fabs is a floating-point op and here its input comes from an nsz op, we can assume that the sign of the zero is "insignificant". My understanding was that only if we have a non-fp op involved do things become complicated. Maybe this would be easier on IRC? Please ping me and Hal. |
llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll | ||
---|---|---|
237 | Here's what I understand how you're looking at this from this thread:
I think 3 might be a bit too aggressive; it seems like a good idea to enforce the invariant that in non-fast-math code, fabs(x) never has the sign bit set. With this patch, that breaks if you use a library compiled with fast-math to produce x. That said, mixing fast-math and non-fast-math code is likely to be pretty infrequent, so maybe we shouldn't worry about it. Feel free to commit as-is. |
llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll | ||
---|---|---|
237 | Okay, will do, thanks. FWIW I totally see your point, and personally I don't feel particularly strongly about this. But I also don't think we have a particularly clear idea about how we want this to work at the moment. So...yeah. Thanks again. |
You need to prove that the result isn't a NaN and isn't -0. Using nsz to prove the result isn't -0 is suspicious. For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0. Actually, sqrt(0) could produce -0 under that definition.
Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt().
I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about. We can split floating-point numbers into categories: +inf, -inf, +0, -0, positive finite, negative finite, negative nan, and positive nan. CannotBeOrderedLessThanZero needs to prove that the input is not in the category -inf or the category negative finite. SignBitMustBeZero needs to prove that the input is not in one of the categories -0, -inf, negative finite, or negative nan. To transform nnan fabs(x) -> x, we need to prove that the input is not in one of the categories -0, -inf, or negative finite. To transform nnan nsz fabs(x) -> x, we need to prove that the input is not in one of the categories -inf, or negative finite. To transform fast fabs(x) -> x, we need to prove the input is not in the category negative finite.
Does this make sense, or is it completely crazy? :)