When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load.
However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles.
Details
Diff Detail
Event Timeline
I agree this seems broken; we shouldn't shrink SNaN constants. However, I still have two comments:
1.) Does this actually have to be platform-specific? I mean, is there any platform where a SNaN constant *can* be stored in converted form? I would have expected that the type-conversion operation needed to load a "shrunk" SNaN constant would be required to actually trigger the "signaling" part of the SNaN (i.e. raise an invalid operation exception) on all platforms ...
2.) To detect an SNaN, you're testing for bitwise-equality against one specific SNaN value. However, in general there can be many different SNaN values. I think you should just test for APF.isSignaling().
This was my initial thought too. It doesn't seem to be a problem on x86 (we get the right value printed and no exceptions), which suggests that it's being handled somehow. I'm not sure about other platforms however.
I left this fix as platform-specific to reduce the possible impact, but if there are no expected drawbacks from disabling it globally, then I'm willing to do so.
2.) To detect an SNaN, you're testing for bitwise-equality against one specific SNaN value. However, in general there can be many different SNaN values. I think you should just test for APF.isSignaling().
Thanks, I've made this change.
Hmm, I've had a bit of a closer look into x86 here. First of all, it seems we're only "shrinking" constants in the x86 extended format. And for this format, there's special code in APFloat::convert that marks any conversion of an x86 extended format SNaN to double as "inexact", and therefore the generic shrinking code doesn't do it (because isValueValidForType returns false).
The current code just has a somewhat cryptic comment:
// x86 has some unusual NaNs which cannot be represented in any other // format; note them here.
but the original code as checked in here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20081006/068140.html
explicitly says:
// We mark this as Inexact if we are losing information. This happens // if are shifting out something other than 0s, or if the x87 long // double input did not have its integer bit set (pseudo-NaN), or if the // x87 long double input did not have its QNan bit set (because the x87 // hardware sets this bit when converting a lower-precision NaN to // x87 long double).
The remark in parenthesis seems to match exactly the problem on System z.
I actually haven't seen any case where LLVM would in fact generate loads of a "shrunk" SNaN constant. Do you have any such case?
B.t.w. the original patch had a test case which seems to have gotten lost in this iteration :-)
Hmm, that definitely explains why x86 isn't running into this issue.
In that case, is the general approach of not shrinking SNaNs in the first place better, or would it make sense to add more special code to APFloat to avoid "inexact" conversions on SystemZ as well?
My feeling is that more backend-specific in APFloat is a bit messy, though it has the potential to fix other cases where we're converting a SNaN (not that I'm currently aware of any).
Here's some example code that results in a "shrunk" SNaN load:
#include <iostream>
#include <limits>
int main()
{
   double snan = std::numeric_limits<double>::signaling_NaN();
   long long *p = (long long *)(&snan);
   std::cout << std::hex << *p << '\n';
   return 0;
}This will print 7ffc000000000000 with trunk Clang, but will print 7ff4000000000000 with GCC or with my patch applied.
Thinking about that, I'm not sure whether the APFloat.convert behavior is really correct. In many cases, it is possible to convert a SNaN in to another SNaN in a different format exactly. (Not all cases, of course. Also, the x86 extended format "pseudo-NaNs" are really special, and can never be converted exactly.)
But simply never doing constant shrinking of a signaling NaN seems the best way to me right now ...
Here's some example code that results in a "shrunk" SNaN load:
[snip]
This will print 7ffc000000000000 with trunk Clang, but will print 7ff4000000000000 with GCC or with my patch applied.
Well, I can reproduce thie 7ffc... result, but only with -m32 and with no optimization. But under those circumstances, I see it with both clang and gcc.
What happens here is that the (non-inlined) call to signaling_NaN() returns an SNaN in the 80-bit extended format in %fp(0) according to the 32-bit floating-point ABI. Code then stores this value to a 64-bit stack slot using FSTPL. *This* instruction is what converts the extended SNaN into a double-precision QNaN (and also sets the inexact exception flag). But that's completely separate issue from constant shrinking ...
Ah, sorry, I should have been more clear -- that code snippet causes the SNaN to be shrunk and converted to a QNaN specifically on SystemZ. I don't have any code that reproduces the issue on other platforms (on x86, the APFloat behaviour is presumably preventing the shrinking from happening).
To be more specific, I was trying to say the APFloat.convert behavior for non-x86 architectures seems correct, so we should't be adding any SNaN-specific code there. (I think the behavior on *x86* may not be fully correct for non-pseudo SNaNs in the extended format, but that's a different story ...)
But simply never doing constant shrinking of a signaling NaN seems the best way to me right now ...
That still seems the right fix to me: simply never shrink SNaN values on any platform.
Since this is then a cross-platform change, you might want to add some floating-point experts to the approvers list.
OK, understood.
This now looks good to me.
Maybe one minor request: Please add a comment before the if (!APF.isSignaling()) check explaining why this is necessary. Otherwise, I think you can go ahead and check this in now.
A build bot appears to be failing the test case:
http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/27427
It looks like shrinking is somehow still happening, so presumably APFloat::isSignaling is giving a false negative. I'm not sure what would be causing it to happen on that one specific bot though.
The builder is configured as -m32.
I guess it would be x87fp-related issue.
On i686 host, NaN might not be preserved.
I re-added the test and it looks like it hasn't run into any issues overnight. Thanks!