There is a FIXME in APFloat that is saying ppc_fp128 is supported through IEEE 128 bits semantic, not IBM double-double semantic. Add a flag to turn of the constand folding to leave the computation to runtime.
Diff Detail
Event Timeline
This looks super suspicious. Why can't we fix this for real in APFloat?
I'd be more comfortable with a patch that didn't attempt to fold in the face of PPCDoubleDouble without a flag to control it.
In the past, we've been comfortable leaving this as an open known issue, especially given that the semantics of PPC long double are not IEEE anyway - the real fix involves creating an APFloat-based double-double implementation which can be used to evaluate constants. How important is it that we make this kind of change now?
We (me and iteratee) don't mind removing the flags. @hfinkel and @kbarton?
The problem this patch fixes is not the IEEE conformance, but the consistency. Currently libstdc++ std::numeric_limits<long double>::epsilon() returns an epsilon value that may be suitable for double-double semantic, but not suitable for the compile-time IEEE 128 semantic.
How important is it that we make this kind of change now?
By "this kind of change" you mean being able to turn off constant folding for ppc_fp128? We observe internal test failures that due to the mix of using std::numeric_limits<long double>::epsilon() and LLVM constant folding, and we'd like to make the tests green, especially when they seem use the standard library correctly. :)
Agreed. We should either do this or not.
The problem this patch fixes is not the IEEE conformance, but the consistency. Currently libstdc++ std::numeric_limits<long double>::epsilon() returns an epsilon value that may be suitable for double-double semantic, but not suitable for the compile-time IEEE 128 semantic.
Yes, I completely understand that.
How important is it that we make this kind of change now?
By "this kind of change" you mean being able to turn off constant folding for ppc_fp128? We observe internal test failures that due to the mix of using std::numeric_limits<long double>::epsilon() and LLVM constant folding, and we'd like to make the tests green, especially when they seem use the standard library correctly. :)
Yes. I'm certainly sympathetic to wanting correctness tests to pass. However, this has been a known issue in LLVM for nearly a decade, and we could have made this change at any time. We did not because we judged the benefit to be not worth the performance cost to applications that benefit from the constant folding. I'm not against doing this - I suspect few applications are depending on long-double calculations on PPC for high performance computation, but I think we need to understand the rationale in light of the history here.
Currently Google is transitioning from GCC to LLVM, and cares about PowerPC, and we care about correctness over performance gain due to ppc long double constant folding. This makes a difference from the past decade, from our perspective.
If there seems to be few applications are depending on this optimization, I wonder if we can turn off the constant folding for now, and fix APFloat later if someone really cares about it?
Just wanted to note that the front-end would still encounter the issue with APFloat, and I would be concerned if the (front-end) constant expression evaluation for ppc_fp128 is disabled entirely.
Yes, but Hubert is right. To fully make this change, you'll need to disable constexpr evaluation of long doubles in Clang also (which I suspect might make it non-conforming).
In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.
If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.
Yes, but Hubert is right. To fully make this change, you'll need to disable constexpr evaluation of long doubles in Clang also (which I suspect might make it non-conforming).
An incorrect frontend behavior isn't a good justification for keeping an incorrect backend behavior.
In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.
Correctness is its own reward.
As a plus, this patch doesn't interfere with someone else adding double double support to APFloat and using that in both the frontend and the backend.
If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.
Yes, that would be the way forward.
I think the important question here is if this patch is an improvement on the way to the best solution you've outlined here.
Tim and I both think that is.
I didn't say that it was. My point is that I think you'll end up needing the full solution regardless, because the behavior will surface in places, such as constexpr evaluation in Clang, that you can't (as easily) turn off.
In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.
Correctness is its own reward.
As a plus, this patch doesn't interfere with someone else adding double double support to APFloat and using that in both the frontend and the backend.
It is the "someone else" I have a problem with here. I'm not okay with this patch as the desired end state of the work on this. Sure, you can disable constant folding, but that doesn't change the fact that LLVM has an APFloat with a PPCDoubleDouble mode which does not match the runtime implementation, and other uses of the LLVM libraries (i.e. Clang) will still use that implementation.
If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.
Yes, that would be the way forward.
Yes, *if* they exactly match the behavior of the GCC implementations? Or do they just need to be consistent with the properties implied by numeric_limits<long double>? What are your requirements here?
I think the important question here is if this patch is an improvement on the way to the best solution you've outlined here.
Tim and I both think that is.
It leaves the overall infrastructure only more-subtly broken. We might do it to make our lives easier, but it is not an improvement. As a community, sometimes we decide to say, "this has been a known problem for a long time, and so we don't want the band-aid, we'll wait for a real solution." This has certainly been a known problem for a long time, and an important question here is: Do we need to leave this as it is to motivate work on a real solution?
Here's another possible short-term approach: Actually make the APFloat representation used for PPC long double large enough to represent ((1.0 + epsilon) - 1.0), where epsilon == 4.94066e-324. I'm under the impression that this is property that is failing for you, and it is indicative of the fact that the current 128-bit representation used cannot represent all of the numbers that the runtime long double type can represent.
As a quick experiment, I'm assuming that we don't want constant folding to change the output of this program:
$ cat ~/eps.cpp #include <iostream> #include <limits> using namespace std; int main() { long double x = 1.0; cout << (x + numeric_limits<long double>::epsilon()) - x << "\n"; }
which can be accomplished by this patch:
diff --git a/lib/Support/APFloat.cpp b/lib/Support/APFloat.cpp index f9370b8..81cef0e 100644 --- a/lib/Support/APFloat.cpp +++ b/lib/Support/APFloat.cpp @@ -76,7 +76,7 @@ namespace llvm { to represent all possible values held by a PPC double-double number, for example: (long double) 1.0 + (long double) 0x1p-106 Should this be replaced by a full emulation of PPC double-double? */ - const fltSemantics APFloat::PPCDoubleDouble = { 1023, -1022 + 53, 53 + 53, 128 }; + const fltSemantics APFloat::PPCDoubleDouble = { 1023, -1022 + 53, 20*(53 + 53), 128 }; /* A tight upper bound on number of parts required to hold the value pow(5, power) is @@ -2926,7 +2926,7 @@ APInt APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const { assert(semantics == (const llvm::fltSemantics*)&PPCDoubleDouble); - assert(partCount()==2); + // assert(partCount()==2); uint64_t words[2]; opStatus fs; @@ -2947,7 +2947,7 @@ APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const APFloat u(extended); fs = u.convert(IEEEdouble, rmNearestTiesToEven, &losesInfo); - assert(fs == opOK || fs == opInexact); + // assert(fs == opOK || fs == opInexact); (void)fs; words[0] = *u.convertDoubleAPFloatToAPInt().getRawData(); @@ -2963,7 +2963,7 @@ APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const APFloat v(extended); v.subtract(u, rmNearestTiesToEven); fs = v.convert(IEEEdouble, rmNearestTiesToEven, &losesInfo); - assert(fs == opOK && !losesInfo); + // assert(fs == opOK && !losesInfo); (void)fs; words[1] = *v.convertDoubleAPFloatToAPInt().getRawData(); } else {
and with this patch applied, even when compiled with optimizations enabled, the test program still prints "4.94066e-324" like it should. This is obviously also a work-around, but:
- It should only cause APFloat answers to differ from runtime answers by being more accurate, not less.
- It isolates the work-around to the APFloat implementation, instead of trying to spread the knowledge of the APFloat deficiencies around the rest of the infrastructure.
P.S. With this patch applied, all Clang regression tests pass, and all LLVM regression tests pass except for the ADT/ADTTests/APFloatTest.PPCDoubleDouble unit test (for a few checks that are checking the representation itself).
Somewhat sadly I think the best move forward is going to be getting ppc_fp128 support in APFloat. Moving forward with IEEE754r support in Power9 it might make sense to switch the platform to support that by default, but we're not there yet.
-eric