- User Since
- Jun 10 2014, 9:32 AM (345 w, 2 d)
Nov 19 2020
I'm fine with this.
Nov 10 2020
Strictly speaking, fp-contract=fast probably should have been a separate flag entirely (since there's no _expression_ being contracted in fast). Unfortunately, that ship has sailed, and it does constrain our ability to choose an accurate name somewhat.
Nov 5 2020
I do not much like faststd, as there's nothing "standard" about it. I do not, however, have a better suggestion off the top of my head. Let's pause and consider the name a little bit longer, please?
Nov 3 2020
(If you tell GCC to respect the pragma via -std=c17 or similar, then -ffp-contract=fast overrides it just like clang's current behavior: https://godbolt.org/z/5dxxGb)
GCC doesn't respect the pragma, so "what other compilers do" is not a particularly useful metric.
Oct 19 2020
I guess the counterargument here would be that .x does not produce an extvector(1), and there is at least a plausible argument that .x should be the same as .lo for a two-element vector. I'm not really convinced by this, but it's not totally outrageous.
I'm fairly certain that this will cause some breaks internally at Apple, but I'm also pretty sure that it's a step in the right direction, and we should just sign up to fix any issues it causes.
Sep 17 2020
If we can do it without complication, it would be best to preserve signaling-ness, because that's the more faithful interpretation of IEEE 754 (even though it _doesn't_ match what the HW does, because the HW can signal and APFloat can't). A general principle (imperfectly adhered to) of IEEE 754 is that conversions on signaling NaNs should _either_ signal and produce a quiet NaN (if possible), or should produce a signaling NaN if no signal is possible.
Aug 31 2020
May 12 2020
Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.
Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.
May 6 2020
TS18661-5 is quite vague on what the intended semantics for the pragma are.
May 5 2020
(Please get one additional sign off before committing; I'm mainly signing off on the numerics model aspect).
My concerns have been addressed. Thanks for bearing with me, Melanie!
I don't think the C standard is likely to ever bless reassociative FP math with an expression-local restriction. Steve, do you actually think that would be a useful optimization mode?
May 4 2020
Apr 27 2020
Apr 23 2020
Mar 10 2020
Mar 9 2020
Feb 7 2020
Dec 4 2019
Nov 20 2019
Sep 25 2019
Backing up what everyone says here: logb doesn't define the sign of NaN results, and 754 explicitly says not to interpret the sign of NaN as having any meaning except in the copySign, absoluteValue, negate, and copy operations. (That's a semantically meaningless statement, since those operations do not exist in a vacuum, which means that you can't actually say anything about the sign of NaN from a formal perspective, but, well, it's the standard we have.)
Sep 5 2019
"pow(x, ±0) returns 1 for any x, even a NaN"
Sep 4 2019
I believe that the code can still be simplified somewhat, but that it's correct as-is for float, double, and long double. I'll take an AI to follow-up on future improvements, and let's get this in.
Aug 28 2019
Eric showed me this link https://godbolt.org/z/AjBHYqv
I would tend to write this function in the following form:
// set up lower bound and upper bound if (r > upperBound) r = upperBound; if (!(r >= lowerBound)) r = lowerBound; // NaN is mapped to l.b. return static_cast<IntType>(r);
I prefer to avoid the explicit trunc call, since that's the defined behavior of the static_cast once the value is in-range, anyway.
Aug 8 2019
It's not that NaN is rare in normal programs, or that it indicates a bug in the code. It's that testing for NaN is usually an indication that you're testing for an exceptional case, and it makes sense to move those off the hot path (i.e. NaN is actually pretty common, but the likelihood of handling it on the normal-value path through code is small).
Aug 6 2019
(Ideally we would just call them e.g. __builtin_floor, but that would be source-breaking. __builtin_tgmath_floor seems like a good compromise.)
Strongly agree with what @rjmccall said. If we can make these generic builtins instead of ending up with O(100) variants of each math operation, that would make life immensely nicer.
Jul 26 2019
Jul 24 2019
Jul 22 2019
Reviewers: what do we need to get this across the finish line?
Jul 16 2019
LGTM. Please get at least one additional reviewer's approval before merging, as this is a corner of clang that I don't work on often.
Jul 15 2019
May 30 2019
Mar 11 2019
Ah, now I see what you're talking about. And in fact, because of the way divide works out, there's a little gap of results that are even possible to achieve just below each binade boundary, so the code you have here will work out fine. We *should* add a comment to clarify this somewhat, but I'm happy to do that in a separate commit. LGTM.
These results *are* tiny in the before rounding sense.
In the parlance of IEEE 754, there are two ways to "detect tininess": "before rounding" and "after rounding". The standard doesn't define how to flush subnormal results, but in practice most HW flushes results that are "tiny". The existing code flushes as though tininess is detected before rounding. This proposed update flushes as though tininess were detected after rounding.
Jan 25 2019
do we want to support _Float16 anywhere else?
ARM is the only in-tree target with a defined ABI that I'm aware of.
Jan 11 2019
Dec 17 2018
Can you also add a check for .double infinity? It looks like that's likely missing too.
Dec 7 2018
Oct 9 2018
Are these documented anywhere? I haven't seen it in any of the patches so far. What do they return for NaN inputs?
Oct 3 2018
Sep 24 2018
I suspect that we'd rather use ilogb in the long term, but as a like-for-like replacement this looks OK.
Jul 19 2018
Jul 18 2018
May 22 2018
IIRC the optimization of divide-by-power-of-two --> multiply-by-inverse does not occur at -O0, so it would be better to multiply by 2^(-fbits) instead.
May 10 2018
May 9 2018
One more question: the caller is responsible for closing the file when they're done, right?
Two quick notes as someone who has never used these functions:
- Name is really the *path* of the file to open, right?
- What happens to ResultFD if the function fails?
May 8 2018
May 4 2018
Apr 26 2018
I like Chandler's wording. Something like:
Apr 25 2018
@gottesmm can you take a look at this? You're more familiar with the APFloat API than I am.
Apr 20 2018
Tangential question: Do we have an intrinsic floating -> integer conversion with defined semantics for out-of-range values?
Apr 6 2018
I also wonder whether requiring fast-math to allow tree reductions is overkill. Tree reductions can be implemented reasonably efficiently in many architectures, while linearly ordered reduction appear to me to be more of a niche.
Mar 26 2018
Two questions, to which I do not know the answer:
This is the class of optimizations that I would call "formally allowed by the standard, but extremely likely to break things and surprise people." Which isn't to say that we shouldn't do it, just ... be prepared.
Mar 20 2018
I'm OK with this.
Mar 16 2018
Mar 5 2018
IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.
Mar 2 2018
Works for me.
Mar 1 2018
We should be able to go farther and just do a fptrunc if SrcSize > DstSize.
Feb 7 2018
Feb 6 2018
Underflow or overflow doesn't change sign, so 0 < C < inf && X >= 0 --> C/X >= 0.
Jan 25 2018
(After checking the architecture manual)
My recollection is that on 386 and later unnormals with a zero significand are treated as an invalid operand; should these print as 0 or as NaN?
Dec 4 2017
Yup, there's likely something more general that we could match, but it's also worth taking this as is.
Nov 27 2017
IEEE 754 rules are that everything canonicalizes except bitwise operations (copy, abs, negate, copysign) and decimal re-encoding operations (which you don't care about).
Oct 30 2017
mod is not bound to the IEEE 754 remainder operation. It binds the C fmod operation. You're looking for the remainder operation.
Sep 13 2017
Sep 12 2017
Sep 6 2017
LGTM. Sorry that I didn't see this earlier.
Aug 21 2017
... except, please add another test-case where the other component is not an integer as well.