User Details
- User Since
- Jun 10 2014, 9:32 AM (486 w, 13 h)
Oct 5 2022
Oct 4 2022
Sep 7 2022
Wearing my compiler user hat, I would much rather use additive -mfeature than have to specify these as -march+feature, even when using a build system that nominally handles this stuff, because I frequently want to be able to compile one specific file with "whatever the prevailing options are, but also enable this one feature." Most build systems make this possible somehow (track down the arch variable, append +feature to it, etc), but it's considerably simpler if you can just append -mfeature to the list of flags and call it a day.
Aug 11 2022
In any event, from the standpoint of the C(23) language, these operations do not set inexact, so I believe that it is appropriate to optimize them as if they do not set inexact.
Looking at implementations of these functions, it looks like GNU libm doesn't raise inexact, but the bionic libm does. I think I'm leaning towards marking all of them as "fng" as it's the more cautious of the two.
Hmm, bionic's behavior sounds a bit surprising. In the committed version, I kept them fnc to keep the previous behavior for now. I'll double-check with @scanon and potentially upload a follow-up patch.
Aug 8 2022
The CFP working group and C23 have since clarified this in Annex F:
Aug 5 2022
Feb 25 2022
There's a lot of churn around proposed "solutions" on this and related PR, but not a very clear analysis of what the problem we're trying to solve is.
Jan 17 2022
Oct 19 2021
Two minor questions, but also LGTM as is.
What's the rationale for making abs undefined on the minimum value? AFAIK every actual simd implementation defines the result and they agree (and even if one didn't, it would be pretty easy to get the "right" result. Introducing UB here just seems like punishing users for no reason.
Oct 13 2021
I'm happy with this now.
Oct 11 2021
May 14 2021
Mar 10 2021
Is there a mechanism to instruct the sanitizer to ignore a specific expression or function? From a cursory reading, I am mildly concerned about a deluge of false positives from primitives that compute exact (or approximate) residuals; these are acting to eliminate or precisely control floating-point errors, but tend to show up as "unstable" in a naive analysis that isn't aware of them.
Nov 19 2020
I'm fine with this.
Nov 10 2020
Strictly speaking, fp-contract=fast probably should have been a separate flag entirely (since there's no _expression_ being contracted in fast). Unfortunately, that ship has sailed, and it does constrain our ability to choose an accurate name somewhat.
Nov 5 2020
I do not much like faststd, as there's nothing "standard" about it. I do not, however, have a better suggestion off the top of my head. Let's pause and consider the name a little bit longer, please?
Nov 3 2020
(If you tell GCC to respect the pragma via -std=c17 or similar, then -ffp-contract=fast overrides it just like clang's current behavior: https://godbolt.org/z/5dxxGb)
GCC doesn't respect the pragma, so "what other compilers do" is not a particularly useful metric.
Oct 19 2020
I guess the counterargument here would be that .x does not produce an extvector(1), and there is at least a plausible argument that .x should be the same as .lo for a two-element vector. I'm not really convinced by this, but it's not totally outrageous.
I'm fairly certain that this will cause some breaks internally at Apple, but I'm also pretty sure that it's a step in the right direction, and we should just sign up to fix any issues it causes.
Sep 17 2020
If we can do it without complication, it would be best to preserve signaling-ness, because that's the more faithful interpretation of IEEE 754 (even though it _doesn't_ match what the HW does, because the HW can signal and APFloat can't). A general principle (imperfectly adhered to) of IEEE 754 is that conversions on signaling NaNs should _either_ signal and produce a quiet NaN (if possible), or should produce a signaling NaN if no signal is possible.
Aug 31 2020
May 12 2020
Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.
Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.
May 6 2020
TS18661-5 is quite vague on what the intended semantics for the pragma are.
May 5 2020
(Please get one additional sign off before committing; I'm mainly signing off on the numerics model aspect).
My concerns have been addressed. Thanks for bearing with me, Melanie!
I don't think the C standard is likely to ever bless reassociative FP math with an expression-local restriction. Steve, do you actually think that would be a useful optimization mode?
May 4 2020
Apr 27 2020
Apr 23 2020
Mar 10 2020
Mar 9 2020
Feb 7 2020
Dec 4 2019
Nov 20 2019
Sep 25 2019
Backing up what everyone says here: logb doesn't define the sign of NaN results, and 754 explicitly says not to interpret the sign of NaN as having any meaning except in the copySign, absoluteValue, negate, and copy operations. (That's a semantically meaningless statement, since those operations do not exist in a vacuum, which means that you can't actually say anything about the sign of NaN from a formal perspective, but, well, it's the standard we have.)
Sep 5 2019
"pow(x, ±0) returns 1 for any x, even a NaN"
Sep 4 2019
I believe that the code can still be simplified somewhat, but that it's correct as-is for float, double, and long double. I'll take an AI to follow-up on future improvements, and let's get this in.
Aug 28 2019
Eric showed me this link https://godbolt.org/z/AjBHYqv
I would tend to write this function in the following form:
// set up lower bound and upper bound if (r > upperBound) r = upperBound; if (!(r >= lowerBound)) r = lowerBound; // NaN is mapped to l.b. return static_cast<IntType>(r);
I prefer to avoid the explicit trunc call, since that's the defined behavior of the static_cast once the value is in-range, anyway.
Aug 8 2019
It's not that NaN is rare in normal programs, or that it indicates a bug in the code. It's that testing for NaN is usually an indication that you're testing for an exceptional case, and it makes sense to move those off the hot path (i.e. NaN is actually pretty common, but the likelihood of handling it on the normal-value path through code is small).
Aug 6 2019
(Ideally we would just call them e.g. __builtin_floor, but that would be source-breaking. __builtin_tgmath_floor seems like a good compromise.)
Strongly agree with what @rjmccall said. If we can make these generic builtins instead of ending up with O(100) variants of each math operation, that would make life immensely nicer.
Jul 26 2019
Jul 24 2019
LGTM
Jul 22 2019
Reviewers: what do we need to get this across the finish line?
Jul 16 2019
LGTM. Please get at least one additional reviewer's approval before merging, as this is a corner of clang that I don't work on often.
Jul 15 2019
May 30 2019
Mar 11 2019
Ah, now I see what you're talking about. And in fact, because of the way divide works out, there's a little gap of results that are even possible to achieve just below each binade boundary, so the code you have here will work out fine. We *should* add a comment to clarify this somewhat, but I'm happy to do that in a separate commit. LGTM.
These results *are* tiny in the before rounding sense.
In the parlance of IEEE 754, there are two ways to "detect tininess": "before rounding" and "after rounding". The standard doesn't define how to flush subnormal results, but in practice most HW flushes results that are "tiny". The existing code flushes as though tininess is detected before rounding. This proposed update flushes as though tininess were detected after rounding.
Jan 25 2019
do we want to support _Float16 anywhere else?
ARM is the only in-tree target with a defined ABI that I'm aware of.
Jan 11 2019
Dec 17 2018
Can you also add a check for .double infinity? It looks like that's likely missing too.
Dec 7 2018
Oct 9 2018
Are these documented anywhere? I haven't seen it in any of the patches so far. What do they return for NaN inputs?
Oct 3 2018
Sep 24 2018
I suspect that we'd rather use ilogb in the long term, but as a like-for-like replacement this looks OK.
Jul 19 2018
Jul 18 2018
May 22 2018
IIRC the optimization of divide-by-power-of-two --> multiply-by-inverse does not occur at -O0, so it would be better to multiply by 2^(-fbits) instead.
May 10 2018
LGTM, thanks!
May 9 2018
One more question: the caller is responsible for closing the file when they're done, right?
Two quick notes as someone who has never used these functions:
- Name is really the *path* of the file to open, right?
- What happens to ResultFD if the function fails?
May 8 2018
LGTM
May 4 2018
Apr 26 2018
I like Chandler's wording. Something like:
Apr 25 2018
@gottesmm can you take a look at this? You're more familiar with the APFloat API than I am.
Apr 20 2018
Tangential question: Do we have an intrinsic floating -> integer conversion with defined semantics for out-of-range values?
Apr 6 2018
I also wonder whether requiring fast-math to allow tree reductions is overkill. Tree reductions can be implemented reasonably efficiently in many architectures, while linearly ordered reduction appear to me to be more of a niche.
Mar 26 2018
Two questions, to which I do not know the answer:
This is the class of optimizations that I would call "formally allowed by the standard, but extremely likely to break things and surprise people." Which isn't to say that we shouldn't do it, just ... be prepared.
Mar 20 2018
I'm OK with this.
Mar 16 2018
Mar 5 2018
IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.