Create a new section for documenting the floating point options. Move all the floating point options into this section, and add new entries for the floating point options that exist but weren't previously described in the UsersManual.
Details
- Reviewers
rjmccall scanon andrew.w.kaylor kpn - Group Reviewers
Restricted Project - Commits
- rGf124ab9fe115: Recommit -r372180
rL372229: Recommit -r372180
rC372229: Recommit -r372180
rGa08d5a4b0ebd: Create UsersManual section entitled 'Controlling Floating Point Behavior'
rL372180: Create UsersManual section entitled 'Controlling Floating Point
rC372180: Create UsersManual section entitled 'Controlling Floating Point
Diff Detail
- Repository
- rL LLVM
Event Timeline
clang/docs/UsersManual.rst | ||
---|---|---|
1180 | I'm not certain this is correct description of the Default. Previously in the documentation the default ws not described. I looked at the code and in Target.h it is initialized to ieee but possibly other specializations would replace the initial value. | |
1213 | Both honor-nans and honor-infinities appear to have no effect at the moment, previously they were mapped to -menable-no-infs -menable-no-nans |
clang/docs/UsersManual.rst | ||
---|---|---|
1213 | I believe others have the same problem too. All probably need to be reassessed. E.g. I think -f[no-]trapping-math isn't hooked up to anything in LLVM. This may also get confusing when the -fp-model=/etc options are added, since they really do the same thing in different ways. Is there a plan in place for how to handle this? Will the GCC options alias the Clang options? |
Thanks! Some wording suggestions.
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | To be clear, fast math is not *equivalent* to just setting all of these separately, right? | |
1156 | This isn't how these two options are spelled below. | |
1168 | Typo | |
1172 | I know this is existing text, but what does it mean to "require" a denormal number? The values make it seem like this changes basic behavior, like a sort of rounding mode except for denormals. | |
1180 | If it's target-specific, that seems relevant to describe. | |
1188 | "According to the C standard, when a floating-point value is converted to Ideally this would then specify that behavior for various targets. | |
1193 | "`-fno-math-errno` allows optimizations that might cause standard For example, on some systems, the math function `sqrt` is specified On some targets, math library functions never set `errno`, and so | |
1203 | "Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA). Fused operations are permitted to produce more precise results than performing the same operations separately. The C standard permits intermediate floating-point results within an expression to be computed with more precision than their type would normally allow. This permits operation fusing, and Clang takes advantage of this by default. This behavior can be controlled with the FP_CONTRACT pragma. Valid values for this option are:" The second paragraph should probably describe how the pragma interacts with the option. | |
1213 | If we accept these options, we should document them, even if just to say that they currently have no effect. But I suspect that what this means is that they have no effect *separately* and we currently only enable NaN/Inf-unsafe optimizations given a more aggressive option like fast-math. | |
1229 | "Allow division operations to be transformed into multiplication by a reciprocal. This can be significantly faster than an ordinary division but can also have significantly less precision." |
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | I think in fact ffast-math is the same as selecting all the invidivdual settings, part of the logic checks if all the features are enabled and then also sets the fast-math macro. |
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | Just checked for crtfastmath.o on the link line and they do not seem to be "equivalent", unless I made a mistake. That said, I think the language "implies" is acceptable here. |
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | Even if the only difference was the macro, that would be quite important — the macros can result in substantially different behavior from math libraries, IIRC. So I would appreciate a wording that carries less of a connotation that this is equivalent to this fixed set of other options. Maybe "Some of the individual optimizations performed by fast-math can be separately enabled or disabled with the following options:"? Melanie, are you saying that __FAST_MATH__ is defined only if *all* of these options are set? I think that's important to cover as well; perhaps: "``-ffast-math`` causes the macro ``__FAST_MATH__`` to be defined. Some math libraries recognize this macro and change their behavior. Using any of the options below to disable any of the individual optimizations in ``-ffast-math`` will cause ``__FAST_MATH__`` to no longer be set." |
clang/docs/UsersManual.rst | ||
---|---|---|
1168 | Is there not an option corresponding to the "afn" fast math flag? | |
1180 | It looks like this is currently only supported for ARM targets. I also saw a CUDA-specific flag for this. X86 targets won't be able to support "positive-zero" mode and "preserve-sign" (which we typically just call "ftz") would only take effect for SSE instructions (i.e. not x87), but we would like to add "denormals-are-zero" (or "daz") which means that denormal operands will be treated as if they were zero by FP instructions. Also "ftz' and "daz" can be used together so I guess we'll need a "daz+ftz" option. All of this depends on having a backend implementation of course, but we're going to want to do this soon. If we can figure out the command line interface, I can talk to someone about getting the x86 codegen part implemented. | |
1199 | I think for clang the default is "-fno-trapping-math", isn't it? Whether or not it is currently setting internal flags that way "-fno-trapping-math" is the behavior you get from the LLVM optimizer if you don't generate constrained intrinsics. | |
1213 | We have a fast-math flag for this (ninf and nnan). The front end should be connecting these options to those flags. |
clang/docs/UsersManual.rst | ||
---|---|---|
1180 |
This is currently handled for -ffast-math/-Ofast by including crtfastmath.o on the link line. IIRC, all it does is set FTZ+DAZ. Can we piggyback off that? That's done because we want FTZ+DAZ to be toggled at start-up, so static initializers and such get the benefit. I suppose it would be possible to do in CodeGen, but sounds tricky... |
clang/docs/UsersManual.rst | ||
---|---|---|
1168 | That's probably -fno-math-errno. I don't know for sure, though. |
clang/docs/UsersManual.rst | ||
---|---|---|
1213 | Okay. Let's keep this patch about documentation, and if we find semantic problems, we can address those in follow-ups. So if these currently do nothing, we should document them that way. |
clang/docs/UsersManual.rst | ||
---|---|---|
1213 | Sounds good. |
clang/docs/UsersManual.rst | ||
---|---|---|
1180 | I found this discussion useful: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053011.html It refers to this bug: https://bugs.llvm.org//show_bug.cgi?id=14024 On some platforms FTZ and DAZ will be set if -ffast-math is used with -Ofast or -ffast-math if crtfastmath.o can be found. For the sake of the current optimization update, I suppose we need to document that fact and mention that -fdenormal-fp-math doesn't actually control it (except on ARM?). I don't think this is the behavior we want though because they're either both on or both left in the system default state. I'll see if I can get a discussion going about getting this implemented in a consistent (or at least consistently sensible) way across architectures and operating systems. |
clang/docs/UsersManual.rst | ||
---|---|---|
1180 | Good idea about the discussion.
I could see the purists wanting to toggle both separately, but I'm not sure it has a lot of practical value. E.g. even if you produced a denorm with no-FTZ, the first use of it will flush it to zero with DAZ. Unless the user is only doing one calculation and then writing to output, that is. On the other hand, if you FTZ, there would be no need to respect no-DAZ since you can't create denorms. Well, that's not entirely true if someone really cares about the precision of their input. But, yeah, those users probably won't be throwing out denorms anyway. I'm also guessing the performance gains of toggling one alone isn't that great. If we were to separate the two though, it might make sense to create two compiler-rt functions to achieve that. Assuming neither wipe out the MXCSR (or equivalent), like crtfastmath.o does. |
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | i made a correction to the text, the code doesn't check of -ffp-contract=fast, it checks if the other bits are set a certain way, then it pushes -ffast-math into the args. | |
1168 | -ffast-math does enable HasApproxFuncs but there doesn't seem to be a command line option to control it. FMF are initialized via "FMF.setFast" which sets that bit. I don't see a gcc option for this eitiher. | |
1172 | In this thread discussing the option--also referred to by Andy-- http://clang-developers.42468.n3.nabble.com/fdenormal-fp-math-td4055684.html , "I believe that flag is taken as a statement about what | |
1180 | Reviewers, can you suggest improvement to the current documentation for option `-fdenormal-fp-math=` | |
1188 | i think it would be difficult to determine the behavior of the target arch float-to-int conversion instructions for all supported targets. Maybe documenting this detaill should be part of adding a new target to clang/llvm. | |
1199 | -ffast-math sets no-trapping-math, and the default is -fno-fast-math, so the default is trapping-math. However, currently trapping-math is not functional, the option has no effect other than possibly disrupting the action of the ffast-math option. I mean, if you have ffast-math on the command line and you also have -ftrapping-math, the FAST_MATH macro would not be defined, and ffast-math would not be pushed into the Args. | |
1203 | I made the text changes you suggested, but i think that the interaction with the pragma should be described in the pragma section. currently there's already information about the option in the pragma section. I'll add a sentence referring the reader to the pragma documentation. | |
1213 | yes, if both the options are on the command line, then -ffinite-math-only is pushed into Args; I'll put this into the next upload. If used individually then these 2 options have no effect. // Handle __FINITE_MATH_ONLY__ similarly. if (!HonorINFs && !HonorNaNs) CmdArgs.push_back("-ffinite-math-only"); | |
1213 | @cameron.mcinally fyi @rjmccall asked me to create a separate floating point options section [this review], then after it's committed, create another patch where we can consider adding fp-model options |
clang/docs/UsersManual.rst | ||
---|---|---|
1152 | Thanks. Let me suggest a reorganization of the new wording; it's currently a bit misleading because it leads with the bit about __FAST_MATH__, which is of course not the primary function of the option. .. option:: -ffast-math Enable fast-math mode. This option allows the compiler to make a set of aggressive assumptions about floating-point math; the resulting code may execute faster but may also have decreased precision or even incorrect behavior for corner cases such as ```-0``. These assumptions include: * Floating-point math obeys regular algebraic rules for real numbers (e.g. ``+`` and ``*`` are associative, ``x/y == x * (1/y)``, and ``(a + b) * c == a * c + b * c``), * operands to floating-point operations are not equal to ``NaN`` and ``Inf``, and * ``+0`` and ``-0`` are interchangeable. ``-ffast-math`` also defines the ``__FAST_MATH__`` preprocessor macro. Some math libraries recognize this macro and change their behavior. With the exception of ``-ffp-contract=fast``, disabling any of the individual optimizations in ``-ffast-math`` will cause ``__FAST_MATH__`` to no longer be set. ``-ffast-math`` implies the following options: | |
1188 | I think it would good to add that documentation for the existing targets, but I don't want to hold up this patch for it, so we can leave that out. | |
1203 | Sure, a cross-reference is fine. | |
1264 | Missing period. |
@rjmccall I don't have commit rights, if you think it's ready can you submit? thanks for all
clang/docs/UsersManual.rst | ||
---|---|---|
1180 | I think it's fine to leave it as you have it for now (assuming this is copying existing documentation). We'd need to figure out what's currently happening in all cases to document that, and I don't think the current behavior is simple. It's probably best to leave cleaning this up as a product of the discussion to figure out what this should be doing. |
ready to commit? anything else needed?
clang/docs/UsersManual.rst | ||
---|---|---|
1180 | I made 1 modification, 'defaults to ieee'. The option in clang actually doesn't have a default, but the corresponding variable in llvm is initialized to ieee. |
clang/docs/UsersManual.rst | ||
---|---|---|
1134 | Not enough - |
Not enough -