Page MenuHomePhabricator

Create UsersManual section entitled 'Controlling Floating Point Behavior'
ClosedPublic

Authored by mibintc on Sep 12 2019, 1:24 PM.

Details

Summary

Create a new section for documenting the floating point options. Move all the floating point options into this section, and add new entries for the floating point options that exist but weren't previously described in the UsersManual.

Diff Detail

Repository
rL LLVM

Event Timeline

mibintc created this revision.Sep 12 2019, 1:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2019, 1:24 PM
mibintc retitled this revision from Create UsersManual section entitled 'Controlling Floating Poit Behavior' to Create UsersManual section entitled 'Controlling Floating Point Behavior'.Sep 12 2019, 1:24 PM
mibintc added inline comments.Sep 12 2019, 1:33 PM
clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

I'm not certain this is correct description of the Default. Previously in the documentation the default ws not described. I looked at the code and in Target.h it is initialized to ieee but possibly other specializations would replace the initial value.

1213 ↗(On Diff #219989)

Both honor-nans and honor-infinities appear to have no effect at the moment, previously they were mapped to -menable-no-infs -menable-no-nans

cameron.mcinally added inline comments.
clang/docs/UsersManual.rst
1213 ↗(On Diff #219989)

I believe others have the same problem too. All probably need to be reassessed. E.g. I think -f[no-]trapping-math isn't hooked up to anything in LLVM.

This may also get confusing when the -fp-model=/etc options are added, since they really do the same thing in different ways.

Is there a plan in place for how to handle this? Will the GCC options alias the Clang options?

Thanks! Some wording suggestions.

clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

To be clear, fast math is not *equivalent* to just setting all of these separately, right?

1156 ↗(On Diff #219989)

This isn't how these two options are spelled below.

1168 ↗(On Diff #219989)

Typo

1172 ↗(On Diff #219989)

I know this is existing text, but what does it mean to "require" a denormal number? The values make it seem like this changes basic behavior, like a sort of rounding mode except for denormals.

1180 ↗(On Diff #219989)

If it's target-specific, that seems relevant to describe.

1188 ↗(On Diff #219989)

"According to the C standard, when a floating-point value is converted to
integer type, the fractional component of the value is discarded, and if the
resulting integer is not expressible in the destination type, the conversion
has undefined behavior. By default, Clang follows this rule strictly and does
not guarantee any particular result. In non-strict mode, this behavior
becomes well-defined, and Clang uses the standard behavior for the target:
typically, the behavior of the target architecture's float-to-int conversion
instructions."

Ideally this would then specify that behavior for various targets.

1193 ↗(On Diff #219989)

"`-fno-math-errno` allows optimizations that might cause standard
C math functions to not set `errno`.

For example, on some systems, the math function `sqrt` is specified
as setting `errno to EDOM` when the input is negative. On these
systems, the compiler cannot normally optimize a call to `sqrt` to use
inline code (e.g. the x86 `sqrtsd` instruction) without additional checking
to ensure that `errno is set appropriately. -fno-math-errno` permits
these transformations.

On some targets, math library functions never set `errno`, and so
`-fno-math-errno` is the default. This includes most BSD-derived
systems, including Darwin."

1203 ↗(On Diff #219989)

"Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA). Fused operations are permitted to produce more precise results than performing the same operations separately.

The C standard permits intermediate floating-point results within an expression to be computed with more precision than their type would normally allow. This permits operation fusing, and Clang takes advantage of this by default. This behavior can be controlled with the FP_CONTRACT pragma.

Valid values for this option are:"

The second paragraph should probably describe how the pragma interacts with the option.

1213 ↗(On Diff #219989)

If we accept these options, we should document them, even if just to say that they currently have no effect. But I suspect that what this means is that they have no effect *separately* and we currently only enable NaN/Inf-unsafe optimizations given a more aggressive option like fast-math.

1229 ↗(On Diff #219989)

"Allow division operations to be transformed into multiplication by a reciprocal. This can be significantly faster than an ordinary division but can also have significantly less precision."

mibintc marked an inline comment as done.Sep 13 2019, 6:30 AM
mibintc added inline comments.
clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

I think in fact ffast-math is the same as selecting all the invidivdual settings, part of the logic checks if all the features are enabled and then also sets the fast-math macro.

cameron.mcinally marked an inline comment as done.Sep 13 2019, 7:38 AM
cameron.mcinally added inline comments.
clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

Just checked for crtfastmath.o on the link line and they do not seem to be "equivalent", unless I made a mistake.

That said, I think the language "implies" is acceptable here.

rjmccall added inline comments.Sep 13 2019, 12:47 PM
clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

Even if the only difference was the macro, that would be quite important — the macros can result in substantially different behavior from math libraries, IIRC. So I would appreciate a wording that carries less of a connotation that this is equivalent to this fixed set of other options. Maybe "Some of the individual optimizations performed by fast-math can be separately enabled or disabled with the following options:"?

Melanie, are you saying that __FAST_MATH__ is defined only if *all* of these options are set? I think that's important to cover as well; perhaps:

"``-ffast-math`` causes the macro ``__FAST_MATH__`` to be defined.  Some math libraries recognize this macro and change their behavior.  Using any of the options below to disable any of the individual optimizations in ``-ffast-math`` will cause ``__FAST_MATH__`` to no longer be set."
clang/docs/UsersManual.rst
1168 ↗(On Diff #219989)

Is there not an option corresponding to the "afn" fast math flag?

1180 ↗(On Diff #219989)

It looks like this is currently only supported for ARM targets. I also saw a CUDA-specific flag for this.

X86 targets won't be able to support "positive-zero" mode and "preserve-sign" (which we typically just call "ftz") would only take effect for SSE instructions (i.e. not x87), but we would like to add "denormals-are-zero" (or "daz") which means that denormal operands will be treated as if they were zero by FP instructions. Also "ftz' and "daz" can be used together so I guess we'll need a "daz+ftz" option. All of this depends on having a backend implementation of course, but we're going to want to do this soon. If we can figure out the command line interface, I can talk to someone about getting the x86 codegen part implemented.

1199 ↗(On Diff #219989)

I think for clang the default is "-fno-trapping-math", isn't it? Whether or not it is currently setting internal flags that way "-fno-trapping-math" is the behavior you get from the LLVM optimizer if you don't generate constrained intrinsics.

1213 ↗(On Diff #219989)

We have a fast-math flag for this (ninf and nnan). The front end should be connecting these options to those flags.

clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

All of this depends on having a backend implementation of course, but we're going to want to do this soon. If we can figure out the command line interface, I can talk to someone about getting the x86 codegen part implemented.

This is currently handled for -ffast-math/-Ofast by including crtfastmath.o on the link line. IIRC, all it does is set FTZ+DAZ. Can we piggyback off that?

That's done because we want FTZ+DAZ to be toggled at start-up, so static initializers and such get the benefit. I suppose it would be possible to do in CodeGen, but sounds tricky...

clang/docs/UsersManual.rst
1168 ↗(On Diff #219989)

That's probably -fno-math-errno. I don't know for sure, though.

rjmccall added inline comments.Sep 13 2019, 2:11 PM
clang/docs/UsersManual.rst
1213 ↗(On Diff #219989)

Okay. Let's keep this patch about documentation, and if we find semantic problems, we can address those in follow-ups. So if these currently do nothing, we should document them that way.

clang/docs/UsersManual.rst
1213 ↗(On Diff #219989)

Sounds good.

clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

I found this discussion useful: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053011.html

It refers to this bug: https://bugs.llvm.org//show_bug.cgi?id=14024

On some platforms FTZ and DAZ will be set if -ffast-math is used with -Ofast or -ffast-math if crtfastmath.o can be found. For the sake of the current optimization update, I suppose we need to document that fact and mention that -fdenormal-fp-math doesn't actually control it (except on ARM?).

I don't think this is the behavior we want though because they're either both on or both left in the system default state. I'll see if I can get a discussion going about getting this implemented in a consistent (or at least consistently sensible) way across architectures and operating systems.

cameron.mcinally marked an inline comment as done.Sep 13 2019, 4:29 PM
cameron.mcinally added inline comments.
clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

Good idea about the discussion.

I don't think this is the behavior we want though because they're either both on or both left in the system default state.

I could see the purists wanting to toggle both separately, but I'm not sure it has a lot of practical value. E.g. even if you produced a denorm with no-FTZ, the first use of it will flush it to zero with DAZ. Unless the user is only doing one calculation and then writing to output, that is.

On the other hand, if you FTZ, there would be no need to respect no-DAZ since you can't create denorms. Well, that's not entirely true if someone really cares about the precision of their input. But, yeah, those users probably won't be throwing out denorms anyway.

I'm also guessing the performance gains of toggling one alone isn't that great.

If we were to separate the two though, it might make sense to create two compiler-rt functions to achieve that. Assuming neither wipe out the MXCSR (or equivalent), like crtfastmath.o does.

mibintc marked 12 inline comments as done.Mon, Sep 16, 10:40 AM
mibintc added inline comments.
clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

i made a correction to the text, the code doesn't check of -ffp-contract=fast, it checks if the other bits are set a certain way, then it pushes -ffast-math into the args.

1168 ↗(On Diff #219989)

-ffast-math does enable HasApproxFuncs but there doesn't seem to be a command line option to control it. FMF are initialized via "FMF.setFast" which sets that bit. I don't see a gcc option for this eitiher.

1172 ↗(On Diff #219989)

In this thread discussing the option--also referred to by Andy-- http://clang-developers.42468.n3.nabble.com/fdenormal-fp-math-td4055684.html , "I believe that flag is taken as a statement about what
the compiler can assume about the FP environment. It is not something we
attempt to ensure at startup."

1180 ↗(On Diff #219989)

Reviewers, can you suggest improvement to the current documentation for option `-fdenormal-fp-math=`

1188 ↗(On Diff #219989)

i think it would be difficult to determine the behavior of the target arch float-to-int conversion instructions for all supported targets. Maybe documenting this detaill should be part of adding a new target to clang/llvm.

1199 ↗(On Diff #219989)

-ffast-math sets no-trapping-math, and the default is -fno-fast-math, so the default is trapping-math. However, currently trapping-math is not functional, the option has no effect other than possibly disrupting the action of the ffast-math option. I mean, if you have ffast-math on the command line and you also have -ftrapping-math, the FAST_MATH macro would not be defined, and ffast-math would not be pushed into the Args.

1203 ↗(On Diff #219989)

I made the text changes you suggested, but i think that the interaction with the pragma should be described in the pragma section. currently there's already information about the option in the pragma section. I'll add a sentence referring the reader to the pragma documentation.

1213 ↗(On Diff #219989)

yes, if both the options are on the command line, then -ffinite-math-only is pushed into Args; I'll put this into the next upload. If used individually then these 2 options have no effect.

// Handle __FINITE_MATH_ONLY__ similarly.
if (!HonorINFs && !HonorNaNs)
  CmdArgs.push_back("-ffinite-math-only");
1213 ↗(On Diff #219989)

@cameron.mcinally fyi @rjmccall asked me to create a separate floating point options section [this review], then after it's committed, create another patch where we can consider adding fp-model options

mibintc updated this revision to Diff 220359.Mon, Sep 16, 10:42 AM
rjmccall added inline comments.Mon, Sep 16, 12:24 PM
clang/docs/UsersManual.rst
1152 ↗(On Diff #219989)

Thanks. Let me suggest a reorganization of the new wording; it's currently a bit misleading because it leads with the bit about __FAST_MATH__, which is of course not the primary function of the option.

.. option:: -ffast-math

   Enable fast-math mode.  This option allows the compiler to make a set of
   aggressive assumptions about floating-point math; the resulting code may
   execute faster but may also have decreased precision or even incorrect
   behavior for corner cases such as ```-0``.  These assumptions include:

   * Floating-point math obeys regular algebraic rules for real numbers (e.g.
     ``+`` and ``*`` are associative, ``x/y == x * (1/y)``, and
     ``(a + b) * c == a * c + b * c``),
   * operands to floating-point operations are not equal to ``NaN`` and
     ``Inf``, and
   * ``+0`` and ``-0`` are interchangeable.

   ``-ffast-math`` also defines the ``__FAST_MATH__`` preprocessor
   macro. Some math libraries recognize this macro and change their behavior.
   With the exception of ``-ffp-contract=fast``, disabling any of the individual
   optimizations in ``-ffast-math`` will cause ``__FAST_MATH__`` to no longer
   be set.

   ``-ffast-math`` implies the following options:
1188 ↗(On Diff #219989)

I think it would good to add that documentation for the existing targets, but I don't want to hold up this patch for it, so we can leave that out.

1203 ↗(On Diff #219989)

Sure, a cross-reference is fine.

1264 ↗(On Diff #220359)

Missing period.

mibintc updated this revision to Diff 220373.Mon, Sep 16, 12:44 PM

Respond to @rjmccall 's review

mibintc marked 2 inline comments as done.Mon, Sep 16, 12:47 PM
rjmccall accepted this revision.Mon, Sep 16, 12:51 PM

LGTM, thank you.

This revision is now accepted and ready to land.Mon, Sep 16, 12:51 PM

@rjmccall I don't have commit rights, if you think it's ready can you submit? thanks for all

clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

I think it's fine to leave it as you have it for now (assuming this is copying existing documentation). We'd need to figure out what's currently happening in all cases to document that, and I don't think the current behavior is simple. It's probably best to leave cleaning this up as a product of the discussion to figure out what this should be doing.

mibintc marked an inline comment as done.Tue, Sep 17, 7:22 AM

ready to commit? anything else needed?

clang/docs/UsersManual.rst
1180 ↗(On Diff #219989)

I made 1 modification, 'defaults to ieee'. The option in clang actually doesn't have a default, but the corresponding variable in llvm is initialized to ieee.

lebedev.ri added inline comments.
clang/docs/UsersManual.rst
1134 ↗(On Diff #220373)

Not enough -

mibintc updated this revision to Diff 220508.Tue, Sep 17, 8:15 AM

fix underline

mibintc marked an inline comment as done.Tue, Sep 17, 8:16 AM
This revision was automatically updated to reflect the committed changes.
andrew.w.kaylor reopened this revision.Tue, Sep 17, 3:08 PM

Need to fix Sphinx warnings

This revision is now accepted and ready to land.Tue, Sep 17, 3:08 PM
Closed by commit rL372229: Recommit -r372180 (authored by erichkeane, committed by ). · Explain WhyWed, Sep 18, 8:08 AM
This revision was automatically updated to reflect the committed changes.