Page MenuHomePhabricator

[RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior
Needs ReviewPublic

Authored by mibintc on May 31 2019, 7:01 AM.

Details

Summary

Intel would like to contribute a patch to implement support for these Intel- and Microsoft -fp options. This message is to describe the options and request feedback from the community.
-frounding-math (supported by gcc and ICC)
-fp-model=[precise|strict|fast] and -fp-exception-behavior=[ignore|maytrap|strict]

This contribution dovetails with the llvm patch "Teach the IRBuilder about constrained fadd and friends". The motivation for providing these is that having umbrella options such as -fp-model= to control most basic FP options is better and easier to understand for users.

The option settings -fp-model=[precise|strict|fast] are supported by both ICC and CL. The CL and ICC -fp-model option is documented on these pages:

https://docs.microsoft.com/en-us/cpp/build/reference/fp-specify-floating-point-behavior?view=vs-2019
https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-fp-model-fp

Currently, clang's default behavior corresponds to -fp-model=precise. Clang/llvm support for -fp-model=strict and
-fp-exception-behavior= was developed in the D53157 patch, and there is current llvm support for the fast settings by using the fast math flags llvm::FastMathFlags. Note: the clang-cl wrapper to support Microsoft options has simplified support for these options by mapping /fp-model=except to ftrapping-math, fp-mdel=fast to ffast-math, fp-model=precise and fp-model=strict to fno-fast-math (see clang/Driver/CLCompatOptions.td).

These are the settings for -fp-model=

precise - Disables optimizations that are not value-safe on floating-point data, although FP contraction is enabled.
strict - Enables precise and except, disables contractions (FMA), and enables pragma stdc fenv_access.   [Note: fenv_access not currently supported in clang]
fast - Equivalent to -ffast-math

What follows here is Microsoft /fp documentation from the msdn site. It's copied here solely for the purposes of saving
information for the future in case the information is removed from the msdn site:

/fp (Specify floating-point behavior)

Specifies how the compiler treats floating-point expressions, optimizations, and exceptions. The /fp options specify whether the generated code allows floating-point environment changes to the rounding mode, exception masks, and subnormal behavior, and whether floating-point status checks return current, accurate results. It controls whether the compiler generates code that maintains source operation and expression ordering and conforms to the standard for NaN propagation, or if it instead generates more efficient code that may reorder or combine operations and use simplifying algebraic transformations that are not allowed by the standard.

Syntax

/fp:[precise | strict | fast | except[-]]

Arguments

precise

By default, the compiler uses /fp:precise behavior.

Under /fp:precise the compiler preserves the source expression ordering and rounding properties of floating-point code when it generates and optimizes object code for the target machine. The compiler rounds to source code precision at four specific points during expression evaluation: at assignments, at typecasts, when a floating-point argument is passed to a function call, and when a floating-point value is returned from a function call. Intermediate computations may be performed at machine precision. Typecasts can be used to explicitly round intermediate computations.

The compiler does not perform algebraic transformations on floating-point expressions, such as reassociation or distribution, unless the transformation is guaranteed to produce a bitwise identical result.
Expressions that involve special values (NaN, +infinity, -infinity, -0.0) are processed according to IEEE-754 specifications. For example, x != x evaluates to true if x is NaN. Floating-point *contractions*, that is, machine instructions that combine floating-point operations, may be generated under /fp:precise.

The compiler generates code intended to run in the [default floating-point environment](#the-default-floating-point-environment) and assumes that the floating-point environment is not accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

If your floating-point code does not depend on the order of operations and expressions in your floating-point statements (for example, if you don't care whether a * b + a * c is computed as (b + c) * a or 2 * a as a + a), consider the [/fp:fast](#fast) option, which can produce faster, more efficient code. If your code both depends on the order of operations and expressions, and accesses or alters the floating-point environment (for example, to change rounding modes or to trap floating-point exceptions), use [/fp:strict](#strict).

strict

/fp:strict has behavior similar to /fp:precise, that is, the compiler preserves the source ordering and rounding properties of floating-point code when it generates and optimizes object code for the target machine, and observes the standard when handling special values. In addition, the program may safely access or modify the floating-point environment at runtime.

Under /fp:strict, the compiler generates code that allows the program to safely unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes. It rounds to source code precision at four specific points during expression evaluation: at assignments, at typecasts, when a floating-point argument is passed to a function call, and when a floating-point value is returned from a function call. Intermediate computations may be performed at machine precision. Typecasts can be used to explicitly round intermediate computations. The compiler does not perform algebraic transformations on floating-point expressions, such as reassociation or distribution, unless the transformation is guaranteed to produce a bitwise identical result. Expressions that involve special values (NaN, +infinity, -infinity, -0.0) are processed according to IEEE-754 specifications. For example, x != x evaluates to true if x is NaN. Floating-point contractions are not generated under /fp:strict.

/fp:strict is computationally more expensive than /fp:precise because the compiler must insert additional instructions to trap exceptions and allow programs to access or modify the floating-point environment at runtime. If your code doesn’t use this capability, but requires source code ordering and rounding, or relies on special values, use /fp:precise. Otherwise, consider using /fp:fast, which can produce faster and smaller code.

fast

The /fp:fast option allows the compiler to reorder, combine, or simplify floating-point operations to optimize floating-point code for speed and space. The compiler may omit rounding at assignment statements, typecasts, or function calls. It may reorder operations or perform algebraic transforms, for example, by use of associative and distributive laws, even if such transformations result in observably different rounding behavior. Because of this enhanced optimization, the result of some floating-point computations may differ from those produced by other /fp options. Special values (NaN, +infinity, -infinity, -0.0) may not be propagated or behave strictly according to the IEEE-754 standard. Floating-point contractions may be generated under /fp:fast. The compiler is still bound by the underlying architecture under /fp:fast, and additional optimizations may be available through use of the [/arch](arch-minimum-cpu-architecture.md) option.

Under /fp:fast, the compiler generates code intended to run in the default floating-point environment and assumes that the floating-point environment isn’t accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

/fp:fast is intended for programs that do not require strict source code ordering and rounding of floating-point expressions, and do not rely on the standard rules for handling special values such as NaN. If your floating-point code requires preservation of source code ordering and rounding, or relies on standard behavior of special values, use [/fp:precise](#precise). If your code accesses or modifies the floating-point environment to change rounding modes, unmask floating-point exceptions, or check floating-point status, use [/fp:strict](#strict).

except

The /fp:except option generates code to ensures that any unmasked floating-point exceptions are raised at the exact point at which they occur, and that no additional floating-point exceptions are raised. By default, the /fp:strict option enables /fp:except, and /fp:precise does not. The /fp:except option is not compatible with /fp:fast. The option can be explicitly disabled by us of /fp:except-.

Note that /fp:except does not enable any floating-point exceptions by itself, but it is required for programs to enable floating-point exceptions. See [_controlfp](../../c-runtime-library/reference/control87-controlfp-control87-2.md) for information on how to enable floating-point exceptions.

Remarks

Multiple /fp options can be specified in the same compiler command line. Only one of /fp:strict, /fp:fast, and /fp:precise options can be in effect at a time. If more than one of these options is specified on the command line, the later option takes precedence and the compiler generates a warning.

Diff Detail

Repository
rL LLVM

Event Timeline

mibintc created this revision.May 31 2019, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2019, 7:01 AM

Documentation missing. All the blurb from patch description should be in doc/

pengfei added a subscriber: pengfei.Jun 2 2019, 5:40 PM
wuzish added a subscriber: wuzish.Jul 3 2019, 7:35 PM
mibintc updated this revision to Diff 211770.Jul 25 2019, 9:11 AM

The IRBuilder now has been taught about constrained fadd and friends. I simply updated my patches to work with the committed revision. Note that this diff now contains what was formerly being reviewed separately in clang+llvm. Also let's discuss the new llvm file fpState.h. In this revision I didn't spend a lot of time pondering the best way to support this in the current world, i just made a simple update so it would build.

I could put the entire patch under clang and not create new files in llvm. I think it would be an advantage to have the interpretation of the floating point command line switches occur in llvm itself. For one reason, because Intel plans to contribute the same options in the Fortran compiler as well as the clang compiler. This puts the switch logic into a single place. Other languages may want to add support for the options too. Note, we also plan to implement inline pragma's for clang which allow the fp modes to be set for the duration of a code block.

I think it would be convenient to have an "unset" setting for the different constrained modes, otherwise you need a boolean that says "no value was provided for this option". But i'm a frontend person so I may need to have my attitude adjusted.

When I was coding this before, I needed to pull out FPState.h from within IRBuilder because otherwise I needed to sprinkle additional include directives in many clang files, if it's pulled out like this to be more or less standalone then that problem resolved.

kpn added a comment.Jul 25 2019, 10:06 AM

I think it would be convenient to have an "unset" setting for the different constrained modes, otherwise you need a boolean that says "no value was provided for this option". But i'm a frontend person so I may need to have my attitude adjusted.

What "different constrained modes"? The IRBuilder is either in constrained mode or it isn't. In constrained mode the exception behavior and rounding mode both have defaults, and those defaults can be changed individually without affecting the other setting. The current defaults can also be retrieved if you need something for a function call where you don't want to change it but need an argument anyway. When do you need this "no value provided" setting?

Oh, I'd check the tools/clang/CODE_OWNERS.txt file and add additional appropriate reviewers. Perhaps John McCall and Richard Smith? I don't know who has opinions on how command line options should be handled.

Do we want the Unix driver to be compatible with gcc? Maybe, maybe not. Opinions, anyone?

The documentation request from lebedev.ri isn't in this ticket yet.

Also, for future historical research purposes I'd cut and paste the relevant portions of outside web pages (like intel.com's) and post them somewhere llvm-ish where they are findable. This ticket, for example, is a good place. Web sites gets reorganized or vanish in full or in part. It's helpful to have some insulation from that over time. I've had to fix compiler bugs that actually were 25 years old before. Yes, 25 years old. Being able to do the research is very helpful.

Oh, it may be useful to know that constrained floating point and the FastMathFlags are not mutually exclusive. I don't know if that matters here or not, but you did mention FastMathFlags.

llvm/lib/IR/FPState.cpp
78 ↗(On Diff #211770)

The IRBuilder already has defaults for exception behavior and rounding. Is it a good idea to duplicate that knowledge here? Worse, what's here is different from what's in the IRBuilder. Why not ask the IRBuilder what its current setting is and use that?

Would it make sense to have setters/getters, and then have a separate updateBuilder() method? We still don't have a good way to get the #pragma down to the lower levels of clang. The current, unfinished, attempt doesn't work for C++ templates and I'm told other cases as well. An updateBuilder() method could be helpful when moving from one scope to another. But keep in mind that if any constrained fp math is used in a function then the entire function has to be constrained.

Given that last bit, it may make more sense to have the support somewhere higher level than as a wrapper around the IRBuilder. Maybe in CodeGenFunction or CodeGenModule? I've not spent much time in clang so I'm not sure if that makes sense or not.

arsenm added a subscriber: arsenm.Jul 25 2019, 7:30 PM
arsenm added inline comments.
llvm/include/llvm/IR/FPState.h
1 ↗(On Diff #211770)

Missing license header and c++ mode comment

llvm/lib/IR/FPState.cpp
1 ↗(On Diff #211770)

Missing license header

73 ↗(On Diff #211770)

eb?

llvm/unittests/IR/IRBuilderTest.cpp
205 ↗(On Diff #211770)

ASSERT_FALSE with no !

217–218 ↗(On Diff #211770)

Most of these should not be using ASSERT_, and instead EXPECT_EQ

258 ↗(On Diff #211770)

This already would have crashed from the cast<> before

259–260 ↗(On Diff #211770)

EXPECT_EQ

mibintc marked 3 inline comments as done.EditedJul 26 2019, 12:01 PM

Thanks for your review >>! In D62731#1601408, @kpn wrote:

I think it would be convenient to have an "unset" setting for the different constrained modes, otherwise you need a boolean that says "no value was provided for this option". But i'm a frontend person so I may need to have my attitude adjusted.

What "different constrained modes"? The IRBuilder is either in constrained mode or it isn't. In constrained mode the exception behavior and rounding mode both have defaults, and those defaults can be changed individually without affecting the other setting. The current defaults can also be retrieved if you need something for a function call where you don't want to change it but need an argument anyway. When do you need this "no value provided" setting?

I'm going to rewrite this

Oh, I'd check the tools/clang/CODE_OWNERS.txt file and add additional appropriate reviewers. Perhaps John McCall and Richard Smith? I don't know who has opinions on how command line options should be handled.

I'd like to fix it more before I add more reviewers

Do we want the Unix driver to be compatible with gcc? Maybe, maybe not. Opinions, anyone?

Oh, I think you mean something more like the gnuish -fno-except or maybe -fp-model=no-except? instead of -fp-model=except- ok we can get that sorted.

The documentation request from lebedev.ri isn't in this ticket yet.

Also, for future historical research purposes I'd cut and paste the relevant portions of outside web pages (like intel.com's) and post them somewhere llvm-ish where they are findable. This ticket, for example, is a good place. Web sites gets reorganized or vanish in full or in part. It's helpful to have some insulation from that over time. I've had to fix compiler bugs that actually were 25 years old before. Yes, 25 years old. Being able to do the research is very helpful.

That's a good idea thanks

Oh, it may be useful to know that constrained floating point and the FastMathFlags are not mutually exclusive. I don't know if that matters here or not, but you did mention FastMathFlags.

Yes i'm not sure how the fast math command line optoins should interact with the fp-model options, i'll have to dig into that.

llvm/lib/IR/FPState.cpp
1 ↗(On Diff #211770)

Thanks @arsenm in the end i believe i won't be adding this file

73 ↗(On Diff #211770)

I mean, if the user requested constrained exception via the option -fp-model=except but no rounding mode has been requested (again via command line options) then the rounding mode should be set to "round to nearest". I'm following a description of how the icc compiler works. I'm afraid that your concise comment "eb?" doesn't convey enough information for me to understand what you mean. With these further remarks is it clear now?

78 ↗(On Diff #211770)

Yes I absolutely don't want to duplicate, and I will submit another version of this patch and i'll be removing fpstate*. I do want to be able to make the fp-model options match the same behavior as icc is using. One reason i wanted to keep track of the state within a separate object is because i was uncertain if stuff would be going on in the IR builder which would be changing the settings, for whatever reason, and i'd want to put them back into settings specified by the command line options before creating the constrained intrinsics in clang/codegen.

let me work on this patch more, i just did a quick update to the latest before sending this up.

As far as pragmas versus templates, this is a concern. Is there something I can read to learn more about the issue? Pragma's are used in OpenMP so there must be a way to have pragma's interact politely with templates? Knowing very little, I thought that the pragma would be held as a _Pragma, sort of like a function call, within the intermediate representation e.g. as opposed to a token stream from the preprocessor and I didn't think there would be a problem with templates per se. I'll check with other folks here at Intel. There was a concern about inlining constrained intrinsics into a function because of your rule about whole function body but nobody mentioned a problem with templates.

kpn added a comment.Jul 26 2019, 12:38 PM

I actually don't have much of an opinion on what the command line argument form should be. It may be helpful for it to be the same as one of the commonly deployed compilers. The worst I think would be pretty close but with subtle differences. So if it can be made to work like Intel's compiler I'm fine with that. But I'm hoping that more people in the community chime in since having a consensus would be best. Personally, I'm not yet giving any final sign-offs to tickets since I don't think I've been here long enough.

As far as the rounding metadata argument, the language reference says this:

For values other than “round.dynamic” optimization passes may assume that the actual runtime rounding mode (as defined in a target-specific manner) matches the specified rounding mode, but this is not guaranteed. Using a specific non-dynamic rounding mode which does not match the actual rounding mode at runtime results in undefined behavior.

Be aware that currently neither of the metadata arguments does anything. They get dropped when llvm reaches the SelectionDAG. And none of the optimization passes that run before that know anything about constrained intrinsics at all. This means they treat code that has them conservatively. Preserving and using that metadata in the optimization passes and getting it down and used by the MI layer is planned but hasn't happened yet. So the full set of arguments may not make sense yet, but an on/off switch for strict mode hopefully does.

llvm/lib/IR/FPState.cpp
78 ↗(On Diff #211770)

See "https://reviews.llvm.org/D52839", "Inform AST's UnaryOperator of FENV_ACCESS". It was there that Richard Smith brought up the issue of templates. I've been prioritizing work on the llvm end and haven't had a chance to get to understand how the relevant parts on the clang side work myself. My hope is that maybe command line arguments can go in to enable strict FP on a compilation-wide basis, with support for the #pragma coming later. But I don't know if it will work out that way.

mibintc updated this revision to Diff 214003.Aug 7 2019, 2:32 PM
mibintc edited the summary of this revision. (Show Details)

Compared to the 2nd revision, this patch moves all the changes into clang, removing the FPState file.

In the summary, I've copied information from Microsoft about the fp-model options into the differential Summary, as requested by Kevin, to create a legacy for future maintainers in case that information disappears from msdn.

In D62731#1601408, @kpn wrote:

I think it would be convenient to have an "unset" setting for the different constrained modes, otherwise you need a boolean that says "no value was provided for this option". But i'm a frontend person so I may need to have my attitude adjusted.

What "different constrained modes"? The IRBuilder is either in constrained mode or it isn't. In constrained mode the exception behavior and rounding mode both have defaults, and those defaults can be changed individually without affecting the other setting. The current defaults can also be retrieved if you need something for a function call where you don't want to change it but need an argument anyway. When do you need this "no value provided" setting?

Oh, I'd check the tools/clang/CODE_OWNERS.txt file and add additional appropriate reviewers. Perhaps John McCall and Richard Smith? I don't know who has opinions on how command line options should be handled.

Do we want the Unix driver to be compatible with gcc? Maybe, maybe not. Opinions, anyone?

Yes we need opinions on this.

The documentation request from lebedev.ri isn't in this ticket yet.

I'm not yet sure what I need for this, the requested documentation is still missing.

Also, for future historical research purposes I'd cut and paste the relevant portions of outside web pages (like intel.com's) and post them somewhere llvm-ish where they are findable. This ticket, for example, is a good place. Web sites gets reorganized or vanish in full or in part. It's helpful to have some insulation from that over time. I've had to fix compiler bugs that actually were 25 years old before. Yes, 25 years old. Being able to do the research is very helpful.

Oh, it may be useful to know that constrained floating point and the FastMathFlags are not mutually exclusive. I don't know if that matters here or not, but you did mention FastMathFlags.

Yes they aren't mutually exclusive. One of the fp-model options implies enabling fast math.

Compared to the 2nd revision, this patch moves all the changes into clang, removing the FPState file.

In the summary, I've copied information from Microsoft about the fp-model options into the differential Summary, as requested by Kevin, to create a legacy for future maintainers in case that information disappears from msdn.

The the documentation should be documentation, it should reside somewhere in clang/docs/.

In D62731#1603030, @kpn wrote:

I actually don't have much of an opinion on what the command line argument form should be. It may be helpful for it to be the same as one of the commonly deployed compilers. The worst I think would be pretty close but with subtle differences. So if it can be made to work like Intel's compiler I'm fine with that. But I'm hoping that more people in the community chime in since having a consensus would be best. Personally, I'm not yet giving any final sign-offs to tickets since I don't think I've been here long enough.

As far as the rounding metadata argument, the language reference says this:

For values other than “round.dynamic” optimization passes may assume that the actual runtime rounding mode (as defined in a target-specific manner) matches the specified rounding mode, but this is not guaranteed. Using a specific non-dynamic rounding mode which does not match the actual rounding mode at runtime results in undefined behavior.

Be aware that currently neither of the metadata arguments does anything. They get dropped when llvm reaches the SelectionDAG. And none of the optimization passes that run before that know anything about constrained intrinsics at all. This means they treat code that has them conservatively. Preserving and using that metadata in the optimization passes and getting it down and used by the MI layer is planned but hasn't happened yet. So the full set of arguments may not make sense yet, but an on/off switch for strict mode hopefully does.

@andrew.w.kaylor Can you please check over Kevin's comments about metadata?

Compared to the 2nd revision, this patch moves all the changes into clang, removing the FPState file.

In the summary, I've copied information from Microsoft about the fp-model options into the differential Summary, as requested by Kevin, to create a legacy for future maintainers in case that information disappears from msdn.

The the documentation should be documentation, it should reside somewhere in clang/docs/.

Got it thanks

I'm not entirely caught up on this review. I've only read the most recent comments, but I think I've got enough context to comment on the metadata arguments.

Based only on the fp-model command line options, the front end should only ever use "round.dynamic" (for strict mode) or "round.tonearest" (where full fenv access isn't permitted but we want to enable strict exception semantics). There are some pragmas, which I believe are in some draft of a standards document but not yet approved, which can declare any of the other rounding modes, but I don't know that we have plans to implement those yet. I'm also hoping that at some point we'll have a pass that finds calls to fesetround() and changes the metadata argument when it can prove what the rounding mode will be.

For the fp exception argument, the only values that can be implied by the -fp-model option are "fpexcept.strict" and "fpexcept.ignore". In icc, we have a separate option that can prevent speculation (equivalent to "fpexcept.maytrap"). I think gcc's, -ftrapping-math has a similar function (though the default may be reversed). I don't think we've talked about how (or if) clang should ever get into the "fpexcept.maytrap" state.

So for now, I think both arguments only need to support one of two states, depending on the -fp-model arguments.

I'm not entirely caught up on this review. I've only read the most recent comments, but I think I've got enough context to comment on the metadata arguments.

Based only on the fp-model command line options, the front end should only ever use "round.dynamic" (for strict mode) or "round.tonearest" (where full fenv access isn't permitted but we want to enable strict

I'm not entirely caught up on this review. I've only read the most recent comments, but I think I've got enough context to comment on the metadata arguments.

Based only on the fp-model command line options, the front end should only ever use "round.dynamic" (for strict mode) or "round.tonearest" (where full fenv access isn't permitted but we want to enable strict exception semantics). There are some pragmas, which I believe are in some draft of a standards document but not yet approved, which can declare any of the other rounding modes, but I don't know that we have plans to implement those yet. I'm also hoping that at some point we'll have a pass that finds calls to fesetround() and changes the metadata argument when it can prove what the rounding mode will be.

For the fp exception argument, the only values that can be implied by the -fp-model option are "fpexcept.strict" and "fpexcept.ignore". In icc, we have a separate option that can prevent speculation (equivalent to "fpexcept.maytrap"). I think gcc's, -ftrapping-math has a similar function (though the default may be reversed). I don't think we've talked about how (or if) clang should ever get into the "fpexcept.maytrap" state.

So for now, I think both arguments only need to support one of two states, depending on the -fp-model arguments.

I'm not entirely caught up on this review. I've only read the most recent comments, but I think I've got enough context to comment on the metadata arguments.

Based only on the fp-model command line options, the front end should only ever use "round.dynamic" (for strict mode) or "round.tonearest" (where full fenv access isn't permitted but we want to enable strict exception semantics). There are some pragmas, which I believe are in some draft of a standards document but not yet approved, which can declare any of the other rounding modes, but I don't know that we have plans to implement those yet. I'm also hoping that at some point we'll have a pass that finds calls to fesetround() and changes the metadata argument when it can prove what the rounding mode will be.

For the fp exception argument, the only values that can be implied by the -fp-model option are "fpexcept.strict" and "fpexcept.ignore". In icc, we have a separate option that can prevent speculation (equivalent to "fpexcept.maytrap"). I think gcc's, -ftrapping-math has a similar function (though the default may be reversed). I don't think we've talked about how (or if) clang should ever get into the "fpexcept.maytrap" state.

So for now, I think both arguments only need to support one of two states, depending on the -fp-model arguments.

@andrew.w.kaylor Thanks Andy. Reminder -- in a private document you indicated to me that -fp-speculation=safe corresponds to the maytrap setting for the exception argument. This patch is implemented with those semantics.

@andrew.w.kaylor Thanks Andy. Reminder -- in a private document you indicated to me that -fp-speculation=safe corresponds to the maytrap setting for the exception argument. This patch is implemented with those semantics.

Right. This is a clear indication of why I shouldn't try to squeeze in a review just before heading out the door at the end of the day. The "separate option" in icc that I was referring to is -fp-speculation. Somehow I overlooked the fact that you were adding that option here. I saw your note and read the recent comments without looking over even the patch description. Sorry about that.

Here's a summary of how the rm and eb options are set

if user requests fp-model=strict, then the ConstrainedIntrinsic will be built with ( rmDynamic, ebStrict )

if user requests fp-model=except or fp-model=noexcept then the ConstrainedIntrinsic will be built with eb set to Strict or Ignore, respectively. In all cases, if the user options don't otherwise request a value for the rounding mode, but the user options has requested a value for excepton behavior, then the ConstrainedIntrinsic will use rmToNearest.

The fp-speculation option controls only the eb setting. There are 3 possible values: fast, strict, safe and the ConstrainedIntrinsic will be built with eb set to ebIgnore, ebStrict, ebMayTrap respectively.

mibintc updated this revision to Diff 215225.Aug 14 2019, 1:24 PM

I added documentation for the new floating point options into clang/docs

kpn added inline comments.Aug 15 2019, 11:53 AM
clang/docs/UsersManual.rst
1307

Extra spaces?

clang/lib/CodeGen/CodeGenFunction.cpp
123

Wait, so "fast" and "precise" are the same thing? That doesn't sound like where the documentation you put in the ticket says "the compiler preserves the source expression ordering and rounding properties of floating-point".

(Yes, I saw below where "fast" turns on the fast math flags but "precise" doesn't. That doesn't affect my point here.)

clang/lib/Frontend/CompilerInvocation.cpp
3087

Shouldn't this be a call to Diags.Report() like in the code just above it and below? Same question for _some_ other uses of llvm_unreachable().

clang/test/CodeGen/fpconstrained.c
23

This is another case of "fast" and "precise" doing the same thing. If we're using the regular fadd then it cannot be that "the compiler preserves the source expression ordering and rounding properties of floating-point".

mibintc marked 3 inline comments as done.Aug 15 2019, 1:05 PM
mibintc added inline comments.
clang/lib/CodeGen/CodeGenFunction.cpp
123

"precise" doesn't necessitate the use of Constrained Intrinsics, And likewise for "fast". The words "compiler preserves the source expression ordering" were copied from the msdn documentation for /fp:precise as you explained it would be useful to have the msdn documentation for the option in case it goes offline in, say, 30 years. The ICL Intel compiler also provides equivalent floating point options. The Intel documentation for precise is phrased differently "Disables optimizations that are not value-safe on floating-point data."

fp-model=precise should enable contractions, if that's not true at default (I mean, clang -c) then this patch is missing that.

fp-model=fast is the same as requesting ffast-math

clang/lib/Frontend/CompilerInvocation.cpp
3087

I put it in as unreachable because the clang driver shouldn't build this combination, but that's a good point I can just switch it to match the other code in this function, thanks.

clang/test/CodeGen/fpconstrained.c
23

I need an fp wizard to address this point, @andrew.w.kaylor ??

The msdn documentation says that strict and precise both preserve ...

mibintc marked 3 inline comments as not done.Aug 16 2019, 1:30 PM

I added an inline reply and unmarked some things that had been inadvertently marked done.

clang/lib/CodeGen/CodeGenFunction.cpp
123

Well, we haven't heard from Andy yet, but he told me some time ago that /fp:precise corresponds more or less (there was wiggle room) to clang's default behavior. It sounds like you think the description in the msdn of /fp:precise isn't describing clang's default behavior, @kpn can you say more about that, and do you think that ConstrainedIntrinsics should be created to provide the semantics of /fp:precise?

mibintc updated this revision to Diff 215666.Aug 16 2019, 1:53 PM
mibintc marked an inline comment as not done.
mibintc edited the summary of this revision. (Show Details)

I addressed some comments from @kpn: I corrected the documentation formatting and added some details, and used Diag instead of llvm_unreachable.

I decided to change -fp-model=except- to -fp-model=noexcept

When the user requests -fp-model=strict I explicitly set the FMA Contraction mode to off.

mibintc marked 2 inline comments as done.Aug 16 2019, 1:56 PM
clang/lib/CodeGen/CodeGenFunction.cpp
123

"Precise" means that no value unsafe optimizations will be performed. That's what LLVM does by default. As long as no fast math flags are set, we will not perform optimizations that are not value safe.

kpn added a comment.Mon, Aug 19, 5:54 AM

I don't believe I have any further comments. What do the front-end guys say?

clang/lib/CodeGen/CodeGenFunction.cpp
123

OK, I stand corrected.

ping, looking for a code review, especially from front end folks. thank you!

rjmccall added inline comments.Mon, Aug 26, 10:00 PM
clang/docs/UsersManual.rst
1299

This should be something like -fp-model=<value>. Square brackets mean optional elements in these docs.

1305

Combined how? With a comma?

This option seems to have two independent dimensions. Is that necessary for command-line compatibility with ICC, or can we separate it into two options?

The documentation should mention the default behavior along both dimensions. Is it possible to override a prior instance of this option to get this default behavior back?

You mention that this -fp-model=fast is equivalent to -ffast-math. How does this option interact with that one if both are given on a command line?

Please put option text in backticks wherever it appears.

Most of these comments apply to -fp-speculation as well.

mibintc marked 2 inline comments as done.Tue, Aug 27, 10:09 AM
mibintc added inline comments.
clang/docs/UsersManual.rst
1305

Combined how? With a comma?

This option seems to have two independent dimensions. Is that necessary for command-line compatibility with ICC, or can we separate it into two options?

Yes that's right, there are 2 dimensions. I wrote it like this for identical compatibility with icc, and cl.exe also defines the option this way, to specify multiple values simultaneously. However I think it would be reasonable and good to split them into separate options. I will discuss this with the folks back home.

The documentation should mention the default behavior along both dimensions.

I added this info into the doc

Is it possible to override a prior instance of this option to get this default behavior back?

The 3 values along one dimension, precise, strict, fast if they appear multiple times in the command line, the last value will be the setting along that dimension. Ditto with the other dimension, the rightmost occurrence of except or noexcept will be the setting.

You mention that this -fp-model=fast is equivalent to -ffast-math. How does this option interact with that one if both are given on a command line?

The idea is that they are synonyms so if either or both appeared on the command line, the effect would be identical.

I'll upload another patch with a few documentation updates and get back to you about splitting the fp-model option into multiple options. (Longer term, there are 2 other dimensions to fp-model)

And thanks for the review

mibintc updated this revision to Diff 217426.Tue, Aug 27, 10:12 AM

I made a couple changes to the UserManual in response to @rjmccall review

rjmccall added inline comments.Tue, Aug 27, 11:34 AM
clang/docs/UsersManual.rst
1305

Yes that's right, there are 2 dimensions. I wrote it like this for identical compatibility with icc, and cl.exe also defines the option this way, to specify multiple values simultaneously. However I think it would be reasonable and good to split them into separate options. I will discuss this with the folks back home.

Okay. There's certainly some value in imitating existing compilers, but it sounds like a lot has been forced into one option, so maybe we should take the opportunity to split it up. If we do split it, though, I think the different dimensions should have different base spellings, rather than being repeated uses of -fp-model.

The 3 values along one dimension, precise, strict, fast if they appear multiple times in the command line, the last value will be the setting along that dimension.

Okay. This wasn't clear to me from the code, since the code also has an "off" option.

The idea is that they are synonyms so if either or both appeared on the command line, the effect would be identical.

Right, but compiler options are allowed to conflict with each other, with the general rule being that the last option "wins". So what I'm asking is if that works correctly with this option and -ffast-math, so that e.g. -ffast-math -fp-model=strict leaves you with strict FP but -fp-model=strict -ffast-math leaves you with fast FP. (That is another reason why it's best to have one aspect settled in each option: because you don't have to merge information from different uses of the option.)

At any rate, the documentation should be clear about how this interacts with -ffast-math. You might even consider merging this into the documentation for -ffast-math, or at least revising that option's documentation. Does -fp-model=fast cause __FAST_MATH__ to be defined?

Also, strictly speaking, this should be -ffp-model, right?

1324

These are exclusive, right? So the documentation should be <value>, not <values>.

clang/docs/UsersManual.rst
1305

I think the ICC interface includes the exception option for compatibility/consistency with Microsoft's /fp option. We can handle that in clang-cl. So, I agree that it makes sense to split that out in clang.

ICC's implementation of this actually has four dimensions, only two of which are being taken on here. Frankly, I think it's a bit of a mess. The core concept which I think we should bring into clang with this option is to have a single option that manages all the various settings to control floating point behavior to produce the primary expected modes of operation so users don't have to find all the flags and remember the default settings for each one.

The way I'd suggest this should work is that we provide just the primary "models" and allow other options to modify the base behavior, regardless of the order in which the options appear. So, for example,

-fp-model=precise -fp-speculation=safe

and

-fp-speculation=stafe -fp-model=precise

would both mean the same thing, disable value-unsafe optimizations and prevent speculative execution of floating point operations. I don't know how painful that is from a driver perspective or how obvious it would be to "most users" but to me it seems to be the logical result of fp-model being an umbrella setting and other options being able to modify it.

clang/docs/UsersManual.rst
1309

There's a bit of ambiguity here because FP contraction isn't an on/off switch in LLVM. It has three settings: on, off, and fast. What you've done in this patch sets it to 'on' for precise, 'off' for strict, and 'fast' for fast. That sounds reasonable, but it's not what ICC and MSVC do. ICC and MSVC both have a behavior equivalent to -ffp-contract=fast in the precise model.

The idea behind this is that FMA operations are actually more precise than the non-contracted operations. They don't always give the same result, but they give a more precise result. The problem with this is that if we adopt this approach it leaves us with no fp model that corresponds to the default compiler behavior if you don't specify an -fp-model at all.

mibintc marked 3 inline comments as done.Mon, Sep 9, 8:39 AM
mibintc added inline comments.
clang/docs/UsersManual.rst
1305

Thanks for the review. I'm going to upload anotoher patch which drops -fp-model=[no-]except. This will clean up the command line for the fp-model setting because now it cannot have 2 settings simultaneously. The new patch will drop the fp-speculation option, and add a new option fp-exception-behavior. The fp-exception-behavior option allows access to the "eb" exception behavior setting of the LLVM constrained floating point intrinsics. The patch is pseudo code at this point because I want to get @rjmccall response to this proposal before finalizing. Since fp-model is an umbrella option, there are conflicts between it and existing options. I added pseudo code into RenderFloatingPointOptions to detect and report the conflicts, and rewrote the part that detects inter-option conflicts.

mibintc updated this revision to Diff 219364.Mon, Sep 9, 8:51 AM
mibintc retitled this revision from [RFC] Add support for options -fp-model= and -fp-speculation= : specify floating point behavior to [RFC] Add support for options -frounding-math -fp-model= and -fp-exception-behavior= : specify floating point behavior.
mibintc edited the summary of this revision. (Show Details)

Addressed comments from @rjmccall and @andrew.w.kaylor. Added -frounding-math option and -ffp-exception-behavior= option. Dropped -ffp-speculation= option. Added details about how the new options conflict with existing floating point options. Rewrote
RenderFloatingPointOptions using pseudo code to show how option conflicts will be detected.

mibintc retitled this revision from [RFC] Add support for options -frounding-math -fp-model= and -fp-exception-behavior= : specify floating point behavior to [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior.Mon, Sep 9, 8:52 AM

I think this is a step in the right direction, thank you. I'd like @scanon to weigh in on the evolving design here.

clang/docs/UsersManual.rst
1314

What you should document here are the semantics and how the option interacts with other options, not how code gets translated into LLVM. I'm not sure what the FIXME question here is; are you asking whether providing -frounding-math should *imply* an FP model?

The notes about each of the options should probably be structured into a bullet list.

1336

This is basically incomprehensible. :) I don't know if the problem is the behavior or just how it's being described, but I have no idea what "conflict" means — does it mean the option gets overridden, ignored, or causes an error? I think what you're trying to say is:

  • Basic FP behavior can be broken down along two dimensions: the FP strictness model and the FP exceptions model.
  • There are many existing options for controlling FP behavior.
  • Some of these existing options are equivalent to setting one (or both?) of these dimensions. These options should generally be treated as synonyms for the purposes of deciding the ultimate setting; for example, -ffp-model=fast -fno-fast-math should basically leave the setting in its default state (right?).
  • Other existing options only make sense in combination with certain basic models. For example, -ffp-contract=fast (note the spelling) is only allowed when using the fast FP model (right?).

As a specific note, you break out the options into a list below; the entry for fast is the place to add things like "Equivalent to -ffast-math, including defining __FAST_MATH__)".

mibintc marked 2 inline comments as done.Mon, Sep 9, 1:28 PM
mibintc added inline comments.
clang/docs/UsersManual.rst
1314

I'll remove the FIXME and assert that frounding-math uses dynamic-rounding and strict exception behavior. This will make frounding-math synonymous with fp-model=strict. I'll reformat to put notes into bullet lists.

1336

Conflict was a poor choice of words. I meant to say that the umbrella options like fp-model=strict overlap with some of the other floating-point settings, in that case the rightmost option takes precedence and overrides the setting. I want the new options to behave in the same way that other clang options: rightmost option has precedence.

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

Yes I'll do that

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

I uploaded a patch to move floating point options to a new documentation section here, https://reviews.llvm.org/D67517