This is an archive of the discontinued LLVM Phabricator instance.

Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior
ClosedPublic

Authored by mibintc on May 31 2019, 7:01 AM.

Details

Summary

Intel would like to contribute a patch to implement support for these Intel- and Microsoft -fp options. This message is to describe the options and request feedback from the community.
-frounding-math (supported by gcc and ICC)
-fp-model=[precise|strict|fast] and -fp-exception-behavior=[ignore|maytrap|strict]

This contribution dovetails with the llvm patch "Teach the IRBuilder about constrained fadd and friends". The motivation for providing these is that having umbrella options such as -fp-model= to control most basic FP options is better and easier to understand for users.

The option settings -fp-model=[precise|strict|fast] are supported by both ICC and CL. The CL and ICC -fp-model option is documented on these pages:

https://docs.microsoft.com/en-us/cpp/build/reference/fp-specify-floating-point-behavior?view=vs-2019
https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-fp-model-fp

Currently, clang's default behavior corresponds to -fp-model=precise. Clang/llvm support for -fp-model=strict and
-fp-exception-behavior= was developed in the D53157 patch, and there is current llvm support for the fast settings by using the fast math flags llvm::FastMathFlags. Note: the clang-cl wrapper to support Microsoft options has simplified support for these options by mapping /fp-model=except to ftrapping-math, fp-mdel=fast to ffast-math, fp-model=precise and fp-model=strict to fno-fast-math (see clang/Driver/CLCompatOptions.td).

These are the settings for -fp-model=

precise - Disables optimizations that are not value-safe on floating-point data, although FP contraction is enabled.
strict - Enables precise and except, disables contractions (FMA), and enables pragma stdc fenv_access.   [Note: fenv_access not currently supported in clang]
fast - Equivalent to -ffast-math

What follows here is Microsoft /fp documentation from the msdn site. It's copied here solely for the purposes of saving
information for the future in case the information is removed from the msdn site:

/fp (Specify floating-point behavior)

Specifies how the compiler treats floating-point expressions, optimizations, and exceptions. The /fp options specify whether the generated code allows floating-point environment changes to the rounding mode, exception masks, and subnormal behavior, and whether floating-point status checks return current, accurate results. It controls whether the compiler generates code that maintains source operation and expression ordering and conforms to the standard for NaN propagation, or if it instead generates more efficient code that may reorder or combine operations and use simplifying algebraic transformations that are not allowed by the standard.

Syntax

/fp:[precise | strict | fast | except[-]]

Arguments

precise

By default, the compiler uses /fp:precise behavior.

Under /fp:precise the compiler preserves the source expression ordering and rounding properties of floating-point code when it generates and optimizes object code for the target machine. The compiler rounds to source code precision at four specific points during expression evaluation: at assignments, at typecasts, when a floating-point argument is passed to a function call, and when a floating-point value is returned from a function call. Intermediate computations may be performed at machine precision. Typecasts can be used to explicitly round intermediate computations.

The compiler does not perform algebraic transformations on floating-point expressions, such as reassociation or distribution, unless the transformation is guaranteed to produce a bitwise identical result.
Expressions that involve special values (NaN, +infinity, -infinity, -0.0) are processed according to IEEE-754 specifications. For example, x != x evaluates to true if x is NaN. Floating-point *contractions*, that is, machine instructions that combine floating-point operations, may be generated under /fp:precise.

The compiler generates code intended to run in the [default floating-point environment](#the-default-floating-point-environment) and assumes that the floating-point environment is not accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

If your floating-point code does not depend on the order of operations and expressions in your floating-point statements (for example, if you don't care whether a * b + a * c is computed as (b + c) * a or 2 * a as a + a), consider the [/fp:fast](#fast) option, which can produce faster, more efficient code. If your code both depends on the order of operations and expressions, and accesses or alters the floating-point environment (for example, to change rounding modes or to trap floating-point exceptions), use [/fp:strict](#strict).

strict

/fp:strict has behavior similar to /fp:precise, that is, the compiler preserves the source ordering and rounding properties of floating-point code when it generates and optimizes object code for the target machine, and observes the standard when handling special values. In addition, the program may safely access or modify the floating-point environment at runtime.

Under /fp:strict, the compiler generates code that allows the program to safely unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes. It rounds to source code precision at four specific points during expression evaluation: at assignments, at typecasts, when a floating-point argument is passed to a function call, and when a floating-point value is returned from a function call. Intermediate computations may be performed at machine precision. Typecasts can be used to explicitly round intermediate computations. The compiler does not perform algebraic transformations on floating-point expressions, such as reassociation or distribution, unless the transformation is guaranteed to produce a bitwise identical result. Expressions that involve special values (NaN, +infinity, -infinity, -0.0) are processed according to IEEE-754 specifications. For example, x != x evaluates to true if x is NaN. Floating-point contractions are not generated under /fp:strict.

/fp:strict is computationally more expensive than /fp:precise because the compiler must insert additional instructions to trap exceptions and allow programs to access or modify the floating-point environment at runtime. If your code doesn’t use this capability, but requires source code ordering and rounding, or relies on special values, use /fp:precise. Otherwise, consider using /fp:fast, which can produce faster and smaller code.

fast

The /fp:fast option allows the compiler to reorder, combine, or simplify floating-point operations to optimize floating-point code for speed and space. The compiler may omit rounding at assignment statements, typecasts, or function calls. It may reorder operations or perform algebraic transforms, for example, by use of associative and distributive laws, even if such transformations result in observably different rounding behavior. Because of this enhanced optimization, the result of some floating-point computations may differ from those produced by other /fp options. Special values (NaN, +infinity, -infinity, -0.0) may not be propagated or behave strictly according to the IEEE-754 standard. Floating-point contractions may be generated under /fp:fast. The compiler is still bound by the underlying architecture under /fp:fast, and additional optimizations may be available through use of the [/arch](arch-minimum-cpu-architecture.md) option.

Under /fp:fast, the compiler generates code intended to run in the default floating-point environment and assumes that the floating-point environment isn’t accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

/fp:fast is intended for programs that do not require strict source code ordering and rounding of floating-point expressions, and do not rely on the standard rules for handling special values such as NaN. If your floating-point code requires preservation of source code ordering and rounding, or relies on standard behavior of special values, use [/fp:precise](#precise). If your code accesses or modifies the floating-point environment to change rounding modes, unmask floating-point exceptions, or check floating-point status, use [/fp:strict](#strict).

except

The /fp:except option generates code to ensures that any unmasked floating-point exceptions are raised at the exact point at which they occur, and that no additional floating-point exceptions are raised. By default, the /fp:strict option enables /fp:except, and /fp:precise does not. The /fp:except option is not compatible with /fp:fast. The option can be explicitly disabled by us of /fp:except-.

Note that /fp:except does not enable any floating-point exceptions by itself, but it is required for programs to enable floating-point exceptions. See [_controlfp](../../c-runtime-library/reference/control87-controlfp-control87-2.md) for information on how to enable floating-point exceptions.

Remarks

Multiple /fp options can be specified in the same compiler command line. Only one of /fp:strict, /fp:fast, and /fp:precise options can be in effect at a time. If more than one of these options is specified on the command line, the later option takes precedence and the compiler generates a warning.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I think this is a step in the right direction, thank you. I'd like @scanon to weigh in on the evolving design here.

clang/docs/UsersManual.rst
1314

What you should document here are the semantics and how the option interacts with other options, not how code gets translated into LLVM. I'm not sure what the FIXME question here is; are you asking whether providing -frounding-math should *imply* an FP model?

The notes about each of the options should probably be structured into a bullet list.

1336

This is basically incomprehensible. :) I don't know if the problem is the behavior or just how it's being described, but I have no idea what "conflict" means — does it mean the option gets overridden, ignored, or causes an error? I think what you're trying to say is:

  • Basic FP behavior can be broken down along two dimensions: the FP strictness model and the FP exceptions model.
  • There are many existing options for controlling FP behavior.
  • Some of these existing options are equivalent to setting one (or both?) of these dimensions. These options should generally be treated as synonyms for the purposes of deciding the ultimate setting; for example, -ffp-model=fast -fno-fast-math should basically leave the setting in its default state (right?).
  • Other existing options only make sense in combination with certain basic models. For example, -ffp-contract=fast (note the spelling) is only allowed when using the fast FP model (right?).

As a specific note, you break out the options into a list below; the entry for fast is the place to add things like "Equivalent to -ffast-math, including defining __FAST_MATH__)".

mibintc marked 2 inline comments as done.Sep 9 2019, 1:28 PM
mibintc added inline comments.
clang/docs/UsersManual.rst
1314

I'll remove the FIXME and assert that frounding-math uses dynamic-rounding and strict exception behavior. This will make frounding-math synonymous with fp-model=strict. I'll reformat to put notes into bullet lists.

1336

Conflict was a poor choice of words. I meant to say that the umbrella options like fp-model=strict overlap with some of the other floating-point settings, in that case the rightmost option takes precedence and overrides the setting. I want the new options to behave in the same way that other clang options: rightmost option has precedence.

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

Yes I'll do that

Hmm, you know, there are enough different FP options that I think we should probably split them all out into their own section in the manual instead of just listing them under "code generation". That will also give us an obvious place to describe the basic model, i.e. all the stuff about it mostly coming down to different strictness and exception models. Could you prepare a patch that *just* does that reorganization without adding any new features, and then we can add the new options on top of that?

I uploaded a patch to move floating point options to a new documentation section here, https://reviews.llvm.org/D67517

mibintc updated this revision to Diff 223090.Oct 3 2019, 2:14 PM

In the previous review, @rjmccall asked me to redo the existing floating point option documentation before submitting this patch. I got the floating point documentation update committed, and I've worked on this patch more to get the floating point "render options" checking implemented. This patch needs more test cases, and there might be a bug or 2 in the "render options" checking. I wanted to show you this work especially to get your reaction to the changes to the floating point options

This patch adds support for frounding-math and ftrapping-math and new options fp-model= and fp-exception-behavior=; fp-model is an "umbrella" option.

Thank you, this looks very clean now.

clang/docs/UsersManual.rst
1139

"represent *the* corresponding IEEE rounding rules"

1151

"provided by other, single-purpose floating point options."

1162

That's not typical driver behavior; why this choice?

simoll added a subscriber: simoll.Oct 8 2019, 10:52 AM
mibintc updated this revision to Diff 223908.Oct 8 2019, 11:24 AM

I made a couple wording changes suggested by @rjmccall

mibintc marked 2 inline comments as done.Oct 8 2019, 11:26 AM
mibintc added inline comments.
clang/docs/UsersManual.rst
1162

The rationale for the warnings is that the floating point options are sufficiently complicated that it makes sense to warn the uses that one of the later options supplied on the command line is undoing a choice made earlier. It's not obvious that e.g. the setting for fassociative-math is also controlled by -fp-model=strict

mibintc marked 2 inline comments as done.Oct 8 2019, 11:35 AM
mibintc added inline comments.
clang/include/clang/Driver/Options.td
918

The ffp-model= option is just a Driver option, it is rewritten into combinations of lower level options like ffp-exception-behavior and frounding-math: it's not a cc1 option.

clang/lib/Driver/ToolChains/Clang.cpp
2463–2464

By default, floating point exceptions are masked. Previously this was set to true, but the value wasn't used. This patch implements support for trapping-math

mibintc updated this revision to Diff 223911.Oct 8 2019, 11:36 AM

clean up some dead code

mibintc updated this revision to Diff 223940.Oct 8 2019, 2:03 PM

I added a test case to show the warning diagnostics when options conflicting with fp-model are provided. I fixed a couple bugs in RenderFloatingPointOptions when issueing diagnostics. still owe a test case showing how the fp-model, rounding, and trapping options are rendered by the Driver for cc1

rjmccall added inline comments.Oct 8 2019, 8:28 PM
clang/docs/UsersManual.rst
1151

I don't know why you keep including "clang" as a modifier here; this is the clang documentation, and all of these options are clang options no matter where they might have been borrowed from.

1162

Okay. Well, it's a new option, so new behavior is alright, but if you're worried about the collisions having arbitrary effects that you'll have to maintain compatibility with, you should consider making it an error instead, because a warning still means it's permitted.

mibintc updated this revision to Diff 224136.Oct 9 2019, 1:11 PM
mibintc retitled this revision from [RFC] Add support for options -frounding-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior to Add support for options -frounding-math, ftrapping-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior.

I added a new test case fp-model.c to test RenderFloatingPointOptions, I also fixed a few issues that I spotted while working through this test case. I responded to couple documentation comments from @rjmccall I still owe a more deluxe version of the test fp-constrained.c to be sure all the option values come through as expected

mibintc marked 5 inline comments as done.Oct 9 2019, 1:18 PM
mibintc added inline comments.
clang/docs/UsersManual.rst
1151

thanks for explicitly pointing out use of 'clang', i fixed it

1162

@andrew.w.kaylor What do you think about making the diagnostics error vs. warning?

clang/include/clang/Basic/LangOptions.h
187

Currently there's no way to get at any of these values besides ToNearest and Dynamic, but I put all the supported values here to support future work

203

-fno-trapping-math implemented by selecting -ffp-exception-behavior=ignore and -ftrapping-math is implemented by selecting -ffp-exception-behavior=strict. What do you think about making ftrapping-math a Driver only option, so that Driver converts the values like this. Otherwise let's make fp-exception-behavior take precedence, in llvm, over ftrapping-math (trapping math is t/f but exception behavior, in the llvm Constrained Floating Point Intrinsics, can take 3 values)

mibintc marked an inline comment as done.Oct 10 2019, 8:07 AM

I looked over the codegen testcase fpconstrained.c and it looks pretty good, so i think this is ready for your review comments. I'll be off the grid for a couple days but looking forward to receiving your feedback.

I inserted a couple of inline remarks in the code review to highlight some areas and questions.

rjmccall added inline comments.Oct 16 2019, 3:22 PM
clang/include/clang/Basic/LangOptions.h
203

If your new option subsumes existing ones, I think making it the frontend option is sensible.

Ping. Hoping for code review so I can move this forward. Affirmative or negative, please let me know. Thank you! --Melanie

rjmccall added inline comments.Oct 18 2019, 11:26 AM
clang/docs/UsersManual.rst
1139

A few points about this documentation that occurred to me since the last time I looked at it:

  • It's weird to talk about LLVM here, since this is the Clang documentation. Clang's behavior is not specified in terms of the IR it generates; it's specified in terms of the formal behavior of the source code. Therefore this documentation should talk about things using concepts from an appropriate language standard whenever possible; in this case, C99 works.
  • It's weird to bring up all these different rounding modes when this option doesn't actually let you do anything with them. If you want to talk about rounding modes in general that's fine as a way of informing the programmer, but we shouldn't give them information they can't use.
  • I don't think -fno-rounding-math is actually equivalent to forcing the use of the tonearest rounding mode; I think it *assumes* that the rounding mode is set to tonearest. (Or am I wrong and this is actually guaranteed by ABI?)
  • I don't think we want to *define* -frounding-math as exactly equivalent to -ffp-model=strict. That might be a convenient implementation for now, but it seems to me that -frounding-math still allows some optimizations that -ffp-model=strict wouldn't.

With that in mind, I'd suggest something like this:

Force floating-point operations to honor the dynamically-set rounding mode by default.

The result of a floating-point operation often cannot be exactly represented in the result type and therefore must be rounded. IEEE 754 describes different rounding modes that control how to perform this rounding, not all of which are supported by all implementations. C provides interfaces (`fesetround and fesetenv`) for dynamically controlling the rounding mode, and while it also recommends certain conventions for changing the rounding mode, these conventions are not typically enforced in the ABI. Since the rounding mode changes the numerical result of operations, the compiler must understand something about it in order to optimize floating point operations.

Note that floating-point operations performed as part of constant initialization are formally performed prior to the start of the program and are therefore not subject to the current rounding mode. This includes the initialization of global variables and local `static variables. Floating-point operations in these contexts will be rounded using FE_TONEAREST`.

  • The option `-fno-rounding-math allows the compiler to assume that the rounding mode is set to FE_TONEAREST`. This is the default.
  • The option `-frounding-math` forces the compiler to honor the dynamically-set rounding mode. This prevents optimizations which might affect results if the rounding mode changes or is different from the default; for example, it prevents floating-point operations from being reordered across most calls and prevents constant-folding when the result is not exactly representable.
mibintc added inline comments.Oct 20 2019, 6:29 PM
clang/docs/UsersManual.rst
1139

Thank you, I will work on another patch

mibintc updated this revision to Diff 226738.Oct 28 2019, 1:19 PM

I adopted the language that @rjmccall recommended for documenting frounding-math., also adding a sentence to describe effects on exception behavior control.

Is the exception-strictness of -frounding-math actually considered to be specified behavior, or is it just a consequence of the current implementation? There are definitely some optimizations that we can't do under strict FP exceptions that we can still do in principle while respecting a dynamic FP rounding mode; for example, the rounding mode can only be changed by a call (or inline assembly), so you can still reorder FP operations around "lesser" side effects, like stores. We can document it even if it's not required behavior, but we should be clear about what it is.

My suggested wording started with a sentence briefly summarizing what the option did that I think you accidentally dropped.

mibintc updated this revision to Diff 226907.Oct 29 2019, 8:13 AM

In response to comments from @rjmccall I inserted into the UsersManual the one-line summary of frounding-math that had been omitted and changed the semantics of frounding-math to not also set exception-behavior to strict

Is the exception-strictness of -frounding-math actually considered to be specified behavior, or is it just a consequence of the current implementation? There are definitely some optimizations that we can't do under strict FP exceptions that we can still do in principle while respecting a dynamic FP rounding mode; for example, the rounding mode can only be changed by a call (or inline assembly), so you can still reorder FP operations around "lesser" side effects, like stores. We can document it even if it's not required behavior, but we should be clear about what it is.

I had thought that it was intended behavior, but I re-checked my notes and realize I was wrong about that. So I've changed the document and the driver, and updated the test. Thanks again for your careful reading.

My suggested wording started with a sentence briefly summarizing what the option did that I think you accidentally dropped.

Yes, it's there now.

Thanks. A few things about the functionality parts of the patch now.

clang/include/clang/Basic/CodeGenOptions.def
238 ↗(On Diff #226907)

Why do we need both a code-gen option and a language option?

clang/include/clang/Basic/LangOptions.h
346

Everything here is a "setting", and in the context of this type they're all FP. Please name these methods something like getRoundingMode().

Does this structure really need to exist as opposed to tracking the dimensions separately? Don't we already track some of this somewhere? We should subsume that state into these values rather than tracking them separately.

clang/lib/CodeGen/CodeGenFunction.cpp
108

Code style: please use () instead of (void), and please place open-braces on the same line as the declaration.

clang/lib/CodeGen/CodeGenFunction.h
4152

Don't use (void), please.

mibintc added inline comments.Oct 29 2019, 12:54 PM
clang/include/clang/Basic/CodeGenOptions.def
238 ↗(On Diff #226907)

The main reason i added it to LangOptions.h is because I saw the FPContract support in there and I thought I'd get on that bandwagon. My ultimate goal, after committing the command line options, is to add support for controlling rounding mode and exception behavior with pragma's embedded in the functions, similar to https://reviews.llvm.org/D69272.

There's a patch here that I like, to add rounding-mode and exception-behavior to FPOptions https://reviews.llvm.org/D65994, but it hasn't been committed yet.

mibintc updated this revision to Diff 227283.Oct 31 2019, 8:23 AM

I followed up on some code review remarks from @rjmccall. I dropped the CODEGEN option and fixed some code formatting. I changed the spelling of the enumeration values for RoundingMode and ExceptionMode to match those proposed in https://reviews.llvm.org/D65994

mibintc marked 5 inline comments as done.Oct 31 2019, 8:27 AM
mibintc added inline comments.
clang/include/clang/Basic/CodeGenOptions.def
238 ↗(On Diff #226907)

I dropped the code-gen option.

clang/include/clang/Basic/LangOptions.h
346

I fixed the spelling, I also dropped the structure and used the ENUM_OPT macro instead of writing out the setter and getter. Look OK now?

Yes, thanks, looks a lot better. Just a few tweaks now.

clang/include/clang/Basic/LangOptions.h
325

Spurious change.

clang/include/clang/Driver/Options.td
1139

It looks like both of these can now be written with BooleanFFlag.

clang/lib/CodeGen/CodeGenFunction.cpp
148

Please make functions that do these translations, and please make them use exhaustive switches with llvm_unreachable at the end.

clang/test/Driver/clang_f_opts.c
323 ↗(On Diff #227283)

Looks like the intent of this test is that you pull this to the lines above, to test that we don't emit an error on it. You should also test -ffp-model.

mibintc marked an inline comment as done.Oct 31 2019, 12:12 PM
mibintc added inline comments.
clang/include/clang/Driver/Options.td
1139

BooleanFFlag doesn't work, there's a FIXME message saying that prefixes don't work, currently they are only being used for unimplemented options.
llvm/clang/lib/Driver/ToolChains/Clang.cpp:2301:17: error: ‘OPT_frounding_math’ is not a member of ‘clang::driver::options’

optID = options::OPT_frounding_math;
        ^
mibintc updated this revision to Diff 227470.Nov 1 2019, 9:51 AM

Respond to recent code review from @rjmccall ; I modified the test cases and added functions for translating between the LangOptions enumeration and llvm enumeration for rounding-mode and exception-behavior. I wasn't able to use BooleanFFlag because at the moment that is only usable for unsupported options.

mibintc added inline comments.Nov 1 2019, 9:52 AM
clang/lib/CodeGen/CodeGenFunction.cpp
133

I added these 2 functions, is this what you have in mind or do you want me to write them differently?

rjmccall added inline comments.Nov 1 2019, 1:15 PM
clang/lib/CodeGen/CodeGenFunction.cpp
133

Slightly differently, yes, please.

static llvm::ConstrainedFPIntrinsic::ExceptionBehavior getConstrainedExceptionBehavior(LangOptions;:FPExceptionModeKind kind) {
  switch (kind) {
  case LangOptions::FPE_Ignore:
    return llvm::ConstrainedFPIntrinsic::ebIgnore;
  // ...rest of cases here...
  // no default: should be exhaustive over the enum
  }
  llvm_unreachable("bad kind");
}
mibintc updated this revision to Diff 227527.Nov 1 2019, 1:44 PM

Recoded ToConstrainedRoundingMD and ToConstrainedExceptionMD as requested by @rjmccall

rjmccall added inline comments.Nov 1 2019, 1:50 PM
clang/lib/CodeGen/CodeGenFunction.cpp
137

Sorry for dragging this out, but is there a reason these need to be member functions on CodeGenFunction rather than just static functions in this file?

mibintc updated this revision to Diff 227532.Nov 1 2019, 2:03 PM

Made a couple functions static per @rjmccall request

mibintc marked an inline comment as done.Nov 1 2019, 2:04 PM
mibintc added inline comments.
clang/lib/CodeGen/CodeGenFunction.cpp
133

sorry i missed that detail (static) the first time around

rjmccall accepted this revision.Nov 4 2019, 9:40 AM

LGTM; thanks for your patience during all the rounds of review.

This revision is now accepted and ready to land.Nov 4 2019, 9:40 AM
mibintc updated this revision to Diff 227960.Nov 5 2019, 1:47 PM
mibintc retitled this revision from Add support for options -frounding-math, ftrapping-math, -fp-model=, and -fp-exception-behavior=, : Specify floating point behavior to Add support for options -frounding-math, -ftrapping-math, -ffp-model=, and -ffp-exception-behavior=, : Specify floating point behavior.

When doing final testing before commit, I used the Debug build with assertions enabled and found that a couple tests failed with assertion in Clang.cpp. I made some modifications there so the assert wouldn't be triggered. Also I fixed some spelling errors in the comments. (option name misspelled: missing the first f)

when "make check-all" with Debug build, I see lit test failure llvm/test/Object/macho-invalid.test; I think that fail is not related to my change.

@rjmccall thanks for all your help developing this patch

Looks okay to me.

mibintc closed this revision.Nov 11 2019, 10:44 AM

Don't know why the commit id didn't get linked when I pushed the change. Here's the closure info:

commit af57dbf12e54f3a8ff48534bf1078f4de104c1cd
Author: Melanie Blower <melanie.blower@intel.com>
Date: Tue Nov 5 13:41:21 2019 -0800

Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=
jgorbe added a subscriber: jgorbe.Nov 18 2019, 1:02 AM

Hi,

I found a clang crash that seems to be caused by this patch. Here's a reduced test case:

template <class>
class a {
 public:
  ~a();
  void b();
};

template <class c>
a<c>::~a() try {
  b();
} catch (...) {
}

class d {
 public:
  d(const char *, int);
  a<int> e;
}

d("", 1);

Building it with clang -c -frounding-math results in the following crash:

Stack dump:
0.      Program arguments: /usr/local/google/home/jgorbe/code/llvm-build/bin/clang-10 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-fi
le-name repro.cc -mrelocation-model static -mthread-model posix -mframe-pointer=all -fmath-errno -frounding-math -masm-verbose -mconstructor-aliases -munwind-tables -fu
se-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -resource-dir /usr/local/google/home/jgorbe/code/llvm-build/lib/clang/10.0.0 -internal-isystem 
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem
 /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -intern
al-isystem /usr/local/include -internal-isystem /usr/local/google/home/jgorbe/code/llvm-build/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-lin
ux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir /usr/local/google/home/jgorbe/repro4 -ferror
-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -faddrsig -o /tmp/repro
-c9cae0.o -x c++ repro.cc 
1.      <eof> parser at end of file
2.      Per-file LLVM IR generation
3.      repro.cc:4:3: Generating code for declaration 'a<int>::~a'
 #0 0x0000000007fb2b27 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /usr/local/google/home/jgorbe/code/llvm/llvm/lib/Support/Unix/Signals.inc:532:11 
 #1 0x0000000007fb2cc9 PrintStackTraceSignalHandler(void*) /usr/local/google/home/jgorbe/code/llvm/llvm/lib/Support/Unix/Signals.inc:593:1             
 #2 0x0000000007fb160b llvm::sys::RunSignalHandlers() /usr/local/google/home/jgorbe/code/llvm/llvm/lib/Support/Signals.cpp:67:5                        
 #3 0x0000000007fb3415 SignalHandler(int) /usr/local/google/home/jgorbe/code/llvm/llvm/lib/Support/Unix/Signals.inc:384:1                        
 #4 0x00007f6789ec03a0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x123a0)                                                                                     
 #5 0x000000000739af45 llvm::AttributeList::hasFnAttribute(llvm::Attribute::AttrKind) const /usr/local/google/home/jgorbe/code/llvm/llvm/lib/IR/Attributes.cpp:1315:10
 #6 0x00000000051a1964 llvm::Function::hasFnAttribute(llvm::Attribute::AttrKind) const /usr/local/google/home/jgorbe/code/llvm/llvm/include/llvm/IR/Function.h:324:5    
 #7 0x00000000052421a1 llvm::IRBuilderBase::setConstrainedFPFunctionAttr() /usr/local/google/home/jgorbe/code/llvm/llvm/include/llvm/IR/IRBuilder.h:262:9
 #8 0x0000000005241b60 llvm::IRBuilderBase::setConstrainedFPCallAttr(llvm::CallInst*) /usr/local/google/home/jgorbe/code/llvm/llvm/include/llvm/IR/IRBuilder.h:271:3
 #9 0x00000000084a516b llvm::IRBuilder<llvm::ConstantFolder, clang::CodeGen::CGBuilderInserter>::CreateCall(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::MDNode*) /usr/local/google/home/jgorbe/code/llvm/llvm/include/llvm/IR/IRBuilder.h:2274:9                                                                                                                                                                 
#10 0x00000000084a4488 llvm::IRBuilder<llvm::ConstantFolder, clang::CodeGen::CGBuilderInserter>::CreateCall(llvm::FunctionCallee, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&, llvm::MDNode*) /usr/local/google/home/jgorbe/code/llvm/llvm/include/llvm/IR/IRBuilder.h:2288:5
#11 0x000000000849288c clang::CodeGen::CodeGenFunction::EmitRuntimeCall(llvm::FunctionCallee, llvm::ArrayRef<llvm::Value*>, llvm::Twine const&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCall.cpp:3726:26
#12 0x000000000849278d clang::CodeGen::CodeGenFunction::EmitNounwindRuntimeCall(llvm::FunctionCallee, llvm::ArrayRef<llvm::Value*>, llvm::Twine const&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCall.cpp:3691:19                                                                                                            
#13 0x000000000897b3e4 (anonymous namespace)::ItaniumCXXABI::emitTerminateForUnexpectedException(clang::CodeGen::CodeGenFunction&, llvm::Value*) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/ItaniumCXXABI.cpp:4364:5                       
#14 0x000000000865698d clang::CodeGen::CodeGenFunction::getTerminateHandler() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGException.cpp:1504:19
#15 0x00000000086561de clang::CodeGen::CodeGenFunction::getEHDispatchBlock(clang::CodeGen::EHScopeStack::stable_iterator) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGException.cpp:630:21                                                                                                                                      
#16 0x000000000864e7d8 clang::CodeGen::CodeGenFunction::PopCleanupBlock(bool) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCleanup.cpp:970:23
#17 0x000000000864d053 clang::CodeGen::CodeGenFunction::PopCleanupBlocks(clang::CodeGen::EHScopeStack::stable_iterator, std::initializer_list<llvm::Value**>) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCleanup.cpp:423:3
#18 0x000000000864eae3 clang::CodeGen::CodeGenFunction::PopCleanupBlocks(clang::CodeGen::EHScopeStack::stable_iterator, unsigned long, std::initializer_list<llvm::Value**>) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCleanup.cpp:479:19                                                                                     
#19 0x000000000864404d clang::CodeGen::CodeGenFunction::RunCleanupsScope::ForceCleanup(std::initializer_list<llvm::Value**>) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenFunction.h:756:11
#20 0x0000000008655cd4 clang::CodeGen::CodeGenFunction::ExitCXXTryStmt(clang::CXXTryStmt const&, bool) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGException.cpp:1234:16
#21 0x000000000868bf3c clang::CodeGen::CodeGenFunction::EmitDestructorBody(clang::CodeGen::FunctionArgList&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/C
GClass.cpp:1531:1
#22 0x0000000008673ce5 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenFunction.cpp:1256:5
#23 0x00000000087aea54 clang::CodeGen::CodeGenModule::codegenCXXStructor(clang::GlobalDecl) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CGCXX.cpp:214:3
#24 0x0000000008980a5e (anonymous namespace)::ItaniumCXXABI::emitCXXStructor(clang::GlobalDecl) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/ItaniumCXXABI.cpp:4003:19
#25 0x000000000853c297 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:2816:9       
#26 0x0000000008532fb9 clang::CodeGen::CodeGenModule::EmitDeferred() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:2132:5                 
#27 0x0000000008533004 clang::CodeGen::CodeGenModule::EmitDeferred() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:2138:7
#28 0x0000000008533004 clang::CodeGen::CodeGenModule::EmitDeferred() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:2138:7
#29 0x0000000008533004 clang::CodeGen::CodeGenModule::EmitDeferred() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:2138:7
#30 0x0000000008531692 clang::CodeGen::CodeGenModule::Release() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenModule.cpp:393:3
#31 0x0000000008e96d12 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/ModuleBuilder.cpp:0:18                                                                                                                                                     
#32 0x0000000008e90d99 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenAction.cpp:242:14                                                                                                                                                                     
#33 0x000000000af2c3ce clang::ParseAST(clang::Sema&, bool, bool) /usr/local/google/home/jgorbe/code/llvm/clang/lib/Parse/ParseAST.cpp:178:12                        
#34 0x0000000008cea5c2 clang::ASTFrontendAction::ExecuteAction() /usr/local/google/home/jgorbe/code/llvm/clang/lib/Frontend/FrontendAction.cpp:1044:1                   
#35 0x0000000008e8e72b clang::CodeGenAction::ExecuteAction() /usr/local/google/home/jgorbe/code/llvm/clang/lib/CodeGen/CodeGenAction.cpp:1089:1                         
#36 0x0000000008ce9f88 clang::FrontendAction::Execute() /usr/local/google/home/jgorbe/code/llvm/clang/lib/Frontend/FrontendAction.cpp:939:7                             
#37 0x0000000008c193b3 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /usr/local/google/home/jgorbe/code/llvm/clang/lib/Frontend/CompilerInstance.cpp:964:23                                                                                                                                                                    
#38 0x0000000008e78e19 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /usr/local/google/home/jgorbe/code/llvm/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:290:8                                                                        
#39 0x000000000517a8d2 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /usr/local/google/home/jgorbe/code/llvm/clang/tools/driver/cc1_main.cpp:250:13         
#40 0x000000000516e5ff ExecuteCC1Tool(llvm::ArrayRef<char const*>, llvm::StringRef) /usr/local/google/home/jgorbe/code/llvm/clang/tools/driver/driver.cpp:309:5         
#41 0x000000000516d9c9 main /usr/local/google/home/jgorbe/code/llvm/clang/tools/driver/driver.cpp:382:5                                                                 
#42 0x00007f678914e52b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2352b)
#43 0x000000000516d1ba _start (/usr/local/google/home/jgorbe/code/llvm-build/bin/clang-10+0x516d1ba)                                                           
clang-10: error: unable to execute command: Segmentation fault                                                                                                          
clang-10: error: clang frontend command failed due to signal (use -v to see invocation)                                                                                 
clang version 10.0.0 (https://github.com/llvm/llvm-project.git f2e65447b3cb6340883957e033e77095a025ebdc)

Thanks I see it, I'm working on a patch. Previously there was no support for frounding-math (unimplemented). This patch enables the option. In the IR builder, there's a call to a runtime function in the exception handler which is unexpectedly null. I start by adding a null pointer check.

Thanks I see it, I'm working on a patch. Previously there was no support for frounding-math (unimplemented). This patch enables the option. In the IR builder, there's a call to a runtime function in the exception handler which is unexpectedly null. I start by adding a null pointer check.

Had a crash on valid here for days, let's revert and then get a fix when recommitting. I'll respond to the thread when reverting. Thanks :)

The incorrect code is actually in the IRBuilder which is part of a different patch...

Thanks I see it, I'm working on a patch. Previously there was no support for frounding-math (unimplemented). This patch enables the option. In the IR builder, there's a call to a runtime function in the exception handler which is unexpectedly null. I start by adding a null pointer check.

Had a crash on valid here for days, let's revert and then get a fix when recommitting. I'll respond to the thread when reverting. Thanks :)

Thanks I see it, I'm working on a patch. Previously there was no support for frounding-math (unimplemented). This patch enables the option. In the IR builder, there's a call to a runtime function in the exception handler which is unexpectedly null. I start by adding a null pointer check.

Had a crash on valid here for days, let's revert and then get a fix when recommitting. I'll respond to the thread when reverting. Thanks :)

@echristo I just saw the bug was reported today, is the "crash on valid" visible on the bots? Can you provide url showing same? Thanks
I opened https://bugs.llvm.org/show_bug.cgi?id=44048 for @jgorbe

mibintc reopened this revision.Nov 18 2019, 12:00 PM

Reopening since patch was reverted

This revision is now accepted and ready to land.Nov 18 2019, 12:00 PM
mibintc updated this revision to Diff 229893.Nov 18 2019, 12:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 18 2019, 12:02 PM

I added a nullptr check in IRBuilder.h and a test case to cover the segfault reported in https://bugs.llvm.org/show_bug.cgi?id=44048

kpn added a comment.Nov 18 2019, 12:18 PM

Does anyone think a warning is appropriate because the new flags are exercising experimental, incomplete code in both clang and llvm? The warning would be removed when we believe the feature is complete and ready to use.

llvm/include/llvm/IR/IRBuilder.h
262 ↗(On Diff #229893)

This looks reasonable to me.

It smells like there's a larger strictfp IRBuilder issue, but that's not an issue for this patch here. The larger issue won't be hit since the new options affect the entire compilation. It therefore shouldn't block this patch.

rjmccall added inline comments.Nov 18 2019, 12:23 PM
llvm/include/llvm/IR/IRBuilder.h
262 ↗(On Diff #229893)

Does IRBuilder actually support inserting into an unparented basic block? I feel like this is exposing a much more serious mis-use of IRBuilder.

In D62731#1750412, @kpn wrote:

Does anyone think a warning is appropriate because the new flags are exercising experimental, incomplete code in both clang and llvm? The warning would be removed when we believe the feature is complete and ready to use.

@kpn Can you say more about "incomplete code in ... clang". I don't know what's missing from clang.

kpn added a comment.Nov 18 2019, 1:11 PM
In D62731#1750412, @kpn wrote:

Does anyone think a warning is appropriate because the new flags are exercising experimental, incomplete code in both clang and llvm? The warning would be removed when we believe the feature is complete and ready to use.

@kpn Can you say more about "incomplete code in ... clang". I don't know what's missing from clang.

See D70256. Calls to clang's math builtins that are llvm intrinsics need to be changed to be calls to the constrained intrinsics. I've been waiting to submit the patches because they weren't testable without these command line options.

There may be other issues as well. I'm not sure.

michele.scandale added inline comments.
clang/lib/Driver/ToolChains/Clang.cpp
2483

Isn't the default -fno-rounding-math?

2491

Shouldn't this be set to true similarly to what you do for TrappingMathPresent to track whether there is an explicit option related to rounding math?

2496

Running clang -### -ftrapping-math -ffp-exception-behavior=ignore lead to this assertion to fail. As far as I can see TrappingMath is not changed in the case FPExceptionBehavior is "ignore" or "maytrap".
Clearly in the "ignore" case it should be safe to just set TrappingMath to false, but I'm not sure about the "maytrap" case.
It seems that -ffp-exception-behavior is more general than -f{,no-}trapping-math, so it seems natural to me to see ftrapping-math and foo-trapping-math as aliases for ffp-exception-behavior=strict and ffp-exception-behavior=ignore respectively. If we agree on this, then I would expect the reasoning inside the compiler only in terms of FPExceptionBehavior.

2498

With this change if I run clang -### -ffast-math test.c I don't see -fno-trapping-math passed to the CC1.
This is changing the changes the value of the function level attribute "no-trapping-math" (see lib/CodeGen/CGCall.cpp : 1747).
Is this an intended change?

Moreover since with this patch the default value for trapping math changed, the "no-trapping-math" function level attribute is incorrect also for default case.

kpn added inline comments.Nov 19 2019, 6:06 AM
llvm/include/llvm/IR/IRBuilder.h
262 ↗(On Diff #229893)

I suspect you are correct. If we let this "F && " change go in we'll have a situation where whether or not a block is currently in a function when a function call is emitted will affect whether or not the eventual function definition gets the strictfp attribute. That seems like an unfortunate inconsistency.

I'm still looking into it. I hope to have an IRBuilder review up today or tomorrow.

mibintc marked 4 inline comments as done.Nov 19 2019, 12:02 PM

inline replies to Michele, will upload a new patch shortly

clang/lib/Driver/ToolChains/Clang.cpp
2483

Yes the default is no rounding math, I'll remove the comment. Thank you.

2491

There's a switch statement above this that interprets the command line option -fp-model=strict as though frounding had appeared on the command line by assigning a new value to optID so that's why there is a discrepancy. Also I'm using the *Present boolean variables to preserve the output from Driver so that pre-existing driver test cases don't need to be changed.

2496

Thanks for pointing out this assertion failure, I will upload a patch with fix. Yes we could entirely express ftrapping-math and fno-trapping-math via the ffp-exception-behavior= option. That would probably be better--currently the trapping option becomes effective via the exception behavior parameter to the llvm floating point constrained intrinsics, and it can take 3 values. I thought it would be too radical at the moment, so I didn't propose that in this patch.

In the patch I'm about to add, I added a test case for the assertion that you saw.

2498

Before this patch, ftrapping-math was added to the Driver and also a bitfield, `NoTrappingFPMath was created in the LLVM to describe the state of trapping-math, but otherwise that bit wasn't consulted and the option had no effect. Gcc describes ftrapping-math as the default, but in llvm by default floating point exceptions are masked and this corresponds to the floating point Constrained Intrinsics having exception behavior set to ignored. This patch changed the llvm constructor to set the trapping bit to "no trap". In fact I'd like to get rid of the NoTrappingFPMath` bitfield since it's not being used, but I didn't make that change at this point.

If I remember correctly, there are a bunch of driver tests that failed if fno-trapping-math is output to cc1. I'd have to reconstruct the details. Since fno-trapping-math is the default, it isn't passed through on the cc1 command line: the Clang.cpp driver doesn't pass through the positive and negative for each existing option.

Thanks for pointing out the line in CGCall.cpp, it seems the CodeGenOpts aren't getting set up perfectly I'll fix that in CompilerInvocation.cpp; I don't see anything setting trapping-math as part of function level attribute, @michele.scandale did I overlook that/can you point out where that is?

llvm/include/llvm/IR/IRBuilder.h
262 ↗(On Diff #229893)

As I just commented on the related patch @kpn posted, it appears that IRBuilder doesn't entirely support inserting into an unparented block. I was surprised by this, but there are places that need to be able to get to the Module from the BasicBlock. So, I think something problematic may be happening in the failing case.

mibintc updated this revision to Diff 230128.Nov 19 2019, 12:41 PM

Here's an update in response to comments from @michele.scandale
I fixed the assertion error and added a test case
I fixed the setting of ftrapping-math in CodeGenOpts
I deleted an incorrect comment
I added a diagnostic when -fp-exception-behavior= is overridden on the command line by f[no-]trapping-math
I updated one of the test cases to work with some small modifications that will be made to IRBuilder.h

clang/lib/Driver/ToolChains/Clang.cpp
2498

I guess you are referring to the code in TargetMachine.cpp where the function level attributes are used to reset the TargetOptions state whenever we initiate the backend codegen for a given function. Considering that the trapping math option as stated in the documentation did not have any effect, I'm not surprised to see not many uses. The only one I can see is in llvm/lib/Target/ARM/ARMAsmPrinter.cpp : 687 where the function level attribute affects the emission of some ARM specific attributes.

My only concern was that the change of the default value for trapping math was not propagated entirely causing this function level attribute to be initialized incorrectly.
Fixing the logic in CompilerInvocation.cpp considering the change of default it is fine for me.

Given that ffp-exception-behavior={ignore,maytrap,strict} supersedes -f{,no-}trapping-math I would expect long term to see the internal state of the compiler frontend to only care about the new state FPExceptionBehavior for both language and code generation options. And I guess the same would apply to the backend stage as well.

clang/lib/Driver/ToolChains/Clang.cpp
2504

Here it seems you are changing optID to OPT_ffast_math to reuse the logic specified below for that case to reset the state of the floating point options.

2512

Here the state of the floating point options seems unchanged except for FPContract. If I run clang -ffp-model=fast -ffp-model=precise, I would expect the state of the floating point options to match the one of -fno-fast-math except for FPContract which you want to be set to "fast".

I think you might need to replicate the reset for all the option here as well, so at this point I don't know how much worth is to use the optID reset trick for the "fast" case only.

mibintc updated this revision to Diff 231758.Dec 2 2019, 12:20 PM

This commit was reverted in 30e7ee3c4bac because a null deref error occurred in IRBuilder.h when setting strictfp attribute, see https://bugs.llvm.org/show_bug.cgi?id=44048 for information about that bug.
This patch moves setting strictfp from IRBuilder into clang/CodeGen. Also addresses some code review comments that were received after the revert. I'll add some inline comments next.

mibintc marked 9 inline comments as done.Dec 2 2019, 12:38 PM

I added inline comments describing what I did in this version of the patch to address the bug https://bugs.llvm.org/show_bug.cgi?id=44048

clang/include/clang/AST/DeclBase.h
1539 ↗(On Diff #231758)

This corresponds to "strictfp" LLVM attribute. I add this here because I want to collect the information during Sema and set the attribute during CodeGen. The next thing I want to do is to add support for modifying float_control via pragma within function bodies (enable floating point control at the block level). If I wasn't preparing to support floating_control via statement-level pragma then setting the bit could be accomplished entirely within CodeGen.

1560 ↗(On Diff #231758)

Need to adjust the number of bits here, because it's at the threshold of overrunning 64 bits.

clang/include/clang/Basic/DiagnosticDriverKinds.td
444 ↗(On Diff #231758)

@kpn thought it would be a good idea to add a Warning that the implementation of float control is experimental and partially implemented. That's what this is for.

clang/lib/Driver/ToolChains/Clang.cpp
2512

@michele.scandale Thanks for your helpful review, I think I fixed the things that you remarked on. I also added a test case for the assertion fail that you saw.

clang/test/CodeGen/fpconstrained.cpp
10 ↗(On Diff #231758)

This is the test case from the bug report (null deref/segfault/in IRBuilder)

llvm/include/llvm/IR/IRBuilder.h
268 ↗(On Diff #231758)

@kpn I got rid of this line because the function attribute is being set in CodeGen

llvm/unittests/IR/IRBuilderTest.cpp
186 ↗(On Diff #231758)

@kpn I changed the test to create the function attribute a priori since it will be set in CodeGen before passing to IRBuilder

kpn added inline comments.Dec 2 2019, 12:52 PM
llvm/include/llvm/IR/IRBuilder.h
268 ↗(On Diff #231758)

Makes sense.

llvm/unittests/IR/IRBuilderTest.cpp
186 ↗(On Diff #231758)

Right, of course.

I'm not going to quibble over the use of auto. It's fine I think.

clang/lib/Driver/ToolChains/Clang.cpp
2512

Thanks!

mibintc marked an inline comment as done.Dec 4 2019, 11:34 AM

I've pushed the updated patch,
commit cdbed2dd856c14687efd741c2d8321686102acb8

thakis added a subscriber: thakis.Dec 4 2019, 12:18 PM

The tests fail on Windows, http://45.33.8.238/win/3405/step_6.txt

Please take a look, and if it takes a while to fix please revert while you investigte.

I reverted it again because build break on windows

The tests fail on Windows, http://45.33.8.238/win/3405/step_6.txt

Please take a look, and if it takes a while to fix please revert while you investigte.

It appears that only the 1st failure there is the fault of this patch. The 2nd seems to have come from some openmp patch (that didn't consider dso_local on windows).

The first (fpconstrained.cpp) likely just needs the check-lines to NOT explicitly say the %4/%5 and capture the loads of those with a wildcard instead.

thakis added a comment.Dec 4 2019, 4:17 PM

It appears that only the 1st failure there is the fault of this patch. The 2nd seems to have come from some openmp patch (that didn't consider dso_local on windows).

The first (fpconstrained.cpp) likely just needs the check-lines to NOT explicitly say the %4/%5 and capture the loads of those with a wildcard instead.

Yes, sorry, I pasted the wrong link. http://45.33.8.238/win/3402/step_6.txt is the one for this commit, and it has just one of the two failures.

I fixed the lit test problem and pushed it again

commit 7f9b5138470db1dc58f3bc05631284c653c9ed7a
Author: Melanie Blower <melanie.blower@intel.com>

I've noticed you removed the change for CompilerInvocation.cpp about the initialization of the codegen option NoTrappingMath. Was that an accident?

I've noticed you removed the change for CompilerInvocation.cpp about the initialization of the codegen option NoTrappingMath. Was that an accident?

I checked the old and new version of the patch and it seems like initialization of NoTrappingMath is unchanged, the definition of the option has it default to 0, and CompilerInvocation.cpp sets it like this in ParseLangArgs, was it something else you were looking at?
Opts.NoTrappingMath = Args.hasArg(OPT_fno_trapping_math);

michele.scandale added a comment.EditedDec 10 2019, 4:28 PM

I've noticed you removed the change for CompilerInvocation.cpp about the initialization of the codegen option NoTrappingMath. Was that an accident?

I checked the old and new version of the patch and it seems like initialization of NoTrappingMath is unchanged, the definition of the option has it default to 0, and CompilerInvocation.cpp sets it like this in ParseLangArgs, was it something else you were looking at?
Opts.NoTrappingMath = Args.hasArg(OPT_fno_trapping_math);

In the driver code you don't always pass -fno-trapping-math, therefore when when the compiler setup the CodeGen options in ParseCodeGenArgs you will end up executing Opts.NoTrappingMath = Args.hasArg(OPT_fno_trapping_math); hence you will have that Opt.NoTrappingMath = false. This is inconsistent with the state of the compiler driver where no-trapping-math is enabled default.

If you want to keep the default of the CC1 different than the default of the compiler driver, that's fine to me, but in that case the compiler driver needs to pass -fno-trapping-math to the CC1.
If we want the same new default, then the logic in ParseCodeGenArgs must be updated.

I've noticed you removed the change for CompilerInvocation.cpp about the initialization of the codegen option NoTrappingMath. Was that an accident?

Thanks again Michele. I'd like to get rid of Opts.NoTrappingMath, but I haven't been bold enough yet. NoTrappingMath is not expressive enough because it can hold only 2 values, whereas the Exception behavior can be ignore, strict or maytrap. So I'd get rid of that Opts field, and the only place where I see it actually being used is in llvm/lib/Target/ARM/ARMAsmPrinter.cpp and the change in this patch doesn't seem to affect the ARM logic so I think if I got rid of it, it would be OK. All the other instances of the string are in llvm test cases.

I've noticed you removed the change for CompilerInvocation.cpp about the initialization of the codegen option NoTrappingMath. Was that an accident?

Thanks again Michele. I'd like to get rid of Opts.NoTrappingMath, but I haven't been bold enough yet.

I don't see a problem with this, but it would be nice to make the -f[no-]trapping-math command line option work. GNU compatibility is good.

rupprecht added inline comments.
clang/lib/Sema/SemaExpr.cpp
13047 ↗(On Diff #231758)

Looks like this is leftover debugging? I'm seeing log spam compiling some files -- this message repeated hundreds of times. I'll go ahead and create a patch that nukes this.

rupprecht marked an inline comment as done.Dec 11 2019, 4:55 PM
rupprecht added inline comments.
clang/lib/Sema/SemaExpr.cpp
13047 ↗(On Diff #231758)

Sorry for the noise, looks like f4a7d5659df7cb56c1baa34a39e9fe2639472741 already did this.

I don't see a problem with this, but it would be nice to make the -f[no-]trapping-math command line option work. GNU compatibility is good.

Thanks Cameron, I'll go that way

rupprecht added inline comments.Dec 12 2019, 12:51 PM
clang/include/clang/Basic/DiagnosticDriverKinds.td
444 ↗(On Diff #231758)

Instead of adding a warning, I'd like to propose -frounding-math not be enabled unless an additional flag (e.g. -fexperimental-float-control) is passed. Or maybe this feature should be called -f[no-]experimental-rounding-math instead.

There are plenty of builds that are already specifying -frounding-math (e.g. they also support building w/ a compiler such as gcc that implements this), and adding this experimental/incomplete implementation is a surprise to those builds.

If I'm wrong and it's completely safe to ignore the warning (maybe it's incomplete but not in any unsafe way), we should just not have it at all.

clang/include/clang/Basic/DiagnosticDriverKinds.td
444 ↗(On Diff #231758)

You raise an interesting point about people who might be using -frounding-math already. However, if they are using this flag because they also sometimes build with a compiler such as gcc that supports the flag, they are currently getting incorrect behavior from clang. Without this patch, clang silently ignores the option and the optimizer silently ignores the fact that the program may be changing the rounding mode dynamically. The user may or may not be aware of this.

With this patch such a user is likely to observe two effects: (1) their code will suddenly get slower, and (2) it will probably start behaving correctly with regard to rounding mode changes. The rounding behavior will definitely not get worse. I think the warning is useful as an indication that something has changed. I don't think requiring an additional option should be necessary.

rupprecht added inline comments.Dec 12 2019, 4:19 PM
clang/include/clang/Basic/DiagnosticDriverKinds.td
444 ↗(On Diff #231758)

However, if they are using this flag because they also sometimes build with a compiler such as gcc that supports the flag, they are currently getting incorrect behavior from clang

Incorrect, yes, but also likely valid behavior. "experimental" seems to imply a miscompile when using this option should not be unexpected.

As I suggested before: if I'm wrong, and this behavior is only going to make the code more correct (as you suggest), can we remove the warning that this must be ack'd explicitly by adding -Wno-experimental-float-control to builds? I don't understand the motivation for the warning.

Currently we emit a warning if you use -frounding-math, saying that the option is ignored. That should have alerted users that they're not getting the correct behavior now. This patch removes the warning and (IIUC) implements the correct behavior but is over-conservative. If that's correct, I think that's acceptable and we don't need an "experimental" flag or a warning.

Currently we emit a warning if you use -frounding-math, saying that the option is ignored. That should have alerted users that they're not getting the correct behavior now. This patch removes the warning and (IIUC) implements the correct behavior but is over-conservative. If that's correct, I think that's acceptable and we don't need an "experimental" flag or a warning.

Oh, I didn't realize we were already warning about that. In theory, we should handle rounding math correctly with this change. It's possible we've missed some things, but I suppose that's always true. I think there are a few general intrinsics left that need constrained versions but don't have them, and we don't have any solution yet for target-specific intrinsics. If any of those have special handling that assumes the default rounding mode we will get it wrong. I don't think most users would be likely to encounter a problem.

Currently we emit a warning if you use -frounding-math, saying that the option is ignored. That should have alerted users that they're not getting the correct behavior now. This patch removes the warning and (IIUC) implements the correct behavior but is over-conservative. If that's correct, I think that's acceptable and we don't need an "experimental" flag or a warning.

Oh, I didn't realize we were already warning about that. In theory, we should handle rounding math correctly with this change. It's possible we've missed some things, but I suppose that's always true. I think there are a few general intrinsics left that need constrained versions but don't have them, and we don't have any solution yet for target-specific intrinsics. If any of those have special handling that assumes the default rounding mode we will get it wrong. I don't think most users would be likely to encounter a problem.

Hmm. The target-specific intrinsics thing does concern me, since I assume many targets have a bunch of vector intrinsics that may be sensitive to rounding. Do we have an idea of how we'd fix that? If it's a short-term incorrectness that we have a plan to fix, I don't mind the risk, but if we don't know how we'd even start to address it...

Hmm. The target-specific intrinsics thing does concern me, since I assume many targets have a bunch of vector intrinsics that may be sensitive to rounding. Do we have an idea of how we'd fix that? If it's a short-term incorrectness that we have a plan to fix, I don't mind the risk, but if we don't know how we'd even start to address it...

I see two potential problem cases with the target-specific intrinsics:

  1. Some optimization pass recognizes the intrinisic and uses its semantics to perform some optimization such as constant folding
  2. Some optimization performs code motion that moves the intrinsic (or, in the backend, the instruction it represents) across an operation that changes the rounding mode

I don't know if there are any instances of the first case in the public repository. Downstream users could be doing it. Those will need special handling if they exist (checking for the the strictfp attribute).

The second case should be handled in IR by fesetround() or other such intrinsics being marked as having side effects. It's possible that there are target-specific intrinsics to change the rounding mode that aren't marked as having side effects, but if so that's simply a bug. The other part of this is that the intrinsic might be lowered to MC and the MC instructions in a way that neglects rounding mode. Many targets have instructions with forms that take an explicit rounding mode argument and the backends may be using that with the default rounding mode. I am not aware of any such case, but it's definitely possible.

Finally, our design for handling strict fp mode in the backends is that rounding mode control will be handled by explicitly modeling the dependency between the relevant control registers and instructions that implicitly use the rounding mode controled by those registers. X86 only recently started doing this. There may be other backends that have not implemented it. Some may never do so.

I don't have a strong preference about what to do with the warning. I have a slight preference for replacing the existing warning with a more specific warning saying that floating math support is a work in progress. Eventually we need a way for backends to indicate that they believe their support is complete.

I believe your analysis on the second point is unfortunately missing half the problem. Functions like fesetround will be treated as having arbitrary side-effects by default, which mean arbitrary code can't be reordered with calls to them. It'd be good to model that more precisely — to flag that the *only* side-effect they have is changing the FP state — but that's not really the big problem; that's just enabling better optimization by e.g. allowing them to be reordered with non-FP code. The big problem is that intrinsic calls are not arbitrary code: the vast majority of intrinsics (e.g. all the ARM vector intrinsics, many of which can be floating-point) are marked IntrNoMem (which I believe corresponds to readnone), which means calls to those intrinsics can be reordered with other code whether or not that code has arbitrary side-effects. It's good that people are looking at achieving better modeling for the x86 backend, but we need to have a plan that doesn't require heroic effort just to get basic correctness.

I would suggest that we need a function/call attribute roughly on the level of readonly / readnone, maybe readfponly, that says that a function has no side-effects and no dependencies on anything *except* the FP state. Basic queries like Instruction::mayReadMemory() that are supposed to be used generically in code-motion transforms would then return true for calls marked that way only if they're FP-constrained functions. So outside of an FP-constrained function we'd preserve optimizability, and inside one we'd be appropriately conservative. The generic backend could similarly just default to treating those intrinsic calls as having side-effects in FP-constrained functions, and there'd just be some way for a backend to opt out of that when it provides precise modeling of FP state. It'd then be a fairly straightforward change to go through the target intrinsic tables and mark which ones have dependencies on FP state.

... The big problem is that intrinsic calls are not arbitrary code: the vast majority of intrinsics (e.g. all the ARM vector intrinsics, many of which can be floating-point) are marked IntrNoMem (which I believe corresponds to readnone), which means calls to those intrinsics can be reordered with other code whether or not that code has arbitrary side-effects.

Oh, you're right. With the constrained intrinsics we are currently handling that by using IntrInaccessibleMemOnly as a proxy for access to the FP environment, but that's stronger than we'd want for architecture-specific intrinsics in the default case. We have talked about an fpenv-specific attribute, but nothing has been done. So, I guess that does leave us in the situation where rounding controls might not be correctly respected if target-specific intrinsics are used.

It's good that people are looking at achieving better modeling for the x86 backend, but we need to have a plan that doesn't require heroic effort just to get basic correctness.

Do you mean in the backend? If so, I don't think that's possible. The backends just don't have any sort of feature that could be used to get conservatively correct behavior for cheap the way intrinsics give it to us in the middle end. Once you go into instruction selection things get very low level in a hurry.

clang/include/clang/Basic/DiagnosticDriverKinds.td
444 ↗(On Diff #231758)

The "experimental" code won't be incorrect in any way that the code generated when we ignore the option is. The things that have been implemented will work correctly. The things that are not implemented will have the potential to disregard runtime changes to the rounding mode. Currently, dynamic changes to the rounding mode always have the potential of being ignored.

It's good that people are looking at achieving better modeling for the x86 backend, but we need to have a plan that doesn't require heroic effort just to get basic correctness.

Do you mean in the backend? If so, I don't think that's possible. The backends just don't have any sort of feature that could be used to get conservatively correct behavior for cheap the way intrinsics give it to us in the middle end. Once you go into instruction selection things get very low level in a hurry.

I'm looking for simple ways to modeling X86 intrinsics, but haven't find better one than modeling it one by one.

I would suggest that we need a function/call attribute roughly on the level of readonly / readnone, maybe readfponly, that says that a function has no side-effects and no dependencies on anything *except* the FP state.

Do you mean mark it at the declaration of intrinsics? Is it reasonable to mark except on dependent intrinsics?

Basic queries like Instruction::mayReadMemory() that are supposed to be used generically in code-motion transforms would then return true for calls marked that way only if they're FP-constrained functions.

Middle end or back end? I think in middle end you may need to change all releated passes to get such information to prevent optimization. And in back end, I think we can simply chain intrinsics marked except with other FP nodes like what common code doing.

It's good that people are looking at achieving better modeling for the x86 backend, but we need to have a plan that doesn't require heroic effort just to get basic correctness.

Do you mean in the backend? If so, I don't think that's possible. The backends just don't have any sort of feature that could be used to get conservatively correct behavior for cheap the way intrinsics give it to us in the middle end. Once you go into instruction selection things get very low level in a hurry.

I'm looking for simple ways to modeling X86 intrinsics, but haven't find better one than modeling it one by one.

I would suggest that we need a function/call attribute roughly on the level of readonly / readnone, maybe readfponly, that says that a function has no side-effects and no dependencies on anything *except* the FP state.

Do you mean mark it at the declaration of intrinsics? Is it reasonable to mark except on dependent intrinsics?

Basic queries like Instruction::mayReadMemory() that are supposed to be used generically in code-motion transforms would then return true for calls marked that way only if they're FP-constrained functions.

Middle end or back end? I think in middle end you may need to change all releated passes to get such information to prevent optimization. And in back end, I think we can simply chain intrinsics marked except with other FP nodes like what common code doing.

We don't want to do this all the time. So we need a new property for the intrinsics that means we should prevent code motion in the middle end when the calling function has the strictfp attribute. Similarly SelectionDAGBuilder should use INTRINSIC_W_CHAIN instead of INTRINSIC_WO_CHAIN for any of these intrinsics when strictp is enabled.

It seems the discussion of whether or not this is incomplete died out -- I'd prefer to assume it is incomplete if there is no consensus. Mailed D71635 to rename -frounding-math to -fexperimental-rounding-math.

Alternatively we could remove the warning. I still don't see a good argument for the middle ground of having it called -frounding-math but also generate a warning.

It seems the discussion of whether or not this is incomplete died out -- I'd prefer to assume it is incomplete if there is no consensus. Mailed D71635 to rename -frounding-math to -fexperimental-rounding-math.

Alternatively we could remove the warning. I still don't see a good argument for the middle ground of having it called -frounding-math but also generate a warning.

It's definitely incomplete but the results will not be any worse than you get when -frounding-math is ignored.

My preference would be to change the text of the warning that is issued but allow -frounding-math to be enabled by this commit without requiring an additional option.

I would also very much like to see this patch re-committed. It's currently in the "approved" state. If anyone objects to this being committed, please use the "request changes" action to indicate this.

rupprecht closed this revision.Dec 17 2019, 7:38 PM

It seems the discussion of whether or not this is incomplete died out -- I'd prefer to assume it is incomplete if there is no consensus. Mailed D71635 to rename -frounding-math to -fexperimental-rounding-math.

Alternatively we could remove the warning. I still don't see a good argument for the middle ground of having it called -frounding-math but also generate a warning.

It's definitely incomplete but the results will not be any worse than you get when -frounding-math is ignored.

My preference would be to change the text of the warning that is issued but allow -frounding-math to be enabled by this commit without requiring an additional option.

If other reviewers agree, then let's just remove the warning. I can send a patch tomorrow unless someone else wants to do that.

I would also very much like to see this patch re-committed. It's currently in the "approved" state. If anyone objects to this being committed, please use the "request changes" action to indicate this.

It is already re-committed. 7f9b5138470db1dc58f3bc05631284c653c9ed7a reapplied it, but IIUC it was not closed in phabricator due to leading whitespace in the commit message:

Reapply af57dbf12e54 "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior="
...
        Differential Revision: https://reviews.llvm.org/D62731

The "Differential" needs to be the first thing, whitespace cannot come before it.

It seems the discussion of whether or not this is incomplete died out -- I'd prefer to assume it is incomplete if there is no consensus. Mailed D71635 to rename -frounding-math to -fexperimental-rounding-math.

Alternatively we could remove the warning. I still don't see a good argument for the middle ground of having it called -frounding-math but also generate a warning.

It's definitely incomplete but the results will not be any worse than you get when -frounding-math is ignored.

My preference would be to change the text of the warning that is issued but allow -frounding-math to be enabled by this commit without requiring an additional option.

If other reviewers agree, then let's just remove the warning. I can send a patch tomorrow unless someone else wants to do that.

Mailed D71671 to do this.

If neither patch is acceptable, then I would like to revert this commit instead, as we are having issues with this patch.

It seems the discussion of whether or not this is incomplete died out -- I'd prefer to assume it is incomplete if there is no consensus. Mailed D71635 to rename -frounding-math to -fexperimental-rounding-math.

Alternatively we could remove the warning. I still don't see a good argument for the middle ground of having it called -frounding-math but also generate a warning.

It's definitely incomplete but the results will not be any worse than you get when -frounding-math is ignored.

My preference would be to change the text of the warning that is issued but allow -frounding-math to be enabled by this commit without requiring an additional option.

If other reviewers agree, then let's just remove the warning. I can send a patch tomorrow unless someone else wants to do that.

Mailed D71671 to do this.

If neither patch is acceptable, then I would like to revert this commit instead, as we are having issues with this patch.

I think we should stick with this patch and remove the warning like you proposed in D71671

I have found bug in clang-cl (win32 clang), related to recent inroduction of ffp-exception-behavior.
Unfortunately, I don't have a working patch yet, and since LLVM bugtracker registration is closed, I can not even submit a bug.

So, if it is not a trouble for you, I will email the bug description here.

Please let me know if it isn't appropriate. Bug description:

Windows: clang-cl is generating call to non-existing lib function for win32 with /fp:except option.
With recent ffp-exception-behavior=maytrap/strict, fp:except in clang-cl became generate FPE aware code.

But in case of floorf and ceilf it generates call to non-existing library function.

clang-cl.exe -m32 /Ox /fp:except testFloor.cpp /FA
testFloor.cpp:

#include <math.h>
float ret(float v) {  return floorf(v); }

resulting assember:

push    eax
movss    xmm0, dword ptr [esp + 8]
movss    dword ptr [esp], xmm0
call    _floorf #no such function!!!
pop    eax
ret

Expected behaviour:

there is no floorf lib function. Like with cosf and other math functions, floorf in MSVC is implemented as inline function.

So, it should be call to _floor (with apropriate conversion first).

I have found bug in clang-cl (win32 clang), related to recent inroduction of ffp-exception-behavior.
Unfortunately, I don't have a working patch yet, and since LLVM bugtracker registration is closed, I can not even submit a bug.

So, if it is not a trouble for you, I will email the bug description here.

Please let me know if it isn't appropriate. Bug description:

Windows: clang-cl is generating call to non-existing lib function for win32 with /fp:except option.
With recent ffp-exception-behavior=maytrap/strict, fp:except in clang-cl became generate FPE aware code.

But in case of floorf and ceilf it generates call to non-existing library function.

clang-cl.exe -m32 /Ox /fp:except testFloor.cpp /FA
testFloor.cpp:

#include <math.h>
float ret(float v) {  return floorf(v); }

resulting assember:

push    eax
movss    xmm0, dword ptr [esp + 8]
movss    dword ptr [esp], xmm0
call    _floorf #no such function!!!
pop    eax
ret

Expected behaviour:

there is no floorf lib function. Like with cosf and other math functions, floorf in MSVC is implemented as inline function.

So, it should be call to _floor (with apropriate conversion first).

Hopefully fixed by 53ee806d93e8d2371726ec5ce59b0e68b309c258

Found some issue when looking at this code: -ftrapping_math and -fno_trapping_math will never have effect

clang/lib/Frontend/CompilerInvocation.cpp
3084

Calling 'Opts.setFPExceptionMode(xx)' here has no effect, as it will be overruled later on, on line 3174 !
Same is true on line 3159