Page MenuHomePhabricator

[Clang][Driver] Fix -ffast-math/-ffp-contract interaction
Needs ReviewPublic

Authored by wristow on Jan 13 2020, 7:19 PM.

Details

Summary

Fused Multiply Add (FMA) was not always being disabled when the switch -ffp-contract=off was used. More specifically, FMA is enabled when -ffp-contract=fast is used, and it also is enabled implicitly with -ffast-math. The combination:

-ffast-math -ffp-contract=off

is intended to leave most of fast-math enabled (for example, leave reassociation, reciprocal transformations, etc.) enabled, but disable the use of FMA. However, FMA was incorrectly left enabled with the above switch combination. This commit fixes this, allowing users to enable most of the fast-math optimizations, while disabling the FMA feature.

Diff Detail

Event Timeline

wristow created this revision.Jan 13 2020, 7:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2020, 7:19 PM
hfinkel added inline comments.
clang/lib/Driver/ToolChains/Clang.cpp
2792

I think that you just need

&& !FPContract.equals("off")

or

&& !(FPContract.equals("off") || FPContract.equals("on"))

of which I think the latter. fp-contract=no is also not a fast-math-compatible setting in that regard, right?

2823

So now we pass it twice, because we also pass it here?

2845

You want the check here, I think, and not below so that if parts of fast-math are disabled __FAST_MATH__ is not set. That seems to be what the comment/code currently do, although what does GCC do in this regard?

wristow marked 3 inline comments as done.Jan 14 2020, 2:38 PM

Thanks for the quick feedback @hfinkel

clang/lib/Driver/ToolChains/Clang.cpp
2792

Eliminating the FPContractDisabled variable, and instead checking that the value isn't "off" (or maybe isn't ("off" or "on")), sounds good. I'm not 100% sure of whether just "off" or both "off" and "on" ought to be checked. More on that in the discussion about __FAST_MATH__ in one of your other comments.

2823

Whoops. This one was here originally, so I'll leave it here, and delete the one I needlessly added, above.

2845

I had originally included the check with the set of conditions of line 2763, but then the warn_drv_overriding_flag_option warning (below) was missed in the case of -ffp-model=fast -ffp-contract=off. So I think the check does need to remain inside the above conditional.

Regarding the __FAST_MATH__ aspect and whether ffp-contract=on also ought to suppress the -ffast-math emission (and what GCC does), this is somewhat fuzzy for me. After getting your comments, I'm leaning toward including both the "off" and "on" in the check (rather than only disabling it in the case of "off", as the current patch does). But I want to at least note my observations here, to make the current situation with both Clang and GCC clear, in case others want to chime in.

Somewhat surprisingly to me, GCC's behavior wrt __FAST_MATH__ seems buggy in general, so it's not a good base to compare against. For example, the following set of switches result in the __FAST_MATH__ macro being defined (although I expected they would have disabled it):

-ffast-math -fno-associative-math -fno-reciprocal-math -frounding-math

(I've just tested GCC 7.4.0.)

I could have sworn that in the past that for GCC I had seen this sort of disabling of part of the "fast-math set" result in the disabling of the def of __FAST_MATH__. As you'd expect with Clang (from the comment in our code code), the above switches do suppress the definition of __FAST_MATH__.

That said, GCC does suppress the __FAST_MATH__ def in some situations. E.g., both GCC and Clang suppress it with the following switches:

-ffast-math -fmath-errno
-ffast-math -ftrapping-math
-ffast-math -fsigned-zeros

In any case, for the -ffp-contract switch, none of the settings suppress the def of __FAST_MATH__ with GCC. That is, all of the following result in __FAST_MATH__ being defined:

-ffast-math -ffp-contract=fast
-ffast-math -ffp-contract=on
-ffast-math -ffp-contract=off

I think that's buggy (but hard to know if it's -ffp-contract specific, or part of the general buggyness of GCC and __FAST_MATH__). Clang (prior to the change described here) also defines __FAST_MATH__ in all of these -ffp-contract situations. But I think that's just part of the bug I'm trying to address here.

Bottom line: The intended semantics are somewhat fuzzy to me. I'd say:

-ffast-math -ffp-contract=fast      // should leave __FAST_MATH__ defined
-ffast-math -ffp-contract=off       // should not define __FAST_MATH__
-ffast-math -ffp-contract=on        // unsure ????

For that last one ("unsure ???"), in the current patch I was leaving __FAST_MATH__ defined. But I'm now leaning toward also suppressing the __FAST_MATH__ def in that case. Unless anyone disagrees, I'll include that change when I update the patch.

wristow updated this revision to Diff 238143.Jan 14 2020, 4:52 PM
wristow retitled this revision from ix -ffast-math/-ffp-contract interaction to Fix -ffast-math/-ffp-contract interaction.

Addressed comments from @hfinkel .

wristow marked 2 inline comments as done.Jan 14 2020, 4:54 PM
wristow marked an inline comment as done.Jan 14 2020, 4:58 PM
spatel added inline comments.Jan 15 2020, 11:41 AM
llvm/test/CodeGen/PowerPC/fmf-propagation.ll
201–203 ↗(On Diff #238143)

I was ok with the reasoning up to here, but this example lost me.
Why does 'contract' alone allow splitting an fma?
Is 'contract' relevant on anything besides fadd (with an fmul operand)?

wristow marked an inline comment as done.Jan 15 2020, 5:32 PM
wristow added inline comments.
llvm/test/CodeGen/PowerPC/fmf-propagation.ll
201–203 ↗(On Diff #238143)

I have to admit I'm handwaving here. I don't really understand why 'contract' alone allows this simpliciation:
fma(X, 7.0, X * 42.0) --> X * 49.0

to happen. But either that's correct, or it's a separate bug (and I waved my hands about explaining more detail).

The short story is that even without my proposed change, having only 'contract' on the FMA intrinsic results in this being simplified to X * 49 (and also having only 'reassoc' did). The original comment:

; This is the minimum FMF needed for this transform - the FMA allows reassociation.

is incomplete/misleading. A more complete comment (relevant before my change) would have been:

; This is the minimum FMF needed for this transform - either 'reassoc' or 'contract' applied to the FMA intrinsic allows reassociation.

And with my change, the result is that 'reassoc' is no longer relevant. But since I don't really understand why it should be simplified with only contract, I should put a TODO comment in. That is, to me it seems like both 'contract' and 'reassoc' should be needed for this simplification to happen (whereas previously it worked with either of them individually). So maybe change this comment to:

; This is the minimum FMF needed for this transform - applying 'contract' to the FMA intrinsic allows reassociation.
; TODO: It seems 'contract' and 'reassoc' combined should be needed for this transform.  Why does it work with only 'contract'?
mcberg2017 added a comment.EditedJan 15 2020, 5:44 PM

We crossed that bridge internally at Apple a while ago, meaning I have some code debt for cleaning up open source for fma formation that uses contract and reassoc differently than we do today, both together and separately, case by case.

We crossed that bridge internally at Apple a while ago, meaning I have some code debt for cleaning up open source for fma formation that uses contract and reassoc differently than we do today, both together and separately, case by case.

Ah, great. I don't have a preference about the order of patches (ie, is it better to make this change first or do the backend cleanup first), so whatever works better.

But it would be better to have all of the baseline tests in place, so we know where we stand currently. So I'd prefer to have the PPC and x86 tests pre-committed to master (change test comments as needed to match current/future reality).

We could also decouple the clang part of this patch from the backend change IIUC. But do we need another change (or at least tests) to show how the driver difference translates in the clang front-end to LLVM FMF and/or function attributes?

clang/test/Driver/fast-math.c
183–184

Do we need another pair of tests for -ffp-contract=on?

I should do the other DAG combiner fma changes after this is wrapped up.

This all sounds good to me.

So to make sure we're all on the same page, my understanding is that the plan forward is:

  1. Make the Clang change first (including adding another pair of tests for -ffp-contract=on).
  2. Update the LLVM tests illustrating the current baseline state.
  3. Make the LLVM change shown here, and update the tests with the fixes.
  4. Bring in the DAG combiner work that Michael has done internally at Apple.

Points 1, 2, and 3 are essentially the points in the patch posted here, so I'll do that. And of course Michael will then take on point 4.

Is that the plan? If yes, I'll transition this item to just be the Clang pieces, and I'll spin off a new one to do the LLVM portion of what is posted here.

Sanjay, regarding:

But it would be better to have all of the baseline tests in place, so we know where we stand currently.

I'm interpreting that as applying to the LLVM tests. That is, the Clang change, along with the updated tests, can be in one commit, rather than first updating the Clang driver tests with the current state. If you'd prefer two commits on the Clang side as well, let me know.

This all sounds good to me.

So to make sure we're all on the same page, my understanding is that the plan forward is:

  1. Make the Clang change first (including adding another pair of tests for -ffp-contract=on).
  2. Update the LLVM tests illustrating the current baseline state.
  3. Make the LLVM change shown here, and update the tests with the fixes.
  4. Bring in the DAG combiner work that Michael has done internally at Apple.

    Points 1, 2, and 3 are essentially the points in the patch posted here, so I'll do that. And of course Michael will then take on point 4.

    Is that the plan? If yes, I'll transition this item to just be the Clang pieces, and I'll spin off a new one to do the LLVM portion of what is posted here.

SGTM

Sanjay, regarding:

But it would be better to have all of the baseline tests in place, so we know where we stand currently.

I'm interpreting that as applying to the LLVM tests. That is, the Clang change, along with the updated tests, can be in one commit, rather than first updating the Clang driver tests with the current state. If you'd prefer two commits on the Clang side as well, let me know.

One commit for the clang changes should be ok; it's a very small diff. But I'm still not sure if the driver change induces frontend diffs that we should make visible via tests.

One commit for the clang changes should be ok; it's a very small diff. But I'm still not sure if the driver change induces frontend diffs that we should make visible via tests.

The only thing I can think of is that it changes whether/when __FAST_MATH__ is defined. But that'll be indirectly tested, via the updated tests in "Driver/fast-math.c", which will verify that "-ffast-math" is passed appropriately (and the __FAST_MATH__ dependency on "-ffast-math" is already tested in "Preprocessor/predefined-macros.c").

In addition to adding the second pair of driver tests you suggested for -ffp-contract=on, I'll also add another pair for -ffp-contract=fast to verify that "-ffast-math" is passed in that case (no change in behavior for that pair, but just confirming it continues to work correctly).

wristow updated this revision to Diff 238595.Jan 16 2020, 1:19 PM
wristow retitled this revision from Fix -ffast-math/-ffp-contract interaction to [Clang][Driver] Fix -ffast-math/-ffp-contract interaction.

Changing this to address only the Clang driver aspect, as discussed in the comments. Will spin off a separate patch for the LLVM side.

spatel added a comment.EditedJan 17 2020, 5:56 AM

This follows the reasoning from the earlier discussion, but after re-reading the gcc comment in particular, I'm wondering whether this is what we really want to do...
If FAST_MATH is purely here for compatibility with gcc, then should we mimic gcc behavior in setting that macro even if we think it's buggy?
Ie, when we translate these settings to LLVM's FMF, we can still override the -ffast-math flag by checking the -ffp-contract flag (if I'm seeing it correctly, the existing code will pass that alongside -ffast-math when contract is set to on/off).

clang/lib/Driver/ToolChains/Clang.cpp
2788

This comment doesn't match the code any more. Should be "If -ffp-contract=off/on, then..."

clang/test/Driver/fast-math.c
184

typo: conteact

Adding potential FP reviewers for question of gcc- (potentially-buggy-) compatibility.

wristow updated this revision to Diff 238933.EditedJan 17 2020, 5:26 PM

Updated patch to correct a comment and fix a typo.

Regarding the point from @spatel :

This follows the reasoning from the earlier discussion, but after re-reading the gcc comment in particular, I'm wondering whether this is what we really want to do...
If __FAST_MATH__ is purely here for compatibility with gcc, then should we mimic gcc behavior in setting that macro even if we think it's buggy?

Thinking about it, yes, I think we should mimic GCC. FTR, I've tried some older GCCs via Compiler Explorer, and found them to be consistent in that removing some aspects of -ffast-math disables the generation of __FAST_MATH__ and removing other aspects of it does not. So my previous recollection ("I could have sworn that in the past that for GCC I had seen this sort of disabling of part of the "fast-math set" result in the disabling of the def of __FAST_MATH__.") is wrong. Bottom line, maybe it's not a bug in GCC, and instead it's the desired GCC behavior (just that I cannot find specific documentation of precisely which settings disable the generation of it).

However, I'd call that a bigger question than this -ffp-contract aspect of __FAST_MATH__, and a separate bug, that can be fixed separately.

Ie, when we translate these settings to LLVM's FMF, we can still override the -ffast-math flag by checking the -ffp-contract flag (if I'm seeing it correctly, the existing code will pass that alongside -ffast-math when contract is set to on/off).

How to handle this seems like an implementation question. The code here is just deciding whether or not we intend to pass "-ffast-math" to cc1 (it isn't directly defining __FAST_MATH__). If we do pass it to cc1, then "clang/lib/Frontend/InitPreprocessor.cpp" will pre-define __FAST_MATH__:

if (LangOpts.FastMath)
  Builder.defineMacro("__FAST_MATH__");

It seems to me that a straightforward way to deal with this question is to make the above test more elaborate (that is, if LangOpts.FastMath is set, OR whatever the appropriate subset of fast-math switches are set). If we do that, we don't need to deal with the "umbrella" aspect of "-ffast-math", which gets messy.

Which is to say the approach here can stay the same as it currently is (-ffp-contract=off/on suppressing the passing of "-ffast-math" to cc1). Although the comment about __FAST_MATH__ here in "Clang.cpp" should be changed, if we take this approach.

How to handle this seems like an implementation question. The code here is just deciding whether or not we intend to pass "-ffast-math" to cc1 (it isn't directly defining __FAST_MATH__). If we do pass it to cc1, then "clang/lib/Frontend/InitPreprocessor.cpp" will pre-define __FAST_MATH__:

if (LangOpts.FastMath)
  Builder.defineMacro("__FAST_MATH__");

It seems to me that a straightforward way to deal with this question is to make the above test more elaborate (that is, if LangOpts.FastMath is set, OR whatever the appropriate subset of fast-math switches are set). If we do that, we don't need to deal with the "umbrella" aspect of "-ffast-math", which gets messy.

Which is to say the approach here can stay the same as it currently is (-ffp-contract=off/on suppressing the passing of "-ffast-math" to cc1). Although the comment about __FAST_MATH__ here in "Clang.cpp" should be changed, if we take this approach.

That sounds fine to me, so could update that comment/this patch.

Let's give this a couple of days after updating to see if anyone else has feedback though. I'm never exactly sure what our responsibility is with respect to gcc compatibility.

I've had a quick look at GCC, and it seems there's a couple of different issues.

First of all, -ffast-math is a "collective" flag that has no separate meaning except setting a bunch of other flags. Currently, these flags are: -fno-math-errno, -funsafe-math-optimizations (which is itself a collective flag enabling -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and -freciprocal-math), -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans, -fcx-limited-range, and -fexcess-precision=fast.

When deciding whether to define __FAST_MATH__, GCC will not specifically look at whether or not the -ffast-math option itself was given, but whether the status of those other flags matches what would have been set by -ffast-math. However, it does not include all of the flags; currently only -fmath-errno, -funsafe-math-optimizations, -fsigned-zeros, -ftrapping-math, -ffinite-math-only, and -fexcess-precision are checked, while -fassociative-math, -freciprocal-math, -frounding-math, -fsignaling-nans, and -fcx-limited-range are ignored.

I'm not sure whether this is deliberate (but it seems weird) or just a bug. I can ask the GCC developers ...

A separate question is the interaction of -ffast-math with -ffp-contract=. Currently, there is no such interaction whatsoever in GCC: -ffast-math does not imply any particular -ffp-contract= setting, and vice versa the -ffp-contract= setting is not considered at all when defining __FAST_MATH__. This seems at least internally consistent.

I'm not sure whether this is deliberate (but it seems weird) or just a bug. I can ask the GCC developers ...

Please do. If there's a rationale, we should know.

A separate question is the interaction of -ffast-math with -ffp-contract=. Currently, there is no such interaction whatsoever in GCC: -ffast-math does not imply any particular -ffp-contract= setting, and vice versa the -ffp-contract= setting is not considered at all when defining __FAST_MATH__. This seems at least internally consistent.

That's interesting, and as you said, internally consistent behavior in GCC. I think it would be a fine idea for us to do the same thing.

Looking into this point, I see that (ignoring fastmath for the moment) our default -ffp-contract behavior is different than GCC's. GCC enables FMA by default when optimization is high enough ('-O2') without any special switch needed. For example, taking an architecture that supports FMA (Haswell), GCC has the following behavior:

float test(float a, float b, float c)
{
  // FMA is enabled by default for GCC (on Haswell), so done at -O2:
  //    gcc -S -O2 -march=haswell test.c  # FMA happens
  //    $ gcc -S -march=haswell test.c ; egrep 'mul|add' test.s
  //            vmulss  -12(%rbp), %xmm0, %xmm0
  //            vaddss  -4(%rbp), %xmm0, %xmm0
  //    $ gcc -S -O2 -march=haswell test.c ; egrep 'mul|add' test.s
  //            vfmadd231ss     %xmm2, %xmm1, %xmm0
  //    $
  return a + (b * c);
}

As we'd expect, GCC does disable FMA with -ffp-contract=off (this is irrespective of whether -ffast-math was specified). Loosely, GCC's behavior can be summarized very simply on this point as:
Suppress FMA when -ffp-contract=off.
(As an aside, GCC's behavior with -ffp-contract=on is non-intuitive to me. It relates to the FP_CONTRACT pragma, which as far as I can see is ignored by GCC.)

In contrast, we do not enable FMA by default (via general optimization, such as '-O2'). For example:

$ clang -S -O2 -march=haswell test.c ; egrep 'mul|add' test.s
        vmulss  %xmm2, %xmm1, %xmm1
        vaddss  %xmm0, %xmm1, %xmm0
        .addrsig
$

I think that whether we want to continue doing that (or instead, enable it at '-O2', like GCC does), is a separate issue. I can see arguments either way.

We do enable FMA with -ffp-contract=fast, as desired (and also with -ffp-contract=on). And we do "leave it disabled" with -ffp-contract=off (as expected).

Now, bringing fastmath back into the discussion, we do enable FMA with -fffast-math. If we decide to continue leaving it disabled by default, then enabling it with -ffast-math seems perfectly sensible. (If we decide to enable it by default when optimization is high enough, like GCC, then turning on -ffast-math should not disable it of course.)

The problem I want to address here is that if the compiler is a mode where FMA is enabled (whether that's at '-O2' "by default", or whether it's because the user turned on -ffast-math), then appending -ffp-contract=off should disable FMA. I think this patch (along with the small change to "DAGCombiner.cpp", in an earlier version of this patch) is a reasonable approach to solving that. I'd say this patch/review/discussion has raised two additional questions:

  1. Under what conditions should __FAST_MATH__ be defined?
  2. Should we enable FMA "by default" at (for example) '-O2'?

I think these additional questions are best addressed separately. My 2 cents are that for (1), mimicking GCC's behavior seems reasonable (although that's assuming we don't find out that GCC's __FAST_MATH__ behavior is a bug). And for (2), I don't have a strong preference.

  1. Should we enable FMA "by default" at (for example) '-O2'?

We recently introduced a new option "-ffp-model=[precise|fast|strict], which is supposed to serve as an umbrella option for the most common combination of options. I think our default behavior should be equivalent to -ffp-model=precise, which enables contraction. It currently seems to enable -ffp-contract=fast in the precise model (https://godbolt.org/z/c6w8mJ). I'm not sure that's what it should be doing, but whatever the precise model does should be our default.

  1. Should we enable FMA "by default" at (for example) '-O2'?

We recently introduced a new option "-ffp-model=[precise|fast|strict], which is supposed to serve as an umbrella option for the most common combination of options. I think our default behavior should be equivalent to -ffp-model=precise, which enables contraction. It currently seems to enable -ffp-contract=fast in the precise model (https://godbolt.org/z/c6w8mJ). I'm not sure that's what it should be doing, but whatever the precise model does should be our default.

That makes sense to me. So -ffp-model/-ffp-contract interaction should be examined, and possibly adjusted. If an adjustment is needed, I think it makes sense to handle that interaction separately.

I'm going to update this patch to adjust the comment that @spatel and I discussed earlier, so it no longer implies that __FAST_MATH__ is defined only when "all of" -ffast-math is enabled.

wristow updated this revision to Diff 240305.Jan 24 2020, 3:18 PM

Update a comment to remove the discussion about enabling the __FAST_MATH__ preprocessor macro. The handling of the setting of __FAST_MATH__ is done in "clang/lib/Frontend/InitPreprocessor.cpp", and once we decide on the set of conditions that enable that definition, we can add an appropriate comment there.

clang/lib/Driver/ToolChains/Clang.cpp
2792

I think this would read better as "... && !FPContract.equals("off") && !FPContract.equals("on")" The '||' in the middle of all these '&&'s combined with the extra parens from the function call trips me up.

2846–2847

As above, I'd prefer "!off && !on".

2854

This is a bit of an oddity in our handling. The FPContract local variable starts out initialized to an empty string. So, what we're saying here is that if you've used all the individual controls to set the rest of the fast math flags, we'll turn on fp-contract=fast also. That seems wrong. If you use "-ffast-math", FPContract will have been set to "fast" so that's not applicable here.

I believe this means

-fno-honor-infinities -fno-honor-nans -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fno-trapping-math -fno-rounding-math

will imply "-ffp-contract=fast". Even worse:

-ffp-contract=off -fno-fast-math -fno-honor-infinities -fno-honor-nans -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fno-trapping-math -fno-rounding-math

will imply "-ffp-contract=fast" because "-fno-fast-math" resets FPContract to empty.

I think we should initialize FPContract to whatever we want to be the default (on?) and remove the special case for empty here. Also, -fno-fast-math should either not change FPContract at all or set it to "off". Probably the latter since we're saying -ffast-math includes -ffp-contract=on.

clang/test/Driver/fast-math.c
196

What about "-ffp-contract=off -ffast-math"? The way the code is written that will override the -ffp-contract option. That's probably what we want, though it might be nice to have a warning.

wristow marked 4 inline comments as done.Jan 24 2020, 7:18 PM
wristow added inline comments.
clang/lib/Driver/ToolChains/Clang.cpp
2792

Sounds good. Will do.

2846–2847

As above, will do.

2854

This is a bit of an oddity in our handling.

Yes it is!

This is certainly getting more complicated than I had originally expected. I feel there isn't a clear spec on what we want in terms of whether FMA should be enabled "automatically" at (for example) '-O2', and/or whether it should be enabled by -ffast-math. I'm very willing to make whatever change is needed here, to meet whatever spec we all ultimately agree on.

So I think this patch should be put on hold until we decide on these wider aspects.

One minor note about:

... Probably the latter since we're saying -ffast-math includes -ffp-contract=on.

I think that is intended to say "-ffast-math includes -ffp-contract=fast". The semantics of -ffp-contract=on are another part that seems unclear (or at least, more complicated, since it then gets into the FP_CONTRACT pragma).

clang/test/Driver/fast-math.c
196

Yes, currently -fffast-math will override that earlier -ffp-contract=off settings. It's unclear to me whether we're ultimately intending for that to be the behavior (because GCC doesn't do that, as @uweigand noted). I guess this is another reason to hold off for a bit, until we figure out the wider spec.

About:

This is a bit of an oddity in our handling.

Yes it is!

This is certainly getting more complicated than I had originally expected. I feel there isn't a clear spec on what we want in terms of whether FMA should be enabled "automatically" at (for example) '-O2', and/or whether it should be enabled by -ffast-math. I'm very willing to make whatever change is needed here, to meet whatever spec we all ultimately agree on.

So I think this patch should be put on hold until we decide on these wider aspects.

Thinking about this a bit more, one thing that I believe there is a clear spec on, is that when -ffp-contract=off is "the last switch related to FMA", then FMA ought to be disabled. So for example, -ffast-math -ffp-contract=off should result in FMA being disabled, irrrspective of how we decide on the "wider aspects" of the spec mentioned above. (FTR, this inability to enable FastMath in general, but explicitly disable FMA, was a problem one of our customers reported, which is what initially motivated this patch.)

Note that the points about the "oddity" of "-ffp-contract=fast" being implied by the two long sets of switches in this comment are unrelated to the change proposed here. That is, that oddity is the current behavior of Clang.

So one approach we could take is to first fix the problem of when -ffp-contract=off is the "the last switch related to FMA", it isn't disabling FMA in some cases. And then as a followup, address the oddities that have been noted here. If we decide to take this two-step approach, I'm happy to take on the followup work (once we decide what the spec should be).

wristow updated this revision to Diff 240689.Mon, Jan 27, 2:50 PM

Used the clearer '!off && !on' (rather than '!(off || on)') in the checks.

Added more tests that note the current situation that -ffast-math enables FMA. overriding an earlier switch that had disabled it (included a "TODO" comment in these tests, since it isn't clear if that behavior is what we will ultimately stick with).

wristow marked 4 inline comments as done.Mon, Jan 27, 2:54 PM
wristow added inline comments.
clang/test/Driver/fast-math.c
196

Added more tests that illustrate this behavior of "-ffp-contract=off/on -ffast-math" overriding and enabling FMA. (I haven't added a warning. Holding off until we decide on other questions discussed in this review.)

I'm not sure whether this is deliberate (but it seems weird) or just a bug. I can ask the GCC developers ...

Please do. If there's a rationale, we should know.

Sorry for the delay ... I've raised that question on the GCC list, and the opinion seems to be that the current behavior is indeed a bug -- however there's still discussion ongoing on what the proper fix ought to be. I'll report back once we've agreed on a solution on the GCC side ...