This is an archive of the discontinued LLVM Phabricator instance.

Fix `unsafe-fp-math` attribute emission.
ClosedPublic

Authored by michele.scandale on Oct 26 2022, 1:43 PM.

Details

Summary

The conditions for which Clang emits the unsafe-fp-math function
attribute has been modified as part of
84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7.
In the backend code generators "unsafe-fp-math"="true" enable floating
point contraction for the whole function.
The intent of the change in 84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
unsafe-fp-math to be set also when only in-statement contraction is
allowed.

Consider the following example

float foo(float a, float b, float c) {
  float tmp = a * b;
  return tmp + c;
}

and compile it with the command line

clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
  -O2 -mavx512f -S -o -

The resulting assembly has a vfmadd213ss instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.

The optimized IR is:

define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
  %mul = fmul reassoc nsz arcp afn float %b, %a
  %add = fadd reassoc nsz arcp afn float %mul, %c
  ret float %add
}

attributes #0 = {
  [...]
  "no-signed-zeros-fp-math"="true"
  "no-trapping-math"="true"
  [...]
  "unsafe-fp-math"="true"
}

The "unsafe-fp-math"="true" function attribute allows the backend code
generator to perform (fadd (fmul a, b), c) -> (fmadd a, b, c).

In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the contract fast-math flag and
llvm.fmuladd is being used to represent contractions opportunities
that occur within a single statement.
Therefore "unsafe-fp-math"="true" can only be emitted when contraction
across statements is allowed.

Moreover the change in 84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7 doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see CGFPOptionsRAII).
For consistency unsafe-fp-math needs to be disabled if the contraction
mode for any scope/operation is not fast.
Similarly for consistency reason the initialization of UnsafeFPMath of
in TargetOptions for the backend code generation should take into
account the contraction mode as well.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 1:43 PM
Herald added a subscriber: ormris. · View Herald Transcript
michele.scandale requested review of this revision.Oct 26 2022, 1:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 1:43 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
clang/lib/CodeGen/CGCall.cpp
1865–1870

I've found quite confusing the (FastMath || (!FastMath && ... )).

Using directly UnsafeFPMath seems more compact, however it also causes to taking into account the value for MathErrno -- which it might be not relevant.

If MathErrno shouldn't affect this, then I would rewrite this as:

if (LangOpts.AllowFPReassoc && LangOpts.AllowRecip &&
     LangOpts.NoSignedZero && LangOpts.ApproxFunc &&
     (LangOpts.getDefaultFPContractMode() ==
         LangOptions::FPModeKind::FPM_Fast ||
     LangOpts.getDefaultFPContractMode() ==
         LangOptions::FPModeKind::FPM_FastHonorPragmas))
  FuncAttrs.addAttribute("unsafe-fp-math", "true");

so that only the relevant options are considered and there is no need to think about what is implied by FastMath.

clang/test/CodeGen/func-attr.c
5–15

See comment about MathErrno in CGCall.cpp

clang/lib/CodeGen/CGCall.cpp
1865–1870

I do wonder if in https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3089-L3091 is actually correct to have !MathErrno. In the GCC documentation of -funsafe-math-optimizations I don't see any connection to the -fmath-errno or -fno-math-errno options.

zahiraam added inline comments.Oct 27 2022, 12:43 PM
clang/lib/CodeGen/CGCall.cpp
1865–1870

Interesting! Using on the command line 'funsafe-math-optimization' implies 'math-errno'. But 'funsafe-math-optimizations' is triggered when this is true:
!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros && ApproxFunc && !TrappingMath

See here: https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3089

1865–1870

When the ‘funsafe-math-optimizations’ is used on the command line LangOpts.UnsafeFP is ‘0’.
The function attribute ‘funnsafe-fp-math’ is not set. Is that what’s wanted? I would have thought that we want the attribute enabled for unsafe mode.

If we call
SpecialOperations = AllowFPReassoc && NoSignedZero && AllowRecip && ApproxFunc && !RoundingMath.

‘ffast-math’ = !MathErrno && FiniteMathOnly && SpecialOperations
‘funsafe-math-optimizations’ = MathErrno && !FiniteMathOnly && SpecialOperations
(it's true that there is no mention of MathErrno in the GCC doc, but the default is fmath-errno so presumably using this option on the command line implies MathErrno).

The function attribute UnsafeFPMath is enabled for
(‘ffast-math’ || ‘funsafe-math-optimization’).
That will lead to this condition:
(SpecialOperations && MathErrNo && "-ffast-math" && -ffp-contract=fast") || (SpecialOperations && "-fmath-errno" && "-ffp-contract=on") .

clang/lib/CodeGen/CGCall.cpp
1865–1870

The compiler driver has a default value for MathErrno which depends on the toolchain properties. When -funsafe-math-optimizations is processed MathErrno is not affected, but then the compiler driver specify -funsafe-math-optimizations to the CC1 command line only if MathErrno is false.
The way the compiler driver mutate the floating point options state when processing -funsafe-math-optimizations seems to match the GCC documentation, but then the condition for the forwarding expect something more.
It is not clear to me if the intention is to match GCC documented behavior, or if instead the Clang definition of -funsafe-math-optimizations implies -fno-math-errno.

1865–1870

When the ‘funsafe-math-optimizations’ is used on the command line LangOpts.UnsafeFP is ‘0’.

The function attribute ‘funnsafe-fp-math’ is not set. Is that what’s wanted? I would have thought that we want the attribute enabled for unsafe mode.

I believe this is caused by MathErrno state in the compiler driver being not affected when processing -funsafe-math-optimizations, but considered for the forwarding to the CC1.

If we call
SpecialOperations = AllowFPReassoc && NoSignedZero && AllowRecip && ApproxFunc && !RoundingMath.

‘ffast-math’ = !MathErrno && FiniteMathOnly && SpecialOperations
‘funsafe-math-optimizations’ = MathErrno && !FiniteMathOnly && SpecialOperations
(it's true that there is no mention of MathErrno in the GCC doc, but the default is fmath-errno so presumably using this option on the command line implies MathErrno).

The function attribute UnsafeFPMath is enabled for
(‘ffast-math’ || ‘funsafe-math-optimization’).
That will lead to this condition:
(SpecialOperations && MathErrNo && "-ffast-math" && -ffp-contract=fast") || (SpecialOperations && "-fmath-errno" && "-ffp-contract=on") .

In GCC -funsafe-math-optimizations does not seem to affect the state of math-errno -- see https://godbolt.org/z/jzW6zqxeM and https://godbolt.org/z/axPb63z3d.

I'm not following entirely, but -funsafe-math-optimizations is just a subset of -ffast-math, so checking only the properties that contribute to -funsafe-math-optimizations should be enough. I think it is way simpler to reason in these terms than enumerating all the possible scenarios.

Now, in https://reviews.llvm.org/D135097#3885466 you said that additional condition on the floating point contraction mode was intended due to the semantic of the unsafe-fp-math from the backends perspective -- i.e. if "unsafe-fp-math"="true" is specified as a function attribute, then the backend can perform any form of contraction on that function. As shown in the example in the description if you only check that LangOpts.getDefaultFPContractMode() != LangOptions::FPModeKind::FPM_Off, then you get unexpected contraction. Therefore if the intention is to ensure the behavior is correct, then you can only set "unsafe-fp-math"="true" only if you have fast contraction mode.

I'm not following entirely, but -funsafe-math-optimizations is just a subset of -ffast-math, so checking only the properties that contribute to -funsafe-math-optimizations should be enough.
I think it is way simpler to reason in these terms than enumerating all the possible scenarios.

Exactly my point Since -funsafe-math-optimizations is a subset of -ffast-math, shouldn't it be the "minimal" option to trigger the unsafe-fp-math attribute for functions?
I agree with the second part of the condition; it should be
updated with (LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_Fast || LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_FastHonorPragmas)
but I am still hesitant about the change you are proposing for the first part of the condition.
@andrew.w.kaylor Can you please weigh in on this?

clang/lib/CodeGen/CGCall.cpp
1865–1870

I'd really like to see us move away from fast-math and unsafe-fp-math and encourage use of individual semantic modes. Both fast-math and unsafe-fp-math are composite settings and I'm sure there are users that depend on them, just as there are optimizations that are checking the function attribute. We need to support the command-line options for the users who have existing build systems using it. We can start fixing the uses within clang and LLVM wherever we can do so without changing behavior (or maybe even changing behavior just a little in reasonable ways).

The main thing I have in mind to start with is the interface between the driver and the front end. We have code in clang/lib/Driver/ToolChains/Clang.cpp that does a bunch of analysis of various options to decide whether or not to pass -ffast-math to the compiler invocation. I don't understand why it can't just pass the individual semantic mode flags and let the front end figure out the composite settings if it needs them. Then if we can ever wean people off of the composite settings, we can get rid of that. I don't know if there is similar handling in the other Driver tool chains files, but I'd view it the same way.

The composite options are much less useful to users if they don't have consistent meanings across all compilers that support them. I think our handling of them should be consistent with gcc. We should also consider OpenCL's -cl-unsafe-math-optimizations option. At a glance, it looks to me like it's consistent with the gcc meaning.

I think we should drop math-errno from our definition of unsafe-fp-math. The errno variable is a very C-specific concept and it shouldn't creep into language-independent handling. The optimizer has no business knowing anything about math errno or being able to deduce it from any setting in the IR. The optimizer only needs to know "I'm allowed to replace this function call" or "I'm not allowed to replace this function call". Setting math errno is a side effect of the library call and the optimizer only needs to know that the call has unmodeled side effects. Generally we handle this by choosing the LLVM intrinsics when the side effects aren't needed, right?

clang/lib/CodeGen/CGCall.cpp
1865–1870

The main thing I have in mind to start with is the interface between the driver and the front end. We have code in clang/lib/Driver/ToolChains/Clang.cpp that does a bunch of analysis of various options to decide whether or not to pass -ffast-math to the compiler invocation. I don't understand why it can't just pass the individual semantic mode flags and let the front end figure out the composite settings if it needs them. Then if we can ever wean people off of the composite settings, we can get rid of that. I don't know if there is similar handling in the other Driver tool chains files, but I'd view it the same way.

Yes, I agree 100% with this.
I would expect that the CC1 to only expose flags to control each individual low level floating point options, and let the compiler driver (which represents the public interface with the user) to one responsible of translating the high level aggregate/options in a combination of the low-level options.
For example, I really would like that when the compiler driver processes

clang -ffast-math -c foo.c

would generate something like the following as a CC1 command line

clang -cc1 \
  -fassociative-math \ # controls the `reassoc` FMF
  -freciprocal-math \ # controls the `arcp` FMF
  -fapprox-func \ # controls the `afn` FMF
  -fno-signed-zeros \ # controls the `nsz` FMF
  -fno-honor-infs \ # controls the `ninf` FMF
  -fno-honor-nans \ # controls the `nnan` FMF
  -ffp-contract=fast \ # controls the `contract` FMF
  -fno-rounding-math \ 
  -ffp-exception-behavior=ignore \
  -fno-math-errno

I'm happy to restructure the condition to set "unsafe-fp-math"="true" to not use LangOpts.UnsafeFPMath, but I'd also expect not to use LangOpts.FastMath either.

I think we should drop math-errno from our definition of unsafe-fp-math.

I'm happy to fix the Clang driver code to remove the !MathErrno in the check that guards forwarding -funsafe-math-optimizations to the CC1. I would think this should be fixed separately from this change.

I'm not following entirely, but -funsafe-math-optimizations is just a subset of -ffast-math, so checking only the properties that contribute to -funsafe-math-optimizations should be enough.
I think it is way simpler to reason in these terms than enumerating all the possible scenarios.

Exactly my point Since -funsafe-math-optimizations is a subset of -ffast-math, shouldn't it be the "minimal" option to trigger the unsafe-fp-math attribute for functions?
I agree with the second part of the condition; it should be
updated with (LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_Fast || LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_FastHonorPragmas)
but I am still hesitant about the change you are proposing for the first part of the condition.
@andrew.w.kaylor Can you please weigh in on this?

I talked to Zahira about this offline, and the current state is even messier than I realized. I think we need to start by making sure we agree on the behavior we want and then figure out how to get there. There are some messy complications in the different ways things are handled in the driver, the front end, and the backend. There are more overlapping settings than I would like, but I guess it's best to look for the best way we can incrementally improve that situation.

As a first principle, I'd like to clarify a "big picture" item that I was trying to get at in my earlier comment. I'd like to be able to reason about this from the starting point of individual semantic modes. There is a set of modes defined in the clang user's manual (https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior). I think this is reasonably complete:

exception_behavior
fenv_access
rounding_mode
fp-contract
denormal_fp_math
denormal_fp_math_32
support_math_errno
no_honor_nans
no_honor_infinities
no_signed_zeros
allow_reciprocal
allow_approximate_fns
allow_reassociation

These are the modes from clang's perspective. They get communicated to the back end in different ways. The backend handling of this isn't nearly as consistent as I'd like, but that's a different problem. I think these basic modes can be thought of as language-independent and should be used as the basic building blocks for reasoning about floating point behavior across all LLVM-based compilers.

On top of these modes, we have two concepts "fast-math" and "unsafe-math-optimizations" which are essentially composites of one or more of the above semantic modes. I say "essentially" because up to this point there has been some fuzzy handling of that. I'd like to say they are absolutely composites of the above modes, and I think we can make it so, starting here.

If we agree that fast-math and unsafe-math-optimizations/unsafe-fp-math are composites of some of the other modes, then we can apply a symmetry property that I think will be helpful in the problem we're trying to solve here and will lead to better reasoning about these settings in the future.

I am proposing that passing "-ffast-math" should have *exactly* the same effects as passing all of the individual command line options that are implied by -ffast-math. Likewise, passing "-funsafe-math-optimizations" should have *exactly* the same effects as passing all of the individual options that are implied by -funsafe-math-optimizations. And, of course, any combination of options that turn various states on and off can be strung together and we can just evaluate the final state those options leave us in to see if we are in the "fast-math" state or the "unsafe-fp-math" state.

I'm going to ignore fast-math right now, because I think the current handling is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" function attribute should all mean exactly the same thing. The function attribute has a different scope, so it might not always have the same value as the other two, but it should at least mean the same thing.

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

There are a couple of things we need to talk about here:

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function is looking for !MathErrno before deciding to pass "-funsafe-math-optimizations" to the front end, so LangOpts.UnsafeFPMath will not be set unless math-errno support is off. However, the RenderFloatingPointOptions() handling of -funsafe-math-optimizations does not change the MathErrno setting. Clearly, there's a bug here one way or the other. In GCC -funsafe-math-optimizations does not affect the math errno handling, so I think that's the correct behavior.
  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function does not consider the FPContract setting when deciding whether to pass "-funsafe-math-optimizations" to the front end or set the fp-contract value when handling the -funsafe-math-optimizations option. However, there are places in the backend that interpret the "unsafe-fp-math" function attribute (and I think also the TargetOptions.UnsafeFPMath flag) as allowing FMA exactly as fp-contract(fast) would. I think this is a case where for some reason the meaning of the "unsafe-fp-math" function attribute has diverged from the meaning of the -funsafe-math-optimizations option, but I can't think of any reason that the -funsafe-math-optimizations option should not allow fast fp-contract, so I think we should make it do so. This option derives from gcc, but gcc doesn't connect the fp-contract behavior to the fast-math options at all, and frankly, I think their fp-contract handling is fairly shoddy, so I think we should diverge from them on this point.

As a first principle, I'd like to clarify a "big picture" item that I was trying to get at in my earlier comment. I'd like to be able to reason about this from the starting point of individual semantic modes. There is a set of modes defined in the clang user's manual (https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior). I think this is reasonably complete:

exception_behavior
fenv_access
rounding_mode
fp-contract
denormal_fp_math
denormal_fp_math_32
support_math_errno
no_honor_nans
no_honor_infinities
no_signed_zeros
allow_reciprocal
allow_approximate_fns
allow_reassociation

These are the modes from clang's perspective. They get communicated to the back end in different ways. The backend handling of this isn't nearly as consistent as I'd like, but that's a different problem. I think these basic modes can be thought of as language-independent and should be used as the basic building blocks for reasoning about floating point behavior across all LLVM-based compilers.

On top of these modes, we have two concepts "fast-math" and "unsafe-math-optimizations" which are essentially composites of one or more of the above semantic modes. I say "essentially" because up to this point there has been some fuzzy handling of that. I'd like to say they are absolutely composites of the above modes, and I think we can make it so, starting here.

If we agree that fast-math and unsafe-math-optimizations/unsafe-fp-math are composites of some of the other modes, then we can apply a symmetry property that I think will be helpful in the problem we're trying to solve here and will lead to better reasoning about these settings in the future.

I am proposing that passing "-ffast-math" should have *exactly* the same effects as passing all of the individual command line options that are implied by -ffast-math. Likewise, passing "-funsafe-math-optimizations" should have *exactly* the same effects as passing all of the individual options that are implied by -funsafe-math-optimizations. And, of course, any combination of options that turn various states on and off can be strung together and we can just evaluate the final state those options leave us in to see if we are in the "fast-math" state or the "unsafe-fp-math" state.

Yes, I agree that this should be the long term goal.

I'm going to ignore fast-math right now, because I think the current handling is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" function attribute should all mean exactly the same thing. The function attribute has a different scope, so it might not always have the same value as the other two, but it should at least mean the same thing.

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

There are a couple of things we need to talk about here:

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function is looking for !MathErrno before deciding to pass "-funsafe-math-optimizations" to the front end, so LangOpts.UnsafeFPMath will not be set unless math-errno support is off. However, the RenderFloatingPointOptions() handling of -funsafe-math-optimizations does not change the MathErrno setting. Clearly, there's a bug here one way or the other. In GCC -funsafe-math-optimizations does not affect the math errno handling, so I think that's the correct behavior.

I agree there is a bug. My resolution for this would be to remove the !MathErrno from:

if (!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros &&
    ApproxFunc && !TrappingMath)
  CmdArgs.push_back("-funsafe-math-optimizations");

so that we match GCC behavior and definition.

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function does not consider the FPContract setting when deciding whether to pass "-funsafe-math-optimizations" to the front end or set the fp-contract value when handling the -funsafe-math-optimizations option. However, there are places in the backend that interpret the "unsafe-fp-math" function attribute (and I think also the TargetOptions.UnsafeFPMath flag) as allowing FMA exactly as fp-contract(fast) would. I think this is a case where for some reason the meaning of the "unsafe-fp-math" function attribute has diverged from the meaning of the -funsafe-math-optimizations option, but I can't think of any reason that the -funsafe-math-optimizations option should not allow fast fp-contract, so I think we should make it do so. This option derives from gcc, but gcc doesn't connect the fp-contract behavior to the fast-math options at all, and frankly, I think their fp-contract handling is fairly shoddy, so I think we should diverge from them on this point.

The fact that "unsafe-fp-math"="true" implies -ffp-contract=fast is quite unfortunate, and it is something that in principle should be addressed.
My understanding is that the floating point math function attributes are a quite old concept that predates the per-instruction fast-math flags. Ideally we should get rid of the flooding point math function attribute since we do have a finer grain representation.
A while back the main issue was that all the backends code (e.g. DAGCombiner) were only using the TargetOptions properties (hence the function attributes) to control the floating point related transformations, hence there would have been regressions.

IMO being more conservative w.r.t. conditions for which the frontend emits the "unsafe-fp-math" attribute is a small step in the direction of phasing out the floating point math function attributes. However I don't know if the backend code has been enhanced to consider the per instruction fast-math flags to control transformations. If the backend code generators are now fast-math-flags aware, then there shouldn't be any significant regression due to missing *legal* transformations.

At high level, I think that we need to understand how important is to match GCC behavior in this particular case. We can change the way Clang defines -funsafe-math-optimizations to imply -ffp-contract=fast, but that seems the wrong solution as it feels like promoting a bug to a feature.

Use low level properties to set "unsafe-fp-math"="true".

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

I've updated the patch to use the low level options as you suggested. In the current version of the patch I'm not currently checking explicitly exception_behavior(ignore) && fenv_access(off) given that the other modes of these properties seems to work only if overall there is the "precise" profile. I don't know if that's acceptable, or if instead the CC1 shouldn't make any assumption about the compiler driver doing the right thing.

I'm not following entirely, but -funsafe-math-optimizations is just a subset of -ffast-math, so checking only the properties that contribute to -funsafe-math-optimizations should be enough.
I think it is way simpler to reason in these terms than enumerating all the possible scenarios.

Exactly my point Since -funsafe-math-optimizations is a subset of -ffast-math, shouldn't it be the "minimal" option to trigger the unsafe-fp-math attribute for functions?
I agree with the second part of the condition; it should be
updated with (LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_Fast || LangOpts.getDefaultFPContractMode() == LangOptions::FPModeKind::FPM_FastHonorPragmas)
but I am still hesitant about the change you are proposing for the first part of the condition.
@andrew.w.kaylor Can you please weigh in on this?

I talked to Zahira about this offline, and the current state is even messier than I realized. I think we need to start by making sure we agree on the behavior we want and then figure out how to get there. There are some messy complications in the different ways things are handled in the driver, the front end, and the backend. There are more overlapping settings than I would like, but I guess it's best to look for the best way we can incrementally improve that situation.

As a first principle, I'd like to clarify a "big picture" item that I was trying to get at in my earlier comment. I'd like to be able to reason about this from the starting point of individual semantic modes. There is a set of modes defined in the clang user's manual (https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior). I think this is reasonably complete:

exception_behavior
fenv_access
rounding_mode
fp-contract
denormal_fp_math
denormal_fp_math_32
support_math_errno
no_honor_nans
no_honor_infinities
no_signed_zeros
allow_reciprocal
allow_approximate_fns
allow_reassociation

These are the modes from clang's perspective. They get communicated to the back end in different ways. The backend handling of this isn't nearly as consistent as I'd like, but that's a different problem. I think these basic modes can be thought of as language-independent and should be used as the basic building blocks for reasoning about floating point behavior across all LLVM-based compilers.

On top of these modes, we have two concepts "fast-math" and "unsafe-math-optimizations" which are essentially composites of one or more of the above semantic modes. I say "essentially" because up to this point there has been some fuzzy handling of that. I'd like to say they are absolutely composites of the above modes, and I think we can make it so, starting here.

If we agree that fast-math and unsafe-math-optimizations/unsafe-fp-math are composites of some of the other modes, then we can apply a symmetry property that I think will be helpful in the problem we're trying to solve here and will lead to better reasoning about these settings in the future.

I am proposing that passing "-ffast-math" should have *exactly* the same effects as passing all of the individual command line options that are implied by -ffast-math. Likewise, passing "-funsafe-math-optimizations" should have *exactly* the same effects as passing all of the individual options that are implied by -funsafe-math-optimizations. And, of course, any combination of options that turn various states on and off can be strung together and we can just evaluate the final state those options leave us in to see if we are in the "fast-math" state or the "unsafe-fp-math" state.

Strongly agree.

I'm going to ignore fast-math right now, because I think the current handling is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" function attribute should all mean exactly the same thing. The function attribute has a different scope, so it might not always have the same value as the other two, but it should at least mean the same thing.

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

Agree. This is not currently the case. -funsafe-math-optimizations should set the FPContract=fast the same way it's done for -ffast-math. I think it makes sense to fix this in this patch. @michele.scandale WDYT?

There are a couple of things we need to talk about here:

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function is looking for !MathErrno before deciding to pass "-funsafe-math-optimizations" to the front end, so LangOpts.UnsafeFPMath will not be set unless math-errno support is off. However, the RenderFloatingPointOptions() handling of -funsafe-math-optimizations does not change the MathErrno setting. Clearly, there's a bug here one way or the other. In GCC -funsafe-math-optimizations does not affect the math errno handling, so I think that's the correct behavior.

Yep! This can be fixed in an upcoming patch. @michele.scandale Do you want to do it? If not, I can.

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function does not consider the FPContract setting when deciding whether to pass "-funsafe-math-optimizations" to the front end or set the fp-contract value when handling the -funsafe-math-optimizations option. However, there are places in the backend that interpret the "unsafe-fp-math" function attribute (and I think also the TargetOptions.UnsafeFPMath flag) as allowing FMA exactly as fp-contract(fast) would. I think this is a case where for some reason the meaning of the "unsafe-fp-math" function attribute has diverged from the meaning of the -funsafe-math-optimizations option, but I can't think of any reason that the -funsafe-math-optimizations option should not allow fast fp-contract, so I think we should make it do so. This option derives from gcc, but gcc doesn't connect the fp-contract behavior to the fast-math options at all, and frankly, I think their fp-contract handling is fairly shoddy, so I think we should diverge from them on this point.

If we stick to the meaning of unsafe math calculations as being what's defined above, then it makes sense to me that unsafe math calculation mode implies FPContract=fast.

I'm going to ignore fast-math right now, because I think the current handling is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" function attribute should all mean exactly the same thing. The function attribute has a different scope, so it might not always have the same value as the other two, but it should at least mean the same thing.

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

Agree. This is not currently the case. -funsafe-math-optimizations should set the FPContract=fast the same way it's done for -ffast-math. I think it makes sense to fix this in this patch. @michele.scandale WDYT?

(*) I understand that Clang has a different definition of -ffast-math compared to the GCC one (i.e. in Clang -ffast-math implies -ffp-contract=fast), and that this can be considered an indication that there isn't a strong desire to be compatible with GCC. If that's the generally accepted direction, I won't oppose it.

Given that we all agreed that we should be using only low level options inside the compiler, I would keep changes to the compiler driver separate from this IR code generation change, and the two should be completely independent.

There are a couple of things we need to talk about here:

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function is looking for !MathErrno before deciding to pass "-funsafe-math-optimizations" to the front end, so LangOpts.UnsafeFPMath will not be set unless math-errno support is off. However, the RenderFloatingPointOptions() handling of -funsafe-math-optimizations does not change the MathErrno setting. Clearly, there's a bug here one way or the other. In GCC -funsafe-math-optimizations does not affect the math errno handling, so I think that's the correct behavior.

Yep! This can be fixed in an upcoming patch. @michele.scandale Do you want to do it? If not, I can.

I can prepare a patch for that, and if there is a general consensus about (*) then we can have a single change that fixes the compiler driver code.

I'm going to ignore fast-math right now, because I think the current handling is mostly OK, and this review is about unsafe-fp-math. The unsafe-fp-math case is a little simpler in concept, but I think we've got more problems with it.

Another fundamental principle I'd like to declare here is that LangOpts.UnsafeFPMath, TargetOptions.UnsafeFPMath, and the "unsafe-fp-math" function attribute should all mean exactly the same thing. The function attribute has a different scope, so it might not always have the same value as the other two, but it should at least mean the same thing.

My understanding is this:

unsafe-fp-math = exception_behavior(ignore) +
                 fenv_access(off) +
                 no_signed_zeros(on) +
                 allow_reciprocal(on) +
                 allow_approximate_fns(on) +
                 allow_reassociation(on) +
                 fp_contract(fast)

The first two of those settings are the default behavior. The others can be turned on by individual command line options. If all of those semantic modes are in the states above, then we are in "unsafe-fp-math" mode. That's my proposal.

Agree. This is not currently the case. -funsafe-math-optimizations should set the FPContract=fast the same way it's done for -ffast-math. I think it makes sense to fix this in this patch. @michele.scandale WDYT?

(*) I understand that Clang has a different definition of -ffast-math compared to the GCC one (i.e. in Clang -ffast-math implies -ffp-contract=fast), and that this can be considered an indication that there isn't a strong desire to be compatible with GCC. If that's the generally accepted direction, I won't oppose it.

This is not new. It's already done this way in the current clang. So, it looks like the decision has been taken already.

Given that we all agreed that we should be using only low level options inside the compiler, I would keep changes to the compiler driver separate from this IR code generation change, and the two should be completely independent.

Okay.

There are a couple of things we need to talk about here:

  1. Currently, the Driver/ToolChains/Clang.cpp RenderFloatingPointOptions() function is looking for !MathErrno before deciding to pass "-funsafe-math-optimizations" to the front end, so LangOpts.UnsafeFPMath will not be set unless math-errno support is off. However, the RenderFloatingPointOptions() handling of -funsafe-math-optimizations does not change the MathErrno setting. Clearly, there's a bug here one way or the other. In GCC -funsafe-math-optimizations does not affect the math errno handling, so I think that's the correct behavior.

Yep! This can be fixed in an upcoming patch. @michele.scandale Do you want to do it? If not, I can.

I can prepare a patch for that, and if there is a general consensus about (*) then we can have a single change that fixes the compiler driver code.

zahiraam added inline comments.Oct 31 2022, 9:41 AM
clang/lib/CodeGen/CodeGenFunction.cpp
181

Shouldn't then this also check for FPFeatures.getFPContractMode() ?

The fact that "unsafe-fp-math"="true" implies -ffp-contract=fast is quite unfortunate, and it is something that in principle should be addressed.
My understanding is that the floating point math function attributes are a quite old concept that predates the per-instruction fast-math flags. Ideally we should get rid of the flooding point math function attribute since we do have a finer grain representation.
A while back the main issue was that all the backends code (e.g. DAGCombiner) were only using the TargetOptions properties (hence the function attributes) to control the floating point related transformations, hence there would have been regressions.

Yes, the function attributes are a vestige of the time before the fast-math flags were introduced, but the use of them hasn't been entirely eliminated and I think there are a couple of cases where some people think it's necessary. I posted an RFC and a patch about a year ago to start cleaning this up, but I got pulled away before it landed and honestly I had forgotten about it. https://discourse.llvm.org/t/rfc-eliminating-non-ir-floating-point-controls-in-the-selection-dag/59173

At high level, I think that we need to understand how important is to match GCC behavior in this particular case. We can change the way Clang defines -funsafe-math-optimizations to imply -ffp-contract=fast, but that seems the wrong solution as it feels like promoting a bug to a feature.

I wouldn't consider the fact that unsafe-fp-math (as a concept) implies fp-contract=fast to be a bug. Yes, it may appear to deviate from the gcc behavior, but the reasons for that are complicated. Mostly they stem from the fact that gcc doesn't support fp-contract in the way that the C standard describes it. In gcc, fp-contract is fast or off, and it defaults to fast. If you pass -ffp-contract=on to gcc, it behaves exactly like -ffp-contract=off. If you pass -std=c99, the default for fp-contract changes to on, but because gcc doesn't support fp-contract=on, FMA isn't formed. https://godbolt.org/z/K86Kv8W7h

My take on this is that fp-contract=on is a mode that conforms to the language standard and fp-contract=fast allows value changing optimizations that do not conform to the standard. Value-changing optimizations that do not conform to the standard are exactly what the definition of -funsafe-math-optimizations allow, so I can't see any reason that -funsafe-math-optimizations shouldn't imply fp-contract=fast.

So I was going to say that gcc is wrong, even by its own rules, to not allow fp-contract=fast behavior under -funsafe-math-optimizations, but then I double-checked my understanding and made a very happy discovery ... -funsafe-math-optimizations DOES imply fp-contract=fast in gcc! You just need to follow a convoluted path to see that (i.e. change the default to something other than fast using -std=c99 and then add the unsafe math option). https://godbolt.org/z/K1GonvGdT

clang/lib/CodeGen/CodeGenFunction.cpp
181

This should be covered by FPFeatures.allowFPContractAcrossStatement() which is basically getFPContractMode() == LangOptions::FPM_Fast.

michele.scandale marked an inline comment as done.Nov 1 2022, 3:09 PM
michele.scandale added inline comments.
clang/test/CodeGen/func-attr.c
5–15

Replaced command line with %clang_cc1 to avoid depending on compiler driver behavior.

The changes in this patch look good to me.
I talked to @andrew.w.kaylor offline: I was thinking that it might be necessary to make the two driver changes we talked about, before merging this patch. But if the tests pass then I think it's okay to implement the driver changes in an upcoming patch.
@michele.scandale please make sure not to drop the driver changes that we agreed upon in this patch. Thanks.

zahiraam accepted this revision.Nov 2 2022, 2:53 PM
This revision is now accepted and ready to land.Nov 2 2022, 2:53 PM

The changes in this patch look good to me.
@michele.scandale please make sure not to drop the driver changes that we agreed upon in this patch. Thanks.

I started looking at the change needed to have unsafe-math => fp-contract=fast.

If I look how -ffast-math behave today, I see that in the driver code -ffast-math changes the state for FPContract (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3032), but then the condition for which -ffast-math is forwarded to the CC1 (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3149) doesn't consider FPContract. This seems related to the fact that GCC does emit __FAST_MATH__ even if the contraction mode is not fast.
From a quick look the FastMath language options is used mainly to guard some macro definition and a codegen path for complex floating point values, so this seems ok in practice.

It seems intended that the semantic of -ffast-math for the CC1 is different than the semantic of -ffast-math for the compiler driver. Based on this, I'd expect a similar solution for -funsafe-math-optimizations, i.e. -funsafe-math-optimizations => -ffp-contract=fast only at the compiler driver level. Does this sound good?

If so, then the driver change for the -funsafe-math-optimizations -> -ffp-contract=fast could be done separately without affecting the code here.

I talked to @andrew.w.kaylor offline: I was thinking that it might be necessary to make the two driver changes we talked about, before merging this patch. But if the tests pass then I think it's okay to implement the driver changes in an upcoming patch.

In this patch the only suboptimal test change is the change to clang/test/CodeGenOpenCL/relaxed-fpmath.cl, where despite the presence of -cl-unsafe-math-optimizations the "unsafe-fp-math"="true" function attributes is not generated due to the contraction mode not being fast.
My understanding is that -cl-fast-relaxed-math should be equivalent to -ffast-math, and -cl-unsafe-math-optimizations should be equivalent to -funsafe-math-optimizations from the user perspective.

From what I see the -cl-* options are simply forwarded as-is to the CC1, and this seems to be the desired behavior for the compiler driver w.r.t. OpenCL specific options. From what I see https://github.com/llvm/llvm-project/blob/main/clang/lib/Frontend/CompilerInvocation.cpp#L3803 implements -cl-fast-relaxed-math => -ffast-math, but this solution doesn't play well if on the same command line you also have -ffp-contract=VAL as the relative order of the options is not taken into account. I'd expect a similar change for the -cl-unsafe-math-optimizations case.

I can either put a TODO comment in the clang/test/CodeGenOpenCL/relaxed-fpmath.cl test, and make the changes with the driver changes for -funsafe-math-optimizations, otherwise the change for the -cl-unsafe-math-optimizations options needs to be done in this patch.

Any preference?

The changes in this patch look good to me.
@michele.scandale please make sure not to drop the driver changes that we agreed upon in this patch. Thanks.

I started looking at the change needed to have unsafe-math => fp-contract=fast.

If I look how -ffast-math behave today, I see that in the driver code -ffast-math changes the state for FPContract (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3032), but then the condition for which -ffast-math is forwarded to the CC1 (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3149) doesn't consider FPContract. This seems related to the fact that GCC does emit __FAST_MATH__ even if the contraction mode is not fast.
From a quick look the FastMath language options is used mainly to guard some macro definition and a codegen path for complex floating point values, so this seems ok in practice.

It seems intended that the semantic of -ffast-math for the CC1 is different than the semantic of -ffast-math for the compiler driver. Based on this, I'd expect a similar solution for -funsafe-math-optimizations, i.e. -funsafe-math-optimizations => -ffp-contract=fast only at the compiler driver level. Does this sound good?

If so, then the driver change for the -funsafe-math-optimizations -> -ffp-contract=fast could be done separately without affecting the code here.

I talked to @andrew.w.kaylor offline: I was thinking that it might be necessary to make the two driver changes we talked about, before merging this patch. But if the tests pass then I think it's okay to implement the driver changes in an upcoming patch.

In this patch the only suboptimal test change is the change to clang/test/CodeGenOpenCL/relaxed-fpmath.cl, where despite the presence of -cl-unsafe-math-optimizations the "unsafe-fp-math"="true" function attributes is not generated due to the contraction mode not being fast.
My understanding is that -cl-fast-relaxed-math should be equivalent to -ffast-math, and -cl-unsafe-math-optimizations should be equivalent to -funsafe-math-optimizations from the user perspective.

From what I see the -cl-* options are simply forwarded as-is to the CC1, and this seems to be the desired behavior for the compiler driver w.r.t. OpenCL specific options. From what I see https://github.com/llvm/llvm-project/blob/main/clang/lib/Frontend/CompilerInvocation.cpp#L3803 implements -cl-fast-relaxed-math => -ffast-math, but this solution doesn't play well if on the same command line you also have -ffp-contract=VAL as the relative order of the options is not taken into account. I'd expect a similar change for the -cl-unsafe-math-optimizations case.

I can either put a TODO comment in the clang/test/CodeGenOpenCL/relaxed-fpmath.cl test, and make the changes with the driver changes for -funsafe-math-optimizations, otherwise the change for the -cl-unsafe-math-optimizations options needs to be done in this patch.

Any preference?

That's exactly what I was worried about.
I think it would be best to create another patch to make the 'funsafe-math-optimization' => FpContract=fast and make sure we have coherence with 'ffast-math' and the cl options, then check in this patch once that's done.

The changes in this patch look good to me.
@michele.scandale please make sure not to drop the driver changes that we agreed upon in this patch. Thanks.

I started looking at the change needed to have unsafe-math => fp-contract=fast.

If I look how -ffast-math behave today, I see that in the driver code -ffast-math changes the state for FPContract (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3032), but then the condition for which -ffast-math is forwarded to the CC1 (https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L3149) doesn't consider FPContract. This seems related to the fact that GCC does emit __FAST_MATH__ even if the contraction mode is not fast.
From a quick look the FastMath language options is used mainly to guard some macro definition and a codegen path for complex floating point values, so this seems ok in practice.

It seems intended that the semantic of -ffast-math for the CC1 is different than the semantic of -ffast-math for the compiler driver. Based on this, I'd expect a similar solution for -funsafe-math-optimizations, i.e. -funsafe-math-optimizations => -ffp-contract=fast only at the compiler driver level. Does this sound good?

If so, then the driver change for the -funsafe-math-optimizations -> -ffp-contract=fast could be done separately without affecting the code here.

I talked to @andrew.w.kaylor offline: I was thinking that it might be necessary to make the two driver changes we talked about, before merging this patch. But if the tests pass then I think it's okay to implement the driver changes in an upcoming patch.

In this patch the only suboptimal test change is the change to clang/test/CodeGenOpenCL/relaxed-fpmath.cl, where despite the presence of -cl-unsafe-math-optimizations the "unsafe-fp-math"="true" function attributes is not generated due to the contraction mode not being fast.
My understanding is that -cl-fast-relaxed-math should be equivalent to -ffast-math, and -cl-unsafe-math-optimizations should be equivalent to -funsafe-math-optimizations from the user perspective.

From what I see the -cl-* options are simply forwarded as-is to the CC1, and this seems to be the desired behavior for the compiler driver w.r.t. OpenCL specific options. From what I see https://github.com/llvm/llvm-project/blob/main/clang/lib/Frontend/CompilerInvocation.cpp#L3803 implements -cl-fast-relaxed-math => -ffast-math, but this solution doesn't play well if on the same command line you also have -ffp-contract=VAL as the relative order of the options is not taken into account. I'd expect a similar change for the -cl-unsafe-math-optimizations case.

I can either put a TODO comment in the clang/test/CodeGenOpenCL/relaxed-fpmath.cl test, and make the changes with the driver changes for -funsafe-math-optimizations, otherwise the change for the -cl-unsafe-math-optimizations options needs to be done in this patch.

Any preference?

That's exactly what I was worried about.
I think it would be best to create another patch to make the 'funsafe-math-optimization' => FpContract=fast and make sure we have coherence with 'ffast-math' and the cl options, then check in this patch once that's done.

@andrew.w.kaylor , @michele.scandale I created a draft patch https://reviews.llvm.org/D137578 that fixes the 2 issues mentioned above. Let me know if the entry is visible to you (it's a private entry for now)? Thanks.

Rebase on top of compiler driver changes for -funsafe-math-optimizations

This revision was automatically updated to reflect the committed changes.