This fixes the mapping between CC1 command line options to the
properties in LangOptions describing the floating point optimizations
configuration.
The default fast-math flags for the IR builder are now derived from such
properties to avoid inconsistencies.
Finally some of the CodeGenOptions floating point optimizations
properties have been removed since they now exist in LangOptions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/test/CodeGenOpenCL/relaxed-fpmath.cl | ||
---|---|---|
14 | This change is based on the following:
See https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/clBuildProgram.html and https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html |
clang/test/CodeGenOpenCL/relaxed-fpmath.cl | ||
---|---|---|
14 | Makes sense! |
The code cleanups all seems reasonable. The actual changes in code-generation changes are because we were failing to set these reliably?
Most of them yes.
In the CUDA test we there is now contract because we honor the default contraction mode that for CUDA is set to fast.
In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode. We might want to treat -ffast-math similarly to -cl-fast-relaxed-math, i.e. it implies by default the "fast" contraction mode. Before this change there behavior was accidentally the same as "-ffast-math changes the default contraction mode" because there was:
if (CGM.getLangOpts().FastMath) FMF.setFast()
and so the bit for AllowContract was enabled in the fast-math flags.
Right, and we weren't honoring that mode before?
In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.
Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.
Not in the setup of base fast-math flags inside CodeGenFunction. However when emitting code for expression with FPOptions stored in the AST nodes then contract was set correctly.
In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.
Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.
Do you think we should handle -ffast-math as -cl-fast-relaxed-math, i.e. it implies the default contraction mode being "fast"?
Okay, that seems justifiable.
In clang/test/CodeGen/complex-math.c I've added -ffp-contract=fast to the command line option because -ffast-math at the CC1 level does not change the default contraction mode.
Oh, I see, makes sense. So there was inconsistent treatment of the options, where one thing observed it but others didn't, and that's been fixed now.
Do you think we should handle -ffast-math as -cl-fast-relaxed-math, i.e. it implies the default contraction mode being "fast"?
I'm actually surprised it doesn't. I can't imagine why someone enabling fast math would want contraction to be disabled.
Just to be clear the clang driver does the right thing.
If you run clang -ffast-math the CC1 invocation has both -ffast-math and -ffp-contract=fast (and other options as well)
Here specifically I'm just considering the behavior of clang -cc1 -ffast-math.
I'm actually surprised it doesn't. I can't imagine why someone enabling fast math would want contraction to be disabled.
Just to be clear the clang driver does the right thing.
If you run clang -ffast-math the CC1 invocation has both -ffast-math and -ffp-contract=fast (and other options as well)Here specifically I'm just considering the behavior of clang -cc1 -ffast-math.
Okay. It is probably best for testing purposes if flags like this stay as close as possible to their driver behavior. The driver can just canonicalize different spellings down to the interface that -cc1 wants.
I've just realized it might be incorrect to have the CC1 option -ffast-math changing the default contraction mode. The clang driver generates -ffast-math based on conditions that do not involve the contraction mode state at all:
// -ffast-math enables the __FAST_MATH__ preprocessor macro, but check for the // individual features enabled by -ffast-math instead of the option itself as // that's consistent with gcc's behaviour. if (!HonorINFs && !HonorNaNs && !MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros && !TrappingMath && !RoundingFPMath) { CmdArgs.push_back("-ffast-math"); if (FPModel.equals("fast")) { if (FPContract.equals("fast")) // All set, do nothing. ; else if (FPContract.empty()) // Enable -ffp-contract=fast CmdArgs.push_back(Args.MakeArgString("-ffp-contract=fast")); else D.Diag(clang::diag::warn_drv_overriding_flag_option) << "-ffp-model=fast" << Args.MakeArgString("-ffp-contract=" + FPContract); } }
For example the running the following clang -### -funsafe-math-optimizations -ffinite-math-only -x c - lead to a CC1 command line without any -ffp-contract= option relying on the fact that the default value for the contraction mode in the compiler is OFF.
I will revert the modification on this aspect.
Revert last change about -ffast-math imply "fast" contraction mode by default in CC1.
Rebase.
Per a private discussion, it seems like the right thing to do here would be stop recognizing -ffast-math flag in cc1 and just put the driver in charge of all these individual flags. That may necessitate adding a -fdefine-fast-math cc1 option to control the #define, but the logic for that seems complex enough that it ought to be left to the driver.
Separately, we should consider making some more driver options imply -ffp-contract=fast. Fast contraction is the default behavior in GCC, and while Clang has chosen not to follow that precedent, it seems to be in keeping with the spirit of options like -funsafe-math-optimizations that they should broadly enable contraction if that policy isn't otherwise specified.
Those sorts of changes should be kept separate from this kind of refactor, though. This is still approved.
For CUDA the default FP contract mode is fast, therefore the contract FMF is emitted.