This patch changes the ffp-model=precise to enables -ffp-contract=on (previously -ffp-model=precise enabled -ffp-contract=fast). This is a follow-up to Andy Kaylor's comments in the llvm-dev discussion "Floating Point semantic modes". From the same email thread, I put Andy's distillation of floating point options and floating point modes into UsersManual.rst
Details
- Reviewers
andrew.w.kaylor rjmccall sepavloff zahiraam lebedev.ri - Commits
- rG48ad446a0fb2: [clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp…
rGb9b696bba670: [clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp…
rGa1449a10dbcf: [clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp…
rG8daac3714083: [clang][FPEnv] Clang floatng point model ffp-model=precise enables ffp…
rGabd09053bc7a: Revert "Revert "Change clang option -ffp-model=precise to select ffp…
rG3fcdf2fa945a: Change clang option -ffp-model=precise to select ffp-contract=on
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/docs/UsersManual.rst | ||
---|---|---|
1459 | I'm confused. Where in this patch the This patch establishes the default option for -ffp-model to select "precise". happens? LHS of diff says it is already default |
I agree that the default mode should be standards-compliant, which I believe permits fp-contract=on (contraction within expressions) but not fp-contract=fast (arbitrary contraction).
clang/docs/UsersManual.rst | ||
---|---|---|
1459 | The comments said that it was the default, but the actual default was something that didn't quite match any of the fp-models -- precise but with fp contraction off. | |
clang/lib/Driver/ToolChains/Clang.cpp | ||
2666–2672 | I think this always gets changed to fast, on, or off below, but making it empty here looks wrong. | |
2783–2787 | Does this mean that "-ffp-model=precise -ffp-contract=off" will leave FP contraction on? That doesn't seem right. | |
2890–2895 | We should never get here with FPContract empty. |
some replies to Andy. I'll upload another patch here which passed check-all locally. then i'll re-commit it.
clang/docs/UsersManual.rst | ||
---|---|---|
1459 | yes you're right. sorry for the confusion. i changed the title and description | |
clang/lib/Driver/ToolChains/Clang.cpp | ||
2783–2787 | No. When -ffp-model=precise is on the command line, we end up here because the code above changes optID to ffp_contract, and sets FPContract to "on". The Val string will be "precise", and it shouldn't be checked. I'll change the comment. |
FYI this caused failed builds for aarch64 (when building with -fno-signed-zeros), hitting unreachable statements in TargetLowering::getNegatedExpression - see https://bugs.llvm.org/show_bug.cgi?id=44892. I guess that's a bug in the aarch64 backend and not with this change in itself, but it was only uncovered by this change...
Looks like this patch regressed the test-suite:
https://lnt.llvm.org/db_default/v4/nts/recent_activity
Now 45min+
This is also breaking test-suites in our internal buildbots ..
This change is actually changing the default behavior: if user does NOT supply -ffp-contract, it becomes -ffp-contract=on.
So this means any code without explicitly add -ffp-contract=off will have -ffp-contract=on.
eg: -O0 will also generate FMA.
Is this intended behavior change? @mibintc @andrew.w.kaylor
You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.
You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.
Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math
Just a suggestion, I'm not sure whether that would be the best breakdown. Perhaps we can also see what the defaults are for GCC and unify with those?
@jsji Will your backend tolerate -ffp-contract=on if optimizations are not disabled, e.g. -O2? We are proposing that the default floating point model be -ffp-model=precise, and with precise model, the ffp-contract=on. However you are right we don't want the frontend to create FMA when optimizations are disabled -O0
It looks like something strange happened with the revert:
clang-11: warning: overriding '-ffp-model=strict' option with '-ffp-model=strict' [-Woverriding-t-option]
I believe the problem is that the original change that was being reverted contained this:
clang/lib/Driver/ToolChains/Clang.cpp @@ -2768,7 +2766,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, !AssociativeMath && !ReciprocalMath && SignedZeros && TrappingMath && RoundingFPMath && DenormalFPMath != llvm::DenormalMode::getIEEE() && + FPContract.empty()) - (FPContract.equals("off") || FPContract.empty()))
But sometime in the land-revert-land-revert cycle the line above that changed, causing the merge to miss this change in the most recent revert. I see that @MaskRay has since re-landed this change set, but it's going to cause problems for PowerPC. If someone needs to revert this yet again, I think it can be safely done by recovering the change above.
Apologies for the mess!
We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).
But yes, I think we are OK to accept -ffp-contract=on (not -ffp-contract=fast) at -O2 if there are strong justifications to do so on clang.
I've been discussing this with @scanon on the cfe-dev mailing list, and he has convinced me that we should create FMA options at -O0 if the fp-contract setting is "on" -- including when it is on by default. The reasoning that persuaded me was, "preserving FMA formation at O0 _helps_ debuggability, because it means that numerical behavior is more likely to match what a user observed at Os, allowing them to debug the problem."
Yeah, I agree with Steve here. The great virtue of -ffp-contract=on over the more aggressive modes is that FMA formation is purely local to a single source expression, which means there's really no obstacle to treating it as implementation-guaranteed semantics. Such behavior should be done consistently across compiler settings rather than potentially making the numeric result of an expression subject to optimizer settings and choices; the latter is fine for "fast" modes but not really acceptable as the default compiler behavior. If a backend needs to legalize the intrinsic back to separate multiply + add instructions, that's okay, but it should do so consistently rather than only under a specific optimization level.
We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).
Correction: XL on PowerPC default to generate FMA at -O0 as well. The corresponding option is -qfloat=maf.
https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.xlcpp1611.lelinux.doc/compiler_ref/opt_float.html
GCC doesn't support "on" so I'm not sure it's possible to distinguish their intentions. I think it would be better to define what we think is best for clang and encourage GCC to unify with that.
IIUC the patch is currently reverted (in 78654e8511cf16d49f6680d782f3771a767ba942), due to ~20 llvm-test-suite failures.
The failures on X86 are listed below. Unfortunately most tests just compare the hash of the output, so it is not obvious in what way the results are changed. Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.
Failed Tests (22): test-suite :: MultiSource/Applications/oggenc/oggenc.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
Presumably it means that whatever optimization options those test specify, are now either more strict or less strict.
I think this is really missing some codegen tests that show how codegen actually changes by this.
Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.
Failed Tests (22): test-suite :: MultiSource/Applications/oggenc/oggenc.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
And by codegen changes i mostly mean newly-set/now-unset fp fast-math instruction flags.
I rebased and enhanced the test case clang/test/CodeGen/ffp-contract-option.c to show the effect of various ffp-contract={on,fast,off} * ffast-math=on,off in response to the request from @lebedev.ri ; sorry for leaving this hanging for so long, I wasn't sure what to do about the optimization discrepancies but I have a plan now. We think it could be caused by a particular nuance on Broadwell.
There are 3 clang settings for ffp-contract (on, off, fast) but the FMF bits have only "allow contract". Clang sets the "allow contract" bit in the IRBuilder only when ffp-contract=fast
I added some codegen tests to show you what ffp-contract is doing: the tests added are: ffp-contract={on,off,fast} * {ffast-math * fno-fast-math}
I created https://reviews.llvm.org/D102861 to make changes to the failing LNT tests. Hoping to push this commit after the test changes in D102862 are approved.
No real comments from me.
I assume, the errors are because -ffp-contract=on actually results in *less* error?
clang/docs/UsersManual.rst | ||
---|---|---|
1501 | Maybe you should add short info to Release Notes as well? |
clang/docs/UsersManual.rst | ||
---|---|---|
1501 |
Good idea thanks. Can I do that in a separate patch? |
clang/docs/UsersManual.rst | ||
---|---|---|
1501 | Yes, maybe wait a few days in case of the revert of this patch. |
The bot is showing a fail due to this patch, see https://lab.llvm.org/buildbot#builders/110/builds/4007
It looks like my updates to LNT earlier this week haven't been migrated to the bot, is that right?
See https://reviews.llvm.org/rTccc86f938839d02de3a6564b2d80fb47a60aa069
In the json log I can see ffp-contract=off but it should be overridden to ffp-contract=off on the command line due to the LNT change.
Is there something else I need to do to update those sources?
I’ve been trying to commit this patch. However, when I commit, the tests fail some 21 floating point tests in LNT. Also there is both performance improvement as well as regression which shouldn’t happen because the sum total of my clang patch + my LNT patch should effectively be a no-op.
It appears that the correction I made to LNT is not effective, but I’ve tested the patch on my Linux box and it does work. My patch unconditionally overrides the command line to restore the original LNT behavior. I verified the commit revision to LNT and the bot is running my patched version of LNT. It looks like the bot is using a different rule to build the test cases.
It’s hard to see exactly what the bot is doing, there isn’t enough information. It would be nice if the bot Makefile was built with ‘verbose’ so I could see all the compile commands that the bot is using.
One of the bot failures is here: http://lnt.llvm.org/db_default/v4/nts/147167
I can see the bot owner is Yin Yang, I will reach out.
I had to revert again, it's still failing on Intel buildbots. Cannot reproduce on Intel-internal Broadwell server.
We are seeing float related unit tests failing in Fuchsia after this patch. What flags should we add to our build to match llvm's previous float point behavior?
Seems like lnt related things are stable now so maybe it is a good time to update release notes?
We're seeing float related tests fail in Chromium too (crbug.com/1235145). I haven't read the background llvm-dev thread about this, but is changing the default behavior here really intentional? Does it match what the standards require and what other compilers do?
It looks like this patch is also causing mis-compares in both SPEC2006 and SPEC2017 on AArch64, e.g. in External/SPEC/CFP2006/453.povray/.
Can we revert this patch please? If I understood correctly, this patch changed the default behavior of clang when generating code for float point arithmetic. After this patch -ffp-contract was set to on if this flag was not specified. In theory this makes clang use FMA instructions that should be slightly more efficient, but there are cases in the field that certain programs relies on the stable rounding behavior of float point arithmetic, and this patch broke that. And in fact we already seen breakages in Fuchsia, Chromium and SPEC benchmark suite and there might be other people seeing the same breakages but still working on bisecting this breaking change.
Even if this breaking change is reasonable to proceed and receive condense from major clang users, this patch lacks enough test coverage for codegen of float point arithmetic on major CPU platforms without related flag specified. Nobody will discover this is a default behavior breaking patch by just reading this patch. This should be reverted and reworked.
There was a long discussion on cfe-dev about this issue approximately January-February 2020 including @scanon @andrew.w.kaylor and many others. While there wasn't 100% consensus to move to ffp-contract=on there are many reasons to do it including better numerical results and debuggability. I'm hoping that @andrew.w.kaylor @kbsmith1 @rjmccall and others will join in here.
I see that the UserManual has been updated, but this behavior change should probably also be mentioned in the Release Notes.
Can someone file a bug for this and put release-13.0.0 in the blocks field so we can track it?
Filed bug to edit the release not as suggested by @xbolva00 here: https://bugs.llvm.org/show_bug.cgi?id=51347
FWIW, fp-contract=on has been the documented default for clang since version 5.
This change just brought the behavior into conformance with the documentation.
This patch had to be reverted because SPEC2006 and SPEC2017 are failing.
More information here:
https://bugs.llvm.org/show_bug.cgi?id=51346
I'm confused. Where in this patch the This patch establishes the default option for -ffp-model to select "precise". happens? LHS of diff says it is already default