Page MenuHomePhabricator

Change clang option -ffp-model=precise to select ffp-contract=on
ClosedPublic

Authored by mibintc on Feb 11 2020, 12:50 PM.

Details

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

The revert of this breaks tests everywhere, as far as I can tell.

It looks like something strange happened with the revert:

clang-11: warning: overriding '-ffp-model=strict' option with '-ffp-model=strict' [-Woverriding-t-option]

I believe the problem is that the original change that was being reverted contained this:

clang/lib/Driver/ToolChains/Clang.cpp 
@@ -2768,7 +2766,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
        !AssociativeMath && !ReciprocalMath &&
        SignedZeros && TrappingMath && RoundingFPMath &&
        DenormalFPMath != llvm::DenormalMode::getIEEE() &&
+        FPContract.empty())
-        (FPContract.equals("off") || FPContract.empty()))

But sometime in the land-revert-land-revert cycle the line above that changed, causing the merge to miss this change in the most recent revert. I see that @MaskRay has since re-landed this change set, but it's going to cause problems for PowerPC. If someone needs to revert this yet again, I think it can be safely done by recovering the change above.

Apologies for the mess!

jsji added a comment.Feb 14 2020, 8:31 AM

@jsji Will your backend tolerate -ffp-contract=on if optimizations are not disabled, e.g. -O2? We are proposing that the default floating point model be -ffp-model=precise, and with precise model, the ffp-contract=on. However you are right we don't want the frontend to create FMA when optimizations are disabled -O0

We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).
But yes, I think we are OK to accept -ffp-contract=on (not -ffp-contract=fast) at -O2 if there are strong justifications to do so on clang.

However you are right we don't want the frontend to create FMA when optimizations are disabled -O0

I've been discussing this with @scanon on the cfe-dev mailing list, and he has convinced me that we should create FMA options at -O0 if the fp-contract setting is "on" -- including when it is on by default. The reasoning that persuaded me was, "preserving FMA formation at O0 _helps_ debuggability, because it means that numerical behavior is more likely to match what a user observed at Os, allowing them to debug the problem."

Yeah, I agree with Steve here. The great virtue of -ffp-contract=on over the more aggressive modes is that FMA formation is purely local to a single source expression, which means there's really no obstacle to treating it as implementation-guaranteed semantics. Such behavior should be done consistently across compiler settings rather than potentially making the numeric result of an expression subject to optimizer settings and choices; the latter is fine for "fast" modes but not really acceptable as the default compiler behavior. If a backend needs to legalize the intrinsic back to separate multiply + add instructions, that's okay, but it should do so consistently rather than only under a specific optimization level.

jsji added a comment.Feb 14 2020, 11:19 AM

We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).

Correction: XL on PowerPC default to generate FMA at -O0 as well. The corresponding option is -qfloat=maf.
https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.xlcpp1611.lelinux.doc/compiler_ref/opt_float.html

You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.

Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math

Just a suggestion, I'm not sure whether that would be the best breakdown. Perhaps we can also see what the defaults are for GCC and unify with those?

GCC doesn't support "on" so I'm not sure it's possible to distinguish their intentions. I think it would be better to define what we think is best for clang and encourage GCC to unify with that.

Perhaps we can also see what the defaults are for GCC and unify with those?

+1

fhahn reopened this revision.Aug 3 2020, 3:48 AM
fhahn added a subscriber: fhahn.

IIUC the patch is currently reverted (in 78654e8511cf16d49f6680d782f3771a767ba942), due to ~20 llvm-test-suite failures.

The failures on X86 are listed below. Unfortunately most tests just compare the hash of the output, so it is not obvious in what way the results are changed. Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.

Failed Tests (22):
  test-suite :: MultiSource/Applications/oggenc/oggenc.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test
  test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test
  test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test
  test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test
  test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
This revision is now accepted and ready to land.Aug 3 2020, 3:48 AM
lebedev.ri requested changes to this revision.Aug 3 2020, 4:23 AM

IIUC the patch is currently reverted (in 78654e8511cf16d49f6680d782f3771a767ba942), due to ~20 llvm-test-suite failures.

The failures on X86 are listed below. Unfortunately most tests just compare the hash of the output, so it is not obvious in what way the results are changed.

Presumably it means that whatever optimization options those test specify, are now either more strict or less strict.
I think this is really missing some codegen tests that show how codegen actually changes by this.

Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.

Failed Tests (22):
  test-suite :: MultiSource/Applications/oggenc/oggenc.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test
  test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test
  test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test
  test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test
  test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
This revision now requires changes to proceed.Aug 3 2020, 4:23 AM

<...>

And by codegen changes i mostly mean newly-set/now-unset fp fast-math instruction flags.

mibintc updated this revision to Diff 345526.May 14 2021, 12:17 PM

I rebased and enhanced the test case clang/test/CodeGen/ffp-contract-option.c to show the effect of various ffp-contract={on,fast,off} * ffast-math=on,off in response to the request from @lebedev.ri ; sorry for leaving this hanging for so long, I wasn't sure what to do about the optimization discrepancies but I have a plan now. We think it could be caused by a particular nuance on Broadwell.

There are 3 clang settings for ffp-contract (on, off, fast) but the FMF bits have only "allow contract". Clang sets the "allow contract" bit in the IRBuilder only when ffp-contract=fast

Matt added a subscriber: Matt.May 19 2021, 11:13 AM

<...>

And by codegen changes i mostly mean newly-set/now-unset fp fast-math instruction flags.

I added some codegen tests to show you what ffp-contract is doing: the tests added are: ffp-contract={on,off,fast} * {ffast-math * fno-fast-math}

I created https://reviews.llvm.org/D102861 to make changes to the failing LNT tests. Hoping to push this commit after the test changes in D102862 are approved.

Hoping @lebedev.ri will take a look since he requested changes, thanks!

lebedev.ri resigned from this revision.May 27 2021, 1:45 PM

No real comments from me.
I assume, the errors are because -ffp-contract=on actually results in *less* error?

This revision is now accepted and ready to land.May 27 2021, 1:45 PM

No real comments from me.
I assume, the errors are because -ffp-contract=on actually results in *less* error?

Yes, FMA improves accuracy, thank you!

xbolva00 added inline comments.Jun 10 2021, 6:36 AM
clang/docs/UsersManual.rst
1501

Maybe you should add short info to Release Notes as well?

mibintc added inline comments.Jun 10 2021, 6:40 AM
clang/docs/UsersManual.rst
1501

Maybe you should add short info to Release Notes as well?

Good idea thanks. Can I do that in a separate patch?

xbolva00 added inline comments.Jun 10 2021, 6:44 AM
clang/docs/UsersManual.rst
1501

Yes, maybe wait a few days in case of the revert of this patch.

The bot is showing a fail due to this patch, see https://lab.llvm.org/buildbot#builders/110/builds/4007

It looks like my updates to LNT earlier this week haven't been migrated to the bot, is that right?

See https://reviews.llvm.org/rTccc86f938839d02de3a6564b2d80fb47a60aa069

In the json log I can see ffp-contract=off but it should be overridden to ffp-contract=off on the command line due to the LNT change.

Is there something else I need to do to update those sources?

I’ve been trying to commit this patch. However, when I commit, the tests fail some 21 floating point tests in LNT. Also there is both performance improvement as well as regression which shouldn’t happen because the sum total of my clang patch + my LNT patch should effectively be a no-op.

It appears that the correction I made to LNT is not effective, but I’ve tested the patch on my Linux box and it does work. My patch unconditionally overrides the command line to restore the original LNT behavior. I verified the commit revision to LNT and the bot is running my patched version of LNT. It looks like the bot is using a different rule to build the test cases.

It’s hard to see exactly what the bot is doing, there isn’t enough information. It would be nice if the bot Makefile was built with ‘verbose’ so I could see all the compile commands that the bot is using.

One of the bot failures is here: http://lnt.llvm.org/db_default/v4/nts/147167

I can see the bot owner is Yin Yang, I will reach out.

mibintc reopened this revision.Jun 19 2021, 5:26 AM
This revision is now accepted and ready to land.Jun 19 2021, 5:26 AM
mibintc reopened this revision.Jul 22 2021, 6:45 AM

I had to revert again, it's still failing on Intel buildbots. Cannot reproduce on Intel-internal Broadwell server.

This revision is now accepted and ready to land.Jul 22 2021, 6:45 AM
haowei added a subscriber: haowei.Jul 30 2021, 12:16 PM

We are seeing float related unit tests failing in Fuchsia after this patch. What flags should we add to our build to match llvm's previous float point behavior?

We are seeing float related unit tests failing in Fuchsia after this patch. What flags should we add to our build to match llvm's previous float point behavior?

This option: ffp-contract=off

Seems like lnt related things are stable now so maybe it is a good time to update release notes?

hans added a subscriber: hans.Aug 2 2021, 1:08 AM

We are seeing float related unit tests failing in Fuchsia after this patch. What flags should we add to our build to match llvm's previous float point behavior?

This option: ffp-contract=off

We're seeing float related tests fail in Chromium too (crbug.com/1235145). I haven't read the background llvm-dev thread about this, but is changing the default behavior here really intentional? Does it match what the standards require and what other compilers do?

fhahn added a comment.Aug 3 2021, 7:53 AM

It looks like this patch is also causing mis-compares in both SPEC2006 and SPEC2017 on AArch64, e.g. in External/SPEC/CFP2006/453.povray/.

haowei added a comment.Aug 3 2021, 9:51 AM

Can we revert this patch please? If I understood correctly, this patch changed the default behavior of clang when generating code for float point arithmetic. After this patch -ffp-contract was set to on if this flag was not specified. In theory this makes clang use FMA instructions that should be slightly more efficient, but there are cases in the field that certain programs relies on the stable rounding behavior of float point arithmetic, and this patch broke that. And in fact we already seen breakages in Fuchsia, Chromium and SPEC benchmark suite and there might be other people seeing the same breakages but still working on bisecting this breaking change.

Even if this breaking change is reasonable to proceed and receive condense from major clang users, this patch lacks enough test coverage for codegen of float point arithmetic on major CPU platforms without related flag specified. Nobody will discover this is a default behavior breaking patch by just reading this patch. This should be reverted and reworked.

hans added a subscriber: tstellar.Aug 4 2021, 1:31 AM

Also +@tstellar since I believe this went in before the 13 branch point.

There was a long discussion on cfe-dev about this issue approximately January-February 2020 including @scanon @andrew.w.kaylor and many others. While there wasn't 100% consensus to move to ffp-contract=on there are many reasons to do it including better numerical results and debuggability. I'm hoping that @andrew.w.kaylor @kbsmith1 @rjmccall and others will join in here.

I see that the UserManual has been updated, but this behavior change should probably also be mentioned in the Release Notes.

Can someone file a bug for this and put release-13.0.0 in the blocks field so we can track it?

haowei added a comment.Aug 4 2021, 1:59 PM

Can someone file a bug for this and put release-13.0.0 in the blocks field so we can track it?

Filed https://bugs.llvm.org/show_bug.cgi?id=51346 .

zahiraam added a comment.EditedAug 4 2021, 2:03 PM

Filed bug to edit the release not as suggested by @xbolva00 here: https://bugs.llvm.org/show_bug.cgi?id=51347

FWIW, fp-contract=on has been the documented default for clang since version 5.

https://releases.llvm.org/5.0.1/tools/clang/docs/ClangCommandLineReference.html#cmdoption-clang-ffp-contract

This change just brought the behavior into conformance with the documentation.

This patch had to be reverted because SPEC2006 and SPEC2017 are failing.
More information here:
https://bugs.llvm.org/show_bug.cgi?id=51346