Page MenuHomePhabricator

Change clang option -ffp-model=precise to select ffp-contract=on
Needs RevisionPublic

Authored by mibintc on Feb 11 2020, 12:50 PM.

Details

Summary

This patch changes the ffp-model=precise to enables -ffp-contract=on (previously -ffp-model=precise enabled -ffp-contract=fast). This is a follow-up to Andy Kaylor's comments in the llvm-dev discussion "Floating Point semantic modes". From the same email thread, I put Andy's distillation of floating point options and floating point modes into UsersManual.rst

Diff Detail

Event Timeline

mibintc created this revision.Feb 11 2020, 12:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2020, 12:50 PM
lebedev.ri added inline comments.Feb 11 2020, 12:58 PM
clang/docs/UsersManual.rst
1388

I'm confused. Where in this patch the This patch establishes the default option for -ffp-model to select "precise". happens? LHS of diff says it is already default

mibintc retitled this revision from Change clang default to -ffp-model=precise to Change clang option -ffp-model=precise to select ffp-contract=on.Feb 11 2020, 1:04 PM
mibintc edited the summary of this revision. (Show Details)

I agree that the default mode should be standards-compliant, which I believe permits fp-contract=on (contraction within expressions) but not fp-contract=fast (arbitrary contraction).

rjmccall accepted this revision.Feb 11 2020, 1:54 PM
This revision is now accepted and ready to land.Feb 11 2020, 1:54 PM
clang/docs/UsersManual.rst
1388

The comments said that it was the default, but the actual default was something that didn't quite match any of the fp-models -- precise but with fp contraction off.

clang/lib/Driver/ToolChains/Clang.cpp
2552–2553

I think this always gets changed to fast, on, or off below, but making it empty here looks wrong.

2663–2667

Does this mean that "-ffp-model=precise -ffp-contract=off" will leave FP contraction on? That doesn't seem right.

2775

We should never get here with FPContract empty.

This revision was automatically updated to reflect the committed changes.
mibintc marked 2 inline comments as done.
mibintc marked 2 inline comments as done.Feb 12 2020, 7:29 AM

some replies to Andy. I'll upload another patch here which passed check-all locally. then i'll re-commit it.

clang/docs/UsersManual.rst
1388

yes you're right. sorry for the confusion. i changed the title and description

clang/lib/Driver/ToolChains/Clang.cpp
2663–2667

No. When -ffp-model=precise is on the command line, we end up here because the code above changes optID to ffp_contract, and sets FPContract to "on". The Val string will be "precise", and it shouldn't be checked. I'll change the comment.

mibintc updated this revision to Diff 244178.Feb 12 2020, 7:30 AM

FYI this caused failed builds for aarch64 (when building with -fno-signed-zeros), hitting unreachable statements in TargetLowering::getNegatedExpression - see https://bugs.llvm.org/show_bug.cgi?id=44892. I guess that's a bug in the aarch64 backend and not with this change in itself, but it was only uncovered by this change...

Looks like this patch regressed the test-suite:
https://lnt.llvm.org/db_default/v4/nts/recent_activity

Now 45min+

jsji added a subscriber: jsji.EditedFeb 13 2020, 2:46 PM

This is also breaking test-suites in our internal buildbots ..

This change is actually changing the default behavior: if user does NOT supply -ffp-contract, it becomes -ffp-contract=on.

So this means any code without explicitly add -ffp-contract=off will have -ffp-contract=on.
eg: -O0 will also generate FMA.

Is this intended behavior change? @mibintc @andrew.w.kaylor

This is also breaking test-suites in our internal buildbots ..

This change is actually changing the default behavior: if user does NOT supply -ffp-contract, it becomes -ffp-contract=on.

So this means any code without explicitly add -ffp-contract=off will have -ffp-contract=on.
eg: -O0 will also generate FMA.

Is this intended behavior change? @mibintc @andrew.w.kaylor

You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.

You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.

Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math

Just a suggestion, I'm not sure whether that would be the best breakdown. Perhaps we can also see what the defaults are for GCC and unify with those?

@jsji Will your backend tolerate -ffp-contract=on if optimizations are not disabled, e.g. -O2? We are proposing that the default floating point model be -ffp-model=precise, and with precise model, the ffp-contract=on. However you are right we don't want the frontend to create FMA when optimizations are disabled -O0

The revert of this breaks tests everywhere, as far as I can tell.

The revert of this breaks tests everywhere, as far as I can tell.

It looks like something strange happened with the revert:

clang-11: warning: overriding '-ffp-model=strict' option with '-ffp-model=strict' [-Woverriding-t-option]

I believe the problem is that the original change that was being reverted contained this:

clang/lib/Driver/ToolChains/Clang.cpp 
@@ -2768,7 +2766,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
        !AssociativeMath && !ReciprocalMath &&
        SignedZeros && TrappingMath && RoundingFPMath &&
        DenormalFPMath != llvm::DenormalMode::getIEEE() &&
+        FPContract.empty())
-        (FPContract.equals("off") || FPContract.empty()))

But sometime in the land-revert-land-revert cycle the line above that changed, causing the merge to miss this change in the most recent revert. I see that @MaskRay has since re-landed this change set, but it's going to cause problems for PowerPC. If someone needs to revert this yet again, I think it can be safely done by recovering the change above.

Apologies for the mess!

jsji added a comment.Feb 14 2020, 8:31 AM

@jsji Will your backend tolerate -ffp-contract=on if optimizations are not disabled, e.g. -O2? We are proposing that the default floating point model be -ffp-model=precise, and with precise model, the ffp-contract=on. However you are right we don't want the frontend to create FMA when optimizations are disabled -O0

We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).
But yes, I think we are OK to accept -ffp-contract=on (not -ffp-contract=fast) at -O2 if there are strong justifications to do so on clang.

However you are right we don't want the frontend to create FMA when optimizations are disabled -O0

I've been discussing this with @scanon on the cfe-dev mailing list, and he has convinced me that we should create FMA options at -O0 if the fp-contract setting is "on" -- including when it is on by default. The reasoning that persuaded me was, "preserving FMA formation at O0 _helps_ debuggability, because it means that numerical behavior is more likely to match what a user observed at Os, allowing them to debug the problem."

Yeah, I agree with Steve here. The great virtue of -ffp-contract=on over the more aggressive modes is that FMA formation is purely local to a single source expression, which means there's really no obstacle to treating it as implementation-guaranteed semantics. Such behavior should be done consistently across compiler settings rather than potentially making the numeric result of an expression subject to optimizer settings and choices; the latter is fine for "fast" modes but not really acceptable as the default compiler behavior. If a backend needs to legalize the intrinsic back to separate multiply + add instructions, that's okay, but it should do so consistently rather than only under a specific optimization level.

jsji added a comment.Feb 14 2020, 11:19 AM

We normally default to -ffp-contract=on only for -O3 and above (for legacy compilers XL/gcc on PowerPC).

Correction: XL on PowerPC default to generate FMA at -O0 as well. The corresponding option is -qfloat=maf.
https://www.ibm.com/support/knowledgecenter/SSXVZZ_16.1.1/com.ibm.xlcpp1611.lelinux.doc/compiler_ref/opt_float.html

You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.

Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math

Just a suggestion, I'm not sure whether that would be the best breakdown. Perhaps we can also see what the defaults are for GCC and unify with those?

GCC doesn't support "on" so I'm not sure it's possible to distinguish their intentions. I think it would be better to define what we think is best for clang and encourage GCC to unify with that.

Perhaps we can also see what the defaults are for GCC and unify with those?

+1

fhahn reopened this revision.Aug 3 2020, 3:48 AM
fhahn added a subscriber: fhahn.

IIUC the patch is currently reverted (in 78654e8511cf16d49f6680d782f3771a767ba942), due to ~20 llvm-test-suite failures.

The failures on X86 are listed below. Unfortunately most tests just compare the hash of the output, so it is not obvious in what way the results are changed. Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.

Failed Tests (22):
  test-suite :: MultiSource/Applications/oggenc/oggenc.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test
  test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test
  test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test
  test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test
  test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
This revision is now accepted and ready to land.Aug 3 2020, 3:48 AM
lebedev.ri requested changes to this revision.Aug 3 2020, 4:23 AM

IIUC the patch is currently reverted (in 78654e8511cf16d49f6680d782f3771a767ba942), due to ~20 llvm-test-suite failures.

The failures on X86 are listed below. Unfortunately most tests just compare the hash of the output, so it is not obvious in what way the results are changed.

Presumably it means that whatever optimization options those test specify, are now either more strict or less strict.
I think this is really missing some codegen tests that show how codegen actually changes by this.

Any ideas on how to move forward on this? I'll update the diff with a rebased version on current trunk, which I used to reproduce the failures.

Failed Tests (22):
  test-suite :: MultiSource/Applications/oggenc/oggenc.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/CLAMR/CLAMR.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG.test
  test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test
  test-suite :: MultiSource/Benchmarks/Rodinia/srad/srad.test
  test-suite :: SingleSource/Benchmarks/Linpack/linpack-pc.test
  test-suite :: SingleSource/Benchmarks/Misc-C++/Large/sphereflake.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.test
  test-suite :: SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.test
  test-suite :: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.test
  test-suite :: SingleSource/Benchmarks/Polybench/stencils/adi/adi.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.expandfft.test
  test-suite :: SingleSource/UnitTests/Vector/SSE/Vector-sse.stepfft.test
This revision now requires changes to proceed.Aug 3 2020, 4:23 AM

<...>

And by codegen changes i mostly mean newly-set/now-unset fp fast-math instruction flags.