Page MenuHomePhabricator

Add support for pragma float_control, to control precision and exception behavior at the source level
ClosedPublic

Authored by mibintc on Jan 16 2020, 6:38 AM.

Details

Summary

Intel would like to support #pragma float_control which allows control over precision and exception behavior at the source level. This pragma is supported by both the Microsoft compiler and the Intel compiler, and our customers have found it useful. This message is to describe the pragma, provide the patch for review, and request feedback from the community.

As the pragma was originally defined in the Microsoft compiler, the pragma supports a stack of settings, so that you can use push and pop to save and restore the state. That functionality is already in clang and this patch piggy-backs on that support (reference PragmaStack and PragmaMsStackAction). The Microsoft compiler supports it only at file scope, but the Intel compiler, and this patch, supports the pragma at either file scope or at the beginning of a compound statement. Clang currently provides support for pragma fp_contract which enables floating point contraction--also at file scope and compound statement, and uses FPFeatures on certain expression nodes to hold the pragma settings. This patch piggy-backs FPFeatures, adding fields to show the setting of "float_control(precise)" and "float_control(except)"

This patch includes an update to the pragma documentation, to summarize, using pragma float_control(precise) is like enabling ffp-model=precise for a region of the program. pragma float_control(except) is like enabling ffp-exception-behavior=strict/ignore for a region of the program.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
erichkeane added inline comments.Apr 27 2020, 10:25 AM
clang/include/clang/Basic/LangOptions.h
388–389

Is this the same logic as getFromOpaqueInt? If so, we should probably just call that.

clang/include/clang/Sema/Sema.h
558

This comment is really oddly phrased and uses the 'stack'-noun as a verb? Something like: (please feel free to wordsmith): "This stack tracks the current state of Sema.CurFPFeatures."?

clang/lib/AST/Expr.cpp
1309

Is there a reason this isn't a member initializer?

clang/lib/Parse/ParsePragma.cpp
667

Oh boy these are some magic lookin numbers... Can you document these two lines?

2539

Replace this with BalancedDelimiterTracker instead, it gives consistent errors and are a bit easier to use. Additionally, I think it does some fixups that allow us to recover better.

I'd also suggest some refactoring with the PragmaFloatControlKind if/elses below. Perhaps handle the closing paren at the end, and do a switch-case for that handling.

clang/lib/Sema/SemaAttr.cpp
428

I guess I don't get why you're switching on both here? Can the two just be combined? I don't know if the 'NewValue = CurFPFeatures.getAsOpaqueInt();

FpPragmaStack.Act(Loc, Action, StringRef(), NewValue);'

part is sufficiently motivated to do 2 separate switches.

1020

Should we still be setting this even if there was an error?

clang/lib/Sema/SemaStmt.cpp
382

unrelated change?

398

Don't use curleys for single liners, both of these probably shouldn't need curleys at all. Comment could be at the top for clarity.

clang/lib/Serialization/ASTReaderStmt.cpp
684

Rather than this variable, why not just ask 'E' below?

mibintc marked 5 inline comments as done.Apr 27 2020, 12:29 PM

A couple replies to @erichkeane

clang/include/clang/AST/Expr.h
2254

It's only used during serialization (ASTReader); I guess the node has already been allocated by then so it's superfluous, because the allocation point could set this field.

2271

John suggested the name getStored hasStored as "less tempting" names. The getStored and hasStored are only used during Serialization. John suggested the getFPFeatures function as the public interface and it uses the LangOptions parameter. The features are saved in the node if they can't be recreated from the command line floating point options (due to presence of floating point pragma)

2702

whoops i meant that to go to the CallExprBits

clang/include/clang/Basic/LangOptions.h
197

it's a NFC the llvm:: prefix wasn't needed. maybe the clang formatter did that?

clang/lib/Serialization/ASTReaderStmt.cpp
684

yes i could do that. it would be a function call

mibintc updated this revision to Diff 260957.Apr 29 2020, 10:49 AM
mibintc marked 7 inline comments as done.

I dropped FPOptions from CallExpr -- if it's needed I will add it using TrailingStorage in a separate patch; I responded to @erichkeane 's code review

mibintc marked an inline comment as done.Apr 29 2020, 10:51 AM
mibintc added inline comments.
clang/include/clang/AST/Expr.h
2702

Adding FPOptions to CallExprBits makes the field too large, I'm going to drop adding FPOptions to CallExpr, i'll add it via trailingstorage in a separate patch if it's needed.

clang/lib/Parse/ParsePragma.cpp
2539

BalancedDelimiterTracker doesn't work here because there's no access to the Parser object. Rewriting it would be an extensive change and I'm doubtful about making this change. PragmaHandler is defined in Lex. I think there are 60 pragma's that use the PragmaHandler.

clang/lib/Sema/SemaAttr.cpp
1020

Should we still be setting this even if there was an error?

It's not harmful to set it, if there's an error diagnostic then there is no codegen right?

clang/lib/Serialization/ASTReaderStmt.cpp
684

i made some changes in this area, eliminating setHasStoredFPFeatures

2 small things, @rjmccall and @sepavloff , anything else?

clang/include/clang/Basic/DiagnosticSemaKinds.td
870

The last 4 can be done via selects as well! Save a couple more spaces before we have to up the diagnostic id size :)

clang/include/clang/Sema/Sema.h
558

Just needs a period at the end.

clang/lib/Parse/ParsePragma.cpp
2539

Thats unfortunate :/ That type does some nice fixup work.

mibintc updated this revision to Diff 261226.Apr 30 2020, 7:41 AM

I changed a comment to add a period, rebased, and used clang-format

mibintc marked an inline comment as done.Apr 30 2020, 7:43 AM
mibintc added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
870

The last 4 can be done via selects as well

Combining these 4 into 1 diagnostic is doable but it's ugly.

erichkeane added inline comments.Apr 30 2020, 8:00 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
870

Concur, I spent some time on it and don't really like it.

erichkeane accepted this revision.Apr 30 2020, 11:10 AM

Looks OK to me. Please give @sepavloff and @rjmccall a day or so to make final comments before committing.

This revision is now accepted and ready to land.Apr 30 2020, 11:10 AM
This revision was automatically updated to reflect the committed changes.
uabelho added a subscriber: uabelho.May 6 2020, 2:41 AM

Hi @mibintc ,

I think I'm seeing some oddities with this patch (current trunk, 0054c46095).
With

clang -S -Xclang -emit-llvm bbi-42553.c -o -

on the input

float rintf(float x);
#pragma STDC FENV_ACCESS ON

void t()
{
    (void)rintf(0.0f);
}

I get

bbi-42553.c:2:14: warning: pragma STDC FENV_ACCESS ON is not supported, ignoring pragma [-Wunknown-pragmas]
#pragma STDC FENV_ACCESS ON
             ^
; ModuleID = 'bbi-42553.c'
source_filename = "bbi-42553.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind optnone strictfp uwtable
define dso_local void @t() #0 {
entry:
  %0 = call float @llvm.experimental.constrained.rint.f32(float 0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.ignore") #2
  ret void
}

and if I remove the pragma I instead get

define dso_local void @t() #0 {
entry:
  %0 = call float @llvm.rint.f32(float 0.000000e+00)
  ret void
}

Before this patch I got the call to llvm.rint.f32 in both cases.

It seems to be this change in SemaStmt.cpp that makes the FENV-pragma have some effect regardless of the warning saying that the pragma doesn't have any effect:

index aa0d89ac09c3..f76994a6dab3 100644
@@ -394,6 +394,11 @@ StmtResult Sema::ActOnCompoundStmt(SourceLocation L, SourceLocation R,
                                    ArrayRef<Stmt *> Elts, bool isStmtExpr) {
   const unsigned NumElts = Elts.size();
 
+  // Mark the current function as usng floating point constrained intrinsics
+  if (getCurFPFeatures().isFPConstrained())
+    if (FunctionDecl *F = dyn_cast<FunctionDecl>(CurContext))
+      F->setUsesFPIntrin(true);
+

It seems to be this change in SemaStmt.cpp that makes the FENV-pragma have some effect regardless of the warning saying that the pragma doesn't have any effect:

index aa0d89ac09c3..f76994a6dab3 100644
@@ -394,6 +394,11 @@ StmtResult Sema::ActOnCompoundStmt(SourceLocation L, SourceLocation R,
                                    ArrayRef<Stmt *> Elts, bool isStmtExpr) {
   const unsigned NumElts = Elts.size();
 
+  // Mark the current function as usng floating point constrained intrinsics
+  if (getCurFPFeatures().isFPConstrained())
+    if (FunctionDecl *F = dyn_cast<FunctionDecl>(CurContext))
+      F->setUsesFPIntrin(true);
+

Thank you I will follow up!

We're also seeing test failures in Apple's Clang fork, e.g.

test/CodeGen/finite-math.c:12:10: error: NSZ: expected string not found in input
 // NSZ: fadd nsz
         ^
<stdin>:11:20: note: scanning from here
define void @foo() #0 {
                   ^
<stdin>:15:9: note: possible intended match here
 %add = fadd float %0, %1
        ^

I haven't checked yet whether we can reproduce upstream.

I posted a patch to fix the bug reported by @uabelho here https://reviews.llvm.org/D79510

@rjmccall I used check-clang and check-all on D71841 from my linux x86-64 server before submitting, and the testing was clear. Maybe your branch has some calls to create BinaryOperator that isn't passing FPFeatures with the correct value--related to D76384. My sandbox is showing this

%add = fadd nsz float %0, %1

I got a report that this patch was causing a problem with Windows <numeric> header because #pragma float_control should be supported in namespace context. I've posted a patch for review here https://reviews.llvm.org/D79631

plotfi added a subscriber: plotfi.May 8 2020, 2:59 PM

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

We're also seeing test failures in Apple's Clang fork, e.g.

test/CodeGen/finite-math.c:12:10: error: NSZ: expected string not found in input
 // NSZ: fadd nsz
         ^
<stdin>:11:20: note: scanning from here
define void @foo() #0 {
                   ^
<stdin>:15:9: note: possible intended match here
 %add = fadd float %0, %1
        ^

I haven't checked yet whether we can reproduce upstream.

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

Oh, thank you for figuring that out. Yeah, it's reasonable for code-gen option parsing to depend on language-option parsing, which means that this patch is wrong. The right fix is that we need to stop parsing these as code-gen options, which is reasonable since they have direct language-semantics impact. If we still need the code-gen option flags, we should be able to recreate them from the language options, I think.

@rjmccall @mibintc Can we revert this patch for now then, and re-land when this patch is reworked? It would be good to get those bots passing. @rjmccall are the bots that you see failing on your end public?

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

Oh, thank you for figuring that out. Yeah, it's reasonable for code-gen option parsing to depend on language-option parsing, which means that this patch is wrong. The right fix is that we need to stop parsing these as code-gen options, which is reasonable since they have direct language-semantics impact. If we still need the code-gen option flags, we should be able to recreate them from the language options, I think.

This comment was removed by mibintc.
mibintc marked 2 inline comments as done.May 11 2020, 5:52 AM

Some inline replies/comments to @rjmccall and @plotfi

clang/lib/Frontend/CompilerInvocation.cpp
3185

@rjmccall @plotfi These earlier patches are also deriving the value of LangOpts from the settings of CG opts

3187

@rjmccall @plotfi here the codegen args are evaluated first. Perhaps we could have a "reconcile codegen and lang args" function which would resolve the floating point settings into a final setting? so that codegen or lang could be parsed in either order?

@rjmccall Uncertain how to proceed, can you recommend? If I recall correctly, I added the lines in CompilerOptions because there were many failing lit tests, i could have fixed the lit fails by adding the lang options to the lit tests. (of course that change could have other repercussions)

mibintc added inline comments.May 11 2020, 10:16 AM
clang/lib/Frontend/CompilerInvocation.cpp
3192

@rjmccall I could set these by using Args.hasArg instead of CGOpts, would that be acceptabel?

rjmccall added inline comments.May 11 2020, 10:32 AM
clang/lib/Frontend/CompilerInvocation.cpp
3185

I don't know what you mean here; the code you're quoting just seems to be looking at Args. It's fine to re-parse arguments in both places if that makes something easier. The problem is that you're looking at the CodeGenOptions structure itself.

rjmccall added inline comments.May 11 2020, 10:49 AM
clang/lib/Frontend/CompilerInvocation.cpp
3192

I think so, yes. Ideally the CG options would then be set based on the earlier values, or replaced with uses of the language options structure, but not having a direct dependency at all may be simpler.

plotfi added a subscriber: ab.May 11 2020, 11:01 AM

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

@rjmccall Uncertain how to proceed, can you recommend? If I recall correctly, I added the lines in CompilerOptions because there were many failing lit tests, i could have fixed the lit fails by adding the lang options to the lit tests. (of course that change could have other repercussions)

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

That was a good fix. I am pretty sure this does mean the diagnostics-order.c will fail on apple's bots. The same diagnostics lines print, but in the wrong order. I haven't root caused that yet.

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

@rjmccall @mibintc I think the diagnostics-order.c test is still behaving correctly technically. The note lines are still printing with the associated error lines, it just happens that one of the warning lines prints at the end instead of in the middle. ie:

error: invalid value '-foo' in '-verify='
note: -verify prefixes must start with a letter and contain only alphanumeric characters, hyphens, and underscores
error: invalid value 'bogus' in '-std=bogus'
note: use 'c89', 'c90', or 'iso9899:1990' for 'ISO C 1990' standard
...
warning: optimization level '-O999' is not supported; using '-O3' instead

instead of

error: invalid value '-foo' in '-verify='
note: -verify prefixes must start with a letter and contain only alphanumeric characters, hyphens, and underscores
warning: optimization level '-O999' is not supported; using '-O3' instead
error: invalid value 'bogus' in '-std=bogus'
note: use 'c89', 'c90', or 'iso9899:1990' for 'ISO C 1990' standard
...

That was a good fix. I am pretty sure this does mean the diagnostics-order.c will fail on apple's bots. The same diagnostics lines print, but in the wrong order. I haven't root caused that yet.

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

I think we might have had to change that test on our fork when we changed the parsing order.

michele.scandale added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
227

I'm not convinced it correct to set contract when allowFPContractWithinStatement return true. Can someone clarify this?

If I compile the following example with -ffp-contract=on:

float test1(float a, float b, float c) {
  float x = a * b;
  return x + c;
}

float test2(float a, float b, float c) {
  return a * b + c;
}

Before this change the generated code was:

define float @test1(float %a, float %b, float %c) {
  %0 = fmul float %a, %b
  %1 = fadd float %0, %c
  ret float %1
}

define float @test2(float %a, float %b, float %c) {
  %0 = call float @llvm.fmuladd.f32(float %a, float%b, float %c)
  ret float %0
}

And my understanding is that the in-statement contraction is implemented by emitting the llvm.fmuladd call that a backend might decide to implement as fmul + fadd or as fma.

With this change the generated code is:

define float @test1(float %a, float %b, float %c) {
  %0 = fmul contract float %a, %b
  %1 = fadd contract float %0, %c
  ret float %1
}

define float @test2(float %a, float %b, float %c) {
  %0 = call contract float @llvm.fmuladd.f32(float %a, float%b, float %c)
  ret float %0
}

and it seems to me that in test1 (where multiple statements where explicitly used) the optimizer is now allowed to perform the contraction, violating the original program semantic where only "in-statement" contraction was allowed.

IIUC, the way within-statement contraction is supposed to work is that there are supposed to be blocking intrinsics inserted at various places. I don't remember the details, though, or if anyone's thought about how it interacts with cross-statement contraction, which this pragma permits within the same function (and could occur with inlining regardless).

clang/include/clang/Basic/LangOptions.h
396–401

Same comment on LangOpts.FastMath || as the one for CompilerInvocation.cpp.

clang/lib/Frontend/CompilerInvocation.cpp
3190–3196

Why do we need Opts.FastMath || here? The code in the compiler driver clang/lib/Driver/ToolChains/Clang.cpp (https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Clang.cpp#L2510) already takes care of generating the right flags for the CC1 to configure the floating point rules.

Moreover, if we ignore what the compiler driver does, the fact that Args.hasArg(OPT_ffast_math) is not considered in the definition of the codegen options such as NoInfsFPMath, NoNaNsFPMath, NoSignedZeros, Reassociate, so the you have already two distinct options for the same abstract property that might not match.

I think that at the CC1 level the reasoning should be done in terms of the fine grain options, and let the compiler driver makes life easy for the users -- i.e. LangOpts.FastMath should just control whether the macro __FAST_MATH__ is defined or not.

IIUC, the way within-statement contraction is supposed to work is that there are supposed to be blocking intrinsics inserted at various places. I don't remember the details, though, or if anyone's thought about how it interacts with cross-statement contraction, which this pragma permits within the same function (and could occur with inlining regardless).

Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.

Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.

Am I missing something?

No, if that's how we handle that, then you're absolutely right that we shouldn't set the contract flag.

scanon added a subscriber: scanon.May 12 2020, 10:17 AM

Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.

Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.

This is exactly right. If we are going to have new in-statement optimizations, then we probably do need to add some blocking intrinsic (which would be elidable given suitable fast-math flags); the system of generating fmuladd works well for FMA contraction, but doesn't really generalize to other optimizations of this sort.

mibintc added inline comments.May 12 2020, 10:51 AM
clang/lib/CodeGen/CGExprScalar.cpp
227

Thanks @michele.scandale i will work on a patch for this

yaxunl added a subscriber: yaxunl.May 12 2020, 5:04 PM
yaxunl added inline comments.
clang/lib/Serialization/ASTReader.cpp
7899

This changes the behavior regarding AST reader and seems to be too hash restriction. Essentially this requires a pch can only be used with the same fp options with which the pch is generated. Since there are lots of fp options, it is impractical to generate pch for all the combinations.

We have seen regressions due to this assertion.

Can this assertion be dropped or done under some options?

Thanks.

mibintc marked an inline comment as done.May 13 2020, 8:14 AM
mibintc added inline comments.
clang/lib/Serialization/ASTReader.cpp
7899

@yaxunl Can you please send me a reproducer, I'd like to see what's going on, not sure if just getting rid of the assertion will give the desired outcome.

yaxunl added inline comments.May 13 2020, 9:51 AM
clang/lib/Serialization/ASTReader.cpp
7899

Pls apply the patch.

Thanks.

mibintc marked 2 inline comments as done.May 13 2020, 1:46 PM
mibintc added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
227

@michele.scandale I posted a patch for 'contract' here, https://reviews.llvm.org/D79903

clang/lib/Serialization/ASTReader.cpp
7899

@rjmccall In the example supplied by @yaxunl, the floating point options in the pch file when created are default, and the floating point options in the use have no-signed-zeros flag. The discrepancy causes an error diagnostic when the pch is used. I added the FMF flags into FPFeatures in this patch, I made them COMPATIBLE_LANGOPT which is the encoding also being used for FastMath, FiniteMathOnly, and UnsafeFPMath. Do you have some advice about this issue?

rjmccall added inline comments.May 13 2020, 3:00 PM
clang/lib/Serialization/ASTReader.cpp
7899

A couple things are going on here.

First: a PCH can only end at the top level, not in the middle of a declaration, but otherwise Sema can be in an arbitrary semantic configuration. That definitely includes arbitrary pragmas being in effect, so in general the end state might not match the default FP state, so this assertion is bogus. When loading a PCH, you need to restore the pragma stack and current FP state to the configuration it was in at the end of the PCH.

Second: if you restore the pragma stack and FP state naively given the current representation of FP state, you will completely overwrite the FP settings of the current translation unit with the FP settings that were in effect when the PCH was built, which is obviously not okay. This is one way (among several) that the current representation is not really living up to the statement that these language options are "compatible". The better way to do this would be for the pragma stack and Expr nodes to record the current set of overrides in effect rather than the absolute current state; this could then be easily applied to an arbitrary global FP state.

clang/lib/CodeGen/CGExprScalar.cpp
227

Thanks!

I documented the issue reported by @yaxunl here, https://bugs.llvm.org/show_bug.cgi?id=46166, and take ownership of the bug. Thanks for the report.

cfang added a subscriber: cfang.Jun 23 2020, 3:12 PM

-ffast-math flag got lost in the Builder after this change.

FMF.isFast() is true before updateFastMathFlags(FMF, FPFeatures), but turns false after.
It seems the Builder.FMF has been correctly set before, but I am not clear what FPFeatures should be at this point:

+static void setBuilderFlagsFromFPFeatures(CGBuilderTy &Builder,
+ CodeGenFunction &CGF,
+ FPOptions FPFeatures) {
+ auto NewRoundingBehavior = FPFeatures.getRoundingMode();
+ Builder.setDefaultConstrainedRounding(NewRoundingBehavior);
+ auto NewExceptionBehavior =
+ ToConstrainedExceptMD(FPFeatures.getExceptionMode());
+ Builder.setDefaultConstrainedExcept(NewExceptionBehavior);
+ auto FMF = Builder.getFastMathFlags();
+ updateFastMathFlags(FMF, FPFeatures);
+ Builder.setFastMathFlags(FMF);
+ assert((CGF.CurFuncDecl == nullptr || Builder.getIsFPConstrained() ||
+ isa<CXXConstructorDecl>(CGF.CurFuncDecl) ||
+ isa<CXXDestructorDecl>(CGF.CurFuncDecl) ||
+ (NewExceptionBehavior == llvm::fp::ebIgnore &&
+ NewRoundingBehavior == llvm::RoundingMode::NearestTiesToEven)) &&
+ "FPConstrained should be enabled on entire function");
+}