This is an archive of the discontinued LLVM Phabricator instance.

Add support for pragma float_control, to control precision and exception behavior at the source level
ClosedPublic

Authored by mibintc on Jan 16 2020, 6:38 AM.

Details

Summary

Intel would like to support #pragma float_control which allows control over precision and exception behavior at the source level. This pragma is supported by both the Microsoft compiler and the Intel compiler, and our customers have found it useful. This message is to describe the pragma, provide the patch for review, and request feedback from the community.

As the pragma was originally defined in the Microsoft compiler, the pragma supports a stack of settings, so that you can use push and pop to save and restore the state. That functionality is already in clang and this patch piggy-backs on that support (reference PragmaStack and PragmaMsStackAction). The Microsoft compiler supports it only at file scope, but the Intel compiler, and this patch, supports the pragma at either file scope or at the beginning of a compound statement. Clang currently provides support for pragma fp_contract which enables floating point contraction--also at file scope and compound statement, and uses FPFeatures on certain expression nodes to hold the pragma settings. This patch piggy-backs FPFeatures, adding fields to show the setting of "float_control(precise)" and "float_control(except)"

This patch includes an update to the pragma documentation, to summarize, using pragma float_control(precise) is like enabling ffp-model=precise for a region of the program. pragma float_control(except) is like enabling ffp-exception-behavior=strict/ignore for a region of the program.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
clang/include/clang/Basic/DiagnosticParseKinds.td
1123

This isn't quite accurate. The pop case has no comma-separated arguments. It might be better to print the full syntax here if that's feasible.

clang/include/clang/Basic/LangOptions.h
382

It seems like fp_precise describes too many things to be a single option. Even within this set of options it overlaps with fp_contract.

clang/test/CodeGen/fp-floatcontrol-stack.cpp
3

Can you add run lines for -ffast-math and (separately) "-fno-honor-nans -fno-honor-infinities"?

18

There should be a constrained fadd here also, right?

53

Why is the constrained intrinsic generated in this case? If we've got both constraints set to the defaults at the file scope I would have expected that to turn off the constrained mode.

125

Are there also fast-math flags set here? If not, why not?

mibintc marked 6 inline comments as done.Jan 28 2020, 10:05 AM

It's not clear to me from reading this how the "precise" control is going to work with relation to the fast math flags. I don't think MSVC allows the granularity of control over these flags that we have in clang, so maybe they don't have this problem.

You're right, MSVC only allows the pragma at file scope.

Consider this code: https://godbolt.org/z/mHiLCm

With the options "-ffp-model=precise -fno-honor-infinities -fno-honor-nans" the math operations here are generated with the "nnan ninf contract" flags. That's correct. What will happen when I use the pragma to turn precise off? Does it enable all fast math flags? Will the subsequent "pop" leave the "ninf nnan" fast math flags enabled?

This patch doesn't have support for turning precise off, that's a bug, I will revisit. This is my plan for how to handle enabling/disabling fast math: IRBuilder.h has settings FMF, and also supplies clearFastMathFlags, setFastMathFlags(flags) and FastMathFlagGuard. When the expression is walked that alters the FMF, use FastMathFlagGuard to save the current state of FMF, modify the settings using the clear or set functions, walk the expression. After the expression is walked, the FMF settings will be restored to previous state by the FastMathFlagGuard destructor.

As I said, I don't think you can get into this situation with MSVC. I believe that icc will go into full fast math mode with the "precise, off, push" pragma but will go back to "nnan ninf contract" mode with the pop. At least, that's what the design docs say. I haven't looked at the code to verify this. It seems like the correct behavior in any case. I think the clang FPOptions needs individual entries for all of the fast math flags to handle this case.

Thanks I'll investigate this and add test cases. I think possibly since IRBuilder has the FMF like I described above it might work with current support. Is there currently a way to modify nan and inf at the source level or only by compiler option? BTW I've been asked to implement a pragma to control fp "reassoc" at the source level. I'm planning to do that after this patch is complete.

clang/docs/LanguageExtensions.rst
3211

Oh, I looked back at the patch for -ffp-model and precise is documented to set ffp-contract=fast. Not sure why I thought that was right. I'll have to redo it.

3219

No, the stack that tracks the float control pragma settings is a pair, roughly (IsPreciseEnabled, IsExceptEnabled)

clang/include/clang/Basic/LangOptions.h
382

I see your point. I wanted it to reflect the current pragma setting that's why I kept it intact. I'll rethink this.

clang/test/CodeGen/fp-floatcontrol-stack.cpp
3

OK. i'll add pragma's to set precise off too.

18

yes there's a constrained add following. i can add that pattern into the file check.

53

The "run" line in this case uses ffp-exception-behavior=struct; i was trying to address https://bugs.llvm.org/show_bug.cgi?id=44571 by checking the command line options to see if strict was enabled. That's why constrained intrinsics are enabled. Evidently that's incorrect.

125

that's a bug. thanks

You could check the scope where the pragma appears is the same way as pragma clang fp does. The code

case tok::annot_pragma_fp:
  ProhibitAttributes(Attrs);
  Diag(Tok, diag::err_pragma_fp_scope);
  ConsumeAnnotationToken();
  return StmtError();

is put into Parser::ParseStatementOrDeclarationAfterAttributes and it produces errors on misplaced pragma. Probably the same way may work for you.

mibintc updated this revision to Diff 243888.Feb 11 2020, 8:50 AM

This patch is a work in progress.

The problem that I want to work on next is that the scope is wrong when the pragma token is seen following the right brace of a function body. In this patch I worked around that problem by modifying the test case to insert bogus "class ResetScope;" to force the scope back to file-level before each pragma that occurs at file scope. You can see that workaround in the test case clang/test/CodeGen/fp-floatcontrol-stack.cpp

These are the main differences with previous version of this patch:

  1. per Andy Kaylor's comments in the llvm-dev discussion "Floating Point semantic modes", I changed the -ffp-model=precise to be the default. I changed ffp-model=precise to set -ffp-contract=on [previously it was -ffp-contract=fast]. This caused the need to change some tests because the LLVM IR now has the "contract" bit enabled.
  2. I put FMF into the FPOptions struct. As I mentioned in a previous comment, I have a request to allow changing the REASSOC bit via source pragma. I plan to submit that in a separate patch. Putting FMF into FPOptions allows the fast bits to be set or disabled via pragma float_control precise, Andy had suggested this change in his code review.
  3. I put Andy's distillation of floating point options and floating point modes into UsersManual.rst
  4. I changed how the strictfp attribute was set on functions, now it's set if the strict mode is enabled in the function body e.g. either by the option level or a float_control pragma

Still needs to be done:

  1. The problem with the file scope I mentioned above
  2. I assume that the 2 denormal settings need to be added to FPOptions.
  3. Required diagnostics a la MSVC that Andy mentioned in llvm-dev discussion "Floating Point semantic modes"
  4. What else?

I would think contract change can be separated from the rest of the changes, and therefore should be a separate review (to reduce noise)?

I would think contract change can be separated from the rest of the changes, and therefore should be a separate review (to reduce noise)?

I split off that change to https://reviews.llvm.org/D74436 and added you as reviewer, thanks.

mibintc updated this revision to Diff 244894.Feb 16 2020, 12:04 PM

I found the problem in the #pragma float_control (push/pop) stack, it was just a dumb bug.

I also added -include-pch test cases, and added code to ASTWriter ASTReader to preserve the floatcontrol pragma stack

For 2 of the new floatcontrol test cases, I temporarily added XFAIL because the patch to default -ffp-precise=fast and ffp-contract=on has been withdrawn (will re-commit soon).

mibintc updated this revision to Diff 246285.Feb 24 2020, 1:00 PM
mibintc marked an inline comment as done.

This patch is code complete and ready for your review and consideration.

rjmccall added inline comments.Mar 1 2020, 11:02 PM
clang/docs/LanguageExtensions.rst
3207

pragma float_control should be in backticks.

Throughout this documentation, when referring to command-line options, please spell them the way they're actually spelled on the command line, i.e. with a dash.

clang/include/clang/AST/Stmt.h
1118

What's happening here is exactly what this assertion is supposed to prevent. If you need more bits in one of these classes (I assume it's CXXOperatorCallExpr), you need to either make a field in the actual class or investigate more arcane mechanisms like trailing storage to reduce the normal impact. The latter is probably unnecessary for CXXOperatorCallExpr.

clang/include/clang/Basic/DiagnosticParseKinds.td
1126

Maybe "operation" would be a better user-facing name than "kind"? Also, this diagnostic is more specific but less helpful than the diagnostic just above.

clang/include/clang/Basic/LangOptions.def
199

Please align the commas.

Would it make more sense to just store an FPOptions in LangOptions instead of breaking all of the bits down separately?

We may need to reconsider at some point whether any of these are really "compatible" language options. Headers can contain inline code, and we shouldn't compile that incorrectly just because we reused a module we built under different language settings. Although... maybe we can figure out a way to store just the ways that an expression's context overrides the default semantics and then merge those semantics into the default set for the translation unit; that would make them actually compatible. Of course, it would also require more bits in expressions where it matters, and you might need to investigate trailing storage at that point.

clang/include/clang/Basic/LangOptions.h
479

Somewhere in this type, it should be obvious where I can go in order to understand what any of these flags means precisely. Ideally that would be reinforced by the method names, instead of using non-term-of-art abbreviations like "reassoc".

515

The more conventional method names here would an instance method called something like getAsOpaqueInt and then a static method called something like getFromOpaqueInt.

clang/lib/CodeGen/CGExprScalar.cpp
482

You can override VisitBinOp and just do this in that case. But why does it need to be done at this level at all, setting global state on the builder for all emissions, instead of in the leaves where we know we're emitting floating-point operations? This is adding a lot of overhead in some of the most commonly-exercised code paths in IRGen, but building FP expressions is relatively uncommon. I would definitely prefer a little bit of repetitive code over burdening the common case this much. It might also be nice to figure out when this is unnecessary.

Also, please extract a function to make FastMathFlags from FPOptions; you'll need it elsewhere, e.g. in CGExprComplex.

mibintc updated this revision to Diff 248274.Mar 4 2020, 1:06 PM

Responding to @rjmccall 's review. Previously there was a FIXME about adding FPFeatures to UnaryOperator. This seems like the right time to do that, so that change is included there. I'll add inline replies too.

mibintc marked 12 inline comments as done.Mar 4 2020, 1:33 PM

some inline replies and comments

clang/include/clang/AST/Stmt.h
1118

@rjmccall The reason i changed the assertion is because FPOptions is now wider, so I had to change the assertion. See line 609 above. Is there something I need to do differently?

clang/include/clang/Basic/DiagnosticParseKinds.td
1126

I got rid of the diagnostic with the unhelpful string and just using the single diagnostic which has full information about how to form the pragma

clang/include/clang/Basic/LangOptions.def
199

I aligned the commas. I didn't put FPOptions into LangOptions, would you like me to make that change too? I don't know about trailing storage. I see that term in the code but I didn't see details about what that is/how that works.

clang/include/clang/Basic/LangOptions.h
479

I put the comments on the field declarations in the private part. I changed the names of the accessor methods to be more descriptive. (Previously I was using the same names as LLVM uses for those fields).

515

I changed the names like you suggested but not using the static method, is this OK?

clang/lib/CodeGen/CGExprScalar.cpp
482

I removed it from here and pushed this work towards the leaves. I decided that I should put FPFeatures onto UnaryOperator nodes which was left as a FIXME by an earlier author in this area. I added the FastMathFlags function like you suggested but i suppose it needs to be moved out of this file.

4173

In the previous rendition of this patch, when the Builder.FMF settings were modified at Visit(BinaryExpression), the assign is seen as a binary expression and so the FPFeatures was passed into IRBuilder. I'm not confident this patch is in the right place, I'd really like to put FPFeatures onto the CallExpr node, because if you call a builtin intrinsic function, and the mode is set to float_control(except, on), the call node for the intrinsic doesn't have the FPFeature bits, so it isn't marked as expected. Before I make that change I want @rjmccall to take another look; If FPFeatures was on CallExpr then I'd remove it here and modify IRBuilder.FMF when visiting CallExpr

clang/test/CodeGenOpenCL/builtins-r600.cl
5 ↗(On Diff #248274)

OpenCL CompilerInvocation always sets fp_contract to "on"; inside clang I check if either fp_contract==on or fp_contract==fast, that expression is used to set the IRBuilder.FMF.contract bit.

CUDA CompilerInvocation always sets fp_contract to "fast"

rjmccall added inline comments.Mar 4 2020, 2:11 PM
clang/include/clang/AST/Stmt.h
1118

Because Stmt is a common base class for so much of the AST but only needs to store a small amount of state itself, we have a complicated system for optimizing space usage in subclasses by allocating bit-fields into Stmt. Letting an individual subclass's bit-field usage run over the expected size and therefore inflate Stmt for all subclasses would be counter-productive, hence the static_assert and why it shouldn't be changed. You need to move the storage of FPOptions into the appropriate subclass wherever it would cause the static_assert to fail.

mibintc updated this revision to Diff 248536.Mar 5 2020, 10:21 AM
mibintc marked 2 inline comments as done.

@rjmccall suggested that I needed to remove FPOptions from the Stmt class since the sizeof assertion failed. I moved that information into the relevant expression nodes and fixed a few small things that I didn't like in the previous version.

mibintc marked 2 inline comments as done.Mar 5 2020, 10:29 AM
mibintc added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
615

The call expr might be a call to an intrinsic, the floating point intrinsic calls need to be marked properly with information from FPFeatures

4173

I got rid of the bogus code here and moved it into VisitCallExpr where it belongs.

I got an email like this "Harbormaster failed remote builds in B48237: Diff 248536!" but there is no further information. It builds OK from my workstation. I did have to paste the review because an upload to Phabricator exceeded the size limit, could that be why? Is there a way to get more information about the build failure?

martong resigned from this revision.Mar 6 2020, 2:43 AM

@rjmccall suggested that I needed to remove FPOptions from the Stmt class since the sizeof assertion failed. I moved that information into the relevant expression nodes and fixed a few small things that I didn't like in the previous version.

You only need to do this for the expression nodes where it causes an overflow in the size of the bit-field; I don't think you're overflowing the capacity of UnaryOperatorBitfields, for example.

@rjmccall suggested that I needed to remove FPOptions from the Stmt class since the sizeof assertion failed. I moved that information into the relevant expression nodes and fixed a few small things that I didn't like in the previous version.

You only need to do this for the expression nodes where it causes an overflow in the size of the bit-field; I don't think you're overflowing the capacity of UnaryOperatorBitfields, for example.

I added unsigned FPFeatures : 14; to class UnaryOperatorBitfields but I encountered the same assertion failure in the Stmt constructor, In constructor ‘clang::Stmt::Stmt(clang::Stmt::StmtClass)’:
/iusers/sandbox/pragma-ws/llvm-project/clang/include/clang/AST/Stmt.h:1095:5: error: static assertion failed: changing bitfields changed sizeof(Stmt)

static_assert(sizeof(*this) <= 8

@rjmccall suggested that I needed to remove FPOptions from the Stmt class since the sizeof assertion failed. I moved that information into the relevant expression nodes and fixed a few small things that I didn't like in the previous version.

You only need to do this for the expression nodes where it causes an overflow in the size of the bit-field; I don't think you're overflowing the capacity of UnaryOperatorBitfields, for example.

I added unsigned FPFeatures : 14; to class UnaryOperatorBitfields but I encountered the same assertion failure in the Stmt constructor, In constructor ‘clang::Stmt::Stmt(clang::Stmt::StmtClass)’:
/iusers/sandbox/pragma-ws/llvm-project/clang/include/clang/AST/Stmt.h:1095:5: error: static assertion failed: changing bitfields changed sizeof(Stmt)

static_assert(sizeof(*this) <= 8

Oh, okay. It's actually pretty unfortunate that we need to increase the memory use of every UnaryOperator, BinaryOperator, and CallExpr in the AST just in case these pragmas are in use. Most instances of these expressions provably have nothing to do with floating-point, and in practice the pragmas are very rarely used.

I mentioned up-thread that it would be nice to make serialized AST actually agnostic about the basic language settings. That would require us to track not just the active option set, but what overrides were actually in place when parsing any given expression; we could then merge the two pretty easily in IRGen. (We could represent this with masks: we could then merge the information by taking the default-settings mask, bitwise-and'ing with the override mask to clear any values that will be overridden, and bitwise-or'ing the override values on top.) But if we did that, then it would be easy for us to identify when creating an AST node that no overrides are in effect; that would let us just store a single flag in the class indicating whether there were overrides, then use trailing storage for the actual overrides if present. This would relieve almost all of the pressure to keep the storage size down.

We could take it a step further by trying to recognize when building certain AST nodes that they're not floating-point operations and just building them with no overrides.

@rjmccall Since CompoundAssignmentOperator derives from BinaryOperator, it's not simple to add Trailing storage here. I think I will have to fold CompoundAssignmentOperator into BinaryOperator and then add the 2 extra fields needed by CompoundAssignmentOperator into Trailing storage. Can you think of a better way? I worked on Trailing storage for UnaryOperator first and that wasn't too bad, but Binary is a different story.

@rjmccall Since CompoundAssignmentOperator derives from BinaryOperator, it's not simple to add Trailing storage here. I think I will have to fold CompoundAssignmentOperator into BinaryOperator and then add the 2 extra fields needed by CompoundAssignmentOperator into Trailing storage. Can you think of a better way? I worked on Trailing storage for UnaryOperator first and that wasn't too bad, but Binary is a different story.

It's something we deal with occasionally, but it's definitely annoying. You basically have to test for which concrete class you have and then ask that class for its trailing storage.

Collapsing the types might be okay but could get involved.

@rjmccall Since CompoundAssignmentOperator derives from BinaryOperator, it's not simple to add Trailing storage here. I think I will have to fold CompoundAssignmentOperator into BinaryOperator and then add the 2 extra fields needed by CompoundAssignmentOperator into Trailing storage. Can you think of a better way? I worked on Trailing storage for UnaryOperator first and that wasn't too bad, but Binary is a different story.

It's something we deal with occasionally, but it's definitely annoying. You basically have to test for which concrete class you have and then ask that class for its trailing storage.

Collapsing the types might be okay but could get involved.

To be clear, we do this in the Clang AST by doing a lot of hand-rolled trailing storage rather than by using llvm::TrailingStorage. If you're willing to do the refactors necessary to use the latter, though, that's great; but as mentioned, you'll need to either collapse CompoundAssignmentOperator into BinaryOperator or introduce a concrete subclass for the non-compound operators and make BinaryOperator an abstract node.

mibintc updated this revision to Diff 259353.Apr 22 2020, 12:11 PM
mibintc retitled this revision from [RFC] Add support for pragma float_control, to control precision and exception behavior at the source level to Add support for pragma float_control, to control precision and exception behavior at the source level.

The previous version of this patch was changed to move BinaryOperator.FPFeatures into TrailingStorage. This patch is rebased on that change. Looking for code review, (check-clang passes) thank you!

mibintc marked 2 inline comments as done.Apr 22 2020, 1:03 PM

added a couple inline explanatory comments

clang/include/clang/Basic/LangOptions.h
307

I added this boolean as part of validating the correctness of the pragma's that access the FP environment, according to the Microsoft checks.. Copying from the Microsoft doc: "There are restrictions on the ways you can use the fenv_access pragma in combination with other floating-point settings:

You can't enable fenv_access unless precise semantics are enabled. Precise semantics can be enabled either by the float_control pragma, or by using the /fp:precise or /fp:strict compiler options. The compiler defaults to /fp:precise if no other floating-point command-line option is specified.

You can't use float_control to disable precise semantics when fenv_access(on) is set."
This is copied from https://docs.microsoft.com/en-us/cpp/preprocessor/fenv-access?view=vs-2019

clang/test/CodeGen/constrained-math-builtins.c
157–161

Since this patch constructs the FPFeatures using the floating point settings from the command line versus the default FPOptions() constructor, several tests need to be changed. Some of the changes I made showing the flags on the IR, other tests I changed by adding ffp-contract to the RUN line to match the expected IR.

noted bug in an inline comment. i will upload a fix

mibintc updated this revision to Diff 260320.Apr 27 2020, 7:32 AM

corrected the bitfield width in Stmt.h for FPFeatures on cxx operator call

mibintc marked an inline comment as done.Apr 27 2020, 9:41 AM
mibintc added inline comments.
clang/docs/LanguageExtensions.rst
3216

there's an extra whitespace here, i'll get rid of it

@rjmccall : can you comment on the CallExpr's trailing storage issue? And give advice as to what you meant in the last review?

clang/include/clang/AST/Expr.h
2122

needs a newline between functions.

2254

Is this really useful/usable at all? It seems to me that since this would require re-allocating this object that noone should be able to set this after construction.

2258

Why is this a separate function from getTrailingFPFeatures?

2259

If they have to be separate, the assert here doesn't really make sense, since getTrailingFPFeatures has the same assert.

2264

For the asserts, we should probably prefer using the hasStoredFPFeatures function.

2271

Is there use in having both this AND the get-stored, as opposed to just making everyone access via the same function? At least having 2 public versions aren't very clear what the difference is to me.

2702

This type already has trailing-storage type stuff. I think in the past review @rjmccall encouraged you to put this in the fake-trailing-storage like above.

clang/include/clang/Basic/LangOptions.h
197

Is this an unrelated change? What is the purpose for this?

388–389

Is this the same logic as getFromOpaqueInt? If so, we should probably just call that.

clang/include/clang/Sema/Sema.h
558

This comment is really oddly phrased and uses the 'stack'-noun as a verb? Something like: (please feel free to wordsmith): "This stack tracks the current state of Sema.CurFPFeatures."?

clang/lib/AST/Expr.cpp
1309

Is there a reason this isn't a member initializer?

erichkeane added inline comments.Apr 27 2020, 10:25 AM
clang/lib/Parse/ParsePragma.cpp
667

Oh boy these are some magic lookin numbers... Can you document these two lines?

2539

Replace this with BalancedDelimiterTracker instead, it gives consistent errors and are a bit easier to use. Additionally, I think it does some fixups that allow us to recover better.

I'd also suggest some refactoring with the PragmaFloatControlKind if/elses below. Perhaps handle the closing paren at the end, and do a switch-case for that handling.

clang/lib/Sema/SemaAttr.cpp
428

I guess I don't get why you're switching on both here? Can the two just be combined? I don't know if the 'NewValue = CurFPFeatures.getAsOpaqueInt();

FpPragmaStack.Act(Loc, Action, StringRef(), NewValue);'

part is sufficiently motivated to do 2 separate switches.

1020

Should we still be setting this even if there was an error?

clang/lib/Sema/SemaStmt.cpp
382

unrelated change?

398

Don't use curleys for single liners, both of these probably shouldn't need curleys at all. Comment could be at the top for clarity.

clang/lib/Serialization/ASTReaderStmt.cpp
684

Rather than this variable, why not just ask 'E' below?

mibintc marked 5 inline comments as done.Apr 27 2020, 12:29 PM

A couple replies to @erichkeane

clang/include/clang/AST/Expr.h
2254

It's only used during serialization (ASTReader); I guess the node has already been allocated by then so it's superfluous, because the allocation point could set this field.

2271

John suggested the name getStored hasStored as "less tempting" names. The getStored and hasStored are only used during Serialization. John suggested the getFPFeatures function as the public interface and it uses the LangOptions parameter. The features are saved in the node if they can't be recreated from the command line floating point options (due to presence of floating point pragma)

2702

whoops i meant that to go to the CallExprBits

clang/include/clang/Basic/LangOptions.h
197

it's a NFC the llvm:: prefix wasn't needed. maybe the clang formatter did that?

clang/lib/Serialization/ASTReaderStmt.cpp
684

yes i could do that. it would be a function call

mibintc updated this revision to Diff 260957.Apr 29 2020, 10:49 AM
mibintc marked 7 inline comments as done.

I dropped FPOptions from CallExpr -- if it's needed I will add it using TrailingStorage in a separate patch; I responded to @erichkeane 's code review

mibintc marked an inline comment as done.Apr 29 2020, 10:51 AM
mibintc added inline comments.
clang/include/clang/AST/Expr.h
2702

Adding FPOptions to CallExprBits makes the field too large, I'm going to drop adding FPOptions to CallExpr, i'll add it via trailingstorage in a separate patch if it's needed.

clang/lib/Parse/ParsePragma.cpp
2539

BalancedDelimiterTracker doesn't work here because there's no access to the Parser object. Rewriting it would be an extensive change and I'm doubtful about making this change. PragmaHandler is defined in Lex. I think there are 60 pragma's that use the PragmaHandler.

clang/lib/Sema/SemaAttr.cpp
1020

Should we still be setting this even if there was an error?

It's not harmful to set it, if there's an error diagnostic then there is no codegen right?

clang/lib/Serialization/ASTReaderStmt.cpp
684

i made some changes in this area, eliminating setHasStoredFPFeatures

2 small things, @rjmccall and @sepavloff , anything else?

clang/include/clang/Basic/DiagnosticSemaKinds.td
870

The last 4 can be done via selects as well! Save a couple more spaces before we have to up the diagnostic id size :)

clang/include/clang/Sema/Sema.h
558

Just needs a period at the end.

clang/lib/Parse/ParsePragma.cpp
2539

Thats unfortunate :/ That type does some nice fixup work.

mibintc updated this revision to Diff 261226.Apr 30 2020, 7:41 AM

I changed a comment to add a period, rebased, and used clang-format

mibintc marked an inline comment as done.Apr 30 2020, 7:43 AM
mibintc added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
870

The last 4 can be done via selects as well

Combining these 4 into 1 diagnostic is doable but it's ugly.

erichkeane added inline comments.Apr 30 2020, 8:00 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
870

Concur, I spent some time on it and don't really like it.

erichkeane accepted this revision.Apr 30 2020, 11:10 AM

Looks OK to me. Please give @sepavloff and @rjmccall a day or so to make final comments before committing.

This revision is now accepted and ready to land.Apr 30 2020, 11:10 AM
This revision was automatically updated to reflect the committed changes.
uabelho added a subscriber: uabelho.May 6 2020, 2:41 AM

Hi @mibintc ,

I think I'm seeing some oddities with this patch (current trunk, 0054c46095).
With

clang -S -Xclang -emit-llvm bbi-42553.c -o -

on the input

float rintf(float x);
#pragma STDC FENV_ACCESS ON

void t()
{
    (void)rintf(0.0f);
}

I get

bbi-42553.c:2:14: warning: pragma STDC FENV_ACCESS ON is not supported, ignoring pragma [-Wunknown-pragmas]
#pragma STDC FENV_ACCESS ON
             ^
; ModuleID = 'bbi-42553.c'
source_filename = "bbi-42553.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind optnone strictfp uwtable
define dso_local void @t() #0 {
entry:
  %0 = call float @llvm.experimental.constrained.rint.f32(float 0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.ignore") #2
  ret void
}

and if I remove the pragma I instead get

define dso_local void @t() #0 {
entry:
  %0 = call float @llvm.rint.f32(float 0.000000e+00)
  ret void
}

Before this patch I got the call to llvm.rint.f32 in both cases.

It seems to be this change in SemaStmt.cpp that makes the FENV-pragma have some effect regardless of the warning saying that the pragma doesn't have any effect:

index aa0d89ac09c3..f76994a6dab3 100644
@@ -394,6 +394,11 @@ StmtResult Sema::ActOnCompoundStmt(SourceLocation L, SourceLocation R,
                                    ArrayRef<Stmt *> Elts, bool isStmtExpr) {
   const unsigned NumElts = Elts.size();
 
+  // Mark the current function as usng floating point constrained intrinsics
+  if (getCurFPFeatures().isFPConstrained())
+    if (FunctionDecl *F = dyn_cast<FunctionDecl>(CurContext))
+      F->setUsesFPIntrin(true);
+

It seems to be this change in SemaStmt.cpp that makes the FENV-pragma have some effect regardless of the warning saying that the pragma doesn't have any effect:

index aa0d89ac09c3..f76994a6dab3 100644
@@ -394,6 +394,11 @@ StmtResult Sema::ActOnCompoundStmt(SourceLocation L, SourceLocation R,
                                    ArrayRef<Stmt *> Elts, bool isStmtExpr) {
   const unsigned NumElts = Elts.size();
 
+  // Mark the current function as usng floating point constrained intrinsics
+  if (getCurFPFeatures().isFPConstrained())
+    if (FunctionDecl *F = dyn_cast<FunctionDecl>(CurContext))
+      F->setUsesFPIntrin(true);
+

Thank you I will follow up!

We're also seeing test failures in Apple's Clang fork, e.g.

test/CodeGen/finite-math.c:12:10: error: NSZ: expected string not found in input
 // NSZ: fadd nsz
         ^
<stdin>:11:20: note: scanning from here
define void @foo() #0 {
                   ^
<stdin>:15:9: note: possible intended match here
 %add = fadd float %0, %1
        ^

I haven't checked yet whether we can reproduce upstream.

I posted a patch to fix the bug reported by @uabelho here https://reviews.llvm.org/D79510

@rjmccall I used check-clang and check-all on D71841 from my linux x86-64 server before submitting, and the testing was clear. Maybe your branch has some calls to create BinaryOperator that isn't passing FPFeatures with the correct value--related to D76384. My sandbox is showing this

%add = fadd nsz float %0, %1

I got a report that this patch was causing a problem with Windows <numeric> header because #pragma float_control should be supported in namespace context. I've posted a patch for review here https://reviews.llvm.org/D79631

plotfi added a subscriber: plotfi.May 8 2020, 2:59 PM

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

We're also seeing test failures in Apple's Clang fork, e.g.

test/CodeGen/finite-math.c:12:10: error: NSZ: expected string not found in input
 // NSZ: fadd nsz
         ^
<stdin>:11:20: note: scanning from here
define void @foo() #0 {
                   ^
<stdin>:15:9: note: possible intended match here
 %add = fadd float %0, %1
        ^

I haven't checked yet whether we can reproduce upstream.

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

Oh, thank you for figuring that out. Yeah, it's reasonable for code-gen option parsing to depend on language-option parsing, which means that this patch is wrong. The right fix is that we need to stop parsing these as code-gen options, which is reasonable since they have direct language-semantics impact. If we still need the code-gen option flags, we should be able to recreate them from the language options, I think.

@rjmccall @mibintc Can we revert this patch for now then, and re-land when this patch is reworked? It would be good to get those bots passing. @rjmccall are the bots that you see failing on your end public?

Hi @rjmccall, I am also seeing similar failures. It is failing on apple's master (and the swift branches as well) because ParseLangArgs and ParseCodeGenArgs are getting called in the opposite order in apple/master from the order they are called in llvm/master. I posted a PR to fix those failures here: https://github.com/apple/llvm-project/pull/1202

but I don't know if this is the most correct approach.

Oh, thank you for figuring that out. Yeah, it's reasonable for code-gen option parsing to depend on language-option parsing, which means that this patch is wrong. The right fix is that we need to stop parsing these as code-gen options, which is reasonable since they have direct language-semantics impact. If we still need the code-gen option flags, we should be able to recreate them from the language options, I think.

This comment was removed by mibintc.
mibintc marked 2 inline comments as done.May 11 2020, 5:52 AM

Some inline replies/comments to @rjmccall and @plotfi

clang/lib/Frontend/CompilerInvocation.cpp
3185

@rjmccall @plotfi These earlier patches are also deriving the value of LangOpts from the settings of CG opts

3187

@rjmccall @plotfi here the codegen args are evaluated first. Perhaps we could have a "reconcile codegen and lang args" function which would resolve the floating point settings into a final setting? so that codegen or lang could be parsed in either order?

@rjmccall Uncertain how to proceed, can you recommend? If I recall correctly, I added the lines in CompilerOptions because there were many failing lit tests, i could have fixed the lit fails by adding the lang options to the lit tests. (of course that change could have other repercussions)

mibintc added inline comments.May 11 2020, 10:16 AM
clang/lib/Frontend/CompilerInvocation.cpp
3192

@rjmccall I could set these by using Args.hasArg instead of CGOpts, would that be acceptabel?

rjmccall added inline comments.May 11 2020, 10:32 AM
clang/lib/Frontend/CompilerInvocation.cpp
3185

I don't know what you mean here; the code you're quoting just seems to be looking at Args. It's fine to re-parse arguments in both places if that makes something easier. The problem is that you're looking at the CodeGenOptions structure itself.

rjmccall added inline comments.May 11 2020, 10:49 AM
clang/lib/Frontend/CompilerInvocation.cpp
3192

I think so, yes. Ideally the CG options would then be set based on the earlier values, or replaced with uses of the language options structure, but not having a direct dependency at all may be simpler.

plotfi added a subscriber: ab.May 11 2020, 11:01 AM

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

@rjmccall Uncertain how to proceed, can you recommend? If I recall correctly, I added the lines in CompilerOptions because there were many failing lit tests, i could have fixed the lit fails by adding the lang options to the lit tests. (of course that change could have other repercussions)

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

That was a good fix. I am pretty sure this does mean the diagnostics-order.c will fail on apple's bots. The same diagnostics lines print, but in the wrong order. I haven't root caused that yet.

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

@rjmccall @mibintc I think the diagnostics-order.c test is still behaving correctly technically. The note lines are still printing with the associated error lines, it just happens that one of the warning lines prints at the end instead of in the middle. ie:

error: invalid value '-foo' in '-verify='
note: -verify prefixes must start with a letter and contain only alphanumeric characters, hyphens, and underscores
error: invalid value 'bogus' in '-std=bogus'
note: use 'c89', 'c90', or 'iso9899:1990' for 'ISO C 1990' standard
...
warning: optimization level '-O999' is not supported; using '-O3' instead

instead of

error: invalid value '-foo' in '-verify='
note: -verify prefixes must start with a letter and contain only alphanumeric characters, hyphens, and underscores
warning: optimization level '-O999' is not supported; using '-O3' instead
error: invalid value 'bogus' in '-std=bogus'
note: use 'c89', 'c90', or 'iso9899:1990' for 'ISO C 1990' standard
...

That was a good fix. I am pretty sure this does mean the diagnostics-order.c will fail on apple's bots. The same diagnostics lines print, but in the wrong order. I haven't root caused that yet.

@ab @rjmccall @mibintc Posted D79730 for consideration.
@mibintc can you produce a version of _this_ diff that works with D79730 applied. Currently the following fail, as they do on Apple Master:

@rjmccall accepted the proposed patch https://reviews.llvm.org/D79735, so I pushed that. I also tried your patch and the 3 CodeGen tests pass but the diagnostics-order.c test fails

Failing Tests (4):
  Clang :: CodeGen/finite-math.c
  Clang :: CodeGen/fp-floatcontrol-stack.cpp
  Clang :: CodeGenOpenCL/relaxed-fpmath.cl
  Clang :: Frontend/diagnostics-order.c

I think we might have had to change that test on our fork when we changed the parsing order.

michele.scandale added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
227

I'm not convinced it correct to set contract when allowFPContractWithinStatement return true. Can someone clarify this?

If I compile the following example with -ffp-contract=on:

float test1(float a, float b, float c) {
  float x = a * b;
  return x + c;
}

float test2(float a, float b, float c) {
  return a * b + c;
}

Before this change the generated code was:

define float @test1(float %a, float %b, float %c) {
  %0 = fmul float %a, %b
  %1 = fadd float %0, %c
  ret float %1
}

define float @test2(float %a, float %b, float %c) {
  %0 = call float @llvm.fmuladd.f32(float %a, float%b, float %c)
  ret float %0
}

And my understanding is that the in-statement contraction is implemented by emitting the llvm.fmuladd call that a backend might decide to implement as fmul + fadd or as fma.

With this change the generated code is:

define float @test1(float %a, float %b, float %c) {
  %0 = fmul contract float %a, %b
  %1 = fadd contract float %0, %c
  ret float %1
}

define float @test2(float %a, float %b, float %c) {
  %0 = call contract float @llvm.fmuladd.f32(float %a, float%b, float %c)
  ret float %0
}

and it seems to me that in test1 (where multiple statements where explicitly used) the optimizer is now allowed to perform the contraction, violating the original program semantic where only "in-statement" contraction was allowed.

IIUC, the way within-statement contraction is supposed to work is that there are supposed to be blocking intrinsics inserted at various places. I don't remember the details, though, or if anyone's thought about how it interacts with cross-statement contraction, which this pragma permits within the same function (and could occur with inlining regardless).

clang/include/clang/Basic/LangOptions.h
396–401

Same comment on LangOpts.FastMath || as the one for CompilerInvocation.cpp.

clang/lib/Frontend/CompilerInvocation.cpp
3190–3196

Why do we need Opts.FastMath || here? The code in the compiler driver clang/lib/Driver/ToolChains/Clang.cpp (https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Clang.cpp#L2510) already takes care of generating the right flags for the CC1 to configure the floating point rules.

Moreover, if we ignore what the compiler driver does, the fact that Args.hasArg(OPT_ffast_math) is not considered in the definition of the codegen options such as NoInfsFPMath, NoNaNsFPMath, NoSignedZeros, Reassociate, so the you have already two distinct options for the same abstract property that might not match.

I think that at the CC1 level the reasoning should be done in terms of the fine grain options, and let the compiler driver makes life easy for the users -- i.e. LangOpts.FastMath should just control whether the macro __FAST_MATH__ is defined or not.

IIUC, the way within-statement contraction is supposed to work is that there are supposed to be blocking intrinsics inserted at various places. I don't remember the details, though, or if anyone's thought about how it interacts with cross-statement contraction, which this pragma permits within the same function (and could occur with inlining regardless).

Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.

Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.

Am I missing something?

No, if that's how we handle that, then you're absolutely right that we shouldn't set the contract flag.

scanon added a subscriber: scanon.May 12 2020, 10:17 AM

Prior to this change contract was never generated in the case of in-statement contraction only, instead clang was emitting llvm.fmuladd to inform the backend that only those were eligible for contraction. From a correctness perspective I think this was perfectly fine.

Currently I don't see any logic to generate "blocking intrinsics" (I guess to define a region around the instructions emitted for the given statement). Until such mechanism is in place, I think that generating the contract fast-math flag also for in-statement contraction is wrong because it breaks the original program semantic.

This is exactly right. If we are going to have new in-statement optimizations, then we probably do need to add some blocking intrinsic (which would be elidable given suitable fast-math flags); the system of generating fmuladd works well for FMA contraction, but doesn't really generalize to other optimizations of this sort.

mibintc added inline comments.May 12 2020, 10:51 AM
clang/lib/CodeGen/CGExprScalar.cpp
227

Thanks @michele.scandale i will work on a patch for this

yaxunl added a subscriber: yaxunl.May 12 2020, 5:04 PM
yaxunl added inline comments.
clang/lib/Serialization/ASTReader.cpp
7899

This changes the behavior regarding AST reader and seems to be too hash restriction. Essentially this requires a pch can only be used with the same fp options with which the pch is generated. Since there are lots of fp options, it is impractical to generate pch for all the combinations.

We have seen regressions due to this assertion.

Can this assertion be dropped or done under some options?

Thanks.

mibintc marked an inline comment as done.May 13 2020, 8:14 AM
mibintc added inline comments.
clang/lib/Serialization/ASTReader.cpp
7899

@yaxunl Can you please send me a reproducer, I'd like to see what's going on, not sure if just getting rid of the assertion will give the desired outcome.

yaxunl added inline comments.May 13 2020, 9:51 AM
clang/lib/Serialization/ASTReader.cpp
7899

Pls apply the patch.

Thanks.

mibintc marked 2 inline comments as done.May 13 2020, 1:46 PM
mibintc added inline comments.
clang/lib/CodeGen/CGExprScalar.cpp
227

@michele.scandale I posted a patch for 'contract' here, https://reviews.llvm.org/D79903

clang/lib/Serialization/ASTReader.cpp
7899

@rjmccall In the example supplied by @yaxunl, the floating point options in the pch file when created are default, and the floating point options in the use have no-signed-zeros flag. The discrepancy causes an error diagnostic when the pch is used. I added the FMF flags into FPFeatures in this patch, I made them COMPATIBLE_LANGOPT which is the encoding also being used for FastMath, FiniteMathOnly, and UnsafeFPMath. Do you have some advice about this issue?

rjmccall added inline comments.May 13 2020, 3:00 PM
clang/lib/Serialization/ASTReader.cpp
7899

A couple things are going on here.

First: a PCH can only end at the top level, not in the middle of a declaration, but otherwise Sema can be in an arbitrary semantic configuration. That definitely includes arbitrary pragmas being in effect, so in general the end state might not match the default FP state, so this assertion is bogus. When loading a PCH, you need to restore the pragma stack and current FP state to the configuration it was in at the end of the PCH.

Second: if you restore the pragma stack and FP state naively given the current representation of FP state, you will completely overwrite the FP settings of the current translation unit with the FP settings that were in effect when the PCH was built, which is obviously not okay. This is one way (among several) that the current representation is not really living up to the statement that these language options are "compatible". The better way to do this would be for the pragma stack and Expr nodes to record the current set of overrides in effect rather than the absolute current state; this could then be easily applied to an arbitrary global FP state.

clang/lib/CodeGen/CGExprScalar.cpp
227

Thanks!

I documented the issue reported by @yaxunl here, https://bugs.llvm.org/show_bug.cgi?id=46166, and take ownership of the bug. Thanks for the report.

cfang added a subscriber: cfang.Jun 23 2020, 3:12 PM

-ffast-math flag got lost in the Builder after this change.

FMF.isFast() is true before updateFastMathFlags(FMF, FPFeatures), but turns false after.
It seems the Builder.FMF has been correctly set before, but I am not clear what FPFeatures should be at this point:

+static void setBuilderFlagsFromFPFeatures(CGBuilderTy &Builder,
+ CodeGenFunction &CGF,
+ FPOptions FPFeatures) {
+ auto NewRoundingBehavior = FPFeatures.getRoundingMode();
+ Builder.setDefaultConstrainedRounding(NewRoundingBehavior);
+ auto NewExceptionBehavior =
+ ToConstrainedExceptMD(FPFeatures.getExceptionMode());
+ Builder.setDefaultConstrainedExcept(NewExceptionBehavior);
+ auto FMF = Builder.getFastMathFlags();
+ updateFastMathFlags(FMF, FPFeatures);
+ Builder.setFastMathFlags(FMF);
+ assert((CGF.CurFuncDecl == nullptr || Builder.getIsFPConstrained() ||
+ isa<CXXConstructorDecl>(CGF.CurFuncDecl) ||
+ isa<CXXDestructorDecl>(CGF.CurFuncDecl) ||
+ (NewExceptionBehavior == llvm::fp::ebIgnore &&
+ NewRoundingBehavior == llvm::RoundingMode::NearestTiesToEven)) &&
+ "FPConstrained should be enabled on entire function");
+}

I got a bug report about this patch, see https://bugs.llvm.org/show_bug.cgi?id=49479. I put a patch to fix it here, https://reviews.llvm.org/D98211