Page MenuHomePhabricator

Enable '#pragma STDC FENV_ACCESS' in frontend
AbandonedPublic

Authored by sepavloff on Oct 21 2019, 10:49 AM.

Details

Summary

This change implements support of '#pragma STDC FENV_ACCESS' in
clang. Use of the pragma in any block of a function results in using
constrained intrinsics to represent floating point operations in the
whole function. The function is marked by the attribute 'StrictFP' in
AST and 'strictfp' in IR.

Event Timeline

sepavloff created this revision.Oct 21 2019, 10:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2019, 10:49 AM
kpn added a comment.Oct 21 2019, 11:03 AM

Does this work for C++? C++ templates? I only see C tests.

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

rjmccall added inline comments.Oct 21 2019, 11:40 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

What's the purpose of this restriction? Whether inline really has much to do with inlining depends a lot on the exact language settings. (Also, even if this restriction were okay, the diagnostic is quite bad given that there are three separate conditions that can lead to it firing.)

Also, I thought we were adding instruction-level annotations for this kind of thing to LLVM IR. Was that not in service of implementing this pragma?

I'm not categorically opposed to taking patches that only partially implement a feature, but I do want to feel confident that there's a reasonable technical path forward to the full implementation. In this case, it feels like the function-level attribute is a dead end technically.

Thanks for the patch! I don't have time to review this in detail this week, but I'm very happy to see this functionality.

clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

I'm guessing this is intended to avoid the optimization problems that would occur (currently) if a function with strictfp were inlined into a function without it. I'm just guessing though, so correct me if I'm wrong.

As I've said elsewhere, I hope this is a temporary problem. It is a real problem though (as is the fact that the inliner isn't currently handling this case correctly).

What would you think of a new command line option that caused us to mark functions with strictfp as noinline? We'd still need an error somewhat like this, but I feel like that would be more likely to accomplish what we want on a broad scale.

hfinkel added inline comments.Oct 21 2019, 2:05 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

We would not want to prevent all inlining, just inlining where the attributes don't match. We should fix his first. I think we just need to add a CompatRule to include/llvm/IR/Attributes.td (or something like that).

sepavloff marked an inline comment as done.Oct 22 2019, 12:27 AM

Try to organize replies for better references.

Background

According to the current design, if floating point operations are represented by constrained intrinsics somewhere in a function, constrained intrinsics must be used in entire function, including inlined function calls (http://lists.llvm.org/pipermail/cfe-dev/2017-August/055325.html). As constrained intrunsics do not participate in optimizations, this may lead to performance drop (discussed in http://lists.llvm.org/pipermail/llvm-dev/2019-August/134641.html). There was an attempt to alleviate this issue using basic block attributes, but this approach was rejected (http://lists.llvm.org/pipermail/llvm-dev/2019-October/135623.html).

In this case allowing #pragma STDC FENV_ACCESS at function level seems a good compromise between performance and flexibility. Users get possibility to use non-standard floating point environment and performance impact is limited to the scope of one function and that scope is controlled by user.

The more general solution, where #pragma STDC FENV_ACCESS is allowed in any block inside a function, as the standard requires, would require extending the scope where constrained intrinsics are used. It would result in performance drop, which is unacceptable in many cases. There are ideas to implement the general case using outlining (http://lists.llvm.org/pipermail/llvm-dev/2019-October/135628.html), it could be a promising way to extend functionality of this solution.

Inlining

Inlining in an issue for functions with #pragma STDC FENV_ACCESS, because if such function is inlined, the host function must use constrained intrinsics, which can result in performance drop. The simplest case is to prohibit inlining of such functions, this way is used in this patch. More elaborated solution would try to merge attributes if possible. For instance, if the host function does not use floating point operations, the function with #pragma STDC FENV_ACCESS may be inlined. However it does not look like a job of frontend.

In D69272#1717021, @kpn wrote:

Does this work for C++? C++ templates? I only see C tests.

Will add C++ tests soon.

In D69272#1717021, @kpn wrote:

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

Function outlining may be a general solution.

clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

As Andrew already said, noinline attribute is a mean to limit negative performance impact. Of course, to inline or not to inline - such decision is made by backend. However it a user requested a function to be inline, a warning looks useful.

When constrained intrinsics get full support in optimizations, this restriction will become unnecessary.

Outlining is one of the ways that converts this solution into full pragma implementation. Another is implementation of constrained intrinsics support in optimization transformations.

As for a new command line option that caused us to mark functions with strictfp as noinline, it loos a good idea, but we must adapt inliner first, so that it can convert ordinary floating operations to constrained intrinsics during inlining.

In the case of #pragma STDC FENV_ACCESS we cannot in general say if attributes are compatible so a function can be inlined into another. The pragma only says that user modified floating point environment. One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining. More fine grained pragmas, like that proposed in https://reviews.llvm.org/D65997 could enable more flexible inlining.

kpn added a comment.Oct 22 2019, 7:33 AM

Baking into the front end the fact that the backend implementation is not yet complete doesn't strike me as a good idea.

And the metadata arguments to the constrained intrinsics are designed to allow for correctly marked constrained intrinsics to be eventually treated pretty close to the same as non-constrained math instructions. Once the implementation is further along, of course.

I think the issue with the inliner not being smart enough yet is an issue for llvm to deal with and not front ends like clang. It would be straightforward enough for llvm to mark functions that have the strictfp attribute so they also are marked noinline. As a temporary measure, of course. This is a case where llvm hasn't caught up with well-formed IR, so it would be llvm's job to work around its own incompleteness.

See D43142 for code to convert all floating point in a function into constrained intrinsics. Updated versions of this code with support for intrinsics that didn't exist at the time also exist. I don't see why this pass couldn't be reworked a bit to be used by the inliner. And it would only be needed when inlining into a strictfp function a function that wasn't strictfp.

You mentioned that extending the scope of the #pragma may result in a "performance drop, which is unacceptable in many cases". But the only difference between allowing the #pragma only at the top of a function, and allowing it everywhere the standard allows, is that the user knows about the potential loss of performance. The performance loss happens in both cases. Again, I don't think baking into clang the current state of llvm is a good idea.

A warning from clang that strictfp code doesn't perform very well today is probably a good idea, and it would be ripped out easily when the day comes. The warning would only fire when the #pragma is seen, and that code is small, self-contained, and actually already exists in clang now but with different text.

hfinkel added inline comments.Oct 22 2019, 9:31 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

Of course, to inline or not to inline - such decision is made by backend. However it a user requested a function to be inline, a warning looks useful.

We need to be careful. inline is not just an optimizaiton hint. It also affects function linkage. Also, when inlining into functions with similar constraints, there's no problem with the inlining.

One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining.

Can you please explain this? It does not seem like that should block inlining when both the caller and callee are marked as fenv_access-enabled.

In D69272#1717967, @kpn wrote:

Baking into the front end the fact that the backend implementation is not yet complete doesn't strike me as a good idea.

I don't expect that this patch would pass review quickly. But it could be used to organize requests what must be done to implement this functionality. Now I see that we need to add functionality to inliner. Do you have ideas what else should be done to implement this variant of the pragma?

I think the issue with the inliner not being smart enough yet is an issue for llvm to deal with and not front ends like clang. It would be straightforward enough for llvm to mark functions that have the strictfp attribute so they also are marked noinline. As a temporary measure, of course. This is a case where llvm hasn't caught up with well-formed IR, so it would be llvm's job to work around its own incompleteness.

That's true. But if user specifies inline for a function that contains the pragma, he requested contradictory attributes. Should compiler emit a warning? If yes, this is a job of frontend.

See D43142 for code to convert all floating point in a function into constrained intrinsics. Updated versions of this code with support for intrinsics that didn't exist at the time also exist. I don't see why this pass couldn't be reworked a bit to be used by the inliner. And it would only be needed when inlining into a strictfp function a function that wasn't strictfp.

I think this patch can help in adaptation of the inliner. The transformation must be rewritten as a function or even built into the logic of inliner.

You mentioned that extending the scope of the #pragma may result in a "performance drop, which is unacceptable in many cases". But the only difference between allowing the #pragma only at the top of a function, and allowing it everywhere the standard allows, is that the user knows about the potential loss of performance. The performance loss happens in both cases. Again, I don't think baking into clang the current state of llvm is a good idea.

A user may be convinced that using the pragma is expensive. He would carefully implement the function with the pragma and if the function is small and called rarely, performance drop can be minimized. Such solution does not work for all cases but for some it is acceptable. If using the pragma in a small region results in loss of optimization in entire function, this is counterintuitive and in many cases unacceptable.

A warning from clang that strictfp code doesn't perform very well today is probably a good idea, and it would be ripped out easily when the day comes. The warning would only fire when the #pragma is seen, and that code is small, self-contained, and actually already exists in clang now but with different text.

I don't see how such warning can help a user. A note about impact of the pragma on performance can be put into documentation. Issuing a warning on every use of the pragma may be annoying.

The main advantage of this restricted variant of the pragma IMHO is the possibility to provide implementation which can be developed in reasonable time and can be used in production code. Users will file bugs, we will fix them and the implementation will progress. If the pragma causes substantial performance loss, number of its users will be lower and the development will be slowed down.

I don't see how such warning can help a user. A note about impact of the pragma on performance can be put into documentation. Issuing a warning on every use of the pragma may be annoying.

I definitely agree. Performance may be fine in many cases, and performance may not be *relatively* important where the pragma is used. I don't believe that a warning should be added for this.

sepavloff marked an inline comment as done.Oct 22 2019, 10:09 AM
sepavloff added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

We need to be careful. inline is not just an optimizaiton hint. It also affects function linkage. Also, when inlining into functions with similar constraints, there's no problem with the inlining.

This is an argument in favor of the warning on conflicting attributes.

One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining.

Can you please explain this? It does not seem like that should block inlining when both the caller and callee are marked as fenv_access-enabled.

I was wrong. If a function correctly restores FP state, it can be inlined into another fenv_access-enabled function.

hfinkel added inline comments.Oct 22 2019, 11:43 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

This is an argument in favor of the warning on conflicting attributes.

No, because they're not conflicting (as you note, we can still inline), and because the a function might need inline linkage even if the user doesn't care about actual function inlining (this is an unfortunate side effect of the C/C++ specifications providing one keyword for both an optimization hint and a linkage modifier).

Okay. If the optimizer cannot correctly handle a mix of constrained and unconstrained FP operations, then the optimizer should protect itself, either by refusing to inline across such boundaries or by adding constraints as necessary before inlining. If it does that, I don't think it's particularly appropriate for the frontend to warn about the combination of constrained FP and supposedly inlining-related attributes (which only really seems true of always_inline). We should just mention in the documentation that this isn't particularly performant at the moment.

I feel quite confident that we can solve the engineering problem of efficiently figuring out that there's a constrained scope somewhere in a function and therefore we need constrained intrinsics.

sepavloff updated this revision to Diff 227690.Nov 4 2019, 4:09 AM

Removed diagnostics on inline functions

As pointed out in review, strictfp attribute does not prevent from inlining,
it only restrict cases where the inlining is possible.

rjmccall added inline comments.Nov 5 2019, 3:30 PM
clang/include/clang/Basic/DiagnosticParseKinds.td
1119

"'#pragma STDC FENV_ACCESS ON' is only supported in the outermost block in a function, ignoring"

Although, as mentioned, it would be better if we can just support this, if necessary by pessimizing the rest of the function. You're already marking the definition with an attribute when you see that there's a pragma within it. Why don't we just (1) add that attribute (once) whenever FENV_ACCESS is on at any point within a function and (2) make sure that we use FP constraints on every FP operation inside a function with the attribute?

clang/lib/Parse/ParsePragma.cpp
668

This is not ignoring the pragma; this is treating it as if it said OFF.

In D69272#1717021, @kpn wrote:

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

Please clarify: I understand that the backend wants all floating point operations to be built using Floating Point Constrained Intrinsics if any operations use constrained intrinsic. But I thought that, if the constraint were only to apply to a block within the function, that the operations outside the block would be written with the default setting for rounding mode and exception behavior, and the operations inside the constrained block would be created with different settings. Like this
float f( float a, float b) {
a*b ; this operation is written with floating point constrained intrinsic, rounding mode nearest, exception behavior ignore
{
#pragma float_control ...
set exception behavior to strict
a*b; // this operation is written with floating point constrained intrinsic, rounding mode tonearest, exception behavior strict
}}

kpn added a comment.Dec 20 2019, 8:03 AM
In D69272#1717021, @kpn wrote:

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

Please clarify: I understand that the backend wants all floating point operations to be built using Floating Point Constrained Intrinsics if any operations use constrained intrinsic. But I thought that, if the constraint were only to apply to a block within the function, that the operations outside the block would be written with the default setting for rounding mode and exception behavior, and the operations inside the constrained block would be created with different settings. Like this

Your understanding is correct.

My understanding of this patch is that it only allows the #pragma at the top of each function. It doesn't allow it in blocks inside the function. So if a function has a block inside it that uses strict FP the patch doesn't change the rest of the function to use constrained FP with the settings like you said. And my question was asking if there was a way forward with this patch to a full implementation.

In D69272#1792877, @kpn wrote:

My understanding of this patch is that it only allows the #pragma at the top of each function. It doesn't allow it in blocks inside the function. So if a function has a block inside it that uses strict FP the patch doesn't change the rest of the function to use constrained FP with the settings like you said. And my question was asking if there was a way forward with this patch to a full implementation.

@hfinkel proposed to use outlining to extract a block with the #pragma to separate function. It could be a basis for a full implementation.

In D69272#1792877, @kpn wrote:

My understanding of this patch is that it only allows the #pragma at the top of each function. It doesn't allow it in blocks inside the function. So if a function has a block inside it that uses strict FP the patch doesn't change the rest of the function to use constrained FP with the settings like you said. And my question was asking if there was a way forward with this patch to a full implementation.

@hfinkel proposed to use outlining to extract a block with the #pragma to separate function. It could be a basis for a full implementation.

I don't think outlining is a reasonable approach. Outlining has a *lot* of other performance consequences, and we'd have to support arbitrary control flow (e.g. goto) in and out of the outlined function, which adds a lot of frontend complexity in pursuit of something that ought be handled at the optimizer level.

If a function has any blocks with the #pragma, we just need to emit the whole function as strictfp. I believe the constrained FP intrinsics can take arguments that make them semantically equivalent to the default rule. If we don't emit code outside of those blocks as efficiently as we would've before, well, that's seems like a solvable optimization problem.

@hfinkel proposed to use outlining to extract a block with the #pragma to separate function. It could be a basis for a full implementation.

I don't think outlining is a reasonable approach. Outlining has a *lot* of other performance consequences, and we'd have to support arbitrary control flow (e.g. goto) in and out of the outlined function, which adds a lot of frontend complexity in pursuit of something that ought be handled at the optimizer level.

The restriction on using the #pragma on top-level only may be considered as request for 'manual outlining'. User have to extract the piece of code that uses the pragma, solving problems like data passing and control flow.

Arbitrary control flow creates problems even in the case of full support of the pragma. If exception takes place inside strictfp function call, the exception handler would operate in unknown FP environment and compiler in general cannot restore it. This is a problem to be solved.

Anyway outlining would be a temporary solution. When compiler can process constrained intrinsics as efficiently as regular nodes, the need in outlining would disappear.

If a function has any blocks with the #pragma, we just need to emit the whole function as strictfp. I believe the constrained FP intrinsics can take arguments that make them semantically equivalent to the default rule. If we don't emit code outside of those blocks as efficiently as we would've before, well, that's seems like a solvable optimization problem.

Eventually we should do this. But now using constrained intrinsics means substantial performance drop down to O0-like level. This pragma is interesting for users who want to use it as a way to optimize their programs. For example, a user wants to enable exceptions in some part of the program to avoid multiple checks. For such users poor performance is inappropriate price. The fact that use of the pragma in a small block kills performance of entire function is counter-intuitive and can create lot of misunderstanding.

Putting restriction on use of the pragma is of course, a temporary solution, it is not usable in all cases. But for some cases it is usable in production code. Where small pieces of code may be extracted into separate functions, this solution can provide tolerable performance loss, if most part of the program doesn't use constrained intrinsics. Warning prevents users from false expectations. Use in production code ensures further development of the feature.

Full-fledged solution requires full support of constrained intrinsics in optimizations. It is not clear how large this work is, but odd are that it would require substantial efforts. It this case usable implementation of 'pragma STDC FE_ACCESS' would postponed. The restricted solution can be implemented much faster and it does not impede development of the full-fledged one.

@hfinkel proposed to use outlining to extract a block with the #pragma to separate function. It could be a basis for a full implementation.

I don't think outlining is a reasonable approach. Outlining has a *lot* of other performance consequences, and we'd have to support arbitrary control flow (e.g. goto) in and out of the outlined function, which adds a lot of frontend complexity in pursuit of something that ought be handled at the optimizer level.

The restriction on using the #pragma on top-level only may be considered as request for 'manual outlining'. User have to extract the piece of code that uses the pragma, solving problems like data passing and control flow.

Imposing the restriction is reasonable. Doing a massive amount of throw-away work in IRGen in order to outline arbitrary sub-functions, control flow and all, just because optimizer support isn't ideal. Just don't lift the restriction until you get the optimizer to a tenable position.

If a function has any blocks with the #pragma, we just need to emit the whole function as strictfp. I believe the constrained FP intrinsics can take arguments that make them semantically equivalent to the default rule. If we don't emit code outside of those blocks as efficiently as we would've before, well, that's seems like a solvable optimization problem.

Eventually we should do this. But now using constrained intrinsics means substantial performance drop down to O0-like level. This pragma is interesting for users who want to use it as a way to optimize their programs. For example, a user wants to enable exceptions in some part of the program to avoid multiple checks. For such users poor performance is inappropriate price. The fact that use of the pragma in a small block kills performance of entire function is counter-intuitive and can create lot of misunderstanding.

This argument is just as much an argument against outlining if not more. Nobody would expect the compiler to implement things that way, and it is very likely to have major performance impact on the rest of the function by preventing optimization of any variable referenced from the inlined function. LLVM is generally terrible at interprocedural optimization if it can't just inline, and you're proposing to block inlining for the same reason you want to inline.

Anyway, I don't think I agree with the premise that this pragma is interesting for users who want to optimize their programs. Users who don't care about FP precision usually just use fast-math, or learn to isolate "fast-mathable" code into a particular translation unit; yes, they'd appreciate having finer-grained control, but it's not the major influence. I would expect the pragma to be much more interesting to people who want to request specific rounding behavior and/or honor a requirement to respect the dynamic rounding mode, which is to say, to people adding *more* constraints to their program and thus making it harder to optimize.

Full-fledged solution requires full support of constrained intrinsics in optimizations. It is not clear how large this work is, but odd are that it would require substantial efforts. It this case usable implementation of 'pragma STDC FE_ACCESS' would postponed. The restricted solution can be implemented much faster and it does not impede development of the full-fledged one.

If the only tenable option is outlining, I'm fine with living with the restricted pragma.

kpn added a comment.Dec 23 2019, 10:13 AM

Putting restriction on use of the pragma is of course, a temporary solution, it is not usable in all cases. But for some cases it is usable in production code. Where small pieces of code may be extracted into separate functions, this solution can provide tolerable performance loss, if most part of the program doesn't use constrained intrinsics. Warning prevents users from false expectations. Use in production code ensures further development of the feature.

Full-fledged solution requires full support of constrained intrinsics in optimizations. It is not clear how large this work is, but odd are that it would require substantial efforts. It this case usable implementation of 'pragma STDC FE_ACCESS' would postponed. The restricted solution can be implemented much faster and it does not impede development of the full-fledged one.

It depends on the definition of "usable" or "tolerable", and the definition that matters is the one users have.

If #pragma is restricted to no smaller than whole functions then a function that uses it inside the body won't compile. We made the decision instead of the user.

If #pragma is not restricted, but triggers worse performance in the rest of the function, then users can decide for themselves if that is acceptable.

We should leave the decision up to the end-user. The idea of always using constrained intrinsics if any constrained intrinsic is used in a function is our best bet at getting something in the hands of users sooner rather than later and without them having to do any rewriting of existing code.

Updated patch

Removed the previous limitation on use of the pragma, which restricted the
pragma to the topmost block only. It should favor users who do not bother
about performance but want the usage be as defined by the Standard.

sepavloff retitled this revision from Restricted variant of '#pragma STDC FENV_ACCESS' to Enable '#pragma STDC FENV_ACCESS' in frontend.Dec 26 2019, 11:55 PM
sepavloff edited the summary of this revision. (Show Details)

Avoid using custom attribute, use Function::useFPIntrin instead.

rjmccall added inline comments.Feb 6 2020, 1:48 PM
clang/lib/Sema/SemaStmt.cpp
386

There isn't necessarily a current function declaration; this is usable in blocks, ObjC methods, etc. It's even usable in global initializers because of statement-expressions. An attribute is really a much more flexible way of setting this information.

Sema has a concept of the current function scope; I think you should probably track this bit there and then copy that as appropriate to the FunctionDecl / ObjCMethodDecl / BlockDecl / lambda invocation function / whatever when the function body is done.

You should also handle templates, which means you'll need to set the FPOptions in the current function scope to the appropriate baseline when starting to instantiate a function definition. constexpr-if means you can't just propagate the bit from the template. I would suggest a rule where, for template patterns, the attribute is used solely to represent the baseline rather than whether the function actually contains any local pragmas. Or maybe you already need to record the baseline for the outermost scope?

rsmith requested changes to this revision.Feb 7 2020, 5:07 PM

I don't see any changes to the constant evaluator here. You need to properly handle constant evaluation within FENV_ACCESS ON contexts, somehow, or you'll miscompile floating-point operations with constant operands. Probably the easiest thing would be to treat all rounded FP operations as non-constant in an FENV_ACCESS ON region, though in C++ constexpr evaluations we could permit rounded FP operations if the evaluation began in an FENV_ACCESS OFF region (that way we know the computations should be done in the default FP environment because feset* are not constexpr functions).

This revision now requires changes to proceed.Feb 7 2020, 5:07 PM
rjmccall added a comment.EditedFeb 7 2020, 6:54 PM

I don't see any changes to the constant evaluator here. You need to properly handle constant evaluation within FENV_ACCESS ON contexts, somehow, or you'll miscompile floating-point operations with constant operands. Probably the easiest thing would be to treat all rounded FP operations as non-constant in an FENV_ACCESS ON region, though in C++ constexpr evaluations we could permit rounded FP operations if the evaluation began in an FENV_ACCESS OFF region (that way we know the computations should be done in the default FP environment because feset* are not constexpr functions).

FWIW, C does actually specify behavior here:

C2x (n2454)
F.8.4 Constant expressions
p1. An arithmetic constant expression of floating type, other than one in an initializer for an object that has static or thread storage duration, is evaluated (as if) during execution; thus, it is affected by any operative floating-point control modes and raises floating-point exceptions as required by IEC 60559 (provided the state for the FENV_ACCESS pragma is "on").

p2. Example:

#include <fenv.h>
#pragma STDC FENV_ACCESS ON
void f(void)
{
  float w[] = { 0.0/0.0 };  // raises an exception
  static float x = 0.0/0.0; // does not raise an exception
  float y = 0.0/0.0;        // raises an exception
  double z = 0.0/0.0;       // raises an exception
  /* ... */
}

p3. For the static initialization, the division is done at translation time, raising no (execution-time) floating-point exceptions. On the other hand, for the three automatic initializations the invalid division occurs at execution time.

F.8.2 Translation
p1. During translation, constant rounding direction modes are in effect where specified. Elsewhere, during translation the IEC 60559 default modes are in effect:

  • The rounding direction mode is rounding to nearest.
  • The rounding precision mode (if supported) is set so that results are not shortened.
  • Trapping or stopping (if supported) is disabled on all floating-point exceptions.

p2. (Recommended practice) The implementation should produce a diagnostic message for each translation-time floating-point exception, other than "inexact"; the implementation should then proceed with the translation of the program.

Some of this translates naturally to C++: e.g. constexpr evaluations are always "during translation" and therefore should honor rounding modes but not the dynamic environment. I guess constexpr functions in surrounding FENV_ACCESS ON contexts would ignore the pragma, and the pragma would be disallowed within them.

Of course, C++ also allows static initialization to be non-constant, and we:

  1. want to do such initialization at translation time as much as possible,
  2. want to be consistent with C as much as possible, and
  3. don't want the obscure decision of whether to perform initialization statically or dynamically to affect the formal semantics of the program, since it can be both (a) difficult for users to reason about and (b) potentially unstable across compiler versions (e.g. if we start constant-folding a builtin).

IMO, the best threading of this needle is to say that expressions with FENV_ACCESS ON can't be constant-folded, but that the pragma only applies to "local" code (function bodies or initializers for non-static data members). This would mean that, if you want static initialization to be environment-aware, you need to move it into a function where the pragma is active. It would also naturally mean that FENV_ACCESS *would* apply to the initialization of static local variables (unless they're constinit); that would be a C/C++ inconsistency, but hopefully it'd at least be an understandable one.

Hello, I rebased this and made a few changes here, https://reviews.llvm.org/D87528 ; I added a question about floating point constant folding in that review, I'm going to duplicate it here,

My question is about constant folding. I am working on a task to ensure that clang is doing floating point constant folding correctly. I thought the constant folding was in AST/ExprConstant.cpp and indeed constant folding does occur there. But sometimes, if the floating point semantics are set to 'strict', even tho' folding has occurred successfully in ExprConstant.cpp, when i look at emit-llvm, there is arithmetic emitted for the floating point expression; For example if you use the command line option -ffp-exception-behavior=strict and you compile this function, it will emit the add instruction; but without the option you will see the folded expression. Either way, if you put a breakpoint inside ExprConstant.cpp the calculation of the floating sum does occur. The function is float myAdd(void) { return 1.0 + 2.0; }

So where is the decision made that backs out the fold?

kpn added a comment.Sep 11 2020, 10:31 AM

Say, in D80952 I added support for disabling strictfp support when a target doesn't support it. But it only applies to command line arguments.

Is there any chance at all that relevant pragmas can also be disabled with the warning in the same cases?

I am working on a task to ensure that clang is doing floating point constant folding correctly.

Could you please share your plans on it? I recently also started implementing constant folding in ExprConstant.cpp. I have not made anything substantial yet, so I can easily switch to another task. Do you have any estimation when you could prepare the first version of the patch?

But sometimes, if the floating point semantics are set to 'strict', even tho' folding has occurred successfully in ExprConstant.cpp, when i look at emit-llvm, there is arithmetic emitted for the floating point expression;

I used a bit different approach, may be it could be useful for you too. An initializer for global variable must be a constant, so things like const xxx = 1.0 + 2.0 are evaluated. No llvm arithmetic occurs in the resulting ll file. Using pragma STDC FENV_ROUND floating point environment may be set to non-default state, which constant evaluator must use.

In D69272#2268387, @kpn wrote:

Say, in D80952 I added support for disabling strictfp support when a target doesn't support it. But it only applies to command line arguments.

Is there any chance at all that relevant pragmas can also be disabled with the warning in the same cases?

This is definitely a good idea.

I am working on a task to ensure that clang is doing floating point constant folding correctly.

Could you please share your plans on it? I recently also started implementing constant folding in ExprConstant.cpp. I have not made anything substantial yet, so I can easily switch to another task. Do you have any estimation when you could prepare the first version of the patch?

I've been given a vague assignment, something along the lines "investigate floating point constant folding and make sure that the semantics are correct. " In the Intel ICL compiler, there were some circumstances of the semantics not being correct. I saw Richard's comments in this review, and Intel also needs FENV_ACCESS implemented so I thought I'd start here. I'm not a floating point expert, but of course some of my colleagues at Intel are! I am pretty slow but it's my area of focus.

But sometimes, if the floating point semantics are set to 'strict', even tho' folding has occurred successfully in ExprConstant.cpp, when i look at emit-llvm, there is arithmetic emitted for the floating point expression;

I used a bit different approach, may be it could be useful for you too. An initializer for global variable must be a constant, so things like const xxx = 1.0 + 2.0 are evaluated. No llvm arithmetic occurs in the resulting ll file. Using pragma STDC FENV_ROUND floating point environment may be set to non-default state, which constant evaluator must use.

When I implemented clang #pragma float_control, I noticed that initialization expressions in classes were not subject to the pragma's that are active in the source file. Those expressions are pulled out and processed differently than the function bodies. I'll upload later today a patch that uses Expr->getFPFeaturesInEffect() to inhibit constant folding in ExprConstant.cpp.

In D69272#2268387, @kpn wrote:

Say, in D80952 I added support for disabling strictfp support when a target doesn't support it. But it only applies to command line arguments.

Is there any chance at all that relevant pragmas can also be disabled with the warning in the same cases?

This is definitely a good idea.

I'll look into it, thank you

sepavloff abandoned this revision.Sun, Nov 15, 10:07 PM

Implemented in D87528.