Page MenuHomePhabricator

Restricted variant of '#pragma STDC FENV_ACCESS'
Needs ReviewPublic

Authored by sepavloff on Mon, Oct 21, 10:49 AM.

Details

Summary

This change implements support of '#pragma STDC FENV_ACCESS' in
frontend. The pragma is supported only at namespace level and in
topmost block in functions.

Use of the pragma results in using constrained intrinsics to represent
floating point operations in the function to which it applies. Such
function is marked by the attribute 'FPEnvironment' in AST and
'fenv_access' in IR.

Event Timeline

sepavloff created this revision.Mon, Oct 21, 10:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMon, Oct 21, 10:49 AM
kpn added a comment.Mon, Oct 21, 11:03 AM

Does this work for C++? C++ templates? I only see C tests.

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

rjmccall added inline comments.Mon, Oct 21, 11:40 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

What's the purpose of this restriction? Whether inline really has much to do with inlining depends a lot on the exact language settings. (Also, even if this restriction were okay, the diagnostic is quite bad given that there are three separate conditions that can lead to it firing.)

Also, I thought we were adding instruction-level annotations for this kind of thing to LLVM IR. Was that not in service of implementing this pragma?

I'm not categorically opposed to taking patches that only partially implement a feature, but I do want to feel confident that there's a reasonable technical path forward to the full implementation. In this case, it feels like the function-level attribute is a dead end technically.

Thanks for the patch! I don't have time to review this in detail this week, but I'm very happy to see this functionality.

clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

I'm guessing this is intended to avoid the optimization problems that would occur (currently) if a function with strictfp were inlined into a function without it. I'm just guessing though, so correct me if I'm wrong.

As I've said elsewhere, I hope this is a temporary problem. It is a real problem though (as is the fact that the inliner isn't currently handling this case correctly).

What would you think of a new command line option that caused us to mark functions with strictfp as noinline? We'd still need an error somewhat like this, but I feel like that would be more likely to accomplish what we want on a broad scale.

hfinkel added inline comments.Mon, Oct 21, 2:05 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

We would not want to prevent all inlining, just inlining where the attributes don't match. We should fix his first. I think we just need to add a CompatRule to include/llvm/IR/Attributes.td (or something like that).

sepavloff marked an inline comment as done.Tue, Oct 22, 12:27 AM

Try to organize replies for better references.

Background

According to the current design, if floating point operations are represented by constrained intrinsics somewhere in a function, constrained intrinsics must be used in entire function, including inlined function calls (http://lists.llvm.org/pipermail/cfe-dev/2017-August/055325.html). As constrained intrunsics do not participate in optimizations, this may lead to performance drop (discussed in http://lists.llvm.org/pipermail/llvm-dev/2019-August/134641.html). There was an attempt to alleviate this issue using basic block attributes, but this approach was rejected (http://lists.llvm.org/pipermail/llvm-dev/2019-October/135623.html).

In this case allowing #pragma STDC FENV_ACCESS at function level seems a good compromise between performance and flexibility. Users get possibility to use non-standard floating point environment and performance impact is limited to the scope of one function and that scope is controlled by user.

The more general solution, where #pragma STDC FENV_ACCESS is allowed in any block inside a function, as the standard requires, would require extending the scope where constrained intrinsics are used. It would result in performance drop, which is unacceptable in many cases. There are ideas to implement the general case using outlining (http://lists.llvm.org/pipermail/llvm-dev/2019-October/135628.html), it could be a promising way to extend functionality of this solution.

Inlining

Inlining in an issue for functions with #pragma STDC FENV_ACCESS, because if such function is inlined, the host function must use constrained intrinsics, which can result in performance drop. The simplest case is to prohibit inlining of such functions, this way is used in this patch. More elaborated solution would try to merge attributes if possible. For instance, if the host function does not use floating point operations, the function with #pragma STDC FENV_ACCESS may be inlined. However it does not look like a job of frontend.

In D69272#1717021, @kpn wrote:

Does this work for C++? C++ templates? I only see C tests.

Will add C++ tests soon.

In D69272#1717021, @kpn wrote:

Is there a way forward to support having the #pragma at the start of any block inside a function? The effect won't be restricted to that block, true, but the standard does say the #pragma is allowed.

Function outlining may be a general solution.

clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

As Andrew already said, noinline attribute is a mean to limit negative performance impact. Of course, to inline or not to inline - such decision is made by backend. However it a user requested a function to be inline, a warning looks useful.

When constrained intrinsics get full support in optimizations, this restriction will become unnecessary.

Outlining is one of the ways that converts this solution into full pragma implementation. Another is implementation of constrained intrinsics support in optimization transformations.

As for a new command line option that caused us to mark functions with strictfp as noinline, it loos a good idea, but we must adapt inliner first, so that it can convert ordinary floating operations to constrained intrinsics during inlining.

In the case of #pragma STDC FENV_ACCESS we cannot in general say if attributes are compatible so a function can be inlined into another. The pragma only says that user modified floating point environment. One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining. More fine grained pragmas, like that proposed in https://reviews.llvm.org/D65997 could enable more flexible inlining.

kpn added a comment.Tue, Oct 22, 7:33 AM

Baking into the front end the fact that the backend implementation is not yet complete doesn't strike me as a good idea.

And the metadata arguments to the constrained intrinsics are designed to allow for correctly marked constrained intrinsics to be eventually treated pretty close to the same as non-constrained math instructions. Once the implementation is further along, of course.

I think the issue with the inliner not being smart enough yet is an issue for llvm to deal with and not front ends like clang. It would be straightforward enough for llvm to mark functions that have the strictfp attribute so they also are marked noinline. As a temporary measure, of course. This is a case where llvm hasn't caught up with well-formed IR, so it would be llvm's job to work around its own incompleteness.

See D43142 for code to convert all floating point in a function into constrained intrinsics. Updated versions of this code with support for intrinsics that didn't exist at the time also exist. I don't see why this pass couldn't be reworked a bit to be used by the inliner. And it would only be needed when inlining into a strictfp function a function that wasn't strictfp.

You mentioned that extending the scope of the #pragma may result in a "performance drop, which is unacceptable in many cases". But the only difference between allowing the #pragma only at the top of a function, and allowing it everywhere the standard allows, is that the user knows about the potential loss of performance. The performance loss happens in both cases. Again, I don't think baking into clang the current state of llvm is a good idea.

A warning from clang that strictfp code doesn't perform very well today is probably a good idea, and it would be ripped out easily when the day comes. The warning would only fire when the #pragma is seen, and that code is small, self-contained, and actually already exists in clang now but with different text.

hfinkel added inline comments.Tue, Oct 22, 9:31 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

Of course, to inline or not to inline - such decision is made by backend. However it a user requested a function to be inline, a warning looks useful.

We need to be careful. inline is not just an optimizaiton hint. It also affects function linkage. Also, when inlining into functions with similar constraints, there's no problem with the inlining.

One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining.

Can you please explain this? It does not seem like that should block inlining when both the caller and callee are marked as fenv_access-enabled.

In D69272#1717967, @kpn wrote:

Baking into the front end the fact that the backend implementation is not yet complete doesn't strike me as a good idea.

I don't expect that this patch would pass review quickly. But it could be used to organize requests what must be done to implement this functionality. Now I see that we need to add functionality to inliner. Do you have ideas what else should be done to implement this variant of the pragma?

I think the issue with the inliner not being smart enough yet is an issue for llvm to deal with and not front ends like clang. It would be straightforward enough for llvm to mark functions that have the strictfp attribute so they also are marked noinline. As a temporary measure, of course. This is a case where llvm hasn't caught up with well-formed IR, so it would be llvm's job to work around its own incompleteness.

That's true. But if user specifies inline for a function that contains the pragma, he requested contradictory attributes. Should compiler emit a warning? If yes, this is a job of frontend.

See D43142 for code to convert all floating point in a function into constrained intrinsics. Updated versions of this code with support for intrinsics that didn't exist at the time also exist. I don't see why this pass couldn't be reworked a bit to be used by the inliner. And it would only be needed when inlining into a strictfp function a function that wasn't strictfp.

I think this patch can help in adaptation of the inliner. The transformation must be rewritten as a function or even built into the logic of inliner.

You mentioned that extending the scope of the #pragma may result in a "performance drop, which is unacceptable in many cases". But the only difference between allowing the #pragma only at the top of a function, and allowing it everywhere the standard allows, is that the user knows about the potential loss of performance. The performance loss happens in both cases. Again, I don't think baking into clang the current state of llvm is a good idea.

A user may be convinced that using the pragma is expensive. He would carefully implement the function with the pragma and if the function is small and called rarely, performance drop can be minimized. Such solution does not work for all cases but for some it is acceptable. If using the pragma in a small region results in loss of optimization in entire function, this is counterintuitive and in many cases unacceptable.

A warning from clang that strictfp code doesn't perform very well today is probably a good idea, and it would be ripped out easily when the day comes. The warning would only fire when the #pragma is seen, and that code is small, self-contained, and actually already exists in clang now but with different text.

I don't see how such warning can help a user. A note about impact of the pragma on performance can be put into documentation. Issuing a warning on every use of the pragma may be annoying.

The main advantage of this restricted variant of the pragma IMHO is the possibility to provide implementation which can be developed in reasonable time and can be used in production code. Users will file bugs, we will fix them and the implementation will progress. If the pragma causes substantial performance loss, number of its users will be lower and the development will be slowed down.

I don't see how such warning can help a user. A note about impact of the pragma on performance can be put into documentation. Issuing a warning on every use of the pragma may be annoying.

I definitely agree. Performance may be fine in many cases, and performance may not be *relatively* important where the pragma is used. I don't believe that a warning should be added for this.

sepavloff marked an inline comment as done.Tue, Oct 22, 10:09 AM
sepavloff added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

We need to be careful. inline is not just an optimizaiton hint. It also affects function linkage. Also, when inlining into functions with similar constraints, there's no problem with the inlining.

This is an argument in favor of the warning on conflicting attributes.

One function may set only rounding mode and another use different exception handling, in this case we cannot do inlining.

Can you please explain this? It does not seem like that should block inlining when both the caller and callee are marked as fenv_access-enabled.

I was wrong. If a function correctly restores FP state, it can be inlined into another fenv_access-enabled function.

hfinkel added inline comments.Tue, Oct 22, 11:43 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
890 ↗(On Diff #225919)

This is an argument in favor of the warning on conflicting attributes.

No, because they're not conflicting (as you note, we can still inline), and because the a function might need inline linkage even if the user doesn't care about actual function inlining (this is an unfortunate side effect of the C/C++ specifications providing one keyword for both an optimization hint and a linkage modifier).

Okay. If the optimizer cannot correctly handle a mix of constrained and unconstrained FP operations, then the optimizer should protect itself, either by refusing to inline across such boundaries or by adding constraints as necessary before inlining. If it does that, I don't think it's particularly appropriate for the frontend to warn about the combination of constrained FP and supposedly inlining-related attributes (which only really seems true of always_inline). We should just mention in the documentation that this isn't particularly performant at the moment.

I feel quite confident that we can solve the engineering problem of efficiently figuring out that there's a constrained scope somewhere in a function and therefore we need constrained intrinsics.

sepavloff updated this revision to Diff 227690.Mon, Nov 4, 4:09 AM

Removed diagnostics on inline functions

As pointed out in review, strictfp attribute does not prevent from inlining,
it only restrict cases where the inlining is possible.

rjmccall added inline comments.Tue, Nov 5, 3:30 PM
clang/include/clang/Basic/DiagnosticParseKinds.td
1068

"'#pragma STDC FENV_ACCESS ON' is only supported in the outermost block in a function, ignoring"

Although, as mentioned, it would be better if we can just support this, if necessary by pessimizing the rest of the function. You're already marking the definition with an attribute when you see that there's a pragma within it. Why don't we just (1) add that attribute (once) whenever FENV_ACCESS is on at any point within a function and (2) make sure that we use FP constraints on every FP operation inside a function with the attribute?

clang/lib/Parse/ParsePragma.cpp
644

This is not ignoring the pragma; this is treating it as if it said OFF.