Page MenuHomePhabricator

[Clang][CodeGen] Set FP options of builder at entry to compound statement
Needs ReviewPublic

Authored by sepavloff on Jul 10 2022, 11:49 PM.

Details

Summary

Previously compilation of a few tests produced incorrect code. In them FP
pragmas were ignored and the resulting code contained constrained intrinsics
with rounding mode and/or exception behavior determined by command line
options only. Compiler creates correct AST for this tests, but FP options
were ignored in code generator because builder object was not set up properly.
To fix code generation builder object now is updated according to FP options
stored in CompoundStmt.

Diff Detail

Event Timeline

sepavloff created this revision.Jul 10 2022, 11:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2022, 11:49 PM
Herald added subscribers: jsji, pengfei. · View Herald Transcript
sepavloff requested review of this revision.Jul 10 2022, 11:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2022, 11:49 PM

Thank you for looking into this! I happened to run into this same issue with #pragma float_control not behaving the way I'd expect. (FWIW, we also ran into an interesting issue where the floating point options were pushed but never popped in the TU were delayed template instantiation behaved differently than typical template instantiation.)

clang/test/CodeGen/pragma-fenv_access.cpp
35

There are some extra test cases I'd like to see coverage for because there are some interesting edge cases to consider.

template <typename Ty>
float func1(Ty) {
  float f1 = 1.0f, f2 = 3.0f;
  return f1 + f2 * 2.0f;
}

#pragma float_control(precise, on, push)
template float func1<int>(int); 
#pragma float_control(pop)

#pragma float_control(precise, on, push)
template <typename Ty>
float func2(Ty) {
  float f1 = 1.0f, f2 = 3.0f;
  return f1 + f2 * 2.0f;
}
#pragma float_control(pop)

template float func2<int>(int);

void bar() {
    func1(1.1);
    func2(1.1);
}

This gets especially interesting when you think about delayed template instantiation as happens by default on Windows targets. Consider this code with the *driver level* -ffast-math flag enabled (not the cc1 option, which is different).

I think that func1<int> SHOULD be precise, because the explicit instantiation is, while func1<double> SHOULD NOT be precise, because the definition is not. func2<int> SHOULD NOT be precise, because the explicit instantiation is not, while func2<double> SHOULD be precise, because the definition is.

Partial specializations are a similar situation where the primary template and its related code made have different options.

WDYT?

sepavloff added inline comments.Jul 15 2022, 1:04 AM
clang/test/CodeGen/pragma-fenv_access.cpp
35

Standard FP pragmas are defined only in C standard, so interaction of them with C++ specific features is actually implementation-defined. The cases presented in your example are reasonable solutions with one exception: IMO func2<int> should be precise, because its template is precise. It is equivalent to:

template <typename Ty>
float func2(Ty) {
#pragma float_control(precise, on)
  float f1 = 1.0f, f2 = 3.0f;
  return f1 + f2 * 2.0f;
}

so instantiation of it would produce function with precise operations.

Implementation of correct mechanism of the interaction requires substantial efforts and should be made in a separate patch, I think. In particular, we need to invent a way to associate a point of instantiation with the FPOptions in that point, so that delayed instantiation could be made with correct set of options.

In this patch the change in SemaTemplateInstantiateDecl.cpp prevents from compiler crash. Without it codegen tries to create a call to constrained intrinsic in the function that do not have attribute StrictFP, because flag FEnvAccess is set at the end of translation unit in pragma-fenv_access.cpp.

This property adheres to a function definition, so it seems to me that an explicit *instantiation* ought to preserve it from the instantiated template definition, but an explicit *specialization* ought to be independent.

i.e.

#pragma float_control(precise, on, push)
template <typename Ty>
float func2(Ty) {
  float f1 = 1.0f, f2 = 3.0f;
  return f1 + f2 * 2.0f;
}
#pragma float_control(pop)

template float func2<int>(int); // precise
template <> float func2<long>(long) { ... } // not precise

Interaction of pragmas and templates is implemented in https://reviews.llvm.org/D131143.

As specific interaction of FP pragmas and template instantiation was not supported, this patch is actual without additional changes.

sepavloff updated this revision to Diff 454822.Aug 23 2022, 6:43 AM
sepavloff edited the summary of this revision. (Show Details)

Rebase and ping