Page MenuHomePhabricator

[FPEnv] Generate constrained FP comparisons from clang
ClosedPublic

Authored by uweigand on Dec 13 2019, 7:05 AM.

Details

Summary

Update the IRBuilder to generate constrained FP comparisons in CreateFCmp when IsFPConstrained is true, similar to the other places in the IRBuilder.

Also, add a new CreateFCmpS to emit signaling FP comparisons, and use it in clang where comparisons are supposed to be signaling (currently, only when emitting code for the <, <=, >, >= operators). Most other places are supposed to emit quiet comparisons, including the equality comparisons, the various builtins like isless, and uses of floating-point values in boolean contexts. A few places that I haven't touched may need some extra thought (e.g. are comparisons implicitly generated to implement sanitizer checks supposed to be signaling?).

I've noticed two potential problems while implementing this:

  • There is currently no way to add fast-math flags to a constrained FP comparison, since this is implemented as an intrinsic call that returns a boolean type, and FMF are only allowed for calls returning a floating-point type. However, given the discussion around https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself really shouldn't have any FMF either, so this is probably OK.
  • Using builtins like __builtin_isless on a "float" type will implicitly convert the float argument to a double; apparently this is because the builtin is declared as having a variable argument list? In any case, that means that even though a quiet comparison is generated, the semantics still isn't correct for quiet NaNs, as that implicit conversion will already signal an exception. This probably needs to be fixed, but I guess that can be done as a separate patch.

Diff Detail

Event Timeline

uweigand created this revision.Dec 13 2019, 7:05 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 13 2019, 7:05 AM

The bug with __builtin_isless should be a really easy fix; the builtin just needs to be flagged as having custom type-checking, and then we need to make sure we do appropriate promotions on the arguments (but we probably do).

The bug with __builtin_isless should be a really easy fix; the builtin just needs to be flagged as having custom type-checking, and then we need to make sure we do appropriate promotions on the arguments (but we probably do).

I think I convinced @erichkeane to look at it on Monday.

The bug with __builtin_isless should be a really easy fix; the builtin just needs to be flagged as having custom type-checking, and then we need to make sure we do appropriate promotions on the arguments (but we probably do).

I think I convinced @erichkeane to look at it on Monday.

Everything but fpclassify is pretty trivial, I just needed to write a test but needed to go home. I'll commit that Monday when I get to it. Fpclassify will take a touch longer since the int arguments need to be dealt with, but that shouldn't be more than a little work.

The bug with __builtin_isless should be a really easy fix; the builtin just needs to be flagged as having custom type-checking, and then we need to make sure we do appropriate promotions on the arguments (but we probably do).

I think I convinced @erichkeane to look at it on Monday.

Everything but fpclassify is pretty trivial, I just needed to write a test but needed to go home. I'll commit that Monday when I get to it. Fpclassify will take a touch longer since the int arguments need to be dealt with, but that shouldn't be more than a little work.

I did the compare operators that didn't work right, and will do a separate patch for the fp-classification type ones: f02d6dd6c7afc08f871a623c0411f2d77ed6acf8

uweigand updated this revision to Diff 234096.Dec 16 2019, 9:39 AM

Added float (f32) test cases.

I did the compare operators that didn't work right, and will do a separate patch for the fp-classification type ones: f02d6dd6c7afc08f871a623c0411f2d77ed6acf8

Thanks! Now I'm getting the correct output for the float test cases as well, and I've added them to the patch.

As to fp-classification, I think there is an additional complication here: according to IEEE and the proposed C2x standard, these builtins should never raise any exception, not even when receiving a signaling NaN as input. Strictly speaking, this means that they cannot possibly be implemented in terms of any comparison operation.

Now, on SystemZ (and many other platforms, I think) there are in fact specialized instructions that will implement the required semantics without raising any exceptions, but there seems to be no way to represent those at the LLVM IR level. We'll probably need some extensions here (some new IR-level builtins?) ...

(But I'd say that problem is unrelated to this patch, so I'd prefer to decouple that problem from the question of whether this patch is the right solution for comparisons.)

I did the compare operators that didn't work right, and will do a separate patch for the fp-classification type ones: f02d6dd6c7afc08f871a623c0411f2d77ed6acf8

Thanks! Now I'm getting the correct output for the float test cases as well, and I've added them to the patch.

As to fp-classification, I think there is an additional complication here: according to IEEE and the proposed C2x standard, these builtins should never raise any exception, not even when receiving a signaling NaN as input. Strictly speaking, this means that they cannot possibly be implemented in terms of any comparison operation.

Now, on SystemZ (and many other platforms, I think) there are in fact specialized instructions that will implement the required semantics without raising any exceptions, but there seems to be no way to represent those at the LLVM IR level. We'll probably need some extensions here (some new IR-level builtins?) ...

(But I'd say that problem is unrelated to this patch, so I'd prefer to decouple that problem from the question of whether this patch is the right solution for comparisons.)

__builtin_fpclassify/isfinite/isinf/isinf_sign/isnan/isnormal/signbit are all implemented the same as the OTHER ones, except there is a strange fixup step in SEMA that removes the float->double cast. It is IMO the wrong way to do it.

I don't think it would modify the IR at all or the AST, but I'm also working on removing that hack (which is what I meant by the fp-classification type ones).

I hope the work I've done already is sufficient to unblock this patch.

__builtin_fpclassify/isfinite/isinf/isinf_sign/isnan/isnormal/signbit are all implemented the same as the OTHER ones, except there is a strange fixup step in SEMA that removes the float->double cast. It is IMO the wrong way to do it.

I don't think it would modify the IR at all or the AST, but I'm also working on removing that hack (which is what I meant by the fp-classification type ones).

I hope the work I've done already is sufficient to unblock this patch.

Yes, this patch is no longer blocked, thanks!

What I was trying to say is that there is a fundamental difference between the comparison builtins like isless, isgreater, etc. and the classification builtins like isinf, isnan, etc.

The former should result in comparison instructions being generated, the only difference between the builtin and a regular "<" operator is that the builtin emits a quiet compare while the operator emits a signaling compare in strict mode.

However, the latter (classification macros) should not actually emit any comparison instructions in strict mode, because the classification macros may never trap, but all comparison instructions do. So the basic idea of implementing e.g. isinf(x) as "fabs(x) == infinity" (like the comment in CGBuiltin.cpp currently says) is fundamentally wrong in strict mode.

Ping again.

This revision is now accepted and ready to land.Jan 9 2020, 6:41 PM
This revision was automatically updated to reflect the committed changes.

Is this approach going to work with scope-local strictness? We need a way to do a comparison that has the non-strict properties but appears in a function that enables strictness elsewhere.

llvm/include/llvm/IR/IRBuilder.h
2385

Can you make a helper method for the common code in the non-constrained paths here?

Please document the difference between these two methods.

Is this approach going to work with scope-local strictness? We need a way to do a comparison that has the non-strict properties but appears in a function that enables strictness elsewhere.

Well, just like for all the other FP builder methods, you can use the setIsFPConstrained method on the builder object to switch between strict and non-strict mode. Does this not suffice, or is there anything particular about the comparisons that would require anything extra?

Please document the difference between these two methods.

OK, checked in header file comments as 6aca3e8.

Can you make a helper method for the common code in the non-constrained paths here?

Would you prefer something like

private:
  Value *CreateFCmpHelper(CmpInst::Predicate P, Value *LHS, Value *RHS,
                          const Twine &Name, MDNode *FPMathTag) {
    if (auto *LC = dyn_cast<Constant>(LHS))
      if (auto *RC = dyn_cast<Constant>(RHS))
        return Insert(Folder.CreateFCmp(P, LC, RC), Name);
    return Insert(setFPAttrs(new FCmpInst(P, LHS, RHS), FPMathTag, FMF), Name);
  }

public:
  Value *CreateFCmp(CmpInst::Predicate P, Value *LHS, Value *RHS,
                    const Twine &Name = "", MDNode *FPMathTag = nullptr) {
    if (IsFPConstrained)
      return CreateConstrainedFPCmp(Intrinsic::experimental_constrained_fcmp,
                                    P, LHS, RHS, Name);

    return CreateFCmpHelper(P, LHS, RHS, Name, FPMathTag);
  }
  [...]

or rather something like:

private:
  Value *CreateFCmpHelper(CmpInst::Predicate P, Value *LHS, Value *RHS,
                          bool IsSignaling, const Twine &Name, MDNode *FPMathTag) {
    if (IsFPConstrained)
      return CreateConstrainedFPCmp(IsSignaling ? Intrinsic::experimental_constrained_fcmps
                                                : Intrinsic::experimental_constrained_fcmp,
                                    P, LHS, RHS, Name);

    if (auto *LC = dyn_cast<Constant>(LHS))
      if (auto *RC = dyn_cast<Constant>(RHS))
        return Insert(Folder.CreateFCmp(P, LC, RC), Name);
    return Insert(setFPAttrs(new FCmpInst(P, LHS, RHS), FPMathTag, FMF), Name);
  }

public:
  Value *CreateFCmp(CmpInst::Predicate P, Value *LHS, Value *RHS,
                    const Twine &Name = "", MDNode *FPMathTag = nullptr) {
    return CreateFCmpHelper(P, LHS, RHS, false, Name, FPMathTag);
  }
  [...]

or maybe simply have CreateFCmpS call CreateFCmp directly in the non-strict case?

Well, just like for all the other FP builder methods, you can use the setIsFPConstrained method on the builder object to switch between strict and non-strict mode. Does this not suffice, or is there anything particular about the comparisons that would require anything extra?

Ah, sorry, I forgot that IsFPConstrained is about whether we emit the intrinsics at all and not whether the intrinsics are currently recording real constraints.

I think I have a slight preference for the second option, where there's a single method that does all the work for the two cases.

I think I have a slight preference for the second option, where there's a single method that does all the work for the two cases.

OK, now checked in as 870137d .