This is an archive of the discontinued LLVM Phabricator instance.

Intrinsic for checking floating point class
ClosedPublic

Authored by sepavloff on Oct 18 2021, 11:49 AM.

Details

Summary

This change introduces a new intrinsic, llvm.is.fclass, which checks
if the provided floating point number belongs to the specifies value
class. It implements the checks made by C standard library functions
isnan, isinf, isfinite, isnormal, issubnormal, issignaling
and corresponding IEEE-754 operations.

The primary motivation for this intrinsic is the support of strict FP
mode. In this mode using compare instructions or other FP operations is
not possible, because if the value is a signaling NaN, floating point
exception Invalid is raised, but the aforementioned functions must
never raise exceptions.

Currently there are two solutions for this problem, both are
implemented partially. One of them is using integer operations to
implement the check. It was implemented in https://reviews.llvm.org/D95948
for isnan. It solves the problem of exceptions, but offers one
solution for all targets, although some can do the check in more
efficient way.

The other, implemented in https://reviews.llvm.org/D96568, introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target
specific code into IR to implement isnan. It is convenient for
targets that has dedicated instruction to determine FP data class.
However using target-specific intrinsic complicates analysis and can
prevent some optimizations.

Using a special intrinsic to represent a value class check allows
representing data class test with enough flexibility. During IR
transformations it represents the check in target-independent way and
prevents it from undesired transformations. In the instruction selector
it allows efficient lowering depending on the target and used mode.

This implementation is an extended variant of llvm.isnan introduced
in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic
support. Target-specific treatment will be implemented in separate
patches.

Diff Detail

Event Timeline

sepavloff created this revision.Oct 18 2021, 11:49 AM
sepavloff requested review of this revision.Oct 18 2021, 11:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2021, 11:49 AM

The patch was tested in runtime using extended version of tests https://reviews.llvm.org/D106804 and temporary clang patch that implements classification builtins by calls to llvm.is.fclass.

arsenm added a subscriber: arsenm.Oct 18 2021, 1:31 PM
arsenm added inline comments.
llvm/docs/LangRef.rst
21984–21993

AMDGPU has a class instruction but the fields are a bit more expansive than this. Specifically, it has positive and negative versions of each of the non-nan fields. e.g. you can check specifically +inf separately from -inf, and the same for the rest.

I guess we could identify and (signbit check), (class %x, !foo)

Make test specification more flexible

Now test data class specification is not limited by the test necessary for
implementation of C classification functions, but is made more close to
the facility provided by data class inquiry instruction existing on many
targets.

The intrinsic is renamed to llvm.is_fpclass to use the similar naming as
in fpclassify.

The patch is rebased.

sepavloff added inline comments.Oct 22 2021, 12:02 AM
llvm/docs/LangRef.rst
21984–21993

Updated the patch to better represent targets with such instructions.

sepavloff updated this revision to Diff 383799.Nov 1 2021, 7:17 AM

Updated patch

  • Check for inverted test was moved into a separate function.
  • Fixed code generation for negative zeros and normals check.
  • Fixed code generation for snan checks.
  • Number of codegen tests were reduced.

This patch was tested using a patches for clang https://reviews.llvm.org/D112932 and test-suite https://reviews.llvm.org/D112933. Clang build with the patch were used to compile llvm-test-suite/SingleSource/UnitTests/Float/classify.c. Running the obtained executable executes the unit tests.

arsenm added inline comments.Nov 11 2021, 6:19 AM
llvm/include/llvm/IR/Intrinsics.td
723

The word "is" here reads weird to me, and you dropped it from the ISD node name. How about just llvm.fpclass? Also the description says "fclass"

sepavloff added inline comments.Nov 11 2021, 8:17 AM
llvm/include/llvm/IR/Intrinsics.td
723

I agree that this name is a bit awkward. Initially it was llvm.fclass, as you noticed, but then it was changed because:

  • it is inconsistent with C function fpclassify, which also may get a special llvm intrinsic,
  • llvm.fclass might be understood as if the intrinsic calculated the value class. Actually it only tests the value against the specified classes.
sepavloff updated this revision to Diff 410259.Feb 21 2022, 2:47 AM

Updated patch

  • Use integer constants instead of metadata to represent fp class test.
  • Added support for fp80.
  • Renamed test, dropped the prefix 'x86-'.
  • Prepared code for sharing with GlobalISel.

This patch with the set of dependent patches provides support of the intrinsic in IR and CodeGen. Default lowering is implemented using integer operations, IEEE and Intel 80-bit extended precision formats are supported in it. Custom lowering for non-IEEE types is provided in separate patches (D113908 and D113414). Targets that support FCLASS or similar instruction can have especially efficient implementation, a patch for SystemZ represents an example (D114695). Patches that implement support for this intrinsic in GlobalISel are also published (D121454 and D121296).

The patch includes large number of tests. To facilitate validation, a preliminary clang patch was prepared (D112932), which enables runtime testing. Appropriate change for test suite is also provided (D112933). The implementation was tested on X86 and AArch64 hardware and PowerPC emulator.

The intrinsic has evolved to rather generic form and a question may rise if it would be profitable to implement an intrinsic for fpclassify and then use it to implement classification functions. The drawback of the solution based on fpclassify is that this function returns a numeric value which is then exists in runtime. If a target provides instruction FCLASS, it would be convenient to have numerical representation of floating-point classes consistent with the output of FCLASS. Different targets use different layouts for FCLASS output, for example RISC-V, SystemZ and AMDGPU all have such instruction and all use different encoding. It is not possible to use FP class numbering that would be convenient for all targets. There is no such problem for is_fpclass. FP class numbers in this case are used in compile-time constants only.

Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2022, 4:43 AM
thopre added inline comments.Apr 20 2022, 4:35 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7130

Does them refer to unsupported values? So an exponent of 0 and a int_bit set is a supported value. But the equality check on line 7392 will return true for that case so I'm a bit confused.

7139
sepavloff updated this revision to Diff 424089.Apr 20 2022, 9:45 PM

Fixed comments

sepavloff added inline comments.Apr 20 2022, 9:48 PM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7130

You are right, the comment describes supported values. Thank you for the catch.

7139

Here you are also right. Fixed.

LGTM but I'd prefer someone else to approve.

LGTM but I'd prefer someone else to approve.

Thanks!
I'll wait for more feedback.

efriedma added inline comments.Apr 21 2022, 11:35 AM
llvm/docs/LangRef.rst
21991

*Quiet

21999

*Negative

22018

Maybe give an example here, e.g. 0x104 tests for a normal number.

Is there some reason the bits are arranged in this particular order? Not that I really have a better suggestion.

llvm/include/llvm/IR/Intrinsics.td
726

Did you mean to add ImmArg<ArgIndex<1>>?

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7012

You could try to handle more cases here, e.g. fcPosNormal is two comparisons. Not sure how worthwhile that is.

sepavloff updated this revision to Diff 424883.Apr 25 2022, 5:31 AM

Addressed review notes. Rebased.

sepavloff marked 2 inline comments as done.Apr 25 2022, 5:54 AM
sepavloff added inline comments.
llvm/docs/LangRef.rst
22018

I put a small example explaining the use of 'test' argument.

As for bit representation of particular tests, it is practically arbitrary. The sequence of bits agrees with the sequence of classes specified by IEEE-754 in 5.7.2. If there were a better sequence, it would be no problems to change the bits.

llvm/include/llvm/IR/Intrinsics.td
726

Exactly! Thank you!

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7012

These cases can be implemented with a single comparison operation. Other cases require more than one comparison. For example, the test fcPosNormal can be implemented as (x >= SmallestNormalized) && (x <= Largest). Also floating-point constant materialization may have some cost. The resulting gain is not obvious. For x86 floating point operations produce better code if SSE is available, but much worse code if x87 is used. So the advantage of floating point comparisons is target-dependent.

This revision is now accepted and ready to land.Apr 25 2022, 9:33 AM
This revision was landed with ongoing or failed builds.Apr 25 2022, 11:20 PM
This revision was automatically updated to reflect the committed changes.
nlopes added a subscriber: nlopes.May 2 2022, 6:12 AM

Is the second argument required to be a constant? If so, it would be great to document that. Thanks!

Is the second argument required to be a constant? If so, it would be great to document that. Thanks!

Fixed in https://github.com/llvm/llvm-project/commit/d9b5544e0f99656ac18e39daf31fa8231ea0d9ef.
Thanks!

nlopes added a comment.May 3 2022, 5:51 AM

Is the second argument required to be a constant? If so, it would be great to document that. Thanks!

Fixed in https://github.com/llvm/llvm-project/commit/d9b5544e0f99656ac18e39daf31fa8231ea0d9ef.
Thanks!

Thank you!
Will add support for this in Alive2 soonish.

nlopes added a subscriber: regehr.May 20 2022, 1:44 AM

We believe that there's a bug when the 2nd argument is 0.
See here: https://gcc.godbolt.org/z/9735rYbqP

LLVM produces 1, but it should be 0. Found by @regehr.

We believe that there's a bug when the 2nd argument is 0.
See here: https://gcc.godbolt.org/z/9735rYbqP

LLVM produces 1, but it should be 0. Found by @regehr.

Although this is a degenerated case and any result is possible, treating 2nd argument as a set of FP classes is more consistent with zero return value.
Thanks!