Page MenuHomePhabricator

Disable _Float16 for non ARM/SPIR Targets
ClosedPublic

Authored by erichkeane on Jan 24 2019, 1:44 PM.

Details

Summary

As Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129543.html

There are problems exposing the _Float16 type on architectures that
haven't defined the ABI/ISel for the type yet, so we're temporarily
disabling the type and making it opt-in.

Diff Detail

Repository
rL LLVM

Event Timeline

erichkeane created this revision.Jan 24 2019, 1:44 PM
rjmccall added inline comments.Jan 24 2019, 3:38 PM
docs/LanguageExtensions.rst
497 ↗(On Diff #183391)

This entire documentation section really just needs to be reworked. Here's a starting point:

Clang supports two half-precision (16-bit) floating point types: ``__fp16`` and
``_Float16``.  These types are supported in all language modes.

``__fp16`` is supported on every target, as it is purely a storage format; see below.
``_Float16`` is currently only supported on the following targets:
- 32-bit ARM
- 64-bit ARM (AArch64)
- SPIR
``_Float16`` will be supported on more targets as they define ABIs for them.

``__fp16`` is a storage and interchange format only.  This means that values of
``__fp16`` are immediately promoted to (at least) ``float`` when used in arithmetic
operations, so that e.g. the result of adding two ``__fp16`` values has type ``float``.
The behavior of ``__fp16`` is specified by the ARM C Language Extensions (`ACLE <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf>`_).
Clang uses the ``binary16`` format from IEEE 754-2008 for ``__fp16``, not the ARM
alternative format.

``_Float16`` is an extended floating-point type.  This means that, just like arithmetic on
``float`` or ``double``, arithmetic on ``_Float16`` operands is formally performed in the
``_Float16`` type, so that e.g. the result of adding two ``_Float16`` values has type
``_Float16``.  The behavior of ``_Float16`` is specified by ISO/IEC TS 18661-3:2015
("Floating-point extensions for C").  As with ``__fp16``, Clang uses the ``binary16``
format from IEEE 754-2008 for ``_Float16``.

``_Float16`` arithmetic will be performed using native half-precision support
when available on the target (e.g. on ARMv8.2a); otherwise it will be performed
at a higher precision (currently always ``float``) and then truncated down to
``_Float16``.  Note that C and C++ allow intermediate floating-point operands
of an expression to be computed with greater precision than is expressible in
their type, so Clang may avoid intermediate truncations in certain cases; this may
lead to results that are inconsistent with native arithmetic.

It is recommended that portable code use ``_Float16`` instead of ``__fp16``,
as it has been defined by the C standards committee and has behavior that is
more familiar to most programmers.

Because ``__fp16`` operands are always immediately promoted to ``float``, the
common real type of ``__fp16`` and ``_Float16`` for the purposes of the usual
arithmetic conversions is ``float``.

A literal can be given ``_Float16`` type using the suffix ``f16``; for example:
```
  3.14f16
```

Because default argument promotion only applies to the standard floating-point
types, ``_Float16`` values are not promoted to ``double`` when passed as variadic
or untyped arguments.  As a consequence, some caution must be taken when using
certain library facilities with ``_Float16``; for example, there is no ``printf`` format
specifier for ``_Float16``, and (unlike ``float``) it will not be implicitly promoted to
``double`` when passed to ``printf``, so the programmer must explicitly cast it to
``double`` before using it with an ``%f`` or similar specifier.
lib/Basic/Targets/AArch64.cpp
52 ↗(On Diff #183391)

spacing

lib/Basic/Targets/SPIR.h
50 ↗(On Diff #183391)

spacing

SjoerdMeijer added inline comments.
include/clang/Basic/TargetInfo.h
66 ↗(On Diff #183391)

I think this is the same as HasLegalHalfType, and we can (re)use that. Or, at least, don't think we need both HasLegalHalfType and HasFloat16. For context, I needed HasLegalHalfType for argument passing, but it looks like it can serve another purpose now.

Out of curiousity, I was wondering if specifying:

KEYWORD(_Float16                    , HALFSUPPORT)

in TokenKids.def is an alternative approach (it is currently set to KEYALL). Thus, enable the keyword when LangOpts.Half is set. By adding this HasFloat16 property here in clang's targetinfo, we're sort of defining again how targets support different types. I.e., if you throw a half type at the backend, the TypeLegalizer will deal with it in one way or another. Perhaps disabling _Float16 can be achieved by disabling the keyword. But I do see that the big advantage of this patch is the much nicer error message (otherwise we would get something like "unknown type name '_Float16'").

521 ↗(On Diff #183391)

Similar remark: the same as hasLegalHalfType()?

erichkeane marked 5 inline comments as done.Jan 25 2019, 7:01 AM
erichkeane added inline comments.
include/clang/Basic/TargetInfo.h
66 ↗(On Diff #183391)

LangOpts.Half (and thus HALFSUPPORT) seems to be specific to OpenCL. It seems to me that the two ARE different, the Half type and the _Float16 type are different semantically and from different standards, so I think that isn't sufficient.

Additionally, from a diagnostics perspective I think it is worse to do it in the TokenKinds.def. If you do it in TokenKinds.def, you get "Type Expected" or "Unknown Type Name", which could encourage someone to write their own. The error message in this patch (Not supported on this target) strongly states that it is a reserved word and is not to be defined. This is particularly important for the targets that intend to add support for this in the future.

Additionally, HasLegalHalfType has slightly different semantics in the ARM case. ARM seems to still want _Float16 to work even without the +fullfp16 support. It seems to be OK (based on CodeGen and the tests written to validate this) with allowing the _Float16 support to work with different code generation.

This LGTM with one minor revision; feel free to commit with that.

For follow-up commit consideration: @scanon, do we want to support _Float16 anywhere else? Do we need to lock down an ABI here for i386/x86_64 in advance of those gears turning in the outer world?

docs/LanguageExtensions.rst
497 ↗(On Diff #183391)

Hmm. Here's a better way of expressing part of that:

``_Float16`` is currently only supported on the following targets, with further
targets pending ABI standardization:
- 32-bit ARM
- 64-bit ARM (AArch64)
- SPIR
484 ↗(On Diff #183536)

"them" should be "it" here.

scanon added a subscriber: ab.Jan 25 2019, 9:28 AM

do we want to support _Float16 anywhere else?

ARM is the only in-tree target with a defined ABI that I'm aware of.

Do we need to lock down an ABI here for i386/x86_64 in advance of those gears turning in the outer world?

We definitely want to push for one to be defined (and make sure that it makes sense), but I don't think we don't need to rush ahead of everyone, rather get to preliminary agreement. I think @ab was going to follow-up on that?

rjmccall added inline comments.Jan 25 2019, 9:28 AM
docs/LanguageExtensions.rst
484 ↗(On Diff #183536)

Sorry, apparently Phabricator was sitting on a comment I made yesterday; ignore it, please, and just go with s/them/it/.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 25 2019, 9:30 AM
Closed by commit rL352221: Disable _Float16 for non ARM/SPIR Targets (authored by erichkeane, committed by ). · Explain Why
This revision was automatically updated to reflect the committed changes.

"it" is the grammatically correct word.

yaxunl added a subscriber: yaxunl.Jan 28 2019, 6:37 PM

This change causes regressions for CUDA/HIP. As single-source language, CUDA/HIP code contains both device and host code. It has separate compilation for host and device.
In host compilation, device function is parsed but not emitted in IR. The device function may have _Float16 argument, which is fine if device target supports it. Host compilation
should not diagnose use of _Float16 in device functions. However, current implementation diagnose any _Float16 usage in host compilation.

This change causes regressions for CUDA/HIP. As single-source language, CUDA/HIP code contains both device and host code. It has separate compilation for host and device.
In host compilation, device function is parsed but not emitted in IR. The device function may have _Float16 argument, which is fine if device target supports it. Host compilation
should not diagnose use of _Float16 in device functions. However, current implementation diagnose any _Float16 usage in host compilation.

Can this be reasonably delayed using the existing diagnostic-delay mechanism, or is there a problem where the diagnostics aren't necessarily bound to a function definition?

Regardless, I think it would be fine if you wanted to add a CUDA/HIP exception to this check in the short term.