This is an archive of the discontinued LLVM Phabricator instance.

Teach the AArch64 backend about half-precision floating point
ClosedPublic

Authored by olista01 on Aug 13 2014, 2:22 AM.

Details

Reviewers
olista01
Summary

This patch makes half and vectors of half valid types for the AArch64 backend (half is known as f16 in the backend). This is mostly a case of adding f16 to all instruction selection patterns that can use it, but also adds some target-independent logic for promoting arithmetic operations to a wider floating-point type, and fixes up some AArch64 custom lowering which assumes that the smallest floating-point type is f32.

The motivation for this is that the ACLE (ARM C Language Extensions) allows fp16 to be used as a function argument or return type, and it must be passed in floating-point registers. Previously, fp16 was converted to i16, so the backend could not know to pass an __fp16 in a different register to a short.

Diff Detail

Event Timeline

olista01 updated this revision to Diff 12437.Aug 13 2014, 2:22 AM
olista01 retitled this revision from to Teach the AArch64 backend about half-precision floating point.
olista01 updated this object.
olista01 edited the test plan for this revision. (Show Details)
olista01 set the repository for this revision to rL LLVM.
olista01 added a subscriber: Unknown Object (MLST).

Hi Oliver,

Thanks for working on this, it looks like it might really be a set of patches that should be analysed separately.

First, there's the "deal with vNf16 natively in AArch64 where possible". This looks largely fine, though even that would be a bit better sub-divided into function-calls, loads, casts, ... for commit.

Then there's the promotion. I think that's rather more problematic. In general trunc . OP32 . extend != OP16. Apparently, it *does* work for add, sub, div, mul & sqrt, but explicitly not for fma. Personally, I'd be extremely surprised if it worked for the transcendental functions, but I haven't tried to prove it.

So we want to be very careful in that area. If we really want to support half as a native type, we'll probably need to add libcalls for some operations.

Now, this clearly ties in with D4456, and I'm guessing you added the promotion logic because it makes clang emit fptrunc/fpext which then gets optimised. In that case, the operations that *do* get optimised should be safe, so a more limited set of promotions is probably workable.

Cheers.

Tim.

Then there's the promotion. I think that's rather more problematic. In general trunc . OP32 . extend != OP16. Apparently, it *does* work for add, sub, div, mul & sqrt, but explicitly not for fma. Personally, I'd be extremely surprised if it worked for the transcendental functions, but I haven't tried to prove it.

My original intention when adding the promotion was to make the backend capable of handling any operation that can be expressed by IR, so that it could be robust against optimisations introducing half-precision operations. However, if this could give the wrong results, I agree it would be better to fail the compilation.

Would it be OK to simply remove all of the promotions to f32, leaving them to fail at instruction selection, or is there a better way to express that an operation is not supported?

Should I leave the promotions in for add, sub, div, mul & sqrt, or do you think it would it be better to be consistent?

So we want to be very careful in that area. If we really want to support half as a native type, we'll probably need to add libcalls for some operations.

I'm not currently aware of any source language other than opencl which allows operations on the half type (ACLE promotes to float first), so adding libcalls seems like overkill at the moment.

Tim's email, which isn't showing up in phabricator:

Hi Oliver,

On 14 August 2014 16:35, Oliver Stannard <oliver.stannard@arm.com> wrote:

Would it be OK to simply remove all of the promotions to f32, leaving them to fail at instruction selection, or is there a better way to express that an operation is not supported?

I think that's probably the best we've got for now.

Should I leave the promotions in for add, sub, div, mul & sqrt, or do you think it would it be better to be consistent?

I'd be happy if you left those in, actually. I have an ongoing cunning
plan to get rid of @llvm.convert.to.fp16 in favour of fptrunc, and
doing those promotions will be a necessary step along the way, I
think.

So we want to be very careful in that area. If we really want to support half as a native type, we'll probably need to add libcalls for some operations.

I'm not currently aware of any source language other than opencl which allows operations on the half type (ACLE promotes to float first), so adding libcalls seems like overkill at the moment.

Agreed. Quite a bit of work without much gain at the moment.

Cheers.

olista01 updated this revision to Diff 12616.Aug 18 2014, 7:01 AM

This review now just covers the scalar part of the patch, I will upload a second patch with the vector parts.

Removed most of the type promotions, except for add, sub, mul, div, fp_round and fp_extend.

Hi Oliver,

This looks fine, apart from one nit. If you don't need FP_ROUND/FP_TRUNC feel free to remove those lines and just commit.

Cheers.

Tim.

lib/Target/AArch64/AArch64ISelLowering.cpp
289–290

Are these needed? I don't see any code in LegalizeDAG to promote FP_ROUND or FP_EXTEND (and doing so sounds a bit dodgy, given that they're what we'll be using to *do* the promotion in general).

olista01 accepted this revision.Aug 18 2014, 7:32 AM
olista01 added a reviewer: olista01.

Good catch, those promotions were unnecessary.

Committed revision 215891.

This revision is now accepted and ready to land.Aug 18 2014, 7:32 AM
olista01 closed this revision.Aug 18 2014, 7:33 AM