This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add support for half-precision floats
ClosedPublic

Authored by luismarques on Oct 21 2019, 2:57 AM.

Details

Summary

Most fp16 operations are automatically supported by promoting the half-precision values to single-precision ones. This patch completes fp16 support by ensuring that load extension / truncate store operations are properly expanded.

The tests included in the patch check the load ext / trunc store behavior, and add a few sanity checks for promoted fp16 operations. Testing with riscv32 using the ilp32d ABI is enough to check the 4 ext/trunc cases, and the riscv64 output doesn't differ in any important way, so the tests target only riscv32.

Diff Detail

Event Timeline

luismarques created this revision.Oct 21 2019, 2:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2019, 2:57 AM

This is looking good to me. I'd like you to precommit the tests.

I find the libcall ABI slighly odd (__gnu_h2f_ieee takes the half-precision arg in a0 and returns the result in fa0, it seems, which strikes me as a bit odd, but maybe I'm missing something). That said, that probably isn't an issue with this patch specifically.

This is looking good to me. I'd like you to precommit the tests.

What do you mean? (Don't forget that the load ext / trunc store tests require this patch's code changes)

I find the libcall ABI slighly odd (__gnu_h2f_ieee takes the half-precision arg in a0 and returns the result in fa0, it seems, which strikes me as a bit odd, but maybe I'm missing something). That said, that probably isn't an issue with this patch specifically.

That makes sense. The half-precision float isn't a floating-point value as understood by the FP unit, so it has to go in a GPR, for ALU operations to slice up the fields and build the normal IEEE 754 32-bit representation, which can then be returned as a regular float -- which for ilp32f is of course returned in an FPR.

I find the libcall ABI slighly odd (__gnu_h2f_ieee takes the half-precision arg in a0 and returns the result in fa0, it seems, which strikes me as a bit odd, but maybe I'm missing something). That said, that probably isn't an issue with this patch specifically.

That makes sense. The half-precision float isn't a floating-point value as understood by the FP unit, so it has to go in a GPR, for ALU operations to slice up the fields and build the normal IEEE 754 32-bit representation, which can then be returned as a regular float -- which for ilp32f is of course returned in an FPR.

No exception for half-precision floats has been included in the RISC-V ELF psABI (one would expect they should use the FP calling convention, as they are an FP real value). This should perhaps be rectified, but separately to this patch. I'll make a note.

No exception for half-precision floats has been included in the RISC-V ELF psABI (one would expect they should use the FP calling convention, as they are an FP real value). This should perhaps be rectified, but separately to this patch. I'll make a note.

It's probably a good idea to clarify the psABI's stance on half floats (possibly including not having any opinion about that), but do beware that this issue transcends the ABI. This half-float is being promoted at the LLVM IR level to a single-precision float, and once that happens everything behaves normally. What happens before the promotion is an LLVM implementation detail, so probably outside the scope of the ABI docs.

It's probably a good idea to clarify the psABI's stance on half floats (possibly including not having any opinion about that), but do beware that this issue transcends the ABI. This half-float is being promoted at the LLVM IR level to a single-precision float, and once that happens everything behaves normally. What happens before the promotion is an LLVM implementation detail, so probably outside the scope of the ABI docs.

Just to clarify, because I used somewhat sloppy terminology. Consider this:

%r = fadd half %a, %b
->
; ... magic ...
; CHECK-NEXT:    fadd.s fa0, fs0, fa0
; ... magic ...

We are conceptually implementing half-float addition, but in reality promoting and expanding that to single-precision floating-point addition. If we consider that all of the promotion/expansion happens at a level that is not visible to the user and does not have to interoperate with other compilers then I would argue that it transcends the ABI requirements. But if it becomes visible then that's another story. Then I guess the answer depends on whether you want to standardize the ABI for a non-standard C type, and thus ensure interoperability across compilers even for that case.

lenary added a comment.EditedOct 24 2019, 9:23 AM

If we consider that all of the promotion/expansion happens at a level that is not visible to the user and does not have to interoperate with other compilers then I would argue that it transcends the ABI requirements. But if it becomes visible then that's another story.

I think most standards that even mention them, say that half-precision floats are a storage-only format, as these platforms cannot directly compute with them.

I do think we should care about how they are passed in the calling convention, as that is perhaps the most "visible" part of using them. At the moment, LLVM is doing something reasonable, that is compatible with GCC, so it would be good to codify.

Edit: Further discussion here about the psABI isn't useful, as this patch is about ensuring we don't abort when expanding half-precision to single-precision.

Removed the align 2 from the tests' loads and stores.

lenary accepted this revision.Oct 25 2019, 2:49 AM

LGTM

This revision is now accepted and ready to land.Oct 25 2019, 2:49 AM
This revision was automatically updated to reflect the committed changes.