This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Do not promote f16 when subtarget HasFullFP16
ClosedPublic

Authored by SjoerdMeijer on Aug 7 2017, 7:49 AM.

Details

Summary

ARMv8.2a adds FP16 support , i.e. f16 is not only a storage-only type, but it also supports performing data processing on 16-bit floating-point quantities. All
all the necessary groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done D15014. To take advantage of this, we do not want to promote f16 to f32 when the subtarget supports FullFP16, which will allow instruction selection of these FP16 instructions.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Aug 7 2017, 7:49 AM
samparker edited edge metadata.Aug 8 2017, 4:03 AM

Hi Sjoerd,

These changes look fine to me, but don't we need to test how the backend handles immediate values in these cases?

cheers,
sam

Probably the subject/description of this patch is a bit confusing/misleading. I should have mentioned in the description that all the groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done D15014, which added all the instructions, and also e.g. an fpimm16 immediate type, and the required regression tests for these instructions. This patch is just a tweak to avoid promotions for FP16 instructions that support f16 operands.

SjoerdMeijer edited the summary of this revision. (Show Details)Aug 8 2017, 5:55 AM
olista01 edited edge metadata.Aug 9 2017, 5:54 AM

I think we could still do with tests for how half-precision constants are code-generated from IR (using FMOV or constant pools). I don't see any existing tests for that.

There are also a few classes of instruction where we still promote to 32-bit float, but it looks like we have the instructions to do them directly in half-precision. If they turn out to be more complicated that changing a setOperationAction call, then it would make sense to do them as separate patches though.

test/CodeGen/AArch64/f16-instructions.ll
517

Could this be selected as "fcvtzs w0, h0"?

903

Could this be done without the FCVTs, by changing the constant?

Hi Sam, Oliver, thanks for checking and reviewing. Your gut feelings were right: some more testing showed that target hook isFPImmLegal was not allowing f16. So I've modified that function, and added more tests.

samparker added inline comments.Aug 15 2017, 7:05 AM
test/CodeGen/AArch64/f16-imm.ll
2

A pedantic native speaker may point out that's its 'illegal'... ;)

21

I think it would be a good idea to add the tests that cover the edge cases too, so that the minimal and maximum values are shown to be accepted correctly.

I forgot to address some of Oliver's earlier comments:

  • I have added fixes for the conversion functions, and modified the tests,
  • I will address the codegen for copysign in a follow up patch, because it it requires a bit of custom lowering.

About the newly added immediate tests:

  • I have added (even) more tests. Edge cases is a bit meaningless for 8-bit fp immediates, because there are many (the ARMARM shows a table with accepted immediates), but this now covers minimum/maximum values.
  • I have fixed the spell error, just only for those pedantic natives. ;-)
olista01 accepted this revision.Aug 15 2017, 9:04 AM

LGTM.

This revision is now accepted and ready to land.Aug 15 2017, 9:04 AM
This revision was automatically updated to reflect the committed changes.