ARMv8.2a adds FP16 support , i.e. f16 is not only a storage-only type, but it also supports performing data processing on 16-bit floating-point quantities. All
all the necessary groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done D15014. To take advantage of this, we do not want to promote f16 to f32 when the subtarget supports FullFP16, which will allow instruction selection of these FP16 instructions.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi Sjoerd,
These changes look fine to me, but don't we need to test how the backend handles immediate values in these cases?
cheers,
sam
Probably the subject/description of this patch is a bit confusing/misleading. I should have mentioned in the description that all the groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done D15014, which added all the instructions, and also e.g. an fpimm16 immediate type, and the required regression tests for these instructions. This patch is just a tweak to avoid promotions for FP16 instructions that support f16 operands.
I think we could still do with tests for how half-precision constants are code-generated from IR (using FMOV or constant pools). I don't see any existing tests for that.
There are also a few classes of instruction where we still promote to 32-bit float, but it looks like we have the instructions to do them directly in half-precision. If they turn out to be more complicated that changing a setOperationAction call, then it would make sense to do them as separate patches though.
test/CodeGen/AArch64/f16-instructions.ll | ||
---|---|---|
517 ↗ | (On Diff #109992) | Could this be selected as "fcvtzs w0, h0"? |
903 ↗ | (On Diff #109992) | Could this be done without the FCVTs, by changing the constant? |
Hi Sam, Oliver, thanks for checking and reviewing. Your gut feelings were right: some more testing showed that target hook isFPImmLegal was not allowing f16. So I've modified that function, and added more tests.
I forgot to address some of Oliver's earlier comments:
- I have added fixes for the conversion functions, and modified the tests,
- I will address the codegen for copysign in a follow up patch, because it it requires a bit of custom lowering.
About the newly added immediate tests:
- I have added (even) more tests. Edge cases is a bit meaningless for 8-bit fp immediates, because there are many (the ARMARM shows a table with accepted immediates), but this now covers minimum/maximum values.
- I have fixed the spell error, just only for those pedantic natives. ;-)