Page MenuHomePhabricator

[ARM] Make FP16 vectors legal

Authored by miyuki on Jul 30 2018, 7:49 AM.



On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:

void foo(int, ...);

typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
void bar(float16x4_t x) {
  foo(42, x);

According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.

This patch makes FP16 vectors legal in the backend, to that they can
be marshalled correctly when passed as parameters. All operations
(except for loads and stores) on FP16 vectors get expanded. The change
required several adjustments in SelectionDAG and in ARM FP16 tests.

Diff Detail

Event Timeline

miyuki created this revision.Jul 30 2018, 7:49 AM

Please make sure there's test coverage for illegal half operations (fadd <4 x half> etc.)

miyuki added a comment.EditedJul 31 2018, 7:22 AM

I tried compiling a fadd <4 x half> and it actually does not work: the type gets legalized into v4i16 and we get an unknown libcall. So, the problem is that we need to soften v4f16 to v4i16 when passing it as a function parameter, but at the same time expand to f32 when performing arithmetics on it. Do you have any suggestion how to implement this correctly? Do any other targets face a similar problem?

Can we make <4 x half> a legal type on all subtargets with NEON, and just mark all the operations "expand" for subtargets which don't support math on it?

miyuki updated this revision to Diff 158541.Aug 1 2018, 8:51 AM
miyuki retitled this revision from [ARM] Allow half-precision FP softening to [ARM] Make FP16 vectors legal.
miyuki edited the summary of this revision. (Show Details)

This approach worked, but required handling two more operations in PromoteFloatOperand: PromoteFloatOp_BUILD_VECTOR and PromoteFloatOp_INSERT_VECTOR_ELT, which in my implementation don't actually promote anything but rather do some FP softening. Not sure if there is a better solution.

Needs tests for hardfloat ABI, in addition to soft-float.

Does half also need to be legal to get passed/returned correctly?

It looks like clang already has special handling for half types in some cases; how does that interact with this patch?

miyuki updated this revision to Diff 159496.EditedAug 7 2018, 6:23 AM

I have added a hardfp test. Clang has some logic to handle scalar FP16 values, it performs softening when needed (e.g. tools/clang/test/CodeGen/arm-fp16-arguments.c), whereas NEON vectors are lowered into LLVM vector types. So we don't need additional changes neither in Clang nor in the LLVM scalar FP16 handling.

We should handle f16 and vectors of f16 in a consistent manner, for the sake of maintaining the code in the future. (Either handle both in the backend, or handle both in clang.)

miyuki abandoned this revision.Sep 11 2018, 2:34 AM

Abandoning in favor of