ACLE 2.0  allows __fp16 to be used as a function argument or return type. This enables this for AArch64.
I have not enabled this for 32-bit ARM targets yet, as we are expecting to release an updated version of the 32-bit AAPCS soon, which changes the handling of __fp16 in a non backwards-compatible way.
This also fixes an existing bug that causes clang to not allow homogeneous floating-point aggregates with a base type of __fp16. This is valid for AAPCS64, but not for AAPCS-VFP.