These currently use _u32, but they should instead use _f32 or _f16, the types of the accumulator, and of the multiplication.
I'm starting with _f16 (because that seems to match the various integer vmlal variants), but either seems fine.
Paths
| Differential D58306
[AArch64] Change size suffix for FP16FML intrinsics. ClosedPublic Authored by ab on Feb 15 2019, 2:04 PM.
Details Summary These currently use _u32, but they should instead use _f32 or _f16, the types of the accumulator, and of the multiplication. I'm starting with _f16 (because that seems to match the various integer vmlal variants), but either seems fine.
Diff Detail
Event TimelineComment Actions I am discussing this with our GCC team as we would like both Clang/GCC implementation to be the same. But you're right that _f16 looks like to be the more consistent choice. I will let you know as soon I know more. Comment Actions LGTM The ACLE has been updated and a new version with change included will be released soon. This revision is now accepted and ready to land.Feb 19 2019, 5:08 AM Closed by commit rL354538: [AArch64] Change size suffix for FP16FML intrinsics. (authored by ab). · Explain WhyFeb 20 2019, 5:13 PM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 187696 cfe/trunk/include/clang/Basic/arm_neon.td
cfe/trunk/test/CodeGen/aarch64-neon-fp16fml.c
|