This is an archive of the discontinued LLVM Phabricator instance.

[ARM] FP16: vector VMUL variants
ClosedPublic

Authored by SjoerdMeijer on Aug 6 2018, 3:17 AM.

Details

Summary

This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants.

Diff Detail

Repository
rL LLVM

Event Timeline

SjoerdMeijer created this revision.Aug 6 2018, 3:17 AM
samparker added inline comments.Aug 8 2018, 1:17 AM
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll
996 ↗(On Diff #159268)

Should there not also be a test where both inputs are v8f16?

SjoerdMeijer added inline comments.Aug 8 2018, 1:28 AM
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll
839 ↗(On Diff #159268)

Yes, they are here :)

SjoerdMeijer added inline comments.Aug 8 2018, 1:49 AM
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll
996 ↗(On Diff #159268)

ah, I now see what you mean. This is the test and IR for ACLE intrinsic:

float16x8_t vmulq_lane_f16 (float16x8_t a, float16x4_t v, const int lane)

but yes, the pattern would also match for a pattern where the 2nd operand is a v8f16.

SjoerdMeijer added inline comments.Aug 8 2018, 2:23 AM
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll
996 ↗(On Diff #159268)

The shufflevector is creating a 8 x half vector here, with half the elements undef because we pass in a 4 x half, so it actually looks all okay here?

samparker accepted this revision.Aug 8 2018, 3:03 AM

Cheers, shufflevector always confuses me. LGTM.

This revision is now accepted and ready to land.Aug 8 2018, 3:03 AM

Thanks for the reviews!

This revision was automatically updated to reflect the committed changes.