This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants.
Details
Diff Detail
Event Timeline
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll | ||
---|---|---|
996 | Should there not also be a test where both inputs are v8f16? |
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll | ||
---|---|---|
839 | Yes, they are here :) |
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll | ||
---|---|---|
996 | ah, I now see what you mean. This is the test and IR for ACLE intrinsic: float16x8_t vmulq_lane_f16 (float16x8_t a, float16x4_t v, const int lane) but yes, the pattern would also match for a pattern where the 2nd operand is a v8f16. |
test/CodeGen/ARM/armv8.2a-fp16-vector-intrinsics.ll | ||
---|---|---|
996 | The shufflevector is creating a 8 x half vector here, with half the elements undef because we pass in a 4 x half, so it actually looks all okay here? |
Yes, they are here :)