We currently generate inefficient code for vector convert from short int to half instructions: promote to int first, then convert, and then demote to half. With this patch and when the fp16 feature is on, we generate straight fp16 vector conversion.
Details
Details
Diff Detail
Diff Detail