HomePhabricator

[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors

Authored by miyuki on Jun 22 2020, 10:15 AM.

Description

[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors

Currently, in order to extract an element from a bf16 vector, we cast
the vector to an i16 vector, perform the extraction, and cast the result to
bfloat. This behavior was copied from the old fp16 implementation.

The goal of this patch is to achieve optimal code generation for lane
copying intrinsics in a subsequent patch (LLVM fails to fold certain
combinations of bitcast, insertelement, extractelement and
shufflevector instructions leading to the generation of suboptimal code).

Differential Revision: https://reviews.llvm.org/D82206