This patch adds codegen for the following BFloat
operations to the ARM backend:
- concatenation of bf16 vectors
- bf16 vector element extraction
- bf16 vector element insertion
- duplication of a bf16 value into each lane of a vector
- duplication of a bf16 vector lane into each lane
Does VMOVRH require fullfp16? Am I right in saying that bfloat doesn't require the set of instructions we put into HasFPRegs16? That sounds like a pain.