Currently, {extract,insert}-element has zero cost at lane 0 [1]. However, there is a cost (by fmov instruction at lane 0 [2], or more generally ext/ins instruction) to move values from SIMD registers to GPR registers, when the element is used explicitly as integers.
See https://godbolt.org/z/faPE1nTn8, when fmov is generated for d* register -> x* register conversion.
Implementation-wise, add a private method AArch64TTIImpl::getVectorInstrCostHelper as a helper function. This way, instruction-based method could share the core logic (e.g., returning zero cost if type is legalized to scalar).
[1] https://github.com/llvm/llvm-project/blob/2cf320d41ed708679e01eeeb93f58d6c5c88ba7a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp#L1853
[2] https://github.com/llvm/llvm-project/blob/2cf320d41ed708679e01eeeb93f58d6c5c88ba7a/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L8150-L8157
what is the purpose of the using statement here?