A pattern for moving data from a Neon ACLE type into an SVE ACLE type
involves extracting the two double-lanes of the Neon register and
inserting them into an SVE register using two DUPs with VL1 and VL2.
This must compile to a NOP.
To achieve this, this patch adds support in DAGCombine to support the
INSERT_VECTOR_ELT => BUILD_VECTOR combine. Since BUILD_VECTOR does not
support scalable vectors, the insertions are pushed into a fixed
BUILD_VECTOR through an INSERT_SUBVECTOR to make it scalable again.
With this DAGCombine in place, existing BUILD_VECTOR combines are able
neatly optimize away bitcast/extractelement/shuffle etc.
Since not all Scalable vector types are supported for INSERT_SUBVECTOR,
I introduce a TargetLoweringInfo::isInsertSubvectorLegal to query
whether to perform the combine.
Two dup => insertelement patterns are added in instCombineSVEDup:
(dup vec VL1 elem0) => (insertelement vec elem0 0) (dup (dup vec VL2 elem1) VL1 elem0) => (insertelement (insertelement vec elem1 1) elem0 0)
... which enable the BUILD_VECTOR optimization to work.
Reference:
"Move data between Advanced SIMD (Neon) and SVE ACLE types" https://developer.arm.com/documentation/ka004612/latest KA004612
Can use getVectorMinNumElements here I think.