Further improvement of the cost model for the scalars used in buildvectors sequences. The main functionality is outlined into a separate function.
The cost is calculated in the following way:
- If the Base vector is not undef vector, resizing the very first mask to have common VF and perform action for 2 input vectors (including non-undef Base). Other shuffle masks are combined with the resulting after the 1 stage and processed as a shuffle of 2 elements.
- If the Base is undef vector and have only 1 shuffle mask, perform the action only for 1 vector with the given mask, if it is not the identity mask.
- If > 2 masks are used, perform serie of shuffle actions for 2 vectors, combing the masks properly between the steps.
The original implementation misses the very first analysis for the Base vector, so the cost might too optimistic in some cases. But it improves the cost for the insertelements which are part of the current SLP graph.
Part of D107966.
Do you intend to use performExtractsShuffleAction more than once in the future? Otherwise some of these function_ref seem superfluous.