When estimating the cost of the in-tree vectorized scalars in
buildvector sequences, need to take into account the vectorized
insertelement instruction. The top of the buildvector seuences is the
topmost vectorized insertelement instruction, because it will have
more than 1 use after the vectorization.
For the affected test case improves througput from 21 to 16 (per
llvm-mca).