If a vector cast gets split, it's quite possible that the resulting casts are legal and cheap.
So, instead of pessimistically assuming scalarization, we use the costs the concrete TTI provides for the split vector.
This looks like it does the right thing for AVX (a lot of overblown costs drop dramatically) - but I'm less sure about ARM.
LGTM, but it looks to me like this should be adding 0, so not increasing by 1?