We ask TTI.getAddressComputationCost() about the cost of computing vector address,
and then multiply it by the vector width. This doesn't make any sense,
it implies that we'd do a vector GEP and then scalarize the vector of pointers,
but there is no such thing in the vectorized IR, we perform scalar GEP's.
This is *especially* bad on X86, and was effectively prohibiting any scalarized
vectorization of gathers/scatters, because X86TTIImpl::getAddressComputationCost()
says that cost of vector address computation is 10 as compared to 1 for scalar.
The computed costs are similar to the ones with D111222+D111220,
but we end up without masked memory intrinsics that we'd then have to
expand later on, without much luck. (D111363)
It would be more productive if it includes the reason why we can’t change it right away.
Lack of such explanation led to the original version of the patch that repeats D93129.