VectorCombine turns splats of scalar loads into a vector load and
splat if it can determine that reading extract bytes won't page
fault, isn't volatile or atomic, etc. I'm not sure how useful this
is for us. It's especially annoying because vrgather.vi has a
early clobber constraint so this can force us to use an extra
register. If there happen to be splats of neighboring scalar
loads, VectorCombine seems to create multiple vector loads starting
from the address of each scalar.
This patch reverses the transform by turning it back into a scalar
load. I'm also peeking through concat_vectors because I saw
VectorCombine pad with undefs if it was too close to the end of
an array to load the full vector size.