This patch extends performMaskedGatherScatterCombine to find gathers
& scatters with a stride of two in their indices, which can be converted
to a pair of contiguous loads or stores with zips & uzps and the
appropriate predicates.
There were no performance improvements found using this combine for scatter
stores of 64 bit data, so we just return SDValue() in this case.
nit: Can you fix all the formatting issues in your new code before merging please?