Patch tries to improve vectorization of the following code:
void add1(int * __restrict dst, const int * __restrict src) { *dst++ = *src++; *dst++ = *src++ + 1; *dst++ = *src++ + 2; *dst++ = *src++ + 3; }
Currently this code cannot be vectorized because the very first operation is not a binary add, but just a load.