The patch is to fix the problem described in https://llvm.org/bugs/show_bug.cgi?id=23217
Loop unrolling in loop vectorization pass has two kinds of benefits: 1. For loop which needs to be both vectorized and unrolled, the unrolling integrated with loop vectorization pass can generate less prologue/epilogue code. 2. unrolling in loop vectorization generates memory boundary check for unrolled loop version, which is useful for better scheduling on some architectures.
However, for x86, its performance is not very sensitive to compile time scheduling. So unrolling in loop vectorization when VF==1 will introduce extra cost of overflow check, memory boundary check and sometimes extra prologue/epilogue code when regular unroller will unroll the loop another time. These are harmful for performance on x86.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1.
Performance neutral for spec2000. Google internal benchmarks: detection improved by 5% on sandybridge and 9% on westmere, saw improved by 1.5% on both platforms.