LoopVectorizer will create a loop guard based on the SCEV expression of ExitCount, and complex ExitCount will introduce redundant instructions. This patch simplifies the ExitCount by offsetting the trip count and induction variable so that there will be fewer redundant conditions after vectorization, e.g., llvm.smax.i64 as shown in test.
Before the optimization
int initial = 0; for(int i = 0; i < nOut; i++) { double temp_value = 0.0; for (int j = initial; j < initial + nIn; j++) { temp_value += values[j] * x[idx[j]]; } initial += nIn; b[i] = temp_value; }
After the optimization
int initial = 0; for(int i = 0; i < nOut; i++) { double temp_value = 0.0; for (int k = 0; k < nIn; k++) { temp_value += values[k + initial] * x[idx[k + initial]]; } initial += nIn; b[i] = temp_value; }
L->isInnermost()