Hi Hal,

Please find attached the patch to enable interchange of loops having reductions. The logic to detect a reduction/induction is borrowed from loop vectorizer code.

With this change we are now able to interchange matrix multiplication code such as one below-

for( int i=1;i<2048;i++) for( int j=1;j<2048;j++) for( int k=1;k<2048;k++) A[i][j]+=B[i][k]*C[k][j];

into -

for( int k=1;k<2048;k++) for( int i=1;i<2048;i++) for( int j=1;j<2048;j++) A[i][j]+=B[i][k]*C[k][j];

which now gets vectorized.

We observe a ~3X execution time improvement in the above code.

Please if you could let me know your inputs on the same.

Thanks and Regards

Karthik Bhat