Hi Hal,
Please find attached the patch to enable interchange of loops having reductions. The logic to detect a reduction/induction is borrowed from loop vectorizer code.
With this change we are now able to interchange matrix multiplication code such as one below-
for( int i=1;i<2048;i++)
for( int j=1;j<2048;j++)
for( int k=1;k<2048;k++)
A[i][j]+=B[i][k]*C[k][j];into -
for( int k=1;k<2048;k++)
for( int i=1;i<2048;i++)
for( int j=1;j<2048;j++)
A[i][j]+=B[i][k]*C[k][j];which now gets vectorized.
We observe a ~3X execution time improvement in the above code.
Please if you could let me know your inputs on the same.
Thanks and Regards
Karthik Bhat