The NPM runs loop passes over loops in forward program order, rather
than the legacy loop PM's reverse program order. This seems to produce
better results as shown here.
I verified that changing the loop order to reverse program order results
in the same IR with the NPM.