Currently pass LSR runs before PPCCTRLoops, we meet some problem with this order because of the icmp inside loop which compares iteration indexed and loop trip count. In pass LSR, we always treat that icmp as a valid ICmpZero type LSRUse. But this is not true because in later pass PPCCTRLoops, we may replace this icmp with ctrloop instruction bdnz. So we may get suboptimal code based on this order.
Reordering LSR and PPCCTRLoops makes LSR know precisely whether the icmp replaced or kept in PPCCTRLoops
Get improvement for most benchmarks of SPEC CPU2017 on Power9. Biggest gain is xz, about 3.5%. No degradation found.