isProfitableChain was added based on register number cost. But now, there are some targets like X86/PowerPC taking instruction number as the major cost.
So on these targets, isProfitableChain wrongly eliminates some LSRUse based on register number when collecting chains/LSRUses, these eliminated LSRUse may impact the instruction number a lot in later rate cost phase. Thus we will get a suboptimal loop code sequence.
This patch can make some cpu2017 benchmarks improve on PowerPC.
This can potentially interfere with the calls to isProfitableLSRChainElement, which also isn't specifically interested in register uses. It will require more cycles to process but how about moving this to after the for loop below?