As this optimization converts two loads into one load with two shift instructions,
it could potentially hurt performance if a loop is arithmetic operation intensive.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Based on our results and feedback from our SD colleagues, I'm fine with approving this patch. I know the performance results were neutral for Spec2006. Did you do any additional testing on Spec2000 or EEMBC by chance?
Approving, but feel free to wait for feedback from Tim, James, or others before committing.
Hi Jun,
Have you considered deciding this as a MachineCombiner pattern? This would be a good place to know if the loop is arithmetic or load/store heavy.
Cheers,
James
Have you considered deciding this as a MachineCombiner pattern? This would be a good place to know if the loop is arithmetic or load/store heavy.
Let me take a look if we can move this to MachineCombiner.
Thanks James!
I had multiple EEMBC runs as my score were somewhat unstable. Overall, I wasn't able to see reproducible regressions. Please feel free to run performance tests and share your results. I will commit this at the end of this week if there is no objection.
Have you considered deciding this as a MachineCombiner pattern? This would be a good place to know if the loop is arithmetic or load/store heavy.
I think MachineCombiner is also good place to perform this optimization with minor changes in the profitability check. As of now, however, I don't have any case impacted by this optimization. So, I will deprioritize doing it until I can find the cases.