Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from sinking the addressing mode computation of memory instructions back to its uses. The problem comes from the insertion of PHIs, which confuse CGP and make it bail.
I found this problem when looking at sqlite amalgamation from https://www.sqlite.org/download.html.
We could teach CGP to look through PHI nodes in FindAllMemoryUses but this would increase the compilation time (currently scanning is limited to 20 memory instructions - sqlite needs 6 times more). Moreover, CGP still wouldn't be able to handle GEPs that have different base and offset but correspond to the same Value Number (like in the regression test).
This looks good for performance and codesize. I am posting some performance numbers targeting Cortex-A57 AArch64 reported by LNT for llvm-test-suite, spec2000, and spec2006 at -O3 using a resent LLVM trunk revision with my patch applied.
Performance Improvements - execution_time
MultiSource/Benchmarks/FreeBench/mason/mason -15.28%
External/SPEC/CINT2000/253.perlbmk/253.perlbmk -4.07%
External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk -3.38%
External/SPEC/CINT2006/401.bzip2/401.bzip2 -2.82%
MultiSource/Benchmarks/Olden/em3d/em3d -2.81%
SingleSource/Benchmarks/Shootout-C++/Shootout-C++-heapsort -2.67%
SingleSource/Benchmarks/Shootout/Shootout-heapsort -2.24%
MultiSource/Benchmarks/Bullet/bullet -1.37%
SingleSource/Benchmarks/Adobe-C++/stepanov_vector -1.15%
Performance Regressions - execution_time
External/SPEC/CINT2006/400.perlbench/400.perlbench 1.45%
Performance Improvements - mem_bytes
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan -2.68%
MultiSource/Benchmarks/Olden/tsp/tsp -2.14%
MultiSource/Benchmarks/FreeBench/mason/mason -1.27%
The version of this file that I see in trunk doesn't have these autogenerated check lines. Is there intended to be a commit before this that adjusts the test?