This patch adjusts the LowerConstantIntrinsics pass placement in the
pipeline to run earlier than currently.
My main motivation for running it earlier is to improve codegen for code
using builtin_mem*_chk / __builtin_object_size combinations.
At the moment, we miss out on various memory related optimizations when
such builtins are involved, because they won't get transformed to
llvm.memset/llvm.memcpy calls early enough. This limits optimizations.
By running it before DSE/MemCpyOpt, we will be able to perform DSE on
code such as below.
void memset_chk(int** ptr, int *foo) { __builtin___memset_chk (ptr, 0, 128, __builtin_object_size (ptr, 0)); ptr[1] = foo; ptr[2] = ((int *)0); } void use(char*); void memcpy_chk(int* ptr, int *bar) { char buf[128]; use(&buf[0]); bar[0] = 10; __builtin___memcpy_chk (bar, &buf[0], 128, __builtin_object_size (bar, 0)); }
It would be great if people familiar with the pass could chime in if
there are any concerns adjusting the placement. I *think* the current
conditions should not impact the LowerConstantIntrinsics pass results
too much, as most reasoning/simplification passes have run already.
Note that the placement for -O1 is a bit odd, because SCCP gets run
*after* MemCpyOpt, not before as for other optimization levels. For O1
it is placed after SCCP, because SCCP can help simplifying IR feeding
LowerConstantIntrinsics.
I collected stats for building SPEC2006,SPEC2017 and MultiSource with
-O3 and a stdlib that uses the _chk builtins. The results are below
and show a few notable improvements with respect to memory
optimizations.
Metric: memcpyopt.NumCpyToSet Program base patch diff test-suite...6.blender_r/526.blender_r.test 41.00 42.00 2.4% Geomean difference 2.4% Metric: memcpyopt.NumMemCpyInstr Program base patch diff test-suite.../CINT2006/403.gcc/403.gcc.test 48.00 51.00 6.2% Geomean difference 6.2% Metric: memcpyopt.NumMemSetInfer Program base patch diff test-suite...lications/sqlite3/sqlite3.test 39.00 136.00 248.7% test-suite...plications/d/make_dparser.test 1.00 2.00 100.0% test-suite...ications/JM/ldecod/ldecod.test 17.00 22.00 29.4% test-suite...6/482.sphinx3/482.sphinx3.test 9.00 10.00 11.1% test-suite...chmarks/MallocBench/gs/gs.test 10.00 11.00 10.0% test-suite...nsumer-lame/consumer-lame.test 13.00 14.00 7.7% test-suite...0.perlbench/400.perlbench.test 39.00 41.00 5.1% test-suite...ications/JM/lencod/lencod.test 56.00 58.00 3.6% test-suite...017rate/557.xz_r/557.xz_r.test 37.00 38.00 2.7% test-suite...lications/ClamAV/clamscan.test 44.00 45.00 2.3% test-suite...7rate/502.gcc_r/502.gcc_r.test 605.00 617.00 2.0% test-suite...ate/525.x264_r/525.x264_r.test 71.00 72.00 1.4% test-suite...6.blender_r/526.blender_r.test 751.00 758.00 0.9% test-suite.../CINT2006/403.gcc/403.gcc.test 148.00 149.00 0.7% test-suite...marks/7zip/7zip-benchmark.test 416.00 417.00 0.2% Geomean difference 19.4% Metric: dse.NumFastOther Program base patch diff test-suite...lications/viterbi/viterbi.test 0.00 12.00 inf% test-suite...lications/sqlite3/sqlite3.test 12.00 43.00 258.3% test-suite...T2006/445.gobmk/445.gobmk.test 4.00 7.00 75.0% test-suite...0.perlbench/400.perlbench.test 49.00 75.00 53.1% test-suite...abench/jpeg/jpeg-6a/cjpeg.test 4.00 5.00 25.0% test-suite...chmarks/MallocBench/gs/gs.test 9.00 11.00 22.2% test-suite...nsumer-lame/consumer-lame.test 7.00 8.00 14.3% test-suite...plications/d/make_dparser.test 12.00 13.00 8.3% test-suite...nsumer-jpeg/consumer-jpeg.test 13.00 14.00 7.7% test-suite...7rate/502.gcc_r/502.gcc_r.test 258.00 271.00 5.0% test-suite...rlbench_r/500.perlbench_r.test 42.00 44.00 4.8% test-suite...pplications/oggenc/oggenc.test 22.00 23.00 4.5% test-suite...6.blender_r/526.blender_r.test 503.00 519.00 3.2% test-suite.../CINT2006/403.gcc/403.gcc.test 78.00 80.00 2.6% Metric: dse.NumFastStores Program base patch diff test-suite...chmarks/MallocBench/gs/gs.test 55.00 59.00 7.3% test-suite...pplications/oggenc/oggenc.test 129.00 132.00 2.3% test-suite...7rate/502.gcc_r/502.gcc_r.test 1491.00 1524.00 2.2% test-suite...rlbench_r/500.perlbench_r.test 235.00 238.00 1.3% test-suite...lications/ClamAV/clamscan.test 214.00 216.00 0.9% test-suite...0.perlbench/400.perlbench.test 177.00 178.00 0.6% test-suite...6.blender_r/526.blender_r.test 3683.00 3703.00 0.5% test-suite...ate/525.x264_r/525.x264_r.test 410.00 412.00 0.5% Metric: dse.NumRedundantStores Program base patch diff test-suite...lications/viterbi/viterbi.test 1.00 15.00 1400.0% test-suite...0.perlbench/400.perlbench.test 3.00 17.00 466.7% test-suite...lications/sqlite3/sqlite3.test 7.00 27.00 285.7% test-suite...pplications/oggenc/oggenc.test 8.00 18.00 125.0% test-suite...017rate/557.xz_r/557.xz_r.test 1.00 2.00 100.0% test-suite.../Benchmarks/Ptrdist/bc/bc.test 1.00 2.00 100.0% test-suite...7rate/502.gcc_r/502.gcc_r.test 14.00 22.00 57.1% test-suite...nsumer-jpeg/consumer-jpeg.test 2.00 3.00 50.0% test-suite...plications/d/make_dparser.test 22.00 32.00 45.5% test-suite...rlbench_r/500.perlbench_r.test 4.00 5.00 25.0% test-suite.../CINT2006/403.gcc/403.gcc.test 11.00 13.00 18.2% test-suite...6.blender_r/526.blender_r.test 79.00 90.00 13.9% test-suite...nsumer-lame/consumer-lame.test 8.00 9.00 12.5%