This is an early work here for tracking changes. Feedback are most welcome.
This patch implements Global Code Motion (GCM) compiler optimization which schedules congruent
instructions across the program. This is an extension of GVNHoist. Not only GCM saves code size, it exposes
redundancies in some cases, it exposes more instruction level parallelism in the
basic-block to which instructions are moved, and it enables other passes like
loop invariant motion to remove more redundancies. The cost model to drive the
code motion is based on liveness analysis on SSA representation such that the
(virtual) register pressure does not increase resulting in 2% fewer spills on
the SPEC-2006 benchmark suite when compiled for x86_64-linux.
The experimental results show reduction in the total compilation time by 1% on SPEC. GCM enables more
inlining and exposes more loop invariant code motion opportunities in majority
of the benchmarks. We have also seen execution time improvements in a few of
SPEC benchmarks viz. mcf (3%) and sjeng(2%).
Stats on llvm-testsuite:
```
2854 instructions hoisted
2867 instructions removed
1361 loads hoisted
1369 loads removed
74 stores hoisted
74 stores removed
10 instructions sunk
```
Codesize measurements:
```
python3 ../utils/compare.py --filter-short --metric=size..text results_gvnhoist.json results_gvnhoist_base.json
Tests: 200
Metric: size..text
Program results_gvnhoist results_gvnhoist_base diff
test-suite...ks/VersaBench/8b10b/8b10b.test 1122 1218 8.6%
test-suite...Source/Benchmarks/sim/sim.test 16130 16658 3.3%
test-suite...ve-susan/automotive-susan.test 26338 26994 2.5%
test-suite...oxyApps-C/XSBench/XSBench.test 13378 13698 2.4%
test-suite...oxyApps-C/RSBench/rsbench.test 20946 21282 1.6%
test-suite...nchmarks/McCat/18-imp/imp.test 12770 12946 1.4%
test-suite...rks/tramp3d-v4/tramp3d-v4.test 804082 814498 1.3%
test-suite...langs-C/unix-tbl/unix-tbl.test 31954 32338 1.2%
test-suite...enchmarks/Olden/em3d/em3d.test 4370 4418 1.1%
test-suite...ks/Prolangs-C/cdecl/cdecl.test 16194 16354 1.0%
test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test 21330 21506 0.8%
test-suite...nchmarks/McCat/09-vor/vor.test 9522 9586 0.7%
test-suite...marks/SciMark2-C/scimark2.test 13090 13170 0.6%
test-suite...s/FreeBench/neural/neural.test 7874 7826 -0.6%
test-suite.../Trimaran/enc-rc4/enc-rc4.test 2818 2834 0.6%
Geomean difference 0.2%
```
Performance measurements: (Ubuntu 17.10 Intel(R) Core(TM) i7-4770 CPU 8x 3.40GHz with frequency scaling disabled)
```
test-suite/build$ python3 ../utils/compare.py --filter-short results_gvnhoist.json results_gvnhoist_base.json
Tests: 200
Short Running: 114 (filtered out)
Remaining: 86
Metric: exec_time
Program results_gvnhoist results_gvnhoist_base diff
test-suite...hmarks/VersaBench/bmm/bmm.test 1.51 1.26 -17.0%
test-suite...mbolics-flt/Symbolics-flt.test 0.69 0.74 6.3%
test-suite...ce/Benchmarks/Olden/bh/bh.test 0.93 0.99 6.2%
test-suite...lications/SIBsim4/SIBsim4.test 1.87 1.76 -6.2%
test-suite...mbolics-dbl/Symbolics-dbl.test 1.96 1.86 -5.3%
test-suite...lications/sqlite3/sqlite3.test 1.44 1.49 3.7%
test-suite...ing-dbl/Equivalencing-dbl.test 1.37 1.32 -3.6%
test-suite...nchmarks/llubenchmark/llu.test 3.93 3.81 -3.1%
test-suite.../Applications/spiff/spiff.test 1.00 1.04 3.1%
test-suite...ow-dbl/GlobalDataFlow-dbl.test 2.12 2.18 2.8%
test-suite...CI_Purple/SMG2000/smg2000.test 1.33 1.37 2.8%
test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test 4.45 4.57 2.7%
test-suite...ow-flt/GlobalDataFlow-flt.test 0.86 0.88 2.5%
test-suite...lications/obsequi/Obsequi.test 1.03 1.05 2.3%
test-suite...Source/Benchmarks/sim/sim.test 2.27 2.31 1.8%
```
TODO:
Investigate regressions: https://github.com/google/hashtable-benchmarks
Potantial bugs: https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-hoist&list_id=173451