Page MenuHomePhabricator

Global code motion of congruent computations
Needs ReviewPublic

Authored by hiraditya on Apr 17 2017, 2:34 PM.

Details

Reviewers
sebpop
Summary

This is an early work here for tracking changes. Feedback are most welcome.

This patch implements Global Code Motion (GCM) compiler optimization which schedules congruent
instructions across the program. This is an extension of GVNHoist. Not only GCM saves code size, it exposes
redundancies in some cases, it exposes more instruction level parallelism in the
basic-block to which instructions are moved, and it enables other passes like
loop invariant motion to remove more redundancies. The cost model to drive the
code motion is based on liveness analysis on SSA representation such that the
(virtual) register pressure does not increase resulting in 2% fewer spills on
the SPEC-2006 benchmark suite when compiled for x86_64-linux.

The experimental results show reduction in the total compilation time by 1% on SPEC. GCM enables more
inlining and exposes more loop invariant code motion opportunities in majority
of the benchmarks. We have also seen execution time improvements in a few of
SPEC benchmarks viz. mcf (3%) and sjeng(2%).

Stats on llvm-testsuite:

2854 instructions hoisted
2867 instructions removed
1361 loads hoisted
1369 loads removed
74 stores hoisted
74 stores removed
10 instructions sunk

Codesize measurements:

python3 ../utils/compare.py --filter-short --metric=size..text results_gvnhoist.json results_gvnhoist_base.json 
Tests: 200
Metric: size..text

Program                                        results_gvnhoist results_gvnhoist_base diff 
 test-suite...ks/VersaBench/8b10b/8b10b.test    1122             1218                  8.6%
 test-suite...Source/Benchmarks/sim/sim.test    16130            16658                 3.3%
 test-suite...ve-susan/automotive-susan.test    26338            26994                 2.5%
 test-suite...oxyApps-C/XSBench/XSBench.test    13378            13698                 2.4%
 test-suite...oxyApps-C/RSBench/rsbench.test    20946            21282                 1.6%
 test-suite...nchmarks/McCat/18-imp/imp.test    12770            12946                 1.4%
 test-suite...rks/tramp3d-v4/tramp3d-v4.test    804082           814498                1.3%
 test-suite...langs-C/unix-tbl/unix-tbl.test    31954            32338                 1.2%
 test-suite...enchmarks/Olden/em3d/em3d.test    4370             4418                  1.1%
 test-suite...ks/Prolangs-C/cdecl/cdecl.test    16194            16354                 1.0%
 test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test    21330            21506                 0.8%
 test-suite...nchmarks/McCat/09-vor/vor.test    9522             9586                  0.7%
 test-suite...marks/SciMark2-C/scimark2.test    13090            13170                 0.6%
 test-suite...s/FreeBench/neural/neural.test    7874             7826                 -0.6%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test    2818             2834                  0.6%
 Geomean difference                                                                    0.2%

Performance measurements: (Ubuntu 17.10 Intel(R) Core(TM) i7-4770 CPU 8x 3.40GHz with frequency scaling disabled)

test-suite/build$ python3 ../utils/compare.py --filter-short results_gvnhoist.json results_gvnhoist_base.json 
Tests: 200
Short Running: 114 (filtered out)
Remaining: 86
Metric: exec_time

Program                                        results_gvnhoist results_gvnhoist_base diff  
 test-suite...hmarks/VersaBench/bmm/bmm.test     1.51             1.26                -17.0% // This is bogus as the final binary for both base and diff were same. will rerun.
 test-suite...mbolics-flt/Symbolics-flt.test     0.69             0.74                 6.3% 
 test-suite...ce/Benchmarks/Olden/bh/bh.test     0.93             0.99                 6.2% 
 test-suite...lications/SIBsim4/SIBsim4.test     1.87             1.76                -6.2% 
 test-suite...mbolics-dbl/Symbolics-dbl.test     1.96             1.86                -5.3% 
 test-suite...lications/sqlite3/sqlite3.test     1.44             1.49                 3.7% 
 test-suite...ing-dbl/Equivalencing-dbl.test     1.37             1.32                -3.6% 
 test-suite...nchmarks/llubenchmark/llu.test     3.93             3.81                -3.1% 
 test-suite.../Applications/spiff/spiff.test     1.00             1.04                 3.1% 
 test-suite...ow-dbl/GlobalDataFlow-dbl.test     2.12             2.18                 2.8% 
 test-suite...CI_Purple/SMG2000/smg2000.test     1.33             1.37                 2.8% 
 test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test     4.45             4.57                 2.7% 
 test-suite...ow-flt/GlobalDataFlow-flt.test     0.86             0.88                 2.5% 
 test-suite...lications/obsequi/Obsequi.test     1.03             1.05                 2.3% 
 test-suite...Source/Benchmarks/sim/sim.test     2.27             2.31                 1.8%

SPEC2k

Spec2006Int	ratio
400.perlbench	0.9969512195
401.bzip2	0.9900793651
403.gcc	1.027950311
429.mcf	1
445.gobmk	1.006315789
456.hmmer	1.005449591
458.sjeng	1.005825243
462.libquantum	1.028037383
464.h264ref	1.024667932
471.omnetpp	0.9437652812
Geomean 1.002625904
	

Spec2006FP	ratio
433.milc	1.151770658
444.namd	1.018518519
450.soplex	1.030888031
453.povray	1.028571429
470.lbm	1.069414317
482.sphinx3	0.988179669
Geomean	1.046631497

TODO:
Investigate regressions: https://github.com/google/hashtable-benchmarks
Potantial bugs: https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-hoist&list_id=173451

Diff Detail

Event Timeline

hiraditya created this revision.Apr 17 2017, 2:34 PM
hiraditya updated this revision to Diff 96800.Apr 26 2017, 11:59 AM

Sink to immediate successors, update memory SSA when sinking.

hiraditya edited the summary of this revision. (Show Details)Apr 26 2017, 12:07 PM
mehdi_amini added inline comments.
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
140

Is it intended to be part of this patch?

482

Can you clarify why is it inserted here? It seems like a strange place to me

hiraditya added inline comments.Apr 27 2017, 11:43 AM
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
140

No, I'll remove this.

482

I'll remove this as well, this was for my testing.

Plesse try to add more reviewers. This seems to be very promising.

Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2019, 4:00 PM

Working on porting this to latest llvm, will push the latest changes soon.

Rebase against master and use profitability to reduce liveness

hiraditya updated this revision to Diff 226660.Oct 28 2019, 6:54 AM

Sink iteratively

lkail added a subscriber: lkail.Oct 28 2019, 7:11 AM

Sink iteratively.

The algorithm for sink instruction can be made more efficient
by having a worklist in reverse-DFSIn number and update after each sink.
The data structure to handle that may be inefficient but I'm not sure.

Remove VNs already handled to reduce redundant lookups.

Make MaxSinkChainLength a flag.

Remove unused variable and use function for updating local stats.

hiraditya updated this revision to Diff 226789.Oct 28 2019, 5:30 PM

Add testcase for sinking

hiraditya edited the summary of this revision. (Show Details)Oct 28 2019, 9:44 PM
hiraditya edited the summary of this revision. (Show Details)Oct 28 2019, 9:47 PM
hiraditya edited the summary of this revision. (Show Details)
lkail added a comment.Oct 29 2019, 2:24 AM

Hi @hiraditya , thanks for your work. I want to test your patch on PowerPC, however current patch seems unable to be applied to current master branch.

fhahn added a subscriber: fhahn.Oct 29 2019, 6:37 AM

Two high level comments after a quick glance

  1. gnv-sink & gvn-hoist are still disabled by default AFAIK and there are still a few known bugs that need to be addressed (https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-hoist, https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-sink). It would be good to guard this change by a new flag that's off by default, otherwise it might hinder work towards weeding out the existing bugs.
  2. Do you think you would be able to use the MergeSet implementation from D57123 (I will rebase the patches ASAP)? Separating out the MergeSet changes should reduce the diff a bit, making reviewing it easier.

Hi @hiraditya , thanks for your work. I want to test your patch on PowerPC, however current patch seems unable to be applied to current master branch.

I'll rebase to master in coming days. Thanks!

Two high level comments after a quick glance

  1. gnv-sink & gvn-hoist are still disabled by default AFAIK and there are still a few known bugs that need to be addressed (https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-hoist, https://bugs.llvm.org/buglist.cgi?quicksearch=gvn-sink). It would be good to guard this change by a new flag that's off by default, otherwise it might hinder work towards weeding out the existing bugs.

Thanks for the link, I'll look at the bugs and try to fix relevant ones

  1. Do you think you would be able to use the MergeSet implementation from D57123 (I will rebase the patches ASAP)? Separating out the MergeSet changes should reduce the diff a bit, making reviewing it easier.

Sure, that would help simplify the code here. I haven't looked at the implementation (D57123) in detail but I'll follow up after it is rebased.

GVNHoist was recently reverted maily due to big regressions:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181001/591519.html

Thanks for the link, i'll see if we can tune this pass based on the benchmarks posted there. This optimization does have cost model that would reduce register pressure so it shouldn't be worse compared to the current GVNHoist.

hiraditya edited the summary of this revision. (Show Details)Oct 29 2019, 1:44 PM
hiraditya edited the summary of this revision. (Show Details)Oct 30 2019, 10:01 AM
hiraditya updated this revision to Diff 227372.Oct 31 2019, 4:32 PM

Rebase against master.