Page MenuHomePhabricator

[LV] Allow large RT checks, if they are a fraction of the scalar cost (WIP)
Needs ReviewPublic

Authored by fhahn on Mar 11 2020, 4:05 AM.

Details

Summary

This patch estimates the cost of the generated runtime checks and
relaxes the limit on the number of runtime checks, if the cost of the
runtime checks is a small fraction (0.5% of the expected scalar loop
runtime). The threshold (and other details) are not set in stone yet
and requires further benchmarking/analysis. Also, ExpectedTC returns
the max of the induction variable for loops without known constant trip
counts, which means we largely overestimate the cost of loops with
variable trip counts.

The current version also keeps a hard limit of 2 *
NumRuntimePointerChecks, but that also needs a better look.

If the general direction is agreed upon, I will hash out the final
details.

Fixes PR44662 (modulo potential adjustments for unknown trip counts)

Diff Detail

Event Timeline

fhahn created this revision.Mar 11 2020, 4:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2020, 4:05 AM
fhahn edited the summary of this revision. (Show Details)Mar 11 2020, 4:42 AM
lebedev.ri added inline comments.Mar 11 2020, 3:19 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9308

Please make 0.005 an option

Reverse-ping, thanks.
Anything to get this going?

fhahn updated this revision to Diff 282932.Aug 4 2020, 8:43 AM

rebase.

With the linked dependent patches, this should now successfully build test-suite with MultiSource/SPEC2000/SPEC2006.

This leads to additional vectorization with runtime checks in a few more cases:

Same hash: 223 (filtered out)
Remaining: 14
Metric: loop-vectorize.LoopsVectorized

Program patch1 patch2 diff
test-suite...Source/Benchmarks/sim/sim.test 5.00 8.00 60.0%
test-suite...rks/FreeBench/pifft/pifft.test 33.00 47.00 42.4%
test-suite...chmarks/Rodinia/srad/srad.test 3.00 4.00 33.3%
test-suite...CFP2000/177.mesa/177.mesa.test 379.00 417.00 10.0%
test-suite...CI_Purple/SMG2000/smg2000.test 78.00 84.00 7.7%
test-suite...pps-C/SimpleMOC/SimpleMOC.test 39.00 42.00 7.7%
test-suite...oxyApps-C/miniGMG/miniGMG.test 42.00 44.00 4.8%
test-suite.../CINT2000/176.gcc/176.gcc.test 97.00 100.00 3.1%
test-suite...006/450.soplex/450.soplex.test 88.00 90.00 2.3%
test-suite...lications/ClamAV/clamscan.test 91.00 93.00 2.2%
test-suite...pplications/oggenc/oggenc.test 130.00 132.00 1.5%
test-suite...006/447.dealII/447.dealII.test 958.00 970.00 1.3%

fhahn updated this revision to Diff 314337.Mon, Jan 4, 2:12 AM

Rebased on top of current trunk. This version now can build MultiSource/SPEC2006/SPEC2000 with -O3 -flto without crashing.

The current version leads to a few more vectorized loops in some benchmarks:

Tests: 236
Same hash: 200 (filtered out)
Remaining: 36
Metric: loop-vectorize.LoopsVectorized

Program                                        base   patch.lv-mem-cost diff
 test-suite...Source/Benchmarks/sim/sim.test     5.00   8.00            60.0%
 test-suite...chmarks/Rodinia/srad/srad.test     3.00   4.00            33.3%
 test-suite...rks/FreeBench/pifft/pifft.test    33.00  43.00            30.3%
 test-suite...CFP2000/177.mesa/177.mesa.test   386.00 424.00             9.8%
 test-suite...CI_Purple/SMG2000/smg2000.test    77.00  83.00             7.8%
 test-suite...pps-C/SimpleMOC/SimpleMOC.test    39.00  42.00             7.7%
 test-suite...oxyApps-C/miniGMG/miniGMG.test    44.00  46.00             4.5%
 test-suite.../CINT2000/176.gcc/176.gcc.test    99.00 102.00             3.0%
 test-suite...006/450.soplex/450.soplex.test    88.00  90.00             2.3%
 test-suite...lications/ClamAV/clamscan.test    97.00  99.00             2.1%
 test-suite...pplications/oggenc/oggenc.test   151.00 153.00             1.3%
 test-suite...006/447.dealII/447.dealII.test   970.00 982.00             1.2%
rkruppe removed a subscriber: rkruppe.Mon, Jan 4, 2:53 AM
fhahn updated this revision to Diff 315353.Fri, Jan 8, 5:00 AM

rebase on top of the recent changes.