This is an archive of the discontinued LLVM Phabricator instance.

[DSE,MSSA] Cache accesses with/without reachable read-clobbers.
ClosedPublic

Authored by fhahn on Feb 23 2020, 12:45 PM.

Details

Summary

Currently we repeatedly check the same uses for read clobbers in some
cases. We can avoid unnecessary checks by keeping track of the memory
accesses we already found read clobbers for. To do so, we just add
memory access causing read-clobbers to a set. Note that marking all
visited accesses as read-clobbers would be to pessimistic, as that might
include accesses not on any path to the actual read clobber.

If we do not find any read-clobbers, we can add all visited instructions
to another set and use that to skip the same accesses in the next call.

I did not yet measure compile-time, but below is the impact on the
number of iterations in getDomMemoryDef:

Metric: dse.NumDomMemDefChecks

Program base patch diff
test-suite...000/183.equake/183.equake.test 132580.00 26961.00 -79.7%
test-suite...T95/147.vortex/147.vortex.test 881946.00 297521.00 -66.3%
test-suite...000/255.vortex/255.vortex.test 882090.00 297594.00 -66.3%
test-suite...T2006/445.gobmk/445.gobmk.test 700940.00 247624.00 -64.7%
test-suite...ications/JM/ldecod/ldecod.test 990956.00 357584.00 -63.9%
test-suite...C/CFP2000/179.art/179.art.test 23014.00 8364.00 -63.7%
test-suite...marks/SciMark2-C/scimark2.test 20939.00 8230.00 -60.7%
test-suite.../CINT2006/403.gcc/403.gcc.test 2412386.00 951674.00 -60.6%
test-suite...006/447.dealII/447.dealII.test 1850445.00 796042.00 -57.0%
test-suite...006/453.povray/453.povray.test 1735262.00 753271.00 -56.6%
test-suite...ProxyApps-C++/CLAMR/CLAMR.test 393888.00 172514.00 -56.2%
test-suite...ProxyApps-C++/HPCCG/HPCCG.test 42350.00 18931.00 -55.3%
test-suite...000/186.crafty/186.crafty.test 371608.00 167669.00 -54.9%
test-suite...langs-C/unix-tbl/unix-tbl.test 10263.00 4763.00 -53.6%
test-suite.../CINT2000/176.gcc/176.gcc.test 1641688.00 763696.00 -53.5%
test-suite...ications/JM/lencod/lencod.test 1459213.00 679907.00 -53.4%
test-suite...FreeBench/distray/distray.test 10477.00 5113.00 -51.2%
test-suite.../Trimaran/enc-md5/enc-md5.test 2651.00 1295.00 -51.2%
test-suite...s/Rodinia/hotspot/hotspot.test 4031.00 1989.00 -50.7%
test-suite...T2006/401.bzip2/401.bzip2.test 171479.00 85496.00 -50.1%
test-suite...lowfish/security-blowfish.test 6217.00 3143.00 -49.4%
test-suite...rks/FreeBench/mason/mason.test 1386.00 712.00 -48.6%
test-suite...yApps-C++/PENNANT/PENNANT.test 135316.00 71201.00 -47.4%
test-suite...ks/McCat/04-bisect/bisect.test 3353.00 1801.00 -46.3%
test-suite...6/464.h264ref/464.h264ref.test 1500143.00 810226.00 -46.0%
test-suite...marks/7zip/7zip-benchmark.test 1278779.00 711387.00 -44.4%
test-suite...lications/viterbi/viterbi.test 11564.00 6497.00 -43.8%
test-suite...plications/d/make_dparser.test 54338.00 31158.00 -42.7%
test-suite...ternal/HMMER/hmmcalibrate.test 75317.00 43258.00 -42.6%
test-suite...lications/ClamAV/clamscan.test 447833.00 258126.00 -42.4%
test-suite...lications/SIBsim4/SIBsim4.test 32896.00 19381.00 -41.1%
test-suite...6/482.sphinx3/482.sphinx3.test 62177.00 37137.00 -40.3%
test-suite...nia/pathfinder/pathfinder.test 1322.00 795.00 -39.9%
test-suite...math/automotive-basicmath.test 146.00 88.00 -39.7%
test-suite...ngs-C/simulator/simulator.test 18187.00 10982.00 -39.6%
test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test 3040.00 1839.00 -39.5%
test-suite...T2006/473.astar/473.astar.test 37795.00 22910.00 -39.4%
test-suite...CFP2006/444.namd/444.namd.test 97627.00 59229.00 -39.3%
test-suite...-typeset/consumer-typeset.test 802759.00 488393.00 -39.2%
test-suite...T2006/456.hmmer/456.hmmer.test 96748.00 58957.00 -39.1%
test-suite...quoia/CrystalMk/CrystalMk.test 9157.00 5669.00 -38.1%
test-suite.../Benchmarks/Olden/mst/mst.test 1054.00 653.00 -38.0%
test-suite :: External/Nurbs/nurbs.test 14763.00 9177.00 -37.8%
test-suite...CFP2000/188.ammp/188.ammp.test 84072.00 53009.00 -36.9%
test-suite...chmarks/MallocBench/gs/gs.test 98221.00 61993.00 -36.9%
test-suite...rks/FreeBench/pifft/pifft.test 15560.00 9888.00 -36.5%
test-suite...ks/McCat/01-qbsort/qbsort.test 497.00 323.00 -35.0%
test-suite...INT2000/164.gzip/164.gzip.test 30328.00 19767.00 -34.8%
test-suite...lications/obsequi/Obsequi.test 44239.00 29195.00 -34.0%
test-suite.../CINT2000/252.eon/252.eon.test 340333.00 224748.00 -34.0%
test-suite.../Prolangs-C++/trees/trees.test 2573.00 1715.00 -33.3%
test-suite...TimberWolfMC/timberwolfmc.test 190315.00 126890.00 -33.3%
test-suite...nch/fourinarow/fourinarow.test 1487.00 994.00 -33.2%
test-suite...SPEC/CINT95/099.go/099.go.test 256287.00 173249.00 -32.4%
test-suite...s-C/Pathfinder/PathFinder.test 8366.00 5708.00 -31.8%
test-suite...5/124.m88ksim/124.m88ksim.test 62116.00 42425.00 -31.7%
test-suite...arks/McCat/17-bintr/bintr.test 154.00 107.00 -30.5%
test-suite...langs-C/football/football.test 17864.00 12416.00 -30.5%
test-suite...lications/minisat/minisat.test 9504.00 6676.00 -29.8%
test-suite...rks/tramp3d-v4/tramp3d-v4.test 1275472.00 896318.00 -29.7%
test-suite...s/FreeBench/neural/neural.test 1534.00 1087.00 -29.1%
test-suite...arks/mafft/pairlocalalign.test 259748.00 185421.00 -28.6%
test-suite...comm-adpcm/telecomm-adpcm.test 53.00 38.00 -28.3%
test-suite...adpcm/rawdaudio/rawdaudio.test 53.00 38.00 -28.3%
test-suite.../Benchmarks/Olden/tsp/tsp.test 2522.00 1834.00 -27.3%
test-suite...:: External/Povray/povray.test 650268.00 476041.00 -26.8%

In some cases this also increase the number of eliminated stores,
because we can explore further. Note that there is a small regression
which I should track down.

Metric: dse.NumFastStores

Program base patch diff
test-suite...T2006/445.gobmk/445.gobmk.test 82.00 123.00 50.0%
test-suite...C/CFP2000/179.art/179.art.test 6.00 7.00 16.7%
test-suite...math/automotive-basicmath.test 7.00 8.00 14.3%
test-suite...ngs-C/assembler/assembler.test 8.00 9.00 12.5%
test-suite...ks/Prolangs-C/gnugo/gnugo.test 9.00 10.00 11.1%
test-suite...INT95/132.ijpeg/132.ijpeg.test 18.00 20.00 11.1%
test-suite...langs-C/football/football.test 10.00 11.00 10.0%
test-suite...ce/Benchmarks/Olden/bh/bh.test 13.00 14.00 7.7%
test-suite...ications/JM/ldecod/ldecod.test 382.00 402.00 5.2%
test-suite...000/183.equake/183.equake.test 40.00 38.00 -5.0%
test-suite...6/482.sphinx3/482.sphinx3.test 22.00 23.00 4.5%
test-suite...T95/147.vortex/147.vortex.test 215.00 224.00 4.2%
test-suite...000/255.vortex/255.vortex.test 217.00 226.00 4.1%
test-suite...SPEC/CINT95/099.go/099.go.test 63.00 65.00 3.2%
test-suite.../Benchmarks/nbench/nbench.test 76.00 78.00 2.6%
test-suite...lications/sqlite3/sqlite3.test 153.00 157.00 2.6%
test-suite...INT2000/164.gzip/164.gzip.test 39.00 40.00 2.6%
test-suite...ications/JM/lencod/lencod.test 840.00 854.00 1.7%
test-suite...marks/7zip/7zip-benchmark.test 1211.00 1231.00 1.7%
test-suite...6/464.h264ref/464.h264ref.test 730.00 741.00 1.5%
test-suite...006/453.povray/453.povray.test 1417.00 1437.00 1.4%
test-suite...lications/ClamAV/clamscan.test 230.00 233.00 1.3%
test-suite.../Applications/SPASS/SPASS.test 156.00 158.00 1.3%
test-suite...0.perlbench/400.perlbench.test 861.00 871.00 1.2%
test-suite.../CINT2000/176.gcc/176.gcc.test 879.00 889.00 1.1%
test-suite...nsumer-lame/consumer-lame.test 100.00 101.00 1.0%
test-suite...:: External/Povray/povray.test 1220.00 1231.00 0.9%
test-suite...ocBench/espresso/espresso.test 115.00 116.00 0.9%
test-suite...chmarks/MallocBench/gs/gs.test 116.00 117.00 0.9%
test-suite...5/124.m88ksim/124.m88ksim.test 116.00 117.00 0.9%
test-suite...CI_Purple/SMG2000/smg2000.test 158.00 159.00 0.6%
test-suite...000/186.crafty/186.crafty.test 158.00 159.00 0.6%
test-suite...0/253.perlbmk/253.perlbmk.test 500.00 503.00 0.6%
test-suite.../CINT2006/403.gcc/403.gcc.test 1178.00 1185.00 0.6%
test-suite...CFP2000/188.ammp/188.ammp.test 181.00 182.00 0.6%
test-suite.../CINT2000/252.eon/252.eon.test 2672.00 2685.00 0.5%
test-suite...006/447.dealII/447.dealII.test 2117.00 2127.00 0.5%
test-suite...-typeset/consumer-typeset.test 1047.00 1051.00 0.4%
test-suite...rks/tramp3d-v4/tramp3d-v4.test 814.00 816.00 0.2%
test-suite...3.xalancbmk/483.xalancbmk.test 1265.00 1267.00 0.2%
test-suite.../Prolangs-C++/vcirc/vcirc.test 11.00 11.00 0.0%

Diff Detail

Event Timeline

fhahn created this revision.Feb 23 2020, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2020, 12:45 PM
fhahn updated this revision to Diff 246123.Feb 23 2020, 1:28 PM

Fix default value for EnableMemorySSA.

fhahn added a comment.Aug 18 2020, 7:01 AM

Unfortunately the improvements are against a large regression over legacy DSE. But I've been working on a further set of patches, which includes this one, to limit the gap between legacy DSE & MemorySSA-backed DSE, hopefully to a point where it is acceptable to switch (and improve compile-time further afterwards). Numbers: http://llvm-compile-time-tracker.com/compare.php?from=4cc20aa74339812c3e4daa4138ed40cb98521bfc&to=504153dc6efd29ad37ffb6ffb2e80d4101b56e5b&stat=instructions

More details: http://llvm-compile-time-tracker.com/compare.php?from=4cc20aa74339812c3e4daa4138ed40cb98521bfc&to=504153dc6efd29ad37ffb6ffb2e80d4101b56e5b&stat=instructions

fhahn updated this revision to Diff 287427.Aug 24 2020, 9:52 AM

Rebase and ping.

This improves compile time, while also improving the number of removed stores in some cases (due to cutting off bad candidates earlier, rather than spending time and scan limit on them).

Improves geoman -O3 -0.15% ReleaseThinLTO -0.29%, ReleaseThinLTO (link only) -0.37%

http://llvm-compile-time-tracker.com/compare.php?from=dbab061a49f82c7d3bce4b3d601742d17df3daee&to=13e39ded7f8f772f9fd0d53aff8849e0a449509b&stat=instructions

asbirlea accepted this revision.Aug 24 2020, 1:16 PM
asbirlea added inline comments.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1740

Delete "we".

1762

Remove "is"

This revision is now accepted and ready to land.Aug 24 2020, 1:16 PM
fhahn updated this revision to Diff 287584.Aug 25 2020, 12:43 AM

Rebase, address comments, thanks!

This revision was landed with ongoing or failed builds.Aug 25 2020, 12:49 AM
This revision was automatically updated to reflect the committed changes.