Using the optimized access enables additional optimizations in cases
where the defining access is a non-aliasing store.
Alternatively we could also walk upwards and skip non-aliasing defs
here, but my experiments so far showed that this will noticeably
increase compile-time for little extra gain compared to just using the
optimized access.
Improvements of dse.NumRedundantStores on MultiSource/CINT2006/CPF2006
on X86 with -O3:
test-suite...-typeset/consumer-typeset.test 1.00 76.00 7500.0% test-suite.../Benchmarks/Bullet/bullet.test 3.00 12.00 300.0% test-suite...006/453.povray/453.povray.test 3.00 6.00 100.0% test-suite...telecomm-gsm/telecomm-gsm.test 1.00 2.00 100.0% test-suite...ediabench/gsm/toast/toast.test 1.00 2.00 100.0% test-suite...marks/7zip/7zip-benchmark.test 1.00 2.00 100.0% test-suite...ications/JM/lencod/lencod.test 7.00 10.00 42.9% test-suite...6/464.h264ref/464.h264ref.test 6.00 8.00 33.3% test-suite...ications/JM/ldecod/ldecod.test 6.00 7.00 16.7% test-suite...006/447.dealII/447.dealII.test 33.00 33.00 0.0% test-suite...6/471.omnetpp/471.omnetpp.test NaN 1.00 nan% test-suite...006/450.soplex/450.soplex.test NaN 2.00 nan% test-suite.../CINT2006/403.gcc/403.gcc.test NaN 7.00 nan% test-suite...lications/ClamAV/clamscan.test NaN 1.00 nan% test-suite...CI_Purple/SMG2000/smg2000.test NaN 3.00 nan%
Follow-up to D111727.
Can you please add a comment to document this choice?
This is unusual ... the typical way to do this is: upperDef = getClobberingMemoryAccess which internally returns the optimized access if already optimized, or does the walk if not.
You're looking for bypassing that, due to increase in compile time from the walk. You want the optimized if available, but to not do the walk otherwise and default to the defining access.
Would adding a new API (e.g. getOptimizedIfAvailable) for this sort of info be useful?