Page MenuHomePhabricator

[SCEV] LoopsUsed memoization
Needs RevisionPublic

Authored by rtereshin on Aug 20 2018, 11:41 AM.

Details

Summary

Currently ScalarEvolution::getUsedLoops traverses a SCEV expression upon each call
w/o caching the results. As it's called by addToLoopUseLists for every new SCEV node created
it is time consuming. This patch adds a memoization map to speed up the calls and tries
to do so in the least invasive manner possible.

This partially addresses https://bugs.llvm.org/show_bug.cgi?id=32731

On the large-SCEVs-shallow-getSCEV-stack.ll test case attached to the Bugzilla bug I see
~70% reduction in overall (wall) time of

time ./bin/opt -slsr large-SCEVs-shallow-getSCEV-stack.ll -o /dev/null

run (the rest of the time (about 98% give or take) is still spent in checkValidity).

CTMark shows either a small improvement or is within noise, it's hard to tell even
on a 100 runs (x86):

NamePrevCurrent%ΔMADPrevCurrent%ΔMAD
CTMark/ClamAV/clamscan9.82029.8004-0.20%-0.01980.07549.90549.8899-0.16%-0.01550.0754
CTMark/kimwitu++/kc11.044011.0182-0.23%-0.02580.026811.080511.08020.00%-0.00030.0268
CTMark/tramp3d-v4/tramp3d-v411.619911.5813-0.33%-0.03860.038811.690611.69030.00%-0.00030.0388
CTMark/7zip/7zip-benchmark27.307527.2473-0.22%-0.06020.151127.894227.89410.00%-0.00010.1511
CTMark/sqlite3/sqlite34.75154.7446-0.15%-0.00690.02784.78944.78930.00%-0.00010.0278
CTMark/7zip/7zip-benchmark-link0.04430.0438-1.13%-0.00050.00100.04570.04570.00%0.00000.0010
CTMark/Bullet/bullet-link0.03200.0319-0.31%-0.00010.00060.03290.03290.00%0.00000.0006
CTMark/ClamAV/clamscan-link0.02340.02340.00%0.00000.00030.02380.02380.00%0.00000.0003
CTMark/SPASS/SPASS-link0.02270.02270.00%0.00000.00020.02300.02300.00%0.00000.0002
CTMark/consumer-typeset/consumer-typeset-link0.02390.02390.00%0.00000.00030.02460.02460.20%0.00000.0003
CTMark/kimwitu++/kc-link0.05280.05280.00%0.00000.00030.05340.05340.09%0.00000.0003
CTMark/lencod/lencod-link0.02380.02380.00%0.00000.00030.02460.02460.00%0.00000.0003
CTMark/mafft/pairlocalalign-link0.01740.01740.00%0.00000.00010.01760.01760.28%0.00000.0001
CTMark/sqlite3/sqlite3-link0.01410.01410.00%0.00000.00010.01430.01430.00%0.00000.0001
CTMark/tramp3d-v4/tramp3d-v4-link0.02380.02380.00%0.00000.00010.02400.02400.00%0.00000.0001
CTMark/consumer-typeset/consumer-typeset7.67727.6499-0.36%-0.02730.03437.70517.70580.01%0.00060.0343
CTMark/mafft/pairlocalalign4.90204.8927-0.19%-0.00930.01994.93494.93550.01%0.00060.0199
CTMark/SPASS/SPASS9.03869.0233-0.17%-0.01530.07169.11679.11810.02%0.00140.0716
CTMark/lencod/lencod8.55708.5427-0.17%-0.01430.03978.59448.59600.02%0.00160.0397
CTMark/Bullet/bullet20.332720.3012-0.15%-0.03150.142020.736620.74450.04%0.00790.1420

(the left half of the table uses minimum as an aggregate function, the right half - median, the underlying data are the same 100 samples,
rows are sorted by the absolute delta of medians)

If the memory consumption becomes a concern we can switch from having a full set of loops referenced attached to every SCEV node
to a mirrored tree (DAG really) structure in a skip-list fashion so every node contains a set of unique references to all closest AddRec
nodes. This way getUsedLoops will be able to traverse a compressed tree (DAG) containing AddRec's only for any SCEV expression.

Diff Detail

Repository
rL LLVM

Event Timeline

rtereshin created this revision.Aug 20 2018, 11:41 AM
mkazantsev requested changes to this revision.Aug 20 2018, 7:28 PM

I have a lot of doubts that it is correct. It might be, but I would request you to add some tests which aggressively exercise scenarios like "cached something -- called forgetMemoizedResults/forgetLoop -- attempted to get cached results". I am specifically worried about CFG changes and loop deletions.

lib/Analysis/ScalarEvolution.cpp
11763

I don't think it is sufficient for correctness. Imagine the situation: A = {1,+,1}<some_loop>, B = A + 1. Something has changed for A, and it is no longer using some_loop. How do we say that B is not using it as well?

I also think that something needs to be done about it in forgetLoop. This method is called, for example, when a loop becomes non-existent, or when the set of its blocks changes. And you may end up with many cache bundles keeping references to this loop.

This revision now requires changes to proceed.Aug 20 2018, 7:28 PM

Hi Maxim,

Thank you for looking into this!

Given my current understanding I believe that as soon as (1) LoopsRefd is a proper inverse (or undefined) map of LoopUsers we should be fine, unless (2) some passes mis-use Scalar Evolution's APIs, for instance, do not call forgetLoop or forgetTopmostLoop on changed or deleted Loops.

For (1), I see LoopUsers getting changed only in a few places:
a) https://github.com/llvm-mirror/llvm/blob/5c1bd30b863cf271dbe70219b0bb5717d1e2ec7e/lib/Analysis/ScalarEvolution.cpp#L11813
b) https://github.com/llvm-mirror/llvm/blob/5c1bd30b863cf271dbe70219b0bb5717d1e2ec7e/lib/Analysis/ScalarEvolution.cpp#L6778

I believe in both cases the changes are reflected in LoopsRefd.

As for (2) if that's the case we already have a problem regardless applying this patch or not, which is also not going to be surfaced by testing Scalar Evolution's APIs, assuming such tests use the APIs as intended, which is probably a safe assumption.

I could also add that the CTMark compiles generated exactly the same binaries before and after this patch.

lib/Analysis/ScalarEvolution.cpp
11763

I don't think it is sufficient for correctness. Imagine the situation: A = {1,+,1}<some_loop>, B = A + 1. Something has changed for A, and it is no longer using some_loop. How do we say that B is not using it as well?

The loop is part of the SCEV node's identity: https://github.com/llvm-mirror/llvm/blob/5c1bd30b863cf271dbe70219b0bb5717d1e2ec7e/lib/Analysis/ScalarEvolution.cpp#L3423

Therefore A can not just stop using some_loop, it could only be recreated with a different loop attached and it will be a different (by reference) node A'.
Same for B, it will be a different node B' = A' + 1 with no cached results for either B' or A'. At least, this is my current understanding of what's going on here.

I also think that something needs to be done about it in forgetLoop. This method is called, for example, when a loop becomes non-existent, or when the set of its blocks changes.

I believe it's already done: https://github.com/llvm-mirror/llvm/blob/5c1bd30b863cf271dbe70219b0bb5717d1e2ec7e/lib/Analysis/ScalarEvolution.cpp#L6774-L6779

And you may end up with many cache bundles keeping references to this loop.

I think this is only possible if a pass using Scalar Evolution and changing loops violates the API and doesn't call forgetLoop on an invalidated Loop. If so, it's that pass' issue, not Scalar Evolution's, and heavy-testing Scalar Evolution's APIs won't surface such bugs at all.

If forgetLoop is properly called however, it would clean the cache for every SCEV referencing the Loop (as soon as LoopsRefd is a proper inverse map of LoopUsers), so no stale bundles I believe.
Also, given that every SCEV B referencing a SCEV A references all the loops referenced by A (and potentially more), it will end up clearing this cache for every SCEV node (transitively) dependent on a directly affected SCEV.
It's probably worth noting also that forgetLoop "forgets" all the contained loops as well, not just the one explicitly specified.