CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter.
Differential D26389
[SCEV] limit recursion depth of CompareSCEVComplexity dfukalov on Nov 8 2016, 3:22 AM. Authored by
Details CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time. Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter.
Diff Detail Event TimelineComment Actions current test input .ll is quite big (~1000 lines). I'm trying to reduce it, but not sure I'll be able to. Should I add it's big version to the change? Comment Actions Hi Dmitry, Please upload the patch with context (http://llvm.org/docs/Phabricator.html) Have you checked if this is solely due to the depth of the expressions, or if this is due to some exponential behavior? Can this be solved using a cache, like in CompareValueComplexity (though it also has a Depth so perhaps it isn't a good example). As far as the test case goes, you can always add a C++ test case to unittests/ that generates the IR or SCEV expression in memory. I agree with you in that we should avoid adding excessively large test files to tests/ Comment Actions Can you check if this fixes the tests cases from these bugs: Comment Actions Hi Sanjoy, the issue is in SLSR pass when it tries to construct with getAddExpr and getMulExpr on a source that is similar to test case from https://llvm.org/bugs/show_bug.cgi?id=28721 but much more bigger: >1000 lines of muls+adds in main block and has 8 phis instead of 2. Tom, I'll try to create a test case by expanding one from bug 28721. Comment Actions Hi Sanjoy, I've reduced testcase to 400 lines. Then I tried to use cache as you suggested and it fixes the issue indeed. I'm going to re-test and up[date patch. But I would add recursion depth control since there is no guarantee the cache will help in all possible cases. For example, during test case reduction I had input variants of input that caused hangs in CompareSCEVComplexity called from other (not SLSR) passes. What do you think? If you agree with recursion depth, should it be new parameter (as in my first implementation) or a hard-coded value as for CompareValueComplexity? Comment Actions Hi Daniil, Great! As I said before, if the test case is repetitive then it is probably better to generate the IR in memory using IRBuilder and put the test case in unittests/. Otherwise a 400 line file does not sound that bad.
I think a recursion depth as a stopgap measure is fine. For now I'd add a max depth as a cl::opt and use the same depth in both CompareValueComplexity and CompareSCEVComplexity (that is, don't "forget" the depth when CompareSCEVComplexity calls into CompareValueComplexity). I'd make this max depth fairly high by default (perhaps 64 or something) since, as I understand it, the caching should take care of avoiding any exponential behavior, and if the we're spending lots of time in CompareSCEVComplexity then the SCEV expression itself is very complex. In this (latter) case it is defensible for CompareSCEVComplexity to take time (it simply has more work to do), but only up to a point.
Comment Actions Hi Sanjoy, I've updated the change and added unittest. I made it since reduced to 400 lines testcase can be compiled by ~10 sec on a fast machine without the fix, but the suggested unittest has "softhang" behavior. For my initial big testcase recursion depth was up to 50+ an I never completed the compilation on a quite fast CPU. So I suggest to set it 32 by default. Comment Actions lgtm
|
IMO a slightly better pattern would have been to have EqCache be a SmallSet of std::pair<PointerUnion<Value *, const SCEV *>, PointerUnion<Value *, const SCEV *>> and to re-use the same cache in CompareValueComplexity.
If you agree, can you do that as a followup change?