What is it about this check which is a problem? Or put another way, why is this not okay but the call to isImpliedCond on line 6956 is fine? The problem is recursion through isImpliedCond->getSCEV->..., right?
The problem is that the check wasn't covering assume loop which caused big hang.
Stacktrace looked more like this
I don't think this is an infinite loop (Piotr, can you please verify
this?), it is probably an O(n!) recursion where n == number of the
ScalarEvolution::isImpliedCond already guards for infinite loops via
MarkPendingLoopPredicate. However, if you have
then the recursive call to isImpliedCond(assume_2, X) may end up
calling isImpliedCond(assume_1, Y) and that may end up calling
isImpliedCond(assume_0, Y) and that may end up calling
isImpliedCond(assume_3, Y). This way, even though we're protected
against full on infinite recursion, we'll still explore all 4! = 24
I think the check with LoopContinuePredicate is fine since it only
calls isImpliedCond if there is exactly one latch in the loop. This
means that the above n! possibility is really only a 1! = 1
possibility for LoopContinuePredicate.
But this from memory, so please double check.
I checked it, it is of course not infinite. I am not sure about n! in case of assumes. I having 10 or 20 assumes will not make it slow. @rsmith was helping me with it, and he thinks that for assumes it is O(n^2), because results are memorized.
So it this case, 1000 assumes is enough to make difference
I don't think so. I think the results are memorized, so calling sImpliedCond in 6956 will not cause lag.
Our initial analysis of the 'assume' problem appeared to be O(n!), but we don't have a reduced testcase for that. The cycle there was isLoopBackedgeGuardedByCond -> isImpliedCond -> getZeroExtendExpr -> isLoopBackedgeGuardedByCond. In the testcase in this patch, getZeroExtendExpr memoizes its result, but there are paths through it that do not appear to do so (in particular, the isLoopBackedgeGuardedByCond test in the isKnownPositive / isKnownNegative cases can lead to a return with no memoization of the getZeroExtendExpr result).
I could easily believe there are testcases for both loops that lead to O(n!) performance. (If not, we are emitting /vastly/ too many assumes...)