This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Properly solve quadratic equations
ClosedPublic

Authored by kparzysz on Jun 18 2018, 9:12 AM.

Details

Summary

Commit r334318 exposed a problem in SolveQuadraticEquation: an add-rec was created with coefficients in i1, and division by 2 became equivalent to division by zero. See http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180611/561033.html.

This also fixes https://bugs.llvm.org/show_bug.cgi?id=38024.

Diff Detail

Repository
rL LLVM

Event Timeline

kparzysz created this revision.Jun 18 2018, 9:12 AM
kparzysz edited the summary of this revision. (Show Details)

See also https://bugs.llvm.org//show_bug.cgi?id=37211 .

I'm tempted to just delete the code instead, given it's screwed up in multiple ways: we shouldn't be dividing by two at all, even if the bitwidth isn't one. (It looks like someone took the quadratic formula for real numbers, and blindly translated it without considering that SCEV expressions aren't real numbers.)

I've no idea what the best fix is but at least the current patch avoids the crash I saw. Thanks!

kparzysz updated this revision to Diff 152471.Jun 22 2018, 6:54 AM
kparzysz retitled this revision from [SCEV] Properly solve quadratic equations with coefficients in i1 to [SCEV] Properly solve quadratic equations.
kparzysz edited the summary of this revision. (Show Details)

Rewrite SolveQuadraticEquation. This fixes https://bugs.llvm.org//show_bug.cgi?id=37211.

Needs more test coverage of the overflow cases.

lib/Analysis/ScalarEvolution.cpp
8517

I don't like hardcoded numbers without an explanation... why is 32 special? Why is 1 special?

8545

I think you can overflow here, and a few other places where you multiply numbers. Not sure exactly how many more bits you need.

10839–10840

Would it make sense to just return getCouldNotCompute() here? Given the SolveQuadraticEquation() API, if RootVal is in the range, something overflowed, I think.

kparzysz updated this revision to Diff 154658.Jul 9 2018, 11:38 AM
kparzysz edited the summary of this revision. (Show Details)
  1. Added solving for arbitrary overflow, i.e. q(x) = 2^n for n \in Z-{0}, (or mod 2^n).
  2. Expanded comments.
  3. Added more testcases (incl. from PR38024).
kparzysz marked 3 inline comments as done.Jul 9 2018, 11:39 AM
efriedma added inline comments.Jul 16 2018, 6:13 PM
lib/Analysis/ScalarEvolution.cpp
8268

Please clarify what happens if the addrec has signed overflow. Please document the conditions under which the algorithm fails to find a solution.

8522

The algorithm should still work even if you treat "i1 true" as -1, I think. Actually, you end up solving the exact same equation, because of the A.isNegative() check.

10839–10840

Please make sure we have a testcase that hits this "getCouldNotCompute".

10841–10842

Please make sure we have a testcase that hits this "getCouldNotCompute".

test/Analysis/ScalarEvolution/solve-quadratic-overflow.ll
3

Please integrate this into the other testcase, with the same comments/CHECK lines.

test/Analysis/ScalarEvolution/solve-quadratic.ll
2

debug-only needs "REQUIRES: asserts".

kparzysz marked 5 inline comments as done.Jul 18 2018, 8:23 AM
kparzysz added inline comments.Jul 18 2018, 8:23 AM
lib/Analysis/ScalarEvolution.cpp
8522

You're right. The check for A.isNegative() was added later.

10841–10842

This shouldn't really happen. I've changed it to an assert.

kparzysz updated this revision to Diff 156082.Jul 18 2018, 8:24 AM
kparzysz marked 3 inline comments as done.

Addressed the comments.

efriedma added inline comments.Jul 18 2018, 11:04 AM
lib/Analysis/ScalarEvolution.cpp
8269

"A signed overflow is ignored" contradicts the implementation; you're treating the coefficients as signed values (by sign-extending them).

kparzysz added inline comments.Jul 18 2018, 11:21 AM
lib/Analysis/ScalarEvolution.cpp
8269

I'm solving q(x) = k * 2^BW. The reason why I'm sign-extending the values is to keep the behavior around 0 the same. I am avoiding any overflows in the calculations and I simulate the unsigned one via "2kR".

Following hits an assertion failure with "opt -analyze -scalar-evolution" with the latest patch, I think due to the inconsistent handling of signed-ness.

define signext i32 @func() {
entry:
  br label %loop

loop:
  %ivr = phi i32 [ 0, %entry ], [ %ivr1, %loop ]
  %inc = phi i32 [ 0, %entry ], [ %inc1, %loop ]
  %acc = phi i32 [ -1, %entry ], [ %acc1, %loop ]
  %ivr1 = add i32 %ivr, %inc
  %inc1 = add i32 %inc, -1                 ; M = inc1 = inc + N = X + N
  %acc1 = add i32 %acc, %inc              ; L = acc1 = X + Y
  %and  = and i32 %acc1, -1              ; iW
  %cond = icmp ule i32 %and, -3
  br i1 %cond, label %exit, label %loop

exit:
  %rv = phi i32 [ %acc1, %loop ]
  ret i32 %rv
}

Well, for one 2/2 seems to be -1...

Code:

APInt TT, UU;
APInt::udivrem(APInt(72, 2), APInt(72, 2), TT, UU);
dbgs() << "TT=" << TT << '\n';

Output:

TT=-1

This is because we handle the special case of LHS==RHS via "Quotient=1", which sets the bitwidth to 1...

But there is also something else going on with this test. The code in getNumIterationsInRange assumes that it's the upper bound of the range that will be crossed, while this case crosses the lower bound. We should really solve for both and eliminate the wrong solution (the one for which the assertion fails).

The code in getNumIterationsInRange assumes that it's the upper bound of the range that will be crossed

The current getNumIterationsInRange implementation is consistent with "equals 0 or wraps around 2^BW in the unsigned modulo 2^BW arithmetic" (but that's not what SolveQuadraticEquation actually computes at the moment).

The current getNumIterationsInRange implementation is consistent with "equals 0 or wraps around 2^BW in the unsigned modulo 2^BW arithmetic" (but that's not what SolveQuadraticEquation actually computes at the moment).

Generic comments like this are not very helpful. Please be more specific in explaining your concerns.

The addrec in your example is {x,+,-1,+,-1}, where x is some starting value. In unsigned arithmetic this expression has an unsigned overflow right at the first iteration (regardless of x), but it doesn't mean that it goes out of the range in the first step.

I can change the comment to something like "returns the smallest positive x such that q(x-1) and q(x) belong to different intervals in the form of [k*2^BW, (k+1)*2^BW)".

kparzysz updated this revision to Diff 156328.Jul 19 2018, 11:33 AM

Check both ends of the range in getNumIterationsInRange, change comment for SolveQuadraticEquation.

The general problem we're trying to solve in getNumIterationsInRange is how many iterations the SCEV {L,+,M,+,N} is in the range [R,S]. There's a simple, inefficient algorithm for this: just call evaluateAtIteration repeatedly until the value isn't in range. But that's obviously way too expensive, and I don't know how to solve the general problem more efficiently.

So let's take a subset of the problem. The simplest subset is SCEVs where the last value in range is before the point of the first unsigned overflow. We can solve this by normalizing the range to [R, UINT_MAX], then finding the first iteration where {L,+,M,+,N} has unsigned overflow, then checking whether the iteration after the overflow is in range. This can be computed either using a straight binary search, or by converting it to an inequality over real numbers and using algebra. I thought this is what you were aiming for, since it's very close to what you've implemented?

We can generalize that a bit by adding some additional code in the case where we fail to compute a solution: apply a binary NOT to both the SCEV and the range, and try again.

Another possible subset is SCEVs where the last value in range is before the point of the first signed overflow. Basically, normalize the range to [R, INT_MAX], treat the coefficients and range as signed numbers, solve the system of inequalities over real numbers, then verify the solution applies to the original modular problem. This is a little more complicated because the vertex of the parabola could be inside the range.

Another crash:

define signext i32 @func() {
entry:
  br label %loop

loop:
  %ivr = phi i32 [ 0, %entry ], [ %ivr1, %loop ]
  %inc = phi i32 [ -100000, %entry ], [ %inc1, %loop ]
  %acc = phi i32 [ 100000, %entry ], [ %acc1, %loop ]
  %ivr1 = add i32 %ivr, %inc
  %inc1 = add i32 %inc, 1                 ; M = inc1 = inc + N = X + N
  %acc1 = add i32 %acc, %inc              ; L = acc1 = X + Y
  %and  = and i32 %acc1, -1            ; iW
  %cond = icmp sgt i32 %and, 5
  br i1 %cond, label %exit, label %loop

exit:
  %rv = phi i32 [ %acc1, %loop ]
  ret i32 %rv
}
lib/Analysis/ScalarEvolution.cpp
8273

Please explicitly state this is treating the coefficients L, M, and N as signed integers.

Yeah, this one is a bug in updating of the coefficients.

My approach here, in a single sentence, is to solve the quadratic equation q(n) = 0 using the formula for real numbers. Doing modulo arithmetic is tricky and it's easy to make a false assumption. For example: in modulo arithmetic there is no linear order, or if someone insists on an order, then we lose n+1>n for each n (so it will have different properties from what holds in R or Z). Additionally, a square root will now have multiple values: in modulo 16, 2^2 = 4, but also 14^2 = 4, because -2 is now "positive" as 14. Because of that I wanted to move away from the modulo arithmetic, and instead operate on integers.

I get the "simulated" integers by extending everything to a bit width that will be sufficient for all required operations. Let's assume for a moment that we're not interested in inequalities, but only in the exact solutions to q(n) = 0. In integers there can only be 0, 1 or 2 such solutions, but with the modulo arithmetic that we started with there can be infinitely many solutions. Those extra solutions show up when we have something like 16*16 = 0 (mod 256). So instead of solving q(n) = 0 we have to find all solutions to q(n) = k * 2^BW and pick the smallest non-negative one. A lot of the code in SolveQuadraticEquation tries to figure out how to rewrite q(n) = k * 2^BW as q(n) - p = 0, so we can still solve q'(n) = 0 for some updated q'. It does that by exploiting the properties of a quadratic function: decreasing before the vertex, increasing afterwards, assuming positive coefficient at n^2. It handles the case where we start with some, let's say positive value, then the value decreases (but remains positive), then goes back up and hits 2^BW eventually. I'll get back to this case later on.

There is still that momentary assumption that we only solve for exact n for which q(n) = 0, but the analysis that this is a part of is also interested in inequalities. The inequality here has the form of q(n) > 0 or q(n) < 0 (or greater-than-or-equal, etc), since the addrec gets shifted by the values at the ends of the range. The interesting solution here is the first step where the values change from negative to positive, or from positive to negative. In the original (modulo) arithmetic this can happen when the values cross the 2^BW boundary (any multiple of it, including 0), or when the values cross the signed overflow (i.e. an odd multiple of 2^(BW-1)). At the time of solving the equation we don't really know which one is the correct one, but since we already have results for k * 2^BW, we can stick with that. This is where the signed overflow is ignored. The returned value may not be the correct solution, but since it is validated by the callers, it is ok to return an "educated guess".

Back to the case with decreasing values---in modulo arithmetic adding -1 would actually be adding 2^BW-1. That would always be an overflow, so strictly speaking we'd have to stop the first time it happens. The problem is that this is rarely the right thing to do. To avoid that, the coefficients are treated as signed numbers.

kparzysz updated this revision to Diff 156593.Jul 20 2018, 1:28 PM
kparzysz marked an inline comment as done.

Fixed some bugs, expanded comments, added more tests.

efriedma added inline comments.Jul 20 2018, 2:35 PM
lib/Analysis/ScalarEvolution.cpp
10863

I'm not convinced this is correct: why is the "valid" solution guaranteed to be smaller?

kparzysz added inline comments.Jul 20 2018, 2:42 PM
lib/Analysis/ScalarEvolution.cpp
10863

Are you referring to line 10720? Because it's the number of iterations. Both solutions are positive and both are validated as leaving the range, so the smaller one will happen first.

kparzysz added inline comments.Jul 20 2018, 2:44 PM
lib/Analysis/ScalarEvolution.cpp
10863

In this line, either UpperValid or LowerValid are true, but not both. If UpperValid is false, then the lower solution is.

efriedma added inline comments.Jul 20 2018, 2:50 PM
lib/Analysis/ScalarEvolution.cpp
10863

I'm specifically concerned about the case where U.ult(L) is true, and UpperValid is false (so we return a "valid" solution, but not the smallest solution). Or is that impossible for some reason?

kparzysz added inline comments.Jul 20 2018, 2:57 PM
lib/Analysis/ScalarEvolution.cpp
10863

Yeah, there is a problem.

We need to handle signed overflow as well. The validity of the solution checked here only checks that it leaves the range, but it doesn't check that it's the first one to do so. The solving functions needs to give the guarantee of being the smallest solution.

kparzysz updated this revision to Diff 156872.Jul 23 2018, 2:25 PM

Separated solving of a quadratic addrec into two cases:

  1. 1. Solving for an exact solution (i.e. calculating unsigned overflow: q(n) = k * 2^BW, where 0 is considered a special case of it).
  2. 2. Solving for exiting from range (includes signed and unsigned overflow).

Both cases now guarantee that if a valid solution is found (i.e. nullifying the addrec or leaving the range), it will be the smallest non-negative one.

Updated testcase to include checks for calculating failing solutions (i.e. non-minimal, etc.).

In some cases an addrec will only become 0 after a number of signed and unsigned overflows has occurred. SolveQuadraticAddRecExact will still return the location of the first unsigned overflow (which will fail validation), however separating it from SolveQuadraticAddRecRange makes it easier to change it in the future to look for an exact solution.

I have run this though 100k randomly generated testcases to detect potential crashes (none have occurred).

I like the separate entry points for equality and inequality.

I was curious about exact solutions, so I did a bit of searching; https://arxiv.org/pdf/1711.03621.pdf describes the complete solution for equality.

lib/Analysis/ScalarEvolution.cpp
8305

negate() can overflow; if that overflow is impossible because of the way the inputs are constructed, please add an assertion.

8473

Do we have to return None here? Even if we can't return a root, we should still be able to find the point where it wraps.

If you think None is the correct representation, please ensure SolveQuadraticAddRecRange doesn't succeed in that case.

8560

Is it possible for the truncation to fail? For an AddRec of bitwidth N, evaluateAtIteration(2^N)==evaluateAtIteration(0), so any solution must be less than 2^N. (And all users should treat the backedge-taken count as an unsigned quantity.)

8581

You might want to explicitly state that there is no solution in some cases.

8682

Do we need to check which one is smaller before we check LeavesRange?

kparzysz marked 4 inline comments as done.Jul 24 2018, 12:54 PM
kparzysz added inline comments.
lib/Analysis/ScalarEvolution.cpp
8473

Yes, this case could likely be solved, but we fail to do it.

The code above that updates the C in Ax^2+Bx+C, does that in a way to find the smallest x that causes an overflow (or zero). It uses the standard "real number" analysis for that, which only assures the existence of such a solution (as a real number), but does nothing to make sure that there will be an integer that satisfies the requirements. If a single iteration skips over both zeros of the updated equation then both, the previous and the next iteration will produce positive values and this case will go "unnoticed" by the loop. The code updating C could, in theory, do that to make sure that an integer solution exists, I just haven't thought of an elegant way to do it.

The point about SolveQuadraticAddRecRange is definitely valid and I have fixed it.

8560

It is possible. The test test/Analysis/ScalarEvolution/solve-quadratic-i1.ll has coefficients in i1, but the solution is 2 (the loop iterates 3 times). I'm not sure if this is specific to i1 or not. It hasn't happened in any test I've run outside of i1.

8682

We don't need to, but we can.

kparzysz updated this revision to Diff 157108.Jul 24 2018, 12:55 PM
kparzysz marked 2 inline comments as done.

Addressed comments.

Btw, earlier on I found this paper, but I decided against it because it wouldn't solve inequalities: https://arxiv.org/abs/1507.07513 (and it would be more work...).

efriedma added inline comments.Jul 24 2018, 3:34 PM
lib/Analysis/ScalarEvolution.cpp
8513

I think BitWidth + 1 should be sufficient? At least, that should be enough to convert the AddRec to an equation without losing any information. (Like you note in the comments later, the equation is L + nM + n(n-1)/2 N = 0 mod BitWidth, or 2L + 2M n + n(n-1) N = 0 mod BitWidth+1.)

8518

Do we prefer to sign-extend just so abs(A) is small in common cases? I think the choice of sign-extension vs zero-extension is arbitrary otherwise. Probably worth explicitly noting with a comment.

8560

Oh, I'm just wrong; in general, the solution for an AddRec of bitwidth B needs up to B+1 bits. Maybe worth noting that explicitly.

kparzysz marked 2 inline comments as done.Jul 25 2018, 11:48 AM
kparzysz added inline comments.
lib/Analysis/ScalarEvolution.cpp
8513

I wanted to avoid any overflows, the motivation was 2M-N. I suppose it cannot overflow BitWidth+1 with M and N being BitWidth wide.

8518

Are you talking only about this? There is another extension in SolveQuadraticEquation (at line 8325). Originally there was only one sign-extension, but after reorganizing this code, it has been split into two. I kept them identical to the original one (i.e. both as sign-extension).

In general the choice of sign-extension was actually to avoid "false positives" when checking for overflow. For example, take q(x) = (x+1)(x-2) = x^2 - x - 2, in i3. The zeros are -1 and 2, q(0) = q(1) = -2 (i.e. nothing exceeding the signed range of i3), so we can expect the calculated solution to be 2. With zero-extended values, the function would be r(x) = x^2 + 7x + 6 with both solutions negative: -6 and -1. The equation r(x) = 0 would then become r(x) = 8 to solve for an unsigned overflow, i.e. x^2 + 7x - 2 = 0. The root of that is approx. 0.27, so the returned solution would be 1.

kparzysz updated this revision to Diff 157331.Jul 25 2018, 11:49 AM
kparzysz marked an inline comment as done.

Addressed further comments.

efriedma added inline comments.Jul 25 2018, 12:09 PM
lib/Analysis/ScalarEvolution.cpp
8518

I'm talking about specifically this one: the extension here will still be necessary even if we add an exact modular quadratic SCEV solver.

I guess the answer is essentially the same, but deserves its own comment.

efriedma added inline comments.
lib/Analysis/ScalarEvolution.cpp
8671

Min.hasValue() is always true here.

8675

How do you ensure the correct solution isn't some value between Min and Max?

Similarly, in the case where SolveForBoundary returns None, how can you be sure the correct solution isn't some value between the Max for this boundary and the Min of the other boundary?

kparzysz marked 2 inline comments as done.Jul 26 2018, 10:30 AM
kparzysz added inline comments.
lib/Analysis/ScalarEvolution.cpp
8671

So is Max.hasValue().

8675

How do you ensure the correct solution isn't some value between Min and Max?

Assuming that Min and Max are different values, one of them is when the first signed overflow happens, the other is when the first unsigned overflow happens. Crossing the range boundary is only possible via an overflow (treating 0 as a special case of it, modeling overflow as crossing k*2^W for some k).

The interesting case here is when Min was eliminated as an invalid solution, but Max was not. The argument is that if there was another overflow between Min and Max, it would also have been eliminated if it was considered.

For a given boundary, it is possible to have two overflows of the same type (signed/unsigned) without having the other type in between: this can happen when the vertex of the parabola is between the iterations corresponding to the overflows. This is only possible when the two overflows cross k*2^W for the same k. In such case, if the second one left the range (and was the first one to do so), the first overflow would have to enter the range, which would mean that either we had left the range before or that we started outside of it. Both of these cases are contradictions.

Similarly, in the case where SolveForBoundary returns None, how can you be sure the correct solution isn't some value between the Max for this boundary and the Min of the other boundary?

Assume that we had such Max_A and Min_B corresponding to range boundaries A and B and such that Max_A < Min_B. If there was a solution between Max_A and Min_B, it would have to be caused by an overflow corresponding to either A or B. It cannot correspond to B, since Min_B is the first occurrence of such an overflow. If it corresponded to A, it would have to be either a signed or an unsigned overflow that is larger than both eliminated overflows for A. But between the eliminated overflows and this overflow, the values would cover the entire value space, thus crossing the other boundary, which is a contradiction.

kparzysz updated this revision to Diff 157528.Jul 26 2018, 10:38 AM
kparzysz marked an inline comment as done.

Addressed review comments.

Added comments, an assertion to ensure that the initial value is in range in SolveQuadraticAddRecRange, more debug messages.

kparzysz added inline comments.Jul 26 2018, 10:39 AM
lib/Analysis/ScalarEvolution.cpp
8706

Doh. Leftover '?'.

efriedma accepted this revision.Jul 27 2018, 4:47 PM

LGTM, but this is complicated code, so hopefully someone else will look as well.

This revision is now accepted and ready to land.Jul 27 2018, 4:47 PM

I didn't review the math, but I think there are some easy steps we can take to make this more watertight. Firstly, the code that operates purely on APInt should be moved to APInt.h. With that, we should be able to brute-force all possible inputs for a small bitwidth (i3 or i4) and check that the solutions are correct. What do you think?

I can do the brute force tests.

kparzysz updated this revision to Diff 158353.Jul 31 2018, 11:51 AM

Moved the SolveQuadraticEquation to APIntOps and renamed it to SolveQuadraticEquationWrap.

Changed the comments to no longer refer to any multiplication by 2, now all actions related to that are only in ScalarEvolution.cpp.

Tested all widths between 2 and 10 (inclusive) using a brute-force test from D50095.

Now that the code is in APInt we should have some tests in unittests/ADT/APIntTest.cpp. Can we pull in some subset of D50095 that runs within a reasonable amount of time (only bitwidth 4, for instance)?

I also have some "black box" comments inline on how the function is documented.

include/llvm/ADT/APInt.h
2174 ↗(On Diff #158353)

Would be nice to clarify the bitwidth in which you evaluate q(n) for (b). It can't be BW since any value in [0,2^BW) will either lies in exactly one such interval or all such intervals, depending on how you slice it.

2178 ↗(On Diff #158353)

I'm not sure this is correct -- sub 5, 1 does not unsign-overflow right? (since zext(5) - zext(1) == zext(5 - 1))?

2183 ↗(On Diff #158353)

Is (a) relevant any more? All the coeffs are APInt.

2187 ↗(On Diff #158353)

Can you add a line on how this "q(n) > T, and q(n+1) < T" ties in with the return value of this function specified at the very top "n >= 1 and q(n-1) and q(n) belong to two different intervals..."?

kparzysz marked 4 inline comments as done.Aug 1 2018, 10:24 AM
kparzysz added inline comments.
include/llvm/ADT/APInt.h
2178 ↗(On Diff #158353)

That's true, but then sub 5, 1 is not the same as add 5, -1. I'm treating sub x, y as equivalent to add x, -y. I added a comment explaining this.

kparzysz updated this revision to Diff 158567.Aug 1 2018, 10:26 AM
kparzysz marked an inline comment as done.

Addressed comments. Added the verification code to APIntTest.

Only one non-trivial question inline.

unittests/ADT/APIntTest.cpp
2368 ↗(On Diff #158567)

Why do we need this special case?

2382 ↗(On Diff #158567)

s/VX/ValueAtX

2384 ↗(On Diff #158567)

I don't quite understand this. Don't you need to check that the overflow bits are different from when the equation is evaluated at X-1 (and not from when the equation is evaluated at 0, which is what it looks like you're doing here)?

2387 ↗(On Diff #158567)

Probably better to call this EquationToString?

kparzysz marked 4 inline comments as done.Aug 1 2018, 11:47 AM
kparzysz added inline comments.
unittests/ADT/APIntTest.cpp
2368 ↗(On Diff #158567)

We don't really. Removed.

2384 ↗(On Diff #158567)

The overflow bits should not change between 0 and X-1 inclusive. We're looking for the first X such that they do change, or that the value is 0 (whichever happens first).

kparzysz updated this revision to Diff 158591.Aug 1 2018, 11:48 AM
kparzysz marked 2 inline comments as done.

Addressed comments.

sanjoy accepted this revision.Aug 2 2018, 9:25 AM

lgtm

include/llvm/ADT/APInt.h
2193 ↗(On Diff #158591)

Do you meant to say that "if the coeffs are positive then use a range of BW-1 to find a solution for signed overflow"?

kparzysz added inline comments.Aug 2 2018, 11:30 AM
include/llvm/ADT/APInt.h
2193 ↗(On Diff #158591)

It should work with with any coefficients. I will clarify the comments.

This revision was automatically updated to reflect the committed changes.