This is an archive of the discontinued LLVM Phabricator instance.

[LVI] Greatly strengthen inductive reasoning on predicates
AbandonedPublic

Authored by reames on Feb 11 2016, 3:37 PM.

Details

Summary

This patch greatly improves LVI's ability to prove inductive invariants in loops.

There are two parts to the change:

  1. Teach the getValueAt interface to use the underlying LVI framework. This should have been a simple change and would have strengthened out ability to prove predicates, but ran across the second issue.
  2. Depending on query order, LVI would sometimes fail to establish facts about loop phis and then poison all future queries. By adding the check in (1), this became far more common. This turns out to be the more interesting and much more subtle part.

The root problem resulting in (2) was that when directly querying the PHI node in a loop, we were applying edge constraints only after visiting every other node. We'd push the PHI onto the stack, push each operand in turn, and eventually hit the PHI again. At that point, we'd stop the search (required to prevent infinite recursion), and use Overdefined for the value of the PHI. We'd then unwind, applying each set of constraints in turn, and then finally apply the edge constraints of the PHI as the very last step.

This often worked out fine, but in some cases, we run into a precision problem. (Consider the test case in induction.ll @test2.) If we have two constant ranges which have no precise unique intersection (i.e. <0, 400> and <1, INT_MIN+1>), we'd end up with a range for the PHI which was imprecise. Since we cache this value, it would end up poisoning all later queries. (Even those which would have succeeded if rerun in full.)

The root issue is that we end up trying to analyse a arithmetic operation ("add i32 %iv, 1") with insufficient information available to prove no overflow, even when the loop itself static guaranteed it! This would end up causing the intersection imprecision mentioned above.

The fix for this is to essentially visit the first value twice. By ensuring that the phi node is the *last* thing pushed, when we start unwinding we get to take advantage of the edge constraints and prove a stronger constraint over all other variables (specifically, the increment.)

Worth noting is that this is the same solution seemingly previously selected for getEdgeValue, even though the code doesn't specifically say so.

This brings up a general point: maybe we should be handling the recursion limit differently. While we do need to prevent *infinite* recursion, this patch only fixes part of the issue. If you start with another node in the cycle, an unfortunate choice of starting point can lead to the same bad result. It might be worth considering (in future work!) always allowing a the recursion to visit the PHI nodes twice (which would requiring visiting some other nodes twice as well.) This would be far more stable.

p.s. In case anyone is curious, the alternate approach (to this patch only) would be to directly query non-recursive facts after deciding not to push operands. I tried that, and it ended up being far more complicated. In particular, it required a change in how we prevented infinite recursion which turned into a real mess. I think this approach is much cleaner.

Diff Detail

Event Timeline

reames updated this revision to Diff 47737.Feb 11 2016, 3:37 PM
reames retitled this revision from to [LVI] Greatly strengthen inductive reasoning on predicates.
reames updated this object.
reames added reviewers: hfinkel, sanjoy, nicholas.
reames added a subscriber: llvm-commits.
sanjoy edited edge metadata.Feb 22 2016, 3:02 PM

ping?

I'll take a look today.

I think (but am by no means sure) that this change can pessimize some cases. Since LVI does not have -analyze, that is somewhat difficult to demonstrate, but if you have

define i8 @g(i32 %a, i32 %length, i1* %cc) {
entry:
  br label %loop

loop:
  %iv = phi i32 [0, %entry], [%iv.next, %backedge]
  %iv.next = add i32 %iv, 1
  %cnd = icmp sgt i32 %iv.next, 0
  br i1 %cnd, label %backedge, label %exit

backedge:
  %cont = icmp slt i32 %iv.next, 400
  br i1 %cont, label %loop, label %exit

exit:
  ret i8 0
}

Then on master, running opt -S -jump-threading -debug-only=lazy-value-info -disable-output < FileName (crude replacement for -analyze -lazy-value-info) gives me:

LVI Getting value   %iv.next = add i32 %iv, 1 at ''
  Result = overdefined
LVI Getting value   %iv.next = add i32 %iv, 1 at ''
  Result = overdefined
LVI Getting edge value   %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'
PUSH:   %iv.next = add i32 %iv, 1 in loop
PUSH:   %iv = phi i32 [ 0, %entry ], [ %iv.next, %backedge ] in loop
PUSH:   %iv.next = add i32 %iv, 1 in backedge
POP   %iv.next = add i32 %iv, 1 in backedge = constantrange<1, -2147483648>
POP   %iv = phi i32 [ 0, %entry ], [ %iv.next, %backedge ] in loop = constantrange<0, 400>
POP   %iv.next = add i32 %iv, 1 in loop = constantrange<1, 401>
  Result = constantrange<1, 401>
LVI Getting edge value   %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'
  Result = constantrange<1, 401>

but after your change I get

LVI Getting value   %iv.next = add i32 %iv, 1 at ''
LVI Getting block end value   %iv.next = add i32 %iv, 1 at 'loop'
PUSH:   %iv = phi i32 [ 0, %entry ], [ %iv.next, %backedge ] in loop
PUSH:   %iv.next = add i32 %iv, 1 in backedge
PUSH:   %iv.next = add i32 %iv, 1 in loop
POP   %iv.next = add i32 %iv, 1 in loop = overdefined
POP   %iv.next = add i32 %iv, 1 in backedge = constantrange<1, -2147483648>
POP   %iv = phi i32 [ 0, %entry ], [ %iv.next, %backedge ] in loop = constantrange<0, 400>
  reuse BB 'loop' val=overdefined
  Result = overdefined
  Result = overdefined
LVI Getting value   %iv.next = add i32 %iv, 1 at ''
LVI Getting block end value   %iv.next = add i32 %iv, 1 at 'backedge'
  Result = constantrange<1, -2147483648>
  Result = constantrange<1, -2147483648>
LVI Getting edge value   %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'
  Result = constantrange<1, -2147483648>
LVI Getting edge value   %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'
  Result = constantrange<1, -2147483648>

The difference is in LVI Getting edge value %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'. However I am yet to make this "regression" result in a missed transform; so it is very possible that there is an invariant on why the above difference does not matter by construction. Is that the case?

I think you had explained this to me in person before, but putting this in writing will both help me remember what we decided, and is also good for record.

lib/Analysis/LazyValueInfo.cpp
1159

Minor stylistic thing: I'd keep the braces around the body, even if they're not syntactically required. The body is fairly big, textually.

sanjoy requested changes to this revision.Feb 25 2016, 10:33 PM
sanjoy edited edge metadata.

"Requesting changes" to get this off my queue.

This revision now requires changes to proceed.Feb 25 2016, 10:33 PM

The difference is in LVI Getting edge value %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'. However I am yet to make this "regression" result in a missed transform; so it is very possible that there is an invariant on why the above difference does not matter by construction. Is that the case?

I think you had explained this to me in person before, but putting this in writing will both help me remember what we decided, and is also good for record.

It is possible for this patch to cause a regression in result quality in some rare cases. The tradeoff is that we're able to catch other cases which are much more common in practice.

Looking at your *particular* example, I think you've just found an unrelated bug in LVI. In particular, we've computed a tight bound for the phi, but for some reason don't use that for the next step of the unwinding analysis. That's more than a bit strange.

The only way I know handle the general case without regressions is the double iteration scheme I mentioned in my description. If we stopped when encountering not the first node in the cycle, but the last node in the cycle (i.e. before pushing any value three times).

I think the tradeoff is worth it in practice.

Actually, the example is related to this change. It's still worth taking as is and addressing separately though. The issue your example highlights is that we're poisoning the cache when encountering %iv.next in the cycle, and then using the cached overdefined rather than recomputing. We should bypass the cache for the query instruction itself to avoid this problem.

The difference is in LVI Getting edge value %iv.next = add i32 %iv, 1 from 'loop' to 'backedge'. However I am yet to make this "regression" result in a missed transform; so it is very possible that there is an invariant on why the above difference does not matter by construction. Is that the case?

I think you had explained this to me in person before, but putting this in writing will both help me remember what we decided, and is also good for record.

It is possible for this patch to cause a regression in result
quality in some rare cases. The tradeoff is that we're able to catch
other cases which are much more common in practice.

What are those cases? Is there a reason why they would be rarer in
practice? I'm okay making a tradeoff, as long as we know what the
tradeoff is.

Looking at your *particular* example, I think you've just found an
unrelated bug in LVI. In particular, we've computed a tight bound for
the phi, but for some reason don't use that for the next step of the
unwinding analysis. That's more than a bit strange.

The only way I know handle the general case without regressions is
the double iteration scheme I mentioned in my description. If we
stopped when encountering not the first node in the cycle, but the
last node in the cycle (i.e. before pushing any value three times).

Why would that scheme handle all cases? IOW, why is visiting every
value in the cycle twice enough (and not weaker then pushing every
value in teh cycle thrice)?

reames abandoned this revision.Dec 3 2020, 3:06 PM