Page MenuHomePhabricator

Rework loop predication pass
ClosedPublic

Authored by apilipenko on Sep 7 2017, 7:18 AM.

Details

Summary

We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation.

The problem.

Consider the loop:

for (int i = b; i != e; i++)
  guard(i u< len)

The current implementation will ask SCEV if the guard condition is monotonic and if it is will replace the IV in the condition with the value of the IV at the end of the loop. In this loop the increment will be marked as nuw because of the guard. Basing on that SCEV will mark the corresponding add recurrence as nuw, which makes the guard condition monotonic. So, loop predication replaces the guard condition with e <u len condition. This transformation is not correct, for example, the loop is now broken for b = 5, e = 4.

Generally the facts SCEV provides are true if the backedge of the loop is taken, which implicitly assumes that the guard doesn't fail. Using these facts to optimize the guard results in a circular logic where the guard is optimized under the assumption that it never fails.

The fix.

An alternative way to reason about loop predication is to use an inductive proof
approach. Given the loop:

if (B(S)) {
  do {
    I = PHI(S, I.INC)
    I.INC = I + S
    guard(G(I));
  } while (B(I.INC));
}

where B(x) and G(x) are predicates that map integers to booleans, we want a
loop invariant expression M such the following program has the same semantics
as the above:

if (B(S)) {
  do {
    I = PHI(S, I.INC)
    I.INC = I + S
    guard(G(S) && M);
  } while (B(I.INC));
}

One solution for M is M = forall X . (G(X) && B(X + S)) => G(X + S)

For now the transformation is limited to the following case:

  • The loop has a single latch with either ult or slt icmp condition.
  • The step of the IV used in the latch condition is 1.
  • The IV of the latch condition is the same as the post increment IV of the guard condition.
  • The guard condition is ult.

In this case the latch is of the from:
++i u< latchLimit or ++i s< latchLimit
and the guard is of the form:
i u< guardLimit

For the unsigned latch comparison case M is:
forall X . X u< guardLimit && (X + 1) u< latchLimit => (X + 1) u< guardLimit

This is true if latchLimit u<= guardLimit since then
(X + 1) u< latchLimit u<= guardLimit == (X + 1) u< guardLimit.

So the widened condition is:
i.start u< guardLimit && latchLimit u<= guardLimit

For the signed latch comparison case M is:
forall X . X u< guardLimit && (X + 1) s< latchLimit => (X + 1) u< guardLimit

The only way the antecedent can be true and the consequent can be false is if
X == guardLimit - 1
(and guardLimit is non-zero, but we won't use this latter fact).
If X == guardLimit - 1 then the second half of the antecedent is
guardLimit s< latchLimit
and its negation is
latchLimit s<= guardLimit.

So the widened condition is:
i.start u< guardLimit && latchLimit s<= guardLimit

This patch implements the logic above. In the follow up changes the following will be supported:

  • ule, sle as the latch conditions
  • the guard IV and the latch IV start from different values (like, latch IV is {1,+,1}, range check IV {2,+,1})

Diff Detail

Repository
rL LLVM

Event Timeline

apilipenko created this revision.Sep 7 2017, 7:18 AM
apilipenko edited the summary of this revision. (Show Details)
apilipenko edited the summary of this revision. (Show Details)Sep 7 2017, 7:20 AM
apilipenko added a reviewer: reames.
mkazantsev added inline comments.Sep 11 2017, 5:21 AM
lib/Transforms/Scalar/LoopPredication.cpp
306 ↗(On Diff #114176)

Why do we need to expand Start? It comes from above the loop, we should be able to reuse it.

415 ↗(On Diff #114176)

Doesn't check on isOne() automatically check affinity?

422 ↗(On Diff #114176)

Should be *Step.

sanjoy accepted this revision.Sep 12 2017, 11:35 PM

I did not check the actual code changes carefully -- I've assumed they're in sync with the comments -- but let me know if you want me to take a look. The comments and the math bits LGTM.

lib/Transforms/Scalar/LoopPredication.cpp
38 ↗(On Diff #114176)

I think we should be clearer here: the facts that SCEV proves about the *increment step* of add recurrences are true if the backedge of the loop is taken

75 ↗(On Diff #114176)

Can you please add a short inductive informal proof on why this is sufficient? You should also state that M or anything "stronger" (i.e. any condition that implies M) will do.

110 ↗(On Diff #114176)

I would append this to make things 100% clear:

"""
In other words, if latchLimit s<= guardLimit then (the ranges below are written in ConstantRange notation, where [A, B) is the set for (I = A; I != B; I++ /*maywrap*/) yield(I);):

   forall X . X u< guardLimit && (X + 1) s< latchLimit =>  (X + 1) u< guardLimit
== forall X . X u< guardLimit && (X + 1) s< guardLimit =>  (X + 1) u< guardLimit
== forall X . X in [0, guardLimit) && (X + 1) in [INT_MIN, guardLimit) =>  (X + 1) in [0, guardLimit)
== forall X . X in [0, guardLimit) && X in [INT_MAX, guardLimit-1) =>  X in [-1, guardLimit-1)
== forall X . X in [0, guardLimit-1) => X in [-1, guardLimit-1)
== true

"""

397 ↗(On Diff #114176)

I don't think you need parens around A == B.

This revision is now accepted and ready to land.Sep 12 2017, 11:35 PM
This revision was automatically updated to reflect the committed changes.