We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation.
The problem.
Consider the loop:
for (int i = b; i != e; i++) guard(i u< len)
The current implementation will ask SCEV if the guard condition is monotonic and if it is will replace the IV in the condition with the value of the IV at the end of the loop. In this loop the increment will be marked as nuw because of the guard. Basing on that SCEV will mark the corresponding add recurrence as nuw, which makes the guard condition monotonic. So, loop predication replaces the guard condition with e <u len condition. This transformation is not correct, for example, the loop is now broken for b = 5, e = 4.
Generally the facts SCEV provides are true if the backedge of the loop is taken, which implicitly assumes that the guard doesn't fail. Using these facts to optimize the guard results in a circular logic where the guard is optimized under the assumption that it never fails.
The fix.
An alternative way to reason about loop predication is to use an inductive proof
approach. Given the loop:
if (B(S)) { do { I = PHI(S, I.INC) I.INC = I + S guard(G(I)); } while (B(I.INC)); }
where B(x) and G(x) are predicates that map integers to booleans, we want a
loop invariant expression M such the following program has the same semantics
as the above:
if (B(S)) { do { I = PHI(S, I.INC) I.INC = I + S guard(G(S) && M); } while (B(I.INC)); }
One solution for M is M = forall X . (G(X) && B(X + S)) => G(X + S)
For now the transformation is limited to the following case:
- The loop has a single latch with either ult or slt icmp condition.
- The step of the IV used in the latch condition is 1.
- The IV of the latch condition is the same as the post increment IV of the guard condition.
- The guard condition is ult.
In this case the latch is of the from:
++i u< latchLimit or ++i s< latchLimit
and the guard is of the form:
i u< guardLimit
For the unsigned latch comparison case M is:
forall X . X u< guardLimit && (X + 1) u< latchLimit => (X + 1) u< guardLimit
This is true if latchLimit u<= guardLimit since then
(X + 1) u< latchLimit u<= guardLimit == (X + 1) u< guardLimit.
So the widened condition is:
i.start u< guardLimit && latchLimit u<= guardLimit
For the signed latch comparison case M is:
forall X . X u< guardLimit && (X + 1) s< latchLimit => (X + 1) u< guardLimit
The only way the antecedent can be true and the consequent can be false is if
X == guardLimit - 1
(and guardLimit is non-zero, but we won't use this latter fact).
If X == guardLimit - 1 then the second half of the antecedent is
guardLimit s< latchLimit
and its negation is
latchLimit s<= guardLimit.
So the widened condition is:
i.start u< guardLimit && latchLimit s<= guardLimit
This patch implements the logic above. In the follow up changes the following will be supported:
- ule, sle as the latch conditions
- the guard IV and the latch IV start from different values (like, latch IV is {1,+,1}, range check IV {2,+,1})