When we try to prove some predicate being true on every iteration of a loop,
sometimes we cannot prove that backedge is guarded by the condition we need.
But if the predicate is monotonic, it is enough to prove it on the first and last iteration
of the loop, and this will give us its proof on every loop's iteration.
Details
- Reviewers
sanjoy apilipenko anna reames
Diff Detail
Event Timeline
This is very similar to what we had as an initial implementation of loop predication. Later we found that this approach is problematic. See https://reviews.llvm.org/D37569 description for the details. This patch likely suffers from a similar problem.
I don't think it does. The problem with Loop Predication was that we made an implicit assumption that the loop backedge is taken, and if it actually wasn't (and the loop exited by guard's condition).
In this patch, we require the number of backedge taken count be computable. If the loop fails the guard after N-th iteration, this number should not exceed N. So the end value we take is not the value on theoretical "last" iteration after which we exit by loop condition, but the value before exiting by either exit condition.
Actually if a loop exit by guard, we should either be able to understand that it happens and limit the backedge taken count to this value, or return SCEVCouldNotCompute at this query. So I don't think it's the case here.
I see the point now. Backedge taken count does not factor in the abnormal exits like guard exits. Looks fishy indeed. Need to check whether we may end up with computable number of iterations when we go via overflow, and maybe make our checks more strict to restrain such cases.
@apilipenko you were right. I was able to construct a test which demonstrates the problem:
define void @test_03(i32* %p, i32* %a) { entry: %len = load i32, i32* %p, align 4, !range !0 %len.minus.1 = add nsw i32 %len, -1 %zero_check = icmp eq i32 %len, 0 br i1 %zero_check, label %loopexit, label %preheader preheader: br label %loop loopexit: ret void loop: %iv = phi i32 [ %iv.next, %loop ], [ -2000000000, %preheader ] %iv.wide = zext i32 %iv to i64 %el = getelementptr inbounds i32, i32* %a, i64 %iv.wide %fishy = icmp sgt i32 %iv, -2000000009 call void(i1, ...) @llvm.experimental.guard(i1 %fishy) [ "deopt"() ] store atomic i32 0, i32* %el unordered, align 4 %iv.next = add nsw i32 %iv, -1 %loopcond = icmp eq i32 %iv, 2000000000 br i1 %loopcond, label %loopexit, label %loop }
-indvars here produces infinite loop instead of deopt after reachng -2000000009. I need to think on alternative solution.
Added checks to avoid circular logic in removing of side exits basing on facts that are derived from side-exits.
test/Transforms/IndVarSimplify/prove_via_monotonicity.ll | ||
---|---|---|
20 | I think we can optimize these cases without thinking about monotonicity -- in both @test_01 and @test_02 to widen the IV you need to prove that *if* the backedge is taken *then* %iv is not 0; and SCEV should be able to prove that easily from the BE condition. |
test/Transforms/IndVarSimplify/prove_via_monotonicity.ll | ||
---|---|---|
20 | It looks like isImpliedCondOperandsViaNoOverflow could have done that, but it doesn't currently support sgt predicate (which is weird). I will try to come up with alternative solution there. |
Abandoning in favor of https://reviews.llvm.org/D43759
test/Transforms/IndVarSimplify/prove_via_monotonicity.ll | ||
---|---|---|
20 |
I think we can optimize these cases without thinking about monotonicity -- in both @test_01 and @test_02 to widen the IV you need to prove that *if* the backedge is taken *then* %iv is not 0; and SCEV should be able to prove that easily from the BE condition.