When removing exiting loop conditions, we only consider checks for
which we know the exact exit count. We could also eliminate checks for
which the condition is always true/false.
Details
Diff Detail
Event Timeline
Max, I think this is probably handled by the existing optimizeLoopExits if the exit counts for the condition are computable. (i.e. you're probably looking at a gap in exit count computation)
Thanks for a clue Philip, you are kind of right, but I don't think we can fix exit count computation. Here's the motivating example:
define i32 @test_01(i32* %p) { entry: %len = load i32, i32* %p, align 4, !range !0 br label %loop loop: ; preds = %backedge, %entry %iv = phi i32 [ %len, %entry ], [ %iv.next, %backedge ] %iv.next = add nsw i32 %iv, -1 %rc = icmp slt i32 %iv.next, %len br i1 %rc, label %backedge, label %fail backedge: ; preds = %loop %loop.cond = icmp ne i32 %iv, 0 br i1 %loop.cond, label %loop, label %exit fail: ; preds = %loop ret i32 -1 exit: ; preds = %backedge ret i32 0 }
The block loop just never exits (if we leave this check alone then the loop is infinite). So formally, we do not know the exit count for this block. The only thing we can do is we can try to prove that, no matter what this exit count is, it is greater than max backedge taken count. This may work.
Reframing the solution with accordance with @reames 's comment, starting from the most trivial cases and moving towards more complex ones.
This looks about reasonable to me, but i don't feel confident reviewing this.
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp | ||
---|---|---|
2348–2352 | Should this just be return SE->isKnownPredicate(Pred, LHSS, RHSS);, | |
2376 | // Likewise, the loop latch must be dominated by the exiting BB. |
Internal testing revealed bugs, look weird, investigating.
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp | ||
---|---|---|
2348–2352 | Other changes planned on top, see dependent patch. |
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp | ||
---|---|---|
2342 | Should invert if false succ is the loop block. |
Addressed comments, fixed bug with non-loop true successor, added corresponding test.
LGTM. Nice restructure.
An idea for another way to approach this would be to add ability to SCEV to prove an exit is infinite directly. That might be useful in other places. We could consider either a new function isExitCountInfinite or a new SCEVInfinite node.
Just to be clear, this is now brainstorming, not a review comment in the sense that I expect you to act on it. :)
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp | ||
---|---|---|
2428 | Minor, pull out the cast into a variable please. |
What is the relationship between this code and SimplifyIndvar::eliminateIVComparison()? Is this basically the same thing, just for icmps against something other than the IV itself?
If I'm understanding this correctly, this optimization doesn't really seem to be related to loop exits and could apply to any icmp in the loop. Looking at it in terms of a loop exit seems to make the logic more complicated (e.g. the double inversion of the predicate).
If we want to generally handle icmps involving SCEVs which were not either uses of IVs (directly) or loop exits, I agree this code would be good to pull out and use.
The catch is the the IV logic only visits a small subset of icmps. If you have any form of complicated expression feeding the icmp, it gives up without every reaching it.
I'll note that I think it's seriously worth considering whether we want a transform which just visits all icmps in a loop and tries to eliminate them, but that's a broader task than this patch. :)
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp | ||
---|---|---|
2359 | Well, they are kind of still analyzable. We just do not know exact exit count for them, but it does not mean we know absolutely nothing. |
Should invert if false succ is the loop block.