In some particular cases eq/ne conditions can be turned into equivalent slt/sgt conditions.
This patch teaches parseLoopStructure to handle some of these cases.
Details
Diff Detail
Event Timeline
Did you consider a somewhat more general trick of using the loop's BE taken count to derive an equivalent BE condition? E.g. if the loops be taken count is 100, and the post-incremented indvar (the one the BE condition uses) is {5,+,1} (everything is i32) then the BE taken condition can be ++I s< 105 (and you can derive this without looking at whether the BE condition was expressed using SLT or NE or EQ etc.).
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
810 | I don't think this equivalence holds if the RHS is negative. |
Seems that the patch has a bug, needs rework.
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
810 | What's the problem with negative RHS? There is no conceptual difference between while (++i != 5) ---> while (++i < 5) and while (++i != -5) ---> while (++i < -5) Maybe what you meant was "if the RHS is less than the IV's initial value"? Such loops get rejected earlier, test_02 exercises this situation. Will add an assert on that. | |
850 | Bug here: if we have reduced RightSCEV by one on line 822, we should not increase RightValue by 1. | |
901 | Same here. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
810 | Actually not earlier, but under this condition we will bail out on line 842. |
I think we still need to parse the loop structure to be able to duplicate it correctly, but the idea is interesting. I will think around it.
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
810 | Tests 2, 4, 6, 8 check such situations and make sure that the IRCE doesn't apply to them. |
Some comments / questions inline.
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
818 | I'm not sure why you need to handle this case differently than the above. That is, while (true) { if (++i == len) break; } seems equivalent to while (++i != len) { } | |
852 | I'm missing a link here. IIUC:
Where does DecreasedRightValueByOne come in? |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
818 | The difference here is in LatchBrExitIdx. Here is a tricky moment. What we have here is loop: %cond = icmp eq i32 %iv.next, %len br i1 %cond, label %exit, label %loop and loop: %cond = icmp ne i32 %iv.next, %len br i1 %cond, label %loop, label %exit We could transform them both into loop: %cond = icmp le i32 %iv.next, %len br i1 %cond, label %loop, label %exit And set LatchBrExitIdx to 1. But LatchBrExitIdx is then stored into a structure and used later. It means that what we should not just change value of LatchBrExitIdx, but also change actual successors of branch instruction. And we should either remember that we did this interchange and swap them in the end of the method, or swap them just after this replacement (risking to have CFG changes before we checked that the gates pass). This approach looks more complex for understanding, and also looks more error-prone. That's why I separate these cases here. | |
852 | In case 2. After we go from (++i == length) to (++i >s (len - 1)), the new RighValue we are working with is actually len - 1. We don't need to create such an instruction, though. RightValue is only used to calculate RightValue + 1. Knowing that we have decreased the old RightValue by 1, we could just not add this 1 and use the old value of len instead. |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
818 | should be lt, not le |
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp | ||
---|---|---|
852 | I see -- do you mind (when you have time) refactoring this code a bit to make things like this more clear? For instance, it is confusing that we have RightValue and RightSCEV that are not in sync at certain points in the control flow. One way to do this would be to have LoopStructure contain a "virtual" version of the latch branch (i.e. ICmpInst::Predicate Pred, const SCEV *LHS, const SCEV *RHS, BasicBlock *TrueBranch, BasicBlock *FalseBranch) and always rewrite the latch branch to be this "canonical" version during IRCE (after the gating checks have passed). |
I don't think this equivalence holds if the RHS is negative.