This is an archive of the discontinued LLVM Phabricator instance.

[IRCE] Recognize loops with ne/eq latch conditions
ClosedPublic

Authored by mkazantsev on Jul 5 2017, 6:42 AM.

Details

Summary

In some particular cases eq/ne conditions can be turned into equivalent slt/sgt conditions.
This patch teaches parseLoopStructure to handle some of these cases.

Diff Detail

Event Timeline

mkazantsev created this revision.Jul 5 2017, 6:42 AM
sanjoy requested changes to this revision.Jul 8 2017, 12:16 PM

Did you consider a somewhat more general trick of using the loop's BE taken count to derive an equivalent BE condition? E.g. if the loops be taken count is 100, and the post-incremented indvar (the one the BE condition uses) is {5,+,1} (everything is i32) then the BE taken condition can be ++I s< 105 (and you can derive this without looking at whether the BE condition was expressed using SLT or NE or EQ etc.).

lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
810

I don't think this equivalence holds if the RHS is negative.

This revision now requires changes to proceed.Jul 8 2017, 12:16 PM
mkazantsev planned changes to this revision.Jul 9 2017, 10:34 PM

Seems that the patch has a bug, needs rework.

lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
810

What's the problem with negative RHS? There is no conceptual difference between

while (++i != 5) ---> while (++i < 5)

and

while (++i != -5) ---> while (++i < -5)

Maybe what you meant was "if the RHS is less than the IV's initial value"? Such loops get rejected earlier, test_02 exercises this situation. Will add an assert on that.

850

Bug here: if we have reduced RightSCEV by one on line 822, we should not increase RightValue by 1.

901

Same here.

mkazantsev added inline comments.Jul 9 2017, 11:40 PM
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
810

Actually not earlier, but under this condition we will bail out on line 842.

mkazantsev edited edge metadata.
mkazantsev marked 2 inline comments as done.

Extended test base, fix a bug with shifting comparison borders when we shouldn't.

Did you consider a somewhat more general trick of using the loop's BE taken count to derive an equivalent BE condition? E.g. if the loops be taken count is 100, and the post-incremented indvar (the one the BE condition uses) is {5,+,1} (everything is i32) then the BE taken condition can be ++I s< 105 (and you can derive this without looking at whether the BE condition was expressed using SLT or NE or EQ etc.).

I think we still need to parse the loop structure to be able to duplicate it correctly, but the idea is interesting. I will think around it.

lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
810

Tests 2, 4, 6, 8 check such situations and make sure that the IRCE doesn't apply to them.

sanjoy requested changes to this revision.Jul 15 2017, 3:59 PM

Some comments / questions inline.

lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
818

I'm not sure why you need to handle this case differently than the above. That is,

while (true) {
  if (++i == len)
    break;
}

seems equivalent to

while (++i != len) {
}
852

I'm missing a link here. IIUC:

  • You started with the backedge taken condition as !(++i == len)
  • !(++i == len) is equivalent to !(++i >s (len - 1)), after checking len - 1 does not overflow (this is the new step you added)
  • !(++i >s (len - 1)) is equivalent to (++i s<= (len - 1))
  • (++i s<= (len - 1)) is equivalent to (++i s< len) after checking that (len - 1) + 1 does not overflow

Where does DecreasedRightValueByOne come in?

This revision now requires changes to proceed.Jul 15 2017, 3:59 PM
mkazantsev added inline comments.Jul 16 2017, 8:37 PM
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
818

The difference here is in LatchBrExitIdx. Here is a tricky moment. What we have here is

loop:
  %cond = icmp eq i32 %iv.next, %len
  br i1 %cond, label %exit, label %loop

and

loop:
  %cond = icmp ne i32 %iv.next, %len
  br i1 %cond, label %loop, label %exit

We could transform them both into

loop:
  %cond = icmp le i32 %iv.next, %len
  br i1 %cond, label %loop, label %exit

And set LatchBrExitIdx to 1. But LatchBrExitIdx is then stored into a structure and used later. It means that what we should not just change value of LatchBrExitIdx, but also change actual successors of branch instruction. And we should either remember that we did this interchange and swap them in the end of the method, or swap them just after this replacement (risking to have CFG changes before we checked that the gates pass).

This approach looks more complex for understanding, and also looks more error-prone. That's why I separate these cases here.

852

In case 2. After we go from (++i == length) to (++i >s (len - 1)), the new RighValue we are working with is actually len - 1. We don't need to create such an instruction, though. RightValue is only used to calculate RightValue + 1. Knowing that we have decreased the old RightValue by 1, we could just not add this 1 and use the old value of len instead.

mkazantsev added inline comments.Jul 16 2017, 8:39 PM
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
818

should be lt, not le

sanjoy accepted this revision.Jul 17 2017, 5:00 PM
sanjoy added inline comments.
lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
852

I see -- do you mind (when you have time) refactoring this code a bit to make things like this more clear? For instance, it is confusing that we have RightValue and RightSCEV that are not in sync at certain points in the control flow.

One way to do this would be to have LoopStructure contain a "virtual" version of the latch branch (i.e. ICmpInst::Predicate Pred, const SCEV *LHS, const SCEV *RHS, BasicBlock *TrueBranch, BasicBlock *FalseBranch) and always rewrite the latch branch to be this "canonical" version during IRCE (after the gating checks have passed).

This revision is now accepted and ready to land.Jul 17 2017, 5:00 PM

Yes, I will try to make it more clear once I have some time for that. Thanks!

This revision was automatically updated to reflect the committed changes.