This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Fix finite loop non-strict predicate simplification (PR60944)
ClosedPublic

Authored by nikic on Mar 7 2023, 8:20 AM.

Details

Summary

There are a number of issues with the current code for converting ule -> ult (etc) predicates for comparisons controlling finite loops:

  • It sets nowrap flags, which may only hold for that particular comparison, not globally. (PR60944)
  • It doesn't check that the RHS is invariant. (I'm not sure this can cause practical issues independently of the previous point.)
  • It runs before simplifications that may be more profitable. (PR54191)

This patch moves the handling for this into computeExitLimitFromICmp(), because it is somewhat tightly coupled with assumptions in that code, and addresses the aforementioned issues.

Fixes https://github.com/llvm/llvm-project/issues/60944.
Fixes https://github.com/llvm/llvm-project/issues/54191.

Diff Detail

Event Timeline

nikic created this revision.Mar 7 2023, 8:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2023, 8:20 AM
nikic requested review of this revision.Mar 7 2023, 8:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2023, 8:20 AM
mkazantsev added inline comments.Mar 8 2023, 7:58 PM
llvm/lib/Analysis/ScalarEvolution.cpp
9184

Can this overflow?

StephenFan added inline comments.Mar 8 2023, 9:48 PM
llvm/lib/Analysis/ScalarEvolution.cpp
9184

Yes, it can. Although it can't overflow in loop L, it still may overflow somewhere out of the loop. See test file pr60944.ll below.
In loop2, %iv + 1 can not overflow as a result of mustprogress attribute, but %iv + 1 can still overflow in loop.

mkazantsev added inline comments.Mar 13 2023, 12:43 AM
llvm/lib/Analysis/ScalarEvolution.cpp
9184

I don't quite get the logic here. Imagine that the loops has two exits, one exit is iv <= MAX_INT (so it is never taken) and another exit is guaranteed to exit in 100 iterations (so the loop is finite). Then you just turn this first exit condition into iv < MIN_INT which is trivially false and immediately exit on the 1st iteration. To me it doesn't matter if the loop is finite or not, you cannot add one here.

nikic added inline comments.Mar 13 2023, 1:02 AM
llvm/lib/Analysis/ScalarEvolution.cpp
9184

ControllingFiniteLoop is ControlsExit && loopHasNoAbnormalExits(L) && loopIsFiniteByAssumption(L). This means that the lopo is finite *and* only has one exit.

mkazantsev added inline comments.Mar 13 2023, 8:46 PM
llvm/lib/Analysis/ScalarEvolution.cpp
9184

I don't understand. Here is what ControlsExit is from the comment of computeExitLimitFromCond:

/// \p ControlsExit is true if ExitCond directly controls the exit
/// branch. In this case, we can assume that the loop exits only if the
/// condition is true and can infer that failing to meet the condition prior
/// to integer wraparound results in undefined behavior.

It doesn't mean that there is only one exit. It only means that, no matter what exit is taken, the condition is going to be true. How do you infer ability to add 1 and guarantee the new (non-equivalent) condition under the same circumstances?

mkazantsev accepted this revision.Mar 14 2023, 2:42 AM

Ok, thanks for clarification. Maybe rename this into IsOnlyExit or ControlsOnlyExit?

This revision is now accepted and ready to land.Mar 14 2023, 2:42 AM
This revision was landed with ongoing or failed builds.Mar 14 2023, 2:59 AM
This revision was automatically updated to reflect the committed changes.

I'm seeing a regression for the following loop:

define void @foo1(ptr %A, i32 %m, i32 %n) mustprogress {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i.06 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %inc = add i32 %i.06, 1
  %cmp.not = icmp ugt i32 %inc, %m
  br i1 %cmp.not, label %for.end, label %for.body

for.end:                                          ; preds = %for.body
  ret void
}

We used to produce a backedge-taken count of %m, but now we produce (-1 + (1 umax (1 + %m))).

We used to produce a backedge-taken count of %m, but now we produce (-1 + (1 umax (1 + %m))).

On your reduced example, it's interesting the the loop backedge is guaranteed to execute when %m is defined. This might be an over-reduction, but we might be able to do a corrected version of D148107 based on the defining scope bound of RHS being guaranteed to reach the backedge test. I'm a bit worried about both the compile time of that, and that I'm over specializing for your possible over-reduced example.

The original C code is something like for (unsigned i = 0; i <= m; i++). The reduction just cut out the loop body.

At a high level, determining that the "max" is unnecessary shouldn't be hard; the only reason it's tricky is that howManyLessThans doesn't know that "m+1" doesn't overflow. If we had a dedicated howManyLessThanOrEquals, we would do the right thing without any special logic.