This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Allow negative steps for LT exit count computation for unsigned comparisons
ClosedPublic

Authored by reames on Jun 11 2021, 11:35 AM.

Details

Summary

This bit of code is incredibly suspicious. It allows fully unknown (but potentially negative) steps, but not steps known to be negative. The comment about scev flag inference is worrying, but also not correct to my knowledge.

At best, this might be covering up some related miscompile. However, there's no test in tree for it, the review history doesn't include obvious motivation, and the C++ example doesn't appear to give wrong results when hand translated to IR. I think it's time to remove this and see what falls out.

Diff Detail

Event Timeline

reames created this revision.Jun 11 2021, 11:35 AM
reames requested review of this revision.Jun 11 2021, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2021, 11:35 AM
efriedma added inline comments.Jun 13 2021, 6:56 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

Does the logic here actually work correctly for step=0? It looks like we end up dividing by zero.

If we can prove the step is non-zero, loopIsFiniteByAssumption seems too aggressive; a lack of abnormal exits should be enough. We'll hit UB after a finite number of steps.

Using isKnownPositive doesn't really make sense in the unsigned case; an unsigned number can't be negative.

reames added inline comments.Jun 14 2021, 9:54 AM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

Step can't be zero, or the loop would be infinite. Though, actually, writing that, I see a latent bug here. The loop could be finite *because we took another exit*, the step of this IV could still be zero, and we could still have a divide by zero along this path.

I don't think that has anything to do with this change though.

p.s. Yes, we can infer non-zero other ways. I even have one patch out to do that right now. :)

efriedma added inline comments.Jun 14 2021, 10:38 AM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

I agree this patch makes sense, just saying it might expose other issues.

*because we took another exit*

Even if we take this exit. If the backedge-taken count is zero, the step doesn't matter.

reames added inline comments.Jun 15 2021, 8:58 AM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

Eli, I think you're missing the point slightly. For the induction step to be zero, and for this to be the sole exit, then the loop must be infinite. Since mustprogress infinite loops are undefined, any result from this function is allowed. If there's a divide by zero which causes a compiler crash, that would be bad, but literally any result is legal at that point.

efriedma added inline comments.Jun 15 2021, 11:55 AM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

If the induction step is zero, and this is the sole exit, the loop is either infinite, or has a backedge-taken count of zero. See, for example, https://godbolt.org/z/9djfj1Ycq .

reames added inline comments.Jun 15 2021, 12:15 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11710–11718

Gah, yeah, you're correct. No idea why I didn't see that the first time.

reames updated this revision to Diff 359395.Jul 16 2021, 11:17 AM

Rebase over tests which actually exercise code and show difference.

The zero stride issue was fixed separately. This still depends on Eli's patch for overflow correctness. Once that's fixed, I think this will be safe.

It's worth noting that given a condition which dominates the latch, an IV which has either nsw or nuw, and a negative step, the exit backedge taken must be either 0 or 1. For it to be more than that, we must perform at least two adds of negative value which must by definition wrap. As a result, we could do much better in terms of formulas specific for negative strides, but I'm not really interested in bothering. The main value of this patch is in making it easier to exercise the general path via tests.

efriedma added inline comments.Jul 16 2021, 12:35 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11556–11560

Is the unconditional use of "smax" here going to cause issues?

11571

Instead of computing min(End, Limit) - Start, should we be using max(End, Start) - Start like we do elsewhere?

11704

This is inductive logic, right? If the first iteration doesn't the loop, the following iterations also can't exit the loop.

I think this logic requires that RHS is invariant? Not that we would compute BECount anyway in that case, but I think we might underestimate MaxBECount.

efriedma added inline comments.Jul 16 2021, 12:46 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11556–11560

Oh, also, if BitWidth is one, "One" is a negative number. isKnownNonPositive(Stride) guards against this, since it's always true for an i1 value. :)

efriedma added inline comments.Jul 16 2021, 1:03 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11556–11560

I'll write a patch for computeMaxBECountForLT.

Posted D106197. That addresses all my review comments here except the one about the RHS being invariant.

reames updated this revision to Diff 359947.Jul 19 2021, 3:48 PM

Address Eli's review comment which isn't covered by D106197.

I'm going to land some tests for this case and probably split this patch since this appears to be a flaw in the existing logic for potentially zero strides.

efriedma added inline comments.Jul 19 2021, 4:03 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11701

I think this needs to be "if the stride is negative and rhs is invariant".

reames planned changes to this revision.Jul 19 2021, 5:40 PM

Needs to be rebased over D106327 once landed.

reames added inline comments.Jul 26 2021, 4:31 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11701

I've given this a lot of thought and I don't believe this is true. Let me explain why I think the code is correct, and you can poke holes in my reasoning. :)

A negative stride must be greater than half of the addressable space. Thus, we can only add a negative stride once without producing poison.

We check that if IV becomes poison that the loop must execute UB. (This is the purpose of the ControlsExit check combined with the flag check.) Thus, we don't have to worry about IV being poison, but another exit being taken before the one we're analyzing. (This is important as it cuts off a *lot* of subtle edge case reasoning.)

As a result, we can infer that IV must add the step at most once. Given we must increment IV on each iteration, this implies the backedge is not taken.

Note that the value of RHS did not appear in any of that logic.

Chasing through the code for computeMaxBECountForLT, we appear to produce a correct/conservative result for a negative stride.

Do you have a particular counter example in mind?

p.s. The means by which we prove poison triggers immediate UB could be generalized easily here. Maybe something to explore in the future if we want to expand support for multiple exit loops.

efriedma added inline comments.Jul 26 2021, 5:30 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11701

I was thinking of something like this:

int a(int *x) {
  int i = 0;
  do {
    i -= 2;
  } while (i < x[i]);
  return i;
}
int main() {
    int z[] = {-100000,0,0,0,};
    __builtin_printf("%d\n", a(z+4));
}

Am I missing some condition that excludes this?

reames added inline comments.Jul 27 2021, 5:45 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11701

Quick answer: I'd gotten myself confused on nsw semantics of add w/negative RHS again. Long answer to follow once I've given it a bit more thought.

reames updated this revision to Diff 369813.Aug 31 2021, 3:59 PM
reames retitled this revision from [SCEV] Allow negative steps for LT exit count computation to [SCEV] Allow negative steps for LT exit count computation for unsigned comparisons.

Narrowing focus to unsigned comparisons.

The reasoning about signed cases makes my head hurt, and even after staring at it for a while, I'm neither sure the code is correct or incorrect for signed comparisons. Give my actual interest is unsigned IVs, I propose we just defer the signed cases until someone else has time, interest, and the ability to reason through it.

@efriedma Sorry for the silence on this for so long.

p.s. I'm going to mark this as dependent on D109029 as the diff is built on top of that patch and might be confusing otherwise, but there's no semantic connection between the two.

This revision is now accepted and ready to land.Sep 9 2021, 12:37 PM
This revision was landed with ongoing or failed builds.Sep 9 2021, 2:30 PM
This revision was automatically updated to reflect the committed changes.