This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
ClosedPublic

Authored by reames on May 25 2021, 2:55 PM.

Details

Summary

The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }

Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".

The basic intuition behind this patch is as follows:

  • A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
  • Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this.

Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.

A couple notes for reviewers:

  • I reused the loop properties code which is overly conservative for this case. I'll follow up in another patch to generalize it for the actual UB rules.
  • We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.

Diff Detail

Event Timeline

reames created this revision.May 25 2021, 2:55 PM
reames requested review of this revision.May 25 2021, 2:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2021, 2:55 PM
marksl added a subscriber: marksl.May 25 2021, 3:18 PM
mkazantsev added inline comments.May 31 2021, 2:57 AM
llvm/lib/Analysis/ScalarEvolution.cpp
6506

Is this actually true? For C++, I was able to find that

In a valid C++ program, every thread eventually does one of the following:

terminate
makes a call to an I/O library function
reads or modifies a volatile object [!!!]
performs an atomic operation or a synchronization operation

Is it different in LLVM? A loop that only reads volatile memory is non-side-effecting, but I'd be surprised if we considered it finite.

reames added inline comments.Jun 1 2021, 1:18 PM
llvm/lib/Analysis/ScalarEvolution.cpp
6506

A volatile read is considered a side effecting instruction, so the presence of a volatile load in the loop will be enough to prevent it from being considered presumed finite.

efriedma added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
11430

Could you just check that Stride is a power of 2 here? This seems overly complicated.

Could you separate out checking "there must be some value of LHS that forces the loop to exit" from "the IV can't wrap"? We can use the former to compute a max backedge taken count even if we can't prove the IV doesn't wrap.

reames added inline comments.Jun 3 2021, 2:24 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11430

Could you just check that Stride is a power of 2 here? This seems overly complicated.

Sure, don't really care here, but I'll make the change.

Could you separate out checking "there must be some value of LHS that forces the loop to exit" from "the IV can't wrap"? We can use the former to compute a max backedge taken count even if we can't prove the IV doesn't wrap.

I'm not really following here. I *think* you're asking if we can simply save the no-self-wrap somewhere right?

I agree that in principal, no-self-wrap is enough to prove a max exit count, but in practice, the current code can't do that. I would strongly prefer to work incrementally here if that is what you're suggesting. :)

efriedma added inline comments.Jun 3 2021, 2:45 PM
llvm/lib/Analysis/ScalarEvolution.cpp
11430

I'm not really following here. I *think* you're asking if we can simply save the no-self-wrap somewhere right?

I was more thinking of computing a max backedge taken count in the case where we can't prove no-self-wrap (i.e. Stride=3).

What you're describing might be useful too.

reames updated this revision to Diff 349697.Jun 3 2021, 2:46 PM

Address reviewer suggestion.

mkazantsev accepted this revision.Jun 6 2021, 8:43 PM

Fine by me if you add a unit test which shows that it works as expected with volatile load.

This revision is now accepted and ready to land.Jun 6 2021, 8:43 PM
This revision was landed with ongoing or failed builds.Jun 7 2021, 11:24 AM
This revision was automatically updated to reflect the committed changes.