This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Support single-cond range check idiom in applyLoopGuards.
ClosedPublic

Authored by fhahn on Jun 22 2021, 1:17 PM.

Details

Summary

This patch extends applyLoopGuards to detect a single-cond range check
idiom that InstCombine generates.

It extends applyLoopGuards to detect (LHS Predicate RHS) which
represents a compare for a range check with a single condition,
as InstCombine may create. A check of the form (-1 + X) u< C
where C in [1, unsigned max) can be de-composed into 2 separate checks:

  • X u< (1 + C)
  • X u> 0.

In practice, this enables us to correctly compute a tight trip count
bounds for code as in the function below. InstCombine will fold the
minimum iteration check created by LoopRotate with the user check (< 8).

void unsigned_check(short *pred, unsigned width) {
    if (width < 8) {
        for (int x = 0; x < width; x++)
            pred[x] = pred[x] * pred[x];
    }
}

As a consequence, LLVM creates dead vector loops for the code above,
e.g. see https://godbolt.org/z/cb8eTcqET

Note that I think it is a bit unfortunate that we still need to detect
such patterns explicitly, but I am not sure if there's a better
alternative at the moment.

Attempt to model using Alive2:
https://alive2.llvm.org/ce/z/_VbcYZ

Diff Detail

Event Timeline

fhahn created this revision.Jun 22 2021, 1:17 PM
fhahn requested review of this revision.Jun 22 2021, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2021, 1:17 PM
efriedma added inline comments.Jun 22 2021, 3:09 PM
llvm/lib/Analysis/ScalarEvolution.cpp
13671

Instead of trying to special-case (-1 + X) u< C, could you use ConstantRange::makeExactICmpRegion? Not that this necessarily needs to be generalized, but it would be easier to follow the logic, I think.

reames added inline comments.Jun 22 2021, 3:41 PM
llvm/lib/Analysis/ScalarEvolution.cpp
13660

I found the wording on this confusing the first several times I read it. Can you reword? One suggestion below if helpful.

Check for a condition of the form (X - 1) < C. InstCombine will create this form when combining two checks of the form x <= C and x >u 0.

(Note the use of the <= to avoid need for C+1 in example. Note avoid mention "range check" as X > 0 doesn't seem like an idiomatic range check to me.)

13666

Style wise: I'd suggest writing a lambda with the signature
matchSomeName = [&](SCEV *& X, SCEVConstant *&C).

I at least would find early return much easier to follow here.

fhahn updated this revision to Diff 353939.Jun 23 2021, 5:54 AM

Addressed both Philip's and Eli's comments, thanks! I moved the code to a lambda and updated it to use makeExactICmpRegion, which no also support arbitrary lower bound values. Generalized modeling in Alive2: https://alive2.llvm.org/ce/z/Lt2ICd

fhahn updated this revision to Diff 353940.Jun 23 2021, 5:55 AM

Fix formatting

fhahn added inline comments.Jun 23 2021, 5:58 AM
llvm/lib/Analysis/ScalarEvolution.cpp
13660

Thanks for the suggestion. I updated the wording, but adjusted it slightly to make the lower bound also an arbitrary constant.

13666

Thanks! I moved the checks + update to the map to a single lambda. Happy to only move the 'match' part as well, if you think that's beneficial.

13671

Thanks for the suggestion! I updated the code to use makeExactICmpRegion, which should be a bit simpler and it should now support arbitrary lower bounds. I think we need to make sure the computed region does not wrap and lower <= upper.

nikic added a subscriber: nikic.Jun 23 2021, 6:16 AM
nikic added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
13677

Checking getLower/getUpper here shouldn't be necessary, as you already check for a wrapped set.

13688

Ranges are half-open, so I don't think getUpper() is correct here. I recommend using getUnsignedMin() and getUnsignedMax() instead. If you do that, then the wrapped set check is also no longer relevant for correctness.

fhahn updated this revision to Diff 353967.Jun 23 2021, 7:44 AM

Address @nikic's comments, use getUnsigned{Min,Max}, remove upper/lower check. Thanks!

fhahn added inline comments.Jun 23 2021, 7:48 AM
llvm/lib/Analysis/ScalarEvolution.cpp
13677

good point, removed.

13688

That's better thanks. I still kept the wrapped check because re-writing is unlikely to add additional info for wrapped ranges.

nikic accepted this revision.Jun 23 2021, 7:59 AM

LGTM

llvm/lib/Analysis/ScalarEvolution.cpp
13664

I would check || AddExpr->getNumOperands() != 2 already here...

13668

... and then fetch LHSUnknown and check it already here. That seems like a more obvious grouping of conditions.

13688

Right. A future extension here could be to check whether it's wrapping but not sign-wrapping, in which case you could do the same with smin/smax.

This revision is now accepted and ready to land.Jun 23 2021, 7:59 AM

LG in general

llvm/lib/Analysis/ScalarEvolution.cpp
13660–13661

This comment doesn't make sense right now.
https://godbolt.org/z/n7qqsTY9e

This revision was landed with ongoing or failed builds.Jun 25 2021, 2:26 AM
This revision was automatically updated to reflect the committed changes.
fhahn marked 3 inline comments as done.Jun 25 2021, 2:27 AM
fhahn added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
13660–13661

Thanks, I adjusted the comment, but kept the form where C1 is negative in the original condition, as this seems more common in practice.

13668

Thanks, I adjusted the checks as suggested.

void added a subscriber: void.Sep 3 2021, 2:16 PM

This commit is breaking the Linux kernel built with it. The kernel's based on version 4.15. When I compile the kernel with it, the system doesn't boot.

This commit is breaking the Linux kernel built with it. The kernel's based on version 4.15. When I compile the kernel with it, the system doesn't boot.

It looks like this was tracked down in PR51760, and fixed. Can you confirm that it is in fact the same issue?

void added a comment.Sep 8 2021, 11:29 AM

This commit is breaking the Linux kernel built with it. The kernel's based on version 4.15. When I compile the kernel with it, the system doesn't boot.

It looks like this was tracked down in PR51760, and fixed. Can you confirm that it is in fact the same issue?

That fix did fix my issue. :-)