This is an archive of the discontinued LLVM Phabricator instance.

RFC: [SCEV] Add explicit representations of umin/smin
ClosedPublic

Authored by loladiro on Aug 1 2018, 6:06 PM.

Details

Summary

Currently we express umin as ~umax(~x, ~y). However, this becomes
a problem for operands in non-integral pointer spaces, because ~x
is not something we can compute for x non-integral. However, since
comparisons are generally still allowed, we are actually able to
express umin(x, y) directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions
were largely identical, except that the SMax variant used isKnownPredicate
while the UMax variant used isKnownViaNonRecursiveReasoning.

Trying to make the UMax variant also use isKnownPredicate yields to
an infinite recursion, while trying to make the SMax variant use
isKnownViaNonRecursiveReasoning causes
Transforms/IndVarSimplify/backedge-on-min-max.ll to fail.

I would appreciate any insight into which predicate is correct here.

Diff Detail

Event Timeline

loladiro created this revision.Aug 1 2018, 6:06 PM
dmgreen added a subscriber: dmgreen.Aug 3 2018, 2:52 AM
loladiro updated this revision to Diff 159239.Aug 5 2018, 6:20 PM

Fix a small bug discovered during production testing (
when expanding umin/umax, don't go to integers just because
types are unequal - the only reason to do is if one of the
operands in an integer and the other is not).

Hi Keno,

I have a general concern against such changes. In fact, you are introducing an alternative way to express the same thing. umin(a, b) and umax(~a, ~b) are the same, but now have 2 possible notations. It means that whatever pass or analysis that needs to recognize this pattern needs to be aware of both. Whenever the new node was not supported, it might be missed optimization opportunities. And we cannot know for sure how many such places there are, or will be. What motivation do you have for making this change? Is it strong enough to take a risk of missing optimization opportunities that I've just pointed out?

I'd also like to know @sanjoy 's opinion on that.

Also, I didn't get the part about infinite recursion in the commit message. Is this the behavior you are observing in current SCEV, or it only happens with your patch? In former case, please submit a bug with a test on which you can see that. From my memory, we've fixed getUMax to avoid the inifinite recursion, and maybe the same fix is required for SMax.

I have a general concern against such changes. In fact, you are introducing an alternative way to express the same thing. umin(a, b) and umax(~a, ~b) are the same, but now have 2 possible notations. It means that whatever pass or analysis that needs to recognize this pattern needs to be aware of both. Whenever the new node was not supported, it might be missed optimization opportunities. And we cannot know for sure how many such places there are, or will be. What motivation do you have for making this change? Is it strong enough to take a risk of missing optimization opportunities that I've just pointed out?

I can understand this concern. On the other hand, by obscuring the true meaning of the pattern (it's not trivial to detect that the argument is actually the negation of another argument - there's some code that tries to do it, but it only works for simple expression), so you also lose the opportunity to optimize based on the existence of umin. In general, I don't really see a good way to deal with non-integral address spaces in the absence of a general umin, short of just disabling these optimizations completely for such pointers, which is not desirable. I suppose another alternative would be to make a first class Neg node type (rather than expanding it through as subs and multiplies), such that (~umax(~x, ~y)) could be pattern matched back into the proper form in code generation. I'm not sure that's any better though, since it just pushed this problem further down the expression tree.

Also, I didn't get the part about infinite recursion in the commit message. Is this the behavior you are observing in current SCEV, or it only happens with your patch? In former case, please submit a bug with a test on which you can see that. From my memory, we've fixed getUMax to avoid the inifinite recursion, and maybe the same fix is required for SMax.

Actually, upon doing some more testing here, I misdiagnosed the test failure (there was a heuristic pattern match to detect the old umin pattern - will push a simple fix monetarily). I'll also put up a separate revision to fix up the isKnownPredicate check for SMax to avoid having that be a behavior change in this revision.

loladiro updated this revision to Diff 160241.Aug 11 2018, 4:06 AM

Fix ir names in tests broken by previous commits and proplerly match u/smin.

For some %r and an indvar %i, the SCEV for (1 + min(r - 1, i)) in smax terms is: (-1 * ((-1 * (zext i32 %r to i64))<nsw> smax {-1,+,-1}<nw><%for.outerloop>))<nsw>.
With your patch, smax gets converted to smin as: (1 + ((-1 + (zext i32 %r to i64))<nsw> smin {0,+,1}<nuw><nsw><%for.outerloop>)), which is correct.
But, it could be simplified further by distributing 1 over smin(?). I might be missing something here. Maybe your patch needs improvement or the getAddExpr() should be improved to handle smin.
P.S.: I am adding 1 explicitly in my code to the min expression i.e. SE->getAddExpr(<scevForMin>, getOne(<ty>)) in both the above cases.

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2019, 5:49 PM
loladiro updated this revision to Diff 196769.Apr 25 2019, 5:56 PM

Undo a small spurious change I noticed while browsing the diff

Alright, I have rebased this revision. I'd be happy to make the improvement to getAddExpr requested by @tvikram, since that seems independently useful, but I'd like to come to an agreement with @sanjoy and @mkazantsev on direction first.

sanjoy accepted this revision.Apr 28 2019, 12:36 PM

This change in isolation LGTM (modulo some minor comments inline) -- at the very least having separate umin / smin nodes makes min expressions more readable (as demonstrated by the updates to the test cases).

However, I'm not sure if this is a sufficient solution of your original problem involving non-integral pointers. Fundamentally to do trip counts right with non-integral pointers, we will at least need to teach SCEV the difference between ni pointers and integers so that it does not create SCEV expressions that cannot be lowered. I think we will also need a psub instruction or intrinsic so that we can compute trip counts for loops like:

ptr0 = ... ; ni pointer
ptr1 = ... ; ni pointer at some offset from ptr0

for (i = ptr0; i != ptr1; i++)
  ...

Of course all of these may be "working" in practice today because of various reasons. :)

lib/Analysis/ScalarEvolution.cpp
3523

Let's avoid a nested ternary here. Maybe you could instead use an immediately executed lambda.

3538–3557

Probably this can just be one loop if you create two APInts, Top and Bottom, or two lambdas IsTop and IsBottom if you're worried about creating large APInts being expensive.

9892

Does this case even fire anymore?

lib/Analysis/ScalarEvolutionExpander.cpp
2153

Did this overflow 80 cols?

This revision is now accepted and ready to land.Apr 28 2019, 12:36 PM
loladiro added inline comments.May 1 2019, 8:35 AM
lib/Analysis/ScalarEvolution.cpp
9892

I wanted to avoid regressing the (probably not very likely, but possible) case that somebody might construct ~umax(~x, ~y) manually. However, in retrospect the better way to do that is probably to just pattern match that to umin on construction. Let me try that and remove this case.

loladiro updated this revision to Diff 197954.May 3 2019, 4:40 AM

Rebase and address review comments.

loladiro updated this revision to Diff 197964.May 3 2019, 5:19 AM

A few minor tweaks.

@sanjoy Could you take another look to make sure I have addressed your comments to your satisfaction before I commit this?

sanjoy accepted this revision.May 5 2019, 12:35 PM

LGTM. If possible (and you haven't already) try bootstrapping clang in a few configurations to double check things are okay.

This revision was automatically updated to reflect the committed changes.
rnk added a subscriber: rnk.May 7 2019, 2:53 PM

This broke the polly build:

[14 processes, 32/64 @ 9.3/s : 3.428s ] Building CXX object tools/polly/lib/CMakeFiles/PollyCore.dir/Support/SCEVAffinator.cpp.o
FAILED: tools/polly/lib/CMakeFiles/PollyCore.dir/Support/SCEVAffinator.cpp.o
...
In file included from /usr/local/google/home/rnk/llvm-project/polly/lib/Support/SCEVAffinator.cpp:13:                                                                            
In file included from /usr/local/google/home/rnk/llvm-project/polly/include/polly/Support/SCEVAffinator.h:16:                                                                    
/usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h:536:30: error: no member named 'visitSMinExpr' in 'polly::SCEVAffinator'         
        return ((SC *)this)->visitSMinExpr((const SCEVSMinExpr *)S);
               ~~~~~~~~~~~~  ^
/usr/local/google/home/rnk/llvm-project/polly/lib/Support/SCEVAffinator.cpp:218:48: note: in instantiation of member function 'llvm::SCEVVisitor<polly::SCEVAffinator, std::pair<isl::noexceptions::pw_aff, isl::noexceptions::set> >::visit' requested here
    PWAC = SCEVVisitor<SCEVAffinator, PWACtx>::visit(Expr);
                                               ^
In file included from /usr/local/google/home/rnk/llvm-project/polly/lib/Support/SCEVAffinator.cpp:13:                                                                            
In file included from /usr/local/google/home/rnk/llvm-project/polly/include/polly/Support/SCEVAffinator.h:16:                                                                    
/usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h:538:30: error: no member named 'visitUMinExpr' in 'polly::SCEVAffinator'         
        return ((SC *)this)->visitUMinExpr((const SCEVUMinExpr *)S);
               ~~~~~~~~~~~~  ^
2 errors generated.

Looked straightforward enough. Should be fixed by rPLO360238. I don't usually work on polly, so let me know if I misunderstood something.