[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start

Authored by mkazantsev on May 18 2017, 5:32 AM.



When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.

Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:

for (...) { // loop
  phi = {A,+,B}
X = load ...

Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.

This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.

Diff Detail

mkazantsev created this revision.May 18 2017, 5:32 AM
sanjoy requested changes to this revision.May 18 2017, 10:59 AM

Comments inline.


I'd not call this complex rec, since I don't think this is specific to PHI nodes (see below). Instead, I'd just call it SCEVDominatesLoop or something like that.


Can you use SCEVTraversal here? That'll also prevent exponential complexity.


I'm not sure why this can't happen with non-PHIs. IMO the condition should be "the expression the SCEVUnknown corresponds to must dominate L->getHeader()".

This revision now requires changes to proceed.May 18 2017, 10:59 AM
mkazantsev added inline comments.May 18 2017, 11:44 PM

Good point, the same situation could happen if we applied non-SCEVable operation with our Phi.

mkazantsev planned changes to this revision.May 18 2017, 11:45 PM

Need to address the comments. Also need to fix the typo in commit message (lost ---> most).

mkazantsev updated this revision to Diff 99537.May 19 2017, 3:36 AM
mkazantsev marked 4 inline comments as done.

Addressed the comments, and also made the analysis more accurate. We only react on SCEVUnknown if there is another loop between it and our loop, and this loop header has Phis.

sanjoy added inline comments.May 19 2017, 12:19 PM

I believe this should be more general, and we should disallow forming {%x,+,%y}<%loop> unless %x and %y both dominate %loop[0]. Whether there is a PHI involved or not should not matter.

In particular, {%x,+,%y} is an expression that evaluates to %x on the 0th iteration of the loop. This is meaningless unless %x dominates the loop header.

However, if you fix this for good, you may see some fall out of LSR, but we should just go ahead and fix those.

[0]: or, if they're complex SCEV expressions, all of the contained SCEVUnknown and SCEVAddRec expressions must dominate %loop.


Why do you need this?

Even if you do, please just check it in separately without further review.

sanjoy requested changes to this revision.May 19 2017, 5:04 PM
This revision now requires changes to proceed.May 19 2017, 5:04 PM
mkazantsev added inline comments.May 21 2017, 9:27 PM

Consider two situations:

Case 1:

for (...) {
  %phi = {a,+,b}
%x = ...
%y = %phi + %x

Case 2:

%x = ...
for (...) {
  %phi = {a,+,b}

%y = %phi + %x

If %x doesn't depend on the loop in any way, I don't see a reason why SCEV should not evaluate %y equally in both cases. In case 2, it is absolutely OK to imagine a recurrence {a+x,+,b} (it could be explicitly calculated in the loop by creating a Phi), but in case 1 such transormation is also valid because SCEV should be place-independent, and if nothing prohibits us from calculating %x before the loop (even if the actual instruction stands after it), we should be able to deal with it just like if it was calculated before the loop if it makes sense. Is it not correct?

The only problem SCEVUnknown creates is that it can be actually a varying value in some loop which does not dominate L. In this case our logic of picking the bottom-most loop in getAddExpr and getMulExpr becomes incorrect. We are just picking the wrong loop, because we can only pick a loop of AddRecExpr.

If there are no loops with Phis between our SCEVUnknown and L, all varying values that might be used by the SCEVUnknown actually dominate the loop L (and this is actually what we want in [0]). It means that our picking of the loop in AddExpr is correct. We could just prohibit all SCEVUnknown below L, but I think this would be too strong reduction of the scope.


Yeah, this wasn't intended to be here. Will be removed.

sanjoy added inline comments.May 21 2017, 10:50 PM

I don't think the two situations are the same.

Firstly, I agree that in case 2, %y should be {%x+a,+,b}.

In case 1, however, I don't think the same should hold. The definition of a {a,+,b} is that it is equal to a on the 0th iteration and that makes little sense if a does not dominate the loop. For instance, if we allow such SCEV expressions, we won't always be able to compute a "value at exit" (like we do in rewriteLoopExitValues) that is available on a loop exit.

However, repeating what I said before, I suspect we may have to compromise on this due to LSR (i.e. LSR may be relying on us also folding case 1), but we should try and see.

mkazantsev added inline comments.May 23 2017, 1:26 AM

Doing this breaks 4 tests (CHECK's fail), but at least in some of them we can get rid of SCEVUnknown making some minor improvements. I will go through them accurately and make this changes, and then we'll see if we lost something important.

mkazantsev updated this revision to Diff 99884.May 23 2017, 5:02 AM
mkazantsev retitled this revision from [SCEV] Do not fold expressions with SCEVUnknown Phis into AddRecExpr's to [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start.
mkazantsev edited the summary of this revision. (Show Details)
mkazantsev marked 2 inline comments as done.
sanjoy accepted this revision.May 23 2017, 5:04 PM

lgtm with a minor nit


Instead of hasDominatedSCEVUnknown, I'd call this isAvailableAtLoopEntry, negate the return value and push the isLoopInvariant into this this function. This seems more more semantic -- we're interested in knowing if a SCEV expression can possibly be computed at L 's preheader. You can then shift the comment about invariants (paraphrased as needed) into this function.

This revision is now accepted and ready to land.May 23 2017, 5:04 PM

No functional changes, just renames of methods for better semantical understanding.