This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
ClosedPublic

Authored by broune on Aug 7 2015, 9:34 PM.

Details

Summary

http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

for (int i = 0; i < numIterations; ++i)
  sum += ptr[i - offset];

for (int i = 0; i < numIterations; ++i)
  sum += ptr[i * stride];

for (int i = 0; i < numIterations; ++i)
  sum += ptr[3 * (i << 7)];

Diff Detail

Event Timeline

broune updated this revision to Diff 31572.Aug 7 2015, 9:34 PM
broune retitled this revision from to [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl.
broune updated this object.
broune added reviewers: sanjoy, atrick.
broune added subscribers: eliben, jingyue, meheff and 3 others.
sanjoy requested changes to this revision.Aug 11 2015, 2:55 PM
sanjoy edited edge metadata.

Comments inline.

lib/Analysis/ScalarEvolution.cpp
3378

Any non-zero value except 1. :)

3380

I'm being pedantic here, but the reasoning here may be easier to follow if you write -Foo as -1 * Foo (assuming that's what you mean).

3396

I did not quite follow the logic here -- what's is the "relevant loop" here? An example will be very useful here.

Also, why don't you need cast<SCEVAddRecExpr>(RHS)->getLoop() == RelevantLoop?

4123

Why is this needed? Aren't we ever only going to pass in mul, sub and shl?

4237

Why not hoist LHS = getSCEV(U->getOperand(0) as well?

4265

I'll be in favor of using U here instead of Op to be consistent, unless that does not work for some reason.

4408

Unfortunately, this is not quite right. There are some inconsistencies in the langref that to my knowledge have not yet been fixed: see http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html and http://reviews.llvm.org/D8890

This revision now requires changes to proceed.Aug 11 2015, 2:55 PM
broune updated this revision to Diff 31898.Aug 11 2015, 7:25 PM
broune edited edge metadata.
broune marked 7 inline comments as done.

Address Sanjoy's comments.

broune added inline comments.Aug 11 2015, 7:26 PM
lib/Analysis/ScalarEvolution.cpp
3378

It's true that getNegativeSCEV happens to represent -RHS as (-1) * RHS which as an unsigned operation would look like MAX * RHS which indeed doesn't wrap for RHS==1. My thinking is that the mathematical operation of negation does wrap for RHS==1, as you end up with MAX instead of -1. So as I think of it, it would be incorrect to pass the NUW flag to getNegativeSCEV even if you knew that RHS == 1, as the mathematical operation is not NUW, but it would be correct for getNegativeSCEV itself to recognize if RHS==1 and apply the NUW flag in that case (ignoring that it could fold to a constant in that case).

3380

-Foo is the mathematical operation being done, while (-1) * Foo is the representation that getMinusSCEV happens to choose for that operation (at least before simplification). In the way that I'm thinking of this, the flags that are passed in to getAddExpr and getNegativeSCEV concern the mathematical operation, not the representation, so the comment follows the math.

I could change this if you think that the passed-in flags should concern the representation, instead?

3396

I added a comment with an example. The difficulty that this comes out of is that flags and many SCEV operations do not have a loop/scope attached to them, only recurrences do. So we have to be careful with applying flags to subexpressions that do not involve a recurrence. In this case, the relevant loop would be the one from the recurrence for the purposes of this check.

I also realized that this check is unnecessary if NSW was proven by looking only at RHS on its own, with no use of LHS or the Flags parameter, so I added a check for that case.

4123

This function can be passed a ConstantExpr version of those, which it cannot handle. I preferred to check for that once here instead of doing it in each caller.

4408

It seems that there is agreement for shl nsw to be equivalent to mul nsw, in which case this is correct, except that no one has bothered to update the LangRef yet, so it's not official. Also, some updates in InstCombine (and possibly elsewhere) would be required to make the change and those also haven't been done yet. Is that right? I added a comment to explain this, along with a check to avoid applying flags for left shift by BitWidth - 1 until the situation is resolved.

sanjoy added inline comments.Aug 13 2015, 12:54 AM
lib/Analysis/ScalarEvolution.cpp
3381

My subjective opinion is that since there is no direct representation of -X in LLVM IR, we ultimately have to choose a specific representation to reason about. The specific representation is not obvious either since both sub 0 X and mul -1 X are equally valid, so it is nice to be explicit about these things. Moreover, here things are doubly confusing since sub 0 X and mul -1 X have identical behavior w.r.t. nsw so there is no easy way for the reader to "check" if (s)he understood the intent correctly.

However, objectively, I don't think there is any problem with directly proving things about a mathematical negation operation, so I'd suggest just putting in a sentence either here or above on what exactly you mean by -RHS.

3408

But what if you have a loop nested within other, and LHS is an addrec on the inner loop while RHS is an addrec on the outer loop? Then LHS - RHS would be an addrec on the inner loop (so the nsw is guaranteed to apply only on the inner loop), but RHS has a larger scope than the inner loop while still being an add rec.

IOW, something like this:

define void @f(i32 %outer_l, i32 %inner_l) {
 entry:
  br label %outer

 outer:
  %o_idx = phi i32 [ 0, %entry ], [ %o_idx.inc, %outer.be ]
  %o_idx.inc = add i32 %o_idx, 1
  %cond = icmp eq i32 %o_idx, 42
  br i1 %cond, label %inner, label %outer.be

 inner:
  %i_idx = phi i32 [ 0, %outer ], [ %i_idx.inc, %inner ]
  %i_idx.inc = add i32 %i_idx, 1
  %v = sub nsw i32 %i_idx, %o_idx
  %cond2 = icmp eq i32 %i_idx, %inner_l
  br i1 %cond2, label %outer.be, label %inner

 outer.be:
  %cond3 = icmp eq i32 %o_idx, %outer_l
  br i1 %cond3, label %exit, label %outer

exit:
 ret void
}

where %v is an add rec for the %inner loop while %o_idx is an add rec on the outer loop.

4125

Then I'd suggest if (isa<ConstantExpr>(V)) return FlagAnyWrap;, that makes the code more obvious.

4408–4409

SGTM.

broune marked 4 inline comments as done.Aug 13 2015, 9:01 PM
broune added inline comments.
lib/Analysis/ScalarEvolution.cpp
3381

I changed it to (-1)*RHS.

3408

Good point, I added a test case that is based on this. It could also be an addrec on the outer loop, which is what I was thinking of it as, but your example shows that that interpretation is not sufficient. There should be a workable scheme where whomever proves NSW or NUW and passes in those flags also can explicitly pass in the loop that the flags are proven within, though I'll leave that for another time.

I now only put NSW on the negation if it can be proven without using the passed-in NSW.

broune updated this revision to Diff 32126.Aug 13 2015, 9:02 PM
broune edited edge metadata.
broune marked 2 inline comments as done.
sanjoy accepted this revision.Aug 14 2015, 12:05 AM
sanjoy edited edge metadata.

lgtm

test/Analysis/ScalarEvolution/min-max-exprs.ll
36 ↗(On Diff #32126)

I'm surprised this changed in the last version -- I think you only made SCEV more conservative in this last revision?

This revision is now accepted and ready to land.Aug 14 2015, 12:05 AM
broune marked an inline comment as done.Aug 14 2015, 11:27 AM

Thank you to Sanjoy for the review!

test/Analysis/ScalarEvolution/min-max-exprs.ll
36 ↗(On Diff #32126)

The NSW on the negation is now applied (if it can be proven) even if the subtraction does not have a NSW flag.

broune closed this revision.Aug 14 2015, 3:46 PM
broune marked an inline comment as done.