This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] get a more accurate range for AddRecExpr with nuw flag
ClosedPublic

Authored by shchenz on Dec 18 2019, 11:50 PM.

Details

Summary

If AddRecExpr with nuw flag, we know that the value will never be smaller than its initial value.

This patch makes this behavior more general.

nsw will be handled in following patch.

Diff Detail

Event Timeline

shchenz created this revision.Dec 18 2019, 11:50 PM
nikic added inline comments.Dec 19 2019, 1:08 AM
llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332 ↗(On Diff #234666)

Aren't these two test changes regressions?

shchenz marked an inline comment as done.Dec 19 2019, 4:21 AM
shchenz added inline comments.
llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332 ↗(On Diff #234666)

Hi @nikic , thanks for your quick comment. For my understanding, seems the changes are improvement instead of regression.
Before LFTR, we use %iv as loop iteration variable, the loop count loop_count is abs(-2147483648) + 20 (-2147483648 is SINT32_MIN). For the increase expression %iv.inc = add nsw i32 %iv, 1, it will not generate poison value(no overflow happens for the loop before transformation).
After LFTR, we use %iv2 as loop iteration variable, %iv2 start-value is -2, so to iterates times loop_count (21 bigger than SINT32_MAX), expression [[IV2_INC]] = add nsw i32 [[IV2]], 1 will surely overflow, thus we will get poison value.
But after the change, no nsw anymore, so no poison value, we will do wrap to reach loop_count times iterations.
Could you please help to point out where am I wrong?

sanjoy.google requested changes to this revision.Dec 19 2019, 7:24 AM

This needs a specific test case.

llvm/lib/Analysis/ScalarEvolution.cpp
5658

This is the unsigned minimum value right? If so please name it UnsignedMinValue.

5661

Can you comment on why you need the else block?

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332 ↗(On Diff #234666)

Either the test changes are regressions or this change is a bugfix (and LFTR was incorrect before).

This revision now requires changes to proceed.Dec 19 2019, 7:24 AM
nikic added inline comments.Dec 19 2019, 12:43 PM
llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332 ↗(On Diff #234666)

Ah yes... LFTR does have a known issue where nowrap flags are not always stripped correctly when switching to a dynamically dead IV. It indeed looks like the drop of nsw here is correct. What isn't clear to me though is why it happens: As I understand this patch, it is about making use of nsw/nuw information on addrecs even if the start is non-constant, so I would naively assume that we should only be inferring more flags with this change, not less.

shchenz updated this revision to Diff 235093.Dec 22 2019, 10:18 PM
shchenz marked 3 inline comments as done.

fix @sanjoy.google comments

llvm/lib/Analysis/ScalarEvolution.cpp
5661

If we use getUnsignedRangeMin(AddRec->getStart()) as minimum value for SignedRange of the start of AddRec, we may get wrong result. For example, assuming we get range [137, 227) as result of getUnsignedRangeMin(AddRec->getStart()); for type i8, if we want to return it as SignedRange, it will be [-119, -29), this is invalid range for AddRec with nuw even for SignedRange?

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332 ↗(On Diff #234666)

The difference happens when we get SCEV for %iv2.inc = add nuw nsw i32 %iv2, 1.
Firstly, we get SCEV for PHI %iv2 = phi i32 [ -2, %entry ], [ %iv2.inc, %always_taken ], it is {-2,+,1}<nuw><nsw><%for.cond>
When we try to get SCEV for %iv2.inc = add nuw nsw i32 %iv2, 1, it is getAddExpr with Ops {1, {-2,+,1}<nuw><nsw><%for.cond>}

We get a different result in StrengthenNoWrapFlags called by getAddExpr.

// (A <opcode> C) --> (A <opcode> C)<nsw> if the op doesn't sign overflow.
if (!(SignOrUnsignWrap & SCEV::FlagNSW)) {
  auto NSWRegion = ConstantRange::makeGuaranteedNoWrapRegion(
      Opcode, C, OBO::NoSignedWrap);
  if (NSWRegion.contains(SE->getSignedRange(Ops[1])))
    Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW);
}

// (A <opcode> C) --> (A <opcode> C)<nuw> if the op doesn't unsign overflow.
if (!(SignOrUnsignWrap & SCEV::FlagNUW)) {
  auto NUWRegion = ConstantRange::makeGuaranteedNoWrapRegion(
      Opcode, C, OBO::NoUnsignedWrap);
  if (NUWRegion.contains(SE->getUnsignedRange(Ops[1])))
    Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);
}

For {-2,+,1}<nuw><nsw><%for.cond>, with this patch's fixing, now the SignedRange is [-2,-2147483629), the unsigned range is [-2,0), so new range is not contained inside MGNWR, flag is 0.

Before the change, SignedRange is [-2,0), unsigned range is [-2,0). Signed Range is contained in Signed type MGNWR ([-2147483648,2147483647)), flag is SCEV::FlagNSW. But Seems SignedRange is quite conservative?

nikic added inline comments.Dec 23 2019, 2:16 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5661

Why is [-119, -29) an invalid range?

shchenz marked an inline comment as done.Dec 23 2019, 4:07 AM
shchenz added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
5661

For SCEV1 {x,+,stride}<nuw>, when we try to get a signed range for SCEV1, the result range should be contained in [0, SINT_MAX +1)?
If we use getUnsignedRangeMin(AddRec->getStart()) to get start value range, it should be any value in [0, UINT_MAX + 1), so the start min can be a value between [ SINT_MAX +1, UINT_MAX + 1), like 137 for type i8. If the signed range for SCEV1 is [137, 227), it is not inside the expected range [0, 128). For this case, we should return empty for SCEV1 SignedRange?

nikic added inline comments.Dec 26 2019, 7:39 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5661

I still don't really get it. Why would the range need to be inside [0, SINT_MAX]? For example, let's say the start range is [-100, -90], then that could still result in a valid nuw addrec, e.g. because we have iv == -10 as a loop exit condition. "nuw" doesn't imply that the numbers can't be "negative", as long as the addrec does not cross zero.

shchenz marked an inline comment as done.Dec 27 2019, 1:21 AM
shchenz added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
5661

Ah, thanks, got your idea.

we get
--> {-10,+,1}<nuw><%loop> U: [-10,0) S: [-10,0)

So signed range for a SCEV with nuw flag may still be a range in the negative part.

shchenz planned changes to this revision.Dec 27 2019, 1:21 AM
shchenz updated this revision to Diff 235532.EditedDec 29 2019, 7:37 PM

for signed range with nuw flag, we can only use the sign min value when constant range for START can not cross 0.

shchenz marked an inline comment as done.Dec 29 2019, 7:46 PM
shchenz added inline comments.
llvm/test/Analysis/ScalarEvolution/range_nw_flag.ll
23

For signed range, I think here it should also be [11,0).
but seems, there is a legacy issue for start SCEV signed range: {(1 + (10 smax %offset))<nuw>,+,1}<nuw><%loop>.

signed range for (10 smax %offset) is [10, SINT_MIN)

(1 + (10 smax %offset))<nuw> signed range should be [11, SINT_MIN), instead of full-set? @nikic

Even with the change in https://reviews.llvm.org/D64869 which takes nuw for add into account, we still get full-set as result.
Flow in addWithNoWrap:
normal add result for (1 + [10, SINT_MIN))<nuw> is [11, SINT_MIN +1)
uadd_sat result for (1 + [10, SINT_MIN))<nuw> is also [11, SINT_MIN +1), so final result is [11, SINT_MIN +1), currently this range is treated as signed wrap range. (Min value is not SINT_MIN)

shchenz updated this revision to Diff 235537.Dec 29 2019, 8:48 PM
shchenz marked an inline comment as not done.Jan 5 2020, 9:17 PM

gentle ping

shchenz updated this revision to Diff 236736.Jan 7 2020, 6:12 PM

rebase after D64869

nikic added inline comments.Jan 8 2020, 10:21 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5661

Thanks, this looks better. What isn't clear to me though is why we can't just always use getUnsignedRangeMin() here. Shouldn't that always be correct and potentially give a better result?

Or is there some rule that you are not allowed to use the unsigned range while computing a signed range?

shchenz marked an inline comment as done.Jan 8 2020, 6:57 PM
shchenz added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
5661

Thanks for your comment. @nikic I am not sure about using getUnsignedRangeMin() as signed range min value is right. For my understanding, signed range and unsigned range are different ranges, why we can use unsigned range min value as signed range min value? nuw can ensure this?

nikic added inline comments.Jan 9 2020, 1:10 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5661

Maybe @sanjoy.google can comment on that? My understanding is that both the unsigned and signed ranges are valid and we are free to use either one, so using the unsigned one is more beneficial in an unsigned context. But I'm not totally sure if my understanding here is right.

In any case, even when not always using the unsigned range (maybe better to avoid unless we know for sure), you should still be able to simplify this code to:

ConstantRange StartRange = getRangeRef(Start, SignHint);
APInt MinValue = StartRange.getUnsignedMin();

This will return zero if the range crosses zero. There is no actual requirement here that the range is all-negative or all non-negative here, you just always need to pick the unsigned minimum of the range. This will usually be zero if the range is not all-negative or all non-negative, but there are exceptions. For example in your [11, SINT_MIN + 1) example from the test below the range can have both signs, but the unsigned min is still 11.

My understanding is that both the unsigned and signed ranges are valid and we are free to use either one, so using the unsigned one is more beneficial in an unsigned context.

That is correct.

shchenz updated this revision to Diff 237227.Jan 9 2020, 5:27 PM

use unsigned range min as initial value

nikic accepted this revision.Jan 10 2020, 12:51 AM

LGTM

Hi @sanjoy.google , could you please help to have another look at this? Thanks a lot.

sanjoy accepted this revision.Jan 12 2020, 1:47 PM
sanjoy added a subscriber: sanjoy.

lgtm

This revision was not accepted when it landed; it landed in state Needs Review.Jan 12 2020, 5:28 PM
This revision was automatically updated to reflect the committed changes.