Download Raw Diff

Details

Reviewers

nikic
reames
sanjoy.google
fhahn
hfinkel
sanjoy

Commits

rGa6342c247a17: [SCEV] accurate range for addrecexpr with nuw flag

Summary

If AddRecExpr with nuw flag, we know that the value will never be smaller than its initial value.

This patch makes this behavior more general.

nsw will be handled in following patch.

Diff Detail

Event Timeline

shchenz created this revision.Dec 18 2019, 11:50 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 18 2019, 11:50 PM

Herald added subscribers: llvm-commits, • wuzish, hiraditya. · View Herald Transcript

nikic added inline comments.Dec 19 2019, 1:08 AM

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332	Aren't these two test changes regressions?

shchenz marked an inline comment as done.Dec 19 2019, 4:21 AM

shchenz added inline comments.

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332	Hi @nikic , thanks for your quick comment. For my understanding, seems the changes are improvement instead of regression. Before LFTR, we use `%iv` as loop iteration variable, the loop count `loop_count` is `abs(-2147483648) + 20` (`-2147483648` is `SINT32_MIN`). For the increase expression `%iv.inc = add nsw i32 %iv, 1`, it will not generate poison value(no overflow happens for the loop before transformation). After LFTR, we use `%iv2` as loop iteration variable, `%iv2` start-value is -2, so to iterates times `loop_count` (21 bigger than SINT32_MAX), expression `[[IV2_INC]] = add nsw i32 [[IV2]], 1` will surely overflow, thus we will get poison value. But after the change, no `nsw` anymore, so no poison value, we will do wrap to reach `loop_count` times iterations. Could you please help to point out where am I wrong?

This needs a specific test case.

llvm/lib/Analysis/ScalarEvolution.cpp
5658	This is the unsigned minimum value right? If so please name it `UnsignedMinValue`.
5661	Can you comment on why you need the else block?
llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332	Either the test changes are regressions or this change is a bugfix (and LFTR was incorrect before).

This revision now requires changes to proceed.Dec 19 2019, 7:24 AM

nikic added inline comments.Dec 19 2019, 12:43 PM

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332	Ah yes... LFTR does have a known issue where nowrap flags are not always stripped correctly when switching to a dynamically dead IV. It indeed looks like the drop of nsw here is correct. What isn't clear to me though is why it happens: As I understand this patch, it is about making use of nsw/nuw information on addrecs even if the start is non-constant, so I would naively assume that we should only be inferring more flags with this change, not less.

fix @sanjoy.google comments

llvm/lib/Analysis/ScalarEvolution.cpp
5661	If we use `getUnsignedRangeMin(AddRec->getStart())` as minimum value for SignedRange of the start of AddRec, we may get wrong result. For example, assuming we get range `[137, 227)` as result of `getUnsignedRangeMin(AddRec->getStart());` for type `i8`, if we want to return it as SignedRange, it will be `[-119, -29)`, this is invalid range for AddRec with `nuw` even for SignedRange?
llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll
332	The difference happens when we get SCEV for `%iv2.inc = add nuw nsw i32 %iv2, 1`. Firstly, we get SCEV for PHI `%iv2 = phi i32 [ -2, %entry ], [ %iv2.inc, %always_taken ]`, it is `{-2,+,1}<nuw><nsw><%for.cond>` When we try to get SCEV for `%iv2.inc = add nuw nsw i32 %iv2, 1`, it is `getAddExpr` with Ops `{1, {-2,+,1}<nuw><nsw><%for.cond>}` We get a different result in `StrengthenNoWrapFlags` called by `getAddExpr`. // (A <opcode> C) --> (A <opcode> C)<nsw> if the op doesn't sign overflow. if (!(SignOrUnsignWrap & SCEV::FlagNSW)) { auto NSWRegion = ConstantRange::makeGuaranteedNoWrapRegion( Opcode, C, OBO::NoSignedWrap); if (NSWRegion.contains(SE->getSignedRange(Ops[1]))) Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW); } // (A <opcode> C) --> (A <opcode> C)<nuw> if the op doesn't unsign overflow. if (!(SignOrUnsignWrap & SCEV::FlagNUW)) { auto NUWRegion = ConstantRange::makeGuaranteedNoWrapRegion( Opcode, C, OBO::NoUnsignedWrap); if (NUWRegion.contains(SE->getUnsignedRange(Ops[1]))) Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW); } For `{-2,+,1}<nuw><nsw><%for.cond>`, with this patch's fixing, now the SignedRange is `[-2,-2147483629)`, the unsigned range is `[-2,0)`, so new range is not contained inside MGNWR, flag is 0. Before the change, SignedRange is `[-2,0)`, unsigned range is `[-2,0)`. Signed Range is contained in Signed type MGNWR ([-2147483648,2147483647)), flag is `SCEV::FlagNSW`. But Seems SignedRange is quite conservative?

nikic added inline comments.Dec 23 2019, 2:16 AM

llvm/lib/Analysis/ScalarEvolution.cpp
5661	Why is `[-119, -29)` an invalid range?

shchenz marked an inline comment as done.Dec 23 2019, 4:07 AM

shchenz added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
5661	For SCEV1 `{x,+,stride}<nuw>`, when we try to get a signed range for SCEV1, the result range should be contained in [0, SINT_MAX +1)? If we use `getUnsignedRangeMin(AddRec->getStart())` to get start value range, it should be any value in [0, UINT_MAX + 1), so the start min can be a value between [ SINT_MAX +1, UINT_MAX + 1), like 137 for type `i8`. If the signed range for SCEV1 is [137, 227), it is not inside the expected range [0, 128). For this case, we should return empty for SCEV1 SignedRange?

nikic added inline comments.Dec 26 2019, 7:39 AM

llvm/lib/Analysis/ScalarEvolution.cpp
5661	I still don't really get it. Why would the range need to be inside [0, SINT_MAX]? For example, let's say the start range is `[-100, -90]`, then that could still result in a valid nuw addrec, e.g. because we have `iv == -10` as a loop exit condition. "nuw" doesn't imply that the numbers can't be "negative", as long as the addrec does not cross zero.

shchenz marked an inline comment as done.Dec 27 2019, 1:21 AM

shchenz added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
5661	Ah, thanks, got your idea. we get `--> {-10,+,1}<nuw><%loop> U: [-10,0) S: [-10,0)` So signed range for a SCEV with `nuw` flag may still be a range in the negative part.

shchenz planned changes to this revision.Dec 27 2019, 1:21 AM

for signed range with nuw flag, we can only use the sign min value when constant range for START can not cross 0.

shchenz marked an inline comment as done.Dec 29 2019, 7:46 PM

shchenz added inline comments.

llvm/test/Analysis/ScalarEvolution/range_nw_flag.ll
23	For signed range, I think here it should also be [11,0). but seems, there is a legacy issue for start SCEV signed range: `{(1 + (10 smax %offset))<nuw>,+,1}<nuw><%loop>`. signed range for `(10 smax %offset)` is `[10, SINT_MIN)` `(1 + (10 smax %offset))<nuw>` signed range should be `[11, SINT_MIN)`, instead of full-set? @nikic Even with the change in https://reviews.llvm.org/D64869 which takes `nuw` for add into account, we still get full-set as result. Flow in `addWithNoWrap`: normal add result for (1 + [10, SINT_MIN))<nuw> is `[11, SINT_MIN +1)` uadd_sat result for (1 + [10, SINT_MIN))<nuw> is also `[11, SINT_MIN +1)`, so final result is `[11, SINT_MIN +1)`, currently this range is treated as signed wrap range. (Min value is not SINT_MIN)

shchenz updated this revision to Diff 235537.Dec 29 2019, 8:48 PM

gentle ping

rebase after D64869

nikic added inline comments.Jan 8 2020, 10:21 AM

llvm/lib/Analysis/ScalarEvolution.cpp
5661	Thanks, this looks better. What isn't clear to me though is why we can't just always use `getUnsignedRangeMin()` here. Shouldn't that always be correct and potentially give a better result? Or is there some rule that you are not allowed to use the unsigned range while computing a signed range?

shchenz marked an inline comment as done.Jan 8 2020, 6:57 PM

shchenz added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
5661	Thanks for your comment. @nikic I am not sure about using `getUnsignedRangeMin()` as signed range min value is right. For my understanding, signed range and unsigned range are different ranges, why we can use unsigned range min value as signed range min value? `nuw` can ensure this?

nikic added inline comments.Jan 9 2020, 1:10 AM

llvm/lib/Analysis/ScalarEvolution.cpp
5661	Maybe @sanjoy.google can comment on that? My understanding is that both the unsigned and signed ranges are valid and we are free to use either one, so using the unsigned one is more beneficial in an unsigned context. But I'm not totally sure if my understanding here is right. In any case, even when not always using the unsigned range (maybe better to avoid unless we know for sure), you should still be able to simplify this code to: ConstantRange StartRange = getRangeRef(Start, SignHint); APInt MinValue = StartRange.getUnsignedMin(); This will return zero if the range crosses zero. There is no actual requirement here that the range is all-negative or all non-negative here, you just always need to pick the unsigned minimum of the range. This will usually be zero if the range is not all-negative or all non-negative, but there are exceptions. For example in your `[11, SINT_MIN + 1)` example from the test below the range can have both signs, but the unsigned min is still `11`.

My understanding is that both the unsigned and signed ranges are valid and we are free to use either one, so using the unsigned one is more beneficial in an unsigned context.

That is correct.

use unsigned range min as initial value

LGTM

Hi @sanjoy.google , could you please help to have another look at this? Thanks a lot.

lgtm

This revision was not accepted when it landed; it landed in state Needs Review.Jan 12 2020, 5:28 PM

Closed by commit rGa6342c247a17: [SCEV] accurate range for addrecexpr with nuw flag (authored by shchenz). · Explain Why

This revision was automatically updated to reflect the committed changes.

Diff 235093

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,648 Lines • ▼ Show 20 Lines	if (const SCEVTruncateExpr *Trunc = dyn_cast<SCEVTruncateExpr>(S)) {
return setRange(Trunc, SignHint,		return setRange(Trunc, SignHint,
ConservativeResult.intersectWith(X.truncate(BitWidth),		ConservativeResult.intersectWith(X.truncate(BitWidth),
RangeType));		RangeType));
}		}

if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(S)) {		if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(S)) {
// If there's no unsigned wrap, the value will never be less than its		// If there's no unsigned wrap, the value will never be less than its
// initial value.		// initial value.
if (AddRec->hasNoUnsignedWrap())		if (AddRec->hasNoUnsignedWrap()) {
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(AddRec->getStart()))		APInt UnsignedMinValue = APInt(BitWidth, 0);
		sanjoy.googleUnsubmitted Not Done Reply Inline Actions This is the unsigned minimum value right? If so please name it `UnsignedMinValue`. sanjoy.google: This is the unsigned minimum value right? If so please name it `UnsignedMinValue`.
if (!C->getValue()->isZero())		if (SignHint == ScalarEvolution::HINT_RANGE_UNSIGNED)
		UnsignedMinValue = getUnsignedRangeMin(AddRec->getStart());
		else {
		sanjoy.googleUnsubmitted Done Reply Inline Actions Can you comment on why you need the else block? sanjoy.google: Can you comment on why you need the else block?
		shchenzAuthorUnsubmitted Not Done Reply Inline Actions If we use `getUnsignedRangeMin(AddRec->getStart())` as minimum value for SignedRange of the start of AddRec, we may get wrong result. For example, assuming we get range `[137, 227)` as result of `getUnsignedRangeMin(AddRec->getStart());` for type `i8`, if we want to return it as SignedRange, it will be `[-119, -29)`, this is invalid range for AddRec with `nuw` even for SignedRange? shchenz: If we use `getUnsignedRangeMin(AddRec->getStart())` as minimum value for SignedRange of the…
		nikicUnsubmitted Not Done Reply Inline Actions Why is `[-119, -29)` an invalid range? nikic: Why is `[-119, -29)` an invalid range?
		shchenzAuthorUnsubmitted Done Reply Inline Actions For SCEV1 `{x,+,stride}<nuw>`, when we try to get a signed range for SCEV1, the result range should be contained in [0, SINT_MAX +1)? If we use `getUnsignedRangeMin(AddRec->getStart())` to get start value range, it should be any value in [0, UINT_MAX + 1), so the start min can be a value between [ SINT_MAX +1, UINT_MAX + 1), like 137 for type `i8`. If the signed range for SCEV1 is [137, 227), it is not inside the expected range [0, 128). For this case, we should return empty for SCEV1 SignedRange? shchenz: For SCEV1 `{x,+,stride}<nuw>`, when we try to get a signed range for SCEV1, the result range…
		nikicUnsubmitted Not Done Reply Inline Actions I still don't really get it. Why would the range need to be inside [0, SINT_MAX]? For example, let's say the start range is `[-100, -90]`, then that could still result in a valid nuw addrec, e.g. because we have `iv == -10` as a loop exit condition. "nuw" doesn't imply that the numbers can't be "negative", as long as the addrec does not cross zero. nikic: I still don't really get it. Why would the range need to be inside [0, SINT_MAX]? For example…
		shchenzAuthorUnsubmitted Done Reply Inline Actions Ah, thanks, got your idea. we get `--> {-10,+,1}<nuw><%loop> U: [-10,0) S: [-10,0)` So signed range for a SCEV with `nuw` flag may still be a range in the negative part. shchenz: Ah, thanks, got your idea. we get `--> {-10,+,1}<nuw><%loop> U: [-10,0) S: [-10,0)` So…
		nikicUnsubmitted Not Done Reply Inline Actions Thanks, this looks better. What isn't clear to me though is why we can't just always use `getUnsignedRangeMin()` here. Shouldn't that always be correct and potentially give a better result? Or is there some rule that you are not allowed to use the unsigned range while computing a signed range? nikic: Thanks, this looks better. What isn't clear to me though is why we can't just always use…
		shchenzAuthorUnsubmitted Done Reply Inline Actions Thanks for your comment. @nikic I am not sure about using `getUnsignedRangeMin()` as signed range min value is right. For my understanding, signed range and unsigned range are different ranges, why we can use unsigned range min value as signed range min value? `nuw` can ensure this? shchenz: Thanks for your comment. @nikic I am not sure about using `getUnsignedRangeMin()` as signed…
		nikicUnsubmitted Not Done Reply Inline Actions Maybe @sanjoy.google can comment on that? My understanding is that both the unsigned and signed ranges are valid and we are free to use either one, so using the unsigned one is more beneficial in an unsigned context. But I'm not totally sure if my understanding here is right. In any case, even when not always using the unsigned range (maybe better to avoid unless we know for sure), you should still be able to simplify this code to: ConstantRange StartRange = getRangeRef(Start, SignHint); APInt MinValue = StartRange.getUnsignedMin(); This will return zero if the range crosses zero. There is no actual requirement here that the range is all-negative or all non-negative here, you just always need to pick the unsigned minimum of the range. This will usually be zero if the range is not all-negative or all non-negative, but there are exceptions. For example in your `[11, SINT_MIN + 1)` example from the test below the range can have both signs, but the unsigned min is still `11`. nikic: Maybe @sanjoy.google can comment on that? My understanding is that both the unsigned and signed…
		ConstantRange StartRange =
		getRangeRef(AddRec->getStart(), HINT_RANGE_SIGNED)
		.intersectWith(
		ConstantRange(APInt(BitWidth, 0),
		APInt::getSignedMaxValue(BitWidth) + 1),
		ConstantRange::Unsigned);
		if (!StartRange.isEmptySet())
		UnsignedMinValue = StartRange.getUnsignedMin();
		}
		if (!UnsignedMinValue.isNullValue())
ConservativeResult = ConservativeResult.intersectWith(		ConservativeResult = ConservativeResult.intersectWith(
ConstantRange(C->getAPInt(), APInt(BitWidth, 0)), RangeType);		ConstantRange(UnsignedMinValue, APInt(BitWidth, 0)), RangeType);
		}

// If there's no signed wrap, and all the operands have the same sign or		// If there's no signed wrap, and all the operands have the same sign or
// zero, the value won't ever change sign.		// zero, the value won't ever change sign.
if (AddRec->hasNoSignedWrap()) {		if (AddRec->hasNoSignedWrap()) {
bool AllNonNeg = true;		bool AllNonNeg = true;
bool AllNonPos = true;		bool AllNonPos = true;
for (unsigned i = 0, e = AddRec->getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = AddRec->getNumOperands(); i != e; ++i) {
if (!isKnownNonNegative(AddRec->getOperand(i))) AllNonNeg = false;		if (!isKnownNonNegative(AddRec->getOperand(i))) AllNonNeg = false;
▲ Show 20 Lines • Show All 6,926 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/range_nw_flag.ll

Show All 14 Lines	loop:
%exitcond = icmp eq i32 %nexti, %numIterations		%exitcond = icmp eq i32 %nexti, %numIterations
br i1 %exitcond, label %exit, label %loop		br i1 %exitcond, label %exit, label %loop

exit:		exit:
ret void		ret void
}		}

; CHECK-LABEL: @test-addrec-nuw		; CHECK-LABEL: @test-addrec-nuw
; CHECK: --> {(1 + (10 smax %offset))<nuw>,+,1}<nuw><%loop> U: full-set S: full-set		; CHECK: --> {(1 + (10 smax %offset))<nuw>,+,1}<nuw><%loop> U: [11,0) S: [11,0)
		shchenzAuthorUnsubmitted Not Done Reply Inline Actions For signed range, I think here it should also be [11,0). but seems, there is a legacy issue for start SCEV signed range: `{(1 + (10 smax %offset))<nuw>,+,1}<nuw><%loop>`. signed range for `(10 smax %offset)` is `[10, SINT_MIN)` `(1 + (10 smax %offset))<nuw>` signed range should be `[11, SINT_MIN)`, instead of full-set? @nikic Even with the change in https://reviews.llvm.org/D64869 which takes `nuw` for add into account, we still get full-set as result. Flow in `addWithNoWrap`: normal add result for (1 + [10, SINT_MIN))<nuw> is `[11, SINT_MIN +1)` uadd_sat result for (1 + [10, SINT_MIN))<nuw> is also `[11, SINT_MIN +1)`, so final result is `[11, SINT_MIN +1)`, currently this range is treated as signed wrap range. (Min value is not SINT_MIN) shchenz: For signed range, I think here it should also be [11,0). but seems, there is a legacy issue for…
define void @test-addrec-nuw(float* %input, i32 %offset, i32 %numIterations) {		define void @test-addrec-nuw(float* %input, i32 %offset, i32 %numIterations) {
entry:		entry:
%cmp = icmp sgt i32 %offset, 10		%cmp = icmp sgt i32 %offset, 10
%min.10 = select i1 %cmp, i32 %offset, i32 10		%min.10 = select i1 %cmp, i32 %offset, i32 10
br label %loop		br label %loop
loop:		loop:
%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]		%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]
%nexti = add nuw i32 %i, 1		%nexti = add nuw i32 %i, 1
Show All 10 Lines

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll

	Show First 20 Lines • Show All 280 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[IV2:%.]] = phi i32 [ -1, [[ENTRY:%.]] ], [ [[IV2_INC:%.]], [[ALWAYS_TAKEN:%.]] ]			; CHECK-NEXT: [[IV2:%.]] = phi i32 [ -1, [[ENTRY:%.]] ], [ [[IV2_INC:%.]], [[ALWAYS_TAKEN:%.]] ]
	; CHECK-NEXT: [[IV:%.]] = phi i32 [ -2147483648, [[ENTRY]] ], [ [[IV_INC:%.]], [[ALWAYS_TAKEN]] ]			; CHECK-NEXT: [[IV:%.]] = phi i32 [ -2147483648, [[ENTRY]] ], [ [[IV_INC:%.]], [[ALWAYS_TAKEN]] ]
	; CHECK-NEXT: store i32 [[IV]], i32* [[PTR:%.*]]			; CHECK-NEXT: store i32 [[IV]], i32* [[PTR:%.*]]
	; CHECK-NEXT: br i1 [[ALWAYS_FALSE:%.]], label [[NEVER_TAKEN:%.]], label [[ALWAYS_TAKEN]]			; CHECK-NEXT: br i1 [[ALWAYS_FALSE:%.]], label [[NEVER_TAKEN:%.]], label [[ALWAYS_TAKEN]]
	; CHECK: never_taken:			; CHECK: never_taken:
	; CHECK-NEXT: store volatile i32 [[IV2]], i32* [[PTR]]			; CHECK-NEXT: store volatile i32 [[IV2]], i32* [[PTR]]
	; CHECK-NEXT: br label [[ALWAYS_TAKEN]]			; CHECK-NEXT: br label [[ALWAYS_TAKEN]]
	; CHECK: always_taken:			; CHECK: always_taken:
	; CHECK-NEXT: [[IV2_INC]] = add nuw nsw i32 [[IV2]], 1			; CHECK-NEXT: [[IV2_INC]] = add nuw i32 [[IV2]], 1
	; CHECK-NEXT: [[IV_INC]] = add nsw i32 [[IV]], 1			; CHECK-NEXT: [[IV_INC]] = add nsw i32 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IV2_INC]], -2147483628			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IV2_INC]], -2147483628
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond
	Show All 26 Lines
	; CHECK-NEXT: [[IV2:%.]] = phi i32 [ -2, [[ENTRY:%.]] ], [ [[IV2_INC:%.]], [[ALWAYS_TAKEN:%.]] ]			; CHECK-NEXT: [[IV2:%.]] = phi i32 [ -2, [[ENTRY:%.]] ], [ [[IV2_INC:%.]], [[ALWAYS_TAKEN:%.]] ]
	; CHECK-NEXT: [[IV:%.]] = phi i32 [ -2147483648, [[ENTRY]] ], [ [[IV_INC:%.]], [[ALWAYS_TAKEN]] ]			; CHECK-NEXT: [[IV:%.]] = phi i32 [ -2147483648, [[ENTRY]] ], [ [[IV_INC:%.]], [[ALWAYS_TAKEN]] ]
	; CHECK-NEXT: store i32 [[IV]], i32* [[PTR:%.*]]			; CHECK-NEXT: store i32 [[IV]], i32* [[PTR:%.*]]
	; CHECK-NEXT: br i1 [[ALWAYS_FALSE:%.]], label [[NEVER_TAKEN:%.]], label [[ALWAYS_TAKEN]]			; CHECK-NEXT: br i1 [[ALWAYS_FALSE:%.]], label [[NEVER_TAKEN:%.]], label [[ALWAYS_TAKEN]]
	; CHECK: never_taken:			; CHECK: never_taken:
	; CHECK-NEXT: store volatile i32 [[IV2]], i32* [[PTR]]			; CHECK-NEXT: store volatile i32 [[IV2]], i32* [[PTR]]
	; CHECK-NEXT: br label [[ALWAYS_TAKEN]]			; CHECK-NEXT: br label [[ALWAYS_TAKEN]]
	; CHECK: always_taken:			; CHECK: always_taken:
	; CHECK-NEXT: [[IV2_INC]] = add nsw i32 [[IV2]], 1			; CHECK-NEXT: [[IV2_INC]] = add i32 [[IV2]], 1
				nikicUnsubmitted Not Done Reply Inline Actions Aren't these two test changes regressions? nikic: Aren't these two test changes regressions?
				shchenzAuthorUnsubmitted Done Reply Inline Actions Hi @nikic , thanks for your quick comment. For my understanding, seems the changes are improvement instead of regression. Before LFTR, we use `%iv` as loop iteration variable, the loop count `loop_count` is `abs(-2147483648) + 20` (`-2147483648` is `SINT32_MIN`). For the increase expression `%iv.inc = add nsw i32 %iv, 1`, it will not generate poison value(no overflow happens for the loop before transformation). After LFTR, we use `%iv2` as loop iteration variable, `%iv2` start-value is -2, so to iterates times `loop_count` (21 bigger than SINT32_MAX), expression `[[IV2_INC]] = add nsw i32 [[IV2]], 1` will surely overflow, thus we will get poison value. But after the change, no `nsw` anymore, so no poison value, we will do wrap to reach `loop_count` times iterations. Could you please help to point out where am I wrong? shchenz: Hi @nikic , thanks for your quick comment. For my understanding, seems the changes are…
				sanjoy.googleUnsubmitted Not Done Reply Inline Actions Either the test changes are regressions or this change is a bugfix (and LFTR was incorrect before). sanjoy.google: Either the test changes are regressions or this change is a bugfix (and LFTR was incorrect…
				nikicUnsubmitted Done Reply Inline Actions Ah yes... LFTR does have a known issue where nowrap flags are not always stripped correctly when switching to a dynamically dead IV. It indeed looks like the drop of nsw here is correct. What isn't clear to me though is why it happens: As I understand this patch, it is about making use of nsw/nuw information on addrecs even if the start is non-constant, so I would naively assume that we should only be inferring more flags with this change, not less. nikic: Ah yes... LFTR does have a known issue where nowrap flags are not always stripped correctly…
				shchenzAuthorUnsubmitted Done Reply Inline Actions The difference happens when we get SCEV for `%iv2.inc = add nuw nsw i32 %iv2, 1`. Firstly, we get SCEV for PHI `%iv2 = phi i32 [ -2, %entry ], [ %iv2.inc, %always_taken ]`, it is `{-2,+,1}<nuw><nsw><%for.cond>` When we try to get SCEV for `%iv2.inc = add nuw nsw i32 %iv2, 1`, it is `getAddExpr` with Ops `{1, {-2,+,1}<nuw><nsw><%for.cond>}` We get a different result in `StrengthenNoWrapFlags` called by `getAddExpr`. // (A <opcode> C) --> (A <opcode> C)<nsw> if the op doesn't sign overflow. if (!(SignOrUnsignWrap & SCEV::FlagNSW)) { auto NSWRegion = ConstantRange::makeGuaranteedNoWrapRegion( Opcode, C, OBO::NoSignedWrap); if (NSWRegion.contains(SE->getSignedRange(Ops[1]))) Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW); } // (A <opcode> C) --> (A <opcode> C)<nuw> if the op doesn't unsign overflow. if (!(SignOrUnsignWrap & SCEV::FlagNUW)) { auto NUWRegion = ConstantRange::makeGuaranteedNoWrapRegion( Opcode, C, OBO::NoUnsignedWrap); if (NUWRegion.contains(SE->getUnsignedRange(Ops[1]))) Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW); } For `{-2,+,1}<nuw><nsw><%for.cond>`, with this patch's fixing, now the SignedRange is `[-2,-2147483629)`, the unsigned range is `[-2,0)`, so new range is not contained inside MGNWR, flag is 0. Before the change, SignedRange is `[-2,0)`, unsigned range is `[-2,0)`. Signed Range is contained in Signed type MGNWR ([-2147483648,2147483647)), flag is `SCEV::FlagNSW`. But Seems SignedRange is quite conservative? shchenz: The difference happens when we get SCEV for `%iv2.inc = add nuw nsw i32 %iv2, 1`. Firstly, we…
	; CHECK-NEXT: [[IV_INC]] = add nsw i32 [[IV]], 1			; CHECK-NEXT: [[IV_INC]] = add nsw i32 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IV2_INC]], -2147483629			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IV2_INC]], -2147483629
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_COND]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond
	Show All 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] get a more accurate range for AddRecExpr with nuw flag
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 235093

llvm/lib/Analysis/ScalarEvolution.cpp

llvm/test/Analysis/ScalarEvolution/range_nw_flag.ll

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] get a more accurate range for AddRecExpr with nuw flagClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 235093

llvm/lib/Analysis/ScalarEvolution.cpp

llvm/test/Analysis/ScalarEvolution/range_nw_flag.ll

llvm/test/Transforms/IndVarSimplify/lftr-pr31181.ll

[SCEV] get a more accurate range for AddRecExpr with nuw flag
ClosedPublic