This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Analysis/ScalarEvolution.cpp
11518	If we consider the `canIVOverflowOnLT()` case, the only thin this guarantees is that `RHS + Stride - 1` does not overflow, it doesn't make a direct statement about the addrec. If the loop exits before reaching this exit (for simplicity: abnormal exit on first iteration), then I don't think we can really make any statement about the nowrap behavior of the addrec.

Per above comment

This revision now requires changes to proceed.Jun 9 2021, 9:48 AM

reames added inline comments.Jun 9 2021, 2:08 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Right, but we know that the result of the expression being considered must reach a conditional branch, and that conditional branch must dominate the latch. Given that, we know that if this condition did overflow, then the iteration of the loop would be UB unless we exit from some other exit before this one. In either case, that seems to give an upper bound on the trip count which does imply that AddRec can't overflow in a well defined loop doesn't it?

LGTM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Ah yes, you're right. It does still provide an upper bound. I got this mixed up with concerns from D101722. I've played around with some examples and convinced myself that this is correct...

This revision is now accepted and ready to land.Jun 9 2021, 2:58 PM

nikic added inline comments.Jun 9 2021, 3:16 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	After thinking about it some more, there is still one case that may be problematic. What if the addrec starts out above the RHS (meaning the loop exits after the first iteration). I'm thinking something like this: ; Assume %start is larger than 200. define void @test(i8 %start) { entry: br label %loop loop: %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ] %iv.next = add i8 %iv, 2 call void @use_and_exit(i8 %iv.next) %cond = icmp ult i8 %iv.next, 200 br i1 %cond, label %loop, label %exit exit: ret void } declare void @use_and_exit(i8) I believe your change would now infer that `{2 + %start,+,2}` is `<nuw>`, but if `@use_and_exit()` exits beforehand that is not necessarily a given.

Change appears to be wrong as written, discussing why before adjusting.

llvm/lib/Analysis/ScalarEvolution.cpp
11518	I think you're correct, and that this change is wrong. But I think the reasoning on why is bit different than your phrasing. Let me try to describe why, and you can tell me if this matches what you're thinking. In order for the flags computed by canIVOverflowOnLT to hold for the IV in general, we need to know that a poison value for the comparison must reach a branch (or other full UB use). If after poison has been generated, but before the undefined use, the loop exits, the overflow flag might not hold. This can happen when either a) there's an abnormal or normal exit before this one which isn't control dependent on this IV. If that happens, users of the IV might see a wrapped value even though we've proved this exit can't be taken after overflow. (Aside, this is specific to the canIVOverflowOnLT path. I believe the isUBOnWrap path is sufficiently conservative to not have this issue.) If the IV starts at a value above the RHS, the loop may exit on the first iteration. But, and this is the key bit, assuming the exit taken is control dependent on the IV, we can still infer the nowrap flag. Why? There's two addrecs here. There's IV = {start, +, 2}, and IV.next = {start + 2, +, 2}. The exit test is in terms of the later. Given that, the first iteration of %iv.next can't wrap, or we wouldn't be in the exit on first iteration case. Hm, I do notice this argument is specific to the step and rhs values here. If rhs was say, 0, I think this argument collapses. Maybe we have another latent issue here?

nikic added inline comments.Jun 11 2021, 11:36 AM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Sorry for the back and forth here. I just realized something important: We're talking about nowrap flags on addrecs here, not on the incrementing add itself. And nowrap flags on addrecs are directly tied to the trip count. An addrec for a loop with zero iterations is always nuw/nsw. So maybe there's no problem here after all?

reames added inline comments.Jun 11 2021, 12:45 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Oh, I see your point, but let me rephrase again. For a loop with exits on the first iteration, the exiting value of the addrec is the start value. For a post-inc IV, that start value is the start of the pre-inc IV + the step. The computation of that start value might overflow, but there's no further application of the step, and thus the addrec is trivially nsw/nuw. Though, unless I'm missing something, this only addresses the second part of my last response. I believe the poison not reaching branch instruction issue still exists, and isn't iteration 1 specific. I believe that could happen on any iteration which happens to be the final iteration, but not exiting through this exit.

reames added inline comments.Jun 11 2021, 12:47 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Sorry for the back and forth here. Also, just wanted to explicitly say please don't be sorry, this back and forth is really helpful for me in figuring out exactly what's going on with this (insanely complicated) set of code. I really appreciate the discussion.

nikic added inline comments.Jun 11 2021, 2:04 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11518	Though, unless I'm missing something, this only addresses the second part of my last response. I believe the poison not reaching branch instruction issue still exists, and isn't iteration 1 specific. I believe that could happen on any iteration which happens to be the final iteration, but not exiting through this exit. I think it should cover that case as well. Nowrap flags on addrecs don't make a statement about overflow behavior on the last loop iteration, so it should be okay even if there is a prior exit (on the last iteration).

reames abandoned this revision.Sep 1 2021, 1:45 PM

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ScalarEvolution.cpp

7 lines

test/

Analysis/

ScalarEvolution/

different-loops-recs.ll

10 lines

Diff 350417

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,412 Lines • ▼ Show 20 Lines	if (!IV && AllowPredicates) {
IV = convertSCEVToAddRecWithPredicates(LHS, L, Predicates);		IV = convertSCEVToAddRecWithPredicates(LHS, L, Predicates);
PredicatedIV = true;		PredicatedIV = true;
}		}

// Avoid weird loops		// Avoid weird loops
if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())		if (!IV \|\| IV->getLoop() != L \|\| !IV->isAffine())
return getCouldNotCompute();		return getCouldNotCompute();

bool NoWrap = ControlsExit &&		auto WrapType = IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW;
IV->getNoWrapFlags(IsSigned ? SCEV::FlagNSW : SCEV::FlagNUW);		bool NoWrap = ControlsExit && IV->getNoWrapFlags(WrapType);

const SCEV Stride = IV->getStepRecurrence(this);		const SCEV Stride = IV->getStepRecurrence(this);

bool PositiveStride = isKnownPositive(Stride);		bool PositiveStride = isKnownPositive(Stride);

// Avoid negative or zero stride values.		// Avoid negative or zero stride values.
if (!PositiveStride) {		if (!PositiveStride) {
// We can compute the correct backedge taken count for loops with unknown		// We can compute the correct backedge taken count for loops with unknown
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if (!PositiveStride) {
};		};

// Avoid proven overflow cases: this will ensure that the backedge taken		// Avoid proven overflow cases: this will ensure that the backedge taken
// count will not generate any unsigned overflow. Relaxed no-overflow		// count will not generate any unsigned overflow. Relaxed no-overflow
// conditions exploit NoWrapFlags, allowing to optimize in presence of		// conditions exploit NoWrapFlags, allowing to optimize in presence of
// undefined behaviors like the case of C language.		// undefined behaviors like the case of C language.
if (canIVOverflowOnLT(RHS, Stride, IsSigned) && !isUBOnWrap())		if (canIVOverflowOnLT(RHS, Stride, IsSigned) && !isUBOnWrap())
return getCouldNotCompute();		return getCouldNotCompute();

		// We have proven that IV can not overflow, remember that fact.
		setNoWrapFlags(const_cast<SCEVAddRecExpr *>(IV), WrapType);
		nikicUnsubmitted Not Done Reply Inline Actions If we consider the `canIVOverflowOnLT()` case, the only thin this guarantees is that `RHS + Stride - 1` does not overflow, it doesn't make a direct statement about the addrec. If the loop exits before reaching this exit (for simplicity: abnormal exit on first iteration), then I don't think we can really make any statement about the nowrap behavior of the addrec. nikic: If we consider the `canIVOverflowOnLT()` case, the only thin this guarantees is that `RHS +…
		reamesAuthorUnsubmitted Done Reply Inline Actions Right, but we know that the result of the expression being considered must reach a conditional branch, and that conditional branch must dominate the latch. Given that, we know that if this condition did overflow, then the iteration of the loop would be UB unless we exit from some other exit before this one. In either case, that seems to give an upper bound on the trip count which does imply that AddRec can't overflow in a well defined loop doesn't it? reames: Right, but we know that the result of the expression being considered must reach a conditional…
		nikicUnsubmitted Not Done Reply Inline Actions Ah yes, you're right. It does still provide an upper bound. I got this mixed up with concerns from D101722. I've played around with some examples and convinced myself that this is correct... nikic: Ah yes, you're right. It does still provide an upper bound. I got this mixed up with concerns…
		nikicUnsubmitted Not Done Reply Inline Actions After thinking about it some more, there is still one case that may be problematic. What if the addrec starts out above the RHS (meaning the loop exits after the first iteration). I'm thinking something like this: ; Assume %start is larger than 200. define void @test(i8 %start) { entry: br label %loop loop: %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ] %iv.next = add i8 %iv, 2 call void @use_and_exit(i8 %iv.next) %cond = icmp ult i8 %iv.next, 200 br i1 %cond, label %loop, label %exit exit: ret void } declare void @use_and_exit(i8) I believe your change would now infer that `{2 + %start,+,2}` is `<nuw>`, but if `@use_and_exit()` exits beforehand that is not necessarily a given. nikic: After thinking about it some more, there is still one case that may be problematic. What if the…
		reamesAuthorUnsubmitted Done Reply Inline Actions I think you're correct, and that this change is wrong. But I think the reasoning on why is bit different than your phrasing. Let me try to describe why, and you can tell me if this matches what you're thinking. In order for the flags computed by canIVOverflowOnLT to hold for the IV in general, we need to know that a poison value for the comparison must reach a branch (or other full UB use). If after poison has been generated, but before the undefined use, the loop exits, the overflow flag might not hold. This can happen when either a) there's an abnormal or normal exit before this one which isn't control dependent on this IV. If that happens, users of the IV might see a wrapped value even though we've proved this exit can't be taken after overflow. (Aside, this is specific to the canIVOverflowOnLT path. I believe the isUBOnWrap path is sufficiently conservative to not have this issue.) If the IV starts at a value above the RHS, the loop may exit on the first iteration. But, and this is the key bit, assuming the exit taken is control dependent on the IV, we can still infer the nowrap flag. Why? There's two addrecs here. There's IV = {start, +, 2}, and IV.next = {start + 2, +, 2}. The exit test is in terms of the later. Given that, the first iteration of %iv.next can't wrap, or we wouldn't be in the exit on first iteration case. Hm, I do notice this argument is specific to the step and rhs values here. If rhs was say, 0, I think this argument collapses. Maybe we have another latent issue here? reames: I think you're correct, and that this change is wrong. But I think the reasoning on why is bit…
		nikicUnsubmitted Not Done Reply Inline Actions Sorry for the back and forth here. I just realized something important: We're talking about nowrap flags on addrecs here, not on the incrementing add itself. And nowrap flags on addrecs are directly tied to the trip count. An addrec for a loop with zero iterations is always nuw/nsw. So maybe there's no problem here after all? nikic: Sorry for the back and forth here. I just realized something important: We're talking about…
		reamesAuthorUnsubmitted Done Reply Inline Actions Oh, I see your point, but let me rephrase again. For a loop with exits on the first iteration, the exiting value of the addrec is the start value. For a post-inc IV, that start value is the start of the pre-inc IV + the step. The computation of that start value might overflow, but there's no further application of the step, and thus the addrec is trivially nsw/nuw. Though, unless I'm missing something, this only addresses the second part of my last response. I believe the poison not reaching branch instruction issue still exists, and isn't iteration 1 specific. I believe that could happen on any iteration which happens to be the final iteration, but not exiting through this exit. reames: Oh, I see your point, but let me rephrase again. For a loop with exits on the first iteration…
		nikicUnsubmitted Not Done Reply Inline Actions Though, unless I'm missing something, this only addresses the second part of my last response. I believe the poison not reaching branch instruction issue still exists, and isn't iteration 1 specific. I believe that could happen on any iteration which happens to be the final iteration, but not exiting through this exit. I think it should cover that case as well. Nowrap flags on addrecs don't make a statement about overflow behavior on the last loop iteration, so it should be okay even if there is a prior exit (on the last iteration). nikic: > Though, unless I'm missing something, this only addresses the second part of my last response.
		reamesAuthorUnsubmitted Done Reply Inline Actions Sorry for the back and forth here. Also, just wanted to explicitly say please don't be sorry, this back and forth is really helpful for me in figuring out exactly what's going on with this (insanely complicated) set of code. I really appreciate the discussion. reames: > Sorry for the back and forth here. Also, just wanted to explicitly say please don't be sorry…
}		}

ICmpInst::Predicate Cond = IsSigned ? ICmpInst::ICMP_SLT		ICmpInst::Predicate Cond = IsSigned ? ICmpInst::ICMP_SLT
: ICmpInst::ICMP_ULT;		: ICmpInst::ICMP_ULT;
const SCEV *Start = IV->getStart();		const SCEV *Start = IV->getStart();
const SCEV *End = RHS;		const SCEV *End = RHS;
// When the RHS is not invariant, we do not know the end bound of the loop and		// When the RHS is not invariant, we do not know the end bound of the loop and
// cannot calculate the ExactBECount needed by ExitLimit. However, we can		// cannot calculate the ExactBECount needed by ExitLimit. However, we can
▲ Show 20 Lines • Show All 2,153 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll

	Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	define void @test_01(i32 %a, i32 %b) {			define void @test_01(i32 %a, i32 %b) {

	; CHECK-LABEL: Classifying expressions for: @test_01			; CHECK-LABEL: Classifying expressions for: @test_01
	; CHECK: %sum1 = add i32 %phi1, %phi2			; CHECK: %sum1 = add i32 %phi1, %phi2
	; CHECK-NEXT: --> {(%a + %b),+,3}<%loop1>			; CHECK-NEXT: --> {(%a + %b),+,3}<%loop1>
	; CHECK: %sum2 = add i32 %sum1, %phi3			; CHECK: %sum2 = add i32 %sum1, %phi3
	; CHECK-NEXT: --> {(6 + %a + %b),+,6}<%loop1>			; CHECK-NEXT: --> {(6 + %a + %b),+,6}<%loop1>
	; CHECK: %is1 = add i32 %sum2, %a			; CHECK: %is1 = add i32 %sum2, %a
	; CHECK-NEXT: --> {(6 + (2 * %a) + %b),+,6}<%loop1>			; CHECK-NEXT: --> {(6 + (2 * %a) + %b),+,6}<nuw><%loop1>
	; CHECK: %sum3 = add i32 %phi4, %phi5			; CHECK: %sum3 = add i32 %phi4, %phi5
	; CHECK-NEXT: --> {116,+,3}<%loop2>			; CHECK-NEXT: --> {116,+,3}<%loop2>
	; CHECK: %sum4 = add i32 %sum3, %phi6			; CHECK: %sum4 = add i32 %sum3, %phi6
	; CHECK-NEXT: --> {159,+,6}<%loop2>			; CHECK-NEXT: --> {159,+,6}<%loop2>
	; CHECK: %is2 = add i32 %sum4, %b			; CHECK: %is2 = add i32 %sum4, %b
	; CHECK-NEXT: --> {(159 + %b),+,6}<%loop2>			; CHECK-NEXT: --> {(159 + %b),+,6}<%loop2>
	; CHECK: %ec2 = add i32 %is1, %is2			; CHECK: %ec2 = add i32 %is1, %is2
	; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<%loop1>,+,6}<%loop2>			; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<nw><%loop1>,+,6}<%loop2>
	; CHECK: %s1 = add i32 %phi1, %is1			; CHECK: %s1 = add i32 %phi1, %is1
	; CHECK-NEXT: --> {(6 + (3 * %a) + %b),+,7}<%loop1>			; CHECK-NEXT: --> {(6 + (3 * %a) + %b),+,7}<%loop1>
	; CHECK: %s2 = add i32 %is2, %phi4			; CHECK: %s2 = add i32 %is2, %phi4
	; CHECK-NEXT: --> {(222 + %b),+,7}<%loop2>			; CHECK-NEXT: --> {(222 + %b),+,7}<%loop2>
	; CHECK: %s3 = add i32 %is1, %phi5			; CHECK: %s3 = add i32 %is1, %phi5
	; CHECK-NEXT: --> {{{{}}(59 + (2 * %a) + %b),+,6}<%loop1>,+,2}<%loop2>			; CHECK-NEXT: --> {{{{}}(59 + (2 * %a) + %b),+,6}<nw><%loop1>,+,2}<%loop2>
	; CHECK: %s4 = add i32 %phi2, %is2			; CHECK: %s4 = add i32 %phi2, %is2
	; CHECK-NEXT: --> {{{{}}(159 + (2 * %b)),+,2}<%loop1>,+,6}<%loop2>			; CHECK-NEXT: --> {{{{}}(159 + (2 * %b)),+,2}<%loop1>,+,6}<%loop2>
	; CHECK: %s5 = add i32 %is1, %is2			; CHECK: %s5 = add i32 %is1, %is2
	; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<%loop1>,+,6}<%loop2>			; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<nw><%loop1>,+,6}<%loop2>
	; CHECK: %s6 = add i32 %is2, %is1			; CHECK: %s6 = add i32 %is2, %is1
	; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<%loop1>,+,6}<%loop2>			; CHECK-NEXT: --> {{{{}}(165 + (2 * %a) + (2 * %b)),+,6}<nw><%loop1>,+,6}<%loop2>

	entry:			entry:
	br label %loop1			br label %loop1

	loop1:			loop1:
	%phi1 = phi i32 [ %a, %entry ], [ %phi1.inc, %loop1 ]			%phi1 = phi i32 [ %a, %entry ], [ %phi1.inc, %loop1 ]
	%phi2 = phi i32 [ %b, %entry ], [ %phi2.inc, %loop1 ]			%phi2 = phi i32 [ %b, %entry ], [ %phi2.inc, %loop1 ]
	%phi3 = phi i32 [ 6, %entry ], [ %phi3.inc, %loop1 ]			%phi3 = phi i32 [ 6, %entry ], [ %phi3.inc, %loop1 ]
	▲ Show 20 Lines • Show All 523 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Cache wrap facts for positive IVs w/LT exitsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 350417

llvm/lib/Analysis/ScalarEvolution.cpp

llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll

[SCEV] Cache wrap facts for positive IVs w/LT exits
AbandonedPublic