This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Move mustprogress based no-self-wrap logic so it applies to all exit conditions
ClosedPublic

Authored by reames on Jun 9 2021, 2:48 PM.

Download Raw Diff

Details

Reviewers

nikic
mkazantsev
fhahn
lebedev.ri

Commits

rGea12c2cb9c42: [SCEV] Move mustprogress based no-self-wrap logic so it applies to all exit…

Summary

This change moves logic which we'd added specifically for less than tests so that it applies to equalities and greater than tests as well. The basic idea is that if we can show an IV cycles infinitely through the same series on self-wrap, and that the exit condition must be taken to prevent UB, we can conclude that it must be taken before self-wrap and thus infer said flag.

The motivation here is simple loops with unsigned induction variables w/non-one steps and inequality tests. A toy example would be:
for (unsigned i = 0; i != N; i += 2) { body; }

If body contains no side effects, and this is a mustprogress function, we can assume that this must be a finite loop and thus that the exit count is N/2.

Note that we canonicalize to NE tests in LFTR, so after the previous change we'd compute an exit count, perform LFTR, and then "forget" what the exit count was again.

A couple notes on things left out of this patch:

Looking back through multiple invertible functions, and anything other than ZExt. This is easy to add, but deserves it's own test coverage and review. I add Zext only so that I could delete the less than version of the same code.
Forming the IV as an addrec where possible. Doing this will (further) help our ability to handle extended ne tests, but will be done separately. (See the existing less than code for what I mean here - note that code will still be triggered if we prove no-self-wrap in this patch.)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Jun 9 2021, 2:48 PM

Herald added subscribers: bollu, hiraditya, mcrosier. · View Herald TranscriptJun 9 2021, 2:48 PM

reames requested review of this revision.Jun 9 2021, 2:48 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2021, 2:48 PM

reames mentioned this in rG4ac3dae57f27: [tests] Precommit test for D103991.Jun 9 2021, 3:06 PM

Harbormaster completed remote builds in B108490: Diff 350999.Jun 9 2021, 3:30 PM

Please add tests for this this exit doesn't dominate the latch.

llvm/lib/Analysis/ScalarEvolution.cpp
8287	It is only true if this condition dominates the latch. If not, it might exit at some point during the iteration space, but it was guarded by a volatile condition that prevented it. If my understanding is correct, please specify this requirement explicitly in function's comment. Maybe (I'm not sure) this still holds for non-dominating exits if it is the only exit from the loop.
8300	Please add early bail if IV is not affine. This will save compile time for complex addrecs.
8302	How about negative powers of 2? I think all your reasoning validly applies for them too.
8311	This should go before anything else to save compile time.
llvm/test/Analysis/ScalarEvolution/ne-overflow.ll
10	Please use ./utils/update_analyze_test_checks.py. It works well with SCEV.

This revision now requires changes to proceed.Jun 9 2021, 9:32 PM

reames added inline comments.Jun 11 2021, 10:12 AM

llvm/lib/Analysis/ScalarEvolution.cpp
8287	All exits being analyzed for exit counts must dominate the latch. See computeExitLimit. I might be remembering this wrong, but I think you were even the person who added that code. :)
8300	I don't think it actually saves anything in this case. Both callers filter out non-affine ARs before calling this. I'll make it an assert though.
8302	I believe it does too. Incrementalism. And not terrible relevant for LT/EQ cases, it'll be more important when applied to GT.
8311	Er, I think you misread the code? It's at the earliest point it can legally be. It's proving no-self-wrap.
llvm/test/Analysis/ScalarEvolution/ne-overflow.ll
10	No actually, it doesn't. It just claims to. I've tried it with other test changes recently, and been very unpleasantly surprised. It mostly just results in missing check lines.

Next action item here is mine. (Rebase needed, tests can be landed, etc..)

This update reworks the patch, and hopefully makes it a lot more obviously correct. The patch is restructured to purely pull code up from howManyLessThans into the caller so that it handles all condition codes. This does reduce the scope slightly, but I've included planned extensions in the review description which should cover all cases.

nikic added inline comments.Nov 18 2021, 9:02 AM

llvm/lib/Analysis/ScalarEvolution.cpp
8335	Also check that the NW flag isn't already set?
11844	Maybe you want to extract canAsusmeNoSelfWrap as a static function and reuse? You repeat the same conditions in the new code, just without the comments.

reames added inline comments.Nov 18 2021, 9:09 AM

llvm/lib/Analysis/ScalarEvolution.cpp
8335	Oh, good catch.
11844	I'm hoping to delete it in a following commit. I replicated the explanation that seemed relevant, and the remaining use looks highly suspect. (I wrote the remaining use, and I'm not convinced by my own comments. At a minimum, I want to rework the comments.)

Harbormaster completed remote builds in B134914: Diff 388214.Nov 18 2021, 9:17 AM

Address review comments

This revision was not accepted when it landed; it landed in state Needs Review.Nov 18 2021, 10:10 AM

This revision was landed with ongoing or failed builds.

Closed by commit rGea12c2cb9c42: [SCEV] Move mustprogress based no-self-wrap logic so it applies to all exit… (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rGea12c2cb9c42: [SCEV] Move mustprogress based no-self-wrap logic so it applies to all exit….

Harbormaster completed remote builds in B134924: Diff 388235.Nov 18 2021, 10:12 AM

reames mentioned this in D114176: [SCEV] Look through invertible functions when infering no-self-wrap from mustprogres.Nov 18 2021, 10:31 AM

Looks like this caused an unexpected compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=7a14244cc645bbbcbf5056e7a00fadbb339e92ed&to=ea12c2cb9c4221095abfb2af7148140783040734&stat=instructions +1.5% on mafft.

In D103991#3140961, @nikic wrote:

Looks like this caused an unexpected compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=7a14244cc645bbbcbf5056e7a00fadbb339e92ed&to=ea12c2cb9c4221095abfb2af7148140783040734&stat=instructions +1.5% on mafft.

Your right, that is unexpected. I don't really see anything in this code likely to be slow, do you see anything obvious? If not, I may need help testing a few variants to see what we see.

Any evidence on impact? Are we transforming those benchmarks more?

In D103991#3141054, @reames wrote:

In D103991#3140961, @nikic wrote:

Looks like this caused an unexpected compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=7a14244cc645bbbcbf5056e7a00fadbb339e92ed&to=ea12c2cb9c4221095abfb2af7148140783040734&stat=instructions +1.5% on mafft.

Your right, that is unexpected. I don't really see anything in this code likely to be slow, do you see anything obvious? If not, I may need help testing a few variants to see what we see.

Any evidence on impact? Are we transforming those benchmarks more?

There is some impact, but it's very minor: https://llvm-compile-time-tracker.com/compare.php?from=7a14244cc645bbbcbf5056e7a00fadbb339e92ed&to=ea12c2cb9c4221095abfb2af7148140783040734&stat=size-text

I don't immediately see what could cause this.

In D103991#3141134, @nikic wrote:

I don't immediately see what could cause this.

Does 1a5666acb help? That's the only thing I could vaguely see causing this.

In D103991#3141221, @reames wrote:

In D103991#3141134, @nikic wrote:

I don't immediately see what could cause this.

Does 1a5666acb help? That's the only thing I could vaguely see causing this.

Nope, this didn't make any measurable difference.

In D103991#3141328, @nikic wrote:

Nope, this didn't make any measurable difference.

How about 734abbad7?

In D103991#3141667, @reames wrote:

In D103991#3141328, @nikic wrote:

Nope, this didn't make any measurable difference.

How about 734abbad7?

Also no difference. I suspect that we're seeing some kind of second order effect here, not an issue in the code itself.

In D103991#3142766, @nikic wrote:

Also no difference. I suspect that we're seeing some kind of second order effect here, not an issue in the code itself.

I agree and am going to revert the two speculative changes as they just complicate the code.

I don't really know what else to do here on the original regression. It sure seems like inferring flags is simply making some other piece of code slower. As things stand, I plan to leave the original code in tree as the regression is small, but I find that result unsatisfying.

nikic mentioned this in D114185: [SCEV] Leverage inferred no-self-wrap flags to refine trip counts.Nov 19 2021, 10:14 AM

Looking at callgrind profiles for one test case, the main additional cost is when simplifying IVs after unrolling, during the willOverflow check that zext/sext both operands. In getZeroExtendExpr we seem to spend more times in various proveNoWrapByXYZ() methods. It's not obvious to me how this change would cause that, I'd more expect the reverse effect (less need to infer additional flags because we added more here).

In D103991#3143636, @nikic wrote:

Looking at callgrind profiles for one test case, the main additional cost is when simplifying IVs after unrolling, during the willOverflow check that zext/sext both operands. In getZeroExtendExpr we seem to spend more times in various proveNoWrapByXYZ() methods. It's not obvious to me how this change would cause that, I'd more expect the reverse effect (less need to infer additional flags because we added more here).

All I can think of is that maybe we figure out the trip count for some loop, unroll it, and then spend time analyzing the newly introduced IVs? That's really the only interaction I can find.

Out of curiosity, what kind of unrolling do you see? Full, partial, max count full, or runtime?

In D103991#3143841, @reames wrote:

In D103991#3143636, @nikic wrote:

Looking at callgrind profiles for one test case, the main additional cost is when simplifying IVs after unrolling, during the willOverflow check that zext/sext both operands. In getZeroExtendExpr we seem to spend more times in various proveNoWrapByXYZ() methods. It's not obvious to me how this change would cause that, I'd more expect the reverse effect (less need to infer additional flags because we added more here).

All I can think of is that maybe we figure out the trip count for some loop, unroll it, and then spend time analyzing the newly introduced IVs? That's really the only interaction I can find.

This file (tddis.c from mafft) doesn't see any codegen change, so it's not new unrolling. It could be that we're now better able to analyze an unrolled loop though.

Out of curiosity, what kind of unrolling do you see? Full, partial, max count full, or runtime?

From the debug log, there's both full and runtime unrolling going on in this file -- but looking at the unrolling implementation, we only simplify IVs for non-complete unrolling, so this should be related to runtime unrolling.

@nikic I don't see anything obvious here, and don't consider the reported issue worthy of revert. Given that, I'm going to stop look into this. If you want to file a standalone bug with a reproducer, I can take a further look, but at the moment, the amount of information shared so far is not really easily actionable.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ScalarEvolution.cpp

35 lines

test/

Analysis/

ScalarEvolution/

ne-overflow.ll

28 lines

trip-count-negative-stride.ll

16 lines

Diff 388252

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,278 Lines • ▼ Show 20 Lines	ScalarEvolution::computeExitLimitFromICmp(const Loop *L,
bool AllowPredicates) {		bool AllowPredicates) {
// If the condition was exit on true, convert the condition to exit on false		// If the condition was exit on true, convert the condition to exit on false
ICmpInst::Predicate Pred;		ICmpInst::Predicate Pred;
if (!ExitIfTrue)		if (!ExitIfTrue)
Pred = ExitCond->getPredicate();		Pred = ExitCond->getPredicate();
else		else
Pred = ExitCond->getInversePredicate();		Pred = ExitCond->getInversePredicate();
const ICmpInst::Predicate OriginalPred = Pred;		const ICmpInst::Predicate OriginalPred = Pred;

		mkazantsevUnsubmitted Not Done Reply Inline Actions It is only true if this condition dominates the latch. If not, it might exit at some point during the iteration space, but it was guarded by a volatile condition that prevented it. If my understanding is correct, please specify this requirement explicitly in function's comment. Maybe (I'm not sure) this still holds for non-dominating exits if it is the only exit from the loop. mkazantsev: It is only true if this condition dominates the latch. If not, it might exit at some point…
		reamesAuthorUnsubmitted Done Reply Inline Actions All exits being analyzed for exit counts must dominate the latch. See computeExitLimit. I might be remembering this wrong, but I think you were even the person who added that code. :) reames: All exits being analyzed for exit counts must dominate the latch. See computeExitLimit. I…
const SCEV *LHS = getSCEV(ExitCond->getOperand(0));		const SCEV *LHS = getSCEV(ExitCond->getOperand(0));
const SCEV *RHS = getSCEV(ExitCond->getOperand(1));		const SCEV *RHS = getSCEV(ExitCond->getOperand(1));

// Try to evaluate any dependencies out of the loop.		// Try to evaluate any dependencies out of the loop.
LHS = getSCEVAtScope(LHS, L);		LHS = getSCEVAtScope(LHS, L);
RHS = getSCEVAtScope(RHS, L);		RHS = getSCEVAtScope(RHS, L);

// At this point, we would like to compute how many iterations of the		// At this point, we would like to compute how many iterations of the
// loop the predicate will return true for these inputs.		// loop the predicate will return true for these inputs.
if (isLoopInvariant(LHS, L) && !isLoopInvariant(RHS, L)) {		if (isLoopInvariant(LHS, L) && !isLoopInvariant(RHS, L)) {
// If there is a loop-invariant, force it into the RHS.		// If there is a loop-invariant, force it into the RHS.
std::swap(LHS, RHS);		std::swap(LHS, RHS);
Pred = ICmpInst::getSwappedPredicate(Pred);		Pred = ICmpInst::getSwappedPredicate(Pred);
		mkazantsevUnsubmitted Not Done Reply Inline Actions Please add early bail if IV is not affine. This will save compile time for complex addrecs. mkazantsev: Please add early bail if IV is not affine. This will save compile time for complex addrecs.
		reamesAuthorUnsubmitted Done Reply Inline Actions I don't think it actually saves anything in this case. Both callers filter out non-affine ARs before calling this. I'll make it an assert though. reames: I don't think it actually saves anything in this case. Both callers filter out non-affine ARs…
}		}

		mkazantsevUnsubmitted Not Done Reply Inline Actions How about negative powers of 2? I think all your reasoning validly applies for them too. mkazantsev: How about negative powers of 2? I think all your reasoning validly applies for them too.
		reamesAuthorUnsubmitted Done Reply Inline Actions I believe it does too. Incrementalism. And not terrible relevant for LT/EQ cases, it'll be more important when applied to GT. reames: I believe it does too. Incrementalism. And not terrible relevant for LT/EQ cases, it'll be…
// Simplify the operands before analyzing them.		// Simplify the operands before analyzing them.
(void)SimplifyICmpOperands(Pred, LHS, RHS);		(void)SimplifyICmpOperands(Pred, LHS, RHS);

// If we have a comparison of a chrec against a constant, try to use value		// If we have a comparison of a chrec against a constant, try to use value
// ranges to answer this query.		// ranges to answer this query.
if (const SCEVConstant *RHSC = dyn_cast<SCEVConstant>(RHS))		if (const SCEVConstant *RHSC = dyn_cast<SCEVConstant>(RHS))
if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(LHS))		if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(LHS))
if (AddRec->getLoop() == L) {		if (AddRec->getLoop() == L) {
// Form the constant range.		// Form the constant range.
		mkazantsevUnsubmitted Not Done Reply Inline Actions This should go before anything else to save compile time. mkazantsev: This should go before anything else to save compile time.
		reamesAuthorUnsubmitted Done Reply Inline Actions Er, I think you misread the code? It's at the earliest point it can legally be. It's proving no-self-wrap. reames: Er, I think you misread the code? It's at the earliest point it can legally be. It's proving…
ConstantRange CompRange =		ConstantRange CompRange =
ConstantRange::makeExactICmpRegion(Pred, RHSC->getAPInt());		ConstantRange::makeExactICmpRegion(Pred, RHSC->getAPInt());

const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);		const SCEV Ret = AddRec->getNumIterationsInRange(CompRange, this);
if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;		if (!isa<SCEVCouldNotCompute>(Ret)) return Ret;
}		}

		// If this loop must exit based on this condition (or execute undefined
		// behaviour), and we can prove the test sequence produced must repeat
		// the same values on self-wrap of the IV, then we can infer that IV
		// doesn't self wrap because if it did, we'd have an infinite (undefined)
		// loop.
		if (ControlsExit && isLoopInvariant(RHS, L) && loopHasNoAbnormalExits(L) &&
		loopIsFiniteByAssumption(L)) {

		// TODO: We can peel off any functions which are invertible in L. Loop
		// invariant terms are effectively constants for our purposes here.
		auto *InnerLHS = LHS;
		if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS))
		InnerLHS = ZExt->getOperand();
		if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(InnerLHS)) {
		auto StrideC = dyn_cast<SCEVConstant>(AR->getStepRecurrence(this));
		if (!AR->hasNoSelfWrap() && AR->getLoop() == L && AR->isAffine() &&
		StrideC && StrideC->getAPInt().isPowerOf2()) {
		nikicUnsubmitted Not Done Reply Inline Actions Also check that the NW flag isn't already set? nikic: Also check that the NW flag isn't already set?
		reamesAuthorUnsubmitted Done Reply Inline Actions Oh, good catch. reames: Oh, good catch.
		auto Flags = AR->getNoWrapFlags();
		Flags = setFlags(Flags, SCEV::FlagNW);
		SmallVector<const SCEV*> Operands{AR->operands()};
		Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
		setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
		}
		}
		}

switch (Pred) {		switch (Pred) {
case ICmpInst::ICMP_NE: { // while (X != Y)		case ICmpInst::ICMP_NE: { // while (X != Y)
// Convert to: while (X-Y != 0)		// Convert to: while (X-Y != 0)
if (LHS->getType()->isPointerTy()) {		if (LHS->getType()->isPointerTy()) {
LHS = getLosslessPtrToIntExpr(LHS);		LHS = getLosslessPtrToIntExpr(LHS);
if (isa<SCEVCouldNotCompute>(LHS))		if (isa<SCEVCouldNotCompute>(LHS))
return LHS;		return LHS;
}		}
▲ Show 20 Lines • Show All 3,483 Lines • ▼ Show 20 Lines	auto canAssumeNoSelfWrap = [&](const SCEVAddRecExpr *AR) {
auto StrideC = dyn_cast<SCEVConstant>(AR->getStepRecurrence(this));		auto StrideC = dyn_cast<SCEVConstant>(AR->getStepRecurrence(this));
if (!StrideC \|\| !StrideC->getAPInt().isPowerOf2())		if (!StrideC \|\| !StrideC->getAPInt().isPowerOf2())
return false;		return false;

if (!ControlsExit \|\| !loopHasNoAbnormalExits(L))		if (!ControlsExit \|\| !loopHasNoAbnormalExits(L))
return false;		return false;

return loopIsFiniteByAssumption(L);		return loopIsFiniteByAssumption(L);
};		};
		nikicUnsubmitted Not Done Reply Inline Actions Maybe you want to extract canAsusmeNoSelfWrap as a static function and reuse? You repeat the same conditions in the new code, just without the comments. nikic: Maybe you want to extract canAsusmeNoSelfWrap as a static function and reuse? You repeat the…
		reamesAuthorUnsubmitted Done Reply Inline Actions I'm hoping to delete it in a following commit. I replicated the explanation that seemed relevant, and the remaining use looks highly suspect. (I wrote the remaining use, and I'm not convinced by my own comments. At a minimum, I want to rework the comments.) reames: I'm hoping to delete it in a following commit. I replicated the explanation that seemed…

if (!IV) {		if (!IV) {
if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS)) {		if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS)) {
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());
if (AR && AR->getLoop() == L && AR->isAffine()) {		if (AR && AR->getLoop() == L && AR->isAffine()) {
auto Flags = AR->getNoWrapFlags();
if (!hasFlags(Flags, SCEV::FlagNW) && canAssumeNoSelfWrap(AR)) {
Flags = setFlags(Flags, SCEV::FlagNW);

SmallVector<const SCEV*> Operands{AR->operands()};
Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
}

auto canProveNUW = [&]() {		auto canProveNUW = [&]() {
if (!isLoopInvariant(RHS, L))		if (!isLoopInvariant(RHS, L))
return false;		return false;

if (!isKnownNonZero(AR->getStepRecurrence(*this)))		if (!isKnownNonZero(AR->getStepRecurrence(*this)))
// We need the sequence defined by AR to strictly increase in the		// We need the sequence defined by AR to strictly increase in the
// unsigned integer domain for the logic below to hold.		// unsigned integer domain for the logic below to hold.
return false;		return false;

const unsigned InnerBitWidth = getTypeSizeInBits(AR->getType());		const unsigned InnerBitWidth = getTypeSizeInBits(AR->getType());
const unsigned OuterBitWidth = getTypeSizeInBits(RHS->getType());		const unsigned OuterBitWidth = getTypeSizeInBits(RHS->getType());
// If RHS <=u Limit, then there must exist a value V in the sequence		// If RHS <=u Limit, then there must exist a value V in the sequence
// defined by AR (e.g. {Start,+,Step}) such that V >u RHS, and		// defined by AR (e.g. {Start,+,Step}) such that V >u RHS, and
// V <=u UINT_MAX. Thus, we must exit the loop before unsigned		// V <=u UINT_MAX. Thus, we must exit the loop before unsigned
// overflow occurs. This limit also implies that a signed comparison		// overflow occurs. This limit also implies that a signed comparison
// (in the wide bitwidth) is equivalent to an unsigned comparison as		// (in the wide bitwidth) is equivalent to an unsigned comparison as
// the high bits on both sides must be zero.		// the high bits on both sides must be zero.
APInt StrideMax = getUnsignedRangeMax(AR->getStepRecurrence(*this));		APInt StrideMax = getUnsignedRangeMax(AR->getStepRecurrence(*this));
APInt Limit = APInt::getMaxValue(InnerBitWidth) - (StrideMax - 1);		APInt Limit = APInt::getMaxValue(InnerBitWidth) - (StrideMax - 1);
Limit = Limit.zext(OuterBitWidth);		Limit = Limit.zext(OuterBitWidth);
return getUnsignedRangeMax(applyLoopGuards(RHS, L)).ule(Limit);		return getUnsignedRangeMax(applyLoopGuards(RHS, L)).ule(Limit);
};		};
		auto Flags = AR->getNoWrapFlags();
if (!hasFlags(Flags, SCEV::FlagNUW) && canProveNUW())		if (!hasFlags(Flags, SCEV::FlagNUW) && canProveNUW())
Flags = setFlags(Flags, SCEV::FlagNUW);		Flags = setFlags(Flags, SCEV::FlagNUW);

setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);		setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
if (AR->hasNoUnsignedWrap()) {		if (AR->hasNoUnsignedWrap()) {
// Emulate what getZeroExtendExpr would have done during construction		// Emulate what getZeroExtendExpr would have done during construction
// if we'd been able to infer the fact just above at that time.		// if we'd been able to infer the fact just above at that time.
const SCEV Step = AR->getStepRecurrence(this);		const SCEV Step = AR->getStepRecurrence(this);
▲ Show 20 Lines • Show All 2,111 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/ne-overflow.ll

; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
; RUN: opt %s -passes='print<scalar-evolution>' -scalar-evolution-classify-expressions=0 -disable-output 2>&1 \| FileCheck %s		; RUN: opt %s -passes='print<scalar-evolution>' -scalar-evolution-classify-expressions=0 -disable-output 2>&1 \| FileCheck %s

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"		target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"		target triple = "x86_64-unknown-linux-gnu"

; A collection of tests focused on exercising logic to prove no-unsigned wrap		; A collection of tests focused on exercising logic to prove no-unsigned wrap
; from mustprogress semantics of loops.		; from mustprogress semantics of loops.

define void @test(i32 %N) mustprogress {		define void @test(i32 %N) mustprogress {
		mkazantsevUnsubmitted Not Done Reply Inline Actions Please use ./utils/update_analyze_test_checks.py. It works well with SCEV. mkazantsev: Please use ./utils/update_analyze_test_checks.py. It works well with SCEV.
		reamesAuthorUnsubmitted Done Reply Inline Actions No actually, it doesn't. It just claims to. I've tried it with other test changes recently, and been very unpleasantly surprised. It mostly just results in missing check lines. reames: No actually, it doesn't. It just claims to. I've tried it with other test changes recently…
; CHECK-LABEL: 'test'		; CHECK-LABEL: 'test'
; CHECK-NEXT: Determining loop execution counts for: @test		; CHECK-NEXT: Determining loop execution counts for: @test
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is ((-2 + %N) /u 2)
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 2147483647
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is ((-2 + %N) /u 2)
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]		%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]
%iv.next = add i32 %iv, 2		%iv.next = add i32 %iv, 2
%cmp = icmp ne i32 %iv.next, %N		%cmp = icmp ne i32 %iv.next, %N
br i1 %cmp, label %for.body, label %for.cond.cleanup		br i1 %cmp, label %for.body, label %for.cond.cleanup

for.cond.cleanup:		for.cond.cleanup:
ret void		ret void
}		}

define void @test_preinc(i32 %N) mustprogress {		define void @test_preinc(i32 %N) mustprogress {
; CHECK-LABEL: 'test_preinc'		; CHECK-LABEL: 'test_preinc'
; CHECK-NEXT: Determining loop execution counts for: @test_preinc		; CHECK-NEXT: Determining loop execution counts for: @test_preinc
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is (%N /u 2)
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 2147483647
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (%N /u 2)
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]		%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]
%iv.next = add i32 %iv, 2		%iv.next = add i32 %iv, 2
%cmp = icmp ne i32 %iv, %N		%cmp = icmp ne i32 %iv, %N
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	for.cond.cleanup:
ret void		ret void

}		}


define void @test_1024(i32 %N) mustprogress {		define void @test_1024(i32 %N) mustprogress {
; CHECK-LABEL: 'test_1024'		; CHECK-LABEL: 'test_1024'
; CHECK-NEXT: Determining loop execution counts for: @test_1024		; CHECK-NEXT: Determining loop execution counts for: @test_1024
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is ((-1024 + %N) /u 1024)
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 4194303
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is ((-1024 + %N) /u 1024)
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]		%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]
%iv.next = add i32 %iv, 1024		%iv.next = add i32 %iv, 1024
%cmp = icmp ne i32 %iv.next, %N		%cmp = icmp ne i32 %iv.next, %N
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	for.cond.cleanup:
ret void		ret void
}		}

define void @test_zext(i64 %N) mustprogress {		define void @test_zext(i64 %N) mustprogress {
; CHECK-LABEL: 'test_zext'		; CHECK-LABEL: 'test_zext'
; CHECK-NEXT: Determining loop execution counts for: @test_zext		; CHECK-NEXT: Determining loop execution counts for: @test_zext
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (%N /u 2)
		; CHECK-NEXT: Predicates:
		; CHECK-NEXT: {0,+,2}<nuw><%for.body> Added Flags: <nusw>
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]		%iv = phi i32 [ %iv.next, %for.body ], [ 0, %entry ]
%iv.next = add i32 %iv, 2		%iv.next = add i32 %iv, 2
%zext = zext i32 %iv to i64		%zext = zext i32 %iv to i64
▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/trip-count-negative-stride.ll

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

; Same as ult_infinite, except that the loop is ill defined due to the		; Same as ult_infinite, except that the loop is ill defined due to the
; must progress attribute		; must progress attribute
define void @ult_infinite_ub() mustprogress {		define void @ult_infinite_ub() mustprogress {
; CHECK-LABEL: 'ult_infinite_ub'		; CHECK-LABEL: 'ult_infinite_ub'
; CHECK-NEXT: Determining loop execution counts for: @ult_infinite_ub		; CHECK-NEXT: Determining loop execution counts for: @ult_infinite_ub
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is 1
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 1
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is 1
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 2
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
%i.05 = phi i8 [ %add, %for.body ], [ 0, %entry ]		%i.05 = phi i8 [ %add, %for.body ], [ 0, %entry ]
%add = add i8 %i.05, 128		%add = add i8 %i.05, 128
%cmp = icmp ult i8 %add, 255		%cmp = icmp ult i8 %add, 255
▲ Show 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

; Same as slt_infinite, except that the loop is ill defined due to the		; Same as slt_infinite, except that the loop is ill defined due to the
; must progress attribute		; must progress attribute
define void @slt_infinite_ub() mustprogress {		define void @slt_infinite_ub() mustprogress {
; CHECK-LABEL: 'slt_infinite_ub'		; CHECK-LABEL: 'slt_infinite_ub'
; CHECK-NEXT: Determining loop execution counts for: @slt_infinite_ub		; CHECK-NEXT: Determining loop execution counts for: @slt_infinite_ub
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is 0
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 0
; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is 0
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
%i.05 = phi i8 [ %add, %for.body ], [ 0, %entry ]		%i.05 = phi i8 [ %add, %for.body ], [ 0, %entry ]
%add = add i8 %i.05, 128		%add = add i8 %i.05, 128
%cmp = icmp slt i8 %add, 127		%cmp = icmp slt i8 %add, 127
▲ Show 20 Lines • Show All 217 Lines • Show Last 20 Lines