Download Raw Diff

Details

Reviewers

efriedma
nikic
fhahn

Commits

rGd24a0e88576d: [SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic

Summary

The basic idea here is that given a zero extended narrow IV, we can prove the inner IV to be NUW if we can prove there's a value the inner IV must take before overflow which must exit the loop.

This is a follow to D108651.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Sep 8 2021, 11:10 AM

Herald added subscribers: bollu, hiraditya, mcrosier. · View Herald TranscriptSep 8 2021, 11:10 AM

reames requested review of this revision.Sep 8 2021, 11:10 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 8 2021, 11:10 AM

reames mentioned this in D108651: [SCEV] Use no-self-wrap flags infered from exit structure to compute trip count.Sep 8 2021, 11:17 AM

Harbormaster completed remote builds in B123090: Diff 371397.Sep 8 2021, 12:28 PM

ping

efriedma added inline comments.Sep 13 2021, 3:45 PM

llvm/lib/Analysis/ScalarEvolution.cpp
11803	Why is the `StrideMax == 0` special case necessary?
11807	What is `APInt::getMaxValue(InnerBitWidth) - StrideMax` the limit of? I guess you're looking for the smallest possible value of `AR` at the step before it overflows. If that value forces the loop to exit, then the loop must exit before the overflow. A conservative way to figure that is based on the stride itself: just use the smallest value X such that X+Stride might overflow. A comment would be helpful. Also, I think you might be off by one here? `Limit` is one less than the value X I described. But maybe that cancels out somehow...

address review comment with better comments, also fixed a bug noticed in the process. I used max in two places, whereas one of them needed to be a min for the required purpose.

Planning to land the test, then rebase this.

reames mentioned this in rGdf7c2bcf4e45: precommit tests for D109457.Sep 16 2021, 12:43 PM

Harbormaster completed remote builds in B124252: Diff 373028.Sep 16 2021, 12:46 PM

Rebase over landed tests.

One question for reviewers - We don't currently get any value in handling non-constant strides because (due to limitations in flag inference), we basically never conclude the that step is non-zero from data flow. (We might from control flow, but that doesn't influence constant ranges.) Should I keep the complexity? Or just drop it to constant non-zero step and be done?

Harbormaster completed remote builds in B124259: Diff 373037.Sep 16 2021, 1:54 PM

ping

p.s. In offline discussion, @nlewycky suggested a nice generalization of this idea in terms of invertible operations once we've proven the value being compared against is in the co-domain of the LHS. I want to land this as is, but I'm hoping to leverage that idea into a generalization both here, and in IndVarSimplify.

ping - this has been outstanding for a while and is blocking progress for me, any chance I can get someone to review?

JFYI, all of the other approaches I've mentioned in the review thread so far have not panned out. I can cover some cases with each, but not the motivating example. This patch's use of mustprogress to disprove the infinite loop case is critical. Every time I tried implementing this differently, I ended up just re-implementing this same idea without the infrastructure that SCEV already provides for this. I think this really does have to be in SCEV's trip count logic.

reames mentioned this in rG8b31f07cdf13: [tests] Add indvars tests showing missing transforms with small IVs.Oct 14 2021, 1:32 PM

In D109457#3064633, @reames wrote:

JFYI, all of the other approaches I've mentioned in the review thread so far have not panned out.

As often happens, taking a fresh look at something reveals another approach. I've posted https://reviews.llvm.org/D111836 which isn't really a replacement for this patch, but tackles problems in the same area, and is probably easier to review/justify.

reames mentioned this in rGc0d9bf2f6afd: [indvars] Allow rotation (narrowing) of exit test when discovering trip count.Nov 4 2021, 2:49 PM

Drop the must-exit logic entirely. As recently demonstrated in the indvars approach, none of the original motivating examples actually require it after we incorporate more precise range reasoning.

Add applyLoopGuards to constrain RHS, and fix an off by one bug which resulted in inprecise results at the edge. Additionally, include tests specifically focused on the edge case to demonstrate the correctness of said change.

As can now be seen by the test changes in finite-exit-comparisons.ll (the test file exercising the new indvars logic), this sometimes lets us to LFTR an exit test instead of rotating/narrowing. I've manually confirmed that we generate exit tests for all of the examples, why some still hit the rotate path will be investigated separately. (Edit: See https://bugs.llvm.org/show_bug.cgi?id=52423 for result of that investigation. Its an LFTR limitation we should fix.) I'm hoping to be able to delete that logic again entirely, but well, we'll see.

@nikic, @mkazantsev, @fhahn - Given the the must-exit logic has been removed and this is now pretty much just basic constant range reasoning, I'd greatly appreciate a review so we can get this in. After implementing the indvars approach, I'm more convinced than ever that SCEV really should just be able to compute a trip count for these loops. Everything else seems like a massive hack.

Harbormaster completed remote builds in B132731: Diff 385125.Nov 5 2021, 11:53 AM

LGTM

llvm/lib/Analysis/ScalarEvolution.cpp
11801

This revision is now accepted and ready to land.Nov 5 2021, 2:23 PM

Closed by commit rGd24a0e88576d: [SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic (authored by reames). · Explain WhyNov 5 2021, 3:37 PM

This revision was automatically updated to reflect the committed changes.

reames added a commit: rGd24a0e88576d: [SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic.

Diff 385208

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,786 Lines • ▼ Show 20 Lines if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS)) {

const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand()); const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());

if (AR && AR->getLoop() == L && AR->isAffine()) { if (AR && AR->getLoop() == L && AR->isAffine()) {

auto Flags = AR->getNoWrapFlags(); auto Flags = AR->getNoWrapFlags();

if (!hasFlags(Flags, SCEV::FlagNW) && canAssumeNoSelfWrap(AR)) { if (!hasFlags(Flags, SCEV::FlagNW) && canAssumeNoSelfWrap(AR)) {

Flags = setFlags(Flags, SCEV::FlagNW); Flags = setFlags(Flags, SCEV::FlagNW);

SmallVector<const SCEV*> Operands{AR->operands()}; SmallVector<const SCEV*> Operands{AR->operands()};

Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags); Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);

}

auto canProveNUW = [&]() {

if (!isLoopInvariant(RHS, L))

return false;

if (!isKnownNonZero(AR->getStepRecurrence(*this)))

nikicUnsubmitted

Not Done

return false;

- if (getUnsignedRangeMin(AR->getStepRecurrence(*this)).isZero())

+ if (!isKnownNonZero(AR->getStepRecurrence(*this)))

// We need the sequence defined by AR to strictly increase in the

nikic:

// We need the sequence defined by AR to strictly increase in the

// unsigned integer domain for the logic below to hold.

efriedmaUnsubmitted

Not Done

Why is the StrideMax == 0 special case necessary?

efriedma: Why is the `StrideMax == 0` special case necessary?

return false;

const unsigned InnerBitWidth = getTypeSizeInBits(AR->getType());

const unsigned OuterBitWidth = getTypeSizeInBits(RHS->getType());

efriedmaUnsubmitted

Not Done

What is APInt::getMaxValue(InnerBitWidth) - StrideMax the limit of?

I guess you're looking for the smallest possible value of AR at the step before it overflows. If that value forces the loop to exit, then the loop must exit before the overflow. A conservative way to figure that is based on the stride itself: just use the smallest value X such that X+Stride might overflow.

A comment would be helpful.

Also, I think you might be off by one here? Limit is one less than the value X I described. But maybe that cancels out somehow...

efriedma: What is `APInt::getMaxValue(InnerBitWidth) - StrideMax` the limit of? I guess you're looking…

// If RHS <=u Limit, then there must exist a value V in the sequence

// defined by AR (e.g. {Start,+,Step}) such that V >u RHS, and

// V <=u UINT_MAX. Thus, we must exit the loop before unsigned

// overflow occurs. This limit also implies that a signed comparison

// (in the wide bitwidth) is equivalent to an unsigned comparison as

// the high bits on both sides must be zero.

APInt StrideMax = getUnsignedRangeMax(AR->getStepRecurrence(*this));

APInt Limit = APInt::getMaxValue(InnerBitWidth) - (StrideMax - 1);

Limit = Limit.zext(OuterBitWidth);

return getUnsignedRangeMax(applyLoopGuards(RHS, L)).ule(Limit);

};

if (!hasFlags(Flags, SCEV::FlagNUW) && canProveNUW())

Flags = setFlags(Flags, SCEV::FlagNUW);

setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags); setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);

}

if (AR->hasNoUnsignedWrap()) { if (AR->hasNoUnsignedWrap()) {

// Emulate what getZeroExtendExpr would have done during construction // Emulate what getZeroExtendExpr would have done during construction

// if we'd been able to infer the fact just above at that time. // if we'd been able to infer the fact just above at that time.

const SCEV *Step = AR->getStepRecurrence(*this); const SCEV *Step = AR->getStepRecurrence(*this);

Type *Ty = ZExt->getType(); Type *Ty = ZExt->getType();

auto *S = getAddRecExpr( auto *S = getAddRecExpr(

getExtendAddRecStart<SCEVZeroExtendExpr>(AR, Ty, this, 0), getExtendAddRecStart<SCEVZeroExtendExpr>(AR, Ty, this, 0),

getZeroExtendExpr(Step, Ty, 0), L, AR->getNoWrapFlags()); getZeroExtendExpr(Step, Ty, 0), L, AR->getNoWrapFlags());

▲ Show 20 Lines • Show All 2,064 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll

Show First 20 Lines • Show All 273 Lines • ▼ Show 20 Lines	for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

; Because of the range on RHS including only values within i8, we don't need		; Because of the range on RHS including only values within i8, we don't need
; the must exit property		; the must exit property
define void @rhs_narrow_range(i16 %n.raw) {		define void @rhs_narrow_range(i16 %n.raw) {
; CHECK-LABEL: 'rhs_narrow_range'		; CHECK-LABEL: 'rhs_narrow_range'
; CHECK-NEXT: Determining loop execution counts for: @rhs_narrow_range		; CHECK-NEXT: Determining loop execution counts for: @rhs_narrow_range
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.		; CHECK-NEXT: Loop %for.body: backedge-taken count is (-1 + (1 umax (2 * (zext i7 (trunc i16 (%n.raw /u 2) to i7) to i16))<nuw><nsw>))<nsw>
; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 253
; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (-1 + (1 umax (2 * (zext i7 (trunc i16 (%n.raw /u 2) to i7) to i16))<nuw><nsw>))<nsw>		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (-1 + (1 umax (2 * (zext i7 (trunc i16 (%n.raw /u 2) to i7) to i16))<nuw><nsw>))<nsw>
; CHECK-NEXT: Predicates:		; CHECK-NEXT: Predicates:
; CHECK-NEXT: {1,+,1}<%for.body> Added Flags: <nusw>		; CHECK: Loop %for.body: Trip multiple is 1
;		;
entry:		entry:
%n = and i16 %n.raw, 254		%n = and i16 %n.raw, 254
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
%iv = phi i8 [ %iv.next, %for.body ], [ 0, %entry ]		%iv = phi i8 [ %iv.next, %for.body ], [ 0, %entry ]
%iv.next = add i8 %iv, 1		%iv.next = add i8 %iv, 1
store i8 %iv, i8* @G		store i8 %iv, i8* @G
%zext = zext i8 %iv.next to i16		%zext = zext i8 %iv.next to i16
%cmp = icmp ult i16 %zext, %n		%cmp = icmp ult i16 %zext, %n
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

		define void @ugt_constant_rhs(i16 %n.raw, i8 %start) mustprogress {
		;
		; CHECK-LABEL: 'ugt_constant_rhs'
		; CHECK-NEXT: Determining loop execution counts for: @ugt_constant_rhs
		; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
		; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.
		; CHECK-NEXT: Loop %for.body: Unpredictable predicated backedge-taken count.
		;
		entry:
		br label %for.body

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]
		%iv.next = add i8 %iv, 1
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ugt i16 %zext, 254
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}

		define void @ult_constant_rhs(i16 %n.raw, i8 %start) {
		;
		; CHECK-LABEL: 'ult_constant_rhs'
		; CHECK-NEXT: Determining loop execution counts for: @ult_constant_rhs
		; CHECK-NEXT: Loop %for.body: backedge-taken count is (255 + (-1 * (zext i8 (1 + %start) to i16))<nsw>)<nsw>
		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 255
		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (255 + (-1 * (zext i8 (1 + %start) to i16))<nsw>)<nsw>
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
		;
		entry:
		br label %for.body

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]
		%iv.next = add i8 %iv, 1
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ult i16 %zext, 255
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}

		define void @ult_constant_rhs_stride2(i16 %n.raw, i8 %start) {
		;
		; CHECK-LABEL: 'ult_constant_rhs_stride2'
		; CHECK-NEXT: Determining loop execution counts for: @ult_constant_rhs_stride2
		; CHECK-NEXT: Loop %for.body: backedge-taken count is ((1 + (-1 * (zext i8 (2 + %start) to i16))<nsw> + (254 umax (zext i8 (2 + %start) to i16))) /u 2)
		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 127
		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is ((1 + (-1 * (zext i8 (2 + %start) to i16))<nsw> + (254 umax (zext i8 (2 + %start) to i16))) /u 2)
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
		;
		entry:
		br label %for.body

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]
		%iv.next = add i8 %iv, 2
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ult i16 %zext, 254
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}

		define void @ult_constant_rhs_stride2_neg(i16 %n.raw, i8 %start) {
		;
		; CHECK-LABEL: 'ult_constant_rhs_stride2_neg'
		; CHECK-NEXT: Determining loop execution counts for: @ult_constant_rhs_stride2_neg
		; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
		; CHECK-NEXT: Loop %for.body: Unpredictable max backedge-taken count.
		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is ((256 + (-1 * (zext i8 (2 + %start) to i16))<nsw>)<nsw> /u 2)
		; CHECK-NEXT: Predicates:
		; CHECK-NEXT: {(2 + %start),+,2}<%for.body> Added Flags: <nusw>
		;
		entry:
		br label %for.body

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]
		%iv.next = add i8 %iv, 2
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ult i16 %zext, 255
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}


		define void @ult_restricted_rhs(i16 %n.raw) {
		; CHECK-LABEL: 'ult_restricted_rhs'
		; CHECK-NEXT: Determining loop execution counts for: @ult_restricted_rhs
		; CHECK-NEXT: Loop %for.body: backedge-taken count is (-1 + (1 umax (zext i8 (trunc i16 %n.raw to i8) to i16)))<nsw>
		; CHECK-NEXT: Loop %for.body: max backedge-taken count is 254
		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (-1 + (1 umax (zext i8 (trunc i16 %n.raw to i8) to i16)))<nsw>
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
		;
		entry:
		%n = and i16 %n.raw, 255
		br label %for.body

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ 0, %entry ]
		%iv.next = add i8 %iv, 1
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ult i16 %zext, %n
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}

		define void @ult_guarded_rhs(i16 %n) {;
		; CHECK-LABEL: 'ult_guarded_rhs'
		; CHECK-NEXT: Determining loop execution counts for: @ult_guarded_rhs
		; CHECK-NEXT: Loop %for.body: backedge-taken count is (-1 + (1 umax %n))
		; CHECK-NEXT: Loop %for.body: max backedge-taken count is -2
		; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is (-1 + (1 umax %n))
		; CHECK-NEXT: Predicates:
		; CHECK: Loop %for.body: Trip multiple is 1
		;
		entry:
		%in_range = icmp ult i16 %n, 256
		br i1 %in_range, label %for.body, label %for.end

		for.body: ; preds = %entry, %for.body
		%iv = phi i8 [ %iv.next, %for.body ], [ 0, %entry ]
		%iv.next = add i8 %iv, 1
		%zext = zext i8 %iv.next to i16
		%cmp = icmp ult i16 %zext, %n
		br i1 %cmp, label %for.body, label %for.end

		for.end: ; preds = %for.body, %entry
		ret void
		}



declare void @llvm.assume(i1)		declare void @llvm.assume(i1)

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

; Case where we could prove this using range facts, but not must exit reasoning		; Case where we could prove this using range facts, but not must exit reasoning
define void @slt_non_constant_rhs_no_mustprogress(i16 %n.raw) {		define void @slt_non_constant_rhs_no_mustprogress(i16 %n.raw) {
; CHECK-LABEL: @slt_non_constant_rhs_no_mustprogress(		; CHECK-LABEL: @slt_non_constant_rhs_no_mustprogress(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[N:%.]] = and i16 [[N_RAW:%.]], 255		; CHECK-NEXT: [[N:%.]] = and i16 [[N_RAW:%.]], 255
; CHECK-NEXT: [[TMP0:%.*]] = trunc i16 [[N]] to i8		; CHECK-NEXT: [[SMAX:%.*]] = call i16 @llvm.smax.i16(i16 [[N]], i16 1)
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]		; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i16 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1		; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i16 [[INDVARS_IV]], 1
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[IV_NEXT]], [[TMP0]]		; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i16 [[INDVARS_IV_NEXT]], [[SMAX]]
; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]		; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END:%.*]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%n = and i16 %n.raw, 255		%n = and i16 %n.raw, 255
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
▲ Show 20 Lines • Show All 780 Lines • ▼ Show 20 Lines

for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

define i16 @ult_multiuse_profit(i16 %n.raw, i8 %start) mustprogress {		define i16 @ult_multiuse_profit(i16 %n.raw, i8 %start) mustprogress {
; CHECK-LABEL: @ult_multiuse_profit(		; CHECK-LABEL: @ult_multiuse_profit(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.*]] = trunc i16 254 to i8		; CHECK-NEXT: [[TMP0:%.]] = add i8 [[START:%.]], 1
		; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[TMP0]] to i16
		; CHECK-NEXT: [[TMP2:%.*]] = trunc i16 254 to i8
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[START:%.]], [[ENTRY:%.]] ]		; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[START]], [[ENTRY:%.*]] ]
; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1		; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16		; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[IV_NEXT]], [[TMP2]]
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[IV_NEXT]], [[TMP0]]
; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]		; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: [[ZEXT_LCSSA:%.*]] = phi i16 [ [[ZEXT]], [[FOR_BODY]] ]		; CHECK-NEXT: [[UMAX:%.*]] = call i16 @llvm.umax.i16(i16 [[TMP1]], i16 254)
; CHECK-NEXT: ret i16 [[ZEXT_LCSSA]]		; CHECK-NEXT: ret i16 [[UMAX]]
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]		%iv = phi i8 [ %iv.next, %for.body ], [ %start, %entry ]
%iv.next = add i8 %iv, 1		%iv.next = add i8 %iv, 1
%zext = zext i8 %iv.next to i16		%zext = zext i8 %iv.next to i16
Show All 34 Lines
for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret i16 %iv2		ret i16 %iv2
}		}

define void @slt_restricted_rhs(i16 %n.raw) mustprogress {		define void @slt_restricted_rhs(i16 %n.raw) mustprogress {
; CHECK-LABEL: @slt_restricted_rhs(		; CHECK-LABEL: @slt_restricted_rhs(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[N:%.]] = and i16 [[N_RAW:%.]], 255		; CHECK-NEXT: [[N:%.]] = and i16 [[N_RAW:%.]], 255
; CHECK-NEXT: [[TMP0:%.*]] = trunc i16 [[N]] to i8		; CHECK-NEXT: [[SMAX:%.*]] = call i16 @llvm.smax.i16(i16 [[N]], i16 1)
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]		; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i16 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1		; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i16 [[INDVARS_IV]], 1
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[IV_NEXT]], [[TMP0]]		; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i16 [[INDVARS_IV_NEXT]], [[SMAX]]
; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]		; CHECK-NEXT: br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END:%.*]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%n = and i16 %n.raw, 255		%n = and i16 %n.raw, 255
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.body		for.body: ; preds = %entry, %for.body
▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 385208

llvm/lib/Analysis/ScalarEvolution.cpp

llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logicClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 385208

llvm/lib/Analysis/ScalarEvolution.cpp

llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

[SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic
ClosedPublic