This is an archive of the discontinued LLVM Phabricator instance.

With the help of nsw flag in keep-nsw-nuw-flag.ll, this loop which cannot be vectorized due to CouldNotCompute BackedgeTakenCount now can be vectorized.
We can use opt -indvars keep-nsw-nuw-flag.ll -S | opt -loop-vectorize -S to verify this.
Also, if we just use -indvars -loop-vectorize back to back, this loop can be vectorized without this patch. And I found out that the reason is that nsw flag still remains when we try to vectorize this loop. That is some cached value is reused during scalar-evolution analysis for loop-vectorize pass.
During the pipeline, we may call forgetLoop in many places, thus the cached value may be cleared. So I guess it is better to keep the nsw or nuw flags explicitly after widen IV.

Harbormaster completed remote builds in B140625: Diff 396186.Dec 24 2021, 6:00 PM

Can you explain why SCEVExpander does not preserve the nowrap flag in this case? Assuming it is present on the wide IV, I would have expect it to also get expanded as such.

Also, it looks like you did not regenerate all affected tests.

nikic added a reviewer: reames.Dec 25 2021, 8:29 AM

fix the affected tests

Harbormaster completed remote builds in B140873: Diff 396501.Dec 29 2021, 1:48 AM

In D116276#3209662, @nikic wrote:

Can you explain why SCEVExpander does not preserve the nowrap flag in this case? Assuming it is present on the wide IV, I would have expect it to also get expanded as such.

Sorry for the late reply. The SCEVExpander(Rewriter) is used to create the widen IV and its widenIVUse(). As for the IV increment, we just use the following code:

WideInc =
      cast<Instruction>(WidePhi->getIncomingValueForBlock(LatchBlock));

Of course in this way, we cannot preserve the nowrap flag cause both the WidePhi and the IncomingValueForBlock do not contain the flag.
I guess the reason why we don`t need to preserve this flag previously is that the AddRec is computed from the OrigPhi, which is like the following:

{((sext i32 %arg1 to i64) + (sext i32 %arg2 to i64)),+,(sext i32 %arg2 to i64)}<nsw><%body>

So, at this moment, it does not matter whether the increased instruction contains the flag because the SCEV is right.
However, during the optimization pipeline, we may call the SE->forgetLoop() to drop the cache value and recompute from scratch. At that moment, since we have lost the NSW flag, then the BackedgeTakenCount would be CouldNotCompute, which will prevent vectorizing.
With this patch, function s122 and function s172 in TSVC now can be vectorized. Following is the source code of s122:

real_t s122(struct args_t * func_args)
{
//    induction variable recognition
//    variable lower and upper bound, and stride
//    reverse data access and jump in data access
    struct{int a;int b;} * x = func_args->arg_info;
    int n1 = x->a;
    int n3 = x->b;
    initialise_arrays(__func__);
    int j, k;
#pragma clang loop vectorize(assume_safety)
    for (int nl = 0; nl < iterations; nl++) {
        j = 1;
        k = 0;
        for (int i = n1-1; i < LEN_1D; i += n3) {
            k += j;
            a[i] += b[LEN_1D - k];
        }
    }
}

Also, I guess it is ok that if an i32-IV has nowrap flag, its corresponding widen-IV has the same nowrap flag.

@guopeilin This doesn't really answer my question. You say that the SCEV for the wide IV at this point is {((sext i32 %arg1 to i64) + (sext i32 %arg2 to i64)),+,(sext i32 %arg2 to i64)}<nsw><%body>, which has the <nsw> flag. My baseline expectation would be that SCEVExpander will also add the nsw flag to the expanded IR (see the code around https://github.com/llvm/llvm-project/blob/015ff729cb90317e4e75cf48b1e5dd7850f0cbd0/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L1300-L1304). My question is why that doesn't happen, because making sure SCEVExpander propagates the flag from SCEV to IR seems like the more principled way to address this problem.

In D116276#3212829, @nikic wrote:

@guopeilin This doesn't really answer my question. You say that the SCEV for the wide IV at this point is {((sext i32 %arg1 to i64) + (sext i32 %arg2 to i64)),+,(sext i32 %arg2 to i64)}<nsw><%body>, which has the <nsw> flag. My baseline expectation would be that SCEVExpander will also add the nsw flag to the expanded IR (see the code around https://github.com/llvm/llvm-project/blob/015ff729cb90317e4e75cf48b1e5dd7850f0cbd0/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L1300-L1304). My question is why that doesn't happen because making sure SCEVExpander propagates the flag from SCEV to IR seems like the more principled way to address this problem.

Sorry that I misunderstand your question before. And I guess the code you show here may be the exact root reason for the omission of nowarp flags. I have dump the SCEV value of OpAfterExtend and ExtendAfterOp within the function IsIncrementNSW(https://github.com/llvm/llvm-project/blob/015ff729cb90317e4e75cf48b1e5dd7850f0cbd0/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp#L1142-L1144), which is like the following:

OpAfterExtend = {((sext i32 %arg1 to i128) + (sext i32 %arg2 to i128)),+,(sext i32 %arg2 to i128)}<nw><%body>
ExtendAfterOp = {(sext i64 ((sext i32 %arg1 to i64) + (sext i32 %arg2 to i64)) to i128),+,(sext i32 %arg2 to i128)}<nsw><%body>

They are different from each other, which makes function IsIncrementNSW return false, and thus SCEVExpander stops propagating the flag.
I will keep debugging it and to see the reason for the difference and try to upstream a new patch. Please wait.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SimplifyIndVar.cpp

4 lines

test/

Transforms/

IndVarSimplify/

AArch64/

widen-loop-comp.ll

4 lines

X86/

pr27133.ll

2 lines

keep-nsw-nuw-flag.ll

59 lines

widen-i32-i8ptr.ll

2 lines

LoopFlatten/

widen-iv.ll

18 lines

widen-iv2.ll

4 lines

widen-iv3.ll

10 lines

Diff 396501

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

Show First 20 Lines • Show All 1,892 Lines • ▼ Show 20 Lines	PHINode *WidenIV::createWideIV(SCEVExpander &Rewriter) {
if (BasicBlock *LatchBlock = L->getLoopLatch()) {		if (BasicBlock *LatchBlock = L->getLoopLatch()) {
WideInc =		WideInc =
cast<Instruction>(WidePhi->getIncomingValueForBlock(LatchBlock));		cast<Instruction>(WidePhi->getIncomingValueForBlock(LatchBlock));
WideIncExpr = SE->getSCEV(WideInc);		WideIncExpr = SE->getSCEV(WideInc);
// Propagate the debug location associated with the original loop increment		// Propagate the debug location associated with the original loop increment
// to the new (widened) increment.		// to the new (widened) increment.
auto *OrigInc =		auto *OrigInc =
cast<Instruction>(OrigPhi->getIncomingValueForBlock(LatchBlock));		cast<Instruction>(OrigPhi->getIncomingValueForBlock(LatchBlock));
		if (isa<OverflowingBinaryOperator>(OrigInc) && OrigInc->hasNoSignedWrap())
		WideInc->setHasNoSignedWrap(true);
		if (isa<OverflowingBinaryOperator>(OrigInc) && OrigInc->hasNoUnsignedWrap())
		WideInc->setHasNoUnsignedWrap(true);
WideInc->setDebugLoc(OrigInc->getDebugLoc());		WideInc->setDebugLoc(OrigInc->getDebugLoc());
}		}

LLVM_DEBUG(dbgs() << "Wide IV: " << *WidePhi << "\n");		LLVM_DEBUG(dbgs() << "Wide IV: " << *WidePhi << "\n");
++NumWidened;		++NumWidened;

// Traverse the def-use chain using a worklist starting at the original IV.		// Traverse the def-use chain using a worklist starting at the original IV.
assert(Widened.empty() && NarrowIVUsers.empty() && "expect initial state" );		assert(Widened.empty() && NarrowIVUsers.empty() && "expect initial state" );
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll

	Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY:%.]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY:%.]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[SUM_0:%.]] = phi i32 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[SUM_0:%.]] = phi i32 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[CMP:%.*]] = icmp ule i64 [[INDVARS_IV]], [[TMP0]]			; CHECK-NEXT: [[CMP:%.*]] = icmp ule i64 [[INDVARS_IV]], [[TMP0]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[ADD]] = add nsw i32 [[SUM_0]], [[TMP1]]			; CHECK-NEXT: [[ADD]] = add nsw i32 [[SUM_0]], [[TMP1]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: br label [[FOR_COND]]			; CHECK-NEXT: br label [[FOR_COND]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: [[SUM_0_LCSSA:%.*]] = phi i32 [ [[SUM_0]], [[FOR_COND]] ]			; CHECK-NEXT: [[SUM_0_LCSSA:%.*]] = phi i32 [ [[SUM_0]], [[FOR_COND]] ]
	; CHECK-NEXT: ret i32 [[SUM_0_LCSSA]]			; CHECK-NEXT: ret i32 [[SUM_0_LCSSA]]
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	▲ Show 20 Lines • Show All 118 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[TMP0]], [[FOR_COND_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY:%.*]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[TMP0]], [[FOR_COND_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY:%.*]] ]
	; CHECK-NEXT: [[SUM_0:%.]] = phi i32 [ [[ADD:%.]], [[FOR_BODY]] ], [ 0, [[FOR_COND_PREHEADER]] ]			; CHECK-NEXT: [[SUM_0:%.]] = phi i32 [ [[ADD:%.]], [[FOR_BODY]] ], [ 0, [[FOR_COND_PREHEADER]] ]
	; CHECK-NEXT: [[CMP:%.*]] = icmp ule i64 [[INDVARS_IV]], [[TMP1]]			; CHECK-NEXT: [[CMP:%.*]] = icmp ule i64 [[INDVARS_IV]], [[TMP1]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[ADD]] = add nsw i32 [[SUM_0]], [[TMP2]]			; CHECK-NEXT: [[ADD]] = add nsw i32 [[SUM_0]], [[TMP2]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i64 0, [[INDVARS_IV_NEXT]]			; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i64 0, [[INDVARS_IV_NEXT]]
	; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_COND]], label [[FOR_END]]			; CHECK-NEXT: br i1 [[CMP2]], label [[FOR_COND]], label [[FOR_END]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: [[SUM_0_LCSSA:%.*]] = phi i32 [ [[SUM_0]], [[FOR_BODY]] ], [ [[SUM_0]], [[FOR_COND]] ]			; CHECK-NEXT: [[SUM_0_LCSSA:%.*]] = phi i32 [ [[SUM_0]], [[FOR_BODY]] ], [ [[SUM_0]], [[FOR_COND]] ]
	; CHECK-NEXT: ret i32 [[SUM_0_LCSSA]]			; CHECK-NEXT: ret i32 [[SUM_0_LCSSA]]
	; CHECK: leave:			; CHECK: leave:
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	▲ Show 20 Lines • Show All 1,020 Lines • Show Last 20 Lines

llvm/test/Transforms/IndVarSimplify/X86/pr27133.ll

	Show All 15 Lines
	; CHECK-NEXT: [[C_0_LCSSA:%.*]] = phi i32 [ [[INDVARS1]], [[FOR_COND]] ]			; CHECK-NEXT: [[C_0_LCSSA:%.*]] = phi i32 [ [[INDVARS1]], [[FOR_COND]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = catchswitch within none [label %catch] unwind to caller			; CHECK-NEXT: [[TMP0:%.*]] = catchswitch within none [label %catch] unwind to caller
	; CHECK: catch:			; CHECK: catch:
	; CHECK-NEXT: [[TMP1:%.]] = catchpad within [[TMP0]] [i8 null, i32 64, i8* null]			; CHECK-NEXT: [[TMP1:%.]] = catchpad within [[TMP0]] [i8 null, i32 64, i8* null]
	; CHECK-NEXT: catchret from [[TMP1]] to label [[EXIT:%.*]]			; CHECK-NEXT: catchret from [[TMP1]] to label [[EXIT:%.*]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret i32 [[C_0_LCSSA]]			; CHECK-NEXT: ret i32 [[C_0_LCSSA]]
	; CHECK: for.inc:			; CHECK: for.inc:
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: br label [[FOR_COND]]			; CHECK-NEXT: br label [[FOR_COND]]
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond: ; preds = %for.inc, %entry			for.cond: ; preds = %for.inc, %entry
	%c.0 = phi i32 [ %inc, %for.inc ], [ 0, %entry ]			%c.0 = phi i32 [ %inc, %for.inc ], [ 0, %entry ]
	%idxprom = sext i32 %c.0 to i64			%idxprom = sext i32 %c.0 to i64
	Show All 22 Lines

llvm/test/Transforms/IndVarSimplify/keep-nsw-nuw-flag.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -indvars -S < %s \| FileCheck %s

				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-linux-gnu"

				@a = external global [8000 x float], align 64
				@b = external global [8000 x float], align 64
				@c = external global [8000 x float], align 64

				define float @foo(i32 %arg1, i32 %arg2) {
				; CHECK-LABEL: @foo(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = sext i32 [[ARG1:%.]] to i64
				; CHECK-NEXT: [[TMP1:%.]] = sext i32 [[ARG2:%.]] to i64
				; CHECK-NEXT: br label [[PREHEADER:%.*]]
				; CHECK: preheader:
				; CHECK-NEXT: br label [[BODY:%.*]]
				; CHECK: body:
				; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[BODY]] ], [ [[TMP0]], [[PREHEADER]] ]
				; CHECK-NEXT: [[ARRAYIDX_A:%.]] = getelementptr inbounds [8000 x float], [8000 x float] @a, i64 0, i64 [[INDVARS_IV]]
				; CHECK-NEXT: [[TMP2:%.]] = load float, float [[ARRAYIDX_A]], align 4
				; CHECK-NEXT: [[ARRAYIDX_B:%.]] = getelementptr inbounds [8000 x float], [8000 x float] @b, i64 0, i64 [[INDVARS_IV]]
				; CHECK-NEXT: [[TMP3:%.]] = load float, float [[ARRAYIDX_B]], align 4
				; CHECK-NEXT: [[ARRAYIDX_C:%.]] = getelementptr inbounds [8000 x float], [8000 x float] @c, i64 0, i64 [[INDVARS_IV]]
				; CHECK-NEXT: [[FADD:%.*]] = fadd fast float [[TMP2]], [[TMP3]]
				; CHECK-NEXT: store float [[FADD]], float* [[ARRAYIDX_C]], align 4
				; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], [[TMP1]]
				; CHECK-NEXT: [[CMP:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], 8000
				; CHECK-NEXT: br i1 [[CMP]], label [[BODY]], label [[CLEANUP:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
				; CHECK: cleanup:
				; CHECK-NEXT: br label [[PREHEADER]]
				;
				entry:
				br label %preheader

				preheader: ; preds = %cleanup, %entry
				br label %body

				body: ; preds = %body, %preheader
				%iv = phi i32 [ %arg1, %preheader ], [ %add, %body ]
				%sext = sext i32 %iv to i64
				%arrayidx.a = getelementptr inbounds [8000 x float], [8000 x float]* @a, i64 0, i64 %sext
				%0 = load float, float* %arrayidx.a, align 4
				%arrayidx.b = getelementptr inbounds [8000 x float], [8000 x float]* @b, i64 0, i64 %sext
				%1 = load float, float* %arrayidx.b, align 4
				%arrayidx.c = getelementptr inbounds [8000 x float], [8000 x float]* @c, i64 0, i64 %sext
				%fadd = fadd fast float %0, %1
				store float %fadd, float* %arrayidx.c, align 4
				%add = add nsw i32 %iv, %arg2
				%cmp = icmp slt i32 %add, 8000
				br i1 %cmp, label %body, label %cleanup, !llvm.loop !0

				cleanup: ; preds = %body
				br label %preheader
				}

				!0 = distinct !{!0, !1}
				!1 = !{!"llvm.loop.mustprogress"}

llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll

	Show All 10 Lines
	; CHECK-NEXT: store i8 [[ARRAYDECAY2032]], i8* inttoptr (i64 8 to i8***), align 8			; CHECK-NEXT: store i8 [[ARRAYDECAY2032]], i8* inttoptr (i64 8 to i8***), align 8
	; CHECK-NEXT: br label [[FOR_COND2106:%.*]]			; CHECK-NEXT: br label [[FOR_COND2106:%.*]]
	; CHECK: for.cond2106:			; CHECK: for.cond2106:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_COND2106]] ], [ 0, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_COND2106]] ], [ 0, [[ENTRY:%.*]] ]
	; CHECK-NEXT: [[GID_0:%.]] = phi i8 [ null, [[ENTRY]] ], [ [[INCDEC_PTR:%.*]], [[FOR_COND2106]] ]			; CHECK-NEXT: [[GID_0:%.]] = phi i8 [ null, [[ENTRY]] ], [ [[INCDEC_PTR:%.*]], [[FOR_COND2106]] ]
	; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[GID_0]], i64 1			; CHECK-NEXT: [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[GID_0]], i64 1
	; CHECK-NEXT: [[ARRAYIDX2115:%.]] = getelementptr inbounds [15 x i8], [15 x i8] [[PTRIDS]], i64 0, i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2115:%.]] = getelementptr inbounds [15 x i8], [15 x i8] [[PTRIDS]], i64 0, i64 [[INDVARS_IV]]
	; CHECK-NEXT: store i8* [[GID_0]], i8** [[ARRAYIDX2115]], align 8			; CHECK-NEXT: store i8* [[GID_0]], i8** [[ARRAYIDX2115]], align 8
	; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: br label [[FOR_COND2106]]			; CHECK-NEXT: br label [[FOR_COND2106]]
	;			;
	entry:			entry:
	%ptrids = alloca [15 x i8*], align 8			%ptrids = alloca [15 x i8*], align 8
	%arraydecay2032 = getelementptr inbounds [15 x i8], [15 x i8]* %ptrids, i64 0, i64 0			%arraydecay2032 = getelementptr inbounds [15 x i8], [15 x i8]* %ptrids, i64 0, i64 0
	store i8 %arraydecay2032, i8* inttoptr (i64 8 to i8***), align 8			store i8 %arraydecay2032, i8* inttoptr (i64 8 to i8***), align 8
	br label %for.cond2106			br label %for.cond2106

	Show All 10 Lines

llvm/test/Transforms/LoopFlatten/widen-iv.ll

	Show All 35 Lines
	; CHECK-NEXT: br label [[FOR_BODY4_US:%.*]]			; CHECK-NEXT: br label [[FOR_BODY4_US:%.*]]
	; CHECK: for.body4.us:			; CHECK: for.body4.us:
	; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]			; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32
	; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[TMP3]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[TMP3]], [[MUL_US]]
	; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IDXPROM_US]]			; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IDXPROM_US]]
	; CHECK-NEXT: tail call void @f(i32* [[ARRAYIDX_US]])			; CHECK-NEXT: tail call void @f(i32* [[ARRAYIDX_US]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw nsw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]
	; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:			; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT2]] = add i64 [[INDVAR1]], 1			; CHECK-NEXT: [[INDVAR_NEXT2]] = add nuw nsw i64 [[INDVAR1]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]
	; CHECK: for.cond.cleanup.loopexit:			; CHECK: for.cond.cleanup.loopexit:
	; CHECK-NEXT: br label [[FOR_COND_CLEANUP]]			; CHECK-NEXT: br label [[FOR_COND_CLEANUP]]
	; CHECK: for.cond.cleanup:			; CHECK: for.cond.cleanup:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = add nsw i64 [[INDVAR]], [[TMP3]]			; CHECK-NEXT: [[TMP5:%.*]] = add nsw i64 [[INDVAR]], [[TMP3]]
	; CHECK-NEXT: [[TMP6:%.*]] = sext i32 [[J_016_US]] to i64			; CHECK-NEXT: [[TMP6:%.*]] = sext i32 [[J_016_US]] to i64
	; CHECK-NEXT: [[TMP7:%.*]] = add nsw i64 [[TMP6]], [[TMP3]]			; CHECK-NEXT: [[TMP7:%.*]] = add nsw i64 [[TMP6]], [[TMP3]]
	; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[J_016_US]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[J_016_US]], [[MUL_US]]
	; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVAR2]]			; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[INDVAR2]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[ARRAYIDX_US]], align 4			; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[ARRAYIDX_US]], align 4
	; CHECK-NEXT: tail call void @g(i32 [[TMP8]])			; CHECK-NEXT: tail call void @g(i32 [[TMP8]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw nsw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[INC_US:%.*]] = add nuw nsw i32 [[J_016_US]], 1			; CHECK-NEXT: [[INC_US:%.*]] = add nuw nsw i32 [[J_016_US]], 1
	; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]
	; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:			; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add nuw nsw i64 [[INDVAR2]], 1
	; CHECK-NEXT: [[INC6_US]] = add nuw nsw i32 [[I_018_US]], 1			; CHECK-NEXT: [[INC6_US]] = add nuw nsw i32 [[I_018_US]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]
	; CHECK: for.cond1.preheader:			; CHECK: for.cond1.preheader:
	; CHECK-NEXT: [[I_018:%.]] = phi i32 [ [[INC6:%.]], [[FOR_COND1_PREHEADER]] ], [ 0, [[FOR_COND1_PREHEADER_PREHEADER]] ]			; CHECK-NEXT: [[I_018:%.]] = phi i32 [ [[INC6:%.]], [[FOR_COND1_PREHEADER]] ], [ 0, [[FOR_COND1_PREHEADER_PREHEADER]] ]
	; CHECK-NEXT: [[INC6]] = add nuw nsw i32 [[I_018]], 1			; CHECK-NEXT: [[INC6]] = add nuw nsw i32 [[I_018]], 1
	; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC6]], [[N]]			; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC6]], [[N]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_COND1_PREHEADER]], label [[FOR_COND_CLEANUP_LOOPEXIT19:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_COND1_PREHEADER]], label [[FOR_COND_CLEANUP_LOOPEXIT19:%.*]]
	▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines
	; CHECK: for.body4.us:			; CHECK: for.body4.us:
	; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]			; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32
	; CHECK-NEXT: [[ADD_US:%.*]] = add i32 [[TMP3]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add i32 [[TMP3]], [[MUL_US]]
	; CHECK-NEXT: [[IDXPROM_US:%.*]] = zext i32 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[IDXPROM_US:%.*]] = zext i32 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IDXPROM_US]]			; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IDXPROM_US]]
	; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ARRAYIDX_US]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ARRAYIDX_US]], align 4
	; CHECK-NEXT: tail call void @g(i32 [[TMP4]])			; CHECK-NEXT: tail call void @g(i32 [[TMP4]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP2_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP2_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND1_FOR_COND_CLEANUP3_CRIT_EDGE_US]]
	; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:			; CHECK: for.cond1.for.cond.cleanup3_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT2]] = add i64 [[INDVAR1]], 1			; CHECK-NEXT: [[INDVAR_NEXT2]] = add i64 [[INDVAR1]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT19:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT19:%.*]]
	; CHECK: for.cond1.preheader:			; CHECK: for.cond1.preheader:
	; CHECK-NEXT: [[I_018:%.]] = phi i32 [ [[INC6:%.]], [[FOR_COND1_PREHEADER]] ], [ 0, [[FOR_COND1_PREHEADER_PREHEADER]] ]			; CHECK-NEXT: [[I_018:%.]] = phi i32 [ [[INC6:%.]], [[FOR_COND1_PREHEADER]] ], [ 0, [[FOR_COND1_PREHEADER_PREHEADER]] ]
	▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]			; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND1_PREHEADER_US]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i32
	; CHECK-NEXT: [[ADD_US:%.*]] = add i32 [[TMP3]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add i32 [[TMP3]], [[MUL_US]]
	; CHECK-NEXT: [[IDXPROM_US:%.*]] = zext i32 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[IDXPROM_US:%.*]] = zext i32 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i16, i16 [[A:%.*]], i64 [[IDXPROM_US]]			; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i16, i16 [[A:%.*]], i64 [[IDXPROM_US]]
	; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[ARRAYIDX_US]], align 2			; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[ARRAYIDX_US]], align 2
	; CHECK-NEXT: [[ADD5_US:%.]] = add i16 [[TMP4]], [[VAL:%.]]			; CHECK-NEXT: [[ADD5_US:%.]] = add i16 [[TMP4]], [[VAL:%.]]
	; CHECK-NEXT: store i16 [[ADD5_US]], i16* [[ARRAYIDX_US]], align 2			; CHECK-NEXT: store i16 [[ADD5_US]], i16* [[ARRAYIDX_US]], align 2
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP2_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP2_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND1_FOR_INC7_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND1_FOR_INC7_CRIT_EDGE_US]]
	; CHECK: for.cond1.for.inc7_crit_edge.us:			; CHECK: for.cond1.for.inc7_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT2]] = add i64 [[INDVAR1]], 1			; CHECK-NEXT: [[INDVAR_NEXT2]] = add i64 [[INDVAR1]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT2]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_END9_LOOPEXIT:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_END9_LOOPEXIT:%.*]]
	; CHECK: for.end9.loopexit:			; CHECK: for.end9.loopexit:
	; CHECK-NEXT: br label [[FOR_END9]]			; CHECK-NEXT: br label [[FOR_END9]]
	▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[FLATTEN_TRUNCIV:%.*]] = trunc i64 [[INDVAR2]] to i8			; CHECK-NEXT: [[FLATTEN_TRUNCIV:%.*]] = trunc i64 [[INDVAR2]] to i8
	; CHECK-NEXT: br label [[FOR_BODY9_US:%.*]]			; CHECK-NEXT: br label [[FOR_BODY9_US:%.*]]
	; CHECK: for.body9.us:			; CHECK: for.body9.us:
	; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND3_PREHEADER_US]] ]			; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ 0, [[FOR_COND3_PREHEADER_US]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i8			; CHECK-NEXT: [[TMP3:%.*]] = trunc i64 [[INDVAR]] to i8
	; CHECK-NEXT: [[ADD_US:%.*]] = add i8 [[TMP3]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add i8 [[TMP3]], [[MUL_US]]
	; CHECK-NEXT: [[CONV14_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i32			; CHECK-NEXT: [[CONV14_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i32
	; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])			; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP6_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP6_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]
	; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:			; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT1:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT1:%.*]]
	; CHECK: for.cond3.preheader:			; CHECK: for.cond3.preheader:
	; CHECK-NEXT: [[I_026:%.]] = phi i8 [ [[INC16:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]			; CHECK-NEXT: [[I_026:%.]] = phi i8 [ [[INC16:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]
	▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[CONV14_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i32			; CHECK-NEXT: [[CONV14_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i32
	; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])			; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])
	; CHECK-NEXT: [[CONV15_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i16			; CHECK-NEXT: [[CONV15_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i16
	; CHECK-NEXT: [[CALL16_US:%.*]] = tail call i32 @use_16(i16 [[CONV15_US]])			; CHECK-NEXT: [[CALL16_US:%.*]] = tail call i32 @use_16(i16 [[CONV15_US]])
	; CHECK-NEXT: [[CALL18_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])			; CHECK-NEXT: [[CALL18_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])
	; CHECK-NEXT: [[CALL20_US:%.*]] = tail call i32 @use_16(i16 [[CONV15_US]])			; CHECK-NEXT: [[CALL20_US:%.*]] = tail call i32 @use_16(i16 [[CONV15_US]])
	; CHECK-NEXT: [[CONV21_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[CONV21_US:%.*]] = zext i8 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[CALL22_US:%.*]] = tail call i32 @use_64(i64 [[CONV21_US]])			; CHECK-NEXT: [[CALL22_US:%.*]] = tail call i32 @use_64(i64 [[CONV21_US]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP6_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP6_US:%.*]] = icmp ult i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]
	; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:			; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp ult i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT1:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT1:%.*]]
	; CHECK: for.cond3.preheader:			; CHECK: for.cond3.preheader:
	; CHECK-NEXT: [[I_038:%.]] = phi i8 [ [[INC24:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]			; CHECK-NEXT: [[I_038:%.]] = phi i8 [ [[INC24:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]
	▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[ADD_US:%.*]] = add i16 [[TMP3]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add i16 [[TMP3]], [[MUL_US]]
	; CHECK-NEXT: [[CONV14_US:%.*]] = sext i16 [[FLATTEN_TRUNCIV]] to i32			; CHECK-NEXT: [[CONV14_US:%.*]] = sext i16 [[FLATTEN_TRUNCIV]] to i32
	; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])			; CHECK-NEXT: [[CALL_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])
	; CHECK-NEXT: [[CALL15_US:%.*]] = tail call i32 @use_16(i16 [[FLATTEN_TRUNCIV]])			; CHECK-NEXT: [[CALL15_US:%.*]] = tail call i32 @use_16(i16 [[FLATTEN_TRUNCIV]])
	; CHECK-NEXT: [[CALL17_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])			; CHECK-NEXT: [[CALL17_US:%.*]] = tail call i32 @use_32(i32 [[CONV14_US]])
	; CHECK-NEXT: [[CALL18_US:%.*]] = tail call i32 @use_16(i16 [[FLATTEN_TRUNCIV]])			; CHECK-NEXT: [[CALL18_US:%.*]] = tail call i32 @use_16(i16 [[FLATTEN_TRUNCIV]])
	; CHECK-NEXT: [[CONV19_US:%.*]] = sext i16 [[FLATTEN_TRUNCIV]] to i64			; CHECK-NEXT: [[CONV19_US:%.*]] = sext i16 [[FLATTEN_TRUNCIV]] to i64
	; CHECK-NEXT: [[CALL20_US:%.*]] = tail call i32 @use_64(i64 [[CONV19_US]])			; CHECK-NEXT: [[CALL20_US:%.*]] = tail call i32 @use_64(i64 [[CONV19_US]])
	; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT:%.*]] = add nuw nsw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[CMP6_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]			; CHECK-NEXT: [[CMP6_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP0]]
	; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]			; CHECK-NEXT: br label [[FOR_COND3_FOR_COND_CLEANUP8_CRIT_EDGE_US]]
	; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:			; CHECK: for.cond3.for.cond.cleanup8_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[FLATTEN_TRIPCOUNT]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND3_PREHEADER_US]], label [[FOR_COND_CLEANUP_LOOPEXIT:%.*]]
	; CHECK: for.cond3.preheader:			; CHECK: for.cond3.preheader:
	; CHECK-NEXT: [[I_039:%.]] = phi i16 [ [[INC22:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]			; CHECK-NEXT: [[I_039:%.]] = phi i16 [ [[INC22:%.]], [[FOR_COND3_PREHEADER]] ], [ 0, [[FOR_COND3_PREHEADER_PREHEADER]] ]
	▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopFlatten/widen-iv2.ll

	Show All 39 Lines
	; CHECK-NEXT: [[J_014_US:%.]] = phi i32 [ 0, [[FOR_COND1_PREHEADER_US]] ], [ [[INC_US:%.]], [[FOR_BODY3_US]] ]			; CHECK-NEXT: [[J_014_US:%.]] = phi i32 [ 0, [[FOR_COND1_PREHEADER_US]] ], [ [[INC_US:%.]], [[FOR_BODY3_US]] ]
	; CHECK-NEXT: [[TMP7:%.*]] = add nsw i64 [[INDVAR]], [[TMP5]]			; CHECK-NEXT: [[TMP7:%.*]] = add nsw i64 [[INDVAR]], [[TMP5]]
	; CHECK-NEXT: [[TMP8:%.*]] = sext i32 [[J_014_US]] to i64			; CHECK-NEXT: [[TMP8:%.*]] = sext i32 [[J_014_US]] to i64
	; CHECK-NEXT: [[TMP9:%.*]] = add nsw i64 [[TMP8]], [[TMP5]]			; CHECK-NEXT: [[TMP9:%.*]] = add nsw i64 [[TMP8]], [[TMP5]]
	; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[J_014_US]], [[MUL_US]]			; CHECK-NEXT: [[ADD_US:%.*]] = add nsw i32 [[J_014_US]], [[MUL_US]]
	; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[ADD_US]] to i64			; CHECK-NEXT: [[IDXPROM_US:%.*]] = sext i32 [[ADD_US]] to i64
	; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i64 [[TMP7]]			; CHECK-NEXT: [[ARRAYIDX_US:%.]] = getelementptr inbounds i32, i32 [[TMP4]], i64 [[TMP7]]
	; CHECK-NEXT: store i32 32, i32* [[ARRAYIDX_US]], align 4			; CHECK-NEXT: store i32 32, i32* [[ARRAYIDX_US]], align 4
	; CHECK-NEXT: [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT]] = add nuw nsw i64 [[INDVAR]], 1
	; CHECK-NEXT: [[INC_US]] = add nuw nsw i32 [[J_014_US]], 1			; CHECK-NEXT: [[INC_US]] = add nuw nsw i32 [[J_014_US]], 1
	; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP1]]			; CHECK-NEXT: [[CMP2_US:%.*]] = icmp slt i64 [[INDVAR_NEXT]], [[TMP1]]
	; CHECK-NEXT: br i1 [[CMP2_US]], label [[FOR_BODY3_US]], label [[FOR_COND1_FOR_INC4_CRIT_EDGE_US]]			; CHECK-NEXT: br i1 [[CMP2_US]], label [[FOR_BODY3_US]], label [[FOR_COND1_FOR_INC4_CRIT_EDGE_US]]
	; CHECK: for.cond1.for.inc4_crit_edge.us:			; CHECK: for.cond1.for.inc4_crit_edge.us:
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i64 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add nuw nsw i64 [[INDVAR2]], 1
	; CHECK-NEXT: [[INC5_US]] = add nuw nsw i32 [[I_016_US]], 1			; CHECK-NEXT: [[INC5_US]] = add nuw nsw i32 [[I_016_US]], 1
	; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[TMP3]]			; CHECK-NEXT: [[CMP_US:%.*]] = icmp slt i64 [[INDVAR_NEXT3]], [[TMP3]]
	; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_END6_LOOPEXIT:%.*]]			; CHECK-NEXT: br i1 [[CMP_US]], label [[FOR_COND1_PREHEADER_US]], label [[FOR_END6_LOOPEXIT:%.*]]
	; CHECK: for.end6.loopexit:			; CHECK: for.end6.loopexit:
	; CHECK-NEXT: br label [[FOR_END6]]			; CHECK-NEXT: br label [[FOR_END6]]
	; CHECK: for.end6:			; CHECK: for.end6:
	; CHECK-NEXT: ret i32 undef			; CHECK-NEXT: ret i32 undef
	;			;
	Show All 37 Lines

llvm/test/Transforms/LoopFlatten/widen-iv3.ll


	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py

	; RUN: opt < %s -S -loop-flatten \			; RUN: opt < %s -S -loop-flatten \
	; RUN: -verify-loop-info -verify-dom-info -verify-scev -verify \| \			; RUN: -verify-loop-info -verify-dom-info -verify-scev -verify \| \
	; RUN: FileCheck %s --check-prefix=CHECK			; RUN: FileCheck %s --check-prefix=CHECK

	target datalayout = "n32"			target datalayout = "n32"

	@v = global [64 x i16] [i16 0, i16 1, i16 2, i16 3, i16 4, i16 5, i16 6, i16 7, i16 8, i16 9, i16 10, i16 11, i16 12, i16 13, i16 14, i16 15, i16 16, i16 17, i16 18, i16 19, i16 20, i16 21, i16 22, i16 23, i16 24, i16 25, i16 26, i16 27, i16 28, i16 29, i16 30, i16 31, i16 32, i16 33, i16 34, i16 35, i16 36, i16 37, i16 38, i16 39, i16 40, i16 41, i16 42, i16 43, i16 44, i16 45, i16 46, i16 47, i16 48, i16 49, i16 50, i16 51, i16 52, i16 53, i16 54, i16 55, i16 56, i16 57, i16 58, i16 59, i16 60, i16 61, i16 62, i16 63], align 1			@v = global [64 x i16] [i16 0, i16 1, i16 2, i16 3, i16 4, i16 5, i16 6, i16 7, i16 8, i16 9, i16 10, i16 11, i16 12, i16 13, i16 14, i16 15, i16 16, i16 17, i16 18, i16 19, i16 20, i16 21, i16 22, i16 23, i16 24, i16 25, i16 26, i16 27, i16 28, i16 29, i16 30, i16 31, i16 32, i16 33, i16 34, i16 35, i16 36, i16 37, i16 38, i16 39, i16 40, i16 41, i16 42, i16 43, i16 44, i16 45, i16 46, i16 47, i16 48, i16 49, i16 50, i16 51, i16 52, i16 53, i16 54, i16 55, i16 56, i16 57, i16 58, i16 59, i16 60, i16 61, i16 62, i16 63], align 1

	Show All 9 Lines
	; CHECK-NEXT: [[MUL:%.*]] = mul nsw i16 [[I_013]], 16			; CHECK-NEXT: [[MUL:%.*]] = mul nsw i16 [[I_013]], 16
	; CHECK-NEXT: [[TMP1:%.*]] = zext i16 [[MUL]] to i32			; CHECK-NEXT: [[TMP1:%.*]] = zext i16 [[MUL]] to i32
	; CHECK-NEXT: br label [[FOR_BODY4:%.*]]			; CHECK-NEXT: br label [[FOR_BODY4:%.*]]
	; CHECK: for.cond.cleanup:			; CHECK: for.cond.cleanup:
	; CHECK-NEXT: [[ADD5_LCSSA_LCSSA:%.*]] = phi i16 [ [[ADD5_LCSSA]], [[FOR_COND_CLEANUP3]] ]			; CHECK-NEXT: [[ADD5_LCSSA_LCSSA:%.*]] = phi i16 [ [[ADD5_LCSSA]], [[FOR_COND_CLEANUP3]] ]
	; CHECK-NEXT: ret i16 [[ADD5_LCSSA_LCSSA]]			; CHECK-NEXT: ret i16 [[ADD5_LCSSA_LCSSA]]
	; CHECK: for.cond.cleanup3:			; CHECK: for.cond.cleanup3:
	; CHECK-NEXT: [[ADD5_LCSSA]] = phi i16 [ [[ADD5:%.*]], [[FOR_BODY4]] ]			; CHECK-NEXT: [[ADD5_LCSSA]] = phi i16 [ [[ADD5:%.*]], [[FOR_BODY4]] ]
	; CHECK-NEXT: [[INDVAR_NEXT3]] = add i32 [[INDVAR2]], 1			; CHECK-NEXT: [[INDVAR_NEXT3]] = add nuw nsw i32 [[INDVAR2]], 1
	; CHECK-NEXT: [[INC7]] = add nuw nsw i16 [[I_013]], 1			; CHECK-NEXT: [[INC7]] = add nuw nsw i16 [[I_013]], 1
	; CHECK-NEXT: [[EXITCOND14_NOT:%.*]] = icmp eq i32 [[INDVAR_NEXT3]], 4			; CHECK-NEXT: [[EXITCOND14_NOT:%.*]] = icmp eq i32 [[INDVAR_NEXT3]], 4
	; CHECK-NEXT: br i1 [[EXITCOND14_NOT]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_COND1_PREHEADER]]			; CHECK-NEXT: br i1 [[EXITCOND14_NOT]], label [[FOR_COND_CLEANUP:%.*]], label [[FOR_COND1_PREHEADER]]
	; CHECK: for.body4:			; CHECK: for.body4:
	; CHECK-NEXT: [[INDVAR:%.]] = phi i32 [ [[INDVAR_NEXT:%.]], [[FOR_BODY4]] ], [ 0, [[FOR_COND1_PREHEADER]] ]			; CHECK-NEXT: [[INDVAR:%.]] = phi i32 [ [[INDVAR_NEXT:%.]], [[FOR_BODY4]] ], [ 0, [[FOR_COND1_PREHEADER]] ]
	; CHECK-NEXT: [[J_011:%.*]] = phi i16 [ 0, [[FOR_COND1_PREHEADER]] ]			; CHECK-NEXT: [[J_011:%.]] = phi i16 [ 0, [[FOR_COND1_PREHEADER]] ], [ [[INC:%.]], [[FOR_BODY4]] ]
	; CHECK-NEXT: [[SUM_110:%.*]] = phi i16 [ [[SUM_012]], [[FOR_COND1_PREHEADER]] ], [ [[ADD5]], [[FOR_BODY4]] ]			; CHECK-NEXT: [[SUM_110:%.*]] = phi i16 [ [[SUM_012]], [[FOR_COND1_PREHEADER]] ], [ [[ADD5]], [[FOR_BODY4]] ]
	; CHECK-NEXT: [[TMP2:%.*]] = add nuw nsw i32 [[INDVAR]], [[TMP0]]			; CHECK-NEXT: [[TMP2:%.*]] = add nuw nsw i32 [[INDVAR]], [[TMP0]]
	; CHECK-NEXT: [[ADD:%.*]] = add nuw nsw i16 [[J_011]], [[MUL]]			; CHECK-NEXT: [[ADD:%.*]] = add nuw nsw i16 [[J_011]], [[MUL]]
	; CHECK-NEXT: [[TMP3:%.*]] = trunc i32 [[TMP2]] to i16			; CHECK-NEXT: [[TMP3:%.*]] = trunc i32 [[TMP2]] to i16
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [64 x i16], [64 x i16] @v, i16 0, i16 [[TMP3]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [64 x i16], [64 x i16] @v, i16 0, i16 [[TMP3]]
	; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[ARRAYIDX]], align 1			; CHECK-NEXT: [[TMP4:%.]] = load i16, i16 [[ARRAYIDX]], align 1
	; CHECK-NEXT: [[ADD5]] = add nsw i16 [[TMP4]], [[SUM_110]]			; CHECK-NEXT: [[ADD5]] = add nsw i16 [[TMP4]], [[SUM_110]]
	; CHECK-NEXT: [[INDVAR_NEXT]] = add i32 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT]] = add nuw nsw i32 [[INDVAR]], 1
	; CHECK-NEXT: [[INC:%.*]] = add nuw nsw i16 [[J_011]], 1			; CHECK-NEXT: [[INC]] = add nuw nsw i16 [[J_011]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[INDVAR_NEXT]], 16			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i32 [[INDVAR_NEXT]], 16
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP3]], label [[FOR_BODY4]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP3]], label [[FOR_BODY4]]
	;			;
	entry:			entry:
	br label %for.cond1.preheader			br label %for.cond1.preheader

	for.cond1.preheader: ; preds = %entry, %for.cond.cleanup3			for.cond1.preheader: ; preds = %entry, %for.cond.cleanup3
	%i.013 = phi i16 [ 0, %entry ], [ %inc7, %for.cond.cleanup3 ]			%i.013 = phi i16 [ 0, %entry ], [ %inc7, %for.cond.cleanup3 ]
	Show All 25 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[IndVarS] Keep the nsw/nuw flags after simplifyAndExtendNeeds ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 396501

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

llvm/test/Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll

llvm/test/Transforms/IndVarSimplify/X86/pr27133.ll

llvm/test/Transforms/IndVarSimplify/keep-nsw-nuw-flag.ll

llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll

llvm/test/Transforms/LoopFlatten/widen-iv.ll

llvm/test/Transforms/LoopFlatten/widen-iv2.ll

llvm/test/Transforms/LoopFlatten/widen-iv3.ll

[IndVarS] Keep the nsw/nuw flags after simplifyAndExtend
Needs ReviewPublic