This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
2
LoopInterchange.cpp
-
test/Transforms/LoopInterchange/
-
Transforms/
-
LoopInterchange/
1
pr43797-lcssa-for-multiple-outer-loop-blocks.ll
-
pr45743-move-from-inner-preheader.ll

Differential D96708

[llvm] Bug fix: tightlyNested() of Loop interchange
Needs ReviewPublic

Authored by masakazu.ueno on Feb 15 2021, 6:26 AM.

Download Raw Diff

Details

Reviewers

fhahn
hfinkel
qcolombet

Summary

There is an error in the determination of whether the loop is tightly nested.
Therefore, the execution result may not be correct.
So I added a check to see if there are instructions between nested loops.

Example

#include <stdio.h>
static   long  int i,j,k=3,m=3;
unsigned long  int x[50];
long double a[50][50];

int main() {

for (i=0; i<50; i++) for (j=0; j<50; j++) a[i][j]=i; for (i=0; i<50; i++) x[i]=1;

j=0;
for(i=40;i>1;i--) {
  k=20;
  while(k++<40) x[k]=x[j]%40;
  a[i][i]+=a[i][j]+a[j][i];
  for(m=9;;) if (--m<5) break;
}

for (i=0; i<50; i++) for (j=0; j<50; j++)
  printf("a:%ld %5.5Le \n",i,a[i][j]);

  return 0;
}

*) Specified options are "-Ofast -mllvm -enable-loopinterchange=true -flegacy-pass-manager".

The difference between the results when loop interchange is enabled and disabled is shown below.

$ clang -O2 -flegacy-pass-manager -mllvm -enable-loopinterchange=true -Rpass=loop-interchange bug1362_tightlynested.c -o a_error.out
bug1362_tightlynested.c:13:3: remark: Loop interchanged with enclosing loop. [-Rpass=loop-interchange]
  while(k++<40) x[k]=x[j]%40;
  ^
$ clang -O2 -flegacy-pass-manager -mllvm -enable-loopinterchange=false -Rpass=loop-interchange bug1362_tightlynested.c -o a_ok.out
$ ./a_error.out > a_error.res
$ ./a_ok.out > a_ok.res
$ diff a_ok.res a_error.res 
103c103
< a:2 4.00000e+00 
---
> a:2 4.20000e+01 
154c154
< a:3 6.00000e+00 
---
> a:3 6.30000e+01 
205c205
< a:4 8.00000e+00 
---
> a:4 8.40000e+01
 :
 Omit the following.

Diff Detail

Event Timeline

masakazu.ueno created this revision.Feb 15 2021, 6:26 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptFeb 15 2021, 6:26 AM

masakazu.ueno requested review of this revision.Feb 15 2021, 6:26 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptFeb 15 2021, 6:26 AM

Harbormaster completed remote builds in B89220: Diff 323729.Feb 15 2021, 7:00 AM

masakazu.ueno updated this revision to Diff 323953.Feb 16 2021, 4:16 AM

congzhe added a subscriber: congzhe.Feb 24 2021, 6:46 PM

congzhe added inline comments.Feb 25 2021, 9:49 PM

llvm/lib/Transforms/Scalar/LoopInterchange.cpp
590	Is there any underlying reason that we selectively allow these instructions in `InnerLoopPreheader`, for example why `zext`? I'm wondering if there are specific reasons or is it an ad-hoc solution for now? The test program you provided does not seem to have these instructions in particular. I have the same question for `InnerLoopExit`.
653	Maybe add some comment to self-explain this piece of code? If `containsUnsafeInstructionsInInnerLoop()` then it means the loops are not tightly nested? It all depends on how we define "tightly nested" in loop interchange. For example, if we have an add instruction in the inner loop preheader then `containsUnsafeInstructionsInInnerLoop()` would return true -> but having a `add %x + 1` instruction in the inner loop preheader does not necessarily mean the loops are not tightly nested? Am I missing something?
llvm/test/Transforms/LoopInterchange/pr43797-lcssa-for-multiple-outer-loop-blocks.ll
9	I understand with this patch, these tests would fail legality check. I'm wondering if these tests are really not interchangable? If we did do loop interchange on the tests it would give us something wrong?

I added some comments to the source code.

Harbormaster completed remote builds in B92219: Diff 328379.Mar 5 2021, 2:38 PM

In D96708#2605502, @masakazu.ueno wrote:

I added some comments to the source code.

Thanks for the comments!

Whitney mentioned this in D98263: [LoopInterchange] fix tightlyNested() in LoopInterchange legality.Mar 10 2021, 6:16 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopInterchange.cpp

38 lines

test/

Transforms/

LoopInterchange/

pr43797-lcssa-for-multiple-outer-loop-blocks.ll

87 lines

pr45743-move-from-inner-preheader.ll

35 lines

Diff 328379

llvm/lib/Transforms/Scalar/LoopInterchange.cpp

Show First 20 Lines • Show All 341 Lines • ▼ Show 20 Lines	public:
bool currentLimitations();		bool currentLimitations();

const SmallPtrSetImpl<PHINode *> &getOuterInnerReductions() const {		const SmallPtrSetImpl<PHINode *> &getOuterInnerReductions() const {
return OuterInnerReductions;		return OuterInnerReductions;
}		}

private:		private:
bool tightlyNested(Loop Outer, Loop Inner);		bool tightlyNested(Loop Outer, Loop Inner);
		bool containsUnsafeInstructionsInInnerLoop(void);
bool containsUnsafeInstructions(BasicBlock *BB);		bool containsUnsafeInstructions(BasicBlock *BB);

/// Discover induction and reduction PHIs in the header of \p L. Induction		/// Discover induction and reduction PHIs in the header of \p L. Induction
/// PHIs are added to \p Inductions, reductions are added to		/// PHIs are added to \p Inductions, reductions are added to
/// OuterInnerReductions. When the outer loop is passed, the inner loop needs		/// OuterInnerReductions. When the outer loop is passed, the inner loop needs
/// to be passed as \p InnerLoop.		/// to be passed as \p InnerLoop.
bool findInductionAndReductions(Loop *L,		bool findInductionAndReductions(Loop *L,
SmallVector<PHINode *, 8> &Inductions,		SmallVector<PHINode *, 8> &Inductions,
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	assert(OuterLoop->isLCSSAForm(*DT) &&
"Outer loop not left in LCSSA form after loop interchange!");		"Outer loop not left in LCSSA form after loop interchange!");

return true;		return true;
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

		/// Returns true if there are unsafe instructions above or below
		/// the inner loop.
		bool LoopInterchangeLegality::containsUnsafeInstructionsInInnerLoop() {
		BasicBlock *InnerLoopPreHeader = InnerLoop->getLoopPreheader();
		BasicBlock *InnerLoopExit = InnerLoop->getExitBlock();
		BasicBlock *OuterLoopHeader = OuterLoop->getHeader();
		BasicBlock *OuterLoopLatch = OuterLoop->getLoopLatch();

		// Check loop preheader.
		congzheUnsubmitted Not Done Reply Inline Actions Is there any underlying reason that we selectively allow these instructions in `InnerLoopPreheader`, for example why `zext`? I'm wondering if there are specific reasons or is it an ad-hoc solution for now? The test program you provided does not seem to have these instructions in particular. I have the same question for `InnerLoopExit`. congzhe: Is there any underlying reason that we selectively allow these instructions in…
		// In Transforms/LoopInterchange/lcssa-preheader.ll, the inner loop
		// preheader has Zext instruction.
		if (InnerLoopPreHeader != OuterLoopHeader) {
		for (Instruction &I : *InnerLoopPreHeader) {
		if (!isa<BranchInst>(&I) && !isa<ZExtInst>(&I) &&
		!isa<DbgInfoIntrinsic>(&I))
		return true;
		}
		}

		// Check loop latch.
		if (InnerLoopExit != OuterLoopLatch) {
		for (Instruction &I : *InnerLoopExit) {
		if (!isa<BranchInst>(&I) && !isa<PHINode>(&I) &&
		!isa<DbgInfoIntrinsic>(&I))
		return true;
		}
		}

		return false;
		}

bool LoopInterchangeLegality::containsUnsafeInstructions(BasicBlock *BB) {		bool LoopInterchangeLegality::containsUnsafeInstructions(BasicBlock *BB) {
return any_of(*BB, [](const Instruction &I) {		return any_of(*BB, [](const Instruction &I) {
return I.mayHaveSideEffects() \|\| I.mayReadFromMemory();		return I.mayHaveSideEffects() \|\| I.mayReadFromMemory();
});		});
}		}

bool LoopInterchangeLegality::tightlyNested(Loop OuterLoop, Loop InnerLoop) {		bool LoopInterchangeLegality::tightlyNested(Loop OuterLoop, Loop InnerLoop) {
BasicBlock *OuterLoopHeader = OuterLoop->getHeader();		BasicBlock *OuterLoopHeader = OuterLoop->getHeader();
Show All 24 Lines	bool LoopInterchangeLegality::tightlyNested(Loop OuterLoop, Loop InnerLoop) {

// Also make sure the inner loop preheader does not contain any unsafe		// Also make sure the inner loop preheader does not contain any unsafe
// instructions. Note that all instructions in the preheader will be moved to		// instructions. Note that all instructions in the preheader will be moved to
// the outer loop header when interchanging.		// the outer loop header when interchanging.
if (InnerLoopPreHeader != OuterLoopHeader &&		if (InnerLoopPreHeader != OuterLoopHeader &&
containsUnsafeInstructions(InnerLoopPreHeader))		containsUnsafeInstructions(InnerLoopPreHeader))
return false;		return false;

		// Check if the loops are tightly nested.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // Basically, if there are instructions that may be unsafe + // Basically, if there are instructions that may be unsafe Lint: Pre-merge checks: clang-format: please reformat the code ``` - // Basically, if there are instructions that may…
		congzheUnsubmitted Not Done Reply Inline Actions Maybe add some comment to self-explain this piece of code? If `containsUnsafeInstructionsInInnerLoop()` then it means the loops are not tightly nested? It all depends on how we define "tightly nested" in loop interchange. For example, if we have an add instruction in the inner loop preheader then `containsUnsafeInstructionsInInnerLoop()` would return true -> but having a `add %x + 1` instruction in the inner loop preheader does not necessarily mean the loops are not tightly nested? Am I missing something? congzhe: Maybe add some comment to self-explain this piece of code? If…
		// Basically, if there are instructions that may be unsafe
		// in the inner loop preheader and exit, suppress Loop-interchange.
		if (containsUnsafeInstructionsInInnerLoop())
		return false;

LLVM_DEBUG(dbgs() << "Loops are perfectly nested\n");		LLVM_DEBUG(dbgs() << "Loops are perfectly nested\n");
// We have a perfect loop nest.		// We have a perfect loop nest.
return true;		return true;
}		}

bool LoopInterchangeLegality::isLoopStructureUnderstood(		bool LoopInterchangeLegality::isLoopStructureUnderstood(
PHINode *InnerInduction) {		PHINode *InnerInduction) {
unsigned Num = InnerInduction->getNumOperands();		unsigned Num = InnerInduction->getNumOperands();
▲ Show 20 Lines • Show All 1,067 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopInterchange/pr43797-lcssa-for-multiple-outer-loop-blocks.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -loop-interchange -verify-loop-lcssa -S %s \| FileCheck %s		; RUN: opt -loop-interchange -verify-loop-lcssa -pass-remarks-output=%t -S %s
		; RUN: FileCheck %s --input-file %t --check-prefix REMARK

; Tests for PR43797.		; Tests for PR43797.

@wdtdr = external dso_local global [5 x [5 x double]], align 16		@wdtdr = external dso_local global [5 x [5 x double]], align 16

		; REMARK: NotTightlyNested
		congzheUnsubmitted Not Done Reply Inline Actions I understand with this patch, these tests would fail legality check. I'm wondering if these tests are really not interchangable? If we did do loop interchange on the tests it would give us something wrong? congzhe: I understand with this patch, these tests would fail legality check. I'm wondering if these…

		;; Loops not tightly nested are not interchanged
define void @test1() {		define void @test1() {
; CHECK-LABEL: @test1(
; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[INNER_HEADER_PREHEADER:%.*]]
; CHECK: outer.header.preheader:
; CHECK-NEXT: br label [[OUTER_HEADER:%.*]]
; CHECK: outer.header:
; CHECK-NEXT: [[OUTER_IDX:%.]] = phi i64 [ [[OUTER_IDX_INC:%.]], [[OUTER_LATCH:%.]] ], [ 0, [[OUTER_HEADER_PREHEADER:%.]] ]
; CHECK-NEXT: [[ARRAYIDX8:%.]] = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]] @wdtdr, i64 0, i64 0, i64 [[OUTER_IDX]]
; CHECK-NEXT: br label [[INNER_HEADER_SPLIT:%.*]]
; CHECK: inner.header.preheader:
; CHECK-NEXT: br label [[INNER_HEADER:%.*]]
; CHECK: inner.header:
; CHECK-NEXT: [[INNER_IDX:%.]] = phi i64 [ [[TMP3:%.]], [[INNER_LATCH_SPLIT:%.*]] ], [ 0, [[INNER_HEADER_PREHEADER]] ]
; CHECK-NEXT: br label [[OUTER_HEADER_PREHEADER]]
; CHECK: inner.header.split:
; CHECK-NEXT: [[TMP0:%.]] = load double, double [[ARRAYIDX8]], align 8
; CHECK-NEXT: store double undef, double* [[ARRAYIDX8]], align 8
; CHECK-NEXT: br label [[INNER_LATCH:%.*]]
; CHECK: inner.latch:
; CHECK-NEXT: [[INNER_IDX_INC:%.*]] = add nsw i64 [[INNER_IDX]], 1
; CHECK-NEXT: br label [[INNER_EXIT:%.*]]
; CHECK: inner.latch.split:
; CHECK-NEXT: [[TMP1:%.]] = phi i64 [ [[OUTER_V:%.]], [[OUTER_LATCH]] ]
; CHECK-NEXT: [[TMP2:%.*]] = phi i64 [ [[OUTER_IDX_INC]], [[OUTER_LATCH]] ]
; CHECK-NEXT: [[TMP3]] = add nsw i64 [[INNER_IDX]], 1
; CHECK-NEXT: br i1 false, label [[INNER_HEADER]], label [[OUTER_EXIT:%.*]]
; CHECK: inner.exit:
; CHECK-NEXT: [[OUTER_V]] = add nsw i64 [[OUTER_IDX]], 1
; CHECK-NEXT: br label [[OUTER_LATCH]]
; CHECK: outer.latch:
; CHECK-NEXT: [[OUTER_IDX_INC]] = add nsw i64 [[OUTER_IDX]], 1
; CHECK-NEXT: br i1 false, label [[OUTER_HEADER]], label [[INNER_LATCH_SPLIT]]
; CHECK: outer.exit:
; CHECK-NEXT: [[EXIT1_LCSSA:%.*]] = phi i64 [ [[TMP1]], [[INNER_LATCH_SPLIT]] ]
; CHECK-NEXT: [[EXIT2_LCSSA:%.*]] = phi i64 [ [[TMP2]], [[INNER_LATCH_SPLIT]] ]
; CHECK-NEXT: ret void
;
entry:		entry:
br label %outer.header		br label %outer.header

outer.header: ; preds = %for.inc27, %entry		outer.header: ; preds = %for.inc27, %entry
%outer.idx = phi i64 [ 0, %entry ], [ %outer.idx.inc, %outer.latch ]		%outer.idx = phi i64 [ 0, %entry ], [ %outer.idx.inc, %outer.latch ]
%arrayidx8 = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]]* @wdtdr, i64 0, i64 0, i64 %outer.idx		%arrayidx8 = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]]* @wdtdr, i64 0, i64 0, i64 %outer.idx
br label %inner.header		br label %inner.header

Show All 16 Lines	outer.latch: ; preds = %for.end
br i1 undef, label %outer.header, label %outer.exit		br i1 undef, label %outer.header, label %outer.exit

outer.exit: ; preds = %for.inc27		outer.exit: ; preds = %for.inc27
%exit1.lcssa = phi i64 [ %outer.v, %outer.latch ]		%exit1.lcssa = phi i64 [ %outer.v, %outer.latch ]
%exit2.lcssa = phi i64 [ %outer.idx.inc, %outer.latch ]		%exit2.lcssa = phi i64 [ %outer.idx.inc, %outer.latch ]
ret void		ret void
}		}

		; REMARK: NotTightlyNested

		;; Loops not tightly nested are not interchanged
define void @test2(i1 %cond) {		define void @test2(i1 %cond) {
; CHECK-LABEL: @test2(
; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[COND:%.]], label [[INNER_HEADER_PREHEADER:%.]], label [[OUTER_EXIT:%.*]]
; CHECK: outer.header.preheader:
; CHECK-NEXT: br label [[OUTER_HEADER:%.*]]
; CHECK: outer.header:
; CHECK-NEXT: [[OUTER_IDX:%.]] = phi i64 [ [[OUTER_IDX_INC:%.]], [[OUTER_LATCH:%.]] ], [ 0, [[OUTER_HEADER_PREHEADER:%.]] ]
; CHECK-NEXT: [[ARRAYIDX8:%.]] = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]] @wdtdr, i64 0, i64 0, i64 [[OUTER_IDX]]
; CHECK-NEXT: br label [[INNER_HEADER_SPLIT:%.*]]
; CHECK: inner.header.preheader:
; CHECK-NEXT: br label [[INNER_HEADER:%.*]]
; CHECK: inner.header:
; CHECK-NEXT: [[INNER_IDX:%.]] = phi i64 [ [[TMP3:%.]], [[INNER_LATCH_SPLIT:%.*]] ], [ 0, [[INNER_HEADER_PREHEADER]] ]
; CHECK-NEXT: br label [[OUTER_HEADER_PREHEADER]]
; CHECK: inner.header.split:
; CHECK-NEXT: [[TMP0:%.]] = load double, double [[ARRAYIDX8]], align 8
; CHECK-NEXT: store double undef, double* [[ARRAYIDX8]], align 8
; CHECK-NEXT: br label [[INNER_LATCH:%.*]]
; CHECK: inner.latch:
; CHECK-NEXT: [[INNER_IDX_INC:%.*]] = add nsw i64 [[INNER_IDX]], 1
; CHECK-NEXT: br label [[INNER_EXIT:%.*]]
; CHECK: inner.latch.split:
; CHECK-NEXT: [[TMP1:%.*]] = phi i64 [ [[OUTER_IDX_INC]], [[OUTER_LATCH]] ]
; CHECK-NEXT: [[TMP2:%.]] = phi i64 [ [[OUTER_V:%.]], [[OUTER_LATCH]] ]
; CHECK-NEXT: [[TMP3]] = add nsw i64 [[INNER_IDX]], 1
; CHECK-NEXT: br i1 false, label [[INNER_HEADER]], label [[OUTER_EXIT_LOOPEXIT:%.*]]
; CHECK: inner.exit:
; CHECK-NEXT: [[OUTER_V]] = add nsw i64 [[OUTER_IDX]], 1
; CHECK-NEXT: br label [[OUTER_LATCH]]
; CHECK: outer.latch:
; CHECK-NEXT: [[OUTER_IDX_INC]] = add nsw i64 [[OUTER_IDX]], 1
; CHECK-NEXT: br i1 false, label [[OUTER_HEADER]], label [[INNER_LATCH_SPLIT]]
; CHECK: outer.exit.loopexit:
; CHECK-NEXT: [[OUTER_IDX_INC_LCSSA:%.*]] = phi i64 [ [[TMP1]], [[INNER_LATCH_SPLIT]] ]
; CHECK-NEXT: [[OUTER_V_LCSSA:%.*]] = phi i64 [ [[TMP2]], [[INNER_LATCH_SPLIT]] ]
; CHECK-NEXT: br label [[OUTER_EXIT]]
; CHECK: outer.exit:
; CHECK-NEXT: [[EXIT1_LCSSA:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[OUTER_V_LCSSA]], [[OUTER_EXIT_LOOPEXIT]] ]
; CHECK-NEXT: [[EXIT2_LCSSA:%.*]] = phi i64 [ 0, [[ENTRY]] ], [ [[OUTER_IDX_INC_LCSSA]], [[OUTER_EXIT_LOOPEXIT]] ]
; CHECK-NEXT: ret void
;
entry:		entry:
br i1 %cond, label %outer.header, label %outer.exit		br i1 %cond, label %outer.header, label %outer.exit

outer.header: ; preds = %for.inc27, %entry		outer.header: ; preds = %for.inc27, %entry
%outer.idx = phi i64 [ 0, %entry ], [ %outer.idx.inc, %outer.latch ]		%outer.idx = phi i64 [ 0, %entry ], [ %outer.idx.inc, %outer.latch ]
%arrayidx8 = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]]* @wdtdr, i64 0, i64 0, i64 %outer.idx		%arrayidx8 = getelementptr inbounds [5 x [5 x double]], [5 x [5 x double]]* @wdtdr, i64 0, i64 0, i64 %outer.idx
br label %inner.header		br label %inner.header

Show All 23 Lines

llvm/test/Transforms/LoopInterchange/pr45743-move-from-inner-preheader.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -loop-interchange -S %s \| FileCheck %s			; RUN: opt -loop-interchange -S %s \| FileCheck %s

	@global = external local_unnamed_addr global [2 x [10 x i32]], align 16			@global = external local_unnamed_addr global [2 x [10 x i32]], align 16

				;; Loops not tightly nested are not interchanged
	; We need to move %tmp4 from the inner loop pre header to the outer loop header			; We need to move %tmp4 from the inner loop pre header to the outer loop header
	; before interchanging.			; before interchanging.
	define void @test1() local_unnamed_addr #0 {			define void @test1() local_unnamed_addr #0 {
	; CHECK-LABEL: @test1(			; CHECK-LABEL: @test1(
	; CHECK-NEXT: bb:
	; CHECK-NEXT: br label [[INNER_PH:%.*]]
	; CHECK: outer.header.preheader:
	; CHECK-NEXT: br label [[OUTER_HEADER:%.*]]
	; CHECK: outer.header:
	; CHECK-NEXT: [[OUTER_IV:%.]] = phi i64 [ [[OUTER_IV_NEXT:%.]], [[OUTER_LATCH:%.]] ], [ 0, [[OUTER_HEADER_PREHEADER:%.]] ]
	; CHECK-NEXT: [[INNER_RED:%.]] = phi i32 [ [[OUTER_RED:%.]], [[OUTER_HEADER_PREHEADER]] ], [ [[RED_NEXT:%.*]], [[OUTER_LATCH]] ]
	; CHECK-NEXT: [[TMP4:%.*]] = add nsw i64 [[OUTER_IV]], 9
	; CHECK-NEXT: br label [[INNER_SPLIT1:%.*]]
	; CHECK: inner.ph:
	; CHECK-NEXT: br label [[INNER:%.*]]
	; CHECK: inner:
	; CHECK-NEXT: [[INNER_IV:%.]] = phi i64 [ 0, [[INNER_PH]] ], [ [[TMP0:%.]], [[INNER_SPLIT:%.*]] ]
	; CHECK-NEXT: [[OUTER_RED]] = phi i32 [ [[RED_NEXT_LCSSA:%.*]], [[INNER_SPLIT]] ], [ 0, [[INNER_PH]] ]
	; CHECK-NEXT: br label [[OUTER_HEADER_PREHEADER]]
	; CHECK: inner.split1:
	; CHECK-NEXT: [[PTR:%.]] = getelementptr inbounds [2 x [10 x i32]], [2 x [10 x i32]] @global, i64 0, i64 [[INNER_IV]], i64 [[TMP4]]
	; CHECK-NEXT: store i32 0, i32* [[PTR]], align 4
	; CHECK-NEXT: [[RED_NEXT]] = or i32 [[INNER_RED]], 20
	; CHECK-NEXT: [[INNER_IV_NEXT:%.*]] = add nsw i64 [[INNER_IV]], 1
	; CHECK-NEXT: [[EC_1:%.*]] = icmp eq i64 [[INNER_IV_NEXT]], 400
	; CHECK-NEXT: br label [[OUTER_LATCH]]
	; CHECK: inner.split:
	; CHECK-NEXT: [[RED_NEXT_LCSSA]] = phi i32 [ [[RED_NEXT]], [[OUTER_LATCH]] ]
	; CHECK-NEXT: [[TMP0]] = add nsw i64 [[INNER_IV]], 1
	; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i64 [[TMP0]], 400
	; CHECK-NEXT: br i1 [[TMP1]], label [[EXIT:%.*]], label [[INNER]]
	; CHECK: outer.latch:
	; CHECK-NEXT: [[OUTER_IV_NEXT]] = add nsw i64 [[OUTER_IV]], 1
	; CHECK-NEXT: [[EC_2:%.*]] = icmp eq i64 [[OUTER_IV_NEXT]], 400
	; CHECK-NEXT: br i1 [[EC_2]], label [[INNER_SPLIT]], label [[OUTER_HEADER]]
	; CHECK: exit:
	; CHECK-NEXT: ret void
	;
	bb:			bb:
	br label %outer.header			br label %outer.header

	outer.header: ; preds = %bb11, %bb			outer.header: ; preds = %bb11, %bb
	%outer.iv = phi i64 [ 0, %bb ], [ %outer.iv.next, %outer.latch ]			%outer.iv = phi i64 [ 0, %bb ], [ %outer.iv.next, %outer.latch ]
	%outer.red = phi i32 [ 0, %bb ], [ %red.next.lcssa, %outer.latch ]			%outer.red = phi i32 [ 0, %bb ], [ %red.next.lcssa, %outer.latch ]
	br label %inner.ph			br label %inner.ph

	▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines