This is an archive of the discontinued LLVM Phabricator instance.

[LoopFusion] Restrict loop fusion to rotated loops.
ClosedPublic

Authored by kbarton on Dec 4 2019, 9:32 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
Meinersbur
dmgreen
etiotto
Whitney
fhahn
hfinkel

Commits

rGff07fc66d9ee: [LoopFusion] Restrict loop fusion to rotated loops.

Summary

This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

kbarton created this revision.Dec 4 2019, 9:32 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 4 2019, 9:32 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B41867: Diff 232160.Dec 4 2019, 9:32 AM

Herald added a subscriber: ormris. · View Herald TranscriptDec 4 2019, 9:32 AM

[suggestion] Add a test case to check for "not rotated" to be rejected.

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
270	Does this test something different now?
llvm/test/Transforms/LoopFusion/diagnostics_missed.ll
3–4	[nit] can be removed

This revision is now accepted and ready to land.Dec 5 2019, 8:55 AM

Whitney added a child revision: D71165: [LoopFusion] Move instructions from FC0.Latch to FC1.Latch..Dec 7 2019, 5:29 AM

Whitney added inline comments.Dec 12 2019, 4:12 PM

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	This guard was not in the original test case, now the second loop has two guards, is that intensional?
78	bb34 is also a predecessor of bb33.

Addressed all the review comments.
I'm ready to land this, unless @Whitney has concerns about the test case.

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	Yes, this is intentional to make the two loops not control-flow-equivalent, which is what this specific test is checking.
270	This is a good catch. I missed this. I've fixed it to check for the memory dependencies again.

Whitney added inline comments.Dec 16 2019, 9:39 AM

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	The two loops should already be not control-flow-equivalent, as bb7 is not guarded, while bb22 is guarded. Why is the extra guard needed?

kbarton marked 2 inline comments as done.Dec 16 2019, 9:56 AM

kbarton added inline comments.

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	If the loop is not guarded, the code will use the preheader; if the loop is guarded, it uses the guard. In this case, since one is guarded and the other is not guarded, it will compare the preheader for the first loop with the guard for the second loop and find them control flow equivalent. It will fail in a later check for fusion, however this specific test is meant to check that loops that are not control flow equivalent are put into different candidate sets. Putting an extra branch around the second loop satisfies that.

Whitney added inline comments.Dec 16 2019, 10:08 AM

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	Notice that before changing the test case to rotated form, there is only one loop guard needed. I think we should change the way fusion compute the control flow equivalent between two loops. As a guarded loop (assuming the condition not always true) and an unguarded loop should be consider not control flow equivalent. I am ok to keep that as future improvement.

kbarton marked an inline comment as done.Dec 16 2019, 10:43 AM

kbarton added inline comments.

llvm/test/Transforms/LoopFusion/cannot_fuse.ll
42	Yes, this is an existing problem that came up recently. I don't think the current patch exacerbates the problem though. I am currently working on a fix for that, but would prefer to land this patch as is and fix the existing problem separately.

Closed by commit rGff07fc66d9ee: [LoopFusion] Restrict loop fusion to rotated loops. (authored by kbarton). · Explain WhyDec 16 2019, 12:17 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopFuse.cpp

6 lines

test/

Transforms/

LoopFusion/

cannot_fuse.ll

307 lines

diagnostics_missed.ll

364 lines

four_loops.ll

140 lines

loop_nest.ll

104 lines

simple.ll

309 lines

Diff 234118

llvm/lib/Transforms/Scalar/LoopFuse.cpp

Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
STATISTIC(UncomputableTripCount, "SCEV cannot compute trip count of loop");		STATISTIC(UncomputableTripCount, "SCEV cannot compute trip count of loop");
STATISTIC(NonEqualTripCount, "Loop trip counts are not the same");		STATISTIC(NonEqualTripCount, "Loop trip counts are not the same");
STATISTIC(NonAdjacent, "Loops are not adjacent");		STATISTIC(NonAdjacent, "Loops are not adjacent");
STATISTIC(NonEmptyPreheader, "Loop has a non-empty preheader");		STATISTIC(NonEmptyPreheader, "Loop has a non-empty preheader");
STATISTIC(FusionNotBeneficial, "Fusion is not beneficial");		STATISTIC(FusionNotBeneficial, "Fusion is not beneficial");
STATISTIC(NonIdenticalGuards, "Candidates have different guards");		STATISTIC(NonIdenticalGuards, "Candidates have different guards");
STATISTIC(NonEmptyExitBlock, "Candidate has a non-empty exit block");		STATISTIC(NonEmptyExitBlock, "Candidate has a non-empty exit block");
STATISTIC(NonEmptyGuardBlock, "Candidate has a non-empty guard block");		STATISTIC(NonEmptyGuardBlock, "Candidate has a non-empty guard block");
		STATISTIC(NotRotated, "Candidate is not rotated");

enum FusionDependenceAnalysisChoice {		enum FusionDependenceAnalysisChoice {
FUSION_DEPENDENCE_ANALYSIS_SCEV,		FUSION_DEPENDENCE_ANALYSIS_SCEV,
FUSION_DEPENDENCE_ANALYSIS_DA,		FUSION_DEPENDENCE_ANALYSIS_DA,
FUSION_DEPENDENCE_ANALYSIS_ALL,		FUSION_DEPENDENCE_ANALYSIS_ALL,
};		};

static cl::opt<FusionDependenceAnalysisChoice> FusionDependenceAnalysis(		static cl::opt<FusionDependenceAnalysisChoice> FusionDependenceAnalysis(
▲ Show 20 Lines • Show All 212 Lines • ▼ Show 20 Lines	bool isEligibleForFusion(ScalarEvolution &SE) const {
}		}

if (!L->isLoopSimplifyForm()) {		if (!L->isLoopSimplifyForm()) {
LLVM_DEBUG(dbgs() << "Loop " << L->getName()		LLVM_DEBUG(dbgs() << "Loop " << L->getName()
<< " is not in simplified form!\n");		<< " is not in simplified form!\n");
return reportInvalidCandidate(NotSimplifiedForm);		return reportInvalidCandidate(NotSimplifiedForm);
}		}

		if (!isRotated()) {
		LLVM_DEBUG(dbgs() << "Loop " << L->getName() << " is not rotated!\n");
		return reportInvalidCandidate(NotRotated);
		}

return true;		return true;
}		}

private:		private:
// This is only used internally for now, to clear the MemWrites and MemReads		// This is only used internally for now, to clear the MemWrites and MemReads
// list and setting Valid to false. I can't envision other uses of this right		// list and setting Valid to false. I can't envision other uses of this right
// now, since once FusionCandidates are put into the FusionCandidateSet they		// now, since once FusionCandidates are put into the FusionCandidateSet they
// are immutable. Thus, any time we need to change/update a FusionCandidate,		// are immutable. Thus, any time we need to change/update a FusionCandidate,
▲ Show 20 Lines • Show All 1,301 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopFusion/cannot_fuse.ll

Show All 9 Lines
; CHECK: Fusion Candidates:		; CHECK: Fusion Candidates:
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK: bb		; CHECK: bb
; CHECK: ****************************		; CHECK: ****************************
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK: bb20.preheader		; CHECK: bb20.preheader
; CHECK: ****************************		; CHECK: ****************************
; CHECK: Loop Fusion complete		; CHECK: Loop Fusion complete
define void @non_cfe(i32* noalias %arg) {		define void @non_cfe(i32* noalias %arg, i32 %N) {
bb:		bb:
br label %bb5		br label %bb7

bb5: ; preds = %bb14, %bb
%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb14 ], [ 0, %bb ]
%.01 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]
%exitcond4 = icmp ne i64 %indvars.iv2, 100
br i1 %exitcond4, label %bb7, label %bb16

bb7: ; preds = %bb5		bb7: ; preds = %bb, %bb14
%tmp = add nsw i32 %.01, -3		%.014 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]
%tmp8 = add nuw nsw i64 %indvars.iv2, 3		%indvars.iv23 = phi i64 [ 0, %bb ], [ %indvars.iv.next3, %bb14 ]
		%tmp = add nsw i32 %.014, -3
		%tmp8 = add nuw nsw i64 %indvars.iv23, 3
%tmp9 = trunc i64 %tmp8 to i32		%tmp9 = trunc i64 %tmp8 to i32
%tmp10 = mul nsw i32 %tmp, %tmp9		%tmp10 = mul nsw i32 %tmp, %tmp9
%tmp11 = trunc i64 %indvars.iv2 to i32		%tmp11 = trunc i64 %indvars.iv23 to i32
%tmp12 = srem i32 %tmp10, %tmp11		%tmp12 = srem i32 %tmp10, %tmp11
%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2		%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv23
store i32 %tmp12, i32* %tmp13, align 4		store i32 %tmp12, i32* %tmp13, align 4
br label %bb14		br label %bb14

bb14: ; preds = %bb7		bb14: ; preds = %bb7
%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1		%indvars.iv.next3 = add nuw nsw i64 %indvars.iv23, 1
%tmp15 = add nuw nsw i32 %.01, 1		%tmp15 = add nuw nsw i32 %.014, 1
br label %bb5		%exitcond4 = icmp ne i64 %indvars.iv.next3, 100
		br i1 %exitcond4, label %bb7, label %bb34

		bb34:
		%cmp = icmp slt i32 %N, 50
		WhitneyUnsubmitted Not Done Reply Inline Actions This guard was not in the original test case, now the second loop has two guards, is that intensional? Whitney: This guard was not in the original test case, now the second loop has two guards, is that…
		kbartonAuthorUnsubmitted Not Done Reply Inline Actions Yes, this is intentional to make the two loops not control-flow-equivalent, which is what this specific test is checking. kbarton: Yes, this is intentional to make the two loops not control-flow-equivalent, which is what this…
		WhitneyUnsubmitted Not Done Reply Inline Actions The two loops should already be not control-flow-equivalent, as bb7 is not guarded, while bb22 is guarded. Why is the extra guard needed? Whitney: The two loops should already be not control-flow-equivalent, as bb7 is not guarded, while bb22…
		kbartonAuthorUnsubmitted Done Reply Inline Actions If the loop is not guarded, the code will use the preheader; if the loop is guarded, it uses the guard. In this case, since one is guarded and the other is not guarded, it will compare the preheader for the first loop with the guard for the second loop and find them control flow equivalent. It will fail in a later check for fusion, however this specific test is meant to check that loops that are not control flow equivalent are put into different candidate sets. Putting an extra branch around the second loop satisfies that. kbarton: If the loop is not guarded, the code will use the preheader; if the loop is guarded, it uses…
		WhitneyUnsubmitted Not Done Reply Inline Actions Notice that before changing the test case to rotated form, there is only one loop guard needed. I think we should change the way fusion compute the control flow equivalent between two loops. As a guarded loop (assuming the condition not always true) and an unguarded loop should be consider not control flow equivalent. I am ok to keep that as future improvement. Whitney: Notice that before changing the test case to rotated form, there is only one loop guard needed.
		kbartonAuthorUnsubmitted Done Reply Inline Actions Yes, this is an existing problem that came up recently. I don't think the current patch exacerbates the problem though. I am currently working on a fix for that, but would prefer to land this patch as is and fix the existing problem separately. kbarton: Yes, this is an existing problem that came up recently. I don't think the current patch…
		br i1 %cmp, label %bb16, label %bb33

bb16: ; preds = %bb5		bb16: ; preds = %bb34
%tmp17 = load i32, i32* %arg, align 4		%tmp17 = load i32, i32* %arg, align 4
%tmp18 = icmp slt i32 %tmp17, 0		%tmp18 = icmp slt i32 %tmp17, 0
br i1 %tmp18, label %bb20, label %bb33		br i1 %tmp18, label %bb20.preheader, label %bb33

bb20: ; preds = %bb30, %bb16		bb20.preheader: ; preds = %bb16
%indvars.iv = phi i64 [ %indvars.iv.next, %bb30 ], [ 0, %bb16 ]		br label %bb22
%.0 = phi i32 [ 0, %bb16 ], [ %tmp31, %bb30 ]
%exitcond = icmp ne i64 %indvars.iv, 100
br i1 %exitcond, label %bb22, label %bb33

bb22: ; preds = %bb20		bb22: ; preds = %bb20.preheader, %bb30
%tmp23 = add nsw i32 %.0, -3		%.02 = phi i32 [ 0, %bb20.preheader ], [ %tmp31, %bb30 ]
%tmp24 = add nuw nsw i64 %indvars.iv, 3		%indvars.iv1 = phi i64 [ 0, %bb20.preheader ], [ %indvars.iv.next, %bb30 ]
		%tmp23 = add nsw i32 %.02, -3
		%tmp24 = add nuw nsw i64 %indvars.iv1, 3
%tmp25 = trunc i64 %tmp24 to i32		%tmp25 = trunc i64 %tmp24 to i32
%tmp26 = mul nsw i32 %tmp23, %tmp25		%tmp26 = mul nsw i32 %tmp23, %tmp25
%tmp27 = trunc i64 %indvars.iv to i32		%tmp27 = trunc i64 %indvars.iv1 to i32
%tmp28 = srem i32 %tmp26, %tmp27		%tmp28 = srem i32 %tmp26, %tmp27
%tmp29 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv		%tmp29 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv1
store i32 %tmp28, i32* %tmp29, align 4		store i32 %tmp28, i32* %tmp29, align 4
br label %bb30		br label %bb30

bb30: ; preds = %bb22		bb30: ; preds = %bb22
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
%tmp31 = add nuw nsw i32 %.0, 1		%tmp31 = add nuw nsw i32 %.02, 1
br label %bb20		%exitcond = icmp ne i64 %indvars.iv.next, 100
		br i1 %exitcond, label %bb22, label %bb33.loopexit

bb33: ; preds = %bb20, %bb16		bb33.loopexit: ; preds = %bb30
		br label %bb33

		bb33: ; preds = %bb33.loopexit, %bb16, %bb34
ret void		ret void
}		}

		WhitneyUnsubmitted Done Reply Inline Actions bb34 is also a predecessor of bb33. Whitney: bb34 is also a predecessor of bb33.
; Check that fusion detects the two canddates are not adjacent (the exit block		; Check that fusion detects the two canddates are not adjacent (the exit block
; of the first candidate is not the preheader of the second candidate).		; of the first candidate is not the preheader of the second candidate).

; CHECK: Performing Loop Fusion on function non_adjacent		; CHECK: Performing Loop Fusion on function non_adjacent
; CHECK: Fusion Candidates:		; CHECK: Fusion Candidates:
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]
; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]
; CHECK-NEXT: ****************************		; CHECK-NEXT: ****************************
; CHECK: Attempting fusion on Candidate Set:		; CHECK: Attempting fusion on Candidate Set:
; CHECK-NEXT: [[LOOP1PREHEADER]]		; CHECK-NEXT: [[LOOP1PREHEADER]]
; CHECK-NEXT: [[LOOP2PREHEADER]]		; CHECK-NEXT: [[LOOP2PREHEADER]]
; CHECK: Fusion candidates are not adjacent. Not fusing.		; CHECK: Fusion candidates are not adjacent. Not fusing.
; CHECK: Loop Fusion complete		; CHECK: Loop Fusion complete
define void @non_adjacent(i32* noalias %arg) {		define void @non_adjacent(i32* noalias %arg) {
bb:		bb:
br label %bb3		br label %bb5

bb3: ; preds = %bb11, %bb
%.01 = phi i64 [ 0, %bb ], [ %tmp12, %bb11 ]
%exitcond2 = icmp ne i64 %.01, 100
br i1 %exitcond2, label %bb5, label %bb4

bb4: ; preds = %bb3		bb4: ; preds = %bb11
br label %bb13		br label %bb13

bb5: ; preds = %bb3		bb5: ; preds = %bb, %bb11
%tmp = add nsw i64 %.01, -3		%.013 = phi i64 [ 0, %bb ], [ %tmp12, %bb11 ]
%tmp6 = add nuw nsw i64 %.01, 3		%tmp = add nsw i64 %.013, -3
		%tmp6 = add nuw nsw i64 %.013, 3
%tmp7 = mul nsw i64 %tmp, %tmp6		%tmp7 = mul nsw i64 %tmp, %tmp6
%tmp8 = srem i64 %tmp7, %.01		%tmp8 = srem i64 %tmp7, %.013
%tmp9 = trunc i64 %tmp8 to i32		%tmp9 = trunc i64 %tmp8 to i32
%tmp10 = getelementptr inbounds i32, i32* %arg, i64 %.01		%tmp10 = getelementptr inbounds i32, i32* %arg, i64 %.013
store i32 %tmp9, i32* %tmp10, align 4		store i32 %tmp9, i32* %tmp10, align 4
br label %bb11		br label %bb11

bb11: ; preds = %bb5		bb11: ; preds = %bb5
%tmp12 = add nuw nsw i64 %.01, 1		%tmp12 = add nuw nsw i64 %.013, 1
br label %bb3		%exitcond2 = icmp ne i64 %tmp12, 100
		br i1 %exitcond2, label %bb5, label %bb4

bb13: ; preds = %bb4		bb13: ; preds = %bb4
br label %bb14		br label %bb16

bb14: ; preds = %bb23, %bb13
%.0 = phi i64 [ 0, %bb13 ], [ %tmp24, %bb23 ]
%exitcond = icmp ne i64 %.0, 100
br i1 %exitcond, label %bb16, label %bb15

bb15: ; preds = %bb14		bb15: ; preds = %bb23
br label %bb25		br label %bb25

bb16: ; preds = %bb14		bb16: ; preds = %bb13, %bb23
%tmp17 = add nsw i64 %.0, -3		%.02 = phi i64 [ 0, %bb13 ], [ %tmp24, %bb23 ]
%tmp18 = add nuw nsw i64 %.0, 3		%tmp17 = add nsw i64 %.02, -3
		%tmp18 = add nuw nsw i64 %.02, 3
%tmp19 = mul nsw i64 %tmp17, %tmp18		%tmp19 = mul nsw i64 %tmp17, %tmp18
%tmp20 = srem i64 %tmp19, %.0		%tmp20 = srem i64 %tmp19, %.02
%tmp21 = trunc i64 %tmp20 to i32		%tmp21 = trunc i64 %tmp20 to i32
%tmp22 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.0		%tmp22 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.02
store i32 %tmp21, i32* %tmp22, align 4		store i32 %tmp21, i32* %tmp22, align 4
br label %bb23		br label %bb23

bb23: ; preds = %bb16		bb23: ; preds = %bb16
%tmp24 = add nuw nsw i64 %.0, 1		%tmp24 = add nuw nsw i64 %.02, 1
br label %bb14		%exitcond = icmp ne i64 %tmp24, 100
		br i1 %exitcond, label %bb16, label %bb15

bb25: ; preds = %bb15		bb25: ; preds = %bb15
ret void		ret void
}		}

; Check that the different bounds are detected and prevent fusion.		; Check that the different bounds are detected and prevent fusion.

; CHECK: Performing Loop Fusion on function different_bounds		; CHECK: Performing Loop Fusion on function different_bounds
; CHECK: Fusion Candidates:		; CHECK: Fusion Candidates:
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]
; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]
; CHECK-NEXT: ****************************		; CHECK-NEXT: ****************************
; CHECK: Attempting fusion on Candidate Set:		; CHECK: Attempting fusion on Candidate Set:
; CHECK-NEXT: [[LOOP1PREHEADER]]		; CHECK-NEXT: [[LOOP1PREHEADER]]
; CHECK-NEXT: [[LOOP2PREHEADER]]		; CHECK-NEXT: [[LOOP2PREHEADER]]
; CHECK: Fusion candidates do not have identical trip counts. Not fusing.		; CHECK: Fusion candidates do not have identical trip counts. Not fusing.
; CHECK: Loop Fusion complete		; CHECK: Loop Fusion complete
define void @different_bounds(i32* noalias %arg) {		define void @different_bounds(i32* noalias %arg) {
bb:		bb:
br label %bb3		br label %bb5

bb3: ; preds = %bb11, %bb
%.01 = phi i64 [ 0, %bb ], [ %tmp12, %bb11 ]
%exitcond2 = icmp ne i64 %.01, 100
br i1 %exitcond2, label %bb5, label %bb4

bb4: ; preds = %bb3		bb4: ; preds = %bb11
br label %bb13		br label %bb13

bb5: ; preds = %bb3		bb5: ; preds = %bb, %bb11
%tmp = add nsw i64 %.01, -3		%.013 = phi i64 [ 0, %bb ], [ %tmp12, %bb11 ]
%tmp6 = add nuw nsw i64 %.01, 3		%tmp = add nsw i64 %.013, -3
		%tmp6 = add nuw nsw i64 %.013, 3
%tmp7 = mul nsw i64 %tmp, %tmp6		%tmp7 = mul nsw i64 %tmp, %tmp6
%tmp8 = srem i64 %tmp7, %.01		%tmp8 = srem i64 %tmp7, %.013
%tmp9 = trunc i64 %tmp8 to i32		%tmp9 = trunc i64 %tmp8 to i32
%tmp10 = getelementptr inbounds i32, i32* %arg, i64 %.01		%tmp10 = getelementptr inbounds i32, i32* %arg, i64 %.013
store i32 %tmp9, i32* %tmp10, align 4		store i32 %tmp9, i32* %tmp10, align 4
br label %bb11		br label %bb11

bb11: ; preds = %bb5		bb11: ; preds = %bb5
%tmp12 = add nuw nsw i64 %.01, 1		%tmp12 = add nuw nsw i64 %.013, 1
br label %bb3		%exitcond2 = icmp ne i64 %tmp12, 100
		br i1 %exitcond2, label %bb5, label %bb4

bb13: ; preds = %bb4		bb13: ; preds = %bb4
br label %bb14		br label %bb16

bb14: ; preds = %bb23, %bb13
%.0 = phi i64 [ 0, %bb13 ], [ %tmp24, %bb23 ]
%exitcond = icmp ne i64 %.0, 200
br i1 %exitcond, label %bb16, label %bb15

bb15: ; preds = %bb14		bb15: ; preds = %bb23
br label %bb25		br label %bb25

bb16: ; preds = %bb14		bb16: ; preds = %bb13, %bb23
%tmp17 = add nsw i64 %.0, -3		%.02 = phi i64 [ 0, %bb13 ], [ %tmp24, %bb23 ]
%tmp18 = add nuw nsw i64 %.0, 3		%tmp17 = add nsw i64 %.02, -3
		%tmp18 = add nuw nsw i64 %.02, 3
%tmp19 = mul nsw i64 %tmp17, %tmp18		%tmp19 = mul nsw i64 %tmp17, %tmp18
%tmp20 = srem i64 %tmp19, %.0		%tmp20 = srem i64 %tmp19, %.02
%tmp21 = trunc i64 %tmp20 to i32		%tmp21 = trunc i64 %tmp20 to i32
%tmp22 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.0		%tmp22 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.02
store i32 %tmp21, i32* %tmp22, align 4		store i32 %tmp21, i32* %tmp22, align 4
br label %bb23		br label %bb23

bb23: ; preds = %bb16		bb23: ; preds = %bb16
%tmp24 = add nuw nsw i64 %.0, 1		%tmp24 = add nuw nsw i64 %.02, 1
br label %bb14		%exitcond = icmp ne i64 %tmp24, 200
		br i1 %exitcond, label %bb16, label %bb15

bb25: ; preds = %bb15		bb25: ; preds = %bb15
ret void		ret void
}		}

; Check that the negative dependence between the two candidates is identified		; Check that the negative dependence between the two candidates is identified
; and prevents fusion.		; and prevents fusion.

; CHECK: Performing Loop Fusion on function negative_dependence		; CHECK: Performing Loop Fusion on function negative_dependence
; CHECK: Fusion Candidates:		; CHECK: Fusion Candidates:
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]
; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]
; CHECK-NEXT: ****************************		; CHECK-NEXT: ****************************
; CHECK: Attempting fusion on Candidate Set:		; CHECK: Attempting fusion on Candidate Set:
; CHECK-NEXT: [[LOOP1PREHEADER]]		; CHECK-NEXT: [[LOOP1PREHEADER]]
; CHECK-NEXT: [[LOOP2PREHEADER]]		; CHECK-NEXT: [[LOOP2PREHEADER]]
; CHECK: Memory dependencies do not allow fusion!		; CHECK: Memory dependencies do not allow fusion!
; CHECK: Loop Fusion complete		; CHECK: Loop Fusion complete
define void @negative_dependence(i32* noalias %arg) {		define void @negative_dependence(i32* noalias %arg) {
bb:		bb:
br label %bb5		br label %bb7

bb5: ; preds = %bb9, %bb		bb11.preheader: ; preds = %bb9
%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb9 ], [ 0, %bb ]		br label %bb13
%exitcond4 = icmp ne i64 %indvars.iv2, 100
br i1 %exitcond4, label %bb7, label %bb11

bb7: ; preds = %bb5		bb7: ; preds = %bb, %bb9
%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2		%indvars.iv22 = phi i64 [ 0, %bb ], [ %indvars.iv.next3, %bb9 ]
%tmp8 = trunc i64 %indvars.iv2 to i32		%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv22
		%tmp8 = trunc i64 %indvars.iv22 to i32
store i32 %tmp8, i32* %tmp, align 4		store i32 %tmp8, i32* %tmp, align 4
br label %bb9		br label %bb9

bb9: ; preds = %bb7		bb9: ; preds = %bb7
%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1		%indvars.iv.next3 = add nuw nsw i64 %indvars.iv22, 1
br label %bb5		%exitcond4 = icmp ne i64 %indvars.iv.next3, 100
		br i1 %exitcond4, label %bb7, label %bb11.preheader
bb11: ; preds = %bb18, %bb5
%indvars.iv = phi i64 [ %indvars.iv.next, %bb18 ], [ 0, %bb5 ]		bb13: ; preds = %bb11.preheader, %bb18
%exitcond = icmp ne i64 %indvars.iv, 100		%indvars.iv1 = phi i64 [ 0, %bb11.preheader ], [ %indvars.iv.next, %bb18 ]
br i1 %exitcond, label %bb13, label %bb19		%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1

bb13: ; preds = %bb11
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%tmp14 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv.next		%tmp14 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv.next
%tmp15 = load i32, i32* %tmp14, align 4		%tmp15 = load i32, i32* %tmp14, align 4
%tmp16 = shl nsw i32 %tmp15, 1		%tmp16 = shl nsw i32 %tmp15, 1
%tmp17 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv		%tmp17 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv1
store i32 %tmp16, i32* %tmp17, align 4		store i32 %tmp16, i32* %tmp17, align 4
br label %bb18		br label %bb18

bb18: ; preds = %bb13		bb18: ; preds = %bb13
br label %bb11		%exitcond = icmp ne i64 %indvars.iv.next, 100
		br i1 %exitcond, label %bb13, label %bb19

bb19: ; preds = %bb11		bb19: ; preds = %bb18
ret void		ret void
}		}

; Check for values defined in Loop 0 and used in Loop 1.		; Check for values defined in Loop 0 and used in Loop 1.
; It is not safe to fuse in this case, because the second loop has		; It is not safe to fuse in this case, because the second loop has
; a use of %.01.lcssa which is defined in the body of loop 0. The		; a use of %.01.lcssa which is defined in the body of loop 0. The
; first loop must execute completely in order to compute the correct		; first loop must execute completely in order to compute the correct
; value of %.01.lcssa to be used in the second loop.		; value of %.01.lcssa to be used in the second loop.

; CHECK: Performing Loop Fusion on function sumTest		; CHECK: Performing Loop Fusion on function sumTest
; CHECK: Fusion Candidates:		; CHECK: Fusion Candidates:
; CHECK: * Fusion Candidate Set *		; CHECK: * Fusion Candidate Set *
; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP1PREHEADER:bb[0-9]*]]
; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]		; CHECK-NEXT: [[LOOP2PREHEADER:bb[0-9]*]]
; CHECK-NEXT: ****************************		; CHECK-NEXT: ****************************
; CHECK: Attempting fusion on Candidate Set:		; CHECK: Attempting fusion on Candidate Set:
; CHECK-NEXT: [[LOOP1PREHEADER]]		; CHECK-NEXT: [[LOOP1PREHEADER]]
; CHECK-NEXT: [[LOOP2PREHEADER]]		; CHECK-NEXT: [[LOOP2PREHEADER]]
; CHECK: Memory dependencies do not allow fusion!		; CHECK: Memory dependencies do not allow fusion!
		MeinersburUnsubmitted Done Reply Inline Actions Does this test something different now? Meinersbur: Does this test something different now?
		kbartonAuthorUnsubmitted Done Reply Inline Actions This is a good catch. I missed this. I've fixed it to check for the memory dependencies again. kbarton: This is a good catch. I missed this. I've fixed it to check for the memory dependencies again.
; CHECK: Loop Fusion complete		; CHECK: Loop Fusion complete
define i32 @sumTest(i32* noalias %arg) {		define i32 @sumTest(i32* noalias %arg) {
bb:		bb:
br label %bb6		br label %bb9

bb6: ; preds = %bb9, %bb		bb13.preheader: ; preds = %bb9
%indvars.iv3 = phi i64 [ %indvars.iv.next4, %bb9 ], [ 0, %bb ]		br label %bb15
%.01 = phi i32 [ 0, %bb ], [ %tmp11, %bb9 ]
%exitcond5 = icmp ne i64 %indvars.iv3, 100
br i1 %exitcond5, label %bb9, label %bb13

bb9: ; preds = %bb6		bb9: ; preds = %bb, %bb9
%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv3		%.01.lcssa = phi i32 [ 0, %bb ], [ %tmp11, %bb9 ]
		%.013 = phi i32 [ 0, %bb ], [ %tmp11, %bb9 ]
		%indvars.iv32 = phi i64 [ 0, %bb ], [ %indvars.iv.next4, %bb9 ]
		%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv32
%tmp10 = load i32, i32* %tmp, align 4		%tmp10 = load i32, i32* %tmp, align 4
%tmp11 = add nsw i32 %.01, %tmp10		%tmp11 = add nsw i32 %.013, %tmp10
%indvars.iv.next4 = add nuw nsw i64 %indvars.iv3, 1		%indvars.iv.next4 = add nuw nsw i64 %indvars.iv32, 1
br label %bb6		%exitcond5 = icmp ne i64 %indvars.iv.next4, 100
		br i1 %exitcond5, label %bb9, label %bb13.preheader
bb13: ; preds = %bb20, %bb6
%.01.lcssa = phi i32 [ %.01, %bb6 ], [ %.01.lcssa, %bb20 ]
%indvars.iv = phi i64 [ %indvars.iv.next, %bb20 ], [ 0, %bb6 ]
%exitcond = icmp ne i64 %indvars.iv, 100
br i1 %exitcond, label %bb15, label %bb14

bb14: ; preds = %bb13		bb14: ; preds = %bb20
br label %bb21		br label %bb21

bb15: ; preds = %bb13		bb15: ; preds = %bb13.preheader, %bb20
%tmp16 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv		%indvars.iv1 = phi i64 [ 0, %bb13.preheader ], [ %indvars.iv.next, %bb20 ]
		%tmp16 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv1
%tmp17 = load i32, i32* %tmp16, align 4		%tmp17 = load i32, i32* %tmp16, align 4
%tmp18 = sdiv i32 %tmp17, %.01.lcssa		%tmp18 = sdiv i32 %tmp17, %.01.lcssa
%tmp19 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv		%tmp19 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv1
store i32 %tmp18, i32* %tmp19, align 4		store i32 %tmp18, i32* %tmp19, align 4
br label %bb20		br label %bb20

bb20: ; preds = %bb15		bb20: ; preds = %bb15
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
br label %bb13		%exitcond = icmp ne i64 %indvars.iv.next, 100
		br i1 %exitcond, label %bb15, label %bb14

bb21: ; preds = %bb14		bb21: ; preds = %bb14
ret i32 %.01.lcssa		ret i32 %.01.lcssa
}		}

; Similar to sumTest above. The first loop computes %add and must		; Similar to sumTest above. The first loop computes %add and must
; complete before it is used in the second loop. Thus, these two loops		; complete before it is used in the second loop. Thus, these two loops
; also cannot be fused.		; also cannot be fused.
Show All 36 Lines	for.body8: ; preds = %for.body, %for.body8
%inc14 = add nuw nsw i64 %i2.031, 1		%inc14 = add nuw nsw i64 %i2.031, 1
%cmp5 = icmp ult i64 %inc14, %conv		%cmp5 = icmp ult i64 %inc14, %conv
br i1 %cmp5, label %for.body8, label %for.cond.cleanup7		br i1 %cmp5, label %for.body8, label %for.cond.cleanup7

for.cond.cleanup7: ; preds = %for.body8, %entry		for.cond.cleanup7: ; preds = %for.body8, %entry
%sum1.0.lcssa36 = phi float [ 0.000000e+00, %entry ], [ %add, %for.body8 ]		%sum1.0.lcssa36 = phi float [ 0.000000e+00, %entry ], [ %add, %for.body8 ]
ret float %sum1.0.lcssa36		ret float %sum1.0.lcssa36
}		}

		; Check that non-rotated loops are not considered for fusion.
		; CHECK: Performing Loop Fusion on function notRotated
		; CHECK: Loop bb{{.*}} is not rotated!
		; CHECK: Loop bb{{.*}} is not rotated!
		define void @notRotated(i32* noalias %arg) {
		bb:
		br label %bb5

		bb5: ; preds = %bb14, %bb
		%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb14 ], [ 0, %bb ]
		%.01 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]
		%exitcond4 = icmp ne i64 %indvars.iv2, 100
		br i1 %exitcond4, label %bb7, label %bb17

		bb7: ; preds = %bb5
		%tmp = add nsw i32 %.01, -3
		%tmp8 = add nuw nsw i64 %indvars.iv2, 3
		%tmp9 = trunc i64 %tmp8 to i32
		%tmp10 = mul nsw i32 %tmp, %tmp9
		%tmp11 = trunc i64 %indvars.iv2 to i32
		%tmp12 = srem i32 %tmp10, %tmp11
		%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2
		store i32 %tmp12, i32* %tmp13, align 4
		br label %bb14

		bb14: ; preds = %bb7
		%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1
		%tmp15 = add nuw nsw i32 %.01, 1
		br label %bb5

		bb17: ; preds = %bb27, %bb5
		%indvars.iv = phi i64 [ %indvars.iv.next, %bb27 ], [ 0, %bb5 ]
		%.0 = phi i32 [ 0, %bb5 ], [ %tmp28, %bb27 ]
		%exitcond = icmp ne i64 %indvars.iv, 100
		br i1 %exitcond, label %bb19, label %bb18

		bb18: ; preds = %bb17
		br label %bb29

		bb19: ; preds = %bb17
		%tmp20 = add nsw i32 %.0, -3
		%tmp21 = add nuw nsw i64 %indvars.iv, 3
		%tmp22 = trunc i64 %tmp21 to i32
		%tmp23 = mul nsw i32 %tmp20, %tmp22
		%tmp24 = trunc i64 %indvars.iv to i32
		%tmp25 = srem i32 %tmp23, %tmp24
		%tmp26 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv
		store i32 %tmp25, i32* %tmp26, align 4
		br label %bb27

		bb27: ; preds = %bb19
		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
		%tmp28 = add nuw nsw i32 %.0, 1
		br label %bb17

		bb29: ; preds = %bb18
		ret void
		}

llvm/test/Transforms/LoopFusion/diagnostics_missed.ll

	; RUN: opt -S -loop-fusion -pass-remarks-missed=loop-fusion -disable-output < %s 2>&1 \| FileCheck %s			; RUN: opt -S -loop-fusion -pass-remarks-missed=loop-fusion -disable-output < %s 2>&1 \| FileCheck %s
	;
	target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"

				MeinersburUnsubmitted Done Reply Inline Actions [nit] can be removed Meinersbur: [nit] can be removed
	@B = common global [1024 x i32] zeroinitializer, align 16, !dbg !0			@B = common global [1024 x i32] zeroinitializer, align 16, !dbg !0

	; CHECK: remark: diagnostics_missed.c:18:3: [non_adjacent]: entry and for.end: Loops are not adjacent			; CHECK: remark: diagnostics_missed.c:18:3: [non_adjacent]: entry and for.end: Loops are not adjacent
	define void @non_adjacent(i32* noalias %A) !dbg !67 {			define void @non_adjacent(i32* noalias %A) !dbg !14 {
	entry:			entry:
	br label %for.cond			br label %for.body

	for.cond: ; preds = %for.inc, %entry
	%i.0 = phi i64 [ 0, %entry ], [ %inc, %for.inc ]
	%exitcond1 = icmp ne i64 %i.0, 100
	br i1 %exitcond1, label %for.body, label %for.cond.cleanup

	for.cond.cleanup: ; preds = %for.cond			for.cond.cleanup: ; preds = %for.inc
	br label %for.end			br label %for.end

	for.body: ; preds = %for.cond			for.body: ; preds = %entry, %for.inc
	%sub = add nsw i64 %i.0, -3			%i.02 = phi i64 [ 0, %entry ], [ %inc, %for.inc ]
	%add = add nuw nsw i64 %i.0, 3			%sub = add nsw i64 %i.02, -3
				%add = add nuw nsw i64 %i.02, 3
	%mul = mul nsw i64 %sub, %add			%mul = mul nsw i64 %sub, %add
	%rem = srem i64 %mul, %i.0			%rem = srem i64 %mul, %i.02
	%conv = trunc i64 %rem to i32			%conv = trunc i64 %rem to i32
	%arrayidx = getelementptr inbounds i32, i32* %A, i64 %i.0			%arrayidx = getelementptr inbounds i32, i32* %A, i64 %i.02
	store i32 %conv, i32* %arrayidx, align 4			store i32 %conv, i32* %arrayidx, align 4
	br label %for.inc			br label %for.inc

	for.inc: ; preds = %for.body			for.inc: ; preds = %for.body
	%inc = add nuw nsw i64 %i.0, 1, !dbg !86			%inc = add nuw nsw i64 %i.02, 1, !dbg !26
	br label %for.cond, !dbg !87, !llvm.loop !88			%exitcond1 = icmp ne i64 %inc, 100
				br i1 %exitcond1, label %for.body, label %for.cond.cleanup, !llvm.loop !28

	for.end: ; preds = %for.cond.cleanup			for.end: ; preds = %for.cond.cleanup
	br label %for.cond2			br label %for.body6

	for.cond2: ; preds = %for.inc13, %for.end			for.cond.cleanup5: ; preds = %for.inc13
	%i1.0 = phi i64 [ 0, %for.end ], [ %inc14, %for.inc13 ]
	%exitcond = icmp ne i64 %i1.0, 100
	br i1 %exitcond, label %for.body6, label %for.cond.cleanup5

	for.cond.cleanup5: ; preds = %for.cond2
	br label %for.end15			br label %for.end15

	for.body6: ; preds = %for.cond2			for.body6: ; preds = %for.end, %for.inc13
	%sub7 = add nsw i64 %i1.0, -3			%i1.01 = phi i64 [ 0, %for.end ], [ %inc14, %for.inc13 ]
	%add8 = add nuw nsw i64 %i1.0, 3			%sub7 = add nsw i64 %i1.01, -3
				%add8 = add nuw nsw i64 %i1.01, 3
	%mul9 = mul nsw i64 %sub7, %add8			%mul9 = mul nsw i64 %sub7, %add8
	%rem10 = srem i64 %mul9, %i1.0			%rem10 = srem i64 %mul9, %i1.01
	%conv11 = trunc i64 %rem10 to i32			%conv11 = trunc i64 %rem10 to i32
	%arrayidx12 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %i1.0			%arrayidx12 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %i1.01
	store i32 %conv11, i32* %arrayidx12, align 4			store i32 %conv11, i32* %arrayidx12, align 4
	br label %for.inc13			br label %for.inc13

	for.inc13: ; preds = %for.body6			for.inc13: ; preds = %for.body6
	%inc14 = add nuw nsw i64 %i1.0, 1, !dbg !100			%inc14 = add nuw nsw i64 %i1.01, 1, !dbg !31
	br label %for.cond2, !dbg !101, !llvm.loop !102			%exitcond = icmp ne i64 %inc14, 100
				br i1 %exitcond, label %for.body6, label %for.cond.cleanup5, !llvm.loop !33

	for.end15: ; preds = %for.cond.cleanup5			for.end15: ; preds = %for.cond.cleanup5
	ret void			ret void
	}			}


	; CHECK: remark: diagnostics_missed.c:28:3: [different_bounds]: entry and for.end: Loop trip counts are not the same			; CHECK: remark: diagnostics_missed.c:28:3: [different_bounds]: entry and for.end: Loop trip counts are not the same
	define void @different_bounds(i32* noalias %A) !dbg !105 {			define void @different_bounds(i32* noalias %A) !dbg !36 {
	entry:			entry:
	br label %for.cond			br label %for.body

	for.cond: ; preds = %for.inc, %entry			for.cond.cleanup: ; preds = %for.inc
	%i.0 = phi i64 [ 0, %entry ], [ %inc, %for.inc ]
	%exitcond1 = icmp ne i64 %i.0, 100
	br i1 %exitcond1, label %for.body, label %for.cond.cleanup

	for.cond.cleanup: ; preds = %for.cond
	br label %for.end			br label %for.end

	for.body: ; preds = %for.cond			for.body: ; preds = %entry, %for.inc
	%sub = add nsw i64 %i.0, -3			%i.02 = phi i64 [ 0, %entry ], [ %inc, %for.inc ]
	%add = add nuw nsw i64 %i.0, 3			%sub = add nsw i64 %i.02, -3
				%add = add nuw nsw i64 %i.02, 3
	%mul = mul nsw i64 %sub, %add			%mul = mul nsw i64 %sub, %add
	%rem = srem i64 %mul, %i.0			%rem = srem i64 %mul, %i.02
	%conv = trunc i64 %rem to i32			%conv = trunc i64 %rem to i32
	%arrayidx = getelementptr inbounds i32, i32* %A, i64 %i.0			%arrayidx = getelementptr inbounds i32, i32* %A, i64 %i.02
	store i32 %conv, i32* %arrayidx, align 4			store i32 %conv, i32* %arrayidx, align 4
	br label %for.inc			br label %for.inc

	for.inc: ; preds = %for.body			for.inc: ; preds = %for.body
	%inc = add nuw nsw i64 %i.0, 1, !dbg !123			%inc = add nuw nsw i64 %i.02, 1, !dbg !43
	br label %for.cond, !dbg !124, !llvm.loop !125			%exitcond1 = icmp ne i64 %inc, 100
				br i1 %exitcond1, label %for.body, label %for.cond.cleanup, !llvm.loop !45

	for.end: ; preds = %for.cond.cleanup			for.end: ; preds = %for.cond.cleanup
	br label %for.cond2			br label %for.body6

	for.cond2: ; preds = %for.inc13, %for.end
	%i1.0 = phi i64 [ 0, %for.end ], [ %inc14, %for.inc13 ]
	%exitcond = icmp ne i64 %i1.0, 200
	br i1 %exitcond, label %for.body6, label %for.cond.cleanup5

	for.cond.cleanup5: ; preds = %for.cond2			for.cond.cleanup5: ; preds = %for.inc13
	br label %for.end15			br label %for.end15

	for.body6: ; preds = %for.cond2			for.body6: ; preds = %for.end, %for.inc13
	%sub7 = add nsw i64 %i1.0, -3			%i1.01 = phi i64 [ 0, %for.end ], [ %inc14, %for.inc13 ]
	%add8 = add nuw nsw i64 %i1.0, 3			%sub7 = add nsw i64 %i1.01, -3
				%add8 = add nuw nsw i64 %i1.01, 3
	%mul9 = mul nsw i64 %sub7, %add8			%mul9 = mul nsw i64 %sub7, %add8
	%rem10 = srem i64 %mul9, %i1.0			%rem10 = srem i64 %mul9, %i1.01
	%conv11 = trunc i64 %rem10 to i32			%conv11 = trunc i64 %rem10 to i32
	%arrayidx12 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %i1.0			%arrayidx12 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %i1.01
	store i32 %conv11, i32* %arrayidx12, align 4			store i32 %conv11, i32* %arrayidx12, align 4
	br label %for.inc13			br label %for.inc13

	for.inc13: ; preds = %for.body6			for.inc13: ; preds = %for.body6
	%inc14 = add nuw nsw i64 %i1.0, 1			%inc14 = add nuw nsw i64 %i1.01, 1
	br label %for.cond2, !dbg !138, !llvm.loop !139			%exitcond = icmp ne i64 %inc14, 200
				br i1 %exitcond, label %for.body6, label %for.cond.cleanup5, !llvm.loop !48

	for.end15: ; preds = %for.cond.cleanup5			for.end15: ; preds = %for.cond.cleanup5
	ret void			ret void
	}			}

	; CHECK: remark: diagnostics_missed.c:38:3: [negative_dependence]: entry and for.end: Loop has a non-empty preheader			; CHECK: remark: diagnostics_missed.c:38:3: [negative_dependence]: entry and for.end: Loop has a non-empty preheader
	define void @negative_dependence(i32* noalias %A) !dbg !142 {			define void @negative_dependence(i32* noalias %A) !dbg !51 {
	entry:			entry:
	br label %for.cond			br label %for.body

	for.cond: ; preds = %for.inc, %entry			for.body: ; preds = %entry, %for.inc
	%indvars.iv1 = phi i64 [ %indvars.iv.next2, %for.inc ], [ 0, %entry ]			%indvars.iv13 = phi i64 [ 0, %entry ], [ %indvars.iv.next2, %for.inc ]
	%exitcond3 = icmp ne i64 %indvars.iv1, 100			%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv13
	br i1 %exitcond3, label %for.body, label %for.end			%tmp = trunc i64 %indvars.iv13 to i32

	for.body: ; preds = %for.cond
	%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv1
	%tmp = trunc i64 %indvars.iv1 to i32
	store i32 %tmp, i32* %arrayidx, align 4			store i32 %tmp, i32* %arrayidx, align 4
	br label %for.inc			br label %for.inc

	for.inc: ; preds = %for.body			for.inc: ; preds = %for.body
	%indvars.iv.next2 = add nuw nsw i64 %indvars.iv1, 1			%indvars.iv.next2 = add nuw nsw i64 %indvars.iv13, 1
	br label %for.cond, !dbg !160, !llvm.loop !161			%exitcond3 = icmp ne i64 %indvars.iv.next2, 100
				br i1 %exitcond3, label %for.body, label %for.end, !llvm.loop !58
	for.end: ; preds = %for.cond
	call void @llvm.dbg.value(metadata i32 0, metadata !147, metadata !DIExpression()), !dbg !163			for.end: ; preds = %for.inc
	br label %for.cond2, !dbg !164			call void @llvm.dbg.value(metadata i32 0, metadata !56, metadata !DIExpression()), !dbg !61
				br label %for.body5
	for.cond2: ; preds = %for.inc10, %for.end
	%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc10 ], [ 0, %for.end ]			for.body5: ; preds = %for.end, %for.inc10
	%exitcond = icmp ne i64 %indvars.iv, 100			%indvars.iv2 = phi i64 [ 0, %for.end ], [ %indvars.iv.next, %for.inc10 ]
	br i1 %exitcond, label %for.body5, label %for.end12			%indvars.iv.next = add nuw nsw i64 %indvars.iv2, 1

	for.body5: ; preds = %for.cond2
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
	%arrayidx7 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv.next			%arrayidx7 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv.next
	%tmp4 = load i32, i32* %arrayidx7, align 4			%tmp4 = load i32, i32* %arrayidx7, align 4
	%mul = shl nsw i32 %tmp4, 1			%mul = shl nsw i32 %tmp4, 1
	%arrayidx9 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv			%arrayidx9 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv2
	store i32 %mul, i32* %arrayidx9, align 4			store i32 %mul, i32* %arrayidx9, align 4
	br label %for.inc10			br label %for.inc10

	for.inc10: ; preds = %for.body5			for.inc10: ; preds = %for.body5
	br label %for.cond2			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %for.body5, label %for.end12

	for.end12: ; preds = %for.cond.			for.end12: ; preds = %for.inc10
	ret void, !dbg !178			ret void, !dbg !62
	}			}

	; CHECK: remark: diagnostics_missed.c:51:3: [sumTest]: entry and for.cond2.preheader: Dependencies prevent fusion			; CHECK: remark: diagnostics_missed.c:51:3: [sumTest]: entry and for.cond2.preheader: Dependencies prevent fusion
	define i32 @sumTest(i32* noalias %A) !dbg !179 {			define i32 @sumTest(i32* noalias %A) !dbg !63 {
	entry:			entry:
	br label %for.cond			br label %for.body

	for.cond: ; preds = %for.inc, %entry			for.cond2.preheader: ; preds = %for.inc
	%indvars.iv1 = phi i64 [ %indvars.iv.next2, %for.inc ], [ 0, %entry ]			br label %for.body5
	%sum.0 = phi i32 [ 0, %entry ], [ %add, %for.inc ]
	%exitcond3 = icmp ne i64 %indvars.iv1, 100
	br i1 %exitcond3, label %for.body, label %for.cond2

	for.body: ; preds = %for.cond			for.body: ; preds = %entry, %for.inc
				%sum.04 = phi i32 [ 0, %entry ], [ %add, %for.inc ]
				%indvars.iv13 = phi i64 [ 0, %entry ], [ %indvars.iv.next2, %for.inc ]
	br label %for.inc			br label %for.inc

	for.inc: ; preds = %for.body			for.inc: ; preds = %for.body
	%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv1			%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv13
	%tmp = load i32, i32* %arrayidx, align 4			%tmp = load i32, i32* %arrayidx, align 4
	%add = add nsw i32 %sum.0, %tmp			%add = add nsw i32 %sum.04, %tmp
	%indvars.iv.next2 = add nuw nsw i64 %indvars.iv1, 1			%indvars.iv.next2 = add nuw nsw i64 %indvars.iv13, 1
	br label %for.cond, !dbg !199, !llvm.loop !200			%exitcond3 = icmp ne i64 %indvars.iv.next2, 100
				br i1 %exitcond3, label %for.body, label %for.cond2.preheader, !llvm.loop !73
	for.cond2: ; preds = %for.inc10, %for.cond
	%sum.0.lcssa = phi i32 [ %sum.0, %for.cond ], [ %sum.0.lcssa, %for.inc10 ]			for.body5: ; preds = %for.cond2.preheader, %for.inc10
	%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc10 ], [ 0, %for.cond ]			%indvars.iv2 = phi i64 [ 0, %for.cond2.preheader ], [ %indvars.iv.next, %for.inc10 ]
	%exitcond = icmp ne i64 %indvars.iv, 100			%arrayidx7 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv2
	br i1 %exitcond, label %for.body5, label %for.end12

	for.body5: ; preds = %for.cond2
	%arrayidx7 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
	%tmp4 = load i32, i32* %arrayidx7, align 4			%tmp4 = load i32, i32* %arrayidx7, align 4
	%div = sdiv i32 %tmp4, %sum.0.lcssa			%div = sdiv i32 %tmp4, %add
	%arrayidx9 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv			%arrayidx9 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv2
	store i32 %div, i32* %arrayidx9, align 4			store i32 %div, i32* %arrayidx9, align 4
	br label %for.inc10			br label %for.inc10

	for.inc10: ; preds = %for.body5			for.inc10: ; preds = %for.body5
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv2, 1
	br label %for.cond2			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %for.body5, label %for.end12

	for.end12: ; preds = %for.cond2			for.end12: ; preds = %for.inc10
	ret i32 %sum.0.lcssa, !dbg !215			ret i32 %add, !dbg !76
	}			}

	declare void @llvm.dbg.value(metadata, metadata, metadata)			; Function Attrs: nounwind readnone speculatable willreturn
				declare void @llvm.dbg.value(metadata, metadata, metadata) #0

				attributes #0 = { nounwind readnone speculatable willreturn }

	!llvm.dbg.cu = !{!2}			!llvm.dbg.cu = !{!2}
	!llvm.module.flags = !{!11, !12, !13, !14}			!llvm.module.flags = !{!10, !11, !12, !13}

	!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())			!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
	!1 = distinct !DIGlobalVariable(name: "B", scope: !2, file: !6, line: 46, type: !7, isLocal: false, isDefinition: true)			!1 = distinct !DIGlobalVariable(name: "B", scope: !2, file: !3, line: 46, type: !6, isLocal: false, isDefinition: true)
	!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 9.0.0 (git@github.ibm.com:compiler/llvm-project.git 23c4baaa9f5b33d2d52eda981d376c6b0a7a3180)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, nameTableKind: GNU)			!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "clang version 9.0.0 (git@github.ibm.com:compiler/llvm-project.git 23c4baaa9f5b33d2d52eda981d376c6b0a7a3180)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !4, globals: !5, nameTableKind: GNU)
	!3 = !DIFile(filename: "diagnostics_missed.c", directory: "/tmp")			!3 = !DIFile(filename: "diagnostics_missed.c", directory: "/tmp")
	!4 = !{}			!4 = !{}
	!5 = !{!0}			!5 = !{!0}
	!6 = !DIFile(filename: "diagnostics_missed.c", directory: "/tmp")			!6 = !DICompositeType(tag: DW_TAG_array_type, baseType: !7, size: 32768, elements: !8)
	!7 = !DICompositeType(tag: DW_TAG_array_type, baseType: !8, size: 32768, elements: !9)			!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
	!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)			!8 = !{!9}
	!9 = !{!10}			!9 = !DISubrange(count: 1024)
	!10 = !DISubrange(count: 1024)			!10 = !{i32 2, !"Dwarf Version", i32 4}
	!11 = !{i32 2, !"Dwarf Version", i32 4}			!11 = !{i32 2, !"Debug Info Version", i32 3}
	!12 = !{i32 2, !"Debug Info Version", i32 3}			!12 = !{i32 1, !"wchar_size", i32 4}
	!13 = !{i32 1, !"wchar_size", i32 4}			!13 = !{i32 7, !"PIC Level", i32 2}
	!14 = !{i32 7, !"PIC Level", i32 2}			!14 = distinct !DISubprogram(name: "non_adjacent", scope: !3, file: !3, line: 17, type: !15, scopeLine: 17, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !19)
	!17 = !DISubroutineType(types: !18)			!15 = !DISubroutineType(types: !16)
	!18 = !{null, !19}			!16 = !{null, !17}
	!19 = !DIDerivedType(tag: DW_TAG_restrict_type, baseType: !20)			!17 = !DIDerivedType(tag: DW_TAG_restrict_type, baseType: !18)
	!20 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !8, size: 64)			!18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64)
	!67 = distinct !DISubprogram(name: "non_adjacent", scope: !6, file: !6, line: 17, type: !17, scopeLine: 17, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !68)			!19 = !{!20, !21, !24}
	!68 = !{!69, !70, !73}			!20 = !DILocalVariable(name: "A", arg: 1, scope: !14, file: !3, line: 17, type: !17)
	!69 = !DILocalVariable(name: "A", arg: 1, scope: !67, file: !6, line: 17, type: !19)			!21 = !DILocalVariable(name: "i", scope: !22, file: !3, line: 18, type: !23)
	!70 = !DILocalVariable(name: "i", scope: !71, file: !6, line: 18, type: !72)			!22 = distinct !DILexicalBlock(scope: !14, file: !3, line: 18, column: 3)
	!71 = distinct !DILexicalBlock(scope: !67, file: !6, line: 18, column: 3)			!23 = !DIBasicType(name: "long int", size: 64, encoding: DW_ATE_signed)
	!72 = !DIBasicType(name: "long int", size: 64, encoding: DW_ATE_signed)			!24 = !DILocalVariable(name: "i", scope: !25, file: !3, line: 22, type: !23)
	!73 = !DILocalVariable(name: "i", scope: !74, file: !6, line: 22, type: !72)			!25 = distinct !DILexicalBlock(scope: !14, file: !3, line: 22, column: 3)
	!74 = distinct !DILexicalBlock(scope: !67, file: !6, line: 22, column: 3)			!26 = !DILocation(line: 18, column: 30, scope: !27)
	!79 = distinct !DILexicalBlock(scope: !71, file: !6, line: 18, column: 3)			!27 = distinct !DILexicalBlock(scope: !22, file: !3, line: 18, column: 3)
	!80 = !DILocation(line: 18, column: 3, scope: !71)			!28 = distinct !{!28, !29, !30}
	!86 = !DILocation(line: 18, column: 30, scope: !79)			!29 = !DILocation(line: 18, column: 3, scope: !22)
	!87 = !DILocation(line: 18, column: 3, scope: !79)			!30 = !DILocation(line: 20, column: 3, scope: !22)
	!88 = distinct !{!88, !80, !89}			!31 = !DILocation(line: 22, column: 30, scope: !32)
	!89 = !DILocation(line: 20, column: 3, scope: !71)			!32 = distinct !DILexicalBlock(scope: !25, file: !3, line: 22, column: 3)
	!93 = distinct !DILexicalBlock(scope: !74, file: !6, line: 22, column: 3)			!33 = distinct !{!33, !34, !35}
	!94 = !DILocation(line: 22, column: 3, scope: !74)			!34 = !DILocation(line: 22, column: 3, scope: !25)
	!100 = !DILocation(line: 22, column: 30, scope: !93)			!35 = !DILocation(line: 24, column: 3, scope: !25)
	!101 = !DILocation(line: 22, column: 3, scope: !93)			!36 = distinct !DISubprogram(name: "different_bounds", scope: !3, file: !3, line: 27, type: !15, scopeLine: 27, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !37)
	!102 = distinct !{!102, !94, !103}			!37 = !{!38, !39, !41}
	!103 = !DILocation(line: 24, column: 3, scope: !74)			!38 = !DILocalVariable(name: "A", arg: 1, scope: !36, file: !3, line: 27, type: !17)
	!105 = distinct !DISubprogram(name: "different_bounds", scope: !6, file: !6, line: 27, type: !17, scopeLine: 27, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !106)			!39 = !DILocalVariable(name: "i", scope: !40, file: !3, line: 28, type: !23)
	!106 = !{!107, !108, !110}			!40 = distinct !DILexicalBlock(scope: !36, file: !3, line: 28, column: 3)
	!107 = !DILocalVariable(name: "A", arg: 1, scope: !105, file: !6, line: 27, type: !19)			!41 = !DILocalVariable(name: "i", scope: !42, file: !3, line: 32, type: !23)
	!108 = !DILocalVariable(name: "i", scope: !109, file: !6, line: 28, type: !72)			!42 = distinct !DILexicalBlock(scope: !36, file: !3, line: 32, column: 3)
	!109 = distinct !DILexicalBlock(scope: !105, file: !6, line: 28, column: 3)			!43 = !DILocation(line: 28, column: 30, scope: !44)
	!110 = !DILocalVariable(name: "i", scope: !111, file: !6, line: 32, type: !72)			!44 = distinct !DILexicalBlock(scope: !40, file: !3, line: 28, column: 3)
	!111 = distinct !DILexicalBlock(scope: !105, file: !6, line: 32, column: 3)			!45 = distinct !{!45, !46, !47}
	!116 = distinct !DILexicalBlock(scope: !109, file: !6, line: 28, column: 3)			!46 = !DILocation(line: 28, column: 3, scope: !40)
	!117 = !DILocation(line: 28, column: 3, scope: !109)			!47 = !DILocation(line: 30, column: 3, scope: !40)
	!123 = !DILocation(line: 28, column: 30, scope: !116)			!48 = distinct !{!48, !49, !50}
	!124 = !DILocation(line: 28, column: 3, scope: !116)			!49 = !DILocation(line: 32, column: 3, scope: !42)
	!125 = distinct !{!125, !117, !126}			!50 = !DILocation(line: 34, column: 3, scope: !42)
	!126 = !DILocation(line: 30, column: 3, scope: !109)			!51 = distinct !DISubprogram(name: "negative_dependence", scope: !3, file: !3, line: 37, type: !15, scopeLine: 37, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !52)
	!130 = distinct !DILexicalBlock(scope: !111, file: !6, line: 32, column: 3)			!52 = !{!53, !54, !56}
	!131 = !DILocation(line: 32, column: 3, scope: !111)			!53 = !DILocalVariable(name: "A", arg: 1, scope: !51, file: !3, line: 37, type: !17)
	!138 = !DILocation(line: 32, column: 3, scope: !130)			!54 = !DILocalVariable(name: "i", scope: !55, file: !3, line: 38, type: !7)
	!139 = distinct !{!139, !131, !140}			!55 = distinct !DILexicalBlock(scope: !51, file: !3, line: 38, column: 3)
	!140 = !DILocation(line: 34, column: 3, scope: !111)			!56 = !DILocalVariable(name: "i", scope: !57, file: !3, line: 42, type: !7)
	!142 = distinct !DISubprogram(name: "negative_dependence", scope: !6, file: !6, line: 37, type: !17, scopeLine: 37, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !143)			!57 = distinct !DILexicalBlock(scope: !51, file: !3, line: 42, column: 3)
	!143 = !{!144, !145, !147}			!58 = distinct !{!58, !59, !60}
	!144 = !DILocalVariable(name: "A", arg: 1, scope: !142, file: !6, line: 37, type: !19)			!59 = !DILocation(line: 38, column: 3, scope: !55)
	!145 = !DILocalVariable(name: "i", scope: !146, file: !6, line: 38, type: !8)			!60 = !DILocation(line: 40, column: 3, scope: !55)
	!146 = distinct !DILexicalBlock(scope: !142, file: !6, line: 38, column: 3)			!61 = !DILocation(line: 0, scope: !57)
	!147 = !DILocalVariable(name: "i", scope: !148, file: !6, line: 42, type: !8)			!62 = !DILocation(line: 45, column: 1, scope: !51)
	!148 = distinct !DILexicalBlock(scope: !142, file: !6, line: 42, column: 3)			!63 = distinct !DISubprogram(name: "sumTest", scope: !3, file: !3, line: 48, type: !64, scopeLine: 48, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !66)
	!153 = distinct !DILexicalBlock(scope: !146, file: !6, line: 38, column: 3)			!64 = !DISubroutineType(types: !65)
	!154 = !DILocation(line: 38, column: 3, scope: !146)			!65 = !{!7, !17}
	!160 = !DILocation(line: 38, column: 3, scope: !153)			!66 = !{!67, !68, !69, !71}
	!161 = distinct !{!161, !154, !162}			!67 = !DILocalVariable(name: "A", arg: 1, scope: !63, file: !3, line: 48, type: !17)
	!162 = !DILocation(line: 40, column: 3, scope: !146)			!68 = !DILocalVariable(name: "sum", scope: !63, file: !3, line: 49, type: !7)
	!163 = !DILocation(line: 0, scope: !148)			!69 = !DILocalVariable(name: "i", scope: !70, file: !3, line: 51, type: !7)
	!164 = !DILocation(line: 42, column: 8, scope: !148)			!70 = distinct !DILexicalBlock(scope: !63, file: !3, line: 51, column: 3)
	!178 = !DILocation(line: 45, column: 1, scope: !142)			!71 = !DILocalVariable(name: "i", scope: !72, file: !3, line: 54, type: !7)
	!179 = distinct !DISubprogram(name: "sumTest", scope: !6, file: !6, line: 48, type: !180, scopeLine: 48, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !2, retainedNodes: !182)			!72 = distinct !DILexicalBlock(scope: !63, file: !3, line: 54, column: 3)
	!180 = !DISubroutineType(types: !181)			!73 = distinct !{!73, !74, !75}
	!181 = !{!8, !19}			!74 = !DILocation(line: 51, column: 3, scope: !70)
	!182 = !{!183, !184, !185, !187}			!75 = !DILocation(line: 52, column: 15, scope: !70)
	!183 = !DILocalVariable(name: "A", arg: 1, scope: !179, file: !6, line: 48, type: !19)			!76 = !DILocation(line: 57, column: 3, scope: !63)
	!184 = !DILocalVariable(name: "sum", scope: !179, file: !6, line: 49, type: !8)
	!185 = !DILocalVariable(name: "i", scope: !186, file: !6, line: 51, type: !8)
	!186 = distinct !DILexicalBlock(scope: !179, file: !6, line: 51, column: 3)
	!187 = !DILocalVariable(name: "i", scope: !188, file: !6, line: 54, type: !8)
	!188 = distinct !DILexicalBlock(scope: !179, file: !6, line: 54, column: 3)
	!193 = distinct !DILexicalBlock(scope: !186, file: !6, line: 51, column: 3)
	!194 = !DILocation(line: 51, column: 3, scope: !186)
	!199 = !DILocation(line: 51, column: 3, scope: !193)
	!200 = distinct !{!200, !194, !201}
	!201 = !DILocation(line: 52, column: 15, scope: !186)
	!215 = !DILocation(line: 57, column: 3, scope: !179)

llvm/test/Transforms/LoopFusion/four_loops.ll

	; RUN: opt -S -loop-fusion < %s \| FileCheck %s			; RUN: opt -S -loop-fusion < %s \| FileCheck %s

	@A = common global [1024 x i32] zeroinitializer, align 16			@A = common global [1024 x i32] zeroinitializer, align 16
	@B = common global [1024 x i32] zeroinitializer, align 16			@B = common global [1024 x i32] zeroinitializer, align 16
	@C = common global [1024 x i32] zeroinitializer, align 16			@C = common global [1024 x i32] zeroinitializer, align 16
	@D = common global [1024 x i32] zeroinitializer, align 16			@D = common global [1024 x i32] zeroinitializer, align 16

	; CHECK: void @dep_free			; CHECK: void @dep_free
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]+]]			; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]+]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %exitcond12, label %[[LOOP1BODY:bb[0-9]+]], label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP1BODY]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP1LATCH:bb[0-9]+]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER]]			; CHECK: br i1 %{{.*}}, label %[[LOOP2BODY:bb[0-9]+]], label %[[LOOP2BODY]]
	; CHECK: [[LOOP2PREHEADER]]			; CHECK: [[LOOP2BODY]]
	; CHECK: br i1 %exitcond9, label %[[LOOP2HEADER:bb[0-9]+]], label %[[LOOP3PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP2HEADER]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP3PREHEADER]]			; CHECK: br i1 %{{.*}}, label %[[LOOP3BODY:bb[0-9]+]], label %[[LOOP3BODY]]
	; CHECK: [[LOOP3PREHEADER]]			; CHECK: [[LOOP3BODY]]
	; CHECK: br i1 %exitcond6, label %[[LOOP3HEADER:bb[0-9]+]], label %[[LOOP4PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP3HEADER]]
	; CHECK: br label %[[LOOP3LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP3LATCH:bb[0-9]+]]
	; CHECK: [[LOOP3LATCH]]			; CHECK: [[LOOP3LATCH]]
	; CHECK: br label %[[LOOP4PREHEADER]]			; CHECK: br i1 %{{.*}}, label %[[LOOP4BODY:bb[0-9]+]], label %[[LOOP4BODY]]
	; CHECK: [[LOOP4PREHEADER]]			; CHECK: [[LOOP4BODY]]
	; CHECK: br i1 %exitcond, label %[[LOOP4HEADER:bb[0-9]+]], label %[[LOOP4EXIT:bb[0-9]+]]
	; CHECK: [[LOOP4EXIT]]
	; CHECK: br label %[[FUNCEXIT:bb[0-9]+]]
	; CHECK: [[LOOP4HEADER]]
	; CHECK: br label %[[LOOP4LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP4LATCH:bb[0-9]+]]
	; CHECK: [[LOOP4LATCH]]			; CHECK: [[LOOP4LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]			; CHECK: br i1 %{{.*}}, label %[[LOOP1HEADER]], label %[[LOOPEXIT:bb[0-9]+]]
	; CHECK: [[FUNCEXIT]]
	; CHECK: ret void			; CHECK: ret void
	define void @dep_free() {			define void @dep_free() {
	bb:			bb:
	br label %bb13			br label %bb15

	bb13: ; preds = %bb22, %bb			bb25.preheader: ; preds = %bb22
	%indvars.iv10 = phi i64 [ %indvars.iv.next11, %bb22 ], [ 0, %bb ]			br label %bb27
	%.0 = phi i32 [ 0, %bb ], [ %tmp23, %bb22 ]
	%exitcond12 = icmp ne i64 %indvars.iv10, 100			bb15: ; preds = %bb, %bb22
	br i1 %exitcond12, label %bb15, label %bb25			%.08 = phi i32 [ 0, %bb ], [ %tmp23, %bb22 ]
				%indvars.iv107 = phi i64 [ 0, %bb ], [ %indvars.iv.next11, %bb22 ]
	bb15: ; preds = %bb13			%tmp = add nsw i32 %.08, -3
	%tmp = add nsw i32 %.0, -3			%tmp16 = add nuw nsw i64 %indvars.iv107, 3
	%tmp16 = add nuw nsw i64 %indvars.iv10, 3
	%tmp17 = trunc i64 %tmp16 to i32			%tmp17 = trunc i64 %tmp16 to i32
	%tmp18 = mul nsw i32 %tmp, %tmp17			%tmp18 = mul nsw i32 %tmp, %tmp17
	%tmp19 = trunc i64 %indvars.iv10 to i32			%tmp19 = trunc i64 %indvars.iv107 to i32
	%tmp20 = srem i32 %tmp18, %tmp19			%tmp20 = srem i32 %tmp18, %tmp19
	%tmp21 = getelementptr inbounds [1024 x i32], [1024 x i32]* @A, i64 0, i64 %indvars.iv10			%tmp21 = getelementptr inbounds [1024 x i32], [1024 x i32]* @A, i64 0, i64 %indvars.iv107
	store i32 %tmp20, i32* %tmp21, align 4			store i32 %tmp20, i32* %tmp21, align 4
	br label %bb22			br label %bb22

	bb22: ; preds = %bb15			bb22: ; preds = %bb15
	%indvars.iv.next11 = add nuw nsw i64 %indvars.iv10, 1			%indvars.iv.next11 = add nuw nsw i64 %indvars.iv107, 1
	%tmp23 = add nuw nsw i32 %.0, 1			%tmp23 = add nuw nsw i32 %.08, 1
	br label %bb13			%exitcond12 = icmp ne i64 %indvars.iv.next11, 100
				br i1 %exitcond12, label %bb15, label %bb25.preheader
	bb25: ; preds = %bb35, %bb13
	%indvars.iv7 = phi i64 [ %indvars.iv.next8, %bb35 ], [ 0, %bb13 ]			bb38.preheader: ; preds = %bb35
	%.01 = phi i32 [ 0, %bb13 ], [ %tmp36, %bb35 ]			br label %bb40
	%exitcond9 = icmp ne i64 %indvars.iv7, 100
	br i1 %exitcond9, label %bb27, label %bb38			bb27: ; preds = %bb25.preheader, %bb35
				%.016 = phi i32 [ 0, %bb25.preheader ], [ %tmp36, %bb35 ]
	bb27: ; preds = %bb25			%indvars.iv75 = phi i64 [ 0, %bb25.preheader ], [ %indvars.iv.next8, %bb35 ]
	%tmp28 = add nsw i32 %.01, -3			%tmp28 = add nsw i32 %.016, -3
	%tmp29 = add nuw nsw i64 %indvars.iv7, 3			%tmp29 = add nuw nsw i64 %indvars.iv75, 3
	%tmp30 = trunc i64 %tmp29 to i32			%tmp30 = trunc i64 %tmp29 to i32
	%tmp31 = mul nsw i32 %tmp28, %tmp30			%tmp31 = mul nsw i32 %tmp28, %tmp30
	%tmp32 = trunc i64 %indvars.iv7 to i32			%tmp32 = trunc i64 %indvars.iv75 to i32
	%tmp33 = srem i32 %tmp31, %tmp32			%tmp33 = srem i32 %tmp31, %tmp32
	%tmp34 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv7			%tmp34 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv75
	store i32 %tmp33, i32* %tmp34, align 4			store i32 %tmp33, i32* %tmp34, align 4
	br label %bb35			br label %bb35

	bb35: ; preds = %bb27			bb35: ; preds = %bb27
	%indvars.iv.next8 = add nuw nsw i64 %indvars.iv7, 1			%indvars.iv.next8 = add nuw nsw i64 %indvars.iv75, 1
	%tmp36 = add nuw nsw i32 %.01, 1			%tmp36 = add nuw nsw i32 %.016, 1
	br label %bb25			%exitcond9 = icmp ne i64 %indvars.iv.next8, 100
				br i1 %exitcond9, label %bb27, label %bb38.preheader
	bb38: ; preds = %bb48, %bb25
	%indvars.iv4 = phi i64 [ %indvars.iv.next5, %bb48 ], [ 0, %bb25 ]			bb51.preheader: ; preds = %bb48
	%.02 = phi i32 [ 0, %bb25 ], [ %tmp49, %bb48 ]			br label %bb53
	%exitcond6 = icmp ne i64 %indvars.iv4, 100
	br i1 %exitcond6, label %bb40, label %bb51			bb40: ; preds = %bb38.preheader, %bb48
				%.024 = phi i32 [ 0, %bb38.preheader ], [ %tmp49, %bb48 ]
	bb40: ; preds = %bb38			%indvars.iv43 = phi i64 [ 0, %bb38.preheader ], [ %indvars.iv.next5, %bb48 ]
	%tmp41 = add nsw i32 %.02, -3			%tmp41 = add nsw i32 %.024, -3
	%tmp42 = add nuw nsw i64 %indvars.iv4, 3			%tmp42 = add nuw nsw i64 %indvars.iv43, 3
	%tmp43 = trunc i64 %tmp42 to i32			%tmp43 = trunc i64 %tmp42 to i32
	%tmp44 = mul nsw i32 %tmp41, %tmp43			%tmp44 = mul nsw i32 %tmp41, %tmp43
	%tmp45 = trunc i64 %indvars.iv4 to i32			%tmp45 = trunc i64 %indvars.iv43 to i32
	%tmp46 = srem i32 %tmp44, %tmp45			%tmp46 = srem i32 %tmp44, %tmp45
	%tmp47 = getelementptr inbounds [1024 x i32], [1024 x i32]* @C, i64 0, i64 %indvars.iv4			%tmp47 = getelementptr inbounds [1024 x i32], [1024 x i32]* @C, i64 0, i64 %indvars.iv43
	store i32 %tmp46, i32* %tmp47, align 4			store i32 %tmp46, i32* %tmp47, align 4
	br label %bb48			br label %bb48

	bb48: ; preds = %bb40			bb48: ; preds = %bb40
	%indvars.iv.next5 = add nuw nsw i64 %indvars.iv4, 1			%indvars.iv.next5 = add nuw nsw i64 %indvars.iv43, 1
	%tmp49 = add nuw nsw i32 %.02, 1			%tmp49 = add nuw nsw i32 %.024, 1
	br label %bb38			%exitcond6 = icmp ne i64 %indvars.iv.next5, 100
				br i1 %exitcond6, label %bb40, label %bb51.preheader
	bb51: ; preds = %bb61, %bb38
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb61 ], [ 0, %bb38 ]
	%.03 = phi i32 [ 0, %bb38 ], [ %tmp62, %bb61 ]
	%exitcond = icmp ne i64 %indvars.iv, 100
	br i1 %exitcond, label %bb53, label %bb52

	bb52: ; preds = %bb51			bb52: ; preds = %bb61
	br label %bb63			br label %bb63

	bb53: ; preds = %bb51			bb53: ; preds = %bb51.preheader, %bb61
	%tmp54 = add nsw i32 %.03, -3			%.032 = phi i32 [ 0, %bb51.preheader ], [ %tmp62, %bb61 ]
	%tmp55 = add nuw nsw i64 %indvars.iv, 3			%indvars.iv1 = phi i64 [ 0, %bb51.preheader ], [ %indvars.iv.next, %bb61 ]
				%tmp54 = add nsw i32 %.032, -3
				%tmp55 = add nuw nsw i64 %indvars.iv1, 3
	%tmp56 = trunc i64 %tmp55 to i32			%tmp56 = trunc i64 %tmp55 to i32
	%tmp57 = mul nsw i32 %tmp54, %tmp56			%tmp57 = mul nsw i32 %tmp54, %tmp56
	%tmp58 = trunc i64 %indvars.iv to i32			%tmp58 = trunc i64 %indvars.iv1 to i32
	%tmp59 = srem i32 %tmp57, %tmp58			%tmp59 = srem i32 %tmp57, %tmp58
	%tmp60 = getelementptr inbounds [1024 x i32], [1024 x i32]* @D, i64 0, i64 %indvars.iv			%tmp60 = getelementptr inbounds [1024 x i32], [1024 x i32]* @D, i64 0, i64 %indvars.iv1
	store i32 %tmp59, i32* %tmp60, align 4			store i32 %tmp59, i32* %tmp60, align 4
	br label %bb61			br label %bb61

	bb61: ; preds = %bb53			bb61: ; preds = %bb53
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
	%tmp62 = add nuw nsw i32 %.03, 1			%tmp62 = add nuw nsw i32 %.032, 1
	br label %bb51			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %bb53, label %bb52

	bb63: ; preds = %bb52			bb63: ; preds = %bb52
	ret void			ret void
	}			}

llvm/test/Transforms/LoopFusion/loop_nest.ll

	Show All 17 Lines
	;			;
	@A = common global [1024 x [1024 x i32]] zeroinitializer, align 16			@A = common global [1024 x [1024 x i32]] zeroinitializer, align 16
	@B = common global [1024 x [1024 x i32]] zeroinitializer, align 16			@B = common global [1024 x [1024 x i32]] zeroinitializer, align 16

	; CHECK: void @dep_free			; CHECK: void @dep_free
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]+]]			; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]+]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %exitcond12, label %[[LOOP3PREHEADER:bb[0-9]+.preheader]], label %[[LOOP2HEADER:bb[0-9]+]]
	; CHECK: [[LOOP3PREHEADER]]
	; CHECK: br label %[[LOOP3HEADER:bb[0-9]+]]			; CHECK: br label %[[LOOP3HEADER:bb[0-9]+]]
	; CHECK: [[LOOP3HEADER]]			; CHECK: [[LOOP3HEADER]]
	; CHECK: br i1 %exitcond9, label %[[LOOP3BODY:bb[0-9]+]], label %[[LOOP1LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP3LATCH:bb[0-9]+]]
				; CHECK: [[LOOP3LATCH]]
				; CHECK: br i1 %{{.*}}, label %[[LOOP3HEADER]], label %[[LOOP1LATCH:bb[0-9]+]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2HEADER:bb[0-9]+]]			; CHECK: br i1 %{{.*}}, label %[[LOOP2PREHEADER:bb[0-9]+]], label %[[LOOP2PREHEADER]]
	; CHECK: [[LOOP2HEADER]]			; CHECK: [[LOOP2PREHEADER]]
	; CHECK: br i1 %exitcond6, label %[[LOOP4PREHEADER:bb[0-9]+.preheader]], label %[[LOOP2EXITBLOCK:bb[0-9]+]]
	; CHECK: [[LOOP4PREHEADER]]
	; CHECK: br label %[[LOOP4HEADER:bb[0-9]+]]			; CHECK: br label %[[LOOP4HEADER:bb[0-9]+]]
	; CHECK: [[LOOP2EXITBLOCK]]
	; CHECK-NEXT: br label %[[FUNCEXIT:bb[0-9]+]]
	; CHECK: [[LOOP4HEADER]]			; CHECK: [[LOOP4HEADER]]
	; CHECK: br i1 %exitcond, label %[[LOOP4BODY:bb[0-9]+]], label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP4LATCH:bb[0-9]+]]
				; CHECK: [[LOOP4LATCH]]
				; CHECK: br i1 %{{.*}}, label %[[LOOP4HEADER]], label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER:bb[0-9]+]]			; CHECK: br i1 %{{.}}, label %[[LOOP1HEADER]], label %[[LOOP1EXIT:bb[0-9]]]
	; CHECK: [[FUNCEXIT]]
	; CHECK: ret void			; CHECK: ret void

	; TODO: The current version of loop fusion does not allow the inner loops to be			; TODO: The current version of loop fusion does not allow the inner loops to be
	; fused because they are not control flow equivalent and adjacent. These are			; fused because they are not control flow equivalent and adjacent. These are
	; limitations that can be addressed in future improvements to fusion.			; limitations that can be addressed in future improvements to fusion.
	define void @dep_free() {			define void @dep_free() {
	bb:			bb:
	br label %bb13			br label %bb16

	bb13: ; preds = %bb27, %bb			bb16: ; preds = %bb, %bb27
	%indvars.iv10 = phi i64 [ %indvars.iv.next11, %bb27 ], [ 0, %bb ]			%.06 = phi i32 [ 0, %bb ], [ %tmp28, %bb27 ]
	%.0 = phi i32 [ 0, %bb ], [ %tmp28, %bb27 ]			%indvars.iv105 = phi i64 [ 0, %bb ], [ %indvars.iv.next11, %bb27 ]
	%exitcond12 = icmp ne i64 %indvars.iv10, 100			br label %bb18
	br i1 %exitcond12, label %bb16, label %bb30

	bb16: ; preds = %bb25, %bb13			bb30: ; preds = %bb27
	%indvars.iv7 = phi i64 [ %indvars.iv.next8, %bb25 ], [ 0, %bb13 ]			br label %bb33
	%exitcond9 = icmp ne i64 %indvars.iv7, 100
	br i1 %exitcond9, label %bb18, label %bb27

	bb18: ; preds = %bb16			bb18: ; preds = %bb16, %bb25
	%tmp = add nsw i32 %.0, -3			%indvars.iv74 = phi i64 [ 0, %bb16 ], [ %indvars.iv.next8, %bb25 ]
	%tmp19 = add nuw nsw i64 %indvars.iv10, 3			%tmp = add nsw i32 %.06, -3
				%tmp19 = add nuw nsw i64 %indvars.iv105, 3
	%tmp20 = trunc i64 %tmp19 to i32			%tmp20 = trunc i64 %tmp19 to i32
	%tmp21 = mul nsw i32 %tmp, %tmp20			%tmp21 = mul nsw i32 %tmp, %tmp20
	%tmp22 = trunc i64 %indvars.iv10 to i32			%tmp22 = trunc i64 %indvars.iv105 to i32
	%tmp23 = srem i32 %tmp21, %tmp22			%tmp23 = srem i32 %tmp21, %tmp22
	%tmp24 = getelementptr inbounds [1024 x [1024 x i32]], [1024 x [1024 x i32]]* @A, i64 0, i64 %indvars.iv10, i64 %indvars.iv7			%tmp24 = getelementptr inbounds [1024 x [1024 x i32]], [1024 x [1024 x i32]]* @A, i64 0, i64 %indvars.iv105, i64 %indvars.iv74
	store i32 %tmp23, i32* %tmp24, align 4			store i32 %tmp23, i32* %tmp24, align 4
	br label %bb25			br label %bb25

	bb25: ; preds = %bb18			bb25: ; preds = %bb18
	%indvars.iv.next8 = add nuw nsw i64 %indvars.iv7, 1			%indvars.iv.next8 = add nuw nsw i64 %indvars.iv74, 1
	br label %bb16			%exitcond9 = icmp ne i64 %indvars.iv.next8, 100
				br i1 %exitcond9, label %bb18, label %bb27

	bb27: ; preds = %bb16			bb27: ; preds = %bb25
	%indvars.iv.next11 = add nuw nsw i64 %indvars.iv10, 1			%indvars.iv.next11 = add nuw nsw i64 %indvars.iv105, 1
	%tmp28 = add nuw nsw i32 %.0, 1			%tmp28 = add nuw nsw i32 %.06, 1
	br label %bb13			%exitcond12 = icmp ne i64 %indvars.iv.next11, 100
				br i1 %exitcond12, label %bb16, label %bb30
	bb30: ; preds = %bb45, %bb13
	%indvars.iv4 = phi i64 [ %indvars.iv.next5, %bb45 ], [ 0, %bb13 ]
	%.02 = phi i32 [ 0, %bb13 ], [ %tmp46, %bb45 ]
	%exitcond6 = icmp ne i64 %indvars.iv4, 100
	br i1 %exitcond6, label %bb33, label %bb31

	bb31: ; preds = %bb30			bb33: ; preds = %bb30, %bb45
	br label %bb47			%.023 = phi i32 [ 0, %bb30 ], [ %tmp46, %bb45 ]
				%indvars.iv42 = phi i64 [ 0, %bb30 ], [ %indvars.iv.next5, %bb45 ]
				br label %bb35

	bb33: ; preds = %bb43, %bb30			bb31: ; preds = %bb45
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb43 ], [ 0, %bb30 ]			br label %bb47
	%exitcond = icmp ne i64 %indvars.iv, 100
	br i1 %exitcond, label %bb35, label %bb45

	bb35: ; preds = %bb33			bb35: ; preds = %bb33, %bb43
	%tmp36 = add nsw i32 %.02, -3			%indvars.iv1 = phi i64 [ 0, %bb33 ], [ %indvars.iv.next, %bb43 ]
	%tmp37 = add nuw nsw i64 %indvars.iv4, 3			%tmp36 = add nsw i32 %.023, -3
				%tmp37 = add nuw nsw i64 %indvars.iv42, 3
	%tmp38 = trunc i64 %tmp37 to i32			%tmp38 = trunc i64 %tmp37 to i32
	%tmp39 = mul nsw i32 %tmp36, %tmp38			%tmp39 = mul nsw i32 %tmp36, %tmp38
	%tmp40 = trunc i64 %indvars.iv4 to i32			%tmp40 = trunc i64 %indvars.iv42 to i32
	%tmp41 = srem i32 %tmp39, %tmp40			%tmp41 = srem i32 %tmp39, %tmp40
	%tmp42 = getelementptr inbounds [1024 x [1024 x i32]], [1024 x [1024 x i32]]* @B, i64 0, i64 %indvars.iv4, i64 %indvars.iv			%tmp42 = getelementptr inbounds [1024 x [1024 x i32]], [1024 x [1024 x i32]]* @B, i64 0, i64 %indvars.iv42, i64 %indvars.iv1
	store i32 %tmp41, i32* %tmp42, align 4			store i32 %tmp41, i32* %tmp42, align 4
	br label %bb43			br label %bb43

	bb43: ; preds = %bb35			bb43: ; preds = %bb35
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
	br label %bb33			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %bb35, label %bb45

	bb45: ; preds = %bb33			bb45: ; preds = %bb43
	%indvars.iv.next5 = add nuw nsw i64 %indvars.iv4, 1			%indvars.iv.next5 = add nuw nsw i64 %indvars.iv42, 1
	%tmp46 = add nuw nsw i32 %.02, 1			%tmp46 = add nuw nsw i32 %.023, 1
	br label %bb30			%exitcond6 = icmp ne i64 %indvars.iv.next5, 100
				br i1 %exitcond6, label %bb33, label %bb31

	bb47: ; preds = %bb31			bb47: ; preds = %bb31
	ret void			ret void
	}			}

llvm/test/Transforms/LoopFusion/simple.ll

	; RUN: opt -S -loop-fusion < %s \| FileCheck %s			; RUN: opt -S -loop-fusion < %s \| FileCheck %s

	@B = common global [1024 x i32] zeroinitializer, align 16			@B = common global [1024 x i32] zeroinitializer, align 16

	; CHECK: void @dep_free			; CHECK: void @dep_free
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]*]]			; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]*]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP1BODY:bb[0-9]]], label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP1BODY]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]			; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER:bb[0-9]+]]			; CHECK: br i1 %{{.*}}, label %[[LOOP2HEADER:bb[0-9]+]], label %[[LOOP2HEADER]]
	; CHECK: [[LOOP2PREHEADER]]			; CHECK: [[LOOP2HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP2BODY:bb[0-9]]], label %[[LOOP2EXIT:bb[0-9]*]]
	; CHECK: [[LOOP2BODY]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]			; CHECK: br i1 %{{.}}, label %[[LOOP1HEADER]], label %{{.}}
	; CHECK: ret void			; CHECK: ret void
	define void @dep_free(i32* noalias %arg) {			define void @dep_free(i32* noalias %arg) {
	bb:			bb:
	br label %bb5			br label %bb7

	bb5: ; preds = %bb14, %bb			bb7: ; preds = %bb, %bb14
	%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb14 ], [ 0, %bb ]			%.014 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]
	%.01 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]			%indvars.iv23 = phi i64 [ 0, %bb ], [ %indvars.iv.next3, %bb14 ]
	%exitcond4 = icmp ne i64 %indvars.iv2, 100			%tmp = add nsw i32 %.014, -3
	br i1 %exitcond4, label %bb7, label %bb17			%tmp8 = add nuw nsw i64 %indvars.iv23, 3

	bb7: ; preds = %bb5
	%tmp = add nsw i32 %.01, -3
	%tmp8 = add nuw nsw i64 %indvars.iv2, 3
	%tmp9 = trunc i64 %tmp8 to i32			%tmp9 = trunc i64 %tmp8 to i32
	%tmp10 = mul nsw i32 %tmp, %tmp9			%tmp10 = mul nsw i32 %tmp, %tmp9
	%tmp11 = trunc i64 %indvars.iv2 to i32			%tmp11 = trunc i64 %indvars.iv23 to i32
	%tmp12 = srem i32 %tmp10, %tmp11			%tmp12 = srem i32 %tmp10, %tmp11
	%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2			%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv23
	store i32 %tmp12, i32* %tmp13, align 4			store i32 %tmp12, i32* %tmp13, align 4
	br label %bb14			br label %bb14

	bb14: ; preds = %bb7			bb14: ; preds = %bb7
	%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1			%indvars.iv.next3 = add nuw nsw i64 %indvars.iv23, 1
	%tmp15 = add nuw nsw i32 %.01, 1			%tmp15 = add nuw nsw i32 %.014, 1
	br label %bb5			%exitcond4 = icmp ne i64 %indvars.iv.next3, 100
				br i1 %exitcond4, label %bb7, label %bb17.preheader
	bb17: ; preds = %bb27, %bb5
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb27 ], [ 0, %bb5 ]			bb17.preheader: ; preds = %bb14
	%.0 = phi i32 [ 0, %bb5 ], [ %tmp28, %bb27 ]			br label %bb19
	%exitcond = icmp ne i64 %indvars.iv, 100
	br i1 %exitcond, label %bb19, label %bb18			bb19: ; preds = %bb17.preheader, %bb27
				%.02 = phi i32 [ 0, %bb17.preheader ], [ %tmp28, %bb27 ]
	bb18: ; preds = %bb17			%indvars.iv1 = phi i64 [ 0, %bb17.preheader ], [ %indvars.iv.next, %bb27 ]
	br label %bb29			%tmp20 = add nsw i32 %.02, -3
				%tmp21 = add nuw nsw i64 %indvars.iv1, 3
	bb19: ; preds = %bb17
	%tmp20 = add nsw i32 %.0, -3
	%tmp21 = add nuw nsw i64 %indvars.iv, 3
	%tmp22 = trunc i64 %tmp21 to i32			%tmp22 = trunc i64 %tmp21 to i32
	%tmp23 = mul nsw i32 %tmp20, %tmp22			%tmp23 = mul nsw i32 %tmp20, %tmp22
	%tmp24 = trunc i64 %indvars.iv to i32			%tmp24 = trunc i64 %indvars.iv1 to i32
	%tmp25 = srem i32 %tmp23, %tmp24			%tmp25 = srem i32 %tmp23, %tmp24
	%tmp26 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv			%tmp26 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv1
	store i32 %tmp25, i32* %tmp26, align 4			store i32 %tmp25, i32* %tmp26, align 4
	br label %bb27			br label %bb27

	bb27: ; preds = %bb19			bb27: ; preds = %bb19
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
	%tmp28 = add nuw nsw i32 %.0, 1			%tmp28 = add nuw nsw i32 %.02, 1
	br label %bb17			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %bb19, label %bb18

				bb18: ; preds = %bb27
				br label %bb29

	bb29: ; preds = %bb18			bb29: ; preds = %bb18
	ret void			ret void
	}			}

	; CHECK: void @dep_free_parametric			; CHECK: void @dep_free_parametric
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]*]]			; CHECK: br i1 %{{.}}, label %[[LOOP1PREHEADER:bb[0-9.a-z]]], label %[[EXITBLOCK:bb[0-9]*]]
				; CHECK: [[LOOP1PREHEADER]]
				; CHECK: br label %[[LOOP1HEADER:bb[0-9]*]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP1BODY:bb[0-9]]], label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP1BODY]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]			; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER:bb[0-9]+]]			; CHECK: br i1 %{{.}}, label %[[LOOP2HEADER:bb[0-9]]], label %[[LOOP2HEADER]]
	; CHECK: [[LOOP2PREHEADER]]			; CHECK: [[LOOP2HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP2BODY:bb[0-9]]], label %[[LOOP2EXIT:bb[0-9]*]]
	; CHECK: [[LOOP2BODY]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]			; CHECK: br i1 %{{.*}}, label %[[LOOP1HEADER]], label %[[EXITBLOCK]]
	; CHECK: ret void			; CHECK: ret void
	define void @dep_free_parametric(i32* noalias %arg, i64 %arg2) {			define void @dep_free_parametric(i32* noalias %arg, i64 %arg2) {
	bb:			bb:
	br label %bb3			%tmp3 = icmp slt i64 0, %arg2
				br i1 %tmp3, label %bb5, label %bb15.preheader

	bb3: ; preds = %bb12, %bb			bb5: ; preds = %bb5, %bb12
	%.01 = phi i64 [ 0, %bb ], [ %tmp13, %bb12 ]			%.014 = phi i64 [ 0, %bb ], [ %tmp13, %bb12 ]
	%tmp = icmp slt i64 %.01, %arg2			%tmp6 = add nsw i64 %.014, -3
	br i1 %tmp, label %bb5, label %bb15			%tmp7 = add nuw nsw i64 %.014, 3

	bb5: ; preds = %bb3
	%tmp6 = add nsw i64 %.01, -3
	%tmp7 = add nuw nsw i64 %.01, 3
	%tmp8 = mul nsw i64 %tmp6, %tmp7			%tmp8 = mul nsw i64 %tmp6, %tmp7
	%tmp9 = srem i64 %tmp8, %.01			%tmp9 = srem i64 %tmp8, %.014
	%tmp10 = trunc i64 %tmp9 to i32			%tmp10 = trunc i64 %tmp9 to i32
	%tmp11 = getelementptr inbounds i32, i32* %arg, i64 %.01			%tmp11 = getelementptr inbounds i32, i32* %arg, i64 %.014
	store i32 %tmp10, i32* %tmp11, align 4			store i32 %tmp10, i32* %tmp11, align 4
	br label %bb12			br label %bb12

	bb12: ; preds = %bb5			bb12: ; preds = %bb5
	%tmp13 = add nuw nsw i64 %.01, 1			%tmp13 = add nuw nsw i64 %.014, 1
	br label %bb3			%tmp = icmp slt i64 %tmp13, %arg2
				br i1 %tmp, label %bb5, label %bb15.preheader
	bb15: ; preds = %bb25, %bb3
	%.0 = phi i64 [ 0, %bb3 ], [ %tmp26, %bb25 ]			bb15.preheader: ; preds = %bb12, %bb
	%tmp16 = icmp slt i64 %.0, %arg2			%tmp161 = icmp slt i64 0, %arg2
	br i1 %tmp16, label %bb18, label %bb17			br i1 %tmp161, label %bb18, label %bb27

	bb17: ; preds = %bb15			bb18: ; preds = %bb15.preheader, %bb25
	br label %bb27			%.02 = phi i64 [ 0, %bb15.preheader ], [ %tmp26, %bb25 ]
				%tmp19 = add nsw i64 %.02, -3
	bb18: ; preds = %bb15			%tmp20 = add nuw nsw i64 %.02, 3
	%tmp19 = add nsw i64 %.0, -3
	%tmp20 = add nuw nsw i64 %.0, 3
	%tmp21 = mul nsw i64 %tmp19, %tmp20			%tmp21 = mul nsw i64 %tmp19, %tmp20
	%tmp22 = srem i64 %tmp21, %.0			%tmp22 = srem i64 %tmp21, %.02
	%tmp23 = trunc i64 %tmp22 to i32			%tmp23 = trunc i64 %tmp22 to i32
	%tmp24 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.0			%tmp24 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %.02
	store i32 %tmp23, i32* %tmp24, align 4			store i32 %tmp23, i32* %tmp24, align 4
	br label %bb25			br label %bb25

	bb25: ; preds = %bb18			bb25: ; preds = %bb18
	%tmp26 = add nuw nsw i64 %.0, 1			%tmp26 = add nuw nsw i64 %.02, 1
	br label %bb15			%tmp16 = icmp slt i64 %tmp26, %arg2
				br i1 %tmp16, label %bb18, label %bb27

	bb27: ; preds = %bb17			bb27: ; preds = %bb17
	ret void			ret void
	}			}

	; CHECK: void @raw_only			; CHECK: void @raw_only
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]*]]			; CHECK-NEXT: br label %[[LOOP1HEADER:bb[0-9]*]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP1BODY:bb[0-9]]], label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP1BODY]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]			; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER:bb[0-9]+]]			; CHECK: br i1 %{{.*}}, label %[[LOOP2HEADER:bb[0-9]+]], label %[[LOOP2HEADER]]
	; CHECK: [[LOOP2PREHEADER]]			; CHECK: [[LOOP2HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP2BODY:bb[0-9]]], label %[[LOOP2EXIT:bb[0-9]*]]
	; CHECK: [[LOOP2BODY]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]			; CHECK: br i1 %{{.}}, label %[[LOOP1HEADER]], label %{{.}}
	; CHECK: ret void			; CHECK: ret void
	define void @raw_only(i32* noalias %arg) {			define void @raw_only(i32* noalias %arg) {
	bb:			bb:
	br label %bb5			br label %bb7

				bb11.preheader: ; preds = %bb9
				br label %bb13

	bb5: ; preds = %bb9, %bb			bb7: ; preds = %bb, %bb9
	%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb9 ], [ 0, %bb ]			%indvars.iv22 = phi i64 [ 0, %bb ], [ %indvars.iv.next3, %bb9 ]
	%exitcond4 = icmp ne i64 %indvars.iv2, 100			%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv22
	br i1 %exitcond4, label %bb7, label %bb11			%tmp8 = trunc i64 %indvars.iv22 to i32

	bb7: ; preds = %bb5
	%tmp = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2
	%tmp8 = trunc i64 %indvars.iv2 to i32
	store i32 %tmp8, i32* %tmp, align 4			store i32 %tmp8, i32* %tmp, align 4
	br label %bb9			br label %bb9

	bb9: ; preds = %bb7			bb9: ; preds = %bb7
	%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1			%indvars.iv.next3 = add nuw nsw i64 %indvars.iv22, 1
	br label %bb5			%exitcond4 = icmp ne i64 %indvars.iv.next3, 100
				br i1 %exitcond4, label %bb7, label %bb11.preheader
	bb11: ; preds = %bb18, %bb5
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb18 ], [ 0, %bb5 ]			bb13: ; preds = %bb11.preheader, %bb18
	%exitcond = icmp ne i64 %indvars.iv, 100			%indvars.iv1 = phi i64 [ 0, %bb11.preheader ], [ %indvars.iv.next, %bb18 ]
	br i1 %exitcond, label %bb13, label %bb19			%tmp14 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv1

	bb13: ; preds = %bb11
	%tmp14 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv
	%tmp15 = load i32, i32* %tmp14, align 4			%tmp15 = load i32, i32* %tmp14, align 4
	%tmp16 = shl nsw i32 %tmp15, 1			%tmp16 = shl nsw i32 %tmp15, 1
	%tmp17 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv			%tmp17 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv1
	store i32 %tmp16, i32* %tmp17, align 4			store i32 %tmp16, i32* %tmp17, align 4
	br label %bb18			br label %bb18

	bb18: ; preds = %bb13			bb18: ; preds = %bb13
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
	br label %bb11			%exitcond = icmp ne i64 %indvars.iv.next, 100 br i1 %exitcond, label %bb13, label %bb19

	bb19: ; preds = %bb11			bb19: ; preds = %bb18
	ret void			ret void
	}			}

	; CHECK: void @raw_only_parametric			; CHECK: void @raw_only_parametric
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
				; CHECK: br i1 %{{.}}, label %[[LOOP1PREHEADER:bb[0-9.a-z]]], label %[[EXITBLOCK:bb[0-9]*]]
				; CHECK: [[LOOP1PREHEADER]]
	; CHECK: br label %[[LOOP1HEADER:bb[0-9]*]]			; CHECK: br label %[[LOOP1HEADER:bb[0-9]*]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP1BODY:bb[0-9]]], label %[[LOOP2PREHEADER:bb[0-9]+]]			; CHECK: br i1 %{{.}}, label %[[LOOP2HEADER:bb[0-9]]], label %[[LOOP2HEADER]]
	; CHECK: [[LOOP1BODY]]			; CHECK: [[LOOP2HEADER]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]			; CHECK: br i1 %{{.*}}, label %[[LOOP1HEADER]], label %[[EXITBLOCK]]
	; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP2PREHEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP2BODY:bb[0-9]]], label %[[LOOP2EXIT:bb[0-9]*]]
	; CHECK: [[LOOP2BODY]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]
	; CHECK: ret void			; CHECK: ret void
	define void @raw_only_parametric(i32* noalias %arg, i32 %arg4) {			define void @raw_only_parametric(i32* noalias %arg, i32 %arg4) {
	bb:			bb:
	br label %bb5

	bb5: ; preds = %bb11, %bb
	%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb11 ], [ 0, %bb ]
	%tmp = sext i32 %arg4 to i64			%tmp = sext i32 %arg4 to i64
	%tmp6 = icmp slt i64 %indvars.iv2, %tmp			%tmp64 = icmp sgt i32 %arg4, 0
	br i1 %tmp6, label %bb8, label %bb14			br i1 %tmp64, label %bb8, label %bb23

	bb8: ; preds = %bb5			bb8: ; preds = %bb, %bb8
	%tmp9 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2			%indvars.iv25 = phi i64 [ %indvars.iv.next3, %bb8 ], [ 0, %bb ]
	%tmp10 = trunc i64 %indvars.iv2 to i32			%tmp9 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv25
				%tmp10 = trunc i64 %indvars.iv25 to i32
	store i32 %tmp10, i32* %tmp9, align 4			store i32 %tmp10, i32* %tmp9, align 4
	br label %bb11			%indvars.iv.next3 = add nuw nsw i64 %indvars.iv25, 1
				%tmp6 = icmp slt i64 %indvars.iv.next3, %tmp
	bb11: ; preds = %bb8			br i1 %tmp6, label %bb8, label %bb17
	%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1
	br label %bb5			bb17: ; preds = %bb8, %bb17
				%indvars.iv3 = phi i64 [ %indvars.iv.next, %bb17 ], [ 0, %bb8 ]
	bb14: ; preds = %bb22, %bb5			%tmp18 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv3
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb22 ], [ 0, %bb5 ]
	%tmp13 = sext i32 %arg4 to i64
	%tmp15 = icmp slt i64 %indvars.iv, %tmp13
	br i1 %tmp15, label %bb17, label %bb23

	bb17: ; preds = %bb14
	%tmp18 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv
	%tmp19 = load i32, i32* %tmp18, align 4			%tmp19 = load i32, i32* %tmp18, align 4
	%tmp20 = shl nsw i32 %tmp19, 1			%tmp20 = shl nsw i32 %tmp19, 1
	%tmp21 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv			%tmp21 = getelementptr inbounds [1024 x i32], [1024 x i32]* @B, i64 0, i64 %indvars.iv3
	store i32 %tmp20, i32* %tmp21, align 4			store i32 %tmp20, i32* %tmp21, align 4
	br label %bb22			%indvars.iv.next = add nuw nsw i64 %indvars.iv3, 1
				%tmp15 = icmp slt i64 %indvars.iv.next, %tmp
	bb22: ; preds = %bb17			br i1 %tmp15, label %bb17, label %bb23
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
	br label %bb14

	bb23: ; preds = %bb14			bb23: ; preds = %bb17, %bb
	ret void			ret void
	}			}

	; CHECK: void @forward_dep			; CHECK: void @forward_dep
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK: br label %[[LOOP1HEADER:bb[0-9]*]]			; CHECK: br label %[[LOOP1HEADER:bb[0-9]*]]
	; CHECK: [[LOOP1HEADER]]			; CHECK: [[LOOP1HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP1BODY:bb[0-9]]], label %[[LOOP2PREHEADER:bb[0-9]+]]
	; CHECK: [[LOOP1BODY]]
	; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]			; CHECK: br label %[[LOOP1LATCH:bb[0-9]*]]
	; CHECK: [[LOOP1LATCH]]			; CHECK: [[LOOP1LATCH]]
	; CHECK: br label %[[LOOP2PREHEADER:bb[0-9]+]]			; CHECK: br i1 %{{.*}}, label %[[LOOP2HEADER:bb[0-9]+]], label %[[LOOP2HEADER]]
	; CHECK: [[LOOP2PREHEADER]]			; CHECK: [[LOOP2HEADER]]
	; CHECK: br i1 %{{.}}, label %[[LOOP2BODY:bb[0-9]]], label %[[LOOP2EXIT:bb[0-9]*]]
	; CHECK: [[LOOP2BODY]]
	; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]			; CHECK: br label %[[LOOP2LATCH:bb[0-9]+]]
	; CHECK: [[LOOP2LATCH]]			; CHECK: [[LOOP2LATCH]]
	; CHECK: br label %[[LOOP1HEADER]]			; CHECK: br i1 %{{.}}, label %[[LOOP1HEADER]], label %{{.}}
	; CHECK: ret void			; CHECK: ret void
	define void @forward_dep(i32* noalias %arg) {			define void @forward_dep(i32* noalias %arg) {
	bb:			bb:
	br label %bb5			br label %bb7

	bb5: ; preds = %bb14, %bb			bb7: ; preds = %bb, %bb14
	%indvars.iv2 = phi i64 [ %indvars.iv.next3, %bb14 ], [ 0, %bb ]			%.013 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]
	%.01 = phi i32 [ 0, %bb ], [ %tmp15, %bb14 ]			%indvars.iv22 = phi i64 [ 0, %bb ], [ %indvars.iv.next3, %bb14 ]
	%exitcond4 = icmp ne i64 %indvars.iv2, 100			%tmp = add nsw i32 %.013, -3
	br i1 %exitcond4, label %bb7, label %bb17			%tmp8 = add nuw nsw i64 %indvars.iv22, 3

	bb7: ; preds = %bb5
	%tmp = add nsw i32 %.01, -3
	%tmp8 = add nuw nsw i64 %indvars.iv2, 3
	%tmp9 = trunc i64 %tmp8 to i32			%tmp9 = trunc i64 %tmp8 to i32
	%tmp10 = mul nsw i32 %tmp, %tmp9			%tmp10 = mul nsw i32 %tmp, %tmp9
	%tmp11 = trunc i64 %indvars.iv2 to i32			%tmp11 = trunc i64 %indvars.iv22 to i32
	%tmp12 = srem i32 %tmp10, %tmp11			%tmp12 = srem i32 %tmp10, %tmp11
	%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv2			%tmp13 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv22
	store i32 %tmp12, i32* %tmp13, align 4			store i32 %tmp12, i32* %tmp13, align 4
	br label %bb14			br label %bb14

	bb14: ; preds = %bb7			bb14: ; preds = %bb7
	%indvars.iv.next3 = add nuw nsw i64 %indvars.iv2, 1			%indvars.iv.next3 = add nuw nsw i64 %indvars.iv22, 1
	%tmp15 = add nuw nsw i32 %.01, 1			%tmp15 = add nuw nsw i32 %.013, 1
	br label %bb5			%exitcond4 = icmp ne i64 %indvars.iv.next3, 100
				br i1 %exitcond4, label %bb7, label %bb19
	bb17: ; preds = %bb25, %bb5
	%indvars.iv = phi i64 [ %indvars.iv.next, %bb25 ], [ 0, %bb5 ]			bb19: ; preds = %bb14, %bb25
	%exitcond = icmp ne i64 %indvars.iv, 100			%indvars.iv1 = phi i64 [ 0, %bb14 ], [ %indvars.iv.next, %bb25 ]
	br i1 %exitcond, label %bb19, label %bb26			%tmp20 = add nsw i64 %indvars.iv1, -3

	bb19: ; preds = %bb17
	%tmp20 = add nsw i64 %indvars.iv, -3
	%tmp21 = getelementptr inbounds i32, i32* %arg, i64 %tmp20			%tmp21 = getelementptr inbounds i32, i32* %arg, i64 %tmp20
	%tmp22 = load i32, i32* %tmp21, align 4			%tmp22 = load i32, i32* %tmp21, align 4
	%tmp23 = mul nsw i32 %tmp22, 3			%tmp23 = mul nsw i32 %tmp22, 3
	%tmp24 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv			%tmp24 = getelementptr inbounds i32, i32* %arg, i64 %indvars.iv1
	store i32 %tmp23, i32* %tmp24, align 4			store i32 %tmp23, i32* %tmp24, align 4
	br label %bb25			br label %bb25

	bb25: ; preds = %bb19			bb25: ; preds = %bb19
	%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1			%indvars.iv.next = add nuw nsw i64 %indvars.iv1, 1
	br label %bb17			%exitcond = icmp ne i64 %indvars.iv.next, 100
				br i1 %exitcond, label %bb19, label %bb26

	bb26: ; preds = %bb17			bb26: ; preds = %bb25
	ret void			ret void
	}			}