This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
LICM.cpp
-
test/Transforms/LICM/
-
Transforms/
-
LICM/
-
loopsink.ll
-
sink.ll

Differential D37463

Fix miscompile in LoopSink pass
AbandonedPublic

Authored by DaniilSuchkov on Sep 5 2017, 3:38 AM.

Download Raw Diff

Details

Reviewers

danielcdh
trentxintong
mkazantsev
reames

Summary

It was allowed by llvm::canSinkOrHoistInst to sink non-invariant loads into loops but it is illegal because it can introduce new data races. For example:

b = *p;
for (int i = 0; i < N; i++)
  a[i] = b;

Assuming a is a thread local value, it should always contain a single value. If it were to contain different values at different indices, that would be a miscompile.

Diff Detail

Repository: rL LLVM

Event Timeline

DaniilSuchkov created this revision.Sep 5 2017, 3:38 AM

Thanks for raising the issue.

But I'm not sure if this the right fix

this disables sinking of all load instructions unless it's a constant load, or has invariant load metadata
not sure why SafetyInfo would help identify the potential data race here

Additionally, about the original testcase, the data race is there even without loopsink optimization as the load outside of the loop is not guarded by any locks. I'm not a C++ standard expert and am not sure if data race like this is considered as undefined behavior (if yes, then it's legal for compiler to sink the load down to the loop to break the "everything in a[] should be the same" assumption)

Put it another way, is it legal to hoist the load for the following load with potential racy condition?

for (i = 0; i < N; ++i)

a[i] = *p;

In C/C++, your example has undefined behavior. From the C++ standard: "The execution of a program contains a data race if it contains two conﬂicting actions in diﬀerent threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undeﬁned behavior."

In D37463#861162, @danielcdh wrote:

not sure why SafetyInfo would help identify the potential data race here

From comment to canSinkOrHoistInst:

/// If SafetyInfo is null, we are checking for sinking instructions from
/// preheader to loop body (no speculation).

So I'm just checking if we are going to sink load into loop.

Put it another way, is it legal to hoist the load for the following load with potential racy condition?

for (i = 0; i < N; ++i)

a[i] = *p;

Yes. "No race" is one of possible cases of data race.

I need to read what C++ specification says about this particular issue, but basically LLVM is not only used to compile C++. This situation can be illegal in other languages (again, need to dig more through specifications). My proposal is to add an option that prohibits this transform and set it to false by default, with abitily to turn it off for languages where it is prohibited.

Actually the answer is here: https://llvm.org/docs/Atomics.html#notatomic

NotAtomic is the obvious, a load or store which is not atomic. (This isn’t really a level of atomicity, but is listed here for comparison.) This is essentially a regular load or store.
...
Notes for optimizers:
Introducing loads to shared variables along a codepath where they would not otherwise exist is allowed; introducing stores to shared variables is not.

So, this behavior of LoopSink is not a bug.

Actually, there might be a bug here in the handling of "unordered" atomic loads... I haven't verified, but we might need to special-case them in LoopSink.

LoopSink uses canSinkOrHoistInst from LICM to check whether instruction can be sunk or not. This check rejects all loads with isUnordered() == true (see http://llvm.org/docs/Atomics.html#atomics-and-ir-optimization). It looks like "unordered" atomics are handled in a right way: they are never sunk into loops.

In D37463#861838, @mkazantsev wrote:

I need to read what C++ specification says about this particular issue, but basically LLVM is not only used to compile C++. This situation can be illegal in other languages (again, need to dig more through specifications). My proposal is to add an option that prohibits this transform and set it to false by default, with abitily to turn it off for languages where it is prohibited.

LLVM has a memory model.
https://llvm.org/docs/LangRef.html#memory-model-for-concurrent-operations

If this violates LLVM's memory model, it's a bug.
It it doesn't, it should not be turned off, and an other-language frontend needs to make sure the code it generates is correct in LLVM's memory model.
(or of course, start a discussion about changing the memory model)

In D37463#864419, @dberlin wrote:

In D37463#861838, @mkazantsev wrote:

I need to read what C++ specification says about this particular issue, but basically LLVM is not only used to compile C++. This situation can be illegal in other languages (again, need to dig more through specifications). My proposal is to add an option that prohibits this transform and set it to false by default, with abitily to turn it off for languages where it is prohibited.

LLVM has a memory model.
https://llvm.org/docs/LangRef.html#memory-model-for-concurrent-operations

If this violates LLVM's memory model, it's a bug.
It it doesn't, it should not be turned off, and an other-language frontend needs to make sure the code it generates is correct in LLVM's memory model.
(or of course, start a discussion about changing the memory model)

Strongly agreed with this. Neither the C++ memory model or the Java memory model are directly relevant to this discussion. We should focus on the LLVM memory model exclusively when defining LLVM bugs and revise the LLVM memory model if needed to support one of our frontend languages.

Reading through the discussion, it sounds like we have correctly concluded there was no bug here (at least for standard non-atomic loads and stores). Reading through the code, it also looks like ordered atomics are handled conservatively and not sunk. I think there might be some further investigation needed for unordered atomics, but I will take that offline with Daniil before returning to this thread.

In D37463#867451, @reames wrote:

I think there might be some further investigation needed for unordered atomics, but I will take that offline with Daniil before returning to this thread.

Returning to this point, the conclusion I would reach reviewing our Atomics documentation is that sinking an unordered atomic into a loop must be disallowed, because doing so duplicates the load which is disallowed.

The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here's the full text of the section under discussion:

Notes for optimizers

In terms of the optimizer, this *prohibits any transformation that transforms a single load into multiple loads*, transforms a store into multiple stores, narrows a store, or stores a value which would not be stored otherwise. Some examples of unsafe optimizations are narrowing an assignment into a bitfield, *rematerializing a load*, and turning loads and stores into a memcpy call. Reordering unordered operations is safe, though, and optimizers should take advantage of that because unordered operations are common in languages that need them.

The important bits are in *bold*. The first highlighted clause could be interpreted to be a weaker constraint that it first seems. In particular, it could be read as only disallowing *splitting* a load into multiple pieces, and not disallowing *duplicating* an load. Internally, several folks read it that way. I believe that is an incorrect interpretation and that duplication must also be disallowed by the text. My reasoning is as follows:

Sinking a load into the header of a loop with a trip count > 1 increases the number of times the load is run dynamically.

Rematerialization (which is explicitly allowed as a disallowed transformation) is the insertion of loads by the register allocator at use sites. LoopSinking and rematerialization end up producing *exactly the same code* on the following example:

L = load atomic %p
while (foo()) {

call_which_clobbers_all_regs();
use L

}

Both transforms would want to produce:

while (foo()) {

call_which_clobbers_all_regs();
L = load atomic %p
use L

}

If this were legal for LoopSink, then it would also be legal for rematerialization. And it's definitely not, as explicitly stated in the text. As such, it must be the case that LoopSink is disallowed for this case as well, which requires the broader interpretation of the initial sentence.

Given this, I must conclude that LoopSink is currently buggy as we do sink unordered atomic loads into loops.

In D37463#882487, @reames wrote:
In D37463#867451, @reames wrote:

I think there might be some further investigation needed for unordered atomics, but I will take that offline with Daniil before returning to this thread.

Returning to this point, the conclusion I would reach reviewing our Atomics documentation is that sinking an unordered atomic into a loop must be disallowed, because doing so duplicates the load which is disallowed.

The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here's the full text of the section under discussion:

Notes for optimizers
In terms of the optimizer, this *prohibits any transformation that transforms a single load into multiple loads*, transforms a store into multiple stores, narrows a store, or stores a value which would not be stored otherwise. Some examples of unsafe optimizations are narrowing an assignment into a bitfield, *rematerializing a load*, and turning loads and stores into a memcpy call. Reordering unordered operations is safe, though, and optimizers should take advantage of that because unordered operations are common in languages that need them.
The important bits are in *bold*. The first highlighted clause could be interpreted to be a weaker constraint that it first seems. In particular, it could be read as only disallowing *splitting* a load into multiple pieces, and not disallowing *duplicating* an load. Internally, several folks read it that way. I believe that is an incorrect interpretation and that duplication must also be disallowed by the text. My reasoning is as follows:

Sinking a load into the header of a loop with a trip count > 1 increases the number of times the load is run dynamically.

Rematerialization (which is explicitly allowed as a disallowed transformation) is the insertion of loads by the register allocator at use sites. LoopSinking and rematerialization end up producing *exactly the same code* on the following example:

L = load atomic %p
while (foo()) {
call_which_clobbers_all_regs();
use L
}

Both transforms would want to produce:

while (foo()) {
call_which_clobbers_all_regs();
L = load atomic %p
use L
}

If this were legal for LoopSink, then it would also be legal for rematerialization. And it's definitely not, as explicitly stated in the text. As such, it must be the case that LoopSink is disallowed for this case as well, which requires the broader interpretation of the initial sentence.

Given this, I must conclude that LoopSink is currently buggy as we do sink unordered atomic loads into loops.

I agree. Sinking the load into the loop is equivalent to rematerialization.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

LICM.cpp

11 lines

test/

Transforms/

LICM/

loopsink.ll

57 lines

sink.ll

19 lines

Diff 113818

lib/Transforms/Scalar/LICM.cpp

Show First 20 Lines • Show All 593 Lines • ▼ Show 20 Lines	if (AA->pointsToConstantMemory(LI->getOperand(0)))
return true;		return true;
if (LI->getMetadata(LLVMContext::MD_invariant_load))		if (LI->getMetadata(LLVMContext::MD_invariant_load))
return true;		return true;

// This checks for an invariant.start dominating the load.		// This checks for an invariant.start dominating the load.
if (isLoadInvariantInLoop(LI, DT, CurLoop))		if (isLoadInvariantInLoop(LI, DT, CurLoop))
return true;		return true;

		// Don't sink non-invariant loads into loops because it can introduce new
		// data races. For example:
		// b = *p;
		// for (int i = 0; i < N; i++)
		// a[i] = b;
		// Assuming `a` is a thread local value, it should always contain a single
		// value. If it were to contain different values at different indices, that
		// would be a miscompile.
		if (!SafetyInfo)
		return false;

// Don't hoist loads which have may-aliased stores in loop.		// Don't hoist loads which have may-aliased stores in loop.
uint64_t Size = 0;		uint64_t Size = 0;
if (LI->getType()->isSized())		if (LI->getType()->isSized())
Size = I.getModule()->getDataLayout().getTypeStoreSize(LI->getType());		Size = I.getModule()->getDataLayout().getTypeStoreSize(LI->getType());

AAMDNodes AAInfo;		AAMDNodes AAInfo;
LI->getAAMetadata(AAInfo);		LI->getAAMetadata(AAInfo);

▲ Show 20 Lines • Show All 805 Lines • Show Last 20 Lines

test/Transforms/LICM/loopsink.ll

; RUN: opt -S -loop-sink < %s \| FileCheck %s		; RUN: opt -S -loop-sink < %s \| FileCheck %s
; RUN: opt -S -passes=loop-sink < %s \| FileCheck %s		; RUN: opt -S -aa-pipeline=basic-aa -passes=loop-sink < %s \| FileCheck %s

@g = global i32 0, align 4		@g = global i32 0, align 4
		@g_const = constant i32 1, align 4

; b1		; b1
; / \		; / \
; b2 b6		; b2 b6
; / \ \|		; / \ \|
; b3 b4 \|		; b3 b4 \|
; \ / \|		; \ / \|
; b5 \|		; b5 \|
; \ /		; \ /
; b7		; b7
; preheader: 1000		; preheader: 1000
; b2: 15		; b2: 15
; b3: 7		; b3: 7
; b4: 7		; b4: 7
; Sink load to b2		; Sink add to b2. Don't sink load because it's illegal to sink non-invariant
		; loads into loops.
; CHECK: t1		; CHECK: t1
; CHECK: .b2:		; CHECK: .b2:
; CHECK: load i32, i32* @g		; CHECK: add i32 %global, 1
; CHECK: .b3:
; CHECK-NOT: load i32, i32* @g		; CHECK-NOT: load i32, i32* @g
		; CHECK: .b3:
		; CHECK-NOT: add i32 %global, 1
define i32 @t1(i32, i32) #0 !prof !0 {		define i32 @t1(i32, i32) #0 !prof !0 {
%3 = icmp eq i32 %1, 0		%3 = icmp eq i32 %1, 0
br i1 %3, label %.exit, label %.preheader		br i1 %3, label %.exit, label %.preheader

.preheader:		.preheader:
%invariant = load i32, i32* @g		%global = load i32, i32* @g
		%invariant = add i32 %global, 1
br label %.b1		br label %.b1

.b1:		.b1:
%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]		%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]
%c1 = icmp sgt i32 %iv, %0		%c1 = icmp sgt i32 %iv, %0
br i1 %c1, label %.b2, label %.b6, !prof !1		br i1 %c1, label %.b2, label %.b6, !prof !1

.b2:		.b2:
Show All 35 Lines
; \ / \|		; \ / \|
; b5 \|		; b5 \|
; \ /		; \ /
; b7		; b7
; preheader: 500		; preheader: 500
; b1: 16016		; b1: 16016
; b3: 8		; b3: 8
; b6: 8		; b6: 8
; Sink load to b3 and b6		; Sink add to b3 and b6. Don't sink load because it's illegal to sink
		; non-invariant loads into loops.
; CHECK: t2		; CHECK: t2
; CHECK: .preheader:		; CHECK: .preheader:
; CHECK-NOT: load i32, i32* @g		; CHECK-NOT: add i32 %global, 1
; CHECK: .b3:		; CHECK: .b3:
; CHECK: load i32, i32* @g		; CHECK-NOT: load i32, i32* @g
		; CHECK: add i32 %global, 1
; CHECK: .b4:		; CHECK: .b4:
; CHECK: .b6:		; CHECK: .b6:
; CHECK: load i32, i32* @g		; CHECK-NOT: load i32, i32* @g
		; CHECK: add i32 %global, 1
; CHECK: .b7:		; CHECK: .b7:
define i32 @t2(i32, i32) #0 !prof !0 {		define i32 @t2(i32, i32) #0 !prof !0 {
%3 = icmp eq i32 %1, 0		%3 = icmp eq i32 %1, 0
br i1 %3, label %.exit, label %.preheader		br i1 %3, label %.exit, label %.preheader

.preheader:		.preheader:
%invariant = load i32, i32* @g		%global = load i32, i32* @g
		%invariant = add i32 %global, 1
br label %.b1		br label %.b1

.b1:		.b1:
%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]		%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]
%c1 = icmp sgt i32 %iv, %0		%c1 = icmp sgt i32 %iv, %0
br i1 %c1, label %.b2, label %.b6, !prof !2		br i1 %c1, label %.b2, label %.b6, !prof !2

.b2:		.b2:
Show All 34 Lines
; b3 b4 \|		; b3 b4 \|
; \ / \|		; \ / \|
; b5 \|		; b5 \|
; \ /		; \ /
; b7		; b7
; preheader: 500		; preheader: 500
; b3: 8		; b3: 8
; b5: 16008		; b5: 16008
; Do not sink load from preheader.		; Do not sink add from preheader.
; CHECK: t3		; CHECK: t3
; CHECK: .preheader:		; CHECK: .preheader:
; CHECK: load i32, i32* @g		; CHECK: load i32, i32* @g
		; CHECK: add i32 %global, 1
; CHECK: .b1:		; CHECK: .b1:
; CHECK-NOT: load i32, i32* @g		; CHECK-NOT: load i32, i32* @g
		; CHECK-NOT: add i32 %global, 1
define i32 @t3(i32, i32) #0 !prof !0 {		define i32 @t3(i32, i32) #0 !prof !0 {
%3 = icmp eq i32 %1, 0		%3 = icmp eq i32 %1, 0
br i1 %3, label %.exit, label %.preheader		br i1 %3, label %.exit, label %.preheader

.preheader:		.preheader:
%invariant = load i32, i32* @g		%global = load i32, i32* @g
		%invariant = add i32 %global, 1
br label %.b1		br label %.b1

.b1:		.b1:
%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]		%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]
%c1 = icmp sgt i32 %iv, %0		%c1 = icmp sgt i32 %iv, %0
br i1 %c1, label %.b2, label %.b6, !prof !2		br i1 %c1, label %.b2, label %.b6, !prof !2

.b2:		.b2:
Show All 22 Lines	.b7:
%t7 = add nuw nsw i32 %iv, 1		%t7 = add nuw nsw i32 %iv, 1
%c7 = icmp eq i32 %t7, %p7		%c7 = icmp eq i32 %t7, %p7
br i1 %c7, label %.b1, label %.exit, !prof !3		br i1 %c7, label %.b1, label %.exit, !prof !3

.exit:		.exit:
ret i32 10		ret i32 10
}		}

; For single-BB loop with <=1 avg trip count, sink load to b1		; For single-BB loop with <=1 avg trip count, sink add to b1.
		; Don't sink load because it's illegal to sink non-invariant loads into loops.
; CHECK: t4		; CHECK: t4
; CHECK: .preheader:		; CHECK: .preheader:
; CHECK-not: load i32, i32* @g
; CHECK: .b1:
; CHECK: load i32, i32* @g		; CHECK: load i32, i32* @g
		; CHECK-NOT: add i32 %global, 1
		; CHECK: .b1:
		; CHECK-NOT: load i32, i32* @g
		; CHECK: add i32 %global, 1
; CHECK: .exit:		; CHECK: .exit:
define i32 @t4(i32, i32) #0 !prof !0 {		define i32 @t4(i32, i32) #0 !prof !0 {
.preheader:		.preheader:
%invariant = load i32, i32* @g		%global = load i32, i32* @g
		%invariant = add i32 %global, 1
br label %.b1		br label %.b1

.b1:		.b1:
%iv = phi i32 [ %t1, %.b1 ], [ 0, %.preheader ]		%iv = phi i32 [ %t1, %.b1 ], [ 0, %.preheader ]
%t1 = add nsw i32 %invariant, %iv		%t1 = add nsw i32 %invariant, %iv
%c1 = icmp sgt i32 %iv, %0		%c1 = icmp sgt i32 %iv, %0
br i1 %c1, label %.b1, label %.exit, !prof !1		br i1 %c1, label %.b1, label %.exit, !prof !1

Show All 9 Lines
; \ / \|		; \ / \|
; b5 \|		; b5 \|
; \ /		; \ /
; b7		; b7
; preheader: 1000		; preheader: 1000
; b2: 15		; b2: 15
; b3: 7		; b3: 7
; b4: 7		; b4: 7
; There is alias store in loop, do not sink load		; Sink load from constant memory.
; CHECK: t5		; CHECK: t5
; CHECK: .preheader:		; CHECK: .preheader:
; CHECK: load i32, i32* @g		; CHECK-NOT: load i32, i32* @g_const
; CHECK: .b1:		; CHECK: .b2:
; CHECK-NOT: load i32, i32* @g		; CHECK: load i32, i32* @g_const
define i32 @t5(i32, i32*) #0 !prof !0 {		define i32 @t5(i32, i32*) #0 !prof !0 {
%3 = icmp eq i32 %0, 0		%3 = icmp eq i32 %0, 0
br i1 %3, label %.exit, label %.preheader		br i1 %3, label %.exit, label %.preheader

.preheader:		.preheader:
%invariant = load i32, i32* @g		%invariant = load i32, i32* @g_const
br label %.b1		br label %.b1

.b1:		.b1:
%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]		%iv = phi i32 [ %t7, %.b7 ], [ 0, %.preheader ]
%c1 = icmp sgt i32 %iv, %0		%c1 = icmp sgt i32 %iv, %0
br i1 %c1, label %.b2, label %.b6, !prof !1		br i1 %c1, label %.b2, label %.b6, !prof !1

.b2:		.b2:
Show All 36 Lines

test/Transforms/LICM/sink.ll

	; RUN: opt -S -licm < %s \| FileCheck %s --check-prefix=CHECK-LICM			; RUN: opt -S -licm < %s \| FileCheck %s --check-prefix=CHECK-LICM
	; RUN: opt -S -licm < %s \| opt -S -loop-sink \| FileCheck %s --check-prefix=CHECK-SINK			; RUN: opt -S -licm < %s \| opt -S -loop-sink \| FileCheck %s --check-prefix=CHECK-SINK
	; RUN: opt -S < %s -passes='require<opt-remark-emit>,loop(licm),loop-sink' \			; RUN: opt -S < %s -passes='require<opt-remark-emit>,loop(licm),loop-sink' \
	; RUN: \| FileCheck %s --check-prefix=CHECK-SINK			; RUN: \| FileCheck %s --check-prefix=CHECK-SINK

	; Original source code:			; Original source code:
	; int g;			; int g;
	; int foo(int p, int x) {			; int foo(int p, int x) {
	; for (int i = 0; i != x; i++)			; for (int i = 0; i != x; i++)
	; if (__builtin_expect(i == p, 0)) {			; if (__builtin_expect(i == p, 0)) {
	; x += g; x *= g;			; x += g; x *= g;
	; }			; }
	; return x;			; return x;
	; }			; }
	;			;
	; Load of global value g should not be hoisted to preheader.			; Computations depending on the load of global value @g should not be hoisted to
				; preheader. Though load of @g will be hoisted by LICM and should not be sunk
				; back by LoopSink because sinking of non-invariant loads into loops is illegal:
				; it can introduce new data races. For example:
				; b = *p;
				; for (int i = 0; i < N; i++)
				; a[i] = b;
				; Assuming `a` is a thread local value, it should always contain a single
				; value. If it were to contain different values at different indices, that
				; would be a miscompile.


	@g = global i32 0, align 4			@g = global i32 0, align 4

	define i32 @foo(i32, i32) #0 !prof !2 {			define i32 @foo(i32, i32) #0 !prof !2 {
	%3 = icmp eq i32 %1, 0			%3 = icmp eq i32 %1, 0
	br i1 %3, label %._crit_edge, label %.lr.ph.preheader			br i1 %3, label %._crit_edge, label %.lr.ph.preheader

	.lr.ph.preheader:			.lr.ph.preheader:
	br label %.lr.ph			br label %.lr.ph

	; CHECK-LICM: .lr.ph.preheader:			; CHECK-LICM: .lr.ph.preheader:
	; CHECK-LICM: load i32, i32* @g			; CHECK-LICM: load i32, i32* @g
				; CHECK-LICM: add nsw i32 %glob, 1
	; CHECK-LICM: br label %.lr.ph			; CHECK-LICM: br label %.lr.ph

	.lr.ph:			.lr.ph:
	%.03 = phi i32 [ %8, %.combine ], [ 0, %.lr.ph.preheader ]			%.03 = phi i32 [ %8, %.combine ], [ 0, %.lr.ph.preheader ]
	%.012 = phi i32 [ %.1, %.combine ], [ %1, %.lr.ph.preheader ]			%.012 = phi i32 [ %.1, %.combine ], [ %1, %.lr.ph.preheader ]
	%4 = icmp eq i32 %.03, %0			%4 = icmp eq i32 %.03, %0
	br i1 %4, label %.then, label %.combine, !prof !1			br i1 %4, label %.then, label %.combine, !prof !1

	.then:			.then:
	%5 = load i32, i32* @g, align 4			%glob = load i32, i32* @g, align 4
				%5 = add nsw i32 %glob, 1
	%6 = add nsw i32 %5, %.012			%6 = add nsw i32 %5, %.012
	%7 = mul nsw i32 %6, %5			%7 = mul nsw i32 %6, %5
	br label %.combine			br label %.combine

	; CHECK-SINK: .then:			; CHECK-SINK: .then:
	; CHECK-SINK: load i32, i32* @g			; CHECK-SINK-NOT: load i32, i32* @g
				; CHECK-SINK: add nsw i32 %glob, 1
	; CHECK-SINK: br label %.combine			; CHECK-SINK: br label %.combine

	.combine:			.combine:
	%.1 = phi i32 [ %7, %.then ], [ %.012, %.lr.ph ]			%.1 = phi i32 [ %7, %.then ], [ %.012, %.lr.ph ]
	%8 = add nuw nsw i32 %.03, 1			%8 = add nuw nsw i32 %.03, 1
	%9 = icmp eq i32 %8, %.1			%9 = icmp eq i32 %8, %.1
	br i1 %9, label %._crit_edge.loopexit, label %.lr.ph			br i1 %9, label %._crit_edge.loopexit, label %.lr.ph

	Show All 11 Lines