This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
4/12
LoopFlatten.cpp
-
test/Transforms/LoopFlatten/
-
Transforms/
-
LoopFlatten/
4
loop-flatten-negative.ll
1/1
loop-flatten.ll

Differential D105802

[LoopFlatten] Fix missed LoopFlatten opportunity
ClosedPublic

Authored by RosieSumpter on Jul 12 2021, 2:12 AM.

Download Raw Diff

Details

Reviewers

SjoerdMeijer
dmgreen
Whitney
eopXD
fhahn

Commits

rGf117ed542fd2: [LoopFlatten] Fix missed LoopFlatten opportunity
rG2df8bf9339e4: [LoopFlatten] Fix missed LoopFlatten opportunity

Summary

When the limit of the inner loop is a known integer, the InstCombine
pass now causes the transformation e.g. imcp ult i32 %inc, tripcount ->
icmp ult %j, tripcount-step (where %j is the inner loop induction
variable and %inc is add %j, step), which is now accounted for when
identifying the trip count of the loop. This is also an acceptable use
of %j (provided the step is 1) so is ignored as long as the compare
that it's used in is also the condition of the inner branch.

Diff Detail

Unit TestsFailed

	Time	Test
	2,730 ms	x64 debian > libarcher.critical::critical.c
	2,890 ms	x64 debian > libarcher.races::critical-unrelated.c
	2,610 ms	x64 debian > libarcher.races::lock-nested-unrelated.c
	2,570 ms	x64 debian > libarcher.races::lock-unrelated.c
	2,670 ms	x64 debian > libarcher.races::parallel-simple.c
		View Full Test Results (17 Failed)

Event Timeline

RosieSumpter created this revision.Jul 12 2021, 2:12 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 12 2021, 2:12 AM

RosieSumpter requested review of this revision.Jul 12 2021, 2:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2021, 2:12 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

SjoerdMeijer added reviewers: dmgreen, Whitney, eopXD, fhahn.Jul 12 2021, 2:48 AM

Harbormaster completed remote builds in B113456: Diff 357866.Jul 12 2021, 2:50 AM

SjoerdMeijer added inline comments.Jul 12 2021, 5:18 AM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
169	Make this an `else if`?
178	I am wondering what happens if the RHS is not a constant, so instead of this: %cmp = icmp ult i32 %j, 18 we have something like: %cmp = icmp ult i32 %j, %N Do we reject that earlier, and/or do we not have these test cases yet?
181	And I was also wondering if we need to worry about RHS being INT_MAX, in which case the + 1 will overflow.
395	Perhaps here we can have: // The use is in the compare which is also the condition of the inner // branch. In this case the compare has been altered by another // transformation (e.g icmp ult %inc, limit -> icmp ult %j, limit-1). // Ignore this use as the compare gets removed later anyway. if (U == FI.InnerBranch->getCondition()) continue; To make things a bit simpler.
llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348	You need to `CHECK:` something here.
379	Same here.
llvm/test/Transforms/LoopFlatten/loop-flatten.ll
589	It's more common to put the CHECK-LABEL just before ; CHECK: entry:

fhahn added inline comments.Jul 12 2021, 1:26 PM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
175	Those kinds of patterns are very fragile in general. Would it be possible to use SCEV instead, which makes it easy to analyse the induction variables & trip counts in a more robust way?

SjoerdMeijer added inline comments.Jul 13 2021, 1:09 AM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
175	You're absolutely right. In fact, there's a quite some pattern matching going on for things like getLoopTest, getLoopIncrement, etc., that you'd probably expect to be present in a loop util helper function or something like that, but they aren't. The pattern matching makes some things easier though (as we are looking for some specific patterns). I feel that moving some things to helpers and using SCEV is major surgery, and if we promise not to add more pattern matching after this, I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default (we have this enabled for years now downstream), and before we do that we wanted to fix 2 minor things: this minor extension for a case which you expect to trigger, and we need to strengthen an overflow check. After this, I think a nice follow up project is indeed to see if we can move code to helpers and use SCEV.

Changed else to else if
Amended comments
Moved 'CHECK-LABEL' statement
Added test for when incoming phi node value for preheader is a variable (not zero)
Fixed debug output for when incoming phi node value can't be cast to ConstantInt (gives null which causes a crash when dump() is used)

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
178	It shouldn't get here unless it is constant since this only catches the case where the compare which is the condition of the back edge has been changed from e.g. %cmp = icmp ult i32 %inc, 19 to %cmp = icmp ult i32 %j, 18 If the RHS is unknown e.g. %cmp = icmp ult i32 %inc, %N it won't have been changed and so will be caught by one of the previous pattern matches. I've altered the comments so that it's clearer that this match is only for the case when limit is a constant, and also added an assert to ensure that the LHS of the compare is the induction phi.
181	So here the limit is RHS+1 because the another transformation changed the compare from %cmp = icmp ult i32 %inc, limit to %cmp = icmp ult i32 %j, limit-1 (where limit is a known constant) so the only way RHS=limit-1 would be INT_MAX is if limit had already overflowed, so I think we don't need to check for it here?
llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348	The only CHECK in this file is at the beginning: CHECK-NOT: Checks all passed, doing the transformation which checks that all these tests fail doesn't it? Or should I being doing more specific checks here?

fhahn added inline comments.Jul 14 2021, 4:57 AM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
175	I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default I understand the motivation, but I am not sure it is the best way forward. If SCEV is the the right tool to use, I think it would be great to make the switch before enabling it by default; I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides. After it is enabled by default, the motivation for a substantial refactoring might be smaller than before enabling it. When enabling it by default, it is also beneficial for the pass to be as general as possible, so fundamental problems can be flushed out early and the compile-time/code-size impact is also representative on a large set of inputs. Also, not using SCEV makes it harder to reason about the pass, because we can't rely on trusting SCEV to correctly handle all the difficult cases with respect to overflows and such when it comes to trip counts and inductions. We instead need to verify the manual patterns and their interaction with the dependent code. For example, it seems like using SCEV would allow us to be more confident when reasoning about some of the issues in the comments below.

Harbormaster completed remote builds in B113944: Diff 358556.Jul 14 2021, 5:15 AM

SjoerdMeijer added inline comments.Jul 14 2021, 6:42 AM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
175	Alright, fair enough, let's then first see what using SCEV exactly means and would involve. I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides. Yeah, the trip count is the easy part, but it's also the other loop components like the increment, test, etc., which are not provided by existing APIs. Thus, with major surgery, I probably also meant getting rid of `findLoopComponents` and adding something similar to Loop or LoopUtils, possibly using SCEV more. This little pattern match extension here might still be innocent and the way to go, but like I said, let's review SCEV usage first then.

SjoerdMeijer mentioned this in D106045: [LoopFlatten] Use Loop to identify loop induction phi. NFC.Jul 15 2021, 1:42 AM

Other patches have refactored parts of LoopFlatten's code to use SCEV and Loop instead of pattern matching (see https://reviews.llvm.org/D106580 and https://reviews.llvm.org/D106256) so this diff has been updated to be on top of these changes.

Harbormaster completed remote builds in B116471: Diff 362081.Jul 27 2021, 12:19 PM

SjoerdMeijer added inline comments.Jul 28 2021, 1:32 AM

llvm/lib/Transforms/Scalar/LoopFlatten.cpp
157–158	Nit: every time I need to read this `if` a couple of times, but I think it makes sense. Just for readability, to reduce indentation inside the if a little bit, I was wondering if this would help: if (SE->getSCEV(TripCount) != SCEVTripCount) && !IsWidened) { .. } else if (SE->getSCEV(TripCount) != SCEVTripCount)) { .. }
162	Nit: what was this `true` argument again? I would guess thought that we don't need...
llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348	Ah, sorry, I had missed that. It's indeed fine like this.

Replaced checks on RHS of compare with asserts
Slight rewriting to improve readability

Thanks, much easier to read now, LGTM!

This revision is now accepted and ready to land.Jul 28 2021, 3:40 AM

Harbormaster completed remote builds in B116644: Diff 362318.Jul 28 2021, 4:02 AM

Closed by commit rG2df8bf9339e4: [LoopFlatten] Fix missed LoopFlatten opportunity (authored by RosieSumpter). · Explain WhyJul 29 2021, 1:53 AM

This revision was automatically updated to reflect the committed changes.

RosieSumpter added a commit: rG2df8bf9339e4: [LoopFlatten] Fix missed LoopFlatten opportunity.

RosieSumpter added a reverting change: rGfab5659c7941: Revert "[LoopFlatten] Fix missed LoopFlatten opportunity".Jul 29 2021, 7:53 AM

RosieSumpter reopened this revision.Aug 2 2021, 1:21 AM

This revision is now accepted and ready to land.Aug 2 2021, 1:21 AM

Added check and test for trip count

I've ran downstream testing for this patch, and that came back okay. LGTM.

Harbormaster completed remote builds in B117393: Diff 363391.Aug 2 2021, 2:05 AM

Closed by commit rGf117ed542fd2: [LoopFlatten] Fix missed LoopFlatten opportunity (authored by RosieSumpter). · Explain WhyAug 2 2021, 3:17 AM

This revision was automatically updated to reflect the committed changes.

RosieSumpter added a commit: rGf117ed542fd2: [LoopFlatten] Fix missed LoopFlatten opportunity.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopFlatten.cpp

22 lines

test/

Transforms/

LoopFlatten/

loop-flatten-negative.ll

64 lines

loop-flatten.ll

53 lines

Diff 357866

llvm/lib/Transforms/Scalar/LoopFlatten.cpp

Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	static bool findLoopComponents(
ICmpInst *Compare = dyn_cast<ICmpInst>(BackBranch->getCondition());		ICmpInst *Compare = dyn_cast<ICmpInst>(BackBranch->getCondition());
if (!Compare \|\| !IsValidPredicate(Compare->getUnsignedPredicate()) \|\|		if (!Compare \|\| !IsValidPredicate(Compare->getUnsignedPredicate()) \|\|
Compare->hasNUsesOrMore(2)) {		Compare->hasNUsesOrMore(2)) {
LLVM_DEBUG(dbgs() << "Could not find valid comparison\n");		LLVM_DEBUG(dbgs() << "Could not find valid comparison\n");
return false;		return false;
}		}
IterationInstructions.insert(Compare);		IterationInstructions.insert(Compare);
LLVM_DEBUG(dbgs() << "Found comparison: "; Compare->dump());		LLVM_DEBUG(dbgs() << "Found comparison: "; Compare->dump());

// Find increment and limit from the compare		// Find increment and limit from the compare
		SjoerdMeijerUnsubmitted Done Reply Inline Actions Nit: every time I need to read this `if` a couple of times, but I think it makes sense. Just for readability, to reduce indentation inside the if a little bit, I was wondering if this would help: if (SE->getSCEV(TripCount) != SCEVTripCount) && !IsWidened) { .. } else if (SE->getSCEV(TripCount) != SCEVTripCount)) { .. } SjoerdMeijer: Nit: every time I need to read this `if` a couple of times, but I think it makes sense. Just…
Increment = nullptr;		Increment = nullptr;
if (match(Compare->getOperand(0),		if (match(Compare->getOperand(0),
m_c_Add(m_Specific(InductionPHI), m_ConstantInt<1>()))) {		m_c_Add(m_Specific(InductionPHI), m_ConstantInt<1>()))) {
Increment = dyn_cast<BinaryOperator>(Compare->getOperand(0));		Increment = dyn_cast<BinaryOperator>(Compare->getOperand(0));
		SjoerdMeijerUnsubmitted Done Reply Inline Actions Nit: what was this `true` argument again? I would guess thought that we don't need... SjoerdMeijer: Nit: what was this `true` argument again? I would guess thought that we don't need...
Limit = Compare->getOperand(1);		Limit = Compare->getOperand(1);
} else if (Compare->getUnsignedPredicate() == CmpInst::ICMP_NE &&		} else if (Compare->getUnsignedPredicate() == CmpInst::ICMP_NE &&
match(Compare->getOperand(1),		match(Compare->getOperand(1),
m_c_Add(m_Specific(InductionPHI), m_ConstantInt<1>()))) {		m_c_Add(m_Specific(InductionPHI), m_ConstantInt<1>()))) {
Increment = dyn_cast<BinaryOperator>(Compare->getOperand(1));		Increment = dyn_cast<BinaryOperator>(Compare->getOperand(1));
Limit = Compare->getOperand(0);		Limit = Compare->getOperand(0);
		} else {
		SjoerdMeijerUnsubmitted Done Reply Inline Actions Make this an `else if`? SjoerdMeijer: Make this an `else if`?
		// The compare may have been altered by another transformation
		// (e.g icmp ult %inc, limit -> icmp ult %j, limit-1).
		// In this case the increment is obtained from the InductionPHI
		// and the limit is the RHS of the compare + 1.
		Value *LatchValue = InductionPHI->getIncomingValueForBlock(Latch);
		if (match(LatchValue,
		fhahnUnsubmitted Not Done Reply Inline Actions Those kinds of patterns are very fragile in general. Would it be possible to use SCEV instead, which makes it easy to analyse the induction variables & trip counts in a more robust way? fhahn: Those kinds of patterns are very fragile in general. Would it be possible to use SCEV instead…
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions You're absolutely right. In fact, there's a quite some pattern matching going on for things like getLoopTest, getLoopIncrement, etc., that you'd probably expect to be present in a loop util helper function or something like that, but they aren't. The pattern matching makes some things easier though (as we are looking for some specific patterns). I feel that moving some things to helpers and using SCEV is major surgery, and if we promise not to add more pattern matching after this, I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default (we have this enabled for years now downstream), and before we do that we wanted to fix 2 minor things: this minor extension for a case which you expect to trigger, and we need to strengthen an overflow check. After this, I think a nice follow up project is indeed to see if we can move code to helpers and use SCEV. SjoerdMeijer: You're absolutely right. In fact, there's a quite some pattern matching going on for things…
		fhahnUnsubmitted Not Done Reply Inline Actions I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default I understand the motivation, but I am not sure it is the best way forward. If SCEV is the the right tool to use, I think it would be great to make the switch before enabling it by default; I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides. After it is enabled by default, the motivation for a substantial refactoring might be smaller than before enabling it. When enabling it by default, it is also beneficial for the pass to be as general as possible, so fundamental problems can be flushed out early and the compile-time/code-size impact is also representative on a large set of inputs. Also, not using SCEV makes it harder to reason about the pass, because we can't rely on trusting SCEV to correctly handle all the difficult cases with respect to overflows and such when it comes to trip counts and inductions. We instead need to verify the manual patterns and their interaction with the dependent code. For example, it seems like using SCEV would allow us to be more confident when reasoning about some of the issues in the comments below. fhahn: > I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten…
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions Alright, fair enough, let's then first see what using SCEV exactly means and would involve. I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides. Yeah, the trip count is the easy part, but it's also the other loop components like the increment, test, etc., which are not provided by existing APIs. Thus, with major surgery, I probably also meant getting rid of `findLoopComponents` and adding something similar to Loop or LoopUtils, possibly using SCEV more. This little pattern match extension here might still be innocent and the way to go, but like I said, let's review SCEV usage first then. SjoerdMeijer: Alright, fair enough, let's then first see what using SCEV exactly means and would involve. >…
		m_c_Add(m_Specific(InductionPHI), m_ConstantInt<1>()))) {
		Increment = dyn_cast<BinaryOperator>(LatchValue);
		ConstantInt *RHS = cast<ConstantInt>(Compare->getOperand(1));
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions I am wondering what happens if the RHS is not a constant, so instead of this: %cmp = icmp ult i32 %j, 18 we have something like: %cmp = icmp ult i32 %j, %N Do we reject that earlier, and/or do we not have these test cases yet? SjoerdMeijer: I am wondering what happens if the RHS is not a constant, so instead of this: %cmp = icmp…
		RosieSumpterAuthorUnsubmitted Not Done Reply Inline Actions It shouldn't get here unless it is constant since this only catches the case where the compare which is the condition of the back edge has been changed from e.g. %cmp = icmp ult i32 %inc, 19 to %cmp = icmp ult i32 %j, 18 If the RHS is unknown e.g. %cmp = icmp ult i32 %inc, %N it won't have been changed and so will be caught by one of the previous pattern matches. I've altered the comments so that it's clearer that this match is only for the case when limit is a constant, and also added an assert to ensure that the LHS of the compare is the induction phi. RosieSumpter: It shouldn't get here unless it is constant since this only catches the case where the compare…
		ConstantInt *One = ConstantInt::get(RHS->getType(), 1, true);
		Limit = ConstantInt::get(Compare->getContext(),
		RHS->getValue() + One->getValue());
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions And I was also wondering if we need to worry about RHS being INT_MAX, in which case the + 1 will overflow. SjoerdMeijer: And I was also wondering if we need to worry about RHS being INT_MAX, in which case the + 1…
		RosieSumpterAuthorUnsubmitted Not Done Reply Inline Actions So here the limit is RHS+1 because the another transformation changed the compare from %cmp = icmp ult i32 %inc, limit to %cmp = icmp ult i32 %j, limit-1 (where limit is a known constant) so the only way RHS=limit-1 would be INT_MAX is if limit had already overflowed, so I think we don't need to check for it here? RosieSumpter: So here the limit is RHS+1 because the another transformation changed the compare from ```…
		}
}		}
if (!Increment \|\| Increment->hasNUsesOrMore(3)) {		if (!Increment \|\| Increment->hasNUsesOrMore(3)) {
LLVM_DEBUG(dbgs() << "Cound not find valid increment\n");		LLVM_DEBUG(dbgs() << "Cound not find valid increment\n");
return false;		return false;
}		}
IterationInstructions.insert(Increment);		IterationInstructions.insert(Increment);
LLVM_DEBUG(dbgs() << "Found increment: "; Increment->dump());		LLVM_DEBUG(dbgs() << "Found increment: "; Increment->dump());
LLVM_DEBUG(dbgs() << "Found limit: "; Limit->dump());		LLVM_DEBUG(dbgs() << "Found limit: "; Limit->dump());
▲ Show 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	for (User *U : FI.InnerInductionPHI->users()) {

// Matches the same pattern as above, except it also looks for truncs		// Matches the same pattern as above, except it also looks for truncs
// on the phi, which can be the result of widening the induction variables.		// on the phi, which can be the result of widening the induction variables.
bool IsAddTrunc = match(U, m_c_Add(m_Trunc(m_Specific(FI.InnerInductionPHI)),		bool IsAddTrunc = match(U, m_c_Add(m_Trunc(m_Specific(FI.InnerInductionPHI)),
m_Value(MatchedMul))) &&		m_Value(MatchedMul))) &&
match(MatchedMul,		match(MatchedMul,
m_c_Mul(m_Trunc(m_Specific(FI.OuterInductionPHI)),		m_c_Mul(m_Trunc(m_Specific(FI.OuterInductionPHI)),
m_Value(MatchedItCount)));		m_Value(MatchedItCount)));

		SjoerdMeijerUnsubmitted Done Reply Inline Actions Perhaps here we can have: // The use is in the compare which is also the condition of the inner // branch. In this case the compare has been altered by another // transformation (e.g icmp ult %inc, limit -> icmp ult %j, limit-1). // Ignore this use as the compare gets removed later anyway. if (U == FI.InnerBranch->getCondition()) continue; To make things a bit simpler. SjoerdMeijer: Perhaps here we can have: // The use is in the compare which is also the condition of the…
if ((IsAdd \|\| IsAddTrunc) && MatchedItCount == InnerLimit) {		if ((IsAdd \|\| IsAddTrunc) && MatchedItCount == InnerLimit) {
LLVM_DEBUG(dbgs() << "Use is optimisable\n");		LLVM_DEBUG(dbgs() << "Use is optimisable\n");
ValidOuterPHIUses.insert(MatchedMul);		ValidOuterPHIUses.insert(MatchedMul);
FI.LinearIVUses.insert(U);		FI.LinearIVUses.insert(U);
} else {		} else if (U == FI.InnerBranch->getCondition())
		// The use is in the compare which is also the condition of the inner
		// branch. In this case the compare has been altered by another
		// transformation (e.g icmp ult %inc, limit -> icmp ult %j, limit-1).
		// Ignore this use as the compare gets removed later anyway.
		continue;
		else {
LLVM_DEBUG(dbgs() << "Did not match expected pattern, bailing\n");		LLVM_DEBUG(dbgs() << "Did not match expected pattern, bailing\n");
return false;		return false;
}		}
}		}

// Check that there are no uses of the outer IV other than the ones found		// Check that there are no uses of the outer IV other than the ones found
// as part of the pattern above.		// as part of the pattern above.
for (User *U : FI.OuterInductionPHI->users()) {		for (User *U : FI.OuterInductionPHI->users()) {
▲ Show 20 Lines • Show All 343 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll

Show First 20 Lines • Show All 335 Lines • ▼ Show 20 Lines	for.inc6: ; preds = %for.body3
%inc7 = add nuw nsw i32 %i.018, 1		%inc7 = add nuw nsw i32 %i.018, 1
%exitcond19 = icmp ne i32 %inc7, 10		%exitcond19 = icmp ne i32 %inc7, 10
br i1 %exitcond19, label %for.body, label %for.end8		br i1 %exitcond19, label %for.body, label %for.end8

for.end8: ; preds = %for.inc6		for.end8: ; preds = %for.inc6
ret i32 10		ret i32 10
}		}

		; test_10 and test_11 are for the case when the inner limit is a
		; defined integer (e.g. 20), then InstCombine makes the transformation:
		; icmp ult i32 %inc, 20 -> icmp ult i32 %j, 20-step.

		; test_10: If the step is not 1, the loop shouldn't be flattened.
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions You need to `CHECK:` something here. SjoerdMeijer: You need to `CHECK:` something here.
		RosieSumpterAuthorUnsubmitted Not Done Reply Inline Actions The only CHECK in this file is at the beginning: CHECK-NOT: Checks all passed, doing the transformation which checks that all these tests fail doesn't it? Or should I being doing more specific checks here? RosieSumpter: The only CHECK in this file is at the beginning: ``` CHECK-NOT: Checks all passed, doing the…
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions Ah, sorry, I had missed that. It's indeed fine like this. SjoerdMeijer: Ah, sorry, I had missed that. It's indeed fine like this.
		define i32 @test_10(i32* nocapture %A) {
		entry:
		br label %for.cond1.preheader

		for.cond1.preheader:
		%i.017 = phi i32 [ 0, %entry ], [ %inc, %for.cond.cleanup3 ]
		%mul = mul i32 %i.017, 20
		br label %for.body4

		for.body4:
		%j.016 = phi i32 [ 0, %for.cond1.preheader ], [ %add5, %for.body4 ]
		%add = add i32 %j.016, %mul
		%arrayidx = getelementptr inbounds i32, i32* %A, i32 %add
		store i32 30, i32* %arrayidx, align 4
		%add5 = add nuw nsw i32 %j.016, 2
		%cmp2 = icmp ult i32 %j.016, 18
		br i1 %cmp2, label %for.body4, label %for.cond.cleanup3

		for.cond.cleanup3:
		%inc = add i32 %i.017, 1
		%cmp = icmp ult i32 %inc, 11
		br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup

		for.cond.cleanup:
		%0 = load i32, i32* %A, align 4
		ret i32 %0
		}

		; test_11: The inner inducation variable is used in a compare which
		; isn't the condition of the inner branch.
		define i32 @test_11(i32* nocapture %A) {
		SjoerdMeijerUnsubmitted Not Done Reply Inline Actions Same here. SjoerdMeijer: Same here.
		entry:
		br label %for.cond1.preheader

		for.cond1.preheader:
		%i.020 = phi i32 [ 0, %entry ], [ %inc7, %for.cond.cleanup3 ]
		%mul = mul i32 %i.020, 20
		br label %for.body4

		for.body4:
		%j.019 = phi i32 [ 0, %for.cond1.preheader ], [ %inc, %for.body4 ]
		%cmp5 = icmp ult i32 %j.019, 5
		%cond = select i1 %cmp5, i32 30, i32 15
		%add = add i32 %j.019, %mul
		%arrayidx = getelementptr inbounds i32, i32* %A, i32 %add
		store i32 %cond, i32* %arrayidx, align 4
		%inc = add nuw nsw i32 %j.019, 1
		%cmp2 = icmp ult i32 %j.019, 19
		br i1 %cmp2, label %for.body4, label %for.cond.cleanup3

		for.cond.cleanup3:
		%inc7 = add i32 %i.020, 1
		%cmp = icmp ult i32 %inc7, 11
		br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup

		for.cond.cleanup:
		%0 = load i32, i32* %A, align 4
		ret i32 %0
		}

; Outer loop conditional phi		; Outer loop conditional phi
define i32 @e() {		define i32 @e() {
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.end16		for.body: ; preds = %entry, %for.end16
%f.033 = phi i32 [ 0, %entry ], [ %inc18, %for.end16 ]		%f.033 = phi i32 [ 0, %entry ], [ %inc18, %for.end16 ]
▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopFlatten/loop-flatten.ll

	Show First 20 Lines • Show All 580 Lines • ▼ Show 20 Lines
	; CHECK: %inc7 = add nuw nsw i32 %i.018, 1			; CHECK: %inc7 = add nuw nsw i32 %i.018, 1
	; CHECK: %exitcond19 = icmp eq i32 %inc7, %flatten.tripcount			; CHECK: %exitcond19 = icmp eq i32 %inc7, %flatten.tripcount
	; CHECK: br i1 %exitcond19, label %for.end8, label %for.body			; CHECK: br i1 %exitcond19, label %for.end8, label %for.body

	for.end8: ; preds = %for.inc6			for.end8: ; preds = %for.inc6
	ret i32 10			ret i32 10
	}			}

				; CHECK-LABEL: test9
				SjoerdMeijerUnsubmitted Done Reply Inline Actions It's more common to put the CHECK-LABEL just before ; CHECK: entry: SjoerdMeijer: It's more common to put the CHECK-LABEL just before ; CHECK: entry:
				; When the inner loop limit is a defined integer (e.g. 20) and the step
				; is 1, InstCombine causes the transformation:
				; icmp ult i32 %inc, 20 -> icmp ult i32 %j, 19.
				; This is an 'unoptimizable' use of the inner induction variable %j but
				; we should still flatten the loop as this compare instruction is
				; removed later anyway.
				define i32 @test9(i32* nocapture %A) {
				entry:
				br label %for.cond1.preheader
				; CHECK: entry:
				; CHECK: %flatten.tripcount = mul i32 20, 11
				; CHECK: br label %for.cond1.preheader

				for.cond1.preheader:
				%i.017 = phi i32 [ 0, %entry ], [ %inc6, %for.cond.cleanup3 ]
				%mul = mul i32 %i.017, 20
				br label %for.body4
				; CHECK: for.cond1.preheader:
				; CHECK: %i.017 = phi i32 [ 0, %entry ], [ %inc6, %for.cond.cleanup3 ]
				; CHECK: %mul = mul i32 %i.017, 20
				; CHECK: br label %for.body4

				for.cond.cleanup3:
				%inc6 = add i32 %i.017, 1
				%cmp = icmp ult i32 %inc6, 11
				br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup
				; CHECK: for.cond.cleanup3:
				; CHECK: %inc6 = add i32 %i.017, 1
				; CHECK: %cmp = icmp ult i32 %inc6, %flatten.tripcount
				; CHECK: br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup

				for.body4:
				%j.016 = phi i32 [ 0, %for.cond1.preheader ], [ %inc, %for.body4 ]
				%add = add i32 %j.016, %mul
				%arrayidx = getelementptr inbounds i32, i32* %A, i32 %add
				store i32 30, i32* %arrayidx, align 4
				%inc = add nuw nsw i32 %j.016, 1
				%cmp2 = icmp ult i32 %j.016, 19
				br i1 %cmp2, label %for.body4, label %for.cond.cleanup3
				; CHECK: for.body4
				; CHECK: %j.016 = phi i32 [ 0, %for.cond1.preheader ]
				; CHECK: %add = add i32 %j.016, %mul
				; CHECK: %arrayidx = getelementptr inbounds i32, i32* %A, i32 %i.017
				; CHECK: store i32 30, i32* %arrayidx, align 4
				; CHECK: %inc = add nuw nsw i32 %j.016, 1
				; CHECK: %cmp2 = icmp ult i32 %j.016, 19
				; CHECK: br label %for.cond.cleanup3

				for.cond.cleanup:
				%0 = load i32, i32* %A, align 4
				ret i32 %0
				}

	declare i32 @func(i32)			declare i32 @func(i32)